<<

Verbs of Caused-Separation in Thai and Khmer:

Lexical Semantics and Language Convergence

in Mainland Southeast Asia

Nitipong Pichetpan

A thesis submitted in fulfilment of

the requirements for the degree of

Doctor of Philosophy

Department of

Faculty of Arts and Social Sciences

The University of Sydney

January 2021

Statement of Originality

This is to certify that to the best of my knowledge, the content of this thesis is my own work. This thesis has not been submitted for any degree or other purposes.

I certify that the intellectual content of this thesis is the product of my own work and that all the assistance received in preparing this thesis and sources have been acknowledged.

(Nitipong Pichetpan)

January 2021

i

Abstract

This dissertation investigates the question of whether and to what extent language convergence within a linguistic area may extend into the domain of lexical semantics.

To investigate this question, it examines similarities and differences among verbs of separation in Thai and Khmer – two genealogically unrelated languages that both fall within the Mainland Southeast Asian (MSEA) linguistic area. Descriptions of caused separation events were first elicited from native speakers of the languages. Cluster analyses (Jaccard’s index and average linkage) were performed to determine the domain’s categorisation—together with analyses of the verb distributional patterns. A comparison was made to uncover (dis)similarity in semantic categorisation and provide input for discussion on areal semantics. The findings reveal that Thai and

Khmer are both parallel and different from each other at the lexical semantic level.

The two languages have comparable but not identical numbers of semantic categories.

Also, in their organisation of caused-separation events, Thai and Khmer are sometimes similar and sometimes different. The groupings present cross-linguistic trends (cf. Majid et al., 2004, 2008), parallel distinctions not widely reported in the cross-linguistic research, and language-specific differentiation. Further, parallelism specific to Thai and Khmer is evaluated as evidence of area-specific convergence, thus enhancing MSEA’s status as a linguistic area at the lexical semantic level. To assess the evidence of convergence, a method of triangulation with languages outside the immediate area is utilised. The study opens the way for further research regarding general context and specific mechanisms associated with patterns of Thai-Khmer semantic convergence.

ii

Acknowledgements

Kodhaṃ paññāya ucchinde

Cut away your anger with (the of) wisdom. (Naya-Aṅ.Sattaka. 23/100)

This dissertation would not have been possible without the help and support of many people. First and foremost, I would like to thank my supervisors, Nick Enfield and

Mark W. Post. Their guidance and insight have been valuable over the past years. My special thanks go to Anthony V. N. Diller for his inspiring discussions about linguistics as as many fruitful conversations about all things good academic writing. I also thank Angela Terrill for comments and proofreading of this dissertation.

A special mention goes to my colleagues for their assistance in the data collection: Sarawanee Sankaburanurak and Pimpa Verochanakorn for collaboration assistance in Thailand and Cambodia, respectively; Phan Sotheara for assistance. I express my gratitude to anonymous language consultants, native speakers of Thai and Khmer, with whom I conducted fieldwork, for sharing their knowledge of these languages with me. Thanks also go to Maryam

Montazerolghaem at Sydney Informatics Hub, Angus Wheeler, and Panrawee

Rungskunroch for statistical assistance.

I am grateful to Klairung Amratisha, Natthaporn Panpothong, and Siriporn

Phakdeephasook for support to start the long years of my PhD, to the Department of

Thai and Eastern Language and Culture, Thammasat University for allowing me to

iii

pursue my PhD overseas, and to the Office of Educational Affairs, Royal Thai

Embassy of Australia for administrative assistance. I would like to extend my thanks to the people at the Department of Linguistics and PGARC Woolley, University of

Sydney for the friendly atmosphere; it has a direct influence on my success. I take this opportunity to thank Zhang Dongbing, Mun Kihong, Meng Weijian, Faris

Yothasamuth, and Chavalin Svetanant for lunches, cups of coffee, and ideas. I also owe thanks to Napakorn Sricomnerd, Thani Saenchaiban, Thosapol Thittitanavanich,

Napakadol Kittisenee, Apitsara Sriamorn, and all relatives and friends that cheered me up and made my PhD years more colourful.

My acknowledgements would not be complete without thanking the most important people in my life. To Granny, Mum and Auntie—Bunchuea Michalat†,

Jurairat Sriamorn and Asama Sriamorn, I thank you for your encouragement in everything that I did and all of your love and support. I also thank Paul Vaivasa; you have helped me achieve what I have done, and you have been there with me through all of the ups and downs.

iv

Funding Disclosure

This PhD research was supported by Thammasat University’s Scholarship for Study

Overseas (2016-2020).

I greatly appreciate the grants from the Faculty of Arts and Social Sciences Doctoral

Research Travel Grant Scheme (2017) and the Faculty’s Emergency Bursary (2020).

I am grateful to Chulalongkorn Summer School of Southeast Asian Linguistics organised by the Department of Linguistics, Faculty of Arts, Chulalongkorn

University for the grant that allowed me to attend wonderful classes in June 2017.

Also, I am thankful to the University of Sydney’s Postgraduate Research Support

Scheme (PRSS) for conference expenses in 2019.

v

Table of Contents

Statement of originality...... i

Abstract ...... ii

Acknowledgements ...... iii

Funding Disclosure ...... v

Table of Contents ...... vi

List of Tables ...... xv

List of Figures ...... xix

List of Abbreviations and Conventions ...... xxiii

CHAPTER 1 Introduction ...... 1

1.1 Premises of the study ...... 2

1.1.1 Background ...... 2

1.1.2 Subject of the study ...... 5

1.2 Research questions ...... 14

1.3 Thesis structure ...... 19

CHAPTER 2 Literature review and methodology ...... 22

2.1 Literature review ...... 23

2.1.1 Mainland Southeast Asia (MSEA): The linguistic area ...... 23

2.1.1.1 Geographical area ...... 23

2.1.1.2 MSEA as a linguistic area and its areal linguistic support...... 24

vi

2.1.1.3 Areal semantics: Studies of contact-induced semantic convergence ..... 31

2.1.1.4 MSEA languages to be examined: Thai and Khmer ...... 34

2.1.2 Lexico-semantic typological analysis of events ...... 37

2.1.2.1 Lexico-semantic categorisation: Analysis of Denotational Range ...... 37

2.1.2.2 Semantic maps ...... 46

2.1.2.3 Consistency of lexical descriptions ...... 49

2.1.3 Domain of caused-separation events ...... 53

2.1.3.1 Nature of caused-separation events ...... 53

2.1.3.2 Caused-separation as core of separation events ...... 57

2.1.3.3 Previous work on caused-separation in languages...... 59

2.2 Methodology ...... 76

2.2.1 Purpose of the study ...... 76

2.2.2 Elicitation tool: MPI’s ‘cut’ and ‘break’ clips ...... 77

2.2.3 Pilot study ...... 79

2.2.3.1 Location and participants for the pilot study ...... 80

2.2.3.2 Objectives of pilot research ...... 80

2.2.3.3 Pilot Interviews: outcomes and improvement...... 81

2.2.4 Data collection ...... 83

2.2.4.1 Planning and preparation for fieldwork ...... 83

2.2.4.2 Data collection sites ...... 85

2.2.4.3 Participants and recruitment ...... 86

vii

2.2.4.4 Data capturing ...... 86

2.2.4.5 Ethical aspect ...... 92

2.2.5 Data processing ...... 93

2.2.5.1 Data transcription ...... 93

2.2.5.2 Data coding for references ...... 93

2.2.6 Methods of data analysis ...... 94

2.2.6.1 Summary of research design ...... 94

2.2.6.2 Statistical approaches ...... 98

CHAPTER 3 Denotational range of caused-separation in Thai ...... 112

3.1 Lexical expressions of caused-separation in Thai: Prescriptive accounts

...... 113

3.2 Lexical items for caused-separation events and their structural patterns in

Thai: Elicited data...... 129

3.2.1 List of Thai caused-separation verbs ...... 130

3.2.2 Summary of structural patterns of Thai caused-separation verbs ... 135

3.2.2.1 Verbs of caused-separation in single-verb constructions in Thai ...... 137

3.2.2.2 Verbs of caused-separation in multi-verb construction in Thai ...... 140

3.3 Granularity of caused-separation categories in Thai ...... 145

3.3.1 Caused-separation categories in Thai ...... 148

3.3.1.1 Patterning of caused-separation categories by verbs in Thai ...... 148

3.3.1.2 Lexical variation within caused-separation categories in Thai ...... 153

viii

3.3.2 Partitioning of caused-separation categories into subcategories in Thai

...... 156

3.3.3 Asymmetric lexical resources and granularity in caused-separation

categories in Thai ...... 164

3.3.4 Summary ...... 167

3.4 Boundary locations of caused-separation categories in Thai ...... 168

3.4.1 Placement of caused-separation category boundaries in Thai ...... 168

3.4.1.1 Grouping of caused-separation events in Thai...... 169

3.4.1.2 Overlaps between caused-separation categories in Thai ...... 176

3.4.2 Inconsistency in naming caused-separation events in Thai ...... 184

3.4.2.1 Degrees of inconsistency in caused-separation descriptions in Thai ... 184

3.4.2.2 Inconsistency in naming core versus peripheral scenes of caused-

separation categories in Thai ...... 185

3.4.2.3 Inconsistency levels in descriptions across caused-separation categories in Thai

...... 187

3.4.3 Summary ...... 188

3.5 Semantic organisation of caused-separation categories in Thai ...... 189

3.5.1 Semantic characteristics of caused-separation categories in Thai .... 190

3.5.2 Lexicalisation of caused-separation in Thai ...... 212

3.5.3 Summary ...... 218

CHAPTER 4 Denotational range of caused-separation in Khmer ...... 220

ix

4.1 Lexical expressions of caused-separation in Khmer: Prescriptive accounts

...... 221

4.1.1 Morphosyntax of caused-separation verbs in Khmer ...... 221

4.1.2 Semantic treatment of caused-separation verbs in Khmer ...... 225

4.2 Lexical items for caused-separation events and their structural patterns in

Khmer: The speakers’ elicited data ...... 228

4.2.1 List of Khmer caused-separation verbs and their occurrences ...... 229

4.2.2 Summary of structural patterns of Khmer caused-separation verbs

...... 235

4.2.2.1 Verbs of caused-separation in single-verb constructions in Khmer .... 238

4.2.2.2 Verbs of caused-separation in multi-verb construction in Khmer ...... 240

4.3 Granularity of caused-separation categories in Khmer ...... 251

4.3.1 Caused-separation categories in Khmer ...... 252

4.3.1.1 Patterning of caused-separation categories by verbs in Khmer ...... 252

4.3.1.2 Lexical variation within caused-separation categories in Khmer ...... 257

4.3.2 Partitioning of caused-separation categories into subcategories in Khmer

...... 260

4.3.3 Asymmetric lexical resources and granularity in caused-separation

categories in Khmer ...... 269

4.3.4 Summary ...... 272

4.4 Boundary locations of caused-separation categories in Khmer ...... 273

4.4.1 Placement of caused-separation category boundaries in Khmer ...... 273

x

4.4.1.1 Grouping of caused-separation events in Khmer ...... 273

4.4.1.2 Overlaps between caused-separation categories in Khmer ...... 281

4.4.2 Inconsistency in naming caused-separation events in Khmer ...... 287

4.4.2.1 Degrees of inconsistency in caused-separation descriptions in Khmer

...... 288

4.4.2.2 Inconsistency in naming core versus peripheral scenes of caused-

separation categories in Khmer ...... 288

4.4.2.3 Inconsistency levels in descriptions across caused-separation

categories in Khmer ...... 291

4.4.3 Summary ...... 293

4.5 Semantic organisation of caused-separation categories in Khmer...... 293

4.5.1 Semantic characteristics of caused-separation categories in Khmer 294

4.5.2 Lexicalisation of caused-separation in Khmer ...... 316

4.5.3 Summary ...... 322

CHAPTER 5 Comparison of semantic categorisation of caused-separation across Thai and Khmer ...... 324

5.1 Verb usage for caused-separation domain in Thai and Khmer: The data from fieldwork elicitation ...... 325

5.1.1 Frequency analysis in caused-separation domain across Thai and

Khmer ...... 325

5.1.2 Verbs of caused-separation used by Thai and Khmer speakers ...... 327

5.2 Comparison of granularity of caused-separation domain across Thai and

Khmer ...... 332

xi

5.2.1 Granular categorisation in caused-separation domain across Thai and

Khmer ...... 333

5.2.1.1 Numbers of categories across Thai and Khmer ...... 333

5.2.1.2 Internal organisation of caused-separation categories in Thai and

Khmer: Finer granularity and depth of hierarchy structure ...... 337

5.2.2 Asymmetry in encoding of caused-separation across Thai and Khmer: A

look at lexical resources ...... 340

5.2.2.1 Asymmetry in lexical resources ...... 341

5.2.2.2 Asymmetry in hierarchy of semantic categories ...... 345

5.3 Comparison of category boundaries in caused-separation domain across

Thai and Khmer ...... 349

5.3.1 Event grouping for caused-separation categories across Thai and Khmer

...... 350

5.3.2 Semantic distinctions in caused-separation domain across Thai and

Khmer ...... 356

5.3.3 Overlap in category extension: A view through characteristic

divergence ...... 360

5.4 Comparison of semantic organisation of caused-separation domain across

Thai and Khmer ...... 365

5.4.1 Semantic components in caused-separation domain across Thai and Khmer

...... 366

5.4.2 Semantic characteristic conflation in lexicalisation of caused-

separation verbs across Thai and Khmer ...... 371

xii

CHAPTER 6 Areal lexico-semantics in Mainland Southeast Asian (MSEA)

Sprachbund: A case of caused separation in Thai and Khmer ...... 376

6.1 Research findings in summary ...... 376

6.1.1 Lexical semantic categorisation of caused-separation in Thai and Khmer

...... 377

6.1.2 Similar and different lexical semantic distinctions of caused-

separation across Thai and Khmer ...... 379

6.1.2.1 Convergence to cross-linguistic distinction trend ...... 379

6.1.2.2 Mutual parallelism across Thai and Khmer ...... 381

6.1.2.3 Divergence across Thai and Khmer ...... 383

6.2 Area-specific lexical semantics: A look at mutual parallelism across Thai and Khmer ...... 385

6.2.1 Local lexical semantic convergence in Thai and Khmer against mutual

parallelism in Hindi and Tamil (Narasimhan, 2007) ...... 386

6.2.2 Thai and Khmer lexical semantic convergence as areal traits ...... 389

6.2.3 Evidence of MSEA linguistic area from Thai and Khmer areal

semantics ...... 391

6.3 Research implications, limitations, and recommendations ...... 391

6.3.1 Implications ...... 392

6.3.2 Limitations ...... 394

6.3.3 Recommendations ...... 395

References ...... 397

xiii

Appendices

A. ‘Cut’ and ‘Break’ Clips (Bohnemeyer et al., 2001) ...... 415

B. Gini-Simpson’s Diversity-Index Scores for Thai and Khmer ...... 419

C. Ethics Approval Letter ...... 421

xiv

List of Tables

Table 2.1 Some MSEA areal features in phonology, morphology, , syntax and

lexical semantics...... 27

Table 2.2 Shared semantic components of meaning with reference to hit-class, cut-

class, and break-class ...... 62

Table 2.3 Pattern of syntactic behaviour of three different caused-separation verb

classes: hit-class, cut-class, and break-class, with respect to diathesis

alternations ...... 62

Table 2.4 Subtype events of separation in ‘cut’ and ‘break’ clips ...... 78

Table 2.5 Prompting questions in English and Thai, for ‘cut’ and ‘break’ clips ...... 81

Table 2.6 Formal outline for note-taking during an interview...... 91

Table 2.7 Dummy distributions of verb occurrences in Language A ...... 99

Table 2.8 Gini-Simpson’s D used to numerically specify how Scenes X and Y are

diverse regarding verb types ...... 102

Table 2.9 Binary distributions of two fish species discovered in four dummy rivers 105

Table 2.10 Jaccard’s similarity indices of four dummy rivers, measured by two fish

species ...... 107

Table 2.11 Distance measures of four dummy rivers ...... 109

Table 2.12 All average linkage hierarchical clustering algorithm of Rivers A, B, C,

and D ...... 110

Table 3.1 Cutting not found in some of the four dictionaries ...... 115

Table 3.2 Circularity in dictionary definitions: 11 cutting words in RIT (1950, 1982) .... 117

xv

Table 3.3 English translations used for definitions of 11 Thai cutting words in

McFarland (1944) and Haas (1964) ...... 118

Table 3.4 Theme objects specified for cutting words’ definition proposed by

Premsrirat (1987) ...... 119

Table 3.5 Manners/purposes/emotions specified for cutting words’ definition

proposed by Premsrirat (1987) ...... 121

Table 3.6 Caused-separation words found in Phathumetha’s (2016) Thai-Thai

thesaurus ...... 125

Table 3.7 Thai verbs used to describe 43 caused-separation scene clips ...... 131

Table 3.8 Pairing of caused-separation verbs and their potential succeeding verbs of

resulting separation...... 144

Table 3.9 Summary of predominant Thai verbs in six caused-separation categories 152

Table 3.10 Thai verb types and percentage of occurrences for six categories of

caused-separation ...... 155

Table 3.11 Simplified placement of category boundaries in Thai caused-separation

domain ...... 176

Table 3.12 Gini-Simpson’s diversity indices and verbs relating to periphery within the

/tàt/-, /tʰúp/-, and /tɕʰìːk/-categories...... 186

Table 3.13 Semantic organisation patterns in classification of caused-separation

domain in Thai...... 211

Table 3.14a Lexicalisation of caused-separation at category level in Thai ...... 213

Table 3.14b Lexicalisation of caused-separation at subcategory level in Thai ...... 213

Table 3.14c Lexicalisation of caused-separation at subdivision level in Thai ...... 214

Table 4.1 Some Khmer separation verbs with ideal examples of typical objects being

separated or divided and instruments involved ...... 226

xvi

Table 4.2 Khmer verbs used to describe 43 caused-separation clips ...... 230

Table 4.3 Pairing of preceding caused-separation verbs and their potential succeeding

transitive verbs ...... 247

Table 4.4 Patterns of transitive verbs of caused-separation in serialisation ...... 249

Table 4.5 Pairing of preceding caused-separation verbs and their potential succeeding

intransitive verbs of resulting separation...... 250

Table 4.6 Summary of predominant Khmer verbs in seven caused-separation

categories ...... 256

Table 4.7 Khmer verb types and percentage of occurrences for seven categories of

caused-separation ...... 259

Table 4.8 Simplified placement of category boundaries in Khmer caused-separation

domain ...... 281

Table 4.9 Gini-Simpson’s diversity indices and verbs relating to periphery within the

/kat/-, /dɑm/-, /puh/-, and /kac/-categories ...... 289

Table 4.10 Summary of caused-separation scenes described by /pdac/ ‘separate’ in

/dɑm/-, /cak/-, and /tieɲ/-categories with regards to certain semantic

features ...... 312

Table 4.11Semantic organisation patterns in classification of caused-separation

domain in Khmer ...... 316

Table 4.12a Lexicalisation of caused-separation at category level in Khmer ...... 317

Table 4.12b Lexicalisation of caused-separation at subcategory level in Khmer ...... 318

Table 4.12c Lexicalisation of caused-separation at subdivision level in Khmer ...... 318

Table 5.1 Summary of frequency analysis for caused-separation verbs in Thai and

Khmer ...... 326

xvii

Table 5.2 Summary of cluster analysis for caused-separation categories in Thai and

Khmer ...... 334

Table 5.3 Grid-like design of core set of caused-separation scenes ...... 342

Table 5.4a Lexical resources available in each category linked to core caused-

separation scenes in Thai ...... 343

Table 5.4b Lexical resources available in each category linked to core caused-

separation scenes in Khmer ...... 343

Table 5.5 Organisation of caused-separation events in relation to categories across

Thai and Khmer ...... 356

Table 5.6 Semantic distinctions in structuring of caused-separation domain across

Thai and Khmer ...... 357

Table 5.7 Semantic component patterns belonging to caused-separation categories

across Thai and Khmer ...... 366

Table 5.8 Additional semantic components for subcategories across Thai and Khmer ... 369

Table 5.9 Comparison of Thai and Khmer semantic conflation patterns for

some verbs of caused-separation ...... 372

xviii

List of Figures

Figure 1.1. Triangulation of Thai, Khmer, and Hindi and Tamil, in explorations of

MSEA’s areal lexico-semantics ...... 14

Figure 2.1. Hierarchical order of distinguishing components in the classification of

cutting in Thai after Premsirirat (1987) ...... 71

Figure 2.2. Pre-fieldwork tasks taken from October 2016 to June 2017 ...... 84

Figure 2.3. Interview procedures for eliciting data using the ‘cut’ and ‘break’ clips

(Bohnemeyer et al., 2001) ...... 88

Figure 2.4. Dendrogram of four dummy rivers...... 111

Figure 3.1. Numbers of scenes with different numbers of verbs in Thai...... 134

Figure 3.2. Hierarchical clustering of caused-separation scenes, based on

corresponding verbs in Thai...... 150

Figure 3.3a. Cluster tree and verb frequency for /tàt/-category ...... 158

Figure 3.3b. Cluster tree and verb frequency for /tʰúp/-category ...... 160

Figure 3.3c. Cluster tree and verb frequency for /hàk/-category ...... 160

Figure 3.3d. Cluster tree and verb frequency for /tɕʰìːk/-category ...... 162

Figure 3.3e. Cluster tree and verb frequency for /tʰîm/- and /krìːt/-categories ...... 162

Figure 3.4. Lexical density within six categories of caused-separation in Thai ...... 165

Figure 3.5. Simplified hierarchical structure of semantic domain “caused-separation”

in Thai ...... 166

Figure 3.6. Average of diversity values for six categories in caused-separation in Thai .... 187

Figure 3.7. Chisel-type instruments potentially interpreted as having either a sharp

point or a sharp short ...... 192

xix

Figure 3.8. Outer border of hand imitating blade of knife, in karate-chopping ...... 193

Figure 3.9. Perpendicular-held blade characteristic for /sàp/-subdivision of /fan/-

subcategory ...... 194

Figure 3.10. Use of contained in all scenes in /tɕaːm/-subdivision ...... 195

Figure 3.11. One- and two-dimensional rigid objects as characteristic for /hàk/-

category ...... 199

Figure 3.12. Specific types of one- and two-dimensional flexible objects as

characteristic for /tɕʰìːk/-category ...... 200

Figure 3.13. Hole in object’s surface as characteristic for /tʰîm/-category ...... 201

Figure 3.14. Mapping of semantic characteristics onto six categories (as well as

subcategories) of caused-separation in Thai ...... 208

Figure 4.1. Numbers of scenes with different numbers of verbs in Khmer ...... 234

Figure 4.2. Hierarchical clustering of caused-separation scenes, based on

corresponding verbs in Khmer ...... 254

Figure 4.3a. Cluster tree and verb frequency for /kat/-category ...... 262

Figure 4.3b. Cluster tree and verb frequency for /dɑm/-category ...... 262

Figure 4.3c. Cluster tree and verb frequency for /puh/-category ...... 265

Figure 4.3d. Cluster tree and verb frequency for /cak/-category ...... 265

Figure 4.3e. Cluster tree and verb frequency for /kac/-category ...... 265

Figure 4.3f. Cluster tree and verb frequency for /tieɲ/-category ...... 266

Figure 4.3g. Cluster tree and verb frequency for /heak/-category ...... 266

Figure 4.4. Lexical density within seven categories of caused-separation in Khmer 269

Figure 4.5. Simplified hierarchical structure of semantic domain “caused-separation”

in Khmer ...... 270

xx

Figure 4.6. Average of diversity values for seven categories in caused-separation in

Khmer...... 292

Figure 4.7. Scene 15 SAW STICK, showing the back-and-forth action with use of saw to

cause separation ...... 297

Figure 4.8. Scene 4 CHOP STRETCHED CLOTH W/ KNIFE, showing the person repeatedly

striking blows with small knife ...... 298

Figure 4.9a-c. /han/-category (a-b) consistent with use of supporting surface, whereas

/ʔaa/-category (c) not requiring implementation of supplementary support .. 299

Figure 4.10. Some scenes in /dɑm/-category involving use of (blunt-headed) hammer .. 300

Figure 4.11. Some scenes in /dɑm/-category involving use of (blunt-edged) knife hand

...... 300

Figure 4.12. Blunt edge of knife hand comparable to blunt hammerhead ...... 301

Figure 4.13a-c. Three scenes relating intended results of multiple-fragment versus

one-location separation ...... 303

Figure 4.14. Scenes in /puh/-category consistent with lengthwise direction of

separation ...... 304

Figure 4.15. Scenes in /kap/-subcategory involving use of big or ...... 305

Figure 4.16. Some scenes in /puh/-category involving use of small knives, as opposed

to those in /kap/-subcategory ...... 306

Figure 4.17. Scenes in /cak/-category involving both (intended) full and partial

separation ...... 307

Figure 4.18. One- and two-dimensional rigid objects as characteristic for /kac/-

category ...... 309

Figure 4.19. /tieɲ/-category insensitive to whether caused-separation was made with

or without intensity ...... 310

xxi

Figure 4.20. /haek/-category insensitive to whether full separation was expected .... 311

Figure 4.21. Mapping of semantic characteristics onto seven categories (as well as

subcategories) of caused-separation in Khmer ...... 314

Figure 5.1. Numbers of verb types per scene and descriptions per scene across Thai

and Khmer ...... 327

Figure 5.2. Caused-separation verbs used by Thai speakers...... 328

Figure 5.3. Caused-separation verbs used by Khmer speakers ...... 329

Figure 5.4. Portions in percentage of different categories in caused-separation domain

in Thai and Khmer ...... 335

Figure 5.5a. Hierarchy of caused-separation categories with relevant subcategories

and smaller divisions in Thai...... 339

Figure 5.5b. Hierarchy of caused-separation categories with relevant subcategories

and smaller divisions in Khmer ...... 339

Figure 5.6. Overlapping ranges of categories in Thai and Khmer, with relevant core

and peripheral scenes ...... 363

xxii

List of Abbreviations and Conventions

1 first person

3 third person

ADVP adverb phrase

CAUS causative

CLF classifier

CONJ conjunction

COP copular

DET determiner

FP final particle

INTR intransitive

N noun

NEG negative

NP noun phrase

O object

OBJ object

PASS passive

PL plural

POL polite

POSS possessive

PREP preposition

PROG progressive aspect

PURP purposive

xxiii

QUOT quotative

RECP reciprocal

REL relativiser

RESP resultative phrase

SBJ subject

SG singular

TR transitive

V verb

VB verb

VP verb phrase

• Proper names are glossed with initial capital and period. • Bold is added in examples to indicate items of interest. • Bracketed transcriptions and glosses in examples indicate portions omitted from original materials but are part of the sentence or passage meaning. • Bracketed sections in text denote syntactic structures, or represent the lexicalisation properties of verbs. • Slashes are used to indicate Thai and Khmer words or phrases in text. • Single quotation marks are used to indicate translations or glosses in text. • Double-headed represent translations between languages.

xxiv

CHAPTER 1

Introduction

The Mainland Southeast Asia Sprachbund, or linguistic area, is well established, indeed considered one of the best, if not the best, examples of linguistic areas in the world. This linguistic area is defined on the basis of a wide range of structurally convergent traits—at various linguistic levels: e.g., phonology, morphology and syntax—among the area’s genealogically unrelated but sympatric languages.

Nevertheless, despite a large body of previous work focused on revealing such structural convergence among those languages, research on lexical-semantic parallelism relating to the postulation of the Sprachbund is still underdeveloped. This present study accordingly seeks to examine the lexico-semantic relatedness of two

Mainland Southeast Asia languages, i.e., Thai and Khmer, to determine whether evidence at the semantic level in those languages can be found to support the notion of the Mainland Southeast Asia linguistic area, and in turn to expand our understanding of areal semantics.

In the following sections, background information is provided on the area of

Mainland Southeast Asia investigated in the present study. This orientation includes how the area has been treated as a linguistic area. Use of areal (lexical) semantics to support the existence of the Mainland Southeast Asia linguistic area is introduced in §

1.1.1. In § 1.1.2, I consider the study’s limitation to two genetically unrelated but neighbouring languages: Thai and Khmer, selected to serve as representative

Mainland Southeast Asia languages. Also introduced is the experiential event domain:

1

caused-separation. This is presented as the domain chosen to uncover areal lexico- semantic trends. The research questions posed in this study are subsequently addressed in § 1.2, followed by an overview of the dissertation structure in § 1.3.

1.1 Premises of the study

1.1.1 Background

Mainland Southeast Asia (MSEA) as used here focuses on the core area occupied by

Indochina – Cambodia, Laos, and Vietnam – and Thailand (Comrie, 2007). A wider regional grouping would include Myanmar, Peninsular Malaysia, Southern and

Southwestern China and some states of Northeast India (Enfield, 2001, 2005, 2019).

The area is hence understood as incorporating two corresponding notions: core MSEA and greater MSEA. Not only do these states share the recognised borders, but they also share historically-rooted political bonds. These derive mainly through the predecessor polity, the Khmer empire, which prevailed over most parts of the four nations during its power peak during the 11th and 12th centuries (Sercombe & Tupas,

2014; Siebenhütter, 2019). The past–present relationships among these nations help considerably in perceiving the long-standing ethnolinguistic contact situations in this small region which define this core area of MSEA. The latter notion, of greater

MSEA, includes other parts of some present nation-states peripheral to central MSEA, i.e., the lowlands of China south of the Yangtze River, India’s seven sister states, i.e.,

Arunachal Pradesh, Assam, Meghalaya, Manipur, Mizoram, Nagaland and Tripura, and the Himalayan region (e.g., Sikkim) (Matisoff, 2001).

2

Within core MSEA, there are five recognised language families commingling: in alphabetical order, Austroasiatic, Austronesian1, Hmong-Mien, Sino-Tibetan, and

Tai-Kadai (Comrie, 2007; Dahl, 2008; Enfield, 2001, 2005, 2011a; Matisoff, 2001).

Enfield (2019) considers MSEA languages of these families typical in that they show many common structural properties observed in other language areas around the world. Also, however, there are significant unusual area-wide features. Post (2015) notes such characteristic relatedness stretching across the language families of MSEA

—possibly through a variety of contact-related phenomena, pointing to a case of linguistic convergence in the area. Based on this convergence, the MSEA linguistic area or Sprachbund2 is consistent with Enfield’s (2005, p. 190) broad definition (see details in § 2.1.1.2).

The MSEA linguistic area or Sprachbund is well established (Bisang, 2006;

Enfield, 2005, 2011a) and even is considered “the prime example” of the world’s linguistic areas (Gil, 2015, p. 267; Siebenhütter, 2019, pp. 16-17) or “the ultimate

Sprachbund” (Dahl, 2008, p. 2018), as qualified by previous studies on many areally distributed traits. Enfield (2005, p. 182) establishes that MSEA areal structural features are shared “at multiple levels” recognisable from phonology, morphology, and lexicon, through to syntax.

However, despite the number of studies on structural relatedness of MSEA languages across different language families (e.g., Bisang, 1991, 1999; Budge, 1980;

Capell, 1979; Clark & Prasithrathsint, 1985; Clark, 1985, 1989, 1996; Enfield, 2003,

2004, 2005, 2011a; Li & Thompson, 1981; Matisoff, 1973, 1991; Migliazza, 1996,

1 Austronesian is partly represented in MSEA by Cham—spoken in South Vietnam and Cambodia (cf. Enfield, 2005; Matisoff, 2001) 2 Although the terms “Sprachbund” and “linguistic area” can be employed for different notions (Thomason & Kaufman, 1988), both are considered in this present dissertation as synonymous, following other scholars (Enfield, 2005; Siebenhütter, 2019, among others). 3

Chapter 11), those studies evaluating semantic convergence in particular have been patchy and few. Enfield (2005, p. 196) noted: “Even within the semiotic and cultural phenomena most closely tied to linguistic structure, little is known about the geographical distribution of variation”. Specifically, for the MSEA linguistic area, areal lexico-semantic explorations appeared with Matisoff’s (1978) seminal work on

Tibeto-Burman variational semantics, which refers in part to areal parallel lexicalisations. Later, in Matisoff’s (2001) summary to background his analysis of areal prosodic diffusibility, he concludes a list of MSEA lexico-semantic convergent traits presented in his and his contemporaries’ work. These include expressions referring to psychological phenomena (Matisoff, 1986), as well as fine-grained lexical specificity (e.g., of verbs of manipulation; cf. Diffloth, 1994). Recently, Siebenhütter

(2019, mainly Chapter 6) has introduced an analysis of strategies of encoding spatial relations in the specific MSEA languages as evidence for areally diffused semantic values. Overall, the literature on MSEA areal semantic convergence can be described as patchy and sporadic, thereby justifying more investigation into this domain.

To sum up, there seems no need to redefine the MSEA Sprachbund since it has been already well established (Enfield, 2005). However, at the semantic level, detailed studies are far from sufficient. In particular, there are few studies documenting and explaining how certain areal semantic features spread and become diffused among languages in the area. For progress in this line of enquiry, a controlled methodology is required. The next section deals with how MSEA areal semantics can be investigated through understanding the lexico-semantic organisation of an event domain in chosen MSEA languages, as measured against those in certain languages outside the area.

4

1.1.2 Subject of the study

The subject of this dissertation is the lexical encoding of lexical-semantic categorisation patterns of an event domain in certain MSEA languages. To accomplish this aim, this study proposes the following steps (explained in more detail in Chapter 2):

(a) One experiential event domain is selected as a basis for comparative

investigation.

(b) Two genealogically unrelated but sympatric languages in the MSEA region

are selected. From these, linguistic data are captured using a controlled

experimental elicitation task to provide corpus data sets for language-internal

and cross-linguistic lexico-semantic explorations.

(c) Two genetically unrelated languages outside the MSEA region are selected

and surveyed for comparative purposes to probe the areal extent of lexico-

semantic commonalities established for the languages in (b).

The following paragraphs first describe the field of lexico-semantic (event) categorisation in human language, and why investigations of patterns of lexico- semantic organisation are promoted in this research as input for analysing areal linguistic relationships. Further, I point out which experiential domain was chosen for the present study and explain the rationale for selection. Lastly, I introduce the languages providing lexico-semantic evidence: two MSEA languages and other two languages for comparison outside the area.

In the present research, the study of lexico-semantic categorisation is mainly focused on how given domains like objects or events are organised in terms of their lexical encoding. In other words, it concerns how lexical expressions play their part in

5

speakers’ conceptualisation and categorisation of diverse objects and events in the physical world. The semantic categories in such domains are revealed in speakers’ lexical expressions. Studies of lexico-semantic categorisation are accordingly a type of exploration of systematic relations between meaning and form (e.g., lexical expressions, in the case of this study; cf. Talmy, 2003).

Narasimhan, Kopecka, Bowerman, Gullberg, and Majid (2012) note that previous research on semantic categorisation—much of which of course is lexically oriented—espouses at least three competing positions as to how humans as speakers recognise, i.e., package and structure, physical objects or experiential events. Such standpoints consist of, in this study’s terms, universal categorisation, cross-linguistic diversity in categorisation, and cross-linguistic convergent categorisation with language-specificity (see more details in Chapter 2). In brief, the first two are the opposites to each other. Universal categorisation postulates that humans share either innate universal recognition (cf. Fodor, 1975) or levels of exposure to the same correlated properties (cf. Rosch & Mervis, 1975) and fine-grained discriminations (cf.

Rogers & McClelland, 2004), therefore having universal linguistic organisation. By contrast, the stance of cross-linguistic significant variation in categorisation defends nuances across speakers of different languages in semantic patterning—mostly due to linguistic arbitrariness (cf. Bloomfield, 1933; Gleason, 1961). The last viewpoint of three stands in the middle course, in that it acknowledges the role of biases in perception and arbitrary conventions for localising semantic organisation but does not entirely exclude the touch of universal convergence (cf. Levinson, Meira & the

Language and Cognition Group, 2003; Majid, Bowerman, van Staden & Boster,

2007a, among others).

6

Methodologically speaking, despite differences in opinions, the authors cited above all theorise categorisation patterns of a given domain across different languages. That is, previous research on semantic categorisation considers how the presuppositions about a domain’s justifiable categories are practically useful for cross-linguistic typological comparisons—potentially in determining cross-linguistic convergence and diversity.

The present study consequently manipulates the same subject in such a way that the approach can assist in obtaining lexico-semantic patterning (i.e., of a given domain) in MSEA languages and constituting evidence of lexical-semantic characteristics to contrast with those of other languages outside the area.

Fundamentally, these analytical steps provide a reasonable base for examining

MSEA’s areal semantics.

Additionally, explorations of lexical-semantic organisation in this research are mainly discussed with reference to three denotational aspects summarised by Evans

(2010): semantic granularity, placement of semantic boundaries, and semantic grouping, based on probabilistic semantic maps (cf. Koptjevskaja-Tamm, Rakhilina,

& Vanhove, 2016; see more detail in Chapter 2). Briefly, these aspects help produce a systematic description of a given domain’s semantic architecture as determined by relevant lexical descriptors, i.e., verbs in this study. The semantic granularity concerns how many categories are minimally presupposed. The boundary location refers to where category boundaries are located. The semantic grouping has to do with what are treated as instances (e.g., subcategories) of the same assumed categories and what criteria are relevant in defining types, or determining—both grouping and dissecting—reasonable categories. Consequently, analysing cross-linguistic lexico- semantic surface characterisation of a given domain across languages can help

7

determine systematic class-defining properties, thus providing robust data for typological comparison: e.g., of areal semantic relatedness.

Next, I explore the chosen domain employed in the present study to help identify commonalities (and diversity) of lexico-semantic organisation across languages. The selection of two MSEA languages and two comparative languages outside the area, from where lexico-semantic insights for this research derive, is rationalised.

In this research, I selected the experiential domain of caused-separation as the focus for lexical-semantic explorations, both enabled by, and building on, earlier research. This event domain involves incorporating events with material destruction

(Majid, van Staden, Boster, & Bowerman, 2004), which itself involves agents manipulating theme objects to undergo an irreversible change of state, or “state change involving some kind of separation in an object” (Majid et al., 2007a). In effect, agents perform actions causing change-of-state events that lead “to an observable disruption in the continuity of a figure in an irreversible manner”

(Devylder & Zlatev 2020: 257)3. The actions typically involve some type of manipulation. Lexical descriptors or verbs of caused-separation are accordingly regarded as verbs of manipulation (Matisoff, 2001) or verbs of change of state

(Guerssel, Hale, Laughren, Levin & White, 1985; Levin, 1993; Majid et al., 2007a).

Additionally, caused-separations are seen as core events of general separations, as compared with other non-destructible but reversible separations (e.g., opening, or

3 The definition of caused-separation in this study focuses on (a) observable material destruction and (b) irreversible manner (see Devylder and Zlatev, 2020, p. 257). These focussed factors appear adequate in classifying events for the caused-separation without use of additional ones. Though complete versus partial and intentional versus accidental separation of a theme object are extra factors that one can consider, they do not affect the established domain with respect to its event-member inclusion. In addition, the MPI’s stimulus labels (see more on § 2.2.2 and Appendix A) concerning these extra details may be not exact. They are only an approximate description for convenience. 8

taking-apart) or removals of outer layers (e.g., peeling) (cf. Majid, Boster, &

Bowerman, 2008; Majid et al., 2004, 2007a). The present study focusses only on events of caused-separation, which involves the destructibility of objects and irreversibility of state change.

Caused-separations are central to human cognition. Majid et al. (2004, 2007a) argue for this domain’s relevance and significance to human cognition, building upon two notions: one regarding the easy discernibility of caused-separations, and the other concerning the use of common knowledge to deal with such events. First, the notion of accessibility means that caused-separation events are observable. They are not those of unseen human activities but commonly observable empirically in everyday life since human interrelations or behaviours usually contain actions of the kind: e.g., from preparing food in a village kitchen to trimming paper after printing in an office in Bangkok. Second, and correspondingly, talking about events of caused-separation depends only on common knowledge. Speakers of a language do not have to count on specialised knowledge to mention a caused-separation, nor do listeners have to possess specific expertise to understand references regarding such an event. Data collection methods in this domain can widely apply to speakers of particular languages since they tend feel comfortable when asked to give descriptions for events of this universally recognisable kind.

A further selection criterion of importance is ongoing disciplinary interest in the domain of caused-separation. Leading experts in lexico-semantic analysis have conducted studies across more than 30 languages (e.g., Guerssel et al., 1985; Hale and

Keyser, 1987; Majid et al., 2004; Majid et al., 2007a-b; Narasimhan, 2007, among others). Such research has contributed to a considerable body of existing literature on this experiential event domain. Methodological tools have been developed which are

9

appropriate for the current study, as discussed in Chapter 2. These studies provide important evidence establishing robust grounds for comparative typological studies.

These factors indicate why the present research has opted to focus on the experiential domain of caused-separation events. In the present study, this event domain has been used to show how speaker-informants of two chosen MSEA languages mentioned events of the specified kind through free descriptions of controlled visual stimuli.

Linguistic information, obtained in this way, consisting of verbal descriptors, is coded and employed to characterise lexico-semantic convergence among selected

MSEA languages. Finally, features of lexical semantics established for the MSEA region investigated are compared with relevant findings in non-MSEA languages.

This enables generalisations as to the extent of areally diffused lexico-semantic characteristics.

Two languages, Thai and Khmer, have been chosen as representative of those in the MSEA region. Data evaluated in this study were collected from native speakers of these languages. Lexico-semantic observations from both languages have then been compared and contrasted with insights derived from two non-MSEA languages. In a previous study, Narasimhan (2007) examined the same event domain of caused- separation as it was categorised by speakers of Hindi and Tamil. Findings from these languages are thus useful for the present study’s comparative purposes.

Selection of the representative MSEA languages, Thai and Khmer, and of the languages used for comparison, Hindi and Tamil, is further explained and justified.

Firstly, some preliminary background on these languages follows.

10

Siamese (Enfield, 2005; Gedney, 1989; Li, 1977), Standard Thai (Diller,

2012; Lewis, 2009), or simply Thai as used in this research refers to an MSEA language under the Tai-Kadai language family. It is the official and national language of Thailand (Smalley, 1994). The Thai language studied here is mainly based on normative varieties spoken in Bangkok, the capital, and related Central Thai varieties, despite being widely spoken across the country (cf. Kosonen & Person, 2014). Thai, as the language with the greatest prestige and importance in Thailand, is dominant in schooling, government offices, business and commerce, and the (national) media.

Khmer and Cambodian are alternative names for the second MSEA language providing field data for the present study. Like Thai, the Cambodian language

(Haiman, 2011), or simply Khmer as used here, is the best-known Austroasiatic language, with its national-language status in the Kingdom of Cambodia (cf. Enfield,

2005). In addition, there are significant numbers of speakers in neighbouring Thailand and Vietnam (Frewer, 2014). Correspondingly, Khmer has been important in typological linguistic studies and has often represented Austroasiatic, i.e., Mon-

Khmer, languages, in linguistic analyses (e.g., Gregerson, 1976; Huffman, 1976;

Shorto, 2006).

Thai and Khmer are chosen for this study due to their representativity among

MSEA languages in both quantitative and qualitative terms. Notably, despite the vast diversity of languages in the MSEA region (Enfield, 2003, 2005), Thai and Khmer together are found spoken by nearly one-third of the total MSEA speakers (cf. Draper,

2019). Qualitatively speaking, according to Comrie (2007), Thai4 and Khmer possess

4 Dahl (2008, p. 218) also claims that Thai can be seen as “the pivot” of languages spoken in the MSEA Sprachbund since it possesses a very substantial number of structural traits typical of the area (cf. Comrie, 2007). 11

large numbers of linguistic features characteristic of MSEA languages, thereby being representative for the area’s languages.

Besides, though belonging to the different language families, Thai and Khmer resemble each other in many important traits, i.e., Indic script, vocabulary (Enfield,

2005), and in salient features of common syntactic structure (Huffman, 1973). Dahl

(2008) even gives an intriguing mathematical insight to Khmero-Thai linguistic relatedness permeating across genealogical boundaries. These suggest a research focus on open questions of whether convergence between Thai and Khmer also can be recognised at the semantic level, and the extent to which it can be effectively studied and understood.

A further stage of analysis is required to determine whether Thai-Khmer convergence at the semantic level should be taken as due to areal influence. For this stage, evidence from Thai and Khmer is to be triangulated with input from other languages outside MSEA to exclude potential universality and in turn to support implications of areal relatedness.

In the present study, Hindi and Tamil have been assessed and selected for a triangulation analysis. Three factors have gone into this assessment: genealogical unrelatedness, geographically immediate neighbourhood, and availability of lexico- semantic data on caused-separation. The case for selection can be summarised as follows.

First, Hindi and Tamil are genetically unrelated to Thai and Khmer. Modern

Standard Hindi or simply Hindi is an Indo-European language (IE) prevailing over the northern two-thirds of the Indian subcontinent (Shapiro, 1989), whereas Tamil is a

Dravidian language (DR) spoken predominantly in southern India and adjacent north-

12

eastern Sri Lanka (Annamalai & Steever, 1998). Therefore, Hindi and Tamil, as spreading in South Asia (SA), are geographically separated from Thai and Khmer, which are spoken in proximity to one another. Consequently, any lexico-semantic parallelism found across genealogically unrelated Thai and Khmer but not contained in Hindi and Tamil could be tentatively postulated as being an MSEA areal phenomena, since potential universality can be reasonably ruled out using the triangulation measurement (see Figure 1.1).

Second, despite being in different language families, Hindi and Tamil belong in the same SA linguistic area, a region with common characteristic traits cutting across genetic boundaries (Shapiro, & Schiffman, 2019). According to Post (2015), the SA linguistic area is located adjacent to the MSEA linguistic area, as attested by language contact observed among North-east Indian and MSEA languages.

Lastly, there has already been comparable lexico-semantic research on Hindi and Tamil (Narasimhan, 2007) that provides helpful insights into the domain of caused-separation. Selection of these two SA languages for comparative purposes accordingly enables the triangulation analysis along with Thai and Khmer. That said, it is worth noting that I am not suggesting that distantly comparable Hindi and

Tamil’s contact conditions are parallel in all respects to those of Thai and Khmer.

13

Figure 1.1. Triangulation of Thai, Khmer, and Hindi and Tamil, in explorations of MSEA’s areal lexico-semantics: Thai and Khmer are first mutually compared in order to establish lexico-semantic relatedness; then, lexico-semantic observations of the two languages are measured against insights from the comparative languages: Hindi and Tamil.

To sum up, the present study examines semantic categorisation of the caused- separation domain in the two MSEA languages: Thai and Khmer. The SA languages

Hindi and Tamil provide a critical comparative dimension. They allow in effect a type of measuring against which the supposed uniqueness of MSEA features can be assessed.

1.2 Research questions

In this section, I review findings from previous research on lexico-semantic categorisation of the (caused) separation domain. Many of the findings point to considerable agreement in the domain’s distinctions across languages—albeit with certain persistent language-specific differences. Such considerations of universal convergence along with language-specificity in semantic encoding of caused-

14

separation are appreciated here as helpful in subsequently laying out the research questions for the present study.

The event domain of (caused) separation and its (lexical) expressions have prompted a considerable number of studies in different languages, both cross- linguistically and language-internally (i.e., within a single language), since Fillmore’s

(1970) influential approach to universal distinctions based on semantic and syntactic behaviour of hit-type verbs versus break-type verbs. In each of these studies, the ever- present question raised relate to how the domain of (caused) separation is to be organised and classified, as revealed by linguistic expressions: e.g., lexical items or grammatical constructions. The studies discuss both universal convergence and language-specific divergence. Semantic encoding of (caused) separation observed in different languages has been consequently elaborated in several ways. Three established stances have been especially prominent: the position of universal categorisation, that of cross-linguistic diversity in categorisation, and that of cross- linguistic convergent categorisation with language-specificity.

Research into other experiential event domains has shown similar diversity in approach. Semantic categorisation of caused-separation events seems to align with studies of many other event domains. This may be thanks in part to its potential consistency and in part to shared approaches in analysing the human linguistic interpretation of the world. This includes theoretical recognition of how language structures and represents events (cf. Narasimhan et al., 2012).

Studies based on universal categorisation (Fillmore, 1970; Guerssel et al.,

1985; Hale & Keyser, 1987; Keyser & Roeper, 1984; Kroeger, 2010; cf. Levin, 1993) together point to the conclusion that there appears to be universal behaviour of verb

15

classes relevant to the (caused) separation domain, suggesting that speakers of different languages should recognise similar distinctions. For example, Guerssel et al. propose, on the basis of Berber, Warlpiri, and Winnebago, that cut-type verbs and break-type verbs are universally characterised according to their semantics, i.e., systematic meaning components, and syntax, i.e., diathesis alternation, thereby advocating two universally available categories in the domain of separation events: cutting versus breaking.

By contrast, Pye, Loeb, and Pao (1995) cast doubt on the universality notion as suggesting that languages categorise separation events, and universality of categorisation is far from concluded.

Majid et al. (2004) are the first to advocate the position of cross-linguistic convergence with language-specificity for the domain of separation. Given 28 typologically, genealogically, and areally diverse languages, they show that speakers of such languages show considerable agreement in lexical-semantic categorisation of the domain of cutting and breaking,5 as converging on a similarity space. In this shared space, all the languages together distinguish events of this kind based on levels of agents’ control over the location of separation. Consequently, events featuring precise control and those with imprecise control are universally categorised into two extreme-end groups, i.e., roughly, cutting events versus breaking events. Having said that, there has still been room for language-specificity to play its part. As Majid et al. point out, separation events with an intermediate level of agents’ control are handled variably, cross-linguistically. For example, speakers of different languages appear to

5 In Majid et al.’s (2004) work, the researchers use cutting and breaking to refer to both caused and spontaneous separation, which involves irreversible material destruction. However, though spontaneous events were included, they did not change outcomes about how caused-separation events were analytically perceived with reference to semantic categorisation. 16

have different ideas about tearing: i.e., whether to distinguish it from other actions or group it into some other category: e.g., of breaking. Variation seems to be activated by language-specific influences on the separation domain, which in turn bring about differences in numbers of categories and placement of category boundaries.

Several recent and more experimentally designed studies have appeared to support in particular the stance of cross-linguistically convergent distinctions with local divergence (Lüpke, 2007; Majid et al., 2007a-b, 2008; Rounti, 2018; van Staden,

2007, among others).

Narasimhan (2007) examines semantic categorisation of caused-separation events in Hindi and Tamil, using a comparative verb-distribution table as a classic semantic map. She establishes similarities in Hindi and Tamil’s distinctions between cutting and breaking as exhibited cross-linguistically (cf. Majid et al., 2004).

Particularly, typical cutting events in both languages have a high degree of predictability (i.e., relating to precise control) of the locus of separation, while breaking events show near-complete unpredictability. Additionally, Hindi and Tamil are similar in discriminating events of tearing from others in the same domain.

Nevertheless, Hindi and Tamil bilaterally converge on an unusual grouping of snapping and smashing event descriptions. This grouping is utterly opposed to the way many other languages distinguishing these events (Majid et al., 2004, 2007a,

2008). Such common difference found in Hindi (IE) and Tamil (DR)—albeit from the different genealogical roots—seems to raise the question of why they come together on this event organisation. Might such a convergent linguistic trait be induced by some pressure other than genetic influence? Language contact seems to be implicated here: since unlike Hindi and Tamil, other IE languages which are spread throughout

17

the European linguistic area (cf. Haspelmath, 2000), like English, Swedish, German, and Dutch (cf. Majid et al., 2008), make a robust categorical distinction between snapping and smashing.6

There are differences between Hindi and Tamil, still (Narasimhan, 2007). For instance, the breaking category boundaries are placed differently in the two languages since Tamil’s breaking refers to more specific kinds of actions than does the Hindi counterpart. In particular, while Hindi’s breaking can involve either rigid or non-rigid objects, Tamil’s is restricted to actions upon rigid objects only. Other important differences across Hindi and Tamil also include the different subdistinctions in smashing versus breaking, and the different encodings of opening and cutting

(Narasimhan, 2007).

These studies point to the applicability of lexico-semantic typological structures of the domain as potentially determining semantic convergent traits across the different languages described. Such convergence is then taken to provide evidence of significance for further areal assessments.

The present study follows in a similar direction, using the major observational and experimental approaches in the literature mentioned above, in order to determine lexical-semantic architectures in certain MSEA languages. Comparison is also made with equivalent insights from non-MSEA languages. In this way, patterns of areal semantic cohesion specifically within the MSEA linguistic area are detected. These are considered to be at least potential areal features. In particular, if there are found to be lexical semantic parallels between Thai and Khmer which are not represented in

6 Narasimhan (2007) does not intend to compare Hindi and Tamil in their areal context. However, since the two languages are genealogically unrelated but spoken in close geographical proximity, their lexical-semantic parallelism raises an issue, as against the universal trend. This leads to a conjecture that the convergences are triggered by areal influences. 18

geographically distant languages like Hindi or Tamil, these could be taken as implying effects triggered by areal influences.

This study consequently poses the three major research questions:

(1) What are the lexico-semantic categorisations of caused-separation events as

measured by verbal descriptors across Thai and Khmer—with respect to

different denotational aspects of the domain?

(2) What are lexico-semantic convergent traits across Thai and Khmer—as

derived from (1)?

(3) How can Thai and Khmer lexico-semantic parallels—as seen in (2)—represent

potential areal/Sprachbund lexical semantic traits, as triangulated with non-

MSEA Hindi and Tamil, thereby taking evidence at the lexico-semantic level

as enhancing MSEA’s status as a linguistic area?

Answers provided in this research characterise lexico-semantic typology of the chosen domain of caused-separation in Thai and Khmer. The findings are then used to assess the extent to which the two languages’ semantic convergence can be attributed to areal influences.

1.3 Thesis structure

This dissertation is presented in six chapters, including this first chapter, the introduction. The second chapter breaks down into two main parts: the literature review, and the methodology. The first part of chapter 2 covers three sections: (1) the review of the earlier literature on the Mainland Southeast Asia (MSEA) Sprachbund,

(2) theoretical background to the lexico-semantic typological analysis, and (3) previous related studies on (cross-linguistic) event categorisation—concentrating on

(caused) separation events. The second part of Chapter 2 describes the methodology:

19

e.g., the methods of collecting data by an elicitation task and how it is processed; also mentioned are ethical considerations, as well as the context of the data collected for this research. In this part, I also outline in a concise yet comprehensive way certain statistical tools that help display more illustratively semantic categorisation of verbs of caused-separation events, and their relevance for typology.

Chapters 3, 4 and 5 are the analytical chapters of the dissertation. Chapters 3 and 4 concern Thai and Khmer data respectively. These chapters present the denotational analysis of the caused-separation domain as determined by Thai and

Khmer verbs used in naming the stimulus video clips. The analysis presented shows three significant aspects of the domain in Thai and Khmer: (1) granularity of caused- separation categories, (2) placement of category boundaries in the caused-separation semantic space, and (3) semantic organisation of caused-separation events.

Chapter 5 is devoted to a comparison of the results of the two preceding chapters. This draws conclusions about lexico-semantic convergence as semantically related evidence for defining and reinforcing the MSEA linguistic area. This topic is more fully treated in this dissertation’s last chapter.

Finally, Chapter 6 attempts to show how lexico-semantic parallelism attested across Thai and Khmer may be unusual with respect to genetically unrelated languages spoken in other regions of the world. The Hindi—Tamil pair is taken as illustrative of difference from what is established for Thai—Khmer. A key issue is how such convergent evidence can be explained by the likelihood that Thai and

Khmer have been mutually influenced due to geographical proximity despite belonging to the different language families. Such convergence consequently supports the MSEA Sprachbund. In this way, the last chapter locates the analytical results and

20

conclusions advocated in this dissertation, especially those from Chapter 5, into a wider context. Findings are related to larger debates about applications of lexico- semantic parameters in defining linguistic areas.

------⁂ ------

21

CHAPTER 2

Literature review and methodology

This chapter consists of two main parts, Literature review, and Methodology. The literature review contains three sections. The first (§ 2.1.1) provides background on

Mainland Southeast Asia, how it has been considered as a linguistic area or

Sprachbund due to areal convergence, and the languages to be investigated in this study. The second section (§ 2.1.2) is concerned with certain concepts of lexico- semantic typology, i.e., semantic categorisation of a given domain, semantic maps in typological analysis, consistency in lexical descriptions, and lexicalisation patterns.

The third section (§ 2.1.3) is about the domain of caused-separation events and the relevant previous studies in different languages.

The methodology part first explains this study’s objective: to characterise semantic categorisations of caused-separation in Thai and Khmer, thereby contributing to an understanding of lexico-semantic convergence in the MSEA linguistic area (§ 2.2.1). Next, the use of an experimental elicitation task and the pilot study are discussed (§§ 2.2.2 and 2.2.3). The remaining subsections look at the data collection (§ 2.2.4), how to prepare the data for the analysis (§ 2.2.5), and the data analysis methods, particularly regarding the statistical methods employed (§ 2.2.6).

22

2.1 Literature review

2.1.1 Mainland Southeast Asia (MSEA): The linguistic area

2.1.1.1 Geographical area

Mainland Southeast Asia (MSEA hereafter) is the area encompassing Indochina and

Thailand, perhaps with potentially extended zones to the west, the north, or to the south. MSEA is consequently indefinite with respect to exact boundaries. However, the area can be either narrowly or broadly defined, leading to two senses, i.e., core

MSEA and greater MSEA.

In the narrower sense, MSEA refers specifically to core MSEA, i.e.,

Cambodia, Laos, Vietnam, and Thailand, since different scholars (e.g., Comrie, 2007;

Enfield, 2005; Enfield & Comrie, 2015; Enfield, 2019; Jenny, 2015, among others) concur in selecting this smaller but central region of MSEA as representative of the

MSEA area understood more widely. The notion of a core MSEA area is not merely for subjective convenience. Multilateral trade and commerce, along with other private and state interrelations, have increased the integration of the core MSEA region. This has led to a set of cohesive defining factors. Not only that, the four nation-states also share historical roots through common precursors, thus potentially setting this area off from its circumscribing regions outside the common source. For example, the Khmer

Empire at its peak (Sercombe & Tupas, 2014; Siebenhütter, 2019) flourished across core MSEA, ruling and spreading its politico-cultural shadow over large portions of what are now four nations. As a result, Khmer influence can be seen in the design of old towns and cities, other sociocultural traditions, and even local languages in these countries, but they rarely filtered outward to other regions. Such factors contribute to the demarcation of core MSEA.

23

According to the broad definition, MSEA by contrast can mean the areas extending from the core area, i.e., Indochina and Thailand, to flanking regions in at least three directions. Westward, northward, and southward extensions potentially determine a region called greater MSEA. To the west, MSEA may comprise of

Myanmar (e.g., Enfield, 2005; Jenny, 2015; Vittrant, 2015) and the sister states in

Northeast India (e.g., Hyslop, Morey, & Post, 2011, 2012, 2013; Matisoff, 2001;

Morey & Post, 2008, 2010). To the north, some authorities extend the MSEA area to encompass regions of China south to the Yangtze River (e.g., de Sousa, 2015; Bauer,

1996; Ansaldo & Matthews, 2001; Chappell, 2001). Lastly, to the south, MSEA may extend to the Malay Peninsula (e.g., Adelaar & Himmelmann, 2005; Blust, 1994).

Notably, given those possibly extensive areas of MSEA, Enfield (2019, p. 2) remarks that the term “Mainland” may not be literally applicable. However, following common inclusive usage, the present study recognises this broadly defined MSEA.

To sum up, MSEA may refer to two geographical senses, core MSEA and greater MSEA, with the latter incorporating the former. However, for convenience, the term MSEA in this study henceforth has specific reference to core MSEA where

Thai and Khmer, the chosen languages for the present analysis, are spread and spoken.

2.1.1.2 MSEA as a linguistic area and its areal linguistic support

MSEA is regarded as “linguistically highly complex” (Siebenhütter, 2019, p. 22) according to three senses (cf. Nettle, 1999), i.e., language diversity, phylogenetic diversity, and structural diversity (Enfield, 2011a). First, MSEA has a relatively high diversity of languages; specifically, this language diversity is measured by the number of languages per square kilometre (Enfield, 2019). Second, the phylogenetic diversity in MSEA is determined by the intermingling of five major language families in the

24

area. These are: Austronesian (i.e., languages of the Chamic group spoken in South

Vietnam and Cambodia) (cf. Matisoff, 2001), Mon-Khmer (i.e., languages of sub- branches spoken in highlands of central Laos and Vietnam, and Northeast Cambodia)

(cf. Enfield, 2005), Tai-Kadai (i.e., languages of the Tai branch spoken in Laos and

Thailand), Sino-Tibetan (i.e., particularly Tibeto-Burman spoken in northern Laos, and northern Thailand) (cf. Bradley, 2003), and Hmong-Mien (spoken in Thailand,

Laos, and Vietnam). However, MSEA surprisingly shows a low degree of typological diversity since many linguistic features in phonology and morphosyntax are common to all or most MSEA languages (i.e., from different language families) (cf. Bisang,

1991, 1999; Clark, 1985, 1989, 1996; Clark & Prasithrathsint, 1985; Enfield, 2003).

Thus, linguistic complexity in MSEA is characterised by the fact that the area is home to structurally or typologically convergent languages despite the sheer numbers and their diverse genetic groupings.

Considering the typological relatedness across different languages in this particular region, MSEA is recognised as one of the world’s linguistic areas (e.g.,

Comrie, 2007; Dahl, 2008; Enfield, 2001, 2003, 2011a, 2019; Gil, 2015; Goddard,

2005; Matisoff, 2001; Post, 2015; Siebenhütter, 2019), as consistent with Enfield’s

(2005) broad definition:

… a geographical region in which neighbouring languages belonging to

different language families show a significant set of structural properties in

common, where the commonality in structure is due to contact and where the

shared structural properties are not found in languages immediately outside the

area (ideally where these include languages belonging to the same families as

those spoken inside the area. (p. 190)

25

Given this, the linguistic complexity of MSEA is based on areal linguistic phenomena. That is, widespread structural similarity permeating linguistic genetic code is contact-induced, i.e., triggered by diffusional influences due to sociohistorical contact among neighbouring language communities in a proximity area. Such contact seems to have the power to have any linguistic feature transferred from any language to any language across genetic barriers (Thomason & Kaufman, 1988).

Previous studies on typological convergence across MSEA languages (e.g.,

Bisang, 1991, 1999, Budge, 1980; Capell, 1979; Clark & Prasithrathsint, 1985; Clark,

1985, 1989, 1996; Enfield, 2001, 2003, 2005, 2011a; Li & Thompson, 1981; Matisoff,

1973, 1991; Migliazza, 1996, Chapter 11) taken together show that languages of different genetic affiliations in the area have come to share a substantial number of structural features, i.e., both actual forms and grammatical patterns, at multiple levels.

Table 2.1 summarises some the MSEA areal features across the five language families7 (references cited in the table where relevant), ignoring exceptions within families. The “±” sign below marks variation within certain families whose relevant features are present in some members, but absent in others.

7 Following Enfield’s (2005, p. 51) discussion on MSEA areal features, Austronesian is not currently included in the table and Sino-Tibetan as one of the main language families in the area was further split into Sinitic and Tibeto-Burman. 26

vocabulary, referring to phenomena where not the forms of words but the content and structure of the vocabulary were shared across languages of different genealogical origins (cf. Enfield, 2001). Matisoff explains such phenomena as being triggered by long cultural contact. This sort of contact is responsible for people having come to share common worldviews and consensus on what to think and talk about. Matisoff

(2001) describes MSEA lexico-semantic convergent traits presented in his and his contemporaries’ work: e.g., expressions referring to psychological phenomena

(Matisoff, 1986), and fine-grained lexical specificity (e.g., of verbs of manipulation)

(Diffloth, 1994). In 2004, Matisoff again devotes a section to Southeast Asian lexico- semantic areal features in his discussion on the status of areal semantics (pp. 365-

370). Enfield (2011b) discusses underlying lexical semantic distinctions in the taste/flavour domain across Kri and Lao—both spoken in Laos. Recently,

Siebenhütter (2019, mainly Chapter 6) has introduced analysis of how spatial relations are encoded in Thai, Lao, Khmer, and Vietnamese, consequently evidencing the areal distribution of semantic features in the MSEA linguistic area. Roughly compared to studies on other areal structures for the MSEA region, those on areal semantics are considerably fewer.

To sum up, the MSEA linguistic area has been well established (cf. Enfield,

2005, 2011a, 2019). The few forerunner studies on lexical semantics that have been conducted raise the question whether and how much we may find evidential support from this study area. If we can, that also could suggest a type of convergence mechanisms about which we do not know that much yet. Consequently, more research on potential areal semantic traits and their utility in actual communicative practice should be undertaken (cf. Ameka & Wilkins, 1996; Koptjevskaja-Tamm & Liljegren,

2017; Matisoff, 2004), and the present study intends to respond to this call. To frame

30

the present study’s analysis, notions regarding the focus and potential of areal semantics are summarised in the following section.

2.1.1.3 Areal semantics: Studies of contact-induced semantic convergence

Areal semantics is concerned with the diffusional spread of semantic features across neighbouring languages in a geographical space. Theoretically, it incorporates the convergence of individual (i.e., loanwords), and the patterning of semantic domains, to the lexical profile of a language, such as derivational mechanisms or ratios of verbs versus nouns (Koptjevskaja-Tamm & Liljegren, 2017, p. 211).

Correspondingly, areal semantics asks questions of whether and how lexico-semantic parallels as such can serve as indicators—like other areal surface structural parallelism—of areal clustering, especially when the semantic convergence attracts the curious eyes of outsiders to an area, and is observed across languages of distinct families in the area.

Despite the expansive continuum of the areal semantic scope, Koptjevskaja-

Tamm and Liljegren (2017) show that the subject of the individual- influence at the lowest point of that range may be collapsed into issues of area-specific lexicalisations, while at the highest point the organisation of as possibly involving grammaticalisation should be precluded to bolster a distinction between grammar and lexicon. Consequently, there seem to be at least three groups of lexico- semantic phenomena worth discussing for this research area (Gast & Koptjevskaja-

Tamm, 2018):

(i) Lexico-semantic parallels. These phenomena can break down to two finer-

grained subgroups, i.e., polysemy and lexico-constructional calquing.

Polysemy calquing is concerned with the expression of two supposedly

31

different concepts using one , or colexification (cf. François, 2008; Gast

& Koptjevskaja-Tamm, 2018): e.g., ‘fruit’ = ‘child’ across West African

languages. Lexico-constructional patterns are about translation loans which

show interlingual matches of the same semantic and structural patterning, such

as ‘sun’ = ‘eye of the day’ in many MSEA languages.

(ii) Shared formulaic expressions. A good example is the farewell expression in

many European languages: e.g., au revoir (French), or auf Wiedersehen

(German) (Koptjevskaja-Tamm & Liljegren, 2017).

(iii) Area-specific lexicalisation and shared organisation of semantic domains. A

case of shared areal lexicalisations is highly specialised vocabulary for dairy

products in the Greater Hindukush languages (Koptjevskaja-Tamm &

Liljegren, 2017). For shared semantic domain organisations, they are about

whether and how languages lexically converge in given semantic domains,

such as TASTE and FLAVOUR across Kri and Lao, two neighbouring languages

in Laos—in MSEA (Enfield, 2011b).

Additionally, an exploration of areal lexico-semantic parallels requires methodological considerations as likely suffering fallacies made by researchers’ perceptual bias in defining important area-specific patterns. Accordingly,

Koptjevskaja-Tamm and Liljegren (2017, p. 224) outline a methodological guideline: to determine areal properties, they should be tested both across languages belonging to different families within an area and across languages outside it. Also,

Koptjevskaja-Tamm and Liljegren consider arguments about how explorations of lexico-semantic areality can be conducted at different scales: while a macrotypological study can help assess uniqueness of a lexico-semantic property across the languages of the world or signpost the areas where it might be

32

predominant, a microtypological study will assist in estimating particularly how a property is systematically organised in a specific area—perhaps rather than being located more widely (cf. Koptjevskaja-Tamm, 2011).

Considering Koptjevskaja-Tamm’s (2011) and Koptjevskaja-Tamm and

Liljegren’s (2017) principles as well as Enfield’s (2005) definition of a linguistic area

(see § 2.1.1.2), an investigation of areal semantic properties to support a linguistic area may be necessary to identify at least:

(i) Geographical area(s): one may choose either a single specific area (e.g.,

Brenzinger and Fehn, 2013, Central Kalahari; Treis, 2010, the Ethio-Eritrean

area) or multiple areas across the globe (e.g., Urban, 2009, 2010, 2012).

(ii) Languages of different genetic affiliation in the area(s), in (i): one may start

from two languages (cf. Enfield, 2011b, in Kri and Lao).

(iii) Comparative languages outside the area(s), in (i): one has to choose outsider

languages to those in (ii) for comparison.

(iv) Situation that the languages, in (ii), share lexical semantic features: one can

choose to investigate any lexico-semantic features with reference to the above-

discussed groups.

(v) Likelihood that the semantic features, in (iv), are unusual to the outsider

languages, in (iii): one should measure potential lexical semantic peculiarities

observed in the chosen languages against insights of the outsider languages.

The principles above can used to enrich the notion of the MSEA linguistic area through applying lexico-semantic analysis. In this way, shared lexico-semantic organisation and modes of categorisation of a selected domain can become a resource to probe areal relationships. As introduced in the previous chapter, for this study

33

caused-separation is taken as a promising semantic domain to investigate. Fieldwork research in this domain was consequently undertaken in two core MSEA languages,

Thai and Khmer. Recent relevant studies relating to these languages are summarised in the following section (§ 2.1.1.4). In addition, Narasimhan (2007) has studied the caused-separation lexico-semantic domain for Hindi and Tamil. In line with (iii) and

(v) in the preceding scheme, her analysis provides an appropriate outsider case for comparison. Her work is introduced in § 2.1.3.3

2.1.1.4 MSEA languages to be examined: Thai and Khmer

In § 1.1.2, Thai and Khmer have been discussed about what they are, and where they are spoken. This section then outlines why they are considered eligible for the present research.

Despite belonging to different families, Thai and Khmer share strong typological similarities to each other (Enfield, 2003). Below is a summary8 of some phonological, lexical, and syntactic parallelism across Thai and Khmer.

Phonology

(i) Sesquisyllabicity: a shift of stress to final syllables and weakening of non-final

syllables (cf. Nacaskul, 1962)9

(ii) Initial velar nasals (Pothipath, 2018)

8 Since Thai and Khmer observe as well structural similarities found across MSEA languages (see § 2.1.1.2), this list accordingly avoids the repetition of such convergent traits, thereby focussing particularly on bilaterally shared features. Additionally, it is noted that none of the studies on Khmero- Thai linguistic parallelism cited here, as far as is known, particularly develops the notion of areal (lexical) semantics for the MSEA linguistic area, or at least the area where Thai and Khmer are spread, except for Enfield’s (2003) detailed discussion of areal patterns of ‘acquire’. 9 Nacaskul (1962) does not make specific use of the term sesquisyllabicity in her comparative work on Thai and Khmer (Cambodian in her study). However, she mentions and explains phonological parallelism phenomena of syllable structures. 34

Lexicon

(iii) Semi-lexical expressions: /kʰâːŋ lâːŋ/ in Thai and /kʰaaŋ kraom/ in Khmer

(side under) ‘underneath’ (Gorgoniev, 1966); Thai /tʰâː jaŋŋán/ and /baə

doocnuh / (if like-that) ‘in that case’ or Thai /tʰîː nîː/ and Khmer /tii nih/ (spot

this) ‘here’ (Huffman, 1973); synonymous compounds (Pothipath, 2018)

(iv) Functional range of individual forms: /kʰɔ̌ ːŋ/ in Thai versus /rɔbɑh/ in Khmer

‘thing’ for possessive use (Gorgoniev, 1966); the parallel participation of Thai

/dâj/ and Khmer /baan/ ‘acquire’ (Enfield, 2003; Huffman, 1973), or of Thai

/hâj / and Khmer /ʔaoy/ ‘give’ or Thai /wâː/ and Khmer /tʰaa/ (say) ‘say;

QUOT.REL.CONJ’ (Huffman, 1973)

Syntax

(v) Copula constructions: in Thai-Khmer order, /pen/ and /cie/, and /kʰɯː/ and /kɨɨ/

constructions (Huffman, 1973; Martini, 1956)

(vi) Predicative and attributive uses of verbs with qualitative meaning (Martini,

1957)

(vii) Word order: numeral-classifier (Gorgoniev, 1966; Minegishi, 2004)

(viii) Goal/instrument and object/instrument complement: e.g., those after Thai /paj/

and Khmer /tɨv/ ‘go’ potentially read as a goal or instrument without the help

of prepositions (Minegishi, 2004)

The linguistic convergence in Thai and Khmer, despite the different families, is salient and overshadows the relationships of each language with its genetic relatives. For example, Dahl (2008, p. 219) has calculated the typological distance between Thai and Khmer to be 12.3. This value is almost the same as that of Russian and Polish, the Slavic languages, which he takes to be 12.8. He finds it remarkable to

35

contrast this with 48, the distance metric he assigns between Khmer and Mundari, two

Austroasiatic languages. That being the case, there seems to be a potential for these two languages to show parallelism at the (lexical) semantic level as well. An exploration of Thai and Khmer’s lexico-semantic convergence is accordingly potentially useful and may uncover more (areal) convergent traits across the languages.

Additionally, looking at the wider scope of the MSEA convergence, Thai and

Khmer possess very large numbers of areal features characteristic of the MSEA linguistic area. According to Comrie’s (2007, p. 44) measure, Thai exhibits 19 out of

21 MSEA characteristic traits, while Khmer has 18. Simply put, both languages possess no less than 85% of the areally diffused features in the area. This striking percentage reflects how typical Thai and Khmer appear when measured by the MSEA features cited by Comrie.

To present examples of Thai and Khmer, the present study adopts the transcription conventions outlined by Tingsabadh and Abramson (1993) for Thai and that by Headley (1977) for Khmer. Following those conventions, tones in Thai are consequently marked orthographically using diacritic accents except that the mid-tone is left unmarked here. Long vowels in Thai are indicated with a special symbol for lengthening, “ː”. For Khmer, vowel length by contrast is rendered by doubling the vowel symbol: e.g., “ɑɑ”. Slashes are used to indicate Thai and Khmer words or phrases when put next to other text, except for the interlinear gloss text. Additionally, bolding is used on some example words or phrases to draw attention to matters under discussion.

36

2.1.2 Lexico-semantic typological analysis of events

Lexical semantic, or lexico-semantic, typology—semantically oriented lexical typology in Koptjevskaja-Tamm, Vanhove, and Koch’s (2007) terms—is primarily concerned with the focal question of what meanings can be expressed by a single word, such as a verb, in different languages. Lexico-semantic typology includes issues of universal versus language-specific patterns of lexicalisation and categorisation within lexical fields or semantic domains.

Koptjevskaja-Tamm et al. (2007) notes that prior systematic studies on cross- linguistic semantic patterning have only been achieved for a limited number of semantic domains encoded by words, especially nouns or verbs: e.g., BODY (e.g.,

Anderson, 1978; Enfield, Majid, & van Staden, 2006), COLOUR (e.g., Berlin & Kay,

1969; Hood & Finkelstein, 1983; Kay, Berlin, and Merrifield, 1991) or (e.g.,

Goodenough, 1965; Wallace & Atkins, 1960). Among these studies, the major focus has been so far on the characterisation of entities (cf. Murphy, 2002), while little research has been done currently on event structuring despite an upsurge of interests in event domains in recent years (Majid et al., 2008).

According to Majid et al. (2008), investigations into the linguistic structuring of events centre on two critical questions: (1) how meaning elements are packaged into the lexical elements of a sentence (cf. Talmy, 1985), and (2) how similar a given event domain is partitioned cross-linguistically (cf. Choi & Bowerman, 1991, for the motion domain). In this study, I focus particularly on event categorisation.

2.1.2.1 Lexico-semantic categorisation: Analysis of Denotational Range

In the present study, categorisation is concerned with whether language treats different entities (such as objects, events, relationships, or attributes) as the same kind (cf. Majid

37

et al., 2008). For example, chairs and tables, though dissimilar perceptually, can be categorised under the same semantic category, i.e., “furniture” (Rosch, 1975, 1978).

Despite the quite straightforward definition, different scholars hold different positions as to how categorisation of semantic domains across languages are recognised. Specifically, as reflected in prior research or discussion, there have been so far three different positions on linguistic structuring (Narasimhan et al., 2012): (1) universal categorisation, (2) cross-linguistic diversity in categorisation, and (3) cross-linguistic convergent categorisation with language-specificity—as conveniently termed in this study.

Given the viewpoint of universal categorisation, different languages categorise a given domain similarly for either of the following theoretical reasons. First, according to radical concept nativism, lexical concepts are atomic, i.e., primitive, and cannot be learnt, being innate (Fodor, 1975, 1981). Consequently, were lexical concepts to be natively available, conceptualisation, i.e., categorisation, by lexical means should be consistent cross-linguistically. Second, though not homogenous, natural entities contain patterns of correlations: e.g., a strong correlation between having feathers and having wings (cf. Frye, 2011). Accordingly, as speakers of different languages learnt to capture the same patterns of correlated properties or regularities between attributes of diverse entities in the real world (Rosch & Mervis,

1975; Rogers & McClelland, 2004), they should conceptualise and categorise them in a similar way.

The position of cross-linguistic diversity in categorisation strongly opposes the above-mentioned hypothesis of either the mind or the world structuring real-world entities. According to this view, Bloomfield (1933) posits that linguistic

38

categorisation of entities is a somewhat arbitrary process since language is an autonomous system, and therefore not deliberately modified or derived by variables outside the system. Consequently, categorisation of a given domain cross- linguistically cannot be expected to be expressed similarly. Gleason (1961) supports this hypothesis by discussing the semantics of colour vocabulary. He asserts that language sets its arbitrary colour boundaries without any non-linguistic pressure: either the boundary in the spectrum or the human perception of these hue categories.

Different may then have different numbers of basic colour terms in an unpredictable way. That said, this extreme position of limitless and unsystematic variation has been falsified (cf. Berlin & Kay, 1969, on cross-linguistic constraints on colour lexicons). Accordingly, it does seem to merit presentation as an alternative fact of linguistic categorisation.

The last point of view on categorisation involves a compromise between the above extremes. According to it, categorisation across language is not significantly divergent but converges to some extent as influenced by both perceptual or cognitive factors and linguistic arbitrariness (cf. Levinson et al., 2003; Majid et al., 2007a, among others). Also, there appears to be other pressure on semantic structurings, such as communicative needs (Malt, Sloman, Gennari, Shi & Wang, 1999), or cultural influences (Levinson et al., 2003).

No matter how the above positions on semantic categorisation are theoretically opposed to one another with reference to the notions of causality and potential outcomes, it is clear that the linguistic categorisation matter ultimately needs investigating empirically.

39

Semantic categorisation of a given event domain can be determined by the semantics of lexical items (Jessen, 2013; Vanhove, 2008). Specifically, measuring extension patterns and taxonomic depth of lexical descriptors, like verbs, when used to carve up semantic domains, or lexical fields—in Vanhove’s term—would contribute to identification of domain architecture, i.e., the extent to which the domain is partitioned lexico-semantically, thereby suggesting minimally presupposed categories within it. Evans (2010) proposes that lexico-semantic categorisation may be systematically investigated with respect to three different aspects—especially for cross-linguistic comparisons: (1) semantic granularity, (2) category-boundary location, and (3) semantic grouping or dissection (or semantic organisation in this study). Below are details and approaches for treating each of these aspects. a. Semantic granularity

Evans (2010, p. 511) explains that granularity of semantic categorisation involves the question of how many categories there are in a semantic domain. This aspect consequently relates to the degree of subdivisions in a domain as such (cf. Gast,

König, & Moyse-Faurie, 2014). For example, Vulchanova, Martinez, and Vulchnov

(2012, p. 24) demonstrate that English makes a cut that Bulgarian does not, i.e., between ‘crawl’ and ‘slither’, merged as pŭlzja ‘crawl’, and at other times ignores a division that Bulgarian makes (i.e., katerja se ‘climb/clamber up’, and slizam ‘go down’ versus ‘climb’ in English). Not only cross-linguistically, but degrees of semantic granularity also can be compared language-internally to clarify asymmetric distinctions in a language. For instance, Kopecka (2012) shows that Polish speakers encode events of placement in a finer-grained manner than removal events, thereby substantiating the “Source/Goal asymmetry” hypothesis (cf. Narasimhan et al., 2012).

40

Additionally, though a language contains coarse-grained categories for a given domain rather than for others, the categories as such may be, as a matter of fact, more finely subcategorised, thereby indicating the hierarchical patterning of the domain structure. A case in point is the loosely defined ‘break’ category in English (Majid,

Gullberg, van Staden, & Bowerman, 2007b). It taxonomically incorporates finer- grained distinctions, such as ‘smash’ or ‘snap’.

Methodologically, a category of an event domain can be presupposed if a number of event members are together labelled with a single lexical item, i.e., a verb.

Later, if different categories in the domain are again described together with a verb, a more coarse-grained category should be suggested. At a given level of categorisation, there tend to be preferred categories as defined by frequent verbs of the domain, as against the more coarse- or finer-grained categories labelled with infrequently used verbs. Those meaningful categories are at the basic level given formalised levels of abstraction in categorisation (Rosch, 1975, 1978). Rosch posits that the most natural, preferred level for categorisation should be basic for it contains relatively much information at a relatively low cost. Specifically, a basic-level category is so informative that the relevant features or information can be easily inferred, while distinctive enough from other categories at the same level (cf. Guastavino, 2018).

To sum up, a cross-linguistic study on semantic granularity of an event domain, as determined by lexical verbs, primarily helps specify how many relevant categories can be involved in individual languages, and whether they relate to one other in a similar way across languages.

41

b. Placement of semantic boundaries

The second aspect pointed by Evans (2010, p. 511) is concerned with the question of where the boundaries between categories within a given domain are placed. Evans points out that events as intangible entities are more susceptible to different interpretations than physical entities like the body, which has visual and functional discontinuities. Consequently, category boundaries of an event domain tend to either vary markedly across languages or seem to be fuzzy thanks to interspeaker variability within a language. For cross-linguistic variation, a good case in point is Bowerman and Choi’s (2001, p. 255) illustration of English ‘open’ and its Korean near- equivalent. The category boundaries given by English ‘open’ have no alignment with any of its Korean near-equivalents. To this extent, the Korean near-equal categories just overlap individually with the English ‘open’. For interpersonal variation, take events of motion in Russian for example. Vulchanova et al. (2012, p. 34) observes that speakers of Russian have a distinction between ‘walk’ and ‘creep’ as labelled with idti and polzti respectively. However, they found it harder to describe motions with low posture like creeping, but more removed from the substrate—like supported motions, i.e., walking—for example, a chameleon crawling on all four legs.

Consequently, they varied in selecting either idti or polzti for those motions, thereby giving a fuzzy border for the ‘walk’ and ‘creep’ categories in Russian.

The placement of category boundaries of a given event domain in a language can be described as follows. Two categories as determined by two different verbs (or sets of verbs) are seen as distinctly lying apart from each other, with no overlap. In contrast, two categories as specified by—but not limited to—two different verbs are considered located ‘adjoining’ each other, i.e., in an abstract space, when some of the individual categories’ members can be as well labelled by verbs that are normally

42

used for the other. That being the case, the two categories are not only adjacent but also share fuzzy boundaries, causing transitional zones or overlapping areas. Category members can also break down into two kinds. First, those that are exclusively and unanimously labelled with verbs of a category are seen as central to the category.

Second, those that can be labelled with verbs of a category as well as others are relatively outer to the category. Accordingly, labelling of central event members is uniform, whereas that of outer members is relatively less consistent. These phenomena in turn reflect varying degrees of consistency or insecurity of event descriptions for different members, member subgroups, or categories (see more details on consistency in naming events in § 2.2.6.2a).

More generally, the notion of central and outer members of a category is reminiscent of Rosch’s (1975, 1978) prototype theory of categorisation. She proposes that category members relate to one another as having certain features in common with one or more members, but few, or no, features in common with all others in the same category. Some members can be roughly typical representatives of a category.

The core, i.e., most central, member of the category as such is called a prototype, for two reasons. First, it has more common features with other members. Second, it shares the least features or none with members of other categories. A prototype is consequently seen as best reflecting the category. To this extent, central events of an event category are not only described exclusively and consistently by speakers as noted above, but they also represent what features are prototypical for the category.

To conclude, an investigation of boundary locations of categories in a given domain essentially centres on how categories as such relate to one another, thus illuminating the semantic architecture of the domain. Additionally, a category boundary helps identify core members central to the category as well as the category’s

43

outer members at the category borderline. Category relationships with respect to placement of boundaries are then useful for typological comparisons of semantic structuring of events across languages. c. Semantic organisation

For event descriptions, semantic organisation relates to criterial properties whereof an event category makes a point (Evans, 2010, p. 511). On one hand, it is concerned with the question of which properties are used for differentiation of low-level components of the event (or event phases) that deserve dedicated types, thereby grouping events with these identifiable phases together. For example, in English, a criterion of consumed food versus liquid objects defines the type of eat and that of drink (Gast et al., 2014). Consequently, it clusters different events of consumption under the eat and drink categories, which focus on ingestion of different object types. On the other hand, semantic organisation asks the question of how an event as a complex phenomenon is semantically decomposed into parts. A case in point is the dissection of motion events into Manner versus Path, then being encoded into either verbs or satellites in different languages (Talmy, 1985): e.g., English codes the manner feature to verbs and the path feature to prepositional phrases, whereas Spanish verbalises the path instead and encodes the manner into gerundive phrases. By contrast, features by which an event category may be decided can be perceived as parameters (e.g.,

Pederson, Danziger, Wilkins, Levinson, Kitae, & Senft, 1998; Levinson et al., 2003).

Specifically, these parametric features help identify potential event phases that classify different happenings into types or categories, consequently structuring an event domain.

Methodologically, the semantic organisation of an event category can be inferred from event properties, i.e., features, that most category members mutually

44

share. However, if a category is large, one can logically start from looking at common features among central members of the category since members of this kind theoretically best denote the category (cf. Rosch, 1975, 1978).

Additionally, categories at a coarse-grained level would require a small number of parameters, while those at finer-grained levels would depend more on a relatively substantial range of fine-grained parameters. For instance, Vulchanova et al.

(2012) claim that at the most coarse-grained level, motions cross-linguistically break down into two categories, i.e., suspended versus supported gaits, based on the salient feature of velocity. Then, other fine-grained features: e.g., medium, posture, species, path orientation, or figure orientation are progressively employed to further subdivisions of such coarse-grained categories, so representing hierarchical links between categories and subcategories. Correspondingly, if we proceed deeper into the hierarchy to the level where no more fine-grained categories can be identified, we may specify the lowest identifiable phases logically possible in a language.

Furthermore, wherever event categories are anchored by verbs, shared features among category members could be recognised as linked to the verbs, while other event information might be semantically assigned to other linguistic expressions, or simply ignored (cf. Evans, 2010).

In the same spirit of prior research on event typology (e.g., Talmy, 1985;

Majid et al., 2007a; Bowerman, 2005; Kopecka & Narasimhan, 2012; Vulchanova et al., 2012), a study on semantic organisation of an event domain can give insight into the extent to which the domain categories can be taken as contiguous, thereby helping clarify lexicalisation of event components, i.e., into verbs or other linguistic devices.

In this way, components of lexical structures can be compared cross-linguistically.

45

In conclusion, this section has introduced how partitioning of a semantic domain can be assessed systematically with reference to the three different aspects of categorisation. With these facets, differences in the extensions of lexical expressions and the taxonomic depth of presupposed categories can be identified. In the next section, I explain how a graphic display, i.e., a semantic map, can be used to account for semantic structuring across languages as providing a relatively unbiased way of modelling language data.

2.1.2.2 Semantic maps

A semantic map is a tool used to visualise common cross-language patterns shown by semantically or functionally similar linguistic expressions of specific languages.

Applications of semantic maps in lexical typology were taken up by

Haspelmath (1997, 2003, 2004) and further developed more recently during the second half of the 2000s to now (e.g., Clancy, 2006; Cysouw, 2007; Cysouw, 2010;

François, 2008; Narrog & van der Auwera, 2011; Narrog & Ito, 2007). Such graphic displays are prominent in the research on domain categorisation—particularly based on similarity data (e.g., Bowerman, Gullberg, Majid, & Narasimhan, 2004; Majid et al., 2004; Majid et al., 2007a-b; Majid et al., 2008; Malt & Majid, 2013; Malt,

Gennari, Imai, Ameel, Tsuda, & Majid, 2008; Wälchli, 2010).

Koptjevskaja-Tamm et al. (2016, pp. 444-445) postulate that the

“contiguity/connectivity requirement” would be the main operational method for producing a cross-linguistically generalised semantic map (cf. Gast & Koptjevskaja-

Tamm, 2018). To be precise, for data to be objectively comparable, a postulate is required: functions, such as uses, meanings, or contexts that are usually correlated with the same linguistic expressions are more similar semantically than those

46

associated with different expressions. With this principle, such functions can be consequently arranged to represent a continuous region in semantic space.

Although semantic maps generally embody cross-linguistic generalisations of lexical items’ semantic range, it is worthwhile to take a proviso or two into account.

First, a semantic map does not necessarily display the relation between the use range of a linguistic expression and its descriptive intensional meanings. In other words, it may or may not correspond to speakers’ mental representations (cf. Haspelmath,

2003; Wälchli & Cysouw, 2012). Also, Haspelmath (2003) comments that a correlation of one and the same lexical item with several uses or contexts in a semantic map may not express multiple possible meanings. In this approach, multifunctionality need not force a strict commitment to either monosemic or polysemic interpretations. Similarly, the semantic map approach enables a degree of tractability regarding these issues. This makes semantic maps suitable for typological frameworks comparing elicited lexical data.

A popular semantic map method for visualising similarity or categorisation data is Venn diagrams. They are used to graphically represent a pool of etic items such as stimulus items (Moore, Donelson, Eggleton, & Bohnemeyer, 2015). This type of diagram however has some limitations. For example, when a considerable number of languages or response types are involved, this technique typically makes impossible the potential to determine patterns (Moore et al., 2015).

Other types of techniques were therefore introduced to produce simplified models, allowing principal dimensions of agreement and variability to emerge.

Researchers however have to select from among these techniques to maintain an acceptable tradeoff between simplification and information loss (Moore et al., 2015).

47

Such methods for simplicity usually refer to Multidimensional Scaling (MDS),

Correspondence Analysis (CA), and Hierarchical Cluster Analysis (HCA), as sharing the same design for visualising data in order to facilitate cluster inspection; accordingly, the overall structure of these clusters may allow interpretations of co- classifications.

Research methodology in the present study exploits Hierarchical Cluster

Analysis. As suggested by Rosch (1978), objects—including lexical items as complex objects with semantic dimensions—are not only nested by logical categories but also in hierarchical taxonomy. Material-destruction verbs, like those of caused-separation, tend to have a hierarchical relationship. As shown in prior research, events labelled lexically different in one context were at other times assigned the same descriptors: e.g., snap and smash versus break in English (Majid et al., 2007b). Accordingly, to reveal underlying hierarchical structures, this study applies HCA as it has the advantage over other statistical techniques with respect to such taxonomic and dimensional aspects.

Using an agglomerative clustering by a similarity matrix, the HCA method creates a linear form of semantic maps called a dendrogram, i.e., a tree representation.

This version of semantic maps is useful for characterising a semantic domain, and/or making predictions: e.g., of parametric values (Koptjevskaja-Tamm, 2016). In a tree diagram, function elements (in this study, the stimulus video clips) that are most similar to one another are clustered with the shortest link leaves (or lines)—usually starting from the left or the bottom side of the dendrogram (cf. Majid et al., 2007b).

Similarity, in this case, is determined by use of a single expression, i.e., a verb, that was not used elsewhere. A cluster of two functions correspondingly reflects their degree of similarity. If a cluster is anchored in a larger general cluster, there is a

48

hierarchical relationship among the verbs used to describe functions. That is, at least one verb is used to cover all the functions in the widest-range cluster, and at least one verb is used for all the functions in the subgroup. Furthermore, an outlier—known as a runt—may appear in a dendrogram as forming a branch by itself. Although this outlier may give an idea of how it is distinguished from other clusters, it does not express any categorisation knowledge, thereby commonly not participating in discussion on semantic partitioning of a given domain.

As noted above, statistical methods are approaches to data visualisation for simplicity. Accordingly, graphic displays generated by such methods may cause information reduction. In this way, a dendrogram may conceal a large amount of diversity as clustering different daughters that are very different from one another into a node only by measure of a single similarity factor. Categorisation commonly concerns similarity rather than difference. Even so, coarse-grained categorisation can provide insights into how a language classifies events, so dendrograms are considered a useful mode of representation for this study.

The detailed description of how the HCA technique can be performed to construct dendrograms through steps of agglomeration is given in 2.2.6.2b.

2.1.2.3 Consistency of lexical descriptions

Consistency of lexical descriptions refers to interspeaker agreement in lexical expressions for an entity, or the degree to which the entity was consistently named by particular words. Brown and Lenneberg (1954) assert that consistency is one of the three indicators for codability in language. The two others are length of response, or number of syllables or words, and reaction times. To this extent, high consistency, or low diversity in naming a certain entity reflects that the entity is accessible to

49

(i.e., high codability) and perhaps to consciousness (Majid et al.,

2018).

Therefore, an exploration of average lexical consistency for categories in a given domain may help to probe questions about the lexico-semantic partitioning of the domain. Specifically, some empirical issues with respect to consistency of vocabulary in a lexical field can be listed below.

- In a category, do the central members possess a higher degree of average

consistency of lexical descriptions than the outer members? If so, are the

central members as the prototypes relatively more codable than the outer

counterparts?

- What does low lexical consistency for the outer members of a category tell us

about insecurity in naming such outer members? (cf. Vulchanova et al., 2012)

- Do event categories with a high degree of average lexical consistency tend to

be named with coarse-grained expressions, i.e., basic verbs? Note that Majid

et al. (2018) observes that the codability of a domain was higher provided that

a group of speakers were more likely to use abstract terminology—as opposed

to specific detailed words—to refer to that domain.

- Are descriptions of categories with high lexical consistency shorter than those

of categories with a low degree of lexical consistency? As noted above, both

lexical consistency and the length of description can indicate codability;

accordingly, they should be aligned or mutually reinforced (cf. Majid et al.,

2018, for more codable stimuli with shorter descriptions on average).

There appear to be three influential metrics widely used for measuring consistency of lexical descriptions, or lexical diversity: (1) number of different words

50

(e.g., Klee, 1992; Miller, 1991; Watkins, Kelly, Harbers, & Hollis, 1995), (2) type- token ratio (for example, Fletcher, 1985; MacWhinney, 1994; Pan, 1994; Stickler,

1987; Templin, 1957), and (3) Simpson’s diversity index (e.g., Jessen & Cadierno,

2013; Majid & Burenhult, 2014; Majid et al., 2007b; Vulchanova et al., 2012). The following discussion notes potential flaws of the first two and defends the use of the last measure only in this research.

The number of different words (NDW) is straightforward as considering only the range of different words employed. To this extent, the range of vocabulary displayed by a speaker or a group of speakers contributes to how diversely they named an entity. However, when two different ranges of vocabulary were derived from two different numbers of total words, doubts arise due to like not being compared with like (Malvern, Richards, Chipere, & Duran, 2004). Standardisation of the sample length, such as randomisation, was consequently introduced to improve this measure; however, Malvern et al. (2004, p .18) found that despite the same body of data, “different trials will select different samples and could lead to differing results”. That being the case, NDW may not induce reliability for diversity measurements.

The type-token ratio, or TTR, may provide a better indication of diversity than

NDW since the total number of words, or the sample size, is for the TTR measure directly included into a calculation. Malvern et al. (2004, p. 19) explains that the type count is the number of different words a language sample contains, while the token count is the total number of words in it; TTR can be then obtained by a division of the number of types by the number of tokens. However, they still found that the direct use of tokens in TTR was not as helpful as it should have been because different sample sizes have nevertheless biased potential characterisations of diversity. As they

51

summarise: “Higher values will be obtained from shorter samples and lower ones from larger samples” (pp. 22-23).

Differing from the TTR, the Simpson’s index,10 or D for short, as a measure of diversity is likely to mitigate the sample size bias in that it does not only consider the total number of different types (richness), and that of tokens as variables in a calculation for diversity, but also the distribution of the tokens as such (evenness) towards the individual types. A diversity measurement is thus based on both richness and evenness of types with reference to their tokens. No matter how much larger or smaller a sample size becomes and how many types there are, the relevant degree of diversity would remain relatively high if the types were uniformly distributed or would be otherwise substantially reduced provided that the types were largely unevenly distributed (cf. Boenigk, Wodniok, & Glücksman, 2015).

Consequently, the measure of Simpson’s diversity index is more appropriate to apply in the present study to estimate the degree of variation in lexical items used in naming of the target video clip stimuli (i.e., ‘cut’ and ‘break’ clips, see § 2.2.2), thereby reflecting the different levels of homogeneity, i.e., consistency of lexical choices, i.e., verbs, across participants for the same clips (cf. Vulchanova et al., 2012).

The D can be first calculated individually for each clip in each investigated language, thereby displaying the diversity in the naming of the certain clip. The D for each assumed category in the chosen domain and for each of the relevant languages can be estimated by averaging all for-each-clip indices for that category and that language respectively. Diversity levels determined in such a way may be able to respond to the above-mentioned questions.

10 Simpson’s diversity index (Simpson, 1949) was borrowed from ecology where it is used for measuring species diversity in a community (Majid et al., 2018). 52

For the interpretation of D as ranging from 0 to 1, it equals 0 at maximum diversity. To this extent, this index is counterintuitive and contrary to its name: higher values mean lower diversity. Instead, this study adopts the complement representation: Gini-Simpson’s index instead, as calculated from the derivational formula 1 - D. With this derived index, the higher the value, the higher the diversity.

Simpson’s original formulation and the derivation of Gini-Simpson diversity are more fully considered in § 2.2.6.2 on the statistical approaches used in this research.

2.1.3 Domain of caused-separation events

In this study, the experiential domain of caused-separation was selected for explorations since its events are central to human cognition with their ready accessibility. Also, speakers of a language do not need to rely on specialised knowledge to talk about a situation of the kind, nor do audiences need to have specific expertise to understand its references.

The following pages refer to the nature of caused-separation as related to other types of separation and its status as core events of separation. This section ends with discussion of previous studies on the caused-separation event domain across languages, especially Thai, one of the languages to be examined here.

2.1.3.1 Nature of caused-separation events

Events of caused-separation11 are essentially defined as separation of material integrity (cf. Hale & Keyser, 1987, Fillmore, 1970; Majid et al., 2007b, 2008;

Bouveret & Sweetser, 2009; Devylder, 2017; Thepchuaysuk & Thepkanjana, 2017).

11 Some previous research alternatively adopts a different name: cutting and breaking (e.g., Majid et al., 2007a-b; Bowerman, Majid, Erkelens, Narasimhan, & Chen, 2004), as perhaps reflecting the putatively universal classification of cut-type versus break-type events in this domain. Instead, this study makes employment of the label “caused-separation” for its expressivity of the domain nature in that it involves material disintegration of objects (intentionally) caused by agents. 53

However, they may be incorporated into a larger domain which encompasses other types of separation (e.g., opening or taking-apart) (cf. Majid et al., 2004, 2008). This broader domain of separation can be established because of its event members all involving objects undergoing a change of state (cf. Fillmore, 1970; Thepchuaysuk &

Thepkanjana, 2017), i.e., from integral or whole objects into separated parts or pieces.

From this point of view, caused-separation events and other separating actions are similar due to physical processes where entities move from one state to another, regardless of assumed results of activities, or types of objects affected. On the other hand, caused-separation events are also considered inherently diverse from other types of separation, in that the former do not yield certain implications which the latter could convey.

Specifically, caused-separation events are not tied to some physical effects and potential implications; in contrast, other types of separation are firmly so. Consider caused-separation, opening, taking-apart, and peeling, using the English terms here as labels for comparative purposes. These four kinds of events are generally distinct in nature; for example, opening involves revocable effects, while taking-apart and peeling possibly do the same, but caused-separation does not share this feature. Next, some salient characteristics of caused-separation, opening, taking-apart, and peeling are presented, with discussion on significant variation among these categories. This internal variation over different types of general separation.

Like caused-separation, taking-apart, and peeling, opening is positively associated with events where something is caused to separate or become apart either wholly or partially. Particularly, should we incorporate opening into events of separation? It would likely be considered reasonable, since in an opening event an object may be made to separate by hand, or using a device. Accordingly, one thing

54

possibly either becomes two (or many) things or is caused to leave a blank space or gap. Nevertheless, opening in a sense is also divergent from other types of separation in two fundamental ways, at least: functionality and reversibility.

Firstly, the possibility of being opened can be viewed as functionality of objects. For instance, we can open a door because it is functionally capable of being opened. In contrast, the possibility of objects having destructive separability as their function is not so convincing. Take cutting a tomato as an example. The possibility of a tomato being cut by a knife does not convince us of the tomato’s use or functionality of being separated. It is only a natural quality of such a tomato that does not resist cutting forces. Correspondingly, the functionality of being opened is intimately linked to the other point being discussed: the capability of being reversed, or reversibility.

Logically, the condition of being reversible would be expected if the functionality of being opened was regarded for objects. To illustrate, a door can be opened at some point, shut at a subsequent time, and re-opened later. Ostensibly, the opening of the door is revocable as a closing can take place after an opening. The reversibility of opening then allows the door to be ready for reopening, thereby optimising its functionality of opening.

Contrastingly (but typically), cutting and breaking objects do not appear to involve reversibility. For example, when we had already either cut a tomato or broken a twig, the two objects could not have been restored to their original states. Thus, the likelihood of being cut or broken is not customarily viewed as uses or functions of theme objects.

Additionally, such reversibility versus irreversibility of separation is supported by evidence from the statistical analysis (Majid et al., 2008). Majid et al. claim that

55

reversibility is the relevant dimension enhancing a cross-linguistic classification of caused-separation events from those of opening. Particularly, using the data elicited by the stimuli (cf. Bohnemeyer, Bowerman, & Brown, 2001) based on 28 typologically and genetically different languages, Majid et al. were able to construct a

Multidimensional Scaling diagram showing possible dimensions of similarity over which the languages associate events of separation together, as well as distinguishing some others. According to the diagram, the first dimension supports a distinction between the reversible separations and those with irreversibility. The reversible events are later regarded as opening events (p. 240).

Moving on now to events of taking-apart, they possibly express revocable actions, as do events of opening. Generally, they refer to disassembly of grouped sets of objects, such as a pile of plastic glasses, or a set of a table and side chairs. Take an event of pulling a chair away from a table as an example. The action is reversible since the chair can be pushed back to its original position. However, the reversibility of taking-apart essentially differs from that of opening. The capability of a chair taken apart or pulled away then pushed back cannot be regarded as its function since no chair was primarily designed or made to be pushed aside and drawn back to its “well- matched” table. In other words, it makes no sense to say that the chair was moved because of its functionality or suitability of well serving such a purpose.

Again, Majid et al. (2008) refer to another dimension that statistically distinguishes an event of taking-apart—i.e., pushing a chair backwards from a table

(displayed in the video clip no. 7 in Bohnemeyer et al., 2001) from all the other separation events (including those of opening) in the multidimensional scaling diagram.

56

We now turn to events of peeling. They are more closely related to caused- separation rather than opening and taking-apart. According to Majid at al.’s (2008) multidimensional scaling diagram, different events of peeling (in Bohnemeyer et al.,

2001) are placed closer to the cluster of caused-separation events. However, caused- separation and peeling are still considered relatively distinct because actions of peeling are associated with shared knowledge of loose skins or outer coverings and the way of removing such things, whereas those of caused-separation are not quite so.

Therefore, peeling is not only dividing things into pieces, but precisely done to cause outer coverings to be apart from inner portions or segments.

Taken together, certain characteristics (e.g., that of reversibility, or the implicative removal of outer layers) suggest how opening, taking-apart, and peeling can be regarded as characteristically diverging from caused-separation. There is thus significant internal variation in the broad domain of separation. Consequently, caused- separation events may not equate those of general separation, which incorporate other separation types. All different types of separation should not be taken indiscriminately without raising concerns about those fundamental differences. Having said that, caused-separation can be reasonably considered core or typical in the domain of separation (cf. Majid et al., 2008). Next, how caused-separation typically characterises events of separation is discussed in more detail.

2.1.3.2 Caused-separation as core of separation events

As noted above, events of caused-separation can be regarded as exemplars of separation.

Generally, actions of division or separation often involve physical coercions aiming at disturbing integrity of objects by disjointing their parts and giving greater

57

prominence to permanent destruction of entirety caused by humans. As in

Phanthumetha’s Khlang kham [Thai Thesaurus] (2016), the semantically related verb class of “separating parts” contains a large number of verbs of cutting, breaking, slicing, slashing and the like. Since most of the verbs in the Thai semantic class of separation involve activities of material destruction coerced by human physical forces, events of the kind are then deemed to be typical of separation in general.

In addition, historically speaking, separating operations have long roots in human culture and have commonly called to mind certain representative activities and interactions. For example, in postharvest reports (see Kader, 1992), when farmers talked about separation, they often referred to cutting off or breaking apart substandard, unformed, or broken products, such as extraneous kernels or unripened buds, or breaking open shells from nuts, like shelling walnuts, yet they sometimes mentioned peeling of vegetable outer layers in such actions of separation as well. In postharvest reports of typical separating operations, opening and taking apart were rare. Note that these do not provide irreversible destruction of object integrity, which is usually required in agricultural production. To this extent, the closeness between caused-separation and peeling can be sensibly perceived in regulating human activities, as distinct from other separation types like opening or taking-apart.

However, as mentioned, events of peeling are essentially unlike those of caused-separation events. That is, peeling can recall an implication of immediately surrounding outer layers of objects separated. For example, the event of the mother peeling a mango means that the mother removed a mango’s skin, as the skin itself is implicitly and automatically inferred—though no lexical unit associated with it was explicitly stated. In contrast, cutting and breaking an object do not imply a pre- existing layer as such. As a matter of fact, the actions are generally regarded as failing

58

to recognise the existence of the outer layer or skin (if available). For instance, saying: we cut a mango, we, through the verb, do not hint at the mango’s skin: the mango might not have a skin before the action of cutting, or if it had a skin, the action would have no precise dealing with the skin, and it would still remain attached to the fruit after being cut. Consequently, in strict sense, peeling is deviating from other caused- separation events, and may not be as truly representative of caused destructive separation.

Although events of separation incorporate a wide range of activities: e.g., cutting, breaking, peeling, opening, or taking apart, they commonly call attention to caused-separation of irreversible material destruction as conferred by the prototypical verbs of the semantic class of separation. Also, major operations of separation in the history of humankind mainly involve caused-separation of integral objects in order to pursue livelihoods. Thus, I regard caused-separation events as encompassing cutting, breaking, tearing, or the like. The main focus of this study is thus on these separation events and on the verbs used to code such caused-separation exemplars.

2.1.3.3 Previous work on caused-separation in languages

This section reviews prior research related to caused-separation events, focussing on earlier interpretations of the domain and determinations of its semantic structuring. In the first part of this section, I start with cross-linguistic studies of caused-separation in various languages of the world, showing important findings compatible with different, divergent theoretical stances of semantic categorisation (cf. Narasimhan et al., 2012; in § 2.1.3.3a). The second part (§ 2.1.3.3b) specifically discusses how the caused- separation domain was previously characterised with respect to semantic patterning in

Thai. Consequently, this part provides an important orientation for the present investigation. Note that given lack of relevant prior studies into semantic

59

characterisation of caused-separation in Khmer, there is no equivalent treatment of the language here. a. Cross-linguistic research on caused-separation domain

As noted above, events of caused-separation, i.e., separation with material destruction, seem central to human cognition and require no specialised knowledge for descriptions (Majid et al., 2004, 2007a).

Consequently, the event domain of caused-separation and its vocabulary have prompted a considerable number of studies in many different languages since

Fillmore’s (1970) discussion on universal classification concerning semantic and syntactic behaviour of hit-type verbs versus break-type verbs. Each of these studies basically treats questions regarding semantic categorisation in the domain of caused- separation events, as conveyed by linguistic expressions. They assess topics like universal convergence or language-specific divergence for the domain’s distinctions.

So far, category characterisation of caused-separation observed in different languages has been accordingly discussed with reference to any of the three established positions: the stance of universal categorisation, that of cross-linguistic diversity in categorisation, and that of cross-linguistic convergent categorisation with language- specificity.

Important findings in this previous cross-linguistic research are reviewed in the following sections.

UNIVERSAL PATTERNING OF CATEGORISATION

Studies of caused-separation consistent with the stance of universal categorisation

(e.g., Fillmore, 1970; Guerssel et al., 1985; Hale & Keyser, 1987; Keyser & Roeper,

1984; Kroeger, 2010; Levin, 1993) conclude similarly that verb classes can be defined

60

by the shared behaviour of their members regarding diathesis alternations as well as the shared aspects of meaning, and this way of classification is reflected cross- linguistically. In line with such a claim, presumably universal behaviour of verb classes is taken as means of categorising the caused-separation domain. This suggests that similar distinctions can be directly recognised across different languages.

Fillmore (1970) is a critical early analysis of semantic and syntactic issues involved with caused-separation verbs. This study contrasts hitting-class and breaking-class verbs and establishes useful methodological tests to distinguish them.

Fillmore considers sets of verbs exemplifying the hitting/breaking difference and distinguishes semantic components of surface contact versus change of state. He shows that verbs of the former set permit body-part possessor ascension alternation whereas those of the latter set permit causative/inchoative alternation (see Table 2.3. below). Probing procedures of the type used by Fillmore have become standard in analyses of caused-separation verbs. Following Fillmore, Guerssel et al. (1985) propose, based on Berber, Warlpiri, and Winnebago, that cut-type verbs and break- type verbs are universally characterised thanks to their semantics (+MOTION,

+CONTACT versus -MOTION, -CONTACT) and syntax (conative versus non-conative), thereby promoting two cross-linguistically valid categories in the caused-separation domain: cutting versus breaking. In 1993, Levin builds on Fillmore and Guerssel et al.’s categorisation using a wider range of semantic properties and diathesis alternations to determine potentially universal caused-separation verb classes, as described in Table 2.2 and 2.3, respectively.

61

Table 2.2

Shared semantic components of meaning with reference to hit-class, cut-class, and break-class

(adapted from Levin, 1993, p. 10)

Class MOTION CONTACT CHANGE

hit verbs + + -

cut verbs + + +

break verbs - - +

Table 2.3

Pattern of syntactic behaviour of three different caused-separation verb classes: hit-class, cut-class, and break-class, with respect to diathesis alternations (adapted from Levin, 1993, p. 7)

Alternation hit cut break Example12

Conative Yes Yes No A. Carla hit at the door.

B. Margaret cut at the bread.

C. *Janet broke at the vase.

Body-part possessor ascension Yes Yes No A1. Carla hit Bill’s back.

A2. Carla hit Bill on the back.

B1. Margaret cut Bill’s arm.

B2. Margaret cut Bill on the arm.

C1. Janet broke Bill’s finger.

C2. *Janet broke Bill on the finger.

Middle No Yes Yes A. *Door frames hit easily.

B. The bread cuts easily.

C. Crystal vases break easily.

12 Adapted from Levin (1993, pp. 6-7) 62

Referring to Tables 2.2 and 2.3, the conative alternation appears to be compatible with the hit and cut classes which share the meanings of CONTACT and

MOTION. The body-part possessor ascension is possible only for the hit and cut classes whose meaning include contact.13 As for the middle construction, it relates to the ‘cut’ and ‘break’ classes whose meaning incorporates CHANGE. By means of the shared semantic and syntactic behaviour, all the three caused-separation verb classes are clearly distinguished. Also, according to Levin (1993), these distinctions are not only grammatically relevant to English but also to other languages: Lhasa Tibetan (cf.

DeLancey, 1995; 2000), Berber, Warlpiri, and Winnebago (cf. Guerssel et al., 1985),

Kimaragang Dusun (cf. Kroeger, 2010), and Jarawara (cf. Vogel, 2003).

Levin and Hovav (1995) add to the semantic structure of cutting-type verbs the component of INSTRUMENT. They posit that a cut-class verb always requires an instrument which a volitional agent implements to cause the change of state represented by the verb; however, a break-class verb does not do so.

CROSS-LINGUISTIC DIFFERENCES IN CATEGORISATION

By contrast, Pye et al. (1995) expresses doubt about the universality notion by arguing that despite certain common semantic properties (Pye, 1996), equivalent caused- separation lexical fields can significantly vary across languages with respect to extension patterns in the domain. For example, Pye found that in K’iche’ Maya, different events practically labelled with the English verb break are to be described using up to 42 different cutting/breaking verbs, and of course no neutral equivalent to break in English is available in this language. Accordingly, the K’iche’ Maya

13 Levin (1993) demonstrates that body-part possessor ascension is associated only with CONTACT but not MOTION because it is also compatible with the touch-class to which MOTION does not correlate. The touch-class is excluded from the present discussion because it is not commonly considered to involve the caused-separation domain. 63

phenomena of breaking may suggest distinctions in the lexical-field patterning as against those in English.

Another case in point is cross-linguistic variation in caused-separation classification with respect to various combinations of parametric properties (Pye,

1994, 1996; Pye et al., 1995). Bowerman (2005, pp. 230-232) explains such differences as follows. Individual languages select different features to be paid attention to when describing the complex domain of cutting and breaking and determine how to combine them. In English, the verb break characteristically applies to rigid objects and occasionally to one- and three-dimensional flexible objects: e.g., a or thread, or a baguette). Apparently, material disruptions involving two- dimensional flexible objects, such as a sheet of paper or blanket, are incompatible with break; instead, tear or rip proves to be more accurate in these cases. Other languages like Mandarin, Thai, and K’iche’ Mayan do not follow the English distinction between breaking and tearing events. By contrast, they use different combinations of properties such as the shape, size, and material of objects being broken.

Such linguistic phenomena seem to suggest that significant differences may lie in the way that languages classify separation events, and the universality of categorisation is far from established—or may be wrong.

UNIVERSAL CONVERGENCE WITH LANGUAGE-SPECIFIC INTERFERENCE

Majid et al. (2004) are the first who promote the position regarding cross-linguistic convergence with language-specificity in the caused-separation domain. They used the elicitation task experiment (i.e., ‘cut’ and ‘break’ clips, by Bohnemeyer et al.,

2001) and the extensional approach to investigate semantic patterning of caused-

64

separation verbs across 28 typologically, genealogically, and areally diverse languages. They show that there seems to be a cross-linguistic consensus in lexical- semantic categorisation of the domain of core cutting and breaking—or caused- separation as it is termed in this study—as speakers of such different languages converge on a similarity space. In this shared space, events of the kind are well- distinguished based on levels of agents’ control over the location of separation.

Consequently, events with precise control and those with imprecise control are universally categorised into two extreme groups, namely cutting versus breaking.

Having said that, language-specificity still plays a role. According to Majid et al., separation events with intermediate control are handled variably across languages.

Such variation seems to be activated by language-specific influences on the separation domain, which in turn lead to differences in numbers of categories and locations of category boundaries.

Following Majid et al.’s (2004) approach, several more studies on caused- separation verb semantics (e.g., Majid et al., 2007a-b; Majid et al., 2008; Narasimhan,

2007) have been subsequently conducted, supporting findings of cross-linguistic similarities with language-specific differences in the domain. Particularly, the most criterial cross-linguistic feature of caused-separation events is whether the location of separation could be predicted precisely or not. However, other local differences are also evident in this domain’s arrangements across different languages, as described below.

Majid et al. (2007b) point to different arrangements of chopping events, which are considered those with the intermediate predictability of the separation location— the agent’s intermediate control, in different but genetically related Germanic languages. Specifically, English chopping events are grouped with precisely-

65

controlled cutting events, while German, Dutch, and Swedish gather them together with imprecisely-controlled breaking events. Apart from Majid et al.’s European languages, Ameka and Essegbey (2007) study Ewe, a dialect group spoken in south- eastern Ghana. They suggest that highly agentive caused-separation verbs of the language are typically classified according to various kinds of instruments (e.g., a sharp instrument), and sorts of purposes or manners (e.g., a sweeping movement). By contrast, in Tzeltal, three main properties fundamentally distinguishing caused- separation categories are (1) spatial and tactile properties of objects being separated,

(2) separation concerning the object’s axes or parts and (3) completion of separation

(Brown, 2007). Much like Tzeltal, Chen (2007) found that caused-separation verbs in

Mandarin are fundamentally differentiated based on instruments (e.g., single versus double blades), manners, and affected-object properties. Jalonke (Lüpke, 2007) seems to run more parallel to Ewe in terms of the use of manner and instrument classifying features. However, it also appreciates other distinct criterial aspects, i.e., the concept of wholeness (Gestalt whole versus componential whole), and the stereotypical versus unexpected actions of separation. Hindi (IE) and Tamil (DR) are described by

Narasimhan (2007) as also following the cross-linguistic categorisation by the predictability of the separating location. However, several local distinctions recognised in many languages: e.g., the type of instruments used, and the executed manners, are also found in the two languages’ caused-separation categorisation despite different combinations.

The last language to be considered here is Lao, a Tai-Kadai language. Enfield

(2007) shows, based on some frequently used caused-separation verbs, that the caused-separation categorisation in Lao depends on whether the instrument is present, what manner is taken, what properties of the objects being separated are, and whether

66

the action is completed. Given that Lao is closely related to Thai, a focus of this study, and both are spoken in geographic proximity, similarities and differences of

Lao and Thai categorisations in the caused-separation domain can be informative, especially in how key contrasts (if available) are made among caused-separation cognates.

The above-described findings show that distinctions between caused- separation categories are more complex than original analyses by Fillmore (1970), and Levin and Hovav (1995). Despite some cross-linguistic dimensional convergence

(e.g., regarding the high versus low predictability of locus of separation), several language-specific semantic distinctions are found classifying the caused-separation domain, reflecting variation in its patterning across different languages. As far as facts tell, some of the differences can be listed below:

- Intermediate predictability of locus of separation

- Instruments (e.g., different kinds of sharp instruments)

- Manners (e.g., a sweeping movement, or a forceful manner)

- Spatial properties of actions (e.g., separation across axes)

- Textural properties of objects (e.g., a long and rigid object)

- Result or completion of actions (e.g., full separation)

- Gestalt whole versus componential whole (e.g., whether objects being

separated are considered whole)

- Conventionality (e.g., whether actions are regarded as stereotypical)

To conclude, previous studies using the extensional approach together suggest cross-linguistic convergence with local differences in the caused-separation domain categorisation. Still, they seemed to restrict consideration only to certain identical

67

features. To this extent, the more inclusive approach is required for further cross- linguistic research of similarity and difference in the way various languages pattern the domain, and whether such similarities are universal or language-specific.

Taken together, previous research on semantic categorisation of caused- separation events has been so far compatible with any of the existing theoretical proposals regarding the linguistic structuring and representation of experiential events

(see the three viewpoints of semantic categorisation in § 2.1.2.1; cf. Narasimhan et al.,

2012). Despite this, recent and more experimentally designed studies have appeared to strengthen the stance of cross-linguistically convergent classification with local divergence (e.g., Lüpke, 2007; Majid et al., 2007a-b, 2008; Rounti, 2018; van Staden,

2007, among others). These studies are valuable for further investigating structuring of caused-separation events in other languages of the world for two reasons. First, they provide background information on both shared and language-specific semantic components that caused-separation verbs may lexically encode. Second, they contribute to a methodological resource on how semantic categorisation of caused- separation events can be experimentally and analytically evaluated.

The next part provides more information about linguistic structuring of the caused-separation domain, but focusses particularly on Thai. b. Studies on caused-separation events in Thai

Studies on the semantic categorisation of caused-separation in Thai have been carried out by Premsrirat (1987), Thepchuaysuk (2016), and Thepchuaysuk & Thepkanjana

(2017). Premsrirat adopts the lexicalist approach to determine the semantic structure of caused-separation verbs in Thai, for comparison with that of Khmu equivalents.

However, for the interest of space and to avoid issues beyond the present study’s

68

scope, only her discussion on Thai is included here. Thepchuaysuk and Thepkanjana, by contrast, extend Majid et al.’s (2004, 2007a, 2008) cross-linguistic research. They conduct the first investigation of the caused-separation domain in Thai using the elicitation task experiment (Bohnemeyer et al., 2001). This research is reviewed below. Their findings relating to how the domain is patterned in Thai are considered.

Gaps still existing in this area of research are mentioned in the concluding part.

PREMSRIRAT (1987)

Premsrirat’s (1987) research on the cutting field is the first of its kind for Thai (as well as Khmu). Her aim is to specify semantic components of Thai cut-type verbs as intentionally defined by dictionaries and native speakers, using the componential analysis method, and their hierarchical relationships. Premsrirat explains that the cutting lexical field in her terms refers to events involving “a human manual activity to divide an object into parts with the help of sharp-edged instrument” (p. 150). With this definition, her cutting field seems to be smaller but still incorporated into the caused-separation domain discussed in this study (see § 2.1.3.1).

Premsrirat (1987) collected matched data of cutting verbs from four different sources: (1) her own intuition as a Thai native, (2) three available dictionaries, (3) a discussion on contrastive characteristics of verbs of the kind by Thai students, and (4) answers from questionnaires on characteristic components of cutting verbs. A data set of 55 cutting verbs in Thai is constructed, showing that the cutting field in Thai requires up to 89 derived semantic components. Using such data, Premsrirat then generates a component-by-verb matrix following Jakobson, Fant, and Halle (1951) to identify four different basic classes of distinguishing components with their subclasses in Thai, as adapted as follows (p. 159):

69

(1) Instruments a. Kinds of instruments b. Parts of instruments

(2) Objects being cut a. Substances and shapes b. Areas to cut

(3) Actions a. Directions b. Movements c. Emotional attitudes

(4) Results a. Effects—physical results b. Uses—purposes

Premsrirat (1987) suggests a plausible arrangement for them as following

Nida’s (1975, p.37) logical-temporal order of internal relations. As a result, she claims that the cutting-field classification in Thai would start from what instruments are used to cause a cut, to what kinds of objects are cut, what actions are made, and what effects or uses are intended or going to come, in hierarchically sequential order.

Premsrirat’s establishment of the hierarchical order of the components in the cutting domain can be more simply visualised as follows:

70

Figure 2.1. Hierarchical order of distinguishing components in the classification of cutting in Thai after

Premsirirat (1987).

Premsrirat (1987, p. 160) also calls for “the systematic way of describing the meanings of a [cutting] word in Thai” as against inadequate and component-missing definitions in dictionaries. For example, instead of “to cut (into pieces), slice” (Haas,

1964), /hàn/ should be defined based on hierarchically ordered components: “/hàn/ is a manual activity (implies “human action”) for dividing things into parts by using an instrument with (1)14 a single sharp narrow blade to cut especially (2) food items such as meat, vegetables or fruit, with (3) a vertical motion into (4) slices or chunks” (p. 160).

Premsrirat (1987) contributes to the systematic research approach (cf.

Jakobson et al., 1951) to semantic systems of cutting words in Thai. With this study’s methodology, the meaning of a cut-type verb can be described with reference to distinctive components which uncover contrasts in the lexical structure. A strong point of Premsrirat’s study is identification of semantic components that are ordered hierarchically in the organisation of the cutting field.

THEPCHUAYSUK (2016); THEPCHUAYSUK & THEPKANJANA (2017)

14 The numbering and bolding are not made in the original source but marked here to mimic the hierarchy of semantic components. 71

Following Majid et al.’s (2004, 2008) technique, Thepchuaysuk (2016), and

Thepchuaysuk and Thepkanjana (2017) base their research on the video clip stimuli displaying separation events in sensu lato (Bohnemeyer et al., 2001). Specifically, such events involve a broad range of different-in-nature separation actions: i.e., opening and taking-apart events (both implying reversible separation), peeling events

(for removal of outer layers), caused-separation, and spontaneous separation—the last two involving irreversible material destruction. He found 38 different verbs from 48 informants of Thai as they described the general separation stimuli.

Using a Multidimensional Scaling (MDS) solution based on a similarity matrix,15 Thepchuaysuk (2016), and Thepchuaysuk and Thepkanjana (2017) generalise parametric features involved in classifying different events of general separation. The analysis is based on shared attributes present in all event members in the same clusters and subclusters in a semantic space. Despite the different kinds of separation involved, Thepchuaysuk’s findings show that caused-separation events can be adequately described and predicted by the proposed semantic parameters. Below is a summary of the semantic parameters used to categorise the separation domain as established in his study.

(a) Presence of instruments: Thepchuaysuk and Thepkanjana (2017) mention two

aspects of this criterial feature. First, they explain that no separation events

with the presence of instruments: e.g., cutting,16 can be grouped together with

15 The pairwise similarity matrix is generated based on whether a pair of video clips is considered similar semantically. If at least one informant described two clips with the same verb, then the two clips are scored with 1 as completely similar. If no informant described them with the same verb, they are then marked with 0 as completely dissimilar (Thepchuaysuk & Thepkanjana, 2017, p. 288; cf. Majid et al., 2008). Given 61 different video stimuli in Thepchuaysuk’s research, there are totally (61 × 60) ÷ 2 = 1,830 pairs of the stimuli. 16 The use of the terms cutting, breaking, snapping, or the like, and the English glosses for languages other than English does not imply the identicality but only the putative comparability to the English counterparts, and is merely for convenience in discussion. 72

those without the use of instruments: e.g., snapping. The specific issue is

whether an instrument involved in a separation action helps to classify that

event in the domain. Second, different types of instruments tend to influence

subclassification of separation events. As Thepchuaysuk remarks, his Thai

informants, for example, always used /tàt/ ‘cut’ to label separations caused by

double-bladed scissors, whereas those by small single-bladed implements

were variably described with a wider range of cutting verbs including /tàt/.

Another case is the involvement of blunt-headed implements like a hammer.

Separation events with this kind of instruments seem to be partitioned from

others as commonly named distinctly with /tʰúp/ ‘hit/smash’.

(b) Use of force: Events with the agent forcefully causing a separation are named

differently from those without the agent’s forceful manner in Thai. For

example, the event where the agent used a sharp implement to separate a rope

in a non-forceful manner tended to be labelled with /tàt/, whereas those in

which the agent used the same kind of instrument to separate a carrot

forcefully instead were generally named with two different verbs: /sàp/ ‘chop’

and /pʰàː/ ‘chop/cleave’.

(c) Orientations of separation: the Thai semantic patterning of separation events

seems sensitive to crosswise versus lengthwise orientations of separation. For

instance, the Thai informants tended to describe the event where the agent

used a small knife to separate a carrot along its long axis with /pʰàː/

‘chop/cleave’, whereas other similar events except only for separations along

the short axis were mostly labelled with /hàn/ ‘cut’ instead.

(d) Predictability of locus of separation: Whether separating results can be

precisely predicted or not plays a role in the semantic categorisation in this

73

domain. Consequently, smashing events where the location of separation can

hardly be predicted thanks to the potential resulting shattered pieces of theme

objects were never labelled with the same verbs as cutting/slitting counterparts

in which the separation locus is more precisely predictable.

(e) Theme object properties: For events without the presence of instruments, the

properties of objects being acted upon seem to help for event classification.

Specifically, two-dimensional flexible objects like a sheet of paper tend to

invoke the use of different verbs as against one-dimensional rigid objects like

a twig.

Thepchuaysuk and Thepkanjana (2017) also remark that these semantic parameters can work together to classify separation events into categories—or more precisely in their words: “to help select a verb for describing a separation event” (p.

297). For example, an event of tearing a piece of cloth by hands may be categorised with a combination of [-instrument] and [+2-dimensional object].

Note that Thepchuaysuk (2016) also makes the analysis of diathesis alternations of separation verbs. He attempts to show that verbs of separation can be classified into categories as well, according to shared syntactic behaviour. However, no clear links were established between semantic and syntactic properties with respect to the linguistic structuring of the separation field (cf. Fillmore, 1970; Levin, 1993, among others) as both are discussed quite separately in his discussion.

Thepchuaysuk (2016) and Thepchuaysuk and Thepkanjana (2017) introduced an experiment involving the use of manipulation by the video clip stimuli assembled by Bohnemeyer et al. (2001) to elicit linguistic data for an investigation of the semantic patterning in the separation domain. His analysis, based on etically collected

74

data, concludes that the semantic domain of separation events is structured using certain parameters; some of which are also observed, despite different combinations and degrees, in other languages. For example, Enfield (2007) discusses separation events with single- versus double-bladed implements in Lao.

Given the emphasis on findings based on observable data, Thepchuaysuk’s

(2016) and Thepchuaysuk and Thepkanjana’s (2017) research is more etic driven in exploring the separation field, in contrast to Premsrirat’s (1987) lexicalist approach to emic categorisation. Additionally, Thepchuaysuk’s implementation of

Correspondence Analysis and Multidimensional Scaling orients his study quantitatively. However, at least two research gaps currently exist for Thai:

(1) Potential hierarchical structuring of the semantic field of separation is not yet

discussed using data from the etic-driven methodology. Despite the analysis of

separation classification with reference to the etically standardised stimuli (cf.

Bohnemeyer et al., 2001), Thepchuaysuk did not examine the extent to which

the domain in question would be organized in a taxonomic model.

Consequently, it is not clear whether and how his derived criterial features for

categorisation are related to one another in hierarchical fashion.

(2) No discussion is developed about different lexical descriptions for the same

events. Particularly, the studies by Premsrirat and Thepchuaysuk are similar in

inferring that combinations of criterial semantic attributes rigidly partition the

separation domain. Premsrirat even claims that sets of characteristic

components are so distinctive that individual cutting verbs should be defined

according to them. However, such estimates are questionable. If semantic

classification could be so rigidly defined, why can some separation events be

described by different verbs in most real-life situations?

75

To respond to these research gaps, the present study explores extensions of caused-separation categories in Thai and their taxonomic depth to generalise semantic parameters at different categorical levels, using the clustering analysis explained below (see §§ 2.1.2.2 and 2.2.6.2). It determines the groups of caused-separation verbs strongly associated with different semantic categories. It also investigates frequent versus rare verbs of caused-separation events and specifies certain basic verbs corresponding to coarse caused-separation categories in Thai in order to better delineate the semantic domain of caused-separation in the language.

This research focusses only on that of caused-separation, which involves destructibility of objects and irreversibility of state changes. Other separation types are therefore excluded.

2.2 Methodology

The study’s purpose is first described to inform the direction of this research. The following three sections introduce and discuss methods of data collection (§ 2.2.4), processing (§ 2.2.5), and analysis (§ 2.2.6).

2.2.1 Purpose of the study

This study aims to demonstrate how analysis of lexico-semantic parallelism in a controlled lexical field can contribute to the notion of a linguistic area. The objective is thus to establish main features of a particular lexico-semantic domain as it is encoded in nearby but unrelated languages, Thai and Khmer, both in the Mainland

Southeast Asian (MSEA) areal grouping. How these unrelated languages encode caused-separation events is the lexico-semantic issue to be investigated. Semantic categorisations in the two languages are compared and contrasted, with parallelism

76

shown to reflect integrative lexico-semantic cohesion characterising the MSEA linguistic area.

2.2.2 Elicitation tool: MPI’s ‘cut’ and ‘break’ clips

Bohnemeyer et al.’s (2001) 61 ‘cut’ and ‘break’ clips (enumerated and described in

Appendix A) were produced as one of the experimental apparatuses used by various researcher teams at the Max Planck Institute for (MPI) in a wide- ranging research project called “Event Representation”. This stimulus set was firstly used to elicit descriptions of actions and state changes to collect empirical data for determining how 28 different languages treat the semantic domain of separation and for devising cross-linguistic lexical description patterns for the domain. As mentioned in § 2.1.3.3, the stimuli have already been successfully employed in a cross-linguistic comparison (Narasimhan, 2007). Findings there are available to provide a matched counterpart to the results of this research: comparison of the patterning of caused- separation event descriptions in Thai and Khmer.

Furthermore, the ‘cut’ and ‘break’ clips are culturally appropriate for the intended audience in the present study (see § 2.2.2) for two reasons. First, all the events involved in the stimuli exhibit basic and realistic phenomena. There are no cultural barriers, but only actions commonly encountered in daily life, e.g., cutting cloth, or fish. Second, all relevant questions (e.g., What did the agent do?) used to prompt speakers in this stimulation set are objective assessments. These questions are also non-leading thus allowing the speakers to give descriptions without external biases.

Despite the name ‘cut’ and ‘break’, the stimulus set is actually composed of various subtypes of separation events, namely caused and spontaneous separation,

77

opening, taking-apart, and peeling. Subtypes in the stimulus set however are not quantitatively equal, as shown in Table 2.4 below.

Table 2.4

Subtype events of separation in ‘cut’ and ‘break’ clips (Bohnemeyer et al., 2001).

Subtype Events of Separation Clip No. F %

Caused-separation 1-6, 9, 10, 12-15, 18-21, 23-28, 31, 43 70.49

32, 34-40, 42, 43, 45, 48-51, 53, 54,

56, 57, and 61

Spontaneous separation 8, 16, 17, and 46 4 6.56

Opening 33, 41, 44, 47, 52, 55, 58, 59, and 9 14.75

60

Taking apart 7, 11, and 22 3 4.92

Peeling 29, and 30 2 3.28

Total 61 100

In Table 2.4 most of the clips are classified as caused-separation (70.49%), while the rest account for the other types: spontaneous separation (6.56%), opening (14.75%), taking-apart (4.92%), and peeling (3.28%). Despite the use of the whole set of stimuli for my fieldwork, caused-separation events17 are only discussed in the present study.

The non-destructive separation category, i.e., opening, taking-apart, and peeling, was

17 In the creation of the videoclip stimuli, knife/karate hands were designed to be included as part of instruments—along with hammers and bladed tools, such as axes, chisels, knives, machetes, saws, and scissors (Majid et al, 2007a; 2008). The “karate hand” as instrument was not only taken for the videoclip production but was also supported by cross-linguistic evidence (e.g., Ameka & Essegby, 2007; Majid et al., 2007a; O’Connor, 2007). 78

later excluded from further analyses because of their distinct properties from caused- separation with material destruction, while the spontaneous separation type was also ruled out for two reasons:

(1) Although involving material disintegration, spontaneous separation events are less salient in human perception (Thepchuaysuk & Thepkanjana, 2017), as opposed to caused-separation, since separation events are in general caused by human-related forces, physical forces, or other external forces. Also, it is difficult to think of any separation events triggered by no apparent cause due to their rarity in the natural world. To this extent, spontaneous separation events may essentially differ from the caused counterparts.

(2) The number of the video clips displaying spontaneous separation is too small and insufficient to allow for determining how speakers of Thai and Khmer would depict events of the kind. Moreover, investigation of spontaneous separation was not the primary purpose of the stimulus production (Bohnemeyer et al., 2001). As

Majid et al. (2007a, p. 148) explained, spontaneous separation video clips were included merely to “explore questions of argument structure”.

All the video clip files were installed in a specific folder on a personal laptop both for the pilot study, and for field surveys. The files are incorporated in either

Windows Media Player or GOM Media Player.

2.2.3 Pilot study

This section is dedicated to an overview of the pilot study, which was conducted twice separately in March 2017, covering 30 minutes at each probe on average. The first interview was audio-recorded only, whereas the second was both videotaped and audio-recorded. Next, the location and participants for the pilot study are described

79

below as well as the objectives and how the pilot interviews revealed some ways of making methodological improvements.

2.2.3.1 Location and participants for the pilot study

The pilot study phase was conducted in Sydney where I was during the pre- fieldwork stage.

Two participants (aged 35.5 on average in 2017) took part in the two separate pilot interviews. The first was for a Thai female language student (referred to as

Informant A, hereafter). She used to live in Bangkok, Thailand, and worked as an investor relations officer for a power company. The other was for a Thai male doctoral student at the University of Sydney (Informant B, below). He has been a university lecturer, teaching Thai literature to bachelor’s degree students. Both participants, staying in Sydney, actively communicated with other native speakers in

Thai. Also, they only arrived in Sydney about two months before the pilot study (i.e., in March 2017); so, they had just left the language community in Thailand for a short time.

The locations of each interview were chosen to suit the participants’ convenience: Informant A’s private rental room, and the Informant B’s University campus at Camperdown. Additionally, in accordance with the code of ethics, any other personal information on the pilot participants was deliberately not provided here.

2.2.3.2 Objectives of pilot research

The aim of the pilot study was to provide a general basis for the full-scale field surveys in Thailand and Cambodia, with reference to tool uses, prompting questions, time frames for interviews, and determining the best ways of capturing data.

80

2.2.3.3 Pilot interviews: Outcomes and improvement

The pilot interviews were directed at the elicitation task specified in Bohnemeyer et al. (2001). The procedure resembled conditions expected in the field sites. Below I discuss the results of the pilot, especially relating to the prompting questions and the elicitation duration, and how to improve field elicitation processes.

The prompting questions suggested by Bohnemeyer et al. (2001) in English and the Thai version are presented in Table 2.5.

Table 2.5

Prompting questions in English and Thai, for ‘cut’ and ‘break’ clips (Bohnemeyer et al., 2001).

Languages English Thai

Event types

Caused-separation; What did the agent do? kʰǎw (or kʰáw) tʰam ʔaraj

Other separation types 3SG do what

‘What did he/she do?’

Spontaneous separation What happened to the object? kɤ̀ ːt ʔaraj kʰɯ̂ n

happen what go.up

‘What happened?’

According to Table 2.5, the corresponding Thai questions are not completely translated from English, but translated and adapted into Thai. The Thai versions are more likely natural when used to elicit data from speakers of the language.

Additionally, I found that the questions sometimes led to unanticipated or incomplete

81

answers. For example, some answers were given in an elliptical construction, as in

(2.1).

(2.1) Researcher: kʰǎw (or kʰáw) tʰam ʔaraj

3SG do what

‘What did he/she do?’

Informant B: [kʰáw] hàn plaː

[3SG ] slice fish

‘[He/she] sliced the fish.’

The subscripted word here appears to show that the pronoun kʰáw ‘he/she’ would be there absent since contextually given or derivable in Thai thanks to its topic status. Then, I needed to add in some short questions (e.g., Who? for Who did that?) to engage the informant B to give a complete sentence answer, thereby deriving the full verbal argument structure. During the elicitation, the participants gradually learnt and later moved on to give only complete sentences. This means that the prompting questions eventually became functional to elicit complete-sentence answers; other supplementary questions then became unnecessary.

On the time duration for interviewing, the participants were first told that the individual interview would cover 25 to 30 minutes, covering the total time of all replays and all the prompting questions as well as small discussions, and it is a one- off session only. However, in the real situations, short questions were added to elicit

82

alternative answers, and a short break were requested. As a result, the interviews took more time than expected.

Therefore, the time duration for interviewing was extended to 30 minutes or a bit more. Additionally, despite a one-time interview, each informant needed to be informed that future contact may be required to receive consultation about the data obtained. Accordingly, all informants were required to give the necessary contact information for future contact.

To summarise this section, the pilot study aided in pointing to some improvements. Regarding the elicitation questions, the pilot outcomes indicate that some short questions may be asked supplementarily. Also, the time duration needs adjusting to respond to certain conditions (i.e., extra time for unstructured discussions and breaks).

2.2.4 Data collection

2.2.4.1 Planning and preparation for fieldwork

Collecting linguistic data from dictionaries or written materials may be misleading and may misrepresent actual usage.

For this study, fieldwork collection of empirical data, needs to rely on an objective non-linguistic tool. This is to prevent excessive reliance on local languages

(in this case, Thai and Khmer) as mediums of communication. Use of these languages as intermediaries could easily trigger unintentional prompts.

Moreover, a non-linguistic tool can be used as a standard grid for cross- linguistic comparisons. In other words, if the same set of experimental stimuli was delivered to speakers of Thai and Khmer, how speakers of one language described

83

caused-separation events—displayed by the stimuli—could be accordingly evaluated by comparison with speakers of the other language performing the same experiment.

Accordingly, fieldwork data collection in relevant language communities in

Thailand and Cambodia is a main feature of this research project. Field surveys require careful preparation to best manage a systematic data collection process for gathering all necessary data. The preparation for fieldwork was thus planned in stages: a literature review, a variety of administrative tasks, and a fieldwork preparation session. The timeline below illustrates the various steps taken from

October 2016 to early June 2017 based on the chronological sequence of pre- fieldwork tasks.

Figure 2.2. Pre-fieldwork tasks taken from October 2016 to June 2017.

In the timeline, apart from the literature review, pre-field activities break down into three main tasks. These are tool preparation, pilot experiment, and ethics application. During the tool preparation stage, I found that the set of video clip stimuli, ‘cut’ and ‘break’ clips created by Bohnemeyer et al. (2001) could be effectively used in this study. These materials can help create situations that induce

84

speakers of Thai and Khmer to describe events of interest (i.e., caused-separation).

Next, to test the specified set of stimulation materials, I conducted a pilot study on a small sample of participants to evaluate the feasibility of the tool and find ways to improve the performance of a full-scale fieldwork project (see more detail in § 2.2.3).

Finally, and in parallel, I wrote and produced all supporting documentation required for the ethics application and then submitted it in early 2017.

Next, I elaborate the study sites where I applied it (§ 2.2.4.2), the recruitment of language informants (§ 2.2.4.3), the procedures of capturing language data (§

2.2.4.4). Also, I conclude this data collection section with the ethical aspect (§

2.2.4.5).

2.2.4.2 Data collection sites

The data collection sites are in two countries where Thai and Khmer are native:

Thailand and Cambodia, respectively.

For Thai, the field investigations were conducted at Khlong Luang, Pathum

Thani, Thailand. This location is about 40 kilometres north of the capital Bangkok.

People in this area predominantly speak Thai in daily life, especially outside the home, at school, or with passers-by. Also, at this site is located Thammasat University

(Rangsit campus), where all Thai informants are affiliated or living nearby. The

University’s Department of Thai and Eastern Language and Culture provided a venue for the interviews.

For Khmer, I conducted the fieldwork at Battambang province in Cambodia.

Battambang is located in the northwest of the country and home to the University of

Battambang; all my informants of Khmer are students of this University. I selected this Cambodian province as the study site since the Khmer variant spoken there is

85

“more representative of the of the majority of the population” (Tiwary &

Kumar, 2009, p. 132).

2.2.4.3 Participants and recruitment

All Thai and Khmer speakers who participated in the present research were native speakers of their language and were living in the language speech community. They were speakers of Thai or Khmer as first language and practically employed their language in daily life. The two languages involve seven participants each, for a total of 14 people. The participants ranged in age from 19 to 52 (M = 30.64; SD = 11.15) and were 50% female. They all had a greater than high school education level.

To recruit participants for this research, I sent invitation letters. For Thai fieldwork, Asst. Prof. Sarawanee Sankaburanurak, Thammasat University Lecturer in the Department of Thai and Eastern Language and Culture, assisted in the recruitment of faculty members of the Department. For Khmer fieldwork, letters were sent to

Khmer-speaking students at the University of Battambang. Assistance was provided by Phan Sotheara, a then fourth-year university student and my language assistant during the field trip in Battambang.

Subsequently, 10 informants of each language actually participated in the elicitation task, but in the final phase linguistic data were randomly selected from those given by seven of them for further analysis.

2.2.4.4 Data capturing

The following is an account of how linguistic data were obtained using the ‘cut’ and

‘break’ stimuli of Bohnemeyer et al., (2001) at the two field sites. The process of conducting individual elicitation interviews is introduced, as well as how audio-visual recordings were created and how fieldnotes were taken.

86

a. Elicitation task procedures

Elicitation interviews conducted on site were in three main stages:

Stage 1: The interview appointments were arranged in advance. Since the researcher had planned to take detailed notes immediately after each interview, only a few participants were invited for interview per day.

Stage 2: The pre-interview stage was held on the same day of each interview.

It began with setting up the tools (i.e., audio-visual recorders) before an interview.

When each participant arrived at his or her private interview, I started explaining the potential benefits and rights, and potential difficulties that individual participants might encounter during the interview (see also § 2.2.4.5). Other possible questions raised by some informants were also answered: for example, that of the average duration for an interview, or of a withdrawal from the study.

Stage 3: After the completion of the ethics forms, an elicitation interview began with an explanation of what interview procedures would be performed in order to obtain accurate descriptions for the stimulus clips (Bohnemeyer et al., 2001), from each informant, as illustrated in the figure below.

87

Figure 2.3. Interview procedures for eliciting data using the ‘cut’ and ‘break’ clips (Bohnemeyer et al.,

2001).

The figure shows that an elicitation interview starts with stimuli. In other words, I first displayed a stimulus clip to the participant present, before engaging him or her with prompting questions (for example, “What did the agent do?”). After the participant provided his or her description(s) based on what he or she saw in the displayed video clip scene, I might further ask some supplementary informal questions; the participant could at this point express his or her agreement or disagreement with the initial-expression answer(s). For the latter case, he or she could initiate a revision. Subsequently, as suggested by Bohnemeyer et al. (2001), I turned to questions for eliciting alternative descriptions, so that the participant may give additional possible options for descriptions. Note that the participant can request video clip replays at any time within a single cycle of eliciting descriptions for a video clip. After a cycle was completed, the participant and I can proceed to the next one.

Additionally, the interaction and communication in an interview were only in Thai and Khmer respectively; all the questions used were versioned in these languages.

88

Furthermore, in strict accordance with Bohnemeyer et al. (2001), the above- mentioned prompting questions were well structured to draw out linguistic data. In contrast, the supplementary questions for requesting revisions and the questions used to elicit alternative expressions were loosely organised and depended on the extent to which an informant answered. Consequently, the elicitation task of this study can be defined as semi-structured since the obtained information was accessed through both structured questions and conversation-like questions.

Finally, after an interview, each informant was provided with a financial reward and snack refreshments. b. Audio-visual recordings

The present study recorded linguistic data both audibly and visually. In what follows,

I briefly explain how audio-visual recordings were made in each of the interviews.

Before starting recordings, I once again informed each informant that their interview was going to be recorded. Note that some informants might feel worried about being recorded but giving them time to prepare themselves, for example with some small talk were usually helpful. Then after obtaining their permission, I started recording using an audio recorder (i.e., Sony Handycam® PJ410) and a video camera

(i.e., Remax® Voice Recorder RP1). During an interview, the audio recorder and the video camera were almost always turned on to avoid missing important data. Also, after an informal and friendly talk, I carried out a small check to make sure that the frame of the informant and the volume of the two recorders were set correctly. The

LED screens of the recorders were always flipped out towards me so that optimal recording quality could be easily monitored. Recording quality however depended on recording conditions and environments.

89

When the participant requested a short break and refreshments, the two recorders were temporarily turned off. Sometimes, although the participant did not ask for a break, but when I sensed a tense atmosphere, a short break was given by my initiative since either anxiety or excitement would affect negatively linguistic data being obtained. After a pause or break, I rechecked whether the recorders were turned on and ready to record. c. Field note-taking

In addition to the audio-visual records, I took notes during the data collection phase.

These notes are in two categories: one was carefully recorded during the individual elicitation interviews, and the other was taken outside the interviews.

In particular, as I took notes during individual elicitation interviews, they usually include vocabulary units (i.e., verbs) that each informant chose to describe different video clips, some critical comments on the choices of words and structures, and various issues, and ideas relating to the data elicited. These notes invaluably assisted in the analysis when I returned to review the raw data. Below is the formal outline used for note-taking during individual interviews.

90

Table 2.6

Formal outline for note-taking during an interview.

Name:

Date: Age (if reported):

Stimulus set name: ‘cut’ and ‘break’ clips Location: Pathum Thani, Thailand

No. Description

S1 …

Other notes were occasionally taken outside elicitation interviews; some of which are about remarks and comments on technical difficulties and practical limitations involved in the interview appointments and the pre- and post-interview stage. Some notes are related to interesting encounters while I was in the field sites. I once heard, for example, a male student saying that he could easily cut off (or, more accurately, /tàt/ in Thai) a peeled hard-boiled egg using a wooden skewer as he pressed the long side of the stick onto the egg in order to make it separate.

Unconventionally but apparently, the skewer was not one of the sharp instruments commonly used for severance, but a stick with a pointed tip, which is, at least in Thai culture, customarily used to pierce a series of cubes of meat or meatballs. This kind of unconventional-use matters was worth jotting down.

To conclude, two fieldnote types were taken during different elicitation interviews and on occasion in the field to reflect what I discovered and discussed with the informants and other native speakers. These notes also show my initial and

91

emerging understanding of how speakers of Thai and Khmer probably describe caused-separation happenings in the natural world.

2.2.4.5 Ethical aspect

In this section, some basic ethical principles are briefly described. a. Critical Information for Informants

All informants are encouraged to learn some critical information and then asked to fill out a consent form prior to their participation in this research. Also, the informants can access the Participant Information Statement (PIS), which presents the research objectives, the methods of capturing data, and the potential outside participant(s): e.g., the researcher’s supervisors or local assistants, so that they could decide to enable, refuse, or discontinue their participation.

Furthermore, the PIS clearly states that this research is the basis for the degree of Doctor of Philosophy (Linguistics) at the University of Sydney. Also, conducting this research is not expected to produce financial benefits for the researcher. b. Payment and Working Conditions

The PIS also specifies financial benefits, non-financial benefits (provided during an interview), and working conditions. Specifically, it confirms the right of each informant in that although they initially decide to participate in this research but later change their mind, they can withdraw at any time, by informing the researcher. c. Data Management and Result Dissemination

The PIS includes matters of data management and result dissemination. By providing the consent, each informant agrees that I can collect their given information for research purposes.

92

Particularly, after this research is completed, all identification information, and the contact data are to be destroyed so that other people will not know whose information it is: they will not know the individual participating informants, and they will be unable to link each of them to any of given information. Also, findings directly or partially related to this research will be published in a research thesis, publications, or conference reports, but, again, individual informants will not be identifiable in these publications.

2.2.5 Data processing

2.2.5.1 Data transcription

The audio-visual recordings are about 210 minutes in total (from seven randomly selected participants); their format is MP3 files made by the audio recorder, and WAV files from the video camera. However, only linguistic descriptions recorded in the

WAV format and their directly converted MP3 files18 (not those recorded by the audio recorder) were subsequently transcribed and coded using the multimodal digital annotation tool: ELAN19, and EXCEL 2016, respectively. Note that I first used ELAN to quickly transcribe the language data before exporting them to EXCEL 2016 for further coding.

2.2.5.2 Data coding for references

In the data coding (in EXCEL) for references in the analysis, all linguistic descriptions from ELAN were coded regarding (1) the stimuli’s set name (2) the video clip number (3) the participant’s initials. For example, the coding CB-S11-VP is

18 Note that I converted all WAV files to the MP3 corresponding files and imported the two types of data to ELAN. This was more efficient than importing the WAV files from the video camera and the MP3 data recorded by the audio-recorder since no time synchronisation of the two-sourced data is required. 19 ELAN. (n.d.). The language archive, Max Planck Institute for Psycholinguistics [Computer programme]. Retrieved June 1, 2017, from https://tla.mpi nl/tools/tla-tools/elan/ 93

read as the description of Scene 11 in the ‘cut’ and ‘break’ clips, proposed by VP, where V stands for the informant’s family name, and P for the first name.

2.2.6 Methods of data analysis

This section discusses the quantitative analytical methods in the present research. In §

2.2.6.1, I outline the steps to manage and analyse the data obtained from the elicitation task, and how to discuss the lexico-semantic convergence based on the semantic categorisations in the caused-separation domain across Thai and Khmer. In §

2.2.6.2, I describe the employed statistical approaches, i.e., Simpson’s diversity index and Hierarchical Cluster Analysis (HCA).

2.2.6.1 Summary of research design

In the following, I summarise the important steps of how the data collected for this study were adjusted in tables and matrices and subjected for statistical study using the

Hierarchical Clustering approach and the Gini-Simpson’s Index to identify lexico- semantic patterning of caused-separation in Thai and Khmer. I also briefly outline how such characterisations in the languages were explored for potential semantic convergent traits in the areal context. a. Data handling and analysis

1. I identified caused-separation verbs in all descriptions for the 43 target events

(cf. Bohnemeyer et al., 2001) as units of analysis (UoA). Following Majid et

al. (2004), I acknowledged the target events as “the change in an object from a

state of integrity to a state of separation or material destruction” (p. 886). I

then concentrated on only constituents of descriptions which may lexically

94

relate to these targets20. For example, in Thai, the event of “a man cutting a

carrot” may be encoded as /kʰǎw hàn kʰɛːrɔ̀t/ (3SG cut carrot) ‘He sliced the

carrot’. As seen, the caused change of state is expressed solely by the main

verb /hàn/; accordingly, this lexical descriptor is considered a UoA for the

present analysis. However, identification of UoA may not be always

straightforward as Thai and Khmer allow multiverb constructions.

Particularly, speakers of the languages may express other subevents in

serialisation to the state change, such as results of separation. Verbs used to

describe other subevents as such were not included in my analysis, as strictly

following the method of UoA identification mentioned above and simplifying

the analysis. For instance, in Khmer, a multiverb description for the event of

cutting a branch could be /kat pdac mɛɛkcʰəə/ (cut make.separation branch)

‘cut the branch off’; only the verb /kat/ is here used for analysis since it is

considered primarily encoding the state-change subevent. Having said that,

/pdac/ may be employed as a UoA only when it occasionally occurs solely to

represent the caused action, like /pdac mɛɛkcʰəə/ (make.separation branch)

‘separated the branch’.

2. Different verbs (i.e., types) used in naming the target-event stimuli and their

frequency of occurrences (i.e., tokens) were tallied across informants in Thai

and Khmer. A table was generated to illustrate the type-token relationship for

20 Caused-separation verbs such as those of change of state in Thai and Khmer would naturally implicate a result-state rather than encode or entail it; the resulting implicature varies in degrees of strength. For instance, as for verbs like /tʰúp/ ‘smash’ or /tàt/ ‘cut’ in Thai and /kat/ ‘cut’ in Khmer, they appear increasingly likely that speakers think separation will occur or has occurred, different from some others like Thai /dɯŋ/ ‘tug’ or Khmer /tieɲ/ ‘tug’, which less likely point to a result-state. Thus, the difference between encoding and implicating caused-separation—from the case of Thai and Khmer verbs, for example—is arguably gradient rather binary. Yet, in line with much of the MPI work (Enfield, 2007; Majid et al., 2007a; Narasimhan, 2007, inter alia) and in order to keep the Thai and Khmer comparison manageable, the issue cannot be fully pursued in this study. Relatedly, the assignment of a feature like SEP in Tables 3.14a-c and 4.12a-c is an arbitrary decision of the present study’s researcher. 95

each language. From these tables, I can present lexical resources in the caused-

separation domain and specify, in a way, high-frequency caused-separation

verbs and possible long-tailed verbs of the kind in Thai and Khmer.

3. For both Thai and Khmer, I also tabulated verb types and their tokens across

the informants against the 43 video clips. Particularly, a clip-by-verb matrix

was created for the individual languages: in each matrix, the clips are shown in

rows while the verb types in columns. The numbers in each matrix cell

represent how many times the relevant verb was employed by all the

informants to describe the relevant video clip.

4. Using the clip-by-verb matrices for Thai and Khmer, I conducted Jaccard

distance-based Hierarchical Clustering using SPSS software to manipulate

dendrograms; each leaf of which represents a video-clip marked by a short

description tag, being clustered together in branches from the left toward the

apex of the dendrograms at the right end.

5. Juxtaposition of the dendrogram and the clip-by-verb matrix in each of the

languages was used to determine, from the top of the dendrogram, whether

and how a single verb would span across different clusters of the video clips

(cf. Andics, 2012). By this process, I specified the most coarse-grained

partitions as revealed by verb extensions, thereby uncovering minimally

presupposed categories in the caused-separation domain for Thai and Khmer.

Again, in each category across the languages, the same process can be

repeated to get the remaining subcategories. Note that each row of the clip-by-

verb matrices is to be rearranged to match the dendrogram leaves in each

language for facilitating the process of category determination. Also, I went

back to consult with some of the Thai and Khmer informants and other native

96

speakers when encountering ill-structured problems regarding this patterning

process. I have mentioned these problems where relevant (in Chapters 3 and

4).

6. With the dendrograms juxtaposed to the clip-by-verb tables, I was able to

determine how the caused-separation domain was organised in Thai and

Khmer with reference to three different aspects of semantic categorisation as

Evans (2010) suggests: semantic granularity, placement of category

boundaries and semantic organisation (i.e., grouping). b. Study of lexico-semantic convergence in Thai and Khmer

1. Based on the insights into the semantic structuring in the caused-separation

domain in Thai and Khmer, I identified similarities and differences in how the

two languages categorise the domain under discussion. Specifically, I started

from comparing the sets of lexical resources used by the informants of Thai

and Khmer in naming the selected stimuli. Then, the numbers of the derived

caused-separation categories and subcategories in Thai and Khmer were

compared to analyse how similarly and differently the caused-separation

domain is partitioned across the languages, and whether the two languages’

category boundaries are drawn comparably, especially with respect to

potential transitional zones between the categories. Last, I considered how the

categories and subcategories at different categorical levels are semantically

organised across the languages as reflected by criterial semantic features

involved.

2. The similarities in the caused-separation patterning across Thai and Khmer

were also used particularly in considering whether they might be aligned with

cross-linguistic convergence in the semantic structuring of caused-separation

97

events (cf. Majid et al., 2004, 2007a, 2008), or driven by language-specific

distinctions shared between these two languages.

3. I specifically determined the language-specific or local parallels between Thai

and Khmer in the caused-separation categorisation to argue for potential

semantic convergence of the two languages. To this extent, the proposed

semantic parallelism between Thai and Khmer can expand the understanding

of Khmero-Thai linguistic convergence, which have already been made at

multiple levels, into the (lexical) semantic level.

4. Finally, I measured such language-specific lexico-semantic parallelism across

Thai and Khmer as two MSEA languages against the caused-separation

semantic partitioning in Hindi and Tamil, two non-MSEA languages (cf.

Narasimhan, 2007). By this means, areal semantics features are explored

through the case of Thai and Khmer’s caused-separation semantic

categorisations, accordingly supporting the notion of the MSEA linguistic

area.

2.2.6.2 Statistical approaches

This section discusses how the statistical analyses of the present study were carried out. In the first part, I explain Simpson’s diversity index used to calculate the amount of lexical (i.e., verb) diversity, which accordingly suggests degrees of consistency in naming different video clip stimuli, and semantic categories and subcategories in this research. The second part focusses on the Hierarchical Clustering approach to producing dendrograms, which help illustrate Thai and Khmer’s semantic categorisations in the caused-separation domain.

98

a. Simpson’s diversity index for lexical diversity

Despite a large or small number of different verbs found to describe events, they do not always directly indicate whether speakers may describe such events somewhat inconsistently or rather consistently, respectively. As matter of fact, a total number of different verbs used fails to answer the question of how consistently speakers describe an event. Below are mock distributions of different dummy verbs across different events in Language A to show what I meant by the above.

Table 2.7

Dummy distributions of verb occurrences in Language A: the scenes in the leftmost column are described by the mock verbs in the first row, quantified by numbers in the cells.

Event Verb 1 Verb 2 Verb 3 Verb 4 Verb 5 Verb 6 Verb 7

Event X 1 1 1 1 1 1 1

Event Y 4 0 0 1 1 1 0

Referring to Table 2.7, Event X contains seven different verbs whereas Event Y comprises only four different verbs. Then, the former seems richer in terms of verb types, and of course, one might say Event X was more diversely described as compared to the latter, based on the numbers of different verbs used. Then, supposing there was another event, say Event K which also contains the same number of different verbs as Event Y (= four verbs), should we conclude that Events Y and K show the same degree of lexical diversity? As a matter of fact, we may not positively say so unless the relative abundance of the individual verb types of the two events has been considered. Say, the distribution of each verb of Event K is 1:0:0:1:1:1:0,

99

whereas that of Event Y is 4:0:0:1:1:1:0 as seen in Table 2.7. At this point, the verb distribution of Event K looks more even, and then more diverse since there is no verb type in dominance. In contrast, Event Y is less even, and then less diverse because the first verb in order predominates for the event, in turn marginalising the other verbs— present in low occurrence.

Now, we can see how richness (i.e., numbers of types) and evenness

(distributions of types) 21 can be the two fundamental factors in measuring diversity; the Simpson’s diversity index thus comes into play.

Simpson’s diversity index (Simpson, 1949) is commonly used in ecology and paleoethnobotany, to measure species evenness and richness for diversity (cf.

Marston, 2014). In fact, it relates to three different subtypes: the Simpson’s index (D) itself, the Gini-Simpson’s index (1 - D), and the inverse Simpson’s index (or D-1).

Though all representing the same notion of diversity, values of the different subtypes are read dissimilarly. For example, a value of the Gini-Simpson’s of 0.82, is not similar to a value of 0.82 for the Simpson’s index. In the present study, the Gini-

Simpson’s was adopted, considering that the bigger the value, the higher the diversity.

In other words, the Gini-Simpson’s is much more straightforward and more sensible when a high amplitude will intuitively entail a wide divergence22. However, values for

21 Another excellent and more straightforward example of richness and evenness in measuring diversity is that, supposing there were two fictitious nations, and each contained three different ethnic races, we could see the extent to which the ethnic diversity across the two countries was at the same level (i.e., thanks to the same richness of races). Yet, when adding more information on how many people there were within each ethnic race, say, 50:50:50 in the first nation and 140:5:5 in the second, we would tell that though the same amount of population (N = 150), the first was more diverse than the second in terms of the relative abundant population of the individual races, considering the basic probability (P) how frequently we would encounter people from the different races (i.e., 0.3:0.3:0.3 for the first nation as against 0.9:0.03:0.03 for the second). Specifically, we have an even chance to meet each race in the first nation, but uneven opportunities for encounters of the different races in the second nation. This is therefore how evenness also goes into operation on measuring diversity. 22 Simpson’s index gives values with the reverse reading to the Gini-Simpson’s. It ranges from 0 for ideal (infinite) diversity to 1 for no diversity (i.e., counterintuitively, the higher the value, the lower the diversity). As for the inverse Simpson’s, it has the same value reading as the Gini-Simpson’s. 100

the Simpson’s (or D) needs to be calculated first before converted to those for the

Gini-Simpson’s because of the latter upon the former: 1 – D.

In this study, diversity is measured with the following equation, as given by

Kelly (2017):

[∑푅 푛 (푛 −1)] ∑ 푛(푛−1) 휆 = 푖=1 푖 푖 (or, simply put, 퐷′ = ) 푁(푁−1) 푁(푁−1)

th where ni is the total number of tokens belonging to the i type, and N is the total number of tokens in the dataset. Then, D is converted to the Gini-Simpson’s (1 – D or

D″) by having it subtracted from 1, using the formula (Kelly, 2017):

∑ 푛(푛 − 1) 퐷″ = 1 − [ ] 푁(푁 − 1)

With this version of Gini-Simpson’s index, 0 designates complete uniformity, and 1 represents complete diversity, or the higher the value, the higher the diversity. Now, we can calculate how much diversity is characteristic of verb types for each of the events in Table 2.7, as shown below in Table 2.8.

However, for fear of an entirely evenly distributed population case, where the value of the inverse 1 Simpson’s is undefined in ordinary arithmetic (i.e., ⁄0 = ∞), such an index was not selected for this study. 101

Table 2.8

Gini-Simpson’s D used to numerically specify how Scenes X and Y are diverse regarding verb types.

Event Computation D″

X 1(1 − 1) + 1(1 − 1) + 1(1 − 1) + 1(1 − 1) + 1(1 − 1) + 1(1 − 1) + 1(1 − 1) 1 1 − [ ] 7(7 − 1)

Y 4(4 − 1) + 1(1 − 1) + 1(1 − 1) + 1(1 − 1) 0.71 1 − [ ] 7(7 − 1)

According to Table 2.8, Event X is higher in diversity than Event Y (i.e., D″X > D″Y).

Besides, if we considered the two events as a set of events, we could calculate an average diversity value for the set with this ordinary arithmetic mean (AM) equation:

푛 1 퐷″ + 퐷″ + ⋯ + 퐷″ 퐴푀 = ∑ 퐷″ = 1 2 푛 푛 푖 푛 푖=1

By the formula, if n diversity values are specified, each value represented by D″i where i = 1, 2, …, n, the average or arithmetic mean is the sum of the diversity values as divided by n (or, more precisely, the number of verb types involved). Therefore, according to Table 2.8, the diversity average for the set is:

1 + 0.71 = 0.85 2

102

In this study, the Gini-Simpson’s helps determine consistency of the speakers’ descriptions for the caused-separation domain within and across Thai and Khmer.

Correspondingly, such notion of consistency can provide information on Thai and

Khmer characterisations of events of the kinds, especially which events they regard as central or marginal to particular semantic categories in the domain. b. Hierarchical Cluster Analysis

With the aim of exploring semantic categorisation of the caused-separation domain in

Thai and Khmer, this research used cluster analysis to determine similarity matrices.

These involve how different video clip stimuli were denoted with the same verbs or how different verbs were employed to label the same stimuli. On this basis, tree diagrams (or dendrograms, henceforth) were constructed to display illustrations for such similarity. Discussion of the measure of (dis)similarity and the method of clustering follows.

How do we measure such (dis)similarity degrees/values? There exist several

(statistical) methods to obtain such sensible degrees. Commonly, statisticians and researchers use a distance measure called Euclidean distance for describing dissimilarity between pairs of types, objects, species, or things. As Aldenderfer &

Blashfield (2006) summarise distance measures:

… most of the more popular coefficients demonstrate similarity by high

values within their ranges, but distance measures are scaled in the reverse.

Two cases are identical if each one is described by variables with the same

magnitudes. In this case, the distance between them is zero. Distance

measures normally have no upper bounds, and are scale-dependent. Among

103

the more popular representations of distance is Euclidean distance … (pp. 25-

26)

Thus, Euclidean distance indices (ED) are dissimilarity values: the higher the ED, the higher the dissimilarity. If a pair of types or things appreciated complete similarity, its

ED would take on 0 but were a pair to show no similarity or entire dissimilarity, its

ED would undertake infinity as stated above: “no upper bounds” (Aldenderfer &

Blashfield, 2006, p. 26).

However, there is one major issue in measuring (dis)similarity using

Euclidean distance: it includes absences of variables in the calculation of similarity, thereby resulting in joint absences or negative matches. Negative matches are satisfactory or acceptable in some (dissimilarity) calculations; for example, people (as the data sets) who have never committed any crimes (as the variables) could be considered mutually similar in the sense of innocence. Having said that, in some other cases, joint absences may not give much satisfaction. As in the present study, it would not make any sense if two different video clip stimuli were considered similar to one another based on the fact that they were never described with the same verbs by any informants. Additionally, that they were not labelled the same verbs do not help predict whether they could be named with the same other verbs in other occasions.

Moreover, Euclidean distance usually works based on continuous or variable data—like the reading on a scale. However, in this research, I primarily target the abstract extension of verbs representing caused-separation events: which verbs are used to describe which events. Consequently, only two-level values are here required: participation and non-participation (i.e., of different verbs for different video clips).

104

These values are attribute or categorical data—reflecting characteristics of caused- separation verbs, which can be used for further (dis)similarity analysis.

Based on the above two issues, Euclidean distance is considered inappropriate for this study. A different measure is required indicating similarity between cases defined by binary attributes: for example, 1 for participation/presence and 0 for non- participation/absence, while avoiding negative matches in estimates of similarity.

Jaccard’s coefficient satisfies this requirement. This coefficient is one of the similarity coefficient methods which measure the association or agreement between pairs of cases by a series of categorical variables. Take the binary-coded data in Table 2.9 as an example.

Table 2.9

Binary distributions of two fish species discovered in four dummy rivers.

Rivers Asian river catfish Great snakehead

A 0 0

B 0 0

C 1 1

D 1 1

As seen in Table 2.9, each variable can only take on either of the two possible values:

1 refers to the presence of a variable and 0 to its absence (Aldenderfer & Blashfield,

2006). Therefore, if we look for Jaccard’s coefficient of a pair of cases in Table 2.9, it is most convenient to analyse it from a 2 × 2 association. Take the pair of C and D,

105

for example. The 2 × 2 association of common fish species between these two rivers is:

D

1 0

1 a = 2 b = 0 C 0 c = 0 d = 0

In the association, these two rivers have two shared species: Asian river catfish and great snakehead (shown in Cell a). The two rivers do not include any species found only in one river but absent in the other (shown in Cells b and c). Also, neither species was absent from both rivers (shown in Cell d). As for Jaccard’s similarity index, it is calculated by dividing the number of fish species (= variables) shared by the two rivers C and D (= cases), by the sum of the number of shared species (Cell a) and the number of species present (Cells b and c). Thus,

2 퐽 = = 1.000 2+0+0

Unlike Euclidean distance, the measure of Jaccard’s coefficient provides an index of similarity; therefore, the higher the value, the higher the similarity, and its upper limit is 1.000, as complete similarity requires that all relevant variables must be shared across all cases. Thus, Jaccard’s coefficient can be defined as the equation below:

106

푎 퐽 = (푎 + 푏 + 푐)

Jaccard’s coefficient is naturally prompted by the question as to what proportion of the variables is the same in two cases. Therefore, applying the formula to all the five other logically possible pairs: AB, AC, AD, BC, and BD, Jaccard’s coefficient helps obtain the entire 4 × 4 similarity matrix:

Table 2.10

Jaccard’s similarity indices of four dummy rivers, measured by two fish species.

A B C D

A - 0.000 0.000 0.000

B - 0.000 0.000

C - 1.000

D -

In the six cells of the 4 × 4 similarity matrix, there is only one value of J (= 1.000), calculated by the fact that the rivers C and D have the same variables (= fish species), specifying that they are completely similar.

Having defined the meaning of similarity and considering the similarity measure used in this study, below, I discuss Cluster Analysis and how to illustrate clustering in a dendrogram.

107

Cluster Analysis is generally regarded as a tool for looking into possible case groups (i.e., clusters) in specified data sets and characteristics of these case clusters.

The analysis essentially involves how individual cases, as determined by closeness, can be linked reasonably to form clusters, and how these clusters that are closest to each other are connected to higher clusters. This clustering process continues until all the cases are merged in one cluster. Therefore, the question concerned here is what method should be used for linking clusters in this study.

In statistics, there are many different linkage methods (Spencer, 2013).

Average linkage between groups (also known as unweighted pair-group average—

UPGMA) is compatible with cluster analysis using Jaccard’s similarity coefficient

(e.g., Majid et al., 2008; cf. Vulchanova et al., 2012). The UPGMA grouping method is accordingly selected for this study.

Average-linkage clustering delineates the average distance of case-to-cluster or cluster-to-cluster distances between all possible pairs of points in the case and the cluster or the two clusters being compared (Wilks, 2011). Take the similarity indices in Table 2.10 as an example. We know that the closest cases (= rivers) are C and D.

First, these cases can be joined to form a cluster. Subsequently, which of the other cases (i.e., A, or B) should be next joined to such a cluster? At this stage, the average linkage criterion is involved. However, since this criterion depends on distance measures, the equation: D = 1 – S needs to be used to convert all the Jaccard’s indices in Table 2.10 to the distance matrix, as shown in Table 2.11.

108

Table 2.11

Distance measures of four dummy rivers, calculated from similarity indices in Table 2.10, through conversion formula: D = 1 – S.

A B C D

A - 1.000 1.000 1.000

B - 1.000 1.000

C - 0.000

D -

According to the above table, when considering how far, say, River A is from the cluster of Rivers C and D, we involve two possible distance measures: A to C and A to

D. Then, to define the distances between the cases and the cluster, the average-linkage criterion is used to compute the average of these two distances. The averaged distance

1.000+1.000 of A to C and A to D is: = 1, suggesting complete dissimilarity between the 2 cases and the cluster and do not generate a larger cluster. Likewise, the distances between River B and the cluster of Rivers C and D show a similar pattern, taking the average distance of 1 and deriving no larger cluster. Additionally, Rivers A and B themselves do not form a cluster given the average distance of 1. Table 2.12 summarises all the average linkage hierarchy between Rivers A, B, C, and D.

109

Table 2.12

All average linkage hierarchical clustering algorithm of Rivers A, B, C, and D.

Step Joining distance Joining Joining New cluster called

1 0.000 C D C/D

2 1.000 C/D A N/A

3 1.000 C/D B N/A

A graphic illustrating the clustering of cases is called a dendrogram (or more simply, tree diagram) (Spencer, 2013). In a dendrogram, individual cases start from one side, mostly at the bottom or the left or right end, and then continually merge into a single cluster at the other side. The scale of a diagram proves the distance between each cluster. Thus, in Figure 2.4, if taking the results from Table 2.12 for example, we can establish a cluster of River C and D at the distance of 0.000. Subsequently, the newly created cluster, i.e., Rivers C and D, River A, and River B were technically merged at the distance of 1.000, showing complete dissimilarity.

110

Figure 2.4. Dendrogram of four dummy rivers: A, B, C, and D, showing individual cases continually clustered from the right end to a single cluster at the left end.

All in all, we can see how to perform Hierarchical Clustering through case similarity as determined by variables. In the Hierarchical Cluster Analysis of this research, lexical descriptors, i.e., verbs of caused-separation events, are treated as the variables, while the cases are the video clip stimuli (cf. Bohnemeyer et al., 2001). The dendrograms derived from the probes are used to help visually determine the semantic categorisations in the caused-separation domain across Thai and Khmer.

------⁂ ------

111

CHAPTER 3

Denotational range of caused-separation in Thai

This chapter discusses elicitation and analysis of the Thai data. As outlined in the previous chapter, the field investigation made use of stimulus materials developed and explained by Bohnemeyer et al. (2001). These materials are designed to produce descriptions for 43 caused-separation stimulus events by using video clips to elicit relevant lexical verbs. In line with the semantic typology approach of Evans (2010), field data in the caused-separation domain were then analysed over three areas of the denotational range: granularity of linguistic encoding, lexico-semantic boundary locations, and semantic organisation.

I start with a brief description of the Thai caused-separation semantics addressed prescriptively in the previous scholarly literature (§ 3.1). I then investigate the issue of granularity: how native Thai speakers carve up the caused-separation domain into categories. Next, I analyse how these categories are placed with reference to one another in the semantic space of caused-separation and detect overlapping areas that compromise the category boundaries (§ 3.3). Last, I examine the semantic organisation within the caused-separation domain to identify semantic components in the encoding of the domain in Thai (cf. Majid et al., 2007b).

112

3.1 Lexical expressions of caused-separation in Thai:

Prescriptive accounts

Vocabulary of this semantic domain is commonly provided in monolingual and bilingual dictionaries; also, in a thesaurus. This section reviews Premsrirat’s (1987) listing of cutting words found in four dictionaries (Haas, 1964; McFarland, 1944;

Royal Institute of Thailand (RIT), 1950, 1982), the revised edition of RIT dictionary’s

(2011) definitions of some common caused-separation words, and categories of caused-separation words as classified by semantic similarity in a Thai-Thai thesaurus

(Phanthumetha, 2016).

The following paragraphs summarise the limitations of the definition of caused-separation words by synonyms and distinctive characteristics. In differentiating between words of the lexico-semantic domain, there appear to be unclear definitional points. Definitions of caused-separation terms given in the works considered above raise these concerns:

(1) Circularity of definitions: deficits of this kind are commonly seen in Thai-Thai

dictionaries. When moving through networks of word-defining synonyms,

often we eventually arrive at the starting point again. Then, semantic

distinctions among caused-separation terms may not be drawn in any revealing

way.

(2) Vagueness of definitions: in Thai-English dictionaries, the definition does not

clearly distinguish between this caused-separation term and some others of the

same kind.

(3) Missing distinctive components: monolingual and bilingual dictionaries fail to

include important components, being under-informative regarding distinctions.

113

(4) Unclear status and relations of distinguishing components: it has yet to be

made clear whether different distinguishing components are of equal

importance to speakers of Thai in discerning caused-separation words and in

picking out any of them to describe events of caused separation.

(5) Status of academically constructed categorisation: in a thesaurus-dictionary

(Phanthumetha, 2016), caused-separation terms are classified into different

subtle (or fine-grained) subcategories. Still, it is arguable whether such

partitioning is simply an arbitrary scholarly classification. If being the case, it

may not reflect how Thai speakers comprehend extensions of terms when

actually using them.

What follows is discussion on what I above referred to regarding the means of giving prescriptive definitions for caused-separation terms and the certain potential defects.

A full list of cutting words in Thai was first proposed by Premsrirat (1987) based on two monolingual dictionaries: Royal Institute of Thailand (1950, 1982), and two bilingual dictionaries: McFarland (1944) and Haas (1964). According to

Premsrirat, cutting words refer specifically to “a human manual activity to divide an object into parts with the help of a sharp-edged instrument” (p. 150). Thus, cutting words in her terms do not completely equate semantically with caused-separation terms considered in this research. They are only the part which expresses material destruction. Looking at these Thai cutting words, however, helps understand how their prescriptive meanings are discerned.

Premsrirat (1987) summarises the distribution of her list of 55 cutting words occurring in the dictionaries: some are found in all of the four dictionaries while

114

others are not. Table 3.1 shows how 17 out of 55 cutting words are absent from some sources (p. 164).

Table 3.1

Cutting words not found in some of the four dictionaries; x marks absence of relevant words.

Cutting word23 Haas (1964) McFarland (1944) RIT (1950) RIT (1982)

/sɔːj/ x

/sɔ́ʔ/ x

/tɕìːak/ x

/bàːk/ x

/pàːt/ x

/pɔ̀ːk/ x

/kʰwân/ x

/tɕʰɔ̀ʔ/ x

/kɛ̀ʔ/ or /kɛ̀ʔsa làk/ x

/lɔ́ʔ/ x

/tɕǐːan/ x

/krìːt/ x

/tɕʰalɛ̀ːp/ x x

/tɕìːak/ x

/lít/ x x

/kʰlìp/ x x

/sǎj/ x x

23 The original forms of transcription in Premsrirat (1987) are adapted to suit the format used in this study. 115

Premsrirat (1987) raises two basic issues observed from the definitions of 55 cutting words in the dictionaries. One concerns circularity: the use of Thai cutting words to define each other in RIT (1950, 1982). The other issue is concerned with how McFarland (1944) and Haas (1964) use English translated synonyms to define cutting words in Thai.

To illustrate these issues, I follow Premsrirat in selecting eleven cutting words to describe and compare how words of this set are described in dictionaries. The shortlist of cutting words includes /tɕʰalɛ̀ːp/, /tɕʰɯ̌ ːan/, /tɕʰɯ̂ ːat/, /tɕʰamlɛ̀ʔ/, /hàn/,

/fǎːn/, /lɛ̂ ː/, /lɔ́ʔ/, /pàːt/, /sǎj/, and /tʰɯ̌ ːa/ (pp. 165-169).

Table 3.2 below demonstrates how the eleven cutting words listed above are described by ten words of the same kind in Thai (RIT, 1950, 1982; cf. Premsrirat,

1987). It is apparent that seven of the defining words as such are indeed from the eleven words themselves: /tɕʰalɛ̀ːp/, /tɕʰɯ̌ ːan/, /tɕʰɯ̂ ːat/, /fǎːn/, /lɛ̂ ː/, /pàːt/, and /tʰɯ̌ ːa/, thereby lending somewhat circular meanings. For example, based on RIT (1950,

1982), /tɕʰɯ̂ ːat/ may be defined with /tʰɯ̌ ːa/, which is circularly described with

/tɕʰɯ̂ ːat/ in the : /tɕʰɯ̂ ːat-tʰɯ̌ ːa/. Another case in point is /tɕʰɯ̌ ːan/ prescriptively meaning /tɕʰɯ̂ ːat/, which may be defined with /lɛ̂ ː/. However, /lɛ̂ ː/ itself links back to /tɕʰɯ̌ ːan/ again.

116

Table 3.2

Circularity in dictionary definitions: 11 cutting words in RIT (1950, 1982).

Thai terms

Cutting word

Remark ːp/

ːat/ ːan/

ɛ

̌ ːa/

al ɯ ɯ ̌ ːt/ ːn/ ʰ ʰ ʰ t/ ː/ ɯ t/ à ǎ ɕ ɕ ɛ í ɕ à t f t tʰ l l t t p / / /fan/ / / / / / / /

/tɕʰalɛːp/ ✓ n/a in RIT (1982)

/tɕʰɯ̌ ːan/ ✓ ✓

/tɕʰɯːat/ ✓ ✓ ✓ ✓

/tɕʰamlɛʔ/ ✓ ✓ ✓

/hàn/ ✓

/fǎːn/ ✓ ✓ ✓ ✓ ✓

/lɛː/ ✓

/lɔʔ/ ✓ n/a in RIT (1982)

/pàːt/ ✓ ✓ ✓

/sǎj/ n/a in both

/tʰɯ̌ ːa/ ✓ ✓

Given the use of Thai synonyms for definitions, overt circularity is present, thus making no clear distinctions for some cutting terms. This could be regarded as essentially meaningless since defining some Thai cutting words by using other words of the same lexico-semantic domain is likely to produce misleading definitions, including implications about full synonymy. As a matter of fact, despite being defined with other Thai words, one cutting word is not entirely synonymous with another one.

In general, they merely share senses in some contexts but differ in others.

Considering the same 11 shortlisted cutting words again, but in dictionaries of

McFarland (1944) and Haas (1964), I found that English translated words are employed for their definitions (Table 3.3). Although 15 different English words are involved in describing the meaning of the 11 Thai cutting words, only three English partial synonyms seem predominant in the definitions. In fact, 9 out of 11 Thai words

117

are defined only with the English terms, i.e., slice, carve, and cut. Consequently, the

English translations do not seem to provide adequate information to distinguish some cutting terms in Thai. For example, (1) /tɕʰɯ̂ ːat/ and /tɕʰamlɛ̀ʔ/, and (2) /hàn/ and

/fǎːn/ are similarly defined using the English translated synonyms: (1) carve, slice, and cut, and (2) slice and cut. Given such translations into English, one would find it quite difficult to say how each pair of Thai cutting words differ.

Table 3.3

English translations used for definitions of 11 Thai cutting words in McFarland (1944) and Haas

(1964).

English translated synonyms Cutting Remark

word

carve slice cut excise prune level pare chip break dissect dress shave trim plane hack

/tɕʰalɛːp/ ✓ n/a in McFarland

(1944)

/tɕʰɯ̌ ːan/ ✓ ✓ ✓ ✓ ✓

/tɕʰɯːat/ ✓ ✓ ✓

/tɕʰamlɛʔ/ ✓ ✓ ✓

/hàn/ ✓ ✓

/fǎːn/ ✓ ✓

/lɛː/ ✓ ✓ ✓ ✓ ✓

/lɔʔ/ ✓ ✓

/pàːt/ ✓ ✓ ✓ ✓ ✓ ✓

/sǎj/ ✓

/tʰɯ̌ ːa/ ✓ ✓ ✓ ✓

Additionally, Premsrirat (1987, p. 162) points out that even though other detail has been included in definitions by Thai words in RIT (1950, 1982) and English translated synonyms in McFarland (1944) and Haas (1964), this detail is inadequate to distinguish between different terms of cutting given the lack of crucial distinguishing

118

components. She proposes that distinctions in objects being acted upon, and manners of action/purposes/emotions should be included for the clearer, i.e., more distinctive, meaning of individual cutting words. Tables 3.4 and 3.5 outline the theme objects and the manners/purposes/emotions, respectively, added by Premsrirat to improve definitions of the 11 Thai cutting words (pp. 165-169).

Table 3.4

Theme objects specified for cutting words’ definition proposed by Premsrirat (1987).

Theme object

Cutting word

meat fruit vegetable carcass flesh gristle ice wood throat wrist organpart dress paper leather /tɕʰalɛ̀ːp/ ✓ ✓ ✓ ✓

/tɕʰɯ̌ ːan/ ✓ ✓ ✓

/tɕʰɯ̂ ːat/ ✓ ✓

/tɕʰamlɛ̀ʔ/ ✓ ✓

/hàn/ ✓ ✓ ✓

/fǎːn/ ✓ ✓

/lɛ̂ ː/ ✓ ✓ ✓

/lɔ́ʔ/ ✓ ✓

/pàːt/ ✓ ✓ ✓ ✓

/sǎj/ ✓ ✓ ✓ ✓

/tʰɯ̌ ːa/ ✓

Table 3.4 shows 14 kinds of objects taken to be representative for different actions represented by the individual cutting terms. These theme objects added by Premsrirat

119

are distinctive in that no two of the 11 cutting words have exactly the same list of objects being acted upon. This might imply that the dictionary definitions, as interpreted by Premsirirat, can strongly associate the individual terms with specific theme objects. However, I suggest that there is less predictive power than assumed with specific theme-object components. Though theme objects like meat, fruit, or vegetable would help formulate the meaning of cutting words in Thai, they do not predict verbs’ ability to be descriptors for cutting actions with different objects, but similar in some respects, such as meat versus gelatin. Another more generalised aspect of these theme objects, that can be referred to as object types, may have more predictive usefulness. For instance, instead of defining /pàːt/ for meat, fruit, vegetable, and organ parts, the word would be better defined in association with the three- dimensional object type.

In addition to theme objects, Premsrirat (1987) proposes that manners of action, purposes or emotions are helpful as distinguishing characteristics for cutting terms. Table 3.5 shows manners/purposes/emotions potentially contained in the semantics of the selected 11 cutting words in Thai, as proposed by Premsrirat (pp.

165-169).

120

Table 3.5

Manners/purposes/emotions specified for cutting words’ definition proposed by Premsrirat (1987).

Manner/purpose/emotion specified in definitions

Cutting word

annoyance forcefully repeatedly rapidly diagonally horizontally vertically crosswise lengthwise precisely neatly roughly little by little sawing violently destructive

/tɕʰalɛ̀ːp/ ✓ ✓ ✓

/tɕʰɯ̌ ːan/ ✓

/tɕʰɯ̂ ːat/ ✓ ✓ ✓ ✓

/tɕʰamlɛ̀ʔ/ ✓

/hàn/ ✓

/fǎːn/ ✓ ✓ ✓ ✓

/lɛ̂ ː/ ✓ ✓ ✓

/lɔ́ʔ/ ✓ ✓

/pàːt/ ✓ ✓ ✓

/sǎj/ ✓

/tʰɯ̌ ːa/ ✓ ✓ ✓ ✓

Again, manners of action, purposes, or emotions appear to distinguish the individual cutting words semantically, as in Table 3.5. Identical lists of potential manners/purposes are not shared by any of the words demonstrated in the table, thereby achieving semantic distinctions between them.

To sum up, the four monolingual and bilingual dictionaries appear to be deficient in their definitions. They are unable to specify adequate distinctive meanings for cutting words in Thai. They give circular definitions and English synonym translations that are vague. In response to these deficiencies in the dictionaries,

Premsrirat (1987) suggests the addition of certain components: e.g., theme objects and

121

manners/purposes, into the meaning of words that represent cutting actions in Thai to help distinguish them semantically.

Turning now to another recent edition of the Thai-Thai Royal Institute’s

Dictionary or RID (RIT, 2011), I expand consideration from Thai cutting terms to those of caused-separation more generally. This includes verbs of material destruction: e.g., those of smashing, snapping, or tearing etc. For purposes of brief discussion here, I select eight caused-separation verbs based on their high frequency

(> 700 tokens) in the Thai National Corpus (TNC).24 They are listed below with definitions translated from dictionary entries:

- /fan/. To take a sharp-edge object like a sword, and then use it to slash (/fâːt/)

something.

- /sàp/. To take a sharp instrument like a knife or an axe, and either swiftly or

repeatedly hacking away (/fan/) at something, such as hacking a pig bone, or

hacking a papaya, or to take a pointed instrument, then pricking or piercing

(/tɕɔ̀ʔ/) something with it, such as piercing an elephant with a sharp metal

hook.

- /tàt/. To make (something) ‘separate’ with a sharp instrument: cutting paper,

cutting cloth.

- /hàn/. To put something on a support and cut (/tàt/) it into small pieces.

- /pʰàː/. To cause something to split (/jɛ̂ ːk/) lengthwise, using a knife or an axe,

such as cutting firewood, or dissecting an abscess.

- /tʰúp/. To use a hard object like a hammer or a round object to hit (/tiː/) down

on something, and then causing it to shatter (/tɛ̀ːk/): e.g., smashing a coconut,

24 Aroonmanakun, W. (2007). TNC: Thai National Corpus (Third Edition). Retrieved October 1, 2020, from http://www.arts.chula.ac.th/~ling/tnc3/. 122

a brick, or a rock, or causing it to be easily chewable (/nûm/) or crushed

(/lɛ̀ːk/), such as smashing beef or pork, or causing someone [animals included]

to die, such as crushing the head of a fish.

- /tɕʰìːk/25 (TR). To cause something to be torn (/kʰàːt/) or to come apart (/jɛ̂ ːk/):

e.g., tearing cloth, tearing a durian.

- /hàk/ (TR). To fold (/pʰáp/) or bend (/ŋɔː/) something so that it tears (/kʰàːt/) or

falls apart (/lùt/).

Similar to the earlier editions of RID (RIT, 1950, 1982), the 2011 edition has still used synonymous Thai words as well as other optional distinctive descriptions: e.g., implied results, spatial patterns of action, and relevant items, to define caused- separation terms. In parallel with those details, prototypical instruments and exemplary theme objects are sometimes present to optimally draw distinctions between different caused-separation vocabulary words. For example, /sàp/ is given a dictionary definition with /fan/; both of which are interpreted as generally involving a blow of a sharp-bladed tool.

RIT (2011) does not methodically define caused-separation words with all the above characteristic components (e.g., of instruments or theme objects) and necessary information. Take /tɕʰìːk/ and /hàk/ as examples. These two words are defined as referring to caused-separation without sufficiently specifying whether any implement is taken.

Native-speaker adults may not react to definitions of these two words as lacking information regarding instrument since they either experientially or culturally

25 /tɕʰìːk/ and /hàk/ can be classified as intransitive verbs under certain sentence conditions. In their intransitive sense, both verbs may refer to either spontaneous separation or resulting separation—which was derived from caused-separation expressed by the same verb forms. 123

recognise that happenings represented by /tɕʰìːk/ and /hàk/ usually involve hand actions, or do not involve other types of instruments.

Young learners of Thai and non-native users may find the definitions under- informative, since we would view them as reporting only meagre characteristics of words that do not prove very helpful in providing specific mental representations or

“pictures in the mind”. For instance, the above meaning of /hàk/ does not seem to exclude the application of the word to some theme objects like 70-gram A4 paper, despite no cases in actual use.

Lastly, moving to the extensive thesaurus compiled by Phanthumetha (2016), I found that this prescriptively positioned text contains a large vocabulary of Thai caused-separation words, as evidenced in Table 3.6 below.

124

Table 3.6

Caused-separation words found in Phathumetha’s (2016) Thai-Thai thesaurus, as involving 12 meaningful fine-grained categories; some words (underlined) are classified into two subtle subcategories.

Mega-category All things Man and all things Category Things, nature, condition, characteristics, and behaviour Things, nature, condition, characteristics, and behaviour Sub-category Changing Changing textures Combining and separating Mobilising and changing Behaviour of things Fine-grained sub-category Separating some parts Breaking Changing shapes Separating adjacent parts Changing sizes Making holes Scratching Levelling Separating Impacting Making entering or going into Touching things Term /bàn/ /bùp/ /tɕi aranaj/ /hɛ̀ k/ /tàt/ /prùʔ//bà k/ /krɔ / /bɛ̀ŋ//kʰrû t/ /tɕìm/ /bà t/ /bìʔ//fâ t/ /tɕi a/ /tɕɔ̀ʔ//bâŋ//pà t//jɛ̂ k//kʰù t/ /tʰalu aŋ/ /tʰîm/ /dèt/ /rabɤ̀ t//klɔ̀ m/ /tɕʰaj/ /kʰù t/ /sǎj/ /sakàt/ /tʰúp/ /tʰɛ ŋ/ /fǒn/ /tɔ̀ j//lǎw/ /tɕʰalùʔ//kʰwân/ /tɕi a/ /tʰalǔŋ/ /tʰîm/ /hàk/ /tʰúp/ /sî am/ /tʰalu aŋ/ /krì t/ /hâm/ /hàn/ /kàt/ /katʰɔ́ʔ/ /kì aw/ /krì ak/ /lɔ́ʔ/ /lɛ̂ / /lít/ /pʰà / /ra n/ /rɔ n/ /rɔ́ʔ/ /sàp/ /sɔ j/ /sɔ̌ j/ /sɔ́ʔ/ /sǐn/ /tàt/ /tɕàk/ /tɕamrɤ n/ /tɕʰalì k/ /tɕʰamlɛ̀ʔ/ /tɕʰɔ̀ʔ/ /tɕʰì k/ /tɕʰǐn/ /tɕʰɯ̌ an/ /tɕʰɯ̂ at/ /tɕi an/ /tɕǐ an/ /tʰɔ n/

125

Table 3.6 shows as many as 73 different caused-separation words in Thai, classified by

12 different fine-grained semantic relations: e.g., separating some parts, breaking, changing shapes, or making holes, according to similarities26 in meaning. As proposed by

Phanthumetha (2016), the majority of caused-separation words (= 36) belong to the subtle category of separating some parts, and mostly overlap with the list of cutting words of Premsrirat (1987). The smallest fine-grained category is that of separating adjacent parts and changing sizes. Note that some caused-separation words are found in two different fine-grained subcategories: e.g., /tàt/ is found both in the set of separating some parts and in that of changing sizes.

When all the subtle subcategories of the caused-separation words are considered, they are found mainly to involve actions that affect state changes. To this extent,

Phanthumetha’s (2016) categorisation is consistent with observations of other scholars

(e.g., Guerssel et al., 1985; Levin, 1993; Majid et al., 2007a, 2008). That being said, some fine-grained subcategories are not so well typified in this respect despite their caused- separation word members denoting material destruction involving a change of state. For example, two caused-separation words, i.e., /bàːt/ and /tʰîm/, are typically regarded as verbs of material destruction—i.e., cutting and piercing—but they are classified into the subtle subcategory of touching things. This naming is unilluminating. It neglects any association with potential material destruction—that is, of caused state-change events.

26 Phanthumetha (2016) defines whether different words are considered similar in meaning on four bases: (1) synonymous words, (2) words with synonymous meanings but different at language registers (/tàt/ (ordinary) versus /tɕʰǐn/ (poetic)), (3) partially synonymous words sharing at least one sense, and (4) words with similar meanings (pp. (12)-(13)).

126

Similar to the above Thai-Thai dictionaries (RIT, 1950, 1982, 2011) and

Premsrirat’s (1987), Phanthumetha’s (2016) Thai-Thai thesaurus also uses circular approaches to define caused-separation words. Specifically, Thai synonyms are principally employed along with other information, such as examples of theme objects

(e.g., chives or rose branches for /tàt/), prototypical tools (e.g., sickle for /kìːaw/), manners of action (e.g., forcefully for /tɕʰɯ̂ ːat/), implied/expected results (e.g., into pieces for /tʰɔːn/), and relevant items (e.g., supporting board for /hàn/). These provide short definitions for each caused-separation term. Nevertheless, Phanthumetha does not always use a systematic combination of these components to define individual caused-separation words. Instead, she clarifies her objectives this way: since her primary purpose is to categorise Thai words with similar meanings into the same subtle or fine-grained subcategories (p. 10), she only gives short and clear summary meanings to individual words in order that they can be semantically distinguished from others in the same fine-grained subcategories. Examples are offered only when necessary (p. 13).

As discussed above, the monolingual and bilingual dictionaries (Haas, 1964;

McFarland, 1944; RIT, 1950, 1982, 2011), the research study on Thai cutting words’ prescriptive definitions (Premsrirat, 1987), and the thesaurus (Phanthumetha, 2016) provide approaches to defining caused-separation words in Thai that are essentially lexicographical. These approaches may be broken down into two groups: those using synonyms, and those using distinctive features. As for definitions by synonyms, for example, Phanthumetha uses /tàt/ to define many caused-separation words, such as

/kìːaw/, /sakàt/, or /raːn/. In the bilingual dictionaries, translated English synonyms are usually used to label Thai cutting words: for example, cut, carve, slice for /tɕʰɯ̂ ːat/,

/tɕʰamlɛ̀ʔ/, /lɛ̂ ː/ in Thai. However, Thai or English synonyms do not apply alone to

127

define caused-separation words; they are often intertwined with the use of distinguishing components or distinctive information to form short definitions or explanations for each caused-separation term. Take /sakàt/ as an example. Premsrirat uses the distinguishing components of instrument and theme objects to define the term. At the same time, Phanthumetha applies components of an object type (i.e., solids), exemplary objects (e.g., steel or stones), and potentially implied or expected results (e.g., [scratched] into pieces). On the other hand, Phanthumetha’s work does not mention instrumental implements. According to the above references, distinctive characteristics for caused-separation words may be summarised as follows:

- Theme objects or object types

- Instruments

- Expected results

- Manners of action, purposes, or emotions

Synonymous words along with distinguishing components in prescriptive definitions appear on the one hand to distinguish between Thai caused-separation words in given cases (cf. Premsrirat, 1987). On the other hand, if shared either fully or in part across the meanings of different caused-separation vocabulary terms, such synonyms and characteristics may aid in classifying words of the general semantic domain into semantic subcategories (cf. Phanthumetha, 2016).

In summary, caused-separation words have been commonly included in different monolingual and bilingual dictionaries (Haas, 1964; McFarland, 1944; RIT,

1950, 1982, 2011). In such dictionaries, terms that express caused-separation events are often prescriptively defined with synonymous words and other information like relevant distinguishing features. However there appear to be certain shortcomings

128

with prescriptive definitions of this type. Descriptions and explanations of caused- separation words can end up with circular or otherwise vague definitions of terms.

Some open issues concern distinctive components deployed methodically in distinctions made among caused-separation terms and prescriptive semantic classifications of words of this lexico-semantic domain: are they perceptually accurate or acceptable to native speakers of Thai?

Next, I further consider caused-separation terms in Thai as used by a group of native speakers of Thai to name events of caused-separation with varying features, hoping to find descriptively positioned semantic categorisation and natural-sounding parametric features in the caused-separation lexical field.

3.2 Lexical items for caused-separation events and their

structural patterns in Thai: Elicited data

This section reports on the field research introduced in § 2.2. Analysis is based on the descriptions of caused-separation events27 elicited from seven Thai speakers using the

MPI’s ‘cut’ and ‘break’ clips (Bohnemeyer et al., 2001; only 43 caused-separation clips out of the total were selected for field investigation). The following sections present accounts of relevant lexical descriptors and their constructional structures. The Thai verbs extracted from the speakers’ descriptions address lexical diversity for the caused- separation domain in Thai (§ 3.2.1). The basic clause patterns speakers used for their reports with Thai caused-separation verbs are also analysed (§ 3.2.2).

27 The terms cutting and breaking, and other subevent names are used here as quick ways of describing ordinary-life separation events. Using these English labels is for convenience; across different languages there cannot be any implication that their meanings are homogenous. For example, cutting in Thai may not be comparable with cutting in other languages.

129

3.2.1 List of Thai caused-separation verbs

What follows shows the extent to which the seven native speakers of Thai consistently employed various caused-separation verbs to name actions shown in the individual stimulus video clips, thus describing their events (henceforth referred to as scenes).

Certain caused-separation verbs are found extensively applying to a wide range of different scenes, occurring more frequently than others in the datasets.

To ensure homogeneous conditions in the datasets, I have discarded cases in which the intended event of caused-separation was not coherently described and where incoherent descriptions were first articulated but later withdrawn by the speaker. The present research took all remaining descriptions into consideration (334 cases in total). Also excluded in the first instance are intransitive resulting-state separation verbs: /kʰàːt/ ‘be.torn’ or /tɛ̀ːk/ ‘be.shattered’, and directional path verbs

/ʔɔ̀ːk/ ‘go out; leave’ and /paj/ ‘go’, which occasionally occur serialised to preceding caused-separation verbs, and seem appropriate for emphasising change-of-state (or change-of-location) (cf. Slobin, 1996; Muansuwan, 2001; Talmy, 2003; Thepkanjana

& Uehara, 2009)28. This is because these items do not directly encode caused actions, that is caused-separation, which is the focus of the present study (see § 2.1.3.2).

However, I reflect on those verbs in § 3.2.2, where I briefly discuss certain constructional patterns related to caused-separation verbs.

Table 3.7 below shows that there are 24 different verbs used by the Thai speakers to depict the 43 scenes; their frequency of occurrence ranges from 1 to 79 tokens. The nine most common verbs, /tàt/ ‘cut’, /tʰúp/ ‘smash’, /fan/ ‘hack’, /hàk/

28 Despite the multi-verb sequence being used, I have taken the single-verb construction to be sufficient to describe caused separation in Thai and Khmer since it in general more often occurred in the event descriptions. Also, I conjecture that the resultative/multi-verb construction transpired occasionally only for emphasis of “accomplished” results.

130

‘snap’, /hàn/ ‘cut’, /sàp/ ‘chop’, /tɕʰìːk/ ‘tear’, /pʰàː/ ‘split’, /dɯŋ/ ‘tug’29 are used in more than 80% of all the scene descriptions. This means that less than 38% of the 24 verbs account for the majority of cases. About 62% of the verbs are responsible for the smaller number of the cases: they employ 15 different verbs, such as /lɯ̂ aj/ ‘saw’, or /tɕɔ̀ʔ/ ‘puncture’, deployed for less than 18% of the descriptions. These verbs can be considered rarely used and long-tailed in the datasets. Furthermore, about one-third of the verbs appear even more infrequently: in less than 1% of the valid cases. Seven occur each less than four times: /tɔ̀ːk/ ‘pound’, /dèt/ ‘pluck’, /tɕîm/ ‘jab’, /katʰɔ́ʔ/

‘crack’, /kratɕʰâːk/ ‘drag off’, /lít/ ‘prune’, /sɔːj/ ‘slice’.

Table 3.7

Thai verbs used to describe 43 caused-separation scene clips, in descending order of their distribution of verb tokens.

No. Verb type Approximate meaning30 Frequency Percentage of No. of scenes where

frequency the verb was applied

1. /tàt/ ‘cut, sever’ 79 23.65% 19

2. /tʰúp/ ‘hammer; hit, smash’ 42 12.57% 7

3. /fan/ ‘slash, cut, chop, sever’ 40 11.98% 15

4. /hàk/ ‘break, snap’ 29 8.68% 4

29 The English glosses here were selected to adequately specify the Thai caused-separation verbs and assist in readers’ ready comprehension. Still, one may earnestly recommend other possible better glosses for some of the verbs: e.g., ‘hit’, ‘pull’, ‘stab’ for the respective verbs /tʰúp/, /dɯŋ/, and /tɕîm/, thus there is a need to support the present study’s decisions with some justifications. As for /tʰúp/, though it does not necessarily implicate resultant completion, its more common occurrences in a single- verb construction in describing caused-separation with a result of shattered pieces (cf. Thepchuaysuk & Thepkanjana, 2017) are evidence for the gloss ‘smash’ rather than ‘hit’. English ‘hit’ should be good for /tiː/, not for /tʰúp/ since both ‘hit’ and /tiː/ are similar in being higher up (or more general) in the semantic hierarchy than /tʰúp/. Next, /dɯŋ/ is better matched with ‘tug’ than ‘pull’ since the former involves “being tight” which is not really emphasised with the latter. Last, /tɕîm/ is glossed with ‘jab’ – probably better than ‘stab’ which seemingly implies more force or effort, more like /tʰɛːŋ/. 30 The approximate meanings are based on Thai-English Student’s Dictionary (Haas, 1964). Glosses for each verb may be different in this study from these prescriptively positioned meanings.

131

No. Verb type Approximate meaning30 Frequency Percentage of No. of scenes where

frequency the verb was applied

5. /hàn/ ‘cut into pieces, slice’ 22 6.59% 7

6. /sàp/ ‘hack, chop, slash’ 21 6.29% 8

7. /tɕʰìːk/ ‘tear, rip; to get torn, 16 4.79% 4

ripped’

8. /pʰàː/ ‘split, cut, hew’ 13 3.89% 3

9. /dɯŋ/ ‘pull, draw, tug’ 10 2.99% 2

10. /bàːt/ ‘cut, wound, scar’ 9 2.69% 1

11. /lɯ̂ aj/ ‘saw’ 8 2.40% 1

12. /tɕɔ̀ʔ/ ‘puncture, bore, make a 8 2.40% 5

hole, punch a hole’

13. /tɕaːm/ ‘strike’ 7 2.10% 4

14. /tɕʰɯ̌ an/ ‘cut off, slice off’ 5 1.50% 3

15. /krìːt/ ‘cut, slash, scratch’ 5 1.50% 1

16. /tʰîm/ ‘pierce, thrust in, stab’ 5 1.50% 1

17. /tʰɛːŋ/ ‘pierce, stab; to stick 4 1.20% 3

into, put into’

18. /tɔ̀ːk/ ‘hammer, nail, pound’ 3 0.90% 3

19. /dèt/ ‘pluck, nip off, pinch 2 0.60% 2

off’

20. /tɕîm/ ‘jab’ 2 0.60% 1

21. /katʰɔ́ʔ/ ‘flake, chip off; to crack’ 1 0.30% 1

132

No. Verb type Approximate meaning30 Frequency Percentage of No. of scenes where

frequency the verb was applied

22. /kratɕʰâːk/ ‘drag off’ 1 0.30% 1

23. /lít/ ‘trim, prune’ 1 0.30% 1

24. /sɔːj/ ‘mince, cut, slice (into 1 0.30% 1

small pieces)’

The right-most column in Table 3.7 confirms the different distributional characteristics of the individual caused-separation verbs in Thai. For the total set of 43 scenes, a limited subset of Thai caused-separation verbs shows high-frequency differential distributions: these verbs are used to describe relatively large proportions of the total scenes. In contrast, other items were employed to describe only relatively small numbers of the scenes. More specifically, more than 44% of the scenes could be named only with the high-frequency verb /tàt/ ‘cut’, which clearly indicates its generality in meaning. By contrast, the nine infrequent verbs: /bàːt/ ‘cut/wound’,

/lɯ̂ aj/ ‘saw’, /krìːt/ ‘cut/slash’, /tʰîm/ ‘pierce’, /tɕîm/ ‘jab’, /katʰɔ́ʔ/ ‘crack’, /kratɕʰâːk/

‘drag off’, /lít/ ‘prune’, and /sɔːj/ ‘slice’, were used individually to describe merely a single scene, together accounting for less than 21% of the designated scenes. In more general terms, more than 93% of all the scenes were described with one of the high- frequency verbs, while the rest were described with only one low-frequency verb.

This set of single-item verbs is referred to as long-tailed.

Some caused-separation verbs were employed in descriptions of more than one scene. According to Table 3.7, total scenes in which individual verbs were used

(= 98) is larger than the number of the scenes used for the study (= 43). This shows

133

that some scenes were described with multiple caused-separation verbs. Figure 3.1 summarises numbers of scenes that were named with different numbers of lexical verbs.

Figure 3.1. Numbers of scenes with different numbers of verbs in Thai.

As illustrated in Figure 3.1, Thai speakers in the field study showed variation in how consistently certain scenes were described. For more than a third of all the scenes, a single caused-separation verb was used in descriptions. This was followed by the use of two, three, or four different caused-separation verbs for about 23%, 20% and 12% of all the scenes, respectively. This means that over 80% of all the scenes were described using only one to three caused-separation verb tokens. The application of five different caused-separation verbs across the Thai speakers for an individual scene is sporadic: there are only three scenes (about 9%) displaying this phenomenon, i.e.,

Scene 43 CHOP CARROT W/ CHIISEL, Scene 53 CHOP STICK, and Scene 54 CHOP CARROT

W/ AXE. These differences in how individual scenes are described involving different

134

numbers of caused-separation verbs reflect varying degrees of agreement between the speakers of Thai in naming particular caused-separation events. This is discussed in more detail in § 3.4.

In the cases considered thus far, the Thai speakers used 24 different caused- separation verbs in response to the scenes. The most frequent nine verbs account for more than four-fifths of the scene descriptions. The remaining 15 verbs were employed in a low proportion only. Also, the Thai caused-separation verbs were found to differ in how speakers extended their ranges. Some were used in descriptions that spanned over a wide range of scenes, whereas others were restricted to a single scene. The majority of the scenes were named with only one caused-separation verb; less than 7% of the scenes were described with up to five different verbs.

Since the nine most frequent verbs account for the great majority of scenes (= more than 93%), a preliminary observation is that this uneven distribution is of significance for characterising lexico-semantic structure of the Thai caused-separation domain. The extension of these caused-separation verbs is described in more detail in

§ 3.3, as is the means of determining semantic categories in the caused-separation domain. In the next section, I illustrate and summarise the structural patterns of Thai caused-separation verbs as found in the field data.

3.2.2 Summary of structural patterns of Thai caused-separation

verbs

Verbs of caused-separation listed in Table 3.7 were found in different syntactic constructions in the datasets used in this study. Their structural characteristics are illustrated in the following examples:

135

(3.1) pʰûːtɕʰaːj kʰon níː hàn kʰɛːròt man CLF:man DET slice carrot

SBJ VB OBJ ‘This man cut the carrot.’ [CB-S10-OS] (3.2) kʰǎw hàk kìŋmáj dûaj nâːkʰǎː

3SG snap twig with lap

SBJ VB OBJ1 PREP OBJ2 ‘He snapped the twig with his lap.’ [CB-S5-RW] (3.3) pʰûːtɕʰaːj tɕʰáj sìw tàt tɕʰɯ̂ ːak man use chisel cut rope

SBJ VB1 OBJ1 VB2 OBJ2 ‘The man used a chisel to cut the rope.’ [CB-S2-CP] (3.4) pʰûːjǐŋ tɕʰìːk pʰâː ʔɔ̀ːk tɕàːk kan woman tear cloth exit from RECP

SBJ VB1 OBJ VB2 PREP RECP ‘The woman tore the cloth apart.’ [CB-S1-PA] (3.5) pʰûːjǐŋ tɕʰáj kʰɔ́ːn tʰúp tɕaːn tɛ̀ːk woman use hammer smash plate be.shattered

SBJ VB1 OBJ1 VB2 OBJ2 VB3 ‘The woman smashed the plate with a hammer.’ [CB-S40-YY] (3.6) pʰûːtɕʰaːj tʰúp tɕʰɯ̂ ːak hâj kʰàːt woman smash rope GIVE.CAUS be.torn

SBJ VB1 OBJ PURP VB2 ‘The man broke the rope.’ [CB-S50-CP]

136

(3.7) pʰûːtɕʰaːj tɕʰáj kʰɔ́ːn tʰúp tɕʰɯ̂ ːak tɕon [tɕʰɯ̂ ːak] kʰàːt man use hammer smash rope until [rope] be.torn

SBJ VB1 OBJ1 VB2 OBJ2i CONJ [SBJi] VB3 ‘The man hammered the rope until it broke.’ [CB-S50-YY] (3.8) kʰǎw fan kìŋmáj lǎːj tʰiː tɕon [kìŋmáj] hàk

3SG slash twig many CLF:time until [twig] break

SBJ VB1 OBJi ADVP CONJ [SBJi] VB2 ‘He slashed the twig several times until it broke.’ [CB-S3-SW]

The above examples show that in general, the descriptions of different events by Thai speakers appeared in either mono-clausal contexts (in (3.1) – (3.6)) or in bi-clausal contexts (in (3.7) – (3.8)). Looking especially at the clauses where a caused- separation verb occurred to express causal actions, one can see that it can either function as a single main verb for the entire clause (in (3.1)–(3.2) and (3.8)) or occur serially with one or more verbs (in (3.3)–(3.4) and (3.5)–(3.7)). The structures [SBJ VB

OBJ] and [SBJ VB1 OBJ1 VB2 OBJ2] are found respectively in 33% and 32% of all the descriptive clauses containing caused-separation verbs (positions indicated in boldface). Basic syntax of the elicited Thai descriptions can thus be interpreted as moderately uniform: these two constructions are predominant in nearly two-thirds of the descriptions. In the following, I briefly describe how each verb of caused- separation in Thai is expressed either in a mono-verbal construction (in § 3.2.2.1) or in a serial verb, i.e., multi-verb, construction (in § 3.2.2.2).

3.2.2.1 Verbs of caused-separation in single-verb constructions in Thai

According to the examples (3.1) – (3.2) and (3.8), the structure of a description in mono-verbal construction can be summarised as follows:

137

(SBJ) VB OBJ (ADVP) / (PREP OBJ)

The predicating element of this construction is always verbal, with a bare transitive verb, i.e., the verb of caused-separation. This caused-separation verb is usually preceded by a noun phrase: either nominal or pronominal, identifying the subject of the construction. However, given the information structure imposed by the method of elicitation, some informants occasionally did not include the subject phrase. A noun phrase describing the object immediately follows the caused- separation verb. The object phrase is sporadically followed by an adverbial phrase (as in (3.8)) or by a prepositional object phrase (as in (3.2)).

(Note that the use of the slash punctuation in the above is to indicate that either an adverbial or prepositional phrase may occur immediately after the main verb plus object of caused-separation. However, despite not being found in this study, some informants suggest a possible co-occurrence of the two types of phrases in any logical order.)

Correspondingly, in the Thai data, semantic roles seem to uniformly map onto the syntactic representations of a description in this single-verb construction.

Specifically, the agent of the caused-separation event is coded as the syntactic subject.

The action of caused-separation is consistently mapped into the main verb. The syntactic object expresses the theme of the verb that is the separation target. The adverbial phrase and prepositional object are sometimes realised to specify how the relevant action was done, and characterise the instrumental function, respectively.

This pattern is illustrated below.

138

SBJ VB OBJ ADVP PREP OBJ

| | | | |

AGENT CAUSED-SEPARATION THEME MANNER INSTRUMENT

This extensive mono-verbal clausal description containing a caused-separation verb along with manner and instrumental items is potential, but not frequently used in practice. Cases of the full construction are less than 1% of all the descriptions.

Instead, the most common description in single-verb structure is [(SBJ) VB OBJ] accounting for 33% of all the cases; in this minimal structure, the syntactic constituents of manners and instruments are not realised.

The usual absence of syntactic elements in manner and instrumental functions in descriptions using mono-verbal constructions has at least two implications. On the one hand, when speakers described any caused-separation event, relevant features of manners and instruments might not be commonly salient perceptually to them, not being coded formally. Such semantic features may be syntactically realised because speakers consider that specified manner or instrumental perspectives are important and should be explicitly expressed to ensure clear descriptions of how caused- separation was carried out, or which instrument is responsible for an action. On the other hand, caused-separation verbs may generally entail or imply some manner and instrumental features. Relying on such semantic entailment or implication of verbs, speakers do not feel that they need to specify semantic features of manners or instruments explicitly unless other non-entailed or less conventionally implied information is necessary to mention.

139

The minimal description pattern [SBJ VB OBJ] is not seen as a dedicated construction for expressions of caused-separation since it is just a transitive construction commonly found in Thai.

3.2.2.2 Verbs of caused-separation in multi-verb construction in Thai

Verbs of caused-separation in this study did not occur only in the single-verb construction but were also found in multi-verb constructions. The working definition of multi-verb constructions in this study follows Enfield (2008) in that they refer to formally unmarked sequences or serialisation of verbs, where NP arguments may intervene between the verbs or follow them. The three basic constructions shown below were relatively common in the data. They are multi-verb clausal structures, where the verb of caused-separation is in boldface:

(1) (SBJ) VB1 OBJ1 VB2 OBJ2 (VB3)

(2) (SBJ) VB1 OBJ1 VB2 (PREP OBJ3)

(3) (SBJ) VB1 OBJ GIVE VB2

As for the NP arguments occurring in each of the structures, they consistently refer to specific semantic roles. The agent is mapped to the NP argument in the subject position, whereas that occurring immediately after the first-positioned verb can be instrumental, as in (1) and (2) or theme, as in (3). The NP argument after the second- positioned verb in (1) is tied to the theme, with OBJ2 referring to the item undergoing caused-separation. The prepositional phrase in (2) can be either the prepositional reciprocal pronoun or the prepositional object phrase. As for the prepositional pronoun, it is /tɕàːk kan/ PREP:from RECP, indicating ‘from each other’, whose

140

antecedent is considered to be the relevant theme separating ‘reciprocally’ into (two) parts. Take (3.4): the reciprocal pronoun refers to ‘the cloth’ being mutually torn apart. The prepositional object is resultative, showing change of state of the theme.

Also, all the verbs in the sequences in (1) – (3) consistently refer to certain actions or results of actions involved in events of caused-separation. In (1), VB1 comes from the closed set including /ʔaw/ ‘take’ and /tɕʰáj/ ‘use’, representing an action of

‘holding’ or ‘using’ a tool for caused-separation. At the same time, VB2 is tied to a causal action, which corresponds to the open set of caused-separation verbs. The clause-final verb, i.e., VB3, in (1), expresses the resulting separation attributable to the relevant casual action. In (2), VB1 represents an action expressed by a caused- separation verb, and VB2 commonly corresponds to PATH expressed by a directional path: e.g., /ʔɔ̀ːk/ ‘exit’ or a resulting situation represented by a resulting separation verb. The structure (3) contains a grammaticalised causative verb /hâj/ ‘GIVE31’ preceding VB2 which consistently represents the resulting separation. The three patterns of multi-verb constructions are illustrated below:

(1) (SBJ) VB1 OBJ1 VB2 OBJ2 (VB3)

| | | | | |

AGENT HOLDING/USING INSTRUMENT CAUSED- THEME RESULTING

SEPARATION SEPARATION

31 The gloss of /hâj/ is given in small capital letters to specify its lexical source, i.e., the ‘give’ verb, while marking its grammaticalised status of losing verbal properties: e.g., loss of the semantics of ‘transporting objects’, and its acquisition of more grammatical properties: e.g., merging two structures of a causing and resulting situation into one. Also, note that a variant /tʰamhâj/ of /hâj/ was once used as well, thereby pointing to sporadicity given its uncommon use in the spoken language.

141

(2) (SBJ) VB1 OBJ1 VB2 PREP OBJ3

| | | | |

AGENT CAUSED- THEME PATH / SEPARATE PARTS/

SEPARATION RESULTING RESULTATIVE

SEPARATION

(3) (SBJ) VB1 OBJ GIVE VB2

| | | | |

AGENT CAUSED- THEME CAUS RESULTING

SEPARATION SEPARATION

As the above configurations show, patterns of multi-verb structures are not dedicated constructions for expressing events of caused-separation since they are common grammatical means in Thai. These multi-verb structures are three different types of serialisation: sequential, resultative, and causative, as defined by Iwasaki and

Ingkhaphirom (2005).

First, without VB3, the structure (1) becomes the same as (2) in that they are instances of sequential serialisation which is a conjoined structure without an overt linker like /lɛ́ʔ/ ‘and’, describing “a series of events that happen one after another with respect to the same subject in sequential serialisation” (Iwasaki & Ingkhaphirom,

2005, p. 233). This means that an action of holding or taking an implement is considered happening first in the series and is followed by an action of caused- separation.

Second, the structure (1) with the realisation of VB3 is seen as another hybrid type of serialisation in Thai: resultative serialisation, where the verb phrases have different subjects. Specifically, in (1), VB1 and VB2 sequentially express a tool-taking action and caused-separation. These components are conjoined as one situation, again

142

without an overt linker. The actions are both predicated of the syntactic subject. In contrast, the syntactic object of VB2 is, in turn, the semantic subject of VB3, which represents a resulting situation. Iwasaki and Ingkhaphirom (2005, p. 239) highlight the nature of the resulting situation: it is what is highly expected from the preceding causing situation.

Last, the structure (3) is causative serialisation with a grammaticalised causative verb /hâj/ ‘GIVE’. This type of causative of serialisation looks superficially similar to the resultative type since it comprises a situation with both cause and result.

In fact, Iwasaki and Ingkhaphirom (2005) suggest a distinction between these types.

Unlike the resultative, the causative does not require a result that is a natural and expected consequence of the relevant causal action, but merely a resulting situation that is caused accordingly by it. However, according to the elicited descriptions involving causative serialisation, such a definition does not seem to clearly distinguish this type from the resultative type since the results in causative serialisation are usually found to be naturally expected consequences: e.g., /khat/ ‘be.torn’ by /tat/

‘cut’. The relevant examples suggest a different distinction. What is discerned in this causative type, as distinct from the resultative counterpart, is the use of a grammaticalised verb /hâj/ ‘GIVE’ which specifies a causative-purposive sense for the construction. Causative serialisation with /hâj/ can be interpreted as purposive in that the result is not always realised, as against that of resultative serialisation, which always implies a realised consequence.

Looking more precisely at serialisation of caused-separation verbs and resulting separation verbs in (1) and (3), I summarise how verbs of caused-separation may occur serially in tandem with certain resulting separation verbs.

143

should apply. Otherwise, /kʰàːt/ should be used instead. For /tàt/, when the theme object is one-dimensional rigid, /hàk/ should be deployed. Otherwise, the resulting separation verb /kʰàːt/ is appropriate. However, it is not the intent of the present study to address verbs of resulting separation in Thai; further discussion on this topic is not pursued here.

In summary, this section has accounted for descriptions of caused-separation events in both mono-clausal and bi-clausal structures. The particular clause types where verbs of caused-separation were found to occur were both single-verb and multi-verb constructions. In single-verb constructions, verbs of caused-separation commonly occurred in the basic Thai transitive construction: [SBJ VB OBJ]. In multi- verb constructions, verbs of the relevant kind were able to be serialised with preceding verbs of ‘holding/using’, such as /tɕʰáj/ ‘use’, or with following path verbs or verbs of resulting separation. Consequently, an important finding for this stage of analysis is that no dedicated construction has been detected that specifically expresses caused-separation events in Thai. Instead, the events are described using more general

Thai construction types. The section that follows moves on to consider how the extensional use of the individual verbs of caused-separation that were introduced in §

3.2.1 can facilitate the partitioning of the semantic domain in the Thai case.

3.3 Granularity of caused-separation categories in Thai

The previous section showed that a total of 24 verbs were used by Thai speakers to describe the individual 43 scenes of caused-separation events. Also, the verbs were shown to have different distributions in the descriptions. Thus, a range of the scenes was named with the same verbs. The distribution of these verb extensions provides a

145

way to classify the different scenes according to semantic categories in the domain of caused-separation.

In this section, a cluster approach is followed (cf. Majid et al., 2007a-b).

Different scenes of caused-separation that were described across the Thai speakers using the same verbs are taken to be semantically similar to one another; those described with different verbs show less similarity; scenes that were never named with the same verbs should give rise to separate clusters. For example, Scene 49 CUT

ROPE W/ KNIFE and Scene 56 CUT CLOTH W/ SCISSORS are both named with /tàt/ ‘cut’.

This differs from Scene 57 SNAP CARROT BY HAND, which is described with /hàk/

‘snap’. Through this type of distribution, semantic similarity among the scenes can be measured to determine which scenes are alike and which are not. Such measurements then can lead to a type of semantic categorisation in the caused-separation domain in

Thai.

I apply the statistical tool of cluster analysis to investigate the similarities of scenes by the verbs used by the different Thai speakers. To begin with, a scene-by- verb matrix is generated with reference to the 43 scenes (in rows) and 24 different

Thai verbs (in columns). Subsequently, cluster analysis is applied to the matrix using

Jaccard’s coefficient and between-group linkage. This method produces a clustering of scenes with verb distributional homogeneity together, i.e., the more similar verb occurrence patterns the scenes exhibit, the more closely similar they are following the

Jaccard algorithm.

The multivariate statistical investigation results in a hierarchical clustering tree or dendrogram32 for all the scenes. This dendrogram potentially reveals the

32 The exact placement of scenes within a cluster is not meaningful; the illustrated location of a scene in a cluster closer to a neighbouring cluster does not always represent more proximity than others in the

146

hierarchical category structure of the caused-separation event domain underlying the descriptions: e.g., the number of categories, category boundaries, the relationship between categories, and the generality of verb meanings. It points toward the semantic distinctions between categories. Note that Jaccard’s coefficient was chosen because it excludes joint absences from the analysis. With this index’s insensitivity to zero-zero matches, two scenes were thus prevented from being clustered due only to mutual non-occurrences of a particular describing verb.

Additionally, to judge what clusters of scenes are of importance for discussion, I follow Vulchanova et al. (2012) in that the membership in the subtrees is determined at least to some degree by the proportions of verbs occurring in the scene descriptions. Therefore, for each subtree category in this study (p. 22), it makes sense to look into the proportion of verb usage that is meaningful (in terms of features).

Categories decided by the above defining explanation involve at least three conditional phenomena. First, there is at least a particular verb applying to cover all member scenes of a category (cf. Andics, 2012); the relatively large extensions of verbs help partition the domain into broad and meaningful categories. Second, and correspondingly, domain-partitioning verbs as such are the most frequent in occurrences for individual categories, which also cover a large proportion of all occurrences of the verbs. Vulchanova et al. (2012) consider this kind of predominant verb as representative of pertinent categories. Last, all member scenes of a category appear to have the same feature values that can be reasonably linked to the category’s representative verbs.

cluster as such. The inclusion of all scenes in a cluster only means they are mutually similar (Majid, Evans, Gaby, & Levinson, 2011).

147

Below I discuss underlying semantic categories (§ 3.3.1) and subcategories of each category (§ 3.3.2) in the domain of caused-separation events that are produced by the dendrogram generated for the Thai data. This is followed by a comparison of the individual categories, noting whether and how they differ in asymmetric hierarchical relationships (§ 3.3.3).

3.3.1 Caused-separation categories in Thai

3.3.1.1 Patterning of caused-separation categories by verbs in Thai

In Figure 3.2, the left side of the dendrogram shows that there are six high-level branching categories in the caused-separation event domain. This arrangement is based on the verb distributional pattern (on the right) in Thai. The category coverages are shown with an orange background in each pertinent verb column. Two relatively larger categories are revealed, covering 60.5% of all the scenes available in this study, as determined separately by /tàt/ ‘cut’ and /tʰúp/ ‘smash’. Four others together account for less than 23.2% of all the scenes: the verbs /hàk/ ‘snap’, /tɕʰìːk/ ‘tear’, /tʰîm/ ‘stab’, and /krìːt/ ‘slit’.

In addition, note that there appear to be seven specialised scenes in the dendrogram: Scene 37 CHOP CARROT W/ AXE, Scene 51 CHOP LEMON W/ KNIFE, Scene

4 CHOP CLOTH W/ KNIFE, Scene 10 SLICE CARROT W/ KNIFE, Scene 15 SAW STICK,

Scene 9 SLICE CARROT LENGTHWISE W/ KNIFE, and Scene 18 CUT FINGER

ACCIDENTALLY. In actual elicitation responses, Thai speakers used specialised verbs to describe these scenes. They did not describe them with any more general partitioning verb that applied to distinguish between the six large categories.

However, in later follow-up questioning, many of them, as well as other Thai native speakers, confirmed that six scenes could be properly named with the more general verb /tàt/ ‘cut’: Scenes 37, 51, 4, 10, 15, and 9. Consequently, the six special scenes

148

were not considered as separate categories. As for Scene 18, it is considered an outlier33 for the present study as exclusively labelled with /bàːt/ ‘cut’ and utterly unrelated to any of the established categories. Therefore, this scene is not discussed further.

In more specific terms, 19 scenes are involved in descriptions for the first and largest category. These scenes are characterised by verb /tàt/ ‘cut’, which was used to describe them all, accounting for 52.3% of the dendrogram, as joined at the top. For this count, I have factored out the previously-mentioned six special scenes to limit consideration to the actual responses from fieldwork elicitation. These 19 scenes cover 100% of /tàt/ (‘cut’) occurrences. Note that three of the scenes of the /tàt/- category, i.e., Scenes 43, 53, and 2, appear to be superficially grouped in the hierarchical clustering with another category (see the fifth category below). This is because of the use of other infrequent verbs: e.g., /tɕɔ̀ʔ/ ‘puncture’ was selected by some of the speakers. However, these scenes are still classified under the first category, given that /tàt/ ‘cut’ was predominantly used for them.

33 The exclusion of S18 CUT FINGER ACCIDENTALLY followed the prior MPI study’s methodology (Majid et al., 2007a). Again, the verbs /bàːt/ in Thai and /mut/ (see later in § 4.3.1.1) in Khmer are the dedicated verbs for the scene in this study. In other words, these verbs are unable to apply to actions commonly described by other verbs like /tàt/ in Thai or /kat/ in Khmer, pointing to the division of accidental versus intentional separation.

149

Figure 3.2. Hierarchical clustering of caused-separation scenes, based on corresponding verbs in Thai.

150

The second-largest category is structured around seven scenes that are considered similar to each other due to the same naming verb /tʰúp/ ‘smash’. This verb accounts for 82.4% of the descriptions for the category. The member scenes of the second category cover 100% of the verb occurrences. The third and fourth categories contain only four scenes individually. The predominant verbs in the scene descriptions in each of those categories are /hàk/ ‘snap’ (100%) and /tɕʰìːk/ (55.2%)

‘tear’ respectively. The third and fourth categories’ member scenes cover 100% of the

/hàk/ and /tɕʰìːk/ occurrences, respectively. Finally, there are two other single-scene categories. Scene 45 POKE HOLE IN CLOTH W/ TWIG was best characterised with /tʰîm/

‘stab’, accounting for 45.5% of the descriptions for this category. Regarding the other scene, the Thai speakers employed /krìːt/ ‘slit’ for Scene 14 CUT MELON W/ KNIFE.

This verb constitutes 62.5% of the descriptions for the last category. The fifth and sixth categories’ scenes span across 100% of the /tʰîm/ ‘stab’ and /krìːt/ ‘slit’ occurrences, respectively. Though these two categories are superficially grouped in the dendrogram closely with the first category, none of the consultants accepted the use of /tàt/ for the relevant scenes. The verbs /tʰîm/ and /krìːt/ were applied to them with high frequency. It is accordingly appropriate to partition them as separate categories.

151

Table 3.9

Summary of predominant Thai verbs in six caused-separation categories.

Domain Caused-separation

Category: CAT I II III IV V VI

Predominant verb: Vx /tàt/ /tʰúp/ /hàk/ /tɕʰìːk/ /tʰîm/ /krìːt/

‘cut’ ‘smash’ ‘snap’ ‘tear’ ‘stab’ ‘slit’

No. of scenes (% of 19 7 4 4 1 1

43 scenes) (44.2%) (16.3%) (9.3%) (9.3%) (2.3%) (2.3%)

% of descriptions 52.3% 82.4% 100% 55.2% 45.5% 62.5% containing Vx in CAT

% of all occurrences 100% 100% 100% 100% 100% 100% of Vx in this research

Table 3.9 is a summary of predominant verbs for the six main dendrogram categories in the caused-separation domain in Thai. It shows the numbers of scenes in each category, the percentages of the categories’ descriptions containing the predominant verbs, and the percentages of all occurrences of each categories’ predominant verbs in this study.

As shown in Table 3.9, the caused-separation domain can be characterised by a set of 6 different Thai verbs. For the sake of convenience, the corresponding event categories will be referred to as the /tàt/-, /tʰúp/-, /hàk/-, /tɕʰìːk/-, /tʰîm/-, and /krìːt/- categories. The Thai semantic categorisation in the domain of caused-separation is also aligned with the frequency analysis (given in § 3.2.1) in that four of the six categories the applications of four of the nine frequent verbs, i.e., /tàt/ ‘cut’,

/tʰúp/ ‘smash’, /hàk/ ‘snap’, and /tɕʰìːk/ ‘tear’. The remaining two categories are discussed further in § 3.4. They consist of the two long-tailed verbs /tʰîm/ and /krìːt/, whose categories may involve specific salient features.

152

A further issue in interpreting Table 3.9 is that in the descriptions for most of the categories, there appear to be other caused-separation verbs in addition to the corresponding predominant verbs. Therefore, to some extent, the table represents the vocabulary resources for each category. Next, I discuss the extent to which relatively infrequent verbs relating to specific semantic categories of caused-separation events were used.

3.3.1.2 Lexical variation within caused-separation categories in Thai

As mentioned above, the six Thai verbs featured in Table 3.9 occurred predominantly in descriptions for the corresponding six categories of caused-separation; however, other less frequent verbs were also used by speakers to label most categories.

For the /tàt/-category, another 10 less frequent verbs were used for some of the member scenes. In particular, following the predominant verb /tàt/ at a distance frequency-wise, /fan/ ‘hack’ accounts for 16.6% of the descriptions for the category, covering 11 out of the 19 category’s scenes (57.9%). The verbs /sàp/ ‘hack’ and /hàn/

‘cut’ stand at 8.6% and 9.3% of the category’s descriptions, extending over 7 (36.8%) and 5 (26.3%) of all the 19 scenes. Of the descriptions for the category, 13.2% contain the sporadic verbs, i.e., /tɕaːm/ ‘strike’ (4%), /tɕɔ̀ʔ/ ‘puncture’ (3.3%), /tɕʰɯ̌ an/ ‘slice’

(2%), /tɔ̀ːk/ ‘crack’ (2%), /tʰɛːŋ/ ‘stab’ (1.3%), and /katʰɔ́ʔ/ ‘crack’ (0.7%); these verbs also differ in the extensions in the /tàt/-category: 4 (21.1%), 2 (10.5%), 3 (15.8%), 2

(10.5%), 3 (15.8%), and 1 (5.3%) of the 19 scenes, respectively.

The /thúp/-category contains only three other infrequently used verbs, aside from the predominant verb /thúp/. Of this category’s descriptions, 7.8% involve /fan/

‘hack’ while the same percentage applies to /sàp/ ‘hack’; these cover respectively three and one of all the seven member scenes. A rare verb for the category is /lít/

153

‘prune’, contained in 2% of the answers describing the category, covering only one scene.

Following the predominant verb /tɕʰìːk/ ‘tear’ for the /tɕʰìːk/-category, /dɯŋ/

‘tug’ was employed in 55.2% of the descriptions for the category, covering two of the four category’s scenes. Other verbs contained in the naming answers include /dèt/

‘pluck’ (6.9%), /kratɕʰâːk/ ‘drag.off’ (3.4%). These two verbs account, respectively, for two and one of the category’s scenes. Three additional verbs were used relatively infrequently in the descriptions for this category, instead of the predominant item

/tɕʰìːk/ ‘tear’.

The /tʰîm/- and /krìːt/-categories are each associated with only one scene:

Scene 45 POKE HOLE IN CLOTH W/ TWIG and Scene 14 CUT MELON W/ KNIFE. Despite this limited distribution, other infrequent verbs were also used in the descriptions instead of the corresponding predominant verbs. Of the naming responses for the

/tʰîm/-category, 18.2% contain /tɕɔ̀ʔ/ ‘puncture’, followed by /tʰɛːŋ/ ‘stab’ and /tɕîm/

‘jab’ at the same percentage individually. As for the /krìːt/-category’s descriptions, they contain /tɕʰɯ̌ an/ ‘slice’ (25%) and /tɕɔ̀ʔ/ ‘puncture’ (12.5%).

Note that as suggested by the frequency analysis in Table 3.9, the /hàk/- category was not found described with other verbs of caused-separation apart from

/hàk/ itself. This shows lexical homogeneity for the /hàk/ semantic category: a usually high level of consistency across the Thai speakers in naming the caused-separation events in the category. Table 3.10 below summarises the lexical multiplicity observed in the six categories in the domain of caused-separation in Thai.

154

Table 3.10

Thai verb types and percentage of occurrences for six categories of caused-separation; verbs underlined are those appearing in more than one category.

Caused-separation category in Thai

/tàt/- /tʰúp/- /hàk/- /tɕʰìːk/- /tʰîm/- /krìːt/-

/tàt/ 52.3% /tʰúp/ 82.4% /hàk/ 100% /tɕʰìːk/ 55.2% /tʰîm/ 45.5% /krìːt/ 62.5%

/fan/ 16.6% /fan/ 7.8% /dɯŋ/ 34.5% /tɕɔ̀ʔ/ 18.2% /tɕʰɯ̌ an/ 25%

/hàn/ 9.3% /sàp/ 7.8% /dèt/ 6.9% /tʰɛːŋ/ 18.2% /tɕɔ̀ʔ/ 12.5%

/sàp/ 8.6% /lít/ 2% /kratɕʰâːk/ 3.4% /tɕîm/ 18.2%

/tɕaːm/ 4%

/tɕɔ̀ʔ/ 3.3%

/tʰɛːŋ/ 1.3%

/tɕʰɯ̌ an/ 2%

/tɔ̀ːk/ 2%

/katʰɔ́ʔ/ 0.7%

Note: verb frequencies are expressed as percentages of occurrence in descriptions for each category.

At least three observations can be made from this table. One concerns infrequent verbs for a given category which are much less numerous than the predominant verbs.

This suggests that the Thai speakers mostly named all the categories with a limited set of caused-separation vocabulary, using other uncommonly occurring verbs only occasionally.

Another observation is concerned with the different numbers of verbs in each category and their varying distributional occurrences. This mirrors consistency: the extent to which Thai speakers may name some categories more inconsistently than others. For example, while the /tàt/-category reflects widespread variation in naming caused-separation events as characterised by an extensive list of vocabulary alternates, by contrast, the /hàk/-category was named unanimously with a single verb

155

type across all of the Thai speakers in the study. Another case in point is the /tɕʰìːk/- and /tʰîm/-categories. Although involving the identical numbers of verbs (three to four), they have different verb occurrence distributions. The infrequently used verbs of the /tʰîm/-category were used more evenly by the Thai speakers than those of the

/tɕʰìːk/-category, in turn mirroring high diversity of potential lexical descriptors for the former category as against the latter. This observation is discussed further in §

3.4.3.

The last observation is that there are five infrequent verbs of caused-separation in Thai involving more than one category. For example, /fan/ ‘hack’ was used to describe certain scenes of both the /tàt/- and the /tʰúp/-categories. This shows that some Thai semantic categories in the caused-separation domain may be understood as potentially overlapping, given these “double-faced” verb types. This issue is explored in § 3.4.2.

What is more, referring back to Figure 3.2, some infrequently used verbs of caused-separation are associated with specific patterns across scenes in each individual category, in turn pointing to more specific subcategories. In the next section, I explore in more detail subcategories of this type, based on infrequent verb distributional patterns.

3.3.2 Partitioning of caused-separation categories into subcategories

in Thai

As discussed above, some of the categories in the caused-separation event domain in

Thai include other verbs used less frequently than the corresponding predominant verbs. Within each category, some of these uncommon verbs show certain patterns of

156

occurrence. These patterns allow subcategorisation to be determined for the individual categories in a way similar to how the six categories were defined.

From Figure 3.2, potential subcategories in the hierarchical cluster structure of each caused-separation category are also evident as characterised by certain infrequent corresponding verbs. For convenience, the figure is repeated in slightly modified form in this section, breaking it down into five separate figures (Figure 3.3a

– 3.3e) to facilitate discussion of caused-separation subpartitioning in Thai.

Figure 3.3a shows that the /tàt/-category can be characterised by what can be called special outsider scenes. These scenes relate to the occurrence patterns of /fan/

‘hack’, /hàn/ ‘slice’, /pʰàː/ ‘hew’, and /tɕɔ̀ʔ/ ‘puncture’. These items subdivide the major category into four subcategories, covering respectively 90%, 86.4%, 100%, and

62.5% of all relevant occurrences in the descriptions. Although potentially occurring separately, several of these four subcategorising verbs were sometimes found in association. Specifically, some scenes were characterised by either /fan/ or /hàn/, and some by either /fan/ or /pʰàː/. This suggests a type of semantic overlap in the relevant subcategories. For convenience, these subcategories are referred to by the corresponding representative verbs, i.e., the /fan/-, /hàn/-, /pʰàː/-, and /tɕɔ̀ʔ/- subcategories.

157

Figure 3.3a. Cluster tree and verb frequency for /tàt/-category.

158

Furthermore, as Figure 3.3a shows, two subcategories within the /tàt/-category contain smaller subdivisions. First, the /fan/-subcategory may be subdivided with reference to the /sàp/ ‘chop’ and /tɕaːm/ ‘strike’ distributions. Specifically, although disparate in distribution, the occurrences of /sàp/ are patterned as various subclusters in the hierarchical structure, suggesting a /sàp/-subdivision inside the /fan/- subcategory. The verb /sàp/ for this subdivision represents 81% of all its occurrences in the descriptions. The other subdivision is more clearly structured around four scenes inside the /fan/-subcategory. It involves 100% of all the /tɕaːm/ occurrences, constituting a robust /tɕaːm/-subdivision. Note that some scenes were characterised with both /sàp/ and /tɕaːm/, showing that meanings of these verbs might sometimes intersect. Second, the /hàn/-subcategory likewise seems to have a subdivision characterised by /lɯ̂ aj/ ‘saw’. Though involving only Scene 15 SAW STICK, the use of

/lɯ̂ aj/ for this scene precisely accounts for all its descriptions in this study, pointing to specificity of the /lɯ̂ aj/-subdivision within the /hàn/-subcategory.

Figure 3.3b illustrates how the /tʰúp/-category is less hierarchically complex than the above /tàt/-category, with its four subcategories. By contrast, the /tʰúp/- category references only one subcategory, characterised by /fan/ ‘hack’. This subcategory is structured around a subcluster of three scenes, which cover 10% of the

/fan/ occurrences in the descriptions. Additionally, as mentioned above, the verb /sàp/ forms a small subdivision within the /fan/-subcategory. This verb was also found labelling one of the three scenes named with /fan/ in the /tʰúp/-category.

159

Figure 3.3b. Cluster tree and verb frequency for /tʰúp/-category.

Figure 3.3c. Cluster tree and verb frequency for /hàk/-category.

160

Unlike the above two semantic categories in the domain of caused-separation events, the /hàk/-category does not have an internal hierarchy as shown above in

Figure 3.3c. As a result, this category has the flattest structure in the semantic categorisation of the domain in Thai. This flatness suggests that the Thai speakers had no leeway to make alternative descriptions. The likely inference is that they were strongly influenced by perceptually salient features inherent in the scenes which make up the category.

Figure 3.3d displays the hierarchical structure of the /tɕʰìːk/-category with only one subcategory. This subcategory is characterised by /dɯŋ/ ‘tug’, covering

100% of this verb’s occurrences in the descriptions. This subcategory is structured around two scenes: Scene 35 BREAK YARN W/ INTENSITY, and Scene 38 BREAK YARN.

Since the /dɯŋ/-subcategory was consistently named by the Thai speakers in describing these two scenes, perceptually salient characteristics undoubtedly featured prominently in the stimulus materials.

Figure 3.3e illustrates that, despite the small size with only two scenes in total, the /tʰîm/- and /krìːt/-categories are still seen as occupying positions in the hierarchical arrangement. This is because some other less frequently used verbs can be identified as finer-grained alternatives. Within the /tʰîm/-category, the three other infrequent verbs, i.e., /tɕɔ̀ʔ/ ‘puncture’, /tʰɛːŋ/ ‘stab’, and /tɕîm/ ‘jab’, constitute specific /tɕɔ̀ʔ/-, /tʰɛːŋ/-, and /tɕîm/-subcategories applying only to one scene. The category as a whole is represented by 25% of the /tɕɔ̀ʔ/ occurrences, 50% of the /tʰɛːŋ/ occurrences, and 100% of the /tɕîm/ occurrences. Likewise, in the /krìːt/-category, the

/tɕʰɯ̌ an/-subcategory can be seen as characterised by /tɕʰɯ̌ an/ ‘slice’, 40% of whose occurrences in the descriptions are covered by the only scene of this category.

161

Figure 3.3d. Cluster tree and verb frequency for /tɕʰìːk/-category.

Figure 3.3e. Cluster tree and verb frequency for /tʰîm/- and /krìːt/-categories.

162

A methodological effect of the approach used here is that as compared to counterparts in the other categories, the subcategories within the /tʰîm/- and /krìːt/- categories look less robust. The apparent scarcity of the scenes involved in the display prevents us from seeing potential extensions or relationships with corresponding subclass-characterising verbs. Especially, according to the present results, it is impossible to tell whether the three subcategories assumed for the /tʰîm/-category can be subordinate to one another. The present study treats them as distinctive subcategories: further research must be conducted to determine their relations more precisely.

To conclude this section, note that some of the caused-separation categories in

Thai can have subcategories characterised by certain infrequent verbs occurring in the data for each category. The /tàt/-category is the most hierarchically complex. It is subcategorised into four branches, one of which can be even further subdivided. Each of the other categories appears to contain one subcategory, thereby showing relatively more straightforward hierarchical structures. In stark contrast, the /hàk/-category has no subcategory. This item has the flattest hierarchy display, indicating a maximal degree of consistency among the Thai speakers in naming the relevant event scenes.

Furthermore, the use of infrequent verbs uncovers the likelihood that some categories of caused-separation may mutually overlap in meaning, as may some subcategories. These issues of degree of consistency in naming caused-separation, and possible overlaps in meaning between categories and subcategories are further developed with more lexical-semantic accounts in § 3.4. Before that, the next section briefly describes asymmetric lexical density and granularity within the caused- separation event categories in Thai as reflected by different ranges of lexical resources and the fine-grained domain partitioning.

163

3.3.3 Asymmetric lexical resources and granularity in caused-

separation categories in Thai

The number of lexical items and the fine-grained structures of each category, as discussed in §§ 3.3.1 and 3.3.2, also reflects the asymmetrical nature of lexical resources and degrees of granularity within the caused-separation event domain.

Figure 3.4 illustrates the lexical variation across the different categories in the caused-separation event domain in Thai, based on Table 3.10. We can see that the

/tàt/-category is notable for the considerable number of verbs that are potential alternatives. If the unusual scenes associated with this category were also taken into account, two more verb types, i.e., /pʰàː/ ‘hew’ and /lɯ̂ aj/ ‘saw’, would increase the already crowded lexical density even more. By contrast, the other categories just contain fewer than half of the number of verbs attested in the /tàt/-category. The

/tʰúp/-, /tɕʰìːk/-, /tʰîm/- /krìːt/- and /hàk/-categories involve only four, four, four, three, and one verb, respectively. The relative availability of lexical resources for the individual categories accordingly reveals lexical asymmetry in the caused-separation event domain.

164

Figure 3.4. Lexical density within six categories of caused-separation in Thai.

Verbs associated with the /tàt/-category display a higher level of linguistic elaboration of the category. In contrast, the smaller number of verbs involving the other categories suggest fewer distinctions. Expressly, unless all the verbs in each category just displayed patterns of synonymy, we would recognise them as suggesting how different scenes prompted different verbs to match semantic classification in

Thai. The insights into the hierarchical structure of each caused-separation category in the language, as stated in § 3.3.2, seem to resonate with the above assumption from the lexical perspective. Figure 3.5 shows a simplified version of hierarchical structuring for the categories and subcategories of caused-separation events in Thai.

165

Figure 3.5. Simplified hierarchical structure of semantic domain “caused-separation” in Thai; the symbol “-” marks the verbs to specify the relevant categories, subcategories, and subdivisions.

The figure shows that there are inferred to be six categories in the caused- separation domain as investigated here. The /tàt/-categories show the deepest hierarchy and finest-grained distinctions among those categories, breaking down into four different subcategories. Further subdivisions apply in some subcategories. The subcategories are characterised by /fan/ ‘hack’, /hàn/ ‘slice’, /pʰàː/ ‘hew’, and /tɕɔ̀ʔ/

‘puncture’; the two subcategories /fan/ and /hàn/, can be subdivided further as determined by the applicability of /sàp/ ‘chop’ and /tɕaːm/ ‘strike’, and /lɯ̂ aj/ ‘saw’ as

166

indicated in the figure. The four smaller categories include either one or three subcategories each, with no finer-grained subdivision. Specifically, the /tʰîm/- category contains the /tɕɔ̀ʔ/-, /tʰɛːŋ/-, and /tɕîm/-subcategories. The /tʰúp/-category is with the /fan/-subcategory, the /tɕʰìːk/-category with the /dɯŋ/-subcategory, and the

/krìːt/-category with the /tɕʰɯ̌ an/-subcategory. The only one with no subdistinction is the /hàk/-category, showing the flattest hierarchy in the domain. The differences in the depth of hierarchical structures and the subcategorisation of some categories point to granular distinction asymmetry. Specifically, the data show that Thai draws finer- grained distinctions in certain categories; it subclassifies other categories more coarsely.

Taken together, the categorisation of the caused-separation domain in Thai reveals lexical asymmetry and asymmetric granularity between the categories. Also, the two forms of asymmetry seem to be aligned in that the larger the number of verbs for a given category, the finer-grained the distinction in that category.

3.3.4 Summary

Analysis of Thai speakers’ descriptions of the caused-separation scenes results in recognition of six different categories in the lexico-semantic domain. Such categories are labelled here after the predominant verbs that characterise them: the /tàt/-, /tʰúp/-,

/hàk/-, /tɕʰìːk/-, /tʰîm/-, and /krìːt/-categories. Each of these categories has been found to contain a specific number of vocabulary items. The /tàt/-category shows the most massive variation of verbs used for the member scenes. In contrast, the /hàk/-category is maximally economical, with only one verb type. As for other infrequently used verbs relating to the categories, their occurrence patterns helped identify the underlying hierarchical structures. Certain subcategories and small subdivisions have been revealed under some categories. The /tàt/-category comprises the hierarchically

167

deepest and finest-grained distinctions with four subcategories; two of which even have further small subdivisions. Other categories involve only one subcategory, with the /hàk/-category unique in its flattest hierarchy. The differences in numbers of lexical items and granularity of partitioning relevant to each of the categories appear to display asymmetry in lexical density and subtle (fine-grained) distinctions in the caused-separation event domain in Thai.

3.4 Boundary locations of caused-separation categories in

Thai

This section examines relevant semantic distinctions between the categories in the caused-separation event domain in Thai. Criteria are discussed that establish the semantic boundaries of the categories and aid in determining the semantic relationship between them. The section also provides semantic analysis of certain issues left unresolved from the preceding section: for example, potential overlaps in meaning between categories and subcategories and consistency in the scene naming through semantic lenses. In § 3.4.1, I start with both the internal and external relations of the categories in the domain, including the groupings of caused-separation scenes, the core versus peripheral scene groups within the categories, and the overlaps between the categories or subcategories. § 3.4.2, explains the Diversity Index for scenes in the domain to discuss (in)consistency in naming different scenes, with a central versus peripheral parameter applied to categories with and without indeterminacy. I give a summary in § 3.4.3.

3.4.1 Placement of caused-separation category boundaries in Thai

In this subsection I consider the extent to which semantically meaningful categories are related to one another within the domain of caused-separation events in Thai (§

168

3.4.1.1). Additionally, as mentioned in § 3.3, overlaps between certain categories and subcategories are further examined here (§ 3.4.1.2). Regarding this issue, the first line of my analysis is to identify “grey zones” or overlapping areas according to the verb distributional patterns. Then, I briefly explain how such subcategories intersect with one another in the same categories. After that, I determine centrality versus periphery within certain categories, specifying marginal scenes that may induce category overlaps. This provides semantic accounts for the presence of overlapping areas between the categories.

3.4.1.1 Grouping of caused-separation events in Thai

The dendrogram in Figure 3.2 visualises the hierarchical clustering of caused- separation events in Thai examined in this study. Thai groups events of caused- separation in accordance with the generalised patterns present in many other different languages of the world (Majid et al., 2008). Yet, language-specific variation in the grouping of events is still apparent in this language. The understanding of how the speakers of Thai grouped and distinguished between different events of caused- separation accordingly illuminates the relationship of event-type categories in this semantic domain in Thai.

The Thai speakers in the study were found to distinguish events of caused- separation according to the predictability of locus of separation in the acted-upon objects (cf. Majid et al., 2004, 2008). Looking specifically at scenes that involve the use of an instrument, there seems to be a quite secure boundary between those in which the location of separation cannot be determined because of the potential involvement of multiple resulting fractures and those where the location of separation is predictable with great accuracy. The former scene set includes Scene 39 SMASH

FLOWER POT W/ HAMMER, Scene 40 SMASH PLATE W/ HAMMER, and Scene 31 SMASH

169

STICK W/ HAMMER, contained entirely in the /tʰúp/-category, whereas the latter set involves scenes like Scene 49 CUT ROPE WITH KNIFE (i.e., typical cutting) or Scene 15

SAW STICK (i.e., typical sawing) from the /tàt/-category. None of the speakers ever named scenes from the two different categories with the same verb. Events that were non-instrument-assisted showed a similar pattern. Those with low predictability involved a hand action: e.g., where the agent grabbed a long and rigid object then exerted force on its end until it broke somewhere between the two hands (such as

Scene 25 SNAP TWIG). These events were not labelled with the same verbs as those with relatively immediate predictability, such as Scene 1 TEAR CLOTH.

However, as Majid et al. (2008) emphasise, the notion of predictability does not restrict focus only to a checklist of features of an event, but also requires attention to how such features relate logically to one another. In other words, a possible predictability degree is subject to interrelations of features inherent in the relevant event. For example, a strike of a hammer upon an acted-upon (rigid) object is commonly consistent with low predictability of location of separation given the non- bladed but blunt instrument, the ballistic motion, and the intensity. Events with these features were regularly described with a verb different from that for events with high predictability. However, for some speakers, such an implement and a manner upon a different object type: e.g., a non-rigid rope, may involve a relatively higher level of predictability since the rope separates in one location where the hammer contacted, or one place next to it. Some then labelled an event like Scene 50 CHOP ROPE W/ HAMMER with /tʰúp/ ‘smash’ which was predominantly used for scenes with low predictability, while some applied /fan/ ‘hack’, a verb for scenes with high predictability.

Next, we move on to another pattern of grouping caused-separation events that is similar to the cross-linguistic partitioning, i.e., snapping versus smashing

170

distinction (see Dimension 3 in Majid et al., 2008). The Thai speakers distinguished rigidly between these two categories: snapping events were clustered as the /hàk/- category involving one-dimensional rigid objects separated into two pieces by adding an amount of pressure to the ends of the objects (Scenes 25, 57, and 19) or smacking it over the knee (Scene5). Smashing events (Scenes 39, 40, 31 and 21) are grouped into the /tʰúp/-category, which reflects rigid objects fragmented into many pieces by a blow of a hammer. None of the scenes showing events of snapping were named similarly to those expressing those of smashing.

In accordance with Majid et al.’s (2008) cross-linguistic observation, Thai characterises only one scene from the others: Scene 45 POKE HOLE IN CLOTH, an intermediate predictability event. However, the Thai speakers did not show the use of a dedicated verb for this small partition. They used the verb applied in this scene for other clips as well. Take one example: the verb /tɕɔ̀ʔ/ ‘puncture’ was used for Scene

45 and for Scene 43 where a person used a sharp-pointed chisel to divide a carrot.

Although Scene 43 does not involve partial separation producing a hole, unlike Scene

45, it apparently contains some other features similar to the event of poking a hole: for example, the use of a pointed instrument, or a (vertical) downward blow of the instrument.

The above similarities to the cross-linguistic patterns notwithstanding, Thai also shows language-specificities in grouping events of caused-separation.

Not only does Thai assign some events with intermediate predictability of separation location (e.g., Scene 45 above) to a distinct category as the majority (or first) method, but it also lumps some of them with those of either high or low predictability (as the second and third methods, respectively). According to the

171

second method, the speakers of Thai followed the cross-linguistic trend in that they used /tàt/ ‘cut’, the most frequent verb ordinarily used for high-predictability events of cutting, to describe scenes that express immediate-predictability karate-chopping and typical chopping by use of sharp-bladed tools, showing that these scenes were generally lumped together. However, an exception was Scene 32 CUT CARROT W/

HAND USED KNIFE-LIKE. In this study, Scene 32 was lumped with the low predictability cluster, using /tʰúp/ ‘smash’, therefore suggesting the application of the third method.

The split of this scene and the others relating to karate-chopping reflects, on the one hand, the significance of immediacy for events of this type in Thai. They may be sometimes clustered with events with high predictability. On the other hand, the manner of karate-chopping may become less salient to speakers when some other features become more important. Scene 32 is the only karate-chopping scene showing a two-dimensional rigid theme object, i.e., a carrot. Others contain one-dimensional rigid (i.e., a stick) or flexible (i.e., a rope) and two-dimensional flexible objects (i.e., stretched cloth), which each involve a separation in one place. For Scene 32, object type may play a more decisive role.

The Thai speakers also distinguish one scene of slitting or making an incision on a melon (i.e., Scene 14) from other events of cutting. Specifically, the scene features the agent placing a sharp blade on the surface of a melon then pressing the blade into the melon’s flesh merely to cause a slit on the melon—note that the melon was not cut off into parts. This phenomenon was not recognised as a categorisation pattern in Majid et al. (2008).

172

The grouping of tearing events in Thai is another difference from the cross- linguistic partitioning in the caused-separation domain. Specifically, in Thai Scenes 1

TEAR CLOTH INTO TWO PIECES and 36 TEAR CLOTH HALF-WAY did not form a single event type as suggested by Majid et al. (2008). There is no dedicated verb for these scenes. In this study, the verb /tɕʰìːk/ ‘tear’ used for tearing cloth was extended to pulling yarn apart (i.e., Scene 35 BREAK YARN INTO PIECES and Scene 38 BREAK

SINGLE PIECE OFF YARN). The scenes of breaking yarn were also described with a dedicated verb /dɯŋ/ ‘tug’, thereby forming a subcluster within the lump.

Not predicted by Majid et al.’s (2008) cross-linguistic patterning of caused- separation, Thai seems to observe a different distinctive notion: an instrument versus non-instrument distinction. Given overall inspection, events that contain any kind of instrument were never described similarly to those without the use of a tool or caused by a karate hand. Correspondingly, no overlapping uses of any verb types for scenes of these two kinds are present, suggesting the definite category boundary between them. In Figure 3.2, we see scenes with a knife, a chisel, a hammer, etc. clustered at the top and the mid-region of the dendrogram, separate from scenes with the bare hands to snap, tear, or pull a theme object apart.

The predictability of locus of separation can be cast as controlling boundaries between events of caused-separation in Thai only when we first take account of instrument and manual manipulations. This means that though the locus predictability would be still important in the event grouping, the instrument versus non-instrument distinction needs to be more strongly emphasised as the first step towards defining category boundaries in this domain. Thai has a set of verbs for instrument-assisted caused-separation events, such as those of cutting or smashing, and another distinct set for events of separation caused by manual treatments like those of tearing or

173

snapping. However, the notion of this distinction is, though important, rather non- representational since no general verbs for all scenes with an assisting tool and for those with a hand action are available in Thai, and the distinction can be accentuated only when exposed by the visualisation of this domain’s partitioning (see Table 3.11).

To sum up, category boundaries for Thai in the caused-separation event domain are shown to be oriented toward one another in both similar and different ways from the cross-linguistic generalisations (Majid et al., 2008). The speakers of

Thai in this study consistently made splits between events with and without the use of an instrument. For these speakers, the /tàt/-, /tʰúp/-, /krìːt/-, and /tʰîm/-categories are never grouped with the /hàk/- and /tɕʰìːk/-categories. Significantly, this distinction was not observed as a distinguishing dimension in Majid et al.’s (2008) cross- linguistic study. On closer inspection, the Thai speakers in this study did observe the cross-linguistic notion of the predictability of location of separation. Events high in locus predictability were not labelled with different verbs from those with low predictability; some exceptional cases are also observed, but infrequently. The speakers complied with the solid category boundary between the /hàk/- and /tʰúp/- categories—or snapping versus smashing events, respectively—since no sharing of verbs across the categories was observed. An event of piercing or poking a hole in the cloth (i.e., Scene 45) was found to produce another pattern that Thai appears to share with the cross-linguistic tendency. However, more like English and Kuuk Thaayorre

(Majid et al., 2008, p. 243), the speakers of Thai applied the verb for this scene more widely as well. Unlike Majid et al. found, the Thai speakers formed a distinct category for slitting, detached from general cutting. They did not organise a separate group for tearing events but sorted them along with a category of another event type: pulling-apart. As for events of karate-chopping and chopping which are considered

174

those with immediate predictability of locus of separation by Majid et al., they were lumped together by the speakers of Thai with events whose predictability is high, i.e., events of cutting or sawing.

Table 3.11 summarises in simplified form the placement of the category boundaries in the caused-separation domain in Thai based on the discussion of distinctive characteristics in this section. Boundary locations represent the hierarchical organisation of the categories in the domain. Note that the simplification here involves reduction of information. The table omits details of semantic subcategories and small semantic divisions and does not indicate overlapping regions within the domain; the latter are discussed in more depth in § 3.4.1.2.

175

Table 3.11

Simplified placement of category boundaries in Thai caused-separation domain.

First distinction: Type of manipulation

Instrument Manual

Slitting Sawing Cutting Chopping Karate- Piercing Smashing Tearing Snapping

chopping

Second distinction: Predictability of locus of separation

High/intermediate Intermediate/low Intermediate Low

Slitting Sawing Cutting Chopping Karate- Piercing Smashing Tearing Snapping

chopping

Third distinction: Instrument type or object type

Sharp/pointed Pointed Blunt Flexible Rigid

Slitting Sawing Cutting Chopping Karate- Piercing Smashing Tearing Snapping

chopping

Fourth distinction: blade placement on object surface

Placement No placement

Slitting Sawing Cutting Chopping Karate-

chopping

↓ ↓ ↓ ↓ ↓ ↓

/krìːt/- /tàt/-category /tʰîm/- /tʰúp/-category /tɕʰìːk/- /hàk/-

category category category category

3.4.1.2 Overlaps between caused-separation categories in Thai

As mentioned above, the Thai verb distributional patterns reveal category and subcategory overlaps in the caused-separation domain. Below I discuss the extent to which overlapping areas or “grey zones” can be discerned between some semantic categories and subcategories. A rationale is presented, based on semantic features associated with the areas where boundaries are covered by other boundaries and where they are not.

Extensions of relevant class-defining verbs in Thai suggest that in some of the six major categories, subcategory boundaries intersect with one another. In the /tàt/-

176

category, the /fan/-subcategory is partially overlapping with the /pʰàː/-subcategory; in the /tʰîm/ -category, the /tɕɔ̀ʔ/-, /tʰɛːŋ/-, and /tɕîm/- subcategories mutually overlap.

According to the verb extensions, there appear to be two overlapping areas between the six categories. The first area lies between the /tàt/-category and the /tʰúp/- category, while the second is found between the /tàt/-category and the /tʰîm/-category.

Below, I first examine the overlapping areas between these subcategories before moving on to consider boundaries between the categories.

The overlaps between the /fan/-subcategory and the /pʰàː/-subcategory were initiated since Scene 37 CHOP CARROT LENGTHWISE W/ AXE and Scene 51 CHOP MELON

W/ KNIFE were characterised by /fan/ ‘hack’ or by /pʰàː/ ‘hew’. In the first place, these two scenes can be considered specialised scenes. They can subsequently be grouped into the /tàt/-category (see 3.3.1.1). Though the relevant scenes were not named with

/tàt/ ‘cut’ in the present study’s responses, native speakers of Thai consulted later confirmed as acceptable the use of the verb /tàt/ ‘cut’ for these scenes. This allows their inclusion into the category for broader analytic purposes. The relevant scenes share most salient features found in other scenes of the /tàt/-category: the actions are accomplished through sharp-bladed/pointed instruments (see Table 3.11). They were accordingly able to be described with other infrequently used verbs. Such verbs still portray the most prominent semantic feature, while specifying other detailed event attributes. In these cases, for the speakers in the study, capturing the feature of a forceful striking motion found across the scenes must have been salient, so they apparently preferred using /fan/ ‘hack’ to represent this forceful-impact nuance. The speakers who focussed on the feature of lengthwise separation instead chose /pʰàː/

‘hew’ for their description.

177

Likewise, overlaps among the three subcategories within the /tʰîm/-category were attributable to the fact that Scene 45 was characterised by one of the verbs: /tɕɔ̀ʔ/

‘puncture’, /tʰɛːŋ/ ‘stab’, or /tɕîm/ ‘jab’. The scene contains the prominent features of the category: the action is performed (1) with intensity; (2) by a pointed instrument.

Consequently, three infrequent verbs are eligible for descriptions for the category.

These verbs express both of the required salient features and also some additional specific feature. In effect, the Thai speakers using these alternate forms made different interpretations of the same scene and represented them using different verb types. For example, certain speakers chose to specify the event interpretation of downward blow of the pointed implement (that regularly implies a resulting hole); for this they preferred the use of /tɕɔ̀ʔ/ ‘puncture’. By contrast, other speakers instead chose to particularise the manner of pressing the pointed end of the tool against the theme object’s surface; they then applied the verb /tɕîm/ ‘jab’.

These subcategory overlaps introduce at least two generalisations relating to the semantics of verbs and semantic reasons for subcategory overlaps. First, verbs that exhibit a common subcategory pattern within a category show overlap in meaning as they need to share at least the most salient and important category-characterising feature(s), while capturing some additional specific meanings. Second, a subcategory overlap is considered to be analytically valid since different speakers are argued to have different construals of the same scenes, focussing on different distinct characteristics. They have then chosen different verbs that semantically specify such particularities, while still reflecting a common feature in the scene descriptions. The extension of different verbs for the same scenes thus induces overlapping subcategory boundaries.

178

Not only do certain subcategory boundaries cover part of the same areas, but some category partitions are also in overlapping relation. Such overlaps are attributed to the verb occurrence patterns that account for subcategories in accordance with the model used in this study. Specifically, verbs for subcategories within a category can exhibit a pattern of extension from one category to another. Given this, certain categories are depicted as spanning across each other. In this study, /fan/ ‘hack’ patterns as a subcategory in both of the /tàt/- and /tʰúp/-categories, while /tɕɔ̀ʔ/

‘puncture’ assigns a subcategory each to the /tàt/- and the /tʰîm/-categories.

Certain predominant verbs appear to be associated with well-imaged features that the model treats as prominent or demarcating characteristics. Although different categories capture a range of prominent features, some scenes included in a given category can reflect incompatibility and divergence with regard to other scenes in the same category. Features of the latter divergent kind tend to make those scenes negotiable or construable differently both within and across the Thai speakers.

Because of varying interpretations of the scenes, verbs describing a negotiable subcategory could differ, despite ordinary compliance with major category divisions.

Individual speakers might perceive or interpret a scene in a particular way and produce the scene’s description to reflect that interpretation.

To highlight the function of divergent characteristics in prompting, or at least enabling, overlaps between the categories, I apply the notion of “core” and

“periphery” to relevant categories, following Vulchanova et al. (2012). This distinction relates to determining how “outer” scenes contain characteristics divergent from features manifested by “central” scenes of a category. Is it possible to prove that these peripheral scenes normally lie in an overlapping area between categories, justifying feature divergence as the condition for category overlaps?

179

Let us start with the /tàt/-category. The analysis discussed above determines four core scenes, i.e., Scene 49 CUT ROPE W/ KNIFE, Scene 56 CUT CLOTH W/ SCISSORS,

Scene 27 CUT HAIR W/ SCISSORS, and Scene 24 CUT ROPE W/ SCISSORS. These are clustered with the shortest branches as shown in Figure 3.3a; they are consistently characterised by the predominant verb /tàt/ ‘cut’. As these scenes are seen as more prototypical than others potentially included in the category, their common characteristics would be thus representative for the core category. The four scenes emphasise the feature generalised above for the /tàt/-category as containing the actions of dividing objects into parts using sharp-bladed instruments, i.e., scissors

(Scenes 56, 27, and 24) and a knife (Scene 49). Also, considering this feature in relation to the theme objects, i.e., (Scenes 49 and 24), cloth (Scene 56), and hair

(Scene 27), the predictability of location of separation is recognised as relatively high.

According to the clustering analysis (Figure 3.3a), we may identify two sets of peripheral scenes of this /tàt/-category since the scene sets are represented as clustered to the core scenes from the distant subtrees. One of these includes three scenes of chopping and karate-chopping (Scenes 34, 61, and 6), from the top end of the dendrogram. These scenes together express the use of sharp-bladed instruments like the core but with blows of the instruments. They are characterised by an intermediate level of predictability. Unlike the core scenes, this peripheral set features ballistic motion of the implements. Aside from /tàt/ ‘cut’, the relevant events were described with two other verbs, i.e., /fan/ ‘hack’, and /sàp/ ‘hack’, constituting the /fan/- subcategory.

The other set of peripheral scenes contains three scenes from the bottom part of the category cluster. They are Scenes 43, 53, and 2, each of which portrays a blow of a sharp-pointed implement such as a chisel. With intermediate predictability, the

180

blow causes full separation of different theme objects: a carrot (Scene 43), a stick

(Scene 53), and a stretched rope (Scene 2). The scenes were characterised by /tɕɔ̀ʔ/

‘puncture’, as an alternative to /tàt/ ‘cut’, as well as other sporadic verbs which are then included in the /tɕɔ̀ʔ/-subcategory. Comparing the features of the periphery to the core, we can generalise by observing that the peripheral features are divergent in emphasising the striking motion of the tools used and the level of predictability associated with the effect of particular motion.

Next, we turn to the /tʰúp/-category. In line with the preceding analysis, three core scenes in this category establish the predominant /tʰúp/ ‘smash’ characteristic that models the category. The shortest branches of the cluster show a consistent instrumental characterisation. The three scenes conjointly express events featuring the blows of blunt instruments such as hammers, resulting in multiple fragments. Taking into account the relationship of these characteristics, the scenes reflect low predictability of locus of separation. The three peripheral scenes appear appended to the core from the distant subcluster at the bottom end of the category. They are Scenes

23, 50, and 32, showing the use of blunt implements, i.e., hammers, except for the last scene. Scene 32, by contrast to the others, involves a knife-like hand making a karate- chop. For some speakers in the study, this hand shape was perhaps construed as a blunt-edged tool. As a result, Scene 32 was distantly grouped with the above two scenes. Note that all three of the peripheral scenes depict the blow of instruments or an instrument-like ‘knife hand’ upon the stretched cloth (Scene 23), the stretched rope

(Scene 50), and the carrot (Scene 32), which suggests separation in one place rather than several resulting fragments. The actions displayed by the scenes are associated with relatively intermediate locus predictability, thus supporting a modest core/periphery contrast for this category.

181

The /tʰîm/-category does not show a demarcation between core and periphery due to its limitation to a single scene member. However, it is worth repeating here for further discussion that the category involves an event with intermediate locus predictability that features partial separation of cloth caused by the vertically downward blow of a pointed twig. This single scene was predominantly described with /tʰîm/ ‘stab’ that patterns the category. Other less frequent alternatives were

/tɕɔ̀ʔ/ ‘puncture’, /tʰɛːŋ/ ‘stab’, and /tɕîm/ ‘jab’, taken to model subcategories.

Now, looking back at the overlaps between the /tàt/- and the /tʰúp/-categories, and between the /tàt/- and /tʰîm/-categories, we note that for the former pair, a /fan/- subcategory induced the overlap between the categories, appearing as included in the peripheral scenes of both categories. Similarly, the latter case shows that within the

/tàt/-category, the /tɕɔ̀ʔ/-subcategory that initiated overlap with the /tʰîm/-category also applies to the peripheral scenes. Accordingly, since these peripheral scenes are argued to contain divergent features, we may surmise that there is a connection between divergent features and the overlapping areas between the categories. In other words, I argue for feature divergence of certain scenes within a category as the threshold for inducing a category overlap. Specifically, the characteristics of the peripheral scenes characterised by /fan/ ‘hack’ within the /tàt/-category are divergent, while corresponding in some sense to those of the peripheral scenes within the /tʰúp/- category. What the two peripheral cases involve in common is both the blows of the tools and the immediate level of locus predictability. These factors are opposed, to some extent, to those characterising the core of each category. As for the divergent features of the peripheral scenes characterised by /tɕɔ̀ʔ/ ‘puncture’ within the /tàt/- category, they are closer to those of the /tʰîm/-category through the perspective of the

/tɕɔ̀ʔ/-subcategory. The pertinent peripheral event features involve a downward blow

182

of a pointed instrument with a slight difference potentially involved in whether the resulting separation is full or partial.

Not only do peripheral scenes reflect feature divergence, potentially causing overlapping areas between categories, but the Thai speakers described them with a wider variety of verbs. For example, while the core of the /tàt/-category was consistently named with the predominant verb, the two sets of peripheries were constituted by infrequent verbs. One might explain this as insecurity among the Thai speakers in naming peripheral scenes along the lines of Vulchanova et al. (2012, p.

29), who maintain that such insecurity is due to ‘increased distance’ or divergence from ‘default’ or prototypical features of a semantic category. Insecurity would then be reflected in intra-speaker and inter-speaker inconsistency of event descriptors (i.e., use of multiple verbs). Two restrictions could be postulated at this point:

(1) The core within a category should exhibit a higher degree of consistency in

naming than the periphery in the same category.

(2) As overlapping areas comprising peripheral scenes, categories that overlap

with others should be lower than those with no overlap, regarding consistency

levels of event descriptions.

However, unlike Vulchanova et al., I question a measure of (in)consistency as such based merely on numbers of verbs used (pp. 26 and 29). The same numbers of types in different circumstances may not indicate the same degree of consistency with respect to the types if not used or distributed evenly. Different occurrences need to be considered as well. Next, I examine an alternative approach: (in)consistency in naming as analysed through Gini-Simpson coefficients. This precedes reconsideration

183

of whether the two restrictions postulated above are still supported numerically with respect to diversity-index scores.

3.4.2 Inconsistency in naming caused-separation events in Thai

For some scenes, verb choice was unanimous. By contrast, other scenes were encoded using several types of verbs. This implies internal disagreement among the consultants, thereby decreasing degree of consistency in describing the scenes. In this subsection, the Gini-Simpson index of diversity (1 - D; see § 2.2.6.2a) is used to measure diversity values of the individual scenes of caused-separation based on verbs and their frequency as used by the Thai speakers in the scene descriptions. In §

3.4.3.1, I explain the degree of lexical-description diversity in the caused-separation event domain in Thai. In the remainder, diversity scores of the core and peripheral scenes in certain categories (§ 3.4.3.2) are examined along with categories involving an overlap or no overlap (§ 3.4.3.3). This is to evaluate the postulations relating to

(in)consistency in the event-naming described in § 3.4.2.2.

3.4.2.1 Degrees of inconsistency in caused-separation descriptions in Thai

For each scene in the Thai data collected, diversity is first calculated individually in scene descriptions. All per-scene indices are reported in Appendix B.

Then the overall index score for Thai is determined on the average of all per- scene diversity. The average Gini-Simpson’s index value of Thai is found to be 0.36, which suggests that Thai shows high homogeneity or consistency in verb choices across speakers for the same scenes. Note that, for the Gini-Simpson index, values close to zero represent high homogeneity, and values near one make up high heterogeneity, i.e., many verbs rather evenly distributed for each scene.

184

3.4.2.2 Inconsistency in naming core versus peripheral scenes of caused-separation

categories in Thai

Considering the categories in the domain, there appear to be three categories showing the divide between the core and periphery. The two of them are the /tàt/- and /tʰúp/- categories, which have been specified and discussed in § 3.4.2.2. The last one is the

/tɕʰìːk/-category containing two core scenes, i.e., Scene 1 TEAR CLOTH INTO TWO

PIECES and Scene 36 TEAR CLOTH HALF-WAY — judged to be the shortest branches in the cluster (see Figure 3.3d). There are also two peripheral scenes which are lumped to the core from the distant subcluster, namely Scenes 35 and 38.

According to the diversity indices, all the core scenes in each of the three categories have the zero value because they were described consistently with the relevant predominant verbs of the categories. Specifically, the core in the /tàt/- category was described with /tàt/ ‘cut’, that in the /tʰúp/-category, with /tʰúp/ ‘smash’, and that in the /tɕʰìːk/-category, with /tɕʰìːk/ ‘tear’. By contrast, the peripheral scenes contained in the three categories were named by both the predominant verbs and some other less frequent verbs of the categories, reaching the higher scores of diversity, as summarised in Table 3.12.

185

Table 3.12

Gini-Simpson’s diversity indices and verbs relating to periphery within the /tàt/-, /tʰúp/-, and /tɕʰìːk/- categories.

Category Diversity Verb

index

/tàt/-category

Scene 34 KARATE-CHOP CLOTH 0.7619 /tàt/; /fan/; /sàp/

Set 1 Scene 61 KARATE-CHOP ROPE 0.6667 /tàt/; /fan/; /sàp/

Scene 6 CHOP CARROTS W/ KNIFE 0.6667 /tàt/; /fan/; /sàp/

Scene 43 CHOP CARROT W/ CHISEL 0.7857 /tàt/; /tɕɔ̀ʔ/; /hàn/; /tʰɛːŋ/; /tɔ̀ːk/*

Set 2 Scene 53 CHOP STICK W/ CHISEL 0.7857 /tàt/; /tɕɔ̀ʔ/; /tʰɛːŋ/; /tɔ̀ːk/*; /katʰɔ́ʔ/*

Scene 2 CHOP ROPE W/ CHISEL 0.6389 /tàt/; /tɕɔ̀ʔ/; /tɔ̀ːk/*

/tʰúp/-category

Scene 23 CHOP CLOTH W/ HAMMER 0.2500 /tʰúp/; /fan/

Scene 50 CHOP ROPE W/ HAMMER 0.2857 /tʰúp/; /fan/

Scene 32 KARATE-CHOP CARROT 0.6667 /tʰúp/; /fan/; /sàp/

/tɕʰìːk/-category

Scene 35 BREAK YARN INTO PIECES 0.6429 /tɕʰìːk/; /dɯŋ/; /dèt/*; /kratɕʰâːk/*

Scene 38 BREAK SINGLE PIECE OFF YARN 0.5238 /tɕʰìːk/; /dɯŋ/; /dèt/*

Note that the asterisk (*) indicates a verb that occurred in less than 1% of all the descriptions.

These diversity indices help verify that the core scenes show the highest consistency in naming, represented by the zero values, whereas the peripheral counterparts have lower consistency as indicated by values of diversity near one. On close inspection, despite the same numbers of verbs involved, the different diversity- index scores prove that basing analysis only on lexical resources may not accurately reflect the diversity of descriptions (i.e., inconsistency). Usage frequency (how many times each lexeme occurred in the descriptions) also exerts a significant influence.

186

3.4.2.3 Inconsistency levels in descriptions across caused-separation categories in

Thai

The preceding section has shown that peripheral scenes involve relatively high inconsistency in event naming and are usually included in specific categories. A related inference is that the overall naming of the categories overlapping with others should also show higher inconsistency than that of the categories showing no overlap.

To substantiate this hypothesis, the average diversity-index scores of individual categories in the domain are calculated, as illustrated in Figure 3.6. (For clarity of presentation, the special scenes associated with the /tàt/-category are omitted.)

Figure 3.6. Average of diversity values for six categories in caused-separation in Thai; categories showing overlap are indicated by orange colour, while non-overlapping categories are indicated by grey colour.

187

The above figure shows that the /tàt/- and /tʰúp/-categories that contain overlapping regions each do not have scores of diversity higher than the remaining categories, i.e., the /krìːt/-, /tɕʰìːk/-, and /hàk/-categories. Only the /tʰîm/-category is exceptional in having a higher score. Specifically, the diversity-index value of the

/tàt/-category is lower than that of the /krìːt/-category, and the diversity of the /tʰúp/- category is lower than the /tɕʰìːk/-category.

However, when average diversity-index scores are calculated for the three categories that show overlap and for the other three with no overlap, the average of the former group is 0.53, whereas that of the latter is 0.26. The interpretation is that the high diversity-index score that reflects inconsistency in naming caused-separation for each category may not only due to inherent divergence in peripheral scenes. It may also due in part to heterogeneity among the speakers with regard to describing the individual event types related to the semantic categories. For example, despite having no overlap with other categories, the /tɕʰìːk/-category has relatively higher inconsistency in naming than the /tʰúp/-category. This is because speakers used the pertinent verbs less uniformly for the former than the latter. Yet, if diversity values of the categories are assessed in overall terms, the tendency remains: categories with overlaps are prone to higher inconsistency because of divergence in peripheral scenes.

3.4.3 Summary

To conclude this section, it is appropriate to note that Thai groups event types of caused-separation both similarly and differently from the cross-linguistic generalisations noted by Majid et al. (2004, 2008) as reflected by the placement of category boundaries in the domain. Relations between categories demonstrate that

Thai does not show completely and clearly an articulated form of categorisation, given that at least three different categories may be involved in category overlaps.

188

Additionally, certain subcategories are seen extending over others in the same category, thereby producing significant subcategory overlaps. These overlaps may be caused by different intra- and inter-speaker construals. Speakers may describe the same scenes with varying featural specificity for the subcategory cases, and with characteristic divergence for the category cases. Furthermore, the overlapping regions are shown to involve a wide variety of verbs used in the descriptions. This lends support to restrictions postulated by Vulchanova et al. (2012) regarding low consistency in naming among speakers. Diversity-index calculations also support these proposals.

3.5 Semantic organisation of caused-separation categories

in Thai

In the previous two main sections, we have discussed the partitioning of caused- separation domain into semantic categories and subcategories in Thai by verb distributional patterns (i.e., § 3.3), and category relations (i.e., § 3.4). In this last section of the chapter, remaining investigations are carried out on semantic organisation. This refers to experiential categories in the semantic field evoked by

Thai speakers as they described events of the domain. In § 3.5.1, I develop an approach to semantic specification of the verb-patterned categories and subcategories of caused-separation events. This is to determine which semantic elements are treated as important in each category/subcategory. Based on such semantic elements, I generalise semantic organisation patterns that underlie the categories and subcategories (as well as small semantic divisions) in the domain. In § 3.5.2, I argue for certain semantic elements being lexicalised or conflated (cf. Choi & Bowerman,

189

1991; Talmy, 1985) into different caused-separation verbs, which in turn reveal a great deal about how the domain is semantically categorised in Thai.

3.5.1 Semantic characteristics of caused-separation categories in

Thai

The categories established by the verb occurrence patterns are viewed as meaningful in terms of feature values shared by all relevant scenes. Such semantic features also distinguish the scenes from others in the remaining categories (cf. Vulchanova et al.,

2012, p. 22). Through this type of analysis, we examine underlying semantic characteristics for the individual categories and subcategories. Furthermore, since all the categories and subcategories are connected to certain class-defining verbs in the process of their establishment (see § 3.3.1), the category/subcategory semantic features also link to these verbs. In what follows, I start by generalising semantic properties for the established categories and subcategories. The properties are argued to function as semantic parameters in semantic categorisation. With reference to prior research (see § 2.1.3.3), I consider a series of event characteristics: instruments, manners, spatial properties of actions (i.e., action orientations), textural properties of objects, or results of actions. These event characteristics were found to be significant for speakers’ semantic classification of event descriptions in the caused- separation domain.

In the first instance, consider the 19 scenes included in the /tàt/-category. All of them depict agents performing actions by sharp-bladed instruments, i.e., a big knife, a knife, a machete, a chisel, and scissors; also included here are blade-like

190

hands, i.e., knife hand.34 A second common feature is that they all involve complete separation as the result of the actions. Next, consider in detail the scenes where the pointed chisels/awls were used (i.e., Scene 43 CUT CARROT W/ CHISEL, Scene 53 CHOP

STICK W/ CHISEL, and Scene 2 CHOP ROPE W/ CHISEL) or the knife hands (i.e., Scene 34

CHOP CLOTH W/ KARATE HAND GESTURE, Scene 61 KARATE-CHOP ROPE, and Scene 42

KARATE-CHOP STICK). The chisel looks different from other sharp-bladed implements.

It has a small sharp edge used for causing separation. Speakers might interpret the sharp end as a short blade (see Figure 3.7). Consequently, in this study, the chisel/pike instruments can be read by the Thai speakers as either pointed or sharp-edged implements. This dual interpretation of chisels/pikes is discussed again when the scenes with these instruments play their part in grey zones between categories. For simplification, chisels are considered short sharp-bladed tools for the /tàt/-category.

As for the knife-hands, their gesture can be viewed as blades of sharp tools; specifically, the lateral outer border of the hand is likened to a sharp edge (see Figure

3.8). A hand position of this kind could conceivably be imitating sharp-bladed implements. The knife hands are then instrument-like for this category. Additionally, the /tàt/-category’s scenes cover a variety of manners of actions and theme objects.

The manners can involve a vertically downward blow of an instrument or instrument- like implement with intensity, a back-and-forth motion of a saw or a knife (i.e., a sawing motion), or repeated strikes. The objects in the relevant scenes included a piece of cloth, a rope, one or more carrots, a branch/stick, hair, and a fish. We can see that the /tàt/-category can involve one-dimensional flexible, two-dimensional fabric-

34 A knife hand (or, 手刀 [onyomi: shutō], Japanese for ‘hand-sword’), as shown by its name is a kind of gesture which resembles a knife/sword blade. To form such a knife hand, the fingers of one’s hand are extended tightly together, and the open hand and wrist muscles are stretched. Although the hand’s outer side is likened to a blade, it does not effectively afford cutting of an object for its lack of a sharp, straight edge, unlike a real knife or sword. Accordingly, a stroke of such a knife hand is needed to cause either full or partial separation.

191

like or rigid, three-dimensional lump objects. It is thus insensitive to different types of theme objects. Furthermore, in the scenes of the category, the orientations of actions can vary, being either across or along the objects’ length: e.g., Scene 54 CUT CARROT

CROSSWISE W/ AXE versus Scene 48 CHOP BRANCH LENGTHWISE AND CROSSWISE W/

AXE. All the above semantic properties for the /tàt/-category still hold good, even when the six special scenes related to this category are taken into consideration.

Figure 3.7. Chisel-type instruments potentially interpreted as having either a sharp point (left) or a sharp short blade (right).

192

Figure 3.8. Outer border of hand imitating blade of knife, in karate-chopping.

Taken together, the /tàt/-category can be generalised as caused-separation by means of sharp-bladed instruments. More specific circumstances need to be involved when we look at finer-grained classification inside the /tàt/-category. In other words, the subcategories are more specific versions of the higher category in that they involve extra conditioned semantic properties, as considered below.

To begin with, the /fan/-subcategory comprises the scenes that show the actions of caused-separation by a sharp-edged implement, but with a specific manner of action. There is a blow of an instrument done quickly and forcefully in a striking manner. Again, this subcategory is not sensitive to different types of theme objects and kinds of action orientations (i.e., lengthwise versus crosswise).

For the /sàp/- and /tɕaːm/-subdivisions of this subcategory, all of the semantic conditions mentioned above still apply: caused-separation; (1) with forceful intensity;

(2) by a sharp-edged instrument. Included are key elements specific to each of the

193

subdivisions. The /sàp/-subdivision is highly sensitive to a typical manner of holding a knife so that its blade is perpendicular to the objects or to the plain or ground (see

Figure 3.9). The /tɕaːm/-subdivision is not concerned with manner of grasping or holding an implement but specifies the implement itself. It always involves the scenes including the use of an axe (Figure 3.10).

Figure 3.9. Perpendicular-held blade characteristic for /sàp/-subdivision of /fan/-subcategory: from left to right, Scene 61 KARATE-CHOP ROPE, and Scene 6 CHOP CARROT W/ KNIFE, as examples.

194

Figure 3.10. Use of axe contained in all scenes in /tɕaːm/-subdivision: from left to right, Scene 48 CHOP

BRANCH W/ AXE, Scene 54 CUT CARROT CROSSWISE W/ AXE, Scene 13 CUT ROPE W/ AXE, and Scene 37

CUT CARROT LENGTHWISE W/ AXE.

The /hàn/-subcategory generally centres on caused-separation with the use of sharp-bladed tools like others in the /tàt/-category. Additionally, it covers the scenes which depict a specific movement (i.e., a sawing or parallel back-and-forward motion) of sharp-bladed implements. Also included are those scenes portraying a feature that specifies implement position for /hàn/ ‘slice’ that is characteristic of this subcategory. This coincides with Phanthumetha’s (2016) proposed definition: /hàn/ means to “cut on a supporting surface” (p. 925). This is true of most of the scenes in this subcategory. Nevertheless, occurrences of /hàn/ were infrequently extended to actions without the use of a supporting surface. In practical use, the verb can even be found describing an event which completely lacks a surface as such, as in (3.9) from an online procedural video.

195

(3.9) hàn hǔːahɔ̌ ːm mâj tɕʰáj kʰǐːaŋ mâj sɛ̀ːp taː dûaj náʔ slice onion NEG use chopping.board NEG irritate eye also FP

‘Cutting onions without a cutting board. Not irritating your eyes as well.’ (Pamachae, 2018 [Youtube videoclip]35)

Consequently, despite /hàn/ ‘slice’ suggest the involvement of a supporting surface as a strong implication, overriding this is sometimes possible.

The /lɯ̂ aj/-subdivision in the /hàn/-subcategory is also sensitive to the application of a sharp tool and sawing motions. Additionally, it involves a typical kind of implement: a saw, as displayed in Scene 15 SAW STICK. The use of a saw as the implement in that scene clarifies why /lɯ̂ aj/ ‘saw (VB)’ described the only scene of this small division. The appropriateness of the verb for Scene 15 can be discussed in two corresponding ways. First of all, it is a denominal verb derived directly from

the instrumental noun /lɯ̂ aj/ ‘saw’; in Thai, saying /lɯ̂ ajTR máːjOBJ/ lit. saw wood

‘sawing wood’ sounds more natural than using other verbs with the explicit

expression of /lɯ̂ aj/N: e.g., /tàt máj dûaj lɯ̂ ːaj/ lit. cut wood with saw ‘cutting wood with a saw’, /tɕʰáj lɯ̂ ːaj tàt máj/ lit. use saw cut wood ‘using a saw to cut the wood’, or /hàn máj dûaj lɯ̂ ːaj/ lit. slice wood with saw ‘cutting wood with a saw’, in a description of an event in which the agent used a saw to perform separation. Second,

/lɯ̂ aj/VB can entail a sawing motion as being a derivative from a name referring to an

35 Pamachae. (2018, November 4). hàn hǔːahɔ̌ ːm mâj tɕʰáj kʰǐːaŋ mâj sɛ̀ ːp taː dûaj náʔ [Cutting onions without a cutting board. Not irritating your eyes as well]. Youtube. Retrieved August 1, 2020, from https://youtu.be/vfJgAWcFRYI.

196

instrument which has a conventional use of push-and-pull or backward-and-forward motion.

Still within the /tàt/-category, its /pʰàː/-subcategory comprises Scene 37 CUT

CARROT LENGTHWISE W/ AXE, Scene 51 CUT MELON W/ KNIFE, and Scene 9 SLICE

CARROT LENGTHWISE W/ KNIFE. These scenes not only reflect caused-separation by means of sharp-edged implements, but also specify orientation of separation actions.

According to the scene events, all the agents performed actions of separating objects with the same orientation referring to the lengthwise direction. Note that the forceful manner does not seem to be involved in the characterisation of this subcategory since some scenes, e.g., Scenes 37 and 51 express actions with intensity whereas others do not, e.g., Scene 9.

The /tɕɔ̀ʔ/-subcategory is the last one within the /tàt/-category. Its scenes are consonant with the generalised characteristic of caused-separation by sharp tools, similar to the above three subcategories. However, all the scenes of this subcategory express a specific manner of action performed by agents of the separation: that of vertical downward blows of pointed/short sharp-edged implements (i.e., Scene 43

CHOP CARROT CROSSWAYS W/ CHISEL, Scene 53 CHOP STICK W/ CHISEL, and Scene 2

CHOP ROPE W/ CHISEL). This semantic feature is accordingly considered as the principal condition in the characterisation of the subcategory.

Semantic characterisation of the /tʰúp/-category is based on seven different scenes. Six of these show the agent performing an action with a blunt-headed instrument such as a hammer, with forceful intensity. Furthermore, this category involves different types of theme objects undergoing caused-separation: one- dimensional flexible (like a rope), two-dimensional rigid (like a carrot or a stick),

197

two-dimensional flat (like flexible cloth or a rigid plate), and three-dimensional (like a flowerpot). These factors contribute to a core characterisation for this verb. Only

Scene 32 (KARATE-CHOP CARROT CROSSWAYS) depicts the use of a knife-hand as implement. This exceptional scene raises questions that merit special discussion.

The action in Scene 32 was described with three different verbs: /tʰúp/

‘smash’, /fan/ ‘hack’, /sàp/ ‘chop’. The latter two corresponding verbs were predominant in the descriptions of this scene. Since none of the Thai speakers named

Scene 32 with another possible verb like /tàt/ ‘cut’, and since the /fan/-subcategory is considered to be within the /tʰúp/-category, the clustering analysis consequently assigned the scene to the /tʰúp/-category. This assignment was despite considerable divergence from the rest with reference to the implement used. The reason that some of the Thai speakers described Scene 32 similarly to scenes in the /tʰúp/-category can be understood by considering the properties of the knife-hand and the theme object, a carrot. Specifically, the use of a handshape as an implement for any event of caused- separation may allow room for different interpretations. A hand can be formed into many gestures, e.g., a knife-hand shape imitating a blade, or a fist shape representing a hammerhead. As for the object acted upon, a carrot is hard and crunchy. For some speakers, its separation would not easily be caused by a hand blade but would more likely be caused by a blunt rounded edge pounding on it. Given this, for a few speakers, Scene 32 was better described with /tʰúp/. Since the scene was predominantly characterised with /fan/ ‘hack’ and /sàp/ ‘chop’, only a minority favoured the interpretation of a blunt edge over that of a blade hand.

Additionally, all the scenes combined in the /tʰúp/-category show a resulting complete separation—whether into parts or into small fragments. The /fan/- subcategory within the /tʰúp/-category also generally reflects the use of a blunt-

198

headed implement with intensity to cause separation. However, a characteristic condition for this subcategory is seen in that all the results of actions portrayed by the related scenes specify separation into two parts, not into small fragments. This kind of resulting separation is aligned with factors involved with the /fan/-subcategory within the /tàt/-category: e.g., Scene 61 where a rope stretched between two tables is karate- chopped by the agent into two parts.

Next, to consider semantic characterisation of the /hàk/-category, its scenes do not express use of any implements other than hand actions. Agents are shown snapping objects by bending the trailing edges, like Scene 25 SNAP TWIG W/ TWO

HANDS, or over the knee—as in Scene 5 SNAP STICK OVER KNEE36. Also, this category appears to require a specific type of object: (two-dimensional) rigid. This property applies to all objects in the pertinent scenes (see Figure 3.11).

Figure 3.11. One- and two-dimensional rigid objects as characteristic for /hàk/-category: from left to right, Scene 25 with a twig, Scene 57 with a carrot, Scene 19 with a twig, and Scene 5 with a stick.

36 Note that even though the agent in this scene began with snapping the stick over his knee, he bent and ripped it off later.

199

The /tɕʰìːk/-category comprises different scenes showing an agent performing a hand action of separating a flexible object, either one-dimensional or two- dimensional, as shown in Figure 3.12 below. The category contains only one subcategory, characterised by the distributional pattern of /dɯŋ/ ‘tug’ which is predominant in its descriptions. This subcategory covers the subcluster of two scenes, i.e., Scenes 35 and 38, featuring the caused-separation by hand actions of a specific type of theme object: one-dimensional flexible. Additionally, neither the /tɕʰìːk/- category nor its /dɯŋ/-subcategory are characterised by a forceful manner of action.

Either full or partial resulting separation may be involved.

Figure 3.12. Specific types of one- and two-dimensional flexible objects as characteristic for /tɕʰìːk/- category: from left to right, Scene 1 TEAR CLOTH INTO TWO PIECES BY HAND, Scene 36 TEAR CLOTH

ABOUT HALF-WAY THROUGH W/ TWO HANDS, Scene 35 BREAK YARN INTO PIECES W/ INTENSITY, and

Scene 38 BREAK SINGLE PIECE OFF YARN BY HAND.

The /tʰîm/-category is comprised of only one scene in this study, showing the agent poking a hole in the stretched cloth. The scene involves explicitly the manner of a vertical downward forceful strike of the pointed implement, a twig, causing a hole in

200

the theme object. Semantic characteristics of the /tʰîm/-category are consequently close to the /tàt/-category through the /tɕɔ̀ʔ/-subcategory, which is considered a subcategory of the /tʰîm/-category. A relative difference between the /tʰîm/-category and the /tàt/-category is that the former does not require resulting complete or full separation of theme objects into parts but only partial destruction as a cut-through hole in the objects’ surface (see Figure 3.13).

Figure 3.13. Hole in object’s surface as characteristic for /tʰîm/-category: Scene 45.

In this study, the /tʰîm/-category incorporates three possibilities characterised by three different verbs: /tɕɔ̀ʔ/ ‘puncture’, /tʰɛːŋ/ ‘stab’, and /tɕîm/ ‘jab’. It is hard to define distinct semantic characterisations for the three subcategories based on this study’s clustering analysis since only one scene, Scene 45, is represented. However, considering more widely some examples in practical use from TNC (Aroonmanakun,

201

2007)37 helps shed light on possible distinguishing characteristics relating to the individual subcategories:

(3.10) sîːkʰroːŋ hàk tʰîm pɔ̀ːt rib break stab lung

‘The rib was broken then puncturing the lung.’

(adapted from NACMD101 in Aroonmanakun, 2007)

(3.11) pʰûːak-pʰrâj tɕʰáj máj-jaːw lâj tʰîm tɕʰáːŋ

PL-commoner use stick-long chase stab elephant

‘The commoners chased elephants and used a long stick to stab them.’

(adapted from PRNV010 in Aroonmanakun, 2007)

(3.12) wajrûn tɕʰɔ̂ ːp tɕɔ̀ʔ tɕamûːk tɕɔ̀ʔ lín teenager like pierce nose pierce tongue

‘Teens like piercing their nose and tongue.’

(adapted from NACHM038 in Aroonmanakun, 2007)

37 Aroonmanakun, W. (2007). TNC: Thai National Corpus (Third Edition). Retrieved October 1, 202 0, from http://www.arts.chula.ac.th/~ling/tnc3/.

202

(3.13) lûːksìt mâj jàːk hâj tɕɔ̀ʔ kalòːk ʔaːtɕaːn student NEG want give pierce skull teacher

‘The students did not want to have the teacher’s skull perforated (trepanned).’

(adapted from NACSS078 in Aroonmanakun, 2007)

(3.14) tɕâwmɛ́ːwpǒŋ pen-bâː tʰɛːŋ kʰɔː tuːa-ʔeːŋ taːj

Chao Miao Pong COP-crazy stab neck body-self die

‘C. M. P. became mentally deranged; he stabbed himself to death in his neck.’

(adapted from ACSS102 in Aroonmanakun, 2007)

(3.15) pʰûːtɕʰaːj lâj tʰɛːŋ plaː man chase stab fish

‘The men chased and stabbed fish.’

(adapted from PRNV174 in Aroonmanakun, 2007)

(3.16) tɕʰaːj-nùm tɕʰáj sɔ̂ ːm tɕîm nɯ́ ːa sàj pàːk man-young use fork jab meat put.in mouth

‘The young man poked the meat into his mouth, using a fork.’

(adapted from PRNV175 in Aroonmanakun, 2007)

(3.17) pʰîː-driːm ʔaw níw tɕîm nâːpʰàːk tɕʰǎn rɛːŋ-rɛːŋ

Brother-D. take finger tap forehead 1SG violent-violent

‘Brother D. forcefully poked my forehead with his finger.’

(adapted from PRNV085 in Aroonmanakun, 2007)

203

Events characterised by /tʰîm/ in examples (3.10)-(3.11) do not specifically exemplify the manner noted above of a vertical downward blow of a sharp-edged, pointed instrument. They involve instead a manner of either a projectile or nondescript motion of pointed instrument-like implements or instruments, with or without an agent’s intention, e.g., people ballistically stabbing elephants with sticks in (3.11); the rib accidentally puncturing the lung in (3.10). Additionally, those events portray the agents’ action with intensity, potentially causing separation.

Events characterised by /tɕɔ̀ʔ/ ‘puncture’, /tʰɛːŋ/ ‘stab’, and /tɕîm/ ‘jab’ in the preceding examples appear to focus more on specific subevent characteristics. In

(3.12)-(3.13), the events portrayed by /tɕɔ̀ʔ/ ‘puncture’ specifically indicate cut- through holes as the expected results of the actions. Also, the actions are regarded as purposive rather than non-purposive or accidental. For example, (3.12) involves the event of piercing a nose for the resulting decoration. Examples (3.14)-(3.15) show that the events characterised by /tʰɛːŋ/ do not have to result in cut-through holes, but need only cause hollows or cuts in the themes. The /tʰɛːŋ/ events also imply purposive manner of the actions for certain outcomes. For instance, (3.14) expresses the agent causing a cut on his neck to kill himself. Events described with /tɕîm/, as in (3.16)-

(3.17), can express the particular manner of pressing the instrument-like implement or instrument against the objects, with or without asserting their penetration into the objects’ body or portion. Summarily, the /tʰîm/-category seems to cover events that contain the forceful/violent use of a (sharp) pointed implement to cause separation. In contrast, its subcategories simply spotlight certain specific characteristics of caused- separation by sharp-pointed tools: the purposiveness of actions (as characterised by

/tɕɔ̀ʔ/ or /tʰɛːŋ/) or the manner of pressing a sharp point forward into or pressing the end of an index finger against theme objects (as characterised by /tɕîm/).

204

Consequently, the three /tɕɔ̀ʔ/-, /tʰɛːŋ/-, and /tɕîm/-subcategories are seen as more semantically specific versions of the /tʰîm/-category, at least in the present study’s data. However, further research needs to be conducted to bring more understanding or clarity to the above-mentioned specific distinctive features of the three subcategories and to justify whether they are reserved for the /tʰîm/-category or have potential to be separate categories overlapping with one another and also with the /tʰîm/-category.

Like the /tʰîm/-category, the /krìːt/-category includes a single scene given the clustering analysis results: Scene 14 CUT MELON W/ KNIFE. This scene depicts the agent slitting a melon using a sharp-edged instrument, a knife, causing a long and narrow cut or a partial split in the melon. Specifically, the agent makes a separation with a specific manner of placing the blade on the desired location, then pressing it into the object. This category has only one small partition, i.e., the /tɕʰɯ̌ an/- subcategory. Again, given the minimal inclusion of a single scene in the present study’s results, it is difficult to semantically characterise the subcategory. To remedy this, I have located examples from TNC (Aroonmanakun, 2007)38, which could better illustrate how events represented by /krìːt/ and /tɕʰɯ̌ an/ might be interpreted.

(3.18) kʰǔnpʰɛ̌ ːn krìːt nâːtʰɔ́ːŋ pʰanjaː

Khun Phaen slit belly wife

‘Khun Phaen slit his wife’s belly.’

(adapted from ACHM040 in Aroonmanakun, 2007)

38 Aroonmanakun, Wirote. (2007). TNC: Thai National Corpus (Third Edition). Retrieved October 1, 2020 from http://www.arts.chula.ac.th/~ling/tnc3/.

205

(3.19) kʰon-ráːj tɕʰáj mîːt krìːt múŋlûːat man-bad use knife slit wire.screen

‘The thief used a knife to cut the wire screen.’

(adapted from NWRP_CR006 in Aroonmanakun, 2007)

(3.20) kʰon-ráːj krìːt kratɕòk kʰâw paj man-bad slit mirror enter go

‘The thief slit the mirror then broke in.’

(adapted from NWRP_CR012 in Aroonmanakun, 2007)

(3.21)

(tɕâwlaj) dɯŋ dàːp maː tɕʰɯ̌ ːan tʰɔ́ːŋ-kʰɛ̌ ːn

(C. L.) pull sword come slash belly-arm

‘(C. L.) pulled the sword and slashed his forearm.’

(adapted from PRNV in Aroonmanakun, 2007)

(3.22) seːtɕoː kamlaŋ pʰajaːjaːm tɕʰɯ̌ ːan pʰǒn-sôm

S. PROG try slash fruit-orange

‘S. was trying to slash an orange.’

(adapted from PRNV052 in Aroonmanakun, 2007)

(3.23) daːlâː tɕʰáj mîːt tɕʰɯ̌ ːan tʰîː kʰɔ̂ ː-mɯː kʰɔ̌ ːŋ tɕeːsǎn

Darla USE knife slash at ankle-hand POSS Jason

‘D. slashed J.’s wrist with a knife.’

(adapted from PRNV095 in Aroonmanakun, 2007)

206

The events characterised by /krìːt/ in (3.18)-(3.20) express agents causing a long and narrow cavity in a variety of different object types, such as bodily flesh; a two-dimensional object that is either flexible and flat, or rigid and fragile. Details of these events are regarded as characteristic for the /krìːt/-category. Likewise, the events described with /tɕʰɯ̌ an/ ‘slash’, as in (3.21)-(3.23), help approximate specific features of the /tɕʰɯ̌ an/-subcategory within the /krìːt/-category. They reflect the subcategory’s characteristics in that agents were envisioned as causing a long and narrow wound or cut precisely in the flesh of the body part or the fruit (i.e., the pulpy part), but not in other kinds of objects that might occur with /krìːt/ generally.

The semantic characteristics discussed above can be mapped onto individual categories as in Figure 3.14 below. Features associated with the subcategories are marked individually with the ‘plus’ symbol (+) to identify their specific version in addition to those applying to the large categories. Note that these illustrated features are given, based primarily on the 43 scenes in this study and a certain number of examples from TNC (Aroonmanakun, 2007).

207

Figure 3.14. Mapping of semantic characteristics onto six categories (as well as subcategories) of caused-separation in Thai represented by class-defining verbs presented with hyphenation. 208

The figure shows that, for relevant scenes, generalised semantic characteristics for the categories are aligned with abstract notions. These include: instrument versus manual manipulation; predictability of location of separation; differences in instrument types; differences in theme object types; other semantic distinctions, such as whether a blade was placed on a surface before an action, as observed in discussion of the category boundary location in § 3.4.1.1. The four categories in the top panel are controlled by

Instrument manipulation, while the rest, by Manual manipulation. Within these two types of manipulation, categories are subsequently distinguished with respect to different

Predictability (see Table 3.11). Then, Instrument type:39 e.g., sharp-bladed tools, or

Object type: e.g., rigid objects, come into play, followed by Blade placement influencing the partitioning of the categories. On close inspection, other characteristics can also be associated with the subcategories. Generally, they include Manner of action

(e.g., how the agent holds an implement, or how s/he uses the tool to perform the action), and Object subtype (e.g., how many dimensions the object has). Within the /fan/ subcategory, small event-type divisions are further specified by semantic features such as

Specific instrument and Manner of holding instrument. Note that the subcategories and small divisions are still semantically tied to the characteristics of the germane higher partitions. For example, the use of sharp-bladed instruments is characteristically specified for the /tàt/-category, so it is also specified for its /fan/-subcategory, and for all of the three small divisions relating the subcategory. However, I should note here that considering the individual categories in Figure 3.14 one by one, degree of influence can

39 Note that a handshape like a knife hand or a hammer fist is here treated as instrument, then incorporated into the type of instrument that it resembles, i.e., a knife hand into the sharp-bladed type, and a fist or a hard blunt edge of a hand into the blunt-headed type.

209

differ. The categorisations influenced by Predictability and Instrument manipulation may not be as helpful as Instrument type, Object type, or Blade placement in distinguishing between categories at the level determinable by the verb occurrence patterns. In other words, manipulation type and its level of predictability may be of small account in classifying a caused-separation event into a specific category. Also, Instrument manipulation seems in practice redundant in that Instrument type entails its particular type of manipulation. Accordingly, the two non-concrete notions are not developed further in the analysis of the semantic organisation of semantic parameters below.

As we can see, all the above-generalised semantic components are associated with individual categories/subcategories/subdivisions. At the same time, they can be viewed as serving to distinguish between them. This provides a basis to argue for recognising these semantic components as semantic parameters that determine distinctions between the six semantic categories represented in Figure 3.14. Table 3.13 summarises how the semantic parameters are organised in different ways (‘semantic organisation patterns’) to classify the categories, subcategories, and small divisions in the caused-separation domain in

Thai.

210

Table 3.13

Semantic organisation patterns in classification of caused-separation domain in Thai.

Level of classification Semantic organisation pattern

Categories

/tàt/-, /tʰúp/-, /tʰîm/- Instrument type

/krìːt/- Instrument type + Blade placement

/hàk/-, /tɕʰìːk/- Manual manipulation + Object type

Subcategories

/fan/-, /hàn/-, /tɕɔ̀ʔ/-, /tʰɛːŋ/-, Instrument type + Manner of action + (Object subtype)

/tɕîm/-

/pʰàː/-, Instrument type + Manner of action + Direction of separation

/tɕʰɯ̌ an/- Instrument type + Blade placement + Object subtype

/dɯŋ/- Manual manipulation + Object type + Object subtype

Small divisions

/sàp/- Instrument type + Manner of action + Manner of holding

instrument

/tɕaːm/-, /lɯ̂ aj/- Instrument type + Manner of action + Specific instrument

Note that (1) the class-defining verbs with hyphenation are used to present the categories, subcategories, and small divisions; (2) the boldface indicates semantic components which play a role in the specification of each level of classification; (3) the /fan/-category appears to entertain Object subtype when involving the blunt-instrument type.

211

3.5.2 Lexicalisation of caused-separation in Thai

Since the categories and their subcategories are modelled according to verb distributional patterns, their generalised semantic features should also relate to each relevant verb.

Vulchanova et al. (2012) argue along these lines: “If a meaningful subtree [i.e., category] is connected to a particular verb or verbs, we may surmise that there is a connection between the underlying feature of the subtree and these verbs” (p. 22).

This subsection aims to suggest how such a connection applies to data analysed in this study. Characteristics relating to the individual categories of caused-separation as in

Figure 3.14 are conflated under a main feature of CAUSED-SEPARATION (see § 3.2.2) in the lexicalisation of the verbs that pattern the categories. Table 3.13 indicates how those lexicalisation patterns mirror the semantic organisation patterns in the categorisation of this domain. Tables 3.14a-c below display in greater detail basic lexicalisation patterns in the predominant verbs that establish the six categories.

These tables indicate how semantic organisation patterns categorise the verbs into subcategories. Note that square brackets are used to enclose the lexicalisation properties of verbs, labelled in an abbreviated manner. For the tables below, [SEP] stands for

[CAUSED-SEPARATION ACTION]40, [INSTR TYP] for [INSTRUMENT TYPE], [BLADE] for

[BLADE PLACEMENT], [MANUAL] for [MANUAL MANIPULATION], [OBJ TYP] for [THEME OBJ

TYPE], [MANNER] for [MANNER OF ACTION], [OBJ SUBTYPE] for [OBJECT SUBTYPE], [MAN

40 The featural assignment of [SEP] in Table 3.14a-c as well as Table 4.12a-c was arbitrarily imposed for this study, since whether a result-state is entailed or strongly/weakly implicated by the verbs under consideration is still open to discussion.

212

HOLD] for [MANNER OF HOLDING INSTRUMENT], and [SPEC INSTR] for [SPECIFIC

INSTRUMENT].

Table 3.14a

Lexicalisation of caused-separation at category level in Thai.

Conflation: (1) [SEP + INSTR TYP] (2) [SEP + INSTR TYP + BLADE] (3) [SEP + MANUAL + OBJ TYP]

Verb: /tàt/ – [SEP + SHARP-BLADED] /krìːt/ – [SEP + SHARP-BLADED + BLADE] /hàk/ – [SEP + MANUAL + RIGID]

/tʰúp/ – [SEP + BLUNT-HEADED] /tɕʰìːk/ – [SEP + MANUAL + FLEXIBLE]

/tʰîm/ – [SEP + POINTED]

Table 3.14b

Lexicalisation of caused-separation at subcategory level in Thai.

Conflation: (1) [SEP + INSTR TYP + MANNER (2) [SEP + INSTR TYP + DIR] (3) [SEP + INSTR TYP + BLADE + (4) [SEP + MANUAL + OBJ

+ (OBJ SUBTYP)] OBJ SUBTYP] TYP + OBJ SUBTYP]

Verb: /fan/ – [SEP + SHARP-BLADED + /pʰàː/ – [SEP + SHARP-BLADED + /tɕʰɯ̌ an/ – [SEP + SHARP- /dɯŋ/ – [SEP + MANUAL +

STRIKING ] LENGTHWISE] BLADED + BLADE + FLESH] FLEXIBLE + 1-D]

/fan/ – [SEP + BLUNT-HEADED +

STRIKING + 1-D]

/hàn/ – [SEP + SHARP-BLADED +

SAWING]

/tɕɔ̀ʔ/ – [SEP + POINTED +

DOWNWARD BLOW]

/tʰɛ ŋ/ – [SEP + POINTED +

PURPOSIVE]

213

Conflation: (1) [SEP + INSTR TYP + MANNER (2) [SEP + INSTR TYP + DIR] (3) [SEP + INSTR TYP + BLADE + (4) [SEP + MANUAL + OBJ

+ (OBJ SUBTYP)] OBJ SUBTYP] TYP + OBJ SUBTYP]

/tɕîm/ – [SEP + POINTED +

PRESSING-AGAINST]

Table 3.14c

Lexicalisation of caused-separation at subdivision level in Thai.

Conflation: (1) [SEP + INSTR TYP + MANNER + MAN HOLD] (2) [SEP + INSTR TYP + MANNER + SPEC INSTR]

Verb: /sàp/ – [SEP + SHARP-BLADED + STRIKING + PERPENDICULAR-HELD BLADE] /tɕaːm/ – [SEP + SHARP-BLADED + STRIKING + AXE]

/lɯ̂ aj/ – [SEP + SHARP-BLADED + STRIKING + SAW]

According to the table, note that the lexicalisation patterns in verbs of caused- separation in Thai in a way reaffirm lexicography approaches in distinguishing components for caused-separation words: the citing of usual theme objects, instruments, and manners (see § 3.1). As can be seen, although the object type does not play a key role for the proposed categorisation in general, it does come into play for verbs that relate to

Manual manipulation, i.e., /hàk/ ‘snap’ and /tɕʰìːk/ ‘tear’. Features of Instrument type and

Manner of action are quite active in the semantic organisation patterns for many verbs of cutting (e.g., /fan/ ‘hack’, or /hàn/ ‘slice’). Additionally, the lexicalisation patterns suggest at least two important findings:

(a) As lexicalisations are based on the characteristics of the individual categories

through the surmised connection between the categories and the verbs that define

214

them, such characteristics can be in a way seen as semantic parameters that do not

only cause distinction between different event types corresponding to the

categories but also most likely have an effect when speakers make lexical choices

to describe events of caused-separation in Thai (cf. Andics, 2012)

(b) Lexicalisation in the class-defining verbs appears to express two main degrees of

semantic generality. From the small size of conflated semantic elements, it

follows that the verbs that define the broad categories are rather generic. They

simply express that separation was performed with a certain kind of tool. By way

of comparison, the verbs that model the subcategories appear to express a specific

level of meaning.

It is additionally essential to return here an important question: whether resulting separation is complete or partial, already mentioned in § 3.4.1; see also § 3.2.2.1. Should resulting complete separation be regarded as an entailment? Consider examples (3.24)-

(3.25), from project field data:

(3.24) pʰûːtɕʰaːj tàt pʰǒm pʰûːjǐŋ man cut hair woman

‘The man cut the woman’s hair.’ (CB- S27-CP)

215

(3.25) tʰɤː tɕʰáj kìŋmáj tɕɔ̀ʔ pʰâː

3SG use branch puncture cloth

‘She pierced the cloth with a branch.’ (CB-S45-VP)

The examples are associated with Scene 27 CUT HAIR W/ SCISSORS and Scene 45

POKE HOLE IN CLOTH W/ TWIG. These show respectively the complete separation of the woman’s hair and what could be viewed as partial separation: a hole made in the cloth.

To unpack this semantic contrast, I asked some other native speakers of Thai—without showing them the relevant video-clip scenes—whether they would think of separation caused either fully or partially when hearing the two descriptions. I found that their answers were positive. However, I also found that it is better not to treat resulting separation characteristics as entailments of verbs of the kind. Consider the above examples again in the constructed versions (3.26)-(3.27).

(3.26) pʰûːtɕʰaːj tàt pʰǒm pʰûːjǐŋ mâj kʰàːt man cut hair woman NEG torn

‘The man (tried to) cut the woman’s hair; it was not off.’

216

(3.27) tʰɤː tɕʰáj kìŋmáj tɕɔ̀ʔ pʰâː tɛ̀ː mâj kʰâw

3SG use branch puncture cloth CONJ:but NEG enter.into

‘She used a stick to try to poke a hole in the cloth, but did not make it.’

They still sound well-formed semantically, though without meanings of either complete

(3.26) or partial (3.27) resulting separation. Such separations therefore are not regarded as entailed. In other words, in (3.26)-(3.27), the “separation result” meanings are barred due to the additional information provided by the resultative verb phrase (3.26) or by the following clause (3.27). The conclusion must be therefore that semantic behaviour of

Thai verbs of material destruction, i.e., in our case, caused-separation verbs like /tàt/ ‘cut’ and /tɕɔ̀ʔ/ ‘puncture’, differs from that of destruction verbs like English cut or smash, which entail a result: a cut or small fragment, respectively41.

That being the case, I argue against either complete or partial resulting separation being lexicalised by verbs of caused-separation, but for such results as being implied due to conventional norms of interpretation, perhaps as pragmatic defaults. Accordingly, an

41 Verbs used to describe events involving destruction (e.g., caused-separation) are accomplishment verbs in many languages. This kind of verb signifies a durative process leading to a patient’s state-change. Yet, it is a general property of Thai and Khmer verbs of (material) destruction that they do not necessarily entail a change of state—like Mandarin (see 杀 ‘kill’ in Lin, 2019, pp. 168-169). Rather, they imply a resultant state—certainly with different implicature strength than verbs with no reference to a resultant change of state in their semantics, such as activity verbs. Therefore, pragmatic cancellation of an implied change of state is acceptable to speakers of Thai and of Khmer (‘I killed it, but it would not die’; see more in § 4.5.2). In this regard, Thai and Khmer caused-separation may not be true accomplishment verbs. They may perhaps be “projected accomplishment verbs”, which entail an activity as well as the subject’s purpose to achieve a result (Enfield, 2008; cf. Talmy, 2000). Still, the entailment of the agent’s intention does not seem to be the case for the two MSEA languages since such verbs can co-occur with adverbs meaning ‘accidentally’ or ‘unintentionally’. Therefore, Thai and Khmer caused-separation verbs could be activity verbs with a result-state implicature. These verbs are often interpreted as implying a resultant change of state.

217

implied state change by these verbs varies in implicature strength, with different degrees characterising individual verbs, sometimes correlating with type of affected object. For instance, /tàt/ strongly implies full separation of objects when no cancellation is involved.

It was often used to describe circumstances where such complete separation was achieved: for example, /kʰǎw tàt kradàːt/ 3SG cut paper ‘He cut paper’. By contrast, /tʰúp/ implies resulting separation mostly when acted-upon objects are considered as brittle, or breaking, splitting, or powdering easily. Otherwise, a separation result is less expected, for example, /kʰǎw tʰúp mɔ̂ ː-din/ 3SG smash pot-soil ‘He smashed the clay pot’ versus

/kʰǎw tʰúp pʰâː/ 3SG smash cloth ‘He hit the cloth’. In the analysis for /tʰúp/ proposed here, total separation is neither lexicalised nor entailed.

3.5.3 Summary

In this section, I have determined the characteristics of the domain’s individual categories, subcategories, and subdivisions based on the common semantic values among the relevant scenes. These characteristics subsequently lead to generalising necessary semantic components which play a role in the categorisation of the caused-separation event domain in Thai. The components include Instrument type, Manual manipulation, Blade placement, Object type, Manner of action, Object subtype,

Manner of holding instrument, and Specific instrument. As the semantic properties are characteristic for the partitions that verb occurrence patterns establish, I have inferred that we are able to connect the characteristics to the relevant class-defining verbs.

Semantic organisation patterns apply to specific lexical verbs. The last part of this section presents how the semantic elements are associated in the fundamental categorisation of caused-separation. Lexicalisation patterns of all of the Thai verbs in the project’s data are

218

compared and contrasted. Also discussed is why some semantic elements like Result of separation are not considered entailed in lexicalisation. In making this argument, I am viewing lexicalisation patterns as mirroring semantic organisation in the domain’s classification.

------⁂ ------

219

CHAPTER 4

Denotational range of caused-separation in Khmer

This chapter addresses the Khmer data collected and analysed in this study. In parallel to the analysis of the Thai data, the sections below treat verbal descriptions for the 43 video stimuli of caused-separation produced by Bohnemeyer et al. (2001), introduced in

Chapters 2 and 3. The organisation of the semantic domain is discussed in terms of the granularity of its linguistic encoding and lexical semantic boundary locations. As in the preceding chapter, I begin with an overview of the semantics of caused-separation verbs as explicated in prior related literature on Khmer (§ 4.1). In § 4.2, I present an investigation to describe the granularity of the caused-separation domain in Khmer, as represented by the extension of verbs used in naming the caused-separation situations. I then analyse how the established categories lie with respect to one another in the caused- separation semantic space in Khmer. Certain overlapping areas between categories are revealed, implying fuzzy categorisation judgements for the domain (§ 4.3). It is shown that such categorical unclarity is so far-reaching in Khmer because the language appears to contain one lexeme applicable to more than half of the putative categories. This likely affects general semantic distinctions in the domain. In the last section (§ 4.5), I explore attributes of the individual (sub)categories based on the pertinent scenes of the categories, before generalising the semantic organisation within the domain for Khmer. I am then in a position to present an analysis of the lexicalisation patterns of Khmer caused-separation verbs.

220

4.1 Lexical expressions of caused-separation in Khmer:

Prescriptive accounts

Below, I outline morphological and syntactic characteristics of caused-separation verbs in

Khmer, and of how scholars have described them—as manifested in monolingual and bilingual dictionaries (Headley, 1977; Headley, Chim, & Soeum, 1997; Nath, 1967) and a reference grammar (Haiman, 2011). Note that none of these sources is specific to the lexical semantic domain of caused-separation and its lexical resources in Khmer. I summarise and reflect on how relevant verbs are suggested for different morpho-syntactic environments and how they are ideally defined. The small length of this section shows that the lexical semantic domain of caused-separation in the language is still relatively understudied (compare to § 3.1 for Thai). Therefore, not only does this chapter provide an analysis of semantic category information and lexical semantics in the domain of caused-separation in Khmer but may also be the first focussed study on this issue for the language.

4.1.1 Morphosyntax of caused-separation verbs in Khmer

Morpho-syntactic environments regarded by Haiman (2011) and Nath (1967) are essential for analysing verbs of cutting and breaking in Khmer. These sources refer to affixation, verbid prepositions, instrument subjects, or decorative symmetry and similar processes. Like many Khmer verbs, verbs of separation can involve affixation, resulting in different morpho-syntactic (and morpho-phonological) indications of causativisation, intransitivisation, or nominalisation. We may combine intransitive verbs of separation with the productive causative prefixes /bɑn-/ or /trɑ-/ to form the causative/transitive

221

variants. For example, /baek/ ‘broken; cracked’ is prefixed with /bɑn-/ for /bɑm-baek/

‘break (TR)’, or /bak/ with /bɑn-/ and /trɑ-/ for /bɑm-bak/ ‘break (TR)’ and /trɑ-bak/

‘snap’, respectively (Haiman, 2011, p. 59).

Some prefixed causative verbs have been subsequently affected by the loss of the rhyme portion of the initial unstressed syllable, preconsonantal devoicing, and anaptyctic insertion: e.g., /bɑmbaek/ > /*bbaek/ > /pəbaek/ (Haiman, 2011, p. 21). Some transitive caused-separation verbs, such as /haek/ ‘tear’ and /beh/ ‘pluck’ can be prefixed with /rɔ-/, deriving the intransitive: /rɔhaek/ ‘be.torn’ and /rɔbeh/ ‘peel.off’, respectively. These grammatical changes, i.e., from non-causative to causative verbs or from transitive to intransitive verbs, do not imperil the essential semantic relations of these pairs of verbs.

We can trace the meanings back and forth between stem verbs and prefixed derivatives, since the derivational rules yield systematic semantic effects. For affixation in nominalisation, we can make some transitive verbs of caused-separation nominal using the morphology of infixation, i.e. /-rɑ(n ~ m)-/. Exemplary cases in point are at /kat/ ‘cut’ and /bok/ ‘pound’ being infixed to form /k-rɑn-at/ ‘fabric; coupon’, and /p-rɑm-ok/

‘pounding’, respectively (pp. 67-68). However, the normalising affixation appears to produce random semantic effects as opposed to that of causativisation and intransitivisation.

Some verbs of separation in Khmer also derive so-called verbid prepositions as there are deemed to be inadequately dedicated prepositions in the language, and the PREP

N order can be primarily understood as a subcase of the V O order (Haiman, 2011, p. 214).

Below are examples of /kat/ used as prepositions for direction.

222

(4.1) rʊət kat prey run PREP:cut forest ‘run through the forest’ (Haiman, 2011, p. 214)

(4.2) rɔteh bɑɑ kat dom ʔac cart drag PREP:cut piece shit ‘a cart (that was) dragged over a piece of shit’ (Haiman, 2011, p. 291)

Khmer verbs of separation may occur in sentences where “characterised instruments” turn up as oblique subjects (cf. Nath, 1967). In other words, we can say that such verbs permit instrument subjects. Take (4.3 – 4.5) for example.

(4.3) coŋ-kvaev pdac voal yaaŋ pʰoy tip-machete separate vine kind easy

‘The blade of the machete cut the vines easily.’

(Haiman, 2011, p. 204)

(4.4) kambət mut day knife cut hand

‘The knife went through the hand.’

(Nath, 1967)

223

(4.5) bɑnlaa mut cəəŋ thorn cut foot

‘The thorn stuck into the foot.’

(Nath, 1967)

Khmer is accordingly like English in that they both allow the instrument-subject construction for certain caused-separation verbs like ‘break’ (Levin, 1993, p. 80). Levin points out that in English not only do verbs condition the turning up of instruments as subjects, but the instruments also play a role in whether we can promote them based on the notion of intermediary versus enabling/facilitating instruments. However, no study has been conducted to investigate whether or not the notion comes into play in Khmer, leaving the potential for further research.

Some verbs of cutting in Khmer are involved in decorative, symmetrical compounding as a particular stylistic feature of the language (Haiman, 2011). Four-word symmetrical and asyndetic coordination of two VPs comprises the same beginning components repeated with different succeeding adjacent ones, thus setting an ABAC alliteration wherein both B and C are almost synonymous with each other. Haiman points out that in no case does the formal repetition of A and B signal repetition. The reason for this rhythm is “elegance” or “coherence” (p. 86); that is why it is considered decorative.

224

Take (4.6) as an example of how the verb of caused-separation /kat/ ‘cut’42 occurs in the

A-position, forming a symmetrical compounding.

(4.6) kat cət kat tlaəm cut heart cut liver ‘Let go of one’s suffering’

(Haiman, 2011, p. 87)

Note that /cət/ ‘heart’ and /tlaəm/ ‘liver’ refer to two different organs. However, in the

Khmer culture they are stereotyped as seats of emotions, metaphorically meaning ‘mind’ or ‘spirit’ (Nath, 1967). We may thus consider them as synonyms at the general level on account of having the same denotation in this case.

4.1.2 Semantic treatment of caused-separation verbs in Khmer

The three monolingual and bilingual dictionaries (Headley, 1977; Headley et al., 1997;

Nath, 1967) show that certain Khmer verbs of caused-separation verbs are linked to particular types of theme objects as suggested in their definition and usage examples. For instance, /sɑk/ ‘hair’ or /cəəŋ sɑk/ ‘hairline’ is determined for separation characterised by the verbs /kat/ ‘cut’, /crəp/ ‘trim’, or /cie/ ‘trim’, while a tree, i.e., /cʰəə/, for partitioning denoted by /puh/ ‘break’, /ʔaa/ ‘saw’, /bɑmbak/ ‘break (TR)’, or /kat/ ‘cut’. Again, I found that the dictionaries offer instrument prototypes for individual caused-separation verbs:

42 The Khmer /kat/ ‘cut’ is likely homophonous with the English but etymologically unrelated.

225

e.g., an axe (or /puutʰav/ in Khmer) used in an action expressed by /puh/, or a hammer, i.e., /ɲɔɲuə/, in that represented by /dɑm/ ‘pound (TR)’.

Below is Table 4.1 showing exemplary theme objects and instruments prescribed as associated with certain Khmer verbs of caused-separation, given in Headley (1977),

Headley et al. (1997), and Nath (1967).

Table 4.1

Some Khmer separation verbs with ideal examples of typical objects being separated or divided and instruments involved (Headley, 1977; Headley et al., 1997; Nath, 1967).

(Caused) Exemplary Object being acted upon Typical Instrument

Separation Verb

/bɑmbak/ ‘break /(mɛɛk-)cʰəə/ ‘tree (branch)’ n/a

(TR)’

/cəɲcram/ ‘cut’ /sac cruuk/ ‘pork meat’ n/a

/cət/ ‘slit’ /bɑnlae/ ‘vegetable’ n/a

/cie/ ‘prune’ /slǝk/ ‘leaf’; /cəəŋ sɑk/ ‘hairline’ n/a

/crəp/ ‘cut’ /cəəŋ sɑk/ ‘hairline’; /sac/ ‘meat’ /kɑntray/ ‘scissors’

/criək/ ‘slit’ /ŋiet/ ‘dried product’ n/a

/dɑm/ ‘pound’ /trəy/ ‘fish’ /ɲɔɲuə/ ‘hammer’; /ʔɑnluuŋ/

‘mallet; club’

/haek/ ‘tear’ /sɑmpʊət/ ‘sarong’; /krɑdaah/ ‘paper’; /moat/ n/a

‘mouth’; /kee/ ‘reputation; honour’

/kap/ ‘hack’ n/a /daav/ ‘sword; saber’;

/kambət/ ‘knife’

226

(Caused) Exemplary Object being acted upon Typical Instrument

Separation Verb

/kat/ ‘cut’ /cʰəə/ ‘tree’; /sɑk/ ‘hair’; /sɑmpʊət/ ‘sarong’ /kɑntray/ ‘scissors’; /rɔnaa/

‘saw’

/mut/ ‘cut’ /day/ ‘hand’; /cəəŋ/ ‘foot’; /dəy/ ‘earth; ground’ /kambət/ ‘knife’; /bɑnlaa/

‘thorn’

/puh/ ‘chop’ /ʔoh/ ‘firewood’; /cʰəə/ ‘tree’; /cʊəɲceaŋ/ ‘wall’ /puutʰav/ ‘axe; hatchet’

/veah/ ‘slit’ /ŋiet/ ‘dried product’; /pʊəh trəy/ ‘fish belly’ n/a

/ʔaa/ ‘saw (TR)’ /cʰəə/ ‘tree’; /sbaek/ ‘skin; leather’ /rɔnaa/ ‘saw (N)’; /kambət/

‘knife’

The cutting and breaking definitions in the Khmer-Khmer and Khmer-English dictionaries (Headley, 1977; Headley et al., 1997; Nath, 1967) involve too simplistic and sweeping statements and circularity. In concrete terms, some definitions (or translations) are very circuitous in that one term is used to define another, and also vice versa. A good example is /kac/ ‘break (TR)’, which is defined by the verb /bɑmbak/ ‘break’, and with the relations reversed, in Nath’s (1967) Khmer-Khmer dictionary. Also, oversimplifications or sweeping definitions in the dictionaries (Headley, 1977; Headley et al., 1997) combine to make prescriptive meanings—ones that prescribe how terms should be interpreted—insufficient for practical uses. For example, the two different verbs /dɑm/ and /viey ~ vay/ are together generalised, or oversimplified, through three different English verbs: ‘beat’, ‘hit’, and ‘strike’. Based on that, we do not know if the verbs differ from each other, or how they should be used in any given case. Even worse,

227

the defining English words might enable us to regard both Khmer verbs as synonymous, still without clear evidence.

The following sections present the findings of this study on Khmer. I show how

Khmer speakers categorise their experience of caused-separation. In particular, the effective granularity, semantic classification, and semantic organisation of the domain are presented and discussed. How Khmer speakers conceptualise this semantic field is investigated, with little or no reliance on formalised definitional practice. Rather than prescriptively positioned dictionaries, use is made of data elicited by the video stimulus task (Bohnemeyer et al., 2001), supplemented where necessary with examples from corpora.

4.2 Lexical items for caused-separation events and their

structural patterns in Khmer: The speakers’ elicited data

This section presents field research findings and analysis relating to Khmer verbs of caused-separation. It gives accounts of relevant lexical descriptors and their related structural patterns. Data analysed are from descriptions of caused-separation events elicited from seven Khmer speakers using the video stimuli task (Bohnemeyer et al., 2001) described in § 2.2.2. Specifically, I begin with the list of Khmer verbs of caused-separation collected from the speakers’ free descriptions to explain lexical diversity for this semantic domain in Khmer (§ 4.2.1). In § 4.2.2, I outline the use of caused-separation verbs as they engage certain basic clausal patterns in the language.

228

4.2.1 List of Khmer caused-separation verbs and their occurrences

Methodology parallels what was done for Thai in chapter 3. I determined a relevant subset of the larger 61 ‘cut’ and ‘break’ video stimuli (Bohnemeyer et al., 2001) for elicitation to assemble Khmer verb inventories used as labels of events. These individual stimulus clips simulated different caused-separation subtypes, such as cutting, breaking, hitting, and poking (cf. Majid et al., 2007a). Note that to achieve homogeneity in the data

I discarded cases in which the speaker incoherently described the intended events as identified by my Khmer assistant or other native speakers, and in which the erroneous descriptions were withdrawn by the speaker during or after the elicitation task. The remaining 335 descriptions were taken into account.

In addition, occasionally intransitive result verbs like /dac/ ‘separate (NTR)’ or

/rɔhaek/ ‘be.torn’ and transitive variants derived from intransitive result verbs like /pdac/

‘separate (TR)’, /bɑmbak/ ‘break (TR); make.broken’, or /bɑmbaek/ ‘make.shattered’ are subsequently serialised to preceding caused-separation verbs. I have excluded these intransitives from the analysis since they do not encode causal actions. Items included in the analysis encode causes and manners of action, the focus of this study (see § 2.2.6.1).

Furthermore, lack of realisation of the excluded intransitive verbs does not yield ungrammaticality of the clausal descriptions of caused-separation events. Note that though we may view derived transitive verbs as referring to caused-separation, they differ from non-derived or ordinary verbs of caused-separation: they do not specify causes and manners of action, thus being excluded for simplification of analysis. That said, if derived transitive verbs occurred as the main verb of the clause, they were included in analysis. In

229

this study, less than 4% of the descriptions involved the use of derived transitive verbs as the single main verb.

A single two-verb case requires comment: /cəɲcram/ ‘cut’ preceding non-derived transitive /kap/ ‘hack’. This was found only once. Accordingly, only the preceding verb was included in the analysis, factoring out the subsequent one, in accordance with the data screening method. Serialisation of verbs is mentioned briefly in § 4.2.2, which primarily discusses constructional patterns related to caused-separation verbs in Khmer.

The caused-separation verbs applied unevenly to different scenes, with some occurring more frequently than others in the datasets. Table 4.2 shows the uneven distribution characterising verbs in the descriptions, sorted by frequency with the most frequent one at the top of the list.

Table 4.2

Khmer verbs used to describe 43 caused-separation clips, in descending order of their distribution of verb tokens.

No. Verb type Approximate meaning43 Frequency Percentage of No. of scenes

frequency where the verb was

applied

1 /kat/ ‘cut, slice, slit’ 73 21.79% 21

2 /dɑm/ ‘strike, beat, hit, smash’ 37 11.04% 10

43 These English glosses for the individual Khmer verbs are largely based on the Khmer-English dictionaries (Headley, 1977; Headley et al., 1997). They are given to assist readers, and may not capture actual meanings accurately.

230

No. Verb type Approximate meaning43 Frequency Percentage of No. of scenes

frequency where the verb was

applied

3 /kap/ ‘cut, hack’ 31 9.25% 10

4 /kac/ ‘break, break off, snap’ 29 8.66% 4

5 /puh/ ‘chop, hack, cut, split’ 23 6.87% 7

6 /pdac/ ‘break’ 20 5.97% 8

7 /ʔaa/ ‘saw, cut’ 17 5.07% 4

8 /viey/ or ‘beat, hit, slap, strike’ 16 4.78% 9

/vay/44

9 /haek/ ‘tear’ 15 4.48% 2

10 /cak/ ‘pierce, stab, inject’ 14 4.18% 4

11 /tieɲ/ ‘pull, drag, tug’ 10 2.99% 3

12 /han/ ‘slice, cut’ 8 2.39% 3

13 /mut/ ‘cut, pierce, stab’ 7 2.09% 1

14 /cəɲcram/ ‘chop, mince, hash’ 6 1.79% 2

15 /cət/ ‘pare, peel’ 6 1.79% 3

16 /bɑmbaek/ ‘break up, smash, shatter’ 4 1.19% 3

17 /bok/ ‘pound, crush’ 4 1.19% 4

44 The variant /vay/ is more colloquial than /viey/ (Headley, 1977). Both were used by the different speakers in this study, with the former occurring eight times more than the former for the data collected in spoken Khmer. The two variants are presented together in Table 4.1, with the variant /vay/ used for both below.

231

No. Verb type Approximate meaning43 Frequency Percentage of No. of scenes

frequency where the verb was

applied

18 /bɑmbak/ ‘break off, smash’ 3 0.90% 3

19 /tumluh/ ‘perforate, to bore, to drill (a 3 0.90% 1

hole), to punch, to puncture’

20 /cie/ ‘clip, prune, trim, slice’ 2 0.60% 2

21 /criək/ ‘cut, shred, split’ 2 0.60% 2

22 /veah/ ‘cut (open), slit, make an 2 0.60% 2

incision’

23 /crəp/ ‘cut, trim’ 2 0.60% 1

24 /ʔok/ ‘hit down’ 1 0.30% 1

We can see from Table 4.2 that there are 24 different verbs used in the Khmer speakers’ descriptions of the 43 scenes. The frequency of the verbs ranges from 1 to 73 occurrences, with the most frequent /kat/ ‘cut’ and the most infrequent /ʔok/ ‘hit.down’.

The ten most common verbs, /kat/ ‘cut’, /dɑm/ ‘smash’, /kap/ ‘hack’, /kac/ ‘snap’, /puh/

‘chop’, /pdac/ ‘separate (TR)’, /ʔaa/ ‘saw’, /viey/ or /vay/ ‘strike’, /haek/ ‘tear’ and /cak/

‘stab’45 are used in over 80% of all the scene descriptions, meaning that only less than

42% of the 24 verbs accounted for a significantly higher proportion of the cases. Over

45 Like those for the Thai counterparts, the English glosses were carefully chosen to represent the Khmer caused-separation verbs’ common meaning in this study, contributing effectively to readers’ comprehension. They are therefore seen as most well-suited and workable for the present research—one perhaps being aware of other gloss translations within reach notwithstanding.

232

58% of the verbs by contrast account for less than 18% of the description cases, this group of infrequently occurring verbs considered ‘long-tailed’ because of their low frequency in the datasets. Also, up to 25% of all the observed verbs (i.e., /bɑmbak/ ‘break

(TR)’, /tumluh/ ‘puncture’, /cie/ ‘prune’, /criək/ ‘slit’, /veah/ ‘slit’, /crəp/ ‘cut’, and /ʔok/

‘hit.down’) appeared even less frequently: in less than 1% of the cases, since the speakers used them less than three times.

How individual Khmer verbs of caused-separation differ in distribution is shown in the rightmost column. Some verbs were distributed naming large proportions of the stimulus scenes. In this study, the Khmer speakers could label a little less than half of the scenes with the most frequent verb /kat/ ‘cut’. This seems to suggest a generic meaning for this verb. In contrast, others were used to describe tiny numbers of scenes. Infrequent verbs such as /mut/ ‘cut’, /tumluh/ ‘puncture’, /crəp/ ‘cut’, /ʔok/ ‘hit.down’ were used to describe merely a single scene, together accounting for less than 10% of all the scenes. In more general terms, more than 97% of all the scenes could be described with one of the above ten most frequent verbs, while the rest were with one of the long-tailed verbs.

The sum of verb applications (= 110), being greater than scene total, shows the potential of some scenes to be described with more than one verb type. Figure 4.1 below shows how different numbers of verbs were used for the scenes, ranging from one to six types. The Khmer speakers described a little more than 27% of the scenes with three different verbs, indicating that this number is the most common value for relevant Khmer descriptions (coloured in dark orange). Approximately half of the scenes were labelled with either one or two verb types. Less than 12% involved four different verb types, and less than 7% with five types. There is only one scene (i.e., Scene 32 KARATE-CHOP

233

CARROT; coloured in light orange) that was found named by up to six different verbs, i.e.,

/kat/ ‘cut’, /dɑm/ ‘smash’, /kap/ ‘hack’, /puh/ ‘chop’, /vay/ ‘smash’, and /bɑmbaek/

‘smash’. This means that over 90% of the scenes were described using one to four different verbs of caused-separation. The differences in the individual scenes involving the different numbers of caused-separation verbs reflect varying degrees of agreement between the speakers of Khmer in naming different caused-separation events. This is discussed in more detail later in § 4.4.2.

Figure 4.1. Numbers of scenes with different numbers of verbs in Khmer.

In the next section, I summarise the structural patterns of Khmer caused-separation verbs to show how they can appear in different constructions, i.e., single-verb and multi-verb construction types.

234

4.2.2 Summary of structural patterns of Khmer caused-separation

verbs

This section describes how the different verbs listed in Table 4.2 were used to describe caused-separations, as occurring in single-verb constructions. In addition, certain caused- separation verbs could also accompany subsequent verbs in multi-verb constructions.

What follows goes into more detail about verbs of caused-separation in both kinds of verb-phrase constructions.

In the linguistic data considered in this study, Khmer caused-separation verbs occurred in varying structural characteristics of descriptions, as exemplified below.

(4.7) kee haek krɑnat

3SG tear cloth

SBJ VB OBJ ‘He tore the cloth’ [CB-S36-SS] (4.8) koat kat kaarot cie kaŋ kaŋ

3SG.POL cut carrot BE piece piece

SBJ VB OBJ RESP ‘He broke the carrot into pieces’ [CB-S10-CN] (4.9) koat praə kɑntray kat ksae

3SG.POL use scissors cut rope

SBJ VB1 OBJ1 VB2 OBJ2 ‘She cut the string with scissors.’ [CB-S24-MS]

235

(4.10) koat yɔɔk ɲɔɲoa mɔɔk dɑm kaarot

3SG.POL take hammer COME smash carrot

SBJ VB1 OBJ1 PURP VB2 OBJ2 ‘He smashed the carrot with a hammer.’ [CB-S21-KC] (4.11) kee tieɲ ksae pdac

3SG tug rope separate

SBJ VB1 OBJ VB2 ‘She pulled the string, breaking it off.’ [CB-S35-SS] (4.12) nieŋ kac bɑmbak mɛɛk-cʰəə girl snap break branch

SBJ VB1 VB2 OBJ ‘She snapped the branch, and breaking it off.’ [CB-S25-MS] (4.13) koat yɔɔk day tɨv bok ksae pdac

3SG.POL take hand GO pound rope separate

SBJ VB1 OBJ1 PURP VB2 OBJ2 VB3 ‘He used his hand to beat the rope, breaking it off.’ [CB-S2-KC] (4.14) koat yɔɔk ɲɔɲoa mɔɔk dɑm ksae pdac

3SG.POL take hammer COME smash rope separate

SBJ VB1 OBJ1 PURP VB2 OBJ2 OBJ3 pontae ksae baan pdac nɨv lǝǝk tii-bəy but rope acquire separate at time third

CONJ SBJ PASS VB PREP NP ‘He struck the rope with a hammer, but the string was torn apart on the third time.’ [CB- S50-KC]

236

(4.15) nieŋ haek krɑnat dac cie pii girl tear cloth separate COP two

SBJ VB1 OBJ VB2 RESP ‘She tore the cloth in two.’ [CB-S1-CP] (4.16) koat kat dac ksae cie pii

3SG.POL cut separate rope COP two

SBJ VB1 VB2 OBJ RESP ‘He cut the string to two’ [CB-S24-PS] (4.17) koat yɔɔk ɲɔɲoa mɔɔk vay lǝǝ krɑnat ʔaoy dac

3SG.POL take hammer come strike on cloth GIVE separate

SBJ VB1 OBJ1 PURP VB2 PREP OBJ2 PURP VB3 ‘He struck the cloth with a hammer.’ [CB-S23-CP]

According to these examples, the caused-separation descriptions in Khmer usually appeared in mono-clausal contexts (in 4.7 – 4.13 and 4.15 – 4.17), as less often in bi- clausal contexts (in 4.14). Looking particularly at the clauses in which the verb of caused- separation express causal actions, a verb of caused-separation either occurred as a single main verb for the clause (in 4.7 – 4.8) or was serialised with one or more preceding or subsequent verbs (in 4.9 – 4.17).

In the data, clauses containing caused-separation verbs (in boldface) occurred in

[SBJ VB OBJ] constructions in approximately two fifths of the scene descriptions; [SBJ VB1

VB2 OBJ], in over a tenth. We can therefore describe the basic syntax of the elicited Khmer descriptions as quite consistent, since slightly more than a half of the descriptions

237

appeared in the two constructions. In § 4.2.2.1, I describe the occurrences of Khmer verbs of caused-separation in single-verb structures, followed by those in serialisation (in §

4.2.2.2).

4.2.2.1 Verbs of caused-separation in single-verb constructions in Khmer

Caused-separation verbs in Khmer may appear as the single verb of the descriptions as in

(4.7) – (4.8). Accordingly, across variants of the constructional type, the mono-verbal structure can be summarised as below:

SBJ VB OBJ (RESP)

The predicate of this construction is always a bare transitive verb of caused-separation.

The caused-separation verb is preceded by a noun phrase: either nominal or pronominal

(e.g., kee ‘s/he’, or more polite /koat/ ‘he/she’). This noun phrase represents the construction’s subject. There is a noun phrase immediately following the main verb of caused-separation; it is the syntactic object. The object may be followed by a resultative phrase headed by copular /cie/ ‘be.PREP’. Given its infrequency, the resultative phrase can be considered parenthetical.

The Khmer data shows that certain semantic roles appear to map consistently to each of the above syntactic representations. The agent of the action is encoded by the syntactic subject, as in (4.7): /kee/ ‘s/he’ pronominally refers to the man who performed the action. The action of caused-separation is represented by the single main verb: e.g.,

/kat/ ‘cut’ in (4.8). Note that some of caused-separation verbs potentially occurring in VB

238

can be morphologically derived forms: e.g., /pdac/ ‘separate (TR)’ [CB-S2-MS], corresponding to intransitive verbs: e.g., /dac/ ‘separate (INTR)’, that express a state, i.e., resulting separation. The following syntactic object of the verb: for example, /krɑnat/

‘cloth’ in (4.7), expresses the theme or the acted-upon object. This object phrase is occasionally followed by a prepositional phrase with a resultative interpretation. As in

(4.8), the phrase /cie kaŋ kaŋ/ (COP piece piece) ‘into pieces’ specifies how the resulting separation was achieved due to the /kat/ action. The mapping of the semantic roles to the syntactic functions can be formalised as below.

SBJ VB OBJ RESP

| | | |

AGENT CAUSED-SEPARATION THEME RESULT STATE

The above maximal single-verb clausal description containing a caused-separation verb is possible but infrequent. Less than 6% of all the descriptions employ this extensive structure. As noted, the smaller mono-verbal construction without the prepositional phrase: [SBJ VB OBJ] is more commonplace. Absence of a prepositional phrase that expresses resultant state highlights some related implications. The event feature of resulting separation may not be generally important for speakers of Khmer to be realised by means of a surface form, i.e., a prepositional phrase. Instead, a state resulting from caused-separation would already be entailed or implied by the main verb. Consequently, a realised phrase specifying such a state could be a form of emphasis or detail

239

specification. This emphatic expression can be omitted—perhaps when considered unnecessary.

Furthermore, although [SBJ VB OBJ] is predominant in the descriptions, it is not considered dedicated to expressions of caused-separation. In Khmer, the main verb in this construction can be those of any action type, and the construction itself is basic for transitive verbs.

4.2.2.2 Verbs of caused-separation in multi-verb construction in Khmer

Caused-separation verbs in Khmer also occurred in sequence with precedent or subsequent verbs in multi-verb construction. Enfield (2008) explains that a construction of this kind refers to unmarked sequences of verbs, in which NP arguments can interpose between the verbs or complete the serialisation. What follows examines a range of multi- verb constructions in the Khmer descriptions for caused-separation events, and the mapping of semantic roles and syntactic constituents of those constructions.

Five multi-verb construction structures are found in the Khmer descriptions, as shown below. Note that the syntactic objects of caused-separation verbs may either insert between the verbs in sequence or be placed at the end of the sequence without semantic effects, thus (1a) and (1b) are alternatives; so are (2a) and (2b).

(1a) SBJ VB1 OBJ VB2 (RESP)

(1b) SBJ VB1 VB2 OBJ (RESP)

(2a) SBJ VB1 OBJ1 COME/GO VB2 OBJ2 (VB3)

240

(2b) SBJ VB1 OBJ1 COME/GO VB2 VB3 OBJ2

(2c) SBJ VB1 OBJ1 COME/GO VB2 (PREP) OBJ2 GIVE VB3

Before the mapping of semantic roles to the syntactic arguments (and adjuncts, if any) in each multi-verb construction, let us look at different types of serial verbs that occurred in these constructions. In (1a-b), VB2 represents the serial verb following the main verb of each clause construction. It can be transitive or intransitive. In (2a-b), the serial verbs represented by VB2 and VB3 are found in this study only as transitive; however, in the grammar of Khmer, an intransitive verb is allowed for VB3. In (2c), the serial verb (VB2) is transitive, whereas that expressed by VB3 is always intransitive.

A key point in determining differences in the serial-verb types is whether the syntactic subject of a following verb is co-referential with that of the main verb. If the serial verb is intransitive, the NP that acts as the object of the main or preceding verb is the subject of the following intransitive verb. For example, both (4.11) and (4.15) reflect the (1a) multi-verb pattern, with a slight difference in that (4.11) does not realise the resultant state into a resultative phrase, but (4.15) does, with /cie pii/ (be two) ‘into two’.

In (4.11), the serial verb is /pdac/ ‘separate (TR)’, with the syntactic subject co-referential with that of the main verb /tieɲ/ ‘tug’. This means that the one who performed the action represented by the main verb also executed the action expressed by the serial verb. In

(4.15), the serial verb /dac/ ‘separate (INTR)’ and the main verb /haek/ ‘tear’ have different logical subjects. The NP argument /krɑnat/ ‘cloth’ acting as the syntactic object of the main verb functions as the syntactic subject of the serial verb.

241

Differentiating serial-verb types involves not only the (non)-coreference of the subjects but also the pragmatically interpreted temporal relationship between the verbs. In

(4.15), the event expressed by the intransitive serial verb (i.e., /dac/) is regarded as taking place after the event represented by the preceding main verb. We can understand this event temporality as the intransitive serial verb predicating the result of the main verb.

The temporal subsequence correspondingly mirrors the resultative relationship of these verb events. The different-subject serial verb construction in this case can be thus viewed as a resultative construction. In (4.11), it is quite equivocal whether the event represented by the transitive serial verb /pdac/ should be subsequent but close to, or simultaneous (or at least overlapping) with that by the main verb /tieɲ/. We may therefore read the clause as either ‘He tore the rope then he separated it’, or ‘He tore and separated the rope’. Note that /pdac/ is derived from /dac/, which was used in a well-formed context to express the resultant state (see example 4.15). This relationship could trigger the uncertainty about the order of the events: the derivational correspondence just licenses an interpretation of the serial verb event as being the consequent result of the cause or manner by the main verb. However, Khmer also allows a parallel reading for the primary and secondary

predications in same-subject multi-verb construction: [SBJ VB1 VB2 OBJ] entails [SBJ VB1

OBJ] + [SBJ VB2 OBJ] (see 4.18). The clause (4.11) accordingly may have a distributive interpretation in which the serial verb describes an event that took place at the time of the main verb’s event occurrence. That being the case, provided that the resultative reading held, the structural patterns [SBJ VB1 OBJ VB2 (RESP)] and [SBJ VB1 VB2 OBJ (RESP)], with

VB2 each being transitive, would be viewed as the same-subject resultative constructions.

If the distributive reading held instead, the construction could be regarded as asyndetic

242

and coordinating. Syntactically, the (same-subject) resultative construction seems to be a poor candidate in the labelling for the two patterns (with VB2 being transitive) since they cannot be medially negated (compare 4.11′ to the constructed example 4.19). The candidacy of the coordinating construction is accordingly stronger since asyndetic coordination in Khmer may disallow medial negation (see 4.18′) (cf. Enfield, 2008, for

Lao).

(4.18) vie daə rɔɔk cɑmnəy [parallel reading]

3SG walk find feed ‘It was walking and finding feed,’ or ‘it foraged.’ (SEAlang Library Khmer Text Corpus, 2007)46 (4.11′) kee tieɲ ksae *mɨn pdac

3SG tug rope NEG separate (4.19) kee kat mɛɛkcʰəə mɨn dac [resultative reading]

3SG cut branch NEG separate ‘He cut the branch; but it was not off,’ or ‘he tried to cut the branch.’ (constructed example) (4.18′) vie daə *mɨn rɔɔk cɑmnəy

3SG walk NEG find feed

46 SEAlang Library Khmer Text Corpus. (2007). SEAlang Library Khmer. Retrieved November 1, 2020, from http://sealang net/khmer/corpus.htm.

243

Turning now to the semantic mapping of these syntactic structures, the descriptions in this study show that the syntactic functions in each of the structural patterns correspond to certain semantic roles. Let us start with each verb predicate. In

(1a-b), VB1 represents an action of caused-separation; as this delineates a large number of verbs, the action could be any type of cutting and breaking. Intransitive VB2 refers to a resultant effect—i.e., of the VB1 event. In this study, intransitive VB2 expressions involve a small group of verbs, i.e., /dac/ ‘separate (INTR)’, /rɔhaek/ ‘be.torn’, /bak/ ‘break

(INTR)’, with each corresponding to certain VB1 expressions (see more details on the correspondence of caused-separation VB1 and resulting separation VB2 below). Like VB1, transitive VB2 expresses an act of caused-separation, but semantically “lighter” in that it does not give manner-of-action information (Headley, 1977; Headley et al., 1997; Nath,

1967), and is limited to a restricted set of verbs: in this study, /pdac/ ‘separate (TR)’,

/bɑmbak/ ‘break (TR)’, and /bɑmbaek/ ‘shatter’. In this study, each of these transitive verbs occurring as VB2 appeared combined specifically with certain caused-separation verbs in VB1. I discuss the relationship of transitive VB1-transitive VB2 expressions in the last part of this section.

In (2a-c), VB1 is from the closed set, including only two different verbs. One is

/yɔɔk/ ‘take’, and the other is /praə/ ‘use’. The two verbs are used to represent an action of holding/taking or using an implement for an action of caused-separation, which is expressed by VB2. In the position before VB2 there appears either of the deverbal purpose linkers, i.e., /mɔɔk/ ‘COME’ and /tɨv/ ‘GO’ (cf. Haiman, 2011; Heine & Kuteva, 2002).

According to Haiman (p. 312), the deverbal linker /tɨv/ ‘GO’ is optional and seems limited to a same-subject purpose marker. Using /tɨv/ ‘GO’ to link the preceding and succeeding

244

predications thus helps indicate a same-subject serialisation: VB1 and VB2 share the same subject. Native speakers suggest that the deverbal linker /mɔɔk/ ‘COME’ is like the linker

/tɨv/ ‘GO’ in its optionality and same-subject indication. They also comment that an omission of such purpose linkers comes with the cost of clause abruptness. Last, VB3 express an action of caused-separation like transitive VB2 in (1a-b).

In (2c), VB1 and VB2 are like the equivalent in (2a-b). However, there is the deverbal purpose linker /ʔaoy/ ‘GIVE’ (Haiman, 2011; Heine & Kuteva, 2002) placed before the clause-final verb, i.e., VB3. According to Haiman (p. 309), the deverbal linker

/ʔaoy/ ‘GIVE’ involves a change-of-subject marking, as opposed to /tɨv/ ‘GO’, a same- subject purpose clause linker. VB3 and the preceding verb, i.e., VB2, do not have the same subject, but in (2c), the subject of VB3 is the syntactic object of VB2.

Consideration now turns to NP arguments and adjuncts. In describing event scenes, the Khmer speakers mapped the agent to the NP argument in the subject position of all the structures (1a-b) and (2a-c). The object NP of the main verb in (1a-b) is tied to the theme object. Note that if VB2 is transitive, the object NP of VB1 is also that of VB2, representing the acted-upon object. The resultative adjunct headed by copular /cie/ ‘BE’ represents a resultant state of the main verb event. Following Enfield (2008, p. 409), the resultative phrase is the copula adjunct expressing the physical form of the object of VB1

(and transitive VB2) as resulting from the preceding predication regarding physical transformation.

In (2a-c), the first object NP of the main verb consistently expresses an implement used for caused-separation, while the Khmer speakers tied the second object NP to the theme object. Only in (2c), a preposition like /lǝǝ/ ‘on’ may occur to introduce this

245

second object NP, causing VB2 to take part in a conative construction. Following Levin

(1993), verbs of caused-separation in VB2 of (2c) sometimes occur in conative constructs because they are sensitive to both contact and motion meaning components. We may surmise that a speaker would realise a preposition, as in (2c) if s/he wanted to regard the motion subevent in this way.

The three patterns of multi-verb constructions mapped with certain semantic roles are illustrated below:

(1a) SBJ VB1 OBJ VB2 (RESP)

| | | | |

AGENT CAUSED- THEME CAUSED- RESULT

SEPARATION SEPARATION/

RESULTING

SEPARATION

(1b) SBJ VB1 VB2 OBJ (RESP)

| | | | |

AGENT CAUSED- CAUSED- THEME RESULT

SEPARATION SEPARATION/

RESULTING

SEPARATION

(2a) SBJ VB1 OBJ1 COME/GO VB2 OBJ2 (VB3)

| | | | | | |

AGENT TAKING/USING INSTRUMENT PURPOSIVE CAUSED- THEME CAUSED-

SEPARATION SEPARATION

(2b) SBJ VB1 OBJ1 COME/GO VB2 VB3 OBJ2

| | | | | | |

AGENT TAKING/USING INSTRUMENT PURPOSIVE CAUSED- CAUSED- THEME

SEPARATION SEPARATION

246

Table 4.3 shows that many verbs of caused-separation can precede a handful of verbs, including /pdac/ ‘separate (TR)’, /bɑmbak/ ‘break’, and /bɑmbaek/ ‘smash; shatter’

(Headley, 1977; Headley et al., 1997; Nath, 1967). All the preceding verbs can be serialised to the succeeding verb /pdac/. If dictionary meaning is considered, /pdac/ would be harmonious with any of the precedent verbs since it denotes an action of caused-separation, without referring to any specific intended kind of result. The succeeding verb /bɑmbak/ ‘break’ appears serialised to four different verbs: /kac/ ‘snap’,

/vay/ ‘strike’, /dɑm/ ‘smash’, and /cak/ ‘stab’. These preceding verbs share a common meaning component: all involve intended separation at one place on the theme object. We also perceive this component from /bɑmbak/ ‘break’. The last succeeding verb /bɑmbaek/

‘shatter’ is compatible with /vay/ ‘strike’, /dɑm/ ‘smash’, /cak/ ‘stab’, /puh/ ‘chop’, and

/bok/ ‘pound’. These five verbs contain a methodical component of striking an implement upon the theme object. The meaning component of a blow of a tool makes sense with the succeeding verb /bɑmbaek/ ‘shatter’, which refers to an action of shattering something into multiple pieces or fractures.

The above table also suggests that all the succeeding transitive verbs are derivatives from certain intransitive sources, i.e., /dac/ ‘separate (TR)’, /bak/ ‘break

(INTR)’, and /baek/ ‘be.shattered’. In the present study, I also found one case where the non-derived verb /kap/ ‘hack’ was serialised to /cəɲcram/ ‘cut’, indicating that two non- derived verbs of caused-separation can be strung into sequence in Khmer. Furthermore, the order of the transitive-verb serialisation is strict in that there is no case where a derived verb precedes a non-derived verb of caused-separation. Table 4.4 shows a

248

summary of the patterns of sequencing transitive verbs in the descriptions of caused- separation.

Table 4.4

Patterns of transitive verbs of caused-separation in serialisation.

Verbpreceding Verbsucceeding Example

non-derived derived /kac/ ‘snap’ + /pdac/ ‘separate (TR)’

non-derived non-derived /cəɲcram/ ‘cut’ + /kap/ ‘hack’

Unlike transitive-verb serialisation, Table 4.5 below shows that a smaller number of caused-separation verbs can occur preceding relevant verbs of resulting separation.

Four different verbs /haek/ ‘tear’, /dɑm/ ‘smash’, /kac/ ‘snap’, and /vay/ ‘strike’ were found preceding /dac/ ‘separate (INTR)’ in serialisation. Two of these verbs, i.e., /haek/

‘tear’ and /dɑm/ ‘smash’, are also compatible with another result verb /rɔhaek/ ‘be.torn’.

Additionally, only /kac/ ‘snap’ may precede /bak/ ‘break (INTR)’. The combinability of

/dac/ ‘separate (INTR)’ with relatively more verbs than the other two result verbs can be understood semantically: /dac/ ‘separate (INTR)’ does not refer to any specific result type of the caused-separation verb event, like its derivative /pdac/ ‘separate (TR)’.

Accordingly, /dac/ ‘separate (INTR)’ does not contradict implications of any preceding verbs of caused-separation, which allude to different types of intended results. The result verb /rɔhaek/ ‘be.torn’ is morphologically derived from /haek/ ‘tear’, so it can be serialised to /haek/, emphasising the direct result of the /haek/ event. The verb /rɔhaek/

‘be.torn’ is strung to /dɑm/ ‘smash’ since it may indicate that the result of the /dɑm/ event

249

either transitive or intransitive verbs. The forms following transitive verbs were far more often morphologically derived from intransitive verbs of resulting separation; there is only one case in which a non-derived verb was used in the present study. As for intransitive serial verbs, they are those expressing resultant states of separation.

Additionally, different transitive and intransitive serial verbs appeared to vary in how they were able to be strung with different preceding verbs of caused-separation.

4.3 Granularity of caused-separation categories in Khmer

The preceding section shows that the Khmer speakers made use of 24 different verbs to name the 43 stimulus scenes of caused-separation events. The 24 verbs have different distributions in that speakers applied some to describe many more scenes than others. On the other hand, data also show ranges of scenes that were apparently named with the same verbs. These verb distributions provide a means to classify different scenes according to semantic categories in the caused-separation domain (cf. Vulchanova et al.,

2012). In what follows I consider how Khmer partitions the caused-separation event domain, and the extent to which different degrees of lexical variation can be observed in each caused-separation category.

To make a preliminary semantic categorisation of the domain, I relied on semantic similarity across the different event stimuli (compare with Chapter 3 of this study). I first considered such similarity through how the individual stimulus scenes were named by the speakers. Any of the scenes described with the same verb by several of the speakers are regarded as semantically similar to one another and were classified into the same event type or (sub)category. If no common verb was ever used to name any

251

particular scene, those scenes are considered different semantically. This means that they are identified as being in different (sub)categories. The methodology of similarity consideration, i.e., clustering analysis, for Khmer accordingly follows that described for

Thai in the preceding chapter.

I consider a cluster of scenes as a category based on distribution: a proportion of the same verb’s usage in the descriptions determines the scene membership in a classification, thus modelling a category in that domain (Vulchanova et al., 2012, p. 22).

Each category is thus subject to the following conditions, as proposed by Vulchanova et al. First, in any category, at least one verb type applies to cover all member scenes.

Second, the category-defining verb is predominant: it is the most frequent verb in the descriptions for a category. Last, given the membership in a category, all member scenes should reflect the same feature values.

Below, I first proceed with a discussion of semantic categories in the caused- separation domain in Khmer (§ 4.3.1). Then I enquire further into subcategorisation within each category (§ 4.3.2). The analysis for both mentioned subsections relies on the resulting dendrogram for Khmer and related verb distributional patterns. In § 4.3.3, I compare the individual perceived categories to see how they may differ by asymmetry in lexical resources and hierarchical organisations.

4.3.1 Caused-separation categories in Khmer

4.3.1.1 Patterning of caused-separation categories by verbs in Khmer

Figure 4.2 juxtaposes the dendrogram (on the left) to the verb distributions (on the right), showing seven different categories characterising the caused-separation event domain.

252

Colouring shows how each relevant column specifies proportions for each category. Two larger categories defined by /kat/ ‘cut’ and /dɑm/ ‘smash’ are portrayed as covering

53.5% of all the scenes used in this study. Five others together occupy the remaining

46.5% of the total scenes: /puh/ ‘chop’, /cak/ ‘stab’, /kac/ ‘snap’, /tieɲ/ ‘tug’, and /haek/

‘tear’. Note that there are three scenes whose verb distributions do not form a pattern with any category given the Khmer speakers’ responses; they are Scene 42 KARATE-CHOP

STICK, Scene 15 SAW STICK, and Scene 18 CUT FINGER ACCIDENTALLY. I refer to them as special scenes after this. Moreover, for Scene 42, some speakers in the present study and other native speakers of Khmer confirm that this scene can be described with /dɑm/

‘smash’; similarly, for Scene 15, with /kat/ ‘cut’. These two then are not considered as separate categories. Scene 18, in contrast, is a true outlier since it was only described with

/mut/ ‘cut’, thus not being classified into any of the other categories indicated in Figure

4.2. I therefore exclude the scene from further discussion. Additionally, I preclude the cases where a verb type occurred less than three times for a scene, unless they helped expose a potential verb distributional pattern. This means that in a subcluster of scenes, I took into consideration one or two occurrences of a verb for a scene when they appear patterned with highly frequent uses of the same verb for other adjacent scenes. For example, /dɑm/ occurred only one time for Scenes 34 and 61, but its occurrences formed a pattern with adjacent Scene 23 for which the speakers used the verb up to four times, in the same subcluster, for the establishment of the /dɑm/-category. In this case, the occurrences of /dɑm/ for the two scenes were reckoned.

253

Figure 4.2. Hierarchical clustering of caused-separation scenes, based on corresponding verbs in Khmer.

254

Of the descriptions for the first and largest category – at the middle of the dendrogram, 56.19% contain the class-defining verb /kat/ ‘cut’, used for all the 14 member scenes; the special /kat/ scene noted above is temporarily removed from consideration at this point. These scenes cover 80.82% of all /kat/ ‘cut’ occurrences.

The next largest category is based on nine scenes, all described with the same verb /dɑm/ ‘smash’. Of the descriptions for this category, 50% comprise this defining verb, and the nine scenes cover 97.3% of all the verb occurrences. The third category is defined by /puh/ ‘chop’, incorporating five scenes. This verb accounts for 52.5% of the category’s descriptions, while the member scenes cover 91.3% of all /puh/ occurrences. The fourth and fifth categories each contain four scenes. The predominant verbs contained in the description of each category are /cak/ ‘stab’

(41.18%) and /kac/ ‘snap’ (90.63%). Their scenes account for 100% of all the two verbs’ occurrences. The final sixth and seventh categories are tiny as each includes just two scenes. The former is defined by /tieɲ/ ‘tug’, contained in 64.29% of the descriptions for the category; the latter, by /haek/ ‘tear’, contained in 93.75% of the descriptions for the category. The scenes of these two categories cover 90% and 100% of all /tieɲ/ and /haek/ occurrences, respectively. Since both of these categories are tiny, doubt arises as to whether they should be incorporated into other bigger categories. Nevertheless, their precision as categories is adequately maintained: no other verbs associated with other categories apply to the scenes of these categories— except for /pdac/ ‘separate’ relating to the /tieɲ/-category, which is discussed later.

Also, the overlapping region between the sixth and seventh categories is too small to collapse them into a single category.

Table 4.6 below summarises how Khmer partitions the caused-separation event domain by seven different verbs. The table shows that these seven verbs are the

255

predominant ones in each category, covering all the member scenes. We can consider them as class-defining verbs in Khmer and make use of them to label the individual corresponding categories: /kat/-, /dɑm/-, /puh/-, /cak/-, /kac/-, /tieɲ/-, and /haek/- categories.

Table 4.6

Summary of predominant Khmer verbs in seven caused-separation categories.

Domain Caused-separation

Category: CAT I II III IV V VI VII

Predominant verb: /kat/ /dɑm/ /puh/ /cak/ /kac/ /tieɲ/ /haek/

Vx ‘cut’ ‘smash’ ‘chop’ ‘stab’ ‘snap’ ‘tug’ ‘tear’

No. of scenes (% 14 9 5 4 4 2 2 of 43 scenes) (32 56%) (20 93%) (11.63%) (9.3%) (9.3%) (4.65%) (4.65%)

% of descriptions 56.19% 50% 52.5% 41.18% 90.6% 64.29% 93.75% containing Vx in

CAT

% of all 80.82% 97.3% 91.3% 100% 100% 90% 100% occurrences of Vx in this research

The categorisation of the caused-separation event domain in Khmer correlates with the frequency analysis (see Table 4.2), as six of the seven categories reflect the usages of six of the ten frequent verbs, i.e., /kat/ ‘cut’, /dɑm/ ‘smash’, /kac/ ‘snap’,

/puh/ ‘chop’, /haek/ ‘tear’, and /cak/ ‘stab’. The residual category was labelled by the relatively less frequent verb /haek/ ‘tear’ almost always used for it. The verb /haek/

‘tear’ includes features discussed further in § 4.5.

Figure 4.2 also shows that there are other less used verbs of caused-separation involved in each category, besides the seven corresponding predominant verbs. These

256

verbs attest lexical variation in each category. The next subsection presents the other infrequent verbs associated with the individual seven categories.

4.3.1.2 Lexical variation within caused-separation categories in Khmer

As pointed out above, the Khmer speakers also characterise most of the posited categories using other less frequent corresponding verbs.

In the /kat/-category, five other infrequent verbs were used to describe some of the member scenes. Specifically, /kap/ ‘hack’ was contained in 20.95% of the category’s descriptions, spanning across five out of the 14 scenes (35.71%). The verbs

/ʔaa/ ‘saw’ and /han/ ‘slice’ account for 9.52% and 7.62% of the descriptions for the category, covering three (21.43%) and two (14.29%) of all the member scenes. Of the category’s descriptions, 5.71% contain the sporadically used verbs, i.e., /cəɲcram/

‘cut’ (3.81%), /cət/ ‘slit’ (0.95%), and /cie/ ‘prune’ (0.95%). The verb /cəɲcram/ is associated with two (5.71%) of the /kat/-category’s scenes, while /cət/ and /cie/ with one scene (or 7.14% each).

The /dɑm/-category incorporates other six infrequently used verbs, besides the class-defining verb /dɑm/. Of the category’s descriptions, 15.28% contains /vay/

‘strike’, 15.28% are associated with /kat/ ‘cut’, and 12.5% are with /pdac/ ‘separate’.

Each of these verbs cover seven (77.78%), three (33.33%), and four (44.44%) of all the nine member scenes, respectively. Also, there are three other sporadic verbs. The verb /bɑmbaek/ ‘smash’ was contained in 4.17% of the category’s description, covering two (22.22%) of the member scenes. The other two /kap/ ‘hack’ and /puh/

‘chop’ were each found in 1.39% of the descriptions for the category, individually used to define the same single scene (11.11%) of all the member scenes: Scene 32

KARATE-CHOP CARROT CROSSWAYS.

257

The /puh/-category in which /puh/ ‘chop’, is predominant involves six different verbs. The verb /kap/ ‘hack’ was used in 17.5% of the descriptions for the category, spanning across three (60%) of the five member scenes. The verbs /criək/

‘slit’ and /veah/ ‘slit’ are individually found in 5% of the category’s descriptions.

Each of them covers two (40%) of the member scenes. One of these is represented by both items: Scene 14 CUT MELON W/ KNIFE. The other three sporadically used verbs in this category include /cət/ ‘slit’, /cəɲcram/ ‘cut’, and /cie/ ‘prune’, each covering only one of the category’s scenes.

The /cak/- and /kac/-categories contain the same number of member scenes (=

4). They involve different numbers of verbs in the descriptions. Parallel with /cak/

‘stab’, nine other verbs were used less frequently to describe the /cak/-category. They are /bok/ ‘pound’, /pdac/ ‘separate’, /kat/ ‘cut’, /tumluh/ ‘puncture’, /crəp/ ‘cut’, /dɑm/

‘smash’, /puh/ ‘chop’, /vay/ ‘strike’, and /bɑmbaek/ ‘smash’. The verbs /bok/ and

/pdac/ were each contained in the same number of descriptions (11.76% of all for this category), individually covering four (100%) and one (25%) of the four member scenes. Likewise, the verbs /kat/ and /tumluh/ were used in the similar frequency of descriptions (8.82% of all for this category). These account for three (75%) and one

(25%) of the category’s scenes, respectively. The verb /crəp/ was included in 5.88% of the category’s descriptions, structured around one scene. The other sporadic verbs

/dɑm/, /puh/, /vay/, and /bɑmbaek/ were individually contained in 2.94% of all the descriptions for the category, each built around one scene. The /kac/-category involves two less frequently used verbs, i.e., /bɑmbak/ ‘snap’ and /ʔok/ ‘hit.down’, besides /kac/ ‘snap’, the predominant verb for this category. Covering two (25%) of the four relevant member scenes, /bɑmbak/ was included in 6.25% of the category’s

258

descriptions. Finally, the verb /ʔok/ was observed only for one scene: Scene 5 BREAK

STICK OVER KNEE.

The last two categories incorporate other less frequently used verbs, beyond their predominant defining verbs: /tieɲ/ ‘tug’ and /haek/ ‘tear’. The /tieɲ/-category was also labelled using /pdac/ ‘separate’. 35.71% of the descriptions for the category contained this verb, used for both member scenes. The /haek/-category, in addition to

/haek/, was named with /tieɲ/ for one scene: Scene 1 TEAR CLOTH BY HAND. I found only a single instance of this verb for the category.

Table 4.7 below outlines the lexical multiplicity discovered in the six categories in the domain of caused-separation in Khmer.

Table 4.7

Khmer verb types and percentage of occurrences for seven categories of caused-separation; verbs underlined are those appearing in more than one category.

Caused-separation category in Khmer

/kat/- /dɑm/- /puh/- /cak/- /kac/- /tieɲ/- /haek/-

/kat/ 5619% /dɑm/ 50 0% /puh/ 52 50% /cak/ 41 18% /kac/ 90 6% /tieɲ/ 64 29% /haek/ 93 75%

/kap/ 2095% /vay/ 15 28% /kap/ 17 50% /pdac/ 11 76% /bɑmbak/ 6 25% /pdac/ 35 71% /tieɲ/ 6 25%

/ʔaa/ 952% /kat/ 15 28% /cət/ 12 50% /bok/ 11 8% /ʔok/ 3 13%

/han/ 762% /pdac/ 12 50% /cəɲcram/ 5 00% /kat/ 8 82%

/cəɲcram/ 381% /bɑmbaek/ 4 17% /criək/ 5 00% /tumluh/ 8 82%

/cət/ 095% /puh/ 1 39% /veah/ 5 00% /crəp/ 5 88%

/cie/ 095% /kap/ 1 39% /cie/ 5 00% /dɑm/ 2 94%

/puh/ 2 94%

/vay/ 2 94%

/bɑmbaek/ 2 94%

The above table suggests some general trends for the Khmer data. For each category the predominant verb is far more frequent than its less-used counterparts. This indicates that the Khmer speakers generally named events of these categories using

259

the compact set of seven different verbs. Other verbs were used less frequently. Also, some categories show a wider range of verbs, therefore suggesting that the speakers named these categories less consistently than others. For example, the /cak/-category incorporates up to ten different verbs, which is the highest number as compared to other categories. It is accordingly prone to the highest inconsistency; however, this requires further investigation and discussion (see § 4.4.2). Furthermore, there appear to be four of the predominant verbs and seven of the infrequent verbs involving more than one category, showing potential overlap between certain categories. I discuss this issue later in § 4.4.1.

Referring again to Figure 4.2, not only are there certain infrequently used verbs associated with each of the seven categories, but some verbs also appear to show distinct distributional patterns. This enables finer-grained categorisation in the domain. In the next subsection, I re-examine the categories based on verb frequency and assess the extent to which the Khmer semantic domain of caused-separation may be subclassified.

4.3.2 Partitioning of caused-separation categories into subcategories

in Khmer

The previous subsection shows that certain verbs occurring less frequently in scene descriptions are involved in the caused-separation event categories. This is in addition to their corresponding predominant and class-defining verbs. Moreover, the distributional patterns of some of such less-frequently used verbs can serve to fine- grain the categories of caused-separation in Khmer into subcategories or even minor divisions. The following argument gives more detail on subcategorisation within each category, based on Figure 4.2. To facilitate to our present discussion, the figure is

260

reproduced here with modification. It is dissected into seven derived figures, as shown in Figures 4.3a-g, corresponding to the individual categories.

Before discussing fine-grained categorisation in this domain, I would like to make some points about /pdac/ ‘separate’. Table 4.2 above reveals that /pdac/ was serialised with a wide range of preceding verbs of caused-separation, which I regard as associated with different categories (see Table 4.7). I therefore view the verb as likely compatible with many types of caused-separation—it perhaps bears the generic meaning of ‘making separation’. Such a potential echoes the prescriptive definition of

/pdac/, i.e., ‘separate (TR)’ given in the dictionaries (Headley et al., 1997; Nath, 1967).

Not only was /pdac/ serialised with more specific caused-separation verbs, but it also occurred as the main verb in the single-verb construction. According to the clustering analysis (see Figure 4.2), the combined occurrences of /pdac/ display a distinctive distributional pattern within three out of the seven categories, indicating subcategorisation. However, on inspection, there seem to be few or no characteristics in common between the scenes described by /pdac/ across those different categories.

In this case, the verb’s distributions raise some doubts as to the subcategorisation methodology discussed below.

261

Figure 4.3a. Cluster tree and verb frequency for /kat/-category.

Figure 4.3b. Cluster tree and verb frequency for /dɑm/-category.

262

Figure 4.3a shows how the /kat/-category incorporates three subcategories.

Relevant in this subdivision are distributional patterns of less-frequently used verbs, including /kap/ ‘hack’, /ʔaa/ ‘saw’, and /han/ ‘slice’. These verbs take up 70.98%,

100%, and 75% of all the respective occurrences in the descriptions. (The special scene, Scene 15 SAW STICK, requires extra consideration; see below.) Henceforth, for brevity, I refer to these subcategories by the pertinent defining verbs, i.e., the /kap/-,

/ʔaa/-, and /han/-subcategories. The /kap/-subcategory then covers five scenes in the

/kat/-category, while the /ʔaa/- and /han/-categories span over four and two scenes, respectively (with Scene 15 included in the /ʔaa/- category).

Furthermore, the /kap/-subcategory seems to contain a small subdivision, with

/cəɲcram/ ‘cut’ further dividing the subcategory, as structured around two scenes:

Scene 4 CHOP CLOTH W/ KNIFE and Scene 6 CHOP CARROTS W/ KNIFE. The verb

/cəɲcram/ for this small division accounts for 66.67% of all its occurrences in the descriptions. This small division is referred to as the /cəɲcram/-subdivision, after its defining verb.

The /dɑm/-category in Figure 4.3b shows a single subcategory determined by the distributional pattern of /vay/ ‘strike’, called the /vay/-subcategory. The special

Scene 42 KARATE-CHOP STICK, is connected with the /vay/-subcategory, accounting for 93.75% of all its occurrences in the descriptions. The /vay/-subcategory is further subdivided by the occurrence pattern of /kat/ ‘cut’, which is related to three different scenes (at the top of the category’s figure). Additionally, in this category /pdac/

‘separate’ occurred nine times (45% of all this verb’s occurrences), showing a pattern of occurrences around four scenes: Scene 34 KARATE-CHOP CLOTH, Scene 61 KARATE-

CHOP ROPE, Scene 23 CHOP CLOTH W/ HAMMER, and Scene 50 CHOP ROPE W/ HAMMER.

Three of the scenes (34, 61, and 23) are also shared with /kat/ while the remainder

263

with /vay/. Discussion relating to /pdac/ above indicates why /pdac/ should not be assigned to categorisation within the /dɑm/-category.

Figure 4.3c shows the /puh/-category’s hierarchical structure. This category clearly contains two subcategories. One is determined by the occurrence pattern of

/kap/ ‘hack’, which takes up 22.58% of all this verb’s occurrences in this study.

Though the number of occurrences for this category is small, /kap/ was distributed with the three scenes that were grouped relatively closely in the same subcluster within the /puh/-category’s cluster structure. The scenes include Scene 48 CHOP

BRANCH W/ AXE, Scene 51 CHOP MELON W/ KNIFE, and CHOP CARROT W/ AXE. Since

/kap/ defines this subcategory, it is called the /kap/-subcategory. The other subcategory is defined by the occurrences of /cət/ ‘slit’, structured around only one scene, i.e., Scene 14 CUT MELON W/ KNIFE. Despite the small size, I consider the existence of this category to be firmly established, due to its defining verb. The verb

/cət/, occurring six times in all, occurred 5 times of 83.33% for this subcategory scene. This subcategory is thus called the /cət/-subcategory.

264

Figure 4.3c. Cluster tree and verb frequency for /puh/-category.

Figure 4.3d. Cluster tree and verb frequency for /cak/-category.

Figure 4.3e. Cluster tree and verb frequency for /kac/-category.

265

Figure 4.3f. Cluster tree and verb frequency for /tieɲ/-category.

Figure 4.3g. Cluster tree and verb frequency for /heak/-category.

266

In Figure 4.3d, the /cak/-category is shown to possess one subcategory. This subcategory was defined by the distributional pattern of /tumluh/, based on a single scene, i.e., Scene 45 POKE HOLE IN CLOTH W/ TWIG. It is called the /tumluh/- subcategory. Though comprising only one scene, the subcategory’s status is certain as

100% of the /tumluh/ occurrences in this study were assigned to it, indicating that this one scene is a member of the subcategory which the verb adequately specifies. Note that in the /cak/-category, /pdac/ was applied four times (20% of all this verb’s occurrences) to describe Scene 2 CHOP ROPE W/ CHISEL. According to the earlier discussed issue, for now, /pdac/ is not regarded as involved in the subcategorisation of this category.

Differing from the above categories in the domain, the /kac/-category does not seem to possess any internal hierarchical structure as presented below in Figure 4.3e.

Though there are other infrequent verbs, that is /bɑmbak/ ‘snap’ and /ʔok/ ‘hit.down’, involved with the category, their use is too small-scale to allow for subcategorisation.

According to Figure 4.3f, the /tieɲ/-category involves no subcategory since the

Khmer speakers patterned no other verb in the descriptions for this category, except for /pdac/ ‘separate’. This item, even though the two scenes of this category were expressed by /pdac/, is not regarded as forming any subcategory within the /tieɲ/- category.

Figure 4.3g depicts a similar distribution: like the /kac/- and /tieɲ/-categories, the /haek/-category does not show any subcategory since the only alternative verb

/tieɲ/ ‘tug’, aside from predominant /haek/ ‘tear’, does not make up any occurrence pattern to represent subcategorisation. To conclude this section, four of the seven caused-separation event categories in Khmer can split into certain subcategories as

267

determined by the less frequently used verbs corresponding to each category. The

/kat/-category displays the most hierarchical complexity, embodying three subcategories, one of which is even further subdivided. The /puh/-category consists of two subcategories, characterised by /kap/ and /cət/. The /dɑm/- and /cak/-categories individually incorporate one subcategory: the /vay/- and the /tumluh/-subcategories, respectively. The /vay/-subcategory can be subclassified within the /kat/-subdivision.

In contrast to the above categories, the /kac/-, /tieɲ/-, and /haek/-categories do not show any potential subcategorisation. The verb /pdac/ ‘separate’ as discussed in the beginning of this section, concerns the three categories. Specifically, its distributional patterns are seen in the /dɑm/- and /tieɲ/-categories as covering four and two scenes, respectively. Also, /pdac/ occurred four times by different Khmer speakers in descriptions of a scene in the /cak/-category. Subcategorisation has not been determined for this verb because its patterns are linked to divergent categories showing little or no identifiable commonality. Also, as given in the prescriptively positioned dictionaries (Headley et al., 1997; Nath, 1967), /pdac/ may not specify any relevant event type—i.e., of caused-separation—but refers merely to ‘separation’. The scenes characterised by the occurrence pattern of the verb (as in § 4.5), do not suggest a clear extensional meaning intended by the speakers.

In addition, the observed subcategories suggest that some caused-separation categories may overlap with one another in meaning. Certain overlaps between categories are further analysed in § 4.4. Before that, the following subsection explores potential asymmetry in lexical density and granular levels across the caused- separation categories in Khmer as identified by assigned vocabulary and the fine partitioning of the domain.

268

4.3.3 Asymmetric lexical resources and granularity in caused-

separation categories in Khmer

In this subsection, I consider how Khmer lexical items and the hierarchical structure of each category help unveil the asymmetrical nature of lexical resources and degrees of fine-grained division within the domain of caused-separation events.

Figure 4.4 below shows the amount of lexical variation across different caused-separation event categories, reflecting distributions reported in Table 4.7. In the figure, the /cak/-category comprises the largest number of verbs (10), followed by the /kat/- (7), /dɑm/- (7), /puh/- (7), /kac/- (3), /tieɲ/- (2), and /haek/-categories (2).

The different availability of lexical resources for each category reveals lexical asymmetry in the domain. (Note that the special scene associated with the /dɑm/- category, Scene 42 KARATE-CHOP STICK, is not counted in Figure 4.4; if it were, one more verb type, i.e., /bɑmbak/ ‘snap’ would come into play, increasing the already packed category.)

Figure 4.4. Lexical density within seven categories of caused-separation in Khmer.

269

In line with the discussion in §§ 4.3.1 and 4.3.2, beyond the predominant category-defining verbs, other verbs involved in each category have established subcategorisations in the caused-separation event domain. Figure 4.5 presents a simplified version of the architecture of the categories and subcategories in the domain for Khmer. Note that in this figure the domain branches into the categories, subcategories, and small divisions as denoted by solid lines. Also, the figure marks each class-defining verb with the dash symbol “-” to represent (sub)classifications.

Furthermore, it links the occurrences of /pdac/ (indicated by the asterisk) to some categories in the hierarchical structures via dashed lines.

Figure 4.5. Simplified hierarchical structure of semantic domain “caused-separation” in Khmer.

270

The figure shows that some categories contain subclasses placed deeper in the hierarchical structure. In contrast, some others have no prominent smaller classes. The

/kat/-category has a deeper hierarchy and finer-grained distinctions since it splits down into three categories and one small division. As for the /dɑm/-category, it has one subcategory, which further consists of a subdivision. It is also pictured as deep hierarchically. The /puh/-category has a less deep hierarchy but has finer distinctions comprising two subcategories with no small divisions. The /cak/-category, displaying one subcategory without further subcategorisation, has less deep hierarchical architecture. In contrast to the mentioned four categories, the figure presents the /kac/-,

/tieɲ/-, and /haek/-categories as having the flattest structures: none of them involves subcategorisation.

The above two figures yield interesting information for comparison, raising methodological issues relating to interpretation. The /cak/-category contains a relatively higher number of verbs, albeit having only one established subcategory. In contrast, although the /puh/- and /kat/- categories can break down into two and three subcategories respectively, they each utilise a smaller number of verbs. Such reverse variation at this detailed level reflects the different ways of deriving verb lists for categories and subcategories within them. The lists of verbs have arisen out of all the extensional uses of verbs for the categories. The (sub)categorisation was by contrast defined by patterns of verb occurrence across the speakers and scenes. Specifically, listings of verbs corresponding to each category are inclusive as incorporating all of the verbal items used for describing the member scenes of the given category. The verb list shows maximal capacity as far as data for this study indicate. A subcategory or smaller division is instead determined by the occurrence pattern(s) of a verb type.

In effect, though there appear to be many verbs involved in a particular category,

271

these verbs alone would not determine a subclass without one additional condition.

The condition is that the verbs must show extension over the relevant scenes grouped under a (sub)cluster in the hierarchical dendrogram as indicated in Figure 4.2. As a result, a wide range of vocabulary in a category does not always produce a significant number of subcategories or small divisions.

Nevertheless, a larger picture emerges. Figure 4.5 illustrates four categories containing subcategories. The other three have no subcategory. The categories with subcategories involve seven to ten different verbs. The categories that show no subcategorisation by contrast involve far fewer different verbs. According to this trend, in general terms, a category with a higher number of verbs tends to host a more significant number of subcategories. That being the case, although dependence of the number of subcategories in a category upon the number of verbs involved in the category’s descriptions is not a rule, frequently these quantities correlate.

No matter whether the number of verbs would strongly predict the granularity, we can see one thing from the lexical density and hierarchical structures in the domain’s categories: the asymmetry in the number of verbs within the categories, and the fact that each of them is fine-grained.

4.3.4 Summary

Based on the scene descriptions elicited from the Khmer speakers, seven different categories in the caused-separation event domain can be determined. Each category has a different number of lexical descriptors (i.e., verbs of caused-separation).

Apart from the predominant class-defining verbs, occurrence patterns show less frequently used verbs. Based on these less frequent items, some categories can break down into subcategories, and sometimes into even smaller divisions. The

272

different numbers of verbs for each category and the different manifestations of granularity within them then accentuate the skewed nature of lexical distributions and subtle distinctions in the caused-separation event domain in Khmer.

4.4 Boundary locations of caused-separation categories in

Khmer

This section aims at detecting semantic distinctions between the caused-separation categories in Khmer. Such semantic contrasts or nuances establish semantic boundaries of the categories and infer semantic relationships between them. In turn, such boundaries may disclose information in response to particular issues raised in §

4.3.1.2 but left unresolved. The issues include whether some categories overlap one another in meaning and whether the higher number of applied verbs means higher inconsistency in the speakers’ descriptions. In § 4.4.1, I engage with the semantic groupings of caused-separation scenes, the core versus peripheral scenes of each category, and the potential overlapping regions of some different categories. Then, the

Diversity Index of all the scenes is defined to measure heterogeneity in naming core and peripheral scenes in each category and to probe the categories with and without overlapping memberships (see § 4.4.2). A summary follows in § 4.4.3.

4.4.1 Placement of caused-separation category boundaries in Khmer

4.4.1.1 Grouping of caused-separation events in Khmer

Hierarchical clustering in the caused-separation domain in Khmer are visualised as in

Figure 4.2, which shows that speakers of the language grouped events of the domain partly in line with cross-linguistic trends seen in many languages (Majid et al., 2008, discussed below). Language-specific nuances are nevertheless detected in Khmer.

273

Both language-specific and cross-linguistic characteristics are relevant for analysing this domain.

Let us first deal with how the Khmer speakers distinguished between events or scenes of caused-separation in ways that were in accordance with the cross-linguistic distinctions identified by Majid et al. (2008). As reported by Majid et al., the separation domain is carved up regarding levels of the predictability of location of separation (pp. 240-242). Khmer also observes this semantic distinction as splitting between high- and low-predictability events: for example, distinct verbs for cutting versus smashing. Intermediate-predictability events seem to merge into one of the polarities, i.e., fine-precision events: for example, sawing—chopping lumping.

However, to understand the predictability distinction more precisely, we need to look at scenes containing the use of instruments and those with hand-use separately.

Scenes in each split group appear to have distinct patterns. Among the scenes with instrument use, high-predictability events (e.g., Scenes 15 or 10) were named by different verbs from those with relatively low predictability (e.g., Scenes 31, 21, or

40). Specifically, Scene 15 SAW STICK and Scene 10 SLICE CARROT W/ KNIFE were described by /ʔaa/ ‘saw’, whereas Scene 31 SMASH STICK W/ HAMMER, Scene 21

SMASH CARROT W/ HAMMER, and Scene 40 SMASH PLATE W/ HAMMER were all characterised by /dɑm/ ‘smash’. Again, hand actions with low predictability were described with a different verb from intermediate-predictability counterparts. A case in point is Scene 25 SNAP TWIG versus Scene 38 BREAK YARN BY HAND. The former is an imprecise action, labelled by /kac/ ‘snap’. We can view the latter as having relatively intermediate predictability; accordingly, it was often described by /tieɲ/

‘tug’.

274

Majid et al. (2008) also argue for counteractions between event features in gauging levels of predictability. For example, though scenes with the canonical use of a hammer for ballistically striking upon a (rigid) object like a pot would be prone to low predictability, if the object is instead tautly stretched cloth, some speakers may construe the event feature as having intermediate predictability instead, since the cloth will separate in only one place, not into multiple fragments. This is important because it means that different features may require different defining verbs. In Khmer, the speakers named Scene 39 SMASH POT W/ HAMMER with /dɑm/ ‘smash’, while they described Scene 23 CHOP STRETCHED CLOTH W/ HAMMER potentially using /dɑm/

‘smash’ and sometimes /kat/ ‘cut’. The latter verb is common to scenes with relatively high/intermediate predictability, showing the different estimation of predictability.

Moving on to another cross-linguistic distinctive pattern, that of tearing, I consider Khmer as contributing to this pattern since in this study it distinguishes the event type from other caused-separation types in two scenes: Scene 1 TEARING CLOTH

BY HAND and Scene 36 TEAR CLOTH ABOUT HALFWAY W/ HANDS. The Khmer speakers described both scenes only with /haek/ ‘tear’, and 100% of the verb’s occurrences in this study were attributed to these scenes.

Khmer also follows the cross-linguistic pattern of snapping versus smashing distinction (cf. Dimension 3 in Majid et al., 2008). In this study, all the four scenes:

Scene 25 SNAP TWIG, Scene 57 SNAP CARROT, Scene 5 BREAK STICK, and Scene 19

SNAP TWIG, pertaining to the snapping event type were described differently from that of smashing (Scenes 31, 21, 40, and 39). The Khmer speakers referred to the snapping events with /kac/ ‘snap’ while characterising those of smashing by /dɑm/ ‘smash’ as discussed above.

275

The last cross-linguistic recurrent pattern observed in Khmer is the distinguishing of poking-a-hole (cf. Majid et al., 2008) from other similar or close separation types: e.g., some chopping events. Scene 45 POKE HOLE IN CLOTH was described by /cak/ ‘stab’ in Khmer, while the speakers of Khmer commonly named chopping scenes with /kat/ ‘cut’, and /puh/ ‘chop’. However, the verb for Scene 45 also applied to the three other chopping scenes, i.e., Scene 43 CHOP CARROT W/

CHISEL, Scene 53 CHOP STICK W/ CHISEL, and Scene 2 CHOP ROPE W/ CHISEL. The three scenes are similar to Scene 45 in that the persons each used a chisel in the canonical manner of a downward blow. One nuance is that Scene 45 concerns a partial separation, i.e., a hole on the cloth, whereas the others involve complete divisions.

What could be considered the wide-ranging semantic inclusivity of /cak/ ‘stab’ in labelling Scene 45 and other scenes with the use of a chisel is not too unexpected.

Some other languages have been reported with the same application pattern. In

English, Majid et al. demonstrate that the verb stab and bodge were used to refer to both Scene 45 and Scene 43 (p. 243).

As noted in the opening of this part, Khmer still exhibits some language- specific semantic distinctions in the domain. Notions that the Khmer speakers observed crop up that might not be the case for the cross-linguistic perspective.

Outlined below are such items relating to predictability of location of separation, as well as to other likely language-specific discriminations.

As remarked above, an important viewpoint concerns the predictability of locus of separation taken as a distinctive semantic notion. This is more clearly shown if we consider separately hand actions and events using instruments. The suggestion here is that there is a critical distinction required prior to the predictability distinction: the instrument versus manual manipulation. In this study, I argue for Khmer

276

distinguishing between caused-separation events with the use of hand and events with the tool implementation. Evidence for this claim is that different types of verbs were often used to describe these two event types. For example, events of snapping, pulling apart and tearing were named by the Khmer speakers with /kac/ ‘snap’, /tieɲ/ ‘tug’, and /haek/ ‘tear’, respectively. In contrast, event types which concern the use of instruments like smashing, cutting, or chopping were most frequently described by

/dɑm/ ‘smash’, /kat/ ‘cut’, or /puh/ ‘chop’, respectively.

There appear to be two possible divergences: one involving karate-chopping events and the other with the use of /pdac/ ‘separate’. As for actions of karate- chopping events, even though events of such type were basically designed to be those of hand actions (cf. Majid et al., 2008, p. 240), the Khmer speakers lumped karate- chopping together with events of smashing, being named alike with /dɑm/ ‘smash’.

Superficially, events of karate-chopping may appear to infringe the distinction of instrument versus manual manipulation. Yet, if we were to look at those events in an interpretational way, we might speculate that perhaps the speakers would consider the knife hand as an instrument. Specifically, the speakers of Khmer would think of the hand edge as the blunt end (e.g., of a hammer) used in striking a blow. That being the case, karate-chopping actions could be re-read as those containing the use of instruments.

The other deviation concerns /pdac/ ‘separate’. I found the verb describing both hand actions (i.e., of tearing) and events involving use of instruments. In this study, 75% of /pdac/’s occurrences were contained in the descriptions of instrument- manipulated scenes: for example, Scene 34 KARATE-CHOP CLOTH and Scene 61

KARATE-CHOP ROPE. The other 25% of the verb’s occurrences were involved in the descriptions of two manually manipulated scenes: Scene 35 BREAK YARN INTO PIECES

277

and Scene 38 BREAK SINGLE PIECE OFF YARN. The seeming infringement of /pdac/ is explicable in that it may appear with the generic meaning of ‘separation’. If so, then, this verb can characterise any caused-separation action—as also suggested by the dictionaries (Headley et al., 1997; Nath, 1967). This explanation is dealt with later in

§ 4.5. In any case, the instrument—manual manipulation distinction still holds true, regardless of tools and except for the occasional use of the putative general verb of separation /pdac/ ‘separation’.

Among scenes concerned with instrument manipulation, those involving imprecision in caused-separation were distinguished from those with high/intermediate predictability as determined by different use of verbs. However, events of the latter type were found to consist of two splits, in which events were described differently. This suggests that another important distinctive notion should recognised: above the distinction of predictability of location separation but below the notion of the instrument (versus manual) manipulation. Consider two sets of high/intermediate-predictability. I found that a difference between these sets involves the direction or orientation of separation. One group includes Scene 9 SLICE CARROT

W/ KNIFE, Scene 14 CUT MELON W/ KNIFE, and Scene 37 CHOP CARROT W/ AXE. These event groups show high/intermediate predictability characterised by lengthwise caused-separation along theme objects. The other group includes Scene 15 SAW STICK,

Scene 28 CUT FISH W/ KNIFE, and Scene 39 SMASH POT W/ HAMMER. These provide examples of the caused-separation of objects that are non-lengthwise in orientation, i.e., crosswise or into-multiple-pieces. Note that hand actions would also follow the lengthwise versus non-lengthwise distinction. Events of tearing that normally involve caused-separation in the lengthwise direction were usually described by a different verb from non-lengthwise actions of snapping and pulling apart, arriving at the

278

distinction. The stimuli displaying tearing events in this study are however limited and rather unclear regarding the distinction. Further study is needed to settle this point.

Correspondingly, as seen above, the Khmer speakers in this study did not consistently distinguish scenes with intermediate predictability: e.g., those of chopping or karate-chopping, from those with high predictability. In fact, events intermediate in predictability were lumped together with both high precision and low predictability caused-separation actions. In effect, the speakers grouped some of such intermediate predictability events with either polarity. Some chopping events like

Scene 13 CHOP ROPE W/ AXE or Scene 54 CHOP CARROT W/ AXE were associated with high precision cutting events since the speakers described them with the same verb

/kat/ ‘cut’. Some actions of chopping, such as Scene 37 CHOP CARROT W/ AXE or

Scene 48 CHOP BRANCH W/ AXE were named with the same verb /puh/ ‘chop’ as a slitting event with high predictability (Scene 14 CUT MELON W/ KNIFE). In contrast, the speakers grouped events of karate-chopping, such as Scene 32 KARATE-CHOP CARROT and Scene 34 KARATE-CHOP CLOTH along with scenes characterised by imprecision such as smashing actions: Scene 21 SMASH CARROT W/ HAMMER, thus being described alike with /dɑm/ ‘smash’.

In Khmer, hand actions also seem to come into play for semantic distinctions associated with different instrument types; also, for events featuring the use of instruments and different object characteristics. Among scenes with high/intermediate predictability, those involving the use of sharp-bladed implements like a knife are expressed differently from those including the use of pointed tools like a chisel. For example, Scene 49 CUT ROPE W/ KNIFE were named with /kat/ ‘cut’, and Scene 51

CHOP MELON W/ AXE was described by /puh/ ‘chop’. In contrast, Scene 53 CHOP STICK

279

W/ CHISEL was described with /cak/ ‘stab’. Scenes with blunt instruments are consistent with low/intermediate predictability, being described distinctly with /dɑm/

‘smash’. As for scenes displaying hand actions, they follow the distinctions made by characteristics of objects. Those containing flexible objects differ in description from those with rigid objects. A case in point is Scene 1 TEAR CLOTH BY HAND versus Scene

25 SNAP TWIG W/ HANDS. The speakers used /haek/ ‘tear’ to describe the former, whereas /kac/ ‘snap’ to the latter. Furthermore, scenes with flexible objects appear to split down into two groups as distinguished by the notion of object subtype regarding numbers of dimensions. The speakers named scenes with one-dimensional flexible objects (e.g., Scene 38 BREAK SINGLE PIECE OFF YARN BY HAND) with /tieɲ/ ‘tug’.

Instead, /haek/ ‘tear’ was used to characterise scenes with flexible objects if two- dimensional (e.g., Scene 36 TEAR CLOTH W/ HANDS).

Below is Table 4.8 summarising schematically the location of category boundaries in the caused-separation event domain in Khmer, based on the previous discussion. Boundary placements illustrate the hierarchical categorical organisation in the domain. The simplification here reduces details of subcategorisation and smaller division and category overlaps; the latter are discussed subsequently in the next section.

280

Table 4.8

Simplified placement of category boundaries in Khmer caused-separation domain.

First distinction: instrument versus manual manipulation Instrument Manual sawing cutting chopping#1 piercing karate-chopping smashing slitting chopping#2 snapping pulling-apart tearing Second distinction: length-oriented versus non-lengthwise action Non-lengthwise Lengthwise Non-lengthwise ? sawing cutting chopping#1 piercing karate-chopping smashing slitting chopping#2 snapping pulling-apart tearing Third distinction: Predictability of locus of separation High/intermediate Intermediate/low High/intermediate Low Intermediate sawing cutting chopping#1 piercing karate-chopping smashing slitting chopping#2 snapping pulling-apart tearing Fourth distinction: Instrument type or object type

sharp-bladed pointed Blunted sharp-bladed rigid flexible sawing cutting chopping#1 piercing karate-chopping smashing slitting chopping#2 snapping pulling-apart tearing Fifth distinction: object subtype (number of dimensions)

1-D 2-D

pulling-apart Tearing

↓ ↓ ↓ ↓ ↓ ↓ ↓ /kat/- /cak/- /dɑm/- /puh/- /kac/- /tieɲ/- /haek/-

4.4.1.2 Overlaps between caused-separation categories in Khmer

As noted in § 4.4.1.1, the verb distributional patterns in Khmer suggest category boundary overlaps in the caused-separation event domain. In what follows, I consider overlapping areas in the domain’s categorisation regarding semantic characteristics of scenes in the areas where overlapping of category boundaries is observed and where it is absent.

Extensions of category or subcategory defining verbs in Khmer (see Figures

4.3a-g) indicate that major-category boundaries overlap with one another in at least two areas. The first area lies between the /kat/- and /dɑm/-categories; the second lies between the /kat/- and /puh/-categories. Below is a discussion of how each pair of categories forms mutual overlapping portions.

The overlapping relations of some category partitions are characterised by certain verb distributional patterns that model categories and subcategories. Some

281

category and subcategory defining verbs appear to stretch from one category to another. Specifically, /kap/ ‘hack’ shaped a subcategory each in both the /kat/- and

/puh/-categories, while /kat/ ‘cut’ is defined both for the /kat/-category and for a small division in the /dɑm/-category. Because of this distribution, the relevant categories are read as partially covering each other.

In each category, all the member scenes capture common distinguishing and representative features. Yet, the study’s Khmer descriptions treat various scenes as organised in individual categories with features demarcating characteristics analogous to those of certain other scenes, while depicting discrepancies in others. In this study,

I argue for such divergent attributes being interpretive: as potentially triggering different negotiation or construal intra- and interpersonally. Given differing readings of the relevant scenes, some verbs customarily designating particular categories or subcategories were used for their construal effects. In this way, different categories were sometimes applied by speakers in producing the scenes’ descriptions since the designating verbs may be perceived as better characterising the scenes.

In the subsequent paragraphs, I highlight the function of divergent characteristics in prompting or permitting overlaps between the categories. Initially, I analyse data using the notions of “core” and “periphery” as developed by Vulchanova et al. (2012). This analysis determines characteristics of “outer” scenes as divergent from “central” scenes in the /kat/-, /puh/-, and /dɑm/-categories, which directly involve the overlaps. Later, I discuss the extent to which these peripheral scenes lie in overlapping areas between categories, supporting feature divergence as the basic condition for category overlaps.

282

Beginning with the /kat/-category, four core scenes are determined. They are

Scene 49 CUT ROPE W/ KNIFE, Scene 56 CUT CLOTH W/ SCISSORS, Scene 27 CUT HAIR

W/ SCISSORS, and Scene 24 CUT ROPE W/ SCISSORS, as present in the same subcluster in the middle of the category’s dendrogram illustration (see Figure 4.3a). All these scenes were consistently characterised by /kat/ ‘cut’. With these exclusive descriptions by the category-defining verb, we can confidently regard the scenes as more prototypical than others in the category. Their shared characteristics would be accordingly representative for the category. The four scenes together express the actions of dividing objects into parts by sharp-bladed implements, i.e., scissors

(Scenes 56, 27, and 24) and a knife (Scene 49). Taking into account the instrument feature in relation to theme objects: ropes (Scenes 49 and 24), cloth (Scene 56), and hair (Scene 27), the precision of locus of separation is recognised as high.

The clustering analysis findings in Figure 4.3a show three sets of peripheral scenes in the /kat/-category. The figure illustrates how the clusters of these scene are oriented with respect to the above-mentioned core scenes as three divergent subclusters. The first one incorporates five scenes of chopping (i.e., Scenes 13, 54, 3,

4, and 6), from the lower end of the /kat/-category’s cluster, as illustrated in Figure

4.3a. Taken together, these scenes express use of a bladed instrument as the core does, but also conveyed is the manner of striking a blow, consequently suggesting intermediate predictability of location of separation. In parallel to /kat/ ‘cut’, /kap/

‘hack’ was used to describe the five scenes of the first peripheral set; three of which were also described by /cəɲcram/ ‘cut’.

The second peripheral scene set includes three scenes from the upper end of the category’s cluster. They are of sawing events (i.e., Scenes 20, 28, and 12), each representing the use of knives (Scene 20, 28, and 12) to cut objects: that is, a twig

283

(Scene 20), fish (Scene 28), and stretched cloth (Scene 12), with a back-and-forth action or sawing motion. Therefore, they are regarded as involving high predictability.

The scenes were characterised by /ʔaa/ ‘saw’, beside /kat/ ‘cut’. Note that other sporadic verbs /han/ ‘slice’ and /cie/ ‘prune’ were also used in the descriptions of

Scenes 28 and 12, respectively.

The third scene set has two scenes of slicing (Scenes 10 and 26), from the upper portion of the category’s cluster next to that of the second set. The scenes express the (sometimes back-and-forth) actions of segmenting objects like carrots into slices, using the sharp implements like knives. They thus show features close to those of the core (e.g., use of bladed tools) and also to those of the second set (e.g., sawing motion), but make further reference to certain intended results: for example, multiple pieces of carrots in Scene 10. Also, according to the features involved, all the scenes of the third set at least suggest high-precision actions like those of the second.

However, in looking for a commonality among the three periphery sets, an issue arises: they appear to demonstrate less precise manners of action than for the core, e.g., the violent blow of sharp-bladed implement and the to-and-fro sawing motion.

The ballistic manner further suggests compromised intermediate predictability for the periphery, as disparate from the core with high precision.

Next, the /puh/-category is considered. The dendrogram in Figure 4.2 suggests only one core scene for the category, Scene 9 SLICE CARROT W/ KNIFE, since no short branch for the scene is delineated. This core scene is based on the near exclusive use of the category-defining verb /puh/ ‘chop’. Note that far more sporadic alternatives

/criək/ ‘slit’ and /crəp/ ‘cut’ each occurred only once. This core scene displays a person employing a sharp-bladed implement, i.e., a knife, to cut an object, a carrot, in the lengthwise direction. Considering the relationship of all the features involved, the

284

scene mirrors a high precision cutting action. Two sets of peripheral scenes in the category can be pointed out. One set consists of only Scene 14 CUT MELON W/ KNIFE, described by /cət/ ‘slit’, as well as by /puh/. This scene displays most of its features parallel to the core, but with just the specific manner of placing the sharp blade on the surface of the melon before cutting or drilling. Accordingly, it conveys high predictability of location of separation. The other periphery set is bigger since it incorporates three different scenes: Scene 48 CHOP BRANCH W/ AXE, Scene 51 CHOP

MELON W/ KNIFE, and Scene 37 CHOP CARROT W/ AXE. Along with /puh/, /kap/ ‘hack’ named the scenes as well. They again have most features comparable to those of the core and the first periphery set, but with the specific manner of striking a blow of a sharp implement. As the predictability of locus of separation is compromised by the ballistic action, I regard it as intermediate. In general terms, the core and periphery share most of their features. Yet, the peripheral scenes each involve manner-of-action specificities unseen in the core.

The /dɑm/-category exhibits a clear demarcation between the core and periphery. I consider the core scenes there being two scenes at the lower end of the category as represented in Figure 4.2. The peripheral scenes are the three scenes (i.e.,

34, 61, and 23), from the most distant subcluster at the top end of the category. The two core scenes, described only with /dɑm/ ‘smash’, include Scene 21 SMASH CARROT

W/ HAMMER and Scene 31 SMASH STICK W/ HAMMER. Both of these display the manner of striking a blow with a blunt-headed implement, e.g., hammer, on objects such as a carrot (in Scene 21) and a stick (in Scene 31). The core scenes are characterised by imprecise separation actions. As for the three peripheral scenes, they are Scene 34

KARATE-CHOP STRETCHED CLOTH, Scene 61 KARATE-CHOP STRETCHED ROPE, and

Scene 23 CHOP STRETCH CLOTH W/ HAMMER, characterised by /kat/ ‘cut’ apart from

285

the category-defining verb /dɑm/ ‘smash’. These peripheral scenes express the manner of striking a blow with knife hands in Scenes 34 and 61 or with a hammer in

Scene 23 to cut off the objects: cloth in Scenes 34 and 23 and a rope in Scene 61. The stark difference between the core and the periphery in the category is the intended resulting separations. We can expect the core events to result in multiple fragments, while the periphery can refer to intended resulting separation in one place.

Now let us discuss the overlapping cases of the /kat/-category with the /puh/- category and with the /dɑm/-category. The first case establishes that the /kap/- subcategories—in the /kat/- and /puh/-categories—cause an overlapping area, encompassing peripheral scenes of both categories. Likewise, the second case shows that within the /dɑm/-category, the /kat/-subdivision that initiated an overlap with the

/kat/-category too accommodates the periphery. Since I argue that these peripheral scenes portray feature discrepancy, a tie is assumed between this divergence and the production of such overlapping areas. These divergent event attributes of the categories’ peripheries are claimed to induce category overlaps.

The characteristics of the peripheral scenes integrated in the /kap/-subcategory within the /kat/-category look divergent while converging toward those of the periphery within the /puh/-category. Together they share both the sharp tool blows and the intermediate predictability, as against those of each category’s core events.

Divergent features of the /dɑm/-category’s peripheral scenes in the /kat/-subdivision look analogous to some characteristics of the /kat/-category in that both sets of scenes involve intended separation in one place on the objects, as opposed to intended resulting multiple fragments expressed by the /dɑm/-category’s core scenes. Feature divergence might trigger speakers to have second thoughts about classification of such peripheral scenes. Such shifting categorisation might then have resulted in

286

placing the scenes in overlapping areas. This would be as mirrored in alternate descriptions substituting the representative descriptors of the categories.

Some peripheral scenes not only reveal feature discrepancy from core counterparts, then bringing about category overlaps, but were also described with a wider variety of verbs than the core. For example, while the Khmer speakers consistently named the core scenes of the /kat/-category (i.e., Scenes 49, 56, 27 and

24) with its predominantly occurring verb /kat/ ‘cut’, I found each of its periphery sets described by other different less used verbs. The different use of verbs for peripheral scenes in a way suggests insecurity among the Khmer speakers when describing such scenes. As previously discussed in § 3.4.1.2, Vulchanova et al. (2012, p. 29) explain that insecurity in naming events is influenced by increased distance or divergence from default features and consequently results in intra- and inter-personally inconsistent use of event descriptors (i.e., verbs). On that account, we may think of the core scenes as associated with a higher consistency in description than the periphery in each category. Also, the overlapping regions containing peripheral scenes should be poorer in description consistency. The next part gauges degrees of consistency in scene characterisation through the Gini-Simpson index to examine these ideas.

4.4.2 Inconsistency in naming caused-separation events in Khmer

Verb choice was consistent in the descriptions of some caused-separation scenes, whereas for other scenes it varied both within and between the speakers of Khmer.

This suggests nuances in degrees of description consistency. Here, I work out such consistency levels using the Gini-Simpson index as previously introduced (see details in § 2.2.6.2a). What follows first considers degrees of lexical-description heterogeneity in the domain (§ 4.4.2.1). Later, I explore diversity scores of the core

287

versus peripheral scenes within the certain categories (§ 4.4.2.2) and of the categories involving an overlap versus those with no overlap (§ 4.4.2.3). These scores enable testing of conjectures concerning event-naming consistency proposed in § 4.4.1.2.

4.4.2.1 Degrees of inconsistency in caused-separation descriptions in Khmer

I commence with measuring the diversity index for descriptions of each stimulus scene in Khmer, with all per-scene values listed in Appendix B.

The overall diversity index score for Khmer on the average of all per-scene index values is calculated to be 0.43. Since the greater the Gini-Simpson index value, the greater the diversity, the average index value suggests intermediate homogeneity or consistency in the verbal descriptions in the language. In addition, the diversity index reveals that for individual scenes, several verbs were rather evenly distributed.

4.4.2.2 Inconsistency in naming core versus peripheral scenes of caused-separation

categories in Khmer

Among the categories in the domain, four categories show delimitation between the core and periphery. Three of them are the /kat/-, /dɑm/-, and /puh/-categories already defined and described above in § 4.4.2.2. The remaining one is the /kac/-category incorporating two core scenes: Scene 25 SNAP TWIG W/ HANDS and Scene 57 SNAP

CARROT W/ HANDS. These are measured by the shortest branches of the cluster (see

Figure 4.3g). I consider the category there to consist of two peripheral scenes from the subcluster at the lower part. They are Scene 5 BREAK STICK OVER KNEE and Scene 19

SNAP TWIG W/ HANDS.

The core scenes in each of the categories present the highest consistency in the event naming, except for those in the /puh/-categories. In the /kat/-, /dɑm/-, and /kac/- categories, the core’s diversity index is calculated at zero, showing complete

288

homogeneity within and between the speakers in their descriptions. In this core case, the Khmer speakers described scenes only using the predominant or representative verbs presented in respective category names. The single core scene in the /puh/- category shows higher diversity at 0.42, since not only was it referred to by category- representative /puh/ ‘chop’, but also by other less frequently used verbs: /criək/ ‘slit’, and /veah/ ‘slit’. Given these readings of Gini-Simpson indices, I interpret the lexical variation degree for the core /puh/- scene as low, pointing to intermediate or compromised consistency in naming.

The peripheral scenes in the four mentioned categories are each characterised by lexical variation in intra- and inter-personal description, resulting in relatively great diversity as illustrated in Table 4.9.

Table 4.9

Gini-Simpson’s diversity indices and verbs relating to periphery within the /kat/-, /dɑm/-, /puh/-, and

/kac/-categories.

Category Diversity index Verb

/kat/-category

S13 CHOP ROPE W/ AXE 0.4762 /kat/; /kap/

S54 CHOP CARROT W/ AXE 0.5333 /kat/; /kap/

set 1 S3 CHOP BRANCH W/ MACHETE 0.2857 /kat/; /kap/

S4 CHOP CLOTH W/ KNIFE 0.6667 /kat/; /kap/; /cəɲcram/

S6 CHOP CARROT W/ KNIFE 0.6389 /kat/; /kap/; /cəɲcram/

S20 SAW TWIG W/ KNIFE 0.4286 /kat/; /ʔaa/

set 2 S28 CUT FISH W/ KNIFE 0.6667 /kat/; /ʔaa/; /han/

S12 CUT CLOTH W/ KNIFE 0.5556 /kat/; /ʔaa/; /cie/

289

Category Diversity index Verb

set 3 S10 SLICE CARROT W/ KNIFE 0.5357 /kat/; /han/

S26 CUT CARROT W/ KNIFE 0.5238 /kat/; /han/; /cət/

/dɑm/-category

S34 KARATE-CHOP CLOTH 0.8214 /dɑm/; /kat/; /pdac/; /vay/

S61 KARATE-CHOP ROPE 0.7455 /dɑm/; /kat/; /pdac/; /vay/

S23 CHOP CLOTH W/ HAMMER 0.7778 /dɑm/; /kat/; /pdac/; /vay/

/puh/-category

S48 CHOP BRANCH W/ AXE 0.6071 /puh/; /kap/; /cəɲcram/

set 1 S51 CHOP MELON W/ KNIFE 0.2857 /puh/; /kap/

S37 CHOP CARROT W/ AXE 0.5238 /puh/; /kap/; /criək/*

set 2 S14 CUT MELON W/ KNIFE 0.6944 /puh/; /cət/; /criək/*; /veah/*

/kac/-category

S5 BREAK STICK OVER KNEE 0.4167 /kac/; /bɑmbak/; /ʔok/

S19 SNAP TWIG W/ HANDS 0.6071 /kac/; /bɑmbak/

Note that the asterisk (*) indicates the verb that occurred less than 1% of all the descriptions.

The above table has an average diversity index of 0.57 for all the peripheries across the four categories. This value is higher than for any of the core values in the same respective categories. The periphery in /dɑm/-category appears to show the most severe inconsistency in event naming, as demonstrated by its average diversity index of 0.78, which is comparatively the greatest. This means that the Khmer speakers opted to apply any of four verb choices: /dɑm/ ‘smash’; /kat/ ‘cut’; /pdac/ ‘separate’;

/vay/ ‘strike’—with each having a more or less equal chance. Of the descriptions for the /dɑm/-category’s periphery, 21.43%, 32.14%, 28.57%, and 17.86% were associated with the respective verbs. The table also shows that, in association with the same numbers of verbs, different index scores for the individual categories were not

290

only affected by the lexical density, but also influenced by the frequency of occurrences of those lexemes.

Overall, the above-discussed diversity indices show that the core scenes are higher in description consistency, whereas the peripheral counterparts show lower consistency as measured by the diversity values approaching one.

4.4.2.3 Inconsistency levels in descriptions across caused-separation categories in

Khmer

Given that the defined peripheral scenes in each category involve high levels of inconsistency in event naming and frequently embody certain categories involving overlapping regions, we may infer that certain categories should show higher average inconsistency in naming caused-separations than those dealing with no overlap. I propose to establish this through working out an average diversity-index score for each of the seven categories in the domain. This procedure should prove the conjectured argument, as shown in Figure 4.6. It is necessary to note here that, for clarity of presentation, Figure 4.6 excludes the two special scenes linked to the /kat/- and /dɑm/-categories from the diversity estimations, as they show deviation from the general trend.

291

Figure 4.6. Average of diversity values for seven categories in caused-separation in Khmer; categories involving overlapping regions are orange, while non-overlapping categories are grey.

All the categories dealing with overlapping regions have lower diversity scores than the /cak/-category, which does not contain any overlap. Still, they have the higher values of description heterogeneity than the remaining categories, except for the /tieɲ/-category. Despite that, further calculations of average Gini-Simpson index applied to the categories associated with overlaps and for those involving no overlap show that the former group is greater than the latter regarding lexical heterogeneity in description (0.48 > 0.39). We may read this in such a way that in each category, the periphery may not play an exclusive role in producing inconsistency in the caused-separation event descriptions. Heterogeneous preferences or perspectives within and between the speakers regarding the event type pertaining to each category may also contribute to the varying descriptions. In a narrow perspective, a category dealing with overlapping regions, accordingly, may not constantly have a higher level of inconsistency than a category with no overlap. Still, when looked at regarding overlapping areas in overall inspection, the two types of

292

categories with overlaps become prone to higher inconsistency. Peripheral scenes in the overlapping areas are concentrated, suggesting a more prominent role for feature divergence linked to such scenes in causing category overlaps.

4.4.3 Summary

Khmer shows both some cross-linguistic recurrent partitioning patterns of separation events (Majid et al., 2004; 2008) and certain potentially distinctive semantic discriminations. The placement of category boundaries in the domain is illustrated in

Table 4.8, reflecting how Khmer organises the relevant semantic distinctions. The relationship between the categories also shows indefinitely delineated categorisation, since three different categories, /kat/-, /puh/-, and /dɑm/-, engage in category overlaps. I have argued for feature divergence—especially as involved in the peripheral scenes of those categories—as accounting for intra- and inter-personal nuances in description, consequently causing overlapping regions between categories.

Also, overlapping areas provide evidence for a wide range of caused-separation vocabulary. I interpret this as showing relatively high inconsistency in the speakers’ event naming. I thus expect the peripheral scenes that are generally linked to the category overlaps and the categories dealing with overlapping areas to have high heterogeneity in lexical description. The Gini-Simpson diversity indices were used to support this conjecture.

4.5 Semantic organisation of caused-separation categories

in Khmer

This last section turns to the critical remaining investigation in the caused-separation event domain in Khmer: semantic organisation of the categories. I start with a semantic investigation of the verb-defined categories and subcategories for

293

determination of meaning components appreciated or recognised as important for the individual categories/subcategories. (see § 4.5.1). After specifying elements linked to such semantic components, generalisations can be made regarding combinations of semantic elements and similar factors in organisation crucial for the categories, subcategories or smaller divisions. Once again in this subsection, I explore the verb

/pdac/ ‘separate’, pointed out above as showing unusual distribution. What role does it play in subclassification? The issue is whether /pdac/ is like other occasional verbs of the established categories, calling for particular assigned subcategories in each related category, or whether it is only structured around generic meaning elements as suggested by dictionary definitions (Headley et al., 1997; Nath, 1967). If the latter, then the verb would be applicable for any, or almost any, category. In § 4.5.2, I propose certain semantic elements that are lexicalised or conflated into Khmer verbs of caused-separation, bringing to light the organisation of semantic categorisation in the language.

4.5.1 Semantic characteristics of caused-separation categories in

Khmer

As formulated by the common distributional patterns of verbs, the categories or smaller subcategories are semantically meaningful in the sense that all scenes pertinent to each category would show the same feature values as conveyed by the relevant verbs. These feature values prompt the use of specific verb types, which account for scene configurations as worked out in § 4.3. Also, feature values discriminate among disparate scenes. Given this, we can examine feature commonalities across each category’s scenes to search for underlying semantic characteristics associated with the individual categories and subcategories.

Correspondingly, we can deem all characteristics linked to the

294

categories/subcategories as associated with pertinent class-defining verbs, since the verbs’ occurrence pattern corresponds to the establishment of such categories and subcategories (see § 4.3.1; cf. Vulchanova et al., 2012). In the following paragraphs, I generalise semantic properties pertaining to the individual categories and subcategories, assessing them as potential parameters in semantic categorisation. In doing so, I consider certain event characteristics: for example, instruments, manners, spatial properties of actions, material properties of objects, or intended results. To some extent, parameters like these have been discussed and defined in past studies on semantic categorisation in the domain (see § 2.1.3.3). We turn now to investigation of individual categories in the following order: /kat/-, /dɑm/-, /puh/-, /cak/-, /kac/-, /tieɲ/- and /haek/-categories.

In the /kat/-category, all the 14 scenes depict persons performing actions through different kinds of sharp-bladed implements. They are knives (Scenes 20, 28,

12, 10, 26, 49, 4, and 6), scissors (Scenes 56, 27, and 24), axes (Scenes 13 and 54), and a machete (Scene 3). According to the evidence of these instruments, the category is sensitive to specific tools with single or double sharp blades, but not to other instrument kinds, such as blunt-headed tools, pointed implements, or the use of hands.

The scenes involve caused-separation across the object’s length. Also, they display complete resulting separation from the actions. A plausible suggestion would be that the category would always recognise this kind of result. Yet, this seems not to be the case. In the closing of this subsection, I argue that recognising complete resulting separation for this verb and for other verbs of caused-separation may be likely, but this is far from definitely true. Turning now to the manner of action, the member scenes of this category contain a variety of caused-separation manners. They are back-and-forth sawing motion, cleaving with a knife, clipping out with scissors, and

295

forceful blows of knives, axes, or a machete. The acted-upon objects were branches

(Scenes 20 and 3), fish (Scene 28), stretched cloth (Scenes 12, 56, and 4), carrots

(Scenes 10, 26, 54, and 6), ropes (Scenes 49, 24, and 13), and hair (Scene 27). This shows that the category is not sensitive to specific kinds of objects since they can be one-dimensional flexible, two-dimensional fabric-like or rigid, three-dimensional lump-like objects. As for directions of separation, the category’s scenes specify crosswise cutting, i.e., across the object’s length. The above generalisation is still the case even for the special scene Scene 15 SAW STICK. Note that the implement in the scene is another type of sharp tool, a serrated saw.

In keeping with the characterisation of the scenes in the /kat/-category, those within its subcategories concentrate more on certain idiosyncratic attributes of caused- separation. The subcategories are structured as if they were specific or intricate versions of the main category. Below is a discussion of the /ʔaa/-, /kap/-, and /han/- subcategories.

The three scenes in the /ʔaa/-subcategory are sensitive both to the category- representative feature of sharp-bladed tools and to the specific quality of to-and-fro motions. If the special scene for the category (Scene 15) is also combined, it is perhaps aligned with characteristics of others in the subcategory: based on speakers’ responses, it was labelled solely with /ʔaa/ ‘saw’ and it displays the features reflecting the subcategory’s idiosyncrasy. The exclusive use of /ʔaa/ for Scene 15 suggests the scene itself as being the prototype for events specified by this verb. Let us look further at this briefly. I argue for two specific features as prompting speakers’ selection of the specific verb /ʔaa/. The first is the use of a serrated saw. In Khmer, the basic word for a saw is /rɔnaa/ ‘saw (N)’ which is morphologically derived from the verbal root /ʔaa/

‘saw (TR)’ through affixation. Strictly literally, /rɔnaa/ is an implement specially used

296

to saw (/ʔaa/). My supposition then is that the speakers considered this defining verb as well suited for its usage to apply to cutting a stick. The second triggering feature is the back-and-forth sawing motion. This manner of action is consistent with the canonical way of applying the saw, being linked intimately to the implement. This particular manner thus ‘switched on’ the use of /ʔaa/ ‘saw (TR)’ in the descriptions, as it was similar to the saw’s typical application. Figure 4.7 below are the still images from Scene 15, showing the features of the serrated-saw use and the to-and-fro sawing motion.

Figure 4.7. Scene 15 SAW STICK, showing the back-and-forth action with use of saw to cause separation.

The scenes comprised in the /kap/-subcategory together exhibit the actions of caused-separation by sharp-edged instruments, also including actions of persons striking a blow or blows with an instrument upon some object. Again, none of this subcategory’s member scenes is seen as sensitive to different types of objects acted upon. Of the /cəɲcram/-subdivision within this subcategory, the two member scenes

(Scenes 4 and 6) share certain key specific elements. Not only do they display the striking thud of knives on the objects, such as on a stretched piece of cloth or on multiple carrots, but the scenes also involve a repeated number of blows—perhaps

297

with the expectation of multiple resulting pieces. Figure 4.8 below shows the still images from Scene 4, illustrating repeated blows of the knife in cutting the stretched cloth.

Figure 4.8. Scene 4 CHOP STRETCHED CLOTH W/ KNIFE, showing the person repeatedly striking blows with small knife.

In the /han/-subcategory of the /kat/-category, the covered scenes appear to focus on specific motions of the sharp-bladed tools used: back-and-forth actions of caused-separation. This feature of a specific sawing motion is then at first glance similar to that of the /ʔaa/-subcategory. However, the caused-separation related to the

/han/-subcategory contains another specific feature: use of supporting surfaces.

Scenes 10 and 26 (see Figure 4.9a-b) for the subcategory each display a person slicing a carrot which was placed on a chopping board. This specific participation of a supporting surface in cutting to-and-fro into something is unnecessary for the /ʔaa/- subcategory, as in Scene 12 (see Figure 4.9c).

298

Figure 4.9a-c. /han/-category (a-b) consistent with use of supporting surface, whereas /ʔaa/-category

(c) not requiring implementation of supplementary support.

Semantic characterisation of the /dɑm/-category comprises nine different scenes. Six of them display persons using blunt-headed instruments, or more precisely hammers, in order to cause separation of objects. These objects include stretched cloth

(Scene 23), a rope (Scene 50), a flowerpot (Scene 39), a plate (Scene 40), a carrot

(Scene 21), and stick (Scene 31) (see Figure 4.10). The other three scenes exhibit knife-hand strikes to the objects: stretched cloth (Scene 34), a rope (Scene 61), a carrot (Scene 32) (see Figure 4.11). These objects point to the fact that the /dɑm/- category is used with different types of acted-upon themes: one-dimensional flexible

(like ropes), one/two-dimensional rigid (like a stick or tapered carrots), two- dimensional basically flat (like flexible stretched cloth or a rigid plate), and three- dimensional (like a brittle flowerpot). Also, all the scenes show complete resulting separation—regardless of whether separating into parts or into small fragments. This points to the low/intermediate predictability notion (see § 4.4.1). In addition, as noted above (in § 4.4.1.1), the scenes of karate-chopping were classified into this category too because they were named by the same verb (/dɑm/), along with other scenes of smashing (by a blunt implement). A possible underlying reason for this classification

299

is that a knife hand would resemble a hammer as the blunt hand edge is analogous to the hammerhead (see Figure 4.12).

Figure 4.10. Some scenes in /dɑm/-category involving use of (blunt-headed) hammer; from left to right, Scenes 23, 50, 39, 40, 21, and 31.

Figure 4.11. Some scenes in /dɑm/-category involving use of (blunt-edged) knife hand; from left to right, Scenes 34, 61, and 32.

300

Figure 4.12. Blunt edge of knife hand (as in Scene 32) comparable to blunt hammerhead (as in Scene

32).

The /dɑm/-category includes the /vay/-subcategory, incorporating seven scenes. These show persons performing a strike or blows of a knife hand or of a hammer upon an object. Also, all the scenes involve results of either one-location separation or breakage into multiple pieces, pointing to low/intermediate predictability of location of separation. In this study, the relevant scenes are inconclusive as to exactly how this subcategory participates in semantic subclassification of the /dɑm/-category. Previous literature (Headley et al., 1997) and examples from Google.com as a substitute concordance seem to give useful information to help investigate the issue. According to Headley et al., events with the use of hands are more acceptably described by /vay/ than /dɑm/, as indicated by the prescriptive definitions of the two verbs. This methodically-based explanation referencing embodiment seems plausible since all the scenes of karate-chopping in the /dɑm/-category were shown as embodied within this subcategory. To further support this conclusion, I rechecked the Google concordance using two word-groups for diagnostic tests. The word-groups are “យកដៃ玶យ” /yɔɔk day vay/ (take hand strike) ‘use the hand to strike’ and “យកដៃៃំ” /yɔɔk day dɑm/ (take hand smash) ‘use

301

the hand to smash’. A search gave 1,330 examples of the first phrase and just four of the second one (Google.com, n.d.).47 Out of the occurrences, we may see that /vay/ is considerably more consistent with the use of hands than /dɑm/. Therefore, the /vay/- subcategory is characterised through a blow or blows of blunt or non-sharp instruments, especially knife hands (or just hands).

In addition, within the /vay/-subcategory, the /kat/-subdivision is structured around three scenes, Scenes 34, 61, and 23. These member scenes show how /kat/

‘cut’ tolerates the manner of striking an instrument, regardless of whether it is sharp or blunt. However, this does not mean that /kat/ can apply to all events describable by

/dɑm/ ‘smash’. The clear reason for the /kat/ distributional pattern subdividing the

/vay/-subcategory is that all the involved scenes refer to resulting separation in just one place on the objects (i.e., stretched cloth in Scenes 34 and 23; a rope in Scene 61) as opposed to multiple resultant pieces. This circumstance suggests that the /kat/- subdivision in this case is specifically concerned about the relatively intermediate predictability of locus of separation, while the /dɑm/-category and the /vay/- subcategory can also show low predictability. For one location associated with the

/kat/-subdivision, it is worthwhile to note that the division, with its resulting separation intermediately predictable, resembles to some extent expected results of actions classifiable into the /kat/-category and describable by its representative verb

(/kat/). Thus, it seems that /kat/ ‘cut’ can apply to these two different levels of categorisation in the two different categories since the similar intended resultant effects would license its use. Figure 4.13a-c below illustrate the three example scenes relating to the /dɑm/-category, the /vay/-subcategory, and the /kat/-subdivision,

47 Google.com. (n.d.). Retrieved October 1, 2020, from https://www.google.com/.

302

respectively, showing that the former two involve the results of multiple fragments, whereas the last is consistent with the one-location separation.

Figure 4.13a-c. Three scenes relating intended results of multiple-fragment versus one-location separation: Scene 31 in /dɑm/-category (a) and Scene 39 in /vay/-subcategory (b) each involve multiple resulting fragments, whereas Scene 23 in /kat/-subdivision (c) contains one-place resulting separation; note that all the scenes similarly display the blow(s) of a hammer.

Another verb occurrence pattern in the /dɑm/-category concerns /pdac/

‘separate’. In this category, the descriptions of four different scenes, i.e., Scenes 34,

61, 23, and 50 contain this verb. All such relevant scenes together exhibit the consistent feature of the use of blunt instruments (Scenes 34 and 61) or knife hands

(Scenes 23 and 50). Also, they involve the resulting effects of one-place separation on the objects (that is, stretched cloth in Scenes 34 and 23, and ropes in Scenes 61 and

50), pointing to the notion of relatively intermediate predictability of locus of separation. The /pdac/ distributional pattern thus looks close to the /kat/-subdivision in terms of semantic characterisation at this point.

Let us now deal with the /puh/-category. It covers four different scenes:

Scenes 48, 51, 37, 9, and 14. The scenes are concerned with the handling of sharp-

303

bladed instruments: an axe in Scene 48, and knives in Scenes 51, 37, 9, and 14. The acted-upon items displayed in these scenes are multi-dimensional rigid or non- flexible: a branch in Scene 48, melons in Scenes 51 and 14, and carrots in Scenes 37 and 9. Considering those specified features, the /puh/-category’s scenes look similar to those in the /kat/-category concerning the applied instrument types and the theme objects, except that the former seem more confined to inflexible theme objects. In addition, closer inspection of the four scenes shows that they all engage in lengthwise division (i.e., the separation along the objects’ length), or long cuts (see Figure 4.14), contrasted with the /kat/-category’s scenes which rather show the crossways direction of separation.

Figure 4.14. Scenes in /puh/-category consistent with lengthwise direction of separation or long cuts; from left to right, Scenes 48, 51, 37, 9, and 14.

Within the /puh/-category, there are two subcategories. First, the /kap/- subcategory covers three scenes: Scenes 48, 51, and 37. Together they show sensitivity towards the use of axes and a big knife, suggesting this as the specific feature for the subcategory (see Figure 4.15). This contrasts with the other two scenes

(Scenes 9 and 14) within the /puh/-category that were not described by /kap/ ‘hack’ and which show the use of smaller knives (Figure 4.16). Second, the /cət/-subcategory includes only one scene in this study, showing most event features similar to other

304

scenes in the /puh/-category. However, the scene displays a specific characteristic of a person placing a blade on the object’s surface—the melon’s—before pressing it into the melon’s flesh, but causing no full separation. I regard this feature as being highly distinct to the subcategory since 83.33% of its characterising verb /cət/’s occurrences are contained in the descriptions for Scene 14 CUT MELON W/ KNIFE, which possesses such a feature. One of the Khmer speakers applied this verb only one time to Scene

26, where a person cut a carrot using a knife. The feature of blade placement on the object’s surface is not conspicuous in this scene. However, since the person moved the blade very close to the object before cutting it, the relevant speaker might either have misread the action or interpreted it thinking of such a feature.

Figure 4.15. Scenes in /kap/-subcategory involving use of big knives or axes; from left to right, Scenes

48, 51, and 37.

305

Figure 4.16. Some scenes in /puh/-category involving use of small knives, as opposed to those in /kap/- subcategory; from left to right, Scenes 9 and 14.

In the /cak/-category, four member scenes are involved: Scenes 43, 53, 2, and

45. They all display persons using pointed instruments: chisels in Scenes 43, 53, and

2, and a twig in Scene 45. The acted-upon objects in the separation events include a rope, a stick and a carrot (Scene 34). Resultant effects of the actions shown in the scenes are complete separation (Scene 43, 53, and 2), or partial separation, as in Scene

45 (Figure 4.17). The /cak/-category contains one smaller subclass, the /tumluh/- subcategory. This subcategory consists of only a single member scene: Scene 45, pointing to the specific event characteristic of intended partial separation: a cut- through hole or opening in the object (i.e., cloth). We can view the feature of an expected hole as unique to the subcategory since /tumluh/ was specially patterned to specify it. Of /tumluh/’s occurrences in the present study, 100% were contained in the descriptions for Scene 45. A morphologically based account would take the /tumluh/- subcategory as being contingent on such a feature. The Khmer transitive verb

/tumluh/ was derived by causativisation from the intransitive root /tluh/

‘be.pierced.through’ (Headley, 1977; Headley et al., 1997). Nath (1967) even shows that /tluh/ can be serialised with the transtive verb /cak/ ‘stab’, being phrased as /cak

306

tluh/ (stab be.pierced.through) ‘stab a hole’. In this construction /tluh/ represents the potential resultant effect triggered by the /cak/ action. Although /cak/ ‘stab’ can be applied to describe an event either with an intended full division or with a cut-through opening, the causativised verb /tumluh/ specifies the /cak/ event type as conveying definite expectation of a resulting hole.

Figure 4.17. Scenes in /cak/-category involving both (intended) full and partial separation; from left to right, Scenes 43, 53, 2, and 45.

Within the /cak/-category, I found the /pdac/ occurrence pattern structured around one scene. It is Scene 2, where a person has struck a blow with a chisel, causing a division of the stretched rope. This scene presents the features of intended full separation, along with intermediate predictability of location of separation. The verb /pdac/ occurred four times in this scene’s descriptions, accounting for one fifth of all the verb’s occurrences in this study. This frequency number is relatively high if we consider that it is dedicated just to one scene. This scene merits closer consideration.

We may align most of its features with those of the scenes associated with /pdac/ in the /dɑm/-category: that is, instrument manipulation, intermediate predictability, and intended resultant one-location separation (across the object’s length). Yet, there is an extra slight nuance here differentiating Scene 2 from those /pdac/ scenes in the /dɑm/- category. The former involves the implementation of a pointed tool, a chisel, for

307

causing separation, while the latter scenes feature the use of other blunt implements like hammers or knife hands. Such a difference therefore suggests that /pdac/ may not be restricted to the implementation of dull and blunt instruments.

We turn now to the /kac/-category with its four member scenes all expressing manual manipulation involved in caused-separation. Specifically, the actors carried out the crossways separation of one/two-dimensional rigid objects (twigs or a stick in

Scenes 25, 5, and 19, and a carrot in Scene 57; see Figure 4.18) by manual pressure on the objects’ edges—sometimes involving another body part like a knee (as in

Scene 5). The scenes also suggest that the category is not sensitive to whether resulting separation is full (Scene 25) or partial (Scenes 57, 5, and 19) or to whether the actions were forcefully undertaken. The verb /kac/ was consistently used to describe these scenes, suggesting that such above-mentioned features are particularly specific to the category and trigger the use of its defining verb. Note that the category’s descriptions also involve two other verbs, i.e., /bɑmbak/ ‘snap’ (for Scenes

5 and 19) and /ʔok/ ‘hit.down’ (for Scene 5). However, each of them is of very sporadic in distribution (see Figure 4.2); I thus exclude them from consideration here.

308

Figure 4.18. One- and two-dimensional (or 1-D and 2-D) rigid objects as characteristic for /kac/- category: from left to right, Scene 25 with a 1-D twig, Scene 57 with a 2-D carrot, Scene 19 with a 1-D twig, and Scene 5 with a 1-D stick.

The two remaining categories, i.e., the /tieɲ/- and /haek/-categories, are smaller than the above categories, each involving only two scenes.

The /tieɲ/-category is based around Scene 35 BREAK YARN INTO PIECES BY

HAND, and Scene 38 BREAK SINGLE PIECE OFF YARN BY HAND. The scenes together specify that the category characteristically involves the event features of use of hands and one-dimensional flexible theme objects. The category is not sensitive to the intensity of actions (Scene 35 with intensity versus Scene 38 without apparent intensity; Figure 4.19). The occurrence pattern of /pdac/ ‘separate’ is also observed in the /tieɲ/-category, covering both scenes. This confirms that /pdac/ extends its use over caused-separation events.

309

Figure 4.19. /tieɲ/-category insensitive to whether caused-separation was made with or without intensity; from left to right, Scene 35 shows the action with personal fury whereas Scene 38 show no apparent intensity in action.

The other small category is the /haek/-category. It covers Scenes 1 TEAR

CLOTH BY HAND and Scene 36 TEAR CLOTH W/ HANDS. The scenes together help determine the category as conditioned by features of hand actions and of two- dimensional flexible objects, while not being sensitive to whether the resultant effect is full or partial (see Figure 4.20). The verb /haek/ ‘tear’ is completely consistent with the category, since 100% of its occurrences are contained in the category’s descriptions.

310

Figure 4.20. /haek/-category insensitive to whether full separation was expected; from left to right,

Scene 1 shows the action with intension of causing full separation, whereas Scene 36 seems to expect no complete separation as it only went halfway.

Before proceeding to a summary of this section, let us address whether /pdac/

‘separate’ is compartmentalised in the categories involved with its distributional pattern, or it does “another job”? Determination of its potential semantic characteristics has been discussed separately above in analyses of several categories.

Table 4.10 below outlines all the scenes possibly affected by a /pdac/ occurrence pattern in the /dɑm/-, /cak/-, and /tieɲ/-categories. Certain characteristics identified as semantic features and commented on above are indicated in columns.

311

acting this way. Some native speakers suggest that /bɑmbak/ ‘snap’ may share the same semantic distribution, breaching the discrimination of instrument versus manual manipulation.

Overall, Figure 4.21 below illustrates the extent to which all the discussed semantic characteristics may be assigned to each of the categories. The “plus” symbol

(+) shows the characteristics linked to the subcategories/smaller divisions to show specificities added to those of the larger categories. I emphasise here that I regard the given characteristic details as a matter of priority for the 43 scenes in this study.

Subsidiary evidence is from the smaller number of instances from other sources like the Google concordance and dictionary entries. Also, the /pdac/ distributional patterns are placed outside the categories’ space to display its non-categorical quality for such categories, but retaining a notion of expected resulting separation.

Figure 4.21 displays generalised semantic attributes for categories relating to verb configurations shown on the left. Abstract distinctive components are: (1) instrument versus manual manipulation, (2) direction of separation in relation to the object’s grain, (3) predictability of location of separation, (4) differences in specific instrument types and theme object types, and (5) the object subtype (i.e., regarding numbers of dimensions), as recognised in the discussion on the category boundary location in § 4.4.1.1. Under closer investigation, other characteristic parameters also can relate to categorisation in the domain. They incorporate Manner of action (i.e., how the person used the implement to perform caused-separation), Supplementary tool, Blade placement, Hand preference, Pattern of action, and (altered)

Predictability.

313

Figure 4.21. Mapping of semantic characteristics onto seven categories (as well as subcategories) of caused-separation in Khmer represented by class-defining verbs presented with hyphenation; note that the asterisk indicates the verb occurrence patterns not considered doing subcategorisation.

314

Despite the different stated distinctive notions, categorisation influenced by

Instrument manipulation appears less helpful in discriminating between the categories at the level definable by the verb distributional patterns and practically redundant since it has been entailed by Instrument type. For this reason, the distinction is considered ineffective and does not receive further attention.

In addition, the above-discussed semantic components are argued to be semantic parameters since not only are they typically linked to each category, grouping different scenes together, but they also assist in distinguishing between them. Table 4.11 delineates the different organisations of semantic parameters in the classification of the caused- separation event domain in Khmer.

315

Table 4.11

Semantic organisation patterns in classification of caused-separation domain in Khmer.

Level of classification Semantic organisation pattern

Categories

/kat/-, /cak/-, /dɑm/- Instrument type

/puh/- Direction of separation + Instrument type

/kac/-, /tieɲ/-, /haek/- Manual manipulation + Object type + (Object subtype*)

Subcategories

/kap/-, /ʔaa/-, /tumluh/- Instrument type + Manner of action + (Direction of separation**)

/han/- Instrument type + Manner of action + Supplementary tool

/cət/- Direction of separation + Instrument type + Blade placement

/vay/- Instrument type + Hand preference

Smaller divisions

/cəɲcram/- Instrument type + Manner of action + Pattern of action

/kat/-*** Instrument type + Hand preference + Predictability

Note that the class-defining verbs with hyphenation present the categories, subcategories, and small divisions; the boldface shows semantic components which play a key role in each categorisation level.

Additionally, the single asterisk denotes that this component applies only to the /tieɲ/-, and /haek/- categories; the double asterisk means that this semantic component applies sometimes to the /kap/- category; the triple asterisk indicates that the /kat/-subdivision does not have to be characteristically the same as the /kat/-category.

4.5.2 Lexicalisation of caused-separation in Khmer

Vulchanova et al. (2012) propose that if we are able to establish semantic (sub)categories in an experiential domain based on verb occurrence patterns in relation to event descriptions, then such verbs should semantically relate to those categories. In the previous section (see Figure 4.21), we have already discussed the semantic

316

characterisation of each category, as well as the pertinent subcategories and smaller divisions. What follows is about how certain semantic characteristics can be connected or lexicalised to relevant verbal lexemes that have shaped the categories in the Khmer semantic domain of caused-separation.

Tables 4.12a-c present lexicalisation patterns for the representative verbs that define the seven categories and the related subcategories and smaller divisions, in accordance with the semantic component organisation patterns summarised in Table 4.11.

In the tables below, square brackets are used to indicate the verb lexicalisation patterns.

That is, [SEP] stands for [CAUSED-SEPARATION ACTION], [INSTR TYP] for [INSTRUMENT

TYPE], [MANUAL] for [MANUAL MANIPULATION], [OBJ TYP] for [OBJECT TYPE], [OBJ

SUBTYP] for [OBJECT SUBTYPE], [DIR] for [DIRECTION OF SEPARATION], [BLADE] for

[BLADE PLACEMENT], [MANNER] for [MANNER OF ACTION], [SUP INSTR] for

[SUPPLEMENTARY INSTRUMENT], [HAND PREF] for [HAND PREFERENCE], [PATT ACT] for

[PATTERN OF ACTION], and [PREDICT] for [PREDICTABILITY].

Table 4.12a

Lexicalisation of caused-separation at category level in Khmer.

Conflation: (1) [SEP + INSTR TYP] (2) [SEP + DIR + INSTR TYP] (3) [SEP + MANUAL + OBJ TYP + (OBJ SUBTYP)]

Verb: /kat/ – [SEP + SHARP-BLADED] /puh/ – [SEP + LENGTHWISE + SHARP-BLADED] /kac/ – [SEP + MANUAL + RIGID]

/dɑm/ – [SEP + BLUNT-HEADED] /tieɲ/ – [SEP + MANUAL + FLEXIBLE + 1-D]

/cak/ – [SEP + POINTED] /haek/ -- [SEP + MANUAL + FLEXIBLE + 2-D]

317

Table 4.12b

Lexicalisation of caused-separation at subcategory level in Khmer.

Conflation: (1) [SEP + INSTR TYP + MANNER + (2) [SEP + INSTR TYP + MANNER + SUP (3) [SEP + DIR + INSTR (4) [SEP + INSTR TYP +

(DIR)] INSTR] TYP + BLADE] HAND PREF]

Verbs: /kap/ – [SEP + SHARP-BLADED + /han/ – [SEP + SHARP-BLADED + SAWING /cət/ – [SEP + /vay/ – [SEP + BLUNT-

STRIKING + (LENGTHWISE)] + SUPPORTING SURFACE] LENGTHWISE + SHARP- HEADED + HAND PREF]

BLADED + BLADE]

/ʔaa/ – [SEP + SHARP-BLADED +

SAWING]

/tumluh/ – [SEP + POINTED +

DOWNWARD BLOW]

Table 4.12c

Lexicalisation of caused-separation at subdivision level in Khmer.

Conflation: (1) [SEP + INSTR TYP + MANNER + PATT ACT] (2) [SEP + INSTR TYP + HAND PREF + PREDICT]

Verbs: /cəɲcram/ – [SEP + SHARP-BLADED + STRIKING + REPEATED] /kat/ – [SEP + SHARP-BLADED + HAND PREF + INTERMEDIATE]

These lexicalisation patterns in the caused-separation verbs in Khmer echo to some extent how some scholars have prescribed usage involving features influencing selection of such verbs (Headley 1977; Headley et al., 1997; Nath, 1967). The relevant features include different instruments and kinds of objects (see § 4.1). Nevertheless, differences in instruments used and in acted-upon objects were not engaged in considering use for every verb in the domain. These features specifically become operative only for verbs involving implement use, such as /kat/ ‘cut’ using a sharp blade, or /cak/ ‘stab’ with a pointed tool. Likewise, nuances in different object types become activated when verbs are deployed representing a hand action of caused-separation: for

318

example, /kac/ ‘snap’ for a rigid object, or /haek/ ‘tear’ for a two-dimensional flexible object.

Therefore, all semantic characteristics engaged in the individual categories, subcategories, and smaller divisions and lexicalised in different verbs not only distinguish between event types of the individual categories but also have an effect on speakers reaching decisions to select verb descriptors of caused-separation in Khmer (cf. Andics,

2012). Furthermore, quantitative characteristics of sematic combinations associated with lexicalisation need to be taken into account. Defining verbs (see Table 4.12a-c) are considered as representing either of the two levels of meaning generality: generic or specific. With the smaller semantic combinations, verbs that define the most coarse- grained categories are seen as generic, since they express usual implement types involved in caused-separation, and sometimes a direction of separation. Verbs associated with subcategorisation are more specific in meaning since not only do they express such semantics like generic verbs, but they also express other meanings of action manners, patterns of action, or intended resulting effects.

It is also important to think here about the involvement of resultant effects of caused-separation in many stimulus scenes. In this study, the stimuli feature both full and partial separation. Those displaying complete resulting separation could all be characterised by the descriptions with the constructions where no (intransitive) verbs of resulting separation occurred in serialisation with the preceding verbs of caused- separation. That being the case, the question then arises as to whether a caused-separation verb also entails a relevant result effect. If that is so, a compatible resulting separation verb serialised to it would simply be added for emphasis. However, my analysis of this

319

issue is different, the conclusion being that verbs of caused-separation in Khmer do not unambiguously convey the semantics of resulting separation but rather merely imply it.

In addition, as only implying such effects, they allow for resultative phrases (or perhaps clauses) with a correlated intransitive verb of resulting separation to confirm whether separation was achieved.

There are at least two rationales that help us understand why we should not consider resulting separation to be entailed by verbs of caused-separation. One concerns the use of some caused-separation verbs for incomplete separation events; the other concerns the possible emergence of resultatives.

Certain verbs of caused-separation used to describe scenes which involve partial separation could also label scenes involving complete separation. In the analysis, there are five scenes that display incomplete separation effects: Scene 14 CUT MELON W/ KNIFE,

Scene 25 SNAP TWIG W/ HANDS, Scene 36 TEAR CLOTH ABOUT HALFWAY W/ HANDS, Scene

45 POKE HOLE IN CLOTH W/ TWIG, and Scene 51 CHOP MELON W/ KNIFE. Note that we can consider Scene 18 CUT ACCIDENTALLY FINGER as involving incomplete separation since the finger shown in the scene was not severed because of cutting; however, this scene was excluded from general analysis as explained in § 4.3.1.1. The five scenes were analysed into four different categories, i.e., /puh/-, /kac/-, /cak/-, and /haek/-categories as describable by the representative verbs of each category, which also applied to other scenes involving full-separation in the respective categories. Therefore, I conclude that such representative caused-separation verbs demonstrate how verbs of caused-separation can apply to describe caused-separations, regardless of whether their actions cause full or incomplete resulting separation. This means they do not contain or require any specific

320

kinds of results, or to be precise, they do not entail any resultant effects. However, one may doubt whether these verbs can represent all others of the same kind. Let us consider the other rationale below.

All verbs of caused-separation in this study can co-occur with resultative phrases or resultative serialisation that provide additional information about how resultant effects were reached (see § 4.2.2). In consultation with native speakers, I conclude that such intransitive resultatives can be negated to reject achievement of resulting separation.

Consider (4.20-4.21) below. Example (4.20) below is from one of the descriptions of

Scene 27 where a man cut a sitting woman’s hair off. Despite the complete resulting separation, no intransitive verb was serialised to the single main verb, representing the resultant effect as if the main verb alone were enough to describe the scene. In general cases, it is ordinary that events like cutting hair by scissors as in Scene 27 are adequately portrayed using only the single main verb /kat/ cut’ with the conventional reading that the hair was completely off due to the action of cutting.

(4.20) koat kat sɑk nierii

3SG cut hair woman

‘He cut the woman’s hair.’ [CB-S27-CP] (4.21) koat kat sɑk nierii mɨn dac

3SG cut hair woman NEG separate ‘He (tried to) cut the woman’s hair, but it was not off.’ (constructed by a native of Khmer)

321

However, when a negative resultative was serialised to (4.20), the above derived (4.21) is still grammatical and semantically well-formed. This shows that if the verb /kat/ ‘cut’ truly entailed a resulting effect—i.e., of complete separation—the addition of a negative resultative would have created a semantic anomaly. Accordingly, it seems that verbs of caused-separation do not necessitate or formally entail resulting separation. However, native speakers regularly interpreted verbs of caused-separation like /kat/ as containing complete resultant separation because in most cases verbs of the kind imply it even though not entailing it.

I have analysed Khmer verbs of caused-separation as not lexicalising a separation result but as implying it or referring it to an expected result. Thus, unlike state-change verbs of material destruction in English, verbs of caused-separation in Khmer appear to differ in semantic behaviour since they do not involve semantic entailment of resulting effects48.

4.5.3 Summary

I have pointed out the semantic characteristics linked to the different levels of categorisation in the caused-separation domain in Khmer. Based on these meaning attributes, the pertinent semantic components were generalised and considered as parameters in characterising the domain’s categorisation. The mentioned components identified are Instrument type, Manual manipulation, Object type, Object subtype,

Direction of separation, Blade placement, Manner of action, Supplementary tool,

Hand preference, Pattern of action, and Predictability. Speakers of Khmer organise

48 See the footnote at the end of § 3.5.2 about the nature of verbs of caused-separation in Khmer (as well as Thai) and why their implicature of caused result is cancellable.

322

these semantic components in different ways to partition the domain at different levels.

Also, given this semantic characterisation associated with the verb-determined categorisation, I argue for the implied connection between the observed characteristics and the relevant defining verb descriptors. I close the section with an outline of semantic element conflation (including the fundamental element of caused-separation) in the lexicalisation patterns of caused-separation verbs in Khmer. It is also worth noting that the lexicalisation patterns used by Khmer speakers echo the domain’s semantic categorisation.

------⁂ ------

323

CHAPTER 5

Comparison of semantic categorisation of caused-separation

across Thai and Khmer

In this chapter, I present a comparison of categorisation of the caused-separation event domain across Thai and Khmer following from the analysis in Chapters 3 and 4. This comparison is an assessment of how Thai and Khmer are the same or different at appropriate levels of semantic organisation, including fine-grained distinctions. General similarities as revealed by the semantic domain’s categorisation may suggest the extent of convergence between Thai and Khmer, but carrying out in-depth comparison will help in assessing which semantic aspects have come about or developed because of

“convergence” factors, a topic discussed further in Chapter 6. Other language-specific factors and those representing divergent conservations are also identified in this chapter.

The first section of this chapter reviews which verbs the speakers of Thai and

Khmer used to describe the 43 caused-separation stimulus scenes and how they used them (§ 5.1). The second section (§ 5.2) describes how the domain is categorised both similarly and differently across the languages, based on the semantics of verbs. The approach of Majid et al. (2004, 2008) is used to probe how certain Thai-Khmer similarities relate to cross-linguistic partitioning patterns and to determine features that seem constant across the two languages or specific to each. The third section, § 5.3, discusses boundaries of the domain’s semantic categories in Thai and Khmer, including

324

whether and how the languages are similar or different regarding their ways of categorising caused-separation events. This concerns the nature of the semantic distinctions made in each language. The last section, § 5.4, explores similar and dissimilar semantic component organisation patterns across Thai and Khmer, consequently making plain the semantic parameters that underlie the categorisation of the domain in these two languages.

5.1 Verb usage for caused-separation domain in Thai and

Khmer: The data from fieldwork elicitation

5.1.1 Frequency analysis in caused-separation domain across Thai and

Khmer

As shown in Table 5.1, the Thai speakers used 24 different caused-separation verbs.

Likewise, the Khmer speakers employed the same total number of verb types. The verbs in Thai and Khmer occurred in 334 and 335 tokens, respectively, in the scene description data. From all the native consultants of each language, there appear to be different pooled numbers of verb occurrences and scene descriptions corresponding to the individual scenes. For Thai, the average number of verbs is 2.28 per scene while the number of scene descriptions are at an average rate of 7.77 descriptions per scene. The average number of verbs for Khmer is approximately 2.56 per scene while that of descriptions is

7.79.

325

Table 5.1

Summary of frequency analysis for caused-separation verbs in Thai and Khmer.

Language # of # of types of total # of caused- x̅ of types of caused- x̅ of caused-

consultants caused-separation separation separation verbs per separation

verbs descriptions scene descriptions per

scene

Thai 7 24 334 2.28 7.77

Khmer 7 24 335 2.56 7.79

Despite the higher average rates of verbs per scene and descriptions per scene for

Khmer than those for Thai, the number of verb occurrences and descriptions for each scene in Khmer is not consistently larger than the individual counterparts in Thai. As illustrated in Figure 5.1, there are 17 scenes associated with a larger number of Khmer verb occurrences than the counterpart number for Thai. The speakers of both languages labelled 16 other scenes with equivalent numbers of Thai and Khmer verb occurrences.

The remaining 10 scenes were described with smaller numbers of Khmer verbs. As for the number of scene descriptions per scene across the two languages, the Khmer speakers associated 15 scenes with the larger number of descriptions. Another 15 scenes were described with equivalent numbers of Thai and Khmer descriptions, while the remaining

13 scenes correspond to a smaller number of Khmer descriptions.

326

Figure 5.1. Numbers of verb types per scene and descriptions per scene across Thai and Khmer.

Therefore, despite different values of average verbs per scene and average descriptions for each scene, it is premature to conclude that the Khmer speakers used larger numbers of verbs to describe a caused-separation event or they would deal with an event of a given kind using a larger number of descriptions, as compared to the speakers of Thai in this study. In fact, a closer look reveals that correspondences between Thai and

Khmer regarding numbers of verbs used per scene and descriptions per scene is somewhat haphazard. It is thus difficult to chart a trend at this point. With the small differences observed in average values, I argue for no important inferences across Thai and Khmer based on frequency analysis alone.

5.1.2 Verbs of caused-separation used by Thai and Khmer speakers

As already mentioned in the above subsection, Thai and Khmer scene descriptions each involved 24 different verbs of caused-separation. This subsection provides a comparative

327

examination with more detail regarding the use and frequency of these verbs by speakers of the two languages.

Figure 5.2 shows the 24 different verbs of caused-separation in Thai, distributed in a descending-frequency manner. A limited group of nine highly frequent verbs is responsible for the majority of the descriptions, at a rate of 81.44%. They are /tàt/ ‘cut’

(23.65%), /tʰúp/ ‘smash’ (12.57%), /fan/ ‘hack’ (11.98%), /hàk/ ‘snap’ (8.68%), /hàn/

‘slice’ (6.59%), /sàp/ ‘chop’ (6.29%), /tɕʰìːk/ ‘tear’ (4.79%), /pʰàː/ ‘hew’ (3.89%), and

/dɯŋ/ ‘tug’ (2.99%).

Figure 5.2. Caused-separation verbs used by Thai speakers.

As illustrated in Figure 5.3, the Khmer speakers used an equivalent number of caused-separation verbs, with the distribution roughly analogous to that displayed by

Thai. In the case of Khmer, 82.14% of the scene descriptions contain only one of the ten caused-separation verbs. They include /kat/ ‘cut’ (21.79%), /dɑm/ ‘smash’ (11.04%),

/kap/ ‘hack’ (9.25%), /kac/ ‘snap’ (8.66%), /puh/ ‘chop’ (6.87%), /pdac/ ‘separate’

328

(5.97%), /ʔaa/ ‘saw’ (5.07%), /vay/ ‘strike’ (4.78%), /haek/ ‘tear’ (4.48%), and /cak/

‘stab’ (4.18%).

Figure 5.3. Caused-separation verbs used by Khmer speakers.

This overview of naming patterns therefore uncovers important specifics about the use of caused-separation verbs in the two languages. These patterns can be explicated as follows. First, Thai and Khmer both make use of small numbers of verbs accounting for most of the scene descriptions. The speakers attributed more than four fifths of the descriptions to no more than ten verbs. Thus, these verbs can be referred to as

‘predominant verbs’ or the frequent-verb descriptors for the domain.

Second, as discussed in Chapters 3 and 4, more than half of the verbs patterned to partition the domain are from these predominantly used verbs. In Thai, four out of the six verbs that define the partitioning semantic categories are included in the nine frequent verbs: they are /tàt/ ‘cut’, /tʰúp/ ‘smash’, /hàk/ ‘snap’, and /tɕʰìːk/ ‘tear’. Likewise, six out of the seven category-defining verbs in Khmer are among the ten predominantly-used verbs: /kat/ ‘cut’, /dɑm/ ‘smash’, /kac/ ‘snap’, /puh/ ‘chop’, /haek/ ‘tear’, and /cak/ ‘stab’.

329

The other verbs in each of the frequent-verb list (e.g., /fan/ ‘hack’, or /hàn/ ‘slice’ in Thai, and /kap/ ‘hack’, or /ʔaa/ ‘saw’ in Khmer) also play a big part in the domain’s subcategorisation

Exceptional considerations apply to the Khmer verb /pdac/ ‘separate’49. As argued in § 4.5.1, this verb did not sort caused-separation events of the /dɑm/-, /cak/-, and /tieɲ/- categories into smaller classes or subcategories: it did not delineate a specific version of the event types linked to the categories. It was in fact used to describe intermediate precision actions expecting a division in one place, regardless of any types of caused actions.

Third, the Thai and Khmer distributions similarly each show an extremely heavy use of a single verb as opposed to the remaining verbs. Specifically, the Thai speakers employed /tàt/ ‘cut’ with far greater frequency than the others. In the Thai data shown in

Figure 5.2, frequency of the next caused-separation verb in sequence is much lower. The most frequent verb /tàt/ is about double the frequency than the second most used verb

/tʰúp/ ‘smash’. In similar fashion, the most frequently occurring verb /kat/ ‘cut’ in Khmer has a frequency approximately double that of /dɑm/ ‘smash’, which was attested as second in the frequency rank. Others showed much lower frequency.

Fourth, the other long-tailed verbs in each language used less frequently than the established predominant verbs reflect similar patterns of frequency distribution.

49 Is it the case that /pdac/ is selected for non-prototypical cases of separation, as conjectured by a reader? This may be one factor in lexical selection. However, the verb /pdac/ was also used for prototypical cases where the use of instruments was involved (e.g., CUT ROPE W/ KNIFE), so non-prototypicality may not be viable as a general account. See also discussion on p. 310.

330

Specifically, their number of occurrences is no greater than 3% of all the cases across

Thai and Khmer.

Similarly across Thai and Khmer, speakers commonly used most of the long- tailed verbs as sporadic alternatives to more frequently used verbs in the head of the distributions. They associated only a minority of the long-tailed verbs with subcategorisations within the caused-separation event domain. For example, in Thai,

/tɕɔ̀ʔ/ ‘puncture’ is involved in the subclassification of the /tàt/- and /tʰîm/-categories. In

Khmer /han/ ‘slice’ and /cət/ ‘slit’ individually involve a subcategory in the respective

/kat/- and /puh/-categories. In addition, the long-tailed verb groups in Thai and Khmer introduce the use of /bàːt/ ‘cut’ and /mut/ ‘cut’, respectively, by which Scene 18 CUT

FINGER ACCIDENTALLY was exclusively described. Since this particular scene was designed solely to feature an accidental action of separation (Bohnemeyer et al., 2001), the particularity of this feature might be perceived as recognised by the two dedicated verbs. Yet, the scene’s descriptions both in Thai and Khmer did not form a pattern with those of other scenes. I excluded them from consideration since they gave no important details, at least for the present study. It was accordingly too early to judge here whether the use of the verbs really aids in defining a distinct event-type category in each language. We then need further research to address this empirical issue.

In summary, the speakers of Thai and Khmer used an equivalent number of caused-separation verbs (24) contained in the 334 and 335 description cases in the respective languages, for the 43 stimulus scenes. The average rates of verb type per scene and descriptions per scene for Khmer are greater than those for the other language, raising a question of whether this difference reflects something important about the

331

domain’s designations across the languages. Closer examination shows that the differences in the numbers of verb types and scene descriptions for each scene across

Thai and Khmer are too fragmented to be the catalyst for a meaningful trend.

Furthermore, a comparison of verb distributions in the Thai and Khmer descriptions captures some interesting aspects of lexical descriptor use for the domain. Specifically, the verbs of caused-separation in Thai and Khmer appeared to produce the similar distributions. I have found that only nine to ten out of the available 24 verbs in each language are responsible for more than 80% of the scene descriptions. The remaining verbs account for the rest. Also, these predominant verb groups in both languages include most of the verbs that do the main work of the domain’s categorisation.

5.2 Comparison of granularity of caused-separation domain

across Thai and Khmer

Despite the equivalent numbers of verb types for the caused-separation event domain across Thai and Khmer, verbs in each language appeared with different distributions.

Distinct occurrence patterns establish slightly different numbers of semantic categories in relation to the domain. This section first provides a comparison of how the domain is decomposed into its subdomains, i.e., categories and pertinent subcategories, and the extent to which such subdomains are internally structured, showing different hierarchical organisations (§ 5.2.1). In addition, the granular categorisations in Thai and Khmer seem to show linguistic asymmetries in lexical use in favour of richer encoding of some categories over others (§ 5.2.2). Such asymmetries in lexical resources also echo a variety

332

of fine-grained potential distinctions within the controlled domain of caused-separation events, as was seen in both Thai and Khmer.

5.2.1 Granular categorisation in caused-separation domain across Thai

and Khmer

In response to the 43 stimulus scenes of caused-separation, the event descriptions elicited from Thai and Khmer speakers provide input for cluster analyses to determine verb distributional patterns that separate categories from one another in the domain (see §§

3.3.1 and 4.3.1). The resulting dendrograms derived from the analyses indicate how the two languages present minimally distinct pictures of categorisation concerning the structure of categories and internal organisation within the general domain of caused- separation. This section addresses these classifications and suggests certain patterns of differences and similarities in the domain’s characterisation.

5.2.1.1 Numbers of categories across Thai and Khmer

Figures 3.2 and 4.2 show that the domain of caused-separation events can be broken into multiple coarse-grained categories based on the cluster analysis and the patterns of verb occurrences. There is a comparable but not identical number of categories: Thai has six, and Khmer has seven. For Thai, the distributions of six defining verbs, i.e., /tàt/ ‘cut’,

/tʰúp/ ‘smash’, /hàk/ ‘snap’, /tɕʰìːk/ ‘tear’, /krìːt/ ‘slit’, and /tʰîm/ ‘stab’, result in partitioning the semantic field into six respective categories, i.e., /tàt/-, /krìːt/-, /tʰúp/-,

/hàk/-, /tɕʰìːk/-, and /tʰîm/-categories (see § 3.3.1.1). For Khmer, the occurrence patterns of the seven Khmer verbs result in partitioning: /kat/ ‘cut’, /dɑm/ ‘smash’, /puh/ ‘chop’,

/kac/ ‘snap’, /cak/ ‘stab’, /haek/ ‘tear’, and /tieɲ/ ‘tug’. These define seven pertinent categories: the /kat/-, /puh/-, /dɑm/-, /kac/-, /haek/-, /tieɲ/-, and /cak/-categories. Results

333

of the cluster analysis consequently reveal that the organisation of caused-separation categories as determined by the speakers of Thai is not congruent with that of the Khmer speakers. At this coarse level, Khmer appears to categorise the domain slightly more finely than Thai. Table 5.2 below provides a summary of the semantic categories of caused-separation found in Thai and Khmer.

Table 5.2

Summary of cluster analysis for caused-separation categories in Thai and Khmer.

Thai Khmer

6 categories 7 categories

/tàt/-category (/tàt/ ‘cut’) /kat/-category (/kat/ ‘cut’)

/tʰúp/-category (/tʰúp/ ‘smash’) /dɑm/-category (/dɑm/ ‘smash’)

/hàk/-category (/hàk/ ‘snap’) /puh/-category (/puh/ ‘chop’)

/tɕʰìːk/-category (/tɕʰìːk/ ‘tear’) /kac/-category (/kac/ ‘snap’)

/krìːt/-category (/krìːt/ ‘slit’) /cak/-category (/cak/ ‘stab’)

/tʰîm/-category (/tʰîm/ ‘stab’) /haek/-category (/haek/ ‘tear’)

/tieɲ/-category (/tieɲ/ ‘tug’)

Note that all the categories in each language are ranked by extensional range.

Table 5.2 can be interpreted by referring to the frequency analysis in § 5.1.2. The categories described in the table can be understood as potentially constituting a first indication of where the domain’s categorisation occurs. A good case in point is the two most frequent verbs of caused-separation in both Thai (i.e., /tàt/ ‘cut’ (23.65% of all descriptions) and /tʰúp/ ‘smash’ (12.57%)) and Khmer (i.e., /kat/ ‘cut’ (21.79%) and

334

/dɑm/ ‘smash’ (11.04%)). The use of these verbs specifically indicates the extent to which speakers of the two languages generally differentiated two main types of caused- separation events and assigned each of them into the separate categories describable by the individual verbs (i.e., the /tàt/- and /tʰúp/-categories in Thai and the /kat/- and /dɑm/- categories in Khmer).

Note too that the different categories for each language, as given in Table 5.2, do not carve up the domain proportionately or into equal portions. Referring again to Figures

3.2 and 4.2, we can distinguish categories of different breadth in both Thai and Khmer.

Some of them appear to extend over a remarkably wide range of stimulus scenes. In contrast, the speakers of the languages restricted other categories to smaller numbers of scenes. The two pie charts below display percentages of the different categories of caused-separation in the two languages. Note that the special scenes detailed in §§ 3.3.1.1 and 4.3.1.1 have been integrated into relevant categories both in Thai and Khmer. Also,

Scene 18 CUT FINGER ACCIDENTALLY was excluded as it was found to be an outlier in the cluster analysis. Accordingly, the percentage was prorated based the remaining 42 scenes.

Figure 5.4. Portions in percentage of different categories in caused-separation domain in Thai and Khmer.

335

In comparing Figure 5.4 to Figures 5.2 and 5.3 (in § 5.1.2), the portions of the categories in both languages may seem to reflect the frequency analysis in that the larger the categories, the more frequent the verbs used to define them. This crude generalisation is well supported by the Thai descriptions, as the descending ranking of the categories by size is strongly aligned with the descending order of frequency of their individual category-defining verbs. In slight contrast, the Khmer descriptions outwardly observe the generalisation too, but only partially. A significant exception is the inverted ranking of the /puh/- and /kac/-categories and of the /cak/- and /haek/-categories. Specifically, the

/kac/-category and the /haek/-category occurred more often in the descriptions than the

/puh/-category and the /cak/-category, respectively. However, each of the former is lower in rank of size than the respective counterpart in the latter group. Note that if these four categories in Khmer are viewed more closely, the differences in frequency of the verbs defining the categories in each pair are very small (less than 2%). These discrepancies seem too minor to affect the generalisation regarding numbers of frequency of occurrences.

In brief, Thai and Khmer carve up the caused-separation event domain into different numbers of categories, as established by the particular defining verb distributional patterns. In Thai, there are inferred to be six categories in the domain, while the same domain turns out to break down into seven categories in Khmer. The individual categories in both languages are not equal in breadth since some of them have broad application whereas others show narrower patterns of use. Nevertheless, for both languages, the comparatively broader categories are unambiguously defined by the more frequent verbs.

336

5.2.1.2 Internal organisation of caused-separation categories in Thai and Khmer: Finer

granularity and depth of hierarchy structure

As discussed in earlier chapters, cluster analysis and verb distributions reveal the internal organisation of the semantic categories of caused-separation in each language (see

Figures 3.2 and 4.2). Specifically, both predominant and less frequently used verbs are involved in this organisation. Not only do the relevant predominant verbs help define the categories of the domain, but other verbs are also used occasionally to name some of their event members. Occurrence patterns of these infrequent verbs then result in the partitioning of certain categories into subcategories, in a manner similar to how the predominant verbs’ distribution patterns divide up the domain. Regarding semantic subpartitioning in the two languages, this appears to show hierarchical architectures of finer-grained distinctions with different shapes across the categories. In this part, I discuss the internal organisations of the domain’s categories in Thai and Khmer and compare them.

The cluster analysis results as displayed in Figures 3.2 and 4.2 again reveal the internal organisation of the semantic categories of caused-separation in each language. As already mentioned, the six different categories were defined for Thai. Based on the distributions of other verbs than the individual category-defining verbs, the /tàt/-category shows the finest-grained subcategorisations, containing four smaller subcategories: /fan/-,

/hàn/, /pʰàː/-, and /tɕɔ̀ʔ/-. The /fan/-subcategory breaks down further into the /sàp/-,

/tɕaːm/-, and /lɯ̂ aj/-subdivisions. The four other categories, i.e., /krìːt/-, /tʰúp/-, /tʰîm/-, and /tɕʰìːk/-, are divided into different numbers of subcategories, ranging from one to three. The remaining /hàk/-category is hierarchically the flattest with no

337

subcategorisation in Thai. In Khmer, the individual seven categories are decomposed at different degrees of subcategorisation, in reference to the relevant infrequent verbs’ occurrence patterns. The /kat/- and /dɑm/-categories can be identified as having the most taxonomic (hierarchical) depth in the language since they are decomposable down to the level of subdivision. Yet, the former has more fine distinctions of subcategorisation than the latter since it exhibits the /kap/-, /ʔaa/-, and /han/-subcategories; the first of which incorporates a smaller division, the /cəɲcram/-subdivision. The /dɑm/-category contains the /vay/-category, which in turn breaks down into the /kat/-subdivision. The other two categories are decomposed only down to the subcategory level: the /puh/-category containing the /kap/- and /cət/-subcategories, and the /cak/-category including only the

/tumluh/-subcategory. The remaining three categories, the /kac/-, /tieɲ/-, and /haek/- categories, are hierarchically flat since they have no subcategorisation. Figures 5.4a-b below summarise the domain’s subcategorisation: fine-grained differentiation within certain categories, with the deeper hierarchical levels where subdivisions occur. Note the slight modification of the hierarchical structures: for Khmer, the /pdac/ occurrence pattern was removed from the hierarchical classification because for reasons described in

§ 4.5.1. The verb-based labels for the individual subclasses are in short format: the representative verbs for categories, subcategories, and subdivisions are presented with hyphenation.

338

Figure 5.5a. Hierarchy of caused-separation categories with relevant subcategories and smaller divisions in

Thai.

Figure 5.5b. Hierarchy of caused-separation categories with relevant subcategories and smaller divisions in

Khmer.

Speakers of Thai and Khmer, through their scene descriptions, decompose the semantic space of caused-separation events along similar lines, distinguishing three hierarchical levels of granularity of categorisation. First, the domain is separated into coarse-grained categories. Second, based on how the speakers in the two language groups described events, some of the categories are divided into subcategories. Last, some of the

339

subcategories are classified further into smaller divisions. Specifically, five out of the six categories of Thai were divided further into subcategories; only one of which (/tàt/- category) contains a smaller division level, /sàp/-, /tɕaːm/-, and /lɯ̂ aj/-subdivisions. This characterises the domain’s Thai taxonomic hierarchy.

The seven major Khmer categories are shown in Figure 5.5b. Of these, four are decomposed into further subcategories, two of which show still deeper levels of smaller divisions. Consequently, since Khmer contains more categories than Thai that continue partitioning to the level of smaller divisions, Khmer’s organisation of the caused- separation domain may be said, overall, to have slightly more taxonomic depth than that of Thai.

5.2.2 Asymmetry in encoding of caused-separation across Thai and

Khmer: A look at lexical resources

In Chapters 3 and 4, I noted the extent of the Thai and Khmer speakers’ use of caused- separation verbal descriptors for the domain’s categories and how hierarchical structures are built within some of the categories. An important finding relates to the tendency toward asymmetry in encoding of events of the domain. This subsection refers again to quantitative aspects of verbs occurring for each category and the hierarchical partitioning of the domain. In this subsection, I develop further arguments and comparisons relating to phenomena of asymmetry in event naming with attention to lexical resources and hierarchy of semantic categories.

340

5.2.2.1 Asymmetry in lexical resources

Figures 3.4 for Thai (§ 3.3.1.2) and 4.4 for Khmer (§ 4.3.1.2) show that, based on descriptions of the same events, comparable categories in the two languages do not involve verbal lexicons of the same size. Specifically, the speakers of the two languages provide more distinct labels for certain caused-separation categories than others.

Furthermore, in Khmer, some categories include up to five times the number of verbs of other categories. In Thai, a category can have ten times the single-verb value of another category. These divergent uses of verbs exhibit the asymmetrical nature of lexical encoding of caused-separation in the two languages.

A question then arises: what should we examine to measure these phenomena of lexical asymmetry in a better-controlled and systematic fashion? Out of the 43 stimulus scenes (Bohnemeyer et al., 2001), some occur with instruments: e.g., scissors, saw, machete, or axe. Some only occur with a particular type of object. For example, Scenes

48, 13, and 54 feature the use of an axe in separating a stick, a rope, and a carrot, respectively. However, the set of 43 scenes does not include the use of an axe in causing separation for a piece of cloth. Another case in point is with the fish as theme. There is only one scene featuring a fish as the theme of separation by a knife. To take proper account of instrument/manner and object feature variation, we should scope out a core set of scenes, as pointed out by Bohnemeyer et al.; these are referred to here as core caused- separation scenes. Taken together, these scenes involve consistently a set of instrument/manner features in relation to a group of object features. Shown below is a grid-like table of core scenes of this type, with the relevant scene number in parentheses.

341

Table 5.3

Grid-like design of core set of caused-separation scenes, modified from Bohnemeyer et al. (2001).

object instrument/manner piece of cloth carrot stick rope/string

pointy tool cpoint (S45) carpoint (S43) spoint (S53) rpoint (S2)

hammer chammer (S23) carhammer (S21) shammer (S31) rhammer (S50)

karate-style ckarate (S34) carkarate (S32) skarate (S42) rkarate (S61)

knife cknife (S12) carknife (S10) sknife (S20) rknife (S49)

hands chands (S1) carhands (S57) shands (S19) rhands (S38)

furiously cfury (S4) carfury (S6) sfury (S5) rfury (S35)

Note that this table was reproduced from Bohnemeyer et al. (2001), with some modifications: (1) spontaneous manner was removed as this study focusses only on events of caused-separation, and (2) two subdifferentiations of ‘carknife’, i.e., cutting the carrot along the long axis (S9) vs. chopping it repeatedly along the short axis (S26), were excluded; I kept only that of cutting the carrot along the short axis, in parallel with the direction of separation displayed in ‘cknife’ (S12), ‘sknife’ (S20), and ‘rknife’ (S49). The original descriptive file names in Bohnemeyer et al. are maintained in each cell of the above table.

Hierarchical clustering organisation established for Thai (Figure 3.2) and for

Khmer (Figure 4.2) provides a means of interpreting the above 24 core scenes of Table

5.3. I found the core caused-separation scenes to be associated with five (out of six) of the major hierarchical categories for Thai and with six (out of seven) categories for

Khmer (see Figures 3.2 and 4.2). Lexical resources utilized in expressing these categories can now be summarised. Considering only caused-separation verbs linked to the limited list of these categories for each language, I summarise the lexical resources utilised as represented in Tables 5.4a-b.

342

Table 5.4a

Lexical resources available in each category linked to core caused-separation scenes in Thai.

/tàt/-category /tʰúp/-category /hàk/-category /tɕʰìːk/-category /tʰîm/-category

/tàt/ ‘cut’ /tʰúp/ ‘smash’ /hàk/ ‘snap’ /tɕʰìːk/ ‘tear’ /tʰîm/ ‘stab’

/fan/ ‘hack’ /fan/ ‘hack’ /dɯŋ/ ‘tug’ /tɕɔ̀ʔ/ ‘puncture’

/hàn/ ‘slice’ /sàp/ ‘chop’ /tʰɛːŋ/ ‘stab’

/tɕɔ̀ʔ/ ‘puncture’

/tɕʰɯ̌ an/ ‘slash’

/tʰɛːŋ/ ‘stab’

Note that verbs in each category are ranked by their total frequency in this study. Verbs occurring in less than 1% of all the descriptions are excluded

Table 5.4b

Lexical resources available in each category linked to core caused-separation scenes in Khmer.

/kat/-category /dɑm/-category /kac/-category /haek/-category /tieɲ/-category /cak/-category

/kat/ ‘cut’ /dɑm/ ‘smash’ /kac/ ‘snap’ /haek/ ‘tear’ /tieɲ/ ‘tug’ /cak/ ‘stab’

/kap/ ‘hack’ /kat/ ‘cut’ /tieɲ/ ‘tug’ /kat/ ‘cut’

/ʔaa/ ‘saw’ /kap/ ‘hack’ /dɑm/ ‘smash’

/han/ ‘slice’ /puh/ ‘chop’ /puh/ ‘chop’

/cəɲcram/ ‘cut’ /vay/ ‘strike’ /vay/ ‘strike’

/bok/ ‘pound’

Note that the verbs in each category are ranked by their total frequency in this study. Verbs occurring in less than 1% of all the descriptions are not included. Also excluded from the table were transitive verbs

/pdac/ ‘separate’ (TR), /bɑmbaek/ ‘smash’, /bɑmbak/ ‘snap’, and /tumluh/ ‘puncture’ derived from the respective intransitives /dac/ ‘separate’ (INTR), /baek/ ‘break’ (INTR), /bak/ ‘break’ (INTR), and /tluh/ ‘be.pierced’. These verbs make more reference to expected result effects of separation than do other typical verbs of caused-separation, which convey different caused-separation actions.

343

According to the two above tables, for Thai, the /tàt/-category displays the largest number of verbs (6), covering 11 scenes of the core set: Scenes 34, 61, 6, 20, 42, 49, 26,

12, 43, 53, and 2. In contrast, the /hàk/-category is limited to the use of /hàk/ ‘snap’. This category is associated with three different core scenes: Scenes 57, 19, and 5. For Khmer, the /cak/-category appears with the largest number of verbs. These involve four scenes:

Scenes 43, 53, 2, and 45. In contrast, the /kac/-category involves only /kac/ ‘snap’, as dealing with three core scenes: Scenes 57, 5, and 19. Note that Figures 3.2 and 4.2, respectively, illustrate the correspondence between the listed verbs and the pertinent scenes.

Core scenes relating to categories with high numbers of verbs can be contrasted with those with a relatively low number of verbs. Differences detected confirm a feature mentioned above: the asymmetrical nature of lexical encoding. Significantly, this asymmetry applies across Thai and Khmer in an analogous manner. Categories in each language which involve the use of high numbers of verbs appear to be structured around certain scenes. Take the /tàt/-category in Thai and the /cak/-category in Khmer, for example; all the core scenes covered by the latter are also associated with the former— except for Scene 45. Furthermore, all the scenes associated with one of the Khmer categories with the second largest number of verbs, i.e., the /kat/-category, are fully incorporated within the /tàt/-category in Thai as well. The relevant evidence is from

Scenes 20, 12, 10, 49, 4, and 6. Also, categories involving limited numbers of verbs in

Thai and Khmer are built around certain scenes. Let us consider the /hàk/-category in

Thai and the /kac/-category in Khmer as examples. The two categories appear restricted to scene descriptions showing preference for single verbs, i.e., /hàk/ ‘snap’ in Thai and

344

/kac/ ‘snap’ in Khmer. In the data sets for each language, these verbs were selected for the same subset of core scenes: Scenes 57, 5, and 19. The phenomena together suggest that there appears to be comparable asymmetry in the lexical descriptions of caused- separations, with certain scenes of the domain depicted by larger numbers of verbs, whereas relatively smaller numbers of verbs characterised other scenes through more restricted usage.

Thus, an important similarity across Thai and Khmer arises. Descriptions of some caused-separation categories display more variation regarding coding strategies than those of other categories of the semantic domain. Specifically, the speakers of the two languages provided more labels for the same event types of caused-separation as exhibited across constant sets of stimulus scenes. In contrast, speakers in the two language groups were comparable in applying only a very restricted number of verbs to other scenes. This observed asymmetry in the nature of lexical encoding has been documented more widely, e.g., by Kopecka (2012). More numerous lexical devices, i.e., verbs, used for describing certain semantic categories have been regarded as allowing for more semantic distinctions to be made in the expression of events of those categories (cf.

Kopecka, 2012, p. 344). I address this issue in the following subsection.

5.2.2.2 Asymmetry in hierarchy of semantic categories

Kopecka (2012) notes that a greater number of lexical devices, such as verbs or particles, may in some ways permit language speakers to partition a semantic subdomain more finely: for example, Goal-oriented versus Source-oriented events in her study (p. 344). If

Kopecka’s observation is the case for other domains, this would mean that the perceived asymmetry of the lexical encoding of a domain would trigger asymmetry in decomposing

345

categories into finer subcategorisation. In more practical terms for this study, the categories with considerable versus limited numbers of caused-separation verbs should correlate with those having finer versus coarser granularity, indicating asymmetry in granularities within the domain. Note that I interpret the notion of fine granularity in two ways. One is concerned with a semantic category potentially breaking down into multiple subcategories, showing its fine partitioning. The other relates to a category involving a deep hierarchical structure.

As described in § 3.3.1.1 and § 4.3.1.1, caused-separation categories in Thai and

Khmer do not all have the same level of granular categorisation. Some of the categories are either finer in granularity or deeper in taxonomic hierarchy down to the level of subdivisions, where others are flatter as containing only one or more subcategories with no smaller division or even with no subcategory at all. With respect to Figure 5.5a (see §

5.2.1.2), most of the domain’s categories in Thai can be subclassified into finer-grained subcategories—some of which are decomposable further into even smaller divisions.

This illustrates different forms of hierarchical structure. The /tàt/-category has the finest granularity as it breaks down into four different subcategories. One of these, the /kap/- subcategory, is fragmented further into three smaller divisions, that is into the /sàp/-,

/tɕaːm/-, and /lɯ̂ aj/-subdivisions. The category accordingly reflects the deepest hierarchy, consisting of three levels of classification, i.e., that of categorisation, that of subcategorisation, and that of subdivision. For Thai, the only category involving no subcategorisation, thus having the flattest hierarchical structure, is the /hàk/-category. In

Khmer, four out of the seven categories in the domain, including the /cak/-, /puh/-, /kat/- and /dɑm/-categories, can be partitioned into subcategories. The latter two categories are

346

divisible further into the subdivision level, adding more depth to their hierarchical organisation. The other three categories, i.e., the /kac/-, /tieɲ/-, and /haek/-categories are flat, containing no subcategories.

Having fine distinctions, the /tàt/-category in Thai, and the /cak/- and /kat/- categories in Khmer mostly involve the same core caused-separation scenes (see §

5.2.2.1). Likewise, the categories with no fine classification like the /hàk/-category in

Thai and the /kac/-category in Khmer involve alike the range of Scenes 5, 19, and 57.

Accordingly, I infer for both languages that different caused-separation categories may undertake different degrees of granularity, consequently triggering rather similar asymmetry in fine differentiation and depth of hierarchical structures.

Recall the observations above regarding the perceived categories with high versus low numbers of caused-separation verbs (see again § 5.2.2.1). We may also determine potential correspondences between numbers of lexical devices and thresholds of granularity of semantic partitioning. Let us start from the categories with limited number of verbs used for the descriptions. In Thai, the /hàk/-category was confined to the use of a single verb as its sole descriptor. Likewise, the /kac/-category in Khmer was effectively restricted to /kac/ ‘snap’. From the viewpoint of subcategorisation, the two categories are constituted similarly in having no finer-grained classification, therefore showing themselves to be relatively flat in hierarchy. Next, we turn to the categories with multiple verbal labels. In Thai, the /tàt/-category involves the greatest number of verbs used for its descriptions. Also, it is the category which contains the comparatively finest-grained subcategorisation pattern in Thai with four different categories. One of these subcategories even breaks down into three smaller divisions. In Khmer, the /cak/-

347

category appears to have the highest number of verbal labels. It also contains one subcategory, showing depth in hierarchy. When attention turns to the categories with second largest number of labels, the /kat/- and /dɑm/-categories, they show divisibility into several finer-grained classifications. The /kat/-category has three subcategories; one of these, the /kap/-subcategory, contains a smaller division: the /cəɲcram/-subdivision.

The /dɑm/-category includes a single /vay/-subcategory, which is further decomposed into a subdivision. As the two categories can be disaggregated to the subdivision level, they each have a comparatively deep hierarchy.

The generalisation to be affirmed here is that the greater the number of verbs potentially used for particular caused-separation categories, the finer granularity those categories are likely to have. This is clearly in line with Kopecka’s (2012) observation regarding the encoding of Goal-oriented versus Source-oriented events. Specifically, the categories showing significant numbers of verb labels in both Thai and Khmer display finer-grained distinctions and greater hierarchical depth. In contrast, the Thai and Khmer categories with smaller numbers of verb descriptors are those that lack a fine hierarchical partitioning. That being the case, the asymmetry in lexical encoding of caused-separation event categories seen in the Thai and Khmer cases needs to be assessed in a wider context. Relevant here are hierarchical distinctions with corresponding asymmetry that apply in like fashion across languages.

In brief, we can observe two kinds of asymmetry in the caused-separation descriptions provided by this study’s Thai and Khmer speakers. One is to do with the lexical coding strategies. Specifically, the asymmetric distribution of more numerous verb labels appears to correlate with the certain semantic categories of the domain rather

348

than with others. The other asymmetry relates to levels of subcategory granularity.

Certain categories appear to have finer distinctions and more depth of hierarchical structure than others. I consider these asymmetrical phenomena as comparably aligned across Thai and Khmer, since the categories demonstrating each of the asymmetries in the two languages involve certain scenes in common. Also, the categories with high numbers of verb labels have been shown to be the same as those having finer distinctions and more depth of partitioning. In other words, for the two languages, the categories with fine granularity and taxonomic depth correlate with those having more distinct labels.

Correspondence between asymmetry in lexical encoding strategies and granularity within the domain is thus taken as established.

5.3 Comparison of category boundaries in caused-separation

domain across Thai and Khmer

Having compared granularity relating to caused-separation categories in Thai and Khmer,

I now look at the placement of the boundaries of those categories in the two languages.

This comparison contributes to a deeper understanding of Thai and Khmer lexico- semantic contrasts, especially as to whether relationships in the same semantic space are similarly or differently perceived for the relevant categories. This includes the degree to which such relationships are parallel or vary across the two languages. In § 5.3.1, I commence by dealing with how the speakers of Thai and Khmer grouped different event types of caused-separation into each different category. Based on the event groupings into the categories, in § 5.3.2 the potential semantic distinctions are again summarised and analytically compared across the languages. The last subsection (§ 5.3.3) broaches

349

another aspect left undiscussed so far regarding the semantic categorisation of the domain: overlaps in some categories’ extension. The notions of core versus peripheral events and consistency in description are employed to examine these category intersections and to make a comparison of them across Thai and Khmer.

5.3.1 Event grouping for caused-separation categories across Thai and

Khmer

According to the analyses in Chapters 3 and 4, speakers of Thai and Khmer were found to group event types of caused-separation both similarly and differently (see §§ 3.4.1.1 and 4.4.1.1).

For Thai, the speakers in this study produced an array of event groupings. First, they made a distinct split between events with and without the use of an instrument.

Events associated with the /tàt/-, /tʰúp/-, /krìːt/-, and /tʰîm/-categories (instrument-using) are never clustered together with those linked to the /hàk/- and /tɕʰìːk/-categories (non- instrumental). Next, on closer inspection, Thai speakers in this study followed the cross- linguistic notion of predictability of separation location (Majid et al., 2004, 2008). They discriminated between events high in predictability (e.g., sawing or cutting) and those with low predictability (e.g., smashing) since they specified the individual event types with different verbs. Although exceptional cases were discovered, they are too sporadic to provide an argument against generalising differentiation in predictability. In an additional categorisation, the speakers lumped snapping events together and assigned smashing events to another group, thereby confirming a solid boundary between the snapping (/hàk/-) and smashing (/tʰúp/-) categories. In a similar demarcation, a piercing

350

event occurred as standalone for the /tʰîm/-category, distinguishing it apart from other caused-separation types, despite its verb labels sometimes being applicable to events of other types. Furthermore, for the Thai speakers, a slitting event (i.e., /krìːt/-category) constituted a distinct category removed from general cutting events (i.e., /tàt/-category).

As for events of karate-chopping and chopping which are deemed those with immediate predictability of locus of separation by Majid et al., the speakers of Thai tended to lump them together with events whose predictability is high, such as events of cutting or sawing. Furthermore, the Thai speakers did not differentiate pulling-apart events from tearing events, sorting both into a single category, i.e., the /tɕʰìːk/-category. However, another event type featuring use of hands, snapping, was treated as having its own class, the /hàk/-category, which was split from categories of tearing and pulling-apart. This distinction is determined as based on differences in the theme object’s textural properties: rigid (for snapping) versus flexible (for tearing—pulling-apart).

The Khmer speakers also produced an array of event groupings. They differentiated between events with the tool implementation and those with hand use, since these two event types were commonly labelled with two distinct sets of (category- defining) verbs: /dɑm/ ‘smash’, /kat/ ‘cut’, or /puh/ ‘chop’, and /cak/ ‘stab’ versus /kac/

‘snap’, /tieɲ/ ‘tug’, and /haek/ ‘tear’. There is also some departure from this generalisation. Events of karate-chopping, despite the use of hands, were classified with those of smashing, featuring the use of blunt tools and generally named by /dɑm/

‘smash’. That said, a construal account can be given to reconcile this deviation. Unlike

Thai, karate-chopping events and those of chopping were not seen as commonly grouped together since Khmer labels the latter with other different verbs, such as /kat/ ‘cut’ or

351

/puh/ ‘chop’. Such events of chopping also appeared to split into two groups. One

(chopping#1) consisted of events of cutting and sawing, while the other (chopping#2), those of slitting, which were not grouped with cutting and sawing either. This split simply points to semantic distinctions in the direction of separation: lengthwise versus non- lengthwise. The high-order grouping of cutting, sawing, and chopping#1 is not sensitive to caused-separation along acted-upon objects’ length, whereas that of slitting and chopping#2 is. Note that a single event of piercing was incorporated into the group of cutting, sawing, and chopping#1, verifying insensitivity to the lengthwise-separation feature. Also, the Khmer speakers followed the cross-linguistic trend in distinguishing events with markedly different levels of predictability (i.e., in location of separation).

Among instrument-manipulated events, the speakers distinguished consistently between high-precision caused-separation actions (e.g., sawing or cutting) and those with low predictability (e.g., smashing). Likewise, low-precision manually-manipulated events of snapping were split away from pulling-apart events and tearing events—both of which are with immediate predictability. Note that the differentiation of snapping from pulling- apart and tearing demonstrated interpretation based on attention to objects’ physical properties in a manner similar to Thai. However, unlike Thai, Khmer was found to differentiate further between pulling-apart and tearing due to different numbers of object dimensions.

Table 5.5 below depicts the grouping of different caused-separation events, as outlined above, in relation to the established categories in Thai and Khmer. Note that individual categories in both languages are labelled by pertinent representative verbs with hyphenation.

352

Table 5.5 shows that Thai and Khmer structure events of caused-separation into groups in ways that are both similar and dissimilar. Each of the languages similarly categorises events of cutting and sawing together within a single category, the /tàt/- category in Thai and the /kat/-category in Khmer. In addition, the two languages cluster all snapping events apart from other caused types, assigning them to a separate category, the /hàk/-category in Thai and the /kac/-category in Khmer.

The two languages also show a few key points of difference. First, events of chopping were sorted by Thai speakers into the same group as cutting and sawing events.

In contrast, Khmer speakers distinguished some of the chopping events from others. The

Khmer speakers were sensitive to the specific direction of separation, i.e., lengthwise. In the Khmer data for this study, all events with a lengthwise separation direction were assigned to the same /puh/-category. Second, karate-chopping events, as those with immediate predictability, were generalised differently across the two languages. In Thai, they were classified with cutting events, assigned to the /tàt/-category, whereas in Khmer they were grouped with smashing events, being related together in the /dɑm/-category.

Third, Khmer speakers discriminated between pulling-apart events and tearing events based on the difference of one-dimensional versus two-dimensional flexible theme objects. Thus, they sorted the two event types into the /tieɲ/- and /haek/-categories. In contrast, Thai speakers were indifferent to this distinction, assigning both event types to the single /tɕʰìːk/-category. Fourth, Thai was found to integrate all actions that cause complete resulting separation by sharp-bladed implements (slitting) or by pointed tools

(chopping—piercing) into the same cluster under the /tàt/-category. However, the Thai speakers made a consistent distinction between the /tʰîm/-category for piercing by pointy

355

tools versus the /krìːt/-category for slitting by bladed tools and with the prior placement of blades. This distinction seems to involve caused-separations by sharp/pointy instruments, but with division into separate categories referring to partial separation

(making a hole or an opening) and to a specialized manner of action (slitting). In contrast,

Khmer deals with these events differently. For the Khmer speakers, events of chopping— piercing were all related to the /cak/-category while events of slitting were all treated within the /puh/-category, both regardless of whether full or partial resulting separation obtained or was expected.

5.3.2 Semantic distinctions in caused-separation domain across Thai

and Khmer

Both similarity and dissimilarity in event grouping patterns have been noted across the languages. This subsection examines and compares such variation, as overviewed in

Table 5.6. This comparison indicates how semantic distinctions assigned by the speakers of Thai and Khmer work to structure the same domain of caused-separation, dividing it into separate categories.

356

Table 5.6

Semantic distinctions in structuring of caused-separation domain across Thai and Khmer.

Instrument manipulation Manual manipulation

High/intermediate Intermediate Intermediate/low Low Intermediate

Sharp-bladed/pointed/knife hand

Non-placement of Placement Knife hand/blunt-

blade of blade pointed headed Rigid Flexible

/tàt/ /krìːt/ /tʰîm/ /tʰúp/ /hàk/ /tɕʰìːk/

‘cut’ ‘slit’ ‘stab’ ‘smash’ ‘snap’ ‘tear’ Thai Thai

Instrument manipulation Manual manipulation

Non-

Non-lengthwise Lengthwise lengthwise ?

High/intermediate Intermediate Intermediate/low High/intermediate Low Intermediate

Knife hand/blunt- Flexible

Sharp-bladed Pointed headed Sharp-bladed Rigid 1-D 2-D

/kat/ /cak/ /dɑm/ /puh/ /kac/ /tieɲ/ /haek/

‘cut’ ‘stab’ ‘smash’ ‘chop’ ‘snap’ ‘tug’ ‘tear’ Khmer

Note that a question mark indicates that it was not readily discernible whether the direction of separation was lengthwise or not.

Table 5.6 provides an insight into underlying Thai and Khmer semantic distinctions as reflected in this study’s event descriptions. Distinctions are indicated as hierarchically partitioning the domain at multiple levels down to where semantic partitions are realised by linguistic representations. For each of the linguistically realised partitioned categories, only the predominant verbs are shown in Table 5.6.

357

The Thai system can be rated as simpler than the Khmer one in at least two respects. One factor concerns the Thai division of the domain into six categories, whereas

Khmer decomposes it into seven. This suggests that, in the case of these decompositions,

Thai is coarser in granularity than Khmer at this first linguistically-mediated level. The other simplification factor comes from the fact that Thai engages four-layered semantic distinctions before linguistically realising the semantics of the partitions. In contrast, for

Khmer, there is a need to include up to five layers of distinctions prior to the level of the linguistic realisation.

On closer inspection, most of the kinds of semantic distinctions in Thai and

Khmer are similar. Specifically, the speakers of the two different languages seemed to draw attention to many similar criterial values for partitioning the domain. Parallel distinctions include the notions of instrument versus manual manipulation, predictability of location of separation, and instrument/object types. For both languages, the extent of caused-separation actions that were coded depending on use of instrument or hands was found to be of greatest importance in the domain’s categorisation. Further, coinciding with a common distinction shared among languages (Majid et al., 2004, 2008), degree of predictability came into play. However, as Majid et al. suggested, there is cross-linguistic variability in the treatment of events with intermediate predictability like chopping and karate-chopping. Khmer can be seen as grouping events of chopping with high-precision action events (e.g., slitting or cutting) but karate-chopping events with events with low predictability (e.g., smashing). In contrast, Thai generally lumps together both chopping and karate-chopping with events of cutting, slitting, or the like. These phenomena point to differences in judgement along this continuous dimension across the two languages.

358

As for the Instrument/object type distinction, Thai and Khmer are by and large similar in how different events should resemble or differ from one another based on different kinds of instruments used or objects acted upon. Nevertheless, in this study, the speakers of

Thai sometimes accounted for the use of pointy tools as being similar to that of bladed implements. These interpretations might be influenced by the effective usage range of certain instruments such as chisels: use of pointed tip versus short sharp edge (see more details in § 3.5.1). This interpretability was not of concern for the Khmer speakers.

Despite widespread commonalities, there are noticeable points of difference between Thai and Khmer. In Khmer, prior to the distinction by degrees of predictability of locus of separation, speakers in this study appeared to separate caused-separation events based on whether the agent made a division along the object’s length. Specifically, slitting events (i.e., Scenes 9 and 14) and some chopping events (i.e., Scenes 2, 43, and

53) which involve a long cut/opening were assigned to the /puh/-category. There appears to be no equivalent distinction in Thai. Additionally, Khmer allows the distinction of flexible objects with different numbers of dimensions: one-dimensional (1-D) versus two- dimensional (2-D). Consequently, events with the 1-D flexible object are assigned to the

/tieɲ/-category whereas those with the 2-D flexible object to the /haek/-category. Thai also has a semantic distinction differing from Khmer: prior placement of a blade. That is, caused-separation events with the use of sharp/pointy implements are divided depending on manner: whether the tool blade was bladed on the object’s surface before a causal action of separation.

359

5.3.3 Overlap in category extension: A view through characteristic

divergence

As described above, the occurrence pattern of Thai and Khmer caused-separation verbs

has enabled division of the semantic space into separate subdomains (§ 5.2.1),

consequently uncovering the underlying semantic distinctions across the languages (§

5.3.2). However, what is left undiscussed so far is the fact that the individual categories

in both languages are not cleanly partitioned. This occurs because the applications of

defining verbs show potential overlaps in the denotational ranges. What follows is a brief

look at the overlapping areas in Thai and Khmer and comparison of overlapping across

the languages.

According to the analyses in Chapters 3 and 4, areas of category overlapping are

identified (see §§ 3.4.1.2 and 4.4.1.2). In Thai overlap occurs between the /tàt/- and

/tʰúp/-categories and between the /tàt/- and /tʰîm/-categories. For Khmer, similar

overlapping occurs between the /kat/- and /dɑm/-categories and between the /kat/- and

/puh/-categories. What enables us to discern these overlapping areas is the subdomain

extensions corresponding to the mentioned categories.

For Thai, the /fan/-subcategory extends over both the /tàt/- and /tʰúp/-categories.

Also, in this language, the /tàt/-category overlaps with the /tʰîm/-category since the

occurrence pattern of /tɕɔ̀ʔ/ ‘puncture’ aided to measure a pertinent subcategory each in

the individual categories. In Khmer, /kap/ ‘hack’ shaped a subcategory individually in

both of the /kat/- and /puh/-categories, pointing to the overlap between the respective

categories. Also, /kat/ ‘cut’ helped define both the /kat/-category and a small division in

the /dɑm/-category. I thus regard the two categories as produced in overlap.

360

In this study, characteristic divergence relating to some member scenes within the overlapping categories helps explain the incidences of overlap. The individual categories contain the member scenes, which capture certain shared features. On the other hand, some scenes not only present characteristics in common with others in the same category but also exhibit some characteristic divergence from the common features. I have argued that these divergent attributes are caused by intra- and inter-personal adoptions of different construals of these scenes. Given such variation in diverging scenes’ interpretations, some verbs in this study, despite customarily defining specific categories, were used flexibly in descriptions of certain scenes. This variation in interpretation consequently is seen as what has produced the category overlaps.

Category core versus periphery is the notion I have used to develop the argument that feature divergence plays a pronounced role in creating an overlapping area between categories (see § 3.3; cf. Vulchanova et al., 2012). The speakers of Thai and Khmer in this study might treat some scenes in overlapping categories as peripheral, in contrast to certain core scenes. These peripheral scenes were specified by a larger variety of verb labels, whereas the core scenes were expressed by predominantly-used category- representative verbs. It is these verbs which are considered prototypical for the categories.

According to the previous cluster analysis (see Figures 3.2 and 4.2), the peripheral scenes are associated with the core from distant subtrees in the dendrograms. Comparing the peripheral scenes to the core in the same categories shows two interesting types of results. First, the category’s peripheral scenes feature some attributive discrepancies from core characteristics. Second, the peripheral scenes are accommodated in the overlapping

361

areas whereas the core scenes are not. Since the scenes with feature divergence and those involving overlaps are by and large the same, I argue for a tie between this kind of divergence and the production of such category overlaps: in effect, the former has given rise to the latter. In addition, as noted earlier, it was feature divergence that caused different interpretations for the relevant scenes. I found that for the scenes with such divergence, variability in interpretation was reflected in the degrees of consistency in the scenes’ descriptions by speakers within each language group.

Degrees of consistency of this type can now be compared across the groups of

Thai and Khmer speakers. Figure 5.6 summarises the overlapping extensions of the /tàt/-,

/tʰúp/-, and /tʰîm/-categories in Thai and the /kat/-, /dɑm/-, and /puh/-categories in

Khmer. Note that the relevant core versus peripheral scenes are specified as well. As the

/tʰîm/-category does not contain any apparent core scene, it was left blank. The Gini-

Simpson diversity indices are given in parentheses.

362

Figure 5.6. Overlapping ranges of categories in Thai and Khmer, with relevant core and peripheral scenes.

363

Figure 5.6 suggests that the speakers of Thai and Khmer produced scene descriptions similarly for scenes that determined the core of certain caused-separation event categories. By contrast, Thai and Khmer category boundaries were placed somewhat differently. A good case to compare is the /tàt/-category from Thai and the /kat/- category from Khmer. According to Table 5.5, these two categories from the different languages mostly cover the same range of the scenes. This range accounts for up to about 35% of all the stimulus scenes; they are quasi-parallel to each other. The /tàt/- category is a bit larger in breadth than its Khmer quasi-counterpart since it incorporates scenes which were assigned in this study to other categories in Khmer: the /cak/-, /dɑm/-, and /puh/-categories. Despite the difference in boundary location, the two categories appear to contain the same core scenes: Scenes 49, 59, 27, 24. In addition, since these four scenes were exclusively labelled by /tàt/ ‘cut’ in Thai and

/kat/ ‘cut’ in Khmer, they indicate that the two verbs closely correspond. They seem to present a close, if not the same, intensional definition or prototype. The two verbs are therefore commonly (and prescriptively) viewed as translations of one another

(Nacaskul, 1983; Phanthumetha, 1974).

In interim summary, Thai and Khmer grouping (or differentiating between) event types of caused-separation are broadly comparable, but with significant differences emerging at a finer grain. The similarities are in some ways alike because they comply with cross-linguistic distinctions established by Majid et al., (2004,

2008). These include discrimination between high- versus low-precision actions and the separation of smashing events from those of snapping. The two languages also exhibit discriminations not widely reported in cross-linguistic research. These include split between instrument-manipulated events and those performed with manual manipulation; also, differentiation of events by specific instrument or object types.

364

Thai and Khmer also each demonstrate certain language specificity in semantic distinctions, e.g., discrimination of blade placement on an object’s surface for Thai and that of the direction of separation actions and of the object’s number of dimensions for Khmer. I regard the existence of these differing distinctions as playing a role in partitioning the same semantic space into varying numbers of subdomains with different boundaries across both languages. In addition, some quasi-parallel categories across the languages—albeit with different boundaries—involve the same range of core member scenes. This shared range throws some light on the conventional translations of /tàt/ ‘cut’ in Thai ↔ /kat/ ‘cut’ in Khmer.

5.4 Comparison of semantic organisation of caused-

separation domain across Thai and Khmer

A remaining aspect of Thai and Khmer semantic categorisation is the semantic organisation of categories, that is, how the two languages arrange the same semantic space of caused-separation. The aim is to compare underlying organisation, especially parametric mechanisms relating to the way the speakers build experiences of caused- separation into their languages via the semantic structure of lexical items used to convey experiential categories (cf. Talmy, 1985). In § 5.4.1, I explore how Thai and

Khmer render different types of caused-separations through recognised semantic components. § 5.4.2 then presents a comparative summary across Thai and Khmer of how certain semantic elements are packed into verbs of caused-separation—based on contrasting semantic component patterns.

365

5.4.1 Semantic components in caused-separation domain across Thai

and Khmer

As explained earlier (§§ 3.5.1 and 4.5.1), semantic components linked to the individual categories of Thai and Khmer illustrate the extent to which the two languages converge or differ in contouring experiences of caused-separation.

Questions include how diverse, expressive and typologically distinct the semantic component patterns are for the domain as viewed across the two languages. Table 5.7 below displays the distinct patterns of semantic components for the Thai and Khmer categories.

Table 5.7

Semantic component patterns belonging to caused-separation categories across Thai and Khmer.

Semantic component pattern Thai Khmer

I – Instrument type /tàt/-category /kat/-category

/tʰúp/-category /dɑm/-category

/tʰîm/-category /cak/-category

II – Manual manipulation + Object type /hàk/-category /kac/-category

/tɕʰìːk/-category

III – Instrument type + Blade placement /krìːt/-category

IV – Direction of separation + Instrument type /puh/-category

V – Manual manipulation + Object type + Object subtype /tieɲ/-category

/haek/-category

As Table 5.7 shows, the speakers of Thai and Khmer in this study constructed

event types organised by specific verb-based categories that were structured around

366

only five patterns of semantic componential features in total (Patterns I – V). The pattern of Instrument type (Pattern I) appears to be predominant since most of the categories discovered in each language are connected to this pattern. Another pattern found across the two languages is that of Manual manipulation + Object type

(Pattern II). The remaining patterns are language-specific and are seen as more expressive. This is because speakers closely adhered to components more specialised than those of the above two patterns (Patterns I – II). Patterns III – V characterise certain caused-separation types to which the individual languages are sensitive. Based on Pattern 1 with the addition of Blade placement, the Thai-specific pattern of

Instrument type + Blade placement is used to identify the slitting type associated with the /krìːt/-category. For Khmer, this event type was collapsed into other types of cutting and chopping, linked to the /puh/-category.

Patterns IV and V are specific to Khmer, with the addition of Direction of

Separation and Object subtype to Pattern I and II, respectively. The patterns spell out the Khmer-specific sensitivity to the discrimination of the events of the /puh/- category and of those of the /tieɲ/- versus /haek/-categories. This suggests that notwithstanding the different six to seven categories available, Thai and Khmer do not deploy too semantically diverse componential techniques to characterise event types of caused-separation. They still follow typologically-attested organisation for some semantic components, including Instrument type, Manual manipulation, and

Object type, in shaping experiential events of caused-separation. In addition, the existence of the language-specific patterns (Patterns III – V) in Thai and Khmer adds diversity. So, both more general and more specific types of semantic event characterisation contribute to the semantic componential pattern the for the domain.

367

Thai and Khmer can also describe caused-separation events more finely, reflecting more detailed patterns of their semantic components. These complex patterns are associated with the relevant categories’ componential patterns, but with the augmentation of specialised semantic components for individual subcategories.

Accordingly, it is desirable to determine which specially-added components are widely observed and utilized across Thai and Khmer. Table 5.8 reveals the extent of selected semantic component patterns at the category level. Extra added semantic components derive componential patterns at the subcategory level. Note that, to provide an overview, Table 5.8 excludes possible component patterns at the smaller division level. Subdivisions include very class-specific components not occurring in other distinct patterns (see details in Tables 3.13 and 4.11). Among semantic components related to the subdivisions, that of Specific instrument in Thai seems especially influential because it works in parcelling and fitting certain caused- separations into separate event types associated with either of the two smaller divisions, i.e., /tɕaːm/- and /lɯ̂ aj/-subdivisions (see discussion in § 3.5.1).

368

Table 5.8

Additional semantic components for subcategories across Thai and Khmer, as added to semantic component patterns at category level.

Additional semantic component at subcategory level

Semantic component pattern at category level

Subcategory action of Manner + subtype Object + preference Hand + separation of Direction + tool Supplementary + placement* Blade + Language Category Instrument type Thai /tàt/- /fan/- ✓ (✓)

/hàn/- ✓

/tɕɔ̀ʔ/- ✓

/pʰàː/- ✓

/tʰîm/- /tɕɔ̀ʔ/- ✓

/tʰɛːŋ/- ✓

/tɕîm/- ✓

Khmer /kat/- /kap/- ✓ (✓)

/ʔaa/- ✓

/han/- ✓ ✓

/cak/- /tumluh/- ✓ ✓

/dɑm/- /vay/-

Instrument type + Blade placement Thai /krìːt/- /tɕʰɯ̌ an/- ✓

Manual manipulation + Object type Thai /tɕʰìːk/- /dɯŋ/- ✓

Direction of separation + Instrument type Khmer /puh/- /cət/- ✓

Note that the /tɕɔ̀ʔ/-subcategory appears twice in the table: once in the /tʰîm/-category and again in the

/tàt/-category, as it occurs in both categories’ subcategorisation. Also, the component of Blade placement in the rightmost column refers to an extra semantic component added to the componential pattern for a category.

Table 5.8 indicates that subcategorisation in the domain can involve up to six different additional semantic components across Thai and Khmer. Manner of action participates in a broader range of subcategories within both languages. It accordingly

369

seems to play a pronounced role in discerning between different event subtypes associated with relevant differential subcategories in the two languages. The component of Direction of separation, by contrast, appears to engage with a more limited number of subcategories: the /pʰàː/-subcategory in Thai and sometimes the

/kap/-subcategory in Khmer. Other extra components at the subcategory level appear rather specific to either Thai or Khmer. That of Object subtype only influences the specification of subcategories in Thai while the others are involved in Khmer-specific subcategorisation.

We have seen so far that Thai and Khmer show a certain similarity in how the particular semantic componential pattern: Instrument type + (Manner of action) predominantly involves the coding of caused-separation experiences at the category and subcategory levels. However, such a componential pattern still does not seem to provide enough direct information for hypothesising about the extent of Thai and

Khmer speakers’ interpretations: how they perceive relevant “semantic features” of the experiential events at issue and how they code such information content into language representation. For instance, despite having the same most observed category for the componential pattern of Instrument type, the speakers of Thai and

Khmer may or may not activate this semantic feature. They may not allow the identical range of semantic features to be exposed with the pattern in the process of verbalisation. The next subsection discusses, in more detail, the lexicalisation of the domain.

370

5.4.2 Semantic characteristic conflation in lexicalisation of caused-

separation verbs across Thai and Khmer

Since the distributional patterns of certain verbs have defined different Thai and

Khmer categories in this study, we can infer that such verbs should correspond to the

categories’ specific semantic features (Vulchanova et al., 2012; also see §§ 3.5.1 and

4.5.1). What follows concentrates on the ways particular configurations of semantic

elements—through which the Thai and Khmer speakers represent experiences of

caused-separation in their languages—are selectively lexicalised. In this way, the

speakers express experiential categories of the domain, based on the relevant

componential patterns described in § 5.4.1. Specifically, I examine a selected number

of semantic characteristics observed from how events of categories are connected to

verbal representations. That in turn can delineate such categories in both Thai and

Khmer and can be used to assess whether there is any significant variation or trend in

the way semantic characteristics operate across the two languages.

In this subsection, I narrow the analysis to the comparison of the lexicalisation

(or more precisely verbalisation) patterns of some caused-separation event types. The

event types were selected because they are widely associated with the categories and

subcategories involving the semantic component patterns existent in both Thai and

Khmer, thus facilitating further comparability. Specifically, the patterns mentioned

here include Instrument type at the category level, and Instrument type + Manner

of action at the subcategory level. Below, I summarise how certain semantic features

align with the chosen category/subcategory componential patterns and how they are

potentially conflated to the core semantic element of any verb of cutting, breaking,

and the like: [CAUSED-SEPARATION]—or [SEP]— in lexicalisation of the experiential

371

domain. Note that I represent other semantic characteristic elements in abbreviated forms: [INSTR TYP] for [INSTRUMENT TYPE], and [MANNER] for [MANNER OF ACTION].

Table 5.9

Comparison of Thai and Khmer semantic feature conflation patterns for some verbs of caused- separation, based on componential patterns at category and subcategory levels.

Thai Khmer

Category level: [SEP + INSTR TYP]

Verb Semantics Verb Semantics

/tàt/ ‘cut’ [SEP + SHARP-BLADED] /kat/ ‘cut’ [SEP + SHARP-BLADED]

/tʰîm/ ‘stab’ [SEP + POINTED] /cak/ ‘stab’ [SEP + POINTED]

/tʰúp/ ‘smash’ [SEP + BLUNT-HEADED] /dɑm/ ‘smash’ [SEP + BLUNT-HEADED]

Subcategory level: [SEP + INSTR TYP + MANNER]

Verb Semantics Verb Semantics

/fan/ ‘hack’ [SEP + SHARP-BLADED + /kap/ ‘hack’ [SEP + SHARP-BLADED +

STRIKING] STRIKING]

/hàn/ ‘slice’ [SEP + SHARP-BLADED + /ʔaa/ ‘saw’ [SEP + SHARP-BLADED +

SAWING] SAWING]

/tɕɔ̀ʔ/ [SEP + POINTED + DOWNWARD /tumluh/ [SEP + POINTED + DOWNWARD

‘puncture’ BLOW] ‘puncture’ BLOW]

/tʰɛːŋ/ ‘stab’ [SEP + POINTED + PURPOSIVE]

/tɕîm/ ‘jab’ [SEP + POINTED + PRESSING-

AGAINST]

What could the lexicalisation patterns in Table 5.9 reveal about the nature of event representation by certain verbs? Clearly, there are some features of caused- separation events perceivable in lexicalisation across Thai and Khmer. Also,

372

language-specific distinctive attributes interact with certain features in finer degrees of lexicalisation. Both Thai and Khmer are seen to lexicalise into verbs features of three types of instruments in conflation with the core feature of [CAUSED-

SEPARATION]: sharp-bladed, pointy, and blunt-headed. Only the first two of these can be further differentiated with more finely grained features for verbalisation of the domain’s subcategorisation. To be explicit, the semantic element of [SHARP-BLADED

INSTRUMENT] can interface with that of [STRIKING MANNER] and of [SAWING

MANNER], while the feature of [POINTED INSTRUMENT] can combine with that of

[DOWNWARD BLOW MANNER]. Yet, Thai turns out to incorporate a wider range than

Khmer of [MANNER OF ACTION] features as language-specifically allowing the feature of [POINTED INSTRUMENT] to interact with the features of [PURPOSIVE MANNER] (i.e., in lexicalising the verb /tʰɛːŋ/ ‘stab’) and [PRESSING-AGAINST MANNER] (i.e., in lexicalising the verb /tɕîm/ ‘jab’). These two specialised features are not evident, or at least not detectable, in the lexicalisation of Khmer verbs utilized to convey caused- separation events in the scenes described.

To conclude this section, Thai and Khmer, to some degree, appear to implement similar underlying parametric mechanisms in constructing experiences of caused-separation. This is because they illustrate certain common configurations of semantic parameters or components in the domain’s event (sub)categorisation. Still, there exist points of differences across the two languages vis-à-vis the role of language-specific semantic components, such as the implementation of Hand preference for Khmer. Furthermore, considering some of the semantic component patterns emerging in both languages, I found a limited array of semantic features—in conflation with the core feature of caused-separation—lexicalised into different verbs.

The regular association patterns of such semantic features with such verbs in Thai and

373

Khmer are both similar and different. The parallel patterns involve the array of certain semantic elements interacting with others similarly across the two languages, while the different ones involve certain semantic feature interfaces observed only in Thai.

In conclusion, first, Thai and Khmer appeared to implement the same numbers of verb labels for the stimulus scenes of the caused-separation event domain. Also, the frequency analysis revealed that the verbs in the two languages occurred in the similar distributions: the few verbs account for the majority of the overall verb occurrences.

Second, based on the Thai and Khmer verb distributional patterns as well as the clustering analysis, this same semantic field was found breaking down to the different numbers of categories. Khmer contains the seven different categories, therefore slightly more than Thai with the six categories. Third, considering the extent of the perceived category boundaries placed to one another, Thai and Khmer would count on both broadly comparable semantic distinctions and dissimilar fine-grained differentiations in the lumping of events of the kind to different event types associated with the individual categories. These similarities could be accounted for as following either the cross-linguistic trend (e.g., predictability of location of separation; cf. Majid et al., 2004, 2008) or the bilateral language-specificity (e.g., differentiation between events with instrument versus manual manipulation). The differences in distinction were evident by the discrimination applied only in either of the languages: for example, the notion of the blade placement upon the object’s surface observed only in

Thai and are here held responsible for the different categorisation across the languages. Last, there are present both similar and differentiating semantic componential parameters observed in the articulation of the experiential event categories. Given the parametric patterns, Thai and Khmer both alike and differently lexicalise certain semantic elements in expressing the domain. The similarities

374

depicted in the categorisation across the two languages may reflect “universal” cognitive or experiential factors and area-specific convergence; especially the latter of which are contingent on the bilaterally achieved parallelisms. The variations having causal effects on the categorisation, by contrast, may mirror other triggering possibilities, such as linguistic artifactuality, discernible perception and memory, or exposure to cultural environments (cf. Fabb, 2016). I give these aspects more emphasis in the concluding chapter.

------⁂ ------

375

CHAPTER 6

Areal lexico-semantics in Mainland Southeast Asian (MSEA)

Sprachbund: A case of caused separation in Thai and Khmer

This last chapter summarises and discusses findings of the previous chapters in response to the research questions posed in Chapter 1. Semantic categorisation of caused-separation in Thai and Khmer is summarised and compared in § 6.1. In § 6.2, the broader areal distributions of lexical semantic patterns are discussed, through a comparison with Hindi and Tamil, two languages that fall outside the MSEA linguistic area. The last section (§ 6.3) discusses implications of the study and further research recommendations.

6.1 Research findings in summary

The aim of this section will be to answer the first research question posed in this study, summarising the lexical semantic categorisation of caused-separation events in

Thai and Khmer (as was detailed in Chapters 3-5). In § 6.1.1, a summary of how the

Thai and Khmer speakers in this study partitioned the domain of caused-separation is provided. § 6.1.2 then highlights the similarities and differences across the languages.

This will constitute a basis for evaluating the case for area-specific convergence in lexical semantics.

376

6.1.1 Lexical semantic categorisation of caused-separation in Thai

and Khmer

In this study, speakers of Thai and Khmer used an equal number of verbs (24) in their descriptions of 43 event scenes of caused-separation.

Verb distributional patterns together with clustering analysis from verb-by- scene matrices also reveal that the two languages have comparable but not identical numbers of semantic categories: Thai has six, while Khmer has seven. Most of the categories are linked to the more frequently used verbs in each language. These lexically-linked categories show that speakers of both languages tend to base their event descriptions on a small set of prototypical lexical choices.

Also, the subcategorisation of the domain reveals finer distinctions and more hierarchical depth of some categories, which likely relate to more numerous verbs.

The fine granularities and the great numbers of verb labels for certain categories thus mirror asymmetry in the semantic partitioning of caused-separation in Thai and

Khmer. Interestingly, such asymmetric instances tend to occur in the same region of the domain across the two languages.

In addition, placement of category boundaries in Thai and Khmer shows both similarities and differences. In both languages, quasi-corresponding categories occupy the same or similar core events, whereas variability is seen only in the categories’ peripheries (see the details in § 5.3).

Furthermore, Thai and Khmer make use of similar underlying semantic parametric mechanisms by which caused-separation experiences are linguistically construed. Most of the categories in the languages yield the same prevalent

377

configuration patterns of semantic components: (I) Instrument type and (II) Manual manipulation + Object type.

The pattern of Instrument type can be further incorporated with another semantic component to produce the more extensive arrangement: Instrument type +

Manner of action. This configuration widely engages in interpreting many categories’ events into subcategories in Thai and Khmer. In Thai, it relates to the subclassification of the /tàt/-category into the /hàn/- and /tɕɔ̀ʔ/-subcategories and of the /tʰîm/-category into the /tɕɔ̀ʔ/-, /tʰɛːŋ/-, and /tɕîm/-subcategories. Likewise, the member events of the /kat/- and of /cak/-categories in Khmer can be subclassified into the respective /ʔaa/- and /tumluh/-subcategories, as subject to the same componential pattern.

Yet, some points of difference can also be found. For example, the pattern of

Instrument type + Blade placement is found to be distinctive for Thai while that of

Manual manipulation + Object type + Object subtype is prominent in Khmer.

Again, some of the different componential structures across the languages can be conflated with other components, functioning in the semantic organisation of some categories’ subcategorisation.

The semantic domain of caused-separation is subject to the two languages’ semantic parametric mechanisms. Organisation of semantic parameters with the core element (i.e., Caused-separation) in each case is associated with (sub)category- defining verbs through lexicalisation of features. With similar underlying parametric structures, the lexicalisation patterns of caused-separation verbs in Thai and Khmer are to some extent analogous, especially in regard to the coarse category level.

378

Thai and Khmer observe both similar and different categorisation of caused- separation events, in numbers of presupposed categories, category boundaries, and semantic organisation of categories. The next subsection provides more discussion of these semantic distinctions, to specify which of them may be widely cross-linguistic, mutually shared across Thai and Khmer, or language-specific, based on the data of the current study and the findings from previous cross-linguistic studies (i.e., Majid et al.,

2004, 2007a, 2008).

6.1.2 Similar and different lexical semantic distinctions of caused-

separation across Thai and Khmer

The similar and dissimilar discriminations in Thai and Khmer reflect common and differential categorisation. Some of the distinctions reflect cross-linguistic trends (cf.

Majid et al., 2004, 2007a, 2008) (§ 6.1.2.1) while others reveal mutually exclusive patterns not widely reported cross-linguistically (§ 6.1.2.2). Some others may reflect language-specific factors (§ 6.1.2.3). Understanding the semantic discriminations of the domains in the two languages creates a context for further discussion about causes and motivations for area-specific lexical semantics.

6.1.2.1 Convergence to cross-linguistic distinction trend

Several semantic distinctions common to Thai and Khmer appear to echo the cross- linguistic trends revealed by Majid et al. (2004, 2007a, 2008).

Cross-linguistic distinctions for the caused-separation domain (Majid et al.,

2004, 2007a, 2008) mainly involve four dimensions.

The most robust of these dimensions is the abstract notion of the predictability of the location of separation. This is considered highly significant across the languages surveyed. In particular, Majid et al. report that many languages apparently

379

distinguish between events of caused-separation with high versus low predictability.

Variations are still observed but only in caused-separation events with intermediate predictability: in some languages, speakers categorise them along with high-precision actions, while in some others with low-precision ones. Otherwise, some languages favour a compromise by integrating them into other events either with high or low predictability.

The second most frequent cross-linguistic dimension captures the distinguishing of tearing events from other caused-separations. Many languages have a dedicated verb to represent this kind of caused-separation events (Majid et al.,

2008).

The third cross-linguistic dimension makes a further distinction among low- predictability events: snapping versus smashing events. According to Majid et al.

(2008), snapping semantically refers to events where a one-dimensional rigid object is separated by hands. In contrast, smashing means events in which a rigid object is fragmented by a blow with an instrument.

The last cross-linguistic dimension concerns the unique differentiation of a piercing event (or poking a hole).

As detailed in Chapters 3 and 4, both Thai and Khmer respect predictability distinctions, as speakers of both languages labelled high- versus low-predictability events of caused-separation by using distinct verbs. That said, Thai and Khmer have quite different arrangements of events with intermediate predictability. For example, only one karate-chopping event was grouped with low-precision actions in Thai. In contrast, all karate-chopping events were grouped with low-precision actions in

Khmer. The two languages agree in representing the distinction of snapping versus

380

smashing. In this study, the two languages never labelled the two kinds of events with the same verbs.

Furthermore, it is worth noting that Thai and Khmer vary as to how they distinguish tearing events and piercing ones from other caused-separations (see §

6.1.2.3).

6.1.2.2 Mutual parallelism across Thai and Khmer

Thai and Khmer also show mutual semantic distinction parallelism not widely reported in the literature (e.g., Majid et al., 2004, 2007a, 2008). They include the discrimination of instrument versus manual manipulation, that of different instrument types, and that of theme-object textural properties (or object types).

In the current study, the distinction of instrument versus manual manipulation is regarded as the first coarse determinant for the domain in both Thai and Khmer.

They similarly distinguish caused-separations involving an instrument from those in which an agent used only his or her hands. This distinction can be understood through the use of two distinct sets of verb labels, for the semantically different caused- separation types. Note that there is a general discrimination of instrument versus manual manipulation in Thai and Khmer, with the condition that a knife-hand is recognised as being an instrument. The karate hand’s chopping51 is obviously an atypical way of causing separation in objects, and it is understandable that speakers would describe these events using different sub-kinds of verbs. Based on perceived

51 The extent to which Thai and Khmer speakers also use typical-separation lexical resources to describe the “unnatural” events: e.g., where a karate chop causes separation, appears to be significant data. Speakers’ lexical choices as such may provide a different kind of view on lexical-semantic parallelism versus non-parallelism. This then would contribute to the study of MSEA linguistic convergence. The issue of lexical choices for non-prototypical separation like karate-chopping is worth exploring and analysing in a separate further study.

381

similarity, “karate hand” could be interpreted either as a blade or as a hammer.

Preferences for particular verbs will depend on typical ways of separating objects

(using instruments) in the language communities and also on individual speakers’ interpretations of the events.

Still, the degree of observation of the +/- instrument manipulation may vary in the two MSEA languages as reflected by /pdac/ ‘separate’ in Khmer. Precisely, despite inadequate information for discussing a full account of the verb, what we know concerns two key points. First, when /pdac/ was employed for instrument- manipulated actions, its distributional pattern can be interpreted as being in line with that of the /kat/-subdivision. That is, both were used consistently for actions involving an intended one-location separation caused by any instrument (e.g., sharp or dull).

Second, the use of /pdac/ suggests that the feature of +/- instrument manipulation does not really specify events it can define. In other words, this Khmer verb pictures the likelihood of a general verb in covering the opposite classification by +/- instrument manipulation, whereas Thai does not do so. The “likely broad-based” coverage of

/pdac/ seemingly compromises the semantic division of +/- instrument manipulation in Khmer. Khmer thus may not observe such a distinction as strongly as does Thai.

These issues of /pdac/ deserve greater attention in further studies concerning specific semantic motivation of the verb and its potential for illuminating the +/- instrument manipulation division. Such research might facilitate understanding of Thai and

Khmer lexical semantic convergence.

Next, instrument types are distinguished among instrument-manipulated events in both Thai and Khmer. Namely, both languages discriminate between events with a sharp-bladed instrument versus those with a pointy tool versus those with a

382

blunt tool. In Khmer, the instrument-type demarcation is expressed by the use of different verbs in response to the different event subtype. Thai also observes this differentiation. Still, the frequent collapse of the pointed-tip tool versus sharp device subdelimitations can be seen, collectively designated by cutting verbs. Likewise, among events with manual manipulation, a similar distinction concerning different theme-object textural properties is made in both languages. In particular, manually- manipulated caused-separations with a rigid object are differentiated from those with a non-rigid (flexible) object.

Additionally, I argue the matches in mutual semantic distinctions play a role in the derivation of quasi-counterpart categories in the two languages (see § 6.1.1).

Recall the quasi-analogous Thai /tàt/- and Khmer /kat/-categories discussed in

Chapter 5, for example. The cores of these two event classes correspond, as variability occurs only in the peripheral regions. This could be attributable to the fact that Thai and Khmer adhere to mutually similar distinctions. By and large, the category- defining verbs /tàt/ ‘cut’ in Thai and /kat/ ‘cut’ in Khmer seem to have very similar applicability and to adopt the same or similar intensional semantics. The congruity in intentional meanings in turn assists in making easy translations between Thai and

Khmer.

Despite the above parallelism, certain divergence in semantic distinctions is still seen in Thai and Khmer, as discussed in the next subsection.

6.1.2.3 Divergence across Thai and Khmer

Thai and Khmer each exhibit the differential development of semantic distinctions.

Some of these differentiations, though available only in either of the languages, are

383

also compatible with the cross-linguistic trends (Majid et al., 2004, 2007a, 2008).

Some others are rather language-specific.

Two of the cross-linguistic semantic distinctions are reflected either in Thai or in Khmer, but not in both. Khmer vigorously upholds the cross-linguistic distinguishing of tearing events from other caused-separation types while Thai fails to lexicalise this distinction. Also, Thai follows the cross-linguistic distinction relating to piercing events whereas Khmer does not appear to do so (see § 5.3.1).

Other language-specific demarcations apart from those mentioned above are also explicit in one or the other of the languages. These point to further divergence in semantic differentiations across Thai and Khmer. In the present study, they include the discriminations concerning prior blade placement in Thai, and the lengthwise direction of separation and object subtypes in Khmer. Specifically, Thai makes a further distinction among events with the use of a sharp-bladed/pointed instrument. It is based on whether the blade had been placed on the object’s surface before an action of separation was performed (see § 3.4.1.1). Events with prior blade placement were differentiated into a dedicated /krìːt/-category in the language. Khmer however does not observe this discrimination, at least for the domain’s categorisation.

Likewise, Khmer draws a further language-specific differentiation among instrument-manipulated events. It concerns whether the direction of caused-separation was oriented along the object’s length. In this study, the events refer to “lengthwise” slitting~cutting and chopping. The language accordingly sorts them to a separate

/puh/-category (see § 4.4.1.2). In addition, Khmer distinguishes between manually- manipulated caused separations with the destruction of flexible objects (e.g., pieces of cloth, or ropes). Specifically, they were differentiated with respect to subtypes defined

384

by numbers of object dimensions. Thus, pulling-apart of a one-dimensional flexible object is semantically different from tearing of a two-dimensional flexible object in the language. The two caused-separation types are separately classified into two different categories, i.e., the /tieɲ/-category for pulling-apart and the /haek/-category for tearing. Again, these two semantic demarcations do not influence the carving-up of the domain into categories in Thai.

Note that some of the above language-specific discriminations do not play a prominent role in the domain’s high-order categorisation in either Thai or Khmer, but may be important in lower-order or finer subclassification. For instance, the notion of the lengthwise orientation primed the establishment of the /puh/-category in Khmer, while potentially determining the /pʰàː/-subcategory in Thai.

The above investigation has unearthed significant parallelism consisting in mutually occurring semantic differentiation, but some of the Thai-Khmer parallelism is cross-linguistically common as well. One widely-attested type of semantic differentiation is categorisation based on predictability of separation location. Another common distinction relates to events of snapping versus smashing. On the other hand, certain types of categorisation showing Thai-Khmer parallelism have not been so widely reported. This includes differentiation based on instrument versus manual manipulation. It is categorisation of this latter kind that could support a case for areal convergence. The next section considers this type of evidence.

6.2 Area-specific lexical semantics: A look at mutual

parallelism across Thai and Khmer

This section will address the third research question raised for this study: whether evidence of areally shared lexical semantic distinctions can support the notion of the

385

MSEA linguistic area. To advance this enquiry, the notion of areal lexical semantic traits revealed in Thai and Khmer is underpinned by triangulation with selected non-

MSEA languages.

An important proposal relating to areal semantics as argued by Koptjevskaja-

Tamm and Liljegren (2017) is that this field could be developed through analysis of the organisation of semantic domains shared across the languages of an area.

Methodologically, to determine such areal organisation, convergence across the area’s genetically unrelated languages and its contrast with other languages outside the area need to be examined. This type of contrast would help justify taking instances of parallelism as areally special or as unusual with respect to outsiders. Comparative outsiders should be of a geographical region immediately outside an area being studied (Enfield, 2005, p. 190).

6.2.1 Local lexical semantic convergence in Thai and Khmer against

mutual parallelism in Hindi and Tamil (Narasimhan, 2007)

In this section mutual correspondences established in this study for Thai and Khmer are compared to corresponding semantic features in unrelated and non-MSEA Hindi and Tamil52 (Narasimhan, 2007, in order to apply the triangulation analysis method

(see Figure 1.1). The current study chose Hindi and Tamil as they are included in the large and well-established South Asian (SA) linguistic area (Abbi, 1994; Emeneau,

1980; Masica, 1976, 1994). This area is to the immediate west of MSEA through a zone covering north Myanmar, northeast India, and the Himalayas (Jenny, 2015; Post,

52 The Hindi and Tamil insights were used to fulfill the triangulation precisely for establishing the notion of areal semantic characteristics for Thai and Khmer. The exhaustive SA language discussion— in comparison with the MSEA languages—is not the centre of interest in the present research. Still, lists of verbs and numbers of categories and subcategories in the SA languages could be fully addressed in a future study. Also, Narasimhan (2007) has already described some account of the Hindi and Tamil facts and it would not be appropriate simply to restate her analysis.

386

2015). Any aspects of the Khmero-Thai mutual parallelism discussed above that are unusual in Hindi and Tamil come to be potentially important for establishing areal/Sprachbund lexical semantic traits. Such features are thereby taken as evidence at the lexico-semantic level to substantiate MSEA as a linguistic area differentiated from the nearby SA linguistic area.

Let us start with how caused-separation verbs in Hindi and Tamil versus Thai and Khmer have been distributed in the descriptions. Narasimhan (2007) studied

Hindi and Tamil’s categorisation of the cutting and breaking (i.e., caused-separation) domain, using the responses of speakers of both languages to Bohnemeyer et al.’s

(2001) ‘cut’ and ‘break’ videos, as was done in the current study. Narasimhan found that the speakers of Hindi and Tamil each resorted to only a small number of recurrently used verbs for most of the domain. Only three verbs in Hindi (i.e., kaaT

‘cut’, toD| ‘break’, and phaaD| ‘tear’) were found to cover the descriptions of 42 scenes out of 43 caused-separation scenes; the exception was the piercing scene. In

Tamil, three verbs (i.e., veTTU ‘cut’, oD|ai ‘break’, and kiZii ‘tear’) described 35 of

43 scenes; other remaining scenes tally with two other verbs (i.e., either narUkkU

‘cut’ or arU ‘saw, cut (thread/rope))’. Likewise, none of these Tamil verbs delineates the piercing event. In contrast (especially to Hindi), Thai and Khmer employ five and seven verbs, respectively, to name the same 42 scenes, despite the piercing event being removed from consideration.

The verb distributions in Hindi and Tamil (Narasimhan, 2007) consequently display markedly different granularity for the domain as opposed to distributions of

Thai and Khmer. In particular, the former SA languages seem to evince a more limited number of categories than the latter MSEA languages in the same domain.

These classifications accordingly indicate that Hindi and Tamil depend on certain

387

semantic distinctions at odds with Thai and Khmer. With this, I do not go so far as to claim that Hindi and Tamil have nothing in common with Thai and Khmer with respect to the categorisation patterns: I regard Hindi and Tamil as comparable to Thai and Khmer to some degree. There are three resulting aspects of the triangulation of the MSEA languages with Hindi and Tamil.

First, the semantic discrimination of instrument versus manual manipulation, common to both Thai and Khmer, is unparalleled in the SA languages being compared. Specifically, Hindi and Tamil do not comply with the instrument versus manual manipulation differentiation. This is reflected through some Hindi and Tamil verbs fitting with both instrument- and manually-manipulated events (Narasimhan,

2007). For example, phaaD| in Hindi and kiZii in Tamil normally apply to tearing events, but they are still applicable to instrument-manipulated events, i.e., with a non- sharp and sharp tool, respectively.

Second, the semantic discrimination of different instrument types, shared by

Thai and Khmer, appears to play a part in Hindi—but not in Tamil. Hindi discriminates caused-separations by a sharp blade from those involving the use of blunt instruments and pointed tools. This is evidenced by the distinct use of kaaT ‘cut’ for caused-separations by a sharp tool and toD| ‘break’ for those by a non-sharp tool

(Narasimhan, 2007). Hindi thus differs slightly from Thai and Khmer with respect to the discrimination concerning a pointed tool. Events with the use of pointed-tip chisels or twigs in Thai and Khmer can be separately classified.

Last, the mutually related distinction of different object textural properties in

Thai and Khmer also seems to influence classifications of caused-by-hand separations in Tamil—but not in Hindi. In particular, Tamil differentiates between manually-

388

manipulated separations involving rigid versus non-rigid objects. This is mirrored by the separate use of oD|ai ‘break’ for rigid themes53 and either arU ‘separate [1-D object]’ or kiZii ‘tear [2-D object]’ for non-rigid/flexible ones.

Contingent on this study’s findings, the next subsection discusses how the above SA comparison sheds lights on potential MSEA area-specific convergence at the lexical semantic level.

6.2.2 Thai and Khmer lexical semantic convergence as areal traits

Out of three diagnostic notions of convergence in lexical semantic distinctions in Thai and Khmer, two have parallels in either Hindi or Tamil. The remaining convergence type was found to be unusual in the two SA languages. This unusualness accordingly reflects a potential for establishing a degree of area-specific relatedness and for recognising an insider-outsider areal configuration.

According to § 6.2.1, the distinctions concerning different instrument types and different object textural properties are associated with Hindi and Tamil, respectively. Hindi observes the differentiation of sharp versus non-sharp instruments, thus distinguishing caused-separations by knives, axes, or machetes from those by hammers, chisels, or twigs. The other distinction, whether theme objects are rigid or flexible, is observed only by Tamil54. The two semantic discrimination notions are not peculiar or specific only to MSEA Thai and Khmer. Instead, semantic discrimination of this type is possible for other languages outside the area.

53 One may suggest that veTTU is also used only rarely for rigid objects. 54 The specific use of phaaD| ‘tear’ for flexible objects may seem to imply the +/- rigid-object division in Hindi. The use of Hindi toD| ‘break’ with both rigid and (one-dimensional) flexible objects makes a case that such a distinction is not pervasive in the language. In other words, Tamil +/- rigid object is a stronger factor for the set of relevant verbs than it is for Hindi. Therefore, as the +/- rigid-object distinction apparently operates in about the same way in Thai and Khmer, I conjecture that +/- rigid object might be a more plausible areal MSEA feature than it is for SA. The feature still needs testing, both in MSEA and in SA, but at least it would be a candidate feature to test.

389

Conversely, the notion of instrument versus manual manipulation is found to be unusual among these two SA languages. Regardless of caused-separations being manipulated by a tool or hands, Hindi and Tamil speakers can describe relevant events for them all with the same verbs. As a matter of fact, many languages are also not sensitive to this particular distinction. Take the three Germanic languages (Majid et al., 2007b): English, German, and Dutch for example. They are spoken in Europe, far from MSEA. Yet they do not discriminate among caused-separations by the use of an instrument versus that of hands. Such non-differentiation is shown by the generous use of break in English, brechen in German, breken in Dutch. They are applicable to both snapping and smashing events.

Consequently, it can be observed that only the distinction concerning instrument versus manual manipulation, shared by MSEA Thai and Khmer, is unusual to the comparative outsiders. This feature then is worth discussing and examining as a potential MSEA areal phenomenon.

Also, it is noted that when comparing MSEA Thai and Khmer to non-MSEA

Hindi and Tamil in this study, I do not suggest that the latter two languages are subject to contact conditions exactly parallel to the Thai-Khmer situation. Hindi and

Tamil are far removed both historically and geographically—occurring at the extremes of the Sprachbund. For these two languages, convergence would have been brought about by non-contiguous contacts, perhaps through intermediary languages, such as northern Dravidian substrates (M. W. Post, personal communication, August

7, 2020) or interactions between Dravidian languages and the predecessors of Hindi

(Schokker and Menon, 1990). Conversely, Thai and Khmer have been in close direct contact for many centuries (Schiller, 1993).

390

6.2.3 Evidence of MSEA linguistic area from Thai and Khmer areal

semantics

In the preceding section, the area-specific parallelism derived from Thai and Khmer has been developed and interpreted in the context of the triangulation analysis. The third research question is to reinforce the notion of the MSEA linguistic area at the lexical semantic level.

As already discussed, the distinction of instrument versus manual manipulation does not occur in Hindi or Tamil. Not only does it look unusual to both

SA languages, but this distinction also can be considered as highly specific to MSEA

Thai and Khmer. Moreover, since Thai and Khmer are from the two different families

(i.e., the Tai-Kradai and Austroasiatic languages), the idiosyncrasy in semantic distinction is thus conceived as areally shared rather than genetically inherited.

Correspondingly, the study’s data points to this conspicuous distinction being induced in Thai and Khmer by contact.

The areal convergence also contributes to a deeper understanding of areality in general in the MSEA area. Many previous studies have assayed MSEA area-specific linguistic features mainly in phonology, morphology, and syntax (Enfield, 2005), so establishing very well the region as a Sprachbund. The present discussion provides evidence from lexical semantics. Again, the established areal lexico-semantic conception helps give a clearer picture of the degrees of transfer in MSEA languages.

6.3 Research implications, limitations, and

recommendations

Implications, limitations, and recommendations constitute a final set of observations.

391

6.3.1 Implications

The research results, from the analysis of the semantics of caused-separation and its

(area-specific) convergence in Thai and Khmer, point to four implications.

First, semantic parallelism does not have to infiltrate thoroughly and completely into every classification feature of a semantic domain. In Thai and Khmer, similar semantic organisation of the caused-separation domain has been shown to be extensive. This can be taken to represent the extent of (areal) convergence. However, there are concurrent disagreements in categorisation (see § 6.1.2). Thus, the conspicuous parallelism in Thai and Khmer should not be allowed to conceal the fact that each of them is still able to harness specificities and to make independent changes. We have to be aware of the specific contact situation: the two languages having been rooted in a complex history of multilingualism (Khanittanan, 2001), population transfer and mutual influence (Huffman, 1973) over centuries or even millennia. Specifically, despite the similarities, this study makes no attempt to conclude or generalise that the qualities of the semantic categorisation in Thai and

Khmer are nearly the same, with only a few trivial differences.

Second, all mutual parallelism found in genetically unrelated but geographically close languages cannot automatically be assumed to be areal features without further investigation. In this study, regarding semantic distinctions in the investigated domain’s categorisation, Thai and Khmer show at least three possible parallels. However, only one of them can be considered area-specific to MSEA, while the other two appear to be paralleled by data in languages outside but near MSEA

(e.g., either Hindi or Tamil). Thus, to specify areal relatedness does not only require insights from originally distinct but sympatric languages, but also necessitate a

392

triangulation with unrelated and geographically distant ones (cf. Enfield, 2005;

Koptjevskaja-Tamm & Liljegren, 2017).

Third, despite the recognition of the areally shared distinction of instrument versus manual manipulation (see § 6.2.2), the current study does not infer that such an areal pattern is completely distinctive to MSEA. Semantic convergence in contiguous geographical areas does not exclude “the possibility of similar parallelisms in other regions of the world” (Koptjevskaja-Tamm & Liljegren, 2017, p. 215). As far as we know, some languages like Tidore in the North Moluccas (van Staden, 2007) and

Tzeltal in the highlands of Chiapas (Brown, 2007) seem to semantically differentiate as well between instrument- versus manually-manipulated caused-separations. As a result, further research is still required to determine the comprehensive areal status of this feature and whether the distinction feature spreads in the relevant areas.

Fourth, lexical semantic traits common to Thai and Khmer may involve translation matters. These may refer to general methodological issues in areal linguistic research. The current study found that the semantic convergence produces the similar categorisation of caused-separation. The comparable categorisation leads to several quasi-parallel categories. Category-defining verbs: e.g., /tàt/ ‘cut’ in Thai and /kat/ ‘cut’ in Khmer for the quasi-corresponding categories across the languages may then possess similar intensional semantics (see § 6.1.2.2). Such similarity in meaning consequently helps in translation between Thai and Khmer as the MSEA languages. Relatedly, all these facts echo Koptjevskaja-Tamm & Liljegren’s (2017) commentary on areally shared lexicalisation being difficult to translate by outsider languages; such shared semantic phenomena may more easily translate among languages within an area.

393

6.3.2 Limitations

As the current study specifically aims at determining the semantics of caused- separations in Thai and Khmer, based very much on Bohnemeyer et al.’s (2001) material, a couple of research limitations originate from the very nature of the stimuli used and issues concerning the causality of linguistic phenomena untouched by the research questions.

The fact that the data collection material might provide restricted information on a certain kind of caused-separation, especially for Thai, is perhaps one limitation.

In many languages (Majid et al., 2008), piercing situations form a separate category in the domain. They involve events where the pointed end is used to drill or punch a hole into an object. However, as Bohnemeyer et al.’s (2001) stimuli contain only one single scene for the event kind, it is difficult to make firm conclusions regarding a variety of manners of action, acted-upon themes, instruments, and relevant resulting separations potentially captured by individual languages. Therefore, how the piercing category extends in different languages is still left to be satisfactorily explained. In

Thai, the piercing class (i.e., /tʰîm/-category) seems to incorporate at least the three different subcategories. They specifically capture different construals of manners and theme-object textural properties for the single piercing event, as supported by evidence outside this experimental study. Still, we have lacked empirically derived information to draw inferences on whether and how nuanced details of piercing matter with respect to subcategorisation.

Another limitation concerns what mechanisms mediate the semantic categorisation similarities and differences in Thai and Khmer. This was left untouched by the research questions. That said, despite the concurrences of the areal and non- areal agreements and disagreements vouching for the potential complexity of their

394

causes, the findings obtained in this study are not sufficient to address the issue of causality in a comprehensive way.

6.3.3 Recommendations

The following lines attempt to suggest central points of attention further studies may take into account, with regards to the foregoing limitations.

First, additional stimulus scenes/events ought to be added to the stimuli

(Bohnemeyer et al., 2001). To illustrate, a variety of piercing events should be included to address the issue of delicate event nuances. Additional video clips may cover numerous manners (e.g., easy poking through like a knife through a soft object, versus stabbing through a difficult medium), different theme-objects varying in textural properties, or other possible aspects (e.g., partial versus complete poking), particularly pursuant to the features already discussed in § 3.5.1. In this way, we can clarify the measuring of information on piercing events, regarding different construals and an extension in the domain. Also, other accidental caused-separation scenes should be developed and added to the stimulus core set for testing different feature determinants (e.g., human-body theme, or different tools) which might steer the selection of specific /bàːt/ and /mut/ in respective Thai and Khmer.

Second, further research focus is needed to elucidate potential drivers for the

Thai and Khmer semantic similarities and differences. In this study, the findings show that the semantic parallelism and differences still co-occur in Thai and Khmer (see §

6.1.2). These phenomena may be affected by complex intermingling factors.

Arguably, a possible interpretation for the parallel elaboration (see § 6.2.2) may have to do with geographical proximity and language contact. Conversely, the non-areal mutual relatedness may be more suitably conceived as caused by other influential

395

factors: e.g., shared environments (cf. Koptjevskaja-Tamm & Liljegren, 2017).

Common surroundings as such could account for the high presence of specific experiences to both Thai and Khmer speakers. Also, they may facilitate the habit of speakers talking about particular actions and recurring contexts for such practices (cf.

Enfield, 2011b). By contrast, the dissimilar traits may be introduced by means of independent innovation. Distinctive lexico-semantic innovation may arise within a limited network of speakers and then spread more widely in the speech community.

The context of the original distinctive usage and details of spread may not be accessible to researchers, or hardly explicable in terms of contact-induced or extralinguistic influences. This study has already provided insights into how Thai and

Khmer categorise features similarly and differently for further research on such possible causes. In this way, lexical semantics could be integrated into the broader research profile of areal semantics, and areal linguistics in general, which in turn feeds into questions concerning the relationships between genetic inheritance and contact-induced linguistic change, intertranslatability in the languages of the world, close communication, or sociolinguistic conversational contexts (e.g., on how ‘cut’ and ‘break’ terms are actually used by speakers interactively in daily-life contexts).

------⁂ ------

396

References

Abbi, A. (1994). Semantic universals in Indian languages. Shimla: All India Institute of Advanced Study. Adelaar, K. A., & Himmelmann, N. (2005). The Austronesian languages of Asia and Madagascar. New York: Routledge. Aldenderfer, M. S., & Blashfield, R. K. (1984). Cluster analysis. Newbury Park: SAGE. Ameka, F. K., & Wilkins, D. P. (1996). Semantics. In H. Goeble, P. H. Nelde, Z. Stary, & W. Wölck (Eds.), Contact linguistics. An international handbook of contemporary research (Vol. 1, pp. 130-138). Berlin: Walter de Gruyter. Ameka, F., & Essegbey, J. (2007). Cut and break verbs in Ewe and the causative alternation construction. Cognitive Linguistics, 18(2), 241-250. doi:10.1515/COG.2007.012 Andersen, E. S. (1978). Lexical universals of body-part terminology. In J. H. Greenberg (Ed.), Universals of human language (Vol. 3, pp. 335-368). Stanford: Stanford University Press. Andics, A. (2012). The semantic role of agentive control in Hungarian placement events. In A. Kopecka & B. Narasimhan (Eds.), Events of putting and taking: A crosslinguistic perspective (pp. 183-200). Amsterdam: John Benjamins. Annamalai, E., & Steever, S. B. (1998). Modern Tamil. In S. B. Steever (Ed.), Dravidian languages (pp. 100-128). London: Routledge. Ansaldo, U., & Matthews, S. J. (2001). Typical creoles and simple languages: The case of Sinitic. Linguistic Typology, 5(3), 311-326. Aroonmanakun, W. (2008). TNC: Thai National Corpus (Third Edition). Retrieved October 1, 2020, from http://www.arts.chula.ac.th/~ling/tnc3/ Bauer, R. S. (1996). Identifying the Tai substratum in Cantonese. In Pan-Asiatic linguistics: Proceedings of the fourth international symposium on languages and linguistics V (pp. 1806-1844). Bangkok: Mahidol University. Berlin, B., & Kay, P. (1969). Basic color terms: Their universality and evolution. Berkeley: University of California Press. Bisang, W. (1991). Verb serialization, and attractor positions in Chinese, Hmong, Vietnamese, Thai and Khmer. In H. Seiler & W. Premper

397

(Eds.), Partizipation: das sprachliche Erfassen von Sachverhalten (pp. 509- 562). Tübingen: Gunter Narr Verlag. Bisang, W. (1999). Classifiers in East and Southeast Asian languages: Counting and beyond. In J. Gvozdanovic (Ed.), Numeral types and changes worldwide (pp. 113-185). Berlin: Mouton de Gruyter. Bisang, W. (2006). Southeast Asia as a linguistic area. In K. Brown (Ed.), Encyclopedia of language & linguistics (2nd ed., Vol. 11, pp. 587-595). Oxford: Elsevier. Bloomfield, L. (1933). Language. New York: Holt. Blust, R. (1994). The Austronesian settlement of Mainland Southeast Asia. In K. L. Adams & T. J. Hudak (Eds.), Papers from the second annual meeting of the Southeast Asian linguistics (pp. 25–83). Tempe, AZ: Program for Southeast Asian Studies, Arizona State University. Boenigk, J., Wodniok, S., & Glücksman, E. (2015). Biodiversity and earth history. Heidelberg: Springer. Bohnemeyer, J., Bowerman, M., & Brown, P. (2001). Cut and break clips. In S. C. Levinson & N. J. Enfield (Eds.), Manual for the field season 2001. Nijmegen: Max Planck Institute for Psycholinguistics. Bouveret, M., & Sweetser, E. (2009). Multi-frame semantics, metaphoric extensions and grammar. In I. Kwon, H. Pritchett, & J. Spence (Eds.), Proceedings of the thirty-fifth annual meeting of the Berkeley Linguistics Society (Vol. 35, pp. 49- 59). Berkeley: Berkeley Linguistics Society. Bowerman, M. (2005). Why can’t you “open” a nut or “break” a cooked noodle? Learning covert object categories in action word meanings. In L. Gershkoff- Stowe & D. H. Rakison (Eds.), Building object categories in developmental time (pp. 227-262). Mahwah, NJ: Erlbaum. Bowerman, M., & Choi, S. (2001). Shaping meanings for language: Universal and language-specific in the acquisition of spatial semantic categories. In S. C. Levinson & M. Bowerman (Eds.), Language acquisition and conceptual development (pp. 475-511). Cambridge, UK: Cambridge University Press. Bowerman, M., Gullberg, M., Majid, A., & Narasimhan, B. (2004). Put project: The cross-linguistic encoding of placement events. In A. Majid (Ed.), Field manual volume 9 (pp. 10-18). Nijmegen: Max Planck Institute for Psycholinguistics.

398

Bowerman, M., Majid, A., Erkelens, M., Narasimhan, B., & Chen, J. (2004). Learning how to encode events of ‘cutting and breaking’: A crosslinguistic study of semantic development. Poster presented at the 2004 Child Language Research Forum “Constructions and Acquisition”. Stanford, CA. Bradley, D. (2003). Lisu. In G. Thurgood & R. J. LaPolla (Eds.), The Sino-Tibetan languages (pp. 222-235). London: Routledge. Brenzinger, M., & Fehn, A.-M. (2013). From body to knowledge: Perception and cognition in Khwe-||ani and ts’ixa. In A. Y. Aikhenvald & A. Storch (Eds.), Perception and cognition in language and culture (pp. 161-191). Leiden: Brill. Brown, P. (2007). ‘She had just cut/broken off her head’: Cutting and breaking verbs in Tzeltal. Cognitive Linguistics, 18(2), 319-330. Brown, R. W., & Lenneberg, E. H. (1954). A study in language and cognition. Journal of Anormal and Social Psychology, 49(3), 454. Budge, C. (1980). Southeast Asia as a linguistic area. (Master’s thesis), Monash University, Melbourne. Capell, A. (1979). Further typological studies in Southeast Asian languages. In N. D. Liem (Ed.), South-East Asian linguistic studies (Vol. 3, pp. 1-42). Canberra: Pacific Linguistics. Chappell, H. (2001). Language contact and areal diffusion in Sinitic languages. In A. Y. Aikhenvald & R. M. W. Dixon (Eds.), Areal diffusion and genetic inheritance: Problems in comparative linguistics (pp. 328–357). Oxford: Oxford University Press. Chen, J. (2007). ‘He cut-break the rope’: Encoding and categorizing cutting and breaking events in Mandarin. Cognitive Linguistics, 18(2), 273-285. doi:10.1515/COG.2007.015 Choi, S., & Bowerman, M. (1991). Learning to express motion events in English and Korean: The influence of language-specific lexicalization patterns. Cognition, 41(1-3), 83-121. doi:10.1016/0010-0277(91)90033-z Clancy, S. J. (2006). The topology of Slavic case: Semantic maps and multidimensional scaling. Glossos, 7(1), 1-28. Clark, M. (1985). Asking Question in Hmong and Other Southeast Asian Languages. Linguistics of the Tibeto-Burman Area, 8(2), 60-67.

399

Clark, M. (1989). Hmong and areal Southeast Asia. In D. Bradley (Ed.), South-East Asian syntax (pp. 175-230). Canberra: Department of Linguistics, Research School of Pacific Studies, Australian National University. Clark, M. (1996). Where do you feel?: Stative verbs and body-part terms in Mainland Southeast Asia. In H. Chappell & W. McGregor (Eds.), The grammar of inalienability. A typological perspective on body part terms and the part- whole relation (pp. 529-564). Berlin: Mouton de Gruyter. Clark, M., & Prasithrathsint, A. (1985). Synchronic lexical derivation in Southeast Asian languages. In S. Ratanakul & S. Premsrirat (Eds.), Southeast Asian linguistic studies presented to André-G. Haudricourt (pp. 34-81). Bangkok: Mahidol University. Comrie, B. (2007). Areal typology of Mainland Southeast Asia: What we learn from the WALS maps. Manusya, 13, 18-47. doi:10.1163/26659077-01003002 Cysouw, M. (2007). Building semantic maps: The case of person marking. In M. Miestamo & B. Wälchli (Eds.), New challenges in typology (1 ed., pp. 225- 248). Berlin: Mouton de Gruyter. Cysouw, M. (2010). Semantic maps as metrics on meaning. Linguistic Discovery, 8(1), 70-95. Dahl, Ö. (2008). An exercise in a posteriori language sampling. STUF - Language Typology and Universals, 61(3), 208-220. doi:10.1524/stuf.2008.0021 de Sousa, H. (2015). The far Southern Sinitic languages as part of Mainland Southeast Asia. In N. J. Enfield & B. Comrie (Eds.), Languages of Mainland Southeast Asia: The state of the art (pp. 356–440). Berlin: Mouton de Gruyter. DeLancey, S. (1995). Verbal case frames in English and Tibetan. Department of Linguistics, University of Oregon. Eugene, OR. DeLancey, S. (2000). The universal basis of case. Logos and Language, 1(2), 1-15. Devylder, S. (2017). Cutting and breaking the embodied self. Cognitextes, 16(1). Devylder, S., & Zlatev, J. (2020). Cutting and breaking metaphors of the self and the motivation and sedimentation model. In A. Baicchi & G. Radden (eds.), Figurative Meaning Construction in Thought and Language. Amsterdam: John Benjamins. Diffloth, G. (1994). The lexical evidence for Austric, so far. Oceanic Linguistics, 33(2), 309-322. doi:10.2307/3623131

400

Diller, A. V. N. (2012). Introduction. In A. V. N. Diller, J. A. Edmondson, & Y. Luo (Eds.), The Tai-Kadai languages (pp. 3-8). London: Routledge. Donegan, P. J., & Stampe, D. (1983). Rhythm and the holistic organization of language structure. In J. H. Richardson, M. Marks, & A. Chukerman (Eds.), Papers from the parasession on the interplay of phonology, morphology and syntax (pp. 337-353). Chicago: Chicago Linguistic Society. Draper, J. (2019). Language education policy in Thailand. In A. Kirkpatrick & A. J. Liddicoat (Eds.), The Routledge international handbook of language education policy in Asia (pp. 229-242). New York: Routledge. ELAN. (n.d.). The language archive, Max Planck Institute for Psycholinguistics [Computer programme]. Retrieved June 1, 2017, from https://tla.mpi.nl/tools/tla-tools/elan/ Emeneau, M. B. (1980). India as a linguistic area. In M. B. Emeneau & A. S. Dil (Eds.), Language and linguistic area. Essays by Murray B. Emeneau. Standford, CA: Standford University Press. Enfield, N. J. (2001). On genetic and areal linguistics in Mainland South-East Asia: Parallel polyfunctionality of ‘acquire’. In A. Y. Aikhenvald & R. M. W. Dixon (Eds.), Areal diffusion and genetic inheritance: Problems in comparative linguistics (pp. 255-290). Oxford: Oxford University Press. Enfield, N. J. (2003). Linguistic epidemiology: Semantics and grammar of language contact in Mainland Southeast Asia. London: Routledge. Enfield, N. J. (2004). Nominal classification in Lao: A sketch. STUF - Language Typology and Universals, 57(2/3), 117-143. Enfield, N. J. (2005). Areal linguistics and Mainland Southeast Asia. Annual Review of , 34, 181-206. Enfield, N. J. (2007). Lao separation verbs and the logic of linguistic event categorization. Cognitive Linguistics, 18(2), 287-296. Enfield, N. J. (2008). A grammar of Lao (Reprint 2010 ed.). Berlin: Mouton de Gruyter. Enfield, N. J. (2011a). Linguistic diversity in Mainland Southeast Asia. In N. J. Enfield (Ed.), Dynamics of human diversity: The case of Mainland Southeast Asia (pp. 63-80). Canberra: Pacific Linguistics.

401

Enfield, N. J. (2011b). Taste in two tongues: A Southeast Asian study of semantic convergence. The Senses and Society, 6(1), 30-37. doi:10.2752/174589311X12893982233632 Enfield, N. J. (2019). Mainland Southeast Asian languages: A concise typological introduction. Cambridge, UK: Cambridge University Press. Enfield, N. J., & Comrie, B. (Eds.). (2015). Languages of Mainland Southeast Asia: The state of the art. Berlin: Mouton de Gruyter. Enfield, N. J., Majid, A., & van Staden, M. (2006). Cross-linguistic categorisation of the body: Introduction. Language Sciences, 28(2), 137-147. doi:https://doi.org/10.1016/j.langsci.2005.11.001 Evans, N. (2010). Semantic typology. In J. J. Song (Ed.), The Oxford handbook of linguistic typology (pp. 504-533). Oxford: Oxford University Press. Fabb, N. (2016). Linguistic theory, linguistic diversity and Whorfian economics. In V. Ginsburgh & S. Weber (Eds.), The Palgrave handbook of economics and language. London: Palgrave Macmillan. Fillmore, C. J. (1970). The grammar of hitting and breaking. In R. A. Jacobs & P. S. Rosenbaum (Eds.), Readings in English transformational grammar (pp. 120- 133). Waltham, MA: Ginn. Fletcher, P. (1985). A child’s learning of English. Oxford: Basil Blackwell. Fodor, J. A. (1975). The language of thought. New York: Crowell. Fodor, J. A. (1981). The present status of the innateness controversy. In J. A. Fodor (Ed.), Representations (pp. 257-316). Cambridge, MA: MIT Press. François, A. (2008). Semantic maps and the typology of colexification: Intertwining polysemous networks across languages. In M. Vanhove (Ed.), From polysemy to semantic change (Vol. 106, pp. 163-215). Amsterdam: John Benjamins. Frewer, T. (2014). Diversity and development: The challenges of education in Cambodia. In P. Sercombe & R. Tupas (Eds.), Language, education and nation-building: Assimilation and shift in Southeast Asia (pp. 45-67). Basingstoke: Palgrave Macmillan. Frye, M. (2011). Metaphors of being a Φ. In C. Witt (Ed.), Feminist metaphysics: Explorations in the ontology of sex, gender, and the self (pp. 85-95). Dordrecht: Springer.

402

Gast, V., & Koptjevskaja-Tamm, M. (2018). The areal factor in lexical typology. In F. Brisard, T. Mortelmans, & D. Olmen (Eds.), Aspects of linguistic variation (pp. 43-81). Berlin: Mouton De Gruyter. Gast, V., König, E., & Moyse-Faurie, C. (2014). Comparative lexicology and the typology of event descriptions: A programmatic study. In D. Gerland, C. Horn, A. Latrouite, & A. Ortmann (Eds.), Meaning and grammar of nouns and verbs (pp. 145-183). Düsseldorf: Düsseldorf University Press. Gedney, W. J. (1989 [1979]). Selected papers on comparative Tai studies. Ann Arbor: University of Michigan. Gil, D. (2015). The mekong-mamberamo linguistic area. In B. Comrie & N. J. Enfield (Eds.), Languages of Mainland Southeast Asia: The state of the art (pp. 266- 355). Berlin: Mouton De Gruyter. Gleason, H. A. (1961). An introduction to descriptive linguistics. New York: Holt, Rinehart and Winston. Goddard, C. (2005). The languages of East and Southeast Asia. Oxford: Oxford University Press. Goodenough, W. H. (1965). Yankee : A problem in componential analysis. , 67(5), 259-287. Google.com. (n.d.). Retrieved October 1, 2020, from https://www.google.com/ Gorgoniev, Y. A. (1966). The Khmer language. Moscow: Nauka. Gregerson, K. J. (1976). Tongue-root and register in Mon-Khmer. In P. N. Jenner, L. C. Thompson, & S. Starosta (Eds.), Austroasiatic studies (pp. 323-360). Honolulu: University of Hawai’i Press. Guastavino, C. (2018). Everyday sound categorization. In T. Virtanen, M. D. Plumbey, & D. Ellis (Eds.), Computational analysis of sound scenes and events (pp. 183-213). Cham: Springer. Guerssel, M., Hale, K., Laughren, M., Levin, B., & Eagle, J. W. (1985). A cross- linguistic study of transitivity alternations. In W. H. Eilfort, P. D. Kroeber, & K. L. Peterson (Eds.), Papers from the parasession on causatives and agentivity at the twenty-first regional meeting (Vol. 21, pp. 48-63). Chicago, IL: Chicago Linguistic Society. Haas, M. R. (1964). Thai-English student’s dictionary. Kuala Lumpur: Oxford University Press. Haiman, J. (2011). Cambodian: Khmer. Amsterdam: John Benjamins

403

Hale, K. L., & Keyser, S. J. (1987). A view from the middle. Cambridge, MA: Center for Cognitive Science, MIT. Haspelmath, M. (1997). Indefinite pronouns. Oxford: Oxford University Press. Haspelmath, M. (2000). The European linguistic area: Standard average European. In M. Haspelmath, E. König, W. Österreicher, & W. Reible (Eds.), Language typology and language universals. Berlin: Mouton de Gruyter. Haspelmath, M. (2003). The geometry of grammatical meaning: Semantic maps and cross-linguistic comparison. In M. Tomasello (Ed.), The new psychology of language (Vol. 2, pp. 211-242). Mahwah, NJ: Erlbaum. Haspelmath, M. (2004). On directionality in language change with particular reference to grammaticalization. In O. Fischer, M. Norde, & H. Perridon (Eds.), Up and down the cline – The nature of grammaticalization: The nature of grammeticalization (pp. 17-44). Amsterdam: John Benjamins. Headley, R. K. (1977). Cambodian-English dictionary (Vol. 3): Washington: Catholic University of American Press. Headley, R. K., Chim, R., & Soeum, O. (1997). Modern Cambodian-English dictionary. Kensington, MD: Dunwoody Press. Heine, B., & Kuteva, T. (2002). World lexicon of grammaticalization. Cambridge, UK: Cambridge University Press. Hood, D. C., & Finkelstein, M. A. (1983). A case for the revision of textbook models of color vision: The detection and appearance of small brief lights. London: Academic Press. Huffman, F. E. (1973). Thai and Cambodian - A case of syntactic borrowing? Journal of the American Oriental Society, 93(4), 488-509. doi:10.2307/600168 Huffman, F. E. (1976). The register problem in fifteen Mon-Khmer languages. In P. N. Jenner, L. C. Thompson, & S. Starosta (Eds.), Austroasiatic studies (pp. 575-589). Honolulu: University of Hawai’i Press. Hyslop, G., Morey, S., & Post, M. W. (Eds.). (2011). North East India linguistics (Vol. 3). New Delhi: Cambridge University Press India. Hyslop, G., Morey, S., & Post, M. W. (Eds.). (2012). North East Indian linguistics (Vol. 4). New Delhi: Cambridge University Press India. Hyslop, G., Morey, S., & Post, M. W. (Eds.). (2013). North East Indian linguistics (Vol. 5). New Delhi: Cambridge University Press India.

404

Iwasaki, S., & Ingkaphirom, P. (2005). A reference grammar of Thai. Cambridge, UK: Cambridge University Press. Jakobson, R., Fant, G. M., & Hale, M. (1951). Preliminaries to speech analysis: The distinctive features and their correlates. Cambridge, MA: MIT Press. Jenny, M. (2015). The far West of Southeast Asia: ‘give’ and ‘get’ in the languages of Myanmar. In N. J. Enfield & B. Comrie (Eds.), Languages of Mainland Southeast Asia: The state of the art (Vol. 649, pp. 155-208). Berlin: Mouton de Gruyter. Jessen, M. (2013). Semantic categories in the domain of motion verbs by adult speakers of Danish, German, and Turkish. Linguistik Online, 61(4), 57-78. Jessen, M., & Cadierno, T. (2013). Variation in the categorization of motion events by Danish, German, Turkish, and L2 Danish speakers. In J. Goschler & A. Stefanowitsch (Eds.), Variation and change in the encoding of motion events (pp. 133-160). Amsterdam: John Benjamins Kader, A. A. (1992). Postharvest of horticultural crops (2nd ed.). Oakland, CA: University of California, and Natural Resources. Kay, P., Berlin, B., & Merrifield, W. (1991). Biocultural implications of systems of color naming. Journal of , 1(1), 12-25. doi:10.1525/jlin.1991.1.1.12 Kelly, A. (2017). Developing metrics for equity, diversity and competition: New measures for schools and universities. London: Routledge. Keyser, S. J., & Roeper, T. (1984). On the middle and ergative constructions in English. Linguistic Inquiry, 15(3), 381-416. Khanittanan, W. (2001). Khmero-Thai: The great change in the history of the Thai language of the Chao Phraya Basin. Phasa lae phasasat [Journal of Language and Linguistics], 19(2), 35-50. Klee, T. (1992). Developmental and diagnostic characteristics of quantitative measures of childrenʼs language production. Topics in language disorders, 12(2), 28-41. doi:10.1097/00011363-199202000-00005 Kopecka, A. (2012). Semantic granularity of placement and removal expressions in Polish. In A. Kopecka & B. Narasimhan (Eds.), Events of putting and taking: A crosslinguistic perspective (pp. 327-346). Amsterdam: John Benjamins. Kopecka, A., & Narasimhan, B. (Eds.). (2012). Events of putting and taking a crosslinguistic perspective. Amsterdam: John Benjamins.

405

Koptjevskaja-Tamm, M. (2011). “It’s boiling hot!” On the structure of the linguistic temperature domain across languages. In S. D. Schmid, U. Detges, P. Gévaudan, W. Mihatsch, & R. Waltereit (Eds.), Rahmen des Sprechens. Beiträge zu Valenztheorie, Varietätenlinguistik, Kreolistik, Kognitiver und Historischer Semantik. Peter Koch zum 60. Geburtstag (pp. 393-410). Tübingen: Narr. Koptjevskaja-Tamm, M. (2016). The lexical typology of semantic shifts: An introduction. In P. Juvonen & M. Koptjevskaja-Tamm (Eds.), The lexical typology of semantic shifts. Berlin: Mouton de Gruyter. Koptjevskaja-Tamm, M., & Liljegren, H. (2017). Semantic patterns from an areal perspective. In R. Hickey (Ed.), The Cambridge handbook of areal linguistics (pp. 204–236). Cambridge, UK: Cambridge University Press. Koptjevskaja-Tamm, M., Rakhilina, E., & Vanhove, M. (2016). The semantics of lexical typology. In N. Riemer (Ed.), The Routledge handbook of semantics (pp. 434-454). Abingdon: Routledge. Koptjevskaja-Tamm, M., Vanhove, M., & Koch, P. (2007). Typological approaches to lexical semantics. Linguistic Typology, 11(1), 159-185. Kosonen, K., & Person, K. R. (2014). Languages, identities and education in Thailand. In P. Sercombe & R. Tupas (Eds.), Language, education and nation- building: Assimilation and shift in Southeast Asia (pp. 200-231). Basingstoke: Palgrave Macmillan. Kroeger, P. R. (2010). The grammar of hitting, breaking, and cutting in Kimaragang Dusun. Oceanic Linguistics, 49, 2-20. doi:10.1353/ol.0.0071 Levin, B. (1993). English verb classes and alternations: A preliminary investigation. Chicago: University of Chicago Press. Levin, B., & Hovav, M. R. (1995). Unaccusativity: At the syntax-lexical semantics interface. Cambridge, MA: MIT Press. Levinson, S., Meira, S., & the Language and Cognition Group. (2003). ‘Natural concepts’ in the spatial topological domain - adpositional meanings in crosslinguistic perspective: An exercise in semantic typology. Language, 79(3), 485-516. Lewis, M. P. (Ed.) (2009). Ethnologue: Languages of the world (16th ed.). Dallas, TX: SIL International.

406

Li, C. N., & Thompson, S. A. (1981). Mandarin Chinese: A functional reference grammar. Berkeley, CA: University of California Press. Li, F. K. (1977). A handbook of comparative Tai. Honolulu: University of Hawai’i Press. Lin, J. (2019). Encoding motion events in Mandarin Chinese: a cognitive functional study. Amsterdam: John Benjamins. Lüpke, F. (2007). ‘Smash it again, Sam’: Verbs of cutting and breaking in Jalonke. Cognitive Linguistics, 18(2), 251-261. doi:10.1515/COG.2007.013 MacWhinney, B. (1994). The CHILDES project: Computational tools for analyzing talk (2nd ed.). Hillsdale, NJ: Erlbaum. Majid, A., & Burenhult, N. (2014). Odors are expressible in language, as long as you speak the right language. Cognition, 130(2), 266-270. doi:10.1016/j.cognition.2013.11.004 Majid, A., Boster, J. S., & Bowerman, M. (2008). The cross-linguistic categorization of everyday events: A study of cutting and breaking. Cognition, 109(2), 235- 250. doi:10.1016/j.cognition.2008.08.009 Majid, A., Bowerman, M., van Staden, M., & Boster, J. (2007). The semantic categories of cutting and breaking events: A crosslinguistic perspective. Cognitive Linguistics, 18(2), 133-152. doi:10.1515/COG.2007.005 Majid, A., Evans, N., Gaby, A., & Levinson, S. C. (2011). The semantics of reciprocal constructions across languages. In N. Evans, A. Gaby, S. C. Levinson, & A. Majid (Eds.), Reciprocals and semantic typology (Vol. 98, pp. 29-59). Amsterdam: John Benjamins Majid, A., Gullberg, M., van Staden, M., & Bowerman, M. (2007). How similar are semantic categories in closely related languages? A comparison of cutting and breaking in four Germanic languages. Cognitive Linguistics, 18(2), 179-194. doi:10.1515/COG.2007.007 Majid, A., Roberts, S. G., Cilissen, L., Emmorey, K., Nicodemus, B., O’Grady, L., . . . Levinson, S. C. (2018). Differential coding of perception in the world’s languages. Proceedings of the National Academy of Sciences - PNAS, 115(45), 11369-11376. doi:10.1073/pnas.1720419115 Majid, A., van Staden, M., Boster, J. S., & Bowerman, M. (2004). Event categorization: A cross-linguistic perspective. In K. Forbus, D. Gentner, & T.

407

Regier (Eds.), Proceedings of the annual meeting of the Cognitive Science Society (Vol. 26, pp. 885-890). Mahwah, NJ: Erlbaum. Malt, B. C., & Majid, A. (2013). How thought is mapped into words. Wiley Interdisciplinary Reviews: Cognitive Science, 4(6), 583-597. doi:10.1002/wcs.1251 Malt, B. C., Gennari, S., Imai, M., Ameel, E., Tsuda, N., & Majid, A. (2008). Talking about walking: Biomechanics and the language of locomotion. Psychological Science, 19, 232-240. Malt, B. C., Sloman, S. A., Gennari, S., Shi, M., & Wang, Y. (1999). Knowing versus naming: Similarity and the linguistic categorization of artifacts. Journal of Memory and Language, 40(2), 230-262. doi:10.1006/jmla.1998.2593 Malvern, D. D., Richards, B. J., Chipere, N., & Durán, P. (2004). Lexical diversity and language development: Quantification and assessment. Basingstoke: Palgrave Macmillan. Marston, J. M. (2014). Ratios and simple statistics in paleoethnobotanical analysis: Data exploration and hypothesis testing. In J. M. Marston, J. d. A. Guedes, & C. Warinner (Eds.), Method and theory in paleoethnobotany (pp. 163-179). Boulder: University of Colorado Press. Martini, F. (1956). Les expressions de ‘être’ en siamois et en cambodgien. Bulletin de la Société de Linguistique de Paris, 52, 289-306. Martini, F. (1957). La distinction du prédicat de qualité et de l’épithète en cambodgien et en siamois. Bulletin de la Société de Linguistique de Paris, 53, 295-305. Masica, C. P. (1976). Defining a linguistic area: South Asia. Chicago: University of Chicago Press. Masica, C. P. (1994). Some new perspectives on South Asia as a linguistic area. In A. Davison & F. M. Smith (Eds.), Papers from the fifteenth South Asian Language Analysis Roundtable Conference 1993 (pp. 187-200). Iowa City: University of Iowa Press. Matisoff, J. (1986). Hearts and minds in Southeast Asian languages and English: An essay in the comparative lexical semantics of psycho-collocations. Cahiers de Linguistique Asie Orientale, 15(1), 5-57. doi:10.1163/19606028-90000013 Matisoff, J. A. (1973). The grammar of Lahu. Berkeley: University of California Press.

408

Matisoff, J. A. (1978). Variational semantics in Tibeto-Burman: The ‘organic’ approach to linguistic comparison. Philadelphia: Institute for the Study of Human Issues. Matisoff, J. A. (1991). Areal and universal dimensions of grammatization in Lahu. In E. C. Traugott & B. Heine (Eds.), Approaches to grammaticalization (Vol. 2, pp. 383-453). Amsterdam: John Benjamins. Matisoff, J. A. (2001). Genetic versus contact relationship: Prosodic diffusibility in South-East Asian languages. In A. Y. Aikhenvald & R. M. W. Dixon (Eds.), Areal diffusion and genetic inheritance problems in comparative linguistics (pp. 291-327). Oxford: Oxford University Press. Matisoff, J. A. (2004). Areal semantics: Is there such a thing? In A. Saxena (Ed.), Himalayan languages, past and present (pp. 347-395). New York: Mouton de Gruyter. McFarland, G. B. (1944). Thai-English dictionary. Standford: Stanford University Press. Migliazza, B. (1996). Mainland Southeast Asia: A unique linguistic area. Notes on Linguistics, 75, 17-25. Miller, D. (1991). A logic programming language with lambda-abstraction, function variables, and simple unification. Journal of Logic and Computation, 1(4), 497-536. doi:10.1093/logcom/1.4.497 Minegishi, M. (2004). Southeast-Asian languages: A case for the caseless? In B. Peri & K. V. Subbarao (Eds.), Non-nominative subjects (Vol. 1, pp. 301-317). Amsterdam: John Benjamins. Moore, R., Donelson, K., Eggleston, A., & Bohnemeyer, J. (2015). Semantic typology: New approaches to crosslinguistic variation in language and cognition. Linguistics Vanguard, 1(1), 189-200. Morey, S., & Post, M. W. (Eds.). (2008). North East Indian linguistics. New Delhi: Cambridge University Press India. Morey, S., & Post, M. W. (Eds.). (2010). North East Indian linguistics (Vol. 2). New Delhi: Cambridge University Press India. Muansuwan, N. (2001). Directional serial verb constructions in Thai. In D. Flickinger & A. Kathol (Eds.), Proceedings of the seventh International HPSG Conference, University of California, Berkeley (22-23 July 2000) (pp. 143- 147). Stanford: CSLI Publications.

409

Murphy, G. L. (2002). The big book of concepts. Cambridge, MA: MIT Press. Nacaskul, K. (1962). Cognate words in Thai and Cambodian. (Master’s thesis), University of London, London. Nacaskul, K. (1983). Phochananukrom thai–khamen [Thai–Khmer dictionary]. Bangkok: Faculty of Arts, Chulalongkorn University. Narasimhan, B. (2007). Cutting, breaking, and tearing verbs in Hindi and Tamil. Cognitive Linguistics, 18(2), 195-205. doi:10.1515/COG.2007.008 Narasimhan, B., Kopecka, A., Bowerman, M., Gullberg, M., & Majid, A. (2012). Putting and taking events: A crosslinguistic perspective. In A. Kopecka & B. Narasimhan (Eds.), Events of putting and taking: A crosslinguistic perspective (pp. 1-18). Amsterdam: John Benjamins Narrog, H., & Ito, S. (2007). Reconstructing semantic maps: The comitative- instrumental area. STUF - Language Typology and Universals, 60(4), 273- 292. Narrog, H., & van der Auwera, J. (2011). Grammaticalization and semantic maps. In H. Narrog & B. Heine (Eds.), The Oxford handbook of grammaticalization (pp. 318-327). Oxford: Oxford University Press. Nath, C. (1967). Vacənaːnùkrɔm khmae [Dictionnaire cambodgien] (5ème éd.). Phnom Penh: Institut bouddhique. Nettle, D. (1999). Linguistic diversity. New York: Oxford University Press. Nida, E. A. (1975). Componential analysis of meaning: An introduction to semantic structure. The Hague: Mouton. Pamachae. (2008, November 4). hàn hǔːahɔ̌ ːm mâj tɕʰáj kʰǐːaŋ mâj sɛ̀ ːp taː dûaj náʔ [Cutting onions without a cutting board; not irritating your eyes as well]. Youtube. Retrieved August 1, 2020, from https://youtu.be/vfJgAWcFRYI Pan, B. A. (1994). Basic measures of child language. In J. L. Sokolov & C. E. Snow (Eds.), Handbook of research in language development using CHILDES (pp. 26-49). Hillsdale, NJ: Erlbaum. Pederson, E., Danziger, E., Wilkins, D. P., Levinson, S. C., Kitae, S., & Senft, G. (1998). Semantic typology and spatial conceptualization. Language, 74, 557- 589. Phanthumetha, B. (Ed.) (1974). Photchananukrom thai-khamen [Thai-Khmer dictionary]. Bangkok: Phraya Anuman Rajathon Foundation.

410

Phanthumetha, N. (2016). Khlang Kham [Thesaurus]. Bangkok: Ammarin Printing and Publishing. Post, M. W. (2015). Morphosyntactic reconstruction in an areal-historical context: A pre-historical relationship between North East India and Mainland Southeast Asia? In N. J. Enfield & B. Comrie (Eds.), Languages of Mainland Southeast Asia: The state of the art (pp. 209-265). Berlin: Mouton de Gruyter. Pothipath, V. (2018). Phasathai nai mummong baeplakphasa [Thai language in typological perspective]. Bangkok: Academic Publications Project, Faculty of Arts, Chulalongkorn University. Premsrirat, S. (1987). Khmu: A minority language of Thailand (A Khmu grammar and a study of Thai and Khmu cutting words). In Papers in South-East Asian Linguistics No. 10 (pp. 145-190). Canberra: The Australian National University. Pye, C. (1994). Breaking concepts: Constraining predicate argument structure. Unpublished manuscript. Department of Linguistic. University of Kansas. Pye, C. (1996). K’iche’ Maya verbs of breaking and cutting. Kansas Working Papers in Linguistics, 21, 87-98. Pye, C., Loeb, D. F., & Pao, Y.-Y. (1995). The acquisition of breaking and cutting. In E. V. Clark (Ed.), Proceedings of the twenty-seventh annual child language research forum (pp. 227-236). Stanford: Center for the Study of Language and Information Stanford. Rogers, T. T., & McClelland, J. L. (2004). Semantic cognition: A parallel distributed processing approach. Cambridge, MA: MIT Press. Rosch, E. (1975). Cognitive representations of semantic categories. Journal of Experimental Psychology: General, 104, 193-233. Rosch, E. (1978). Principles of categorization. Hillside, NJ: Erlbaum. Rosch, E., & Mervis, C. B. (1975). Family resemblances: Studies in the internal structure of categories. Cognitive Psychology, 7(4), 573-605. doi:10.1016/0010-0285(75)90024-9 Rounti, V. L. (2018). Separation events in Modern Greek. (Master’s thesis), San José State University, San José, CA. Royal Institute of Thailand. (1950). Royal Institute Dictionary. Bangkok: Royal Institute.

411

Royal Institute of Thailand. (1982). Royal Institute Dictionary. Bangkok: Royal Institute. Royal Institute of Thailand. (2011). Royal Institute Dictionary. Bangkok: Royal Institute. Schiller, E. (1993). Why serial verb constructions? Neither bioprogram nor substrate! Amsterdam: John Benjamins. Schokker, G. H., & Menon, A. G. (1990). Linguistic convergence: The Tamil-Hindi auxiliaries. Bulletin of the School of Oriental and African Studies, 53(2), 266- 282. doi:10.1017/S0041977X00026070 SEAlang Library Khmer Text Corpus. (2007). SEAlang Library Khmer. Retrieved November 1, 2020, from http://sealang.net/khmer/corpus.htm Sercombe, P., & Tupas, T. (2014). Language, education and nation-building: Assimilation and shift in Southeast Asia. Basingstoke: Palgrave Macmillan. Shapiro, M. C. (1989). A Primer of modern standard Hindi (1st ed.). Delhi: Motilal Banarsidass. Shapiro, M. C., & Schiffman, H. F. (2019). Language and society in South Asia. Berlin: Mouton de Gruyter. Shorto, H. L. (2006). A Mon-Khmer comparative dictionary. Canberra: Pacific Linguistics. Siebenhütter, S. (2019). Conceptual transfer as an areal factor: Spatial conceptualizations in Mainland Southeast Asia. Berlin: Mouton de Gruyter. Simpson, E. H. (1949). Measurements of diversity. Nature, 163, 688. Slobin, D. (1996). From “thought and language” to “thinking for speaking”. In J. J. Gumperz & S. C. Levinson (Eds.), Rethinking linguistic relativity (pp. 70-96). Cambridge, UK: Cambridge University Press. Smalley, W. A. (1994). Linguistic diversity and national unity: Language ecology in Thailand. Chicago: University of Chicago Press. Spencer, N. H. (2013). Essentials of multivariate data analysis. Boca Raton: Chapman & Hall and CRC Press. Stickler, K. R. (1987). Guide to analysis of language transcripts (3rd ed.). Eau Claire, WI: Thinking Publications. Talmy, L. (1985). Lexicalization patterns: Semantic structure in lexical forms. In T. Shopen (Ed.), Language typology and syntactic description (Vol. 3, pp. 36- 149).

412

Talmy, L. (2003). Toward a cognitive semantics (Vol. 1). Cambridge, MA: MIT Press. Templin, M. C. (1957). Certain language skills in children. Minneapolis: University of Minnesota Press. Thepchuaysuk, K. (2016). Syntactic and semantic properties of verbs of separation in Thai. (Doctoral dissertation), Chulalongkorn University, Bangkok. Thepchuaysuk, K., & Thepkanjana, K. (2017). Syntactic and semantic properties of verbs of separation in Thai. Humanities Journal, 24(2), 278-317. Thepkanjana, K., & Uehara, S. (2009). Resultative constructions with “implied- result” and “entailed-result” verbs in Thai and English: A contrastive study. Linguistics, 47(3), 589-618. doi:10.1515/LING.2009.020 Thomason, S. G., & Kaufman, T. (1988). Language contact, creolization, and genetic linguistics. Berkeley, CA: University of California Press. Tingsabadh, K., & Abramson, A. S. (1993). Thai. Journal of the International Phonetic Association, 23, 25-28. Tiwary, S. S., & Kumar, R. (2009). Encyclopaedia of Southeast Asia and its tribes (Vol. 1). New Delhi: Anmol Publications. Treis, Y. (2010). Perception verbs and taste adjectives in Kambaata and beyond. In A. Storch (Ed.), Perception of the invisible (1st ed., pp. 313-346). Cologne: Köppe. Urban, M. (2009). ‘Sun’ and ‘Moon’ in the circum-pacific language area. Anthropological Linguistics, 51(3/4), 328-346. doi:10.1353/anl.2009.0004 Urban, M. (2010). ‘Sun’ = ‘Eye of the Day’: A linguistic pattern of Southeast Asia and Oceania. Oceanic Linguistics, 49(2), 568-579. Urban, M. (2012). Analyzability and semantic associations in referring expressions: A study in comparative lexicology. (Doctoral dissertation), Leiden University, Leiden. van Staden, M. (2007). ‘Please open the fish’: Verbs of separation in Tidore, a Papuan language of Eastern Indonesia. Cognitive Linguistics, 18(2), 297-306. doi:10.1515/COG.2007.017 Vanhove, M. (2008). Semantic associations between sensory modalities, prehension and mental perceptions. In M. Vanhove (Ed.), From polysemy to semantic change: Towards a typology of lexical semantic associations (pp. 163-215). Amsternam: John Benjamins.

413

Vittrant, A. (2015). Expressing motion: The contribution of Southeast Asian languages with reference to East Asian languages. In N. J. Enfield & B. Comrie (Eds.), The languages of Mainland Southeast Asia: The state of the art (pp. 586-632). Berlin: Mouton de Gruyter. Vogel, A. R. (2003). Jarawara verb classes. (Doctoral dissertation), University of Pittsburgh, Ann Arbor. Vulchanova, M., Martinez, L., & Vulchanov, V. (2012). Distinctions in the linguistic encoding of motion: evidence from a free naming task 1. In M. Dimitrova- Vulchanova & E. van der Zee (Eds.), Motion encoding in language and space. Oxford: Oxford University Press. Wälchli, B. (2010). Similarity semantics and building probabilistic semantic maps from parallel texts. Linguistic Discovery, 8(1), 331-371. doi:10.1349/PS1.1537-0852.A.356 Wälchli, B., & Cysouw, M. (2012). Lexical typology through similarity semantics: Toward a semantic map of motion verbs. Linguistics, 50(3), 671-710. doi:10.1515/ling-2012-0021 Wallace, A. F. C., & Atkins, J. (1960). The meaning of kinship terms. American Anthropologist, 62, 58-80. Watkins, R. V., Kelly, D. J., Harbers, H. M., & Hollis, W. (1995). Measuring children’s lexical diversity: Differentiating typical and impaired language learners. Journal of Speech and Hearing Disorders, 38(6), 1349-1355. Wilks, D. S. (2011). Statistical methods in the atmospheric sciences (3rd ed.). Amsterdam: Elsevier.

414

Appendix B. Gini-Simpson’s Diversity-Index Scores for Thai and Khmer.

Scene no. and descriptions 1 - D in Thai 1 - D in Khmer S1 Tear cloth into two pieces by hand 0.0000 0.2222 S2 Cut rope stretched between two tables with single downward blow of chisel 0.6389 0.7857 S3 Hack branch off tree with machete 0.4667 0.2857 S4 Chop cloth stretched between two tables with repeated intense knife blows 0.0000 0.6667 S5 Break stick over knee several times with intensity 0.0000 0.4167 S6 Chop multiple carrots crossways with big knife with intensity 0.6667 0.6389 S9 Slice carrot lengthwise with knife into two pieces 0.0000 0.4167 S10 Slice carrot across into multiple pieces with knife 0.2500 0.5357 S12 Cut strip of cloth stretched between two people’s hands in two 0.4762 0.5556 S13 Cut rope stretched between two tables with blow of axe 0.7500 0.4762 S14 Make single incision in melon with knife 0.6071 0.6944 S15 Saw stick propped between two tables in half 0.2222 0.0000 S18 Cut finger accidentally while cutting orange 0.0000 0.0000 S19 Snap twig with two hands 0.0000 0.2500 S20 Cut single branch off twig with sawing motion of knife 0.3556 0.4286 S21 Smash carrot into several fragments with hammer 0.2500 0.0000 S23 Chop cloth stretched between two tables into two pieces with two blows of hammer 0.2500 0.7778 S24 Cut rope in two with scissors 0.0000 0.0000 S25 Snap twig with two hands, but it doesn’t come apart 0.0000 0.0000 S26 Cut carrot crossways into two pieces with a couple of sawing motions with knife 0.5357 0.5238 S27 Cut hair with scissors 0.0000 0.0000 S28 Cut fish into three pieces with sawing motion of knife 0.4643 0.6667 S31 Smash a stick into several fragments with single blow of hammer 0.0000 0.0000 S32 Cut carrot in half crossways with single karate-chop of hand 0.6667 0.9167 S34 Chop cloth stretched between two tables with single karate-chop of hand 0.7619 0.8214 S35 Break yarn into many pieces with fury 0.6429 0.5714 S36 Tear cloth about half-way through with two hands 0.0000 0.0000 S37 Cut carrot in half lengthwise with single blow of axe 0.8095 0.5238 S38 Break single piece off yarn by hand 0.5238 0.2857 S39 Smash flowerpot with single blow of hammer 0.0000 0.6667 S40 Smash plate with single blow of hammer 0.0000 0.4762 S42 Break vertically-held stick with single karate-chop of hand 0.5714 0.7500 S43 Cut carrot crossways into two pieces with single blow of chisel 0.7857 0.8571

419

Scene no. and descriptions 1 - D in Thai 1 - D in Khmer S45 Poke hole in cloth stretched between two tables with a twig 0.7636 0.6000 S48 Chop branch repeatedly with axe, both lengthwise and crosswise, until a piece comes off 0.7778 0.6071 S49 Cut rope in two with knife 0.0000 0.0000 S50 Chop rope stretched between two tables in two with repeated blows of hammer 0.2857 0.5238 S51 Split melon in two with single knife blow, followed by pushing halves apart by hand 0.6667 0.2857 S53 Break stick in two with single downward blow of chisel 0.7857 0.7857 S54 Cut carrot in half crosswise with single blow of axe 0.8056 0.5333 S56 Cut cloth stretched between two tables in two with scissors 0.0000 0.0000 S57 Snap carrot with two hands 0.0000 0.0000 S61 Break rope stretched between two tables with single karate-chop of hand 0.6667 0.7455

420

Appendix C. Ethics Approval Letter

421