languages

Article Focus Varies by Phrase-Initial Tones in Seoul Korean: Production, Perception, and Automatic Classification

Yong-cheol Lee 1,* and Sunghye Cho 2,* 1 Department of and Literature, Cheongju University, Cheongju 28503, Korea 2 Linguistic Data Consortium, University of Pennsylvania, Philadelphia, PA 19104, USA * Correspondence: [email protected] (Y.-c.L.); [email protected] (S.C.)

 Received: 22 September 2020; Accepted: 11 November 2020; Published: 18 November 2020 

Abstract: Production and perception experiments were conducted to examine whether focus prosody varies by phrase-initial tones in Seoul Korean. We also trained an automatic classifier to locate prosodic focus within a sentence. Overall, focus prosody in Seoul Korean was weak and confusing in production, and poorly identified in perception. However, Seoul Korean’s focus prosody differed between phrase-initial low and high tones. The low group induced a smaller pitch increase by focus than the high tone group. The low tone group was also subject to a greater degree of confusion, although both tone groups showed some degree of confusion spanning the entire phrase as a focus effect. The identification rate was, therefore, approximately half in the low tone group (23.5%) compared to the high tone group (40%). In machine classification, the high tone group was also more accurately identified (high: 86% vs. low: 68%) when trained separately, and the machine’s general performance when the two tone groups were trained together was much superior to the human’s (machine: 65% vs. human: 32%). Although the focus prosody in Seoul Korean was weak and confusing, the identification rate of focus was higher under certain circumstances, which avers that focus prosody can vary within a single language.

Keywords: focus prosody; Seoul Korean; tonal contrast; production; human perception; machine perception

1. Introduction Focus highlights a particular element in a sentence (Bolinger 1972; Xu and Xu 2005). It is normally modulated by prosodic prominence to emphasize its importance in communication. However, prosodic prominence marking focus varies according to each language’s prosodic system. For example, in English, prominence is marked by a post-lexical pitch accent on the head (i.e., a stressed ) of a word (Beckman and Pierrehumbert 1986; Cohan 2000; Jun 2011; Ladd 1996). Additionally, prosodic focus in American English is well identified by a machine classifier with a high accuracy (92%) (Cho et al. 2019). In Mandarin Chinese, it is also cued by the head of a word, but is distinctively characterized by the tone shape of individual lexical tones (Lee et al. 2016; Liu 2009; Yuan 2004). However, in Seoul Korean (SK), focus is expressed through prosodic phrasing in which a new phrase boundary is inserted before the focused word and prominence occurs throughout the entire focused word (Jun 2011; Lee 2012). Based on Jun’s (Jun 2005, 2014) prosodic typology, English and Mandarin Chinese belong to head-prominence languages. However, SK is labeled as an edge-prominence language because prominence is realized by marking the edge of a prosodic constituent. In order to understand the details of the current study, we first briefly overview SK’s prosodic system. We then provide a detailed description of how prosodic focus is manifested in SK. Finally, we present our research goals.

Languages 2020, 5, 64; doi:10.3390/languages5040064 www.mdpi.com/journal/languages Languages 2020, 5, 64 2 of 19

Languages 2020, 5, x FOR PEER REVIEW 2 of 19 SK has neither lexical nor lexical pitch accents (Ueyama and Jun 1998; Song 2005). Instead, in defaultSK prosodichas neither phrasing, lexical eachstress content nor lexical word pitch usually accents forms (Ueyama a small prosodicand Jun 1998; unit called Song an2005). Accentual Instead, Phrasein default (AP) (prosodicJun 1998 )phrasing, that is tonally each markedcontent atword the post-lexical usually forms level. a Thesmall typical prosodic melody unit of called an AP an isAccentual either LH-LH Phrase or HH-LH,(AP) (Jun depending 1998) that onis tonally an AP-initial marked onset at the . post-lexical The level. AP-initial The typical tone is melody high whenof an the AP initial is either onset LH-LH has the or HH-LH, feature [ +dependingaspirated /ontense], an AP-initial while it isonset low segment. when the The initial AP-initial onset has tone theis featurehigh when [ aspirated the initial/tense]. onset Intermediate has the feature H-L [+aspi tonesrated/tense], (either one orwhile both) it is are low omitted when fromthe initial APs thatonset − containhas the less feature than [ four−aspirated/tense]. , resulting Intermediate in several H-L tonal tones patterns: (either one L-H, or L-LH,both) are LH-H, omitted H-H, from H-LH, APs HH-H.that contain Regarding less the than AP four melody syllables, of SK, resulting recent studies in several (Cho andtonal Lee patterns: 2016; Cho L-H, 2017 L-LH,) showed LH-H, that H-H, AP’s H- pitchLH, targetHH-H. is Regarding much higher the when AP melody it begins of withSK, recent a high-tone studies inducing (Cho and segment Lee 2016; and Cho the high2017) pitch showed is retainedthat AP’s toward pitch thetarget final is much syllable higher of an when AP. The it be tonalgins with pattern a high-tone of an AP inducing with a high-inducing segment and the initial high consonantpitch is retained even shows toward a high the plateau, final syllable particularly of an whenAP. The the tonal number pattern of syllables of an AP within with ana high-inducing AP is two or three,initial as consonant demonstrated even in shows Figure a1 high. This plateau, seems to particul be whyarly the when AP with the anumber high-inducing of syllables initial within segment an AP doesis two not or have three, a clear as demonstrated HH-LH pattern in Figure in Figure 1. This1, unlike seems the to basicbe why melody the AP of with TH-LH a high-inducing in SK. Please initial see Chosegment and Lee does(2016 not) for have further a clear discussion HH-LH pattern of the results. in Figure 1, unlike the basic melody of TH-LH in SK. PleaseUSV see Symbol Cho and Macro(s) Lee (2016) for further discussion of the results. Description

0241 Ɂ \textglotstopvari LATIN CAPITAL LETTER GLOTTAL STOP

0242 ɂ \textglotstopvarii LATIN SMALL LETTER GLOTTAL STOP 0243 Ƀ \textbarcapitalb LATIN CAPITAL LETTER B WITH STROKE

0244 Ʉ \textbarcapitalu LATIN CAPITAL LETTER U BAR 0245 Ʌ \textturnedcapitalv LATIN CAPITAL LETTER TURNED V

0246 Ɇ \textstrokecapitale LATIN CAPITAL LETTER E WITH STROKE 0247 ɇ \textstrokee LATIN SMALL LETTER E WITH STROKE USV Symbol Macro(s) Description 0248 Ɉ \textbarcapitalj LATIN CAPITAL LETTER J WITH STROKE

0249 0266 ɉ ɦ \textbarj\m{h} LATIN SMALLLATIN LETTERSMALL LETTER J WITH STROKE H WITH HOOK 024A Ɋ \texthtcapitalq\texthookabove{h} LATIN CAPITAL LETTER SMALL Q WITH HOOK TAIL \texthth 024B ɋ \texthtq LATIN SMALL LETTER Q WITH HOOK TAIL 0267 ɧ \texthookabove{\textheng} LATIN SMALL LETTER HENG WITH HOOK 024C Ɍ \textbarcapitalr LATIN CAPITAL LETTER R WITH STROKE \texththeng 024D ɍ \textbarr LATIN SMALL LETTER R WITH STROKE 0268 ɨ \B{i} LATIN SMALL LETTER I WITH STROKE 024E Ɏ \textbarcapitaly\textbari LATIN CAPITAL LETTER Y WITH STROKE 024F 0269 ɏ ɩ \textbary\m{i} LATIN SMALLLATIN LETTERSMALL LETTER Y WITH IOTA STROKE 0250 ɐ \textturna\textiota LATIN SMALL LETTER TURNED A \textiotalatin 0251 ɑ \textscripta LATIN SMALL LETTER ALPHA \textniiota 0252 ɒ \textturnscripta LATIN SMALL LETTER TURNED ALPHA 026A ɪ \textsci LATIN LETTER SMALL CAPITAL I 0253 ɓ \m{b} LATIN SMALL LETTER B WITH HOOK 026B ɫ \textltilde LATIN SMALL LETTER L WITH MIDDLE TILDE \m{b} 026C ɬ \texthtb\textbeltl LATIN SMALL LETTER L WITH BELT 026D ɭ \textbhook\textrethookbelow{l} LATIN SMALL LETTER L WITH RETROFLEX HOOK 0254 ɔ \m{o}\textrtaill LATIN SMALL LETTER OPEN O Figure 1. Mean pitch values of all speakers by the of the target words. The x-axis shows syllable Figure026E 1. Meanɮ \textopenopitch\textlyoghlig values of all speakers by the length of the target words.LATIN The SMALL x-axis LETTER shows LEZH syllable positionspositions in in the the target \textoopentarget\textOlyoghlig sentences, sentences, and and the the y-axis y-axis shows shows the the mean mean pitch pitch values. values. The The first first three three syllables syllables 0255 026F ɕ ɯ \textctc\textturnmtopic LATIN SMALLLATIN LETTERSMALL LETTER C WITH TURNED CURL M areare for for/i.t /i.te.nɕe.nnɯ/n/‘now- ‘now-TOPIC’,’, and and the the last last four four syllables syllables are are for fo/rmal.ha.se.jo /mal.ha.se.jo// ‘say’ ‘say’ in in the the carrier carrier 0256 0270 ɖ ɰ \M{d}\textturnmrleg LATIN SMALLLATIN LETTERSMALL LETTER D WITH TURNED TAIL M WITH LONG LEG sentencesentence/i.d /i.dʑe.nɯne.n n ______mal.ha.se.jomal.ha.se.jo// ‘Now‘Now saysay _____.’, _____.’, and and the the syllables syllables in in the the middle middle (inside (inside the the 0271 ɱ \textrtaild\m{m} 1 LATIN SMALL LETTER M WITH HOOK dashed lines) are the target words. Modified1 from Cho and Lee(2016). dashed lines) are\textdtail the\textltailm target words. Modified from Cho and Lee (2016). 0257 ɗ \m{d} LATIN SMALL LETTER D WITH HOOK Figure02722 depictsɲ SK’s\m{j} focus realizations when a focused word startsLATIN with SMALL a LETTER low-tone N WITH LEFT inducing HOOK Figure 2 depicts\texthtd SK’s\textltailn focus realizations when a focused word starts with a low-tone inducing segment. Focus conditions\textdhook\textnhookleft were produced in an experimental setting in which six native speakers of segment.0258 Focusɘ conditions\textreve were produced in an experimental settingLATIN in SMALL which LETTER six REVERSED native E speakers of SK read stimuli0273 inɳ isolation\m{n} for broad focus and produced the same stimuliLATIN SMALL in a LETTER Q&A N WITH dialogue RETROFLEX HOOK for SK0259 read stimuliə in \schwaisolation\textrtailn for broad focus and produced the sameLATIN stimuli SMALL LETTER in SCHWA a Q&A dialogue for discourse-new focus. Each stimulus was repeated six times in each condition by each of the six speakers. discourse-new0274 focus.ɴ \textschwa Each\textscn stimulus was repeated six times in each LATINcondition LETTER SMALL by CAPITAL each N of the six The di025Afferencesɚ in prosodic\m{\schwa} realizations between the two focus conditionsLATIN SMALL reveal LETTER SCHWA three WITH HOOKnoteworthy speakers.0275 The differencesɵ \textbaro in prosodic realizations between the two focusLATIN SMALL conditions LETTER BARRED Oreveal three \texthookabove{\schwa} features.0276 First, discourse-newɶ \textscoelig focus produces a higher pitch as compared toLATIN broad LETTER SMALLfocus, CAPITAL and OE the pitch noteworthy features.\textrhookschwa First, discourse-new focus produces a higher pitch as compared to broad focus, increase0277 clearly spansɷ the\textcloseomega entire focused word. Second, the pitch increaseLATIN induced SMALL LETTER by CLOSED focus OMEGA is fairly and025B the pitchɛ increase\m{e} clearly spans the entire focused word. Second,LATIN SMALLthe pitch LETTER OPEN increase E induced by small—just0278 1.18 extraɸ \textepsilon pitch\textphi in semitones (st). The value of 1.18 st is similar toLATIN the SMALL interval LETTER PHI between the C focus is fairly small—just\texteopen\textniphi 1.18 extra pitch in semitones (st). The value of 1.18 st is similar to the interval between0279 the C andɹ \textniepsilon C#\textturnr pitches of an equal-tempered musical scale. BasedLATIN SMALL on LETTERempirical TURNED R findings, a difference025C 027A ofɜ 1.5 ɺst\textrevepsilon is needed\textturnlonglegr in order to recognize a change in prominenceLATIN SMALLLATIN LETTERSMALL of LETTER REVERSEDperception TURNED OPEN R E WITH (Rietveld LONG LEG 1 In this study, we used the International Phonetic Alphabet (IPA) symbols for Korean examples. and025D Gussenhoven027B ɝ ɻ 1985).\texthookabove{\textrevepsilon}\textrethookbelow{r} This suggests that when a focused elementLATIN begins SMALLLATIN LETTERSMALLwith LETTER REVERSED a low TURNED OPEN tone, R E WITH WITH prosodic HOOK HOOK \textrhookrevepsilon\textturnrrtail 025E ɞ \textcloserevepsilon LATIN SMALL LETTER CLOSED REVERSED OPEN E 027C ɼ \textlonglegr LATIN SMALL LETTER R WITH LONG LEG 025F ɟ \B{j} LATIN SMALL LETTER DOTLESS J WITH STROKE 1 In this027D study, weɽ used\textrethookbelow{r} the International Phonetic Alphabet (IPA) symbols forLATIN Korean SMALL LETTER examples. R WITH TAIL \textbardotlessj\textrtailr \textObardotlessj 027E ɾ \textfishhookr LATIN SMALL LETTER R WITH FISHHOOK 0260 ɠ \texthookabove{g} LATIN SMALL LETTER G WITH HOOK 027F ɿ \textlhti LATIN SMALL LETTER REVERSED R WITH FISHHOOK \texthtg\textlhtlongi 0261 ɡ \textscriptg LATIN SMALL LETTER SCRIPT G 0280 ʀ \textscr LATIN LETTER SMALL CAPITAL R 0262 ɢ \textscg LATIN LETTER SMALL CAPITAL G 0281 ʁ \textinvscr LATIN LETTER SMALL CAPITAL INVERTED R 0263 ɣ \m{g} LATIN SMALL LETTER GAMMA 0282 ʂ \textrethookbelow{s} LATIN SMALL LETTER S WITH HOOK \textbabygamma\textrtails \textgammalatinsmall 0283 ʃ \textipagamma\m{s} LATIN SMALL LETTER ESH \esh 0264 ɤ \textramshorns\textesh LATIN SMALL LETTER RAMS HORN 0265 ɥ \textturnh LATIN SMALL LETTER TURNED H 0284 ʄ \texthtbardotlessj LATIN SMALL LETTER DOTLESS J WITH STROKE AND HOOK \texthtObardotlessj \texthtbardotlessjvar 0285 ʅ \m{S} LATIN SMALL LETTER SQUAT REVERSED ESH \textvibyi 13 0286 ʆ \textctesh LATIN SMALL LETTER ESH WITH CURL 0287 ʇ \textturnt LATIN SMALL LETTER TURNED T

0288 ʈ \M{t} LATIN SMALL LETTER T WITH RETROFLEX HOOK \textrtailt \texttretroflexhook

14 USV Symbol Macro(s) Description

0266 ɦ \m{h} LATIN SMALL LETTER H WITH HOOK \texthookabove{h} \texthth 0267 ɧ \texthookabove{\textheng} LATIN SMALL LETTER HENG WITH HOOK \texththeng 0268 ɨ \B{i} LATIN SMALL LETTER I WITH STROKE \textbari 0269 ɩ \m{i} LATIN SMALL LETTER IOTA \textiota \textiotalatin \textniiota 026A ɪ \textsci LATIN LETTER SMALL CAPITAL I

026B ɫ \textltilde LATIN SMALL LETTER L WITH MIDDLE TILDE Languages 2020, 5, 64 026C ɬ \textbeltl 3 of 19 LATIN SMALL LETTER L WITH BELT 026D ɭ \textrethookbelow{l} LATIN SMALL LETTER L WITH RETROFLEX HOOK \textrtaill 026E ɮ \textlyoghlig LATIN SMALL LETTER LEZH and C# pitches of an equal-tempered musical scale. Based on empirical findings, a di\textOlyoghligfference of 1.5 st is neededLanguages in 2020 order, 5, x FOR to recognize PEER REVIEW a change in prominence of perception026F (Rietveldɯ and Gussenhoven\textturnm 31985 of 19). LATIN SMALL LETTER TURNED M This suggests that when a focused element begins with a low tone,0270 prosodicɰ modulation\textturnmrleg by focus in SK LATIN SMALL LETTER TURNED M WITH LONG LEG ismodulation weak and by is thusfocus not in SK perceptually is weak and salient is thus for not listeners. perceptually Lastly, salient focus0271 for does listɱ noteners. change\m{m} Lastly, the focus AP tonal does LATIN SMALL LETTER M WITH HOOK \textltailm pattern;not change that the is, theAP sametonal pitch pattern; contours that is, are the observed same pitch whether contours a focusedUSV are observed element Symbol iswhether Macro(s) present ora focused not. Description 0272 ɲ \m{j} LATIN SMALL LETTER N WITH LEFT HOOK element is present or not. 0266 ɦ \textltailn\m{h} LATIN SMALL LETTER H WITH HOOK \textnhookleft\texthookabove{h} \texthth 0273 ɳ \m{n} LATIN SMALL LETTER N WITH RETROFLEX HOOK 0267 ɧ \textrtailn\texthookabove{\textheng} LATIN SMALL LETTER HENG WITH HOOK \texththeng 0274 ɴ \textscn LATIN LETTER SMALL CAPITAL N 0268 ɨ \B{i} LATIN SMALL LETTER I WITH STROKE 0275 ɵ \textbaro LATIN SMALL LETTER BARRED O \textbari 0276 ɶ \textscoelig LATIN LETTER SMALL CAPITAL OE 0269 ɩ \m{i} LATIN SMALL LETTER IOTA 0277 ɷ \textcloseomega\textiota LATIN SMALL LETTER CLOSED OMEGA 0278 USVɸ Symbol\textphi\textiotalatin Macro(s) LATIN SMALLDescription LETTER PHI \textniphi\textniiota 0289 ʉ \B{u} LATIN SMALL LETTER U BAR 0279026A ɹ ɪ \textturnr\textsci LATINLATIN SMALL LETTER LETTER SMALL TURNED CAPITAL R I \textbaru 027A026B ɺ ɫ \textturnlonglegr\textltilde LATINLATIN SMALL SMALL LETTER LETTER TURNED L WITH R MIDDLE WITH LONG TILDE LEG 028A ʊ \textscupsilon LATIN SMALL LETTER UPSILON 027B026C ɻ ɬ \textrethookbelow{r}\textbeltl\textniupsilon LATINLATIN SMALL SMALL LETTER LETTER TURNED L WITH R BELT WITH HOOK \textturnrrtail 026D 028B ɭ ʋ \textrethookbelow{l}\m{u} LATIN SMALLLATIN LETTERSMALL LETTER L WITH VRETROFLEX WITH HOOK HOOK 027C ɼ \textlonglegr\textrtaill\m{v} LATIN SMALL LETTER R WITH LONG LEG 027D026E ɽ ɮ \textrethookbelow{r}\textlyoghlig\textscriptv LATINLATIN SMALL SMALL LETTER LETTER R WITH LEZH TAIL FigureFigure 2.2.Time-normalized Time-normalized pitch pitch contours contours sampled sampled at ten at equidistant ten equidistant points. points. The shaded The\textrtailr\textOlyoghlig areashaded\textvhook indicates area 027E026F 028Cɾɯ ʌ\textfishhookr\textturnm\textturnv LATINLATIN SMALL SMALLLATIN LETTER LETTERSMALL R WITHLETTER TURNED FISHHOOK TURNED M V theindicates phrase the containing phrase thecontaining focus element the focusminsu element. The sentence minsu. isTheminsu-ga sentence mandu- is minsu-gal m k-nmandu-ndaɾɯ‘Minsul mʌk- 027F0270 028Dɿɰ ʍ\textlhti\textturnmrleg\textturnw LATINLATIN SMALL SMALLLATIN LETTER LETTERSMALL REVERSED LETTER TURNED TURNED R M WITH WITH FISHHOOK W LONG LEG isnɯ eatingnda ‘Minsu dumplings.’ is eating Raw dumplings.’ data taken Raw from data Lee taken and Xu from(2010 Lee). and Xu (2010). 0271 028E ɱ ʎ\textlhtlongi\m{m}\textturny LATIN SMALLLATIN LETTERSMALL LETTER M WITH TURNED HOOK Y 0280 ʀ \textscr\textltailm LATIN LETTER SMALL CAPITAL R Prosodic focus in SK has received considerable attention in the literature028F (Junʏ and Lee\textscy 1998; Jeon LATIN LETTER SMALL CAPITAL Y Prosodic focus in SK has received considerable attention in the02810272 literatureʁ ɲ (Jun\textinvscr and\m{j} Lee 1998; Jeon LATINLATIN LETTER SMALL SMALL LETTER CAPITAL N WITH INVERTED LEFT HOOK R and Nolan 2017; Lee and Xu 2010; Oh and Byrd 2019), yet our understanding0290 ofʐ its exact\textrethookbelow{z} nature is LATIN SMALL LETTER Z WITH RETROFLEX HOOK and Nolan 2017; Lee and Xu 2010; Oh and Byrd 2019), yet our understanding0282 ʂ of\textrethookbelow{s} \textltailnits exact\textrtailz nature is LATIN SMALL LETTER S WITH HOOK still incomplete. More importantly, no research has been conducted examining how\textrtails\textnhookleft prosodic marking still incomplete. More importantly, no research has been conducted0291 examiningʑ how\textctz prosodic LATIN SMALL LETTER Z WITH CURL of focus interacts with the tonal contrast derived from different laryngeal02830273 featuresʃ ɳ \m{s}\m{n} of SK’s AP-initial LATINLATIN SMALL SMALL LETTER LETTER ESH N WITH RETROFLEX HOOK marking of focus interacts with the tonal contrast derived from different0292 laryngealʒ \textrtailn features\m{z} of SK’s LATIN SMALL LETTER EZH onset segment. Stimuli designed in previous studies lacked aspirated/tense segments\esh in order\ezh to avoid AP-initial onset segment. Stimuli designed in previous studies lacked0274 aspirated/tenseɴ \textesh\textscn segments in LATIN LETTER SMALL CAPITAL N pitch perturbation (Jo et al. 2006; Jun and Kim 2007; Jun and Lee 1998; Lee and Xu 2010\textezh; Lee 2009, order to avoid pitch perturbation (Jo et al. 2006; Jun and Kim 2007;0284 Jun0275 and Leeʄ ɵ 1998;\texthtbardotlessj\textbaro Lee\textyogh and Xu 2010; LATINLATIN SMALL SMALL LETTER LETTER DOTLESS BARRED J WITH O STROKE AND HOOK \texthtObardotlessj 2012). If these previous studies had included aspirated/tense segments,0276 potential0293 ɶ diʓ ff\textscoeligerences\textctyogh caused by LATIN LETTERLATIN SMALL SMALL LETTER CAPITAL EZH OE WITH CURL Lee 2009, 2012). If these previous studies had included aspirated/tense segments, potential\texthtbardotlessjvar differences tonal contrast may have been observed in marking prosodic focus.0277 We hypothesize0294 ɷ ʔ \textcloseomega that\textglotstop there may be LATIN SMALLLATIN LETTER CLOSEDGLOTTAL OMEGA STOP caused by tonal contrast may have been observed in marking prosodic0285 focus.ʅ We\m{S} hypothesize that LATIN SMALL LETTER SQUAT REVERSED ESH differences in marking prosodic focus depending on an AP-initial tone.0278 The0295 reasonɸ ʕ\textvibyi for\textphi this\textrevglotstop hypothesis LATIN SMALLLATIN LETTER PHIPHARYNGEAL VOICED FRICATIVE there may be differences in marking prosodic focus depending on an AP-initial tone.\textniphi The reason for will be discussed in detail in the next paragraph. 0286 0296ʆ ʖ\textctesh\textinvglotstop LATIN SMALLLATIN LETTER LETTER ESH INVERTED WITH CURL GLOTTAL STOP this hypothesis will be discussed in detail in the next paragraph. 0279 ɹ \textturnr LATIN SMALL LETTER TURNED R 0297 ʗ \textstretchc LATIN LETTER STRETCHED C The AP-initial tonal contrast (low vs. high) is an excellent0287 case for testingʇ \textturnt whether prosodic LATIN SMALL LETTER TURNED T The AP-initial tonal contrast (low vs. high) is an excellent case027A for testingɺ \textturnlonglegrwhether\textstretchcvar prosodic LATIN SMALL LETTER TURNED R WITH LONG LEG marking of focus varies in different pitch-scaling conditions. Liberman0288 andʈ Pierrehumbert,\M{t} 1984, LATIN SMALL LETTER T WITH RETROFLEX HOOK marking of focus varies in different pitch-scaling conditions. Liberman027B 0298 andɻ Pierrehumbert,ʘ\textrtailt\textrethookbelow{r}\textbullseye 1984, LATIN SMALLLATIN LETTER TURNEDBILABIAL R CLICK WITH HOOK suggested that pitch appears to show downstepping patterns over the course of a sentence\textturnrrtail\textObullseye toward a suggested that pitch appears to show downstepping patterns over the course of a\texttretroflexhook sentence toward a speaker’s baseline (B). This suggests that the scaling of F0 values027C should0299 beɼ evaluatedʙ \textlonglegr\textscb as “baseline LATIN SMALLLATIN LETTER RSMALL WITH CAPITALLONG LEG B speaker’s baseline (B). This suggests that the scaling of F0 values 027Dshould029A beɽ evaluatedʚ \textrethookbelow{r}\textcloseepsilon as “baseline LATIN SMALLLATIN LETTERSMALL LETTER R WITH CLOSEDTAIL OPEN E units above the baseline”, where “baseline” represents the bottom of the speaker’s\textrtailr range at a given units above the baseline”, where “baseline” represents the bottom of the029B speaker’sʛ range\texthtscg at a given LATIN LETTER SMALL CAPITAL G WITH HOOK level of vocal effort. The scaled intonational value (Int) is represented027E by theɾ following\textfishhookr equation: LATIN SMALL LETTER R WITH FISHHOOK level of vocal effort. The scaled intonational value (Int) is represented by029C the followingʜ \textschequation: LATIN LETTER SMALL CAPITAL H 027F ɿ \textlhti 14 LATIN SMALL LETTER REVERSED R WITH FISHHOOK 029D ʝ \textctj LATIN SMALL LETTER J WITH CROSSED-TAIL \textlhtlongi IntInt= = (F0(F0 − B)B)/B/B \textctjvar (1) (1) − 0280 ʀ \textscr LATIN LETTER SMALL CAPITAL R 029E ʞ \textturnk LATIN SMALL LETTER TURNED K or in the other direction 0281 ʁ \textinvscr LATIN LETTER SMALL CAPITAL INVERTED R or in the other direction 029F ʟ \textscl LATIN LETTER SMALL CAPITAL L 0282 ʂ \textrethookbelow{s} LATIN SMALL LETTER S WITH HOOK F0 = Int B + B02A0 ʠ \m{q} (2) LATIN SMALL LETTER Q WITH HOOK F0 = Int × B + B \textrtails (2) × \texthtq 0283 ʃ \m{s} LATIN SMALL LETTER ESH BasedBased onon EquationEquation (2),(2), ifif BB isis 100100 HzHz andand IntInt isis 1, 1, we we have have 200 200 Hz Hz02A1 (F0 (F0= =1 1 ×ʡ100 100+ +\textbarglotstop 100).100). IfIf focusfocus LATIN LETTER GLOTTAL STOP WITH STROKE × \esh scalingscaling isis appliedapplied toto thethe IntInt value,value, thenthen a 30%30% increaseincrease inin anan IntInt valuevalue ofof02A2 11 willwill turnturnʢ \textesh outout\textbarrevglotstop toto bebe 230230 HzHz LATIN LETTER REVERSED GLOTTAL STOP WITH STROKE 02A3 ʣ \textdzlig LATIN SMALL LETTER DZ DIGRAPH (F0(F0 == 1.30 × 100100 ++ 100),100), which which is is15% 15% higher higher in inpitch pitch (230/200). (230/200). If the If0284 the focus focus scalingʄ scaling is\texthtbardotlessj down is down to 10%, to 10%, we LATIN SMALL LETTER DOTLESS J WITH STROKE AND HOOK × 02A4 ʤ \texthtObardotlessj\textdyoghlig LATIN SMALL LETTER DEZH DIGRAPH wemay may only only have have 210 Hz 210 (F0 Hz = (F0 1.1 =× 1001.1 + 100),100 + which100), is which merely is merelya 5% increase a 5% increasein pitch (210/200). in\texthtbardotlessjvar pitch (210 Because/200). × 02A5 ʥ \textdctzlig LATIN SMALL LETTER DZ DIGRAPH WITH CURL Becausefocus scaling focus has scaling a multiplicative has a multiplicative effect on e fftheect Int on value, the Int higher value, pitches higher0285 pitcheswillʅ have will\m{S} a greater have a greaterimpact LATIN SMALL LETTER SQUAT REVERSED ESH 02A6 ʦ \texttslig LATIN SMALL LETTER TS DIGRAPH impactthan lower than lowerpitches, pitches, in terms in terms of F0 of ratios. F0 ratios. This This leads leads us usto tohypothesize hypothesize that that higher higher\textvibyi pitches pitches more 02A7 ʧ \textteshlig LATIN SMALL LETTER TESH DIGRAPH 0286 ʆ \textctesh LATIN SMALL LETTER ESH WITH CURL eeffectivelyffectively markmark prosodic prosodic focus focus as as compared compared to lowerto lower pitches, pitches, in both in both production production and perception. and\texttesh perception. Thus, 0287 ʇ \textturnt LATIN SMALL LETTER TURNED T Thus, prosodic marking of focus for a syllable with [+aspirated/tense] segments02A8 willʨ be more\texttctclig effective LATIN SMALL LETTER TC DIGRAPH WITH CURL 0288 ʈ \M{t} LATIN SMALL LETTER T WITH RETROFLEX HOOK 02A9 ʩ \textfenglig LATIN SMALL LETTER FENG DIGRAPH in production and also more identifiable in perception than a syllable with \textrtailt[−aspirated/tense] 02AA ʪ \textlslig LATIN SMALL LETTER LS DIGRAPH segments. \texttretroflexhook 02AB ʫ \textlzlig LATIN SMALL LETTER LZ DIGRAPH Although previous research on focus prosody in SK has been based primarily on regular 02AE ʮ \textlhtlongy LATIN SMALL LETTER TURNED H WITH FISHHOOK sentences, the current study examines the focus prosody of telephone02AF numbers.ʯ We\textvibyy believe that LATIN SMALL LETTER TURNED H WITH FISHHOOK AND TAIL 02B0 ʰ \textsuph 14 MODIFIER LETTER SMALL H \textsuperscript{h} 02B1 ʱ \textsuphth MODIFIER LETTER SMALL H WITH HOOK \textsuperscript{\texthth} 02B2 ʲ \textsupj MODIFIER LETTER SMALL J \textsuperscript{j} 02B3 ʳ \textsupr MODIFIER LETTER SMALL R \textsuperscript{r}

15 Languages 2020, 5, 64 4 of 19 prosodic marking of focus for a syllable with [+aspirated/tense] segments will be more effective in production and also more identifiable in perception than a syllable with [ aspirated/tense] segments. − Although previous research on focus prosody in SK has been based primarily on regular sentences, the current study examines the focus prosody of telephone numbers. We believe that phone number strings provide an appropriate experimental setting for assessing the interplay between the AP initial tonal contrast and prosodic focus, because the individual digits of a phone number string do not behave like separate monosyllabic words, even though each digit is morphologically one word. That is, in a phone number string, ten digits tend to be produced in three phrases, (NNN)-(NNN)-(NNNN), grouping three or four monomorphemic words together. Since the subgrouping of digits in a phone number string form a tight prosodic unit and each digit does not behave like one word, focusing a phrase-internal digit does not generate a separate AP. Thus, as observed in Figure1, when a L-toned digit is focused in the initial position of a digit string in non-sentence-final positions, the AP begins with a low tone and typically ends with a high tone, since prosodic realization of focus would differ by an AP-initial tone (low or high). This means that when a focused element begins with a low tone in sentence-medial positions, the initial focused digit will not be higher in pitch than the subsequent digit(s) because it bears a low tone. In this case, the initial low tone would not be high enough to be perceptually recognized, compared to the subsequent digit(s). We expect that this will make the focused L-toned digit confusing in perception within an AP, that is, native listeners will have difficulty in identifying which digit is focused. When an initial H-toned digit is in focus, the AP will begin with a high tone and the following will also show a high tone because the tonal melody is HH-LH when an initial segment has the feature [+aspirated/tense]. This indicates that the focused H-toned digit will be as high in pitch as the subsequent digit(s) within an AP, still rendering the focused H-toned digit confusing, but to a lesser extent compared to the focused L-toned digit. Therefore, although both focused L-toned and H-toned digits produce a confusing prosodic marking of focus, we hypothesize that the focused L-toned digit will be even more confusing, and listeners will have difficulty identifying it. Before formulating the hypotheses, it should be noted that previous studies revealed noticeably distinct trends in focus prosody across languages. Focus prosody was clearly marked in production and accurately recognized in perception (over 90%) in English (Lee 2015) and Mandarin Chinese (Lee et al. 2016), but neither clearly marked in production nor accurately recognized in perception (below 50%) in Tokyo Japanese (Lee et al. 2018) and South Kyungsang Korean (Lee et al. 2019). The results of this study are expected to indicate an important typological difference between languages with and without clear prominence of focus prosody which certainly merits broader study. Based on all of the considerations discussed above, we propose three hypotheses predicting how SK focus prosody is manifested. First, prosodic marking of focus is weak overall in production, given the small increase in pitch induced by prosodic focus as shown in Figure2, but a focused H-toned digit will show a (relatively) greater impact than its L-toned focused counterpart, in terms of F0 ratios (Liberman and Pierrehumbert 1984). Second, the prosodic marking of focus in the AP-initial position is confusing overall in perception because of AP’s tonal melodies, but a focused L-toned digit will be associated with a greater degree of confusion than a focused H-toned digit. Third, a machine classifier will have difficulties in correctly identifying prosodic focus in SK due to weak and confusing prosodic cues, but a focused H-toned digit will be better identified than a focused L-toned digit. To test these hypotheses, we conducted production and perception experiments with 10-digit phone number strings and trained a machine learning classifier with prosodic cues that were obtained from the production experiment. USV Symbol Macro(s) Description

0241 Ɂ \textglotstopvari LATIN CAPITAL LETTER GLOTTAL STOP

0242 ɂ \textglotstopvarii LATIN SMALL LETTER GLOTTAL STOP 0243 Ƀ \textbarcapitalb LATIN CAPITAL LETTER B WITH STROKE

0244 Ʉ \textbarcapitalu LATIN CAPITAL LETTER U BAR 0245 Ʌ \textturnedcapitalv LATIN CAPITAL LETTER TURNED V

0246 Ɇ \textstrokecapitale LATIN CAPITAL LETTER E WITH STROKE Languages 2020, 5, 64 0247 ɇ \textstrokee5 of 19 LATIN SMALL LETTER E WITH STROKE

0248 Ɉ \textbarcapitalj LATIN CAPITAL LETTER J WITH STROKE USV Symbol Macro(s) Description 0249 ɉ \textbarj LATIN SMALL LETTER J WITH STROKE 0266 ɦ \m{h} LATIN SMALL LETTER H WITH HOOK 2. Production 024A Ɋ \texthtcapitalq LATIN CAPITAL LETTER SMALL Q WITH HOOK TAIL \texthookabove{h}USV Symbol Macro(s) Description \texthth 024B ɋ \texthtq LATIN SMALL LETTER Q WITH HOOK TAIL

2.1. Method LATIN SMALL LETTER HENG WITH HOOK LATIN CAPITAL LETTER R WITH STROKE 0267 ɧ \texthookabove{\textheng}0266 ɦ \m{h} 024C Ɍ \textbarcapitalrLATIN SMALL LETTER H WITH HOOK \texththeng \texthookabove{h} 024D ɍ \textbarr LATIN SMALL LETTER R WITH STROKE 2.1.1.0268 Stimuli ɨ \B{i} \texthth LATIN SMALL LETTER I WITH STROKE \textbari 024E Ɏ \textbarcapitaly LATIN CAPITAL LETTER Y WITH STROKE 0267 ɧ \texthookabove{\textheng} LATIN SMALL LETTER HENG WITH HOOK 0269 ɩ \m{i} LATIN SMALL024F LETTER IOTA ɏ \textbary LATIN SMALL LETTER Y WITH STROKE Using a Python script, 100 10-digit phone\texththeng number strings were created in which each digit (i.e., 0–9) \textiota 0250 ɐ \textturna LATIN SMALL LETTER TURNED A occurred equally in each\textiotalatinUSV position, Symbol0268 and each Macro(s)ɨ pair\B{i} of adjoining digits (e.g., 01, 87) occurredDescription equallyLATIN across SMALL LETTER I WITH STROKE 0251 ɑ \textscripta LATIN SMALL LETTER ALPHA each pair of sequences.\textniiota The target strings were\textbari presented to participants in two conditions. In one 026A ɪ \textsci0289 0269ʉ \B{u}ɩ \m{i} LATIN LETTER0252 SMALL CAPITALɒ I \textturnscriptaLATIN SMALLLATIN LETTER SMALL U BAR LETTER IOTA LATIN SMALL LETTER TURNED ALPHA condition,026B for broadɫ \textltilde focus, the target strings\textbaru in\textiota a carrier sentenceLATIN were SMALL0253 shown LETTER L WITH inɓ MIDDLE isolation. TILDE\m{b} In the other LATIN SMALL LETTER B WITH HOOK condition,026C forɬ corrective\textbeltl028A focus,ʊ the same\textscupsilon strings\textiotalatin were embeddedLATIN in SMALL a Q&A LETTER L WITH form, BELT as\m{b} shownLATIN SMALL in LETTER below. UPSILON \textniupsilon\textniiota \texthtb In the026D Q&A dialogue,ɭ \textrethookbelow{l} person A asked if a phone number wasLATIN correct, SMALL LETTER and L WITH person RETROFLEX HOOK B answered the \textrtaill028B 026Aʋ \m{u}ɪ \textsci \textbhookLATIN SMALLLATIN LETTER LETTER V WITH SMALL HOOK CAPITAL I question with the intended target string by correcting one inaccurate digit. The answer with corrective LATIN SMALL LETTER OPEN O 026E ɮ \textlyoghlig026B \m{v}ɫ \textltilde LATIN SMALL0254 LETTER LEZHɔ \m{o} LATIN SMALL LETTER L WITH MIDDLE TILDE \textOlyoghlig \textscriptv \textopeno focus was used for analysis. 026C ɬ \textbeltl LATIN SMALL LETTER L WITH BELT 026F ɯ \textturnm \textvhook LATIN SMALL LETTER TURNED M \textoopen 026D ɭ \textrethookbelow{l} LATIN SMALL LETTER L WITH RETROFLEX HOOK A:0270 mina-ɰi\textturnmrleg028C bʌnho-ga\textturnv 887-412-4699-ja.LATIN SMALL0255 LETTER mat TURNEDɕi? M WITH LONG\textctc LEGLATIN SMALL LETTER TURNED V LATIN SMALL LETTER C WITH CURL \textrtaill 0271Mina-POSSɱ \m{m}028D number-NOMʍ \textturnw 887-412-4699-DECLATIN SMALL0256 LETTER right M WITHɖ HOOK \M{d}LATIN SMALL LETTER TURNED W LATIN SMALL LETTER D WITH TAIL 026E ɮ \textlyoghlig LATIN SMALL LETTER LEZH \textltailm \textrtaild ‘Mina’s number028E is 887-412-4699.ʎ \textturny Right?’ \textOlyoghlig LATIN SMALL LETTER TURNED Y 0272 ɲ \m{j} LATIN SMALL LETTER N WITH LEFT HOOK\textdtail B: anija, mina- \textltailni028F 026F bʏnho-nɯ\textscyn\textturnm 787-412-4699-ja. LATIN LETTERLATIN SMALL SMALL CAPITAL LETTER Y TURNED M 0257 ɗ \m{d} LATIN SMALL LETTER D WITH HOOK \textnhookleft no0290 0270 Mina-POSSʐ ɰ\textrethookbelow{z}\textturnmrleg number-TOP 787-412-4699-DEC\texthtdLATIN SMALLLATIN LETTER SMALL Z WITH LETTER RETROFLEX TURNED HOOK M WITH LONG LEG 0273 ɳ \m{n} \textrtailz LATIN SMALL LETTER N WITH RETROFLEX HOOK ‘No, Mina’s number\textrtailn is0271 787-412-4699.’ɱ \m{m} \textdhookLATIN SMALL LETTER M WITH HOOK 0291 ʑ \textctz LATIN SMALL LETTER Z WITH CURL \textltailm LATIN SMALL LETTER REVERSED E 0274 ɴ \textscn LATIN LETTER0258 SMALL CAPITALɘ N \textreve Table1 displays SK0292 numerical0272ʒ digits\m{z}ɲ from\m{j} 0 to 9 and the onset consonant typeLATIN of each SMALLLATIN LETTER digit. SMALL EZH LETTER N WITH LEFT HOOK 0275 ɵ \textbaro LATIN SMALL0259 LETTER BARREDə O \schwa LATIN SMALL LETTER SCHWA \ezh \textltailn \textschwa The digits0276 areɶ classified\textscoelig into two groups—high and low—dependingLATIN LETTER on SMALL their CAPITAL AP-initial OE tonal contrast. \textezh\textnhookleft h h 025Ah ɚ h\m{\schwa} LATIN SMALL LETTER SCHWA WITH HOOK The high0277 toneɷ group\textcloseomega includes the following\textyogh digits: 3 [s am], 4 [s LATINa], SMALL 7 [t LETTERil], CLOSED and OMEGA 8 [p al], whose onset 0273 ɳ \m{n} \texthookabove{\schwa}LATIN SMALL LETTER N WITH RETROFLEX HOOK 0278 ɸ \textphi LATIN SMALL LETTER PHI consonants are associated0293 with aspirationʓ \textctyogh/tenseness,\textrtailn as well as 1 [il], which is reported toLATIN be SMALL produced LETTER EZH WITH CURL \textniphi \textrhookschwa 0294 0274ʔ \textglotstopɴ \textscn LATIN LETTERLATIN GLOTTAL LETTER STOP SMALL CAPITAL N with a0279 lexicallyɹ specified\textturnr H tone (Cho 2018; Jun and Cha 2011,LATIN 2015 SMALL025B). LETTER The TURNED otherɛ R digits\m{e} (0, 2, 5, 6, 9) LATIN SMALL LETTER OPEN E 0295 0275ʕ \textrevglotstopɵ \textbaro \textepsilonLATIN LETTERLATIN PHARYNGEAL SMALL LETTER VOICED BARRED FRICATIVE O belong027A to the lowɺ tone\textturnlonglegr group. Since tonal contrast appears AP-initiallyLATIN SMALL LETTER in SK, TURNED we R WITH examined LONG LEG AP-initial \texteopen 027B ɻ \textrethookbelow{r}0296 0276ʖ \textinvglotstopɶ \textscoelig LATIN SMALL LETTER TURNED R WITH HOOK LATIN LETTERLATIN INVERTED LETTER GLOTTAL SMALL CAPITAL STOP OE digits only for analysis. The bolded AP-initial digits are target digits tested in this study\textniepsilon (i.e., NNN), \textturnrrtail0297 0277ʗ \textstretchcɷ \textcloseomega LATIN LETTERLATIN STRETCHED SMALL LETTER C CLOSED OMEGA which027C is furtherɼ explained\textlonglegr in detail below.\textstretchcvar LATIN SMALL025C LETTER R WITHɜ LONG LEG \textrevepsilon LATIN SMALL LETTER REVERSED OPEN E 0278 ɸ \textphi LATIN SMALL LETTER PHI LATIN SMALL025D LETTER R WITH TAIL \texthookabove{\textrevepsilon} LATIN SMALL LETTER REVERSED OPEN E WITH HOOK 027D ɽ \textrethookbelow{r}0298 ʘ \textbullseye\textniphi ɝ LATIN LETTER BILABIAL CLICK \textrtailr \textObullseye \textrhookrevepsilon Table 1. The onset consonant0279 type ofɹ each digit\textturnr and the tone group depending on the tonal contrastLATIN SMALL LETTER TURNED R 027E ɾ \textfishhookr LATIN SMALL LETTER R WITH FISHHOOK 0299 ʙ \textscb 025E ɞ \textcloserevepsilonLATIN LETTER SMALL CAPITAL B LATIN SMALL LETTER CLOSED REVERSED OPEN E that each digit shows. 027A ɺ \textturnlonglegr LATIN SMALL LETTER TURNED R WITH LONG LEG 027F ɿ \textlhti LATIN SMALL LETTER REVERSED R WITH FISHHOOK 029A ʚ \textcloseepsilon 025F ɟ \B{j}LATIN SMALL LETTER CLOSED OPEN E LATIN SMALL LETTER DOTLESS J WITH STROKE \textlhtlongi027B ɻ \textrethookbelow{r} \textbardotlessjLATIN SMALL LETTER TURNED R WITH HOOK Digit (IPA)029B ʛ Onset\texthtscg Consonant Type Post-Lexical Tone GroupLATIN LETTER SMALL CAPITAL G WITH HOOK 0280 ʀ \textscr \textturnrrtail LATIN LETTER SMALL CAPITAL R \textObardotlessj 0281 ʁ \textinvscr029C 027Cʜ \textschɼ \textlonglegr LATIN LETTER SMALL CAPITAL INVERTED R LATIN LETTERLATIN SMALL SMALL CAPITAL LETTER H R WITH LONG LEG 0 (/koŋ/) lenis0260 Lowɠ \texthookabove{g} LATIN SMALL LETTER G WITH HOOK 0282 ʂ 1 (/il/\textrethookbelow{s})029D 027Dʝ \textctjɽ vowel-initial\textrethookbelow{r}LATIN SMALL LETTER S WITH High HOOK \texthtgLATIN SMALLLATIN LETTER SMALL J WITH LETTER CROSSED-TAIL R WITH TAIL \textrtails \textctjvar\textrtailr 2 (/i/) vowel-initial0261 Lowɡ \textscriptg LATIN SMALL LETTER SCRIPT G 0283 ʃ \m{s} LATIN SMALL LETTER ESH 029E 027Eʞ \textturnkɾ \textfishhookr LATIN SMALLLATIN LETTER SMALL TURNED LETTER K R WITH FISHHOOK 3 (/sam\esh/) aspirated0262 Highɢ \textscg LATIN LETTER SMALL CAPITAL G 029F 027Fʟ \textsclɿ \textlhti LATIN LETTERLATIN SMALL SMALL CAPITAL LETTER L REVERSED R WITH FISHHOOK 4 (/sa\textesh/) aspirated0263 Highɣ \m{g} LATIN SMALL LETTER GAMMA \textlhtlongi 0284 ʄ 5 (/o/\texthtbardotlessj)02A0 ʠ \m{q} vowel-initialLATIN SMALL LETTER DOTLESS Low J WITH STROKE\textbabygammaLATIN AND HOOK SMALL LETTER Q WITH HOOK \texthtObardotlessj0280 \texthtqʀ \textscr \textgammalatinsmallLATIN LETTER SMALL CAPITAL R 6 (/juk\texthtbardotlessjvar/) glide Low h 02A1 0281ʡ \textbarglotstopʁ \textinvscr \textipagammaLATIN LETTERLATIN GLOTTAL LETTER STOP SMALL WITH CAPITAL STROKE INVERTED R 0285 ʅ7 (/t il\m{S}/) aspiratedLATIN SMALL LETTER SQUAT High REVERSED ESH 02A2 0282ʢ \textbarrevglotstopʂ \textrethookbelow{s} 0264 ɤ \textramshornsLATIN LETTERLATIN REVERSED SMALL GLOTTAL LETTER S STOP WITH WITH HOOK STROKE LATIN SMALL LETTER RAMS HORN 8 (/phal\textvibyi/) aspirated High 02A3 ʣ \textdzlig\textrtails 0265 ɥ \textturnhLATIN SMALL LETTER DZ DIGRAPH LATIN SMALL LETTER TURNED H 0286 ʆ9 (/ku\textctesh/) lenisLATIN SMALL LETTER ESH WITH Low CURL 0283 ʃ \m{s} LATIN SMALL LETTER ESH 0287 ʇ \textturnt02A4 ʤ \textdyoghlig LATIN SMALL LETTER TURNED T LATIN SMALL LETTER DEZH DIGRAPH \esh 0288 ʈ \M{t}02A5 ʥ \textdctzlig LATIN SMALL LETTER T WITH RETROFLEX HOOKLATIN SMALL LETTER DZ DIGRAPH WITH CURL \textrtailt \textesh 2.1.2. Subjects 02A6 ʦ \texttslig LATIN SMALL LETTER TS DIGRAPH \texttretroflexhook0284 ʄ \texthtbardotlessj LATIN SMALL LETTER DOTLESS J WITH STROKE AND HOOK Five SK native speakers02A7 (twoʧ males\textteshlig and\texthtObardotlessj three females) ranging in age from 23 to 32LATIN years SMALL LETTER (mean TESH DIGRAPH 13 \texttesh\texthtbardotlessjvar age: 29.4 years, SD: 3.8) participated in the production experiment. The participants were recruited 02A8 0285ʨ \texttctcligʅ \m{S} LATIN SMALLLATIN LETTER SMALL TC DIGRAPH LETTER SQUATWITH CURL REVERSED ESH at a university in the US, and they reported\textvibyi that they had been in the US for less thanLATIN a yearSMALL LETTER at the FENG DIGRAPH 02A9 ʩ \textfenglig 14 time of recording. The02AA participants0286ʪ signed\textlsligʆ a consent\textctesh form and received 10 dollars as compensationLATIN SMALLLATIN LETTER SMALL for LS DIGRAPH LETTER ESH WITH CURL 0287 ʇ \textturnt LATIN SMALL LETTER TURNED T their participation. None02AB of the participantsʫ \textlzlig reported any speech or hearing disorder. AllLATIN subjects SMALL LETTER gave LZ DIGRAPH 02AE 0288ʮ \textlhtlongyʈ \M{t} LATIN SMALLLATIN LETTER SMALL TURNED LETTER H WITH T WITH FISHHOOK RETROFLEX HOOK their informed consent for inclusion before they\textrtailt participated in the study. The study was conducted in 02AF ʯ \textvibyy LATIN SMALL LETTER TURNED H WITH FISHHOOK AND TAIL accordance with the Declaration of Helsinki, and\texttretroflexhook the protocol #818905 was approved by the University 02B0 ʰ \textsuph MODIFIER LETTER SMALL H of Pennsylvania IRB #8. \textsuperscript{h} 02B1 ʱ \textsuphth MODIFIER LETTER SMALL H WITH HOOK \textsuperscript{\texthth} 02B2 ʲ \textsupj 14 MODIFIER LETTER SMALL J \textsuperscript{j} 02B3 ʳ \textsupr MODIFIER LETTER SMALL R \textsuperscript{r}

15 Languages 2020, 5, 64 6 of 19

2.1.3. Recording Procedure Recordings were made in a sound-proof booth at the same university in the US using a Plantronics headset microphone, and were saved directly onto a laptop as 16-bit wave files at a sampling rate of 44.1 kHz. Target stimuli were presented to speakers through PowerPoint slides. Speakers were seated in front of a laptop monitor, wearing the headset microphone. Before recording each of the broad- and corrective-focus conditions, speakers practiced three sample trials to introduceLanguages 2020 them, 5, x FOR to the PEER recording REVIEW procedure and to enable them to be aware of the discourse context,6 of 19 in which a certain digit was wrong and it needed to be corrected in the production. As shown inFigure Figure 3, 3in, inthe the broad-focus broad-focus condition, condition, speakers speakers read read target target stimuli stimuli in a carrier in a carrier sentence, sentence, and in and the incorrective-focus the corrective-focus condition, condition, they first they listened first listened to pre-recorded to pre-recorded prompt promptquestions questions (i.e., person (i.e., A person in the Aabove in the Q&A above dialogue) Q&A dialogue)through headphones through headphones and then produced and then target produced strings target as answers. strings Broad-focus as answers. Broad-focusreadings were readings recorded were recordedfirst, followed first, followed by corr byective-focus corrective-focus readings. readings. Recording Recording times times were approximately 15 and 25 min for broad focus and co correctiverrective focus, respectively, and there was a short intermission between the focus conditions.conditions.

Figure 3. Screenshots of the production experiment. Th Thee left left panel corresponds to a broad-focus

condition in which the targettarget sentencesentence meansmeans ‘Mina’s‘Mina’s numbernumber isis 787-412-4699787-412-46990′ inin English.English. The right panel corresponds to a corrective-focus condition in which the upper sentence means ‘Mina’s number is 887-412-4699. Right?’, Right?’, and the lower sentence means ‘No, Mina’s numbernumber is 787-412-4699.’.

2.1.4. A Sketch of Pitch Contours We firstfirst describe describe sample sample pitch pitch contours contours that that enable enable us to captureus to capture the prosodic the prosodic differences differences between thebetween broad the focus broad and focus the correctiveand the corrective focus of focus each toneof each group. tone Ingroup. this study,In this eachstudy, digit each in digit each in phone each numberphone number string wasstring labeled, was labeled, and pitch and contourspitch contours sampled sampled at ten at equidistant ten equidistant points points of the of labeled the labeled digit weredigit were obtained, obtained, using using ProsodyPro ProsodyPro (Xu 2013 (Xu ).2013). When When a creak—often a creak—of occurringten occurring at the at vowelthe vowel onset—was onset— foundwas found in labeling in labeling each digit,each digit, we actually we actually included included the portion the portion showing showing the creak the and creak used and the used portion the inportion the acoustic in the acoustic analysis. analysis. When a When coda consonanta coda consonant was followed was followed by an onsetby an consonant onset consonant with a with single a closuresingle closure (for example, (for example,kku in kku/juk.ku in /juk.ku// ‘six nine’), ‘six nine’), the closure the closur wase dividedwas divided into twointo halves,two halves, each each one includedone included as a closureas a closure of each of each consonant. consonant. Furthermore, Furthermore, when when pitch pitch halving halving or doubling or doubling errors errors were detectedwere detected during during the inspection the inspection of pitch of contours,pitch contours, they were they manually were manually corrected corrected using a using TD-PSOLA a TD- (TimePSOLA Domain (Time Domain Pitch Synchronous Pitch Synchronous Overlap-Add) Overlap-Add) technique technique in Praat. in Praat. Pitch contoursPitch contours in Hertz in Hertz were thenwere convertedthen converted to semitones to semitones (st) using (st) using the following the following equation equation (Lee et(Lee al. 2016et al.; Xu2016; and Xu Wang and Wang2009): st2009):= 12log st = 212logHz. This2Hz. isThis because is because pitch pitch in semitones in semitones increases increases linearly, linearly, contrary contrary to the to hertz the hertz scale scale that increasesthat increases non-linearly non-linearly (Nolan (Nolan 2003 ).2003). Figure4 4 displays displays thethe time-normalizedtime-normalized pitchpitch contourscontours ofof broadbroad focusfocus andand correctivecorrective focus. focus. SinceSince only one digit is produced with corrective focus, only the phrase containing the corrected digit is shown here, here, while while the the other other phrases phrases are are omitted omitted for forsimplicity’s simplicity’s sake. sake. In Figure In Figure 4, the4 ,numerical the numerical digits digitsin each in panel each panelrefer referto the to digits the digits used used for forthe thefirst first phrase, phrase, and and the the shaded shaded area area represents represents a a focus position. The digits ‘1’‘1’ andand ‘8’‘8’ inin thethe top top panels panels belong belong to to the the high high tone tone group, group, and and the the digits digits ‘5’ ‘5’ and and ‘6’ in‘6’ thein the bottom bottom panels panels belong belong to to the the low low tone tone group. group. Two Two noticeable noticeable featuresfeatures cancan bebe seen in Figure Figure 44.. First, prosodic marking of focus differs strikingly by tonal contrast. It seems that only the digits in the high tone group are realized with a clearly increased pitch range in the focus position, relative to those in the low tone group, suggesting that prosodic marking of focus is more salient in the high tone group than in the low tone group. Second, it appears that focus does not modulate one single digit for both tone groups. All plots in Figure 4 show that the pitch level of positions 2 and/or 3 are also increased, although only position 1 was in focus. This indicates that both tone groups produced a confusing prosodic marking of focus. Languages 2020, 5, 64 7 of 19

First, prosodic marking of focus differs strikingly by tonal contrast. It seems that only the digits in the high tone group are realized with a clearly increased pitch range in the focus position, relative to those in the low tone group, suggesting that prosodic marking of focus is more salient in the high tone group than in the low tone group. Second, it appears that focus does not modulate one single digit for both tone groups. All plots in Figure4 show that the pitch level of positions 2 and /or 3 are also increased, although only position 1 was in focus. This indicates that both tone groups produced a confusingLanguages 2020 prosodic, 5, x FOR marking PEER REVIEW of focus. 7 of 19

Figure 4. Sample pitch contours for the low and high tone groups in the two focus conditions. In panels (aFigure,b), the 4. digits Sample highlighted pitch contours in gray for belong the low to theand high high tone tone groups, groups and in the in panels two focus (c,d), conditions. the digits in In thepanels first area(a,b), refer the todigits the lowhighlighted tone groups. in gray BF belong and CF to are the abbreviations high tone groups, for broad and focus in panels and corrective (c,d), the focus,digits respectively. in the first area refer to the low tone groups. BF and CF are abbreviations for broad focus and corrective focus, respectively. 2.2. Analyses 2.2. Analyses To evaluate how prosodic marking of focus varies by tone group, we directly compared the low and highTo evaluate tone groups how in prosodic AP-initial marking focused of positionsfocus varies (i.e., byN toneNN) group, in two we focus directly conditions compared (broad the and low corrective).and high tone We groups first examined in AP-initial 1000 digitfocused strings positions by spectrogram (i.e., NNN) reading in two focus and listening conditions to determine(broad and whethercorrective). speakers We first actually examined phrased 1000 at digit the hyphens strings by in phonespectrogram number reading strings. and That listening is, our examinationto determine waswhether focused speakers on identifying actually phrased whether at each the hyphens digit group in phone demarcated number by strings. hyphens That was is, our produced examination as a separatewas focused phrase—here on identifying an AP. whether Unlike the each 3-digit digit groups group thatdemarcated were always by hyphens produced was as oneproduced phrase as in a bothseparate broad- phrase—here and corrective-focus an AP. Unlike conditions, the 3-digit the 4-digit groups group that were was oftenalways produced produced as oneas one phrase phrase (i.e., in NNNN),both broad- but and was corrective-focus sometimes divided conditions, into two the phrases 4-digit group (i.e., NN-NN). was often This produced inconsistency as one phrase wasdue (i.e., toNNNN), the declarative but was morphemesometimes (divided-ja or -ija into) attached two phrases to the (i.e., 4-digit NN-NN). group atThis the inconsistency end. The choice was ofdue the to morphemethe declarative depends morpheme on the presence(-ja or -ija of) attached the coda to consonant the 4-digit before group it. Whenat the theend. coda The consonantchoice of the is present,morpheme-ija isdepends attached; on otherwise, the presence-ja is of used the ascoda a morpheme. consonant Tobefore achieve it. When consistency, the coda 4-digit consonant groups is werepresent, excluded -ija is fromattached; further otherwise, analysis, -ja and is used only as the a morpheme. AP-initial focused To achi positioneve consistency, in 3-digit 4-digit groups groups was analyzed.were excluded Furthermore, from further in contrast analysis, to the and assertion only the that AP-initial a new ip focused boundary position is inserted in 3-digit before groups a focused was analyzed. Furthermore, in contrast to the assertion that a new ip boundary is inserted before a focused element (Jun and Cha 2015), we did not consider the ip boundary cue because the existence of the ip has not yet been generally accepted among researchers (e.g., Jeon and Nolan 2017). In this production experiment, a total of 1000 digit strings (5 speakers × 2 focus conditions × 10 digits × 10 string positions) were obtained. However, because only 3-digit AP groups were included, the total number of target digit strings was reduced to 200 (5 speakers × 2 focus conditions × 10 digits × 2 string positions). Classifying the 200 digits by tone group yielded 100 target digits in the low tone group and 100 target digits in the high tone group. This division is based on the fact that, as Languages 2020, 5, 64 8 of 19 element (Jun and Cha 2015), we did not consider the ip boundary cue because the existence of the ip has not yet been generally accepted among researchers (e.g., Jeon and Nolan 2017). In this production experiment, a total of 1000 digit strings (5 speakers 2 focus conditions 10 × × digits 10 string positions) were obtained. However, because only 3-digit AP groups were included, × the total number of target digit strings was reduced to 200 (5 speakers 2 focus conditions 10 digits × × 2 string positions). Classifying the 200 digits by tone group yielded 100 target digits in the low tone × group and 100 target digits in the high tone group. This division is based on the fact that, as previously mentioned, there are five digits (0, 2, 5, 6, 9) in the low tone group and five (1, 3, 4, 7, 8) in the high tone group.2 For acoustic measurements, measures of duration in milliseconds (ms), mean intensity in decibels (dB), and maximum pitch (st) were aggregated in order to observe whether each acoustic cue plays an important role in marking prosodic focus in SK. Among several pitch-related cues, maximum pitch was selected because the overall behavior for both tone groups showed an increase in pitch range to signal prominence for focus. Therefore, we found that maximum pitch best captures the difference in pitch range between broad focus and corrective focus. The measures of duration, mean intensity, and maximum pitch were taken from all the target digits and averaged in each focus condition and each tone group. For the sake of simplicity, we will henceforth refer to duration, mean intensity, and maximum pitch as duration, intensity, and pitch, respectively. In the present study, we used three different analyses to determine how corrective focus was realized differently from broad focus in both tone groups. The first analysis (Section 2.3.1) focused on AP-initial positions to determine whether corrective focus produced more prominent acoustic cues than broad focus. Next, based on observations from Figure4, the second analysis (Section 2.3.2) addressed whether prosodic modulation by focus spanned the entire phrase as a focus effect in both tone groups. To do so, focus position was divided into two positions: On-focus and post-focus. Since position 1 was focused in each AP of a digit string, it was labeled as “on-focus,” and the other positions were labeled as “post-focus.” Based on this method, the aggregate measures of duration (ms), intensity (dB), and pitch (st) in each of the two focus positions were calculated by subtracting broad focus from corrective focus. Finally, in the third analysis (Section 2.3.3), the pitch contours of each position (i.e., NNN) were compared at the AP level to identify whether an AP-initial position in the low tone group was tonally more constrained in marking prosodic focus as compared to the high tone group. A series of linear mixed-effects model analyses were conducted to statistically test the above three analyses, using the lmerTest package (Kuznetsova et al. 2017) in R (R Core Team 2020). Fixed effects were focus (broad, corrective) and tone group (low, high) for the first analysis, focus position (on-focus, post-focus) for the second analysis, and AP position (initial, medial, final) for the third analysis. In all three analyses, random effects included speaker (5 speakers), position (1, 4), and digits (0–9), and dependent variables were the three acoustics cues. The Anova function of the lmerTest package was used to obtain the significance level of the fixed effects. Furthermore, a series of multiple comparisons with Bonferroni correction were conducted using the multcomp package (Hothorn et al. 2008) in R.

2.3. Results

2.3.1. On-Focus Effects Figure5 shows the mean values of duration (ms), intensity (dB), and pitch (st) in two focus conditions, separated by tone group. Target digits under corrective focus were produced with increased

2 A sample size of 200 may not be sufficient to statistically determine the differences in focus prosody between the two tone groups: Low and high tone. Nevertheless, since digit and position were treated as random samples, we had 50 tokens for each focus type (broad and corrective) in each tone group. According to Hair et al.(2010), the general rule is to have a minimum of five observations per variable (5:1), and an acceptable sample size would have ten observations per variable (10:1). Since we had 50 tokens for each focus type, it should be noted that the ratio in our study was just above the rule of thumb. Languages 2020, 5, x 64 FOR PEER REVIEW 9 9of of 19 19

Based on Figure 5, we posit that two-way interaction effects between tone group and focus exist forduration, all three intensity, acoustic and cues, pitch and as comparedtherefore toit targetis nece digitsssaryunder to initiate broad examination focus in both of tone these groups, interaction except effects.duration Results in the of high the tonelinear group. mixed-effects However, model different show tone that groups the interaction showed dieffectfferent between trends tone in marking group andprosodic focus focus. was highly Duration significant and intensity for all played acoustic important parameters roles (duration: in the low toneX2 = group;12.88, df however, = 3, p < in0.01; the intensity:high tone X group,2 = 19.55, intensity df = 1, andp < pitch0.001; werepitch: crucial X2 = 64.22, in the df production = 3, p < 0.001). of corrective The results focus. confirm Furthermore, that the twoamong tone all groups acoustic differ variables, from pitcheach inother the in high marking tone group prosodic was mostfocus, salient which in prompted distinguishing us to corrective evaluate thefocus focus from effect broad of focus.the acoustic variables in each tone group.

Figure 5. Duration, intensity, and pitch of each tone group in two focus conditions (BF: Broad focus, Figure 5. Duration, intensity, and pitch of each tone group in two focus conditions (BF: Broad focus, CF: Corrective focus; Low: Low tone group, High: High tone group). Points refer to mean values, and CF: Corrective focus; Low: Low tone group, High: High tone group). Points refer to mean values, and error bars indicate the 95% confidence interval. error bars indicate the 95% confidence interval. Based on Figure5, we posit that two-way interaction e ffects between tone group and focus exist In the low tone group, the main effect of focus was significant for all acoustic cues (duration: X2 for all three acoustic cues, and therefore it is necessary to initiate examination of these interaction = 9.33, df = 1, p < 0.01; intensity: X2 = 7.65, df = 1, p = 0.01; pitch: X2 = 7.51, df = 1, p = 0.01). In the high effects. Results of the linear mixed-effects model show that the interaction effect between tone group tone group, the main effect of focus showed a clear significance tendency for pitch (X2 = 91.72, df = 1, and focus was highly significant for all acoustic parameters (duration: X2 = 12.88, df = 3, p < 0.01; p < 0.001), but focus had no significant effect on duration (X2 = 0.83, df = 1, p = 0.77) and intensity (X2 intensity: X2 = 19.55, df = 1, p < 0.001; pitch: X2 = 64.22, df = 3, p < 0.001). The results confirm that the = 1.36, df = 1, p = 0.24). The results show clear differences in the trends of marking prosodic focus in two tone groups differ from each other in marking prosodic focus, which prompted us to evaluate the both tone groups. The two groups differ in that duration, intensity, and pitch played an important focus effect of the acoustic variables in each tone group. role in marking focus in the low tone group, whereas only pitch was an important cue in the high In the low tone group, the main effect of focus was significant for all acoustic cues (duration: tone group. X2 = 9.33, df = 1, p < 0.01; intensity: X2 = 7.65, df = 1, p = 0.01; pitch: X2 = 7.51, df = 1, p = 0.01). In the Among the three acoustic variables, pitch and duration deserve further attention, since the focus high tone group, the main effect of focus showed a clear significance tendency for pitch (X2 = 91.72, effect by pitch and duration differed markedly in the two tone groups. In the low tone group, the df = 1, p < 0.001), but focus had no significant effect on duration (X2 = 0.83, df = 1, p = 0.77) and estimated pitch value was 88.65 st for broad focus and 89.24 st for corrective focus, which is an intensity (X2 = 1.36, df = 1, p = 0.24). The results show clear differences in the trends of marking increase of just 0.59 st extra pitch. However, in the high tone group, the estimated pitch value was prosodic focus in both tone groups. The two groups differ in that duration, intensity, and pitch played 93.05 st for broad focus and 94.66 st for corrective focus, for a difference of 1.61 st between the two an important role in marking focus in the low tone group, whereas only pitch was an important cue in focus conditions. The difference indicates that the pitch increase induced by focus was roughly three the high tone group. times greater in the high tone group than in the low tone group (i.e., 1.61 st vs. 0.59 st). The opposite Among the three acoustic variables, pitch and duration deserve further attention, since the focus pattern appears in duration between the two tone groups. Duration served as a more prominent cue effect by pitch and duration differed markedly in the two tone groups. In the low tone group, the for focus in the low tone group, but lacked the same significance in the high tone group. estimated pitch value was 88.65 st for broad focus and 89.24 st for corrective focus, which is an increase of just 0.59 st extra pitch. However, in the high tone group, the estimated pitch value was 93.05 st for 2.3.2. Focus Effects within APs broad focus and 94.66 st for corrective focus, for a difference of 1.61 st between the two focus conditions. The diFigurefference 6 displays indicates the that mean the pitchdifferences increase in induceddurationby (ms), focus intensity was roughly (dB), and three pitch times (st) greater of the infocus the positionshigh tone between group than corrective in the lowfocus tone and group broad (i.e., focus. 1.61 Overall, st vs. 0.59both st). the Thelow oppositeand high patterntone groups appears show in noduration clear betweenindication the of two focus tone effects groups. in Durationthe on-fo servedcus positions, as a more when prominent compared cue for to focusthe post-focus in the low positions.tone group, In butthe lackedlow tone the group, same significance only the duration in the high cues tone for on-foc group.us positions were greater than 0 (i.e., corrective > broad) and the post-focus positions. Although the intensity and pitch cues for on- focus2.3.2. positions Focus Eff ectswere within greater APs than 0, the post-focus positions also showed greater intensity and pitch valuesFigure than6 0, displays and notably the mean even di thefferences post-focus in duration positions (ms), produced intensity greater (dB), pitch and pitch values (st) than of the the focus on- focuspositions positions. between This corrective suggests focus that and prosodic broad marking focus. Overall, of focus both by the intensity low and and high pitch tone was groups certainly show confusingno clear indication in the low of focustone egroupffects inwithin the on-focus APs. The positions, high tone when group compared also showed to the post-focus a similar positions. level of confusion.In the low toneThe on-focus group, only position the duration of the high cues tone for on-focus group did positions not have were duration greater values than 0 (i.e.,greater corrective than 0 and> broad) the post-focus and the post-focus positions. positions.The intensity Although and pitch the cues intensity for on-focus and pitch positions cues for were on-focus greater positions than 0 and the post-focus positions, but the post-focus positions also produced positive intensity and pitch Languages 2020, 5, 64 10 of 19

were greater than 0, the post-focus positions also showed greater intensity and pitch values than 0, and notably even the post-focus positions produced greater pitch values than the on-focus positions. This suggests that prosodic marking of focus by intensity and pitch was certainly confusing in the low tone group within APs. The high tone group also showed a similar level of confusion. The on-focus position of the high tone group did not have duration values greater than 0 and the post-focus positions. The intensity and pitch cues for on-focus positions were greater than 0 and the post-focus positions, Languages 2020, 5, x FOR PEER REVIEW 10 of 19 but the post-focus positions also produced positive intensity and pitch values. This indicates that, values.similar This to the indicates low tone that, group, similar the to high the tonelow grouptone group, produced the high a confusing tone group prosodic produced marking a confusing of focus. prosodicNevertheless, marking two of crucial focus. Nevertheless, differences were two apparent crucial differences between thewere two apparent tone groups: between In the two Low tone tone groups:group, In the the on-focus Low tone and group, post-focus the on-focus positions and exhibited post-focus salient positions differences exhibited in duration,salient differences whereas thein duration,High tone whereas group the exhibited High tone a higher group level exhibited of pitch a hi cuesgher for level on-focus of pitch positions, cues for on-focus which is positions, far greater whichthan 0.is far greater than 0.

Figure 6. Mean differences in duration (ms), intensity (dB), and pitch (st) of the two focus positions Figurebetween 6. Mean corrective differences focus and in duration broad focus. (ms), The intensity horizontal (dB), dotted and pitch line (st) at 0 of on the the two y-axis focus indicates positions that betweenthe difference corrective between focus corrective and broad and focus. broad The focus horizont conditionsal dotted equals line zero,at 0 on and the values y-axis above indicates 0 indicate that thethat difference the value between of the corrective-focus corrective and broad condition focus exceeds conditions that equals of the broad-focus zero, and values counterpart. above 0 indicate that the value of the corrective-focus condition exceeds that of the broad-focus counterpart. The results of the linear mixed-effects model analysis indicate that—for the low tone group—the mainThe eff resultsect of focusof the position linear mixed-effects was significant model for durationanalysis indicate (X2 = 38.45, that—fordf = 1,thep low< 0.001) tone andgroup— pitch the(X 2main= 5.31, effectdf =of1, focusp = 0.05),position but was not significant for intensity for (X duration2 = 1.40, (dfX2 == 1,38.45,p = df0.24). = 1, Thep < 0.001) high toneand pitch group (Xshowed2 = 5.31, a df significant = 1, p = 0.05), effect but of not focus for position intensity on (X duration2 = 1.40, (dfX 2== 1,9.89, p = 0.24).df = 1, Thep < high0.01) tone and group pitch ( Xshowed2 = 8.97, adf significant= 1, p < 0.01), effect but of notfocus on position intensity on (X duration2 = 0.28, df(X=2 =1, 9.89,p = 0.60).df = 1, Notep < 0.01) that and the directionpitch (X2 of= 8.97, significance df = 1, pfor < 0.01), pitch but is not not the on sameintensity between (X2 =the 0.28, two df tone= 1, p groups; = 0.60). the Note post-focus that the positionsdirection producedof significance a greater for pitchmean is dinotfference the same in pitch between than the the two on-focus tone groups position; the for post-focus the low tone positions group. produced The statistical a greater outcomes mean of differenceboth tone in groups pitch indicatethan the that on-focus the on-focus position positions for the low were tone not clearlygroup. distinguishedThe statistical fromoutcomes the post-focus of both tonepositions groups in indicate both tone that groups the on-focus within APs. positions In the lowwere tone not group, clearly pitch distinguished and intensity from were thethe post-focus cues that positions in both tone groups within APs. In the low tone group, pitch and intensity were the cues that produced a confusing prosodic marking of focus within APs. In the high tone group, intensity was such a cue.

2.3.3. SK’s AP Tonal Constraints on Focus Prosody As can be seen in Figure 7, three positions (initial, medial, final) within an AP are shown on the x-axis and the aggregated mean of pitch (st) in the two focus positions (broad vs. corrective) is shown on the y-axis. Figure 7 shows that prosodic marking of focus in SK is affected by AP tonal patterns. The high tone group in the AP-initial position produced a higher in the corrective-focus Languages 2020, 5, 64 11 of 19 produced a confusing prosodic marking of focus within APs. In the high tone group, intensity was such a cue.

2.3.3. SK’s AP Tonal Constraints on Focus Prosody As can be seen in Figure7, three positions (initial, medial, final) within an AP are shown on the x-axis and the aggregated mean of pitch (st) in the two focus positions (broad vs. corrective) is shown on the y-axis. Figure7 shows that prosodic marking of focus in SK is a ffected by AP tonal patterns. TheLanguages high 2020 tone, 5 group, x FOR PEER in the REVIEW AP-initial position produced a higher pitch contour in the corrective-focus11 of 19 condition than in the broad-focus condition. Although higher pitch spanned the entire AP as a focus condition than in the broad-focus condition. Although higher pitch spanned the entire AP as a focus effect in the high tone group, the pitch level in the AP-initial position (94.7 st) was higher than that of effect in the high tone group, the pitch level in the AP-initial position (94.7 st) was higher than that of the AP-medial (93.3 st) and AP-final positions (93.4 st). In the low tone group, the prosodic marking of the AP-medial (93.3 st) and AP-final positions (93.4 st). In the low tone group, the prosodic marking focus in the initial position was marginal and the increase in pitch was also stretched to the entire AP of focus in the initial position was marginal and the increase in pitch was also stretched to the entire position as a focus effect, albeit minimally. What is noteworthy in the low tone group is that the pitch AP position as a focus effect, albeit minimally. What is noteworthy in the low tone group is that the level in the AP-initial position (89.1 st) was significantly lower than that of the AP-medial (91.2 st) and pitch level in the AP-initial position (89.1 st) was significantly lower than that of the AP-medial (91.2 AP-final positions (92.7 st). Note that focus (broad vs. corrective) was not included as a focus effect in st) and AP-final positions (92.7 st). Note that focus (broad vs. corrective) was not included as a focus the model because our goal was to determine whether pitch values were affected by AP tonal patterns effect in the model because our goal was to determine whether pitch values were affected by AP tonal in the same digit string. patterns in the same digit string.

Figure 7. AP (Accentual Phrase) tonal patterns with three digits in two focus conditions (broad vs. Figure 7. AP (Accentual Phrase) tonal patterns with three digits in two focus conditions (broad vs. corrective), separated by tone group. BF is shown here for reference only. corrective), separated by tone group. BF is shown here for reference only. AP position had a significant effect on pitch in the low tone group (X2 = 139.22, df = 2, p < 0.001). AP position had a significant effect on pitch in the low tone group (X2 = 139.22, df = 2, p < 0.001). The results of the multiple comparison analysis are detailed in Table2, which only includes the The results of the multiple comparison analysis are detailed in Table 2, which only includes the comparisons between AP-initial and AP-medial positions and between AP-initial and AP-final comparisons between AP-initial and AP-medial positions and between AP-initial and AP-final positions. As shown in Table2, the AP-initial focused position showed a significantly lower pitch value positions. As shown in Table 2, the AP-initial focused position showed a significantly lower pitch than the AP-medial and AP-final positions (2.12 st and 3.64 st lower than the AP-medial and AP-final value than the AP-medial and AP-final positions (2.12 st and 3.64 st lower than the AP-medial and positions, respectively) because the AP has a tonal pattern of LHH. This means that prosodic focus AP-final positions, respectively) because the AP has a tonal pattern of LHH. This means that prosodic cannot deviate from the tonal pattern and it is realized, conforming to a language’s existing prosodic focus cannot deviate from the tonal pattern and it is realized, conforming to a language’s existing system. In the high tone group, the effect of AP position was also significant for pitch (X2 = 25.65, prosodic system. In the high tone group, the effect of AP position was also significant for pitch (X2 = df = 2, p < 0.001). As shown in Table2, the AP-initial focused position produced a significantly higher 25.65, df = 2, p < 0.001). As shown in Table 2, the AP-initial focused position produced a significantly pitch than the subsequent positions within APs (1.28 st and 1.36 st higher than the AP-medial and higher pitch than the subsequent positions within APs (1.28 st and 1.36 st higher than the AP-medial AP-final positions, respectively). This salient focus prosody by pitch clearly distinguishes the high and AP-final positions, respectively). This salient focus prosody by pitch clearly distinguishes the tone group from the low tone group. high tone group from the low tone group.

Table 2. Results of the multiple comparison analysis for AP position in the low tone group (Estimate: Coefficient estimates, SE: Standard error).

Estimate SE z-Value p-Value AP-initial vs. AP-medial −2.12 0.24 −8.94 <0.001 Low tone AP-initial vs. AP-final −3.64 0.24 −15.37 <0.001 AP-initial vs. AP-medial 1.28 0.29 4.46 <0.001 High tone AP-initial vs. AP-final 1.36 0.29 4.73 <0.001

3. Human Perception

3.1. Data Collection We randomly selected 100 phone number strings produced with corrective focus from the production data of the five speakers (20 strings per speaker). The selected strings were designed to Languages 2020, 5, 64 12 of 19

Table 2. Results of the multiple comparison analysis for AP position in the low tone group (Estimate: Coefficient estimates, SE: Standard error).

Estimate SE z-Value p-Value AP-initial vs. AP-medial 2.12 0.24 8.94 <0.001 Low tone − − AP-initial vs. AP-final 3.64 0.24 15.37 <0.001 − − AP-initial vs. AP-medial 1.28 0.29 4.46 <0.001 High tone AP-initial vs. AP-final 1.36 0.29 4.73 <0.001

3. Human Perception

3.1. Data Collection LanguagesWe 2020 randomly, 5, x FOR selectedPEER REVIEW 100 phone number strings produced with corrective focus from12 of the 19 production data of the five speakers (20 strings per speaker). The selected strings were designed to includeinclude 1010 numericalnumerical digitsdigits rangingranging betweenbetween 0–9 at every string position, and each digit was equally focusedfocused inin everyevery stringstring positionposition (i.e.,(i.e., 1010 focusedfocused digitsdigits × 10 stringstring positions).positions). This design enabled an × equal numbernumber of of corrected corrected digits digits to to be be included included for for the lowthe low and highand high tone groups.tone groups. Since Since only the only initial the positionsinitial positions of 3-digit of 3-digit groups groups (i.e., N (i.e.,NN) N wereNN) included,were included, 20 corrected 20 corrected digits digits in AP-initial in AP-initial positions positions were employedwere employed as target as target stimuli. stimuli. We setset upup the the experiment experiment using using a web-browsera web-browser (Qualtrics) (Qualtrics) inorder in order to recruit to recruit listeners listeners online online with easewith ofease access. of access. Figure Figure8 shows 8 shows a screenshot a screenshot of a part of of a part the survey of the survey in Qualtrics. in Qualtrics. For each For question each question during theduring test, the participants test, participants heard one heard phone one number phone string numb wither string one corrected with one digit, corrected which digit, was produced which was by oneproduced of the fiveby one speakers of the in five the productionspeakers in experiment, the production by pressing experiment, a play button.by pressing It should a play be notedbutton. that It inshould this experiment, be noted that neither in this fillers experiment, with broad neither focus fillers nor any with discourse broad focus context nor was any provided. discourse Listeners context heardwas provided. only the Listeners answer part heard produced only the with answer corrective part produced focus with with no corrective context. focus This processwith no enabledcontext. listenersThis process to rely enabled solely listeners on prosodic to rely cues solely in recognizing on prosodic the cues corrected in recognizing digits. After the corrected listening, digits. participants After werelistening, asked participants to select which were digit asked seemed to select to be which corrected digit among seemed the to 10be digits corrected in the among phone the number 10 digits string in theythe phone heard. number The digits string in the they phone heard. number The digits string in were the shownphone asnumber a task withstring ten-choices were shown in theas a order task theywith appearedten-choices in in the the phone order number they appeared so that the in the participants phone number could makeso that a the selection. participants In total, could 52 native make SKa selection. speakers In participated total, 52 native in the SK perception speakers experimentparticipated (40 in females,the perception 12 males; experiment mean age: (40 24.2, females, SD: 3.8). 12 males; mean age: 24.2, SD: 3.8).

Figure 8. Screenshot of a part of the survey in Qualtrics.

3.2. Analyses Our basicbasic strategystrategy was to analyzeanalyze thethe perceptionperception datadata inin aa confusionconfusion matrix—amatrix—a tabletable thatthat summarizes thethe performance ofof a classificationclassification accuracy—toaccuracy—to gaugegauge thethe accuracy of each tone group’s identificationidentification ofof eacheach focusfocus position.position. For statistical analysis, we conducted a binary logistic regression model using using the the lmerTestlmerTest packagepackage (Kuznetsova (Kuznetsova et al. et 2017) al. 2017 in )R in (R RCore (R CoreTeam Team 2020). 2020 Within). Within the model, the model,tone group tone groupwas included was included as a fixed as a fixed effect, eff ect,listeners listeners and and individual individual digits digits as as random random effects, effects, and identification—coded as 0 (incorrect) or 1 (correct)—as a dependent variable. The Anova function of the lmerTest package was used to determine the significance level of the fixed effect.

3.3. Results Table 3 shows a confusion matrix for the identification of corrected digits in each tone group by the positions within phone number strings. The overall identification rate was 40.0% for the high tone group, but only 23.5% for the low tone group. These results indicate that tone group and prosodic focus interact asymmetrically in SK: The identification rate of focus positions in the high tone group was roughly twice higher than that of the low tone group. The results of the logistic regression model confirmed that the high tone group had a significantly better identification rate than the low tone group (χ2 = 5.07, df = 1, p < 0.05).

Languages 2020, 5, 64 13 of 19 identification—coded as 0 (incorrect) or 1 (correct)—as a dependent variable. The Anova function of the lmerTest package was used to determine the significance level of the fixed effect.

3.3. Results Table3 shows a confusion matrix for the identification of corrected digits in each tone group by the positions within phone number strings. The overall identification rate was 40.0% for the high tone group, but only 23.5% for the low tone group. These results indicate that tone group and prosodic focus interact asymmetrically in SK: The identification rate of focus positions in the high tone group was roughly twice higher than that of the low tone group. The results of the logistic regression model confirmed that the high tone group had a significantly better identification rate than the low tone group (χ2 = 5.07, df = 1, p < 0.05).

Table 3. Confusion matrix of corrective focus perception (percentage values) by position in the digit strings. Numbers highlighted in gray indicate correct identification rates. (Top panel: Low tone group, bottom panel: High tone group).

Perceived 1st 2nd 3rd 4th 5th 6th 7th 8th 9th 10th 1st 21.2 20.4 31.5 3.8 1.2 3.8 6.9 1.2 6.2 3.8 Low 4th 6.9 11.5 13.8 25.8 8.1 18.5 8.5 1.5 4.2 1.2 Target 1st 35.4 19.2 12.3 12.3 7.3 4.6 5.4 2.3 1.2 0.0 High 4th 5.4 2.7 10.4 44.6 4.6 8.5 12.7 0.8 10.0 0.4

Table3 shows that incorrect answers in the first digit phrase usually appeared within each phrase for both tone groups. For example, in the low tone group, when position 1 was focused, listeners selected position 2 at a rate of 20.4% and position 3 at a rate of 31.5%. The high tone group also exhibited a similar confusion rate, but to a lesser extent. When position 1 was focused, listeners chose positions 2 and 3 at a rate of 19.2% and 12.3%, respectively. However, the location of incorrect answers in the second digit phrase varied by tone group. When position 4 was focused in the Low tone group, listeners selected position 5 at a rate of 8.1% and position 6 at a rate of 18.5%. When position 4 was focused in the high tone group, the confusion did not spread significantly to the following positions within the same group (4.6% and 8.5%, respectively). Given that listeners chose position 3 at a rate of 10.4%, the last High tone in the previous digit phrase seemed to confuse the listeners. The identification results shown in Table3 suggest that the confusion of prosodic modulation hindered the identification of corrected digits in both tone groups.

4. Automatic Classification Automatic detection of prosodic focus is particularly important in that it can promote human-machine interaction. The results of the production and the perception experiments suggest that the phonetic cues of prosodic focus in SK are weak and confusing. These results raised a question of how a machine would perform in identifying prosodic focus in phone number strings in SK. Given that automatic detection of focus resulted in a promising result in American English (Cho et al. 2019), which had strong and salient prosodic focus, it was questionable if a machine would be able to detect prosodic focus well in a language like SK, where phonetic cues are not strong and native speakers also perform poorly.

4.1. Features and Model Training We used the production data of the corrective condition for three classification tasks. One task was to identify the position of a focused digit using all training data, and another was to identify the position of a focused digit using digit strings with H-toned focused digits as training data, and the other with digit strings containing L-toned focused digits only (n = 250 digit strings for each tone Languages 2020, 5, 64 14 of 19 group). We trained Support Vector Machine classifiers in all tasks. We used duration, mean intensity, and maximum pitch of each digit position from a digit string that we obtained from the production experiment as features for training. Because there were 10 digits in each digit string, the number of features for each target was 30 (=3 features 10 digit positions). We standardized all features within × each digit string using a z-score scale for better model performance. For example, we grouped all intensity values from one digit string and z-scored the values. This standardization preserved the relative differences between digit positions, yet minimized the problem of having different scales in pitch (st), intensity (dB), and duration (ms), which could hinder effective learning. Since there were relatively many features (n = 30) compared to the number of tokens, we experimented with feature selection techniques. To avoid collinearity affecting the model performance, we tested correlations among features and dropped the ones that showed a high correlation, varying the cutoff correlation coefficient (r) from 0.5, 0.6, 0.7, 0.8, 0.9 to 1 (no feature dropping). We reported the best performance for each classification task after feature selection and hyperparameter tuning in the section below. In all models, we performed leave-one-group-out cross-validation, where one group was defined as all tokens produced by one speaker. All training and testing were done with the scikit-learn package (Pedregosa et al. 2011) in Python.

4.2. Model Performance Table4 shows the confusion matrix of the corrective focus of the AP-initial positions in our models. For all three tasks, the models trained with features selected at r < 0.5 (n = 16) performed best. The best performance model (C = 2.5, gamma = 0.0625) that was trained with all tone groups showed 65% of accuracy for AP-initial positions, which is much higher than that of human perception (about 32%). The accuracy was 64.8% for all positions (macro average precision = 0.66, recall = 0.65, F1-score = 0.65, AUC = 0.94). The model’s performance was very impressive, given that we included only three prosodic features and trained each CV fold with a relatively small number of tokens (400 digit strings). The accuracy increased when the tone groups were trained separately. The model trained with H-toned focused (C = 3, gamma = 0.065) digits correctly predicted focus positions 86% of the time for the AP-initial positions. The accuracy was 84% for all positions (macro average precision = 0.85, recall = 0.84, F1-score = 0.84, AUC = 0.98). However, the accuracy for the model trained with L-toned focused digits (C = 3, gamma = 0.0625) was 68% (16% lower) for the AP-initial positions, which was lower than the H-toned model. The overall accuracy for all positions in the L-toned model was 71.2% (macro average precision = 0.71, recall = 0.71, F1-score = 0.71, AUC = 0.96). Consistent with human perception, the H-toned model had a higher accuracy than the L-toned model.

Table 4. Confusion matrix of corrective focus identification performed by the models. In this table, we only show the accuracies of the AP-initial positions for the sake of simplicity. Numbers highlighted in gray indicate correct identification rates. The top panel shows the results of the model trained with digit strings containing a L-toned focused digit, and the bottom represents the performance of the model trained with a H-toned focused digit. We calculated the prediction rates from the sum of all CVs.

Predicted 1st 2nd 3rd 4th 5th 6th 7th 8th 9th 10th 1st 52 0 8 8 8 0 16 0 0 8 Low 4th 0 0 0 84 0 4 8 0 0 4 Target 1st 80 8 0 4 0 4 0 0 0 4 High 4th 4 4 0 92 0 0 0 0 0 0

5. Discussion and Conclusions In this study, we examined the focus prosody of SK using telephone numbers by conducting production and human perception experiments and building machine learning classifiers to test our three working hypotheses: (a) SK’s prosodic marking of focus is weak, (b) prosodic marking of focus is Languages 2020, 5, 64 15 of 19 confusing because its focus effect is spread over the entire phrase, (c) a machine classifier has difficulty correctly identifying prosodic focus in SK due to weak and confusing prosodic cues, but a focused H-toned digit is better identified than a focused L-toned digit. The methods employed were effective and provided interesting, novel findings regarding the focus prosody of SK. The production and human perception results were congruent overall, and the working hypotheses were both verified by the two experiments. However, we found a striking difference between human and machine in the detection of prosodic focus; the machine’s performance exceeded the human’s, partly rejecting our third hypothesis. What follows is a brief summary of the results and an in-depth discussion of key findings. The results of the production data demonstrated that the low and high tone groups exhibited similarities and differences in marking prosodic focus. In the focus positions for the low tone group, duration, intensity, and pitch played important roles, but pitch was an important cue in the high tone group. However, the two tone groups produced a confusing prosodic marking of focus because prominence mainly by intensity and pitch spanned the entire phrase as a focus effect. Furthermore, while the low tone group was subject to a greater degree of confusion, both tone groups were affected by SK’s basic tonal melodies within an AP. One of the key findings of this work was that prosodic marking of focus varied in different pitch-scaling conditions. As discussed earlier in the Introduction, higher pitches can have a greater impact than lower pitches in terms of F0 ratios, based on the formula F0 = Int B + B. In the current × study, the high tone group induced higher pitches in marking prosodic focus, whereas the low tone group induced relatively lower pitches. The higher pitches showed an increase of 1.61 st in marking prosodic focus, whereas the lower pitches showed an increase of only 0.59 st. We posit that the greater increase in pitch is one of the main factors that enable listeners to better identify focus in the high tone group. Furthermore, the claim by (Rietveld and Gussenhoven 1985) that a difference of less than 1.5 st is not sufficient to recognize a change in prominence in human perception was confirmed in this study. Accordingly, our work affirmed that higher pitches mark prosodic focus more successfully than lower pitches. Another key finding was that the focus prosody of SK was found to be weak and confusing for both tone groups in production, leading to a poor identification rate in human perception. Reasons for this weak and confusing prosodic marking of focus are discussed below. As discussed earlier, Jun (Jun 2005) distinguished two types of language: Head-prominence and edge-prominence languages. In English, which is a head-prominence language, prominence of a word is marked by a post-lexical pitch accent on the head (that is, a stressed syllable) of the word. A focused word is thus realized by expressing the stressed syllable with longer duration, greater intensity, and higher pitch. In contrast, SK, an edge-prominence language, does not have such a head (no lexical stress, lexical tone, or lexical pitch accent) within a word; no specific syllable carries prominence for prosodic marking of focus. This prosodic characteristic results in a weak prosodic marking of focus, which leads to the question as to why prosodic marking of focus was confusing. Prominence of a word in SK is marked by initiating a new prosodic unit before the focused word, and the entire AP becomes prominent by raising the H-toned syllables in its tonal pattern. In this study, we found a strong constraint to group the 10 phone numbers into three or four prosodic units. Although a single digit in the phrase-initial position was narrowly focused, the digit did not itself form one separate AP; instead, the whole prosodic unit containing the focused digit became prominent as a focus effect. The prominence spanning the entire AP features a confusing prosodic marking of focus. Based on the human perception data, an obvious question to ask is why the high tone group was perceived better than the low tone group in the perception of corrective focus. Three answers are possible. First, as stated above, the high tone group produced much higher pitch than the low tone group, which effectively marked prosodic focus. Second, in the low tone group, the acoustic cues of prominence were contradictory between duration and pitch; that is, the focused L-toned digit was longer in duration, but lower in pitch than the subsequent “unfocused” digits, and these conflicting Languages 2020, 5, 64 16 of 19 acoustic cues of prominence were confusing to the listeners. In contrast, the focused H-toned digit was longer in duration and higher in pitch than the subsequent digit, which enhanced the prominence of the focused digit. Third, unlike Mandarin Chinese (Lee et al. 2016) and English (Beckman and Pierrehumbert 1986), where a low tone syllable can be salient by lowering its pitch target, the pitch target of the L-toned digit in SK was not lowered in marking prosodic focus. Considering that prosodic marking of focus was ineffective in SK, our next question is to ask whether the current results can be generalized to a broader range of languages. According to the prosodic typology presented in (Jun 2005; Jun 2014), SK is an edge-prominence language—that is, it does not have any head on the word or phrase, and prominence is realized only by the edge of a word or phrase. Therefore, focus prosody of telephone numbers was indeed confusing. We expect any language belonging to an edge-prominence language to behave like SK. In contrast, a head-prominence language, where prominence is marked by the head of a word or phrase, produces a clear prosodic marking of focus (see Lee(2015) for more information). In addition, in head /edge-prominence languages (e.g., Tokyo Japanese, South Kyungsang Korean), prominence is marked by both the head and edge of a word or phrase. Recent studies of Tokyo Japanese (Lee et al. 2018) and South Kyungsang Korean (Lee et al. 2019) demonstrated that the manner in which prosodic focus is encoded in this language group is confusing because this language group also marks prominence at the edge of a word or phrase. Similar future studies of more diverse language groups will provide more clarity to understanding focus prosody in particular and prosodic typology in general. What’s striking about the results is that the identification rate of our model was remarkably high relative to that of human perception. As stated above, the on-focus positions of the two tone groups were not more salient than the post-focus positions, leading to native listeners’ poor identification in perception. The high identification rate of the machine performance therefore remains particularly surprising because corrective focus was neither clearly marked in production nor accurately recognized USV Symbol Macro(s) Description by human perception. This makes an interesting contrast to American English, where native speakers’ perception was about 97%, whereas the0241 machineɁ performance\textglotstopvari was 92% (Cho et al. 2019). Our initial LATIN CAPITAL LETTER GLOTTAL STOP speculation was that Korean listeners who0242 are alreadyɂ familiar\textglotstopvarii with Korean intonational patterns for LATIN SMALL LETTER GLOTTAL STOP 0243 Ƀ \textbarcapitalb LATIN CAPITAL LETTER B WITH STROKE focus prosody were indeed distracted by the two tone groups’ confusing prosodic marking of focus. 0244 Ʉ \textbarcapitalu LATIN CAPITAL LETTER U BAR For example, as stated above, the focused L-tone digit was longer in duration, but lower in pitch 0245 Ʌ \textturnedcapitalv LATIN CAPITAL LETTER TURNED V than the following digit in our production, and these conflicting acoustic prominence cues may have 0246 Ɇ \textstrokecapitale LATIN CAPITAL LETTER E WITH STROKE confused listeners. In contrast, our model,0247 which isɇ naïve\textstrokee to the intonational system of Korean, seemed LATIN SMALL LETTER E WITH STROKE USV Symbol Macro(s) Description to find the most optimal cues of corrective0248 focus byɈ analyzing\textbarcapitalj prosodic features without being misled LATIN CAPITAL LETTER J WITH STROKE as much as native listeners. It then identified0249 the0266ɉ most\textbarj plausibleɦ \m{h} position within a 10-digit string to LATIN SMALL LETTERLATIN SMALLJ WITH STROKELETTER H WITH HOOK contain such focus cues, resulting in a024A higher identificationɊ \texthtcapitalq rate\texthookabove{h} than human perception. Since the LATIN CAPITAL LETTER SMALL Q WITH HOOK TAIL \texthth discrepancy in the recognition of prosodic024B focus betweenɋ \texthtq human and machine was observed in only a LATIN SMALL LETTER Q WITH HOOK TAIL 0267 ɧ \texthookabove{\textheng} LATIN SMALL LETTER HENG WITH HOOK 024C Ɍ \textbarcapitalr LATIN CAPITAL LETTER R WITH STROKE single study, more languages should be analyzed to determine the\texththeng universality of this case. 024D ɍ \textbarr LATIN SMALL LETTER R WITH STROKE Although the method employed in this study0268 effectivelyɨ tested\B{i} our goals, it is not clear whether LATIN SMALL LETTER I WITH STROKE 024E Ɏ \textbarcapitaly\textbari LATIN CAPITAL LETTER Y WITH STROKE our findings obtained from digit strings can be generalized to regular SK sentences. We expect prosodic 024F 0269ɏ \textbaryɩ \m{i} LATIN SMALL LETTERLATIN SMALLY WITH LETTER STROKE IOTA marking of focus would be more effective0250 for regularɐ sentences\textturna\textiota as compared to telephone numbers. LATIN SMALL LETTER TURNED A USV Symbol Macro(s)\textiotalatin Description Since each regular word tends to form an0251 AP by itself,ɑ whereas\textscripta digits in a phone number string tend to LATIN SMALL LETTER ALPHA \textniiota group together to form an AP, a focused02520289 word inɒʉ regular\textturnscripta\B{u} sentences are free from tonal constraints at LATINLATIN SMALL SMALL LETTER LETTER TURNED U BAR ALPHA 026A ɪ \textsci LATIN LETTER SMALL CAPITAL I 0253 ɓ \m{b}\textbaru LATIN SMALL LETTER B WITH HOOK the AP level. For example, as shown in Figure2026B, the focusedɫ word\textltildeminsu-ga begins with a low-tone LATIN SMALL LETTER L WITH MIDDLE TILDE 028A ʊ \m{b}\textscupsilon LATIN SMALL LETTER UPSILON inducing segment. Because the entire focused word026C instead\texthtb\textniupsilonɬ of a\textbeltl single syllable becomes prominent at LATIN SMALL LETTER L WITH BELT the AP level, its prosodic marking of focus028B will not026Dʋ be confusing,\textbhook\m{u}ɭ contrary\textrethookbelow{l} to the AP-initial focused digit LATIN SMALLLATIN LETTER SMALL V WITH LETTER HOOK L WITH RETROFLEX HOOK in a digit string. Therefore, prosodic marking0254 of focusɔ in\m{o}\m{v} regular\textrtaill sentences will behave differently and LATIN SMALL LETTER OPEN O 026E \textopeno\textscriptvɮ \textlyoghlig LATIN SMALL LETTER LEZH will be more effective than prosodic marking of focus in\textoopen telephone\textvhook\textOlyoghlig numbers. On the other hand, if the focus is placed to fall on the initial syllable0255028C in kim.t026Fɕʌŋ.guk-\textctc\textturnvɯl (to correct\textturnm the family name of a person, for LATINLATIN SMALL SMALL LETTERLATIN LETTER SMALLC WITHTURNED LETTER CURL V TURNED M example), the focused word in regular sentences0256028D 0270 willɖʍ also\M{d}\textturnw beɰ bound\textturnmrleg by tonal constraints at the AP level, LATINLATIN SMALL SMALL LETTERLATIN LETTER SMALLD TURNEDWITH LETTER TAIL W TURNED M WITH LONG LEG 028E 0271ʎ \textrtaild\textturnyɱ \m{m} LATIN SMALLLATIN LETTER SMALL TURNED LETTER Y M WITH HOOK \textdtail 028F ʏ \textscy\textltailm LATIN LETTER SMALL CAPITAL Y LATIN SMALL LETTER D WITH HOOK 0257 0272ɗ \m{d}ɲ \m{j} LATIN SMALL LETTER N WITH LEFT HOOK 0290 ʐ \texthtd\textrethookbelow{z} LATIN SMALL LETTER Z WITH RETROFLEX HOOK \textrtailz\textltailn \textdhook\textnhookleft 0291 ʑ \textctz LATIN SMALL LETTER Z WITH CURL LATIN SMALL LETTER REVERSED E 0258 0273ɘ \textreveɳ \m{n} LATIN SMALL LETTER N WITH RETROFLEX HOOK 0292 ʒ \m{z} LATIN SMALL LETTER EZH 0259 ə \schwa \textrtailn LATIN SMALL LETTER SCHWA \ezh 0274 \textschwa\textscn LATIN LETTER SMALL CAPITAL N \textezhɴ 025A ɚ \m{\schwa} LATIN SMALL LETTER SCHWA WITH HOOK 0275 \textyoghɵ \textbaro LATIN SMALL LETTER BARRED O \texthookabove{\schwa} 0276 ɶ \textscoelig LATIN SMALLLATIN LETTER LETTER EZH WITH SMALL CURL CAPITAL OE 0293 ʓ \textrhookschwa\textctyogh 0277 ɷ \textcloseomega LATIN LETTERLATIN GLOTTAL SMALL STOP LETTER CLOSED OMEGA 025B0294 ɛʔ \m{e}\textglotstop LATIN SMALL LETTER OPEN E 0295 0278ʕ \textepsilon\textrevglotstopɸ \textphi LATIN LETTERLATIN PHARYNGEAL SMALL LETTER VOICED PHI FRICATIVE \textniphi 0296 ʖ \texteopen\textinvglotstop LATIN LETTER INVERTED GLOTTAL STOP 0279 \textniepsilonɹ \textturnr LATIN SMALL LETTER TURNED R 0297 ʗ \textstretchc LATIN LETTER STRETCHED C 025C 027Aɜ \textrevepsilon\textstretchcvarɺ \textturnlonglegr LATIN SMALL LETTERLATIN SMALLREVERSED LETTER OPEN TURNED E R WITH LONG LEG 025D0298 027Bɝʘ \texthookabove{\textrevepsilon}\textbullseyeɻ \textrethookbelow{r} LATINLATIN SMALL LETTER LETTERLATIN BILABIAL SMALLREVERSED CLICK LETTER OPEN TURNED E WITH R HOOK WITH HOOK \textrhookrevepsilon\textObullseye\textturnrrtail 025E0299 027Cɞʙ \textcloserevepsilon\textscbɼ \textlonglegr LATINLATIN SMALL LETTER LETTERLATIN SMALL SMALLCLOSED CAPITAL LETTER REVERSED B R WITH OPEN LONG E LEG

025F029A 027Dɟʚ \B{j}\textcloseepsilonɽ \textrethookbelow{r} LATINLATIN SMALL SMALL LETTERLATIN LETTER SMALLDOTLESS CLOSED LETTER J OPEN WITH R ESTROKE WITH TAIL \textbardotlessj\textrtailr 029B ʛ \texthtscg LATIN LETTER SMALL CAPITAL G WITH HOOK \textObardotlessj 027E ɾ \textfishhookr LATIN SMALL LETTER R WITH FISHHOOK 029C ʜ \textsch LATIN LETTER SMALL CAPITAL H 0260 ɠ \texthookabove{g} LATIN SMALL LETTER G WITH HOOK 027F ɿ \textlhti LATIN SMALL LETTER REVERSED R WITH FISHHOOK 029D ʝ \textctj LATIN SMALL LETTER J WITH CROSSED-TAIL \texthtg \textlhtlongi \textctjvar 0261 ɡ \textscriptg LATIN SMALL LETTER SCRIPT G 0280 ʀ \textscr LATIN LETTER SMALL CAPITAL R 029E ʞ \textturnk LATIN SMALL LETTER TURNED K 0262 ɢ \textscg LATIN LETTER SMALL CAPITAL G 0281 ʁ \textinvscr LATIN LETTER SMALL CAPITAL INVERTED R 029F ʟ \textscl LATIN LETTER SMALL CAPITAL L 0263 ɣ \m{g} LATIN SMALL LETTER GAMMA 0282 ʂ \textrethookbelow{s} LATIN SMALL LETTER S WITH HOOK 02A0 ʠ \m{q} LATIN SMALL LETTER Q WITH HOOK \textbabygamma\textrtails \textgammalatinsmall\texthtq 0283 ʃ \m{s} LATIN SMALL LETTER ESH 02A1 ʡ \textipagamma\textbarglotstop LATIN LETTER GLOTTAL STOP WITH STROKE \esh 026402A2 ɤʢ \textramshorns\textbarrevglotstop\textesh LATINLATIN SMALL LETTER LETTER REVERSED RAMS GLOTTAL HORN STOP WITH STROKE 0265 ɥ \textturnh LATINLATIN SMALL SMALL LETTER LETTER TURNED DZ DIGRAPH H 02A3 0284ʣ \textdzligʄ \texthtbardotlessj LATIN SMALL LETTER DOTLESS J WITH STROKE AND HOOK 02A4 ʤ \textdyoghlig\texthtObardotlessj LATIN SMALL LETTER DEZH DIGRAPH \texthtbardotlessjvar 02A5 ʥ \textdctzlig LATIN SMALL LETTER DZ DIGRAPH WITH CURL 0285 ʅ \m{S} LATIN SMALL LETTER SQUAT REVERSED ESH 02A6 ʦ \texttslig LATIN SMALL LETTER TS DIGRAPH \textvibyi 02A7 ʧ \textteshlig 13 LATIN SMALL LETTER TESH DIGRAPH 0286 ʆ \textctesh LATIN SMALL LETTER ESH WITH CURL \texttesh 0287 ʇ \textturnt LATIN SMALL LETTER TURNED T 02A8 ʨ \texttctclig LATIN SMALL LETTER TC DIGRAPH WITH CURL 0288 ʈ \M{t} LATIN SMALL LETTER T WITH RETROFLEX HOOK 02A9 ʩ \textfenglig LATIN SMALL LETTER FENG DIGRAPH \textrtailt 02AA ʪ \textlslig\texttretroflexhook LATIN SMALL LETTER LS DIGRAPH 02AB ʫ \textlzlig LATIN SMALL LETTER LZ DIGRAPH 02AE ʮ \textlhtlongy LATIN SMALL LETTER TURNED H WITH FISHHOOK

02AF ʯ \textvibyy LATIN SMALL LETTER TURNED H WITH FISHHOOK AND TAIL 02B0 ʰ \textsuph 14 MODIFIER LETTER SMALL H \textsuperscript{h} 02B1 ʱ \textsuphth MODIFIER LETTER SMALL H WITH HOOK \textsuperscript{\texthth} 02B2 ʲ \textsupj MODIFIER LETTER SMALL J \textsuperscript{j} 02B3 ʳ \textsupr MODIFIER LETTER SMALL R \textsuperscript{r}

15 USV Symbol Macro(s) Description

0241 Ɂ \textglotstopvari LATIN CAPITAL LETTER GLOTTAL STOP

0242 ɂ \textglotstopvarii LATIN SMALL LETTER GLOTTAL STOP 0243 Ƀ \textbarcapitalb LATIN CAPITAL LETTER B WITH STROKE

0244 Ʉ \textbarcapitalu LATIN CAPITAL LETTER U BAR 0245 Ʌ \textturnedcapitalv LATIN CAPITAL LETTER TURNED V

0246 Ɇ \textstrokecapitale LATIN CAPITAL LETTER E WITH STROKE 0247 ɇ \textstrokee LATIN SMALL LETTER E WITH STROKE USV Symbol Macro(s) Description 0248 Ɉ \textbarcapitalj LATIN CAPITAL LETTER J WITH STROKE

0249 0266ɉ \textbarjɦ \m{h} LATIN SMALL LETTERLATIN JSMALL WITH STROKE LETTER H WITH HOOK 024A Ɋ \texthtcapitalq\texthookabove{h} LATIN CAPITAL LETTER SMALL Q WITH HOOK TAIL \texthth 024B ɋ \texthtq LATIN SMALL LETTER Q WITH HOOK TAIL 0267 ɧ \texthookabove{\textheng} LATIN SMALL LETTER HENG WITH HOOK 024C Ɍ \textbarcapitalr LATIN CAPITAL LETTER R WITH STROKE \texththeng 024D ɍ \textbarr LATIN SMALL LETTER R WITH STROKE 0268 ɨ \B{i} LATIN SMALL LETTER I WITH STROKE 024E Ɏ \textbarcapitaly\textbari LATIN CAPITAL LETTER Y WITH STROKE 024F 0269ɏ \textbaryɩ \m{i} LATIN SMALL LETTERLATIN YSMALL WITH LETTERSTROKE IOTA 0250 ɐ \textturna\textiota LATIN SMALL LETTER TURNED A USV Symbol Macro(s)\textiotalatin Description 0251 ɑ \textscripta LATIN SMALL LETTER ALPHA \textniiota 02520289 ɒʉ \textturnscripta\B{u} LATINLATIN SMALL SMALL LETTER LETTER TURNED U BAR ALPHA Languages 2020, 5, 64 026A ɪ \textsci LATIN17 LETTER of 19 SMALL CAPITAL I 0253 ɓ \m{b}\textbaru LATIN SMALL LETTER B WITH HOOK 026B ɫ \textltilde LATIN SMALL LETTER L WITH MIDDLE TILDE 028A ʊ \m{b}\textscupsilon LATIN SMALL LETTER UPSILON 026C \texthtb\textniupsilonɬ \textbeltl LATIN SMALL LETTER L WITH BELT 3 and prosodic marking028B of focus026Dʋ would\textbhook\m{u}ɭ not be\textrethookbelow{l} effective for this case compared to telephoneLATIN SMALL numbers.LATIN LETTER SMALL V WITH LETTER HOOK L WITH RETROFLEX HOOK Additional research is,0254 therefore,ɔ necessary\m{o}\m{v} to\textrtaill compare telephone numbers to regular sentences,LATIN SMALL LETTER taking OPEN O 026E \textopeno\textscriptvɮ \textlyoghlig LATIN SMALL LETTER LEZH into account two cases—first, where\textoopen focus\textvhook is\textOlyoghlig placed on the entire word like minsu-ga and, second, on the initial syllable as0255028C in kim.t026Fɕʌŋ.guk-\textctc\textturnvɯl. Furthermore,\textturnm more speakers must be recruitedLATINLATIN SMALL SMALL for LETTERLATIN LETTER future CSMALL WITHTURNED LETTERCURL V TURNED M research in order to ensure0256028D more0270ɖʍ robust\M{d}\textturnwɰ data and\textturnmrleg validate the results of the present experiment.LATINLATIN SMALL SMALL LETTERLATIN LETTER Finally, DSMALL TURNEDWITH LETTERTAIL W TURNED M WITH LONG LEG 028E 0271ʎ \textrtaild\textturnyɱ \m{m} LATIN SMALLLATIN LETTER SMALL TURNED LETTER Y M WITH HOOK we should create a classifier to detect\textdtail prosodic focus in other languages to compare the perception 028F ʏ \textscy\textltailm LATIN LETTER SMALL CAPITAL Y performance between human and machine and improve human-machine interactions.LATIN SMALL LETTER D WITH HOOK 0257 0272ɗ \m{d}ɲ \m{j} LATIN SMALL LETTER N WITH LEFT HOOK 0290 ʐ \texthtd\textrethookbelow{z} LATIN SMALL LETTER Z WITH RETROFLEX HOOK In summary, the results of this study\textrtailz revealed\textltailn both similarities and differences between the two \textdhook\textnhookleft tone groups in marking0291 prosodicʑ focus.\textctz The similarities included: (i) The low and highLATIN tone SMALL LETTER groups Z WITH CURL LATIN SMALL LETTER REVERSED E 0258 0273ɘ \textreveɳ \m{n} LATIN SMALL LETTER N WITH RETROFLEX HOOK 0292 ʒ \m{z} LATIN SMALL LETTER EZH exhibited a degree of0259 confusionə within\schwa a phrase\textrtailn as a focus effect, and (ii) the two toneLATIN groupsSMALL LETTER were SCHWA \ezh 0274 \textschwa\textscn LATIN LETTER SMALL CAPITAL N constrained by AP tonal melodies. The\textezhɴ differences included the importance of duration, intensity, and 025A ɚ \m{\schwa} LATIN SMALL LETTER SCHWA WITH HOOK pitch in the low tone group, in0275 contrast\textyogh toɵ the importance\textbaro of pitch only in the high tone group. Moreover,LATIN SMALL LETTER BARRED O \texthookabove{\schwa} 0276 ɶ \textscoelig LATIN SMALLLATIN LETTER LETTER EZH WITH SMALL CURL CAPITAL OE the focus prosody of0293 the two toneʓ groups\textrhookschwa\textctyogh was not accurately recognized in human perception, but 0294 0277ʔ \textglotstopɷ \textcloseomega LATIN LETTERLATIN GLOTTAL SMALL STOP LETTER CLOSED OMEGA the identification rate025B was approximatelyɛ \m{e} twice as high for the high tone group thanLATIN it SMALL was LETTER for OPEN the E 0295 0278ʕ \textepsilon\textrevglotstopɸ \textphi LATIN LETTERLATIN PHARYNGEAL SMALL LETTER VOICED PHI FRICATIVE low tone group. In machine perception, misidentifications\textniphi were also observed more often in the 0296 ʖ \texteopen\textinvglotstop LATIN LETTER INVERTED GLOTTAL STOP low tone group than in the high0279 tone\textniepsilonɹ group.\textturnr When comparing human and machine performance,LATIN SMALL LETTER TURNED R 0297 ʗ \textstretchc LATIN LETTER STRETCHED C the machine’s performance025C was027Aɜ found\textrevepsilon\textstretchcvarɺ to be\textturnlonglegr significantly superior to the human’s,LATIN contrary SMALL LETTERLATIN to REVERSEDSMALL our LETTER OPEN TURNED E R WITH LONG LEG 025D \texthookabove{\textrevepsilon} LATIN SMALL LETTER REVERSED OPEN E WITH HOOK expectation. In this study,0298 although027Bɝʘ the\textbullseyeɻ focus\textrethookbelow{r} prosody of telephone numbers was weakLATIN and LETTER confusing,LATIN BILABIAL SMALL CLICK LETTER TURNED R WITH HOOK \textrhookrevepsilon\textturnrrtail we observed a high identification rate\textObullseye of prosodic focus under certain circumstances. This study 025E0299 027Cɞʙ \textcloserevepsilon\textscbɼ \textlonglegr LATINLATIN SMALL LETTER LETTERLATIN SMALL CLOSEDSMALL CAPITAL LETTER REVERSED B R WITH OPEN LONG E LEG generated empirical evidence that prosodic marking of focus can vary according to the specific prosodic 025F029A 027Dɟʚ \B{j}\textcloseepsilonɽ \textrethookbelow{r} LATINLATIN SMALL SMALL LETTERLATIN LETTER DOTLESSSMALL CLOSED LETTER J OPEN WITH R ESTROKE WITH TAIL \textbardotlessj\textrtailr system of a language.029B ʛ \texthtscg LATIN LETTER SMALL CAPITAL G WITH HOOK \textObardotlessj 027E ɾ \textfishhookr LATIN SMALL LETTER R WITH FISHHOOK 029C ʜ \textsch LATIN LETTER SMALL CAPITAL H 0260 ɠ \texthookabove{g} LATIN SMALL LETTER G WITH HOOK Author Contributions: Conceptualization027F ɿ Y.-c.L.\textlhti and S.C.; methodology, Y.-c.L. and S.C.; validation, Y.-c.L.LATIN SMALL and LETTER REVERSED R WITH FISHHOOK 029D ʝ \textctj LATIN SMALL LETTER J WITH CROSSED-TAIL \texthtg \textlhtlongi S.C.; formal analysis, Y.-c.L. and S.C.; investigation,\textctjvar Y.-c.L.; resources, Y.-c.L.; writing—original draft preparation, 0261 ɡ \textscriptg LATIN SMALL LETTER SCRIPT G Y.-c.L.; writing—review and editing,0280 Y.-c.L.ʀ and S.C.;\textscr visualization, Y.-c.L. All authors have read and agreedLATIN to LETTER the SMALL CAPITAL R 029E ʞ \textturnk LATIN SMALL LETTER TURNED K 0262 ɢ \textscg LATIN LETTER SMALL CAPITAL G published version of the manuscript.0281 ʁ \textinvscr LATIN LETTER SMALL CAPITAL INVERTED R 029F ʟ \textscl LATIN LETTER SMALL CAPITAL L 0263 ɣ \m{g} LATIN SMALL LETTER GAMMA Funding: 0282 ʂ \textrethookbelow{s} LATIN SMALL LETTER S WITH HOOK This research02A0 received noʠ external\m{q} funding. LATIN SMALL LETTER Q WITH HOOK \textbabygamma\textrtails Acknowledgments: We gratefully acknowledge\textgammalatinsmall\texthtq Mark Liberman and Sun-Ah Jun for their expert advice on this 0283 ʃ \m{s} LATIN SMALL LETTER ESH 02A1 ʡ \textipagamma\textbarglotstop LATIN LETTER GLOTTAL STOP WITH STROKE study’s earlier version. We also thank the three anonymous\esh reviewers for their valuable comments that helped improve this manuscript’s026402A2 quality.ɤʢ \textramshorns\textbarrevglotstop\textesh LATINLATIN SMALL LETTER LETTER REVERSED RAMS GLOTTAL HORN STOP WITH STROKE 026502A3 ɥʣ \textturnh\textdzlig LATINLATIN SMALL SMALL LETTER LETTER TURNED DZ DIGRAPH H Conflicts of Interest: The authors0284 declare noʄ conflict\texthtbardotlessj of interest. LATIN SMALL LETTER DOTLESS J WITH STROKE AND HOOK 02A4 ʤ \textdyoghlig\texthtObardotlessj LATIN SMALL LETTER DEZH DIGRAPH \texthtbardotlessjvar 02A5 ʥ \textdctzlig LATIN SMALL LETTER DZ DIGRAPH WITH CURL References 0285 ʅ \m{S} LATIN SMALL LETTER SQUAT REVERSED ESH 02A6 ʦ \texttslig LATIN SMALL LETTER TS DIGRAPH \textvibyi 02A7 ʧ \textteshlig 13 LATIN SMALL LETTER TESH DIGRAPH Beckman, Mary E., and Janet B.0286 Pierrehumbert.ʆ 1986.\textctesh Intonational structure in Japanese and English. PhonologyLATIN SMALL LETTER ESH WITH CURL \texttesh Yearbook 3: 255–309. [CrossRef0287] ʇ \textturnt LATIN SMALL LETTER TURNED T 02A8 ʨ \texttctclig LATIN SMALL LETTER TC DIGRAPH WITH CURL Bolinger, Dwight. 1972. Accent is0288 predictableʈ (if you’re\M{t} a mind-reader). Language 48: 633–44. [CrossRef]LATIN SMALL LETTER T WITH RETROFLEX HOOK 02A9 ʩ \textfenglig LATIN SMALL LETTER FENG DIGRAPH Cho, Sunghye, and Yong-cheol Lee. 2016. The effect\textrtailt of the consonant-induced pitch on Seoul Korean . 02AA ʪ \textlslig\texttretroflexhook LATIN SMALL LETTER LS DIGRAPH Linguistic Research02AB33: 299–317.ʫ \textlzlig LATIN SMALL LETTER LZ DIGRAPH Cho, Sunghye, Mark Liberman,02AE andʮ Yong-cheol\textlhtlongy Lee. 2019. Automatic detection of prosodic focusLATIN SMALL in American LETTER TURNED H WITH FISHHOOK

English. Proceedings02AF of Interspeechʯ 2019:\textvibyy 3470–74. LATIN SMALL LETTER TURNED H WITH FISHHOOK AND TAIL Cho, Sunghye. 2017. Development02B0 ʰ of Pitch\textsuph Contrast and Seoul Korean Intonation. Ph.D. dissertation,14 MODIFIER LETTERUniversity SMALL H of Pennsylvania, Philadelphia, PA, USA.\textsuperscript{h} 02B1 ʱ \textsuphth MODIFIER LETTER SMALL H WITH HOOK Cho, Sunghye. 2018. The production and\textsuperscript{\texthth} perception of High-toned [il] by young speakers of Seoul Korean. Linguistic Research02B235: 533–65.ʲ \textsupj MODIFIER LETTER SMALL J Cohan, J. Ballantyne. 2000. The Realization\textsuperscript{j} and Function of Focus in Spoken English. Ph.D. dissertation, The University of02B3 Texas at Austin,ʳ Austin,\textsupr TX, USA. MODIFIER LETTER SMALL R \textsuperscript{r} Hair, Joseph, Rolph Anderson, Barry Babin, and William Black. 2010. Multivariate Data Analysis: A Global Perspective. Boston: Pearson.

15

3 We would like to thank the reviewer for bringing this issue to our attention. Languages 2020, 5, 64 18 of 19

Hothorn, Torsten, Frank Bretz, and Peter Westfall. 2008. Simultaneous inference in general parametric models. Biometrical Journal 50: 346–63. [CrossRef][PubMed] Jeon, Hae-sung, and Francis Nolan. 2017. Prosodic marking of narrow focus in Seoul Korean. Laboratory Phonology 8: 1–30. [CrossRef] Jo, Jung-Min, Seok-Keun Kang, and Taejin Yoon. 2006. A rendezvous of focus and topic in Korean: Morpho-syntactic, semantic, and acoustic evidence. The Linguistics Association of Korean Journal 14: 167–96. Jun, Sun-Ah, and Hee-Sun Kim. 2007. VP focus and narrow focus in Korean. Paper presented at the 16th The International Congress of Phonetic Sciences, Saarbrücken, Germany, August 6–10; pp. 1277–80. Jun, Sun-Ah, and Hyuck-Joon Lee. 1998. Phonetic and phonological markers of contrastive focus in Korean. Paper presented at the 5th International Conference on Spoken Language Processing, Sydney, Australia, November 30–December 4; pp. 1295–98. Jun, Sun-Ah, and Jihyeon Cha. 2011. High-toned [il] in Seoul Korean Intonation. Paper presented at the 17th The International Congress of Phonetic Sciences, Hong Kong, China, August 17–21; pp. 990–93. Jun, Sun-Ah, and Jihyeon Cha. 2015. High-toned [il] in Korean: Phonetics, intonational phonology, and sound change. Journal of Phonetics 51: 93–108. [CrossRef] Jun, Sun-Ah. 1998. The Accentual Phrase in the Korean prosodic hierarchy. Phonology 15: 189–226. [CrossRef] Jun, Sun-Ah. 2005. Korean intonational phonology and prosodic transcription. In Prosodic Typology: The Phonology of Intonation and Phrasing. Edited by Sun-Ah Jun. New York: Oxford University Press, pp. 201–29. Jun, Sun-Ah. 2011. Prosodic markings of complex NP focus, syntax, and the pre-/post-focus string. Paper presented at the 28th West Coast Conference on Formal Linguistics (WCCFL 28), Los Angeles, CA, USA, February 12–14; pp. 214–30. Jun, Sun-Ah. 2014. Prosodic typology: By prominence type, word prosody, and macro-. In Prosodic Typology II: The Phonology of Intonation and Phrasing. Edited by Sun-Ah Jun. Oxford: Oxford University Press, pp. 520–40. Kuznetsova, Alexandra, Per B. Brockhof, and Rune H. B. Christensen. 2017. lmerTest Package: Tests in Linear Mixed Effects Models. Journal of Statistical Software 82: 1–26. [CrossRef] Ladd, D. Robert. 1996. Intonational Phonology. Cambridge: Cambridge University Press. Lee, Yong-cheol, and Yi Xu. 2010. Phonetic realization of contrastive focus in Korean. Paper presented at Proceedings of Speech Prosody 2010, Chicago, IL, USA, May 10–14; p. 100033. Lee, Yong-cheol, Dongyoung Kim, and Sunghye Cho. 2019. The effect of prosodic focus varies by phrasal tones: The case of South Kyungsang Korean. Linguistics Vanguard 5: 20190010. [CrossRef] Lee, Yong-cheol, Satoshi Nambu, and Sunghye Cho. 2018. Focus prosody of telephone numbers in Tokyo Japanese. Journal of the Acoustical Society of America 143: EL340–46. [CrossRef][PubMed] Lee, Yong-cheol, Ting Wang, and Mark Liberman. 2016. Production and perception of tone 3 focus in Mandarin Chinese. Frontiers in Psychology 7: 1058. [CrossRef][PubMed] Lee, Yong-cheol. 2009. The Phonetic Realization of Contrastive Focus and its Neighbors in Korean and English: A Cross-Language Study. Master’s thesis, Hannam University, Daejeon, Korea. Lee, Yong-cheol. 2012. Prosodic correlation between the focusing adverb ozik ‘only’ and focus/givenness in Korean. Journal of Speech Sciences 2: 85–111. Lee, Yong-cheol. 2015. Prosodic Focus within and across Languages. Ph.D. dissertation, University of Pennsylvania, Philadelphia, PA, USA. Liberman, Mark, and Janet Pierrehumbert. 1984. Intonational invariance under changes in pitch range and length. In Language Sound Structure. Edited by Mark Aronoff and Richard Oehrle. Cambridge: MIT Press, pp. 157–233. Liu, Fang. 2009. Intonation Sytems of Mandarin and English: A Functional Approach. Ph.D. dissertation, University of Chicago, Chicago, IL, USA. Nolan, Francis. 2003. Intonational equivalence: An experimental evaluation of pitch scales. Paper presented at 15th International Congress of Phonetic Sciences, Barcelona, Spain, August 3–9; pp. 771–74. Oh, Miran, and Dani Byrd. 2019. Syllable-internal corrective focus in Korean. Journal of Phonetics 77: 100933. [CrossRef][PubMed] Pedregosa, Fabian, Gael Varoquaux, Alexandre Gramfort, Vincent Michel, Bertrand Thirion, Olivier Grisel, Mathieu Blondel, Peter Prettenhofer, Ron Weiss, Vincent Dubourg, and et al. 2011. Scikit-learn: Machine learning in Python. Journal of Machine Learning Research 12: 2825–30. Languages 2020, 5, 64 19 of 19

R Core Team. 2020. R: A Language and Environment for Statistical Computing. Version 4.0.2. Available online: http://www.r-project.org (accessed on 1 July 2020). Rietveld, Toni, and Carlos Gussenhoven. 1985. On the relation between pitch excursion size and prominence. Journal of Phonetics 13: 299–308. [CrossRef] Song, Jae Jung. 2005. The Korean Language: Structure, Use and Context. Abingdon: Routledge. Ueyama, Motoko, and Sun-Ah Jun. 1998. Focus realization in Japanese English and Korean English intonation. Japanese and Korean Linguistics 7: 629–45. Xu, Yi, and Ching X. Xu. 2005. Phonetic realization of focus in English declarative intonation. Journal of Phonetics 33: 159–97. [CrossRef] Xu, Yi, and Maolin Wang. 2009. Organizing syllables into groups—Evidence from F0 and duration patterns in Mandarin. Journal of Phonetics 37: 502–20. [CrossRef][PubMed] Xu, Yi. 2013. ProsodyPro—A Tool for large-scale systematic prosody analysis. Paper presented at Tools and Resources for the Analysis of Speech Prosody, Aix-en-Provence, France, August 30; pp. 7–10. Yuan, Jiahong. 2004. Intonation in Mandarin Chinese: Acoustics, Perception, and Computational Modeling. Ph.D. dissertation, Cornell University, Ithaca, NY, USA.

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).