When Being Unseen from Mbert Is Just the Beginning: Handling New Languages with Multilingual Language Models

Total Page:16

File Type:pdf, Size:1020Kb

When Being Unseen from Mbert Is Just the Beginning: Handling New Languages with Multilingual Language Models When Being Unseen from mBERT is just the Beginning: Handling New Languages With Multilingual Language Models Benjamin Mullery Antonis Anastasopoulosz Benoît Sagoty Djamé Seddahy yInria, Paris, France zDepartment of Computer Science, George Mason University, USA [email protected] [email protected] Abstract East) or Bambara (spoken by around 5 million peo- ple in Mali and neighboring countries) are not cov- Transfer learning based on pretraining lan- ered by any available language models. guage models on a large amount of raw data Even if training multilingual models that cover has become a new norm to reach state-of-the- art performance in NLP. Still, it remains un- more languages and language varieties is tempt- clear how this approach should be applied for ing, the curse of multilinguality described by Con- unseen languages that are not covered by any neau et al.(2020) makes it an impractical solu- available large-scale multilingual language tion, as it would require to train ever larger mod- model and for which only a small amount els. Furthermore, as shown by Wu and Dredze of raw data is generally available. In this (2020), large-scale multilingual language models work, by comparing multilingual and mono- reach sub-optimal performance for languages that lingual models, we show that such models only account for a small portion of the pretraining behave in multiple ways on unseen languages. Some languages greatly benefit from trans- data. fer learning and behave similarly to closely In this paper, we describe and analyze task related high resource languages whereas oth- and language adaptation experiments to get usable ers apparently do not. Focusing on the lat- language model-based representations for under- ter, we show that this failure to transfer is studied low resource languages. We run exper- largely related to the impact of the script used iments on 16 typologically diverse unseen lan- to write such languages. Transliterating those languages improves very significantly the abil- guages on three NLP tasks with different char- ity of large-scale multilingual language mod- acteristics: part-of-speech (POS) tagging, depen- els on downstream tasks. dency parsing (DEP) and named entity recognition (NER). Our results bring forth a great diversity of be- 1 Introduction haviors that we classify in three categories re- Language models are now a new standard to flecting the abilities of pretrained multilingual lan- build state-of-the-art Natural Language Processing guage models to be used for low-resource lan- (NLP) systems. In the past year, monolingual lan- guages. Some languages, the “Easy” ones, largely guage models have been released for more than 20 behave like high resource languages. Fine-tuning languages including Arabic, French, German, Ital- large-scale multilingual language models in a task- ian, Polish, Russian, Spanish, Swedish, and Viet- specific way leads to state-of-the-art performance. namese (Antoun et al., 2020; Martin et al., 2020b; The “Intermediate” languages are harder to pro- de Vries et al., 2019; Cañete et al., 2020; Kura- cess as large-scale multilingual language models tov and Arkhipov, 2019; Schweter, 2020, et alia). lead to sub-optimal performance as such. How- Additionally, large-scale multilingual models cov- ever, adapting them using unsupervised fine-tuning ering more than 100 languages are now available on available raw data in the target language leads (XLM-R by Conneau et al.(2020) and mBERT to a significant boost in performance, reaching or by Devlin et al.(2019)). Still, most of the 7000+ extending the state of the art. Finally, the “Hard” spoken languages in the world are not covered— languages are those for which large-scale multi- remaining unseen—by those models. Even lan- lingual models fail to provide decent downstream guages with millions of native speakers like Sorani performance even after unsupervised adaptation. Kurdish (about 7 million speakers in the Middle “Hard" languages include both stable and en- dangered languages, but they predominantly are 2020) enabled transfer from high to low-resource languages of communities that are majorly under- languages, leading to significant improvements in served by modern NLP. Hence, we direct our at- downstream task performance (Rahimi et al., 2019; tention to these “Hard" languages. For those lan- Kondratyuk and Straka, 2019). Furthermore, in guages, we show that the script they are written their most recent forms, multilingual models, such in is a critical element in the transfer abilities as mBERT, process tokens at the sub-word level us- of pretrained multilingual language models. We ing SentencePiece tokenization (Kudo and Richard- show that transliterating them into the script of a son, 2018). This means that they work in an open possibly-related high resource language leads to vocabulary setting.1 This flexibility enables such large gains in performance leading to outperform- models to process any language, even those that ing non-contextual strong baselines. are not part of their pretraining data. To sum up, our main contributions are the fol- When it comes to low-resource languages, one lowing: direction is simply to train such contextualized • Based on our empirical results, we propose a embedding models on whatever data is available. new categorization of low-resource languages Another option is to adapt/finetune a multilingual that are currently not covered by any available pretrained model to the language of interest. We language models: the Hard, the Intermediate briefly discuss these two options. and the Easy languages. • We show that Hard languages can be better Pretraining language models on a small addressed by transliterating them into a better- amount of raw data Even though the amount handled script (typically Latin), providing a of pre-training data seems to correlate with down- promising direction for rendering multilingual stream task performance (e.g. compare BERT and language models useful for a new set of unseen RoBERTa), several attempts have shown that train- languages. ing a new model from scratch can be efficient even if the amount of data in that language is limited. 2 Background and Motivations Indeed, Suárez et al.(2020) showed that pretrain- ing ELMo models (Peters et al., 2018) on less than As Joshi et al.(2020) vividly illustrate, there is 1GB of Wikipedia text leads to state-of-the-art per- a great divergence in the coverage of languages formance while Martin et al.(2020a) showed for by NLP technologies. The majority of the 7000+ French that pretraining a BERT model on as few of the world’s languages are not studied by the as 4GB of diverse enough data results in state-of- NLP community. Some languages have very few the-art performance. This was further confirmed by or no annotated datasets, making the development Micheli et al.(2020) who demonstrated that decent of systems challenging. performance was achievable with as low as 100BM The development of such models is a matter of of raw text data. first importance for the inclusion of communities, the preservation of endangered languages and more Adapting large-scale models for low-resource generally to support the rise of tailored NLP ecosys- languages Multilingual language models can be tems for such languages (Schmidt and Wiegand, used directly on unseen languages, or they can 2017; Stecklow, 2018; Seddah et al., 2020) In that also be adapted using unsupervised methods. For regard, the advent of the Universal Dependency example, Han and Eisenstein(2019) successfully project (Nivre et al., 2016) and the WikiAnn dataset used unsupervised model adaptation of the English (Pan et al., 2017) have greatly opened the num- BERT model to Early Modern English for sequence ber of covered languages by providing annotated labeling. Instead of finetuning the whole model, datasets for respectively 90 languages for depen- Pfeiffer et al.(2020) recently showed that adapter dency parsing and 282 languages for Named Entity layers (Houlsby et al., 2019) can be injected into Recognition. multilingual language models to provide parameter efficient task and language transfer. Regarding modeling approaches, the emergence of multilingual representation models, first with Still, as of today, the availability of monolin- static word embeddings (Ammar et al., 2016) and gual or multilingual language models is limited to then with language model-based contextual rep- 1As long as the input text is written in a script that is used resentations (Devlin et al., 2019; Conneau et al., in the training languages. approximately 120 languages, leaving many lan- Language (iso) Script Family #sents source guages without access to valuable NLP technology, Bambara (bm) Latin Niger-Congo 1k OSCAR Wolof (wo) Latin Niger-Congo 10k OSCAR although some are spoken by millions of people, Swiss* (gsw) Latin West Germanic 250k OSCAR including Bambara, Maltese and Sorani Kurdish. Naija (pcm) Latin Pidgin (En) 237k Other Faroese (fao) Latin North Germanic 297k Leipzig What can be done for unseen languages? Un- Maltese (mlt) Latin Semitic 50k OSCAR Narabizi (nrz) Latin Semitic** 87k Other seen languages vary greatly in the amount of avail- Sorani (ckb) Arabic Indo-Iranian 380k OSCAR able data, in their script (many languages use non- Uyghur (ug) Arabic Turkic 105k OSCAR Sindhi (sd) Arabic Indo-Aryan 375k OSCAR Latin scripts such as Sorani Kurdish and Mingre- Mingrelian (xmf) Georg. Kartvelian 29k Wiki lian), and in their morphological or syntactical Buryat (bxu) Cyrillic Mongolic 7k Wiki Mari (mhr) Cyrillic
Recommended publications
  • Pulley / Descender / Belay Device
    Multi-Purpose Device 11 mm EN FR NO SE Model No. 333010-CE Pulley / Descender / Belay Device EN 12278:2007 EN 341:2011/2A 1019 EN 12841:2006/C Patented WARNING Activities involving the use of this device are potentially dangerous. You are responsible for your own actions and decisions. Before using this device, you must: • Read and understand these user instructions and • Familiarize yourself with its capabilities warnings; and limitations; • Get specific training in its proper use; • Understand and accept the risks involved. FAILURE TO HEED ANY OF THESE WARNINGS MAY RESULT IN SEVERE INJURY OR DEATH. 0 Traceability and Markings D E B F A G C 1019 A. Body controlling production of this PPE E. Individual........................... number No. 1019 00 000 M 0000 VVUÚ, a.s. Unit serial number Pikartská 1337/7 716 07 Ostrava - Radvanice Control Czech Republic Day of manufacture Year of manufacture B. Standards F. Anchor/load end of rope C. Carefully read the instructions for use G. Free end of rope D. Model identification 2 1 Field of Application 4 Inspection, Points to Verify See Text See Text 2 Breaking Strength 5 Compatibility 44 kN O 11 mm (EN) Rope (core + sheath) static, semi-static (EN 1891) type A 22 kN 22 kN 3 Nomenclature of Parts See Text 2 7 8 1 6 10 9 3 5 4 3 6 Installing the Rope LOAD SIDE 2 4 1 3 7 Function Test 8 Securing Function Test STOP! 4 9a EN 341:2011/2A Rescue Descender Lowering From Anchor— Maximum Descent Height: 200 m One Person Minimum/Maximum Working Load: 30–240 kg Maximum Descent Rate: 2 m/s Descent—Two People
    [Show full text]
  • Da Heng Building
    fabric&glass Sefar AG Hinterbissaustrasse 12 9410 Heiden Switzerland Da Heng Building Phone +41 71 898 51 04 Taichung (TW) [email protected] www.sefararchitecture.com [FactBox] Within a spacious green park is the Da Additionally, independent studies have Heng building, which was finished in 2016. also shown that the risk of bird strikes Project/Location The company chairman specifically chose on glass facades with integrated SEFAR® Da Heng, Taichung (TW) this fabric for its progressive appearance Architecture VISION Fabric is significantly www.yojicon.com.tw/list.php and unique attributes. SEFAR® Architecture reduced. The special finish using Sentry- Architect VISION PR 260/50 met. grey fabric was Glas® Foil for facade windows meets the C.Y.Lee & Partners Architects/Planners, Taipei (TW) laminated in the lower curtain wall to re- very highest safety standards. www.cylee.com/team1_tw.php duce glare and provide priavacy at the street The glass lites are fixed on four sides in Designer level for the buidlings occupants. an attached profile. The lites are 1.60 m x + Ray Chen International, Taipei (TW) SEFAR® Architecture VISION is a range of 3 and 4 m high for a total of over 1000 sqm, www.raychen-intl.com.tw high-precision fabrics made from synthet- using 10 mm low iron glass. Glass Manufacturer ic black fibers. In a complex process, fab- Stanley Glass Co. Ltd, Keelung (TW) ric is metal-coated on one side. By means www.stanleyglass.com of digital printing, the coated fabric sides Foil can be further individualized. The reflec- SentryGlas® supplied by Kuraray/DuPont tive properties reduces heat input consid- www.sentryglas.com erably and makes a correspondingly valu- Fabric able contribution to the environment and SEFAR® Architecture VISION PR 260/50 met.
    [Show full text]
  • Reply to Heng: Inborn Aneuploidy and Chromosomal Instability
    LETTER LETTER Reply to Heng: Inborn aneuploidy and chromosomal instability It is with great interest that we read the letter chromosome gains (3). Furthermore, our re- Heng for bringing up the interesting topic of by Heng (1), who kindly brought up the port included cells from patients with con- NCCAs in cancer, but we still feel that the interesting topic of nonclonal-chromosome stitutional trisomy 8, a type of aneuploidy title of our report is justified by its results. aberrations (NCCAs) in the context of our commoninhematologicalaswellassolid recent report on the relationship between neoplasms. We also included cells with Anders Valind1 and David Gisselsson inborn aneuploidy and chromosome insta- multiple trisomies. Neither trisomy 8 nor Department of Clinical Genetics, Lund bility (CIN). We fully agree with Heng that such double trisomies are compatible with University, Lund 22184, Sweden NCCAs are better proxies for CIN than live birth. clonal-chromosome aberrations (CCAs). In In summary, we showed that none of fact, we used the frequency of NCCA as a fairly large spectrum of whole chromosome 1 Heng HH (2014) Distinguishing constitutional and acquired proxy to CIN in our study (2). We also agree gains could automatically destabilize the nonclonal aneuploidy. Proc Natl Acad Sci USA 111:E972. 2 Valind A, Jin Y, Baldetorp B, Gisselsson D (2013) Whole that it might well be that aneuploidy causes chromosome number when present in be- chromosome gain does not in itself confer cancer-like chromosomal other forms of genomic instability, but our nign human cells. Our findings thereby infer instability. Proc Natl Acad Sci USA 110(52):21119–21123.
    [Show full text]
  • Unicode Alphabets for L ATEX
    Unicode Alphabets for LATEX Specimen Mikkel Eide Eriksen March 11, 2020 2 Contents MUFI 5 SIL 21 TITUS 29 UNZ 117 3 4 CONTENTS MUFI Using the font PalemonasMUFI(0) from http://mufi.info/. Code MUFI Point Glyph Entity Name Unicode Name E262 � OEligogon LATIN CAPITAL LIGATURE OE WITH OGONEK E268 � Pdblac LATIN CAPITAL LETTER P WITH DOUBLE ACUTE E34E � Vvertline LATIN CAPITAL LETTER V WITH VERTICAL LINE ABOVE E662 � oeligogon LATIN SMALL LIGATURE OE WITH OGONEK E668 � pdblac LATIN SMALL LETTER P WITH DOUBLE ACUTE E74F � vvertline LATIN SMALL LETTER V WITH VERTICAL LINE ABOVE E8A1 � idblstrok LATIN SMALL LETTER I WITH TWO STROKES E8A2 � jdblstrok LATIN SMALL LETTER J WITH TWO STROKES E8A3 � autem LATIN ABBREVIATION SIGN AUTEM E8BB � vslashura LATIN SMALL LETTER V WITH SHORT SLASH ABOVE RIGHT E8BC � vslashuradbl LATIN SMALL LETTER V WITH TWO SHORT SLASHES ABOVE RIGHT E8C1 � thornrarmlig LATIN SMALL LETTER THORN LIGATED WITH ARM OF LATIN SMALL LETTER R E8C2 � Hrarmlig LATIN CAPITAL LETTER H LIGATED WITH ARM OF LATIN SMALL LETTER R E8C3 � hrarmlig LATIN SMALL LETTER H LIGATED WITH ARM OF LATIN SMALL LETTER R E8C5 � krarmlig LATIN SMALL LETTER K LIGATED WITH ARM OF LATIN SMALL LETTER R E8C6 UU UUlig LATIN CAPITAL LIGATURE UU E8C7 uu uulig LATIN SMALL LIGATURE UU E8C8 UE UElig LATIN CAPITAL LIGATURE UE E8C9 ue uelig LATIN SMALL LIGATURE UE E8CE � xslashlradbl LATIN SMALL LETTER X WITH TWO SHORT SLASHES BELOW RIGHT E8D1 æ̊ aeligring LATIN SMALL LETTER AE WITH RING ABOVE E8D3 ǽ̨ aeligogonacute LATIN SMALL LETTER AE WITH OGONEK AND ACUTE 5 6 CONTENTS
    [Show full text]
  • Heng Xian and the Problem of Studying Looted Artifacts
    Dao (2013) 12:153–160 DOI 10.1007/s11712-013-9323-4 Heng Xian and the Problem of Studying Looted Artifacts Paul R. Goldin Published online: 10 April 2013 # Springer Science+Business Media Dordrecht 2013 Abstract Heng Xian is a previously unknown text reconstructed by Chinese scholars out of a group of more than 1,200 inscribed bamboo strips purchased by the Shanghai Museum on the Hong Kong antiquities market in 1994. The strips have all been assigned an approximate date of 300 B.C.E., and Heng Xian allegedly consists of thirteen of them, but each proposed arrangement of the strips is marred by unlikely textual transitions. The most plausible hypothesis is one that Chinese scholars do not appear to take seriously: that we are missing one or more strips. The paper concludes with a discussion of the hazards of studying unprovenanced artifacts that have appeared during China’s recent looting spree. I believe the time has come for scholars to ask themselves whether their work indirectly abets this destruction of knowledge. Keywords Heng Xian . Chinese philosophy . Shanghai Museum . Looting Heng Xian 恆先 (In the Primordial State of Constancy) is a previously unknown text reconstructed by Chinese scholars out of a group of more than 1,200 inscribed bamboo strips purchased by the Shanghai Museum on the Hong Kong antiquities market in 1994 (MA Chengyuan 2001: 1). The strips have all been assigned an approximate date of 300 B.C.E., and Heng Xian consists of thirteen of them. The first published version was edited by the veteran palaeographer LI Ling 李零 (LI Ling 2003).
    [Show full text]
  • Highway Dharma Letters: Two Buddhist Pilgrims Write to Their Teacher
    Highway Dharma Letters: Two Buddhist Pilgrims Write to Their Teacher The Second Edition of News from True Cultivators by Heng Sure and Heng Chau With a Foreword by Huston Smith Foreword to the Second Edition I have written forewords to more than forty-five books, but I say with- out hesitation that I have never been as honored—and humbled—as I am to have been invited to write this one. This is the most neglected book of the twentieth century—I say this categorically and with com- plete confidence. Come to think of it, though, this stands to reason, for humility requires that authentic spirituality and public relations stand in inverse ration to each other. (“When you pray, go into your closet.”) Humility shows itself also in the fact that the “News” in the title takes the form of daily letters written to the Venerable Master Hsuan Hua, by his two disciples who made the eight-hundred-mile journey from Gold Wheel Temple in South Pasadena to the City of Ten Thousand Buddhas near Ukiah. They covered the distance in the traditional penitential way: three steps followed by a full prostration. One took a vow of complete silence for the duration of the pilgrimage and three years beyond; the other managed practical affairs, did all the talking, driving, cooking, and dealing with occasionally hostile visitors, while still finding time to bow. Having said that this is the most overlooked book of the twenti- eth century, I want only to add that I find it one of the most inspir- ing.
    [Show full text]
  • Quantum Integrable Systems from Conformal Blocks Quantum
    Quantum Integrable Systems from Conformal Blocks (arXiv: 1605.05105 with Joshua D. Qualls) Heng-Yu Chen National Taiwan University Strings 2016 @ Beijing Heng-Yu Chen National Taiwan University Quantum Integrable Systems from Conformal Blocks This talk attempts to discuss the following questions: I Are there more systematic or even more efficient ways to obtain conformal blocks in various dimensions and set up? I Can one extend the correspondence between the degenerate correlation functions and quantum integrable systems in d = 2 dim. CFTs to d > 2 dim. CFTs? I If so, how general such a correspondence is? Heng-Yu Chen National Taiwan University Quantum Integrable Systems from Conformal Blocks Let us begin with the four point function of scalar conformal primary operator φi (x) with scale dimension ∆i in d-dim. CFTs: 2 a 2 b x14 x14 F (u; v) < φ1(x1)φ2(x2)φ3(x3)φ4(x4) >= x 2 x 2 2 (∆1+∆2) 2 (∆3+∆4) 24 13 (x12) 2 (x34) 2 (1) ∆2−∆1 ∆3−∆4 where xij = xi − xj a = 2 , b = 2 and F (u; v) is a function of conformally invariant cross ratios: 2 2 2 2 x12x34 x14x23 u = 2 2 = zz¯; v = 2 2 = (1 − z)(1 − z¯): (2) x13x24 x13x24 Decomposing the four point function further into contributions from the individual exchanged primary operators O∆;l and its descendants: X < φ1(x1)φ2(x2)φ3(x3)φ4(x4) >= λ12O∆;l λ34O∆;l WO∆;l (xi ) (3) fO∆;l g WO∆;l (xi ) is \conformal partial wave" and λijO∆;l are \OPE coefficients.".
    [Show full text]
  • 1 Symbols (2286)
    1 Symbols (2286) USV Symbol Macro(s) Description 0009 \textHT <control> 000A \textLF <control> 000D \textCR <control> 0022 ” \textquotedbl QUOTATION MARK 0023 # \texthash NUMBER SIGN \textnumbersign 0024 $ \textdollar DOLLAR SIGN 0025 % \textpercent PERCENT SIGN 0026 & \textampersand AMPERSAND 0027 ’ \textquotesingle APOSTROPHE 0028 ( \textparenleft LEFT PARENTHESIS 0029 ) \textparenright RIGHT PARENTHESIS 002A * \textasteriskcentered ASTERISK 002B + \textMVPlus PLUS SIGN 002C , \textMVComma COMMA 002D - \textMVMinus HYPHEN-MINUS 002E . \textMVPeriod FULL STOP 002F / \textMVDivision SOLIDUS 0030 0 \textMVZero DIGIT ZERO 0031 1 \textMVOne DIGIT ONE 0032 2 \textMVTwo DIGIT TWO 0033 3 \textMVThree DIGIT THREE 0034 4 \textMVFour DIGIT FOUR 0035 5 \textMVFive DIGIT FIVE 0036 6 \textMVSix DIGIT SIX 0037 7 \textMVSeven DIGIT SEVEN 0038 8 \textMVEight DIGIT EIGHT 0039 9 \textMVNine DIGIT NINE 003C < \textless LESS-THAN SIGN 003D = \textequals EQUALS SIGN 003E > \textgreater GREATER-THAN SIGN 0040 @ \textMVAt COMMERCIAL AT 005C \ \textbackslash REVERSE SOLIDUS 005E ^ \textasciicircum CIRCUMFLEX ACCENT 005F _ \textunderscore LOW LINE 0060 ‘ \textasciigrave GRAVE ACCENT 0067 g \textg LATIN SMALL LETTER G 007B { \textbraceleft LEFT CURLY BRACKET 007C | \textbar VERTICAL LINE 007D } \textbraceright RIGHT CURLY BRACKET 007E ~ \textasciitilde TILDE 00A0 \nobreakspace NO-BREAK SPACE 00A1 ¡ \textexclamdown INVERTED EXCLAMATION MARK 00A2 ¢ \textcent CENT SIGN 00A3 £ \textsterling POUND SIGN 00A4 ¤ \textcurrency CURRENCY SIGN 00A5 ¥ \textyen YEN SIGN 00A6
    [Show full text]
  • The Brill Typeface User Guide & Complete List of Characters
    The Brill Typeface User Guide & Complete List of Characters Version 2.06, October 31, 2014 Pim Rietbroek Preamble Few typefaces – if any – allow the user to access every Latin character, every IPA character, every diacritic, and to have these combine in a typographically satisfactory manner, in a range of styles (roman, italic, and more); even fewer add full support for Greek, both modern and ancient, with specialised characters that papyrologists and epigraphers need; not to mention coverage of the Slavic languages in the Cyrillic range. The Brill typeface aims to do just that, and to be a tool for all scholars in the humanities; for Brill’s authors and editors; for Brill’s staff and service providers; and finally, for anyone in need of this tool, as long as it is not used for any commercial gain.* There are several fonts in different styles, each of which has the same set of characters as all the others. The Unicode Standard is rigorously adhered to: there is no dependence on the Private Use Area (PUA), as it happens frequently in other fonts with regard to characters carrying rare diacritics or combinations of diacritics. Instead, all alphabetic characters can carry any diacritic or combination of diacritics, even stacked, with automatic correct positioning. This is made possible by the inclusion of all of Unicode’s combining characters and by the application of extensive OpenType Glyph Positioning programming. Credits The Brill fonts are an original design by John Hudson of Tiro Typeworks. Alice Savoie contributed to Brill bold and bold italic. The black-letter (‘Fraktur’) range of characters was made by Karsten Lücke.
    [Show full text]
  • Yu Tse Heng [email protected] • Yutseheng.Com • 4I2–5I7–294I
    Yu Tse Heng [email protected] • yutseheng.com • 4I2–5I7–294I EDUCATION Ph.D. Organizational Behavior, University of Washington (UW) 2021 (expected) Minors: Research Methods, Social Statistics Dissertation: The Grief-Work Interface: How Employees Navigate Grief and Work Following the Loss of a Loved One Chair: Ryan Fehr; Committee Members: Crystal Farh, Kira Schabram, and Sapna Cheryan (Psychology) Stage: Proposed April 2020 * Selected as Finalist for the Organization Science/INFORMS Dissertation Proposal Competition (2020) B.Soc.Sc. Psychology (First Class Honors) 2015 National University of Singapore (NUS) RESEARCH INTERESTS • Humanization and Dehumanization • Suffering in Organizations • Positive Organizational Scholarship REFEREED PUBLICATIONS Heng, Y. T., Wagner, D., Barnes, C. M., & Guarana, C. (2018). Archival research: Expanding the methodological toolkit in social psychology. Journal of Experimental Social Psychology, 78, 14-22. WORK UNDER REVISION AND REVIEW (Note: Manuscript names redacted to protect blind peer-review.) Schabram, K. & Heng, Y. T. (under 3rd review). Compassion and burnout. Academy of Management Journal. Heng, Y. T. & Fehr, R., (1st R&R). Self-compassion and helping. Organizational Behavior and Human Decision Processes. Heng, Y. T., Barnes, C. M., & Yam, K. C. (1st R&R). Cannabis and creativity. Journal of Applied Psychology. Dodson, S. & Heng, Y. T. (1st R&R). Self-compassion in organizations. Journal of Organizational Behavior. Heng, Y. T. – September 2020 – P. 1 Heng, Y. T., Chi, N. W., Farh, C. I. C, & Wang, A-C. (under 1st review). Abusive supervision and forgiveness. Academy of Management Journal. Fulmer, C. A., Heng, Y. T., Song, Q., & Chen, Y. (under 1st review). Dyadic trust asymmetry. Organizational Behavior and Human Decision Processes.
    [Show full text]
  • Rubber Bands Thailand AD Prelim
    A-549-835 Investigation Public Document E&C/Office III: LRL/SB DATE: August 29, 2018 MEMORANDUM TO: Christian Marsh Deputy Assistant Secretary for Enforcement and Compliance FROM: James Maeder Associate Deputy Assistant Secretary for Antidumping and Countervailing Duty Operations performing the duties of Deputy Assistant Secretary for Antidumping and Countervailing Duty Operations SUBJECT: Decision Memorandum for the Preliminary Determination in the Less-Than-Fair-Value Investigation of Rubber Bands from Thailand I. SUMMARY The Department of Commerce (Commerce) preliminarily determines that rubber bands from Thailand are being, or are likely to be, sold in the United States at less than fair value (LTFV), as provided in section 733 of the Tariff Act of 1930, as amended (the Act). The estimated weighted-average dumping margins are shown in the “Preliminary Determination” section of the accompanying Federal Register notice. II. BACKGROUND On January 30, 2018, we received an antidumping duty (AD) petition covering imports of rubber bands from Thailand,1 filed in proper form on behalf of Alliance Rubber Company (the petitioner). We initiated this investigation on February 20, 2018.2 In the Initiation Notice, we stated that we intended to select respondents based on U.S. Customs and Border Protection (CBP) data for certain of the Harmonized Tariff Schedule of the United States (HTSUS) subheadings listed in the scope of the investigation.3 Accordingly, on February 1 See the petitioner’s letter, “Petition for the Imposition of Antidumping and Countervailing Duties on Rubber Bands from Thailand, China and Sri Lanka,” dated January 30, 2018 (the Petition). 2 See Rubber Bands from the People’s Republic of China, Sri Lanka, and Thailand: Initiation of Less-Than-Fair- Value Investigations, 83 FR 8424 (February 27, 2018) (Initiation Notice).
    [Show full text]
  • Heng Zhang EDUCATION PUBLICATION WORKING PAPER
    Heng Zhang 3600 Chestnut St. SPE., Rm. 1401, Philadelphia, PA 19104 Tel: 215-410-9769 ▪ Email: [email protected] EDUCATION University of Pennsylvania Jan 2012 – Dec 2013 M.S. in Systems Engineering Thesis: “Optimization of Operations in the Cross-Docking Facility” GPA: 4.0/4.0 (Major GPA 4.0/4.0) Beijing University of Posts and Telecommunications Sep 2009 – Dec 2011 M.S. in Management Science and Engineering Thesis: “Decision Making in Public Transitions Based on Smart Card Data” GPA: 91.2/100 (Major GPA 92.9/100) Beijing University of Posts and Telecommunications Sep 2005 – Jul 2009 B.S. in Mathematics and Applied Mathematics GPA: 89.6/100 (Major GPA 91.8/100, Ranking 1/56) PUBLICATION Jiantong Cao, Heng Zhang, Qiushi Zheng, “Keeping Customers by Data Mining: A Telecommunication Carrier’s Case Study in China”, 2010 IEEE Conference on E-business and E-Commerce, 2010, Guangzhou, China WORKING PAPER Heng Zhang, “Demand Pooling in the Cross-Docking Systems with Information Updating”, 2013, in progress Monique Guignard-Spielberg, Peter Hahn, Heng Zhang, “Practical Cross-Docking Optimization”, to be submitted to Special Issue of Transportation Research Part C: Emerging Technologies Heng Zhang, Peter Hahn, Monique Guignard-Spielberg, “Real-World Data Based Stochastic Cross-Docking Scheduling”, abstract submitted to 2014 TSL Workshop on Handling Uncertainty in Planning Logistics and Transportation Systems, Chicago, IL RESEARCH Research Assistant/Data Analyst (Advisor: Prof. Barry G. Silverman) May 2013 – Aug 2013 Heart Failure Patient Data Analysis, ACASA Laboratory, University of Pennsylvania, PA Performed Exploratory Data Analysis of patient flow in a health care network.
    [Show full text]