Semi-Supervised Learning for Maximizing the Partial AUC

Total Page:16

File Type:pdf, Size:1020Kb

Semi-Supervised Learning for Maximizing the Partial AUC The Thirty-Fourth AAAI Conference on Artificial Intelligence (AAAI-20) Semi-Supervised Learning for Maximizing the Partial AUC Tomoharu Iwata, Akinori Fujino, Naonori Ueda NTT Communication Science Laboratories, Kyoto, Japan {tomoharu.iwata.gy, akinori.fujino.yh, naonori.ueda.fr}@hco.ntt.co.jp Abstract partial area under the ROC curve with a specific FPR range as shown in Figure 1(a). Many methods for maximizing The partial area under a receiver operating characteristic curve (pAUC) is a performance measurement for binary clas- the pAUC have been proposed (Komori and Eguchi 2010; sification problems that summarizes the true positive rate with Ricamato and Tortorella 2011; Narasimhan and Agarwal the specific range of the false positive rate. Obtaining classi- 2013a; 2013b; Ueda and Fujino 2018; Wang and Chang fiers that achieve high pAUC is important in a wide variety 2010). However, existing pAUC maximization methods re- of applications, such as cancer screening and spam filtering. quire many labeled data for training, which are expensive to Although many methods have been proposed for maximizing prepare. the pAUC, existing methods require many labeled data for In this paper, we propose a semi-supervised learning training. In this paper, we propose a semi-supervised learn- method for maximizing the pAUC to achieve a high pAUC ing method for maximizing the pAUC, which trains a classi- fier with a small amount of labeled data and a large amount with a small amount of labeled data and a large amount of of unlabeled data. To exploit the unlabeled data, we derive unlabeled data. Unlabeled data are usually easier to prepare two approximations of the pAUC: the first is calculated from than labeled data. To exploit unlabeled data, we define the positive and unlabeled data, and the second is calculated from unlabeled positive rate (UPR), which is the probability that negative and unlabeled data. A classifier is trained by maxi- the decision function score of unlabeled data is higher than a mizing the weighted sum of the two approximations of the threshold. We then derive two approximations of the pAUC pAUC and the pAUC that is calculated from positive and from the partial area under the curves of the UPR and the negative data. With experiments using various datasets, we TPR or FPR: the first is calculated from positive and unla- demonstrate that the proposed method achieves higher test beled data, and the second is calculated from negative and pAUCs than existing methods. unlabeled data. A classifier is trained by maximizing the weighted sum 1 Introduction of the two approximations of the pAUC and the pAUC that The area under a receiver operating characteristic (ROC) is calculated from positive and negative data. For classifiers, curve (AUC) is widely used for performance measure- the proposed method can use any differentiable functions, ment with binary classification problems (Bradley 1997; such as logistic regression and neural networks. Although Huang and Ling 2005). The ROC curve is the true positive several semi-supervised methods for AUC maximization rate (TPR) as a function of the false positive rate (FPR). have been proposed (Fujino and Ueda 2016; Sakai, Niu, and By maximizing the AUC, we can obtain a classifier that Sugiyama 2018; Kiryo et al. 2017), they are inapplicable to achieves high average TPRs over all FPR values from zero pAUC maximization. The main contributions of this paper to one (Brefeld and Scheffer 2005; Wang and Tang 2009; are as follows: Ding et al. 2015; Ying, Wen, and Lyu 2016; Han and Zhao 2010; Zhou, Lai, and Yen 2009; Zhao et al. 2011). 1. We derive two approximations of the pAUC that are cal- In many applications, we would like to achieve a high culated using unlabeled data (Section 4). TPR with a specific FPR range. For example, in cancer screening applications, maintaining a low FPR is important 2. We propose a semi-supervised learning method for max- if we are to eliminate unnecessary and costly biopsies (Baker imizing the pAUC based on the approximations of the and Pinsky 2001). In a spam detection system, we can accept pAUC (Section 5). Our work is the first attempt for semi- only a low FPR if we are to prevent legitimate emails from supervised pAUC maximization to our knowledge. being identified as spam. In such applications, a partial AUC 3. We empirically demonstrate that the proposed method (pAUC) is more appropriate than an AUC. The pAUC is the performs better than existing supervised and semi- Copyright c 2020, Association for the Advancement of Artificial supervised methods using various datasets for anomaly Intelligence (www.aaai.org). All rights reserved. detection (Section 6). 4239 2 Preliminaries Let x ∈ RD be a D-dimensional feature vector, and y ∈ {±1} be a binary class label. Decision function s : RD → R outputs a score for classification, where the score is used to estimate the class label by, yˆ =sign(s(x) − h), where h is a threshold and sign(·) is the sign function; sign(A)=+1 if A ≥ 0 and sign(A)=−1 otherwise. The true posi- tive rate of threshold h is defined by the probability that the score of positive data is higher than threshold h, TPR(h)= ∞ f s I s>hds f s −∞ P( ) ( ) , where P( ) is the score distribution of positive data, and I(·) is the indicator function; I(A)=1 (a) (b) if A is true and I(A)=0otherwise. Similarly, the false positive rate of threshold h is defined by the probability Figure 1: (a) pAUC(α, β): Partial area under the curve of that the score of negative data is higher than threshold h, ∞ the true positive rate against the false positive rate between FPR(h)= fN(s)I(s>h)ds, where fN(s) is the score −∞ α and β. (b) pAUC(α, β) is calculated by the probability distribution of negative data. The AUC is the area under the h h h that the score of a positive sample is higher than that of a curve of TPR( ) against FPR( ) with varying threshold , negative sample that lies between α and β when sorted by and it is calculated by its score. 1 AUC = TPR(FPR−1(u))du 0 j αM j βM xN ∞ ∞ where α = N , β = N , and (j) denotes the N j = fP(s)fN(s )I(s>s)dsds , (1) negative sample in ranked in the th position among −∞ −∞ negatives in a descending order of scores s(x) (Dodd and −1 Pepe 2003; Narasimhan and Agarwal 2013a). The empiri- where FPR (u)=inf{h ∈ R|FPR(h) ≤ u} (Cortes cal pAUC is the empirical probability that positive samples and Mohri 2004). The AUC is the probability that scores have higher scores than negative samples that are ranked be- sampled from the positive distribution are higher than those tween jα and jβ. The computational complexity of calculat- from the negative distribution. ing Eq.(3) is O(MN log MN +(β − α)MPMN), where the The partial AUC (pAUC) between α and β, where 0 ≤ first term is for sorting MN negative samples, and the second α<β≤ 1, is the normalized partial area under the curve term is for comparing MP positive samples and (β − α)MN of TPR against FPR where the FPR is between α and β as negative samples in the second term. follows, 1 β pAUC(α, β)= TPR(FPR−1(u))du 3 Problem formulation β − α α Assume that we are given a set of positive samples P = −1 M M FPR (α) ∞ {xP } P , a set of negative samples N = {xN } N , and a 1 f s f s I s>s dsds. m m=1 m m=1 = P( ) N( ) ( ) U {xU }MU β − α FPR−1(β) −∞ set of unlabeled samples = m m=1. We would like to obtain a decision function that has a high pAUC with given E I s xP >sxN , = [ ( ( ) ( ))] (2) α and β for unseen samples. where E[·] represents the expectation, xP is a sample from We assume that the positive ratio of unlabeled samples the positive data distribution, and xN is a sample from the θP is known. When labeled and unlabeled data are gen- negative data distribution with the FPR between α and β. erated from the same distribution p(x,y), θP can be eas- Figure 1(b) illustrates how the pAUC is calculated using ily estimated by using the empirical positive probability positive and negative data. with the labeled data. When labeled and unlabeled data P MP are generated from different distributions, θP can be es- Given a set of positive samples P = {xm}m=1 and a set M timated by using methods described in (Saerens, Latinne, of negative samples N = {xN } N , an empirical pAUC is m m=1 and Decaestecker 2002; Iyer, Nath, and Sarawagi 2014; calculated by Du Plessis and Sugiyama 2014). 1 pAUC( α, β)= (β − α)MPMN 4 Partial AUC with unlabeled data × j − αM I s xP >sxN In this section, we derive two approximations of the pAUC ( α N) ( ( m) ( (jα))) P that are calculated using unlabeled data. Intuitively speak- xm∈P j ing, we sort unlabeled data by their scores, consider unla- β beled data with specific ranges as positive or negative, and I s xP >sxN + ( ( m) ( (j))) approximate the pAUC using the estimated positive or neg- j=j +1 α ative data. Figure 2 shows an overview of the calculation P N of the two approximated pAUCs that use positive, negative +(βMN − jβ)I(s(x ) >s(x )) , (3) m (jβ +1) and unlabeled data, which is explained in detail in this sec- 4240 (a) (b) Figure 2: (a) Approximated pAUC with positive and unla- Figure 3: Top: Unlabeled score distribution fU(s).
Recommended publications
  • The Textiles of the Han Dynasty & Their Relationship with Society
    The Textiles of the Han Dynasty & Their Relationship with Society Heather Langford Theses submitted for the degree of Master of Arts Faculty of Humanities and Social Sciences Centre of Asian Studies University of Adelaide May 2009 ii Dissertation submitted in partial fulfilment of the research requirements for the degree of Master of Arts Centre of Asian Studies School of Humanities and Social Sciences Adelaide University 2009 iii Table of Contents 1. Introduction.........................................................................................1 1.1. Literature Review..............................................................................13 1.2. Chapter summary ..............................................................................17 1.3. Conclusion ........................................................................................19 2. Background .......................................................................................20 2.1. Pre Han History.................................................................................20 2.2. Qin Dynasty ......................................................................................24 2.3. The Han Dynasty...............................................................................25 2.3.1. Trade with the West............................................................................. 30 2.4. Conclusion ........................................................................................32 3. Textiles and Technology....................................................................33
    [Show full text]
  • Effect of Marine Bacillus BC-2 on the Health-Beneficial Ingredients of Flavor Liquor
    Food and Nutrition Sciences, 2019, 10, 606-612 http://www.scirp.org/journal/fns ISSN Online: 2157-9458 ISSN Print: 2157-944X Effect of Marine Bacillus BC-2 on the Health-Beneficial Ingredients of Flavor Liquor Yueming Li*, Jianchun Xu, Zhimei Xu Qingdao Langyatai Group Co., Ltd., Qingdao, China How to cite this paper: Li, Y.M., Xu, J.C. Abstract and Xu, Z.M. (2019) Effect of Marine Ba- cillus BC-2 on the Health-Beneficial Ingre- The main aroma component of Luzhou-flavor Liquor is ethyl caproate, which dients of Flavor Liquor. Food and Nutrition is combined with appropriate amount of ethyl butyrate, ethyl acetate, ethyl Sciences, 10, 606-612. lactate and so on. By adding the marine bacillus BC-2 (Accession number: https://doi.org/10.4236/fns.2019.106044 MK811408) to substrate sludge, the bacillus complex bacterial liquid (pit Mud Received: March 27, 2019 Functional Bacterial liquid) has been modified. The complex bacterial liquid Accepted: June 11, 2019 was used in the production of Luzhou-flavor Liquor and it dramatically pro- Published: June 14, 2019 moted the content of health-beneficial ingredients in the new workshop. These results demonstrated that the marine bacillus BC-2 can effectively im- Copyright © 2019 by author(s) and prove the quality and health benefit of Luzhou-flavor Liquor. Scientific Research Publishing Inc. This work is licensed under the Creative Commons Attribution International Keywords License (CC BY 4.0). Luzhou-Flavor Liquor, Marine Bacillus BC-2, Flavoring Components http://creativecommons.org/licenses/by/4.0/ Open Access 1. Introduction In China, Luzhou-flavor Liquor is the most widely used Luzhou-flavor Liquor in social intercourse and its taste is closely related to the quality of pit mud.
    [Show full text]
  • Archaeological Observation on the Exploration of Chu Capitals
    Archaeological Observation on the Exploration of Chu Capitals Wang Hongxing Key words: Chu Capitals Danyang Ying Chenying Shouying According to accurate historical documents, the capi- In view of the recent research on the civilization pro- tals of Chu State include Danyang 丹阳 of the early stage, cess of the middle reach of Yangtze River, we may infer Ying 郢 of the middle stage and Chenying 陈郢 and that Danyang ought to be a central settlement among a Shouying 寿郢 of the late stage. Archaeologically group of settlements not far away from Jingshan 荆山 speaking, Chenying and Shouying are traceable while with rice as the main crop. No matter whether there are the locations of Danyang and Yingdu 郢都 are still any remains of fosses around the central settlement, its oblivious and scholars differ on this issue. Since Chu area must be larger than ordinary sites and be of higher capitals are the political, economical and cultural cen- scale and have public amenities such as large buildings ters of Chu State, the research on Chu capitals directly or altars. The site ought to have definite functional sec- affects further study of Chu culture. tions and the cemetery ought to be divided into that of Based on previous research, I intend to summarize the aristocracy and the plebeians. The relevant docu- the exploration of Danyang, Yingdu and Shouying in ments and the unearthed inscriptions on tortoise shells recent years, review the insufficiency of the former re- from Zhouyuan 周原 saying “the viscount of Chu search and current methods and advance some personal (actually the ruler of Chu) came to inform” indicate that opinion on the locations of Chu capitals and later explo- Zhou had frequent contact and exchange with Chu.
    [Show full text]
  • Shuk Han CHENG (Ne Shuk Han CHUNG) Education 85-90 Phd / RA
    1 page cv April 2011 Shuk Han CHENG (neé Shuk Han CHUNG) Education 85-90 PhD / RA (Department of Bacteriology, Royal Postgraduate Medical School, University of London) 80-83 BSc (Department of Zoology, University of Hong Kong) Employment 10-date Professor, Department of Biology & Chemistry, City University of Hong Kong 01-10 Associate Professor, Department of Biology & Chemistry, City University of Hong Kong 00-01 Assistant Professor, Department of Biology & Chemistry, City University of Hong Kong 97-00 Research Assistant Professor, Department of Biology & Chemistry, City University of Hong Kong 93-97 Lecturer/Scientific Officer, Depts of Paediatrics and Orthopaedics, Chinese University of Hong Kong 90-93 Postdoctoral Fellow in Tak Mak’s lab, Ontario cancer Institute, University of Toronto Research Interest: developmental and regenerative biology, nanobiological interface, nanomedicine Publications Records: 75 papers in SCI journals, 2 book chapters & 10 referred conference proceedings (full list available upon request), 1 patent awarded and 8 patents filed Five most representative publications in recent 5 years: * denotes corresponding authorship 1. Cheng J., * Cheng S.H. (2011) Poly(ethylene glycol) conjugated multi-walled carbon nanotubes as an efficient drug carrier for overcoming multi-drug resistance. Toxicology and Applied Pharmacology 250:184-93. 2. Cheng J., Chan CM, Veca LM, Poon W.L., Chan P.K., Qu L, *Sun Y.-P., *Cheng S.H. (2009). Acute and long-term effects after single loading of functionalized multi-walled carbon nanotubes into zebrafish (Danio rerio) Toxicology & Applied Pharmacology 235:216-225. 3. Gu Y.-J., Cheng J., Lin A.C.-C., Lam Y.W., *Cheng S.H., *Wong W.-T.
    [Show full text]
  • The Zuozhuan Account of the Death of King Zhao of Chu and Its Sources
    SINO-PLATONIC PAPERS Number 159 August, 2005 The Zuozhuan Account of the Death of King Zhao of Chu and Its Sources by Jens Østergaard Petersen Victor H. Mair, Editor Sino-Platonic Papers Department of East Asian Languages and Civilizations University of Pennsylvania Philadelphia, PA 19104-6305 USA [email protected] www.sino-platonic.org SINO-PLATONIC PAPERS FOUNDED 1986 Editor-in-Chief VICTOR H. MAIR Associate Editors PAULA ROBERTS MARK SWOFFORD ISSN 2157-9679 (print) 2157-9687 (online) SINO-PLATONIC PAPERS is an occasional series dedicated to making available to specialists and the interested public the results of research that, because of its unconventional or controversial nature, might otherwise go unpublished. The editor-in-chief actively encourages younger, not yet well established, scholars and independent authors to submit manuscripts for consideration. Contributions in any of the major scholarly languages of the world, including romanized modern standard Mandarin (MSM) and Japanese, are acceptable. In special circumstances, papers written in one of the Sinitic topolects (fangyan) may be considered for publication. Although the chief focus of Sino-Platonic Papers is on the intercultural relations of China with other peoples, challenging and creative studies on a wide variety of philological subjects will be entertained. This series is not the place for safe, sober, and stodgy presentations. Sino- Platonic Papers prefers lively work that, while taking reasonable risks to advance the field, capitalizes on brilliant new insights into the development of civilization. Submissions are regularly sent out to be refereed, and extensive editorial suggestions for revision may be offered. Sino-Platonic Papers emphasizes substance over form.
    [Show full text]
  • Boya Genuine] the Sweet Kitchen Mu Han(Chinese Edition
    FANLB1OIFM « Boya Genuine] the sweet kitchen Mu Han(Chinese Edition) < eBook Boya Genuine] th e sweet kitch en Mu Han(Ch inese Edition) By MU HAN ZHU To get Boya Genuine] the sweet kitchen Mu Han(Chinese Edition) eBook, remember to follow the hyperlink listed below and save the ebook or gain access to other information that are related to BOYA GENUINE] THE SWEET KITCHEN MU HAN(CHINESE EDITION) book. Our services was released using a aspire to serve as a comprehensive on the internet digital local library that oers entry to large number of PDF file document selection. You might find many dierent types of e-book as well as other literatures from the files data bank. Certain preferred subject areas that distribute on our catalog are popular books, solution key, exam test question and solution, guideline example, exercise guideline, quiz trial, user manual, consumer guide, services instruction, maintenance guide, and many others. READ ONLINE [ 1.91 MB ] Reviews This book is definitely worth buying. This really is for all who statte there had not been a worthy of studying. You will not sense monotony at at any moment of the time (that's what catalogs are for concerning should you check with me). -- Mr. Martin Baumbach Completely essential study ebook. This is for all those who statte there was not a well worth reading. I realized this book from my dad and i recommended this publication to find out. -- Jarrell Kovacek YMKF4KJGXL < Boya Genuine] the sweet kitchen Mu Han(Chinese Edition) ^ Book Oth er Books Friendfluence: The Surprising Ways Friends Make Us Who We Are [PDF] Access the link under to read "Friendfluence: The Surprising Ways Friends Make Us Who We Are" PDF document.
    [Show full text]
  • Discovering Discrepancies in Numerical Libraries
    Discovering Discrepancies in Numerical Libraries Jackson Vanover Xuan Deng Cindy Rubio-González University of California, Davis University of California, Davis University of California, Davis United States of America United States of America United States of America [email protected] [email protected] [email protected] ABSTRACT libraries aim to offer a certain level of correctness and robustness in Numerical libraries constitute the building blocks for software appli- their algorithms. Specifically, a discrete numerical algorithm should cations that perform numerical calculations. Thus, it is paramount not diverge from the continuous analytical function it implements that such libraries provide accurate and consistent results. To that for its given domain. end, this paper addresses the problem of finding discrepancies be- Extensive testing is necessary for any software that aims to be tween synonymous functions in different numerical libraries asa correct and robust; in all application domains, software testing means of identifying incorrect behavior. Our approach automati- is often complicated by a deficit of reliable test oracles and im- cally finds such synonymous functions, synthesizes testing drivers, mense domains of possible inputs. Testing of numerical software and executes differential tests to discover meaningful discrepan- in particular presents additional difficulties: there is a lack of stan- cies across numerical libraries. We implement our approach in a dards for dealing with inevitable numerical errors, and the IEEE 754 tool named FPDiff, and provide an evaluation on four popular nu- Standard [1] for floating-point representations of real numbers in- merical libraries: GNU Scientific Library (GSL), SciPy, mpmath, and herently introduces imprecision. As a result, bugs are commonplace jmat.
    [Show full text]
  • Wykorzystanie Biodegradowalnych Polimerów W Rozmnażaniu Ozdobnych Roślin Cebulowych
    Inżynieria Ekologiczna Ecological Engineering Vol. 46, Feb. 2016, p. 143–148 DOI: 10.12912/23920629/61477 WYKORZYSTANIE BIODEGRADOWALNYCH POLIMERÓW W ROZMNAŻANIU OZDOBNYCH ROŚLIN CEBULOWYCH Piotr Salachna1 1 Katedra Ogrodnictwa, Wydział Kształtowania Środowiska i Rolnictwa, Zachodniopomorski Uniwersytet Technologiczny w Szczecinie, ul. Papieża Pawła VI 3, 71-459 Szczecin, e-mail: [email protected] STRESZCZENIE Syntetyczne regulatory wzrostu mają negatywny wpływ na środowisko stąd coraz częściej w ogrodnictwie wyko- rzystuje się naturalne biostymulatory. Niektóre naturalne polimery wykazują stymulujący wpływ na wzrost i roz- wój roślin. Związki te mogą być stosowane do tworzenia hydrożelowych otoczek na powierzchni organów roślin- nych w celu ochrony przed niekorzystnym wpływem czynników zewnętrznych. Gatunki eukomis są powszechnie stosowane w tradycyjnej medycynie Afryki Południowej i znajdują szerokie zastosowanie jako ozdobne rośliny cebulowe. Celem badań było określenie wpływu otoczkowania w biopolimerach sadzonek dwułuskowych dwóch odmian eukomis czubatej (‘Sparkling Burgundy’ i ‘Twinkly Stars’) na plon i jakość uzyskanych cebul przybyszo- wych. Sadzonki otoczkowano w 1% roztworze gumy gellanowej (Phytagel) lub 0,5% roztworze oligochitozanu. Stwierdzono, że otoczkowanie sadzonek w gumie gellanowej miało stymulujący wpływ na liczbę i masę cebul przybyszowych. Najsilniejszy system korzeniowy wytworzyły cebule uformowane na sadzonkach otoczkowa- nych w oligochitozanie. Porównując odmiany wykazano, że ‘Sparkling Burgundy’ wytworzyła więcej cebul, które jednocześnie miały większą masę i dłuższe korzenie niż ‘Twinkly Stars’. Słowa kluczowe: eukomis czubata, oligochitozan, guma gellanowa, sadzonki dwułuskowe. THE USE OF BIODEGRADABLE POLYMERS TO PROPAGATION OF ORNAMENTAL BULBOUS PLANTS ABSTRACT Synthesized growth regulators may cause a negative impact on the environment so the use of natural bio-stimula- tors in horticulture is becoming more popular. Some biopolymers can have a stimulating influence on the growth and development of plants.
    [Show full text]
  • Lihua Xu, Ph.D
    Curriculum Vitae LIHUA XU, PH.D. 400 Palisades Circle, Apt. 202 Asheville, NC 28803 Email: [email protected] Cell: 407-808-2675 EDUCATION Ph.D. December 2009 Oklahoma State University - Stillwater, OK Specialization: Research and Evaluation, Educational Psychology Certificate: TESOL / Linguistics Dissertation: Measurement Invariance of the Inventory of School Motivation across Chinese and American College Students M.A. 2003 Sun Yat-sen University - Guangdong, China TESOL / Linguistics B.A. 1998 Sichuan University - Sichuan, China English ACADEMIC EMPLOYMENT 2017 – present Assistant Professor in Educational Research Western Carolina University, College of Education and Allied Professions Department of Human Services, Educational Leadership Ed.D. program 2010-2017 Lecturer and Statistics Lab Coordinator University of Central Florida, College of Education and Human Performance, Department of Educational and Human Sciences, Methodology Measurement and Analysis Courses Taught 2017- present Western Carolina University ⋅ Multiple Regression (EDL 793) ⋅ Design and Analysis of Educational Research (EDRS 802) ⋅ Data Representation (EDRS 805) ⋅ Classroom Assessment for Student Learning (EDRS 709) ⋅ Methods of Research (EDRS 602) 2010-2017 University of Central Florida, Educational and Human Sciences – Orlando, FL ⋅ Fundamentals of Graduate Research in Education (EDF 6481) ⋅ Measurement and Evaluation in Education (EDF 6432) ⋅ Statistics for Educational Data (EDF 6401) ⋅ Mixed Methods for Evaluation (EDF 6464) 2003-2009 Oklahoma State University, Research and Evaluation – Stillwater, OK Research Design and Methodology (REMS 5013) Oklahoma State University, English Department – Stillwater, OK International Composition I (ENGL 1113) International Composition II (ENGL 1223) Peer-Reviewed Journal Articles (13 articles) * Indicates invited 1. Hayles, O., Xu, L., & Edwards, O. W. (2018). Family structures, family relationship, and children’s perceptions of life satisfaction.
    [Show full text]
  • Glottal Stop Initials and Nasalization in Sino-Vietnamese and Southern Chinese
    Glottal Stop Initials and Nasalization in Sino-Vietnamese and Southern Chinese Grainger Lanneau A thesis submitted in partial fulfillment of the requirements for the degree of Master of Arts University of Washington 2020 Committee: Zev Handel William Boltz Program Authorized to Offer Degree: Asian Languages and Literature ©Copyright 2020 Grainger Lanneau University of Washington Abstract Glottal Stop Initials and Nasalization in Sino-Vietnamese and Southern Chinese Grainger Lanneau Chair of Supervisory Committee: Professor Zev Handel Asian Languages and Literature Middle Chinese glottal stop Ying [ʔ-] initials usually develop into zero initials with rare occasions of nasalization in modern day Sinitic1 languages and Sino-Vietnamese. Scholars such as Edwin Pullyblank (1984) and Jiang Jialu (2011) have briefly mentioned this development but have not yet thoroughly investigated it. There are approximately 26 Sino-Vietnamese words2 with Ying- initials that nasalize. Scholars such as John Phan (2013: 2016) and Hilario deSousa (2016) argue that Sino-Vietnamese in part comes from a spoken interaction between Việt-Mường and Chinese speakers in Annam speaking a variety of Chinese called Annamese Middle Chinese AMC, part of a larger dialect continuum called Southwestern Middle Chinese SMC. Phan and deSousa also claim that SMC developed into dialects spoken 1 I will use the terms “Sinitic” and “Chinese” interchangeably to refer to languages and speakers of the Sinitic branch of the Sino-Tibetan language family. 2 For the sake of simplicity, I shall refer to free and bound morphemes alike as “words.” 1 in Southwestern China today (Phan, Desousa: 2016). Using data of dialects mentioned by Phan and deSousa in their hypothesis, this study investigates initial nasalization in Ying-initial words in Southwestern Chinese Languages and in the 26 Sino-Vietnamese words.
    [Show full text]
  • The Forgotten Case of the Dependency Bugs on the Example of the Robot Operating System
    See discussions, stats, and author profiles for this publication at: https://www.researchgate.net/publication/339106941 The Forgotten Case of the Dependency Bugs On the Example of the Robot Operating System Preprint · February 2020 CITATIONS READS 0 240 4 authors, including: Andrzej Wasowski IT University of Copenhagen 157 PUBLICATIONS 4,470 CITATIONS SEE PROFILE Some of the authors of this publication are also working on these related projects: MROS: Model-based Metacontrol for ROS systems View project ROSIN - ROS-Industrial Quality-Assured Robot Software Components View project All content following this page was uploaded by Andrzej Wasowski on 07 February 2020. The user has requested enhancement of the downloaded file. The Forgotten Case of the Dependency Bugs On the Example of the Robot Operating System Anders Fischer-Nielsen Zhoulai Fu SQUARE Group, IT University of Copenhagen SQUARE Group, IT University of Copenhagen Ting Su Andrzej Wąsowski ETH Zurich SQUARE Group, IT University of Copenhagen ABSTRACT CMakeLists.txt package build script A dependency bug is a software fault that manifests itself when ... accessing an unavailable asset. Dependency bugs are pervasive and catkin_package( ... we all hate them. This paper presents a case study of dependency DEPENDS boost ... bugs in the Robot Operating System (ROS), applying mixed meth- include_directories(SYSTEM install ods: a qualitative investigation of 78 dependency bug reports, a compile&link ${Boost_INCLUDE_DIR}) ... ur5_moveit_plugin with boost quantitative analysis of 1354 ROS bug reports against 19553 reports target_link_libraries(ur10_moveit_plugin ... install ur10_moveit_plugin in the top 30 GitHub projects, and a design of three dependency ${Boost_LIBRARIES} ... linters evaluated on 406 ROS packages. install(TARGETS The paper presents a definition and a taxonomy of dependency ur5_moveit_plugin bugs extracted from data.
    [Show full text]
  • Achieving High Coverage for Floating-Point Code Via Unconstrained Programming
    Achieving High Coverage for Floating-Point Code via Unconstrained Programming Zhoulai Fu Zhendong Su University of California, Davis, USA [email protected] [email protected] Abstract have driven the research community to develop a spectrum Achieving high code coverage is essential in testing, which of automated testing techniques for achieving high code gives us confidence in code quality. Testing floating-point coverage. code usually requires painstaking efforts in handling floating- A significant challenge in coverage-based testing lies in point constraints, e.g., in symbolic execution. This paper turns the testing of numerical code, e.g., programs with floating- the challenge of testing floating-point code into the oppor- point arithmetic, non-linear variable relations, or external tunity of applying unconstrained programming — the math- function calls, such as logarithmic and trigonometric func- ematical solution for calculating function minimum points tions. Existing solutions include random testing [14, 23], over the entire search space. Our core insight is to derive a symbolic execution [17, 24], and various search-based strate- representing function from the floating-point program, any of gies [12, 25, 28, 31], which have found their way into many whose minimum points is a test input guaranteed to exercise mature implementations [16, 39]. Random testing is easy to a new branch of the tested program. This guarantee allows employ and fast, but ineffective in finding deep semantic is- us to achieve high coverage of the floating-point program by sues and handling large input spaces; symbolic execution and repeatedly minimizing the representing function. its variants can perform systematic path exploration, but suf- We have realized this approach in a tool called CoverMe fer from path explosion and are weak in dealing with complex and conducted an extensive evaluation of it on Sun’s C math program logic involving numerical constraints.
    [Show full text]