City University of New York (CUNY) CUNY Academic Works All Dissertations, Theses, and Capstone Projects Dissertations, Theses, and Capstone Projects 6-2016 Infixer: A Method for Segmenting Non-Concatenative Morphology in Tagalog Steven R. Butler Graduate Center, City University of New York How does access to this work benefit ou?y Let us know! More information about this work at: https://academicworks.cuny.edu/gc_etds/1308 Discover additional works at: https://academicworks.cuny.edu This work is made publicly available by the City University of New York (CUNY). Contact:
[email protected] INFIXER:AMETHOD FOR SEGMENTING NON-CONCATENATIVE MORPHOLOGY IN TAGALOG by STEVEN BUTLER A masters thesis submitted to the Graduate Faculty in Linguistics in partial fulfillment of the requirements for the degree of Master of Arts, The City University of New York 2016 © 2016 STEVEN BUTLER Some Rights Reserved This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. https://creativecommons.org/licenses/by-nc-sa/4.0/ ii This manuscript has been read and accepted by the Graduate Faculty in Linguistics in satisfaction of the thesis requirement for the degree of Master of Arts. Professor William Sakas Date Thesis Advisor Professor Gita Martohardjono Date Executive Officer THE CITY UNIVERSITY OF NEW YORK iii Abstract INFIXER:AMETHOD FOR SEGMENTING NON-CONCATENATIVE MORPHOLOGY IN TAGALOG by STEVEN BUTLER Adviser: Professor William Sakas In this paper, I present a method for coercing a widely-used morphological segmentation algo- rithm, Morfessor (Creutz and Lagus 2005a), into accurately segmenting non-concatenative mor- phological patterns. The non-concatenative patterns targeted—infixation and partial-reduplication— present problems for many segmentation algorithms, and tools that can successfully identify and segment those patterns can improve a number of downstream natural language processing tasks, including keyword search and machine translation.