The Kleene Language for Weighted Finite-State Programming: User Documentation, Version 0.9.4.0

Total Page:16

File Type:pdf, Size:1020Kb

The Kleene Language for Weighted Finite-State Programming: User Documentation, Version 0.9.4.0 The Kleene Language for Weighted Finite-State Programming: User Documentation, Version 0.9.4.0 This Document is Work in Progress Corrections and Suggestions Are Welcome Kenneth R. Beesley SAP Labs, LLC P.O. Box 540475 North Salt Lake Utah 84054, USA [email protected] 19 May 2014 Copyright © 2014 SAP AG. Released under the Apache License, Version 2. http://www.apache.org/licenses/LICENSE-2.0.html Beesley i Preface What is Kleene? Kleene is a programming language that can be used to create many useful and efficient linguistic applications based on finite-state machines (FSMs). These applications include tokenizers, spelling checkers, spelling correc- tors, morphological analyzer/generators and shallow parsers. FSMs are also widely used in speech synthesis and speech recognition. Kleene allows programmers to define, build, manipulate and test finite- state machines using regular expressions and right-linear phrase-structure grammars; and Kleene supports variables, rule-like expressions, user-defined functions and familiar programming-language control syntax. The FSMs can include acceptors and two-projection transducers, either weighted un- der the Tropical Semiring or unweighted. If this makes no sense, the pur- pose of the book is to explain it. Pre-edited Kleene scripts can be run from the command line, and a graphical user interface is provided for interactive learning, programming and testing. Operating Systems Kleene runs on OS X and Linux, requiring Java version 1.5 or higher. Prerequisites This book assumes only superficial familiarity with regular languages, reg- ular relations and finite-state machines. While readers need not have any experience with finite-state program- ming, those who have no programming experience at all, e.g. in a language like Java, C++, Perl, Python, Ruby, etc., will find it difficult. ii The Kleene Language Implementation The Kleene parser is implemented in JavaCC/JJTree* and Java,† and the interpreter calls functions in the OpenFst library‡ via the Java Native In- terface (JNI).§ Kleene has its own syntax, based on familiar regular expres- sions, and it borrows much of its semantics from the Xerox/parc finite- state toolkit.¶ Language-restriction expressions and alternation rules are implemented using algorithms supplied by Måns Huldén.‖ Acknowledgments Thanks are due to the following people for advice and encouragement over the years. I have not always followed the advice I received, so I take full responsibility for any problems, failures or infelicities. Lauri Karttunen, Måns Huldén, Paola Nieddu, Phil Sours, Cyril Allauzen, André Kempe, Mike Wilkens, Kemal Oflazer, Natasha Lloyd Kleene is named after American mathematician Stephen Cole Kleene (1909–1994), who investigated the properties of regular sets and invented the metalan- guage of regular expressions. *http://java.net/projects/javacc †http://www.java.com/ ‡Cyril Allauzen, Michael Riley, Johan Schalkwyk, Wojciech Skut and Mehryar Mohri, “OpenFst: A General and Efficient Weighted Finite-State Transducer Library,” Proceedings of the Ninth International Conference on Implementation and Application of Automata, (CIAA 2007), volume 4783 of Lecture Notes in Computer Science, pages 11-23. Springer, 2007. http://www.openfst.org. §http://www.ibm.com/developerworks/java/tutorials/j-jni/ ¶http://www.fsmbook.com ‖Måns Huldén, 2009 Finite-State Machine Construction Methods and Algorithms for Phonology and Morphology, Ph.D. dissertation, University of Arizona. Beesley iii License Kleene is free and open-source, released 4 May 2012 by SAP AG†† under the Apache License, Version 2.** The OpenFst library is available under the same license. ††http://www.sap.com/ **http://www.apache.org/licenses/LICENSE-2.0 Contents 1 Introduction 1 1.1 What is Kleene? . 1 1.2 The Kleene Name . 3 1.3 Possible Applications . 3 1.4 Design Criteria . 3 1.5 Terminology . 5 2 Getting Started with Kleene 7 2.1 Installation and Prerequisites . 7 2.1.1 Pre-compiled Binaries . 7 2.1.2 Compiling Kleene from Source Code . 8 2.2 Launching the Kleene gui ................... 9 2.3 OK, What can you do in Kleene? . 11 2.3.1 First, is it working? . 11 2.3.2 Visualizing Finite-State Machines . 11 2.3.3 Compile regular expressions into FSMs . 13 2.3.4 Optimization of FSMs . 23 2.3.5 Using Defined Variables . 26 2.3.6 Characters . 27 2.3.7 Regular Languages and Regular Relations . 31 2.4 Looking at FSMs . 41 2.5 Applications . 42 v vi The Kleene Language 3 Regular Expression Syntax 45 3.1 Regular Expressions . 45 3.1.1 Primary Regular Expressions . 46 3.1.2 Inherently Delimited Regular Expressions . 47 3.1.3 Regular-Expression Operators . 49 3.1.4 Precedence Issues . 49 3.1.5 Weights . 52 3.1.6 Whitespace in Regular Expressions . 54 3.1.7 Denoting the Empty String . 55 3.1.8 Denoting Any Symbol . 56 3.2 Abstraction Mechanisms . 57 3.2.1 Variables . 57 3.2.2 Pre-defined Functions . 58 3.2.3 User-defined Function Syntax . 60 3.2.4 Function Call Semantics . 64 3.2.5 Function Parameters with Default Values . 66 3.3 Right-linear Phrase-structure Grammars . 68 3.3.1 Right-linear Syntax . 68 3.3.2 Right-linear Semantics . 70 3.4 Scope . 72 3.4.1 Assignments, Declarations and Local Scope . 72 3.4.2 Stand-alone Code Blocks . 73 3.4.3 Function Blocks . 73 3.4.4 External Variables . 74 3.4.5 Free Variables . 77 3.4.6 Combining Free and Local Usage . 78 3.4.7 Export Statements . 79 3.4.8 Practical Use of Code Blocks . 80 3.5 Language Restriction Expressions . 83 4 Alternation Rules 87 4.1 What are Alternation Rules? . 87 Beesley vii 4.2 Mindtuning for Alternations . 88 4.2.1 Underlying and Surface, Upper and Lower . 88 4.2.2 Writing and Testing Alternation Rules . 92 4.2.3 Derivations or Cascades of Rules . 94 4.2.4 The Richness of Alternation-Rule Types . 99 4.3 Basic Mapping Rules . 100 4.3.1 Right-Arrow Rules . 100 4.3.2 Left-arrow Rules . 105 4.4 Optional Alternation Rules . 106 4.5 Maximum and Minimum Matching . 107 4.6 Deletion Rules . 108 4.7 Epenthesis Rules . 108 4.8 Markup Rules . 109 4.9 Two-Level Rule Contexts . 110 4.10 Parallel Rules . 112 4.10.1 Vertical Derivations vs. Parallel Rules . 112 4.10.2 The Built-In Parallel Rule-Compilation Function . 115 4.10.3 Rules with Where-Clauses . 119 4.10.4 Rules where the Input can Match the Empty String . 124 4.11 Transducer-Style Rules . 125 4.11.1 Traditional Rules vs. Transducer-Style Rules . 125 4.11.2 Simple Mapping Rules in the Transducer Style . 127 4.11.3 Transducer-Style Rules vs. Where-Clause Rules . 133 4.11.4 Transducer-Style Rules vs. Traditional Markup Rules 134 4.11.5 Transducer-Style Rules Not Expressible as Traditional Rules . 135 5 Examples without Weights 137 5.1 Introduction . 137 5.2 Spanish Morphology . 137 5.2.1 Spanish Verbs . 137 5.2.2 Spanish Nouns . 150 viii The Kleene Language 5.3 Latin Morphology . 150 5.4 Aymara Morphology . 150 6 Examples with Weights 151 6.1 Introduction . 151 6.2 Weighted Edit Distance for Spelling Correction . 151 7 Challenges in Morphology 163 7.1 Overview . 163 7.2 Infixation . 164 7.3 Reduplication . 164 7.3.1 Range of Reduplicative Phenomena . 164 7.3.2 Contiguous Reduplication . 165 7.3.3 Discontiguous Reduplication . 174 7.4 Semitic Interdigitation . 174 7.5 Constraining Long-distance Dependencies . 174 8 Arithmetic Expressions 175 8.1 Mindtuning for Arithmetic Expressions . 175 8.2 Primary Arithmetic Expressions . 176 8.3 Arithmetic Expression Operators . 177 8.4 Arithmetic Functions . 177 8.5 Boolean Functions . 179 8.5.1 Special Case of Arithmetic Functions . 179 8.5.2 Assert and Require Statements . 181 9 Lists 183 9.1 Lists of FSMs and Numbers . 183 9.2 List Literals, Identifiers, and Assignment . .184 9.3 Pre-Defined Functions Operating on Lists . .185 9.3.1 Functions Joining the Elements of a List . 188 9.3.2 Functions Returning the Map of a List . 190 9.3.3 Functions Returning the Alphabet (Sigma) of an FSM 191 Beesley ix 9.3.4 Functions Returning the Size of a List . 192 9.4 Iteration through Members of a List . 192 9.5 User-Defined Functions and Lists . .193 9.6 Traditional Array-indexing Syntax . 194 10 Other Syntax 197 10.1 Void Functions with Names . 197 10.2 Anonymous Void Functions . 197 10.3 Control Syntax . 198 10.4 Input/Output . 198 10.4.1 Scripts . 198 10.4.2 XML Input/Output . 201 10.4.3 dot Output . 202 10.4.4 Interactive Testing in the gui . 203 10.5 Memory Management . 205 10.5.1 Garbage Collection . 205 10.5.2 Java Memory Usage . 206 10.5.3 Java Fst Objects . 206 10.5.4 Contents of the Main Symbol Table . 207 10.5.5 C++ Memory Space . 207 11 Code Generation and Runtime Code 209 11.1 Using Kleene FSMs in Real Projects . 209 11.2 The fst2java Experiment . 210 11.2.1 Stand-Alone Java Code Generation . 210 11.2.2 How it Works . 211 11.2.3 How to Create the Java Class Files for an FSM . 213 11.2.4 FSM to Java Examples . 221 11.2.5 Generated Java Code API . 231 11.3 Runtime Code . 234 11.3.1 Status of Runtime Code . 234 11.3.2 Basic Functionality of Runtime Code . 235 11.3.3 Kinds of Runtime Code . 236 x The Kleene Language 12 rtn Support 239 12.1 OpenFst Recursive Transition Networks (rtns) . 239 12.1.1 Status .
Recommended publications
  • Singing in English in the 21St Century: a Study Comparing
    SINGING IN ENGLISH IN THE 21ST CENTURY: A STUDY COMPARING AND APPLYING THE TENETS OF MADELEINE MARSHALL AND KATHRYN LABOUFF Helen Dewey Reikofski Dissertation Prepared for the Degree of DOCTOR OF MUSICAL ARTS UNIVERSITY OF NORTH TEXAS August 2015 APPROVED:….……………….. Jeffrey Snider, Major Professor Stephen Dubberly, Committee Member Benjamin Brand, Committee Member Stephen Austin, Committee Member and Chair of the Department of Vocal Studies … James C. Scott, Dean of the College of Music Costas Tsatsoulis, Interim Dean of the Toulouse Graduate School Reikofski, Helen Dewey. Singing in English in the 21st Century: A Study Comparing and Applying the Tenets of Madeleine Marshall and Kathryn LaBouff. Doctor of Musical Arts (Performance), August 2015, 171 pp., 6 tables, 21 figures, bibliography, 141 titles. The English diction texts by Madeleine Marshall and Kathryn LaBouff are two of the most acclaimed manuals on singing in this language. Differences in style between the two have separated proponents to be primarily devoted to one or the other. An in- depth study, comparing the precepts of both authors, and applying their principles, has resulted in an understanding of their common ground, as well as the need for the more comprehensive information, included by LaBouff, on singing in the dialect of American Standard, and changes in current Received Pronunciation, for British works, and Mid- Atlantic dialect, for English language works not specifically North American or British. Chapter 1 introduces Marshall and The Singer’s Manual of English Diction, and LaBouff and Singing and Communicating in English. An overview of selected works from Opera America’s resources exemplifies the need for three dialects in standardized English training.
    [Show full text]
  • Metathesis Is Real, and It Is a Regular Relation A
    METATHESIS IS REAL, AND IT IS A REGULAR RELATION A Dissertation submitted to the Faculty of the Graduate School of Arts and Sciences of Georgetown University in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Linguistics By Tracy A. Canfield, M. S. Washington, DC November 17 , 2015 Copyright 2015 by Tracy A. Canfield All Rights Reserved ii METATHESIS IS REAL, AND IT IS A REGULAR RELATION Tracy A. Canfield, M.S. Thesis Advisor: Elizabeth C. Zsiga , Ph.D. ABSTRACT Regular relations are mathematical models that are widely used in computational linguistics to generate, recognize, and learn various features of natural languages. While certain natural language phenomena – such as syntactic scrambling, which requires a re-ordering of input elements – cannot be modeled as regular relations, it has been argued that all of the phonological constraints that have been described in the context of Optimality Theory can be, and, thus, that the phonological grammars of all human languages are regular relations; as Ellison (1994) states, "All constraints are regular." Re-ordering of input segments, or metathesis, does occur at a phonological level. Historically, this phenomenon has been dismissed as simple speaker error (Montreuil, 1981; Hume, 2001), but more recent research has shown that metathesis occurs as a synchronic, predictable phonological process in numerous human languages (Hume, 1998; Hume, 2001). This calls the generalization that all phonological processes are regular relations into doubt, and raises other
    [Show full text]
  • Typesetting Catalan Texts with TEX
    Typesetting Catalan Texts with TEX Gabriel Valiente Feruglio Universitat de les Illes Balears Departament de Ciencies Matematiques i Inforrnatica E-07071 Palma de Mallorca (Spain) Internet: dmi gvaO@ps. ui b . es Robert Fuster Universitat Politecnica de Valencia Departament de Matematica Aplicada Cami de Vera, 14. E-46071 Valencia (Spain) Internet: mat5rf c@cci. upv .es Abstract As with other non-American English languages, typesetting Catalan texts imposes some special requirements on TEX. These include a particular set of hyphenation patterns and support for a special ligature: unlike other Romanic languages, Catalan incorporates the middle point in the 11 digraph. Hyphenation rules for Catalan are reviewed in ths paper, after a short introduction to hyphenation by TEX. A minimal set of hyphenation patterns covering all Catalan accents and diacritics is also presented. A discussion about the 1'1 ligature concludes the paper. This work represents a first step towards the Catalan TLP (TEX Language Package), under development within the TWGMLC (Technical Working Group on Multiple Language Coordination), where the first author is chairing the Catalan linguistic subgroup. Resum Aixi com en altres llengiies, la composicio de textos escrits en catala demana requeriments especials a1 TEX. Aquests inclouen un conjunt particular de patrons de guionat, aixi com suport per a un lligam especial ja que, a diferencia d'altres llengiies romaniques, el catala incorpora el punt volat a1 digraf I'l. En aquest paper es fa una introduccio a1 guionat arnb TEX i es revisen les regles de guionat per a1 catala. Tanmateix, es presenta un conjunt minim de patrons de guionat que cobreix tots els accents i marques diacritiques del catala.
    [Show full text]
  • Department of English Linguistics School of English and American Studies Eötvös Loránd University Budapest, Hungary Table of Contents
    The Odd Yearbook 2014 Department of English Linguistics School of English and American Studies Eötvös Loránd University Budapest, Hungary Table of Contents Foreword ................................................................................................................................i Fodor, Brigitta: Scottish Vowel Length: Regular Vowel Length Alternations and the Raising of /ae/ in Scottish Standard English ........................................................................................ 1 Pándi, Julianna Sarolta: Flapping in American English: A Theoretical Approach .............. 14 Szalay, Tünde: L Vocalisation in Three English Dialects .................................................... 35 Szabó, Petra Florina: Social and Regional Variation and Intrusive /r/ ................................. 49 Szabó, Lilla Petronella: “You’re a good friend, bro!”—A Corpus-based Investigation of the Meanings of ‘bro’ ................................................................................................................. 81 Kiss, Angelika: A remark on the Individual/Stage-Level Predicate Distinction in English ... 98 Biczók, Bálint: Approaches to Antecedent-Contained Deletion ......................................... 110 Tomacsek, Vivien: Approaches to the Structure of English Small Clauses ......................... 128 Kucsera, Márton: Restrictive relative clauses in Alignment Syntax .................................. 155 Foreword This is the ninth volume in the series ODD Yearbook, which is the collection of undergraduate
    [Show full text]
  • Archisegment-Based Letter-To-Phone Conversion for Concatenative Speech Synthesis in Portuguese
    ARCHISEGMENT-BASED LETTER-TO-PHONE CONVERSION FOR CONCATENATIVE SPEECH SYNTHESIS IN PORTUGUESE Eleonora Cavalcante Albano and Agnaldo Antonio Moreira LAFAPE-IEL-UNICAMP, Campinas, SP, Brazil [email protected] ABSTRACT sive. Interestingly, such exceptions occur precisely where complexity at both the phonetic and the morphological level leads A letter-to-phone conversion scheme is proposed for Portuguese to ambiguity in phonological analysis. which excludes representation of allophonic detail. Phonetically unstable segments are treated as archisegments, their articulatory Take, for example, the representation of rhyme nasality, which weakness being analyzed in terms of feature underspecification. motivates the controversy whether Portuguese has distinctive Besides solving classical problems of allophony and allomorphy, nasal vowels [4,5]. Such nasality, which tends to be rather heavy this analysis provides an efficient principle for building a unit phonetically, is orthographically represented as 'm' or 'n' before a inventory for concatenative speech synthesis. word internal consonant (e.g., samba, santa, sanca) and as a tilde diacritic on the vowel before another vowel or word finally (e.g., 1. PHONOLOGY AND LETTER-TO-PHONE são, sã). This cannot be said to be a reasonable phonemic representation because /m/ and /n/ do not contrast in rhymes, a CONVERSION fact that orthography acknowledges by making the choice of one or the other letter dependent on the following consonant ('m' Concatenative speech synthesis depends crucially on the adequacy before 'p,b' and 'n' elsewhere). Nor can it be said to be a of the phonological analysis underlying its unit list. In the same reasonable allophonic representation because the phonetic way as intelligibility requires concatenative units to be based on realization of nasal rhymes shows no correspondence with the a consistent minimal set of allophones, quality requires enriching diacritic/digraph distinction.
    [Show full text]
  • 1 Stress and Syllable Structure in English
    1 Stress and Syllable Structure in English: Approaches to Phonological Variations* San Duanmu, Hyo-Young Kim, and Nathan Stiennon University of Michigan Abstract 1. What is phonological variation? We use phonological variation to refer to alternative forms that can be used for more or less similar purposes. For example, in English a word made of CVCVCV can have stress on the first syllable, as in Canada, or on the second syllable, as in banana. There is no reason why the stress pattern could not have been the other way round, i.e. for Canada to have stress on the second syllable and for banana to have stress on the first. Nor is there any reason why stress in such words cannot be all on the first syllable, or all on the second. English just happens to use both forms. Similarly, an English word can be VC, such as Ann, CVC, such as sit, or CCCVC, such as split. There is no reason why a word must use one or another form and English just happens to use all those forms. Besides variations within a language, there are also variations across different languages. For example, before the nuclear vowel Standard Chinese allows CG- but not CC-, whereas English allows both CG- and CC-. Similarly, Standard Chinese only allows [–n] and [–ŋ] after the nuclear vowel, whereas English allows many more consonants. * Portions of this work were presented at University of Michigan in 2002, the Second North American Phonology Conference in 2002, Wayne State University in 2004, Peking University in 2004, the 10th Mid- Continental Workshop on Phonology in 2004, and National Chengchi University, Taipei, in 2005.
    [Show full text]
  • The Phonetics and Phonology of Some Syllabic Consonants in Southern British English
    View metadata, citation and similar papers at core.ac.uk brought to you by CORE provided by Hochschulschriftenserver - Universität Frankfurt am Main The Phonetics and Phonology of some Syllabic Consonants in Southern British English Zoë Toft School of Oriental and African Studies University of London 1 Introduction This article presents new experimental data on the phonetics of syllabic /l/ and syllabic /n/ in Southern British English and then proposes a new phonological account of their behaviour. Previous analyses (Chomsky and Halle 1968:354, Gimson 1989, Gussmann 1991 and Wells 1995) have proposed that syllabic /l/ and syllabic /n/ should be analysed in a uniform manner. Data presented here, however, shows that syllabic /l/ and syllabic /n/ behave in very different ways, and in light of this, a unitary analysis is not justified. Instead, a proposal is made that syllabic /l/ and syllabic /n/ have different phonological structures, and that these different phonological structures explain their different phonetic behaviours. This article is organised as follows: First a general background is given to the phenomenon of syllabic consonants both cross linguistically and specifically in Southern British English. In §3 a set of experiments designed to elicit syllabic consonants are described and in §4 the results of these experiments are presented. §5 contains a discussion on data published by earlier authors concerning syllabic consonants in English. In §6 a theoretical phonological framework is set out, and in §7 the results of the experiments are analysed in the light of this framework. In the concluding section, some outstanding issues are addressed and several areas for further research are suggested.
    [Show full text]
  • Unicode Line Breaking Algorithm Technical Reports
    4/26/2018 UAX #14: Unicode Line Breaking Algorithm Technical Reports Proposed Update Unicode® Standard Annex #14 UNICODE LINE BREAKING ALGORITHM Version Unicode 11.0.0 (draft 1) Editors Andy Heninger ([email protected]) Date 2018-04-10 This Version http://www.unicode.org/reports/tr14/tr14-40.html Previous http://www.unicode.org/reports/tr14/tr14-39.html Version Latest http://www.unicode.org/reports/tr14/ Version Latest http://www.unicode.org/reports/tr14/proposed.html Proposed Update Revision 40 Summary This annex presents the Unicode line breaking algorithm along with detailed descriptions of each of the character classes established by the Unicode line breaking property. The line breaking algorithm produces a set of "break opportunities", or positions that would be suitable for wrapping lines when preparing text for display. Status This is a draft document which may be updated, replaced, or superseded by other documents at any time. Publication does not imply endorsement by the Unicode Consortium. This is not a stable document; it is inappropriate to cite this document as other than a work in progress. A Unicode Standard Annex (UAX) forms an integral part of the Unicode Standard, but is published online as a separate document. The Unicode Standard may require conformance to normative content in a Unicode Standard Annex, if so specified in the Conformance chapter of that version of the Unicode Standard. The version number of a UAX document corresponds to the version of the Unicode Standard of which it forms a part. https://www.unicode.org/reports/tr14/tr14-40.html 1/59 4/26/2018 UAX #14: Unicode Line Breaking Algorithm Please submit corrigenda and other comments with the online reporting form [Feedback].
    [Show full text]
  • What Is a Written Word? and If So, How Many? Martin Evertz-Rittich | University of Cologne
    What is a written word? And if so, how many? Martin Evertz-Rittich | University of Cologne /gʁafematik/ Grapholinguistics in the 21st century | 17.06.2020 Outline 1. Defining the written word in alphabetical writing systems 2. Properties of written words 3. Correspondence to elements in spoken language 4. Typological considerations 5. Summary /gʁafematik/ | Martin Evertz-Rittich | 17.06.2020 Defining the written word in alphabetical writing systems Part I Definition by spaces (e.g. Coulmas 1999, 550; Jacobs 2005, 22; Fuhrhop 2008, 193f.) (1) A graphematic word is a string of graphemes that is bordered by spaces and may not be interrupted by spaces. Problems: . <you.>, <you?>, <you!> . <Smiths’> (e.g. in the Smiths’ house), <mother-in-law> /gʁafematik/ | Martin Evertz-Rittich | 17.06.2020 Definition by spaces (Zifonun et al. 1997, 259; my translation) (1) A graphematic word is a string of graphemes that is bordered by spaces and may not be interrupted by spaces. (2) A graphematic word is a string of graphemes that is preceded by a space and may not be interrupted by spaces. Problems: . <you.>, <you?>, <you!> . <Smiths’> (e.g. in the Smiths’ house), <mother-in-law> . <“you”>, <(you)> /gʁafematik/ | Martin Evertz-Rittich | 17.06.2020 Towards a typographic definition: fillers and clitics . Characters and punctation marks can be divided into two classes (Bredel 2009) . Fillers . They can independently fill a segmental slot . Letters, numbers, apostrophes, hyphens . Clitics . They need the support of a filler . periods, colons, semi-colons, commas, brackets, question marks, quotation marks, exclamation marks /gʁafematik/ | Martin Evertz-Rittich | 17.06.2020 A typographic definition Evertz (2016a, 391-392 based on works of Bredel; my translation) (3) A graphematic word is a sequence of slot-filler-pairs surrounded by empty slots in which at least one filler must be a letter.
    [Show full text]
  • Resume Template
    SHUBHAM PATEL +91-7622086123 [email protected] http://shubhampatel.in/ EXPERIENCE AMAZON DEVELOPMENT CENTRE (INDIA) PVT. LTD. Bangalore, IN Software Development Engineer Aug ‘16-Present • Working with Seller Flex team on Warehouse Management System for Indian Sellers; launching worldwide in 2018. • Developed server-side multi-threaded RESTful APIs using Java. • Developed client-side web interface supporting Internationalization and Localization in Indic Languages. BERKMAN KLEIN CENTER, HARVARD UNIVERSITY Remote Open Source Software Developer (Google Summer of Code 2016) April ‘16-Aug ‘16 • Worked on easing reservations approval process and report generation in Book-a-nook. [code] • Developed 20 features including data visualization tool and notification system. [report][video][proposal] PRACTO TECHNOLOGIES Bangalore, IN Software Engineer – Intern Jan ‘16-April ‘16 • Created a scalable architecture for one-click production infrastructure replica. [term paper][slides] • Enabled automated software provisioning, code deployment and configuration management. AMAZON DEVELOPMENT CENTRE (INDIA) PVT. LTD. Bangalore, IN Software Development Engineer – Intern May ‘15-July ‘15 • Developed an automation tool for internal teams which facilitates periodic and on-demand notifications. • Successfully completed the project till beta production in 50% of the estimated time. BOOKLY WEB LLP. Gandhinagar, IN Technical Lead, Backend Developer Mar ‘14-Dec ‘14 • Led the team of 8 in building a college-oriented portal to buy/sell used books. [site][code] • Developed the server-side code in 2 months using Ruby on Rails. EDUCATION DHIRUBHAI AMBANI INSTITUTE OF INFORMATION AND COMMUNICATION TECHNOLOGY (DA-IICT) Gandhinagar, IN Bachelor of Technology, Information and Communication Technology, CPI: 8.57/10 2012-2016 • Student Advisor, Data Structures & Algorithms course, Autumn 2015-16 • Secondary Teaching Assistant, Data Mining and Warehousing course, Autumn 2015-16 • Head Manager of Google Developer Group, DA-IICT Chapter.
    [Show full text]
  • Automatic Syllabification in European Languages: a Comparison of Data-Driven Methods
    AUTOMATIC SYLLABIFICATION IN EUROPEAN LANGUAGES: A COMPARISON OF DATA-DRIVEN METHODS by Connie R. Adsett Submitted in partial fulfillment of the requirements for the degree of Master of Computer Science at Dalhousie University Halifax, Nova Scotia June 2008 c Copyright by Connie R. Adsett, 2008 DALHOUSIE UNIVERSITY FACULTY OF COMPUTER SCIENCE The undersigned hereby certify that they have read and recommend to the Faculty of Graduate Studies for acceptance a thesis entitled \AUTOMATIC SYLLABIFICATION IN EUROPEAN LANGUAGES: A COMPARISON OF DATA-DRIVEN METHODS" by Connie R. Adsett in partial fulfillment of the requirements for the degree of Master of Computer Science. Dated: June 24, 2008 Supervisors: Dr. Yannick Marchand Dr. Vlado Keselj Reader: Dr. Qigang Gao ii DALHOUSIE UNIVERSITY DATE: June 24, 2008 AUTHOR: Connie R. Adsett TITLE: AUTOMATIC SYLLABIFICATION IN EUROPEAN LANGUAGES: A COMPARISON OF DATA-DRIVEN METHODS DEPARTMENT OR SCHOOL: Faculty of Computer Science DEGREE: MCSc CONVOCATION: October YEAR: 2008 Permission is herewith granted to Dalhousie University to circulate and to have copied for non-commercial purposes, at its discretion, the above title upon the request of individuals or institutions. Signature of Author The author reserves other publication rights, and neither the thesis nor extensive extracts from it may be printed or otherwise reproduced without the author's written permission. The author attests that permission has been obtained for the use of any copyrighted material appearing in the thesis (other than brief excerpts requiring only proper acknowledgement in scholarly writing) and that all such use is clearly acknowledged. iii Syllables govern the world. - John Selden (1584{1654) iv Table of Contents List of Tables ..................................
    [Show full text]
  • Ancient Latin Grammar
    Synopsis of Ancient Latin Grammar J. Matthew Harrington Ph.D. Tufts University – Department of Classics © August 2016 - Version 3.0 Section I: Writing Systems and Pronunciation 1.1 Writing Systems 1 1.2 Vocalic Phonemes 2 1.3 Consonantal Phonemes 2 1.4 Chart of Phonemes and Symbols in Classical Latin 3 1.5 Aspiration 3 1.6 Conditioned Sound Change 3 1.7 Syllabification 4 1.8 Quantity 4 1.9 Accent 5 1.10 Elision 5 1.11 Proto-Indo-European (PIE) 5 1.12 Latin Dialects 5 1.13 Educated Latin Pronunciation in the Early 1st Century CE 8 1.14 Vulgar and Late Latin Pronunciation 9 Section II: Case and Syntactic Function 2.1 Syntax 10 2.2 Semantic Word Order 10 2.3 Case and Morpho-Syntax 11 2.4 Adjectival Modification 12 2.5 Nominative 12 2.6 Genitive 12 2.7 Dative 14 2.8 Accusative 16 2.9 Ablative 19 2.10 Vocative 21 2.11 Locative 22 2.12 The Four Forms of Agreement 23 Section III: Mood and Subordination 3.1 Coordination/Subordination 24 3.2 Syntax of Verbal Forms 24 3.3 Independent Subjunctive Usages 25 A. Hortatory Subjunctive 25 B. Deliberative Subjunctive 25 C. Potential Subjunctive 25 D. Optative Subjunctive 25 3.4 Dependent Usages 25 I. ADJECTIVAL CLAUSES 26 A. Relative Clause 26 B. Relative Clause of Characteristic 26 C. Relative Clause in Indirect Statement 26 II. ADVERBIAL CLAUSES 26 A. Purpose Clause 27 B. Result Clause 27 i C. Consecutive Clause 28 D. Conditional Constructions 28 E.
    [Show full text]