Typesetting Bengali in TEX Anshuman Pandey 1 Introduction
Total Page:16
File Type:pdf, Size:1020Kb

Load more
Recommended publications
-
Grammatica Et Verba Glamor and Verve
“GippertOVprint” — 2014/4/24 — 22:14 — page 81 —#1 Grammatica et verba Glamor and verve Studies in South Asian, historical, and Indo-European linguistics in honor of Hans Henrich Hock on the occasion of his seventy-fifth birthday edited by Shu-Fen Chen and Benjamin Slade Beech Stave Press Ann Arbor New York • “GippertOVprint” — 2014/4/24 — 22:14 — page 82 —#2 © Beech Stave Press, Inc. All rights reserved. No part of this publication may be reproduced, translated, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, without prior written permission from the publisher. Typeset with LATEX using the Galliard typeface designed by Matthew Carter and Greek Old Face by Ralph Hancock. The typeface on the cover is Post Hock by Steve Peter. Library of Congress Cataloging-in-Publication Data Grammatica et verba : glamor and verve : studies in South Asian, historical, and Indo- European linguistics in honor of Hans Henrich Hock on the occasion of his seventy- fifth birthday / edited by Shu-Fen Chen and Benjamin Slade. pages cm Includes bibliographical references. ISBN ---- (alk. paper) . Indo-European languages. Lexicography. Historical linguistics. I. Hock, Hans Henrich, - honoree. II. Chen, Shu-Fen, editor of compilation. III. Slade, Ben- jamin, editor of compilation. –dc Printed in the United States of America “GippertOVprint” — 2014/4/24 — 22:14 — page 83 —#3 TableHHHHHH ofHHHHHHHHHHHHHHH ContentsHHHHHHHHHHHH Preface . vii Bibliography of Hans Henrich Hock . ix List of Contributors . xxi , Traces of Archaic Human Language Structure Anvita Abbi in the Great Andamanese Language. , A Study of Punctuation Errors in the Chinese Diamond Sutra Shu-Fen Chen Based on Sanskrit Texts . -
Comparison, Selection and Use of Sentence Alignment Algorithms for New Language Pairs
Comparison, Selection and Use of Sentence Alignment Algorithms for New Language Pairs Anil Kumar Singh Samar Husain LTRC, IIIT LTRC, IIIT Gachibowli, Hyderabad Gachibowli, Hyderabad India - 500019 India - 500019 a [email protected] s [email protected] Abstract than 95%, and usually 98 to 99% and above). The evaluation is performed in terms of precision, and Several algorithms are available for sen- sometimes also recall. The figures are given for one tence alignment, but there is a lack of or (less frequently) more corpus sizes. While this systematic evaluation and comparison of does give an indication of the performance of an al- these algorithms under different condi- gorithm, the variation in performance under varying tions. In most cases, the factors which conditions has not been considered in most cases. can significantly affect the performance Very little information is given about the conditions of a sentence alignment algorithm have under which evaluation was performed. This gives not been considered while evaluating. We the impression that the algorithm will perform with have used a method for evaluation that the reported precision and recall under all condi- can give a better estimate about a sen- tions. tence alignment algorithm's performance, We have tested several algorithms under differ- so that the best one can be selected. We ent conditions and our results show that the per- have compared four approaches using this formance of a sentence alignment algorithm varies method. These have mostly been tried significantly, depending on the conditions of test- on European language pairs. We have ing. Based on these results, we propose a method evaluated manually-checked and validated of evaluation that will give a better estimate of the English-Hindi aligned parallel corpora un- performance of a sentence alignment algorithm and der different conditions. -
Slides for My Lecture for the Texperience 2010
◦ DEVELOPMENTDEVELOPMENT OF OFxxındyındy◦ SORTSORT AND AND MERGE MERGE RULES RULES FORFOR INDIC INDIC LANGUAGES LANGUAGES ZdeněkZdeněk Wagner, Wagner, Praha, Praha, Česká Česká republika republika AnshumanAnshuman Pandey, Pandey, Univ. Univ. Michigan, Michigan, USA USA JayaJaya Saraswati, Saraswati, Mumbai, Mumbai, India India ◦ NoteNote on on pronunciation pronunciation of ofxxındy◦ındy ◦ NoteNote on on pronunciation pronunciation of ofxxındy◦ındy Czech: ks ◦ NoteNote on on pronunciation pronunciation of ofxxındy◦ındy Czech: ks English: usually as z ◦ NoteNote on on pronunciation pronunciation of ofxxındy◦ındy Czech: ks English: usually as z Hindi: as ks., l#mF can be transliterated either Lakshmi or Laxmi ◦ NoteNote on on pronunciation pronunciation of ofxxındy◦ındy Czech: ks English: usually as z Hindi: as ks., l#mF can be transliterated either Lakshmi or Laxmi Chinese: as sh ◦ NoteNote on on pronunciation pronunciation of ofxxındy◦ındy Czech: ks English: usually as z Hindi: as ks., l#mF can be transliterated either Lakshmi or Laxmi Chinese: as sh Russian: 娤¨ (meaning Hindi) ◦ NoteNote on on pronunciation pronunciation of ofxxındy◦ındy Czech: ks English: usually as z Hindi: as ks., l#mF can be transliterated either Lakshmi or Laxmi Chinese: as sh Russian: 娤¨ (meaning Hindi) x◦ındy sorts Hindi MakeIndexMakeIndex • version for English and German • CSIndex – version for Czech and Slovak • unpublished version for Sanskrit (Mark Csernel) Tables defining the sort algorithm are hard-wired in the pro- gram source code. Modification for other languages is difficult and leads rather to confusion than to development of a univer- sal tool. InternationalInternationalMakeIndexMakeIndex • Tables defining the sort algorithm present in external files. • Sort rules defined by regular expressions. -
IECC Compliance Guide for Homes in New Jersey Code: 2009 International Energy Conservation Code
IECC Compliance Guide for Homes in New Jersey Code: 2009 International Energy Conservation Code Step-by-Step Instructions 1. Using the climate zone map to the right, match the jurisdiction to the appropriate IECC climate zone. Use the simplified table of IECC building envelope requirements (below) to determine the basic thermal envelope requirements associated with the jurisdiction. 2. Use the “Outline of 2009 IECC Requirements” printed on the back of this sheet as a reference or a categorized index to the IECC requirements. Construct the building according to the requirements of the IECC and other applicable code requirements. The 2009 International Energy Conservation Code The 2009 IECC was developed by the International Code Council (ICC) and is currently available to states for adoption. The IECC is the national model standard for energy-efficient residential construction recognized by federal law. The American Recovery and Reinvestment Act of 2009 makes funds available to jurisdictions, like New Jersey, that have committed to adopt and implement the 2009 IECC. Users of this guide are strongly recommended to obtain a copy of the IECC and refer to it for any questions and further details on compliance. IECC CLIMATE ZONE 5 compliance training is also available from many sources. To Bergen Morris Sussex obtain a copy of the 2009 IECC, contact the ICC or visit Hunterdon Passaic Warren www.iccsafe.org. Mercer Somerset Limitations This guide is an energy code compliance aid for New Jersey based upon the simple prescriptive option of the 2009 IECC. It CLIMATE ZONE 4 does not provide a guarantee for meeting the IECC. -
Note: the Font “Hindiromanized” Must Be Selected
Hinweis: Der Font “HindiRomanized” muss Note: the font “HindiRomanized” must be ausgewählt sein! selected! Jetzt können Sie ihre Kaschmiri-Texte in Devanagari- Now you may convert your Hindi texts in Schrift umwandeln Devanagari script entweder in ihrer reversen Form (reverse either in their reverted form (reverse Transliteration) Transliteration) oder in ihrer romanisierten Form (Romanization) or in their romanized form (Romanization) Um das zu machen, verfahren Sie so: To do this proceed as described: Klicken Sie mit der rechten Maustaste auf das Right click the “Aksharamala” icon and choose „Aksharamala” Icon und wählen Sie „Options“ “Options” Wählen Sie „Transliteration using multiple Select “Transliteration using multiple Keymaps” Keymaps” Um Kashmiri revers: Aksharas mit dem inherenten To get Kashmiri revers: Aksharas with the Vokal “a” zu erhalten inherent Vowel “a” Wählen Sie als den 1. Keymap: Select as the 1st Keymap: “VerynewDevKashmiri FullReverse-Literation “VerynewDevKashmiri FullReverse-Literation (Unicode -> ITRANS)” (Unicode -> ITRANS)” Wählen Sie als den 2. Keymap “RomKashmiriRaina Select as the 2nd Keymap: (ITRANS -> Unicode)” “RomKashmiriRaina (ITRANS -> Unicode)” oder or “RomKashmiriKoul (ITRANS -> Unicode)” “RomKashmiriKoul (ITRANS -> Unicode)” Drücken Sie anschließend „OK“ Click finally on “OK” Wählen Sie nun den Keymap “VerynewDevKashmiri Now select the keymap “VerynewDevKashmiri FullReverse-Literation (Unicode -> ITRANS)” aus FullReverse-Literation (Unicode -> ITRANS)” Um Kashmiri revers: Aksharas ohne den inherenten -
Cap Height Body X-Height Crossbar Terminal Counter Bowl Stroke Loop
Cap Height Body X-height -height is the distance between the -Cap height refers to the height of a -In typography, the body height baseline of a line of type and tops capital letter above the baseline for refers to the distance between the of the main body of lower case a particular typeface top of the tallest letterform to the letters (i.e. excluding ascenders or bottom of the lowest one. descenders). Crossbar Terminal Counter -In typography, the terminal is a In typography, the enclosed or par- The (usually) horizontal stroke type of curve. Many sources con- tially enclosed circular or curved across the middle of uppercase A sider a terminal to be just the end negative space (white space) of and H. It CONNECTS the ends and (straight or curved) of any stroke some letters such as d, o, and s is not cross over them. that doesn’t include a serif the counter. Bowl Stroke Loop -In typography, it is the curved part -Stroke of a letter are the lines that of a letter that encloses the circular make up the character. Strokes may A curving or doubling of a line so or curved parts (counter) of some be straight, as k,l,v,w,x,z or curved. as to form a closed or partly open letters such as d, b, o, D, and B is Other letter parts such as bars, curve within itself through which the bowl. arms, stems, and bowls are collec- another line can be passed or into tively referred to as the strokes that which a hook may be hooked make up a letterform Ascender Baseline Descnder In typography, the upward vertical Lowercases that extends or stem on some lowercase letters, such In typography, the baseline descends below the baselines as h and b, that extends above the is the imaginary line upon is the descender x-height is the ascender. -
OCR a Level Computer Science H446 Specification
Qualification Accredited Oxford Cambridge and RSA A LEVEL Specification COMPUTER SCIENCE H446 ForH418 first assessment in 2017 For first assessment 2022 Version 2.5 (January 2021) ocr.org.uk/alevelcomputerscience Disclaimer Specifications are updated over time. Whilst every effort is made to check all documents, there may be contradictions between published resources and the specification, therefore please use the information on the latest specification at all times. Where changes are made to specifications these will be indicated within the document, there will be a new version number indicated, and a summary of the changes. If you do notice a discrepancy between the specification and a resource please contact us at: [email protected] We will inform centres about changes to specifications. We will also publish changes on our website. The latest version of our specifications will always be those on our website (ocr.org.uk) and these may differ from printed versions. Registered office: The Triangle Building © 2021 OCR. All rights reserved. Shaftesbury Road Cambridge Copyright CB2 8EA OCR retains the copyright on all its publications, including the specifications. However, registered centres for OCR are permitted to copy material from this OCR is an exempt charity. specification booklet for their own internal use. Oxford Cambridge and RSA is a Company Limited by Guarantee. Registered in England. Registered company number 3484466. Contents Introducing… A Level Computer Science (from September 2015) ii Teaching and learning resources iii Professional development iv 1 Why choose an OCR A Level in Computer Science? 1 1a. Why choose an OCR qualification? 1 1b. Why choose an OCR A Level in Computer Science? 2 1c. -
Acquisition of L2 Phonology – Spanish Meets Croatian
ACQUISITION OF L2 PHONOLOGY – SPANISH MEETS CROATIAN Maša Musulin University of Zagreb Article History: Submitted: 10.06.2015 Accepted: 08.08.2015 Abstract: The phoneme is conceived as a mental image that is stored in our mind and then represented by sounds in speech and graphemes in writing for phonologically based alphabets. The acquisition of L2 phonology includes two very important skills – reading and writing. The information stored in the mind of a speaker interferes with new information produced by the L2 (Robinson, Ellis 2008; Nathan, 2008). What is similar or equal in the target language to one's native language is, while unknown, incorporated one way or another into an existing model, based on prototypicality (Pompeian, 2004, Moreno Fernández, 2010). The process of teaching the sounds, letters and alphabet to foreign students is much shorter than for native speakers because to a foreign student must be given a tool for writing as soon as possible as they have to write what they are learning and memorize new language units (Celce-Murcia, Brinton, Goodwin, 1996). This paper discusses one type of difficulties Spanish learners of Croatian as L2 face when they are introduced to phonology through letters which represent Croatian sounds in order to display the influence of their preexisting phonological concepts. The subjects are ten students from Spain and Latin America. Their task was to read a group of words containing sounds that were predictably hard for them, minimal pairs and a short text. Keywords: phoneme, grapheme, letter, phonological awareness, foreign language 1. INTRODUCTION As literacy has a big impact on phonological awareness in languages with phonological writing, the graphemes that represent the phonemes, including letters, make an integral part of their mental image. -
Bangla Calendar 2020 Download Pdf
Bangla calendar 2020 download pdf Continue The Bangla Calendar is an everyday thing for us as the dates of most of our festivals depend on the Bengali calendar. Our bangla 2020 calendar is the latest version of the online calendar in the world, as it is free to use, easy to access and brings instant information. You will find many Bengali calendars online, ours is the best choice for you because of its entire holiday information and festival. If you're looking for dates of Durga Puja, Laxmi Puja, Kali Puja or any other puja, this bangla calendar 1427 has all the dates marked in it. You don't need to use panjika anymore as our Bangla 2020 calendar brings all the features of panjika. You can find favorable dates of marriage, dates of all pujas, dates of Eid and all necessary holiday dates. This new year Bangla calendar is just a click away. You can see the Bengal date today as soon as you click on the calendar. This Calendar Bangla 2020 for India and Bangladesh. You don't need different calendars if you're Indian or Bangladeshi. আমােদর বাংলা পিকা অাপ এর িকছু বিশ1-কান িদেন িক িদবস বা পূজা2-অাশন 3-সাধভণ 4-নামকরণ5-গৃহ আর 6-গৃহ েবশ7-ভ িববােহর িদন ও লWhat our Bengali calendar 1427/2020 offers 1-Bengal holiday calendar.2- Muhurat Dates.3 - Bengal Marriage Calendar 2020 .4- Fasting Days in Every Month.5- Hindu Holidays 2020.6- Islamic Holidays 2020.7- Christian Holidays 2020.8- Government Holiday List 2020.9- Bengal Calendar with panjika.10- Bengal calendar with Horoscope.11- Sunrise time and sunset time. -
Immigration and Identity Negotiation Within the Bangladeshi Immigrant Community in Toronto, Canada
IMMIGRATION AND IDENTITY NEGOTIATION WITHIN THE BANGLADESHI IMMIGRANT COMMUNITY IN TORONTO, CANADA by RUMEL HALDER A Thesis submitted to the Faculty of Graduate Studies of The University of Manitoba in partial fulfilment of the requirements for the degree of DOCTOR OF PHILOSOPHY Department of Anthropology University of Manitoba Winnipeg, Manitoba Copyright © 2012 by Rumel Halder ii THE UNIVERSITY OF MANITOBA FACULTY OF GRADUATE STUDIES ***** COPYRIGHT PERMISSION IMMIGRATION AND IDENTITY NEGOTIATION WITHIN THE BANGLADESHI IMMIGRANT COMMUNITY IN TORONTO, CANADA by RUMEL HALDER A Thesis submitted to the Faculty of Graduate Studies of The University of Manitoba in partial fulfilment of the requirements for the degree of DOCTOR OF PHILOSOPHY Copyright © 2012 by Rumel Halder Permission has been granted to the Library of the University of Manitoba to lend or sell copies of this thesis to the Library and Archives Canada (LAC) and to LAC’s agent (UMI/PROQUEST) to microfilm this thesis and to lend or sell copies of the film, and University Microfilms Inc. to publish an abstract of this thesis. This reproduction or copy of this thesis has been made available by authority of the copyright owner solely for the purpose of private study and research, and may only be reproduced and copied as permitted by copyright laws or with express written authorization from the copyright owner. iii Dedicated to my dearest mother and father who showed me dreams and walked with me to face challenges to fulfill them. iv ABSTRACT IMMIGRATION AND IDENTITY NEGOTIATION WITHIN THE BANGLADESHI IMMIGRANT COMMUNITY IN TORONTO, CANADA Bangladeshi Bengali migration to Canada is a response to globalization processes, and a strategy to face the post-independent social, political and economic insecurities in the homeland. -
South Kyengsang Nominal Accent 1 a NOTE ON
South Kyengsang nominal accent 1 A NOTE ON TERMINOLOGY: In this description we distinguish two types of “atonic” nouns of one or two syllables: atonic L, which historically come from Middle Korean L(L) words and atonic R, which historically come from Middle Korean R(H ~ L). Ongoing analysis has revealed that the atonic R is allied to the preaccent class, i.e. the atonic R perhaps should be labeled “preaccent R” or the like. However, until we work out a definitive analysis and settle on a comprehensive terminology, we have kept the “atonic R” label. TONE AND ACCENT IN SOUTH KYENGSANG KOREAN NOUNS1 Jieun Kim and Russell G. Schuh UCLA Department of Linguistics 1. Introduction Middle Korean (MK) was a tone language. Accurate documentation of Korean pronunciation dates from Late Middle Korean of the 15th century thanks to the hankul writing system, created under the direction of King Sejong. Remarkably, the form of hankul used to write middle Korean marked not only segmental phonemes but also tones. Tones were marked in MK by “side dots”. MK had three tones:2 low tone (L), which was unmarked; high tone (H), which was marked by a single dot to the left of the syllable; and rising tone (R), which was marked by double dots (resembling a colon) to the left of the syllable. Though the modern use of hankul has changed little from the way it was used in MK, tones are no longer marked, even in dialects that retain lexical pitch distinctions. The table below gives examples of MK words in hankul with a Yale transliteration3 and the modern hankul spelling and Yale transliteration, with accents added marking tones of the modern South Kyengsang reflexes. -
Alphabets, Letters and Diacritics in European Languages (As They Appear in Geography)
1 Vigleik Leira (Norway): [email protected] Alphabets, Letters and Diacritics in European Languages (as they appear in Geography) To the best of my knowledge English seems to be the only language which makes use of a "clean" Latin alphabet, i.d. there is no use of diacritics or special letters of any kind. All the other languages based on Latin letters employ, to a larger or lesser degree, some diacritics and/or some special letters. The survey below is purely literal. It has nothing to say on the pronunciation of the different letters. Information on the phonetic/phonemic values of the graphic entities must be sought elsewhere, in language specific descriptions. The 26 letters a, b, c, d, e, f, g, h, i, j, k, l, m, n, o, p, q, r, s, t, u, v, w, x, y, z may be considered the standard European alphabet. In this article the word diacritic is used with this meaning: any sign placed above, through or below a standard letter (among the 26 given above); disregarding the cases where the resulting letter (e.g. å in Norwegian) is considered an ordinary letter in the alphabet of the language where it is used. Albanian The alphabet (36 letters): a, b, c, ç, d, dh, e, ë, f, g, gj, h, i, j, k, l, ll, m, n, nj, o, p, q, r, rr, s, sh, t, th, u, v, x, xh, y, z, zh. Missing standard letter: w. Letters with diacritics: ç, ë. Sequences treated as one letter: dh, gj, ll, rr, sh, th, xh, zh.