Pàli Lookup System User Manual

Total Page:16

File Type:pdf, Size:1020Kb

Pàli Lookup System User Manual Pàli Lookup System Pàli Lookup System Pàli Lookup Pàli Lookup version 2.0 Copyright © Aukana Trust, Bradford on Avon, United Kingdom User Manual This program is offered for free distribution. System Requirements Monitor: minumum 800x600 resolution Operating System: Windows 95 or later CPU: It runs fine on a slow Pentium. It is believed that it would work alright on a 486, but this hasn’t been checked. Pàli Lookup System - 1 Pàli Lookup System—Installation Using the Pàli Lookup System It’s as easy as this: 1. Place the CD-Rom in your CD-Rom drive. 2. Double click on the Windows 95 (or later) ‘My computer’ icon. Then double click on your CD-Rom drive. 3. Double click on the file ‘setup.exe’. This will run the setup program. It’s only a matter of following the dialogue boxes contained in it. 4. When it’s finished, there will be a ‘Pali Lookup’ option under your programs menu under the start menu. Click on it, and enjoy! The Main Dialogue Box typing ‘gacc’ in the figure above) When the user first enters the and press the Search button. In system, he or she is met with this a moment, the system will show dialogue box. The screen is all the words that match your divided into two parts. The top query. The system does, however, half of the dialogue box is the off a few options to make your area were searches are specified search more effective. and initiated. The bottom half of the dialogue box, below the black line, is where saved words can be viewed and manipulated. The Saved Words area will be dealt with later. For now we are going to look at conducting a search. Search Types (Wildcard options) Running a Search Using the system is as simple The system offers some as this: type the word you are different search types. looking for (such as the user • Look for exact match: The Pàli Lookup System - 2 Pàli Lookup System - 3 system tries to find an exact Usually, the word you are pañigacca, saïgacchati and so the ‘Search’ button. Up will match for the word you looking for will be in the on. pop aõõa, a¤¤a, and a¤¤à. entered in the dictionary. If search-results table. If not, you entered ‘gacc’, as above, then perhaps you chopped off you would get no result a bit too many. Then it’s time because there is no such word to try again. in Pàli. If you wanted to look You can change the default up a word like gacchati you’d setting in the wildcards box at Accents options have to know and type out the any time. Whatever you Include words options entire word. This is great if you choose will remain in effect This option was added simply know the exact word you are until you change it again, or because it is sometimes difficult The Include words option looking for. Know, though, that the session ends. to remember whether the word restricts your search to whatever the dictionary only contains • Entry is at end of Pali word: you are looking for started with grammatical group you specify. root forms of words, and the This is like the previous option, a long-a or a short-a; was it a Only words that both resemble words you will encounter in except that the wildcard plain ‘t’ or was it a ‘t’ with a dot the search word you entered and texts will be inflected. character is at the beginning underneath. If you think it could belong to the grammatical group • Entry is at front of Pàli word: of the string (*string). be either, then it is easiest to click you specified will be returned in This will pick up all words that Entering ‘gacc’ wouldn’t return the ‘Ignore accents’ radio button the search-results dialogue box. start with the string you have anything here, because there and let the system find what it • All words: This is the default, entered. It is equivalent to are no Pàli words that end this can. and when you submit your using a wildcard character at way. But if you entered some- • Use accents: This is the default search, every word in the the end of the string (string*). thing like jànàti, you’d pick up and assumes that you want dictionary will be considered. The user who entered ‘gacc’ words like jànàti itself, and words that resemble the word • Nouns & Adjectives: Press this above will get words like also anujànàti, abhijànàti, or string you entered. If you and when you submit your gacchati, gaccha, gacchi, and so àjànàti and so on. typed in ‘anna’ (and you are search, only nouns and adjec- on. This option, ‘Entry is at • Entry is anywhere in Pàli word: looking only for exact tives will be looked at. Note front of Pàli word’, is the With this option selected, the matches), no words would be that nouns includes pronouns default of the system. This is system looks up all words in returned, because there is no and adjectives includes so because most words one the system that contain your such word in the dictionary. If numerals. comes across in texts are string, whether it is at the you typed ‘a¤¤a’, you would • Verbs: Press this and only finite inflected, i.e., they have an beginning, end or middle of get this one word, and only verbs will be considered. This ending. Unless you know all the word. Entering ‘gacc’ here this word returned. includes all tenses (present, the endings and can accurately would produce a list that • Ignore accents: What if you past, optative, etc.) and were looking for a¤¤a, but tell what the root form would included all the words we saw excludes all participles. weren’t sure about the be, it is probably easier to chop earlier with ‘Entry is at front They’re in the next group. diacritics? Easy—just type in off the last few characters and of Pali word’, but it would also • Participles/Verbals: Press this ‘anna’ press the ‘Ignore and only participles will be run a query on what remains. return words like adhigacchi, accents’ radio button and press considered. There are five Pàli Lookup System - 4 Pàli Lookup System - 5 types of participles included in and let the program run. It only After you’ve saved a few words, the dictionary: past part- takes a second or two and up the main dialogue box might iciples, present participles, pops the search-results dialogue look something like the figure future passive participles, box, a dialogue box that shows above. Buttons for the saved-words list gerunds (also known as all the words in the dictionary This saved-words list is useful absolutes), and infinitives. that match the specifications you for at least two reasons. The saved-words area has three • Indeclinables: Press this and made. First, if you are anything like buttons, and their meanings are only indeclinable words will be There is a Help/About button, the creator of the system, you these: considered. This is a gram- but mostly what it does is direct might find yourself reading a • Export List: This permits you matical group in Pàli that does you to this User Manual. It also sutta. You look up a word and export the list so that you can not take inflections and gives the copyright notice and read it’s meaning. You under- work with it somewhere else. includes adverbs, interjections, some credit to data sources used stand the sentence and move on A windows ‘Save as’ dialogue conjunctions and assorted by the system. to the next one. The word box will pop up which will other particles. The search-results dialogue appears in that sentence as well, allow you to specify the file With all this information you box will be described shortly, but but by this point you’ve already name. The data you save will are fully set up to conduct a first we will have a look at the forgotten what it means and you be saved in comma-delimited search. Press the search button lower half of the main dialogue have to look it up again. The format. box, the saved-words area. saved-words list helps in that you • Delete Row: If you click on a can keep track of words you’ve row to select it (such as karã already looked up, because often on the figure on page 6), then a sutta will use a word over and press this button, the row will over again. be eliminated from the list. A second benefit is provided • Delete List: Press this and the through the Export button. You entire list will be deleted. Note can work your way through a that there is no way to get it sutta, saving all the words you back. don’t know and then export the list. You can print off the list One other note about the using spreadsheet or word- saved-words area: if the infor- processing software. Presto: you mation in the Meaning fields is have a vocabulary for that wider than the field width, it’s particular sutta and should you permitted to widen the column. want to read it again, that list will If you move the mouse cursor to make going through the sutta a the right edge of the field, the shape of the cursor will change The Saved-Words Area one, such as the search-results lot quicker (unless you memo- and permit you to widen the At the bottom of the main dialogue box, permit the user to rised every last word at the first column.
Recommended publications
  • Slides for My Lecture for the Texperience 2010
    ◦ DEVELOPMENTDEVELOPMENT OF OFxxındyındy◦ SORTSORT AND AND MERGE MERGE RULES RULES FORFOR INDIC INDIC LANGUAGES LANGUAGES ZdeněkZdeněk Wagner, Wagner, Praha, Praha, Česká Česká republika republika AnshumanAnshuman Pandey, Pandey, Univ. Univ. Michigan, Michigan, USA USA JayaJaya Saraswati, Saraswati, Mumbai, Mumbai, India India ◦ NoteNote on on pronunciation pronunciation of ofxxındy◦ındy ◦ NoteNote on on pronunciation pronunciation of ofxxındy◦ındy Czech: ks ◦ NoteNote on on pronunciation pronunciation of ofxxındy◦ındy Czech: ks English: usually as z ◦ NoteNote on on pronunciation pronunciation of ofxxındy◦ındy Czech: ks English: usually as z Hindi: as ks., l#mF can be transliterated either Lakshmi or Laxmi ◦ NoteNote on on pronunciation pronunciation of ofxxındy◦ındy Czech: ks English: usually as z Hindi: as ks., l#mF can be transliterated either Lakshmi or Laxmi Chinese: as sh ◦ NoteNote on on pronunciation pronunciation of ofxxındy◦ındy Czech: ks English: usually as z Hindi: as ks., l#mF can be transliterated either Lakshmi or Laxmi Chinese: as sh Russian: 娭¤¨ (meaning Hindi) ◦ NoteNote on on pronunciation pronunciation of ofxxındy◦ındy Czech: ks English: usually as z Hindi: as ks., l#mF can be transliterated either Lakshmi or Laxmi Chinese: as sh Russian: 娭¤¨ (meaning Hindi) x◦ındy sorts Hindi MakeIndexMakeIndex • version for English and German • CSIndex – version for Czech and Slovak • unpublished version for Sanskrit (Mark Csernel) Tables defining the sort algorithm are hard-wired in the pro- gram source code. Modification for other languages is difficult and leads rather to confusion than to development of a univer- sal tool. InternationalInternationalMakeIndexMakeIndex • Tables defining the sort algorithm present in external files. • Sort rules defined by regular expressions.
    [Show full text]
  • Tugboat, Volume 15 (1994), No. 4 447 Indica, an Indic Preprocessor
    TUGboat, Volume 15 (1994), No. 4 447 HPXC , an Indic preprocessor for T X script, ...inMalayalam, Indica E 1 A Sinhalese TEXSystem hpx...inSinhalese,etc. This justifies the choice of a common translit- Yannis Haralambous eration scheme for all Indic languages. But why is Abstract a preprocessor necessary, after all? A common characteristic of Indic languages is In this paper a two-fold project is described: the first the fact that the short vowel ‘a’ is inherent to con- part is a generalized preprocessor for Indic scripts (scripts of languages currently spoken in India—except Urdu—, sonants. Vowels are written by adding diacritical Sanskrit and Tibetan), with several kinds of input (LATEX marks (or smaller characters) to consonants. The commands, 7-bit ascii, CSX, ISO/IEC 10646/unicode) beauty (and complexity) of these scripts comes from and TEX output. This utility is written in standard Flex the fact that one needs a special way to denote the (the gnu version of Lex), and hence can be painlessly absence of vowel. There is a notorious diacritic, compiled on any platform. The same input methods are called “vir¯ama”, present in all Indic languages, which used for all Indic languages, so that the user does not is used for this reason. But it seems illogical to add a need to memorize different conventions and commands sign, to specify the absence of a sound. On the con- for each one of them. Moreover, the switch from one lan- trary, it seems much more logical to remove some- guage to another can be done by use of user-defineable thing, and what is done usually is that letters are preprocessor directives.
    [Show full text]
  • Devan¯Agar¯I for TEX Version 2.17.1
    Devanagar¯ ¯ı for TEX Version 2.17.1 Anshuman Pandey 6 March 2019 Contents 1 Introduction 2 2 Project Information 3 3 Producing Devan¯agar¯ıText with TEX 3 3.1 Macros and Font Definition Files . 3 3.2 Text Delimiters . 4 3.3 Example Input Files . 4 4 Input Encoding 4 4.1 Supplemental Notes . 4 5 The Preprocessor 5 5.1 Preprocessor Directives . 7 5.2 Protecting Text from Conversion . 9 5.3 Embedding Roman Text within Devan¯agar¯ıText . 9 5.4 Breaking Pre-Defined Conjuncts . 9 5.5 Supported LATEX Commands . 9 5.6 Using Custom LATEX Commands . 10 6 Devan¯agar¯ıFonts 10 6.1 Bombay-Style Fonts . 11 6.2 Calcutta-Style Fonts . 11 6.3 Nepali-Style Fonts . 11 6.4 Devan¯agar¯ıPen Fonts . 11 6.5 Default Devan¯agar¯ıFont (LATEX Only) . 12 6.6 PostScript Type 1 . 12 7 Special Topics 12 7.1 Delimiter Scope . 12 7.2 Line Spacing . 13 7.3 Hyphenation . 13 7.4 Captions and Date Formats (LATEX only) . 13 7.5 Customizing the date and captions (LATEX only) . 14 7.6 Using dvnAgrF in Sections and References (LATEX only) . 15 7.7 Devan¯agar¯ıand Arabic Numerals . 15 7.8 Devan¯agar¯ıPage Numbers and Other Counters (LATEX only) . 15 1 7.9 Category Codes . 16 8 Using Devan¯agar¯ıin X E LATEXand luaLATEX 16 8.1 Using Hindi with Polyglossia . 17 9 Using Hindi with babel 18 9.1 Installation . 18 9.2 Usage . 18 9.3 Language attributes . 19 9.3.1 Attribute modernhindi .
    [Show full text]
  • Python Module Index 9
    indictransliterationDocumentation Release 0.0.1 sanskrit-programmers Mar 28, 2021 Contents 1 Submodules 3 1.1 indic_transliteration.sanscript......................................3 1.1.1 Submodules...........................................3 1.1.1.1 indic_transliteration.sanscript.schemes........................3 1.1.1.1.1 Submodules.................................3 1.2 indic_transliteration.xsanscript......................................3 1.3 indic_transliteration.detect........................................3 1.3.1 Supported schemes.......................................4 1.4 indic_transliteration.deduplication....................................5 2 Indices and tables 7 Python Module Index 9 Index 11 i ii indictransliterationDocumentation; Release0:0:1 sanscript is the most popular submodule here. Contents 1 indictransliterationDocumentation; Release0:0:1 2 Contents CHAPTER 1 Submodules 1.1 indic_transliteration.sanscript 1.1.1 Submodules 1.1.1.1 indic_transliteration.sanscript.schemes 1.1.1.1.1 Submodules indic_transliteration.sanscript.schemes.roman indic_transliteration.sanscript.schemes.brahmi 1.2 indic_transliteration.xsanscript 1.3 indic_transliteration.detect Example usage: from indic_transliteration import detect detect.detect('pitRRIn') == Scheme.ITRANS detect.detect('pitRRn') == Scheme.HK When handling a Sanskrit string, it’s almost always best to explicitly state its transliteration scheme. This avoids embarrassing errors with words like pitRRIn. But most of the time, it’s possible to infer the encoding from the text itself.
    [Show full text]
  • Tugboat, Volume 19 (1998), No. 2 115 an Overview of Indic Fonts For
    TUGboat, Volume 19 (1998), No. 2 115 the Indo-Aryan and Dravidian language families of Fonts India. Such uniformity in phonetics is reflected in orthography, which in turn enables all scripts to be transliterated through a single scheme. This unifor- An Overview of Indic Fonts for TEX mity has subsequently been reflected in the translit- Anshuman Pandey eration schemes of the Indic language/script pack- ages. 1 Introduction Most packages have their own transliteration Many scholars and students in the humanities have scheme, but these schemes are essentially variations on a single scheme, differing merely in the coding preferred TEX over other “word processors” or doc- ument preparation systems because of the ease TEX of a few vowel, nasal, and retroflex letters. Most provides them in typesetting non-Roman scripts, the of these packages accept input in one of the two availability of TEX fonts of interest to them, and the primary 7-bit transliteration schemes— ITRANS or ability TEX has in producing well-structured docu- Velthuis—or a derivative of one of them. There ments. is also an 8-bit format called CS/CSX which a few However, this is not the case amongst Indol- of these packages support. CS/CSX is described in ogists. The lack of Indic fonts for TEXandthe further detail in Section 3. perceived difficulty of typesetting them have often 2 The Fonts and Packages turned Indologists away from using TEX. Little do they realize that TEXisthe foremost tool for de- Figure 1 shows examples of the various fonts de- veloping Indic language/script documents.
    [Show full text]
  • Transcription Final Table
    MAJOR ALTERNATIVE TRANSLITERATION METHODS FOR MAINLAND SOUTHEAST ASIAN OLD SCRIPTS So far, this table includes Old Khmer and Cam scripts. IAST: International Alphabet of Sanskrit Transliteration ITRANS: Indian languages TRANSliteration See http://en.wikipedia.org/wiki/Devanagari_transliteration And http://indology.info/email/members/wujastyk And http://fr.wikipedia.org/wiki/American_Standard_Code_for_Information_Interchange See also: Antelme 2007 CIK+CIC CIK+CIC Other Harvard- ITRANS ASCII Sakamoto IAST Velthuis Schreiner Remarks publications pub. Kyoto 5.3 corpora Glottal stop (as in Khmer, Mon and ’ q q, a Q other Southeast Asian scripts) VOWELS a a a A a a a a a ā aa ā AF ā A aa, A aa -a i i i I i i i i i ī ii ī IF ī I ii, I ii -i u u u U u u u u u ū uu ū UF ū U uu, U uu -u 1 e e e E e e e e e ai ai ai AI ai ai ai ai ai o o o O o o o o o au au au AU au au au au au The consonant .r r̥ .r r̥ R. r̥ R RRi, R^i .r .r (and the .rh) is not relevant for Southeast Asian scripts: r̥/r ̥̄ always represent the vowel. The dot r̥̄ .r.r r̥̄ r̥̄ RR RRI, R^I .rr, .r.r -.r underneath the r should be a small circle (here I used Lucida Grande). ḷ .l ḷ L. ḷ IR LLi, L^i .l .l ḹ .l.l ḹ ḹ IRR LLI, L^I .ll, .l.l -.l ṁ .m ṃ M.
    [Show full text]
  • A Tool for Transliteration of Bilingual Texts Involving Sanskrit
    A Tool for Transliteration of Bilingual Texts Involving Sanskrit Nikhil Chaturvedi Prof. Rahul Garg IIT Delhi IIT Delhi [email protected] [email protected] Abstract Sanskrit texts are increasingly being written in bilingual and trilingual formats, with Sanskrit paragraphs/shlokas followed by their corresponding English commentary. Sanskrit can also be written in many ways, including multiple encodings like SLP-1 and Velthuis for its romanised form. The need to tackle such code-switching is exacerbated through the requirement to render web pages with multilingual Sanskrit content. These need to automatically detect whether a given text fragment is in Sanskrit, followed by the identification of the form/encoding, further selectively performing transliteration to a user specified script. The Brahmi-derived writing systems of Indian languages are mostly rather similar in structure, but have different letter shapes. These scripts are based on similar phonetic values which allows for easy transliteration. This correspondence forms the basis of the motivation behind deriving a uniform encoding schema that is based on the underlying phonetic value rather than the symbolic representation. The open-source tool developed by us performs this end-to-end detection and transliteration, and achieves an accuracy of 99.1% between SLP-1 and English on a Wikipedia corpus using simple machine learning techniques. 1 Introduction Sanskrit is one of the most ancient languages in India and forms the basis of numerous Indian lan- guages. It is the only known language which has a built-in scheme for pronunciation, word formation and grammar (Maheshwari, 2011). It one of the most used languages of it's time (Huet et al., 2009) and hence encompasses a rich tradition of poetry and drama as well as scientific, technical, philosophical and religious texts.
    [Show full text]
  • The Seven Factors of Enlightenment Convention
    Home » Library » Authors » Piyadassi | Index | Abbrev | Glossary | Help | Search Source: The Wheel Publication No. 1 (Kandy: Buddhist Publication Society, 1960). Transcribed from the print edition in 2005 by a volunteer, under the auspices of the Access to Insight Dhamma Transcription Project and by arrangement with the Buddhist Publication Society. Minor revisions were made in accordance with the ATI style sheet. Pali diacritics are represented using the Velthuis The Seven Factors of Enlightenment convention. Copyright © 1960 Buddhist Publication Society by Access to Insight edition © 2006 Piyadassi Thera For free distribution. This work may be republished, reformatted, reprinted, and redistributed in any medium. It is the author's wish, however, that any such republication and redistribution be made available to the public on a free and unrestricted basis and that translations and other derivative works be clearly marked as such. Other formats: The Tipitaka, the Buddhist canon, is replete with references to the factors of enlightenment expounded by the Enlightened One on different occasions under different circumstances. In the Book of the Kindred Sayings, V (Samyutta Nikaya, Maha Vagga) we find a special section under the title Bojjhanga Samyutta wherein the Buddha discourses on the bojjhangas in diverse ways. In this section we read a series of three discourses or sermons recited by Buddhists since the time of the Buddha as a protection (paritta or pirit) against pain, disease, and adversity. The term bojjhanga is composed of bodhi + anga. Bodh denotes enlightenment — to be exact, insight concerned with the realization of the four Noble Truths, namely: the Noble Truth of suffering; the Noble Truth of the origin of suffering; the Noble Truth of the cessation of suffering and the Noble Truth of the path leading to the cessation of suffering.
    [Show full text]
  • European Societies GLOBALIZATION of MARKETS FOR
    This article was downloaded by: [UVA Universiteitsbibliotheek SZ] On: 10 September 2013, At: 02:04 Publisher: Routledge Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK European Societies Publication details, including instructions for authors and subscription information: http://www.tandfonline.com/loi/reus20 GLOBALIZATION OF MARKETS FOR CONTEMPORARY ART Olav Velthuis a a Department of Sociology and Anthropology , University of Amsterdam , Amsterdam , The Netherlands Published online: 25 Feb 2013. To cite this article: Olav Velthuis (2013) GLOBALIZATION OF MARKETS FOR CONTEMPORARY ART, European Societies, 15:2, 290-308, DOI: 10.1080/14616696.2013.767929 To link to this article: http://dx.doi.org/10.1080/14616696.2013.767929 PLEASE SCROLL DOWN FOR ARTICLE Taylor & Francis makes every effort to ensure the accuracy of all the information (the “Content”) contained in the publications on our platform. However, Taylor & Francis, our agents, and our licensors make no representations or warranties whatsoever as to the accuracy, completeness, or suitability for any purpose of the Content. Any opinions and views expressed in this publication are the opinions and views of the authors, and are not the views of or endorsed by Taylor & Francis. The accuracy of the Content should not be relied upon and should be independently verified with primary sources of information. Taylor and Francis shall not be liable for any losses, actions, claims, proceedings, demands, costs, expenses, damages, and other liabilities whatsoever or howsoever caused arising directly or indirectly in connection with, in relation to or arising out of the use of the Content.
    [Show full text]
  • Roman Transliteration of Indic Scripts
    Roman Transliteration of Indic Scripts Kavi Narayana Murthy Srinivasu Badugu Department of Computer and Department of Computer and Info. Info.Sciences Sciences University of Hyderabad University of Hyderabad email: [email protected] [email protected] Abstract can help them to read and understand the texts. For example, if you know Hindi or Kannada language but In this paper we analyze the need for Roman you do not know the scripts, you can read and Transliteration for Indic scripts. We evaluate the pros understand the Hindi and Kannada sentences above if and cons of various schemes in use today and argue for you know the Roman script and the conventions used a scientifically designed standard scheme. We offer therein. If you do not know the language also, you can one such scheme for the consideration of all the still read these sentences since you know the Roman experts. We believe the ideas we have presented here script. The main goal of transliteration is to enable the will also be of interest to people in many other reader to read, that is, pronounce the words as countries where the language situation is similar. accurately as practically possible. Pronunciation is important, orthography is not the basis at all. After all, the most important thing about a word or sentence is its 1. Translation and Transliteration: meaning and the next most important thing is its pronunciation. Transliteration has several other Language is all about systematization of mappings benefits as we shall see soon. from sound patterns to meanings, through which we can think and also communicate our thoughts and 2.
    [Show full text]
  • Devanagari Transliteration 1 Devanagari Transliteration
    Devanagari transliteration 1 Devanagari transliteration There are several methods of transliteration from Devanāgarī to the Roman script, which is a process also known as Romanization in the Indian subcontinent. The Hunterian transliteration system is the "national system of romanization in India" and the one officially adopted by the Government of India. IAST is a widely used standard. IAST The International Alphabet of Sanskrit Transliteration (IAST) is a subset of the ISO 15919 standard, used for the transliteration of Sanskrit and Pāḷi into roman script with diacritics. Hunterian transliteration system The Hunterian system was developed in the nineteenth century by William Wilson Hunter, then Surveyor General of India. When it was proposed, it immediately met with opposition from supporters of the earlier practiced non-systematic and often distorting "Sir Roger Dowler method" (an early corruption of Siraj ud-Daulah) of phonetic transcription, which climaxed in a dramatic showdown in an India Council meeting on 28 May 1872 where the new Hunterian method carried the day. The Hunterian method was inherently simpler and extensible to several Indic scripts because it systematized grapheme transliteration, and it came to prevail and gain government and academic acceptance. Opponents of the grapheme transliteration model continued to mount unsuccessful attempts at reversing government policy until the turn of the century, with one critic calling appealing to "the Indian Government to give up the whole attempt at scientific (i.e. Hunterian) transliteration, and decide once and for all in favour of a return to the old phonetic spelling." Over time, the Hunterian method extended in reach to cover several Indic scripts, including Burmese and Tibetan.
    [Show full text]
  • Babel Speaks Hindi
    Babel speaks Hindi Zdeněk Wagner Vinohradská 114 13000 Prague 3 Czech Republic zdenek dot wagner (at) gmail dot com http://icebearsoft.euweb.cz Abstract Babel provides a unified interface for creation of multilingual documents. Un- fortunately no Indic languages are currently supported, so typesetting in In- dic languages is based on specialised packages. The most advanced of these is Velthuis Devan¯agar¯ı for TEX, because it already provides Hindi values for language-dependent strings as well as a macro for a European-style date. A language definition file for plugging Hindi into Babel has therefore been recently developed. The second part of the paper explains differences between Unicode and Velthuis transliteration. This is important for understanding the tool that can convert Hindi and Sanskrit documents from MS Word and OpenOffice.org into TEX via an XSLT 2.0 processor and a Perl script, as well as a method of making the PDF files searchable. Finally the paper discusses some possibilities of further development: the advantages offered by X TE EX and the forthcoming integration of Lua into pdfTEX. 1 Introduction guage definition file which will integrate Hindi into the Babel system. Packages for typesetting in various Indic languages in both plain TEX and LATEX have been available 2 Birth of the Language Definition File from CTAN for a long time. The authors of these The aim of our work was to enable transparent use packages have made substantial efforts to support of Hindi in multilingual documents by means of the the Indic scripts, which present difficulties that can- standard Babel invocation: not be solved by TEX itself.
    [Show full text]