Python Module Index 9

Total Page:16

File Type:pdf, Size:1020Kb

Python Module Index 9 indictransliterationDocumentation Release 0.0.1 sanskrit-programmers Mar 28, 2021 Contents 1 Submodules 3 1.1 indic_transliteration.sanscript......................................3 1.1.1 Submodules...........................................3 1.1.1.1 indic_transliteration.sanscript.schemes........................3 1.1.1.1.1 Submodules.................................3 1.2 indic_transliteration.xsanscript......................................3 1.3 indic_transliteration.detect........................................3 1.3.1 Supported schemes.......................................4 1.4 indic_transliteration.deduplication....................................5 2 Indices and tables 7 Python Module Index 9 Index 11 i ii indictransliterationDocumentation; Release0:0:1 sanscript is the most popular submodule here. Contents 1 indictransliterationDocumentation; Release0:0:1 2 Contents CHAPTER 1 Submodules 1.1 indic_transliteration.sanscript 1.1.1 Submodules 1.1.1.1 indic_transliteration.sanscript.schemes 1.1.1.1.1 Submodules indic_transliteration.sanscript.schemes.roman indic_transliteration.sanscript.schemes.brahmi 1.2 indic_transliteration.xsanscript 1.3 indic_transliteration.detect Example usage: from indic_transliteration import detect detect.detect('pitRRIn') == Scheme.ITRANS detect.detect('pitRRn') == Scheme.HK When handling a Sanskrit string, it’s almost always best to explicitly state its transliteration scheme. This avoids embarrassing errors with words like pitRRIn. But most of the time, it’s possible to infer the encoding from the text itself. detect.py automatically detects a string’s transliteration scheme: 3 indictransliterationDocumentation; Release0:0:1 detect('pitRRIn') == Scheme.ITRANS detect('pitRRn') == Scheme.HK detect('pitFn') == Scheme.SLP1 detect('') == Scheme.Devanagari detect('') == Scheme.Bengali 1.3.1 Supported schemes All schemes are attributes on the Scheme class. You can also just use the scheme name: Scheme.IAST =='IAST' Scheme.Devanagari =='Devanagari' Scripts: • Bengali ('Bengali') • Devanagari ('Devanagari') • Gujarati ('Gujarati') • Gurmukhi ('Gurmukhi') • Kannada ('Kannada') • Malayalam ('Malayalam') • Oriya ('Oriya') • Tamil ('Tamil') • Telugu ('Telugu') Romanizations: • Harvard-Kyoto ('HK') • IAST ('IAST') • ITRANS ('ITRANS') • Kolkata ('Kolkata') • SLP1 ('SLP1') • Velthuis ('Velthuis') indic_transliteration.detect.BLOCKS = [('Malayalam', 3328), ('Kannada', 3200), ('Telugu', 3072), ('Tamil', 2944), ('Oriya', 2816), ('Gujarati', 2688), ('Gurmukhi', 2560), ('Bengali', 2432), ('Devanagari', 2304)] Schemes sorted by Unicode code point. Ignore schemes with none defined. indic_transliteration.detect.BRAHMIC_FIRST_CODE_POINT = 2304 Start of the Devanagari block. indic_transliteration.detect.BRAHMIC_LAST_CODE_POINT = 3455 End of the Malayalam block. class indic_transliteration.detect.Regex IAST_OR_KOLKATA_ONLY = <_sre.SRE_Pattern object> Match on special Roman characters ITRANS_ONLY = <_sre.SRE_Pattern object> Match on ITRANS-only 4 Chapter 1. Submodules indictransliterationDocumentation; Release0:0:1 ITRANS_OR_VELTHUIS_ONLY = <_sre.SRE_Pattern object> Match on chars shared by ITRANS and Velthuis KOLKATA_ONLY = <_sre.SRE_Pattern object> Match on Kolkata-specific Roman characters SLP1_ONLY = <_sre.SRE_Pattern object> Match on SLP1-only characters and bigrams VELTHUIS_ONLY = <_sre.SRE_Pattern object> Match on Velthuis-only characters indic_transliteration.detect.Scheme Enum for Sanskrit schemes. alias of indic_transliteration.detect.Enum indic_transliteration.detect.detect(text) Detect the input’s transliteration scheme. Parameters text – some text data, either a unicode or a str encoded in UTF-8. 1.4 indic_transliteration.deduplication 1.4. indic_transliteration.deduplication 5 indictransliterationDocumentation; Release0:0:1 6 Chapter 1. Submodules CHAPTER 2 Indices and tables • genindex • modindex • search 7 indictransliterationDocumentation; Release0:0:1 8 Chapter 2. Indices and tables Python Module Index i indic_transliteration, ?? 9 indictransliterationDocumentation; Release0:0:1 10 Python Module Index Index B BLOCKS (in module indic_transliteration.detect),4 BRAHMIC_FIRST_CODE_POINT (in module in- dic_transliteration.detect),4 BRAHMIC_LAST_CODE_POINT (in module in- dic_transliteration.detect),4 D detect() (in module indic_transliteration.detect),5 I IAST_OR_KOLKATA_ONLY (in- dic_transliteration.detect.Regex attribute), 4 indic_transliteration (module),1 ITRANS_ONLY (indic_transliteration.detect.Regex at- tribute),4 ITRANS_OR_VELTHUIS_ONLY (in- dic_transliteration.detect.Regex attribute), 4 K KOLKATA_ONLY (indic_transliteration.detect.Regex at- tribute),5 R Regex (class in indic_transliteration.detect),4 S Scheme (in module indic_transliteration.detect),5 SLP1_ONLY (indic_transliteration.detect.Regex at- tribute),5 V VELTHUIS_ONLY (indic_transliteration.detect.Regex attribute),5 11.
Recommended publications
  • Comparison, Selection and Use of Sentence Alignment Algorithms for New Language Pairs
    Comparison, Selection and Use of Sentence Alignment Algorithms for New Language Pairs Anil Kumar Singh Samar Husain LTRC, IIIT LTRC, IIIT Gachibowli, Hyderabad Gachibowli, Hyderabad India - 500019 India - 500019 a [email protected] s [email protected] Abstract than 95%, and usually 98 to 99% and above). The evaluation is performed in terms of precision, and Several algorithms are available for sen- sometimes also recall. The figures are given for one tence alignment, but there is a lack of or (less frequently) more corpus sizes. While this systematic evaluation and comparison of does give an indication of the performance of an al- these algorithms under different condi- gorithm, the variation in performance under varying tions. In most cases, the factors which conditions has not been considered in most cases. can significantly affect the performance Very little information is given about the conditions of a sentence alignment algorithm have under which evaluation was performed. This gives not been considered while evaluating. We the impression that the algorithm will perform with have used a method for evaluation that the reported precision and recall under all condi- can give a better estimate about a sen- tions. tence alignment algorithm's performance, We have tested several algorithms under differ- so that the best one can be selected. We ent conditions and our results show that the per- have compared four approaches using this formance of a sentence alignment algorithm varies method. These have mostly been tried significantly, depending on the conditions of test- on European language pairs. We have ing. Based on these results, we propose a method evaluated manually-checked and validated of evaluation that will give a better estimate of the English-Hindi aligned parallel corpora un- performance of a sentence alignment algorithm and der different conditions.
    [Show full text]
  • Slides for My Lecture for the Texperience 2010
    ◦ DEVELOPMENTDEVELOPMENT OF OFxxındyındy◦ SORTSORT AND AND MERGE MERGE RULES RULES FORFOR INDIC INDIC LANGUAGES LANGUAGES ZdeněkZdeněk Wagner, Wagner, Praha, Praha, Česká Česká republika republika AnshumanAnshuman Pandey, Pandey, Univ. Univ. Michigan, Michigan, USA USA JayaJaya Saraswati, Saraswati, Mumbai, Mumbai, India India ◦ NoteNote on on pronunciation pronunciation of ofxxındy◦ındy ◦ NoteNote on on pronunciation pronunciation of ofxxındy◦ındy Czech: ks ◦ NoteNote on on pronunciation pronunciation of ofxxındy◦ındy Czech: ks English: usually as z ◦ NoteNote on on pronunciation pronunciation of ofxxındy◦ındy Czech: ks English: usually as z Hindi: as ks., l#mF can be transliterated either Lakshmi or Laxmi ◦ NoteNote on on pronunciation pronunciation of ofxxındy◦ındy Czech: ks English: usually as z Hindi: as ks., l#mF can be transliterated either Lakshmi or Laxmi Chinese: as sh ◦ NoteNote on on pronunciation pronunciation of ofxxındy◦ındy Czech: ks English: usually as z Hindi: as ks., l#mF can be transliterated either Lakshmi or Laxmi Chinese: as sh Russian: 娭¤¨ (meaning Hindi) ◦ NoteNote on on pronunciation pronunciation of ofxxındy◦ındy Czech: ks English: usually as z Hindi: as ks., l#mF can be transliterated either Lakshmi or Laxmi Chinese: as sh Russian: 娭¤¨ (meaning Hindi) x◦ındy sorts Hindi MakeIndexMakeIndex • version for English and German • CSIndex – version for Czech and Slovak • unpublished version for Sanskrit (Mark Csernel) Tables defining the sort algorithm are hard-wired in the pro- gram source code. Modification for other languages is difficult and leads rather to confusion than to development of a univer- sal tool. InternationalInternationalMakeIndexMakeIndex • Tables defining the sort algorithm present in external files. • Sort rules defined by regular expressions.
    [Show full text]
  • Note: the Font “Hindiromanized” Must Be Selected
    Hinweis: Der Font “HindiRomanized” muss Note: the font “HindiRomanized” must be ausgewählt sein! selected! Jetzt können Sie ihre Kaschmiri-Texte in Devanagari- Now you may convert your Hindi texts in Schrift umwandeln Devanagari script entweder in ihrer reversen Form (reverse either in their reverted form (reverse Transliteration) Transliteration) oder in ihrer romanisierten Form (Romanization) or in their romanized form (Romanization) Um das zu machen, verfahren Sie so: To do this proceed as described: Klicken Sie mit der rechten Maustaste auf das Right click the “Aksharamala” icon and choose „Aksharamala” Icon und wählen Sie „Options“ “Options” Wählen Sie „Transliteration using multiple Select “Transliteration using multiple Keymaps” Keymaps” Um Kashmiri revers: Aksharas mit dem inherenten To get Kashmiri revers: Aksharas with the Vokal “a” zu erhalten inherent Vowel “a” Wählen Sie als den 1. Keymap: Select as the 1st Keymap: “VerynewDevKashmiri FullReverse-Literation “VerynewDevKashmiri FullReverse-Literation (Unicode -> ITRANS)” (Unicode -> ITRANS)” Wählen Sie als den 2. Keymap “RomKashmiriRaina Select as the 2nd Keymap: (ITRANS -> Unicode)” “RomKashmiriRaina (ITRANS -> Unicode)” oder or “RomKashmiriKoul (ITRANS -> Unicode)” “RomKashmiriKoul (ITRANS -> Unicode)” Drücken Sie anschließend „OK“ Click finally on “OK” Wählen Sie nun den Keymap “VerynewDevKashmiri Now select the keymap “VerynewDevKashmiri FullReverse-Literation (Unicode -> ITRANS)” aus FullReverse-Literation (Unicode -> ITRANS)” Um Kashmiri revers: Aksharas ohne den inherenten
    [Show full text]
  • Prakriya Documentation Release 0.0.7
    prakriya Documentation Release 0.0.7 Dr. Dhaval Patel Dec 17, 2018 Contents 1 prakriya 3 1.1 Features..................................................3 1.2 Support..................................................3 1.3 Credits..................................................3 2 Installation 5 2.1 Stable release...............................................5 2.2 From sources...............................................5 3 Usage 7 4 Contributing 11 4.1 Types of Contributions.......................................... 11 4.2 Get Started!................................................ 12 4.3 Pull Request Guidelines......................................... 13 4.4 Tips.................................................... 13 5 Credits 15 5.1 Development Lead............................................ 15 5.2 Contributors............................................... 15 6 History 17 6.1 0.0.1 (2017-12-30)............................................ 17 6.2 0.0.2 (2018-01-01)............................................ 17 6.3 0.0.3 (2018-01-02)............................................ 17 6.4 0.0.4 (2018-01-03)............................................ 17 6.5 0.0.5 (2018-01-13)............................................ 17 6.6 0.0.6 (2018-01-16)............................................ 18 6.7 0.0.7 (2018-01-21)............................................ 18 6.8 0.1.0 (2018-12-17)............................................ 18 7 Indices and tables 19 Python Module Index 21 i ii prakriya Documentation, Release 0.0.7 Contents: Contents
    [Show full text]
  • Tugboat, Volume 15 (1994), No. 4 447 Indica, an Indic Preprocessor
    TUGboat, Volume 15 (1994), No. 4 447 HPXC , an Indic preprocessor for T X script, ...inMalayalam, Indica E 1 A Sinhalese TEXSystem hpx...inSinhalese,etc. This justifies the choice of a common translit- Yannis Haralambous eration scheme for all Indic languages. But why is Abstract a preprocessor necessary, after all? A common characteristic of Indic languages is In this paper a two-fold project is described: the first the fact that the short vowel ‘a’ is inherent to con- part is a generalized preprocessor for Indic scripts (scripts of languages currently spoken in India—except Urdu—, sonants. Vowels are written by adding diacritical Sanskrit and Tibetan), with several kinds of input (LATEX marks (or smaller characters) to consonants. The commands, 7-bit ascii, CSX, ISO/IEC 10646/unicode) beauty (and complexity) of these scripts comes from and TEX output. This utility is written in standard Flex the fact that one needs a special way to denote the (the gnu version of Lex), and hence can be painlessly absence of vowel. There is a notorious diacritic, compiled on any platform. The same input methods are called “vir¯ama”, present in all Indic languages, which used for all Indic languages, so that the user does not is used for this reason. But it seems illogical to add a need to memorize different conventions and commands sign, to specify the absence of a sound. On the con- for each one of them. Moreover, the switch from one lan- trary, it seems much more logical to remove some- guage to another can be done by use of user-defineable thing, and what is done usually is that letters are preprocessor directives.
    [Show full text]
  • Devan¯Agar¯I for TEX Version 2.17.1
    Devanagar¯ ¯ı for TEX Version 2.17.1 Anshuman Pandey 6 March 2019 Contents 1 Introduction 2 2 Project Information 3 3 Producing Devan¯agar¯ıText with TEX 3 3.1 Macros and Font Definition Files . 3 3.2 Text Delimiters . 4 3.3 Example Input Files . 4 4 Input Encoding 4 4.1 Supplemental Notes . 4 5 The Preprocessor 5 5.1 Preprocessor Directives . 7 5.2 Protecting Text from Conversion . 9 5.3 Embedding Roman Text within Devan¯agar¯ıText . 9 5.4 Breaking Pre-Defined Conjuncts . 9 5.5 Supported LATEX Commands . 9 5.6 Using Custom LATEX Commands . 10 6 Devan¯agar¯ıFonts 10 6.1 Bombay-Style Fonts . 11 6.2 Calcutta-Style Fonts . 11 6.3 Nepali-Style Fonts . 11 6.4 Devan¯agar¯ıPen Fonts . 11 6.5 Default Devan¯agar¯ıFont (LATEX Only) . 12 6.6 PostScript Type 1 . 12 7 Special Topics 12 7.1 Delimiter Scope . 12 7.2 Line Spacing . 13 7.3 Hyphenation . 13 7.4 Captions and Date Formats (LATEX only) . 13 7.5 Customizing the date and captions (LATEX only) . 14 7.6 Using dvnAgrF in Sections and References (LATEX only) . 15 7.7 Devan¯agar¯ıand Arabic Numerals . 15 7.8 Devan¯agar¯ıPage Numbers and Other Counters (LATEX only) . 15 1 7.9 Category Codes . 16 8 Using Devan¯agar¯ıin X E LATEXand luaLATEX 16 8.1 Using Hindi with Polyglossia . 17 9 Using Hindi with babel 18 9.1 Installation . 18 9.2 Usage . 18 9.3 Language attributes . 19 9.3.1 Attribute modernhindi .
    [Show full text]
  • Vedic Accent and Lexicography
    Vedic Accent and Lexicography Felix Rau University of Cologne – Lazarus Project Vedic Accent and Lexicography Lazarus Project: Cologne Sanskrit Lexicon, Project Documentation 2 Felix Rau orcid.org/0000-0003-4167-0601 This work is licensed under the Creative Commons Attribution 4.0 In- ternational License. cite as: Rau, Felix 2017. Vedic Accent and Lexicography. Lazarus Project: Cologne Sanskrit Lexicon, Project Documentation 2. Cologne: Lazarus Project. doi:10.5281/10.5281/zenodo.837826 Lazarus Project (Cologne Sanskrit Lexicon) University of Cologne http://www.cceh.uni-koeln.de/lazarus http://www.sanskrit-lexicon.uni-koeln.de/ 1 Introduction This paper is a preliminary investigation into the problems the representation of the ac- cents of Vedic Sanskrit poses to Sanskrit lexicography. The purpose is to assess the prin- ciples applied in various lexicographic works in the representation of Vedic accents and its relation to the underlying linguistic category as well as traditions of accent marking in different texts. Since the focus is on Sanskrit lexicography, we ignore the complexity of accent marking in manuscripts and the diversity of accent marking across different Indic scripts that were used to write Sanskrit over the ages. We will restrict ourselves to accent marking in Devanagari and Latin script in print, as these two are the relevant systems for virtually all of modern philological Sanskrit lexicography. The complex nature of accent marking in Vedic Sanskrit derives from several facts. Besides the intricacies of the linguistic phenomenon itself (see Kiparsky, 1973, among others), the complexity arises from the fact that different textual or editorial traditions employ structurally different systems for marking Vedic accent.
    [Show full text]
  • Power of Sanskrit
    09/03/2014 WEBPAGE: http://www.translink.profkrishna.com E-mail: [email protected] rofkrishna.com p www. वागतम ् amurthy, Singapore n n Swaagatham N. Krishnamurthy 23 February 2014 www.profkrishna.com Copyright: Dr. N. Krish Consultant, Singapore 1 Acknowledgements and Scope of talk ी गुयो नमः (shri gurub’yo’ namaha) Thanks to Singapore Dakshina Bharatha Brahmana Sabha, and Sri Srinivasan and all members of the rofkrishna.com Sabha Committee for organising this event for me to p p launch my transliteration scheme KrishnaDheva. www. Thanks also to the Sanskrit scholars here, as well as those who have come to learn how to pronounce Sanskrit correctly in English. This talk will not be a religious discourse amurthy, Singapore n This talk will not be a Sanskrit tutoring class This talk will simply be my sharing with you how to: Write down in simple English (KrishnaDheva) any Sanskrit material without special rules, and, Copyright: Dr. N. Krish Read Sanskrit correctly from KrishnaDheva. 2 1 09/03/2014 Starting off The wrong things we say: Sri should be s’ri Siva or Shiva should be s’iva rofkrishna.com p Krishna should be kr+shna www. Visaka should be vis’a’k’a’ Shuklambaradharam should be s’ukla’mbaradh’aram Vrishaba raasi should be vr+shab’a ra’s’ihi amurthy, Singapore n Kowsika gothra should be kaus’ika go’thra … and so on! Copyright: Dr. N. Krish 3 http://sanskritdocuments.org/news/subnews/NASASanskrit.txt Power of Sanskrit – a In ancient India the intention to discover truth was so consuming, that in the process, they discovered perhaps the most perfect tool for fulfilling such a search that the world has ever known – the Sanskrit language.
    [Show full text]
  • Language Transliteration in Indian Languages – a Lexicon Parsing Approach
    LANGUAGE TRANSLITERATION IN INDIAN LANGUAGES – A LEXICON PARSING APPROACH SUBMITTED BY JISHA T.E. Assistant Professor, Department of Computer Science, Mary Matha Arts And Science College, Vemom P O, Mananthavady A Minor Research Project Report Submitted to University Grants Commission SWRO, Bangalore 1 ABSTRACT Language, ability to speak, write and communicate is one of the most fundamental aspects of human behaviour. As the study of human-languages developed the concept of communicating with non-human devices was investigated. This is the origin of natural language processing (NLP). Natural language processing (NLP) is a subfield of Artificial Intelligence and Computational Linguistics. It studies the problems of automated generation and understanding of natural human languages. A 'Natural Language' (NL) is any of the languages naturally used by humans. It is not an artificial or man- made language such as a programming language. 'Natural language processing' (NLP) is a convenient description for all attempts to use computers to process natural language. The goal of the Natural Language Processing (NLP) group is to design and build software that will analyze, understand, and generate languages that humans use naturally, so that eventually you will be able to address your computer as though you were addressing another person. The last 50 years of research in the field of Natural Language Processing is that, various kinds of knowledge about the language can be extracted through the help of constructing the formal models or theories. The tools of work in NLP are grammar formalisms, algorithms and data structures, formalism for representing world knowledge, reasoning mechanisms. Many of these have been taken from and inherit results from Computer Science, Artificial Intelligence, Linguistics, Logic, and Philosophy.
    [Show full text]
  • Brahminet-ITRANS Transliteration Scheme
    BrahmiNet-ITRANS Transliteration Scheme Anoop Kunchukuttan August 2020 The BrahmiNet-ITRANS notation (Kunchukuttan et al., 2015) provides a scheme for transcription of major Indian scripts in Roman, using ASCII characters. It extends the ITRANS1 transliteration scheme to cover characters not covered in the original scheme. Tables 1a shows the ITRANS mappings for vowels and diacritics (matras). Table 1b shows the ITRANS mappings for consonants. These tables also show the Unicode offset for each character. By Unicode offset, we mean the offset of the character in the Unicode range assigned to the script. For Indic scripts, logically equivalent characters are assigned the same offset in their respective Unicode codepoint ranges. For illustration, we also show the Devanagari characters corresponding to the transliteration. ka kha ga gha ∼Na, N^a ITRANS Unicode Offset Devanagari 15 16 17 18 19 a 05 अ क ख ग घ ङ aa, A 06, 3E आ, ◌ा cha Cha ja jha ∼na, JNa i 07, 3F इ, ि◌ 1A 1B 1C 1D 1E ii, I 08, 40 ई, ◌ी च छ ज झ ञ u 09, 41 उ, ◌ु Ta Tha Da Dha Na uu, U 0A, 42 ऊ, ◌ू 1F 20 21 22 23 RRi, R^i 0B, 43 ऋ, ◌ृ ट ठ ड ढ ण RRI, R^I 60, 44 ॠ, ◌ॄ ta tha da dha na LLi, L^i 0C, 62 ऌ, ◌ॢ 24 25 26 27 28 LLI, L^I 61, 63 ॡ, ◌ॣ त थ द ध न pa pha ba bha ma .e 0E, 46 ऎ, ◌ॆ 2A 2B 2C 2D 2E e 0F, 47 ए, ◌े प फ ब भ म ai 10,48 ऐ, ◌ै ya ra la va, wa .o 12, 4A ऒ, ◌ॊ 2F 30 32 35 o 13, 4B ओ, ◌ो य र ल व au 14, 4C औ, ◌ौ sha Sha sa ha aM 05 02, 02 अं 36 37 38 39 aH 05 03, 03 अः श ष स ह .m 02 ◌ं Ra lda, La zha .h 03 ◌ः 31 33 34 (a) Vowels ऱ ळ ऴ (b) Consonants Version: v1.0 (9 August 2020) NOTES: 1.
    [Show full text]
  • Tugboat, Volume 19 (1998), No. 2 115 an Overview of Indic Fonts For
    TUGboat, Volume 19 (1998), No. 2 115 the Indo-Aryan and Dravidian language families of Fonts India. Such uniformity in phonetics is reflected in orthography, which in turn enables all scripts to be transliterated through a single scheme. This unifor- An Overview of Indic Fonts for TEX mity has subsequently been reflected in the translit- Anshuman Pandey eration schemes of the Indic language/script pack- ages. 1 Introduction Most packages have their own transliteration Many scholars and students in the humanities have scheme, but these schemes are essentially variations on a single scheme, differing merely in the coding preferred TEX over other “word processors” or doc- ument preparation systems because of the ease TEX of a few vowel, nasal, and retroflex letters. Most provides them in typesetting non-Roman scripts, the of these packages accept input in one of the two availability of TEX fonts of interest to them, and the primary 7-bit transliteration schemes— ITRANS or ability TEX has in producing well-structured docu- Velthuis—or a derivative of one of them. There ments. is also an 8-bit format called CS/CSX which a few However, this is not the case amongst Indol- of these packages support. CS/CSX is described in ogists. The lack of Indic fonts for TEXandthe further detail in Section 3. perceived difficulty of typesetting them have often 2 The Fonts and Packages turned Indologists away from using TEX. Little do they realize that TEXisthe foremost tool for de- Figure 1 shows examples of the various fonts de- veloping Indic language/script documents.
    [Show full text]
  • Vyakarana Documentation Release 0.1
    vyakarana Documentation Release 0.1 Arun Prasad Jul 14, 2017 Contents 1 Background 3 1.1 Introduction...............................................3 1.2 Rule Types................................................4 1.3 Terms and Data..............................................6 1.4 Sounds..................................................8 1.5 asiddha and asiddhavat ......................................... 10 1.6 Glossary................................................. 10 2 Architecture 13 2.1 Design Overview............................................. 13 2.2 Inputs and Outputs............................................ 14 2.3 Modeling Rules............................................. 15 2.4 Selecting Rules.............................................. 17 2.5 Defining Rules.............................................. 17 3 API Reference 19 3.1 API.................................................... 19 Python Module Index 29 i ii vyakarana Documentation, Release 0.1 This is the documentation for Vyakarana, a program that derives Sanskrit words. To get the most out of the documen- tation, you should have a working knowledge of Sanskrit. Important: All data handled by the system is represented in SLP1. SLP1 also uses the following symbols: • '\\' to indicate anudatta¯ • '^' to indicate svarita • '~' to indicate a nasal sound Unmarked vowels are udatta¯ . Contents 1 vyakarana Documentation, Release 0.1 2 Contents CHAPTER 1 Background This is a high-level overview of the Ashtadhyayi and how it works. Introduction This program has two goals: 1. To generate the entire set of forms allowed by the Ashtadhyayi without over- or under-generating. 2. To do so while staying true to the spirit of the Ashtadhyayi. Goal 1 is straightforward, but the “under-generating” is subtle. For some inputs, the Ashtadhyayi can yield multiple results; ideally, we should be able to generate all of them. Goal 2 is more vague. I want to create a program that defines and chooses its rules using the same mechanisms used by the Ashtadhyayi.
    [Show full text]