Finding Critical Points of Handwritten Persian/Arabic Character

Total Page:16

File Type:pdf, Size:1020Kb

Finding Critical Points of Handwritten Persian/Arabic Character International Journal of Machine Learning and Computing, Vol. 2, No. 5, October 2012 Finding Critical Points of Handwritten Persian/Arabic Character Majid Harouni, Dzulkifli Mohamad, Mohd Shafry Mohd Rahim, and Sami M. Halawani and recognizing the output of feature extraction as its inputs; Abstract—On-line handwritten character recognition system there exists some useful techniques for classification stage is a special line of research in image processing and pattern such as: neural networks, fuzzy logic, etc. So, the significant recognition field, it can also considered a special process of in part is to achieve a most desirable data set form the academic researches and production fields in the past decade. hand-drawn letters, i.e. in the pre-processing. The The general steps in a recognition system are pre-processing, segmentation, feature extraction and classifications. The polynomial approximation was used by singular value techniques of pre-processing stage play an excessive role in the decomposition of on-line handwriting coordinates [1]. In [2], system and directly affect the system performance. The focus of the pre-processing was to obtain a data acquisition by this article is on doing pre-processing stage and providing a smoothing, size normalization and resampling. The decision most desirable data set form the raw data using in feature tree was proposed from the list of a string of the X and Y extraction and then to increase the rate of recognition system. coordinates for data collection set [3], and smoothing For this reason, the novel algorithm is presented for finding a critical points set of hand-drawn letters. These critical points averages of these coordinates done with neighboring pixels, help extracting the correct structural and statistical features of then filtering by eliminating very close points as a minimum on-line character handwriting in any given style writing. distance between consecutive points [4]. In this case, we present a novel algorithm to find these critical points of each Index Terms—Persian and Arabic script, critical points, isolated letter using feature extraction; this algorithm based on-line character recognition, pre-processing. on the ratio of some point set in either both sides of the critical points, which abstains from using of smoothing or resampling and normalization techniques. I. INTRODUCTION This paper is presented as follows: section 2 exhibits the Pre-processing of the on-line handwriting recognition overview of the Persian and Arabic script. The system has always been challenging and worthwhile subject, pre-processing stage uses the proposed algorithm in section 3. and also gained much attention in recent years. In fact, the Section 4 reviews the common feature set of on-line Persian on-line handwriting recognition is one of the most successful and Arabic handwriting. Section 5 is devoted to the and scientific applications of image processing and pattern experimental and test results of the algorithm. The paper recognition. The common architecture of this system consists terminates in the concluding remarks in section 6. mainly of four stages: pre-processing, segmentation, feature extraction and recognition or classification. Pre-processing promotes the hand-drawn letter or raw data into a desired II. PERSIAN AND ARABIC SCRIPTS data set using in all other stages; the procedures of image Persian and Arabic are written from right to left. The processing may be used for this stage, such as interpolation, cursive natures of the scripts, styles of writing are their smoothing, normalization, etc. The stage of segmentation is inherent problems. The Persian letters are four letters more designed to divide the words/sub-words, and letters into their ;and 32 letters ,’پ‘ and ’گ’, ‘چ’, ‘ژ‘ ,than Arabic script characters and some particular parts respectively. Based on generally, some Arabic vowel sings (hamza, shadda, etc) do the first two stage (or one of them in terms of designing not use in Persian writing, The Persian/Arabic letters have up system), the structural and statistical features will be to four different shapes, depending on their preceding and extracted. Finally, a classifier is appointed for categorizing position in the word; for example character “Heh”: isolated In sum, Table 1 .”ه“ and final ,”ھ“ middle ,”ھ“ initial ,”ه“ Manuscript received May 25, 2012; revised September 18, 2012. This illustrates that all of Persian/Arabic alphabets can just be ا’, ‘ research is supported by the Ministry of Higher Education (MOHE) and connected from both sides; but only these seven letters collaboration with Research Management Center (RMC) Universiti can be connected to the ’و‘ and ’‘د’, ‘ذ’, ‘ر’, ‘ز’, ‘ژ Teknologi Malaysia (UTM). This paper is financial supported by GUP Grant (NO. VOT: Q.J130000.7128.01J18). former-letter from the right side. Thus, these letters Majid Harouni is with UTMViCube Lab, Faculty of Computer Science proximate cause discontinuity within the same word as its and Information Systems, Universiti Technologi Malaysia, Johor, Malaysia and with the Department of Computer Science, Islamic Azad University, sub words. Several of Persian/Arabic alphabets divide by the Dolatabad branch, Isfahan, Iran (e-mail: majid.harouni@ gmail.com). same main body and can be different only by a small Dzulkifli Mohamad and Mohd Shafry Mohd Rahim are with the completing part such as some dots, and some slanted lines. UTMViCube Lab, Faculty of Computer Science and Information Systems, Moreover, Fig. 1 illustrates apart from dotting in alphabet, Universiti Technologi Malaysia, Johor, Malaysia (e-mail: [email protected], and [email protected]). most letters are written in one stroke, only five Persian letters Sami M. Halawani is with the Faculty of Computing and Information require write two or more strokes. Technology, King Abdul Aziz University, Saudi Arabia (e-mail: As Table I is shown that there exist 18 of the 32 Persian [email protected]). 10.7763/IJMLC.2012.V2.192 573 International Journal of Machine Learning and Computing, Vol. 2, No. 5, October 2012 letters have dots that appear on above or below of them. TABLE II: NINE-GROUP OF BODY CHARACTER IN PERSIAN ALPHABET Exactly, there are 10 letters of them have one dot, 3 letters G. 1 G. 2 G. 3 G. 4 G. 5 G. 6 G. 7 G. 8 G. 9 د have two dots and 5 of them have three dots. The number of س ج ب dots of in each letters does not change with its four different ذ ف ک ع ط ش چ پ ا that has no dot in isolated’ی‘ forms, excepting the letter of ر ق گ غ ظ ص ح ت آ and final shapes, and instead two dots in its initial and middle ز ض خ ث shapes; hence, any addition or deletion of the dots or slanted ژ .lines can product a misinterpretation of the letters Correspondingly, there are many different styles of writing in Persian/Arabic which introduce different allograph for letters or some letter combinations. As Table 2 shows, there exists III. PRE-PROCESSING AND FINDING CRITICAL POINTS the nine-group of similar body character in Persian alphabet; The primary purpose of the pre-processing stage is to some letters only have same body in initial and middle shapes provide desired data input collection from the raw data via in which these on-line hand-drawn letters for other stages. The ,’ن‘ or the letter of ’ف‘ such as the letter of shapes can be considered same as second group. pre-processing stage includes sub stages such as interpolating or smoothing, resembling and etc. We make a decision based TABLE I: THE PERSIAN ALPHABET IN THE FOUR DIFFERENT SHAPES on which sub stages of pre-processing are useful and most Character Transliteration Isolated Initial Middle Final likely to utilize. So, it is clear that the time of recognition of .letters diminish by decreasing and limiting these sub stages ا ,ا ا ,ا آ, ا آ, ا Alef a We present the proposed pre-processing stage without any ب ب ب ب Beh b ,smoothing and resembling the hand-drawn letters پ پ پ پ Peh p segmentation of each hand-drawn letters into some strokes ت ت ت ت Teh t and dividing each of these strokes into sub-stroke as several ث ث ث ث Theh Th tokens (each sub-stroke is called one token) by the following ج ج ج ج Jeem j unique algorithm in coming sub section. چ چ چ چ Cheh ch A. Finding Critical Points and Dividing Stroke ح ح ح ح Heh h This sub section is based on our unique algorithm to divide خ خ خ خ Kheh kh one stroke into several different parts, which are called د د د د Dal d tokens. Utilizing this algorithm can aid us to find the corner ذ ذ ذ ذ Thal th points that mean local maximum and minimum points of each ر ر ر ر Reh r stroke. These points use for starting and ending point of each token. Our proposed algorithm is separated into four parts ز ز ز ز Zeh z (based on the hand-drawn letter is sequenced to a list of (x,y ژ ژ ژ ژ Zheh zh coordinate points as an input data set, where x and y س س س س Seen s represents the position of the selected points in terms of the ش ش ش ش Sheen sh ratio of the user’s speedwriting to whole length of the ص ص ص ص Sad s hand-drawn letter on a surface writing or tablet pc. ض ض ض ض Zad th Firstly, as noted already, the maximum number of strokes ط ط ط ط Tah t of Persian letters is up to 3 strokes. On the other hand, as ,Table 1 shows if each dot of letter is considered as one stroke ظ ظ ظ ظ Zah z then the maximum number of stroke is up to 4 strokes for ع ع ع ع Ain a every letter.
Recommended publications
  • Usage of Urdu As the Language of Elitism Among the Muslims of the Northern and the Deccan Parts of India: a Socio-Cultural Review
    Middle Eastern Journal of Research in Education and Social Sciences (MEJRESS) Website: http://bcsdjournals.com/index.php/mejrhss ISSN 2709-0140 (Print) and ISSN 2709-152X (Online) Vol.1, Issue 2, 2020 DOI: https://doi.org/10.47631/mejress.v1i2.28 Usage of Urdu as the Language of Elitism among the Muslims of the Northern and the Deccan parts of India: A Socio-Cultural Review Arshi Siddiqui, 1 Ismail Siddiqui 2 1 PhD, Barkatullah University, Bhopal (M.P), India. 2 Integrated Masters, Development Studies, IIT Madras, Chennai, (T.N), India Abstract Article Info Purpose: The paper examines how Urdu evolved from the language of the Article history: rulers to the lingua franca of Muslims in the modern times. The paper Received: 02 September 2020 attempts to highlight how Urdu is still being used as an identity marker for Revised: 08 October 2020 Muslims with respect to the other communities and is a source of Accepted: 18 October 2020 ascendancy, an achieved elitist status within the Muslims of the North and Deccan. Keywords: Approach/Methodology/Design: Socio-cultural analysis. Findings: The usage of Urdu as a political instrument by the Muslim Sociolinguistics, League and the cultural influence the language has exerted on the Muslim Urdu, community led to its usage as a source of elitism within the community in the South Asia, modern times. The analysis indicates that there is harking back to the highly Indian Muslims, Persianised, nastaliq form of Urdu, which was manifested in its literature in Elitism the twentieth century as the pure, hegemonic and the aspired language, true to the identity of the community.
    [Show full text]
  • Intelligent System for Persian Calligraphy Learning a Thesis
    INTELLIGENT SYSTEM FOR PERSIAN CALLIGRAPHY LEARNING A THESIS SUBMITTED TO THE GRADUATE SCHOOL OF APPLIED SCIENCES OF NEAR EAST UNIVERSITY By BEHZAD SOROUSH BARHAGHI In Partial Fulfillment of the Requirements for the Degree of Master. of Science Ill Information Systems Engineering NICOSIA, 2016 Behzad Soroush Barhaghi: Intelligent System for Persian Calligraphy Learn We certify this thesis is satisfactory for the award of the Degree of Master of Science in Information Systems Engineering Examining Committee in charge: Prof. Dr. Fahreddin Sa~rvisor, NEU . -- ~ Assist.Prof.~ Şekeroğlu,Departmentoflnf. SystemsEng.,NEU Assist.Prof. Dr. Y"sal Ever, SoftwareEng.Dept.,NEU I hereby declare that all information in this document has been obtained and presented in accordance with academic rules and ethical conduct. I also declare that, as required by these rules and conduct, I have fully cited and referenced all material and results that are not original to this work. Name,Lastname: ~u.J. ~JrdvJt /3eır~· Signature: L Date: 19.r,ıl ACKNOWLEDGEMENTS It is not possible to thank everybody who has had an involvement with me during the course of my studies for the degree of MSc. However, there are some people who must be thanked. Firstly, I would like to thank my supervisor Prof. Dr. Fahreddin Sadikoglu for his guidance and encouragement throughout the thesis. He has walked me through all the stages of the writing of my thesis. Without his consistent and illuminating instruction, this thesis could not have reached its present form. I would like to thank especially to my lovely wife whose encouragement and support has helped me achieve beyond my greatest expectations.
    [Show full text]
  • Reforms of Uzbek and Turkish Writing at the Beginning of the Xx Century
    CENTRAL ASIAN JOURNAL OF SOCIAL SCIENCES AND HISTORY VOLUME: 02 ISSUE: 02 | FEB 2021 (ISSN: 2660-6836) Available online at www.cajssh.centralasianstudies.org CENTRAL ASIAN JOURNAL OF SOCIAL SCIENCES AND HISTORY Journal homepage: www. http://cajssh.centralasianstudies.org/index.php/CAJSSH Reforms of uzbek and turkish writing at the beginning of the xx century Rustamov Zokir Toirovich Head of Department, Institute of Forecasting and Macroeconomic Research, Uzbekistan [email protected] A B S T R A C T A R T I C L E I N F O The article describes the spelling reforms in the Republics of Article history: Uzbekistan and Turkey in the 20-the 30s of the XX century. The Received 6 Nov 2020 Received in revised form 15 Dec 2020 implemented alphabets, the processes of regulating their grammar, Accepted 30 Jan 2020 documents related to fundamental changes in the socio-political life Available online 09 Feb 2021 of Turkestan, these changes were analyzed. Keywords: Spelling, Alphabet, writing, Alphabet, dictionary, language, textbook, literacy, Gazeta, article, lecture. Introduction The process of a radical change of the Uzbek spelling in the XX century is closely connected with the revolutionary movements in Turkestan. By the 1940s, this record had changed twice. Such changes in spelling have rarely occurred in the history of nations. Based on the analysis of documents related to the radical changes in the socio-political life of Turkestan, these changes can be divided into the following periods: 1st period. Years (1870-1929) based on Arabic script. This period is characterized by the emergence of the movement for the reform of the Arabic script as a result of the conquest of the country by the tsarist empire, the penetration of the alphabet based on Russian graphics and the attempts of the national intelligentsia, under the influence of Russian linguistics, to create a grammar of the Turkic language, similar to Western standards.
    [Show full text]
  • Preliminary Proposal to Encode the Book Pahlavi Script in the Unicode Standard
    UTC Document Register L2/13-141R Preliminary proposal to encode the Book Pahlavi script in the Unicode Standard Roozbeh Pournader, Google Inc. July 24, 2013 1. Background This is a proposal to encode the Book Pahlavi script in the Unicode Standard. Book Pahlavi was last proposed in the “Preliminary proposal to encode the Book Pahlavi script in the BMP of the UCS” (L2/07-234 = JTC1/SC2/WG2 N3294). Here is a short summary of the major differences in the character repertoire between that document and this proposal: • Characters only proposed here: an alternate form of d and a second alternate form of l (see section 7), one atomic symbol and two atomic ligatures (section 8), combining dot below (sections 3 and 6); • Characters proposed in L2/07-234 but not proposed here: ABBREVIATION TAA (representable as a character sequence, section 9), three archigraphemes (ambiguity, implementability, and multiple representation concerns, section 5), combining three dots below (no evidence of usage found), KASHIDA (unified with TATWEEL, sections 9 and 10), numbers (more information needed, section 10). 2. Introduction Book Pahlavi was the most important script used in writing the Middle Persian language.1 It probably started to be commonly used near the end of the Sassanian era (sixth century CE), evolving from the non-cursive Inscriptional Pahlavi. While the term “Book Pahlavi” refers to the script mostly surviving in books, Book Pahlavi has also been used in inscriptions, coins, pottery, and seals. Together with the alphabetic and phonetic Avestan script, the Book Pahlavi abjad is of religious importance to the Zoroastrian community, as most of their surviving religious texts have been written in one or both of the scripts.
    [Show full text]
  • Language Use in the Ottoman Empire and Its Problems
    LANGUAGE USE IN THE OTTOMAN EMPIRE AND ITS PROBLEMS (1299-1923) by Yelda Saydam Dissertation submitted in fulfilment of the requirements for the degree M. Phil. in the Faculty of Humanities (Department of Greek and Centre for Islamic Studies) at the University of Johannesburg Supervisor: Prof. B. Hendrickx Co-supervisor: Dr A. Dockrat Johannesburg 2006/7 Abstract The Ottoman Empire, an imperial power that existed from 1299 to 1923, was one of the largest empires to rule the borders of the Mediterranean Sea. Ottoman Turkish was used especially between the 16th and 19th centuries during the Ottoman Empire. This ornamented, artificial language separated the general population from intellectual and palace elite and a communication problem followed. Although the minorities of the Ottoman Empire were free to use their language amongst themselves, if they needed to communicate with the government they had to use Ottoman Turkish. This thesis explains these language differences and the resulting problems they created during the Empire. Examples of original correspondence are used to highlight the communication differences and the difficulties that ensured. From this study, the author concludes that Ottoman Turkish was not a separate language from Turkish; instead, it was a variation of Turkish in inexistence for approximately 600 years. I Preface My family and I came to South Africa from Turkey during August 2002 for my husband’s sabbatical as a post-doctoral fellow at University of The Witwatersrand. We both took a years leave from our jobs when we came to South Africa. I was working for Havva Özişbakan High School in İzmir, Turkey as a Turkish Language and Literature teacher.
    [Show full text]
  • Leila's Alphabet Journey Text Book 2019 -Reduced
    Leila’ s Alphabet Journey A Practical Guide to the Persian Alphabet By Parastoo Danaee Beginner Level !1 Contents To The Students 4 Introduction | Facts about Persian Language 6 Unit 1 | Persian Alphabet 14 Letter Forms 15 Persian Vowel Forms 17 Practice 1 18 Unit 2 | Basic Features of the Persian Alphabet 20 Practice 2 22 Unit 3 | Letter Forms 23 Non-Connecting Letter Forms 23 Letter Forms 24 Persian Vowel Forms 26 Practice 3 27 Unit 4| Features of the Persian Vowels 29 Short Vowels 29 Long Vowels 30 Diphthongs 30 Practice 4 31 Unit 5| Persian Letters Alef, Be, Pe, Te, Se 32 Practice 5 34 Unit 6| Persian Letters Dâl, Zâl, Re, Ze, Zhe 36 Practice 6 38 Unit 7| Persian Letters Jim, Che, He, Khe 40 Practice 7 42 Unit 8| Persian Letters Sin, Shin, Sât, Zât, Tâ, Zâ 44 Practice 8 46 Unit 9 | Persian Letters ‘Ain, Ghain, Fe, Gh"f 48 Practice 9 50 Unit 10| Persian Letters K"f, Gh"f, L"m,Mim 52 !2 Practice 10 54 Unit 11| Persian Letters Nun, V"v, He, Ye 56 Practice 11 58 Unit 12 | Short Vowels 60 Practice 12 61 Unit 13 | Long Vowels 63 Practice 13 64 Unit 14 | Additional Signs 66 Practice 14 67 !3 To The Students Welcome to Persian! Leila’s Alphabet Journey represents the first in a series of textbooks aimed at teaching Persian to foreign students and is followed by Leila Goes to Iran . Leila, the leading character is a generation 1.5 young lady who grow up in Los Angeles in a home in which Persian language is spoken.
    [Show full text]
  • Romanization of Arabic 1 Romanization of Arabic
    Romanization of Arabic 1 Romanization of Arabic Arabic alphabet ﺍ ﺏ ﺕ ﺙ ﺝ ﺡ ﺥ ﺩ ﺫ ﺭ ﺯ ﺱ ﺵ ﺹ ﺽ ﻁ ﻅ ﻉ ﻍ ﻑ ﻕ ﻙ ﻝ ﻡ ﻥ ﻩ ﻭ ﻱ • History • Transliteration • Diacritics (ء) Hamza • • Numerals • Numeration Different approaches and methods for the romanization of Arabic exist. They vary in the way that they address the inherent problems of rendering written and spoken Arabic in the Latin script. Examples of such problems are the symbols for Arabic phonemes that do not exist in English or other European languages; the means of representing the Arabic definite article, which is always spelled the same way in written Arabic but has numerous pronunciations in the spoken language depending on context; and the representation of short vowels (usually i u or e o, accounting for variations such as Muslim / Moslem or Mohammed / Muhammad / Mohamed ). Method Romanization is often termed "transliteration", but this is not technically correct. Transliteration is the direct representation of foreign letters using Latin symbols, while most systems for romanizing Arabic are actually transcription systems, which represent the sound of the language. As an example, the above rendering is a transcription, indicating the pronunciation; an ﺍﻟﻌﺮﺑﻴﺔ ﺍﻟﺤﺮﻭﻑ ﻣﻨﺎﻇﺮﺓ :munāẓarat al-ḥurūf al-ʻarabīyah of the Arabic example transliteration would be mnaẓrḧ alḥrwf alʻrbyḧ. Romanization standards and systems This list is sorted chronologically. Bold face indicates column headlines as they appear in the table below. • IPA: International Phonetic Alphabet (1886) • Deutsche Morgenländische Gesellschaft (1936): Adopted by the International Convention of Orientalist Scholars in Rome. It is the basis for the very influential Hans Wehr dictionary (ISBN 0-87950-003-4).
    [Show full text]
  • Alphabet and Phonetics
    Sim-Arabic Alphabet & Phonetics page 1 Alphabet and Phonetics The table below shows the Sim-Arabic characters. Each character of this alphabet corresponds to a character, or other orthographic mark, in the Arabic (or Persian) alphabet. Moreover, for those characters that have multiple pronunciations in Arabic (such as the semi-vowels), separate Sim- Arabic characters are used to represent this. The phonetic pronunciations (in both Arabic and Persian) using the International Phonetic Alphabet (IPA) are also shown. Arabic/ IPA IPA Arabic/ IPA IPA Sim-Arabic Persian (Arabic) (Persian) Sim-Arabic Persian (Arabic) (Persian) k k ك aː/ɑː ɒː k ا A l l ل ʔ ʔ l ء * m m م b b m ب b n n/ŋ ن t t n ت t h h ه θ s h ث x w و dʒ dʒ w ج j w و ħ h W ح H uː o و x x U خ K v و d d v د d j j ي ð z y ذ X iː iː ي r ɾ I ر r aː/ɑː iː/ɒː ى z z Y ز z a/at ة s s e س s a/æ æ فتحة ʃ ʃ a ش c i e كسرة sˤ s i ص S u o ضمة dˤ z u ض D n n تنوين tˤ t N ط T p پ ðˤ z p ظ Z tʃ چ ʕ ʔ C ع ^ ʒ ژ ɣ ɣ J غ R g گ f f g ف f q ɢ ق q The Sim-Arabic alphabet is a transliteration scheme, not a transphonation scheme.
    [Show full text]
  • Arabic Alphabet 1 Arabic Alphabet
    Arabic alphabet 1 Arabic alphabet Arabic abjad Type Abjad Languages Arabic Time period 400 to the present Parent systems Proto-Sinaitic • Phoenician • Aramaic • Syriac • Nabataean • Arabic abjad Child systems N'Ko alphabet ISO 15924 Arab, 160 Direction Right-to-left Unicode alias Arabic Unicode range [1] U+0600 to U+06FF [2] U+0750 to U+077F [3] U+08A0 to U+08FF [4] U+FB50 to U+FDFF [5] U+FE70 to U+FEFF [6] U+1EE00 to U+1EEFF the Arabic alphabet of the Arabic script ﻍ ﻉ ﻅ ﻁ ﺽ ﺹ ﺵ ﺱ ﺯ ﺭ ﺫ ﺩ ﺥ ﺡ ﺝ ﺙ ﺕ ﺏ ﺍ ﻱ ﻭ ﻩ ﻥ ﻡ ﻝ ﻙ ﻕ ﻑ • history • diacritics • hamza • numerals • numeration abjadiyyah ‘arabiyyah) or Arabic abjad is the Arabic script as it is’ ﺃَﺑْﺠَﺪِﻳَّﺔ ﻋَﺮَﺑِﻴَّﺔ :The Arabic alphabet (Arabic codified for writing the Arabic language. It is written from right to left, in a cursive style, and includes 28 letters. Because letters usually[7] stand for consonants, it is classified as an abjad. Arabic alphabet 2 Consonants The basic Arabic alphabet contains 28 letters. Adaptations of the Arabic script for other languages added and removed some letters, such as Persian, Ottoman, Sindhi, Urdu, Malay, Pashto, and Arabi Malayalam have additional letters, shown below. There are no distinct upper and lower case letter forms. Many letters look similar but are distinguished from one another by dots (’i‘jām) above or below their central part, called rasm. These dots are an integral part of a letter, since they distinguish between letters that represent different sounds.
    [Show full text]
  • Persian Orthography and Its Relation to Literacy
    P1: JDW View metadata,LE100-22.tex citation and similar LE100/Joshi-v1.cls papers at core.ac.uk June 20, 2005brought 20:14 to you by CORE provided by Middlesex University Research Repository 22 Persian Orthography and Its Relation to Literacy Bahman Baluch Middlesex University Persian orthography, which is a modified version of the Arabic script, is used for transcribing the Persian (Farsi) language, which is the major language spoken in Iran. Persian is also one of the two (Farsi and Urdu) major languages spoken in Afghanistan, and the main language in Tajikestan, a former central Asian republic of the former Soviet Union. However, the Persian spoken in these countries and the script used to transcribe the spoken language, particularly in Tajikistan, have been influenced by local factors and borrowed words. The focus of this chapter is on the Persian spoken in the present-day Iran and its relationship with the orthography, henceforth referred to as Persian orthography. In particular, the emphasis is on how literacy acquisition by Persian beginner (and skilled) readers may be affected by peculiarities of Persian orthography. Arguably, very little systematic research has been conducted on cognitive processes involved in the reading of Persian. It is hoped that this article will stimulate such research. After presenting factors that influence literacy acquisition in Persian, we take up the question of whether there should be changes to Persian orthography. INTRODUCTION This chapter is divided into two sections: The first section deals with the Persian orthography and the nature of its phonological and morphological structure. These features are reviewed from a historical perspective followed by a discussion of how these historical changes have affected modern Persian.
    [Show full text]
  • Linguistic Influence of Persian on South Asian Languages: Special Reference to Urdu
    Billah, A. SACS Special Issue 2018 Linguistic influence of Persian on South Asian languages: Special reference to Urdu Abu Musa Mohammad Arif Billah1 Abstract Iran has a rich linguistic and cultural heritage. The history of the development of Iranian languages from Old Persian to Avesta, Pahlavi and finally Persian goes back to time immemorial. Over this vast period, Iranian languages contributed a significant influence on South Asian languages. Lexicographic affinities and thematic likeness between the Avesta and the Rig-Veda suggest a prolonged cultural and literary relationship between Iran and the sub-continent. The constant traveling of the port cities in South Asia by the Iranian merchants and Sufis through sea-routes paved the way for development of the Persian language in the land since millennium BC. After the establishment of Muslim rule in Delhi, the Persian language begun to flourish in the region and gradually spread throughout the subcontinent. Urdu language developed mainly based on Persian. Bengal fell to Muslims in 1203 AD, which resulted in the replacement of the court language from Sanskrit to Persian. Muslim courts became a meeting place for both local and Iranian poets and intelligentsias during this period. The practice of Persian, Bengali, as well as Arabic as a religious language, begun to thrive promptly in the land. Many Sufis such as Nur Qutb-e Alam, Shaikh Sharafuddin Abu Tawwama contributed a lot in this process of development: the Bengali language and literature now received the full assistance of the Muslim rulers. The presence of more than 10000 Persian words in Bengali suggests a clear linguistic influence of Persian on the language.
    [Show full text]
  • Evolutionary Changes in Persian and Arabic Scripts to Accommodate the Printing Press, Typewriting, and Computerized Word Processing
    parh19-tug-evolution-persian-arabic-scripts.docx/pdf Evolutionary Changes in Persian and Arabic Scripts to Accommodate the Printing Press, Typewriting, and Computerized Word Processing Behrooz Parhami Department of Electrical and Computer Engineering University of California Santa Barbara, CA 93106-9560, USA [email protected] 1. Introduction 2. The Persian Script I have been involved in Iran’s computing scene for five Throughout this paper, my use of the term “Persian decades, first as an engineering student and instructor for script” is a shorthand for scripts of a variety of Persian five years, then as a faculty member at Tehran’s Sharif forms (Farsi/Parsi, Dari, Pashto, Urdu), as well of Arabic, (formerly Arya-Mehr) University of Technology for 14 which shares much of its alphabet with Persian. Work on years (1974-1988), and finally, as an interested observer adapting the Arabic script to modern technology has and occasional consultant since joining University of progressed in parallel with the work on Persian script, California, Santa Barbara, in 1988. Recently, I put with little interaction between the two R&D communities, together a personal history of efforts to adapt computer until fairly recently, thanks to the Internet. technology to the demands and peculiarities of the Persian language, in English [1] and Persian [2], in an effort to The Persian language has a 2600-year history, but the update my earlier surveys and histories [3-6] for posterity, current Persian script was adapted from Arabic some archiving, and educational purposes. 1200 years ago [7]. For much of this period, texts were handwritten and books were copied manually, or In this paper, I focus on a subset of topics from the just- reproduced via primitive printing techniques involving cited publications, that is, the three key transition periods etching of the text on stone or wood, covering it with a in the interaction of Persian script with new technology.
    [Show full text]