Language of Text Messages: a Corpus Based Linguistic Analysis of SMS in Pakistan

Total Page:16

File Type:pdf, Size:1020Kb

Language of Text Messages: a Corpus Based Linguistic Analysis of SMS in Pakistan Language of Text Messages: A Corpus Based Linguistic Analysis of SMS in Pakistan Researcher: Supervisor: Malik Naseer Hussain Dr. Raja Nasim Akhtar DEPARTMENT OF ENGLISH FACULTY OF LANGUAGES AND LITERATURE INTERNATIONAL ISLAMIC UNIVERSITY ISLAMABAD 2013 Language of Text Messages: A Corpus Based Linguistic Analysis of SMS in Pakistan By Malik Naseer Hussain Reg. No. 19-FLL/PhDEng/F-07 A dissertation submitted in partial fulfilment of the requirements for the degree of Doctor of Philosophy in English (Linguistics) DEPARTMENT OF ENGLISH FACULTY OF LANGUAGES AND LITERATURE INTERNATIONAL ISLAMIC UNIVERSITY ISLAMABAD 2013 ii IN THE NAME OF ALLAH, THE MOST GRACIOUS, THE MOST MERCIFUL Say: Truly, my worship, and my sacrifice, and my living, and my dying are for Allah, Lord of the Worlds. (Al-Qur’an, 6, 162) iii Acceptance by the Viva Voce Committee Language of Text Messages: A Corpus Based Linguistic Analysis of SMS in Pakistan Name of Student: Malik Naseer Hussain Registration No: 19-FLL/PhDEng/F-07 Accepted by the Department of English, Faculty of Languages & Literature, International Islamic University Islamabad, in partial fulfilment of the requirements for the award of the Doctor of Philosophy degree in English. Viva Voce Committee _______________________________ ___________________________________ External Examiner Dr. Munawar Iqbal Ahmed Dr. Riaz Hassan Acting Dean Dean, Faculty of Social Sciences Faculty of Languages & Literature Air University, E-9 Islamabad International Islamic University, Islamabad _______________________________ ___________________________________ External Examiner Dr. Munawar Iqbal Ahmed Dr. Wasima Shehzad Chairman, Department of English Professor (Linguistics) International Islamic University, Islamabad Department of Humanities Air University, E-9 Islamabad _______________________________ ___________________________________ Internal Examiner Supervisor Dr. Ayaz Afsar Dr. Raja Nasim Akhtar Associate Professor Professor, Department of English Department of English University of AJ&K, Muzaffarabad International Islamic University, Islamabad 07 October, 2013 iv FOREIGN EVALUATION As per HEC policy, the thesis has been evaluated by the following foreign experts from technologically/academically advanced countries. 1. Prof. Dr. Marina Bondi Professor, Department of Linguistics and Cultural Studies Dean, Faculty of Humanities University of Modena & Reggio Emilia, Italy Email: [email protected] 2. Dr. Paul Thompson Director, Centre for Corpus Research Arts Building, University of Birmingham Edgbaston, Birmingham B15 2TT, UK Email: [email protected] v DECLARATION I, Malik Naseer Hussain, Registration No. 19-FLL/PhDEng/F-07, a student of PhD in English Linguistics at International Islamic University Islamabad do hereby solemnly declare that the dissertation submitted by me in partial fulfilment of the requirements for the degree of Doctor of Philosophy in English (Linguistics) is my original work, except where otherwise acknowledged in the dissertation, and has not been submitted earlier, and shall not be submitted by me in future for obtaining any degree from this or any other university. Malik Naseer Hussain Dated: 7th October 2013 vi ACKNOWLEDGEMENTS First and foremost, all thanks and gratitude to Allah Almighty for giving me strength and opportunities to complete my PhD degree. After Allah, I pay homage to the last Prophet of Allah, Muhammad (PBUH), whose Sunnah is to thank people. He says, -He does not thank Allah who does not thank people” Sunan / ﻻَ يَ ْش ُك ُر هَّللاَ َم ْن ﻻَ يَ ْش ُك ُر النها َس “ Abu-Dawud, 4811. In this regard, I owe thanks to a large number of people and institutions. The first among them is Higher Education Commission, for offering me the merit based scholarship for PhD studies. Next, I owe debts to my research supervisors Prof. Dr. Raja Nasim Akhtar (my PhD supervisor) and Prof. Dr. Ayaz Afsar (my MPhil supervisor) for their support and guidance. I owe a great debt to the Chairman, Department of English, IIUI, Prof. Dr. Munawar Iqbal Ahmed, for his unflinching support in every administrative matter. There are many teachers to whom I thank for their guidance at different phases of my academic career. My special thanks are to my IIUI teachers, Prof. Dr. Ahsan Wahga, Prof. Dr. Qabil Khan, Prof. Dr. Maqsood Alam Bukhari, Prof. Dr. Muhammad Rasheed, Prof. Dr. Nabi Baksh Jumani, Prof. Dr. Safeer Akhtar, Prof. Dr. Safeer Awan; my teachers at AIOU, Prof. Dr. Rehana Masrur, Prof. Dr. Naveed Sultana; and my teachers at Govt. Postgraduate College Attock, Prof. Ashiq Hussain, Prof. Altaf Baig, Prof. Zaheer Ahmed, Prof. Mahmood Khattak, Prof. Shakeel Ahmed, and Prof. Sharif Zaid. Without good friends, one cannot easily accomplish a long journey of studies. So I wish to thank all my colleagues and friends whose support and good wishes remained with me during my educational career. I especially thank my PhD class fellows Ibrar Anver Sheikh, Khalid Mahmood, Tassadaq Hussain, Muhammad Nawaz, Abid Qureshi, Aleem Shakir, Riaz Dasti; my university fellows Muhammad Zamurd, M. Azhar Khan, Mudassar Nazir, Abdul Qayyum, Hafiz Zahid, Muslim Khan, Muhammad Saleem, Ameer Sultan; my friends Abdul Qayyum Sahar, Umer Farooq, Tariq Mehmood Sheikh, Najam-us-Saqib, Yasir Mahmmood, Atif Haseeb, Muhammad Younas Khan, Arshad Mahmood Nashad, Shuhaib Anwer, Masood Anwer, Faheem Qureshi, Amanat Khan, Muhammad Farooq, Abdul Qadoos; my colleagues Abdul Majid Sajjad, Kamran Jamil, M. Khizer, Qaiser Baig, Khurram Baig, Khawar Mehmmood Khokhar, Ashiq Abbasi, Zuhair Wasti, Kamal Pasha, Sajeela Shabir, and Munawar Kashif. vii In the family, the biggest unpaid debt on me is of my father (May his soul rest in peace). What I am today, it is just because of him. People have many dreams in their lives but I was the only dream of his life. So I cannot pay his debt. In addition to him, there is yet a whole chain of my relatives to whom I owe thanks and apologies. I apologize to my family as I could not give them much time. I wish my heartfelt thanks and recognition to my parents, my wife, my sons Bilal Haider, Awais Haider, my daughters Arooj Fatima, Tehreem Fatima, my kin Malik Ibrar Hussain, Malik Zaheer Hussain, and Malik Sajjad Hussain. I also thank all those who provided data for this study, or helped in collecting the data. Thanks to all whose works I used/consulted. My special thanks are to Prof. Laurence Anthony for providing AntConc, the corpus linguistics software, free of cost. While acknowledging people I cannot overlook to mention some places and institutes which have played an unseen role in my growth. I especially mention my native town Padhrar in District Khushab, F. G. Boys Jamshaid High School Attock Cantt, Govt. Postgraduate College Attock, and International Islamic University Islamabad. Their impacts have been very great on my personality In brief, the list of thanks and gratitude is long but the conventional space available for acknowledgements is short, so without naming further, at the end I thank all unnamed and unspecified friends, colleagues, relatives, students, and institutes for the part they have played in my life and career. viii ABSTRACT The thesis presents a corpus based empirical analysis of the language used in SMS text messaging in Pakistan. The study is descriptive in nature and examines types, causes, and patterns/principles of various linguistic adaptations made in text messages. It also provides historical insights, and discusses linguistic-cum-educational impacts and implications of these adaptations. Primarily, the study is based on the linguistic analysis of an SMS corpus of 5000 interpersonal text messages collected from Pakistani texters. For triangulation purposes, it also examines the metalinguistic perceptions of 500 texters who also provided their personal text messages for the study. The study explores linguistic adaptations in six major categories, i.e. lexical, syntactic, punctuation, space, code, and script adaptations in text messages. It was found in the study that linguistic adaptations in text messages are mostly made under certain principles/patterns. Most intentional adaptations in text messages are caused by three major factors that are to be economical in the use of time and effort, to be creative/innovative in developing new language patterns, and to be rapid in SMS communication. Some unintentional adaptations are caused by the careless attitude or poor language command of texters. In addition, many punctuation adaptations are specifically made for paralinguistic purposes. Code alterations are made in the bilingual settings of texters. Lastly, the Roman script is preferred because most Pakistani texters are not adept in typing the Arabic/Urdu script, and so they use the Roman script for both Urdu and English. In the historical perspective, these adaptations are not completely new in nature because their traces are found in history or in other modes of communication. Moreover, these adaptations have various linguistic/educational impacts, and these impacts lead to certain implications for the conventional standards of languages and their teaching. ix CONTENTS CHAPTER 1: INTRODUCTION…………………………………………........ 1 1.1 Background of the Study………………………………………………… 1 1.2 Statement of the Problem……………………………………………….. 5 1.3 Objectives and Research Questions……………………………………… 6 1.4 Significance of the Study………………………………………………… 7 1.5 Limitations and Delimitations………………………………………….. 11 1.6 An Overview of Research Methodology…............................................. 12 1.7
Recommended publications
  • The Pakistan National Bibliography 1999
    THE PAKISTAN NATIONAL BIBLIOGRAPHY 1999 A Subject Catalogue of the new Pakistani books deposited under the provisions of Copyright Law or acquired through purchase, etc. by the National Library of Pakistan, Islamabad, arranged according to the Dewey Decimal Classification, 20th edition and catalogued according to the Anglo American Cataloguing Rules, 2nd revised edition, 1988, with a full Author, Title, Subject Index and List of Publishers. Government of Pakistan, Department of Libraries National Library of Pakistan Constitution Avenue, Islamabad 2000 © Department of Libraries (National Bibliographical Unit) ⎯ 2000. ISSN 10190678 ISBN 969-8014-31-4 Price: Within Pakistan……..Rs. 1100.00 Outside Pakistan…….US$ 60.00 Available from: National Book Foundation, 6-Mauve Area, Taleemi Chowk, Sector G-8/4, ISLAMABAD P A K I S T A N. (ii) PREFACE The objects of the Pakistan National Bibliography are to list new works published in Pakistan, to describe each work in detail and to give the subject matter of each work as precisely as possible. The 1999 volume of the Pakistan National Bibliography covers Pakistani publications published during the year 1999 and received in the Delivery of Books and Newspapers Branch of the National Library of Pakistan at Islamabad under the Provisions of Copyright Law: Copy right Ordinance, 1962 as amended by Copyright (Amendment) Act, 1973 & 1992. Those titles which were not received under the Copyright Law but were acquired through purchase, gift and exchange have also been included in the Bibliography. Every endeavour has been made to ensure the accuracy of the information given. The following classes of publications have been excluded: a) The keys and guides to text-books and ephemeral material such as publicity pamphlets etc.
    [Show full text]
  • Afghanistan Turmoil and Its Implications for Pakistan’S Security (2009-2016)
    AFGHANISTAN TURMOIL AND ITS IMPLICATIONS FOR PAKISTAN’S SECURITY (2009-2016) By MUHAMMAD TARIQ Reg. No.11-AU-RM-M.PHIL-P/SCIENCE-F-5 Ph. D (Political Science) SUPERVISOR Dr. JEHANZEB KHALIL Co-Supervisor Dr. Manzoor Ahmad DEPARTMENT OF POLITCAL SCIENCE FACULTY OF SOCIAL SCIENCES, ABDUL WALI KHAN UNIVERSITY MARDAN Year 2018 1 AFGHANISTAN TURMOIL AND ITS IMPLICATIONS FOR PAKISTAN’S SECURITY (2009-2016) By MUHAMMAD TARIQ Reg. No.11-AU-RM-M.PHIL-P/SCIENCE-F-5 Ph. D (Political Science) Dissertation submitted to the Abdul Wali Khan University Mardan in the partial fulfillment of the requirements for the Degree of Ph. D in Political Science DEPARTMENT OF POLITCAL SCIENCE FACULTY OF SOCIAL SCIENCES, ABDUL WALI KHAN UNIVERSITY MARDAN YEAR 2018 2 Author’s Declaration I, Muhammad Tariq__hereby state that my Ph D thesis titled, “ Afghanistan Turmoil and Its Implications for Pakistan’s Security (2009-2016) is my own work and has not been submitted previously by me for taking any degree from this University i.e. ABDUL WALI KHAN UNIVERSITY MARDAN or anywhere else in the country/world. At any time if my statement is found to be incorrect even after my Graduate, the University has the right to withdraw my Ph D degree. Name of Student: Muhammad Tariq Date: 10 January, 2018 3 Plagiarism Undertaking I solemnly declare that research work presented in the thesis titled “AFGHANISTAN TURMOIL AND ITS IMPLICATIONS FOR PAKISTAN’s SECURITY (2009- 2016)” is solely my research work with no significant contribution from any other person. Small contribution/help wherever taken has been duly acknowledged and that complete thesis has been written by me.
    [Show full text]
  • Statistical Parsing by Machine Learning from a Classical Arabic Treebank
    Statistical Parsing by Machine Learning from a Classical Arabic Treebank Kais Dukes Submitted in accordance with the requirements for the degree of Doctor of Philosophy The University of Leeds School of Computing September, 2013 The candidate confirms that the work submitted is his own, except where work which has formed part of jointly-authored publications has been included. The contribution of the candidate and the other authors to this work has been explicitly indicated overleaf. The appropriate credit has been given where reference has been made to the work of others. This copy has been supplied on the understanding that it is copyright material and that no quotation from the thesis may be published without proper acknowledgement. Publications Chapters 4 to 10 in parts II, III and IV of this thesis are based on jointly-authored publications. I was the lead author and the co-authors acted in an advisory capacity, providing supervision and review. All original contributions presented here are my own. Part II - Modelling Classical Arabic The formal representations of Classical Arabic orthography, morphology and syntax presented in Chapters 4 to 6 are based on the following papers: Kais Dukes and Nizar Habash (2010a). Morphological Annotation of Quranic Arabic. In Proceedings of the Language Resources and Evaluation Conference (LREC) (2530-2536). Valletta, Malta. Kais Dukes, Eric Atwell and Abdul-Baquee Sharaf (2010b). Syntactic Annotation Guidelines for the Quranic Arabic Dependency Treebank. In Proceedings of the Language Resources and Evaluation Conference (LREC) (1822-1827). Valletta, Malta. Kais Dukes and Timothy Buckwalter (2010c). A Dependency Treebank of the Quran using Traditional Arabic Grammar.
    [Show full text]
  • Arabic Alphabet 1 Arabic Alphabet
    Arabic alphabet 1 Arabic alphabet Arabic abjad Type Abjad Languages Arabic Time period 400 to the present Parent systems Proto-Sinaitic • Phoenician • Aramaic • Syriac • Nabataean • Arabic abjad Child systems N'Ko alphabet ISO 15924 Arab, 160 Direction Right-to-left Unicode alias Arabic Unicode range [1] U+0600 to U+06FF [2] U+0750 to U+077F [3] U+08A0 to U+08FF [4] U+FB50 to U+FDFF [5] U+FE70 to U+FEFF [6] U+1EE00 to U+1EEFF the Arabic alphabet of the Arabic script ﻍ ﻉ ﻅ ﻁ ﺽ ﺹ ﺵ ﺱ ﺯ ﺭ ﺫ ﺩ ﺥ ﺡ ﺝ ﺙ ﺕ ﺏ ﺍ ﻱ ﻭ ﻩ ﻥ ﻡ ﻝ ﻙ ﻕ ﻑ • history • diacritics • hamza • numerals • numeration abjadiyyah ‘arabiyyah) or Arabic abjad is the Arabic script as it is’ ﺃَﺑْﺠَﺪِﻳَّﺔ ﻋَﺮَﺑِﻴَّﺔ :The Arabic alphabet (Arabic codified for writing the Arabic language. It is written from right to left, in a cursive style, and includes 28 letters. Because letters usually[7] stand for consonants, it is classified as an abjad. Arabic alphabet 2 Consonants The basic Arabic alphabet contains 28 letters. Adaptations of the Arabic script for other languages added and removed some letters, such as Persian, Ottoman, Sindhi, Urdu, Malay, Pashto, and Arabi Malayalam have additional letters, shown below. There are no distinct upper and lower case letter forms. Many letters look similar but are distinguished from one another by dots (’i‘jām) above or below their central part, called rasm. These dots are an integral part of a letter, since they distinguish between letters that represent different sounds.
    [Show full text]
  • Classical and Modern Arabic Corpora: Genre and Language Change
    This is a repository copy of Classical and modern Arabic corpora: Genre and language change. White Rose Research Online URL for this paper: http://eprints.whiterose.ac.uk/131245/ Version: Accepted Version Book Section: Atwell, E orcid.org/0000-0001-9395-3764 (2018) Classical and modern Arabic corpora: Genre and language change. In: Whitt, RJ, (ed.) Diachronic Corpora, Genre, and Language Change. Studies in Corpus Linguistics, 85 . John Benjamins , pp. 65-91. ISBN 9789027201485 © John Benjamins 2018.This is an author produced version of a book chapter published in Diachronic Corpora, Genre, and Language Change. Uploaded in accordance with the publisher's self-archiving policy. the publisher should be contacted for permission to re-use or reprint the material in any form. https://doi.org/10.1075/scl.85.04atw. Reuse Items deposited in White Rose Research Online are protected by copyright, with all rights reserved unless indicated otherwise. They may be downloaded and/or printed for private study, or other acts as permitted by national copyright laws. The publisher or other rights holders may allow further reproduction and re-use of the full text version. This is indicated by the licence information on the White Rose Research Online record for the item. Takedown If you consider content in White Rose Research Online to be in breach of UK law, please notify us by emailing [email protected] including the URL of the record and the reason for the withdrawal request. [email protected] https://eprints.whiterose.ac.uk/ “Author Accepted Version” of: Atwell, E. (2018). Classical and Modern Arabic Corpora: genre and language change.
    [Show full text]
  • Syntactic Annotation Guidelines for the Quranic Arabic Dependency Treebank
    Syntactic Annotation Guidelines for the Quranic Arabic Dependency Treebank Kais Dukes, Eric Atwell and Abdul-Baquee M. Sharaf School of Computing, University of Leeds, LS2 9JT, United Kingdom E-mail: [email protected], [email protected], [email protected] Abstract The Quranic Arabic Dependency Treebank (QADT) is part of the Quranic Arabic Corpus (http://corpus.quran.com), an online linguistic resource organized by the University of Leeds, and developed through online collaborative annotation. The website has become a popular study resource for Arabic and the Quran, and is now used by over 1,500 researchers and students daily. This paper presents the treebank, explains the choice of syntactic representation ( ), and highlights key parts of the annotation guidelines. The text being analyzed is the Quran, the central religious book of Islam, written in classical Quranic Arabic (c. 600 CE). To date, all 77,430 words of the Quran have a manually verified morphological analysis, and syntactic analysis is in progress. 11,000 words of Quranic Arabic have been syntactically annotated as part of a gold standard treebank ( ). Annotation guidelines are especially important to promote consistency for a corpus which is being developed through online collaboration, since often many people will participate from different backgrounds and with different levels of linguistic expertise. The treebank is available online for collaborative correction to improve accuracy, with suggestions reviewed by expert Arabic linguists, and compared against existing published books of Quranic Syntax. 1. Introduction syntax and terminology for the Quranic Arabic Treebank has attracted volunteer Quranic scholars and expert Arabic Annotating an Arabic corpus presents a set of unique linguists to the project.
    [Show full text]
  • Morphological Annotation of Quranic Arabic
    Morphological Annotation of Quranic Arabic Kais Dukes1 and Nizar Habash2 1 School of Computing, University of Leeds, LS2 9JT, United Kingdom 2 Center for Computational Learning Systems, Columbia University, New York, USA E-mail: [email protected], [email protected] Abstract The Quranic Arabic Corpus (http://corpus.quran.com) is an annotated linguistic resource with multiple layers of annotation including morphological segmentation, part-of-speech tagging, and syntactic analysis using dependency grammar. The motivation behind this work is to produce a resource that enables further analysis of the Quran, the 1,400 year old central religious text of Islam. This paper describes a new approach to morphological annotation of Quranic Arabic, a genre difficult to compare with other forms of Arabic. Processing Quranic Arabic is a unique challenge from a computational point of view, since the vocabulary and spelling differ from Modern Standard Arabic. The Quranic Arabic Corpus differs from other Arabic computational resources in adopting a tagset that closely follows traditional Arabic grammar. We made this decision in order to leverage a large body of existing historical grammatical analysis, and to encourage online collaborative annotation. In this paper, we discuss how the unique challenge of morphological annotation of Quranic Arabic is solved using a multi-stage approach. The different stages include automatic morphological tagging using diacritic edit-distance, two-pass manual verification, and online collaborative annotation. This process is evaluated to validate the appropriateness of the chosen methodology. The popular online site of the Quranic Arabic Corpus has 1. Introduction been used by numerous Quranic scholars, language The Quranic Arabic Corpus (http://corpus.quran.com) is researchers and students of Arabic, many of whom have an on-line annotated linguistic resource with multiple shown particular interest in the morphological annotation layers of annotation including morphological described in this paper.
    [Show full text]
  • Proceedings of the Conference on Language & Technology 2012
    Proceedings of the Conference on LANGUAGE & TECHNOLOGY 2012 Center for Language Engineering Al-Khawarizmi Institute of Computer Science University of Engineering and Technology Lahore- Pakistan Organized by Society for Natural Language Processing In collaboration with Center for Language Engineering Al-Khawarizmi Institute of Computer Science University of Engineering and Technology Lahore- Pakistan Copyright © 2012 Society for Natural Language Processing (SNLP), Pakistan www.snlp.org.pk ISBN: 978-969-9690-05-1 Conference Committees Organizing Committee General Chair: Sarmad Hussain, CLE, KICS-UET, Lahore, Pakistan Technical Committee Co-chair: Abid Khan, University of Peshawar, Pakistan Technical Committee Co-chair: Miriam Butt, University of Konstanz, Germany Publication Committee Co-chair: Tafseer Ahmed, University of Karachi, Pakistan Technical Committee Abid Khan, University of Peshawar, Pakistan (co-chair) Miriam Butt, University of Konstanz, Germany (co-chair) Afaq Husain, Riphah University, Pakistan Chai Wutiwiwatchai, NECTEC, Thailand Douglas Clarke, Cranfield University, UK Elena Bashir, University of Chicago, USA Ghulam Raza, PIEAS, Pakistan Imran Siddiqi, Bahria University Islamabad, Pakistan Khaver Zia, Beaconhouse National University, Pakistan Key-Sun Choi, KAIST, Korea Muhammad Afzal, KICSIT, Pakistan Pushpak Bhatacharyya, IIT Bombay, India Rachel Roxas, De La Salle University, Philippines Rajeev Sangal, IIIT Hyderabad, India Sarmad Hussain, KICS-UET, Pakistan Seemab Latif, NUST, Pakistan Tafseer Ahmed, University of Karachi,
    [Show full text]
  • The Example of Psalm 1
    PROTESTANT THEOLOGICAL UNIVERSITY GRONINGEN, NETHERLANDS DELIMITATION CRITICISM AS A GUIDE TO AN URDU TRANSLATION OF THE PSALMS: THE EXAMPLE OF PSALM 1 THESIS SUPERVISOR DR MARJO C.A. KORPEL PRESENTED BY RAZA WASEEM JUNE 2021 VERY HUMBLY DEDICATED TO DR MARJO C. A. KORPEL AND PROF. DR KLAAS SPRONK WHO INTRODUCED ME TO THE FASCINATING WORLD OF BIBLICAL POETRY AND DELIMITATION CRITICISM CONTENTS LIST OF ABBREVIATIONS…………………………………………………………………….IV CHAPTER 1: CHRISTIAN MISSION AND TRANSLATION OF SCRIPTURES IN UNITED INDIA…………………………………………………………………………….6 1.1- EARLY IMPRESSIONS OF VERNACULAR CHRISTIAN LITERATURE ………….……….7 1.2- SCRIPTURAL TRANSLATIONS…………..………………………...……………………….….9 1.3- THE TRANSLATION OF PSALMS IN URDU…………………………………………………10 CHAPTER 2: URDU PSALMS; LINE ARRANGEMENT AND LANGUAGE USED…...13 2.1- PSALMS OF BENJAMIN SCHULTZE………………………………………………………...14 2.2- PSALMS OF REV. JAMES OWEN…………………………………………………………….16 2.3- PSALMS OF JAMES WALTER WAUGH………………………………………….………….18 CHAPTER 3: CONTEXTUAL FRAMES IN BIBLE TRANSLATIONS AND URDU PSALMS……………………………………………………………………………………………………..19 3.1- COLONIAL VS POST-COLONIAL……………………………………………………….……20 3.2- COLONIZERS VS COGNITIVE LINGUISTS………………………………………………….20 3.3- HOW DOES THE SOCIO-CULTURAL FRAME AFFECT URDU PSALTERS?................................21 3.4- HOW DOES THE ORGANIZATIONAL FRAME AFFECT URDU PSALTERS?...............................23 3.5- HOW DOES THE COMMUNICATIONAL FRAME AFFECT URDU PSALTERS?...........................23 3.6- HOW DOES THE TEXTUAL FRAME AFFECT URDU PSALTERS?...................................................................25
    [Show full text]
  • Critical Survey of the Freely Available Arabic Corpora Wajdi Zaghouani Carnegie Mellon University Qatar Computer Science E-Mail: [email protected]
    Critical Survey of the Freely Available Arabic Corpora Wajdi Zaghouani Carnegie Mellon University Qatar Computer Science E-mail: [email protected] Abstract The availability of corpora is a major factor in building natural language processing applications. However, the costs of acquiring corpora can prevent some researchers from going further in their endeavours. The ease of access to freely available corpora is urgent needed in the NLP research community especially for language such as Arabic. Currently, there is not easy was to access to a comprehensive and updated list of freely available Arabic corpora. We present in this paper, the results of a recent survey conducted to identify the list of the freely available Arabic corpora and language resources. Our preliminary results showed an initial list of 66 sources. We presents our findings in the various categories studied and we provided the direct links to get the data when possible. Keywords: Arabic, Open source, Free, Corpora, Corpus, Survey. that are often not complete or outdated. 1. Introduction As of 2010, ELRA created the LRE Map (Language The use of corpora has been a major factor in the recent Resources and Evaluation) which is an online database on advance in natural language processing development and language resources. The goal behind LRE Map is to evaluation. However, the high costs of building or monitor the creation and the use and of language licensing a corpora could be an obstacle for many young resources. The information is collected during the researchers or even some institution in several parts of the submission process to LREC and other conferences.
    [Show full text]
  • Visual Word Processing of Non-Arabic-Speaking Qur'anic
    Visual Word Processing of Non-Arabic-speaking Qur’anic Memorisers Siti Syuhada Binte Faizal Doctor of Philosophy School of Education, Communication and Language Sciences Newcastle University June 2019 Abstract Visual word processing typically involves the interplay between orthographic, phonological, morphological, and semantic knowledge. However, there are atypical learning situations where the input of one of the above is limited, such as rote memorisation of the Qur’an with little semantic input. This is common in non-Arabic-speaking countries where speakers from Muslim communities learn how to read the Qur’an without understanding what it means. Despite this unique and pervasive phenomenon, little work has been carried out in this area. The goal of this dissertation is to investigate the visual word processing of non-Arabic-speaking Qur’an memorisers at three levels of processing—lexical, sublexical, and morphological. It also aimed to investigate individual differences through examining potential interactions of effects with Qur’an vocabulary knowledge and amount of Qur’an memorised, thereby informing us of the roles of semantics and print exposure to the language in visual word processing. Using stimuli constructed from the Qur’an Lexicon Project, a series of psycholinguistic experiments were conducted with non-Arabic-speaking Qur’an readers and memorisers from Singapore. Participants were given two visual lexical decision tasks (one with morphological priming and one without) and a speeded pronunciation task. A standardised Qur’an Vocabulary Test was also given to measure their vocabulary knowledge and self-reports of Qur’anic memorisation scores were elicited to measure the amount and fluency of Qur’anic memorisation.
    [Show full text]
  • Hybrid Part-Of-Speech Tagger for Non-Vocalized Arabic Text
    International Journal on Natural Language Computing (IJNLC) Vol. 2, No.6, December 2013 Hybrid Part-Of-Speech Tagger for Non-Vocalized Arabic Text Meryeme Hadni 1, Said Alaoui Ouatik 1, Abdelmonaime Lachkar 2 and Mohammed Meknassi 1 1FSDM, Sidi Mohamed Ben Abdellah University (USMBA), Morocco 2E.N.S.A, Sidi Mohamed Ben Abdellah University (USMBA), Morocco ABSTRACT Part of speech tagging (POS tagging) has a crucial role in different fields of natural language processing (NLP) including Speech Recognition, Natural Language Parsing, Information Retrieval and Multi Words Term Extraction. This paper proposes an efficient and accurate POS Tagging technique for Arabic language using hybrid approach. Due to the ambiguity issue, Arabic Rule-Based method suffers from misclassified and unanalyzed words. To overcome these two problems, we propose a Hidden Markov Model (HMM) integrated with Arabic Rule-Based method. Our POS tagger generates a set of three POS tags: Noun, Verb, and Particle. The proposed technique uses the different contextual information of the words with a variety of the features which are helpful to predict the various POS classes. To evaluate its accuracy, the proposed method has been trained and tested with two corpora: the Holy Quran Corpus and Kalimat Corpus for undiacritized Classical Arabic language. The experiment results demonstrate the efficiency of our method for Arabic POS Tagging. In fact, the obtained accuracies rates are 97.6%, 96.8% and 94.4% for respectively our Hybrid Tagger, HMM Tagger and for the Rule-Based Tagger with Holy Quran Corpus. And for Kalimat Corpus we obtained 94.60%, 97.40% and 98% for respectively Rule-Based Tagger, HMM Tagger and our Hybrid Tagger.
    [Show full text]