
Pattern matching in meter detection of Arabic classical poetry Abdelmalek Berkani Adrian Holzer Kilian Stoffel Information Management Institute Information Management Institute Information Management Institute University of Neuchatelˆ University of Neuchatelˆ University of Neuchatelˆ Neuchatel,ˆ Switzerland Neuchatel,ˆ Switzerland Neuchatel,ˆ Switzerland [email protected] [email protected] [email protected] Abstract—Arabic classical poetry meter is a sequence of The data aspect is for building an exhaustive pattern combina- patterns. A poetry verse is characterized by a meter and consists tions data set for all meter variants and pruning the data set by of two parts. Detecting classical poetry meter is important removing meter conflicts and non-concordant patterns between for teaching purposes, for poetry and prose categorization, for authorship recognition and for computational aesthetics. verse parts. The processing side is for verse phonological Automatically detecting the meter of any single verse written preparation, syllables segmentation, exact pattern matching as a normal sentence is challenging. We need a global approach and similarity to mitigate text preparation imperfections. that processes phonological verse preparation, deals with dis- The next sections present background on Arabic poetry, tinguishing the first part of the verse from the second, handles review previous work on meter detection and motivate the ap- meters disambiguation and covers the verse parts concordance. To tackle this challenge, we introduce a novel solution called the proach we used. We then introduce our results and evaluation. Arabic Meters Identification System (AMIS) that combines an exhaustive pattern data set, pattern matching and similarity. We II. BACKGROUND evaluate our system on a vocalized poetry corpus and reach a Classical poetry is the earliest form of Arabic literature [7]. precision of 99.3%. The poem is a set of verses. Every verse “bayt” is composed Keywords—Arabic classical poetry, Arabic Arud, Arabic prosody, Arabic poetry patterns, Arabic poetry meters of two parts. The first part “Sadr” and the second part “Ajoz”. In some rare exceptions, the verse does not have a second part. I. INTRODUCTION The verse is characterized by a meter. Meters are identified by unique names and aim to define the rhythm of the verse. Meter is an essential feature in classical poetry. It char- In Arabic, meters field of study is called “Arud” founded acterizes verse arrangement and harmony. Meter is used in by Al-Khalil ibn Ahmad Al Farahidi (718 - 786). Arud is the teaching [1], in poem classification [2] and authorship recog- prosody used for classical poetry in Arabic, Ottoman, Persian, nition. Beside other poetic features, meter is also used in Urdu and other eastern languages [1]. Table I shows the computational aesthetics [3]. Thus, extracting meter from the original sixteen classical meters with their respective patterns verse is important both for education and research purposes. sequence. Automatic meter detection is a challenging task that needs a A meter is a set of ordered patterns. A pattern is a named complete approach in terms of data collection and processing. group of syllables (long: L, short: S, absence: A). The number, In this article, we introduce a new global approach AMIS the type and the order of patterns make the difference between for Arabic Meters Detection System based on an exhaustive meters. The original sixteen meters of Table I are built on ten data set of pattern combinations, on pattern matching and distinct patterns [5] shown in Table II. similarity. Since each meter may have many pattern alterations The syllables sequence determines patterns names and me- that lead to sequence redundancies and conflicts between ter. The following example shows an original occurrence of meters [4], we need to disambiguate meter variants while meter “Kamil” with its 6 patterns. keeping concordance between the two parts of the verse. f Six groups of syllables: three in each part g In classical poetry, even if the end of a verse part is written [(SSLSL) (SSLSL) (SSLSL) ] [(SSLSL) (SSLSL) (SSLSL) ] as a short vowel, it has to be pronounced as long vowel. The f Six corresponding patterns: three in each part g last phoneme of each verse part has to be adapted consequently [(motafaa ’ilon)(motafaa ’ilon)(motafaa ’ilon)] [(motafaa ’ilon)(motafaa ’ilon)(motafaa ’ilon)] [5]. In the context of a single poetry verse written as a normal sentence, the main issue is making distinction between verse In terms of original patterns usage, [4] notices that most of parts. We have to adapt the data set, to manage both cases poetry verses do not conform to the original form of meters. when the first part ends with short or long vowel. Poets use variants instead. A pattern variant occurs in case To the best of our knowledge, the best accuracy of meter of change in its syllables sequence. A meter variant is an detection of a single poetry verse is 75% [6]. AMIS improves alteration of its original form due to change in patterns or existing work by focusing on both data and processing aspects. their number. A meter variant that keeps the same number of TABLE I III. RELATED WORK ORIGINAL POETRY METERS Researchers have proposed meter detection methods based Pattern sequence on the sixteen theoretical meters and patterns in three steps [6]. Meter First Part Patterns The first step is text conversion in order to keep only spoken Second Part fa’oolon mafaa’iilon fa’oolon mafaa’iilon letters. The second step is the segmentation phase where text Tawil 8 fa’oolon mafaa’iilon fa’oolon mafaa’iilon is converted to syllables. Meter is detected in the last step by faa’ilaaton faa’ilon faa’ilaaton Madid 6 comparing the syllables sequence with the grammar stored faa’ilaaton faa’ilon faa’ilaaton mostaf’ilon faa’ilon mostaf’ilon faa’ilon previously. The data set used to evaluate meter detection Bassit 8 mostaf’ilon faa’ilon mostaf’ilon faa’ilon consists of 128 verses from different Arabic poems with mofaa’alaton mofaa’alaton fa’oolon Wafir 6 a success rate of 75%. Others have used recurrent neural mofaa’alaton mofaa’alaton fa’oolon network (RNN) to detect sixteen poetry Arabic meters and motafaa’ilon motafaa’ilon motafaa’ilon Kamil 6 motafaa’ilon motafaa’ilon motafaa’ilon four English meters with an overall accuracy of 96.38% and mafaa’iilon mafaa’iilon Hazaj 4 82.31%, respectively [8]. mafaa’iilon mafaa’iilon Further research goes beyond the theoretical meters and mostaf’ilon mostaf’ilon mostaf’ilon Rajaz 6 mostaf’ilon mostaf’ilon mostaf’ilon considers variants of original meters [9]–[12]. For instance faa’ilaaton faa’ilaaton faa’ilaaton [9] proposes a detection based on editing, consultation and Ramal 6 faa’ilaaton faa’ilaaton faa’ilaaton knowledge bases modules. They evaluate the system on 20 mostaf’ilon mostaf’ilon faa’ilon Sarii 6 mostaf’ilon mostaf’ilon faa’ilon poems and report good results without giving figures on mostaf’ilon maf’oolaato mostaf’ilon detection accuracy. Further, in [10], researchers encode the Monsarih 6 mostaf’ilon maf’oolaato mostaf’ilon prosody of each input text using Khashan’s method called faa’ilaaton mostaf’i lon faa’ilaaton Khafif 6 faa’ilaaton mostaf’i lon faa’ilaaton “numerical prosody”. Authors report an overall accuracy of mafaa’iilon faa’i laaton 98.6% based on the whole poem evaluation. Modarii 4 mafaa’iilon faa’i laaton The rule based approach presented in [11] describes an maf’oolaato mostaf’ilon Moqtadab 4 algorithm that detects the correct meter in five steps. The maf’oolaato mostaf’ilon mostaf’i lon faa’ilaaton algorithm is based on predefined rules for text conversion in Mojtath 4 mostaf’i lon faa’ilaaton prosody form. It uses only the first part of the verse. The fa’oolon fa’oolon fa’oolon fa’oolon Motaqaarib 8 algorithm was evaluated on a sample of classical Arabic poems fa’oolon fa’oolon fa’oolon fa’oolon fa’oolon faa’ilon faa’ilon faa’ilon faa’ilon and achieves an accuracy of 82%. Motadaarak 8 faa’ilon faa’ilon faa’ilon faa’ilon In terms of meter usage and value, the authors of [12] show that linguistic features based on the Arabic poetry meters are good attributes for authorship attribution. Authors argue that TABLE II meter-based features outperform the usual linguistic features ORIGINAL POETRY PATTERNS commonly used in authorship studies like word frequencies. Pattern Syllables They have also shown that features of Arabic classical poetry faa’ilon (LSL) meters are suitable to distinguish authors in English as well fa’oolon (SLL) as Arabic. mafaa’iilon (SLLL) Some other researchers also used the Arabic poetry meter in mostaf’ilon (LLSL) mostaf’i lon (LLS L) authorship attribution [13]–[15]. They used meter as a feature mofaa’alaton (SLSSL) to distinguish authors. motafaa’ilon (SSLSL) The main difference of the approaches is the detection maf’oolaato (LLLS) faa’ilaaton (LSLL) phase. Some of them use only the first part of the verse [11] faa’i laaton (LS LL) with the risk to detect a prose sentence or free poetry as a classical poetry verse. Some methods use the whole poem or more than one verse [1] or rely on the writing style [16], special characters or patterns as the original form is called “Complete”. Any variant spaces as criteria to separate the first part and the second part. that uses less than the number of the original form is called This approach is challenging because of the possible different “Partial”. e.g. “Kamil” Complete is a variant with six patterns styles in the same poem. as the original one. “Kamil” Partial is a version that has four Other methods use only theoretical set of patterns that patterns.
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages8 Page
-
File Size-