Multilingual scientific e-document on the Web

Azzeddine LAZREK

University Cadi Ayyad, Faculty of Sciences Department of Computer Science Marrakesh - Morocco

[email protected] http://www.ucam.ac.ma/fssm/rydarab

2008 Plan

1 MathML & Arabic notation MathML Adaptation to the Arabic notation

2 Dadzilla – Arabic MathML Navigator Dadzilla Versions ArMathmlEd – Mathematical editor Extension

3 CtoArP – CMathML to PMathML transformation Motivations CtoArP ArSelector – Notation selector Results MathML & Arabic notation MathML Dadzilla – Arabic MathML Navigator Adaptation to the Arabic notation CtoArP – CMathML to PMathML transformation

MathML Mathematical , W3C standard - 1999 Aim Allowing mathematics to be transmitted, processed and published on the Web MathML 1 and 2 The first two versions of MathML describe only mathematics in the European languages Separation Presentation MathML Content MathML Content MathML is generally language-neutral. In contrary, presentation MathML necessarily targets a specific language and notational conventions. Azzeddine LAZREK 3/32 Multilingual scientific e-document on the Web MathML & Arabic notation MathML Dadzilla – Arabic MathML Navigator Adaptation to the Arabic notation CtoArP – CMathML to PMathML transformation

MathML Mathematical Markup Language, W3C standard - 1999 Aim Allowing mathematics to be transmitted, processed and published on the Web MathML 1 and 2 The first two versions of MathML describe only mathematics in the European languages Separation Presentation MathML Content MathML Content MathML is generally language-neutral. In contrary, presentation MathML necessarily targets a specific language and notational conventions. Azzeddine LAZREK 3/32 Multilingual scientific e-document on the Web MathML & Arabic notation MathML Dadzilla – Arabic MathML Navigator Adaptation to the Arabic notation CtoArP – CMathML to PMathML transformation

MathML Mathematical Markup Language, W3C standard - 1999 Aim Allowing mathematics to be transmitted, processed and published on the Web MathML 1 and 2 The first two versions of MathML describe only mathematics in the European languages Separation Presentation MathML Content MathML Content MathML is generally language-neutral. In contrary, presentation MathML necessarily targets a specific language and notational conventions. Azzeddine LAZREK 3/32 Multilingual scientific e-document on the Web MathML & Arabic notation MathML Dadzilla – Arabic MathML Navigator Adaptation to the Arabic notation CtoArP – CMathML to PMathML transformation

MathML Mathematical Markup Language, W3C standard - 1999 Aim Allowing mathematics to be transmitted, processed and published on the Web MathML 1 and 2 The first two versions of MathML describe only mathematics in the European languages Separation Presentation MathML Content MathML Content MathML is generally language-neutral. In contrary, presentation MathML necessarily targets a specific language and notational conventions. Azzeddine LAZREK 3/32 Multilingual scientific e-document on the Web MathML & Arabic notation MathML Dadzilla – Arabic MathML Navigator Adaptation to the Arabic notation CtoArP – CMathML to PMathML transformation Content MathML

Semantically, an Arabic ma- thematical expression has the same functionality as its Latin equivalent: same content MathML tree skele- ton but with different content for token elements.

Azzeddine LAZREK 4/32 Multilingual scientific e-document on the Web MathML & Arabic notation MathML Dadzilla – Arabic MathML Navigator Adaptation to the Arabic notation CtoArP – CMathML to PMathML transformation Presentation MathML

Only rendering aspects need to be taken into account: same presentation MathML tree skeleton with different content for token elements is insufficient.

Azzeddine LAZREK 5/32 Multilingual scientific e-document on the Web X€

MathML & Arabic notation MathML Dadzilla – Arabic MathML Navigator Adaptation to the Arabic notation CtoArP – CMathML to PMathML transformation Direction

2

€ X 1 € = ( ) X (€ ) = € 2 - 1

Azzeddine LAZREK 6/32 Multilingual scientific e-document on the Web MathML & Arabic notation MathML Dadzilla – Arabic MathML Navigator Adaptation to the Arabic notation CtoArP – CMathML to PMathML transformation Direction

Bidi Unicode algorithm À

1 1+2 À 1+ 2 º + − º −

À 2

Azzeddine LAZREK 6/32 Multilingual scientific e-document on the Web MathML & Arabic notation MathML Dadzilla – Arabic MathML Navigator Adaptation to the Arabic notation CtoArP – CMathML to PMathML transformation

Direction

© ©

É ã X YªË@ É ã X YªË@ ® ® Le nombre z´ero Le nombre z´ero

Azzeddine LAZREK 6/32 Multilingual scientific e-document on the Web MathML & Arabic notation MathML Dadzilla – Arabic MathML Navigator Adaptation to the Arabic notation CtoArP – CMathML to PMathML transformation Additional variants

mathvariant attribute

Azzeddine LAZREK 7/32 Multilingual scientific e-document on the Web MathML & Arabic notation MathML Dadzilla – Arabic MathML Navigator Adaptation to the Arabic notation CtoArP – CMathML to PMathML transformation Symbols mirrors

The attributes lspace/rspace and lquote/rquote are to be interpreted as open/closed, instead of left/right respectively.

.�� ����|

Azzeddine LAZREK 8/32 Multilingual scientific e-document on the Web MathML & Arabic notation MathML Dadzilla – Arabic MathML Navigator Adaptation to the Arabic notation CtoArP – CMathML to PMathML transformation Kashida

1 Using CSS text-justify: kashida 2 Using new entities characters &asum ;

Azzeddine LAZREK 9/32 Multilingual scientific e-document on the Web MathML & Arabic notation MathML Dadzilla – Arabic MathML Navigator Adaptation to the Arabic notation CtoArP – CMathML to PMathML transformation Additional constructions

12

Azzeddine LAZREK 10/32 Multilingual scientific e-document on the Web MathML & Arabic notation MathML Dadzilla – Arabic MathML Navigator Adaptation to the Arabic notation CtoArP – CMathML to PMathML transformation MathML extension

After examining all notational conventions, in current use with Arabic, the following step is to clarify the specification of MathML, proposing extensions where needed, so that MathML has the broadest coverage possible proposals: Direction: the overall mathematical directionality should be determined by a dir attribute on the outermost math and mrow elements ; which takes one of the values ltr or rtl. The text content of each token element should be treated as a separate directional segment and the bidirectional algorithm should be applied to each independently Additional value for mathvariant: isolated, initial, tailed, looped, stretched and double-struck Mirroring: code-points for Arabic mathematical symbols are not available yet, but appropriately marked for mirroring Arabic specific notation: additional allowed value madruwb for the notation attribute menclose of factorial symbol.

Azzeddine LAZREK 11/32 Multilingual scientific e-document on the Web MathML & Arabic notation MathML Dadzilla – Arabic MathML Navigator Adaptation to the Arabic notation CtoArP – CMathML to PMathML transformation MathML extension

After examining all notational conventions, in current use with Arabic, the following step is to clarify the specification of MathML, proposing extensions where needed, so that MathML has the broadest coverage possible proposals: Direction: the overall mathematical directionality should be determined by a dir attribute on the outermost math and mrow elements ; which takes one of the values ltr or rtl. The text content of each token element should be treated as a separate directional segment and the bidirectional algorithm should be applied to each independently Additional value for mathvariant: isolated, initial, tailed, looped, stretched and double-struck Mirroring: code-points for Arabic mathematical symbols are not available yet, but appropriately marked for mirroring Arabic specific notation: additional allowed value madruwb for the notation attribute menclose of factorial symbol.

Azzeddine LAZREK 11/32 Multilingual scientific e-document on the Web MathML & Arabic notation MathML Dadzilla – Arabic MathML Navigator Adaptation to the Arabic notation CtoArP – CMathML to PMathML transformation MathML extension

After examining all notational conventions, in current use with Arabic, the following step is to clarify the specification of MathML, proposing extensions where needed, so that MathML has the broadest coverage possible proposals: Direction: the overall mathematical directionality should be determined by a dir attribute on the outermost math and mrow elements ; which takes one of the values ltr or rtl. The text content of each token element should be treated as a separate directional segment and the bidirectional algorithm should be applied to each independently Additional value for mathvariant: isolated, initial, tailed, looped, stretched and double-struck Mirroring: code-points for Arabic mathematical symbols are not available yet, but appropriately marked for mirroring Arabic specific notation: additional allowed value madruwb for the notation attribute menclose of factorial symbol.

Azzeddine LAZREK 11/32 Multilingual scientific e-document on the Web MathML & Arabic notation MathML Dadzilla – Arabic MathML Navigator Adaptation to the Arabic notation CtoArP – CMathML to PMathML transformation MathML extension

After examining all notational conventions, in current use with Arabic, the following step is to clarify the specification of MathML, proposing extensions where needed, so that MathML has the broadest coverage possible proposals: Direction: the overall mathematical directionality should be determined by a dir attribute on the outermost math and mrow elements ; which takes one of the values ltr or rtl. The text content of each token element should be treated as a separate directional segment and the bidirectional algorithm should be applied to each independently Additional value for mathvariant: isolated, initial, tailed, looped, stretched and double-struck Mirroring: code-points for Arabic mathematical symbols are not available yet, but appropriately marked for mirroring Arabic specific notation: additional allowed value madruwb for the notation attribute menclose of factorial symbol.

Azzeddine LAZREK 11/32 Multilingual scientific e-document on the Web MathML & Arabic notation MathML Dadzilla – Arabic MathML Navigator Adaptation to the Arabic notation CtoArP – CMathML to PMathML transformation Extension adopted

The proposals for adapting MathML to Arabic mathematical notation, published in the W3C Note, will be included in the new version of MathML (MathML3).

Azzeddine LAZREK 12/32 Multilingual scientific e-document on the Web Dadzilla MathML & Arabic notation Versions Dadzilla – Arabic MathML Navigator ArMathmlEd – Mathematical editor CtoArP – CMathML to PMathML transformation Extension Navigator choice

Free & Open-source

Azzeddine LAZREK 13/32 Multilingual scientific e-document on the Web Dadzilla MathML & Arabic notation Versions Dadzilla – Arabic MathML Navigator ArMathmlEd – Mathematical editor CtoArP – CMathML to PMathML transformation Extension Architecture

XUL CSS

JavaScript XBL RDF

Web SVG xpconnect services

Gecko

Azzeddine LAZREK 14/32 Multilingual scientific e-document on the Web Dadzilla MathML & Arabic notation Versions Dadzilla – Arabic MathML Navigator ArMathmlEd – Mathematical editor CtoArP – CMathML to PMathML transformation Extension Fonts

To correctly display MathML documents in Arabic presentation, some fonts must be installed: ACM Mirrored BaKoMa Computer Modern fonts RamzArab Adapted Arabic mathematical symbols font Problems

Existent fonts

ACM

ACM &

Azzeddine LAZREK 15/32 Multilingual scientific e-document on the Web Dadzilla MathML & Arabic notation Versions Dadzilla – Arabic MathML Navigator ArMathmlEd – Mathematical editor CtoArP – CMathML to PMathML transformation Extension

Azzeddine LAZREK 16/32 Multilingual scientific e-document on the Web Dadzilla MathML & Arabic notation Versions Dadzilla – Arabic MathML Navigator ArMathmlEd – Mathematical editor CtoArP – CMathML to PMathML transformation Extension

Dadzilla 1.1

q H+€ 3

Azzeddine LAZREK 17/32 Multilingual scientific e-document on the Web Dadzilla MathML & Arabic notation Versions Dadzilla – Arabic MathML Navigator ArMathmlEd – Mathematical editor CtoArP – CMathML to PMathML transformation Extension

Dadzilla 1.2

q H+€ 3

Azzeddine LAZREK 18/32 Multilingual scientific e-document on the Web Dadzilla MathML & Arabic notation Versions Dadzilla – Arabic MathML Navigator ArMathmlEd – Mathematical editor CtoArP – CMathML to PMathML transformation Extension MathML verbosity

TE X MathML a + b ra + b $\sqrt{\frac{a+b}{c+d}}$ c + d c + d

Azzeddine LAZREK 19/32 Multilingual scientific e-document on the Web Dadzilla MathML & Arabic notation Versions Dadzilla – Arabic MathML Navigator ArMathmlEd – Mathematical editor CtoArP – CMathML to PMathML transformation Extension

Azzeddine LAZREK 20/32 Multilingual scientific e-document on the Web Dadzilla MathML & Arabic notation Versions Dadzilla – Arabic MathML Navigator ArMathmlEd – Mathematical editor CtoArP – CMathML to PMathML transformation Extension

http://xulfr.org

Azzeddine LAZREK 21/32 Multilingual scientific e-document on the Web Dadzilla MathML & Arabic notation Versions Dadzilla – Arabic MathML Navigator ArMathmlEd – Mathematical editor CtoArP – CMathML to PMathML transformation Extension

Dadzilla Added the factorial function in Arabic notation. Support extensible alphabetic symbols. Using the attribute with dir avec mrow et mstyle Arabization menu Dadzilla and ArMathmlEd

Azzeddine LAZREK 22/32 Multilingual scientific e-document on the Web Motivations MathML & Arabic notation CtoArP Dadzilla – Arabic MathML Navigator ArSelector – Notation selector CtoArP – CMathML to PMathML transformation Results

CMathML CMathML PMathML

1 1 Dadzilla 2 p1 x 2 € س 2 x x

Azzeddine LAZREK 23/32 Multilingual scientific e-document on the Web Motivations MathML & Arabic notation CtoArP Dadzilla – Arabic MathML Navigator ArSelector – Notation selector CtoArP – CMathML to PMathML transformation Results

I18n

Azzeddine LAZREK 24/32 Multilingual scientific e-document on the Web Motivations MathML & Arabic notation CtoArP Dadzilla – Arabic MathML Navigator ArSelector – Notation selector CtoArP – CMathML to PMathML transformation Results

Pourquoi XSLT? (C→P)MathML est interne a` MathML XSLT est une application XML XSLT est largement adopte´ XSLT est portable

CtoArP feuille de style XSLT fait appel par defaut´ aux regles` definies´ dans ctop. le resultat´ est visualise´ en utilisant Dadzilla

Azzeddine LAZREK 25/32 Multilingual scientific e-document on the Web Motivations MathML & Arabic notation CtoArP Dadzilla – Arabic MathML Navigator ArSelector – Notation selector CtoArP – CMathML to PMathML transformation Results

Pourquoi XSLT? (C→P)MathML est interne a` MathML XSLT est une application XML XSLT est largement adopte´ XSLT est portable

CtoArP feuille de style XSLT fait appel par defaut´ aux regles` definies´ dans ctop.xsl le resultat´ est visualise´ en utilisant Dadzilla

Azzeddine LAZREK 25/32 Multilingual scientific e-document on the Web Motivations MathML & Arabic notation CtoArP Dadzilla – Arabic MathML Navigator ArSelector – Notation selector CtoArP – CMathML to PMathML transformation Results

ctop.xsl est utilisee´ dans les cas ou` le codage est le memeˆ en notation arabe et latine.

Notation arabe root msqrt Notation latine

Regles` de resolution´ de conflits =⇒ les regles` modeles` de ctoarp.xsl sont prioritaires sur celles de ctop.xsl.

Azzeddine LAZREK 26/32 Multilingual scientific e-document on the Web Motivations MathML & Arabic notation CtoArP Dadzilla – Arabic MathML Navigator ArSelector – Notation selector CtoArP – CMathML to PMathML transformation Results

ctop.xsl est utilisee´ dans les cas ou` le codage est le memeˆ en notation arabe et latine.

Notation arabe dir="rtl" root msqrt Notation latine

Regles` de resolution´ de conflits =⇒ les regles` modeles` de ctoarp.xsl sont prioritaires sur celles de ctop.xsl.

Azzeddine LAZREK 26/32 Multilingual scientific e-document on the Web Motivations MathML & Arabic notation CtoArP Dadzilla – Arabic MathML Navigator ArSelector – Notation selector CtoArP – CMathML to PMathML transformation Results

ctop.xsl est utilisee´ dans les cas ou` le codage est le memeˆ en notation arabe et latine.

Notation arabe dir="ltr" root msqrt Notation latine

Regles` de resolution´ de conflits =⇒ les regles` modeles` de ctoarp.xsl sont prioritaires sur celles de ctop.xsl.

Azzeddine LAZREK 26/32 Multilingual scientific e-document on the Web Motivations MathML & Arabic notation CtoArP Dadzilla – Arabic MathML Navigator ArSelector – Notation selector CtoArP – CMathML to PMathML transformation Results

Motivations Certains concepts peuvent etreˆ present´ es´ par plusieurs variet´ es´ de notations au choix : utilisation des symboles miroirs ou des symboles alphabetiques´

¸ X HX ¸ HX m× 1=H . 1=H

¸ ¸ H H X Y X Yk. H H=1 1= utilisation du systeme` du numeration´ arabe ou arabe-hindou utilisation des symboles avec ou sans point diacritiques

Azzeddine LAZREK 27/32 Multilingual scientific e-document on the Web Motivations MathML & Arabic notation CtoArP Dadzilla – Arabic MathML Navigator ArSelector – Notation selector CtoArP – CMathML to PMathML transformation Results

Motivations Certains concepts peuvent etreˆ present´ es´ par plusieurs variet´ es´ de notations au choix : utilisation des symboles miroirs ou des symboles alphabetiques´ utilisation du systeme` du numeration´ arabe ou

arabe-hindou

½ ¾ ¿ 4 5 6 7 8 9 ¼, , , , , , , , , 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 utilisation des symboles avec ou sans point diacritiques

Azzeddine LAZREK 27/32 Multilingual scientific e-document on the Web Motivations MathML & Arabic notation CtoArP Dadzilla – Arabic MathML Navigator ArSelector – Notation selector CtoArP – CMathML to PMathML transformation Results

Motivations Certains concepts peuvent etreˆ present´ es´ par plusieurs variet´ es´ de notations au choix : utilisation des symboles miroirs ou des symboles alphabetiques´ utilisation du systeme` du numeration´ arabe ou arabe-hindou

utilisation des symboles avec ou sans point diacritiques

©



·

À A£ € Ag

º

º

À A£ Ag + €

Azzeddine LAZREK 27/32 Multilingual scientific e-document on the Web Motivations MathML & Arabic notation CtoArP Dadzilla – Arabic MathML Navigator ArSelector – Notation selector CtoArP – CMathML to PMathML transformation Results Selection de notation

Azzeddine LAZREK 28/32 Multilingual scientific e-document on the Web Motivations MathML & Arabic notation CtoArP Dadzilla – Arabic MathML Navigator ArSelector – Notation selector CtoArP – CMathML to PMathML transformation Results

Our aims were to explore avenues to ensure: The presence of a high-quality typographic → Adoption of the TEX typographical rules Compliance with the rules of Arabic calligraphy → Use of Kashida and variations style The expression structuring to allow: search features, copying, indexing. . . → Extension of MathML and development of Dadzilla The look and feel of content coding → Achieving CtoArP The ease with the task of Arabic mathematical e-documents composition → Development DadTEX and ArMathmlEd

Azzeddine LAZREK 29/32 Multilingual scientific e-document on the Web

The End Thank you! Motivations MathML & Arabic notation CtoArP Dadzilla – Arabic MathML Navigator ArSelector – Notation selector CtoArP – CMathML to PMathML transformation Results RamzArab table /

Azzeddine LAZREK 32/32 Multilingual scientific e-document on the Web Motivations MathML & Arabic notation CtoArP Dadzilla – Arabic MathML Navigator ArSelector – Notation selector CtoArP – CMathML to PMathML transformation Results Dadzilla example /

Azzeddine LAZREK 33/32 Multilingual scientific e-document on the Web Motivations MathML & Arabic notation CtoArP Dadzilla – Arabic MathML Navigator ArSelector – Notation selector CtoArP – CMathML to PMathML transformation Results Dadzilla example /

Azzeddine LAZREK 34/32 Multilingual scientific e-document on the Web