Arabic Transliteration Rules in Document 9303

Mike ELLIS ISO, Australia

V3.4.1 10/2011 Main points:

1. Identification: name in script is only reliable basis

2. Countries that use the should have the benefits of the MRZ The Arabic name must appear in Latin characters in VIZ

8.3 Languages and characters. When the mandatory elements of Zones I, II and III are in a national language that does not use the Latin alphabet, a transliteration shall also be provided. Status qqpuo: name in VIZ copied to MRZ Manyyp phonetic transcrip tions of same Arabic name Manyyp phonetic transcrip tions Manyyp phonetic transcrip tions Manyyp phonetic transcrip tions Manyyp phonetic transcrip tions

and possibly over 9,000 more...

MRZ same as VIZ Much variation in (Latin) name IDENTIFICATION: the Arabic name is unique

ﻣﺤﻤﻮد ﻋﺒﺪاﻟﺮﺣﻴﻢ

ﻣﺤﻤﻮد ﻋﺒﺪاﻟﺮﺣﻴﻢ

So the solution is to base the MRZ on the Arabic Name. But the MRZ can only contain OCR-B A-Z and <

ﻣﺤﻤﻮد ﻋﺒﺪاﻟﺮﺣﻴﻢ

ﻣﺤﻤﻮد ﻋﺒﺪاﻟﺮﺣﻴﻢ

9.4.1 Names in the MRZ are represented differently from those in the VIZ. National characters must be transliterated using only the allowed OCR character set [A..Z]. Solution: Transliteration of Arabic name into Latin

Transliteration table based on closest match + ‘escape’ Arabic letter Name MRZ

Unicode XE ء 0621 alef with madda above XAA ﺁ 0622 alef with hamza above XAE أ 0623 waw with hamza above XW ؤ 0624 alef with hamza below I إ 0625 yeh with hamza above XI ئ 0626 alef A ا 0627 beh B ب Technical 0628 [teh marbuta XTA/XAH[1 ة 0629 Report – teh T ت 062A theh XTH ث Appendix 1 062B jeem J ج 062C hah XH ح 062D khah XKH خ 062E dal D د 062F thal XDH ذ 0630 reh R ر 0631 [1] XTA is used generally except if teh marbuta occurs at the end of the name component, in which case XAH is used. Arabic letter Name MRZ

Unicode zain Z ز 0632 seen S س 0633 sheen XSH ش 0634 sad XSS ص 0635 dad XDZ ض 0636 tah XTT ط 0637 zah XZZ ظ 0638 ain E ع Technical 0639 ghain G غ Report – 063A feh F ف 0641 qaf Q ق Appendix 1 0642 kaf K ك 0643 lam L ل 0644 meem M م 0645 noon N ن 0646 heh H ﻩ 0647 waw W و 0648 Arabic letter Name MRZ

Unicode alef maksura XAY ى 0649 yeh Y ي 064A [shadda [DOUBLE][1 ّ 0651 alef wasla XXA ٱ 0671 Tteh XXT ٹ 0679 Peh P ﭗ 067E teh with ring XRT ټ 067C hah with hamza above XKE ځ 0681 hah with 3 d ot s ab ove XXH څ Technical 0685 Tcheh XC چ Report – 0686 Ddal XXD ڈ 0688 dal with ring XDR ډ Appendix 1 0689 Rreh XXR ڑ 0691 reh with ring XRR ړ 0693 reh with dot below and dot above XRX ږ 0696

Jeh XJ ژ 0698 seen with dot below and dot above XXS ښ 069A

.becomes FXDZXDZXAH ﻓ ﻀّﺔ ;becomes EBBAS ﻋﺒّﺎﺒسس Shadda denotes doubling: Latin character or sequence is repeated eg [1] Arabic letter Name MRZ

Unicode keheh XKK ﮎ 06A9 kaf with ring XXK ګ 06AB Ng XNG ڭ 06AD XGG گ 06AF noon ghunna XNN ں 06BA noon with ring XXN ڼ 06BC heh doachashmee XDO ه 06BE heh with yeh above XYH ۀ 06C0

Technical 06C1 heh goal XXG Report – 06C2 heh goal with hamza above XGE Appendix 1 06C3 thteh marb btuta goal XTG farsi yeh XYA ى 06CC yeh with tail XXY ۍ 06CD Yeh Y ې 06D0 Yeh barree XYB ے 06D2 yeh barree with hamza above XBE ۓ 06D3 Some Arabic characters have near matches

Meem مﻣﺤﻤﻮد ﻋﺒﺪاﻟﺮﺣﻴﻢ

M ‘H’ is alreadyyg assigned to “Heh” , so use ‘X’ as escap e

Hah حﻣﺤﻤﻮد ﻋﺒﺪاﻟﺮﺣﻴﻢ

XH

Heh -> ‘H’, Hah -> ‘XH’ Meem مﻣﺤﻤﻮد ﻋﺒﺪاﻟﺮﺣﻴﻢ

M Waw وﻣﺤﻤﻮد ﻋﺒﺪاﻟﺮﺣﻴﻢ

W Dal دﻣﺤﻤﻮد ﻋﺒﺪاﻟﺮﺣﻴﻢ

D “Ain” has no exact Latin eqq,guivalent, so assign ‘E’

Ain ﻣﺤﻤﻮد عﻋﺒﺪاﻟﺮﺣﻴﻢ

E Beh ﻣﺤﻤﻮد بﻋﺒﺪاﻟﺮﺣﻴﻢ

B Dal ﻣﺤﻤﻮد دﻋﺒﺪاﻟﺮﺣﻴﻢ

D Alef ﻣﺤﻤﻮد اﻋﺒﺪاﻟﺮﺣﻴﻢ

A Lam ﻣﺤﻤﻮد لﻋﺒﺪاﻟﺮﺣﻴﻢ

L Reh ﻣﺤﻤﻮد رﻋﺒﺪاﻟﺮﺣﻴﻢ

R Hah ﻣﺤﻤﻮد حﻋﺒﺪاﻟﺮﺣﻴﻢ

XH Yeh ﻣﺤﻤﻮد يﻋﺒﺪاﻟﺮﺣﻴﻢ

Y Meem ﻣﺤﻤﻮد مﻋﺒﺪاﻟﺮﺣﻴﻢ

M Reiterate: MRZ is different form of name to VIZ

9.4.1 Names in the MRZ are represented differently from those in the VIZ. National characters must be transliterated using only the allowed OCR character set [A..Z]. Reiterate: MRZ is different form of name to VIZ

9.1.3 The data in the MRZ are formatted in such a way as to be readable by machines with standard capability worldwide. It must be stressed that the MRZ is reserved for data intended for international use in conformance with international Standards for MRPs. The MRZ is a different representation of the data than is found in the VIZ. Transliteration - advantage

Name in MRZ is unique (= Arabic name) Arabic name direct from MRZ

ﻣﺤﻤﻮد ﻋﺒﺪاﻟﺮﺣﻴﻢ Chip holds name in DG11 (Unicode), can bdbe compared

ﻣﺤﻤﻮدﺤد ﻋﺒﺪاﻟ ﺮﺣﻴﻢ

Unicode: ﻣﺤﻤﻮد ,0645,062D,0645,0648,062F,0639,0628ﻋﺒﺪاﻟﺮﺣﻴﻢ 062F,0627,0644,0631,062D,064A,0645 IMPLEMENTATION ISSUES IMPLEMENTATION – Legacy Database and Border Control

MEHMOOD ABD AL RAHEEM

MAHMOOD ABDUL RAHIM

MAHMUT ABD AR-RAHEEM

MAHMUD ABDALRAHEEM

MAHMOUD ABD-AL-RAHIIM

MAHMUT ABDUL RAHIIM ﻣﺤﻤﻮد ﻋﺒﺪاﻟﺮﺣﻴﻢ Other variations can be derived. Original Arabic form IMPLEMENTATION – Name in VIZ

1. Use Civil Register (transcription)

or 2. Status quo transcription

or 3. Use MRZ transliteration IMPLEMENTATION - PNR

Only concerned when PNR is used for API, not with PNR for airline use.

IATA/CAWG “API Statement of Principles”

presented to

FACILITATION (FAL) DIVISION — TWELFTH SESSION Cairo, Egypt, 2004-3-22/4-2

stated that:

“Required API data should be limited to the data contained in the machine-readable zone of travel documents or obtainable from existing government databases, such as those containing visa issuance information.” IMPLEMENTATION – AIRLINE CHECKING

1. PNR = MRZ

2. PNR = both VIZ and MRZ CONCLUSION

1. Essential – solve identity management issue, required for Interpol , aviation security , etc

2. Using the original name in Arabic is only way

3. Arabic name is now machine readable

4. Will be transitional implementation problems, not impossibl e, b ut worth whil e goal Transliteration Rules - Arabic (WP17)

Mike Ellis ISO (JTC1 SC17/WG3 TF3)

on behalf of New Technologggpies Working Group (NTWG)

presented to ICAO TAG/MRTD 20 – Montreal, September 2011 20th Meeting of the Technical Advisory Group on Machine Readable Travel Documents

V3.4 09/2011