UNESCO WSIS Thematic Meeting Multilingualism for Cultural Diversity & Participation of all in Cyberspace

“Internationalized Domain names - IDN”

Mouhamet Diop

NEXT SA Sénégal ICANN BOARD DIRECTOR Bamako, May 06th, 2005 UNESCO WSIS Thematic Meeting “Cultural Diversity and Multilingualism” Internationalized Domain Names

华人.公司.cn 華人.商業.tw


삼성.회사.kr 三星.회사.kr اﻻهﺮام.م viagénie.qc.ca ישראל.קום آﺄﺮﻩ . ا ﻧ ﺗﺮ ﻧ ﺖ . ا ﻧ ﺗ ﺎﺮا ﺑ ﻌ ﺴﺎ


現代.com ヤフー.com

Internationalized Domain Names - IDN Bamako May 06th, 2005 1 UNESCO WSIS Thematic Meeting “Cultural Diversity and Multilingualism” Table of Contents

• DNS • IDN within the DNS • Methods for Content Access • Stability and Security • IDNA : how it works ? • ICANN coordination on IDN

Internationalized Domain Names - IDN Bamako May 06th, 2005 2 UNESCO WSIS Thematic Meeting “Cultural Diversity and Multilingualism” Human values about Multilingualism and diversity…. “Everyone has the right...to seek, receive and impart information and ideas through any media regardless of

frontiers -- Universal Declaration of Human Rights

There is no universal "right to language". But there are human rights with an implicit linguistic content that multilingual states must acknowledge in order to comply with their international obligations under such instruments as the International Covenant on Civil and Political Rights. “UNDP, Human Development Report 2004: Cultural liberty in today’s diverse world, New York: United Nations Development Programme, 2004, page 60.

Internationalized Domain Names - IDN Bamako May 06th, 2005 3 UNESCO WSIS Thematic Meeting “Cultural Diversity and Multilingualism” The Vision and objectives of IDN -VISION « Enabling the development of multilinguism in the Domain Name System without breaking the DNS or causing significant harms to the Stability and to the Internet Community» -OBJECTIVES -Expand the DNS from LDH to other scripts -Facilitate access for non-English speaking communities -Facilitate the continuity on the content effort -Natural (Cultural identity) Use local language for local message Example: Write an address in Japanese or Haoussa when you write in Japanese and Haoussa. -Provide content and facilitate access to the content elementary school children, people with less education -Provide guidance for registration on IDN and avoid side-effects

Internationalized Domain Names - IDN Bamako May 06th, 2005 4 UNESCO WSIS Thematic Meeting “Cultural Diversity and Multilingualism” The Internet :

• The Internet is all about (Technical point of view): -Names or identifiers (Domain Name Identifier) -Numbers (IP address and ASN Number) -And protocols (ports and protocols)

• The Internet is (user perspective or non technical): -Content : Technical & scientifical content Cultural content Social content -Tools to access the content (Web, Serach engines, etc.) -An economy (stakeholders:actors , coordinators, users) Internationalized Domain Names - IDN Bamako May 06th, 2005 5 UNESCO WSIS Thematic Meeting “Cultural Diversity and Multilingualism” Understanding the rationale …

• Understand the technical limitation -Script vs. Language -Name vs. Identifier -Internationalization vs. Localization -Per label basis • Understand what users wants -Script vs. Language -Name vs. Identifier -Internationalization vs. Localization -Per “FQDN” (fully-qualified domain name) basis

Internationalized Domain Names - IDN Bamako May 06th, 2005 6 UNESCO WSIS Thematic Meeting “Cultural Diversity and Multilingualism” The Domain Name System

• Internet started to grow after 1983 • Host.Txt table was unwieldy and hard to keep up to date in all hosts • in 1984/5, and developed a distributed database system call the Domain Name System to accommodate much larger scale • See for more details

Internationalized Domain Names - IDN Bamako May 06th, 2005 7 UNESCO WSIS Thematic Meeting “Cultural Diversity and Multilingualism” DNS History cont.

• "Host Names On-line," Request for Comments (RFC) 606, L. Peter Deutsch,December 1973; • "Hostnames Server," RFC 811, Ken Harrenstien, Vic White, and Elizabeth Feinler,March 1982; • "DOD Internet Host Table Specification," RFC 952, Ken Harrenstien, M. Stahl, and Elizabeth Feinler, October 1985, • “Role of the Domain Name System,” RFC 3467, John C. Klensin, February 2003.

Internationalized Domain Names - IDN Bamako May 06th, 2005 8 UNESCO WSIS Thematic Meeting “Cultural Diversity and Multilingualism” IDN Evolution: References and Additional Reading

• The IDNA Standard: "Internationalizing Domain Names in Applications (IDNA)", RFC 3490, March 2003; Faltstrom, P., Hoffman, P. and A. Costello,

• "Nameprep: A Stringprep Profile for Internationalized Domain Names (IDN)", RFC 3491, March 2003; Costello, A.,

• "Punycode Hoffman, P. and M. Blanchet, : A Bootstring encoding of for Internationalized Domain Names in Applications (IDNA)", RFC 3492, March 2003.

Internationalized Domain Names - IDN Bamako May 06th, 2005 9 UNESCO WSIS Thematic Meeting “Cultural Diversity and Multilingualism” Where is IDN?


HTTP URI SMTP Mail Format IDN is here DNS


Internet Protocol (IP)

Snapshot of network layers to provide some Internet Service

Internationalized Domain Names - IDN Bamako May 06th, 2005 10 UNESCO WSIS Thematic Meeting “Cultural Diversity and Multilingualism” Confusion

• Language and Script -A language is a way that human interact

-A script is the written form of a language

-Many written languages share the same script

-Some written languages use more than one script

•Example, 現代.com -is this in Chinese, Japanese or Korean?

Internationalized Domain Names - IDN Bamako May 06th, 2005 11 UNESCO WSIS Thematic Meeting “Cultural Diversity and Multilingualism” Characters and Character Sets

• In a “character set” coded for information processing use, fairly abstract characters are assigned “code points” -Essentially, characters are grouped, ordered, and then numbered -“Glyphs” – the form of the characters – are rarely standardized

Internationalized Domain Names - IDN Bamako May 06th, 2005 12 UNESCO WSIS Thematic Meeting “Cultural Diversity and Multilingualism” Scripts and Languages

• A “script” is an (often poorly-defined) collection of related characters -It is common for several languages to share most, but not all, characters from a given script -Scripts are often given the same name as one of the languages that uses them, creating much confusion. Cyrillic script, but Russian, Ukrainian, … languages Arabic script, but Arabic, Farsi, Urdu,… languages • Unicode consortium gives script names and language bindings (UTR 24), but precision is very low

Internationalized Domain Names - IDN Bamako May 06th, 2005 13 UNESCO WSIS Thematic Meeting “Cultural Diversity and Multilingualism” Languages and Countries

• People migrate and take languages with them • Most languages are used in many countries, not just those where they are dominant or “official” • Over enough time, most languages evolve differently in different locations

Internationalized Domain Names - IDN Bamako May 06th, 2005 14 UNESCO WSIS Thematic Meeting “Cultural Diversity and Multilingualism” Confusion

• Name and Identifier -Name is a word or phrase that constitutes the distinctive designation of a person or thing -Identifier is a string of characters that uniquely identify a person or thing

• Example -“Mouhamet Diop” is a Name -“mdiop” is an Identifier -is mdiop.com a Name or Identifer?

• Domain Name is an Identifier not a Name !

Internationalized Domain Names - IDN Bamako May 06th, 2005 15 UNESCO WSIS Thematic Meeting “Cultural Diversity and Multilingualism” Confusion

• Internationalization, Localization and Multilingualism -Internationalizing make the protocol able to handle more scripts -Localization involves tailoring interaction with users in the languages they know -Multilingualism makes the protocol able to handle multiple languages • “I” in IDN is Internationalization but NOT International ! • IDN is NOT about the content • IDN is only related to the identifiers -Path to the content using non-ascii scripts (non LDH only)

Internationalized Domain Names - IDN Bamako May 06th, 2005 16 UNESCO WSIS Thematic Meeting “Cultural Diversity and Multilingualism” Problems

• Encoding -Using one universal character set or multiples caracters set? -What encoding to use? UTF-8? UTF-16?

• Matching -yahoo.com = YAHOO.com -華人.com = 华人.com ?

• Local issues -Language specific consideration

Internationalized Domain Names - IDN Bamako May 06th, 2005 17 UNESCO WSIS Thematic Meeting “Cultural Diversity and Multilingualism” Confusion and Fraud

• Most of the problems are with us already with ASCII, weak software, and bad habits • “Do no harm” may be another important principle

Internationalized Domain Names - IDN Bamako May 06th, 2005 18 UNESCO WSIS Thematic Meeting “Cultural Diversity and Multilingualism” Dispute Resolution or Conflict Prevention • Key principles • Character variants and other evolving systems: prevention of conflicting/ confusing registrations • Dispute resolution policies and mechanisms: “register first, then straighten it out” -ICANN-WIPO UDRP assumes Homogeneous scripts and language characters Conflicts about rights to identical names -but not… Labels constructed from line or box-drawing characters Look-alike characters and strings from different scripts unless they

Internationalized meetDomain Names trademark-like - IDN criteria for “confusingly similar” Bamako May 06th, 2005 19 Translations transcriptions transcodings UNESCO WSIS Thematic Meeting “Cultural Diversity and Multilingualism” Punycode RFC 3492

• Designed for use with Internationalized Domain Names

• It uniquely and reversibly transforms a Unicode string into an LDH string (Letter, Digit, Hyphen)

• Some compression to produce shorter string

• e.g. 新加坡 xn--3bs3aw5wpa2a

Internationalized Domain Names - IDN Bamako May 06th, 2005 20 UNESCO WSIS Thematic Meeting “Cultural Diversity and Multilingualism” Stringprep/Nameprep RFC 3491

• Prepare internationalized domain name labels in order to increase the likelihood that name input and name comparison work in ways that make sense for typical users throughout the world.

• Based on UTR#15 (Normalization) & UTR#22 (Case Mapping)

• Stringprep is the generic processing

• Nameprep is a profile of stringprep for Internationalized Domain Names

Internationalized Domain Names - IDN Bamako May 06th, 2005 21 UNESCO WSIS Thematic Meeting “Cultural Diversity and Multilingualism” IDNA RFC 3490

• IDNA only upgrade in applications to handle IDN

• Consideration of legacy encoding and interopability

• Enforce Nameprep in applications

• Uses Nameprep-ACE’ed IDN over the wire

Internationalized Domain Names - IDN Bamako May 06th, 2005 22 UNESCO WSIS Thematic Meeting “Cultural Diversity and Multilingualism” References and Additional Reading

• The IDNA Standard: Faltstrom, P., Hoffman, P. and A. Costello, "Internationalizing Domain Names in Applications (IDNA)", RFC 3490, March 2003; Hoffman, P. and M. Blanchet, "Nameprep: A Stringprep Profile for Internationalized Domain Names (IDN)", RFC 3491, March 2003; Costello, A., "Punycode: A Bootstring encoding of Unicode for Internationalized Domain Names in Applications (IDNA)", RFC 3492, March 2003. • Variant names and registry restrictions: Konishi, K., Huang, K., Qian, H. and Y. Ko, "Joint Engineering Team (JET) Guidelines for Internationalized Domain Names (IDN) Registration and Administration for Chinese, Japanese, and Korean", RFC 3743, April 2004; Klensin, J., “Registration of Internationalized Domain Names: Overview and Method”, work in progress, July 2004 version is draft-klensin-reg-guidelines-04.txt. • Uses and abuses of the DNS: Klensin, J., "Role of the Domain Name System (DNS), RFC 3467, February 2003. • A different view of the non-ASCII TLD issue: Klensin, J., “National and Local Characters in DNS TLD Names”, work in progress, May 2004 version is draft- klensin-idn-tld-03.txt.

Internationalized Domain Names - IDN Bamako May 06th, 2005 23 UNESCO WSIS Thematic Meeting “Cultural Diversity and Multilingualism” ICANN coordination role in IDN • IDN committee (2001) • IDN RIC-Committee (2002) • Presidential IDN Committee in 2004 • IDN gTLD guidance and registry policies • gTLD registry Implementation with ICANN guidelines : -VeriSign (.com/.net), Afilias (.info), Musedoma (.museum), PIR (.org) as Afilias, NeuLevel (.biz) • IANA registry function for IDN table

Internationalized• GAC/ccTLD Domain Names - IDN consultation on IDN Bamako May 06th, 2005 24 UNESCO WSIS Thematic Meeting “Cultural Diversity and Multilingualism” Experimentation with IDN in Africa -

• Experimentation with IDN – • Implementation at the second level with ccTLDs -Maghreb -West Africa (one or two) -Or other ccTLD to be choosen by the Steering comitte • This is a parallel process for selected languages if they are already in the UNICODE table. • Financial partners: ISOC/Afilias • Technical partners: -Joint-team with CJK Team (Chinese , Japanese and Korean) -Michael Everson, James Seng, John Klensin

Internationalized Domain Names - IDN Bamako May 06th, 2005 25 UNESCO WSIS Thematic Meeting “Cultural Diversity and Multilingualism” « UNICODE & IDN in Africa »

• An African Initiative • A global challenge and historical responsability

• A coordination team (Mouhamet Diop, Pierre Dandjinou, Pierre Ouédraogo,) • A team of volonteers (AFRILANG members (Alex, Lishou, Maxime,etc), Mr Samassekou (ACALAN),UNECA, etc.) • A sponsorship bootstrap process ISOC / Afilias / PIR / UNDP / UNECA (?) • A reference Study for Africa (Pilot for 6 months) • Funds raising for the scaling up

• Institutional partners (ACALAN, UA, UNECA, etc.) Internationalized Domain Names - IDN Bamako May 06th, 2005 26 UNESCO WSIS Thematic Meeting “Cultural Diversity and Multilingualism” Questions

Internationalized Domain Names - IDN Bamako May 06th, 2005 27