Computer Science and Software Engineering University of Wisconsin - Platteville

Note 8. Internationalization

Yan Shi SE 3730 / CS 5730 Lecture Notes

Part of the contents are from Ibrahim Meru’s presentation slides http://jan.newmarch.name/i18n/david.tuffley/IBM_i18n.pdf Terminology

. Internationalization (I18N) — the process of designing a software application so that it can be adapted to various languages and regions without engineering changes — Making an application independent of any particular language or culture . Localization (L10N) — the process of adapting internationalized software for a specific region or language by adding local-specific components and translating texts. . Globalization (G11N) — G11N = I18N + L10N + multilingual support — Application can handle users from multiple countries/regions and languages (simultaneously) . A good reference: https://msdn.microsoft.com/en-US/globalization/mt642951 Scope of I18N Example Character Sets

. ASCII: — the most popular character standard. — use only 7 bits maximum of 128 — adequate for English . Code Pages: — a table of values describing the character set for a particular language — One per language/set of languages — There are hundreds of code pages — Different vendors may have difference code page numbering . Unicode: — an effort to include all characters from previous code pages into single character enumeration. — use 2 bytes

. Standard in U.S. . work for English and German

. 8-bit — 0-127: ASCII — 128-175: international text characters Unicode

. One “codepage” can represent up to 65535 characters. — now can represent over 4 billion characters. — easier for I18N  you don’t have to worry about which code page to use Keyboard Test

. Languages and cultures have different characters and special characters.

. Keyboards differ from country to country to support their character sets and usage patterns.

. These keyboards generate interrupts that must match the loaded code page. German Keyboard Arabic Keyboard Traditional Chinese Keyboard Interesting to Know (Alt code): How to type German on US keyboard?

PART 1 - For this German character, type... These codes work with most fonts. Some fonts may vary. For the PC codes, always use the numeric (extended) keypad on the right of your keyboard and not the row of numbers at the top. (On a laptop you may have to use "num lock" and the special number keys.) German letter/symbol PC Code: Alt + Mac Code: option + ä 0228 u, then a Ä 0196 u, then A é 0233 E ö 0246 u, then o Ö 0214 u, then O ü 0252 u, then u Ü 0220 u, then U 0223 S Hot Key Test

. We may want Hot keys and Shortcuts to be different because the words on the menus are different. — “Copy” ctrl-c , what should it be for “kopieren”?

. Hot key conventions differ – sometimes applications just stick with the English Hot key or short cut regardless of what the local command starts with. Text Filter and Special Character Test

. Sometimes software will block other codes than ASCII. These codes may be needed to support non-English languages.

. Special characters in the middle of names may cause problems. — For example “O’Kelly”, ñ , ß, Ü. Size of Text Messages

. English requires fewer characters than most other western languages. As a rule of thumb, — French is 15% longer, — German is 25% longer. — Eastern languages, traditional or simplified Chinese, Japanese, and Korean require much fewer characters (2-3 character positions per word). . Special consideration must be made for UI design and functionality to handle different length text messages of the languages supported. . Message lengths also greatly complicates business forms and report designs. . E.g.: “Please enter your name” Translation Test

. The sentence structure of typical English “S-V- O”, etc. . Sentence structure may differ from language to language. Therefore, the software must be language sensitive w.r.t. sentence structure. — use variables in messages to assume any order: Sorting Rules

. Where do the characters of a specific language need to fall into a collating sequence? . This needs to be localized for people to use lists naturally.

. English sorts by normal ASCII value sequence. . How to sort Chinese names? Other Peripherals

. Printer: — Some printers does not support certain languages. — Testers must be aware of these non-I18N printers and test for compatibility. — Sizes of papers may also cause issues: A4 or Letter? . Mouse with non-standard drivers . Wireless support: GMS, CDMA, 3G, LTE . Data storage: DVD, flash drive… OS Localization Test

. There is not just Windows 10, it is Windows 10 German, French, Chinese, etc. Need to test completely on all supported OS localizations.

https://geekupwithyourdevices.blogspot.com/2015/05/windows-10-build-10123-screenshots.html Data Format

. "01/02/03" ? . Time zones and daylight savings? . 240.125 vs. 240,125 vs. 240 125 . Money symbols vary: . $125,000 ---> £125,000? . Address formats . Phone number formats . Calendar formats . Measurement units! (Mars Lander) Colors

. Colors are interpreted differently among regions. Colors

. Colors are interpreted differently among regions. Icon Design

. Avoid humor, puns, slang, special, mythological, and religious symbols in icons. . Do not require user to understand subtleties of originating language, culture. . Ensure your icons are not offensive. — Thumbs up: insulting in Turkey — “Ok” sign: insulting in Brazil, other countries Summary: Special Attention for G11N

. Design and Implementation Summary: Special Attention for G11N

. Testing — Must have testers that recognize language and cultural defects . Deployment and Sales — Must follow business rules and regulations of the countries in which you see — Copyrights and anti-piracy practices . Installation — The install must be multi-language to direct users to their native language. . Support and maintenance — Must be able to communicate in the language and during regular business hours. — All documentation must be kept synchronized in multiple languages with the product.