
Internationalization 360˚ Testing MahipalsinnhMahipalsinh Rana Rana Member of Technical Staff Sun Microsystems 1 Agenda z Introduction of I18n - 40 Minutes z Internationalization(I18n) 360˚ testing - 60 Minutes z Testing Standalone Applications – 15 minutes z Quiz – 10 minutes z Testing Web Applications – 15 minutes z Quiz – 10 minutes z I18n testing Automation – 30 minutes z Advanced I18n testing , References – 15 minutes z Q/A - 15 minutes Introduction z Understanding of Internationalization (I18n) z Why I18n testing z Myths for I18n testing z Scope of I18n testing z Terminologies in i18n technology z Character set/Character repertoire z Character Code/Code Point,Coded Character z Encoding , Unicode , UTF-8 ,UTF-16 ,UTF-32 z Glyph , Fonts , Input Method Engine (IME)˫ z Locale “Everyone has the right... to seek, receive and impart information and ideas through any media regardless of frontiers” -- Universal Declaration of Human Rights Why Globalization Why Globalization Sun Portal server in Chinese Why Globalization Yahoo.com in Kannada Why Globalization z “Visitors linger twice as long as they do at English-only URL's. z Business users are 3 times more likely to buy when addressed in their language. z Customer service costs drop when instructions are displayed in the user's native language." 'Strategies for Global Sites' Donald DePalma Forrester Research Inc. Why Globalization "One large IT company discovered that a significant percentage of inquiries were coming from South Korea - they created a Korean website and revenues rose by 8 percent." 'Global eCommerce' Donald J. Plumley Bowne Global Solutions What's with the acronyms? Internationalization ====> i18n , How? There are 18 characters between i and n With that logic : Localization ====>L10n Globalization ====> G11n Translation ===> T9n and you can call me M5l ==> Mahipal , Don't they all look the same? z Localization z Internationalization z Globalization z Translation How do they differ and relate? z Globalization encompasses i18n and l10n. z InternationalizationAn enables localization. z An expert in i18N may not be an expert in l10N. LISA* Definitions z Globalization-(G11n)˫ z “Globalization addresses the business issues associated with taking a product global. In the globalization of high-tech products this involves integrating localization throughout a company, after proper internationalization and product design, as well as marketing, sales, and support in the world market.” z Internationalization-(I18n)˫ z “Internationalization is the process of generalizing a product so that it can handle multiple languages and cultural conventions without the need for re- design. Internationalization takes place at the level of program design and document development.” z Localization-(L10n)˫ z “Localization involves taking a product and making it linguistically and culturally appropriate to the target locale (country/region and language) where it will be used and sold.” *Localization Industry Standards Association Why I18n testing ?˫ z I18n testing is required for enable product localization in multiple languages. z Removing barriers to localization z Enabling Unicode z Independence from UI strings in code z Handling legacy character encodings. z Separating localizable elements from source. z Enabling code to support local,regional, language, or culturally related preferences. Myths for I18n Testing z Misunderstood as translation testing z Only language expert can perform i18n testing z Done after product released z Misunderstood with product localization z It is only about String messages Terminologies of I18n z What is Character set/Character repertoire? z What is Character Code/Code Point,Coded Character? z What is Unicode? z What is meant by Encoding? z UTF-8,UTF-16,UTF-32 z What is Glyph? z What is Font? z What is Input Method Engine(IME) ? z What is Locale? What is Character, Character Set ? z A character is just an abstract minimal unit of text. It doesn't have a fixed shape (that would be a glyph), and it doesn't have a value. z "A" is a character, and so is "$", the symbol for the currency. z Character set/repertoire is a collection of characters. z Examples ȡ _ ȯ Ȫ Ǖ ȡ Ȣ ȡ ¡ ¡ ! Making the World Wide Web world wide! ࡢ࡞࠼ࡢࠗ࠼࠙ࠚ࠶ࡉࠍ⇇ਛߦᐢߍ߹ߒࠂ ߁ ȡ _ ȯ ȡ _ ȡ Ȱ ȡ Ȣ ȡ `ȯ ! 놹ꫭ陹넍 낉麑 꿵넩麑 낮냱ꈑ ꎁ麙韥! What is Character Code/Code Point, Coded Character Set ? z Character Code - A mapping, which defines a one- to-one correspondence between characters in a character repertoire and a set of non-negative integers. Examples of character codes: z ASCII, ISO Latin 1 alias ISO 8859-1, ISO 10646, the Windows character set exists in different variations,or "code pages" (CP)- Windows code page 1252 etc z A Character Code point is unique non-negative integer assigned to character in character code z A coded character set is a character set where each character has been assigned to a unique code point What is Character Code/Code Point, Coded Character Set ? Image Source : z ASCII character set , one of early character set What is Character Code/Code Point, Coded Character Set ? Image Source: Ex. ASCII Character set z 8 bit character set , cover most of character needed by Europeans but What about east part of the world? Unicode z Answer is z It has characters from almost every written script in this world z European alphabetic scripts z Latin,Greek,Cyrillic,Armenian,Georgian,Runic,Ogham,Modifier letters z Middle East Scripts z Hebrew,Arabic,Syriac,Thaana z South & South East Asian scripts z Devanagari,Bengali,Gujurati,Panjabi,Oriya,Tamil,Telugu,Kannada,Mala yalam z East Asian scripts z Han,Hiragana,Katakana,Hangul,Bopomofo,Yi z Symbols Ex. ASCII Character set z Currency symbols,Letter like symbols,Mathematic operators,Numeric forms,Technical symbols,Geometrical symbols z Additional scripts z Ethiopic Cherokee Canadian Aboriginal Syllabics Mongolian What is Character Encoding ? z A mapping from a set of non-negative integers that are elements of a Coded Character Set, to a set of sequences of particular code units of some specified width, such as 8- bit/16- bit/32-bit integers z The most commonly used code units are bytes, but 16-bit or 32- bit integers can also be used for internal processing. z Examples are UTF-8,UTF-16,UTF-32 UTF-8 , UTF-16 , UTF-32 zUTF-32 simply represents each Unicode code point as the 32- bit integer of the same value. zUTF-16 uses sequences of one or two unsigned 16-bit code units to encode Unicode code points. [Values U+0000 to U+FFFF are encoded in one 16-bit unit with the same value. Supplementary characters are encoded in two code units] zUTF-8 uses sequences of one to four bytes to encode Unicode code points. [U+0000 to U+007F are encoded in one byte, U+0080 to U+07FF in two bytes, U+0800 to U+FFFF in three bytes, and U+10000 to U+10FFFF in four bytes.] Relation between Character set and Encoding Characters A ʠ ᅢ Code Point 41 5D0 597D UTF-8 41 D7 90 E5 A5 BD UTF-16 00 41 05 D0 59 7D UTF-32 00 00 00 41 00 00 05 D0 00 00 59 7D z Different encodings yield different byte sequences for same character in Character set Unicode Character set, code set, encodings Universal Unicode UTF Character Code encodings set/repertoire Points All Character UTF-8, set will be Each Unicode UTF-16, a subset of character is UTF-32 this huge assigned a are the character Unicode Code encoding repertoire. point .Range ASCII formats set,French, is U+0000 to for Japanese, U+10FFFF. internal Korean, processi Devanagari ng What is Glyph? z A glyph - a visual appearance z It is important to distinguish the character concept from the glyph concept. A glyph is a presentation of a particular shape which a character may have when rendered or displayed. z Example: a letter and different glyphs for it:latin capital letter z (U+00E9)˫ Z Z Z Z ٥Q + + Ȣ + + Ǔ + ¡ Ǔ¡Ȣ What is Font? z A repertoire of glyphs comprises a font z A font is a numbered set of glyphs. z The numbers correspond to code positions of the characters (presented by the glyphs). z Font including characters for a language should be available for an application to display text for the language What is Input Method Engine(IME)˫ z Input methods capture a sequence of keystrokes and form a character or characters as input for languages z Input Method Engine (IME) is a program or operating system component that allows computer users to enter complex characters and symbols using a standard Western keyboard. It is also referred as Input Method Environment. What is Locale? z Locale is a set of parameters that defines the user's language, country and any special variant references that the user wants to see in their user interface. z The locale naming convention is usually: language[_territory][.encoding][@modifier]. z Example for Hindi with UTF8 encoding : hi_IN.UTF8 z Encoding [ Native encoding (iso8859-*, Shift_JIS,GB18030, BIG5, ISO2022) , Unicode encoding (UTF-8, UTF-16, ) ] What is Locale? z Behavior affected by Locale z Language culture data z Sorting, searching, text boundary, text conversion z Indexing z Country culture data z Calendar, date/time/number/currency format z People name/mailing address layout I18n 360˚ Testing Approach z What is Traditional Approach z What is 360˚ Approach z Case Study z Requirement Phase z Design Phase z Implementation Phase z QA Phase z Documentation Traditional Approach of I18n Testing z Generally start after build released by development team. z In some case starts even after product release as they release separate international release z Major Focus on functionality testing z I18n testing done on
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages35 Page
-
File Size-