Internationalization 360˚ Testing

Internationalization 360˚ Testing

Internationalization 360˚ Testing MahipalsinnhMahipalsinh Rana Rana Member of Technical Staff Sun Microsystems 1 Agenda z Introduction of I18n - 40 Minutes z Internationalization(I18n) 360˚ testing - 60 Minutes z Testing Standalone Applications – 15 minutes z Quiz – 10 minutes z Testing Web Applications – 15 minutes z Quiz – 10 minutes z I18n testing Automation – 30 minutes z Advanced I18n testing , References – 15 minutes z Q/A - 15 minutes Introduction z Understanding of Internationalization (I18n) z Why I18n testing z Myths for I18n testing z Scope of I18n testing z Terminologies in i18n technology z Character set/Character repertoire z Character Code/Code Point,Coded Character z Encoding , Unicode , UTF-8 ,UTF-16 ,UTF-32 z Glyph , Fonts , Input Method Engine (IME)˫ z Locale “Everyone has the right... to seek, receive and impart information and ideas through any media regardless of frontiers” -- Universal Declaration of Human Rights Why Globalization Why Globalization Sun Portal server in Chinese Why Globalization Yahoo.com in Kannada Why Globalization z “Visitors linger twice as long as they do at English-only URL's. z Business users are 3 times more likely to buy when addressed in their language. z Customer service costs drop when instructions are displayed in the user's native language." 'Strategies for Global Sites' Donald DePalma Forrester Research Inc. Why Globalization "One large IT company discovered that a significant percentage of inquiries were coming from South Korea - they created a Korean website and revenues rose by 8 percent." 'Global eCommerce' Donald J. Plumley Bowne Global Solutions What's with the acronyms? Internationalization ====> i18n , How? There are 18 characters between i and n With that logic : Localization ====>L10n Globalization ====> G11n Translation ===> T9n and you can call me M5l ==> Mahipal , Don't they all look the same? z Localization z Internationalization z Globalization z Translation How do they differ and relate? z Globalization encompasses i18n and l10n. z InternationalizationAn enables localization. z An expert in i18N may not be an expert in l10N. LISA* Definitions z Globalization-(G11n)˫ z “Globalization addresses the business issues associated with taking a product global. In the globalization of high-tech products this involves integrating localization throughout a company, after proper internationalization and product design, as well as marketing, sales, and support in the world market.” z Internationalization-(I18n)˫ z “Internationalization is the process of generalizing a product so that it can handle multiple languages and cultural conventions without the need for re- design. Internationalization takes place at the level of program design and document development.” z Localization-(L10n)˫ z “Localization involves taking a product and making it linguistically and culturally appropriate to the target locale (country/region and language) where it will be used and sold.” *Localization Industry Standards Association Why I18n testing ?˫ z I18n testing is required for enable product localization in multiple languages. z Removing barriers to localization z Enabling Unicode z Independence from UI strings in code z Handling legacy character encodings. z Separating localizable elements from source. z Enabling code to support local,regional, language, or culturally related preferences. Myths for I18n Testing z Misunderstood as translation testing z Only language expert can perform i18n testing z Done after product released z Misunderstood with product localization z It is only about String messages Terminologies of I18n z What is Character set/Character repertoire? z What is Character Code/Code Point,Coded Character? z What is Unicode? z What is meant by Encoding? z UTF-8,UTF-16,UTF-32 z What is Glyph? z What is Font? z What is Input Method Engine(IME) ? z What is Locale? What is Character, Character Set ? z A character is just an abstract minimal unit of text. It doesn't have a fixed shape (that would be a glyph), and it doesn't have a value. z "A" is a character, and so is "$", the symbol for the currency. z Character set/repertoire is a collection of characters. z Examples ȡ _ ȯ Ȫ Ǖ ȡ Ȣ ȡ ¡ ¡ ! Making the World Wide Web world wide! ࡢ࡯࡞࠼࡮ࡢࠗ࠼࡮࠙ࠚ࠶ࡉࠍ਎⇇ਛߦᐢߍ߹ߒࠂ ߁ ȡ _ ȯ ȡ _ ȡ Ȱ ȡ Ȣ ȡ `ȯ ! 놹ꫭ陹넍 낉麑 꿵넩麑 낮냱ꈑ ꎁ麙韥! What is Character Code/Code Point, Coded Character Set ? z Character Code - A mapping, which defines a one- to-one correspondence between characters in a character repertoire and a set of non-negative integers. Examples of character codes: z ASCII, ISO Latin 1 alias ISO 8859-1, ISO 10646, the Windows character set exists in different variations,or "code pages" (CP)- Windows code page 1252 etc z A Character Code point is unique non-negative integer assigned to character in character code z A coded character set is a character set where each character has been assigned to a unique code point What is Character Code/Code Point, Coded Character Set ? Image Source : z ASCII character set , one of early character set What is Character Code/Code Point, Coded Character Set ? Image Source: Ex. ASCII Character set z 8 bit character set , cover most of character needed by Europeans but What about east part of the world? Unicode z Answer is z It has characters from almost every written script in this world z European alphabetic scripts z Latin,Greek,Cyrillic,Armenian,Georgian,Runic,Ogham,Modifier letters z Middle East Scripts z Hebrew,Arabic,Syriac,Thaana z South & South East Asian scripts z Devanagari,Bengali,Gujurati,Panjabi,Oriya,Tamil,Telugu,Kannada,Mala yalam z East Asian scripts z Han,Hiragana,Katakana,Hangul,Bopomofo,Yi z Symbols Ex. ASCII Character set z Currency symbols,Letter like symbols,Mathematic operators,Numeric forms,Technical symbols,Geometrical symbols z Additional scripts z Ethiopic Cherokee Canadian Aboriginal Syllabics Mongolian What is Character Encoding ? z A mapping from a set of non-negative integers that are elements of a Coded Character Set, to a set of sequences of particular code units of some specified width, such as 8- bit/16- bit/32-bit integers z The most commonly used code units are bytes, but 16-bit or 32- bit integers can also be used for internal processing. z Examples are UTF-8,UTF-16,UTF-32 UTF-8 , UTF-16 , UTF-32 zUTF-32 simply represents each Unicode code point as the 32- bit integer of the same value. zUTF-16 uses sequences of one or two unsigned 16-bit code units to encode Unicode code points. [Values U+0000 to U+FFFF are encoded in one 16-bit unit with the same value. Supplementary characters are encoded in two code units] zUTF-8 uses sequences of one to four bytes to encode Unicode code points. [U+0000 to U+007F are encoded in one byte, U+0080 to U+07FF in two bytes, U+0800 to U+FFFF in three bytes, and U+10000 to U+10FFFF in four bytes.] Relation between Character set and Encoding Characters A ʠ ᅢ Code Point 41 5D0 597D UTF-8 41 D7 90 E5 A5 BD UTF-16 00 41 05 D0 59 7D UTF-32 00 00 00 41 00 00 05 D0 00 00 59 7D z Different encodings yield different byte sequences for same character in Character set Unicode Character set, code set, encodings Universal Unicode UTF Character Code encodings set/repertoire Points All Character UTF-8, set will be Each Unicode UTF-16, a subset of character is UTF-32 this huge assigned a are the character Unicode Code encoding repertoire. point .Range ASCII formats set,French, is U+0000 to for Japanese, U+10FFFF. internal Korean, processi Devanagari ng What is Glyph? z A glyph - a visual appearance z It is important to distinguish the character concept from the glyph concept. A glyph is a presentation of a particular shape which a character may have when rendered or displayed. z Example: a letter and different glyphs for it:latin capital letter z (U+00E9)˫ Z Z Z Z ٥Q + + Ȣ + + Ǔ + ¡ Ǔ¡Ȣ What is Font? z A repertoire of glyphs comprises a font z A font is a numbered set of glyphs. z The numbers correspond to code positions of the characters (presented by the glyphs). z Font including characters for a language should be available for an application to display text for the language What is Input Method Engine(IME)˫ z Input methods capture a sequence of keystrokes and form a character or characters as input for languages z Input Method Engine (IME) is a program or operating system component that allows computer users to enter complex characters and symbols using a standard Western keyboard. It is also referred as Input Method Environment. What is Locale? z Locale is a set of parameters that defines the user's language, country and any special variant references that the user wants to see in their user interface. z The locale naming convention is usually: language[_territory][.encoding][@modifier]. z Example for Hindi with UTF8 encoding : hi_IN.UTF8 z Encoding [ Native encoding (iso8859-*, Shift_JIS,GB18030, BIG5, ISO2022) , Unicode encoding (UTF-8, UTF-16, ) ] What is Locale? z Behavior affected by Locale z Language culture data z Sorting, searching, text boundary, text conversion z Indexing z Country culture data z Calendar, date/time/number/currency format z People name/mailing address layout I18n 360˚ Testing Approach z What is Traditional Approach z What is 360˚ Approach z Case Study z Requirement Phase z Design Phase z Implementation Phase z QA Phase z Documentation Traditional Approach of I18n Testing z Generally start after build released by development team. z In some case starts even after product release as they release separate international release z Major Focus on functionality testing z I18n testing done on

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    35 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us