E D

I GETTING U

G STARTED INTERNATIONALIZATION

Internationalization MultiLingual Internationalization Computing & Technology

This guide is an introduction to internationalization — what it is, why we do Editor-in-Chief, Publisher Donna Parrish it, and how it is done. Managing Editor Laurel Wagers When I talk to people encountering this term for the first time, I tell them Translation Department Editor Jim Healey Copy Editor Cecilia Spence about my cookie recipe. I will be the first to tell you that I am not a cook, but I Research David Shadbolt do have a cookie recipe that is a very nice combination of butter and sugar and News Kendra Gray, Becky Bennett flour. What does this have to do with internationalization? Well, I can make this Illustrator Doug Jones cookie batter into a wide variety of cookies. I can add oatmeal and raisins or Production Sandy Compton chocolate chips or cinnamon and nutmeg. The results are (almost) always tasty Cover photograph courtesy cookies, but they are tailored to the preferences of the recipients. I think it is of Seattle Public Library because I started with a good quality item that has been carefully designed to allow for many “localizations.” Editorial Board Are you hungry for more? Here is what we’ve included in this guide to help Jeff Allen, Henri Broekmate, Bill Hall, you get started. Andres Heuberger, Chris Langewis, Many people think of software when they think of internationalization. But , John O’Conner, Mandy Pet, Reinhard Schäler Tracy Russell takes us beyond that to give us a description of important interna- tionalization principles that apply to content and design. Advertising Director Jennifer Del Carlo To someone new to internationalization, the subject of can easily be Advertising Kevin Watson, Bonnie Merrell misunderstood. And for good reason: the word is misused in many ways. Richard Webmaster Aric Spence Gillam, who wrote Unicode Demystified, has written an introduction to the topic, Assistants Shannon Abromeit, explains exactly what Unicode is and why its misuses are incorrect. Zabrielle Dillon In addition to Unicode, some misunderstandings about internationalization in general persist. Andrea S. Vine serves up a dozen of these misconceptions and Advertising: [email protected] explains just what is wrong with them and why it is wrong. http://www.multilingual.com/advertising Most programmers have probably written a “Hello, World” program to learn 208-263-8178 a new programming environment. Donald A. DePalma takes a delightful look at Subscriptions, customer service, back issues: the classic beginners’ program using a short Java fragment and shows us just how [email protected] many ways it can fail the internationalization test. http://www.multilingual.com/subscribe Bill Hall, author of Globalization Handbook for the Microsoft .NET Platform (available at http://www.multilingual.com/eBooks), outlines various questions to Submissions: [email protected] be considered when designing software for a global audience. He then provides Editorial guidelines are available at valuable information for project managers and programmers alike. His sidebar http://www.multilingual.com/editorialWriter “Some Principles for Internationalization” is a worthwhile resource for beginners and experienced programmers alike. — Donna Parrish, Publisher Reprints: [email protected]

This guide is published as a supplement to MultiLingual Computing & Technology, the magazine about language technology, localization, web globalization and international software development.

s s r r DePalma Gillam Hall Russell Vine o o h h t t DONALD A. DEPALMA u is cofounder and president of Common Sense Advisory and author of u a a Business Without Borders: A Strategic Guide to Global Marketing. He can be reached at don@ commonsenseadvisory.com RICHARD GILLAM is a senior software developer at Language Analysis Systems and author of Unicode Demystified: A Practical Programmer’s Guide to the Encoding Standard. He can be reached at [email protected] BILL HALL is a writer, teacher and consultant in internationalization, currently at Adobe Systems, and author of Globalization Handbook for the Microsoft .NET Platform. He can be reached at [email protected] MultiLingual Computing, Inc. 319 North First Avenue, Suite 2 TRACY RUSSELL is publishing services manager at the localization firm Wordbank. She can Sandpoint, Idaho 83864-1495 USA be reached at [email protected] 2O8-263-8178 • Fax: 2O8-263-631O ANDREA S. VINE is an internationalization architect at Sun Microsystems and writes a blog at [email protected] http://blogs.sun.com/i18ngal. She can be reached at [email protected] http://www.multilingual.com 2 E D

I GETTING U

INTERNATIONALIZATION G STARTED GetGet ReadyReady toto GoGo InterInternationalnational

Tracy Russell

There is an old joke in which a traveler Decide upon an appropriate tone of both the cultural and the technical implica- stops to ask for directions. The old man voice and register for the target audience and tions — that is, the suitability of the design scratches his head and says, “Well, if I were stick to it. for local markets and the suitability of the you, I wouldn’t start from here.” Unfor- Develop and approve key messages and design for the localization process. tunately, we often feel like saying this to terminology first. Designing for local markets is about some of our clients who present us with Avoid clichés, cultural references and considering how the message will be projects for localization that have clearly not jargon because they are difficult to trans- received. Is there any danger that the design been conceived with any understanding of late effectively. could be regarded as culturally sensitive in the concept of internationalization. Do not use “street” language or words any current or future international markets? So what exactly is internationalization and phrases that will only be used by a Do the visual elements create a positive in the context of localization? It is the process minority of your target audience. impression in these markets? Does the of engineering a product or developing a design communicate the intended meaning? service so that it can be easily and efficiently Designing for the localization process is localized without having to be rewritten, Designing about understanding the technicalities of redesigned or reengineered to cope with dif- design and how they can either promote or ferent languages and regions. for local markets hinder the localization process. The agency In this introductory guide, we offer responsible for localizing the design will be some guidelines for clients whose products is about considering strongly reliant on the technical and visual and services will be marketed beyond their design of the original in order to produce a domestic market, and, as marketing commu- how the message consistent set of localized versions. The nications specialists, we focus on the key ele- speed, efficiency and cost of design local- ments of international communication — will be received. ization will also depend on whether the content and design. You will notice that a design has been fully internationalized and, common thread running through all our therefore, does not require time-consuming advice is the need to consider international- Either avoid abbreviations and acro- language-specific manipulation. ization earlier rather than later in the devel- nyms or write the terms out in full before Let us look at the two main cultural opment of marketing materials to avoid using the abbreviations and acronyms. issues related to designing for international unnecessary costs and delays. Avoid names based on abbreviations. markets — color and imagery. How to say what you mean and mean Even when abbreviations are universally rec- The color purple — death or royalty? what you say in any language. The golden ognized, they can present pronunciation Color can have a strong positive or negative rule of creating source text that will work problems for different cultures. representation in all cultures. Understanding effectively in any language and market is to Avoid metaphors or names based on the impact of color will help with the design, keep it clear and simple and to avoid as images. A bull market or a groundhog day will enabling you to emphasize or de-emphasize many cultural references as possible. The be meaningless to many cultures. corporate colors for a global audience. source text should be well written, unam- Be aware that humor often does not The color black, for example, signifies biguous and grammatically correct. It should travel beyond its culture of origin and can be death in the West, but in the color of conform to any in-house corporate guide- very expensive to adapt. death is white. Purple signifies bravery and lines for terminology and style to reinforce royalty in the West, but is the color of corporate branding but should also be Think International mourning in Brazil. Red is commonly associ- acceptable to local markets from an idio- ated with danger in the West but is associated matic perspective. Before You Get Creative with weddings in China. Green and light The internationalization guidelines blue are regarded as sacred colors in the below for the creation of content are not Since the globalization process is often Middle East, and saffron yellow is a sacred mandatory, but they will help to ensure that based on the adaptation of copy and design color for Buddhists. the source text can be used internationally, from an original marketing tool such as an This is not to say that sensitive colors will minimize localization cost and time, and English language website or an advertising cannot be used in designs for a global audi- allow the user to read and understand the campaign, the way in which the original ence. It is useful, however, to consider the text easily. design is created has a substantial impact on impact of color choice in the context of a mul- Keep copy short and succinct. localization. When designing for an interna- ticultural target market at the earliest stages of Write clearly and unambiguously. tional marketplace, you have to consider the design process. 3 E D

I GETTING U

G STARTED INTERNATIONALIZATION

Beware of the dog. You also need to be Designing for the your communications, remember that the aware of the suitability of images for a global Localization Process font should be widely available in all medi- audience and be prepared to offer different ums. For website localization, Arial and images depending on the target market. Times New Roman are probably the safest Examples of the types of images that can Once the text and design are culturally bets. If a browser cannot display the correct cause difficulties include people, animals, suitable, the next challenge is to ensure that font, the result will be the nearest the browser flags and icons. the design is internationalized from a techni- can find on the machine, which may com- People. Many cultures are extremely sen- cal perspective. There are many issues to con- promise the design. sitive to ethnicity, dress and poses, particularly sider here, and, again, a key piece of advice is Beware the use of corporate fonts. relating to women. A recent poster campaign to consider all of these before the localization Unless they have been developed by large for Lux, featuring Sarah Jessica Parker in a process begins, particularly if the design is for organizations with large budgets, they do not sleeveless dress, had to be hastily airbrushed a multilingual website. tend to support non-English characters. Font to cover her arms for the Israeli market. Leave plenty of room for text expan- design and creation are highly specialized and Animals. They conjure up different sion. No two languages take up the same expensive processes. images in different cultures. Dogs are gener- amount of space when laid out in a design. Where a font is used that does not sup- ally considered to be man’s best friend in the Individual words can expand by up to 300%, port localized characters, the only option is to West, but Arab cultures find them “unclean” and design elements such as a text box can find the closest available match for the target and offensive. sometimes take up twice as much space as the character set, which means that your agencies Flags. Flags are always best avoided English source. But one paragraph in a docu- and localization companies must also possess because they are more political than cultural ment might expand by 30%, and the next a copy of the chosen font. It is possible to pro- and do not clearly represent a specific lan- may not expand at all. Some languages such duce customized versions of Western Euro- guage. Which language is represented, for as Russian can expand up to 70%. Others pean fonts for non-Western languages, but it instance, by the Swiss or Belgian flag? such as Hebrew and Asia-Pacific languages means that extra time will have to be built Icons. Common cultural references such may contract and take up less space. into the project. In short, failing to carefully as mailboxes, rubbish bins and phone boxes The behavior of localized text thus pro- consider font selection at the design stage can are often used in website designs but are vides a significant challenge in producing a add time and cost to a project. unlikely to be universally understood as each design that can be internationalized and sup- Designing for bidirectional and dou- country has a different design. port a wide range of target languages. ble-byte languages. Bidirectional languages In order to accommodate text expan- such as Arabic and Hebrew may require a full sion, the layout and design of the original re-working, as they must read from right to must either allow space for such expansion to left. This means the production of reversed occur or for elements to be moved. Generally, artwork and possibly a change of graphics. text expansion is handled by expanding into Consultation at the earliest possible stage of empty areas of the page or by reducing the design will assist with speedy delivery of size, leading and tracking of the text (by as lit- localized Hebrew and Arabic versions. tle as possible). In some cases, however, more While some publishing packages readily deliberate action must be taken. support the direct input of Asia-Pacific char- For example, headlines often do not acter sets such as Chinese, Japanese and translate easily, and five words can become Korean, localized design can be produced so eight or ten. Point size reduction is often the that the file can be run through the normal only option on a design where text expansion printing process without the need for special- has not been taken into consideration. Inter- ist software. Again, our advice is to consult esting alignments and typographical empha- with the experts at the earliest possible stage sis on the different elements of a headline of design to ensure the speedy delivery of can, as a result, be difficult to reproduce. localized Asia-Pacific versions. Captions should not be crammed too Ask your printer for advice. It is always tightly on a page, either to a graphic or to worth consulting your printer before finalizing each other. A heavily labelled diagram needs the design, particularly when applying the plenty of space for text expansion. same design across multiple language outputs. Tables and forms are difficult to han- Printing costs can be greatly reduced by the dle because of the use of unalterable col- use of a “fifth plate,” a technique that is used ored backgrounds and lines that make extensively in the packaging world, where an expansion impossible. initial large print run containing all the color Select your languages before you elements (photos/graphics) is produced. This choose your fonts. Non-Western languages “blank” printed output is then overprinted require different typefaces in order to accom- with the text elements in a smaller, language- modate the extra characters not supported by specific print run, which allows for more standard fonts. It is, therefore, a good idea to print-on-demand flexibility. This technique consider what languages will be required at only works where the language-specific ele- the earliest possible stage of design before you ments can be printed in a separate color (or decide on which font you are going to use. If colors) without affecting the rest of the design. you want to use a particular font across all This typically means that the text is black. 4

E D

I GETTING U

INTERNATIONALIZATION G STARTED

Avoid turning text into graphics. The guiding principle is to avoid putting translat- able text elements in separate graphics files unnecessarily — either for online or offline communication. Once embedded into a graphics file, it is a manual process to extract the text, and this has to be done separately from the main text extraction. A better Localization Tool for GlobalReady approach when designing for print is to place text in frames laid over the graphic. This PowerBuilder Applications • Seasoned language technology professionals means that text can be extracted in one sim- delivering the best and most cost-effective solutions ple process. Less switching between programs Enable changes the active language of PowerBuilder • Experienced with both clients and vendors in every is needed during the typesetting process, and software dynamically during runtime. Our localization role — management, technology, language, there are fewer files to deliver once localiza- technology enhances the developer’s application sales and finance tion is completed. It is easier to accommodate framework, rapidly capturing (various formats) text • For clients, we have streamlined localization different graphics for different language ver- for translation. After minimal one-time changes, the departments for Fortune 500 companies and sions as appropriate without involving trans- source code compiles into a multilingual application. designed environments which are among the most lators and typesetters in the process. Developers and localizers alike prefer Enable’s highly automated in the world Burning text into graphics for online communication means that it is displayed in a focused and cost-effective approach to user interface • For vendors, we have increased profitability certain way and cannot be adjusted by user localization over other, more generic tools. GlobalReady — screen resolution preferences or the re-sizing Enable is the correct choice for both new projects Our expertise is translation and technology of a window. This might be desirable from a and new versions of existing PowerBuilder software. design perspective, but it makes the localiza- tion process longer and more complex GlobalReady because there is no efficient way of auto- Enable 19710 Ventura Boulevard, Suite 203 mating text extraction and re-insertion into Vio Gorgo 48/C, 30030, Caltana Woodland Hills, CA 91364 graphics, and you will probably need to Venice, Italy 818-887-8718 • Fax: 805-435-3761 involve a DTP/web graphics specialist. The 39-041-5730206 • Fax: 39-041-5730206 [email protected] • http://www.globalready.com same look and feel can often be achieved [email protected] • http://www.enable-pb.com http://www.L10NEngineers.com using plain HTML, particularly with the use of Cascading Style Sheets, and this results in a far more localization-friendly design. Embedding text in graphics also locks your design into one that is only suitable for the English language. If text is already embed- ded, ensure that the design allows for lan- guage expansion. Otherwise, the only option for localized versions is to reduce the point 15 Years of Well-managed size to make the text fit, and this can affect the legibility of the content. Brazilian Translations Interested in Ready for Takeoff? This is what you get: Maximizing Overseas’ • Quality in customer service, deliverables and open Preparing for international departures communication is not difficult, but it does involve getting a Revenue? • Responsiveness and ability to adapt to ever changing grip on a number of cultural issues before requirements you even brief your creative agency. You Adams Globalization brings you an award winning • Customized service with no compromise to translation need to work with a partner who under- team for your internationalization, localization and quality, even in challenging volumes stands internationalization and localization, software localization testing needs. Our extensive • Promptness to raise issues and prevent any disruptions otherwise you could make costly mistakes. 23 years of experience, industry-specific knowledge, to project quality Never underestimate the potential sensitivity technical skills and excellent customer service This is what we get: of any ethnic, religious or cultural group to allow you to deliver your content in any format • Extremely satisfied customers what you say and to how you present your and in most languages. business visually. If you’re not one of them yet, experience the difference! As any business knows, a reputation can Adams Globalization take a lifetime to earn and a moment to 10435 Burnet Road, Suite 125 Follow-Up destroy. So why jeopardize your chances of Austin, TX 78758 Av. Presidente Wilson 165 / Sala 1308 international success just because your agency 800-880-0667 • 512-821-1818 • Fax: 512-821-1888 Rio de Janeiro, RJ, Brazil 20030-020 didn’t know that a beautiful young Chinese [email protected] 55-21-2524-2994 • Fax: 55-21-2210-5472 woman dressed in white is more likely to be on http://www.adamsglobalization.com [email protected] • http://www.follow-up.com.br her way to her funeral than her wedding? Ω 5 E D

I GETTING U

G STARTED INTERNATIONALIZATION ‘Hello,‘Hello, WWorld’orld’ asas anan InterInternationalizationnationalization WWake-upake-up CallCall

Donald A. DePalma

While eXtreme programming makes the { impact on a development project. All the headlines, the reality is that developers code public static void main errors demonstrate that code which should by example, precedent or plagiarism. I have (String argv[]) throws Exception be a personal pact between coder and com- long contended that only one COBOL appli- { puter can frequently appear before an end cation was ever written from scratch, and MyDebug.trace("main"); user’s wondering eyes, showcasing what we every COBOL programmer since Admiral Date today = new Date(); call the “code-content interdependence.” Grace Hopper merely cut and pasted her System.out.println Now I’ll put on my dusty programmer hat code. What happens when a new programmer "Hello, world! Today's date and take a trip through my code sample. fires up his or her favorite interactive develop- is " + today.toString()); ment environment to build some code for } Multiple Errors international deployment? He or she will like- } ly copy some code that he or she wrote before Experts will tell you that Java is inter- Undeclared variable. Whoops! This is and adapt it to the project at hand. nationalized right out of the box. While I’m a simple matter of coder hygiene. I did not Let’s leave the ancien régime of COBOL not an expert on Java, I used to play one declare MyDebug. My bad. I learned Pascal behind with an easy example, using the first back in the 1990s when, as an analyst, I from Andy van Dam. It’s no wonder he gave code written by C-savvy programmers new to scored lots of coffee, coffee cups and other me an incomplete in that course way back a language or development tool. The code coffee paraphernalia from Java develop- in 1979. I do remember him saying some- most likely to be written displays the text ment tool vendors anxious to convince ana- thing about programming not being the “Hello, world.” Simple as they are, these few lysts that their Java was higher octane than path that a talented linguist such as I should lines do not travel well beyond English. their competitors. I wear my coffee beans take. Did he diss me or what? Well, Brown Consider the following Java fragment: on my sleeve. University can forget about this alumnus This short Java example demonstrates donating any money when they come beg- package mine; a number of problems. Some are merely ging for their annual fund this year! public class Greetings bad hygiene, but others could have a costly Locale-specific strings and code. Both “Hello, world” and the code in which it is embedded are specific to English-speaking locales. I know that best practices dictate that I should isolate locale-specific items such as text, icons and formats in external, localizable resource files. Oh well, it’s not as if anyone will use this code in any deployed application, so what’s the harm of doing it just this once? I can always fix it later when I have more time — or whoever borrows this code can do it in his or her copious spare time. Internal use only. The routine "main" is for me to debug this code, not something that my end users should see. Without any context, some translator somewhere down- stream in the process might decide to translate this perfectly good English word into whatever the target language is. If my company uses an external agency, their translators will probably attempt to trans- late it when they localize the code. Either way, somehow I think that this one word could cost my company hundreds of dollars of time to research and resolve, espe- cially as it propagates into code for different 6

E D

I GETTING U

INTERNATIONALIZATION G STARTED international markets. Here’s what I’ll do: “Don’t have a cow!” Whatever. Oh yeah, savvy language such as Pascal, SNOBOL or I’ll leave it as it is and buy a tool such as “whatever” means “I don’t really care what even C or C++, there would have been LingoPort’s Globalyzer or bring in some the answer is.” more problems. specialist service provider such as Basis or Internationalization sure isn’t a walk in Symbio to extract all the quoted strings in Lessons Learned the park. an application and do all the right things Because application code and user-visible with them. Or I could worry about them I now have seven lines of code with six content are so intertwined, you can cause a now. Oh well, let me stick a little Post-It major bugs. I know that Dr. Van Dam lot of damage just coding the way you do note on my monitor so that I do it later. would find more and would certainly every day. Maybe after I get back from coffee. That object to my programming style. Hmm, if I Get help. cuppa Sumatran sounds pretty good right had written this in a less internationally I gotta get some more coffee. Ω about now. Mixed message. The output string “Hello, world! Today’s date is April 9, 2005” is concatenated from text plus a date string function. This string isn’t too bad, especially since it’s my birthday. It’s a good thing that it’s a short sentence. Talented linguist that I am (note to self — “don’t donate anything to Brown this year or next!”), I know that more complex messages might get wrapped around a syntax tree. All us savvy linguists know that English favors a subject-verb- object (SVO) word order, while languages such as Japanese and Russian are more tol- erant of structures such as SOV or OVS. Maybe some conditional code for locale would work here. Another sticky note to self — “Research sentence order and locales.” Maybe next week. English methods. The “Date.toString” function produces an acceptable US date string that might not look good in Senegal. Where is Senegal? Hmm, Google “Senegal” and check out the results. Geez, 47 listings on my computer for Senegal. Weirdness personified. Why do I have listings for • Multilingual solutions (localization, content Senegal on my PC? Oh yeah, I was checking Logrus management, engineering and testing) locale-sensitive methods, wasn’t I? Let’s • Large production site in Moscow search for time and date formats in Senegal. specializes in • Large and complex projects Wow, 609,000 hits, but thankfully none of complementary • ERP/CRM/BPO specialization them on my computer. That would really be • Multilingual software development projects wack. I better get a bit more specific with solutions: these search terms. You know what? I’m • Multilingual Web content management tired of this. Even if those French speakers Logrus is a provider of multilingual solutions into a large number of languages, making it prefer “day/month/year” they’re not going possible for software publishers and other companies to ensure global presence of their products. We to have a cow if they see “month/day/year” are they? specialize in large and highly technical projects requiring unique technical experience, a high level of Unusable documentation. Sooner or self-sufficiency, and outstanding problem-solving capabilities. later someone will read my notes about how Logrus was founded as a dedicated software localization company. We go beyond translation my little Java program violates internation- (although we do a lot of translating) and beyond programming (although we do compile software, alization best practices. What if the next programmer is not a teenager or someone prepare the builds and fix bugs). We are localization professionals. else who grew up with American television? Would that programmer know that there was an actor who played a doctor and then appeared in commercials for a medicine saying “I’m not a doctor, but I used to play one on TV”? That to “diss” is to show disre- Logrus • www.logrus.ru spect? Or that being “wack” is bad? Or that Bart from the animated TV series The [email protected] • +1(215)947-4773 Simpsons is always telling concerned adults, 7 E D

I GETTING U

G STARTED INTERNATIONALIZATION UnicodeUnicode FrFromom 50,00050,000 FeetFeet

Richard Gillam

The computer industry has a strong ten- assigning numbers to characters. If two dency to latch onto buzzwords. I can remem- applications follow the same standard for ber when XML first started to get the attention representing text, they can pass text back and of the industry. Before you knew it, everybody forth between each other, and they’ll both be was trying to find some way to say his or her able to do things with it properly. application “supported XML.” It didn’t matter The problem, of course, is that there are whether there was anything about XML that so many different standards. Most modern was especially useful in the application’s prob- computing systems use the ASCII standard or lem domain.You needed a way to tie it to XML something based on it. ASCII was published just the same. in the 1960s by what is now the American Unicode has been another one of those National Standards Institute (ANSI) and uses perennials in the buzzword sweepstakes. the values from 32 to 126 to represent the 26 Many developers are looking to find a way to uppercase and lowercase letters of the say their product “supports Unicode” or “is English alphabet, the 10 digits, and various based on Unicode,”but they often don’t really punctuation marks and symbols. The values know or care what that means. Many devel- from 0 to 31 and the value 127 were reserved opers seem to feel that “my program supports for various signals that controlled the com- Unicode” and “my program is international- munication protocol, and byte values from ized” are equivalent statements. This is not 128 to 255 weren’t used. only wrong, but scary. They’re two very dif- The problem is that ASCII only includes ferent things. It’s possible both to support codes for the letters in the English alphabet. Unicode and still not be internationalized, Speakers of other languages didn’t have codes and it’s equally possible to write an interna- for the letters of their alphabets. Since the tionalized program that doesn’t have any- byte values from 128 to 255 weren’t standard- thing to do with Unicode. ized by ASCII, various other standards Other authors in this guide will talk sprung up that used these code values for the about what it means for a program to be letters of other alphabets. Standards were put internationalized, so let’s instead take a quick forth by computer vendors, national govern- look at just what Unicode is and the prob- ments and so on. lems it does solve. We’ll also skim lightly over Now there’s a plethora of character the surface of Unicode’s main features. encoding standards out there, each of which Unicode is a character encoding stan- defines code values for a single language or a dard. What does this mean? Computers don’t small group of related languages. There are have any innate knowledge of text or charac- several problems with this. First, the stan- ters or anything like that; all computers really dards are mutually incompatible. While you understand at all are numbers (actually, it’s can usually count on the value 65 represent- bit patterns, but let’s not go too deep here). If ing the capital letter A, the value 215 can rep- you want to represent text in software, you resent many different characters, depending adopt a convention where each character you on the encoding standard. Second, encoded need to represent is assigned to a number. text often travels across media without any You decide that in your application, anytime external indication of the encoding standard you see, say, the number 1 in a memory loca- it follows. Software receiving a message of tion you know is supposed to hold text, you unknown encoding has to guess or simply interpret it as the letter A. When you see the assume, thus leading in many cases to man- number 2, it’s B and so on. gled characters. The sending software intends Of course, text is so common that rather for a particular numeric value to represent than having each developer adopt his or her some character, and the receiving software own convention for representing text with interprets it as something totally different, numbers, the industry issues standards, offi- leading to garbage. Third, mixing languages cial documents that define conventions for in a single document often requires changing 8

E D

I GETTING U

INTERNATIONALIZATION G STARTED from one encoding standard to another in numbers is complex, and Unicode goes to the middle of the document, and there are much trouble to explain how this is done for often no mechanisms in the software for various complex writing systems. doing that. It’s also not always clear just when two Unicode was designed to solve these different doodads are the same character and problems. The idea was to use a larger data when they’re different. In many writing sys- type than a byte for each character and then tems, the shape of a letter can change dra- give every character in every language its own matically depending upon the letters around unique numeric representation. This means it. Unicode places much rigor around the you can mix languages freely in a document process of deciding whether two different without the software being written to sup- written squiggles get different numbers or port mixing encodings, and you can send text the same number. from one system to another without worry- Unicode goes to more trouble to nail ing about it getting mangled on the other end down the semantics of each character. The (as long as the sending and receiving systems standard contains not just a big pile of code both support Unicode). charts, but also a huge database of properties But it’s possible to write an internation- that define how software should treat differ- alized application without using Unicode. ent characters. Is the character a letter, a digit You just have to keep track of which encod- or a punctuation mark? If it’s a letter, is it ing the system is using to represent text in all uppercase or lowercase? Which character is the places where text appears and make it its partner in the opposite case? If the charac- possible to use different encodings when nec- ter is a number, which numeric value does it essary to represent different languages. In represent? If it’s a diacritical mark, how does other words, you can do it, but it requires the it attach to its base character? Is the character application to do much more bookkeeping part of a right-to-left writing system? Does it than is necessary with Unicode. Unicode join cursively to other characters? And so on allows you to process data and present a user and so on. interface in any language without having to Unicode also includes many rules on switch from one encoding standard to anoth- how to do different things with encoded text. er. It doesn’t make internationalization possi- There are rules and guidelines for determin- ble, but it does make it easier. ing where line and word boundaries occur. It should also be clear that Unicode There are rules and guidelines for converting doesn’t solve your internationalization prob- to other encoding systems, for doing lan- lems. You still have to translate the text. You guage-sensitive string comparison, for dis- still have to remember to call number- and playing various things on the screen, for date-formatting routines that can produce implementing Unicode-based regular expres- different output for users of different lan- sions or programming-language identifiers. guages. All Unicode makes possible is repre- And much, much more. senting text in many different languages So, Unicode not only gives you codes for without keeping track of the encoding. practically every character in practically every Although Unicode is unique among writing system used to write languages today, character encoding standards, it’s not just but it also provides you with a wealth of because it assigns numbers to more charac- implementation know-how, and Unicode sup- ters — more than 95,000 in the most recent port libraries provide you with facilities for version, including many that have no other doing all kinds of text-related things. The standardized representations. Unicode is Unicode standard sprawls across not only a also unique in that it approaches the busi- huge 1,500-page book, but also a CD full of ness of assigning numbers to characters database files and many ancillary addenda and with far more rigor than any other encoding technical reports. It’s not all because it con- standard has attempted. For many writing tains 95,000 character assignments; it’s be- systems other than the Latin alphabet used cause tremendous blood, sweat and tears have by English, questions as to how to use num- gone into just how to use those 95,000 bers to represent it aren’t at all clear-cut. In assigned code values to do what you want to many writing systems, the letters don’t do in the language you want to do it in. march in a nice orderly fashion from the Unicode is not just the largest collection of left-hand side of the page to the right. In characters ever encoded in a single standard; some, they go from right to left. In some, it’s the most comprehensive collection of they knot together in very complex ways. In rules, guidelines and best practices for han- some, they’re adorned with various accent, dling text in computer software ever compiled. tone or vowel marks that attach to the letters You could write an internationalized in many different places. Straightening this application without using Unicode, but why out into a one-dimensional sequence of would you? Ω 9

E D

I GETTING U

G STARTED INTERNATIONALIZATION DevelopingDeveloping SoftwarSoftwaree WWithith InterInternationalizationnationalization inin MindMind

Bill Hall

What are internationaliza- delayed releases and missed sales opportu- last minute, if many errors are not found, tion, localization and global- nities that are seldom recouped. and if awkward compromises are often nec- ization? essary to meet release schedules. Where does international- Internationalization is the engineering ization belong in the software What is a typical first-time and design aspect of creating a world-ready development cycle? internationalization experience? product. Internationalization work proper- ly starts in the design phase and lasts until Internationalization is a component of It goes something like this. Let’s sup- the product has been localized and software engineering that should be applied pose the setting is the . An released. Localization is the term to a product throughout the devel- idea for a product is conceived; develop- most often used for the task of opment cycle. Many develop- ment begins apace; version 1.0 is released adapting a product to one ment organizations naively in English; and work immediately starts on particular target market. The product believe that international- a bug fix release simultaneously with the Localization includes trans- ization can be added at next version. Throughout, no thought is lation of user-interface manager asks if the last minute. given to internationalization — mainly strings, adjusting cultur- English will be OK, Unfortunately, because no one on the staff really knows ally sensitive elements internationalization what it means. One day a Japanese company and any other task re- and the Japanese is not a coat of paint calls up to say that it wants the product. quired to make the prod- that is applied to the The product manager asks if English will uct usable in a particular company replies, surface of the product, be OK, and the Japanese company replies, world region or locale.A “Absolutely not!” as is localization. It is “Absolutely not!” At this point, the entire locale is typically identified much more like oil that development cycle becomes completely by language and region identi- lubricates the whole system. disrupted as company management tries to fiers, such as US English, Austrian Internationalization errors tend decide how to handle the situation. German and so on. A product is global- to be found throughout the system Any number of paths can be followed, ized if it is both internationalized and local- and at every level. They can be as varied as but usually the worst possible decision is ized. If we write G11N for globalization, incorrect uses of library and system calls, made: a separate thread of development then G11N = I18N + L10N. improper pointer arithmetic, embedded user begins with the code branching to get the Internationalization, in simplistic terms, interface items, inattention to rendering Japanese release out while US development is a job for programmers, and localization locale sensitive data correctly and erroneous proceeds toward its next version. Unless falls to the linguists. Internationalization assumptions about character encoding. company behavior is modified, the cycle and localization are only loosely related; a Internationalization omissions and over- continues of an English release followed product can be fully internationalized sights can cause substantial rewrites of large much later by releases in other languages without having been translated into an- blocks of code and brisk renegotiation with one by one. In the meantime, the main other language. third-party suppliers of key modules. development group continues to make the same internationalization mistakes release Why is internationalization What does internationaliza- after release. The whole process is very important? tion cost? expensive, maintenance and patches be- come difficult to provide, and the substan- If the software has been properly Large companies that routinely devel- tial benefits of simultaneous release are internationalized, localization (transla- op with internationalization in mind find consistently missed. tion) can proceed quickly, efficiently and that development costs can increase by 10% at a reasonable cost, and the product can to 20% or more since developers must be How can a company avoid re- be sold in other world regions. But if the educated, internationalization phase checks peating this sad experience? product is not internationalized, the trans- must be added, and QA plans must be mod- lation step can be hampered substantially ified for the additional testing required. Many companies have recognized the as program bugs are detected, reported Fortunately, those costs can be amortized futility of the approach I have just described back to the development staff, fixed and over multiple language releases, reducing and will take the time to merge the interna- returned to the translation and quality the effective unit cost. But costs can easily tionalized code into a single, worldwide assurance (QA) teams. The results are double if internationalization is left to the base. From that point they follow a strategy 10 E D

I GETTING U

INTERNATIONALIZATIONINTERNATIONALIZATION G STARTED of developing code that is independent of gives some thought as code is being written Is there some kind of check- language and locale along with supporting as to its possible effect on international- list for internationalization? modules that manage language and locale ization. Coding for internationalization specific issues. Locale-neutral escape mech- is more a matter of attitude and mindset The problem with checklists is that one anisms are developed that provide means rather than linguistic expertise. The most can find important exceptions for each rule. for accessing supporting modules handling important step that a developer can take As stated above, internationalization is more internationalization issues. As a simple is to learn about the National Language a matter of attitude and programming skills. example, a module might contain the user Support available on the platform on Ask yourself if you write C++, Perl, VB, Java interface for a particular language, and the which he or she works. Such knowledge or C# using a checklist. Most likely you don’t. escape mechanism would be a call to get a can be acquired through reading, train- But there are some principles. The accompa- string indexed by a number or a hash value. ing and learning to write small sample nying chart shows some general rules culled Another example could be the need to ren- applications that exercise one or more of from many sources. The statements are der display of data in a way suitable for a these functions. rather broad and need expert interpretation given world region (locale). In this case the escape might be to use the information provided by the operating or runtime sys- tem itself. Taking advantage of such sup- port is one of the most powerful ways to internationalize a program with a mini- mum of effort. It is all a question of modu- lar development and design. What kinds of internationaliza- tion models are used, and what are the merits of each? This is a complex topic that depends entirely on the platforms on which the application runs. However, all systems pro- vide such guidelines, and these must be thoroughly studied. First-timers are often overwhelmed by what has to be learned — another reason why you start the interna- tionalization effort early in the develop- ment cycle. How does a developer learn about internationalization?

Unfortunately, it is nearly impossible to find schools or universities anywhere that are either interested in or qualified to teach internationalization as a part of a normal computer science education. It is a serious failure by those who otherwise do an excellent job of teaching the art and sci- ence of programming and programming languages. Companies are therefore forced into educating their staffs or drawing upon outside expertise. Localization companies and independent internationalization engin- eers often provide such services. As a developer, do I have to be able to speak four languages in order to learn international- ization? Internationalization is an engineer- ing effort. An appreciation of the fact that cultural and linguistic differences exist and that software needs to compensate is enough. It also helps if the developer 11 GE ETTING D

I GETTING STU ARTED G GUIDE STARTED MULTILINIGNTERNAUAL CONTIONALIZTENT MATIONANAGEMENT

Some Principles for Internationalization The program design team considers internationalization from the Program’s internal character encoding is Unicode. beginning of the project. Program properly handles all characters in the program’s character set. Icons, cursors and bitmaps are generic, are culturally acceptable and do Program handles non-homogeneous network environments where not contain text. machines are operating with different encodings. If ethnocentric graphics, colors or fonts are used, they can be replaced Code processes all character sets correctly regardless of character widths. dynamically using locale-sensitive switch statements. Code supports Unicode and conversion between Unicode and any local Menus, dialogs and web layouts can tolerate text expansion. code pages. Development language strings are reviewed for meaning and spelling to No assumptions are made that one character storage element represents reduce user confusion and lessen translation errors. one linguistic character. Strings are documented using comments to provide context for Code uses generic data types and generic function prototypes if available translators. in compiler. Strings or characters that should not be localized are clearly marked. Code does not use embedded font names or make assumptions about Shortcut-key combinations are accessible on all international keyboards. particular fonts being available. International laws affecting design and operation are considered. Program displays and prints text using appropriate fonts. Third-party software used in the product is examined for international- Program meets international testing standards. ization support. Text is translated and meets the standards of native speakers. Consistent terminology is used in messages. Dialog and forms are resized, and text is hyphenated appropriately. The product runs properly in its base language in all target locales. Translated dialog boxes, toolbars, status bars and menus fit on the screen Strings are not assembled by concatenation of fragments. at different resolutions. Source code does not contain hard-coded character constants, numeric Menu and dialog-box keyboard assignments are unique. constants, screen positions, filenames or pathnames that assume a User can type all supported characters into documents, dialog boxes and particular language. filenames. String buffers are large enough to handle translated words and phrases. User can successfully cut, paste, save and print text regardless of Program handles input of international data. language. All language editions can deal with one another’s documents. Sorting and case conversion are culturally correct. Program contains support for locale-specific hardware if required. Application works correctly on localized editions of the target operating Program depends on operating or runtime system functions for sorting, system. character typing and string mapping. These considerations apply specifically for web internationalization: Program takes advantage of generic text layout functions when available. Make sure your presentation tier is ready to manage multiple character Program responds to changes in the user’s choice of interna- encodings correctly in a variety of browsers. tional settings. Check all your forms and other input fields for encoding compatibility. Program handles user keyboard layout changes. Follow all the rules for internationally portable design as listed earlier. Far East editions support input method editors (IME), vertical text and Check your middle-tier components for internationalization compli- line-breaking rules. ance. Ensure that information about encoding and locale of data is All international editions of the program are compiled from one set of passed correctly between presentation and backend tiers. source files. Validate databases to make sure that schemas, data types and table Localizable items are stored in resource files, message tables or design are ready for a multi-locale environment. message catalogues. Check database client calls for use of built-in National Language All language editions share a common file format. Support to return properly encoded and sorted record sets. and illustrations but should give you an idea. long it will take. The ideal audit will usually numbers, currency, calendars, sorting and The base development language is assumed take three to five days or more and should the like. Have a machine ready with source to be English. involve several members of the development code and build tools for viewing. If you have team. Be prepared to brief the auditing team already isolated the user interface, make sure We did not pay attention to on the main features of your product and the auditors can read through strings, view internationalization during de- have someone who knows the system very dialogs and menus, and check clip art for velopment. What do we do? well show the auditors how the product everything from text in bitmaps to strings works. Expect the auditors to do things you assembled by concatenation. Allow direct Get an audit by an experienced company probably never thought about such as access to developers; it is especially useful or individual who knows about this topic. changing the system, user and input locales for an auditor to sit with a programmer and An audit can identify the issues and help you to see how well the program handles differ- view his or her particular contribution at prioritize what needs to be done and how ent keyboards, fonts, date and time formats, runtime as well as examine the underlying 12

E D

I GETTING U

INTERNATIONALIZATION G STARTED code. Keep in mind that the auditors are considered internationalization, the work job, try to work out a method whereby the applying their experiences in international- may be no more complex than getting internationalization work can be done on the ization to mentally run through the items list- strings and other user interface items out of main development line. Otherwise, the ed above to see how well your product meets code and fixing up problems with incorrect organization will have to do one or more pos- these criteria. Of course, the auditors will tai- API use. It should be no surprise that well- sibly large merges, and in the meantime, the lor their actions based on your program. written code is often very easily fixed to development staff will continue to introduce meet internationalization requirements and internationalization errors that will have to be What are the deliverables from badly written code can be pure hell to corrected again and again. the internationalization audit? rewrite. Internationalization engineers call this the good-code-bad-code phenomenon. What do we do after release? The auditors will present you with a report detailing problems found during What do we do after reading Resolve never to overlook internation- runtime, possible problems in the source the report? alization when writing code. Add interna- code they examined (down to the proce- tionalization phase checks and QA to normal dure level with suggestions for repair), gen- Decide as soon as possible whether you development and train any new engineers. eral recommendations about what needs to will perform the work in-house, outsource it Reinforce good programmer behavior and be done in the short and long term (if you or do a combination. The best is to do the do the opposite with bad. Most of all, are going to Western Europe first and the work in-house since the developers will learn expect that internationalization will cost Far East or bidirectional languages later, the steps to coding internationalization as a you something in extra development and you may be able to delay some work for a natural part of coding. But even this approach allow for the time and expense. Remember while), and an estimate of how long the will require some mentoring by experts, and that this cost will be amortized over several internationalization task might take. Keep for a while you may want to have an interna- localizations and that the internationaliza- in mind that such estimates are often very tionalization consultant on hand to identify, tion costs per release will be much lower. difficult to quantify, as are all such determi- explain and help correct the code. You may Enjoy the revenue received from abroad. Ω nations about software. It can happen that also want to get two or three days of seminar This article is a revised and updated errors are buried so deep and so tangled that training for all of the development staff in version of one that first appeared in the major architectural changes are required. In order to acquaint everyone with the basics. Getting Started Guide “Internationaliza- other instances, if developers have used Seminars rarely teach technique, but they do tion” in MultiLingual Computing & Tech- good coding practices even without having raise awareness. If you decide to outsource the nology #47 Volume 13 Issue 3.

13

E D

I GETTING U

G STARTED INTERNATIONALIZATION 1212 MythsMyths andand MisconceptionsMisconceptions AboutAbout InterInternationalizationnationalization

Andrea S. Vine

Myth #1: Making user interface Many folks assume the people translating Engineers need to enable business folks to elements localizable is enough. the product will always choose the best word make the decisions necessary to sell as much for the context. The truth is that localizations product as possible. This, in turn, makes the If this were true, it would mean that the run on tight schedules and low budgets. company more profitable, which raises the product is modified for every country where it Translators usually translate text directly in stock price (well, sometimes), and everyone is sold — Canada, the United Kingdom, message catalogs, rather than as they appear benefits. Localization needs to be enabled Brazil, New Zealand, Greece and dozens more. on the screen. They are not well versed in throughout the product. Obviously no company localizes products for product functionality, and there is little time Log messages fall into a special category every single country it sells into, or localiza- and expertise to perform thorough linguistic of messages. They are usually not localized tion groups would be much, much larger and checks of the text in the context of the running directly, but may, in fact, be indirectly local- their budgets would be significantly bigger. software. They are usually paid by the word, so ized via a log viewer. When this is available, Instead, companies sell the English product all volume is their watchword. Imagine what hap- log messages need to be in a separate over the world, with the exception of a few pens to the translation in this situation. resource file in order to be localized. For this large markets where localized product is sold. reason, log messages need to be localizable, Even the localized products are sold into mul- Myth #3: The code is in Java, and but they need to be separated from other tiple countries. therefore it’s internationalized. messages so that localization knows whether For this reason, the locale of the user to translate them or not. If a message goes should be detected or determined in some Long before the advent of Java, there was to both a log and the user interface and if the way, even if the user must be asked explicitly. internationalized code. How on earth did pro- log messages are restricted to English, then Numeric formats, text formats, dates and any grammers manage this? The answer is that the message going to the user interface other formatted data should appear in a style internationalization was always possible; it just should be retrieved from the localized that is used in that locale. took more effort. Java is written to make inter- resource file, and the message going to the Note that values must be handled care- nationalization much easier. It is not impossi- log should come from the English resource fully. For example, if someone in Germany ble, however, to write Java code that is not file. English files should be shipped with all asks for a price and the price is stored in US internationalized. In fact, it’s pretty easy to localized products. dollars, then there are two possible methods write code that only supports English in the of conveying the value to that user: United States in Java. So, even Java must be Myth #6: The product uses open The currency unit displayed is US dol- carefully coded to support international data. source, and so internationalization lars, but the numeric format of the actual doesn’t apply. value is that of Germany: USD 250 467,10 Myth #4: The product has full The value is converted from US dollars Unicode support and therefore is A lot of folks use the excuse that they into German marks, and the value is displayed internationalized. have no control over the open source, and so with the German mark currency symbol, in a they can’t deal with internationalization German numeric format: DEM 528 450,47 Like the Java myth, so goes the Unicode issues in that code. Yet a product’s interna- Even with the value expressed in US dol- myth. Like Java, Unicode support can make tionalization is only as good as its weakest lars, the thousands separator is a blank, and the handling international data much easier. But component. The decision to use open source decimal is a comma. If the US thousands sepa- once again, code must be written to manage in the first place must take into account all rator, the comma, is used, a German user might data in different languages, in different the requirements for those components. If well be confused about the amount. locales and, for the time being, in different the open-source pieces don’t fulfill the cus- Formats should be locale sensitive, charsets. Ha-ha (did that translate well?). tomer requirements, then the decision to use but value units should only change if there is them must include the coding effort required a conversion. Myth #5: Administrative inter- to internationalize them. Otherwise, cus- Graphics are part of the universal faces and log messages don’t need tomers will not get a product which works product approach. They are so expensive to to be internationalized. the way they expect it to, which in turn localize that no one usually bothers unless means they’re not going to buy it. there’s embedded text (which should be Admins are people too. In some markets, avoided). Graphical images should be uni- the admin interface must be localized. What Myth #7: ISO-8859-1 is the stan- versally appropriate. was done in the past in localization is not nec- dard encoding for HTML. essarily what will be done in the future. Myth #2: Translators choose the Whether or not a product gets translated is a The HTML specification states that ISO- best phrase in the target language. business decision, not a technical decision. 8859-1 is the default encoding, but it is not

14

E D

I GETTING U

INTERNATIONALIZATIONINTERNATIONALIZATION G STARTED

KNOWLEDGE the standard one. Using ISO-8859-1 means Myth #10: Internationalization that only a limited number of languages can was added in the last release, so FROM THE CORE be represented on the web page. But it doesn’t nothing more needs to be done. Historically speaking, opportunities for have to be that way. localization professionals to update their In the HTML 4.0 specification, Unicode Imagine if this were said about any pro- knowledge by meeting with peers to was made the reference character set. This gram architecture or functionality — say, exchange ideas, experiences and information means that all numeric character references security, performance or the ability to print a have been rare. The Localization Institute, are Unicode values, and that means that page in a word processing program. It’s ludi- with MultiLingual Computing, has filled this processors supporting HTML 4.0 also sup- crous to assume anything in program code is gap with events such as Localization World port Unicode. This allows the use of UTF-8 as complete as long as the code keeps changing. conferences. Other forums include the the encoding for HTML pages with the confi- Internationalization is inherent in the pro- Institute’s roundtables. The Management dence that browsers and web servers are able gram code; every time a new line is added, it Roundtable has been held yearly for nine to handle it. And UTF-8 covers most of the must be taken into account. New require- years, and the Project Managers’ Roundtable world’s languages. ments come in, making it necessary to add for seven. functionality, or possibly to change the Myth #8: All company employees Internationalization as a standalone topic architecture. Until all work on the product has received less attention. Generally consi- speak English, so only English needs is discontinued, the internationalization is dered the province of core development in to be supported for internal tools. not done. sophisticated development enterprises and Myth #11: The product works in poorly understood in less-sophisticated en- Within a given company, there’s usually deavors, internationalization has suffered from a primary language chosen for business Japanese, so it’s internationalized. a shortage of organized information transactions within the company. For com- exchange and standard-bearers duly unified panies based in the United States, this is usu- It’s great that the product has been tested by shared knowledge. With localization ally English. Internal tools, however, handle in Japanese; it uncovers a great many prob- data that is beyond internal business. For lems. But not all of them. Other languages, methodologies becoming increasingly well example, they may include a customer data- such as Arabic, Hindi and French, have dif- understood, the time has come to refocus base, a support log or a bug database. Data ferent issues from Japanese. Even Chinese is attention on internationalization. Program- from customers is often in another language, different. There are issues with other locales ming technology has moved from C and using other locale formats. as well. And the product may not work in a C++ into new language paradigms that Bugs in the product often relate to data multilingual environment. So keep testing! contain new mysteries and methods. Emer- that is not English. Trying to record a bug ging markets such as China have flexed their with Japanese data in an English system is a Myth #12: Internationalization is muscles by imposing new mandatory stan- frustrating exercise. Often the person log- done by other engineers after the dards. Information has become fragmented, ging the bug gives up, and the bug isn’t base product code is completed. and it needs to be brought back together logged. If the person persists, then there may again in one knowledgebase. not be enough data to replicate the bug. Just thinking through the engineering From June 29 to July 1, 2005, The Locali- Quality suffers, and the product team may process, does it make sense for the same zation Institute will offer an Internationa- never know why sales are slipping in non- code to be worked on by more than one lization Roundtable at the Granlibakken English markets. engineer? It may make sense for a second Conference Center near Lake Tahoe, . Internal tool teams should gather engineer to review the code, but rewrite it? Discussions will be advanced and technical in requirements just like product teams (assum- Most companies would consider that a nature and will focus on issues that need to ing product teams are gathering require- huge cost issue. After all, engineers are be considered by lead developers, code archi- ments internationally). expensive. But if a company process is set tects, VPs of development, experienced interna- up so that a separate group of engineers, or tionalization engineers, technically oriented Myth #9: This product has never third-party vendor, internationalizes their localization managers and subject matter been localized, so it doesn’t need product, that’s exactly what happens. And experts in any of the fields. internationalization. it will have to happen for each and every Topics include the state of international- release. Internationalization is an architec- ization technology in today’s programming Internationalization is about data pro- ture and coding methodology, not an add- languages; architectural issues; assessment; GB cessing, not just user interface. English prod- on functionality. Even if a company could 18030-2000 issues and certification; tools; uct is sold all over the world, so all data must afford to have post-release international- and making the case for the value of inter- be processed correctly. Even within the ization done, the quality is significantly nationalization to benefit a company. English-speaking markets, non-English data lower. The internationalization engineers For further information, see is processed (see Myths #1 and #8). don’t usually know the product as well as http://www.localizationinstitute.com What was done in the past in localiza- the product engineers, so they introduce tion is not necessarily what will be done in bugs. In addition, the product itself is not the future. Whether or not a product gets architected for international use, and so translated is a business decision, not a may fail in providing useful functionality technical decision (see Myth #5). Engineers for markets around the world. The Localization Institute need to enable business folks to make the Internationalization is not something 4513 Vernon Boulevard, Suite 11 decisions necessary to sell as much product that someone else does. It’s something that Madison,WI 53705 USA as possible. everyone should do. Ω Phone 608.233.1790 Fax 608.441.6124

15

E D

I GETTING

TRANSLATION U G STARTED SubscriptionSubscription OfOfferfer

This supplement introduces you to the level; changing the requirements for inter- more complex. Leaders in the development magazine MultiLingual Computing & national software; and changing how busi- of these systems explain how they work and Technology. Published nine times a year, ness is done all over the world. how they work together. filled with news, technical developments MultiLingual Computing & Technology and language information, it is widely is your source for the best information and Internationalization recognized as a useful and informative pub- insight into these developments and how lication for people who are interested in the they will affect you and your business. Making software ready for the interna- role of language, technology and transla- tional market requires more than just a good tion in our twenty-first-century world. Global Web idea. How does an international developer prepare a product for multiple locales? Will Translation Every Web site is a global Web site, and the pictures and colors you select for a user even a site designed for one country may interface in France be suitable for users in How are translation tools changing require several languages to be effective. Brazil? Elements such as date and currency the art and science of communicating ideas Experienced Web professionals explain formats sound like simple components, but and information between speakers of dif- how to create a site that works for users developers who ignore the many interna- ferent languages? Translators are vital to everywhere, how to attract those users to tional variants find that their products may the development of international and your site and how to keep it current. be unusable. You’ll find sound ideas and localized software. Those who specialize in Whether you use the Internet and World practical help in every issue. technical documents, such as manuals for Wide Web for e-mail, for purchasing computer hardware and software, industri- services, for promoting your business Localization al equipment and medical products, use or for conducting fully international e- sophisticated tools along with professional commerce, you’ll benefit from the informa- How can you make your product look expertise to translate complex text clearly tion and ideas in each issue of MultiLingual and feel as if it were built in another coun- and precisely. Translators and people who Computing & Technology. try for users of that language and culture? use translation services track new develop- How do you choose a localization service ments through articles and news items in Managing Content vendor? Developers and localizers offer MultiLingual Computing & Technology. their ideas and relate their experiences with How do you track all the words and the practical advice that will save you time and Language Technology changes that occur in a multilingual Web money in your localization projects. site? How do you know who’s doing what From multiple keyboard layouts and and where? How do you respond to cus- And There’s Much More input methods to Unicode-enabled opera- tomers and vendors in a prompt manner ting systems, language-specific encodings, and in their own languages? The growing Authors with in-depth knowledge systems that recognize your handwriting or and changing field of content management summarize changes in the language indus- your speech in any language — language and global management systems (CMS and try and explain its financial side, describe technology is changing day by day. And this GMS), customer relations management the challenges of computing in various lan- technology is also changing the way in (CRM) and other management disciplines guages, explain and update encoding which people communicate on a personal is increasingly important as systems become schemes and evaluate software and sys- tems. Other articles focus on particular countries or regions; translation and local- To subscribe, use our secure on-line form at ization training programs; the uses of lan- guage technology in specific industries — a wide array of current topics from the world wwwwww.multilingual.com/subscribe.multilingual.com/subscribe of multilingual computing. Nine times a year, readers of Multi- Be sure to enter this on-line registration code: Lingual Computing & Technology explore sup71 language technology and its applications, project management, basic elements and advanced ideas with the people and com- panies who are building the future. 17