Second Language Linux Is International

Second Language Linux Is International

PROGRAMMING Bilingual Programming Bilingual Programming Second language Linux is international. It was started by a programmer from Finland who speaks Swedish. Aided by a Welsh- speaking lieutenant. Supplemented with a kernel maintainer from Brazil. So why is all our software written in English? This month multi-lingual development and the gettext package. BY STEVEN GOODWIN he English language holds the same Turning Japanese equivalent by using the code in box: In power in today’s society that Latin GNU/Linux uses a technique known as PHP. Tdid many hundreds of years ago. locales to determine many things: the It’s not the most expressive language, nor appropriate translations for text, the #include <stdio.h> is it the most popular. It certainly isn’t character set required to represent the int main(int argc, char *argv[]) the easiest to learn. It is, however, the alphabet, and cultural specifics like the { most widespread. With the remnants of expression of numbers, or the date. Each printf("Hello World!\n"); the old British Empire still present, and area is considered in the box: Locale Cat- return 0; the continued growth of America, people egories, although the focus of this article } are required to use English in order to will be on text translation. compete on the world stage. So let us start with the simplest pro- It’s fairly obvious to us where the transla- Computers and the Internet have gram we know, Hello World. We shall be tion string will need to go. At compile increased this linguistic strangle-hold. coding in C, although the same tech- time, however, we do not know what the More web pages exist in English than any niques can be applied regardless of replacement string will be, or what lan- other language. More programming lan- language. You’ll be able to test the PHP guages it will need to be in. This prevents guages use English words like if and us from including any translation data while, regardless of the designer’s nation- In PHP directly into program. Instead, we must ality. Most software uses prompts and Writing multi-lingual software in PHP is no build up catalogs of each word and error messages that are written in English. different from using C.The functions even phrase used by our program, and employ However, with Linux taking control of have the same name! However,when run as the gettext package to act like a dictio- many different systems across the globe, part of a web page,it might be more suit- nary. This will replace our (English) it would appear to be xenophobic of us able to specify the locale explicitly.Perhaps words with the correct foreign version at to continue developing ‘English-only’ coming from an session variable,or cookie run time. What is ‘correct’ will be deter- software. Adding the ability to change on the users machine. mined by the user’s specific locale. the language (or locale) of your software <?php We are required to do two things, is not a difficult task to achieve, but it setlocale(LC_ALL, "fr_FR"); 1. Mark the source code to say ‘get me shows a wider commitment to your pro- textdomain("lm"); the correct words for the phrase XYZ’ echo gettext("Hello World!\n"); ject, and the open source community in 2. Build a translation dictionary for ?> general. Even if you can not translate the each language we need to support The effect of setlocale can also be achieve by text yourself, you can make it easier for using the putenv function. Marking the source is a simple someone else to do so by following the process. We, as programmers, must work putenv ("LANG=fr"); guidelines in this article. through each line of the code and indi- 66 June 2004 www.linux-magazine.com Bilingual Programming PROGRAMMING cate which lines of text will need trans- translated. We shall shortly see a tool that lating. We can do this by calling a special makes use of these markers itself to help Locale Categories function (called, not surprisingly, build the dictionary of translations. If we A category defines a set of data,and every supported language has its own set of data. gettext) that will consult the dictionary were to build the dictionary manually The category might define the way to and convert our string to something suit- (but why would we?!), the gettext_noop impart particular information:numbers ably foreign. marker would be unnecessary. over 1000 might be separated by commas or Some programmers prefer to replace dots,for example,or the date might be writ- printf(gettext("Hello U this nine character marker with a single ten day-month-year or month-day-year.This World!\n")); character macro, such as the underscore. information is not related to the language This is because the word gettext (and as such,which is why the term ‘locale’is used,constituting both language and cul- This function can be found in the libintl both brackets) can cause many lines to tural specifics. A directory is created for each header file, so we must, break the 80 character limit. This is sim- category. ply, There are standard functions to format #include <libintl.h> these locale strings. For example,strfmon #define _(str) gettext (str) and strftime format the text for money and Compiling under GNU/Linux requires no #define N_(str) gettext_U time data,respectively. extra link libraries for the code to work. noop (str) Category Meaning The word GNU is essential here. That is LC_COLLATE Order of string-collation because the internationalization features The GNU standard prefers a space LC_CTYPE How to define characters. Echoes of are included directly in glibc. Users of between function name and bracket, but ctype.h as this also performs upper/ other Unix-like systems may not be so this is often omitted. lower case conversion lucky. However, without a language cata- We can now move on and build our LC_MESSAGES The translated text.The focus of this article log, no translations will be made. That foreign language dictionary. LC_MONETARY Format and symbols for money doesn’t matter at the moment, since the LC_NUMERIC Format and symbols for numbers English text will be output in all cases Vienna Calling LC_TIME Format and symbols for time and date where a translation can not be found. C Building a file that contains all the programmers will also note that this strings in a program is not as time-con- method is not all-encompassing, because suming as you might think. Naturally, it msgid "Hello World!\n" there is more than one way to declare a is a very common task, and can be msgstr "" string. However, we’ve only learnt one achieved by using a tool named xgettext. way to mark strings for translation. So This is one of the few instances where As you can see, each piece of text has a we will need to use another method, to the ‘x’ does not stand for an X Window marker ID and an equivalent string, cope with those cases where a function program. Instead, it is short for ‘extract’. ready for translating. This string can call to gettext would result in a syntax This program will search the source file only hold a translation for one specific error. For example, for any string used in conjunction with language, so this file becomes a tem- the function call gettext (or gettext_noop) plate. Each translator takes a copy of it, char *pHello = "Hello, U and place the text into a catalog file and translates the text within it to his or World!\n"; (ending the suffix .PO) ready to be trans- her native tongue. Sometimes, this PO lated. The program understands enough file is renamed to POT to differentiate To circumvent this problem, we need to about C, and about other languages (see between the template, and the language- create a macro that includes a marker, box: xgettext: Supported Languages), to specific catalog files. but has no adverse effect on the syntax. understand the syntax of a function call, Note that xgettext will search for the and differentiate it from variables and function name gettext. It does not under- #define gettext_noop(String) U comments. stand enough of the C syntax (or that String any language) to understand techniques ... $ xgettext -d lm helloworld.c like #define _(str), given above. This char *pHello = gettext_noop("U $ tail -n 3 lm.po doesn’t preclude the use of such tricks Hello, World!\n"); #: helloworld.c:5 however. There are two popular solu- tions. One is to specify the underscore as We then need to invoke the translation xgettext: Supported an additional keyword that will act in the module in the usual way, before we out- Languages same manner as if it were gettext. put the string. Like so, C,C++,ObjectiveC awk $ xgettext -d lm -k_ U PO YCP printf (gettext (pHello) ); helloworld.c Python Tcl Lisp,EmacsLisp RST These markers not only perform the Alternatively, you could pre-process translation when the program is running, librep Glade your C file (causing the macro to be but indicate to us what text needs to be Java expanded) before running xgettext. www.linux-magazine.com June 2004 67 PROGRAMMING Bilingual Programming $ xgettext -C -d lm <(gcc -E U which highlights the deliberate mistake localedirectory, be careful not to change helloworld.c) above. Did you spot it? See Listing 1. directory, as this path would then The first warning simply reminds us become unreachable. In this example we specify the -C flag , to that we haven’t changed the header While in our local root directory, we indicate that the piped result is a C information yet.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    5 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us