<<

Unicode SAP Systems

Unicode@

NW AS Internationalization

SupportedlanguagesinUnicode.doc 09.05.2007

© Copyright 2006 SAP AG. All rights reserved. part of this publication may be reproduced or transmitted in any form or for any purpose without the express permission of SAP AG. The information contained herein may be changed without prior notice. Some products marketed by SAP AG and its distributors contain proprietary software components of other software vendors. , Windows, Outlook, and PowerPoint are registered trademarks of Microsoft Corporation. IBM, DB2, DB2 Universal Database, OS/2, Parallel Sysplex, MVS/ESA, AIX, S/390, AS/400, OS/390, OS/400, iSeries, pSeries, xSeries, zSeries, z/OS, AFP, Intelligent Miner, WebSphere, Netfinity, Tivoli, and Informix are trademarks or registered trademarks of IBM Corporation in the and/or other countries. Oracle is registered trademark of Oracle Corporation. UNIX, X/Open, OSF/1, and Motif are registered trademarks of the Open Group. Citrix, ICA, Program Neighborhood, MetaFrame, WinFrame, VideoFrame, and MultiWin are trademarks or registered trademarks of Citrix Systems, Inc. HTML, XML, XHTML and W3C are trademarks or registered trademarks of W3C®, Consortium, Massachusetts Institute of Technology. Java is a registered trademark of Sun Microsystems, Inc. JavaScript is a registered trademark of Sun Microsystems, Inc., used under license for technology invented and implemented by Netscape. MaxDB is a trademark of MySQL AB, Sweden. SAP, R/3, mySAP, mySAP.com, xApps, xApp, SAP NetWeaver and other SAP products and services mentioned herein as well as their respective logos are trademarks or registered trademarks of SAP AG in Germany and in several other countries all over the world. All other product and service names mentioned are the trademarks of their respective companies. Data contained in this document serves informational purposes only. National product specifications may vary. The information in this document is proprietary SAP. No part of this document may be reproduced, copied, or transmitted in any form or for any purpose without the express prior written permission of SAP AG. This document is a preliminary version and not subject to your license agreement or any other agreement with SAP. This document contains only intended strategies, developments, and functionalities of the SAP® product and is not intended to be binding upon SAP to any particular course of business, product strategy, and/or development. Please note that this document is subject to change and may be changed by SAP at any time without notice. SAP assumes no responsibility for errors or omissions in this document. SAP does not warrant the accuracy or completeness of the information, text, graphics, links, or other items contained within this material. This document is provided without a warranty of any kind, either express or implied, including but not limited to the implied warranties of merchantability, fitness for a particular purpose, or non- infringement. SAP shall have no liability for damages of any kind including without limitation direct, special, indirect, or consequential damages that may result from the use of these materials. This limitation shall not apply in cases of intent or gross negligence. The statutory liability for personal injury and defective products is not affected. SAP has no control over the information that you may access through the use of hot links contained in these materials and does not endorse your use of third-party Web pages nor provide any warranty whatsoever relating to third-party Web pages.

SAP AG 2 SupportedlanguagesinUnicode.doc 09.05.2007

Icon Meaning Caution

Background

Function

Example

Tip

Recommendation

Syntax

Typographic Conventions

Format Beschreibung Beispieltext Words or characters that appear on the screen. These include field names, screen titles, pushbuttons as well as menu names, paths and options. Cross-references to other documentation Beispieltext Emphasized words or phrases in body text, titles of graphics and tables BEISPIELTEXT Names of elements in the system. These include report names, program names, transaction codes, table names, and individual key words of a programming language, when surrounded by body text, for example, SELECT and INCLUDE. Beispieltext Screen output. This includes file and directory names and their paths, messages, names of variables and parameters, source code as well as names of installation, upgrade and database tools. Beispieltext Exact user entry. These are words or characters that you enter in the system exactly as they appear in the documentation. Variable user entry. Pointed brackets indicate that you replace these words and characters with appropriate entries. BEISPIELTEXT Keys on the keyboard, for example, function keys (such as F2 ) or the ENTER key A the keyboard key A the keystroke A /A/ the character A [A] the glyph A 0x41 the byte sequence in hexadecimal notation i18n Internationally used abbreviation of the term 'Internationalization'

SAP AG 3 SupportedlanguagesinUnicode.doc 09.05.2007

Introduction ...... 5 What is Unicode?...... 5 Language Configurations in SAP Systems...... 2 Availability of Unicode mySAP Solutions...... 3 Basic Information...... 3 Programming Languages ...... 3 ABAP Programs...... 4 C and C++ Programs...... 5 XML...... 5 Java ...... 5 Unicode Character Encodings and Byte Length...... 6 Hardware Requirements ...... 7 Database and Platform Support...... 8 Frontend Settings in the Unicode System ...... 9 SAP GUI Support...... 9 Frontend Requirements...... 10 Code Pages ...... 10 Font Selection...... 10 Locales ...... 11 International Components for Unicode (ICU) ...... 11 Unicode SAP Systems: After the System Installation ...... 11 Language Installation...... 12 Logon Language...... 14 Translation Import...... 14 Printing in Unicode SAP Systems ...... 14 Supported Device Types ...... 15 Communication within Multilingual System Landscapes ...... 15 Data Transfer in a Unicode/non-Unicode System Landscape ...... 16 Transport between Unicode and non-Unicode SAP Systems...... 17 RFC Library ...... 18 From non-Unicode SAP System to Unicode ...... 19 Appendix ...... 19 Documentation ...... 19 New Installation of Unicode SAP Systems...... 19 Conversion of SAP Single Systems and SAP MDMP Systems to Unicode...... 19 Further Information...... 20 Contacts ...... 20

SAP AG 4 SupportedlanguagesinUnicode.doc 09.05.2007

Introduction With SAP NetWeaver™, Unicode is the solution for multilingual SAP systems. This document is intended for managers and consultants who require detailed information about technical requirements, language availability and import, the integration process of Unicode Systems in an existing SAP System landscapes and ways to your Unicode SAP System.

This document is constantly being revised. Please make sure that you always use the most current version which can be downloaded from SAP Note 73606.

What is Unicode? Unicode (and the parallel ISO 10646 standard) defines the character set necessary for efficiently processing text in any language and for maintaining text data integrity. In addition to global character coverage, the Unicode standard is unique among character set standards because it also defines data and algorithms for efficient and consistent text processing. This enables high-level processing and ensures that all conformant software produces the same results. The widespread adoption of Unicode over the last decade made text data truly portable and formed a cornerstone of the Internet. What is the Business Value? Globalized software, based on Unicode, maximizes market reach and minimizes cost. Globalized software is built and installed once and yet handles text for and from users worldwide and accommodates their cultural conventions. It minimizes cost by eliminating per-language builds, installations, and maintenance updates.

Who needs Unicode? Computer users who deal with multilingual text -- business people, linguists, researchers, scientists, and others - will find that the Unicode Standard greatly simplifies their work. Mathematicians and technicians, who regularly use mathematical symbols and other technical characters, will also find the Unicode Standard valuable. Global business processes, for example global HR system or global Master Data Management, Web Services offering customers to enter their contact data (Global Master Data containing multiple local language characters!), in short: Global Business requires the support of a Global Character Set!

What are the benefits of a Unicode-based SAP System? Internet and Web Services  The Internet (including the World Wide Web) – and therefore collaborative scenarios – are based on Unicode.  Unicode SAP systems take full advantage of XML and Java (both of which require Unicode).  Unicode is required for cross-application data exchange without loss of data caused by incompatible character sets. One way to present documents on the World Wide Web, for example, is XML.  Unicode compliant ABAP paves the way for more efficient and effective integration of ABAP and Java applications. Business Value  Unicode SAP systems can be more tightly integrated with non-SAP products and offer a superior platform for collaborative, cross-system business applications.  Unicode enables SAP customers to install global systems that cover their business processes worldwide.

SAP AG 5 SupportedlanguagesinUnicode.doc 09.05.2007

 Companies using different distributed systems frequently want to aggregate their worldwide corporate data. Without Unicode, their ability to do this is limited. Language Support  Unicode SAP systems provide unrestricted use of any language or language combination in the world.  With Unicode, you can use multiple languages simultaneously on a single frontend.  In Unicode SAP systems you can display and maintain data which has been entered in ANY language with ANY logon language - provided the language installation has been performed correctly as described in chap. Language Installation . The logon language only determines the display language for menus, dynpros, system messages etc….. the language the user is working in.

Fig.1: Example collection of languages which are supported in Unicode SAP systems

The languages are sorted by their 2-letter ISO 639 codes.

Language support • As of R/3 1I non-Unicode : 41 languages which have a 2-letter language key according to ISO 639-1 (= Default Set of Languages: see Fig. 2 ) • As of Web Application Server 6.20 Unicode : a) Default Set of Languages • 41 language codes (see Table below) b) New Unicode Languages: • 433 additional languages which have a 3-letter language key according to ISO 639-2 (see example 1. below) • 86 languages which have no separate ISO 639 language key but are assigned to countries or scripts (see examples 2a) and b) below)

SAP AG 6 SupportedlanguagesinUnicode.doc 09.05.2007

All languages from ISO 639-1and ISO 639-2 standard and 86 languages with no ISO-language key (560 languages in total) can be entered and displayed in Unicode SAP systems as of: Web AS 6.20 Unicode Support Package 54 onwards Web AS 6.40 Unicode Support Package 14 onwards Web AS 7.00 Unicode onwards SAP Note 895560 provides an overview of printing and display restrictions.

Note: Language support in non-Unicode systems as of Web AS 6.20 onwards is limited to the Default Set of 41 languages!

Fig. 2: Default Set SAP/ISO Lang. Key Language SAP/ISO Lang. Key Language AF Afrikaans JA Japanese AR* Arabic* Korean BG Bulgarian LV Latvian CA Catalan LT Lithuanian ZH Chinese MS Malayan ZF Chinese trad. NO Norwegian HR Croatian PL Polish CS Czech PT Portuguese DA Danish Z1 Reserved- cust. NL Dutch Romanian EN English Russian ET Estonian SR Serbian FI Finnish SH Serbian (Latin) FR French SK Slovakian DE German SL Slovene EL Greek ES Spanish SV Swedish HU Hungarian TH IS Icelandic TR Turkish ID Indonesian UK Ukrainian IT Italian *in Unicode systems only

Examples: New Unicode Languages

1. Languages with 3-letter language key according to ISO 639-2: • Hindi = ISO ‘HIN’ = SAP ’’ • Iranian = ISO ‘IRA’ = SAP ’IR’

SAP AG 7 SupportedlanguagesinUnicode.doc 09.05.2007

• Sanskrit = ISO ‘SAN’ = SAP ‘

2. Languages with no separate language key according to ISO 639 a) assigned to countries: • English Australia = ISO ‘ENG’ = SAP '1E' • English Canada = ISO ‘ENG’ = SAP '3E' • English Ireland = ISO ‘ENG’ = SAP '8E' • English New Zealand = ISO ‘ENG’ = SAP '1N' b) assigned to scripts: • Azerbaijani () = ISO ‘AZE’ = SAP ‘5R’ • Azerbaijani (Latin) = ISO ‘AZE’ = SAP ‘AZ’

For an overview of all supported languages (including technical details, used scripts, and countries) see the Excel sheet "Supported Languages and Code Pages.xls". You can download this document from SAP Note 73606 and from www.service.sap.com/i18n → i18n Media Library.

Translation Status SAP delivers the mySAP ERP editions in 30 of the languages listed in Fig.2. For an overview see www.service.sap.com/languages . Here you will find availability figures per release, translation level and delivery status of each language. However, if you need additional languages with 2 or 3-letter language codes, you can install them in your Unicode SAP system as described in chap. Language Installation , but remember that translation is not delivered by SAP.

Language Configurations in SAP Systems One of the important considerations when preparing to install an SAP System is the choice of the system language(s). A multinational company operating in different countries all over the world usually needs several languages consisting of many different characters. Characters are encoded on a per basis. , for example, there is only one set of Latin characters defined, despite the fact that the Latin script is used for the alphabets of thousands of different languages. The same principle applies for any other script (Cyrillic, Arabic, , , etc.) which is used for writing many different languages.

Range of the Unicode Standard The Unicode standard and ISO/IEC 10646 support three encoding forms (UTF-8, UTF-16 and UTF-32) that use a common repertoire of characters and allow for encoding as many as 1.1 million characters. This is sufficient for all known requirements, including full coverage of all historic scripts of the world, as well as common notational systems. The Unicode Standard defines codes for characters used in all the major languages written today. Scripts include the European alphabetic scripts, Middle Eastern right-to-left scripts, and many scripts of Asia. It covers the GB 18030-2000 Standard as well. The Unicode Standard further includes marks, , mathematical symbols, technical symbols, arrows, , etc. It provides codes for diacritics, which are modifying character marks such as the tilde (~), that are used in conjunction with base characters to represent accented letters (ñ, for example). In all, the Unicode Standard, Version 4.0 provides codes for 95,221 characters from the world's alphabets, ideograph sets, and collections."

SAP AG 2 SupportedlanguagesinUnicode.doc 09.05.2007

Fig. 3

ASCII General Scripts Symbols

CJK Ideographs

65,000 characters

Hangul

Compatibility

Additional Surrogate Area 1,000,000 characters

Availability of Unicode mySAP Solutions Unicode-based mySAP solutions deploy the SAP Web Application Server 6.20, 6.40 and 7.00 which support the Unicode Standard and provide essential new functionality, syntax improvements and extended semantics. The first complete Unicode version of the Web Application Server is available as of Web AS Release 6.20 Support Package 38, Web AS Release 6.40 Support Package 04 and Web AS Release 7.00 respectively. The general availability of the Unicode version of mySAP ERP, mySAP Business Suite and SAP NetWeaver components is scheduled for the end of the ramp-up phase of the particular component. See SAP Note 79991 for current status.

Basic Information Programming Languages Prior to Web AS 6.20, SAP used different encodings to process characters from different scripts - such as ASCII, EBCDIC, or double-byte code pages. These character sets covered every language used in SAP. However, problems occurred when you tried to mix texts from different incompatible character sets in a central system. Exchanging data between systems with incompatible character sets also lead to contingent situations.

SAP AG 3 SupportedlanguagesinUnicode.doc 09.05.2007

The solution to this problem is to use a code consisting of all the characters used throughout the world, i.e. Unicode (ISO/IEC 10646), which consists of at least 16 bit = 2 bytes, alternatively of 32 bit = 4 bytes per character.

ABAP Programs Most programs should work without any modification, but you need to ensure that all programs comply with the stricter Unicode 6.10 syntax and semantics, which improve program efficiency and enable Unicode support. Note that all programs must be 6.10 compliant to run in a Unicode system and 6.10 compliant programs will also run in a non-Unicode system as well. In a non-Unicode system, programs do not have to be 6.10 compliant. To check your program, use the transaction UCCHECK to determine if your programs are ABAP 6.10 compliant; In addition, programs should be tested to catch non-static errors that appear at run-time. Use the transaction SCOV to monitor the testing. The language adjustments made as part of the conversion to Unicode provide an excellent opportunity for all ABAP developers to clean up their source code. The new programming statements also work in non-Unicode programs. SAP strongly recommends using them there in order to improve the readability and minimize ambiguity in source code. UCCHECK Run UCCHECK and enter the programs you want to check. After you have completed the check, and modified any code that was not ABAP 6.10 compliant, you should check the runtime behavior of your programs. UCCHECK issues errors for static detectable syntax errors, or warnings where runtime errors are possible, that cannot be detected by the static syntax check. Coverage Analyzer With the Coverage Analyzer you can montor the code coverage of program executions in your SAP systems. The Coverage Analyzer enables you to check the completeness of runtime tests and to display the results. It shows the collected data in several different customizable hierarchies.You can drill down to the programs and to their respective modularization units. The information that can be displayed includes for example: • Degree of utilization • Percentage of program units that have been tested • Percentage of programs whose Unicode check flag is active • Number of processing blocks

Unicode Enhanced Syntax Check The system profile parameter abap/unicode_check=on can be used to enforce the enhanced syntax check for all objects in non-Unicode systems. When setting this parameter, only Unicode- enabled objects (objects with the Unicode flag) are executable. Note that after setting the Unicode flag, automatically generated programs might need to be regenerated. The mentioned parameter should be set to the value "on" only, if all customer programs have been enabled according to transaction UCCHECK. The same ABAP source code is used for both Unicode and non-Unicode installations (see Fig. 3). Therefore you can freely combine Unicode and non-Unicode mySAP components and you take advantage of new releases with or without Unicode. Enhancements have also been made to RFC to guarantee smooth data communication between Unicode and non-Unicode systems and with non- SAP products as well. Because most mySAP solutions consist of cross-component integration scenarios, thorough integration testing is also planned, in particular within a combined Unicode / non- Unicode system environment.

SAP AG 4 SupportedlanguagesinUnicode.doc 09.05.2007

There are separate Unicode and non-Unicode versions of R/3: Fig. 3 Character expansion model

• No explicit Unicode data type in ABAP • Single ABAP source for Unicode and non-Unicode systems • Automatic conversion of character data for communication between Unicode and non- Unicode systems

C and C++ Programs Both the non-Unicode (single byte or multibyte) SAP kernel and the Unicode SAP kernel are compiled from a single set of C program sources -- when porting the SAP system to Unicode, a huge amount of C programming source were enhanced, but this does not affect the non-Unicode SAP kernel. Your C/C++ programs must also be enhanced to run in a Unicode system (see: www.service.sap.com/unicode@sap -> Unicode Library -> ABAP and Unicode -> Ext. Unicode Interfaces: Unicode Interface for C-Programming.ppt. XML Unicode SAP systems take full advantage of XML. SAP Exchange Infrastructure (SAP XI), SAP's platform for process integration based on the exchange of XML messages is only available as of Web AS 6.20 Unicode. It provides a technical infrastructure for XML-based message exchange in order to connect SAP components with each other, as well as with non-SAP components. For detailed information about SAP XI, see www.service.sap.com/xi . For information about SAP Web Services see http://service.sap.com/uddi For information about SAP Internet Business Framework see http://www.service.sap.com/netweaver . For more information about XML-development at SAP, see www.service.sap.com/xml . Java Unicode SAP systems take full advantage of Java. For information about Java development at SAP see www.service.sap.com/j2ee .

SAP AG 5 SupportedlanguagesinUnicode.doc 09.05.2007

Unicode Character Encodings and Byte Length Each Unicode character has a unique Unicode scalar value, and there are three "Unicode Transformation Formats" (UTF), which are mathematical permutations of the Unicode scalar value. There are currently 8-, 16- and 32- bit encodings. UTF-8, UTF-16, UTF-32, as well as other encoding schemes for Unicode characters, such as CESU-8. Although there are multiple encoding schemes, this is not the same as the problem of multiple code pages. All UTF formats, as well as CESU-8, contain exactly the same character set. The transformations between Unicode encodings are done algorithmically, and therefore no conversion tables are needed; this improves performance considerable. To provide the most efficient balance between memory requirements, performance, and compatibility with existing non-Unicode systems, SAP uses different encoding schemes.

Fig. 4: Encoding schemes used in SAP systems UTF- 8 (8 bit encoding) UTF-16 (16 bit encoding) variable length; 1 character = 1-4 bytes fixed length; 1 character = 2 bytes (surrogate pairs = 2+2 bytes) communication Internal (application server) server) (application platform independent, byte order platform-dependent byte order independent (Little/Big Endian) no alignment restriction 2 byte alignment restriction UTF-8 is the character encoding used for best compromise between memory XML usage and algorithmic complexity. (Frontend;GUI) all 7 bit ASCII characters have the same fits to Java and Microsoft

Externalcommunication code points and byte length environment → this ensures compatibility with non- best way to migrate existing ABAP Unicode systems and C programs

Fig. 5: Example: Representation of Unicode Characters in SAP systems Character Unicode Scalar UTF-8 / CESU- UTF-16 UTF-16 Value 8 [Little Endian] [Big Endian] A +0041 41 41 00 00 41 Ä U+00C4 C3 84 C4 00 00 C4 U+03B1 CE B1 B1 03 03 B1 α

U+05D0 D7 90 D0 05 05 D0 א 晓 U+6653 E6 99 93 53 66 66 53

Fig. 6: Database Format: varies depending on the manufacturer

UTF-8 CESU-8 UTF-16 DB/2 (AIX) Oracle SQL Server Max DB (8.0) DB/2 (AS400) SAP DB (7.0)

SAP AG 6 SupportedlanguagesinUnicode.doc 09.05.2007

What is Little Endian/Big Endian? The byte order of UTF-16 depends on the processor architecture (i.e. is byte order/"endian" - dependent).

Little Endian (LE) The least significant byte of the number is stored in memory at the lowest address, and the most significant byte at the highest address. (The little end comes first.) As an analogy, say "fourteen" in English; the less significant number, four, comes first. This is also called least significant byte (LSB) ordering.

Big Endian (BE) The most significant byte is stored in memory at the lowest address, and the least significant byte at the highest address. (The big end comes first.) As an analogy, we say "twenty-four" in English; the more significant number, twenty, comes first. This is also called most significant byte (MSB) ordering. When converting to Unicode, the export code page should correspond to the endianness of the target system.

Hardware Requirements Non-Unicode system - Unicode system compared The Unicode encoding determines the length of a character. A character in one of the Unicode encodings can be more than 1 byte, and therefore Unicode characters can be longer than characters defined in other standard code pages. This leads to larger hardware demands. The CPU/RAM figures below are measured average numbers on SAP Application Servers. They will be different for different transactions. Additional CPU/RAM hardware resource requirements on standalone servers must be provided by DB vendors.

Fig. 7 *

*The ABAP statement SET_LOCALE which is required for the processing of language-dependent data in MDMP systems is CPU expensive. In a single code page/Unicode system it is exclusively used for sorting and therefore not CPU expensive. SAPWide AGcharacter handling is expensive for double-byte code pages. 7 SupportedlanguagesinUnicode.doc 09.05.2007

The size of additional hardware which is required for a Unicode database depends on: • Database Unicode encoding format (e.g. CESU-8 vs. UTF-16) • Database settings (page size, extent size) • Hardware compression • Languages in use: e.g. for double-byte characters UTF-16 requires less storage than UTF-8 which means that processing Japanese text data demands more than processing English text data which means (see Fig. 8 ) • Application modules in use (ratios: tables/indices, text/binary data) • Reorganization frequency : Unicode conversion includes a DB reorganization o DB growth is often compensated by shrinking due to reorganizationi (especially the indices)

Fig. 8

AÄ 晒

1100 8000 CESU-8 UTF-16 1100 8000 CESU-8 UTF-16 1100 8000 CESU-8 UTF-16

Fig. 9: Examples DB Size before DB Size after Change DB Manufacturer Unicode Conversion (in Conversion (in Encoding GB) GB) 54 48 -11,1% Oracle CESU-8 528 461 - 12,7% Oracle CESU-8 772 666 - 13,7% Oracle CESU-8 112 93 -17% Oracle CESU-8 880 674 - 23,4% Oracle CESU-8 240 270 + 12,5% Oracle CESU-8 460 650 + 41,3% SAP DB 7.3 UTF-16 22 36 + 63,6% SQL UTF-16

Database and Platform Support The following /database combinations are supported in Unicode SAP systems.

Informix and Reliant Unix support is not planned. Fig. 10

SAP AG 8 SupportedlanguagesinUnicode.doc 09.05.2007

Database Operating System W2K Linux³ Solaris 1 HP 1 Tru64 1 AIX 1 OS/400 OS/390 Oracle ! ! ! ! ! ! - - MS SQL ! ------Server SAP ! ! ! ! ! ! - - DB/Max DB DB2 ! ! ! ! ! ! -²

1 64bit versions only.

Fig. 11 Additional Storage Database (Platform) Encoding Requirements

DB2 for AS/400 UTFUTF----16161616 10…20% ***

DB2 for z/OS UTFUTF----16161616 ---20…10%-20…10% ****** DB2 /Universal Database for UTFUTF----8888 ---10%-10% Unix/NT MaxDB UTFUTF----1616 40…60% MS SQL UTFUTF----1616 40…60% Oracle CESUCESU----88 ---10%-10%

*Small growth as biggest part of the ASCII based database is already Unicode * *With hardware compression, which is always used for SAP Unicode installations

Average database growth measured in customer systems (sum of all sizes): ■ UTF-8 and CESU-8: -13% (more than 90% of the databases have shrunk ) ■ UTF-16: +30...40% If you run a database which is not supported by Unicode, you can perform a database change with simultaneous Unicode conversion. Choose the heterogeneous system copy procedure for the database export/import instead of the homogeneous system copy method. For more information about system copy methods see SAP Service Marketplace Quick Link /systemcopy.

Frontend Settings in the Unicode System SAP GUI Support All SAP GUIs (HTML, Java, Windows) support Unicode alongside all the non-Unicode code pages already supported. Because SAP GUI is backward compatible, a single SAP GUI can be used to access both Unicode and non-Unicode systems, and therefore only one GUI is needed per frontend.

SAP AG 9 SupportedlanguagesinUnicode.doc 09.05.2007

Frontend Requirements Requirements : minimum SAP GUI Patch Level 56 (SAP recommends to install the newest SAP GUI Patch Level) Documentation : “SAPGUI for Windows: I18N User Guide” : You can download this documentation from SAP Service Marketplace at www.service.sap.com/i18n → I18N Media Library or from SAP Note 508854 SAP Note 710720 (SAPGUI for Windows 6.40)

For full support of languages with multi-byte system locales (Japanese, Traditional Chinese, Simplified Chinese and Korean) SAPGUI 6.40 is required!

Application-specific restrictions As of SAP NetWeaver 2004s all BEx tools are Unicode-enabled. For details have a look at www.service.sap.com/bi → SAP NetWeaver 2004s BI → BI Capabilities.

Prerequisites 1. Unicode version of SAP Web Application Server (at least kernel patch level 1078) 2. SAPGUI 620 (at least patch level 33) 3. Windows 2000, XP and the succeeding versions of Windows 4. I18N mode of SAP Frontend must be ON

Code Pages Unicode SAP systems use one system code page and one frontend code page (GUI code page). The system code page depends on the platform byte order: • 4103 (UTF-16 LE) • 4102 (UTF-16 BE) The frontend code page is 4110 (UTF-8; Unicode Character Set). In Unicode systems the application server sends the information about the frontend code page, and the frontend code page is automatically set to 4110 (UTF-8). Do not change these settings!

Font Selection It is recommended to choose a Microsoft TrueType font, such as "Courier New" or "Andale Mono" (included in the Internet Explorer) as fixed font. In this case, "Tahoma" is automatically selected as proportional font. This combination covers a large area of the Unicode Character Set. Note: "Arial Monospaced for SAP" is not suitable as it covers only Latin-1 characters.

To display and change the frontend settings in the Unicode System, select from the System Function Bar and choose:

SAP AG 10 SupportedlanguagesinUnicode.doc 09.05.2007

1. Font (I18N)…: On the first screen the fixed font is displayed (for example "Courier New".. If you choose the OK button, a second, identical, screen is opened in which the proportional font is displayed (for example "Tahoma"). 2. Options (I18N)… Read the “SAPGUI for Windows: I18N User Guide” before maintaining the frontend settings!

Locales ABAP programs are written to be language-neutral and all language-specific data is derived from the system locales. Unlike in non-Unicode SAP systems, the Unicode system locales are platform independent. To provide Unicode Locales, SAP uses the International Components for Unicode (ICU). The ICU is a C/C++ and Java library that provides many internationalization functions including locales, transliteration and language-sensitive collation.

Input Methods in Unicode Systems SAP does not support input methods which make use of characters in the Unicode Private Use Area (PUA), for example Hong Kong Chinese (HKSCS) characters. For more information, see SAP Note 845233 .

International Components for Unicode (ICU) Background and History of ICU ICU is a set of C/C++ and Java libraries for Unicode support, software internationalization and globalization (i18n/g11n). It grew out of the JDK 1.1 internationalization APIs, which the ICU team contributed, and the project continues to be developed for the most advanced Unicode/i18n support. ICU is widely portable and gives applications the same results on all platforms and between C/C++ and Java software. ICU in the Unicode Development Project SAP Unicode development uses ICU for those functions that were defined by the platform-dependent locales in non-Unicode systems: 1. CTYPE functions: toupperU(), tolowerU(), isupperU(), islowerU(), isspaceU(), isprintU() etc. In the Unicode system, these functions are no longer platform- or locale-dependent. Every Unicode character has well defined properties, and these properties are accessible via ICU. 2. Collation: The ABAP statements SORT ... AS TEXT and CONVERT ... INTO SORTABLE CODE provide the programmer with locale-dependent, cultural sorting. ICU has collations for all languages that are supported by SAP. The bidirectional layout of Hebrew and Arabic texts will be implemented with ICU both in the Unicode and non-Unicode system.

Unicode SAP Systems: After the System Installation Requirements : Web AS 6.20 onwards Programs : report RSCPINST; transaction SMLT Further Information : SAP Notes 73606, 544623, 551344; Excel list "Supported Languages and Code Pages.xls".

SAP AG 11 SupportedlanguagesinUnicode.doc 09.05.2007

In a Unicode SAP system you can use almost every script and therefore almost every language in the world . An overview of the major languages, their scripts and the countries in which they are spoken is available in the Excel file "Supported Languages and Code Pages" which can be downloaded from SAP Note 73606 and from SAP Service Marketplace → Quick Link /i18n → I18N Media Library. Before installing a Unicode system you must consider the following topics: 1. Will you need additional hardware? See chapter Hardware requirements . 2. Do your ABAP programs comply with ABAP 6.10 syntax? Use transaction UCCHECK to find the lines that must be modified. Modify any program that is not compliant. For all programs set the attribute "Unicode enabled". See chapter Programming Languages . 3. If you have any C/C++ programs, are they Unicode-compliant? 4. Which languages will be required? Consider all of the users who will be working in the system and determine which users need to work in their respective languages. Also consider languages that are necessary for you to conduct business.

Language Installation After determining which languages to install run report RSCPINST. RSCPINST is a setup and diagnostic tool for configuring languages in SAP systems. The report automatically determines the settings required for a consistent i18n configuration, based on the set of languages selected. RSCPINST checks (i) all important i18n configuration tables (ii) all important i18n application server profile parameters. The report also updates the necessary database tables, but modifications to the application profile parameters have to be carried out manually. In transaction SE38 start RSCPINST. You will get a message that this is a Unicode system. Confirm the message, then select and follow the instructions described there.

Web AS 6.20 and Web AS 6.40 : You can activate up to 50 languages per system. Web AS 7.00 and higher : You can activate up to 240 languages per system.

You can check your current language configuration anytime by using pushbutton Current NLS config . After you have installed or changed your configuration always use Simulate to check the configuration and possible inconsistencies. You can simulate RSCPINST as often as required. Do not activate before all inconsistencies have been checked and adjusted!

Fig. 12: Language Installation Tool

SAP AG 12 SupportedlanguagesinUnicode.doc 09.05.2007

1. Enter languages The tool suggests the set of languages which is entered in database table TCP0I. If TCP0I is empty, RSCPINST suggests language EN (English) only. If you want to add languages, use Add. You can add all 41 languages from the Default Set delivered by SAP (see Fig. 2 ). If you want to install more languages or the F4 help in the Key field does not include the language key you require, you can add the language key(s) to the F4 help list. Select Extend Language List . This function extends the Default Language Set (as described in chapter 'Technical Language Support', Fig. 2) with any of the new Unicode Languages (chapter 'Technical Language Support', examples 1 and 2). Proceed the configuration with the extended list as usual. Note: Read SAP Note 42305. It is also recommended to contact SAP to evaluate your system landscape and language configuration before adding new languages. 2. Enter country If there is an existing entry, it displays the country in the fields next to Choose and also all possible countries in the list. If it is a new installation or no previous entry is found, it just displays 'Unicode' in both the field and the list. If there is a need to select the country, you can make the country list available by selecting Goto → Select Country (Unicode) from the menubar. 3. Select Simulate from the toolbar. 4. Now RSCPINST checks the consistency of new configurations and generates a list of new setting information and problematic areas, such as obsolete parameters, etc. which must be adjusted

SAP AG 13 SupportedlanguagesinUnicode.doc 09.05.2007

manually before the installation can be activated. No database updates are carried out within this mode. 5. Review the output and make the necessary changes to the profile parameters if indicated any. 6. Select Activate to complete the installation. Manually update the profile parameter zcsa/installed_languages so that it will include all of the languages installed. In the RSCPINST simulation (Fig. 13) you see an example of new parameter values after the installation of two additional languages. Copy the new values and go to transaction RZ 10 or RZ11. Replace the values of zcsa/installed_languages by pasting the copied values onto the current values of this parameter. Fig. 13: Language Installation Simulation

Logon Language In Unicode SAP systems you can display and maintain data from ANY language with ANY logon language - provided the languages and locales have been installed and profile parameters have been maintained correctly as described in chap. Language Installation (see also: SAP Note 42305). The logon language determines the display language for menus, dynpros, system messages etc…i.e. the language the user is working in. Translation Import After having finished all code page related configurations, run transaction SMLT to configure, import or supplement the translations for all required languages. Call transaction SMLT and read the documentation which is available via pushbutton Documentation in the toolbar.

Printing in Unicode SAP Systems Local printing with Single Code Page printers is still possible. To print multilingual data, Unicode printers are required.

SAP AG 14 SupportedlanguagesinUnicode.doc 09.05.2007

Fig. 14

Supported Device Types LEXUTF8 HPUTF8

Communication within Multilingual System Landscapes In this chapter you will see how Unicode SAP systems integrate with other Unicode and with non- Unicode systems. You will also learn about the advantages of a homogeneous Unicode system landscape. The example shows that not each non-Unicode system can correctly deal with the text data it receives - whilst each Unicode system can. So, to be truly able to communicate without any code page incompatibilities, the only solution is a system landscape which is completely Unicode-based . Important SAP Notes SAP Note Description 745030 MDMP - Unicode Interfaces: Solution Overview Provides a detailed overview of data exchange solutions between MDMP systems and Unicode. 656350 Master Data Transfer UNICODE <==> MDMP Systems with ALE Details the description and solution of a special development project which allows transferring language dependent master data between a UNICODE and MDMP system using ALE Technology. 820419 IDoc adapter: Incorrect field values with MDMP systems Information about using the Exchange Infrastructure (XI3.0) for sending IDocs with the IDOC adapter to an MDMP system.

SAP AG 15 SupportedlanguagesinUnicode.doc 09.05.2007

A multinational company has a Unicode SAP system at its headquarters in the US, a single code page SAP system in Japan (Shift-JIS or SJIS) and an MDMP SAP system in Australia (Latin-1/SJIS/Thai). Japanese and English (7Bit ASCII) data can be sent and received by all offices, but the Japanese office cannot receive all data with Thai characters from the Australian office, because SJIS does not contain those characters.

Data Transfer in a Unicode/non-Unicode System Landscape Data transfer between two Unicode systems is always unproblematic, no matter if text data contain language information or not. Communication between two systems will be problematic in the following cases: 1. Sender and receiver system deploy different code pages. If you logon to a Unicode system with language EN and maintain Japanese data which are then transferred (for example via RFC) or transported into a non-Unicode System which has no Japanese code page, the Japanese data will be corrupted in the non-Unicode System. 2. JAVA applications communicate with non-Unicode SAP components: as JAVA is using Unicode for text procssing and Unicode is a superset to all old non-Unicode code pages there is always danger of data loss in the communication between JAVA and non-Unicode software. This applies for the communication between the ABAP stack and the Java stack within a NetWeaver application as well as for the communication of the ABAP stack with external JAVA applications. 3. A Unicode system communicates with an Asian non-Unicode system (double-byte code page).

Fig. 15 100% data transfer 7Bit ASCII data transfer; solution for 100% data transfer being implemented in some applications 7Bit ASCII and some additional data transfer 100% 7Bit ASCII data transfer only No solution for data transfer yet - under investigation!

Smooth data transfer can only be guaranteed between Unicode systems and between Unicode system and JAVA application (see: Fig. 15 ). As MDMP is completely unknown in the JAVA world, there is no communication possible between MDMP system and JAVA applications.

Fig. 16 System Communication Sender / Single Code Page MDMP JAVA Application Unicode Receiver Single Code Page MDMP JAVA Application Unicode

SAP AG 16 SupportedlanguagesinUnicode.doc 09.05.2007

Fig. 17 Example: Communication between Single Code Page SAP Systems Sender* / ISO-1 ISO-2 ISO-3 ISO-5 ISO-6 ISO-7 ISO-9 ISO-11 SJIS Big 5 KSC GB Receiver 5601 1324 ISO- 1 ISO- 2 ISO- 3 ISO- 5 ISO- 6 ISO- 7 ISO- 9 ISO- 11 SJIS Big 5 KSC5601 GB1324

*ISO-X = ISO 8859-X

Fig. 18 Common errors: ■ communication of Unicode system with non-Unicode system (reason: wrong language key) ■ file upload/download (reason: wrong code page)

Transport between Unicode and non-Unicode SAP Systems Transporting objects between UC and non-UC systems is technically supported. There are some restrictions, however, which are described in SAP Notes 638357 and 80727 .

SAP AG 17 SupportedlanguagesinUnicode.doc 09.05.2007

RFC Library As of Web AS 6.10 RFC enhancements guarantee smooth data communication between Unicode and non-Unicode systems and with non-SAP products as well. All necessary data conversions occur in the Unicode system; therefore you do not need to make any changes to your existing non-Unicode systems. When configuring destinations in the Unicode systems, you simply have to declare the RFC destination as a non-Unicode system, and then the data will be converted with the appropriate code pages and language keys of the non-Unicode destination system. The RFC Library exists in a Unicode and non-Unicode version. Thus, the Unicode RFC Library is forward and backward compatible, i. e. a current Unicode RFC Application can communicate with any non-Unicode RFC Application independently of its release and vice versa. The Unicode Library is able to communicate with any RFC partner, regardless if the partner is Unicode or non-Unicode. There are two approaches of the RFC Library: 1. Both RFC partner and RFC client use a Unicode (or non-Unicode) system. The data will be sent to the RFC server as it is, i.e there is no conversion for character-like data at sender side. The receiver converts the data into its own internal format. Note: This RFC-connection does only work 100% if sender and receiver system do not deploy different code pages; i.e. if they are both either Unicode systems (see Fig. 15 ) or they use the same non-Unicode code page (see Fig. 16 )! 2. Only one RFC partner uses a Unicode system. In this case the Unicode system must convert the data into a suitable ASCII data format before sending it. When the RFC converts text data between Unicode and MDMP systems it converts from/to the code page in which the text data are encoded in the MDMP system. The encoding code page depends on the text language which is taken from the language field of the table. If the table has no language field the matching code page will be determined according to the logon language. For example if the logon language is Japanese, the Unicode partner will convert the character-like data into a 8000 code page before sending it. This code page is called communication code page . For more information about RFC-connections between Unicode and non-Unicode systems read the following SAP Notes: 547444 (RFC Enhancement for Unicode ./. MDMP Connections) 480671 (The Text Language Flag of LANG Fields) 722193 (RFC legacy MDMP callers and Unicode callees) 647495 (RFC for Unicode ./. MDMP Connections 790485 (RFC Problem Single Code Page, non-Unicode to Unicode System)

For information about how to use the RFC Library see the RFC-Documentation on SAP Service Marketplace. Go to service.sap.com/rfc-library . Select Media Library → RFC Library Guide .

SAP AG 18 SupportedlanguagesinUnicode.doc 09.05.2007

From non-Unicode SAP System to Unicode There are four ways to your Unicode SAP system: 1. New installation of a Unicode SAP system (as of mySAP ERP 2005/SAP NetWeaver 2004s all new installations are Unicode systems) 2. Conversion of non-Unicode SAP system to Unicode (supported as of R/3 Enterprise 4.70 Ext. 2.00/Web AS 6.20) 3. Combined Upgrade & Unicode Conversion to target release SAP ERP 2005 • source release SAP_BASIS 4.6C (Status: Unrestricted shipment). • source releases SAP_BASIS 6.20 and 6.40 (pilot project, see SAP Note 928729 ) 4. Twin Upgrade & Unicode Conversion • Source releases lower than R/3 4.6C/SAP_BASIS 4.6C (pilot project, see SAP Note 959698 ) • Source release Web AS 6.10 (pilot project, see SAP Note 959698 ) A system conversion from MDMP to Unicode implies some additional consideration and more preparation and conversion steps than a Single Code Page system conversion. Make sure you deploy Unicode-based mySAP components (listed in SAP Note 79991) and a Unicode-enabled database (for current information of databases supported by Unicode, see section Database and Platform Support in this document and SAP Note 379940). Make also sure you read the applicable documentation listed in the Appendix.

Appendix Documentation New Installation of Unicode SAP Systems If you have decided to install a new Unicode SAP system, you need the Installation Guide for your database/platform combination, and SAP Note 544623.

Conversion of non-Unicode SAP to Unicode If you have decided to convert existing SAP systems to Unicode, the following documentation is required: 1. "Unicode Conversion Guide". Go to SAP Service Marketplace → Quick Link /unicode@sap. Select Unicode Library → Unicode Conversion Library .

2. "Homogeneous or Heterogeneous System Copy for SAP Systems Based on SAP Web AS 6.xx". Go to SAP Service Marketplace → Quick Link /instguides.

Combined Upgrade & Unicode Conversion Download the Combined Upgrade & Unicode Conversion Guide 4.6C from SAP Note 928729 . Download the Component Upgrade Guide and the Installation Guide for your database/platform combination from SAP Service Marketplace Quick Link/instguides. Download the System Copy Guide from SAP Service Marketplace Quick Link/systemcopy.

SAP AG 19 SupportedlanguagesinUnicode.doc 09.05.2007

Further Information Visit the websites of the SAP Internationalization team for detailed technical information: http://service.sap.com/i18n http://service.sap.com/unicode@sap Visit the website of the SAP Globalization team for detailed customer and project information: www.service.sap.com/unicode For general information about Unicode visit the public website of the : http://www.unicode.org Overview of pre-Unicode code pages: http://czyborra.com/charsets

Contacts For technical information: mailto:[email protected] For information about Unicode Conversion project status: mailto:[email protected]

SAP AG 20