<<

SAP-Migrationen auf

Sebastian Buhlinger SAP Consultant, HP-SAP EMEA CC

© 2004 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice Agenda

1. Introduction to Unicode

2. Unicode & SAP in General

3. Technology in Depth

3/31/2004 2 Introduction to Unicode

3/31/2004 3 1. Introduction to Unicode

• History of encoding • Problem of • From ACII to Unicode • What is Unicode exactly? • Unicode Encodings

3/31/2004 4 History of Character Encoding

• Historically, computers were pretty slow, had fairly little memory and were very expensive • Up to 1960s I/O meant pushing holes into paper tapes • Most of the character sets date back to punch-card age and are designed with these cards in mind • In the early days of computers every hardware manufacturer used proprietary technology (and encodings) • International data interchange was no issue and so nothing needed to fit together

3/31/2004 5 Problem of character encoding

• Which number is assigned to which character? • When typing an ‘A’ on the keyboard, the computer uses the character code as a basis for pulling the character shape of ‘A’ from a font file listing with the same binary number, and displays or prints it • The character ‘A’ may also have different integer values in different programs or data files (‘A’ might be ‘•’ in an font file) • In some instances no number available for certain characters (f.i. “ä” à Ä) • All data encoded in the form of binary numerical codes 3/31/2004 6 Character repertoire

• English alphabet: with some digits and little more: ~ 60 characters • Western European Standard: ~ 300 characters for several languages • Korean: ~12.000 syllables • Chinese dictionaries: ~ 50.000 letters • Hundreds of other characters in common use, such as math and currency symbols

3/31/2004 7 From ASCII to Unicode

• Most character sets and encodings in 70s/80s were modifications or extensions of ASCII • Many of them used 8-bit with a subset of the 94 used ASCII characters • Most common encodings nowadays use single per character (SBCS) • They are all limited to 256 characters • Due to that, none of them can even cover the letters for the Western European languages

3/31/2004 8 What is Unicode exactly?

• Unicode = universally encoded character set to store information from any language • Unicode defines • properties for each character • standardizes script behavior • provides a standard algorithm for bi directional text • defines cross-mappings for other standards • Unicode defines a unique code value for every character, regardless of platform, program or programming language used

3/31/2004 9 What is Unicode exactly?

• The Unicode standard primarily encodes scripts rather than languages • Scripts comprise several languages that historically share the same set of symbols • In many cases a script may serve to write dozens of languages (e.g. the Latin script) • In other cases one script complies to one language (e.g. )

3/31/2004 10 Unicode Encodings

• UTF = Unicode Transformation Format • UCS = Universal Character Set • CESU = Compatibility Encoding Scheme

• Conversion between different encodings is a simple, bit-wise operation (defined in standard) • No performance excessive conversion table necessary!

3/31/2004 11 Unicode Encodings

• UTF-8: Unicode Transformation based on 8- bit representation

• CESU-8: Compatibility Encoding Scheme of UTF-16 on an 8-bit base

• UTF-16: Unicode Transformation based on 16-bit representation

3/31/2004 12 Unicode Encodings

• UCS-2: Universal Character Set 2 byte variation (16-bit)

• UTF-32: Unicode Transformation based on 32-bit representation

• UCS-4: Universal Character Set 4 byte variation (32 bit)

3/31/2004 13 Unicode Encodings

• Not all Unicode characters are 2 long ’ no doubling of hw requirements in the first place • Unicode encoding determines the length of a character • Character in one Unicode encoding can be longer than 1 byte; therefore Unicode characters can be longer than characters defined in a standard

3/31/2004 14 Example #1

Character UTF-8 UCS-2 UTF-16 A 41 0041 0041 c 63 0063 0063 Æ C3 86 00C6 00C6 Ö C3 B6 00F6 00F6

• DA 64 0664 0664 • E4 BA 75 9875 9875 • F0 9D 84 9E N/A D834 DD1E 3/31/2004 15 Example #2 – character “•” U+AC00

UTF- 8 HEX E A B 0 8 0

BIN 1110 1010 1011 0000 1000 0000

Lead Byte Indicator Trailing Byte Indicator Remove lead bytes 1110 1010 1011 0000 1000 0000

1010 11 0000 00 0000

Regroup bits 1010 1100 0000 0000

UTF- 16 BIN 1010 1100 0000 0000

HEX A C 0 0

3/31/2004 16 Unicode & SAP in General

3/31/2004 17 2. Unicode & SAP in General

• Code Pages • SAP & Code Pages • Language Combinations before Unicode • Recommendations from SAP (w/o Unicode) • When/why do customers need Unicode?

3/31/2004 18 Code Pages

• The code page determine what character you can see and enter

Characters on Disk/Memory

3/31/2004 19 Code Pages

• different code pages map different characters to the same byte sequence

Single Byte Double Byte Characters on Disk/Memory

3/31/2004 20 SAP & Code Pages

3/31/2004 21 Language Combinations before Unicode

• Single Standard Code Pages • supports specific sets of languages • the number and combination of languages that are supported cannot be altered

• Standard code pages and R/3 languages (w/o EBCDIC)

Double-Byte Code Pages

3/31/2004 22 Language Combinations before Unicode

• It is also possible to specify a customer- specific language; this language must use one of the code pages that SAP supports; see Note 0112065

3/31/2004 23 Language Combinations before Unicode

• Blended Code Pages (³ Rel. 3.1D) • SAP proprietary code pages that contain characters from one or more standard code pages

• increases the combinations of languages that can be used

• functionally, a Blended Code Page uses a single code page

• a Blended Code Page is a single code page system

• users can see and enter all characters contained in the code page, regardless of their log-in language 3/31/2004 24 Language Combinations before Unicode

SAP Code Page Supported Languages

3/31/2004 25 Language Combinations before Unicode

• the availability of SAP blended code pages is platform dependent, because SAP blended locales need to be created for each platform • Blended Locale Status (x = available -- = not available)

3/31/2004 26 Language Combinations before Unicode

• MDMP (³ Rel. 3.1I) Multi-Display / Multi-Processing

• allows dynamic code page switching on the application server • therefore permits any combination of standard code pages on one system • the log-on language determines the code page that is active for each user • an MDMP system is recommended if:

1. one or more additional code pages are required to add languages to your existing installation 2. a blended code page cannot support the combination of languages you need for a new installation. For example, an MDMP system with the code pages 1100 and 8000, allows German and Japanese users to log onto the same R/3 system in their respective languages

3/31/2004 27 Language Combinations before Unicode Front End Example

8000 - SJIS

Japan Application DB Server

1100 – ISO-1

Germany • Each user can only access one code page at a time: a user who logs in as a Japanese user cannot enter German characters, and all German characters in the database will not be correctly displayed 3/31/2004 28 Language Combinations before Unicode Example

Japanese German User User

3/31/2004 29 Language Combinations before Unicode

Please Note: • It is possible for a user to log on with German and then manipulate the character set and font settings so that he can enter what appear to be Japanese characters; these characters will not be correctly stored in the database and this data will be corrupt

• If a user wants to enter f.i. Japanese, he/she must log on in Japanese

3/31/2004 30 Language Combinations before Unicode

Please Note: • To insure that no data corruption occurs, the following restrictions must be followed:

•Global data must contain only 7-bit ASCII characters, which are in all code pages •Users may use only the characters of their log-in language or 7-bit ASCII •Batch processes must be assigned with the correct user ID and language •EBCDIC code pages are not supported

3/31/2004 31 Recommendations from SAP (w/o Unicode)

• In general, using a single standard code page for new installations and upgrades is the optimal decision • If additional languages or language combinations are needed, SAP recommends Unambiguous Blended Code Pages for new installations and MDMP for existing installations • Unambiguous Blended Code Pages only support certain language combinations and therefore an MDMP setup may be the only possibility for new installations as well

3/31/2004 32 Unicode-compliant SAP products

• All Unicode installations are currently planned only with written permission of SAP carried out as customer projects together with SAP, except of new installations of R/3 Enterprise Extension Set 2.0

3/31/2004 33 When/why do customers need Unciode?

• Global businesses that require IT systems to support multilingual data without any restrictions ’ f.i. customers with one WW central SAP system

• Web interfaces open the door to a global customer base, and IT systems must consequently be able to support multiple local languages simultaneously

3/31/2004 34 When/why do customers need Unciode?

• With J2EE integration, mySAP components fully support web standards, and with Unicode, it now can take full advantage of XML and Java

• Only Unicode makes it possible to seamlessly integrate inhomogeneous SAP and non-SAP system landscapes ’ NetWeaver

3/31/2004 35 Technology in Depth

3/31/2004 36 3. Technology in Depth

• Little vs. Big Endian • Unicode & Databases • SAP Unicode-based Code Pages • How to Unicode-enable a program • Unicode-enabled ABAP • Migrating to Unicode enabled ABAP • Unicode Conversion, IMIG Lab Test

3/31/2004 37 Little vs. Big Endian

• UCS and Unicode are first of all just code tables that assign integer numbers to characters

• There exist several alternatives for how a sequence of such characters or their respective integer values can be represented as a sequence of bytes

• The two most obvious encodings store Unicode text as sequences of either 2 or 4 bytes sequences 3/31/2004 38 Little vs. Big Endian

• Unless otherwise specified, the most significant byte comes first in these (Big Endian convention) • An ASCII or Latin-1 file can be transformed into a UCS-2 file by simply inserting a 0x00 byte in front of every ASCII byte • If we want to have a UCS-4 file, we have to insert three 0x00 bytes instead before every ASCII byte

3/31/2004 39 Little vs. Big Endian

UTF-16 UTF-16 Character Unicode Scalar Value UTF-8 / CESU-8 [Little [Big Endian] Endian]

A U+0041 41 41 00 00 41

Ä U+00C4 C3 84 C4 00 00 C4

• U+03B1 CE B1 B1 03 03 B1

• U+05D0 D7 90 D0 05 05 D0

• U+6653 E6 99 93 53 66 66 53

3/31/2004 40 Unicode & Databases

Supported Databases by SAP (WAS 6.20)

P Available ? Currently not available -- Unsupported in general

Win2K HP-UX Solaris AIX OS/400 OS/390 Linux SQL Server P ------Oracle P P P P -- -- P DB2 P P P P P ? P SAP DB P P P P -- -- P

3/31/2004 41 Unicode & Databases

Manufacturer Version Encodings

SQL Server 2000 UTF-16

Oracle 7.2 UTF-8

8 UTF-8

9i UTF-8 / UTF-16

10g UTF-8 / UTF-16

DB2 AIX CESU-8

AS400 UTF-16

SAP DB 7.0 UTF-16

8.0 UTF-8 3/31/2004 42 System structure with Unicode

Base codepage is Unicode

SAPgui 620 or later 4110

UTF-16 UTF-8(CESU8) UTF-8 Front-end (Windows2000/XP) SAP Application Server Database SAPgui 620 or higher. (SAP Kernel 620) Oracle 9i

3/31/2004 43 System structure with Unicode

SAP R/3 Application Server All data handled as UTF-16 format. SAP Unicode Kernel 620 ABAP programs have to be Unicode “Enabled” This is special kernel for Unicode R/3. Processing UTF-16 requires more CPU power and not a regular release. (Non- UTF-16 Memory because of the Unicode R/3 system) size and the conversion. CPU and Memory requires SAP Application Server +30% (SAP Kernel 620) SAP recommends to use 64bit processing

3/31/2004 44 System structure with Unicode

Database Server – Oracle 9i in Unicode. Setup Oracle as UTF-8(CESU8) Now, Question is.. SAP says “DB grows +35%-60%” ..Because UTF-8 is variable length set.

ASCII (Alpha-Numeric) is stored as single byte

Latin character – European Accent sign is stored as double byte. Double byte text stored as triple byte. UTF-8(CESU8)

Database Oracle 9i

3/31/2004 45 Double byte? Need more storage? How about Network?

Database will grow like that dramatically? The answer is probably No. ..Because the majority of our data will be ASCII characters and also linguistic factor will explain why.

UTF-8(CESU8) “Thunder” 7 bytes 3 bytes

“Yamamoto” 8 bytes 6 bytes Database Oracle 9i e.g. English word uses more letters than Asian languages.

3/31/2004 46 System structure with codepage

Front-end (SAPgui) Unicode enabled SAP system requires SAPgui 620. Although 620 has some restrictions. -Enables Unicode data handling SAPgui 620 or later on Client PC level. 4110 - Users have to be aware of what kind of language data to use. UTF-8

Front-end (Windows2000/XP) SAPgui 620 or higher.

Codepage 4110 (Unicode) Non-Unicode codepage setting (Latin- 1, S-JIS etc…) will not work. 3/31/2004 47 What is difference in Unicode R/3 •For User •For Developer •For System administration - Basis

3/31/2004 48 What is difference in Unicode R/3

•For User • SAPgui setting has changed. In order to handle multiple language data correctly. • Unicode enabled SAP system requires SAPgui 620 or higher. • Enables Unicode data handling on Client PC level. • Users have to be aware of what kind of language data to use and choose. • Use codepage 4110 • Non-Unicode codepage setting (Latin-1, S-JIS etc…) is no longer work with Unicode R/3.

3/31/2004 49 What is difference in Unicode R/3

•For Developer

•All developers have to be aware of “Unicode Enabling ABAP/4”.

•1 Character <> 1 Byte

•Many of ABAP/4 statements and data type are changed.

3/31/2004 50 What is difference in Unicode R/3

•For System Administration - Basis • RFC - SM59 system interface. • Need to know the target system is Unicode/Non-Unicode • Sending/Receiving with which language data? • Codepage data conversion from/to Non-Unicode system.

• ALE/IDoc

3/31/2004 51 How to Unicode-enable a program

• Separate Unicode and Non-Unicode version of R/3 ABAP • 1 character = 1 byte (types C, N, D, T, STRING) source Non-Unicode • Non-Unicode kernel R/3 • Non-Unicode database

• 1 character = 2 bytes ’ UTF-16 (types C, N, D, T, STRING) Unicode R/3 • Unicode kernel • Unicode database • No explicit Unicode data type in ABAP • Single ABAP source for Unicode and non-Unicode systems 3/31/2004 52 How to Unicode-enable a program

• Major part of ABAP coding is ready for Unicode without any changes • Minor part of ABAP coding has to be adapted to comply with Unicode restrictions (f.i. syntactical restrictions)

3/31/2004 53 How to Unicode-enable a program

• Program attribute „Unicode checks active“

3/31/2004 54 Unicode Enabled ABAP Design Goals • Platform independence ØIdentical behavior on Unicode and non-Unicode systems • Highest level of compatibility to the pre-Unicode world ØMinimize costs for Unicode enabling of ABAP Programs Main Features • Clear distinction between character and byte processing 1 Character <> 1 Byte

3/31/2004 55 Migrating to Unicode enabled ABAP

Step 1 • In non-Unicode system

• Adapt all ABAP programs to Unicode syntax and runtime restrictions

• Set attribute "Unicode enabled" for all programs

3/31/2004 56 Migrating to Unicode enabled ABAP

Step 2 • Set up a Unicode system • Unicode kernel + Unicode database • Only ABAP programs with the Unicode attribute are executable

• Do runtime tests in Unicode system

• Check for runtime errors

• Look for semantic errors

• Check ABAP list layout with former double byte characters 3/31/2004 57 Migrating to Unicode enabled ABAP Use UCCHECK to analyze your applications: • Remove errors • Inspect statically not analyzable places (optional) • Untyped field symbols • Offset with variable length • Generic access to database tables • Set Unicode program attribute using UCCHECK or SE38 / SE24 / ... • Do additional checks with SLIN (e.g. matching of actual and formal parameters in function modules)

3/31/2004 58 Migrating to Unicode enabled ABAP

3/31/2004 59 Migrating to Unicode enabled ABAP

3/31/2004 60 Upgrade to Unicode Upgrade to Unicode

• With Unicode, there are no limitations on users, and all languages in the ISO639 standard can be used

• Unicode is technically supported as of Basis Release 6.20, see Note 0379940 for more information

• A single code page system (standard or Unambiguous Blended Code Page) can be upgraded to Unicode using the normal upgrade method

3/31/2004 62 Unicode Conversion Roadmap Preparation • During preparation, topics such as

• additional hardware requirements, • downtime issues, • Unicode-enabling of customer developments, • and the special treatment of MDMP systems

have to be taken into consideration

3/31/2004 63 Unicode Conversion Roadmap Conversion • The Unicode conversion process is based on a system copy, and during this process, the database conversion and system shutdown/restart are as automated as possible • For small to mid-size databases (< 1 TB), this is based on an SAP Unload/Reload of the complete database; minimum downtime tools will be used for larger databases.

3/31/2004 64 Unicode Conversion Roadmap

Post-Conversion

• Once the Unicode system is up and running, you need to • verify data consistency on a scenario basis, • as well as carry out general integration testing

• For systems that support multiple languages, special emphasis needs to be placed on cross-language handling during the test phase.

• Correction tools are provided by SAP, which can be used in the case that conversion did not run properly.

3/31/2004 65 Unicode Conversion Roadmap

Post-Conversion

• Additional Tool: SAP Data Management - reducing the database size and growth

• To keep your database costs in check, the SAP Data Management service frees up valuable database resources by showing you how to reduce the size and growth of your database by typically 25 % (see details).

3/31/2004 66 Unicode Conversion at a Glance

Preparation

Conversion

Post-Conversion

Set up the Unicode Highly automated Conversion Project Unicode system is up and running Check Prerequisites System will be down during database conversion Verification of Data Analysis for Data Consistency downtime minimization – special MDMP treatment Unload /reload process for small databases Integration Testing focused on Enabling of Customer language handling Developments Minimum downtime tool for large databases

3/31/2004 67 Upgrade Paths to Unicode (R/3 Enterprise) Source system Target system

R/3 3.1i

R/3 Enterprise R/3 Enterprise R/3 4.0b Conversion Unicode Direct non-Unicode upgrade R/3 4.5b

l First upgrade, then conversion to Unicode R/3 4.6b l R/3 Enterprise Ramp-Up started 2002-07 l Unicode availability follows a phase of restricted shipment with pilot customers R/3 4.6c

3/31/2004 68 Upgrade Paths to Unicode (BW 3.1) Source system Target system

BW 2.0B

BW 3.1 BW 3.1 Conversion

non-Unicode Unicode

BW 2.1C l Interfacing R/3 MDMP on a project base only l Unicode BEXGUI restrictions apply l First upgrade, then conversion to Unicode l BW 3.1 Ramp-Up starting 2002-12 BW 3.0 l Unicode availability follows a phase of restricted shipment with pilot customers

3/31/2004 69 Upgrade Paths to Unicode (CRM 3.1) Source system Target system

CRM 2.0C

CRM 3.1 CRM 3.1 Conversion

non-Unicode Unicode

CRM 2.0B

l Selected scenarios only çè cooperation with SAP GBU CRM required l First upgrade, then conversion to Unicode l CRM 3.1 Ramp-Up starting 2002-12 CRM 3.0 l Unicode availability follows a phase of restricted shipment with pilot customers 3/31/2004 70 Unicode Conversion at a Glance

Preparation

Conversion

Post-Conversion

Set up the Unicode Highly automated Conversion Project Unicode system is up and running Check Prerequisites System will be down during database conversion Verification of Data Analysis for Data Consistency downtime minimization – special MDMP treatment Unload /reload process for small databases Integration Testing focused on Enabling of Customer language handling Developments Minimum downtime tool for large databases

3/31/2004 71 Prerequisites, special MDMP treatment

• OSS Note 548016 Conversion from Unicode to non-Unicode is not possible The Unicode Conversion of MDMP AND also Ambiguous Code page systems ( Code Page numbers 6100, 6200 and 6500 ) is only supported on project basis with SAP involvement

• OSS Note 543715 The Unicode Conversion of a BW 3.1 system requires additional steps regarding the system copy

• OSS Note 573044 If you are using HR functionality within R/3 Enterprise , also additional steps are mandatory

3/31/2004 72 6.30 Unicode & MCOD • With SAP WebAS 6.30 a database abstraction layer for the Java stack was introduced – OpenSQL for Java • Tables of the Java stack are stored in the same database instance like the tables of the ABAP stack in two different schema (except Informix) • The concept of MCOD installations is fully supported by the combined stack of ABAP and Java

ABAP Stack (non Unicode/Unicode) SAPQA1 System QA1 Java Stack (Unicode) SAPQA1DB

ABAP Stack (non Unicode/Unicode) SAPTC2 System TC2 Java Stack (Unicode) SAPTC2DB

3/31/2004 73 Unicode Conversion at a Glance

Preparation

Conversion

Post-Conversion

Set up the Unicode Highly automated Conversion Project Unicode system is up and running Check Prerequisites System will be down during database conversion Verification of Data Analysis for Data Consistency downtime minimization – special MDMP treatment Unload /reload process for small databases Integration Testing focused on Enabling of Customer language handling Developments Minimum downtime tool for large databases

3/31/2004 74 Experience on a 280 GB database (220 GB used)

Runtime export and import: 36 hours

expected downtime (with backup, control Reports, user test): 56 hours

database grow: 40 GB PSAPBTABD 18 GB (on 115 GB) PSAPES620D 13 GB (on 4 GB)

3/31/2004 75

BACKUP SLIDES

Unicode Encodings

3/31/2004 77 UTF-8

• UTF-8 is the 8-bit encoding of Unicode • It’s a variable-width encoding and also a strict superset of 7-bit ASCII • “Strict superset” means that every character in 7-bit ASCII is available in UTF-8 with the same corresponding value • 1 character = 1byte – 4 bytes in the encoding • Characters from European scripts: either 1or 2 bytes • Asian scripts: 3 or 4 bytes UTF-8

• UTF-8 used for UNIX-platforms, HTML and most Internet Browsers • Main benefits of UTF-8: •compact storage requirements for European scripts •in general European scripts will occupy less storage on disk and memory •ease of migration –> since 7-bit ASCII data remains the same in UTF-8, data conversion effort between ASCII based character sets and UTF-8 is reduced significantly UTF-8 / CESU-8 (8-bit encodings)

• 8-bit encodings are well-suited for data transfer since all 7-bit ASCII and 8-bit ISO characters retain the same code points

• Easier communication with legacy and non- Unicode systems

• Downside: variable character length UCS-2

• UCS-2 has a fixed width of 16 bit (2 bytes) • UCS-2 is the Unicode encoding for Java & Win NT 4.0 • Main benefits of UCS-2: • More compact storage requirements for Asian scripts (each character represented with 2 bytes only) • String processing will be faster because all characters are of the same width • Good compatibility with Java and clients

• Downside: • UCS-2 can support Unicode characters defined up to Unicode 3.0 only (max. 65.536) UTF-16

• UTF-16 is the 16-bit encoding of Unicode • Basically an extension of UCS-2 • One Unicode character can be 2 or 4 bytes in the encoding • Characters from European and most Asian scripts are represented in 2 bytes • Supplementary characters are represented in 4 bytes • UTF-16 is the main Unicode encoding from Windows 2K UTF-16

• Main benefits of UTF-16: •More compact storage requirements for Asian scripts (2 bytes for commonly used characters) •Ideal if European and Asian scripts are used together --> UTF-16 will occupy less storage on disk and memory than with UTF-8 (3 bytes for Asian part) •Balance of efficient access to characters and economical use of storage

• Above mentioned points reason for use of UTF-16 in SAP Web Application Server UCS-2 / UTF-16 (16-bit encodings)

• 16-bit encodings offer a compromise between the pros and cons of the 8-bit and the 32-bit encodings, respectively • They do not need as much memory as 32-bit encodings, but offer quasi fixed character length • UCS-2 has a fixed character length, but it cannot define more than 2^16 (65.636) characters UTF-32

• 32-Bit encoding

• Popular when memory space is no concern

• Fixed width (4Byte) UCS-4 / UTF-32 (32-bit encodings)

• All 32-bit encodings have a fixed length

• This advantage is outweighed by the extensive memory & storage requirements SAP System-to-System Communication

3/31/2004 87 SAP System-to-System communication

• SAP Web Application Server (³ 6.20)

• Only one source code exists for Unicode-based and non- Unicode-based systems, ’ new developments can be smoothly exchanged

• The interfaces (e.g. RFC) have been extended, so that communication between other Unicode-based systems or non-Unicode-based systems is possible. Furthermore, SAP provides standard tools for the installation of (and conversion to) Unicode-based systems that can also be used for checking and Unicode-enabling of customer developments

3/31/2004 88 SAP System-to-System communication

Latin-1 SJIS • solid lines: receiver can receive all characters http/RFC MDMP R/3

Unicode R/3 • dotted lines: receiver cannot receive characters, which are not in its SJIS own code page. But WWW as long as you restrict the character set, data can be sent from everywhere to Latin-1 everywhere. http/RFC Non-Unicode SJIS R/3

3/31/2004 89 SAP System-to-System communication RFC • Unicode <-> Unicode • no problem

• non Unicode <-> non Unicode • old stuff, receiver converts code page if possible

• Unicode <-> non Unicode • the Unicode side converts from/ to the code page of the non Unicode side • MDMP is converted with a languages key • System settings allow the configuration of error handling

3/31/2004 90 SAP System-to-System communication RFC (SM59) – Unicode <–> non Unicode

3/31/2004 91 SAP System-to-System communication RFC (SM59) – Unicode <–> non Unicode

3/31/2004 92 Sizing Information for Unicode-based SAP Systems

3/31/2004 93 Sizing Info - General The space requirements for encoding a text, compared to encodings currently in use (8 bit per character for European languages, more for Chinese/ Japanese/ Korean), is as follows ’ next Slide

This has an influence on disk storage space and network download speed (when no form of compression is used)

3/31/2004 94 Sizing Info - General UTF-8 No change for US ASCII, just a few percent more for ISO-8859-1, 50% more for Chinese/Japanese/Korean, 100% more for Greek and Cyrillic UCS-2 and UTF-16 No change for Chinese/Japanese/Korean. 100% more for US ASCII and ISO-8859-1, Greek and Cyrillic UCS-4 100% more for Chinese/Japanese/Korean. 300% more for US ASCII and ISO-8859-1, Greek and Cyrillic

3/31/2004 95 Expected Hardware Requirements

• Increase of CPU requirements ØDepending on existing solution: ISO-LATIN1 (ASCII) ð Unicode: +30% Double-Byte/MDMP ð Unicode: + <5%

• Increase of memory requirements ØIncrease of memory requirements depending on underlying DB (+ ~50%) ØApplication Server internally based on UTF-16; DB either UTF-8, CESU-8 or UTF-16

3/31/2004 96 Expected Hardware Requirements • Database growth depending on Ø DB Unicode encoding schema (e.g. CESU-8, UTF-16) Ø Languages in use • A Ä 1 Byte 1100 8000 CESU-8 UTF-16 1100 8000 CESU-8 UTF-16 1100 8000 CESU-8 UTF-16 Encoding Manufacturers Additional Storage Req‘s UTF-8 Oracle, SAP DB (8.0) 35% CESU-8 DB/2 (AIX) UTF-16 SQL Server, DB/2 (AS400), SAP DB 60-70% (7.0) • Network load: (draft results) <7% for Latin-1, about 15% for Japanese, 25% for other Asian languages 3/31/2004 97 Expected Hardware Requirements

R/3 Release 4.0 4.5 4.6c 4.7 (6.20) non-Unicode

CPU 1 +20% +15% +5%

Memory 1 +20% DB: +20%; +5% App:+10%

Disk 1 +10% +10% +10%

NON-Unicode

3/31/2004 98 Expected Hardware Requirements

R/3 Release 4.7 (6.20) non-Unicode 4.7 with Unicode

CPU 1 +30% to 35%

Memory 1 +50%

Disk 1 +~35% (UTF-8) +60-70% (UTF-16)

Unicode

3/31/2004 99 IMIG lab project

Tru64 Transition program

3/31/2004 100 Unicode Conversion - IMIG

• Problem Description • R3Load cannot be used for Very Large Databases – (VLDB > 800 GB) Minimized Downtime tool is needed.

• Solution: Incremental Migration (IMIG): • The few tables that are very large are already converted during uptime of the source system. • The remaining data can be copied during a greatly reduced downtime, by means of a standard procedure.

3/31/2004 101 IMIG – Prerequisites

• Database triggers • All insert, update and delete activities executed on the IMIG tables during and after their initial copy are logged. • RFC = Remote Function Call • SAP function for connecting SAP systems • Target system must exist • The activities logged by the triggers in the source system are passed on to the corresponding tables in the target system by means of RFCs. • SAP Basis System • Reduced SAP system that has basic functions (RFC)

3/31/2004 102 IMIG process (uptime)

Installation SA P W eb AS Installation tem porary Prep aration s target system

Im port IM IG package IMIG Package

In itializa tion Creation IMIG Table O ther IMIG Table Tables em pty Transaction IM IG Trigger SAP W eb Log Table AS Tables

E xport IM IG Export/Import IMIG Table tables / im port O ther IMIG Table em pty in tem porary Tables target system SAPinst Logging SAP W eb AS Log Table Tables

U pdate of IMIG Import tab les O ther IMIG Table IMIG Table Transaction Tables IM IG Logging SAP W eb AS Log Table Tables Update via RFC

Tem porary Target Source System System 3/31/2004 103 Major steps of the IMIG procedure

• Build up the target SAP basis system • Choose the IMG tables • Installation of the IMIG package • Initial copy of the IMG tables • Transfer of the recorded activities • Shutdown of the source system • Backup of the old system • Reduced DROP of the target basis system • Migration of the “rest” • Backup of the new system • Startup of the target SAP system

3/31/2004 104 Hardware environment IMIG tests 100 Mbit

Source System: Target System: SAP R/3 4.6c SAP R/3 4.6c Oracle 8.1.7 Oracle 9iV2.02 Tru64 UNIX 5.1b HP-UX 11iV2 GS160 RX5670 8x EV67-730Mhz 4x Itanium2-1Ghz 32 GB RAM 32 GB RAM

F/W FC direct SCSI attached

Storage: Storage: 2x HSG80 1x VA7410 600 GB 2100 GB

3/31/2004 105 Disk Layout “Source System”

Executables, Redolog, Export target … Domain 1 Disk 48/49 Executables, Redolog, Export target …

Data, Index, Temp, Rollback, …

Data, Index, Temp, Rollback, …

Data, Index, Temp, Rollback, … Domain 2

Disk 32-38 Data, Index, Temp, Rollback, …

Data, Index, Temp, Rollback, …

Data, Index, Temp, Rollback, …

Data, Index, Temp, Rollback, …

3/31/2004 106 Overview db02 – database size

3/31/2004 107 Selected tables for IMIG

ACCTIT 9.017.680 KB

LIPS 8.449.280 KB

VBAP 6.419.720 KB

VBFA 6.067.352 KB

VBRP 5.251.848 KB

MSEG 5.245.552 KB

BSIS 4.102.120 KB

VBPA 2.611.760 KB

VBEP 2.504.160 KB

ACCTCR 1.949.944 KB

VBUP 1.949.944 KB

BSID 1.356.552 KB

3/31/2004 108 Export step

• Comparison runtime with and without parallel load • Load generation with 50 SD benchmark user, 70 loops, the basic load was roughly 15% CPU- utilization • Largest table was the determining factor • Export step run with 12 processes • R3load process consumed 40-50% CPU-utilization

ORACLE snapshot too old tuning of rollback segments,

3/31/2004undo TS 109 Export step Comparison Export without parallel load to Export with parallel load

Owner Object Type KBytes Runtime in hours Runtime in hours Difference without load with load in minutes SAPR3 ACCTIT TABLE 9.017.680 2:43:44 2:49:31 05:47,0 SAPR3 LIPS TABLE 8.449.280 2:25:05 2:31:50 06:45,0 SAPR3 VBAP TABLE 6.419.720 2:18:06 2:20:26 02:20,0 SAPR3 VBFA TABLE 6.067.352 1:48:55 1:55:48 06:53,0 SAPR3 VBRP TABLE 5.251.848 2:09:34 1:57:51 11:43,0 SAPR3 MSEG TABLE 5.245.552 1:57:52 1:59:05 01:13,0 SAPR3 BSIS TABLE 4.102.120 2:28:51 2:32:28 03:37,0 SAPR3 VBPA TABLE 2.611.760 2:16:27 2:19:02 02:35,0 SAPR3 VBEP TABLE 2.504.160 0:54:13 0:55:01 00:48,0 SAPR3 ACCTCR TABLE 1.949.944 0:58:58 1:02:09 03:11,0 SAPR3 VBUP TABLE 1.949.944 1:10:44 1:11:53 01:09,0 SAPR3 BSID TABLE 1.356.552 1:24:52 1:22:50 02:02,0

Sum all tables in KBytes: 54.925.912 2:43:44 2:49:31 05:47,0 Total Export without load in hours: 2:43:44 Total Export with load in hours: 2:49:31 Difference in minutes: 05:47,0 Only 4% difference

3/31/2004 110 Import to the new system •Copy export files to target system •12 processes with 60% CPU-utilization for the fastest run

FS/RAW implementation tests with the data-upload of the incremental migration from Tru64 to HPUX.

Implementation: Start End Runtime 12 Tables in parallel Filesystem: bs=1K, Mount Options=default 11:34:35 13:30:46 1:56:11 Filesystem: bs=8K, Mount Options=default 16:40:34 18:35:35 1:55:01 Filesystem: bs=8K, Mount Options=default, IO slaves=8 14:05:11 15:40:46 1:35:35 Filesystem: bs=8K, Mount Options= rw,suid,largefiles,nolog,nodatainlog,mincache=direct,convosync=dir ect, IO slaves=8 16:44:41 18:36:22 1:51:41 RAW device: async IO 11:45:53 13:00:34 1:14:41 RAW device: no async IO; 8 IO slaves 16:50:37 18:00:50 1:10:13 RAW device: async IO; 2 DB writer 13:50:12 15:05:11 1:14:59

Largest Table ACCTIT Filesystem: bs=8K, Mount Options=default, IO slaves=8 16:08:16 16:37:24 0:29:08 RAW device: async IO 13:16:35 13:44:21 0:27:46 RAW device: no async IO; 8 IO slaves 9:32:39 9:59:56 0:27:17 RAW device: async IO; 2 DB writer 15:15:32 15:43:51 0:28:19

3/31/2004 111 Delta transfer - 1

• SD benchmark generated 3 GB delta data

Delta transfer summary runtimes

Without parallel inserts

# of batch processes Start Runtime

Seconds Minutes 4 13:52:56 2.290 38,17 6 13:32:07 1.873 31,22 7 11:52:10 1.640 27,33 8 17:14:06 1.692 28,20 12 9:32:00 1.797 29,95

With parallel inserts through SAP SD benchmark

#of batch processes Start Runtime Seconds Minutes 4 with pi (50 SD user) 13:56:31 2.403 40,05 6 with pi (50 SD user) 16:18:59 1.946 32,43 7 with pi (50 SD user) 12:17:18 1.932 32,20 7 with pi (100 SD user) 12:57:59 2.227 37,12 8 with pi (50 SD user) 16:51:36 1.961 32,68

3/31/2004 112 Delta transfer - 2

CPU Without With parallel With parallel load parallel load through through 100 SD load 50 SD benchmark user benchmark user User + Ca. Ca. 75% Ca. 88% System 63%

Oracle Ca. 42 Ca. 42 Ca. 42

SAP Ca. 17- Ca. 28-36% Ca. 40-55% disp+work 22%

3/31/2004 113 Major findings

• Largest table is the determining factor

• Storage hardware is the limiting factor, I/O tuning might be necessary

• Export step concurrent activities on the exported tables do not necessarily imply less throughput

• Delta-transfer additional app-server hardware for the conversion could help 3/31/2004 114