Administration: Globalization Table of Contents

PUBLIC SAP IQ 16.0 SP08 Document Version: 2.0 - 2014-06-20 Administration: Globalization Table of Contents 1 About International Language Data.................................................4 1.1 International Languages and Character Sets............................................4 1.1.1 What is ICU, and when is it needed?........................................... 4 1.2 Character sets................................................................. 6 1.2.1 Code Pages in Windows....................................................6 1.2.2 How the Collation Sequence Sorts Characters....................................9 1.3 Collations....................................................................10 1.3.1 SQL Anywhere Collation Algorithm (SACA)......................................11 1.3.2 Unicode Collation Algorithm (UCA)...........................................12 1.3.3 Collations in a database...................................................14 1.3.4 Alternate collations...................................................... 15 1.3.5 Turkish character sets and collations..........................................18 1.4 Locales......................................................................21 1.4.1 Locale language.........................................................21 1.4.2 Locale character set......................................................21 1.5 Understanding Character Set Translation............................................. 22 1.5.1 Character Translation for Database Messages...................................22 1.5.2 Connection Strings and Character Sets........................................23 1.5.3 Avoiding Character Set Translation...........................................23 2 ANSI and OEM Code Pages......................................................25 2.1 ANSI ISO_1 Collation............................................................25 2.2 ANSI 1252LATIN1 Collation........................................................25 2.3 ANSI ISO1LATIN1 Collation........................................................25 2.4 ANSI ISO9LATIN1 Collation....................................................... 26 3 Displaying Collations...........................................................27 4 Multibyte Collations...........................................................28 4.1 Japanese Language Support...................................................... 28 4.2 Thai Language Support..........................................................28 5 Locale information............................................................ 29 6 Selecting a Collation to Support a Specific Locale.................................... 30 7 Choosing Character Sets and Collation Sequences to Maximize Performance................31 8 Setting a Locale.............................................................. 32 8.1 List of Language Label Values......................................................32 8.2 Setting the Locale for an INSERT...LOCATION Statement..................................33 PUBLIC Administration: Globalization 2 © 2014 SAP AG or an SAP affiliate company. All rights reserved. Table of Contents 9 Disabling Character Set Translation on a Database Server..............................35 10 Creating a Database with the Default Collation.......................................36 11 Changing the database collation..................................................37 12 Important Disclaimers on Legal Aspects............................................39 Administration: Globalization PUBLIC Table of Contents © 2014 SAP AG or an SAP affiliate company. All rights reserved. 3 1 About International Language Data When you create a database, you specify a collating sequence, or collation, to be used by the database. A collation is a combination of a character set and a sort order for characters in the database. 1.1 International Languages and Character Sets Configure your SAP® IQ installation to handle international language issues. The database collation sorts and compares all character data types in the database, including object names, such as table and column names. SAP IQ takes advantage of the space efficiency and speed of the SAP® SQL Anywhere® Collation Algorithm. ● The database option SORT_COLLATION allows implicit use of the SORTKEY function on ORDER BY expressions. When the value of this option is set to a valid collation name or collation ID, any string expression in the ORDER BY clause is automatically treated as if the SORTKEY function has been invoked. ● The SORTKEY function uses the International Components for Unicode (ICU) library, instead of the Sybase ® Unicode Infrastructure Library (Unilib ). Sort-key values created using a version of SAP IQ earlier than 15 do not contain the same values created using later versions. Regenerate any sort key values in your database that were generated using a version of SAP IQ earlier than 15. ● The CREATE DATABASE parameter COLLATION supports specification of a collation for a database. The collation of the database must match the collation used by the operating system. The default character set for CREATE DATABASE is ISO_BINENG. ● The CP874toUTF8 utility converts data in the CP874 character set into UTF8 collation, supported by SAP IQ for the Thai language. The CP874toUTF8 utility calls the ICU library to perform data conversion. You can also use this utility to load data in the CP874 character set without converting the data to UTF8. SAP IQ no longer supports custom collations. If you rebuild a database with a custom collation, rebuild in a single step to preserve the custom collation. If you unload the database and then load the schema and data into a database you create, you must use one of the supplied collations. For more information about changes to database collations, and a list of collations deprecated in SAP IQ 15, see New Features in SAP IQ 15.0. Use the iqunload utility to migrate to the current version of SAP IQ from an existing 12.7 database that was created with a deprecated collation. 1.1.1 What is ICU, and when is it needed? ICU, or International Components for Unicode, is an open source library developed and maintained by IBM. ICU facilitates software internationalization by providing Unicode support. Certain character set conversions and collation operations are implemented using ICU. PUBLIC Administration: Globalization 4 © 2014 SAP AG or an SAP affiliate company. All rights reserved. About International Language Data When is ICU needed on the database server? Ideally, ICU should always be available for use by the database server. The following table specifies when and why ICU is needed: ICU is needed when... Notes Unicode Collation Algorithm (UCA) is used as the colla UCA requires ICU. tion for the NCHAR or CHAR character set. The database character set is not UTF-8, but is a multi- For password conversion from the database character byte character set. set to UTF-8 (database passwords are stored in UTF-8, internally). The client and database character sets are different, Proper conversion to and from a multi-byte character and when either of them is multi-byte (including set requires ICU. UTF-8). This includes Unicode ODBC, OLE DB, ADO.NET, and SQL Anywhere JDBC applications, re gardless of the database character set where at least one of these clients does not have ICU. The database character set is not UTF-8 and conver The database server requires ICU to convert UTF-8 to sion between CHAR and NCHAR values is required. another character set. An Embedded SQL client uses an NCHAR character set The database server requires ICU to convert UTF-8 to other than UTF-8. another character set. The default Embedded SQL cli ent NCHAR character set is the same as the initial cli ent CHAR character set. This can be changed using the db_change_nchar_charset function. The CSCONVERT or SORTKEY functions are used. The Character set conversion o and from a multi-byte char CSCONVERT function is called to convert between acter set requires ICU. Sortkey generation for many character sets that conform to the requirements of the sortkey labels requires UCA, which, in turn, requires third point above. ICU. When can I get correct character set conversion on the database server without ICU? You can get correct character set conversion without ICU when both the database character set and client character set are single-byte and sqlany.cvf is available (all platforms), or if the operating system supports the conversion (Windows only). This is because single-byte to single-byte conversions can be processed without ICU if the sqlany.cvf file is available, or the host operating system has the appropriate converters installed. When is ICU needed on the client? For Unicode client applications, you are likely to get better combined client and database server performance when all clients have ICU installed, regardless of the database character set. This is because some of the required Administration: Globalization PUBLIC About International Language Data © 2014 SAP AG or an SAP affiliate company. All rights reserved. 5 conversion activity may be offloaded from the database server to the client, and because fewer conversions are required. Also, if you are using ODBC on Windows platforms, you must have ICU installed on the client, even for ANSI applications. This is because the driver manager converts ANSI ODBC calls to Unicode ODBC calls. 1.2 Character sets A character set is a set of symbols, including letters, digits, spaces, and other symbols. Each piece of software

Load more