Using International Character Sets with SAS® and Teradata Greg Otto, Teradata Corporation Austin Swift, Salman Maher, SAS Institute Inc
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Unicode and Code Page Support
Natural for Mainframes Unicode and Code Page Support Version 4.2.6 for Mainframes October 2009 This document applies to Natural Version 4.2.6 for Mainframes and to all subsequent releases. Specifications contained herein are subject to change and these changes will be reported in subsequent release notes or new editions. Copyright © Software AG 1979-2009. All rights reserved. The name Software AG, webMethods and all Software AG product names are either trademarks or registered trademarks of Software AG and/or Software AG USA, Inc. Other company and product names mentioned herein may be trademarks of their respective owners. Table of Contents 1 Unicode and Code Page Support .................................................................................... 1 2 Introduction ..................................................................................................................... 3 About Code Pages and Unicode ................................................................................ 4 About Unicode and Code Page Support in Natural .................................................. 5 ICU on Mainframe Platforms ..................................................................................... 6 3 Unicode and Code Page Support in the Natural Programming Language .................... 7 Natural Data Format U for Unicode-Based Data ....................................................... 8 Statements .................................................................................................................. 9 Logical -
Proposal for Generation Panel for Sinhala Script Label
Proposal for the Generation Panel for the Sinhala Script Label Generation Ruleset for the Root Zone 1 General information Sinhala belongs to the Indo-European language family with its roots deeply associated with Indo-Aryan sub family to which the languages such as Persian and Hindi belong. Although it is not very clear whether people in Sri Lanka spoke a dialect of Prakrit at the time of arrival of Buddhism in the island, there is enough evidence that Sinhala evolved from mixing of Sanskrit, Magadi (the language which was spoken in Magada Province of India where Lord Buddha was born) and local language which was spoken by people of Sri Lanka prior to the arrival of Vijaya, the founder of Sinhala Kingdom. It is also surmised that Sinhala had evolved from an ancient variant of Apabramsa (middle Indic) which is known as ‘Elu’. When tracing history of Elu, it was preceded by Hela or Pali Sihala. Sinhala though has close relationships with Indo Aryan languages which are spoken primarily in the north, north eastern and central India, was very much influenced by Tamil which belongs to the Dravidian family of languages. Though Sinhala is related closely to Indic languages, it also has its own unique characteristics: Sinhala has symbols for two vowels which are not found in any other Indic languages in India: ‘æ’ (ඇ) and ‘æ:’ (ඈ). Sinhala script had evolved from Southern Brahmi script from which almost all the Southern Indic Scripts such as Telugu and Oriya had evolved. Later Sinhala was influenced by Pallava Grantha writing of Southern India. -
Surviving the TEX Font Encoding Mess Understanding The
Surviving the TEX font encoding mess Understanding the world of TEX fonts and mastering the basics of fontinst Ulrik Vieth Taco Hoekwater · EuroT X ’99 Heidelberg E · FAMOUS QUOTE: English is useful because it is a mess. Since English is a mess, it maps well onto the problem space, which is also a mess, which we call reality. Similary, Perl was designed to be a mess, though in the nicests of all possible ways. | LARRY WALL COROLLARY: TEX fonts are mess, as they are a product of reality. Similary, fontinst is a mess, not necessarily by design, but because it has to cope with the mess we call reality. Contents I Overview of TEX font technology II Installation TEX fonts with fontinst III Overview of math fonts EuroT X ’99 Heidelberg 24. September 1999 3 E · · I Overview of TEX font technology What is a font? What is a virtual font? • Font file formats and conversion utilities • Font attributes and classifications • Font selection schemes • Font naming schemes • Font encodings • What’s in a standard font? What’s in an expert font? • Font installation considerations • Why the need for reencoding? • Which raw font encoding to use? • What’s needed to set up fonts for use with T X? • E EuroT X ’99 Heidelberg 24. September 1999 4 E · · What is a font? in technical terms: • – fonts have many different representations depending on the point of view – TEX typesetter: fonts metrics (TFM) and nothing else – DVI driver: virtual fonts (VF), bitmaps fonts(PK), outline fonts (PFA/PFB or TTF) – PostScript: Type 1 (outlines), Type 3 (anything), Type 42 fonts (embedded TTF) in general terms: • – fonts are collections of glyphs (characters, symbols) of a particular design – fonts are organized into families, series and individual shapes – glyphs may be accessed either by character code or by symbolic names – encoding of glyphs may be fixed or controllable by encoding vectors font information consists of: • – metric information (glyph metrics and global parameters) – some representation of glyph shapes (bitmaps or outlines) EuroT X ’99 Heidelberg 24. -
Database Globalization Support Guide
Oracle® Database Database Globalization Support Guide 19c E96349-05 May 2021 Oracle Database Database Globalization Support Guide, 19c E96349-05 Copyright © 2007, 2021, Oracle and/or its affiliates. Primary Author: Rajesh Bhatiya Contributors: Dan Chiba, Winson Chu, Claire Ho, Gary Hua, Simon Law, Geoff Lee, Peter Linsley, Qianrong Ma, Keni Matsuda, Meghna Mehta, Valarie Moore, Cathy Shea, Shige Takeda, Linus Tanaka, Makoto Tozawa, Barry Trute, Ying Wu, Peter Wallack, Chao Wang, Huaqing Wang, Sergiusz Wolicki, Simon Wong, Michael Yau, Jianping Yang, Qin Yu, Tim Yu, Weiran Zhang, Yan Zhu This software and related documentation are provided under a license agreement containing restrictions on use and disclosure and are protected by intellectual property laws. Except as expressly permitted in your license agreement or allowed by law, you may not use, copy, reproduce, translate, broadcast, modify, license, transmit, distribute, exhibit, perform, publish, or display any part, in any form, or by any means. Reverse engineering, disassembly, or decompilation of this software, unless required by law for interoperability, is prohibited. The information contained herein is subject to change without notice and is not warranted to be error-free. If you find any errors, please report them to us in writing. If this is software or related documentation that is delivered to the U.S. Government or anyone licensing it on behalf of the U.S. Government, then the following notice is applicable: U.S. GOVERNMENT END USERS: Oracle programs (including any operating system, integrated software, any programs embedded, installed or activated on delivered hardware, and modifications of such programs) and Oracle computer documentation or other Oracle data delivered to or accessed by U.S. -
Introduction to Unicode
Introduction to Unicode Presented by developerWorks, your source for great tutorials ibm.com/developerWorks Table of Contents If you're viewing this document online, you can click any of the topics below to link directly to that section. 1. Tutorial basics 2 2. Introduction 3 3. Unicode-based multilingual form development 5 4. Going interactive: Adding the Perl CGI script 10 5. Conclusions and resources 13 6. Feedback 15 Introduction to Unicode Page 1 Presented by developerWorks, your source for great tutorials ibm.com/developerWorks Section 1. Tutorial basics Is this tutorial right for you? This tutorial is for anyone who wants to understand the basics of Unicode-based multilingual Web page development. It is an introduction to how multilingual characters, the Unicode encoding standard, XML, and Perl CGI scripts can be integrated to produce a truly multilingual Web page. The tutorial explains the concepts behind this process and lays the groundwork for future development. Navigation Navigating through the tutorial is easy: * Use the Next and Previous buttons to move forward and backward. * Use the Menu button to return to the tutorial menu. * If you'd like to tell us what you think, use the Feedback button. * If you need help with the tutorial, use the Help button. What is required: Hardware and software needed No additional hardware is needed for this tutorial; however, in order to view some of the files online, Unicode fonts must be loaded. See Resources for information on Unicode fonts if you don't already have one loaded. (Note: Many Unicode fonts are still incomplete or in development, so one particular font may not contain every character; however, for the purposes of this tutorial, the Unicode font downloaded by the user will ideally contain all or most of the language characters used in our examples. -
Using International Characters in Bartender How to Read Data and Print Characters from Almost Every Language and Writing System in the World
Using International Characters in BarTender How to Read Data and Print Characters from Almost Every Language and Writing System in the World Supports the following BarTender software versions: BarTender 2016, BarTender 2019 WHITE PAPER Contents Overview 3 BarTender's Unicode Support 3 Understanding International Character Support 4 Fonts 4 Scripts 4 Asian Characters 5 Writing Systems that Read Right to Left 5 Entering International Text Into Text Objects 7 Using the Insert Symbols or Special Characters Dialog 7 Using the Windows Character Map 7 Using the Keyboard 8 Reading International Characters from a Database 9 Encoding International Characters in Barcodes 10 Using Unicode Data 10 ECI Protocol 11 Formatting According to Locale 12 About Data Types 12 Example 12 Appendix A: Languages, Encodings and Scripts 13 Appendix B: Configuring Windows 14 Installing Windows Input Support for International Languages 14 Windows Support for Additional Display Languages 15 Related Documentation 16 Overview With the built-in international character support that BarTender provides, you can add international characters to your design and encode them into barcodes or RFID tags on your template. When you design items for a single locale, you can configure the format (or appearance) of information that may vary between regions, such as date and time, currency, numbering, and so on. By using BarTender, you can also globalize your design and print international materials by inserting text in multiple languages. For example, the following item illustrates how BarTender can be used with many locales with its built-in Unicode support. BarTender's Unicode Support Unicode is a universal character encoding standard that defines the basis for the processing, storage and interchange of text data in any language. -
Unicode and Code Page Support
Natural Unicode and Code Page Support Version 8.2.4 November 2016 This document applies to Natural Version 8.2.4. Specifications contained herein are subject to change and these changes will be reported in subsequent release notes or new editions. Copyright © 1979-2016 Software AG, Darmstadt, Germany and/or Software AG USA, Inc., Reston, VA, USA, and/or its subsidiaries and/or its affiliates and/or their licensors. The name Software AG and all Software AG product names are either trademarks or registered trademarks of Software AG and/or Software AG USA, Inc. and/or its subsidiaries and/or its affiliates and/or their licensors. Other company and product names mentioned herein may be trademarks of their respective owners. Detailed information on trademarks and patents owned by Software AG and/or its subsidiaries is located at http://softwareag.com/licenses. Use of this software is subject to adherence to Software AG's licensing conditions and terms. These terms are part of the product documentation, located at http://softwareag.com/licenses/ and/or in the root installation directory of the licensed product(s). This software may include portions of third-party products. For third-party copyright notices, license terms, additional rights or re- strictions, please refer to "License Texts, Copyright Notices and Disclaimers of Third-Party Products". For certain specific third-party license restrictions, please refer to section E of the Legal Notices available under "License Terms and Conditions for Use of Software AG Products / Copyright and Trademark Notices of Software AG Products". These documents are part of the product documentation, located at http://softwareag.com/licenses and/or in the root installation directory of the licensed product(s). -
Windows NLS Considerations Version 2.1
Windows NLS Considerations version 2.1 Radoslav Rusinov [email protected] Windows NLS Considerations Contents 1. Introduction ............................................................................................................................................... 3 1.1. Windows and Code Pages .................................................................................................................... 3 1.2. CharacterSet ........................................................................................................................................ 3 1.3. Encoding Scheme ................................................................................................................................ 3 1.4. Fonts ................................................................................................................................................... 4 1.5. So Why Are There Different Charactersets? ........................................................................................ 4 1.6. What are the Difference Between 7 bit, 8 bit and Unicode Charactersets? ........................................... 4 2. NLS_LANG .............................................................................................................................................. 4 2.1. Setting the Character Set in NLS_LANG ............................................................................................ 4 2.2. Where is the Character Conversion Done? ......................................................................................... -
Character Encoding
$7.00 U.S. InsIde: Developing User Interface Standards INTERNATIONAL Plus! Who Owns the ® Data? SSpecpecTHE MULTIVALUE TEttCHNOLOGYrr umMumAGAZINE I MAR/APR 2011 Character Encoding Getting Ready for the World Stage intl-spectrum.com Advanced 6 Spec_Layout 1 2/14/11 2:22 PM Page 1 Advanced database technology for breakthrough applications This makes applications fly. Embed our post-relational database if you Caché eliminates the need for object-relational want your next application to have breakthrough mapping. Which can reduce your development features, run withC abclahzéing speed, be massively cycle by as much as 40%. scalable and require mi®nimal administration. Caché is available for all major platforms – InterSystems has advanced object and it supports MultiValue development. Caché is technology that makes it easier to build applica- deployed on more than 100,000 systems world- tions with XML, Web services, AJAX, Java, and .NET. wide, ranging from two to over 50,000 users. And Caché can run SQL up to 5 times faster than For over 30 years, we’ve provided advanced relational databases. ™ software technologies for breakthrough With its unique Unified Data Architecture , applications. Visit us at the International Spectrum Conference, April 4-7, 2011, West Palm Beach, Florida. InterSystems.com/Advanced6WW Download a free, fully functio©n 20a11l I,n tnerSoys-tetmism Corpeor-altiomn. Alil rtig hctso repseryve do. Inft eCrSyastecmhs Céach, éo is ra r ergiesteqredu treadsemta rikt o f oIntnerS yDsteVmsD Cor,p oaratti on. 2-11 Adv6Spec INTERNATIONAL ® SSpecpecTHE MULTIVALUE tt TErrCHNOLOGYumum MAGAZINE FEATURES I MARCH/APRIL 2011 Character Encoding It’s a small world and getting smaller, Business Tech: User Ownership of Data Gone are the days especially6 thanks to the Internet and 10 when the Data Processing department was both keeper and defender web-enabled applications. -
Using C-Kermit 2Nd Edition
<< Previous file Chapter 1 Introduction An ever-increasing amount of communication is electronic and digital: computers talking to computers Ð directly, over the telephone system, through networks. When you want two computers to communicate, it is usually for one of two reasons: to interact directly with another computer or to transfer data between the two computers. Kermit software gives you both these capabilities, and a lot more too. C-Kermit is a communications software program written in the C language. It is available for many different kinds of computers and operating systems, including literally hundreds of UNIX varieties (HP-UX, AIX, Solaris, IRIX, SCO, Linux, ...), Digital Equipment Corporation (Open)VMS, Microsoft Windows NT and 95, IBM OS/2, Stratus VOS, Data General AOS/VS, Microware OS-9, the Apple Macintosh, the Commodore Amiga, and the Atari ST. On all these platforms, C-Kermit's services include: • Connection establishment. This means making dialup modem connections or (in most cases) network connections, including TCP/IP Telnet or Rlogin, X.25, LAT, NET- BIOS, or other types of networks. For dialup connections, C-Kermit supports a wide range of modems and an extremely sophisticated yet easy-to-use dialing directory. And C-Kermit accepts incoming connections from other computers too. • Terminal sessions. An interactive terminal connection can be made to another com- puter via modem or network. The Windows 95, Windows NT, and OS/2 versions of C-Kermit also emulate specific types of terminals, such as the Digital Equipment Cor- poration VT320, the Wyse 60, or the ANSI terminal types used for accessing BBSs or PC UNIX consoles, with lots of extras such as scrollback, key mapping, printer con- trol, colors, and mouse shortcuts. -
[MS-VUVP-Diff]: VT-UTF8 and VT100+ Protocols
[MS-VUVP-Diff]: VT-UTF8 and VT100+ Protocols Intellectual Property Rights Notice for Open Specifications Documentation . Technical Documentation. Microsoft publishes Open Specifications documentation (“this documentation”) for protocols, file formats, data portability, computer languages, and standards as well as overviews of the interaction among each of these technologiessupport. Additionally, overview documents cover inter-protocol relationships and interactions. Copyrights. This documentation is covered by Microsoft copyrights. Regardless of any other terms that are contained in the terms of use for the Microsoft website that hosts this documentation, you maycan make copies of it in order to develop implementations of the technologies that are described in the Open Specifications this documentation and maycan distribute portions of it in your implementations usingthat use these technologies or in your documentation as necessary to properly document the implementation. You maycan also distribute in your implementation, with or without modification, any schema, IDL'sschemas, IDLs, or code samples that are included in the documentation. This permission also applies to any documents that are referenced in the Open Specifications. documentation. No Trade Secrets. Microsoft does not claim any trade secret rights in this documentation. Patents. Microsoft has patents that maymight cover your implementations of the technologies described in the Open Specifications. documentation. Neither this notice nor Microsoft's delivery of thethis documentation grants any licenses under those patents or any other Microsoft patents. However, a given Open Specification maySpecifications document might be covered by the Microsoft Open Specifications Promise or the Microsoft Community Promise. If you would prefer a written license, or if the technologies described in the Open Specificationsthis documentation are not covered by the Open Specifications Promise or Community Promise, as applicable, patent licenses are available by contacting [email protected]. -
Unicode Cree Syllabics for Windows and Macintosh
Unicode Cree Syllabics for Windows and Macintosh 37th Algonquian Conference, Ottawa 2005 Bill Jancewicz SIL International and Naskapi Development Corporation ABSTRACT Submitted as an update to a presentation made at the 34th Algonquian Conference (Kingston). The ongoing development of the operating systems has included increased support for cross-platform use of Unicode syllabic script. Key improvements that were included in Macintosh's OS X.3 (Panther) operating system now allow direct keyboarding of Unicode characters by means of a user-defined input method. A summary and comparison of the available tools for handling Unicode on both Windows and Macintosh will be discussed. INTRODUCTION Since 1988 the author has been working in the Naskapi language at Kawawachikamach with the primary purpose of linguistic analysis and Bible translation, sponsored by Wycliffe Bible Translators and SIL International. Related work includes mother tongue translator training, Naskapi literature and curriculum development. Along with the language work the author also developed methods of production for Naskapi language materials in syllabic script by means of the computer. With the advent of high quality publishing capabilities in newer computers, procedures were updated to keep pace with the improving technology. With resources available from SIL International computer services department, a very satisfactory system of keyboarding syllabic texts in Naskapi was developed. At the urging of colleagues working in related languages, the system was expanded to include the wider inventory of standard Eastern and Western Cree syllabics. While this has pushed the limit of what is possible with the current technology, Unicode makes this practical. Note that the system originally developed for keyboarding Naskapi syllabics is similar but not identical to the Cree system, because of the unique local orthography in use at Kawawachikamach.