Character Data Representation Architecture (CDRA)

Total Page:16

File Type:pdf, Size:1020Kb

Character Data Representation Architecture (CDRA) Character Data Representation Architecture Character Data Representation Architecture (CDRA) is an IBM architecture which defines a set of identifiers, resources and APIs for identifying character data and maintaining data integrity during data conversion processes. The reference publication provides the reader with the framework for the identifiers, descriptions of the supporting resources and APIs, as well as methods for data conversions. CDRA is applicable to country specific data encodings such as the ISO-8859 series as well as large character sets such as Unicode. Edition Notice: Updated in 2018, this document is a revised, on-line version of the original IBM publication. This version of the document contains many typographical and editorial corrections. A section on the Unicode Encoding Structure has been added in Appendix A. Also, Appendix K has been added to document the CDRA identifiers defined in support of Unicode. Likewise, Appendix J, has been updated with information on how to obtain CDRA conversion table resources on-line. Appendix L was added containing the EBCDIC control code definitions. Overview • Chapter 1. Introduction • Chapter 2. Architecture Strategy Architecture Definition • Chapter 3. CDRA Identifiers • Chapter 4. Services • Chapter 5. CDRA Interface Definitions • Chapter 6. Difference Management • Chapter 7. CDRA Resources and Their Management Appendices • Appendix A. Encoding Schemes • Appendix B. Conversion Methods • Appendix C. CCSID Repository • Appendix D. Platform Support of CDRA • Appendix E. Graphic Character Identification Copyright IBM Corp. 1990, 1995, 2018 • Appendix F. Character Sets and Code Pages • Appendix G. Control Character Mappings • Appendix H. CDRA and IBM i (Formerly OS/400) • Appendix I. DFSMS/MVS Considerations • Appendix J. CDRA Conversion Resources • Appendix K. CDRA and Unicode • Appendix L. EBCDIC control character definitions Resources • Glossary Copyright IBM Corp. 1990, 1995, 2018 Chapter 1. Introduction This chapter introduces readers to the Character Data Representation Architecture. The architecture objectives, challenges, coverage and concepts are presented to form the basis for understanding the following chapters. Definition of Character Data Representation Architecture Character Data Representation Architecture (CDRA) is an IBM* architecture that defines a set of identifiers, services, supporting resources, and conventions to achieve consistent representation, processing, and interchange of graphic character data in data processing environments. Objectives The overall objective of CDRA is to define a method of assigning and preserving the meaning and rendering of coded graphic characters through various stages of processing and interchange. The Objectives intro paragraph is followed by these paragraphs that you have already: • Define the necessary supporting services (such as tagging) to allow consistent support of various coded graphic character sets, both within and across environments • Select a minimum number of coded graphic character sets that satisfies most of today's text and data-processing applications and provide the necessary definitions for them so that they can be consistently supported in all applications, services, and devices within and across different environments Data integrity challenges Character Data Representation Architecture is used by products and systems to address the following data integrity challenges: • Proliferation of Different Character Codes The primary problem in handling coded character sets is the variety of sets of characters and encoding schemes used to represent them. Technology has increased the variety of applications using computers, and the coded character support for these applications has been provided without an overall strategy. • Graphic Character Data Recognition The abilities to properly distinguish graphic character data in a universal manner and to attach a tag to the data are available only in some specific architected environments. The available architected methods are often inconsistent, have lagged behind worldwide requirements, and are constrained by the supporting products. • Inconsistent or Incomplete Set of Identifiers Few applications have consistent character support. Operating system environments rarely provide the necessary services to identify coded character data, leaving this responsibility to the applications. • Use of Absolute Values (Hard-coding) Many applications were designed to operate in specific environments with specific terminal characteristics. The internal representations of frequently used characters have been coded using absolute values. Character data misinterpretation occurs when environment changes are made and the initial character processing functions are no longer valid. • Non-tagged Data Traditional data processing environments were closed systems, and the coded character support was primarily governed by the device character handling capabilities. Data was rarely tagged. The most visible end-user impact of all these concerns is in data interchange within and across system environments. For example: • The set of graphic characters supported on the IBM Personal Computer (PC) cannot be fully processed in non-PC environments. (Character set mismatch.) • When a "dollar symbol" is sent in text from a U.S. mainframe computer to a U.K. mainframe computer, it often appears as a "pound sterling symbol" to the U.K. user. (Conversion based on byte integrity.) • When a "lowercase a" is sent to a Katakana terminal in Japan, it appears as a "Katakana character" or gets irreversibly converted to an "uppercase A". • Application and device code page differences may lead to users entering characters from the keyboard that are different from those specified by the application. For example, in Switzerland, the programmer must key "y-acute capital" (Ý) and "dieresis" ( ¨ ) instead of "left square bracket" ( [ ) and "right square bracket" ( ] ), even though the bracket symbols are supported and are engraved on the Swiss keyboard. • The variety of existing code point conversion tables produces inconsistent and often unpredictable results between different environments. See Figure 1. Figure 1. Character data platform domains A character data domain may be described as an environment in which all character data has the same coded representation. This can be shown in a broad sense with respect to each category of system; midrange, mainframe, workstation and personal. Data domains may be the same within a system, or may differ within a system, but typically the view is one of a character data domain per system category. There are well known examples of the problems encountered as character data moves between data domains, leading to character data misrepresentation. Businesses may spend a significant portion of their information technology budgets circumventing, repairing, and educating to resolve the data integrity problems. Coverage of CDRA Character Data Representation Architecture defines: • An identification or tagging system to uniquely and reliably identify the representation of graphic character data • A set of portable Application Programming Interfaces (APIs) • A set of resources in support of the tags and services • A set of conventions on the use of the tags and services • A strategy for coded character set convergence. This coverage is depicted in Figure 2. Figure 2. Components of CDRA. CDRA components are categorized as the data identification mechanism, functions, resources and processing guidelines. Character data focus Data can be classified in many ways, such as character data, byte strings, integer numbers, or floating-point numbers. Character data is further classifiable into control character data and graphic character data. Control characters include, for example, Horizontal Tab and Line Feed, which perform specific functions. Graphic characters include uppercase and lowercase letters (with and without accent marks), numeric digits 0 to 9, ideographs, and other symbols. Graphic character data streams can include embedded code extension controls, such as Shift-Out or Shift-In, used in the interpretation of the data following the controls. Figure 3 shows an example of these classifications. CDRA deals with character data; primarily with graphic character data, and to a nominal extent with control character data. Figure 3. Types of data in a string. Various types of data may be contained in a data string. CDRA focuses on the coded graphic character data. Support for control functions Control functions, as defined by either single control characters or sequences of code points, can appear intermixed with graphic character data. From the CDRA point of view, there are two categories of control functions: • Code extension These functions modify the interpretation of subsequent code points representing graphic characters. Examples are: Shift-Out (SO), Shift-In (SI), Single-Shift 2 (SS2). • All other control functions Applications or architectures are responsible for handling specific control functions. CDRA provides an interface to query a set of control character encodings and uses the code point assignments for SPACE and SUB in its difference management functions. CDRA conversion methods and functions support the concept of string types to handle space-padded and null-terminated strings. All other aspects of control functions are outside the scope of CDRA. Architecture concepts Tagging Tagging is the primary method to identify the meaning and rendering of coded graphic characters.
Recommended publications
  • Cobol/Cobol Database Interface.Htm Copyright © Tutorialspoint.Com
    CCOOBBOOLL -- DDAATTAABBAASSEE IINNTTEERRFFAACCEE http://www.tutorialspoint.com/cobol/cobol_database_interface.htm Copyright © tutorialspoint.com As of now, we have learnt the use of files in COBOL. Now, we will discuss how a COBOL program interacts with DB2. It involves the following terms: Embedded SQL DB2 Application Programming Host Variables SQLCA SQL Queries Cursors Embedded SQL Embedded SQL statements are used in COBOL programs to perform standard SQL operations. Embedded SQL statements are preprocessed by SQL processor before the application program is compiled. COBOL is known as the Host Language. COBOL-DB2 applications are those applications that include both COBOL and DB2. Embedded SQL statements work like normal SQL statements with some minor changes. For example, that output of a query is directed to a predefined set of variables which are referred as Host Variables. An additional INTO clause is placed in the SELECT statement. DB2 Application Programming Following are rules to be followed while coding a COBOL-DB2 program: All the SQL statements must be delimited between EXEC SQL and END-EXEC. SQL statements must be coded in Area B. All the tables that are used in a program must be declared in the Working-Storage Section. This is done by using the INCLUDE statement. All SQL statements other than INCLUDE and DECLARE TABLE must appear in the Procedure Division. Host Variables Host variables are used for receiving data from a table or inserting data in a table. Host variables must be declared for all values that are to be passed between the program and the DB2. They are declared in the Working-Storage Section.
    [Show full text]
  • Package 'Pinsplus'
    Package ‘PINSPlus’ August 6, 2020 Encoding UTF-8 Type Package Title Clustering Algorithm for Data Integration and Disease Subtyping Version 2.0.5 Date 2020-08-06 Author Hung Nguyen, Bang Tran, Duc Tran and Tin Nguyen Maintainer Hung Nguyen <[email protected]> Description Provides a robust approach for omics data integration and disease subtyping. PIN- SPlus is fast and supports the analysis of large datasets with hundreds of thousands of sam- ples and features. The software automatically determines the optimal number of clus- ters and then partitions the samples in a way such that the results are ro- bust against noise and data perturbation (Nguyen et.al. (2019) <DOI: 10.1093/bioinformat- ics/bty1049>, Nguyen et.al. (2017)<DOI: 10.1101/gr.215129.116>). License LGPL Depends R (>= 2.10) Imports foreach, entropy , doParallel, matrixStats, Rcpp, RcppParallel, FNN, cluster, irlba, mclust RoxygenNote 7.1.0 Suggests knitr, rmarkdown, survival, markdown LinkingTo Rcpp, RcppArmadillo, RcppParallel VignetteBuilder knitr NeedsCompilation yes Repository CRAN Date/Publication 2020-08-06 21:20:02 UTC R topics documented: PINSPlus-package . .2 AML2004 . .2 KIRC ............................................3 PerturbationClustering . .4 SubtypingOmicsData . .9 1 2 AML2004 Index 13 PINSPlus-package Perturbation Clustering for data INtegration and disease Subtyping Description This package implements clustering algorithms proposed by Nguyen et al. (2017, 2019). Pertur- bation Clustering for data INtegration and disease Subtyping (PINS) is an approach for integraton of data and classification of diseases into various subtypes. PINS+ provides algorithms support- ing both single data type clustering and multi-omics data type. PINSPlus is an improved version of PINS by allowing users to customize the based clustering algorithm and perturbation methods.
    [Show full text]
  • Base64 Character Encoding and Decoding Modeling
    Base64 Character Encoding and Decoding Modeling Isnar Sumartono1, Andysah Putera Utama Siahaan2, Arpan3 Faculty of Computer Science,Universitas Pembangunan Panca Budi Jl. Jend. Gatot Subroto Km. 4,5 Sei Sikambing, 20122, Medan, Sumatera Utara, Indonesia Abstract: Security is crucial to maintaining the confidentiality of the information. Secure information is the information should not be known to the unreliable person, especially information concerning the state and the government. This information is often transmitted using a public network. If the data is not secured in advance, would be easily intercepted and the contents of the information known by the people who stole it. The method used to secure data is to use a cryptographic system by changing plaintext into ciphertext. Base64 algorithm is one of the encryption processes that is ideal for use in data transmission. Ciphertext obtained is the arrangement of the characters that have been tabulated. These tables have been designed to facilitate the delivery of data during transmission. By applying this algorithm, errors would be avoided, and security would also be ensured. Keywords: Base64, Security, Cryptography, Encoding I. INTRODUCTION Security and confidentiality is one important aspect of an information system [9][10]. The information sent is expected to be well received only by those who have the right. Information will be useless if at the time of transmission intercepted or hijacked by an unauthorized person [7]. The public network is one that is prone to be intercepted or hijacked [1][2]. From time to time the data transmission technology has developed so rapidly. Security is necessary for an organization or company as to maintain the integrity of the data and information on the company.
    [Show full text]
  • About:Config .Init About:Me About:Presentation Web 2.0 User
    about:config .init about:me about:presentation Web 2.0 User View Technical View Simple Overview Picture Complex Overview Picture Main Problems Statistics I Statistics II .next Targets Of Attack Targets Methods Kinds Of Session Hijacking SQL Injection Introduction Examples SQL Injection Picture Analysis SQL Escaping SQL Escaping #2 SQL Parameter Binding XSS Introduction What Can It Do? Main Problem Types Of XSS Components Involved In XSS Reflected XSS Picture Reflected XSS Analysis (Server Side) Stored XSS Picture Server Side Stored XSS (Local) DOM XSS Example Picture local DOM XSS CSRF Introduction Example Picture CSRF Session Riding Analysis Complex Example Hijack Via DNS + XSS Picture DNS+XSS Combo Cookie Policy Analysis Variant Components .next Misplaced Trust 3rd Party Script Picture Trust 3rd Party Script Analysis Misplaced Trust In Middleware Misplaced Trust In Server­Local Data Picture Local Scripts Analysis Same Origin Policy Frame Policy UI Redressing Introduction Clickjacking Picture Clickjacking Analysis BREAK .next Summary of Defense Strategies "Best Effort" vs. "Best Security" Protection against Hijacking Session Theft Riding, Fixation, Prediction Separate by Trust Validation Why Input Validation at Server Check Origin and Target of Request Validation of Form Fields Validation of File Upload Validation Before Forwarding Validation of Server Output Validation of Target in Client Validation of Origin in Client Validation of Input in Client Normalization What's That? Normalizing HTML Normalizing XHTML Normalizing Image, Audio,
    [Show full text]
  • Modern Programming Languages CS508 Virtual University of Pakistan
    Modern Programming Languages (CS508) VU Modern Programming Languages CS508 Virtual University of Pakistan Leaders in Education Technology 1 © Copyright Virtual University of Pakistan Modern Programming Languages (CS508) VU TABLE of CONTENTS Course Objectives...........................................................................................................................4 Introduction and Historical Background (Lecture 1-8)..............................................................5 Language Evaluation Criterion.....................................................................................................6 Language Evaluation Criterion...................................................................................................15 An Introduction to SNOBOL (Lecture 9-12).............................................................................32 Ada Programming Language: An Introduction (Lecture 13-17).............................................45 LISP Programming Language: An Introduction (Lecture 18-21)...........................................63 PROLOG - Programming in Logic (Lecture 22-26) .................................................................77 Java Programming Language (Lecture 27-30)..........................................................................92 C# Programming Language (Lecture 31-34) ...........................................................................111 PHP – Personal Home Page PHP: Hypertext Preprocessor (Lecture 35-37)........................129 Modern Programming Languages-JavaScript
    [Show full text]
  • Consonant Characters and Inherent Vowels
    Global Design: Characters, Language, and More Richard Ishida W3C Internationalization Activity Lead Copyright © 2005 W3C (MIT, ERCIM, Keio) slide 1 Getting more information W3C Internationalization Activity http://www.w3.org/International/ Copyright © 2005 W3C (MIT, ERCIM, Keio) slide 2 Outline Character encoding: What's that all about? Characters: What do I need to do? Characters: Using escapes Language: Two types of declaration Language: The new language tag values Text size Navigating to localized pages Copyright © 2005 W3C (MIT, ERCIM, Keio) slide 3 Character encoding Character encoding: What's that all about? Copyright © 2005 W3C (MIT, ERCIM, Keio) slide 4 Character encoding The Enigma Photo by David Blaikie Copyright © 2005 W3C (MIT, ERCIM, Keio) slide 5 Character encoding Berber 4,000 BC Copyright © 2005 W3C (MIT, ERCIM, Keio) slide 6 Character encoding Tifinagh http://www.dailymotion.com/video/x1rh6m_tifinagh_creation Copyright © 2005 W3C (MIT, ERCIM, Keio) slide 7 Character encoding Character set Character set ⴰ ⴱ ⴲ ⴳ ⴴ ⴵ ⴶ ⴷ ⴸ ⴹ ⴺ ⴻ ⴼ ⴽ ⴾ ⴿ ⵀ ⵁ ⵂ ⵃ ⵄ ⵅ ⵆ ⵇ ⵈ ⵉ ⵊ ⵋ ⵌ ⵍ ⵎ ⵏ ⵐ ⵑ ⵒ ⵓ ⵔ ⵕ ⵖ ⵗ ⵘ ⵙ ⵚ ⵛ ⵜ ⵝ ⵞ ⵟ ⵠ ⵢ ⵣ ⵤ ⵥ ⵯ Copyright © 2005 W3C (MIT, ERCIM, Keio) slide 8 Character encoding Coded character set 0 1 2 3 0 1 Coded character set 2 3 4 5 6 7 8 9 33 (hexadecimal) A B 52 (decimal) C D E F Copyright © 2005 W3C (MIT, ERCIM, Keio) slide 9 Character encoding Code pages ASCII Copyright © 2005 W3C (MIT, ERCIM, Keio) slide 10 Character encoding Code pages ISO 8859-1 (Latin 1) Western Europe ç (E7) Copyright © 2005 W3C (MIT, ERCIM, Keio) slide 11 Character encoding Code pages ISO 8859-7 Greek η (E7) Copyright © 2005 W3C (MIT, ERCIM, Keio) slide 12 Character encoding Double-byte characters Standard Country No.
    [Show full text]
  • Unicode Ate My Brain
    UNICODE ATE MY BRAIN John Cowan Reuters Health Information Copyright 2001-04 John Cowan under GNU GPL 1 Copyright • Copyright © 2001 John Cowan • Licensed under the GNU General Public License • ABSOLUTELY NO WARRANTIES; USE AT YOUR OWN RISK • Portions written by Tim Bray; used by permission • Title devised by Smarasderagd; used by permission • Black and white for readability Copyright 2001-04 John Cowan under GNU GPL 2 Abstract Unicode, the universal character set, is one of the foundation technologies of XML. However, it is not as widely understood as it should be, because of the unavoidable complexity of handling all of the world's writing systems, even in a fairly uniform way. This tutorial will provide the basics about using Unicode and XML to save lots of money and achieve world domination at the same time. Copyright 2001-04 John Cowan under GNU GPL 3 Roadmap • Brief introduction (4 slides) • Before Unicode (16 slides) • The Unicode Standard (25 slides) • Encodings (11 slides) • XML (10 slides) • The Programmer's View (27 slides) • Points to Remember (1 slide) Copyright 2001-04 John Cowan under GNU GPL 4 How Many Different Characters? a A à á â ã ä å ā ă ą a a a a a a a a a a a Copyright 2001-04 John Cowan under GNU GPL 5 How Computers Do Text • Characters in computer storage are represented by “small” numbers • The numbers use a small number of bits: from 6 (BCD) to 21 (Unicode) to 32 (wchar_t on some Unix boxes) • Design choices: – Which numbers encode which characters – How to pack the numbers into bytes Copyright 2001-04 John Cowan under GNU GPL 6 Where Does XML Come In? • XML is a textual data format • XML software is required to handle all commercially important characters in the world; a promise to “handle XML” implies a promise to be international • Applications can do what they want; monolingual applications can mostly ignore internationalization Copyright 2001-04 John Cowan under GNU GPL 7 $$$ £££ ¥¥¥ • Extra cost of building-in internationalization to a new computer application: about 20% (assuming XML and Unicode).
    [Show full text]
  • SUPPORTING the CHINESE, JAPANESE, and KOREAN LANGUAGES in the OPENVMS OPERATING SYSTEM by Michael M. T. Yau ABSTRACT the Asian L
    SUPPORTING THE CHINESE, JAPANESE, AND KOREAN LANGUAGES IN THE OPENVMS OPERATING SYSTEM By Michael M. T. Yau ABSTRACT The Asian language versions of the OpenVMS operating system allow Asian-speaking users to interact with the OpenVMS system in their native languages and provide a platform for developing Asian applications. Since the OpenVMS variants must be able to handle multibyte character sets, the requirements for the internal representation, input, and output differ considerably from those for the standard English version. A review of the Japanese, Chinese, and Korean writing systems and character set standards provides the context for a discussion of the features of the Asian OpenVMS variants. The localization approach adopted in developing these Asian variants was shaped by business and engineering constraints; issues related to this approach are presented. INTRODUCTION The OpenVMS operating system was designed in an era when English was the only language supported in computer systems. The Digital Command Language (DCL) commands and utilities, system help and message texts, run-time libraries and system services, and names of system objects such as file names and user names all assume English text encoded in the 7-bit American Standard Code for Information Interchange (ASCII) character set. As Digital's business began to expand into markets where common end users are non-English speaking, the requirement for the OpenVMS system to support languages other than English became inevitable. In contrast to the migration to support single-byte, 8-bit European characters, OpenVMS localization efforts to support the Asian languages, namely Japanese, Chinese, and Korean, must deal with a more complex issue, i.e., the handling of multibyte character sets.
    [Show full text]
  • ISO Basic Latin Alphabet
    ISO basic Latin alphabet The ISO basic Latin alphabet is a Latin-script alphabet and consists of two sets of 26 letters, codified in[1] various national and international standards and used widely in international communication. The two sets contain the following 26 letters each:[1][2] ISO basic Latin alphabet Uppercase Latin A B C D E F G H I J K L M N O P Q R S T U V W X Y Z alphabet Lowercase Latin a b c d e f g h i j k l m n o p q r s t u v w x y z alphabet Contents History Terminology Name for Unicode block that contains all letters Names for the two subsets Names for the letters Timeline for encoding standards Timeline for widely used computer codes supporting the alphabet Representation Usage Alphabets containing the same set of letters Column numbering See also References History By the 1960s it became apparent to thecomputer and telecommunications industries in the First World that a non-proprietary method of encoding characters was needed. The International Organization for Standardization (ISO) encapsulated the Latin script in their (ISO/IEC 646) 7-bit character-encoding standard. To achieve widespread acceptance, this encapsulation was based on popular usage. The standard was based on the already published American Standard Code for Information Interchange, better known as ASCII, which included in the character set the 26 × 2 letters of the English alphabet. Later standards issued by the ISO, for example ISO/IEC 8859 (8-bit character encoding) and ISO/IEC 10646 (Unicode Latin), have continued to define the 26 × 2 letters of the English alphabet as the basic Latin script with extensions to handle other letters in other languages.[1] Terminology Name for Unicode block that contains all letters The Unicode block that contains the alphabet is called "C0 Controls and Basic Latin".
    [Show full text]
  • PCL PC-8, Code Page 437 Page 1 of 5 PCL PC-8, Code Page 437
    PCL PC-8, Code Page 437 Page 1 of 5 PCL PC-8, Code Page 437 PCL Symbol Set: 10U Unicode glyph correspondence tables. Contact:[email protected] http://pcl.to -- -- -- -- $90 U00C9 Ê Uppercase e acute $21 U0021 Ë Exclamation $91 U00E6 Ì Lowercase ae diphthong $22 U0022 Í Neutral double quote $92 U00C6 Î Uppercase ae diphthong $23 U0023 Ï Number $93 U00F4 & Lowercase o circumflex $24 U0024 ' Dollar $94 U00F6 ( Lowercase o dieresis $25 U0025 ) Per cent $95 U00F2 * Lowercase o grave $26 U0026 + Ampersand $96 U00FB , Lowercase u circumflex $27 U0027 - Neutral single quote $97 U00F9 . Lowercase u grave $28 U0028 / Left parenthesis $98 U00FF 0 Lowercase y dieresis $29 U0029 1 Right parenthesis $99 U00D6 2 Uppercase o dieresis $2A U002A 3 Asterisk $9A U00DC 4 Uppercase u dieresis $2B U002B 5 Plus $9B U00A2 6 Cent sign $2C U002C 7 Comma, decimal separator $9C U00A3 8 Pound sterling $2D U002D 9 Hyphen $9D U00A5 : Yen sign $2E U002E ; Period, full stop $9E U20A7 < Pesetas $2F U002F = Solidus, slash $9F U0192 > Florin sign $30 U0030 ? Numeral zero $A0 U00E1 ê Lowercase a acute $31 U0031 A Numeral one $A1 U00ED B Lowercase i acute $32 U0032 C Numeral two $A2 U00F3 D Lowercase o acute $33 U0033 E Numeral three $A3 U00FA F Lowercase u acute $34 U0034 G Numeral four $A4 U00F1 H Lowercase n tilde $35 U0035 I Numeral five $A5 U00D1 J Uppercase n tilde $36 U0036 K Numeral six $A6 U00AA L Female ordinal (a) http://www.pclviewer.com (c) RedTitan Technology 2005 PCL PC-8, Code Page 437 Page 2 of 5 $37 U0037 M Numeral seven $A7 U00BA N Male ordinal (o) $38 U0038
    [Show full text]
  • Unicode and Code Page Support
    Natural for Mainframes Unicode and Code Page Support Version 4.2.6 for Mainframes October 2009 This document applies to Natural Version 4.2.6 for Mainframes and to all subsequent releases. Specifications contained herein are subject to change and these changes will be reported in subsequent release notes or new editions. Copyright © Software AG 1979-2009. All rights reserved. The name Software AG, webMethods and all Software AG product names are either trademarks or registered trademarks of Software AG and/or Software AG USA, Inc. Other company and product names mentioned herein may be trademarks of their respective owners. Table of Contents 1 Unicode and Code Page Support .................................................................................... 1 2 Introduction ..................................................................................................................... 3 About Code Pages and Unicode ................................................................................ 4 About Unicode and Code Page Support in Natural .................................................. 5 ICU on Mainframe Platforms ..................................................................................... 6 3 Unicode and Code Page Support in the Natural Programming Language .................... 7 Natural Data Format U for Unicode-Based Data ....................................................... 8 Statements .................................................................................................................. 9 Logical
    [Show full text]
  • Lecture 2: Variables and Primitive Data Types
    Lecture 2: Variables and Primitive Data Types MIT-AITI Kenya 2005 1 In this lecture, you will learn… • What a variable is – Types of variables – Naming of variables – Variable assignment • What a primitive data type is • Other data types (ex. String) MIT-Africa Internet Technology Initiative 2 ©2005 What is a Variable? • In basic algebra, variables are symbols that can represent values in formulas. • For example the variable x in the formula f(x)=x2+2 can represent any number value. • Similarly, variables in computer program are symbols for arbitrary data. MIT-Africa Internet Technology Initiative 3 ©2005 A Variable Analogy • Think of variables as an empty box that you can put values in. • We can label the box with a name like “Box X” and re-use it many times. • Can perform tasks on the box without caring about what’s inside: – “Move Box X to Shelf A” – “Put item Z in box” – “Open Box X” – “Remove contents from Box X” MIT-Africa Internet Technology Initiative 4 ©2005 Variables Types in Java • Variables in Java have a type. • The type defines what kinds of values a variable is allowed to store. • Think of a variable’s type as the size or shape of the empty box. • The variable x in f(x)=x2+2 is implicitly a number. • If x is a symbol representing the word “Fish”, the formula doesn’t make sense. MIT-Africa Internet Technology Initiative 5 ©2005 Java Types • Integer Types: – int: Most numbers you’ll deal with. – long: Big integers; science, finance, computing. – short: Small integers.
    [Show full text]