Character Sets Trigraph Sequences
Total Page:16
File Type:pdf, Size:1020Kb
Subject : Information Technology Paper : Object Oriented Concepts & Programming Module : Overview of C++ Every Programming language like any language has basic character set on which the entire language is built. Language C++ has character set same as that of C. Character sets The basic source character set consists of 96 characters: the space character, the control characters representing horizontal tab, vertical tab, form feed, and new-line, plus the following 91 graphical characters:14 a b c d e f g h i j k l m n o p q r s t u v w x y z A B C D E F G H I J K L M N O P Q R S T U V W X Y Z 0 1 2 3 4 5 6 7 8 9 _ { } [ ] # ( ) < > % : ; . ? * + - / ^ & | ∼ ! = , \ " ’ Trigraph Sequences Some characters from the C and C++ character set are not available in all environments. We can enter these characters into a C or C++ source program using a sequence of three characters called a trigraph. The trigraph sequences are: ??= # pound sign ??( [ left bracket ??) ] right bracket ??< { left brace ??> } right brace ??/ \ backslash ??’ ^ caret ??! | vertical bar ??- ~ tilde The preprocessor replaces trigraph sequences with the corresponding single-character representation. Module 3 - Overview of C++ Escape Sequences Nonprintable characters also known as execution characters can be represented by an escape sequence. Escape sequences are primarily used to put nonprintable characters in character and string literals. For example, you can use escape sequences to put such characters as tab, carriage return, and backspace into an output stream. The escape sequences and the characters they represent are: Escape Sequence Character Represented \a Alert (bell, alarm) \b Backspace \f Form feed (new page) \n New-line \r Carriage return \t Horizontal tab \v Vertical tab \’ Single quotation mark \" Double quotation mark \? Question mark \\ Backslash Note: The line continuation sequence (\ followed by a new-line character) is not an escape sequence. It is used in character strings to indicate that the current line continues on the next line. We can use escape sequences only in character constants or in string literals. An error message is issued if an escape sequence is not recognized. In string and character sequences, when you want the backslash to represent itself (rather than the beginning of an escape sequence), you must use a \\ backslash escape sequence. For example: cout << "The escape sequence \\n." << endl; This statement results in the following output: The escape sequence \n. Alternative representations of operators and punctuators In addition to the reserved language keywords, the following alternative representations of operators and punctuators are also reserved in C and C++: Alternative Operator Alternative Operator Alternative Operator Representation /Punctuation Representation /Punctuation Representation /Punctuation Represented Represented Represented <% { and && and_eq &= %> } bitor | or_eq |= <: [ or || xor_eq ^= :> ] xor ^ not ! %: # compl ~ not_eq != %:%: ## bitand & Dr. Jyoti Pareek 2 Object Oriented Concepts & Programming Module 3 - Overview of C++ Wide Characters The C and C++ standard libraries include a number of facilities for dealing with wide characters and strings composed of them. The wide characters are defined using datatype wchar_t, which in the original C90 standard was defined as "an integral type whose range of values can represent distinct codes for all members of the largest extended character set specified among the supported locales" (ISO 9899:1990 §4.1.5) Both C and C++ introduced fixed-size character types char16_t and char32_t in the 2011 revisions of their respective standards to provide unambiguous representation of 16-bit and 32- bit Unicode transformation formats, leaving wchar_t implementation-defined. The ISO/IEC 10646:2003 Unicode standard 4.0 says that: "The width of wchar_t is compiler-specific and can be as small as 8 bits. Consequently, programs that need to be portable across any C or C++ compiler should not use wchar_t for storing Unicode text. The wchar_t type is intended for storing compiler-defined wide characters, which may be Unicode characters in some compilers." Identifiers An identifier is the user-defined name of a program element. In C++ identifier provides names for the following language elements: Functions Objects Labels Function parameters Macros and macro parameters Typedefs Enumerated types and enumerators C++ Classes and class members C++ Templates C++ Template parameters C++ Namespaces Struct and union names An identifier starts with a letter A to Z or a to z or an underscore (_) followed by zero or more letters, underscores, and digits (0 to 9). In C++there is no limit on the length of the identifer , but implementers often impose one Dr. Jyoti Pareek 3 Object Oriented Concepts & Programming Module 3 - Overview of C++ Keywords Keywords are identifiers reserved by the language for special use. Following is the list of keywords common to both the C and C++. auto do goto short typedef break Double if signed union case else inline sizeof unsigned char enum int static void const extern long struct volatile continue float register switch while default for return Only the exact spelling of keywords is reserved. For example, do is reserved but DO is not. The C++ language also reserves the following keywords: asm friend template bool mutable this catch namespace throw class new true const_cast operator try delete private typeid dynamic_cast protected typename explicit public using export reinterpret_cast virtual false static_cast wchar_t We will see their use in successive modules Data Types Data type define the way you store the values used by and created by the program. Defining a data type means defining what constants (generally in terms of range) a variable of that data type can store and what operations can be performed on that value. Data type can be built in or abstract. A built in data type compiler intrinsically understand. It is predefined and known to the compiler. Therefore we don’t need to define them , they are made available to us for use. As C++ is superset of C, most of the built in data types are inherited as is in C++, and they can be used as we use them in C. The basic scalar data type namely int, float and double along with their modifiers signed, unsigned, short and long can be used as we use them in C. Pointers in C++ also can be used the way we use them in C except few additions that we will discuss in next module. Dr. Jyoti Pareek 4 Object Oriented Concepts & Programming Module 3 - Overview of C++ Following is the complete list of fundamental data types in C++: Group Type Size / precision Char Exactly one byte in size. At least 8 bits. char16_t Not smaller than char. At least 16 bits. Character types char32_t Not smaller than char16_t. At least 32 bits. Can represent the largest supported character wchar_t set. signed char Same size as char. At least 8 bits. signed short int Not smaller than char. At least 16 bits. Integer types (signed) signed int Not smaller than short. At least 16 bits. signed long int Not smaller than int. At least 32 bits. signed long long int Not smaller than long. At least 64 bits. unsigned char unsigned short int Integer types unsigned int (same size as their signed counterparts) (unsigned) unsigned long int unsigned long long int Float Floating-point types Double Precision not less than float long double Precision not less than double Void type Void no storage Null pointer decltype(nullptr) New Data types in C++ In addition to these built in data types inherited from C , another data type boolean was added to standard C++ . The keyword used for this data type is bool . The variables of bool data type can have two possible values – built in constants true (which converts to integer 1) and false ( which converts to integer 0 ). Example : bool found; It can be assigned two values found = true; Or found = false Dr. Jyoti Pareek 5 Object Oriented Concepts & Programming Module 3 - Overview of C++ Before bool became part of standard C++ , programmers used different techniques to produce bool like behaviour, like using int to mimic Boolean behaviour, but it could introduce subtle errors. The use of integer values Because there existed lot of existing code which used int for bool, C++ compiler as Booleans is was made to implicitly convert integer to bool constants true and false. Though, poor ideally compiler should give you warning in this case. programming style in C++ Advantage of using Boolean Data TypeUsing bool conveys intent a bool value is unambiguously true or false, while an integer value can take on many more states. This ambiguity could contribute to errors when code is maintained. Abstract Data type Also C++ allows us to create our own data type, which with careful crafting can be made to look and behave quite similar to standard data type. They are also denoted as Abstract Data type. We will discuss about abstract data type in detail in modules 5. Variable Declaration in C++ C and C++ are block structured programming languages. Scope and lifetime of the variables are determined by the block in which it is defined. The only difference being, that in C variables can only be defined in the beginning of the block , whereas in C++ variables can be defined anywhere in the block with only one restriction that they should be defined before their first use. For example #include <iostream> Using namesapce std; void main(void) { int value1 ; value1 = 10; ………………. // some executable code here ………………. int value2; // this will not work in C but will work in C++ ……………… { int value 3; // this will work in C and C++ both ……………….. ……………..… } } Dr. Jyoti Pareek 6 Object Oriented Concepts & Programming Module 3 - Overview of C++ Storage Classes Like C, C++ also has four storage classes namely auto, extern, register and static.