IA010: Principles of Programming Languages

IA010: Principles of Programming Languages 3. Types Jan Obdržálek obdrzalek@fi.muni.cz Faculty of Informatics, Masaryk University, Brno IA010 3. Types 1 Data types A data type is a collection of data values and a set of predefined operations on these values. Why use types • error detection – improves reliability "IQ" + 160 • implicit context for many operations – improves writability a + b, new p • code documentation – improves readability A type system consists of 1 a mechanism to define types and associate them with certain language constructs 2 set of rules for type equivalence, type compatibility and type inference IA010 3. Types 2 Outline Primitive data types Type checking Composite data types Array types Record types Union types List types Pointer and reference types Type inference IA010 3. Types 3 Basic type taxonomy • boolean type • numeric types Primitive types • character type • character string types • enumeration types, subrange types • record types • union types • array types Composite types • list types • set types • pointer and reference types What about primitive and composite data types? IA010 3. Types 4 Primitive data types IA010 3. Types 5 Primitive and composite types primitive data types two meanings, may coincide 1 with support built-in the programming language also called built-in types 2 building blocks for composite types also called basic types composite data types created by • applying a type constructor (record, array, set ...) • to one or more simpler types (either primitive or composite) The distinction is not always clear and may depend on the language. IA010 3. Types 6 Numeric types • historically oldest, typically reflect hardware • integers and floats, complex numbers • range can implementation-dependent (problem: portability) Integer types • different lengths: C99 signed char, short, int, long, long long (using at least 1, 2, 2, 4 and 8 bytes, respectively) • may be signed or unsigned (typical implementation: twos complement) • arbitrary precision: string representation "12354654231654L" (Python – long integer) • typical for scripting languages • performance penalty IA010 3. Types 7 Numeric types II floating-point types • model real numbers • single (float/real) or double (double) precision • standard IEEE 754 • cannot precisely express all real numbers: 1 irrational numbers: e; π; : : : 2 even short numbers in decimal: 0:1 (base 10) = 0:0001100110011 ::: (base 2) decimal types • fixed number of decimal digits each digit the same number of bits (usually 4 or 8) • use – business applications, precise decimal number representation (0.1) • especially useful, if available in hardware (BCD) • examples: COBOL, C#, F# IA010 3. Types 8 Non-numeric primitive types boolean type • just two values (true, false) • missing in C89, arbitrary numeric type can be used (zero/non-zero) • usually implemented using more than a single bit character type • to store a single character • size of representation depends on the encoding used (ASCII, modern languages: Unicode, UTF-xx) (may vary for different characters – e.g. in UTF-8) • sometimes missing from the language (Python: strings of length 1) • may be even handled like a numeric type (C) IA010 3. Types 9 Character string types • string – a sequence of characters • strings according to length • static length – Python, Java (String), C# • limited dynamic length – C (upper bound on the length) • dynamic length – JavaScript, Perl, standard C++ library • strings according to implementation • special kind of an array of characters – C, C++ (terminated by the null character \0) • primitive data type – PYTHON • class – JAVA, F# • supported operations • concatenation, comparison, substring selection • pattern matching – Perl, JavaScript, Ruby, PHP IA010 3. Types 10 Ordinal (discrete) types An ordinal type is a type which can be mapped to a range of integers. 1 primitive ordinal types • provided in the language • e.g. Java: integer, char, boolean 2 user defined ordinal types • enumeration types • subranges IA010 3. Types 11 User-defined ordinal types Enumeration types • the values, called enumeration constants, are enumerated in the definition enum days {Mon, Tue, Wed, Thu, Fri, Sat, Sun}; • typical implementation – implicit numerical value • the value can often be given explicitly (Fri=2) • advantages over named constants: type checking! Important aspects • can one enum. constant name be used in multiple types? • are enum. constants coerced to integers? (C: yes; JAVA 5.0, C#, F#: no) IA010 3. Types 12 User-defined ordinal types Subrange types • contiguous subsequence of an ordinal type • PASCAL,ADA Example (ADA) type Days is (Mon, Tue, Wed, Thu, Fri, Sat, Sun); subtype Weekdays is Days range Mon..Fri; subtype Index is Integer range 1..100; • operations of the parent type are preserved (as long as the result stays in range) • require run-time type checking • advantages: readability, range checks • can be simulated by asserts IA010 3. Types 13 Type checking IA010 3. Types 14 Type checking topics • type equivalence • type compatibility • type conversion (cast) • type coercion • nonconverting type cast • type inference (later) IA010 3. Types 15 Type checking Ensuring that the operands of an operation are of compatible types. A language can be • strongly typed – an operation cannot be applied to any object which does not support the operation • statically typed – checking can be performed at compile-time • dynamically typed – checking performed at run-time (a form of late binding) (languages with dynamic scoping) Examples • Ada, Java, C# – strongly typed (except for the explicit cast) • Pascal – almost strongly statically typed (except for the untagged variant records) • C89 – weak typing (unions, pointers, arrays, . ) • Scheme, Lisp, ML, F# – strongly typed • Python, Ruby – strongly dynamically typed IA010 3. Types 16 Type equivalence When are two types equivalent? Nontrivial in a language which allows defining of new types (records, arrays,. ) Two variables have the same type, if • they were defined in the same declaration, or in a declaration using the same type name name equivalence: Pascal, Ada, Java, C# • if their types are identical as structures structural equivalence: Algol, Modula, C, ML • combination of both approaches, e.g. C: • name equivalence for struct, union, enum • structural equivalence otherwise IA010 3. Types 17 Structural equivalence issues Are the following types the same? type T1 = record types= array [1..10] of char; a,b : integer typet= array [0..9] of char; end; type T2 = record type T3 = record a : integer; b : integer; b : integer; a : integer; end; end; • T1 and T2: yes • T2 and T3: no (most languages), yes (ML) • s and t: no (most languages), yes (Fortran, Ada) IA010 3. Types 18 Deciding type equivalence Structural equivalence • type names are (recursively) replaced by their definitions • resulting strings are simply compared • obstacles (surmountable): recursive types, pointers Name equivalence • straightforward name comparison • assumption: if the programmer gave two definitions of the “same” type, he probably had a particular use in mind IA010 3. Types 19 Name equivalence TYPE new_type = old_type( * Modula-2 *) • old_type and new_type are aliases • two types, or two names for the same type? • two types: strict name equivalence • same type: loose name equivalence (Pascal) Example: strict name equivalence TYPE imperial_distance = REAL; metric_distance = REAL; VAR i : imperial_distance; m : metric_distance; ... m := i; (* this should probably be an error *) IA010 3. Types 20 Name equivalence in Ada • a restrictive version of name type equivalence • subtype – a type equivalent to the parent type subtype new_int is integer; subtype small_int is integer range 1..100; • derived type – a new type type imperial_distance is new float; type metric_distance is new float; • note the difference: type derived_small_int is new integer range 1..100; subtype subrange_small_int is integer range 1..100; IA010 3. Types 21 Type conversion (cast) Change of types, explicitly stated in the program code. Implementation – three principle cases: 1 structurally equivalent types (same internal representation) (no code executed – the conversion is “for free”) 2 different types, same representation (e.g. subtypes) (run-time check, value can be used if successful) 3 different types with related values (e.g. int vs float) (the specified conversion is performed) Nonconverting type cast • no conversion is performed, the stored value is only interpreted as of the new type • uses: systems programming, significand/exponent extraction, . IA010 3. Types 22 Type conversion examples type test_score is new integer range 0..100; type celsius_temp is new integer; ... n : integer; -- assume 32 bits r : real; -- assume IEEE double-precision t : test_score; c : celsius_temp; ... t := test_score(n); -- run-time semantic check required n := integer(t); -- no check req.; every test_score is an int r := real(n); -- requires run-time conversion n := integer(r); -- requires run-time conversion and check n := integer(c); -- no run-time code required c := celsius_temp(n); -- no run-time code required IA010 3. Types 23 Type compatibility • of prime importance to the programmer • full type equivalence is not always needed • we often need only compatible types: • addition – two numeric type operands • assignment – target type compatible with the source type • subroutine call – formal parameters compatible with arguments • the definition of compatibility differs significantly among various languages Coercion •

Load more