Chapter 7:: Data Types
Total Page:16
File Type:pdf, Size:1020Kb
Chapter 7:: Data Types 1 2 Data Types • most programming languages include a notion of types for expressions, variables, and values • two main purposes of types – imply a context for operations Programming Language Pragmatics • programmer does not have to state explicitly Michael L. Scott • example: a + b means mathematical addition if a and b are integers – limit the set of operations that can be performed • prevents the programmer from making mistakes • type checking cannot prevent all meaningless operations • it catches enough of them to be useful Copyright © 2005 Elsevier Copyright © 2005 Elsevier 3 4 Type Systems Type Systems • bits in memory can be interpreted in different • a type system consists of ways – a way to define types and associate them with language – instructions constructs • constructs that must have values include constants, variables, – addresses record fields, parameters, literal constants, subroutines, and – characters complex expressions containing these – integers, floats, etc. – rules for type equivalence, type compatibility, and type inference • bits themselves are untyped, but high-level • type equivalence: when the types of two values are the same languages associate types with values • type compatibility: when a value of a given type can be used in a – useful for operator context particular context • type inference: type of an expression based on the types of its parts – allows error checking Copyright © 2005 Elsevier Copyright © 2005 Elsevier 5 6 Type Checking Type Checking • type checking ensures a program obeys the • examples language’s type compatibility rules – Common Lisp is strongly typed, but not – type clash: violation of type rules statically typed • a language is strongly typed if it prohibits an – Ada is statically typed operation from performing on an object it does – Pascal is almost statically typed not support – Java is strongly typed, with a non-trivial • a language is statically typed if it is strongly mix of things that can be checked typed and type checking can be performed at statically and things that have to be compile time checked dynamically – few languages fall completely in this category – questions about its value Copyright © 2005 Elsevier Copyright © 2005 Elsevier 1 7 8 Type Checking Polymorphism • a language is dynamically typed if type • polymorphism results when the compiler checking is performed at run time finds that it doesn't need to know certain – late binding things – List and Smalltalk – allows a single body of code to work with objects of – scripting languages, such as Python and Ruby multiple types – with dynamic typing, arbitrary operations can be applied to arbitrary objects • implicit parametric polymorphism: types can be thought of to be implied unspecified parameters • incurs significant run-time costs • for example, ML’s type inference/unification scheme Copyright © 2005 Elsevier Copyright © 2005 Elsevier 9 10 Polymorphism Data Types • subtype polymorphism in object-oriented • we all have developed an intuitive notion of languages what types are; what's behind the intuition? – collection of values from a domain (the – a variable of the base type can refer to an object of the denotational approach) derived type – internal structure of data, described down to the – explicit parametric polymorphism, or generics level of a small set of fundamental types (the • C++, Eiffel, Java structural approach) • useful for container classes – collection of well-defined operations that can be List<T> where T is left unspecified applied to objects of that type (the abstraction approach) Copyright © 2005 Elsevier Copyright © 2005 Elsevier 11 12 Data Types Classification of Types • specified in different ways • most languages provide a set of built-in – Fortran: spelling of variable names types – Lisp: infer types at compile time – integer – List, Smalltalk: track at run time – boolean – C, C++: declared at compile time • often single character with 1 as true and 0 as false • C: no explicit boolean; 0 is false, anything else is true – char • traditionally one byte • ASCII, Unicode – real Copyright © 2005 Elsevier Copyright © 2005 Elsevier 2 13 14 Classification of Types Classification of Types • numeric types • enumeration types – C and Fortran distinguish between different – facilitate readability lengths of integers – allow compiler to catch errors – C, C++, C#, Modula-2: signed and unsigned • Pascal example integers – differences in real precision cause unwanted behavior across machine platforms – orders, so comparisons OK – some languages provide complex, rational, or decimal types Copyright © 2005 Elsevier Copyright © 2005 Elsevier 15 16 Classification of Types Classification of Types • alternatively, can be declared as constants • subrange types – contiguous subset of discrete base type – same as C – helps to document code – most store the actual values rather than ordinal locations • needs two bytes Copyright © 2005 Elsevier Copyright © 2005 Elsevier 17 18 Classification of Types Classification of Types • composite types • composite types (cont.) – records – sets • Cobol • discrete • fields of possibly different type • like enumerations and subranges • similar to tuples – pointers – variant records (unions) • good for recursive data types • multiple field types, but only one valid at a given – lists time • head element and following elements • variable length, no indexing – arrays – files • aggregate of same type • mass storage, outside program memory • strings • linked to physical devices (e.g., sequential access) Copyright © 2005 Elsevier Copyright © 2005 Elsevier 3 19 20 Orthogonality Orthogonality • orthogonality is a useful goal in the design • for example of a language, particularly its type system – Pascal is more orthogonal than Fortran, – a collection of features is orthogonal if there are (because it allows arrays of anything, for no restrictions on the ways in which the instance), but it does not permit variant records features can be combined (analogy as arbitrary fields of other records (for instance) to vectors) • orthogonality is nice primarily because it makes a language easy to understand, easy to use, and easy to reason about Copyright © 2005 Elsevier Copyright © 2005 Elsevier 21 22 Type Checking Type Checking • a type system has rules for • type compatibility / type equivalence – type equivalence (when are the types of two – compatibility is the more useful concept, values the same?) because it tells you what you can do – type compatibility (when can a value of type A – terms are often used interchangeably (incorrectly) be used in a context that expects type B?) – type inference (what is the type of an expression, given the types of the operands?) Copyright © 2005 Elsevier Copyright © 2005 Elsevier 23 24 Type Checking Type Checking • format does not matter • two major approaches: structural struct { int a, b; } is the same as equivalence and name equivalence struct { – name equivalence is based on declarations int a, b; – structural equivalence is based on some notion } of meaning behind those declarations • want them to be the same as – name equivalence is more fashionable these struct { days int a; int b; Copyright © 2005 Elsevier} Copyright © 2005 Elsevier 4 25 26 Type Checking Type Checking • at least two common variants on name • structural equivalence depends on simple equivalence comparison of type descriptions – differences between all these approaches boils down to where the line is drawn between – substitute out all names important and unimportant differences between – expand all the way to built-in types type descriptions • original types are equivalent if the expanded – in all three schemes, we begin by putting every type description in a standard form that takes type descriptions are the same care of "obviously unimportant" distinctions like those above Copyright © 2005 Elsevier Copyright © 2005 Elsevier 27 28 Type Checking Type Checking • example in Ada illustrating type conversion • coercion – when an expression of one type is used in a (1..100) (integer) context where a different type is expected, one normally gets a type error – but what about var a : integer; b, c : real; ... c := a + b; Copyright © 2005 Elsevier Copyright © 2005 Elsevier 29 30 Type Checking Type Checking • coercion • C has lots of coercion, too, but with simpler – many languages allow such statements, and rules: coerce an expression to be of the proper type – all floats in expressions become doubles – coercion can be based just on types of – short int and char become int in operands, or can take into account expected expressions type from surrounding context as well – if necessary, precision is removed when – Fortran has lots of coercion, all based on assigning into LHS operand type Copyright © 2005 Elsevier Copyright © 2005 Elsevier 5 31 32 Type Checking Type Checking • example in C, which does a bit of coercion • in effect, coercion rules are a relaxation of type checking – recent thought is that this is probably a bad idea – languages such as Modula-2 and Ada do not permit coercions – C++, however, goes hog-wild with them – they're one of the hardest parts of the language to understand Copyright © 2005 Elsevier Copyright © 2005 Elsevier 33 34 Type Checking Type Checking • make sure you understand the difference • universal reference types between – a reference that can point to anything – type conversions (explicit) • C, C+: void • Clu: any – type coercions (implicit) • Modula2: address – sometimes the word 'cast' is used for • Java: Object conversions