<<

SE3E03, 2003 1.1 3 SE3E03, 2003 1.3 5 Data Types Abstract Data Types

A signature (S, , F) consists of Data Types – a S of sorts, – a set C of constant symbols, with sorts (c : s), and – a set F of symbols, with argument and result sorts Elementary Structured (f : s … s s). 1 n → Composite Tuples Alternatives Scalar Arrays An abstract (ADT) is a signature together with a set of formulae Variant Types Records over , called axioms of that data type.

Numbers Pointers Strings Files Bool, Char References signature module Enumerations sort type name function symbol function name abstract data type module specification axiom abstract module property

SE3E03, 2003 1.2 4 SE3E03, 2003 1.4 6 Data Types Concrete Data Types

M is a concrete data type over , also called a -, if it provides A data type is a class of objects together with a set of operations for – a carrier set (i.e., set of values) sM for every sort s in S, creating and manipulating them. – a value cM : sM for every constant symbol c : s in C, M M M M – a partial function f : s … s | s for every function symbol 1 n → f : s … s s. Specification of a data type involves: 1 n → – Attributes, e.g. dimension for arrays M is a model of an abstract data type if it satisfies all its axioms. – Values – Operations signature module interface abstract data type module specification structure module Abstract data types: consider only signature (typing) of operations, and axioms specifying them. carrier set data type / class implementation function function / method implementation Concrete data types: Make value and attribute sets explicit, and explicitly model of ADT correct implementation describe operations. SE3E03, 2003 1.5 7 SE3E03, 2003 1.7 9 Data Types in Programs Enumerations — “Small” Variant Types

How can we represent values taken from small finite sets, such as • Variables are bound to values from data types. Suit Diamonds, Hearts, Spades, Clubs ? { } • Variables are declared for holding values from data types. Oberon C BOOLEAN char // 8 (un)signed • Some functionality is polymorphic, i.e., does not depend on the detailed data CHAR — 8 bit enum { types involved. TYPE Suit = CHAR; Diamonds, Hearts, Spades, Clubs CONST Diamonds = 00X; • Some data types may be interchangeable to some extent: subtypes, type Hearts = 01X; … } Suit; conversion (casts) Java Haskell • Type checking helps error detection: boolean Bool // — signed 8 bit! Char −− Unicode, ca. 21 bit! – Static typing: checked at compile time char // — 16 bit! class Suit { data Suit = Diamonds | Hearts – Dynamic typing: checked at run-time public static final byte | Spades | Clubs = = … • Data type are good candidates for modularisation Diamonds 0, Hearts 1, }

SE3E03, 2003 1.6 8 SE3E03, 2003 1.8 10 Record and Tuple Types Optional Values — How to Use NULL

How do we implement points in the cartesian plane IR IR , assuming that REAL, With : none , how can we represent values of type the type + T ? double, Double all “implement” IR ? { } Oberon C — As records, i.e., new types: TYPEMaybeInt= POINTERTO ; typedef int * MaybeInt; Oberon C VARopt: MaybeInt; MaybeInt opt; … ... TYPE Point = RECORD typedef struct { IF opt<> NILTHEN… opt^… if(opt) {... opt ... } x, y : REAL; double x,y ELSE… END; * … else {... }; END; } Point Java Haskell Java Haskell Integer opt; data Maybe a = Nothing | Just a class Point {public double x,y; } data Point = Pt {x, y :: Double} ... … case opt of if(opt ≠ null) {... opt.intValue() ... } Just i → … i … — As tuples, i.e., reusing a generic data type construction: else {... }; Nothing → … Haskell: type Point = ( Double, Double) Java 1.5: Pair In *ML, references cannot contain a NULL value! SE3E03, 2003 1.9 11 SE3E03, 2003 1.13 15 Subclassing as Datatype Construction Subclassing as Datatype Construction — 3 class Point2 { abstract class Snack {} int x, y; } class Coffee extends Snack { double volume; class Point3 extends Point2 { } int z; } class Cookies extends Snack { int number; Which values correspond to objects of class Point2? } Which values correspond to objects of class Point3? Which values correspond to objects of class Snack? Which values can a variable of type Point3 hold? Which values correspond to objects of class Coffee? Which values can a variable of type Point2 hold? Which values correspond to objects of class Cookies? Which values can a variable of type Snack hold?

SE3E03, 2003 1.11 13 SE3E03, 2003 1.15 17 Subclassing as Datatype Construction — 2 Alternatives / Variant Datatypes class Tea { In Haskell and *ML part of the type definition mechanism: boolean green; – Generic: } data Either a b = Left a | Right b class MilkTea extends Tea { double milk; – Special-purpose: } data Snack = Coffee Double | Cookies Int class LemonTea extends Tea { int slices; } In Java and Oberon: Different subclasses of (abstract) classes Which values correspond to objects of class Tea?

Which values correspond to objects of class MilkTea? In C and Pascal: Discriminated unions, “variant records” Which values correspond to objects of class LemonTea? Which values can a variable of type Tea hold? SE3E03, 2003 1.16 18 SE3E03, 2003 1.18 20 Arithmetic Expressions (simplified Jay) in Java Arrays

* Expression Arrays (vectors) can be seen as implementing

− + • total functions from a restricted index type — e.g., subrange types in Ada Variable Value Binary 8 2 3 6 • a restricted of partial functions from the unrestricted index type — C class Expression {// Expression = Variable | Value | Binary Languages can offer different kinds of support for arrays: } class Variable extends Expression {// Variable = String id • Attributes not accessible: C String id; • Attributes like dimensions and ranges accessible: Java } class Value extends Expression {// Value = int intValue • Automatic index checking: Java, Ada, OCaml, Haskell int intValue; • Dynamic extensibility } class Binary extends Expression { • Explicit multiple dimensions, or nested arrays Operator op; • Subarrays: Expression term1, term2; } • Slices: PL/I, FORTRAN 90

SE3E03, 2003 1.17 19 SE3E03, 2003 1.19 21 Arithmetic Expressions in Haskell Partial Function Types data Expr = Variable String f : A | B | Value Integer → | Binary Op Expr Expr • “” in , table in SNOBOL4, Icon

−− direct translation of denotational semantics definition: • If A is a linearly ordered type, then partial functions can be implemented as eval :: ( String → Maybe Integer ) → Expr → Integer balanced binary trees. (Logarithmic overhead) eval env ( Variable v) = case env v of Such implementations are typically found in standard libraries, and called Nothing → error ("undefined variable " ++ v) Just n → n Map, FiniteMap, Dictionary. eval env ( Value n) = n • If A is a “hashable” type, then partial functions can be implemented as eval env ( Binary op e1 e2) = opsem op ( eval env e1) ( eval env e2) hashtables. (Overhead possibly constant, but usually large) :: [ ( , ) ] → ( → ) listEnv String Integer String Maybe Integer • If A is sufficiently small, then we can use arrays of optional values, using: listEnv ps = λ v → lookup v ps A | B A ( + B) fmEnv :: FiniteMap String Integer → ( String → Maybe Integer ) → → fmEnv = lookupFM SE3E03, 2003 1.20 22 SE3E03, 2003 1.35 37 Set Types Designing Data Types

General rule: Start from a mathematical characterization of the data type: • Small sets can be implemented as bit vectors: Pascal, Oberon 1. Specification: Extract from the requirements an abstract data type: signature (= interface) and axioms (= laws). (MIS) • Sets can be considered as total or partial functions: 2. Validation: Derive facts from the laws and relate these with reality. Check IP A A Bool whether important non-axiom properties of reality can be derived from → the axioms. A | 3. Modelling: Design a concrete set-theoretic model (= implementation), → usually using a (recursive) domain equation. (MIS/MID) Implementations of sets are therefore frequently based on implementations of 4. Model verification: Establish that this model satisfies the laws. partial functions, e.g., in Haskell: 5. Design: Decide on implementations of the set-theoretic constructions, defining newtype Set a = MkSet (FiniteMap a ()) an abstraction relation from the implementation to the model. (MID) Document limitations (e.g., artificial upper bounds on stack depth). Verify that limitations are admissible in system context. (Adapt MIS.) 6. Implementation: Write functions implementing the data type operations. 7. Verification: Prove that the implemented data type operations are refinements of the abstract operations in the set-theoretic model.

SE3E03, 2003 1.21 23 SE3E03, 2003 1.54 56 Relation Types Designing Data Types — Example I

Relations as data are not typically supported by programming languages. Binary Trees: a + (Tree a a HTree a) • Database interfaces can be used to manage relations. Haskell: data Tree a = Empty a | Branch ( Tree a) a ( Tree a) C: Recursive needs pointers; these pointers directly represent • Relations can be represented as sets of pairs or as set-valued functions. the alternative “( + _)”, so we define an auxiliary type for the right side of the alternative: A B IP (A B) ↔ Node a Tree a a Tree a A IP B → Since there are no parameterized types in C, we decide a char, and will define A | ((IP B) \ ) node Node char → {∅} Therefore: tree Tree char + Node char + node node *, and now we only need to design a datastructurefor node: node Node char Tree char char Tree char tree char tree Such a triple is naturally implemented as a struct: typedef struct node { struct node * left,right; char contents; } * tree; SE3E03, 2003 1.55 57 Designing Data Types — Example II

Huffman-Trees: HTree a a + (HTree a HTree a) Haskell: data HTree a = Leaf a | Branch ( HTree a) ( HTree a)

C: need pointers again; absence of empty alternative means that NULL does not represent an HTree! Therefore the C type htree is more than we need:

htree + HTree char We can use this for a space-efficient implementation:

HTree char char+ (HTree char HTree char) Def . HTree ( char) + (HTree char HTree char) X X ( + HTree char) (char (HTree char)) (A X) + (B Y) ∪ (A + B) (X Y) ∪ htree (char (HTree char)) Def . htree ∪ htree (char htree) Def . htree ∪

SE3E03, 2003 1.56 58 Designing Data Types — Example II — C typedef

Huffman-Trees: HTree a a + (HTree a HTree a) htree + HTree char The derived space-efficient implementation is a record containing a union:

HTree char htree (char htree) =: hnode ∪ The inclusion now reads: “every HTree charcan be represented by an hnode” typedef struct hstruct { struct hstruct * _left; union { struct hstruct * _right; // if _left ≠ NULL char _leaf ; // if _left == NULL } _u; } hnode, * htree; HTree Leaf (char c); // Constructors HTree Branch(HTree l, HTree r);