Programming Language Theory 1

Programming Language Theory 1 Lecture #6 2 Data Types Programming Language Theory ICS313 Primitive Data Types Character String Types User-Defined Ordinal Types Arrays and Associative Arrays Record Types Nancy E. Reed Union Types [email protected] Pointer Types Type Checking Ref: Chapter 6 in Sebesta 3 4 Data Types Primitive Data Types Data type - defines Not defined in terms of other data types • a collection of data objects and • a set of predefined operations on those objects 1. Integer Evolution of data types: • Almost always an exact reflection of the • FORTRAN I (1957) - INTEGER, REAL, arrays hardware, so the mapping is trivial • Ada (1983) - User can create unique types and system enforces the types • There may be as many as eight different integer Descriptor - collection of the attributes of a variable types in a language (diff. size, signed/unsigned) Design issues for all data types: 2. Floating Point 1. Syntax of references to variables • Model real numbers, but only as approximations 2. Operations defined and how to specify • Languages for scientific use support at least two What is the mapping to computer representation? floating-point types; sometimes more • Usually exactly like the hardware, but not always 5 6 IEEE Floating Point Format Standards Primitive Data Types 3. Decimal – base 10 • For business applications (usually money) • Store a fixed number of decimal digits - coded, not as Single precision floating point. • Advantage: accuracy – no round off error, no exponent, binary representation can’t do this • Disadvantages: limited range, takes more memory Double precision • Example: binary coded decimal (BCD) – use 4 bits per decimal digit – takes as much space as hexadecimal! 4. Boolean (true/false) • Could be implemented as bits, but often as bytes or words • Advantage: readability 1 Programming Language Theory 7 8 Character & String Types Character & String Types 5. Characters & Strings Ada, FORTRAN 90, and BASIC Stored as numeric codes (e.g., ASCII, EBCDIC, • Somewhat primitive Unicode) • Assignment, comparison, concatenation, substring String = Sequences of characters? reference • FORTRAN has an intrinsic for pattern matching Design issues: Example in Ada 1. Is it a primitive type or just an array of characters? 2. Is the length of strings static or dynamic? N := N1 & N2 (concatenation) Operations: N(2..4) (substring reference) • Assignment C and C++ • Comparison (=, >, etc.) • Not primitive • Concatenation • Substring reference • Use char arrays and a library of functions that provide operations • Pattern matching 9 10 Character String Type Examples Character String Length Options Java 1. Static - FORTRAN 77, Ada, COBOL class (not arrays of ) • String char e.g. (FORTRAN 90) • Objects cannot be changed (immutable) CHARACTER (LEN = 15) NAME; • StringBuffer is a class for changeable string objects Perl and JavaScript 2. Limited Dynamic Length - C and C++ • Patterns are defined in terms of reggpular expressions actual length is indicated by a null • A very powerful facility • e.g., character /[A-Za-z][A-Za-z\d]+/ 3. Dynamic - SNOBOL4, Perl, SNOBOL4 (string manipulation language) JavaScript, Common Lisp • Primitive string type • Many operations, including elaborate pattern matching Dynamic storage, thus ‘no limit’ on predefined length 11 12 Character & String Type Evaluation Character & String Types Strings aid writability and readability Static length primitive type • Inexpensive to provide, why not include them? (in most languages) Dynamic length • Weig h fle xib ility vs. cost to p rov ide – flexib ility ove r * length of strings (don’t need to know at compile time), but need dynamic storage Implementation: • Static length - compile-time descriptor Compile-time Run-time descriptor • Limited dynamic length - may need a run-time descriptor descriptor for for length (but not in C and C++) for limited dynamic • Dynamic length - need run-time descriptor; allocation/de- static strings strings allocation is the biggest implementation task 2 Programming Language Theory 13 14 UserUser--DefinedDefined Ordinal Types UserUser--DefinedDefined Ordinal Type Examples Ordinal type – Ada • cannot reuse values • has a range of possible values, e.g. the set of positive integers • can be used for array subscripts, for variables, case selectors • or can be easily associated with integers, e.g. fruit • can be compared Enumeration Types -user enumerates all • constants can be reused (overloaded literals) • can be input and output possible values,,y which are symbolic constants CandC++C and C++ - • Represented with ordinal numbers • cannot reuse values Design Issues • can be used for array subscripts, for variables, case selectors • Can symbolic constants be in more than one type definition? • can be compared • can be input and output as integers - hair = {red,brown,blonde}, - cat = {brown,striped,black} Java does not include an enumeration type, • Can OTs be read/written as symbols? • but provides the Enumeration interface • Are they allowed as array indices, subranges? 15 16 Evaluation of User-User-DefinedDefined Ordinal Types Subrange Types 1. Aid to readability An ordered contiguous subsequence of an ordinal type e.g. no need to code a color as a number Design Issue: How can they be used? (statements) 2. Aid to reliability Pascal • Sub-range types behave as their parent types; can be used as for e.g. compiler can check variables and array indices 3. Operations specified e.g. type pos = 0 .. MAXINT; don’t allow colors to be added, for example Ada • Subtypes are not new types, just constrained existing types (so they are 4. Ranges of values can be checked compatible); can be used as in Pascal, plus case constants e.g. E. g. if you have 7 colors, code them as integers (1..7), without enforcement, 9 is a legal integer subtype POS_TYPE is and thus a `legal color’! INTEGER range 0..INTEGER'LAST; Evaluation and Implementation of 17 18 Multiple Concurrent Jobs SubSub--rangerange Types Aid to readability Operating System: 0 Reliability - restricted ranges adds error Control program and OS resource manager detection Job 1 Job 1: Data, Code, Stack Job 2 Enumeration types are imp lemented as & Heap integers Job 2: ditto Job 3 Sub-range types are the parent types with Job 3: ditto Job 4 code inserted (by the compiler) to restrict Job 4: ditto assignments to sub-range variables Etc.. FFFF 3 Programming Language Theory 19 20 Each Job in Memory Arrays Text: code, constant data 0 An aggregate of homogeneous data elements • individual elements are identified by their position relative to the Data: Text first element (index) • initialized global & static Data Design issues variables 1. What types are legal for subscripts? • global & static variables – 0 Heap 2. Range checked on subscript expressions in references? initialized or un-initialized 3. When does binding of subscript ranges happen? (blank) 4. When does allocation take place? Heap: dynamic memory 5. What is the maximum number of subscripts? 6. Can array objects be initialized? Stack: dynamic - local Stack 7. Are any kind of slices allowed? variables, state of 82472 program FFFF 16 48 144 32 96 288 21 22 Array Indexing Static and Fixed Stack Dynamic Arrays Mapping from indices to elements Type based on subscript binding and binding to storage map (array_name, index_value_list) → an element Index Syntax 1. Static - range of subscripts and storage • FORTRAN, PL/I, Ada use parentheses bindings are static e.gg,y. FORTRAN 77, some arrays in Ada • MtthlMost other languages use brac ktkets • Advantage: execution efficiency (no allocation or de- Subscript types allocation) • FORTRAN, C - integer only 2. Fixed stack dynamic - range of subscripts is • Pascal - any ordinal type (integer, boolean, char, enum) statically bound, but storage is bound at • Ada - integer or enum (includes boolean and char) elaboration time • Java - integer types only • e.g. Most Java locals, and C locals that are not static • Advantage: space efficiency 23 24 Dynamic Arrays Array Subscripts and Initialization 3. Stack-dynamic - range and storage are Number of subscripts • FORTRAN I allowed up to three dynamic, but fixed from then on for the • FORTRAN 77 allows up to seven variable’s lifetime • Most languages have no limit Array Initialization • Advantage: flexibility - size need not be known until the array is about • Usually just a list of values that are put in the array in the order in which the to be used array elements are stored in memory 4. Heap-dynamic - subscript range and storage Example Initialization bindings are dynamic and not fixed 1. FORTRAN - uses the DATA statement, or in / ... / • e.g. (FORTRAN 90) 2. C and C++ - put the values in braces; INTEGER, ALLOCATABLE, ARRAY (:,:) :: MAT • int stuff [] = {2, 4, 6, 8}; (Declares MAT to be a dynamic 2-dimensional array) 3. Ada - positions for the values can be specified • In APL, Perl, and JavaScript, arrays grow and shrink as needed • SCORE : array (1..14, 1..2) := (1 => (24, 10), 2 => (10, 7), • In Java, all arrays are objects (heap-dynamic) 3 =>(12, 30), others => (0, 0)); 4. Pascal does not allow array initialization 4 Programming Language Theory 25 26 BuiltBuilt--inin Array Operations Array Slices 1. APL - many, A slice is some substructure of an array; See text 7th. Ed p. 240-241, 9th Ed p. 270-272 nothing more than a referencing mechanism 2. Ada Slices are only useful in languages that have • Assignment; RHS can be an aggregate constant array operations or an array name 1. Examp le:

Programming Language Theory 1

Software II: Principles of Programming Languages

16 Concurrent Hash Tables: Fast and General(?)!

Off-Heap Memory Management for Scalable Query-Dominated Collections

7.7.2 Dangling References

Concurrent Hash Tables: Fast and General?(!)

Session 6: Data Types and Representation

Memory and Object Management in Ramcloud

PRINCIPLES of BIG DATA Intentionally Left As Blank PRINCIPLES of BIG DATA Preparing, Sharing, and Analyzing Complex Information

CS 230 Programming Languages