Introduction

CA341 - Comparative Programming Languages Introduction

Dr. David Sinclair

Introduction Overview This module will examine the essential concepts on which modern programming languages are based, to understand the design decisions in current languages and the design features that may be introduced into future languages. In this module we will cover: • Data Types and Scope • Pointers and Memory Management • Abstraction • Control • Data Abstraction • Functions, Parameter Matching and Parameter Passing • Assertions and Exceptions • Object-Oriented Programming Paradigm • Memory Models • Logic Programming Paradigm • Functional Programming Paradigm Introduction Texts

Supplementary: • Robert V. Sebesta 2015, Concepts of Programming Languages, 11 Ed., Pearson [ISBN: 9780133943023] • Michael L. Scott 2015, Pragmatics, 4 Ed., San Diego; Morgan Kaufmann [ISBN: 9780124104099] • David A. Watt 2004, Programming Language Design Concepts, John Wiley & Sons [ISBN: 978-047085320] • John . Mitchell 2003, Concepts in programming language, Cambridge University Press New York [ISBN: 978-052178098] • M. Ben-Ari 1996, Understanding Programming Languages, John Wiley & Sons [ISBN: 978-047195846]

Introduction Contact Details

Lecturer: Dr. David Sinclair Office: L253 Phone: 5510 Email: [email protected] WWW: https://www.computing.dcu.ie/∼davids

Course web page: https://www.computing.dcu.ie/∼davids/courses/CA341/CA341.html Introduction How do I successfully complete this module?

The module mark is a straight weighted average of: • 30% continuous assessment • 2 assignments • first assignment: Object-Oriented Programming (15%) • second assignment: Language Comparison (15%) • 70% end-of-semester examination • Do 5 out of 6 questions.

Introduction A Very Brief History of Programming Languages

Prolog ’72 Scheme Common LISP ’75 ’84 Caml OCaml ’85 ’96

ML SML Lisp ’84 ’90 ’62 Haskell Smalltalk ’90 Snobol4 ’80 ’62 Java Scala Simula67 CLU ’95 ’04 APL ’67 ’77 ’60 C++ ’84 C# Algol68 C ’00 ’68 ’72 Objective-C Swift Algol60 ’84 ’14 ’60 CPL BCPL B ’63 ’66 ’69 Python Python 2 Python 3 Plankalkul scripting language ’91 ’00 ’08 ’48 ’77

Autocode Pascal Ada Perl Go ’52 ’57 PL/1 ’71 ’80 ’90 ’09 ’64 Concurrent Pascal Oberon BASIC ’75 ’87 B/0 Flowmatic COBOL ’64 Modula-2 Modula-3 ’57 ’58 ’60 ’78 ’89 Introduction Significant Features

Autocode was developed for the Manchester One and was the first language with a . Fortran was the first widely adopted high-level language. It introduced arrays, symbolic expressions and procedures with parameters. Algol60 introduced the concept of block structure where variables and procedures can be defined anywhere. It also introduced recursive procedures. Cobol was developed primarily for business data processing applications. It introduced the concept of data descriptions which, in later languages, evolved into data types.

Introduction Significant Features (2) Children of Algol60: BASIC popularised programming with its simple structure and efficient implementations that made it the first programming language distributed with the initial PCs. Simula67 was designed for discrete-event simulation problems. It introduced a simple form of parallel execution and classes with hierarchies. Pascal was designed as a language to teach structured programming. The original language is characterised by simplicity. It has since been extended to include modularisation and object-orientation. Also featured an intermediate code, P-code. Algol68 is the successor to Algol60 and its design was driven by the principle of orthogonality (language features can be composed freely and uniformly with predictable effects). Introduction Significant Features (3) The “alternatives”: Lisp in its pure form, was the first functional language. It is based on the lambda calculus and the theory of recursive functions. It has only 1 data structure, the list, and Lisp programs are written as lists. ML is a functional language that demonstrated that a language could be computationally powerful while still retaining the ability to prove some properties about the program without executing it. It has a type inferencing system that guarantees that a well-types ML program will not cause runtime type errors. Haskell is a “pure” functional programming language with lazy evaluation. Prolog is a declarative language in which the programmer describes the properties of a solution and the runtime system searches for the answer (a binding of values to variables).

Introduction Significant Features (4) The “modern” “imperative” languages: Ada was a language developed to replace over 450 different languages used by the U.S. Department of Defence covering the range from information systems to embedded systems. It includes support for concurrency and object-orientation. C was developed as part the development of UNIX. It is very efficient for systems programming. Smalltalk was first major implementation of object-orientation and reflective programming. It is dynamically typed. C++ is a general-purpose programming language adding object-oriented and generic programming features to C, while “retaining” low-level memory manipulation. It is biased toward system programming for both resource-constrained and large systems, with performance, efficiency and flexibility of use. Introduction Significant Features (5)

Objective-C adds object-orientation and Smalltalk message passing to C. Version 2 added garbage collecting. Java is a portable general-purpose language that supports concurrency and object-orientation. It revived the use of intermediate code with bytecode and the JVM. C# is a multi-paradigm programming language that includes strong typing and supports imperative, declarative, functional, generic, object-oriented, and component-oriented programming. Swift is a general-purpose, compiled programming language based on Objective-C with functional aspects. Swift addresses common programming errors like null pointers, and promotes protocol-oriented programming.

Introduction The Perfect Programming Language

There are 2 types of Programming Languages: • those that people complain about; and • those that people do not used.

Bjarne Stroustrup (C++ inventor) Introduction Syntax The syntax of a language defines how statements are form in a language. It is made up of: • lexical rules that define the alphabet of the language and how these characters are combined to form valid words; and • syntax rules that define how valid words are combined to form valid statements.

Typically syntax is defined using Extended Backus-Naur Form (EBNF). EBNF is a context-free grammar that has: Terminals keywords, symbols and characters. Nonterminals are denoted by enclosing “<” and “>” and are composed of terminals and nonterminals. Sequences of terminals and nonterminals, e.g. < name >[< index >].

Introduction Syntax (2) Choice is denoted by “|”. < A > |b represents either the nonterminal A or the terminal b. Repetition is denoted by “*” or “+”. A* represents zero or more occurrences of A. B+ represents one or more occurrences of B. Recusion is where a rule is defined in terms of itself, e.g. < expr > ::= < expr >< operator >< expr >.

The syntax of a simple programming language could be defined as follow. Lexical rules: < operator > ::= + | - | * | / | == | != | < | > | <= | >= < identifier > ::= < letter > | < Id >* < Id > ::= < letter > | < digit > < number > ::= < digit >* < letter > ::= a | b ... z Introduction Syntax (3)

Syntax rules: < program > ::= { < statement >* } < stattement > ::= < assignment > | < conditional > | loop < assignment > ::= < identifier > = < expr > < conditional > ::= if < expr > { < statement >+ } | if < expr > { < statement >+ } else { < statement >+ } < loop > ::= while < expr > { < statement >+ } < expr > ::= < identifier > | < number > | < expr > | < expr > < operator > < expr >

Introduction Semantics

Semantics defines the meaning of a program. Not all syntactically correct programs have a valid meaning. For example the C program, if (a > b) max = a; else max = b; has a valid meaning if a and b are ints. However, if a is an identifier of an int and b is the identifier of a function, then the syntactically correct statement has no valid meaning. Additionally, if the meaning of a syntactically correct program can be verified before the program is executed, then these rules define the static semantics of the program. We will focus on dynamic semantics that describe the effects of executing the constructs of the programming language. A program can only be executed if the program is correct with respect to its syntax and static semantics. Introduction Semantics (2)

A formal semantics defines a language in terms of mathematical concepts and provides a rigorous and unambiguous meaning of each element of the language. Two ways of formally specifying the semantics of a language are axiomatic semantics and denotational semantics. Axiomatic Semantics Axiomatic semantics defines the execution of a program in terms of a state machine. The state of a program is described by a set of first-order predicates that define the property of each value of the program’s variables in a state. The axiomatic semantics defines the meaning of each program construct by relating the state of the program before the execution of the construct to the state of the program after executing the construct.

Introduction Semantics (3)

A predicate P that is required to hold after the execution of a statement S is called a postcondition. A predicate Q that holds before the execution of a statement S, guarantees the termination of S and that the postcondition P will hold, is called a precondition.For a given postcondition, several preconditions may hold. The most general precondition W , with the fewest constraints, is called the weakest precondition. For example, given the statement, x=y+ 1; and the postcondition x > 0, the weakest precondition is y ≥ 0. For each statement in a language, the axiomatic semantics specifies a function asem, called a predicate transformer that calculates the weakest precondition for S and any precondition P. Introduction Semantics (4)

Let the notation Px→expr represent the predicate P with every occurrence of x replaced by expr. For, x = expr;

then asem(x = expr; , P)= Px→expr If we know that asem(S1; , P)= Q and asem(S2; , Q)= R, then asem(S2; S1; , P)= R For, if B then L1 else L2 ; then asem(if-stat, P) = (B ⇒ asem(L1, P)) ∧ (¬B ⇒ asem(L2, P))

Introduction Semantics (5) The semantics of loops is more complicated. Consider, while B do L ; This results in sequence of executions of L whose length is unknown in general. Instead of calculating the weakest precondition, we approximate it by calculating a precondition Q for the postcondition P such that • the loop terminates, and • on termination P holds.

The approximation is a predicate I , called the loop invariant that holds before and after each iteration of the loop and satisfies 1. I ∧ B ⇒ asem(L, I ) 2. I ∧¬B ⇒ P Introduction Semantics (6)

Denotation Semantics Denotational semantics specifies each language statement in terms of mathematical objects, in this case a function dsem from the state of the program before execution to the state of the program after execution. The state is defined by a function mem from the set of variable identifiers to values. dsem(x = expr, mem)= error if mem(v) is undefined for some variable v in expr. dsem(x = expr, mem)= mem′ where mem′(y)= mem(y), ∀y 6= x, mem′(x)= mem(expr), otherwise. If dsem(S1, mem)= mem1 and dsem(S2, mem1) = mem2 then dsem(S1; S2; , mem) = mem2

Introduction Semantics (7)

dsem(if B then L1 else L2, mem)= U where U = dsem(L1, mem) if mem(B)= true U = dsem(L2, mem), otherwise dsem(while B do L, mem)= mem, where U = mem, if mem(B)= false U = dsem(while B do L, dsem(L, mem)), if mem(B)= true.

By applying the denotational rules to the program it is possible to compute the value of mem for the entire program. Introduction Compilation vs. Interpretation

There are two approaches to executing a high-level programming language, interpretation and compilation. An interpreter repeatedly executes the following sequence: 1. Get next statement. 2. Decompose the statement into a series of actions. 3. Perform the actions.

Decomposing the statement into a series of actions is a complicated process. Statements inside a loop will be repeated decomposed and this will affect program efficiency.

Introduction Compilation vs. Interpretation [2]

A compiler will translate a high-level program into an equivalent machine language or intermediate code. Typically program modules are separately translated into relocatable code, and then all the relocatable codes are linked together by a linker into a single relocatable unit. Then the program is loaded into the ’s memory as executable code by a loader. Because each high-level statement is translated only once, compiler programs are usually much more efficient than interpreted programs. Modern can also optimise the resulting relocatable code. Introduction Bindings

A program will have several entities (variables, functions, statements) that have properties called attributes, a variable will have an identifier, type and storage size, a function will have an identifier, a number of parameters of different types and a return type. The value of an attribute must be set before it is used. Setting the value of an attribute is known as binding. Languages differ in what entities have attributes, when they can be bound and whether or not a binding is fixed or can be changed. Language definition time bindings. The type “integer” is bound at language definition time to a mathematical concept. Language implementation time bindings. The set of values that can be bound to a variable type is set at language implementation time as each variable is bound to a memory representation.

Introduction Bindings

Compile-time bindings. Also known as early binding. An example from Java is System.IO.FileStream FS; FS = new System.IO. FileStream(”C: \\ temp. txt”, System.IO.FileMode .Open);

Execution-time bindings. Also known as late binding. In most languages variables can be bound to a value at execution time and the binding can be modifier repeatedly during the program’s execution. object FS = null ; FS = CreateObject(”Scripting .FileSystemObject”);

The first three are example of static bindings and the final one is an example of dynamic bindings.