Foundations of Computer Science I
Total Page:16
File Type:pdf, Size:1020Kb
Foundations of Computer Science I Dan R. Ghica 2014 Contents 1 Introduction 2 1.1 Basic concepts . .2 1.2 Programming the machine, a historical outlook . .2 1.3 Abstraction . .3 1.4 Functional programming: a historical outlook . .4 2 \Formal" programming: rewrite systems 6 2.0.1 \Theorems" . .7 2.1 Further reading . .8 3 Starting with OCaml/F# 9 3.1 Predefined types . .9 3.2 Toplevel vs. local definition . 11 3.3 Type errors . 11 3.4 Defined types: variants . 12 4 Functions 13 4.1 Pattern-matching and if . 14 5 Multiple arguments. Polymorphism. Tuples. 18 5.1 Multiple arguments . 18 5.2 Polymorphism . 18 5.3 Notation . 19 5.4 Tuples . 20 5.5 More on pattern-matching . 21 5.6 Some comments on type . 22 6 Isomorphism of types 23 6.1 Quick recap of OCaml syntax . 24 6.2 Further reading . 24 7 Lists 25 7.1 Arrays versus lists . 25 7.2 Getting data from a list. 26 8 Recursion. 28 8.0.1 Sums . 28 8.0.2 Count . 29 8.1 Creating lists . 30 8.2 Further reading . 32 8.3 More on patterns . 32 1 1 Introduction 1.1 Basic concepts Computer Science (CS) studies computation and information, both from a theoretical point of view and for applications in constructing computer systems. It is perhaps not very helpful to try and define these two basic, and deeply connected, notions of \computation" and \information", but it is perhaps helpful to talk about some properties they enjoy. Information is what is said to be exchanged in the course of communication. Information is said to be used to represent data, knowledge or ideas. A wonderful example of information is the DNA: it is a pattern of molecules that defines the formation and development of an organism without any need for a conscious or insightful mind. This lack of insight is essential for the understanding of computation, which is the deliberate, automated, mechanical processing or transformation of information. So the questions that CS answers are commonly about what can be computed and how it can be computed. The answers to these questions can be both qualitative, indicating that something can/cannot be computed, and quantitative, indicating how much of a certain resource (time, space, energy) must be consumed in order to perform a computation. The engineering and design side of CS is concerned with how to improve quantitative measures of computation. Computation is carried out by a mindless, uninsightful program executed by a machine. The machines can be either physical (case in which they are called computers) or virtual (a machine which is in some logical sense running on top of another machine) or abstract (a theoretical con- traption which embodies certain ideals of what can be computed). Programs are always written in a language which needs to be well-defined in at least two ways: Syntax. Just like English, or any other natural language, it is important to distinguish between well-formed phrases and gibberish. Note that to make this distinction it is not important to understand the meaning, just the rules of what a well-formed phrase looks like. Consider the opening stanza of the poem Jabberwocky by Lewis Caroll: 'Twas brillig, and the slithy toves Did gyre and gimble in the wabe; All mimsy were the borogoves, And the mome raths outgrabe. It has the syntactic form of English despite using nonsensical made-up words. Programming languages, unlike natural languages, have much stricter and clearly defined syntax. A good programming language has no ambiguity as to whether a phrase is legal or not, and if it is legal as to what its syntactic structure is. In a program a single-character typo can usually render a program illegal. Semantics. If syntax is concerned with what a meaningful program should look like, in semantics the concern is what the program actually means. Just like the syntax of a programming language must have a complete, unambiguous specification so its semantics must define com- pletely and unambiguously what each program means. Defining, or even just understanding, the syntax and semantics of languages is an advanced topic with which we will not deal in this module. Instead we will focus on the activity of writing programs, i.e. programming. 1.2 Programming the machine, a historical outlook Programming is actually a very old activity. The oldest recognisable device used to perform calculations, the abacus, is known since 2400 BCE, although all its calculations are manual. The first mechanical calculators were used to perform astrological calculations, for example the astrolabe, which was invented around 150 BCE. More modern efforts are Pascal's 1642 calculating machine and Leibniz's Stepped Reckoner from 1672. The latter was made into a commercial product, the Arithmometer by C.X. Thomas in 1820. Perhaps the most famous was Babbage's analytical engine, a design continuously improved from 2 1833 until 1871, the year of Babbage's death. Notably, its programs were stored on punchcards and it had control features similar to modern computers. Ada Lovelace is officially the world's first computer programmer, having written a program for the analytical engine to compute a sequence of numbers known as Bernoulli numbers. The first electronic computers appeared before and during World War II. Replacing mechanical components with electronic circuits allowed higher speeds and smaller sizes. Variously claiming to be the first computer are Zuse's Z3 (Germany, 1941), the Atanasoff-Berry computer (USA, 1937) or the Colossus (UK, 1943). ENIAC (US, 1946) was the first computer to resemble in internal structure today's computers. It executed a few thousands instructions per second, it weighed 27 tons and it consumed 150 kW of power. For comparison, a smartphone contains a computer which executes billions of instructions per second, weighs 150 g and consumes a few Watts. The first modern programming language was Fortran, created in 1956 by J. Backus. It was designed especially to program the IBM 704. 1.3 Abstraction Abstraction is a process by which essential concepts are separated or derived from detailed prop- erties. A good way of seeing abstraction at work is in tree-like directories, where more abstract concepts are closer to the root. The shirt of footballer Craig Bellamy is found in the following hierarchy of abstractions: 1. Sports memorabilia 2. Football shirts 3. English League shirts 4. Liverpool shirts. The detailed description is something like Liverpool England Football XL Extra Large Mens Craig Bellamy Wales 2007 Home. An abstraction is not unique, the same shirt being possibly abstracted as red football shirt or shirt of famous Welshmen. In programming, abstraction is tremendously important. Because programs can be very com- plicated, the more detail we need to manage manually the more likely we are to make mistakes. Consider the following two ways of giving directions. Low level: 1. Head northwest for 276 ft 2. turn left, continue for 0.5 mi 3. take 2nd exit at roundabout, continue for 0.2 mi 4. slight left, continue for 0.1 mi 5. turn left, continue for 0.2 mi 6. turn right, continue for 482 ft. High level: 1. Go to Selly Oak train station via Harborne Ln If correctly applied by the driver of a car stationed in front of University Station they will have the same effect: driving to Selly Oak train station. However, the second formulation leaves out many details and, because of that, it is more likely to be carried out correctly. A computer, like a map, can be seen at many different levels of abstraction. At the lowest level we have, like in any physical object, elementary particles, their physical and chemical properties, and their energetic states. Higher up we have the geometrical structures of the processor and the memory etched onto wafers of silicone. Higher still, we have logical gates and connectors. 3 Higher still we have the architecture of the computer, consisting of abstract blocks with distinct functionalities. Programming can happen at this level, but it is incredibly detailed and tedious. Administering the computer is one important program called an operating system (OS), and it is a program written at this level of abstraction, or at least its key components are. It gives a far more abstract view of the computer, quite similar to a virtual machine. Because the same OS can run on quite different computers this higher level of abstraction makes programming far easier, and also portable, because we program the OS and not the actual physical computer. On top of the OS we can have further programming language infrastructure, which gives an even more abstract view of the system. This makes programs even more concise and even more portable, because we do not program an OS, but just a highly abstract programming language. These languages have been called, conventionally, functional programming languages. 1.4 Functional programming: a historical outlook Functional programming (FP) languages are utterly top-down. They abstract away as much as possible from the physical computer and its OS. The focus of FP is not to reflect the capabilities of a certain machine model, but to give a direct way of representing information and computation. As a result, FP is concise, elegant and less error-prone than machine-oriented programming. The roots of FP are not in engineering, but in mathematics, and especially in logic. The father of logic is Aristotle, a Greek philosopher who lived in 3rd Century BCE. His aim was to distinguish between what it means to be true as compared to what is simply widely believed or popular. The study of logic was an important aspect of philosophy throughout history, but the idea of calculation in logic was famously emphasised by Leibniz (17th Century) who proposed lingua characteristica universalis, a universal diagrammatic language for all human knowledge, and an associated calculus ratiocinator, calculational rules to be carried out by machines in order to derive all propositions: If controversies were to arise, there would be no more need of disputation between two philosophers than between two accountants.