<<

06-05934 Models of Computation The University of Birmingham Spring Semester 2008 School of Science Volker Sorge February 18, 2008 Handout 7 Summary of this handout: Multitape Machines — Multidimensional Machines — Universal — Church’s Thesis

IV.5 Extending Turing Machines In this handout we will study a few alterations to the basic Turing machine model, which make pro- gramming with Turing machines simpler. At the same time, we will show that these alterations are for convenience only; they add nothing to the expressive power of the model. 61. Extending the tape alphabet. One extension that we have already seen in the previous handout is the extension of the tape alphabet, such that it contains more auxiliary symbols than just the blank. For example, for every input symbol x it can contain an auxiliary symbol x′. By overwriting a symbol with the corresponding primed version, we are able to mark fields on the tape without destroying their contents. However, extended alphabets do not really add anything to the power of Turing machines. If we only have the basic alphabets Σ = {a, b} and T = {a, b, }, then we can simulate a larger alphabet by grouping tape cells together, and by coding symbols from the larger alphabet as strings from the smaller. For example, if we group three cells together, then we have 3∗3∗3 = 27 possibilities for coding symbols. Given a Turing machine transition table for the large alphabet we can easily design an equivalent table for the original alphabet. Whenever the first machine makes one move, the simulating machine makes at least three moves, scanning the three cells used for coding individual symbols and remembering their contents in the state. It will then probably need some more moves to execute the operation that the original machine wanted to do. All in all, the simulating machine will be slower than the original machine by some constant factor, but will be able to perform the same computation. 62. Multitape Turing machines. In the next variant we provide several tapes, each with its own read/write head. The transition table for a multitape machine is more complicated than that of a single tape machine because in each step a decision is taken on the basis of the current state of the and the contents of all cells under the heads of each tape. The action performed consists of actions for each head individually, in particular, each head can move independently of the others; they do not have to all move in one direction. Even a single extra tape is quite useful, because we can use it as a kind of “scratch pad”. This will allow the machine to count steps: each time the head on the main tape moves, the head on the scratch pad tape marks down a symbol, e.g., x. Example: Let us build a two-tape Turing Machine M that consists of one input tape T1 and one output tape T2, which is intially empty. The machine’s task is to read an input string, duplicate each symbol in the string and write it as output. So, for example, if the input is abab the output should be aabbaabb. We define the input alphabet for M as Σ= {a, b} and the tape alphabet as T = {a, b, }. The transition function δ : Q × T × T → Q × T × T for M is then:

δ (a, a) (b, b) (a, ) (b, ) ( , ) q0 = 0 (1, a, a) (1, b, b) (0, r, ) 1 (1, a, r) (1, b, r) (2, a, a) (2, b, b) 2 (2, r, r) (2, r, r) (1, a, a) (1, b, b) stop

Observe that the above table only contains those transitions that can actually occur. In particular we have omitted all the input combinations that should never happen under the assumption that tape T2 is empty, that is, we save 4 columns. We could compress the table even further by exploiting the observation that columns for (a,a) and (b,b) as well as for (a, ) and (b, ) are similar. By defining α ∈ {a, b} we can

66 replace the a’s and b’s by α, thus compressing the table to three columns only. This is often quite helpful when dealing with machines consisting of several tapes and larger alphabets. Nevertheless multitape machines are not more powerful than single tape machines, as their computations can be simulated by single tape machines. We outline the construction for a two-tape machine. The idea is to interleave the contents of the two tapes on a single tape, as in the following picture:

a a b a tape 1 read/write head 1

b b a tape 2 read/write head 2

control unit

112 2 112 2 112 2 11 2 2 1 H1 a abbH2 b a a

In addition, before each data field we leave a blank, which can hold a marker to indicate the position of a head (I have written H for this, but it could be any symbol different from blank.). Whenever the two-tape machine makes one step, the simulating machine has to do many steps. This works as follows. At the beginning of a , the single head of the simulating machine is at the left hand side of the current working region. It first scans across the whole working region to read the data that is currently scanned by the various heads. It can find out about this, because we have included markers to indicate the position of heads. This information (which is finite and of fixed size!) is stored in the state. The simulating machine will then make all changes that the multitape machine would do in its single step. To this purpose, the simulating machine will visit each head marker and either change the field to the right of it, or move the marker to an adjacent valid position (four fields away in the picture above). Note that the simulation of one single step of a multitape machine not only takes many steps on the single-tape machine but that these take more and more time as the tape fills up with data. However, we are currently interested in computing power and not in computing speed. Example: Suppose we want to simulate the two tape machine M, from the previous example, on a single ′ tape Turing machine M using this construction. First we extend the tape alphabet by two symbols H1 and H2. We then have a look at how the tape will look like for the example input abab: Before the computation H1aH2 b a b and after the computation a a b a a b b bH1 a a b b H2 We design M ′ with a high-level diagram. Before doing so, however, we shall first define a couple component machines to search for and move the headers H1, H2 and thus ease the design of the overall machine. We assume that we have the machines r,l,R,L as defined in the previous handout. In addition we assume that for each symbol in the tape alphabet we have a machine that writes that symbol to the tape, i.e. we have machines a,b, ,H1,H2. Given these basic machines, we now define some larger machines as follows: R r H1 L l H1 H1 : −→ stop move right until H1 is found H1 : −→ stop move left until H1 is found R r H2 L l H2 H2 : −→ stop move right until H2 is found H2 : −→ stop move left until H2 is found 4r: rrrr −→ stop go four cells right 4l: llll −→ stop go four cells left

67 MRH1 : 4rH1 −→ stop Move head 1 marker one position (four cells) to the right

MRH2 : 4rH2 −→ stop Move head 2 marker one position (four cells) to the right Finally we can put together the entire machine M ′. Its main idea is to first find the start of the input string (i.e., H1), then write the first character twice on the “second tape” marked with H2, thereby moving H2 twice, and finally return left to H1 and move it one position right. We continue with this until the input string is processed. a

RH2 ralMRH2 ralMRH2 LH1 MRH1 r a b start RH1 r stop a b RH2 rblMRH2 rblMRH2 LH1 MRH1 r

b 63. Multidimensional Tapes In this variant the machine has a multidimensional tape at its disposal, and in each step it can move the head in each direction. For example, on a two-dimensional storage medium, the head can move up, down, left and right. The following picture illustrates such a machine and its simulation.

a a b a b b b b

∗aa ∗ ba ∗ bbbb ∗

This variant offers yet more than even a multitape machine because on a two-dimensional tape we can consider the rows as each constituting a tape by itself. What we then have is a machine which has an unbounded number of infinite tapes at its disposal. However, we again do not gain any computing power, only convenience. We indicate how a two- dimensional tape machine can be simulated by a machine with two one-dimensional tapes. This in turn can be simulated on a single tape, as we saw in the previous section. The simulation works as follows: At each stage in the computation, the two-dimensional machine will have filled only finitely many cells with non-blank symbols. We can draw an imaginary rectangle around this region, as in picture above. The rows of this rectangle can be placed on a one-dimensional tape, separated by some special auxiliary symbol (∗, say). The positions of the heads correspond to each other, ie, they scan corresponding cells. If the two dimensional machine moves its head left or right, then so does the simulating machine. If it moves it up or down, then we go to the next ∗ to the left of the current position, counting steps using the second one-dimensional tape, then we go to the next star to our left (or right) and move as many steps right as the count on the scratch pad indicates. If the two-dimensional machine moves its head vertically out of the imaginary rectangle, then we have to create another line of the same length either to the left or to the right of the current stretch of lines. If the two-dimensional machine moves its head horizontally out of the working rectangle, then we have to add one cell to each line, either on the left or on the right. This amounts to a lot of shifting but is not difficult in principal.

68 IV.6 Church's Thesis In this section we will finally show that the Turing Machine model is capable of simulating any computer architecture, including itself. This will establish that the Turing Machine is not only a suitable model of computation for modern , but also that it is the most powerful model known to date, which will be summarised in Church’s Thesis. 64. Simulating a The von Neumann architecture is the standard com- puter design model for contemporary computers. It uses a (CPU) and a sin- gle separate storage structure to hold both instruc- tions and data. The CPU accesses the memory via a data bus that performs read/write actions in memory cells. Which memory locations are accessed is con- trolled by the address bus that transmits the corre- sponding instructions from the memory to the CPU. On the right is a very crude overview of the basic model. This is actually much simpler than the simulations we have designed so far. The reason is that in a typical von Neumann computer all memory is divided into cells of a fixed size. We therefore do not have to invent clever shifting strategies for increasing data sizes. Furthermore, all machine instructions of a real computer take one or two arguments of a fixed size and return a value of a fixed size. In principal, such operations can be completely tabulated because there are only finitely many cases to consider. Consequently, we can simulate such operations by state transitions (in a very large but still finite state space). If you find this a bit daring then you can also consider how the standard integer and floating point operations can be simulated by Turing machine . Standard computer memory has an enormous advantage over a Turing machine’s tape because data can be read and written in a random access fashion. However, a simulation of this is quite easy if we give ourselves a further tape. On this tape we can write the address we are interested in, and then move down the tape to fetch the data, using the second tape as a counter. Finally, a von Neumann computer stores its machine instructions in the memory itself. Again, this is no problem for the Turing machine model, as we can design microcode for each machine instruction. It follows that a single tape Turing machine can be used to simulate the behaviour of any digital computer. 65. The Universal Turing Machine The strategy we used to simulate a von Neumann architecture on a Turing machine can also be used to simulate Turing machines on Turing machines. This means that we can build a Turing machine which takes as its input an arbitrary transition table and some data and which will then simulate the behaviour of the machine described by this table when started on the given data. The consequence of this possibility is that we can build one single machine — called the Universal Turing machine — which is capable of doing what any other machine could do. Turing, apparently, was the first to realise the practical importance of this idea: Instead of building new machines each time a new task is being tackled, we can instead program the universal Turing machine to do the job at hand. In other words, Turing’s universal machine embodies the idea of the stored program, now commonly associated with von Neumann. Apart from the fact that real computers have limited memory, they are all universal in the precise sense of this section: they can be programmed (in software) to do whatever can be done by machines at all. 66. Turing Completeness We have indicated how to simulate a real computer on a Turing machine. The converse is much simpler, of course: We can easily write a Turing machine emulation in any programming language, by using a doubly linked list to represent (a finite stretch of) the tape, a pointer into the linked list as the read/write head, and a long nested switch-statement to simulate the control unit. Of course, if the Turing machine we intend to simulate uses more and more cells on its tape then we may get into trouble with the fact

69 that real machines have only a finite number of memory cells. However, we said at the beginning of this part of the course, that it would be artificial and pointless to try and impose a fixed limit on the amount of memory available. Many more models of computation have been proposed over the last 70 years. In each case, it turned out that they were equivalent in computing power to Turing machines. If a model of computation is equivalen to the Turing machine model we say that these models of computation are Turing-complete. Showing Turing-completeness for a computational model involves to prove that the models can mutually simulate each other. In more detail one has to show that the other model is as powerful as the Turing machine model by simulating a Turing machine’s

(i) control unit and its state changes;

(ii) unlimited tape;

(iii) read/write head and its actions, i.e., writing, reading and moving left or right on the tape.

To ensure that the other model is not more powerful than a Turing machine, one has to demonstrate how the components of the other model can be simulated on a Turing machine. What we have to do there depends of course on the model in question. Example: Suppose a Turing machine has a tape which is infinite in one direction only, ie, at one end (say the left one) there is a fixed beginning marked with a special symbol that the machine recognises and which stops it from “falling off the tape”. We can show that this machine is nevertheless Turing- complete. Since our machine already has a control unit and a read/write head, we do not have to simulate them. All we have to worry about is the limitations on the left of the tape. Suppose the read/write reaches this end of the tape but wants to go one step further left, then what the machine can do is shift the entire content (which is finite!) of the tape by one cell to the right to make more space. This can be achieved by adding microcode that performs this operation whenever the read/write head sees the end of tape symbol. Thus the Turing machine with the partially limited tape is not less powerful than the one with unlimited tape. Conversely, it is trivial to see how we can simiulate the semi-limited Turing machine on the unlimited tape. 67. Church's Thesis The fact that for decades no-one came up with a more powerful computational model led the American logician Alonzo Church to postulate that “computation” per se is fully captured by any of the known computational models, in particular, by the Turing machine model. He claimed that there is no other way to compute. Church's Thesis: The intuitive concept of an “effective procedure” is fully captured by the Turing machine model. If we accept Church’s Thesis then it follows that a problem which is unsolvable by a Java program is unsolvable by any computational device whatsoever. Church’s Thesis can not be proved. It is entirely possible that someone invents a way of computation which truly extends the capabilities of programs as we know them. In fact, one may argue that the human brain is exactly such a device. On the other hand, it is a striking fact that nobody has been able to come up with an alternative to Turing machines. The more time passes, the more plausible Church’s Thesis becomes.

70 Exercise Sheet 7

Quickies (I suggest that you try to do these before the Exercise Class. Hand in your solutions to the tutor at the end of the class.)

34. Design a Turing machine M over input alphabet Σ = {a, b} that when starting directly left of the input string, writes twice as many a’s as contained in the input string to the right of the input string. In other words, given w ∈ {a, b}∗ as input the resulting tape should contain w a2i where i is the number of a’s in w. Give the high-level diagram for the computation of M. 1

35. Design a two tape Turing machine with input alphabet Σ= {a, b, c} that reverses an input string. Assume that tape one is the input tape with the given string and the read/write head is left at the start of the string. Assume also that tape two, the output tape, is initially empty. Give the transition table of the machine. 1 Classroom exercises (Hand in to your tutor at the end of the exercise class.) 36. Consider a machine that consists of a control unit together with two unlimited stacks. In each transition the machine can pop one symbol from either stack, push one symbol onto either stack, and change the state of the control unit. Explain why this machine is as powerful as a Turing machine, i.e., show that how a TM can simulate the two stack machine and vice versa. 2

37. Consider the two tape Turing machine M over the input alphabet Σ= {a, b} given by the transi- tion table for δ : Q × T × T → Q × T × T below.

δ (a, a) (a, b) (b, b) (b, a) (a, ) (b, ) ( , a) ( , b) ( , ) q0 = 0 (0, a, r) (0, a, r) (0, b, l) (0, b, l) (0, r, a) (0, r, b) stop stop (0, r, )

What is the computation the machine M performs if tape one contains an arbitrary input string from {a, b}∗ somewhere to the right of the read/write head, and if tape two is initially empty. 2 Homework exercises (Submit via the correct pigeon hole before next Monday, 9am.) 38. Design a Turing machine M which, when started on the empty tape, will print all strings of the form ai, i ≥ 1 in ascending order, separated by blanks. In other words the machine will run forever and write all elements of the language {a}∗ \{ǫ} on the tape. Give the transition function δ for M. [Hint: You can reuse some of the component machines we have defined in the previous handout.] 2

39. Build a three tape Turing machine over Σ= {a} and T = {a, }. Assume that tape one and two are input tapes containing exactly one string and the read/write heads point to a cell somewhere left of those strings. The third tape is the output tape, which is initially empty. Given input an and am on tapes one and two, the machine computes am·n on tape three. 2 Stretchers (Problems in this section go beyond what we expect of you in the May exam. Please submit your solution through the special pigeon hole dedicated for these exercises. The deadline is the same as for the other homework.) S9. We define a new type of machine as follows: Our machine has an arbitrary but finite number n of memory slots. Each memory slot can hold a non-negative integer of arbitrary size. In a single computational step the machine can either increment or decrement a memory slot by 1. We denote th th incrementing the i memory slot by Ai and decrementing the i memory slot by Si. A memory slot can never be decremented below 0. We can program our machine with the following programs:

(i) Ai and Si for 1 ≤ i ≤ n are programs. (ii) For two programs M1 and M2 their concatenation M1M2 is also a program for our machine (i.e., first M1 is executed and afterwards M2 is executed).

71 (iii) For a program M, the iteration (M)i, 1 ≤ i ≤ n is also a program, with the effect that M is executed as long as the value of the ith memory slot is different from 0.

The configuration of, for example, a machine with three memory slots is denoted by x, y, 0, mean- ing that memory slot 1 contains x, slot 2 contains y, and slot 3 contains 0. Examples of programs for the machine are:

x, y, 0 x, y, 0 x, y, 0 x, y, 0 ↓ A1 ↓ S2 ↓ A1S2A3 ↓ (S1)1 x + 1, y, 0 x, y − 1, 0 x + 1, y − 1, 1 0, y, 0

(a) Write a program P for the above machine type, that given input x, y, 0 computes x, y, xy. You can take as many auxiliary memory slots as you like. (b) Argue that machines of this type are Turing-complete. 10 bonus points

72 Models of Computation Glossary 7

Address Bus A channel to communicate the memory addresses the CPU should access 69 in a von Neumann architecture.

Central Processing Unit The computational control unit in a von Neumann architecture. 69 Church's Thesis The claim that all effectively computable problems can be computed by 70 a Turing machine. CPU Short for Central Processing Unit. 69

Data Bus A way enabel the CPU to have read/write access to memory cells in a 69 von Neumann architecture.

Turing-complete A model of computation is Turing-completeif it has the same expressive 70 power a Turing machine.

Universal Turing Machine A Turing machine that can simulate all other Turing machines that can 69 be encoded as input on its tape.

Von Neumann Architecture The standard computer design model for contemporary computers con- 69 sisting of a CPU and separate memeory for data and program.