6.3 The Fundamental Questions 6.3.1 Name Binding and

Names in are used to represent predicates and terms. Predicates are used to express a relationship between terms. For example, the statement below could represent the relationship that Jay is the father of Baron. (It could just as well represent the relationship that Baron is the father of Jay since the father predicate has no intrinsic meaning in Prolog.) This is the simplest type of statement and is known as a fact. The name father is the predicate used in this fact. father(jay, baron).

Predicates are also used in rules, which define new relationships in terms of existing relationships. For example, the statement below defines the sibling relationship: X and Y are siblings if they have the same father and X is not equal to Y. The predicates in this example are sibling , father, \+ (not) and = (equal). The left side of the :- (pronounced “if”) operator is called the head of the rule; the right side is called the body. The comma separated terms on the right side are called the conditions of the rule. sibling(X, Y) :- father(Z, X), father(Z, Y), \+(X = Y).

The Prolog term is the single data structure in the language. Terms are either atoms, numbers, variables or compound terms. An atom is a name with no intrinsic meaning that is used to represent a constant value in the domain. An atom is a sequence of alphanumeric characters beginning with a lower case letter or a sequence of characters enclosed in single quotes. Examples of atoms include: jay ‘Jay Fenwick’ x ‘some $’ Notice that enclosing a sequence of characters in single quotes allows us to have constants in our domain that start with an upper case letter and/or include non-alphanumeric characters.

Numbers can also be the terms of predicates. For example, a multiplication table can be expressed with a series of statements of the following form: multiply(1, 1, 1). multiply(1, 2, 2). multiply(1, 3, 3).

Variables are represented by a sequence of letters, numbers and , starting with either an upper case letter or an . A single underscore is a special type of variable, known as the anonymous variable. As seen in the previous section, variables are used in rules or queries and are instantiated during the inferencing process except for the anonymous variable that is used in the pattern matching, but not instantiated. A variable can be instantiated to any term, including atoms, numbers, compound terms and other variables. Example of variables include: X Thefather _father _

A compound term is either a list or a functor term. A Prolog list consists of elements separated by commas enclosed in square brackets. Examples of Prolog lists can be seen below; the first list is the empty list or nil. [] [a, b, ] [a, [b]] [1, a, [‘hello’]]

A string, enclosed in double quotes rather than single quotes, is actually equivalent to a list of the encoded value of the characters within the string. This can be seen by the example below. The = operator attempts to unify the term on the left with the term on the right and does so by instantiating X, Y, Z to the ascii codes for a, b, and c, respectively. | ?- "abc" = [X, Y, Z].

X = 97 Y = 98 Z = 99 yes

A functor term is composed of an atom, known as a functor, and a number of argument terms enclosed within parenthesis. A functor term looks like a function call, however, no transfer of control occurs; rather a functor term, like an atom, is used to represent some value in the domain. Consider the example below. The first three facts represent father-son relationships between the first argument of the predicate and the second. fatherof(‘Baron’) does not evaluate to another value; it is simply a value in the domain, much like ‘Baron’, ‘Grayson’ and ‘Ethan’ are values in the domain. Note that the single quotes are used with ‘Baron’, ‘Grayson’, and ‘Ethan’ to prevent these atoms from being treated as variables. The rule can be used to identify sibling relationships, for example, determining that ‘Baron’ and ‘Grayson’ are siblings. father('Baron', fatherof('Baron')). father('Grayson', fatherof('Baron')). father('Ethan', fatherof('Baron')). sibling(X, Y) :- father(X, Z), father(Y, Z), \+(X = Y).

A procedure in Prolog is a set of statements whose head begins with the same predicate. For example, the set of statements below is a Prolog procedure. myappend([], L, L). myappend([H|T], L, [H|R]) :- myappend(T, L, R).

Prolog performs static scoping and the scope of a Prolog variable is the statement in which the variable appears. For example, the scope of the variable L in the first myappend statement is only that myappend statement. The variable L in the second statement is unrelated to the variable L in the first statement. (If you want to relate this to the block scoping of imperative languages, each statement would be equivalent to a block.) As noted earlier, each call to the procedure will result in a set of variables allocated for that procedure call. The second myappend rule may be used multiple times in the attempt to solve a myappend query. Each use of the rule will result in the creation of space for the rule variables. Thus, the instantiation of L in one use of the rule will be different than the instantiation of L in a second use of the rule.

Prolog is dynamically typed, which means that the type errors are not detected until runtime. If a function is applied to a term with an invalid type, then Prolog will report the error when the function is applied. Unlike most languages, Prolog does not detect the use of a predicate with the wrong number of parameters as a type error. Instead, Prolog searches for a predicate with the given arity and reports that a predicate with that arity is undefined.

Current implementations of Prolog compilers typically target the Warren Abstract Machine (WAM). Similar to the implementation of imperative languages, the WAM uses one or more stacks to store variables and control information, a heap area for data values and an area called the trail to hold information for restoring values when backtracking occurs. These data areas grow while the resolution process creates more subgoals to prove and shrink during backtracking. Space is automatically allocated as needed and garbage collection is performed by the to reclaim space that is no longer needed.

6.3.2 Expressions

Prolog statements, or expressions, are the equivalent of Horn clauses. A Horn clause is a clause in one of the following two forms (P1 ^ P2 ^ … ^ Pn) -> Q (R1 ^ R2 ^ … ^ Rn) These would be written in Prolog in the form below. The Q predicate expression is known as a rule. The Ri predicates are called facts. Q :- P1, P2, …, Pn. R1. R2. . . Rn. Thus the Prolog :- operator is logically equivalent to implication and the comma is equivalent to logical and.

Prolog also defines the ; operator to be equivalent to logical or. For example, the expression below is equivalent to Q if P1 or P2 Q :- P1; P2. The statement above is equivalent to two below. Q :- P1. Q :- P2. The , operator has higher precedence than the; operator, thus the statement below: Q :- P1, P2; P3, P4. Is equivalent to: Q :- P1, P2. Q :- P3, P4.

Parenthesis can be used to change the precedence. For example, Q :- P1, (P2 ; P3), P4. is logically equivalent to Q :- P1, P2, P4. Q :- P1, P3, P4.

Prolog expressions can also consist of arithmetic operators and comparison operators that have the expected precedence (for example, comparison operators have a higher precedence than arithmetic operators, / and * have higher precedence than + and -). In addition, Prolog provides a set of equality and inequality operators and an is operator that evaluates its right operand and attempts to unify the value with the left operand. The is operator can be used to instantiate variables, somewhat similar to assignment statements in imperative languages. Descriptions and example uses of these operators can be seen in the table below.

Operator Description Examples is Evaluates the expression on the | ?- X is 3. right and determines whether it can X = 3 be unified with the expression on yes the left. The expression on the left | ?- 3 is 3. is not evaluated, thus the last yes example evaluates to no. | ?- 3 + 4 is 3 + 4. no =, \= Determines whether the left operand | ?- [X, Y] = [a, b]. and the right operand can or can not X = a be unified Y = b Yes | ?- X = 3. X = 3 yes | ?- X = Y. Y = X yes

==, \== Determines whether the left and | ?- X == 3. right operands evaluate to the same no values. In the first example, X is | ?- 3 * 4 == 3 * 4. unbound and thus has no value. yes | ?- X is 3, X == 3. X = 3 yes <, >, Comparison operators. Notice the | ?- 5 >= 4. =<, >= less than or equal to operator is =< yes | ?- X is 4, 3 =< X. X = 4 yes /, * Operators for division and | ?- X is 2 * 3, 10 > X. multiplication X = 6 yes | ?- X is 3, Y is 10/X. X = 3 Y = 3.3333333333333335 yes +, - Operators for addition and | ?- X is 3, Y is 4 - X + subtraction 7. X = 3 Y = 8 yes

Prolog also supplies a number of built-in predicates. For example, the atom predicate evaluates to true if its argument is an atom. | ?- atom(hello). yes | ?- atom('hello'). yes | ?- atom("hello"). no | ?- atom(3). no

In fact, the operators described earlier are simply predicates that have been invoked using infix, rather than prefix, notation. These operators can also be called using prefix notation like the examples below. | ?- =(X, 3).

X = 3 yes | ?- >(4, 3). yes

A comprehensive look at all Prolog predicates is beyond the scope of this book; however, the concept of negation as failure is unusual and especially worth visiting. Negation as failure is the inferencing rule that derives not P from the failure to derive P. In other words, P is not determined to be false; rather P can not be proven to be true and this is used to derive that not P is true, thus a failure is equivalent to false. Negation as failure is implemented in Prolog with the \+ predicate. (In many Prolog implementations, this is named as not.) Understanding this implementation can help you avoid some surprising results. For example, take another look at this program that defines some father and sibling relationships. Unlike the previous version, the \+ condition is first, rather than last, in the rule. father('Baron', 'Jay'). father('Grayson', 'Jay'). sibling(X, Y) :- \+(X = Y), father(X, Z), father(Y, Z).

The program handles sibling queries properly when provided a possible solution. For example, | ?- sibling('Baron', 'Grayson'). yes | ?- sibling('Grayson', 'Jay'). no

However, when Prolog is asked to come up with instantiations for X and Y, the interpreter responds with no. | ?- sibling(X, Y). no Since it is always possible to come up with bindings for X and Y where X is equal to Y then the X = Y does not fail and nor are X and Y instantiated by the first condition. Since X = Y does not fail, the condition \+(X = Y) fails and the query fails. The fix to this is to move the \+ condition to the end of the rule. This would ensure that X and Y are instantiated before the negation as failure predicate is used. sibling(X, Y) :- father(X, Z), father(Y, Z), \+(X = Y).

| ?- sibling(X, Y).

X = 'Baron' Y = 'Grayson' ? ;

X = 'Grayson' Y = 'Baron' ? ;

(1 ms) no

6.3.3 Managing Flow Control

Prolog is similar to pure functional languages in that repetition is via recursion, rather than iteration. Prolog functions, in fact, look similar to functions in Haskell or ML that utilize pattern matching. For example, consider the Haskell myMember function that was defined in chapter 5. If the parameters match the first pattern, then the second argument is the empty list and the function returns False. Otherwise, the second pattern is used to obtain the first element of the second parameter, which must be a list. The function returns True if the first argument, x, is equal to the first element of the list or if this is not true, the value of the recursive call to myMember. myMember _ [] = False myMember x (y:z) = (x == y) || myMember x z Here is the same function in Prolog. myMember(X, [X|_]). myMember(X, [_|Y]) :- myMember(X, Y).

One difference between a Prolog function and a Haskell function is that in Haskell, the programmer would express the pattern that results in a return value of False. For example, in the myMember function, the return value is False if the second parameter is the empty list as specified by the first statement. False results are not specified in Prolog because Prolog makes a closed world assumption. The closed world assumption is that nothing exists outside of the facts and rules specified in the Prolog program. Making this assumption means that if a query can not be proven to be true, then it must be false.

A second difference between the of Prolog functions and that of pattern matching functions in Haskell and other functional languages is that the Prolog inferencing engine performs backtracking. In Haskell, the first pattern that matches the function call will cause the evaluation of the expression on the right side of the = and the return of that value. In Prolog, matching a goal to a rule will provide one or more subgoals to be proven. If a subgoal can not be proven, the backtracking process will try to prove an earlier subgoal in a different way. If none of the subgoals can be proven, an attempt to prove the goal using another rule will be made. For example, using the Prolog statements below suppose the goal to be proven is p(X). The Prolog inferencing engine will first attempt to prove q(X,Y), determining instantiations for X and Y. Next, using the instantiations for X and Y, Prolog will attempt to prove the subgoal r(X, Y). If this fails, Prolog will attempt to re-prove q(X, Y) finding new instantiations for X and/or Y. If none can be found, the inferencing engine will attempt to prove s(X). Notice that the rules and facts are used from top to bottom and the subgoals in the body of each rule are attempted in order left to right. Thus, the left to right ordering of the conditions expresses the sequential order in which these subgoals are to be solved. p(X) :- q(X, Y), r(X, Y). p(X) :- s(X).

6.3.4 Defining Data Types

As noted earlier, the term is the single data structure in Prolog, encompassing atoms, numbers, variables, functors and lists. User defined data types are easily built by defining new predicates and new terms. For example, below is a user defined list of family members. Specifically, family is a predicate and its argument is a list of terms; the predicate defines a family relation. The first line of the program is a comment. % family is a list containing father, mother, and children family([jay, cindy, [baron, grayson, ethan]]). family([charlie, tricia, [brenden, bradley, thomas]]). family([matt, gini, []]).

6.3.5 Subroutines and Parameter Passing

Each attempt to prove a goal causes a procedure call in Prolog with the matching head of a rule or fact representing the procedure being called. Prior to making the procedure call, a unification must occur between the goal and the head of the rule or fact. Informally, the constraints of the unification process are:  Two predicates unify if they have the same name and arity and their arguments unify. For example, p(X, Y) does not unify with q(X, Y) because p and q are two different names.  Two functor terms unify if they have the same name and arity and their terms unify. For example, f(a, b, X) unifies with f(a, Y, c) because the functors have the same name, arity and their terms unify.  Two different constants do not unify. For example, fruit(apple) does not unify with fruit(banana).  A constant unifies with a variable. For example, fruit(apple) unifies with fruit(X).  Two variables unify. For example, fruit(X) unifies with fruit(Y).  Unification requires that all instances of the same variable in a rule must get the same value. For example, p(X, b, X) does not unify with p(Y, Y, c) because Y must be b to make the second arguments equal and X must be c to make the third arguments equal. This makes the instantiation of the first predicate p(c, b, c) and the instantiation of the second predicate p(b, b, c).

The parameter passing is by matching rather than by a list of arguments. For example, the goal p([a, b, c], X) matches the head of this rule: p([X|Y], Z) :- . . . causing space to be allocated for the parameters X, Y and Z with X given the value a, Y given the value [b, c] and Z uninitialized, although unified with X.

Although Prolog functions look similar to functions in languages that support patterns in parameters, the parameter passing is very different. Take another look at the Haskell and Prolog myMember functions. -- Haskell myMember myMember _ [] = False myMember x (y:z) = (x == y) || myMember x z

% Prolog myMember myMember(X, [X|_]). myMember(X, [_|Y]) :- myMember(X, Y). There are two key differences in the way Haskell and Prolog functions are called. First, Prolog function can be called with uninstantiated variables. For example, the Prolog myMember function can be called like this: | ?- myMember(X, [a, b, c]). X = a and the Prolog inferencing engine will determine values for the parameter X. Second, a variable can appear only once in a Haskell pattern. In Prolog, a variable can appear multiple times in a statement and each occurrence of the variable would be instantiated to the same value. For example, the first Prolog fact uses X as the first parameter and the first element of the second parameter. This fact would match queries like this: | ?- myMember(a, [a, b, c]). yes | ?- myMember(a, [a]). Yes In Haskell, these values are named differently (x and y) and a check must be provided to determine whether x and y are equal.

6.3.6 Exception Handling and Event Handling

Exception handling in Prolog is provided with throw and catch functions. The catch function takes three arguments. The first argument is the goal to be performed (comparable to the body of a try in Java). The second argument is the exception to be caught while solving the goal and may be a variable to allow any type of exception to be caught. The third argument is the function to be executed in the event of an exception. The example below shows the use of a catch function to catch an exception thrown by Prolog. The causeError statement performs a deliberate by divide by zero. The handleError function simply outputs an error message. divide :- catch(causeError, X, handleError(X)). handleError(X) :- write('Caught Exception' : X), nl. causeError :- result is 10 / 0. The output below shows what is displayed by Prolog when the divide function is invoked. | ?- divide. Error:error(evaluation_error(zero_divisor),(is)/2) yes

You can also add a throw to the goal method specified by a catch to create a user-defined exception. The program below displays the Nth Fibonacci number. The value for N is provided by the user. If N is not an integer, then the user defined notInt exception is thrown. The catch function catches the exception causing X to be instantiated with notInt and the handleError function is called, which displays an error message and restarts the process. If another type of error occurs while solving the doWork goal, that exception will also be caught, an error message will be displayed and processing will stop.

% fibonacci(N, [H|T]) – list of first N fibonacci numbers fibonacci(1, [0]). fibonacci(2, [1, 0]). fibonacci(N, [L1, L2, L3|R]) :- N > 2, M is N - 1, fibonacci(M, [L2, L3|R]), L1 is L2 + L3.

% fiboNth(N, F) – F is the Nth Fibonacci number fiboNth(N, F) :- N >= 1, fibonacci(N, [F|_]).

% getFibo - use catch to invoke the doWork rule and catch % an exception if one occurs getFibo :- catch(doWork, X, handleError(X)).

% handleError(notInt) – handle the notInt exception handleError(notInt) :- write('Not an integer'), % error message nl, % newline getFibo. % try again % handleError(X) – handle other exceptions handleError(X) :- write('Unknown Error' : X), % error message nl, % newline halt. % quit

% doWork - prompt for an integer N and find % Nth fibonacci number doWork :- write('Enter an integer: '), read(N), checkInt(N), fiboNth(N, F), % display Nth fibonacci number write('Fibonacci number '), write(N), write(' is '), write(F).

% checkInt(N) - if N is not an integer throw exception notInt checkInt(N) :- integer(N). checkInt(_) :- throw(notInt).

6.3.7 Concurrency

Due to their high level nature, logic and declarative languages expose opportunities for parallelism not seen in conventional languages. Prolog programs consist of a collection of facts and rules that can be accessed by threads running concurrently. Identification of goals that can be solved simultaneously is generally easily done by experienced developers. Although not specified by ISO Prolog, multithreading is becoming increasingly available in Prolog implementations making the language more useful for programming on modern multiprocessor computers..

Most multithreaded Prolog implementations are based upon POSIX threads. POSIX or Portable Operating System Interface for Unix is a set of standards developed by IEEE that define the API (Application Programming Interface) for invoking operating system services. A Prolog thread is directly mapped to a POSIX thread, or pthread, that runs the Prolog engine. Modern processors can create these threads very quickly.

The SWI-Prolog example below shows the use of threads to compute the Nth fibonacci number. The thread_create function takes three parameters. The first parameter specifies the goal to be solved by the thread. The second parameter is instantiated by the thread_create function to a thread identifier value. The third parameter is a list of options, in this case empty, for the thread_create function. The fibonacci function can not be invoked directly by thread_create because the instantiated result will not be available in the main thread (threads do not share the stack and thus do not share instantiations of variables.) Instead, a wrapper function named callFibo has been defined. The callFibo function invokes the fibonacci function and the result is made available to the main thread by invoking the thread_exit function, which also causes the thread to terminate. The thread_join function invoked by the main thread waits for the execution of the thread indicated by the id and retrieves the value passed into thread_exit by calling exited. % fibonacci(N, Result) – Result is Nth fibonacci number fibonacci(1, 0). fibonacci(2, 1). fibonacci(N, Result) :- N > 2, Nminus1 is N - 1, Nminus2 is N - 2, thread_create(callFibo(Nminus1), Id1, []), thread_create(callFibo(Nminus2), Id2, []), thread_join(Id1, exited(N1)), thread_join(Id2, exited(N2)), Result is N1 + N2. callFibo(N) :- fibonacci(N, Result), thread_exit(Result).

6.04 Case Study – A Lexical Analyzer for Wren

A lexical analyzer (sometimes called a scanner) is designed to open a plain text file that contains the source code for a Wren program and convert it into a list of tokens in a standard format. If you are unfamiliar with the Wren language, please see the Case Study at the start of Chapter 2 before proceeding. For example, given the source file: program gcd is var m,n : integer; begin read m; read n; while m <> n do if m < n then n := n - m else m := m - n end if end while; write m end The list of tokens produced by the scanner is: [program,ide(gcd),is,var,ide(m),comma,ide(n),colon,integer, semicolon,begin,read,ide(m),semicolon,read,ide(n),semicolon, while,ide(m),neq,ide(n),do,if,ide(m),less,ide(n),then,ide(n), assign,ide(n),minus,ide(m),else,ide(m),assign,ide(m),minus, ide(n),end,if,end,while,semicolon,write,ide(m),end,eop] This list of tokens will be used by the parser (case study in the next chapter) to generate Wren Intermediate Code (WIC). In a few cases the token is identical to the keyword in the source code, such as read, if, then, else, while, end, and so forth. In other cases a sequence of one or two characters is expanded to a full word token. For example, <> becomes neq (for not equal), < becomes less, := becomes assign, - becomes minus, and so forth. Character sequences in the source code that are not keywords or terminals in the BNF are either identifiers or numeric literal. Identifiers are tokenized as ide() as in ide(m) above. Although not illustrated in the program above, numeric literals become num() as in num(0). The eop (end of program) token is inserted by the scanner as the final token.

We present some of the code for the scanner here and ask you to complete the code as a sequence of exercises. Many built-in rules for Prolog are introduced and presented with brief explanations.

% Wren Lexical Analyzer %------go :- nl, write('>>> Lexical Analyzer for Wren <<<'), nl, write('Enter name of source file: '), nl, getfilename(FileName), nl, see(FileName), scan(Tokens), seen, write('Scan successful'), nl, !, write(Tokens), nl.

The go rule starts the program running by prompting the user for the name of the external file that contains the Wren source code. After opening the file it is scanned to produce a sequence of tokens that is then displayed. Some new features of Prolog are:  write that displays the specified object on the console  nl that forces a new line on the console  see which opens the file with the path specified by the user  seen closes the file getfilename(F) :- get0(C),restfilename(C,Cs),name(F,Cs). restfilename(C,[C|Cs]):-filechar(C),get0(D),restfilename(D,Cs). restfilename(C,[]).

The get0 rule fetches a single character interactively; characters are represented by their ASCII values. This rule may appear to be overly complex but it illustrates a common problem encountered in Prolog and how that problem can be solved. The problem is code that is left recursive; this means the recursive goal appears leftmost in the list of goals. This will cause a problem because Prolog will keep making recursive calls and exhaust memory before any useful work is done. In the code above we want to fetch the characters in the file name one at a time from left to right, but using a single rule would result in the left recursive problem. The solution is to write a separate rule after the first step is completed (the first get0) that will process the file name in a right recursive fashion. The restfilename rule is right recursive. The name rule converts a list of characters as ASCII values into a string. The filechar rule allows letters, digits, periods, and slashes in the file name.

Next we specify sequences of characters or single characters that have special meaning. The following words are reserved words in Wren: program,is,begin,end,var,integer,boolean,read,write,while,do,if, then,else,skip,or,and,true,false,not. Here are the facts for the first four reserved words: reswd(program). reswd(is). reswd(begin). reswd(end).

Exercise 6.4a: Start a file called scanner.pl by entering the rules for go, getfilename and restfilename. Enter the four reserved words shown above then enter rules for all the remaining reserved words.

When we create a list of tokens we want to replace single ASCII characters with tokens that have easily recognizable names. For example, a left parenthesis, (, has an ASCII value of 40, so we write the following fact called single that converts this to the token lparen: single(40,lparen).

Exercise 6.4b: Using a ASCII table write additional single facts for the following characters: rparen, times, plus, comma, minus, divides, semicolon, equal Some Prolog interpreters, including gProlog, require that all facts or rules that start with the same name and have the same arity (number of parameters) must be placed contiguously in the Prolog source code file. So in this case all the single facts must be grouped together.

Some individual characters may have meaning as a single character or may be the first character in a sequence of two characters that represent a single token. Here are three such characters represented by a double rule. double(58,colon). double(60,less). double(62,grtr).

Here is one pair of characters, :=, that forms a single token associated with the assignment operator. pair(58,61,assign). % :=

Exercise 6.4c: Add additional pair rules for <> (neq), <= (lteq), and >= (gteq).

Here is a rule that tests for a lower case letter. lower(C) :- 97=

Exercise 6.4d: Add two additional rules to test if a character is upper case, name it upper, or a digit, name it digit.

The rules you have written so far have been simple but necessary. To avoid boring you any longer with these simple rules we give you the rest that you will need. space(32). tabch(9). period(46). slash(47). endline(10). endfile(26). endfile(-1). whitespace(C) :- space(C) ; tabch(C) ; endline(C). idchar(C) :- lower(C) ; digit(C). filechar(C) :- lower(C); upper(C); digit(C); period(C); slash(C).

Here is the scan rule. scan([T|Lt]) :- tab(4), getch(C), gettoken(C, T, D), restprog(T, D, Lt).

Most of the work is done by gettoken and restprog. First consider the utility rule to get a single character and handle white space correctly (remember ; means ‘or’). Restprog is recursive and halts when the end of program (eop) is encountered. getch(C):- get0(C),(endline(C),nl,tab(4); endfile(C),nl; put(C)). restprog(eop,C,[]). % end of file in previous character restprog(T,C,[U|Lt]) :- gettoken(C, U, D), restprog(U, D, Lt).

The gettoken rule has many alternatives depending on the token that is eventually generated. The general form of the rule is: gettoken(, token, ) Here is the first example when end of file is encountered. gettoken(C, eop, 0) :- endfile(C).

Exercise 6.4e: Write the gettoken rule for the case when a single character unambiguously corresponds to a token, such as ( corresponding to lparen.

The next rule is when the current character is one of the three characters that either corresponds to a single character token or a two character token. For example, < can result in a single character token less or the double character token lteq. gettoken(C, T, E) :- double(C,U),getch(D), (pair(C,D,T),getch(E); T=U,E=D).

A lower case letter indicates a start of an identifier or a reserved word; the following rule takes care of both cases. Notice that user defined identifiers are indicated by ide(). gettoken(C, T, E) :- lower(C), getch(D), restid(D, Lc, E), name(I, [C|Lc]), (reswd(I),T=I ; T=ide(I)).

Your task in the following exercise is to complete the restid rules.

Exercise 6.4f: The general form of these rules are: restid(, seq of ASCII values, ) The first rule is the general recursive case where you keep fetching characters until the idchar goal fails. The second rule handles this case where the current character is not an idchar; it matches the first to third arguments and returns the empty list of characters as the second argument. In order to group all facts or rules with the same name together, the restid rule you are writing here will have to be placed after all the gettoken rules.

The next rule for gettoken handles the case of a literal integer value.

Exercise 6.4g: Write the gettoken rule for integer literals. Suppose the literal is the sequence of digits 123 then the token returned is num(123). You will need a restnum rule that is very similar to the restid in the previous exercise. To complete this exercise you are writing a single gettoken rule and two restnum rules, the second handling the base case. Place the restnum rules after all the gettoken rules.

Here are the final two gettoken rules. The first removes any white space and the second detects any illegal character. Notice the use of the built-in abort goal to stops program execution. gettoken(C, T, E) :- whitespace(C), getch(D), gettoken(D,T,E). gettoken(C, T, E) :- write('Illegal character: '), put(C), nl, abort.

If you have completed all the exercises in this case study you should be ready to test your scanner. Use the gcd program shown at the start of this section to see if you produce the correct sequence of tokens. When writing a complete Prolog program such as the scanner you may not get it completely correct the first time. There are two useful debugging tools that you should become familiar with. The first is called spy where you specify the name of the predicate to be examined. For example, you might suspect you have an error in your gettoken rules, so you can type in: ?- spy(gettoken). Every call to gettoken will be shown on the console along with the result of invoking the rule. The rule nospy turns off the spy point.

The other useful option is to single step through a Prolog program. If you enter: ?- trace. then the entire program is executed in single step mode. For a program such as the scanner this could become very laborious, so you are allowed to trace individual rules, such as trace(gettoken). When you are in single-step mode there are a variety of options on how to proceed; pressing return will continue the single-step mode. There is a notrace rule if you want to turn off tracing.