
Characters, Tokens, Statements, and Steps: What the SAS® Supervisor Sees in Your Program Rick Aster, Valley Forge, PA characters &N or &N. appear in the program, those characters are removed and the characters 12 are substituted for them. Some Abstract macro objects do not resolve to any program text; instead, they To understand and solve some kinds of programming errors, it take various actions of their own. can be helpful to consider what the SAS supervisor does after you Macro objects and secondary program files can generate any submit a program. How does SAS understand and carry out the number of program lines. These program lines are processed one actions you have written in your program? To put it another way, at a time, in much the same manner that program lines from the how does it know what the program means? The answers can be primary program are processed. However, they are not added to found in the kinds of objects the SAS supervisor looks for in a SAS the end of the queue of program lines. These generated program program and the specific rules and processes it follows to identify lines are processed in their entirety at the point where they are these objects. When you can see a SAS program from this point of generated, even if this is in the middle of a program line of the view, it is easier to avoid many common programming errors and primary program file. easier to understand the error messages that the SAS supervisor The SAS supervisor looks for specific character patterns that writes when a program is not written correctly. require preprocessing. Each of these patterns indicates a preprocessing object: Finding the Parts of a Program • Specific two-character combinations consisting of a In order to execute a SAS program, the SAS supervisor has to slice question mark followed by certain other special the program into parts it can work with. The most important of characters. these component parts are the program lines, characters, tokens, statements, and steps of the program. • An ampersand or percent sign followed by a letter or Conceptually, these five kinds of components can be seen as underscore. five stages of the SAS supervisor’s actions. Using an iterative • Consecutive ampersands. process, the SAS supervisor works its way through the program in small increments. At each stage, the SAS supervisor processes just • A percent sign followed by an asterisk. enough of the program to take it to the next stage. When an ampersand or a percent sign is followed by a letter or an 1. Program Lines underscore, it indicates only the beginning of a preprocessor The SAS supervisor starts by reading program lines from the object. It is the macro processor that parses the object and program. In batch mode, the primary source for program lines is a determines how far it extends, and then processes it to create the program file. In interactive mode, it is a text editor window from appropriate substitute text. which the user submits program lines. Several other sources can After a preprocessing substitution, the SAS supervisor checks also pass program lines to the SAS supervisor, including the CALL the resulting text again to see if more preprocessing is necessary. EXECUTE routine and the SCL programs of AF applications. The The SAS supervisor does only enough preprocessing at one time program lines form an execution queue; the SAS supervisor keeps to generate one token. Preprocessing stops and the SAS the lines in order and works with one line at a time. supervisor moves on to the next stage of execution when it Finding the program lines can involve more than merely read- reaches a space, a special character, or the end of the program ing each line that comes from the program file or text editor win- line and the program text up to that point does not contain a dow. Some SAS statements tell the SAS supervisor that lines that preprocessor object. follow are not program lines. The CARDS statement indicates The SAS supervisor has to be aware of quoted strings and data lines. The ENDSAS statement is the end of the program. comments in order to do its preprocessing correctly. When a When it has processed the last program line and there are no quoted string is enclosed in single quotes, the SAS supervisor more lines to read, the SAS supervisor is done executing the SAS treats an ampersand or percent sign in the text of the string as program. In batch mode, this means the SAS session is over, and simply part of the data of the string. By contrast, when a quoted the SAS supervisor cleans up session objects and ends the SAS string is enclosed in double quotes, the SAS supervisor looks inside session. If lines were submitted from a window in an interactive the string text for preprocessor objects that start with an session, the SAS supervisor returns control to the user, so that the ampersand or percent sign. user can take other actions in the session. Comments are identified in this stage of processing. The SAS supervisor does not look for preprocessor objects or quoted strings 2. Characters inside comments, and it excludes comments from the subsequent The SAS supervisor takes one program line and considers it as a stages of SAS execution. The SAS supervisor looks for the two string of text characters. At this stage, it does preprocessing with kinds of comments that the SAS language has. A comment the text, which means that when some character sequences statement starts when an asterisk is found as the first nonblank appear in the text, it replaces them with other characters. This character after the end of the previous statement. The comment preprocessing falls into three separate categories: character code statement ends at the first semicolon. A delimited comment starts substitution, secondary program files, and macro language. Also at with the character sequence /* anywhere in the program — this stage, the SAS supervisor identifies any comments in the except within a quoted string or comment statement. The program text. The comments are not treated as part of the comment ends when the characters */ are found. program and are not passed along to the next stage. These are examples of the SAS supervisor’s actions on Character code substitution is designed to help users whose program text: computers do not have a complete character set. You can use a sequence of two characters to substitute for a missing character. For example, if you cannot type the character { you can write the • %SQUARE(CORNER1, LENGTH) The macro processor character code ?( in its place. is called to substitute text for this macro object. The %INCLUDE statement identifies a secondary program • %INCLUDE CENSOR; The text of the file identified by file. The text of that file is inserted in the program in place of the the CENSOR fileref is inserted at this point in the %INCLUDE statement. program. There are various kinds of macro language references. The easiest kind to understand is a macro variable reference. Suppose the macro variable N is defined with the value 12. Then, if the • TITLE1 "&TITLETEXT"; The quoted string is inside SAS supervisor finds a code such as N (name), T (SAS double quote marks, so the macro processor is called to time), or X (character hexadecimal) after a quoted string, substitute text for the macro object. it treats that code as part of the same token. • /* "- - - - -" */ The comment is excluded • Multilevel name. A period between two words indicates from execution. The quote marks do not indicate a that the words are part of a multilevel name, even if quoted string. there are spaces before or after the period. A name literal, written as a quoted string followed by the letter N, can In this stage, the SAS supervisor identifies the actual text of the also be a part of a multilevel name. SAS code — the characters it will execute. It next has to Sign. determine what the characters mean. • A numeric constant can be preceded by a - or + sign to indicate a negative or positive number. In most places, 3. Tokens but not in an expression, the sign is treated as part of the The SAS supervisor groups the characters of the program into the constant. In expressions, the sign is kept as a separate units that identify the objects and actions of the program. These token and treated as an operator. meaningful units of a program are called tokens. • Signed exponent. A numeric constant in scientific notation The tokens in a program line are usually not hard to pick out. can have a signed exponent. When this occurs, the letter For example, this line divides into seven tokens: E (or D) at the end of the constant value is followed by a or sign and one or more digits. The sign and digits are IF _N_ >= 5 THEN STOP; - + part of the constant value. Each token is written here on a separate line: • Special characters in names. The dollar sign ($) is the first IF character of the name of a character informat or format, or it can serve as the entire name of the standard _N_ character format. When a dollar sign is followed by a >= token that could be an informat or format reference, the 5 dollar sign is treated as part of that token. In some THEN operating systems, filerefs can contain certain special STOP characters, and those characters are treated as part of ; those names. The SAS supervisor starts looking for the first token at the first • Compound keywords.
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages4 Page
-
File Size-