<<

NESUG 18 Ins & Outs

Generating Code for Existing HTML Forms Don Boudreaux, PhD, SAS® Institute Inc., Austin, TX Keith Cranford, Office of the Attorney General, Austin, TX

ABSTRACT SAS® Web Applications typically involve HTML forms that provide a front-end for SAS programming code. The forms involved could be created with 2nd party development tools (ex. FrontPage or Dreamweaver) or by directly hand-coding HTML form tags. In any case, these forms will generate name/value pair output that will need to be handled by SAS code. This paper presents a SAS macro that helps automate the process of developing the necessary SAS code. By using a document containing an form as input, this macro will generate a code template of SAS macros for handling any potential name/value pair output for use within either a SAS/IntrNet® or a SAS9 Stored Process Web Application.

INTRODUCTION The two main coding components of a typical SAS Web Application are the HTML form tags that constitute the application’s front-end and the Macro-level SAS processing code that utilizes the output from that form. This investigation assumes that a “well-formed” HTML document containing a form preexists and that an application developer is in a position to begin developing the associated macro-level SAS processing code to handle the output from that form. This paper presents a SAS macro named %code that assists in the creation of that code. The macro is designed to parse HTML form elements out of an HTML document, check for potential single name/value or multiple name/value output, and generate a file containing the necessary macro-level SAS code needed to process that output. The code generated by %code is only an initial “start-point” template; nevertheless it will reduce the amount of time and effort needed to begin to develop either a SAS/IntrNet or a SAS9 Stored Process Web Application.

%CODE MACRO DEFINITION The %code macro is defined with a set of named parameters: htmlFile, and codeFile. The first parameter contains the name of the input HTML document. It has been given a default value of form.html. This text document is assumed to exist, contain HTML form tags, and follow HTML v4 or XHTML syntax conventions. The codeFile parameter consists of the name of the text file that will contain the SAS macro code needed to check the output for the form found within the HTML document. It has been given a default value of code.sas. The subsequent subsections list the rest of the macro definition broken down into functional code segments.

%macro code(htmlFile=form.html, codeFile=code.sas) ;

SECTION 1 – LOAD AND PARSE THE FORM TAGS The first section of code reads the form tags from within the htmlFile source file and creates a SAS data set named FORM_TAGS. The code is designed to load each line of the source file, examine each character from that line, and extract out complete HTML tags. At the point that the final character of any tag is detected, the tag is written to the log and the first word of the tag is examined and it is determined if either a “select” tag, a “textarea” tag, or an “input” tag has been found. These tags constitute the form elements that would generate output, and only these tags are output into the FORM_TAGS data set.

*** Section 1 ********************** ; *** Load and Parse the Form Tags *** ; data FORM_TAGS(keep = tag type name tag_n) ; length tag $ 80 ; length c $ 1 ; length type name tag_1 $ 9 ; retain tag " " addToTag 0 ;

infile "&htmlFile" truncover ; input line $ 1-200 ; * for each input line ;

name_pt = '/[Nn][Aa][Mm][Ee] *[=] *["' !! ']([^"' !! ']*)["' !! ']/' ;

1 NESUG 18 Ins & Outs

type_pt = '/[Tt][Yy][Pp][Ee] *[=] *["' !! ']([^"' !! ']*)["' !! ']/' ;

name_id = prxparse(name_pt) ; * id to find name value ; type_id = prxparse(type_pt) ; * id to find type value ;

do i = 1 to length(line)+1 ; * examine each char/line ; c = substr(line,i,1) ; * extract a single char ; select ; when (c EQ '<') do ; * if @ start of html tag ; tag = "<" ; * reset var tag ; addToTag = 1 ; * set flag to concat to tag ; end ; when (c EQ '>') do ; * if @ end of html tag ; tag = trim(tag)!!">" ; * end tag with > ; tag = tranwrd(tag,"'",'"') ; * single to double quotes ; tag = tranwrd(tag,"!_blank_"," ") ; * set blanks back ; put "***** " tag ; * write tag to SAS log ; tag_1 = lowcase(scan(tag,1,"< >")) ; * find 1st word in tag ; select (tag_1) ; * select on 1st word ; when ("select","textarea") do ; * if select or textarea ; type = tag_1 ; * 1st word is type value ; mtch = prxmatch(name_id,tag) ; * find the name value ; name = prxposn(name_id,1,tag) ; tag_n + 1 ; * +1 for form tag count ; output ; * output to SAS dataset ; end ; when ("input") do ; * if input tags ; mtch = prxmatch(type_id,tag) ; * find the type value ; type = lowcase(prxposn(type_id,1,tag)) ; mtch = prxmatch(name_id,tag) ; * find the name value ; name = prxposn(name_id,1,tag) ; if type in ("radio","", "text","hidden") then do ; tag_n + 1 ; * +1 for form tag count ; output ; * output to SAS dataset ; end ; end ; otherwise ; end ; addToTag = 0 ; * set flag to stop concat ; end ; when (addToTag EQ 1) do ; if c in (' ', '\n') then tag = trim(tag)!!"!_blank_" ; * concat !_blank_ on tag or ; else tag = trim(tag)!!c ; * concat c on tag ; end ; otherwise ; * c is not part of a tag of ; end ; * interest - do not use it ; end ; run ;

SECTION 2 - CHECKING FOR SINGLE OR MULTIPLE NAME/VALUE OUTPUT When writing the macro code needed to check the output from a form, it is necessary to determine if a form element will possibly generate a single name/value or multiple name/value pairs. This process starts by counting the number of values associated with each name and element type. These counts are saved in a variable named count within a SAS data set called FORM_OUTPUT. This SAS data set is then reread through a DATA step to search for multiple values within a given element name. This is checked by looking at the value of count and considering what type of form element is involved. If count is greater than one but the form element involved is a radio group, then only a single name/value pair is output. On the other hand, all selection lists are, by default, assumed to generate multiple name/values. An indicator variable name mult_value holds the results of this check. This DATA step also looks for

2 NESUG 18 Ins & Outs

name reuse between elements. Two elements that by themselves would only produce a single name/value pair could generate multiple name/value pairs if they happen to share the same element name. Consider the example of a form where a textarea and a hidden field are both named note. Checking for name reuse is accomplished by testing that the names associated with the counts are unique. An indicator variable named name_reuse holds the results of this determination. Then both indicator variables are used to create single new composite named check. A non-zero value in check would indicate either multiple values are generated for a given element name or the element name is reused by several form elements or both conditions exist. Then FORM_OUTPUT is used to make another SAS data set called FORM_MACROS that contains a unique set of element names, a final indicator of multiple name/value existence across names called multiple, and the variable called tag_n that holds onto the order of the elements within the original input source file.

*** Section 2 ******************* ; *** Multiple Name/Value Check *** ; proc sql ; * count values within ; * name, type combinations ; create table FORM_OUTPUT as select name, type, count(*) as count, min(tag_n) as tag_n from FORM_TAGS group by name, type order by name ; quit ;

data FORM_OUTPUT ; set FORM OUTPUT ; * check within names ; by name ; * for multiple values ; select (type) ; when ("select") mult_value=1 ; * default - multiple ; when ("radio") mult_value=0 ; * default - single ; otherwise do ; if count GT 1 then mult_value=1 ; * count GT 1 - multiple ; else mult_value=0 ; * count LE 1 - single ; end ; end ; * also check on name reuse ; * indicates - multiple ; name_reuse = NOT(first.name=1 AND last.name=1) ; check = (mult_value OR name_reuse) ; * either condition ; run ; * indicates - multiple ; proc sort data=FORM_OUTPUT ; by tag_n ; run ;

proc sql ; * check between names ; * and create macro data ; create table FORM_MACROS as select name, min(tag_n) as tag_n, case when sum(check) = 0 then 0 else 1 end as multiple from FORM_OUTPUT group by name order by tag_n ; quit ;

SECTION 3 - WRITE THE MACRO CODE The code in this section begins with a null DATA step that uses the variable multiple, from the SAS data set FORM_MACROS, to create the SAS macro code that would expect no value, a single value, or multiple values for each of the names in the HTML form. After this, one null DATA step is used to generate a simple output message and another is used to write the contents of the macro definition code file into the SAS log.

*** Section 3 *************** ; *** Write Macro Code File *** ; data _null_ ; file "&codeFile" ; set FORM_MACROS ; if _n_=1 then do ; * default comments ;

3 NESUG 18 Ins & Outs

put "%* HTML File: &htmlFile ;" ; * html doc file name ; put "%* Code File: &codeFile ;" ; * SAS code file name ; put ; put "TITLE ;" ; put "OPTIONS PS=800 NODATE ;" ; put ; end ; mp = '_'!!name ; mv_name = '&'!!name ; select ; when (multiple EQ 0) do ; * single name/value ; put '%MACRO ' mp ';' ; * macro defn generator ; put ' %PUT ;' ; put ' %IF %SYMGLOBL(' name') EQ 0 %THEN %DO ;' ; put ' %PUT NOTE: .. NO ' name 'PARAMETER PASSED ;' ; put ' %END ;' ; put ' %ELSE %DO ;' ; put ' %PUT NOTE: .. ONE NAME/VALUE PASSED ; ' ; put ' %PUT NOTE: .. ' name '= ' mv_name ';' ; put ' %END ;' ; put ' %PUT ;' ; put '%MEND ;' ; put '%' mp ';' / ; end ; when (multiple EQ 1) do ; * multiple name/value ; name0 = trim(name)!!'0' ; * macro defn generator ; mv_name0 = '&'!!trim(name0) ; m1_name = trim(name)!!'&i' ; m3_name = '&&'!!trim(name)!!'&i' ; put '%MACRO ' mp ';' ; put ' %PUT ;' ; put ' %IF %SYMGLOBL(' name') EQ 0 %THEN %DO ;' ; put ' %PUT NOTE: .. NO ' name 'PARAMETER PASSED ;' ; put ' %END ;' ; put ' %ELSE %IF %SYMGLOBL(' name0') EQ 0 %THEN %DO ;' ; put ' %PUT NOTE: .. ONE NAME/VALUE PASSED ; ' ; put ' %PUT NOTE: .. ' name '= ' mv_name ';' ; put ' %END ;' ; put ' %ELSE %DO ;' ; put ' %PUT NOTE: .. MULTIPLE NAME/VALUEs PASSED ;' ; put ' %DO i = 1 %TO ' mv_name0 ';' ; put ' %PUT NOTE: ' m1_name '= ' m3_name ';' ; put ' %END ;' ; put ' %END ;' ; put ' %PUT ;' ; put '%MEND ;' ; put '%' mp ';' / ; end ; otherwise ; end ; run ;

data _null_ ; file "&codeFile" mod ; * default code ; put 'DATA _NULL_ ; ' ; * to provide for an html ; put ' FILE _webout ; ' ; * document template and ; put ' PUT "Content-type: text/html" ; ' ; * custom output report ; put ' PUT ; ' ; put ' PUT "" ; ' ; put ' PUT "" ; ' ; put ' PUT " *_title_ " ; ' ; put ' PUT "" ; ' ; put ' PUT "" ; ' ; put ' PUT " *_output_ " ; ' ; put ' PUT "" ; ' ; put ' PUT "" ; ' ; put 'RUN ; ' ; run ;

4 NESUG 18 Ins & Outs

data null ; * write code to SAS log ; infile "&codeFile" ; input ; put "***** " _infile_ ; run ;

It is important to note that the macro code generated by %code provides, within individual macro programs, macro- level conditional checking for each form output element name. However, it does NOT provide for any significant action that would be taken upon finding no parameter passed, a single name/value pair, or multiple set of values. Look carefully at the macro statements to be written into the code file. In all cases, the only result of any macro level checking is to write a message into the SAS log using %PUT statements. These macros only provide an initial template for a SAS/IntrNet Dispatcher Application or a SAS9 Stored Process Web Application. The application is responsible for modifying the code to generate any further specific processing beyond this.

SECTION 4 - PROCESSING SPECIFICATION REPORTS The code in this section submits three PROC PRINT reports that give the %code macro user a look at the specifications used to build the application macro code. The first report shows the form tags that were parsed from the input HTML document. The second report displays information about the output that the form tags can generate. The last PROC PRINT listing displays the data used directly to create the macro programs. At the end of this section, a %mend statement ends the definition of the %code macro program.

*** Section 4 ********** ; *** Reports ************ ; footnote ; title ; title3 "HTML Form Tags Found" ; title4 "SAS data set - WORK.FORM_TAGS" ; proc print data=FORM TAGS noobs split='+' ; var type name tag tag n ; label type='Type of + Form Element' name='Name of + Form Element' tag='HTML Tag' tag_n='Order of + Element in + HTML Form' ; run ;

title3 "Element Name and Value Usage" ; title4 "SAS data set - WORK.FORM OUTPUT" ; proc print data=FORM OUTPUT noobs split='+' ; var type name count mult value name reuse ; label type='Type of + Form Element' name='Name of + Form Element' count='Count of + Values per + Element' mult_value='Possible + Multiple + Values?' name reuse='Name Reuse + Between + Elements?' ; run ;

title3 "Macro Definition Input" ; title4 "SAS data set - WORK.FORM MACROS" ; proc print data=FORM MACROS noobs split='+' ; var name multiple ; label name='Name of + Form Element' multiple='Multiple + name/value + Indicator' ; run ; title ;

%mend code ;

AN EXAMPLE Consider the example HTML document shown in Table 1; it contains a simple HTML form that uses a group, two , and a selection list. Assuming that the location of the document is

5 NESUG 18 Ins & Outs

c:\test\htmlTestCode.html, the following macro trigger would be appropriate to generate the application SAS code that would use the output from the HTML form contained within it:

%code(htmlFile=c:\test\htmlTestCode.html, codeFile=c:\test\code.txt);

Table 1. An Example HTML Document

NESUG 2005

SAS Experience:
less than 5 years
5 or more years

Web Tools:
html
Java

Education:


 

The resulting code would be able to handle the single name/value output from the radio button group named exp, the possible multiple name/value output from the two checkboxes that share the name web, and the selection list named educ. Note that by default, all macros associated with selection lists are coded to handle multiple name/value output. The code, written to c:\test\code.txt, is listed below:

%* HTML File: c:\test\htmlTestCode.html ; %* Code File: c:\test\code.txt ; TITLE ; OPTIONS PS=800 NODATE ;

%MACRO _exp ; %PUT ; %IF %SYMGLOBL(exp ) EQ 0 %THEN %DO ; %PUT NOTE: .. NO exp PARAMETER PASSED ; %END ; %ELSE %DO ; %PUT NOTE: .. ONE NAME/VALUE PASSED ; %PUT NOTE: .. exp = &exp ; %END ; %PUT ; %MEND ;

%_exp ;

%MACRO _web ;

6 NESUG 18 Ins & Outs

%PUT ; %IF %SYMGLOBL(web ) EQ 0 %THEN %DO ; %PUT NOTE: .. NO web PARAMETER PASSED ; %END ; %ELSE %IF %SYMGLOBL(web0 ) EQ 0 %THEN %DO ; %PUT NOTE: .. ONE NAME/VALUE PASSED ; %PUT NOTE: .. web = &web ; %END ; %ELSE %DO ; %PUT NOTE: .. MULTIPLE NAME/VALUEs PASSED ; %DO i = 1 %TO &web0 ; %PUT NOTE: web&i = &&web&i ; %END ; %END ; %PUT ; %MEND ; %_web ;

%MACRO _educ ; %PUT ; %IF %SYMGLOBL(educ ) EQ 0 %THEN %DO ; %PUT NOTE: .. NO educ PARAMETER PASSED ; %END ; %ELSE %IF %SYMGLOBL(educ0 ) EQ 0 %THEN %DO ; %PUT NOTE: .. ONE NAME/VALUE PASSED ; %PUT NOTE: .. educ = &educ ; %END ; %ELSE %DO ; %PUT NOTE: .. MULTIPLE NAME/VALUEs PASSED ; %DO i = 1 %TO &educ0 ; %PUT NOTE: educ&i = &&educ&i ; %END ; %END ; %PUT ; %MEND ; %_educ ;

DATA _NULL_ ; FILE _webout ; PUT "Content-type: text/html" ; PUT ; PUT "" ; PUT "" ; PUT " *_title_ " ; PUT "" ; PUT "" ; PUT " *_output_ " ; PUT "" ; PUT "" ; RUN ;

CONCLUSION As shown, this macro helps automate the work that’s involved when creating SAS code for HTML front-end interfaces. The %code generated macro code provides either a SAS/IntrNet or a SAS9 Stored Process Web Application developer a good “start point” that can be readily modified to produce a complete application.

DEVELOPMENT NOTES

The %code macro was written using SAS9 under Windows XP Professional. Furthermore, SAS9 is required to run it. The %code macro utilizes Pearl regular expressions and several associated functions that are not available in older versions of SAS. It is also important to emphasize that the HTML document loaded into the macro must follow the HTML v4 or XHTML syntax specifications. In particular, the parsing code assumes that tag definitions do not overlap and that the tag attribute values are quoted. In testing, several documents did not meet this requirement and did not produce the desired results. However, modifying these documents either by hand or filtering them with HTML Tidy (see http://www.w3.org/) did allow them to be successfully used. Also be aware that, the %code definition shown

7 NESUG 18 Ins & Outs

in this paper is not intended to represent production-level code. Production-level code would be subject to more extensive code review and testing than was performed on this macro.

RESOURCES SAS Institute Inc. 2001. SAS Web Tools: Advanced Dynamic Solutions Using SAS/IntrNet Software. Cary, NC: SAS Institute Inc.

SAS Institute Inc. 2004. New Features in SAS System 9. Cary, NC: SAS Institute Inc.

SAS Institute Inc. 2004. SAS Macro Language. Cary, NC: SAS Institute Inc.

Web Technologies Community: SAS/IntrNet Software, SAS Institute Inc. http://support.sas.com/rnd/web/intrnet/.

World Wide Web Consortium. http://www.w3.org/

Murdock, Kelly. 2000. Master VISUALLY HTML 4 and XHTML 1, Forest City, CA: IDG Books Worldwide, Inc.

CONTACT INFORMATION Please forward comments and questions to:

Don Boudreaux, PhD E-mail: [email protected]

Keith Cranford E-mail: [email protected]

SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration.

Other brand and product names are trademarks of their respective companies.

8