<<

Behavior Research Methods. Instruments. & Computers 1996,28 (2), 165-169

CGI scripts: Gateways to World-Wide Web power

JAMES M. KIELEY Miyazaki International CoUege, Miyazaki, Japan

The power of the hypertext-based information presentation system known as the World-Wide Web can be enhanced by scripts based on the common gateway interface (CG!) protocol. CG! scripts re­ siding on a Webserver permit the execution of computer programs that can perform a wide variety of functions that maybe useful to psychologists. Example applications are presented here, along with ref­ erence information for potential script developers.

The majority ofinformation that people access via the permit users to input data by clicking on checkboxes, hypertext-based information presentation system known radio buttons, menus, reset buttons, and submit buttons, as the World-Wide Web (WWW) is actually stored in the and also by typing into text fields (Lemay, 1995). of static files-that is, text and graphics files that COl was developed by the original programmers ofthe appear a certain way when viewed from a , -based CERN and NCSA HTTP Web servers to such as or , because ofa command lan­ supersede a prior scripting environment called HTBIN. guage known as HTML. HTML, by its original design, is Other Web servers that support scripting, including those a simple command set used to present multimedia infor­ based on other operating systems, mayor may not use mation that can be accessed asynchronously. The capa­ the COl protocol. Early applications of COl scripts in­ bilities ofHTML, and, therefore, the WWW, can be im­ cluded using them to serve information to a browser that proved with scripts conforming to the common gateway is in a format that is otherwise unreadable, such as an SQL interface (COl) protocol. COl, in effect, extends the database. WWW's presentation language (HTML) by permitting the An introduction to COl is available from the National execution ofexternal programs from a WWW server. In Center for Supercomputing Applications (see NCSA, addition to providing an overview ofbasic CGI concepts, 1995a). Netscape Communications (1995) publishes Web this paper will present example applications that may be server documentation in hard copy form that may be valu­ ofinterest to psychologists as well as pointers on how to able to COl script writers. A general reference with a dis­ get started in developing custom COl scripts to fit a spe­ cussion of COl worth obtaining is the book Managing cific application. Internet Information Resources (Liu, Peek, Jones, Buus, Among the types of added WWW features that COl & Nyre, 1994). Individuals interested in using COl in con­ scripts support are clickable area-sensitive graphics, known junction with a Windows are referred to Run­ as imagemaps, that can be used in lieu of standard text ning a Perfect Web Site (Chandler, 1995). menus, HTML FORMS that permit user-inputted data to HTML and COl can work together by any ofthree al­ be stored and/or analyzed, rotatable 3-D imaging, access ternative means. The ISINDEX tag provides almost any to WAIS and SQL databases, and the building ofindexes Web browser with the ability to use scripts to perform ofselected WWW-based information. In theory, COl gate­ simple searches. Both the GET and the POST methods can way scripting permits programmers to add virtually any be used to pass data input through HTML FORMS. The functionality that the original Web server software au­ GET method passes user input using environmental vari­ thors omitted. ables, but GET messages have a 256-character limit. The POST method passes keystrokes using standard input. Essential Concepts and Vocabulary The POST method does not have the limitations associ­ HTML stands for hypertext markup language, which ated with ISINDEX and GET, and so its use is most pre­ represents the essential command set for formatting the ferred (see Liu et aI., 1994, for a further discussion). presentation of Web documents. HTML commands are implemented as tags and are usually found in files ending Applications with a . extension (e.g., homepage.html). The original Since CGI provides WWW developers with an open HTML 1.0 feature set has been superseded by HTML 2.0, development environment, there are few limits to poten­ which includes support for FORMS, and HTML 3.0, which tial applications. Presented here are five different uses that supports tables, mathematical equations, and more flexi­ both are illustrative ofwhat COl scripts can do and may ble text formatting. HTML 2.0 and later FORMS in effect be of immediate benefit to psychologists in teaching or research. Exploration of these examples requires that Correspondence should be addressed to J. M. Kieley, The Claremont the reader have full Internet access and a Web browser Graduate School, Academic Computing Building, 130 E. 9th St., Clare­ that supports FORMS. Examples can be tested interac­ mont, CA 91711 (e-mail:[email protected]). tively using the URL http://acb2.cgs.edulpsychpage.html.

165 Copyright 1996 Psychonomic Society, Inc. 166 KIELEY

Source code for the CGI scripts is available at the same archive and search functions are potentially important address. for -dependent researchers and teachers with 1. Datacollection. HTML FORMS provide compliant focused technical interests and with limited free time for Web browsers with the user interface necessary to dis­ browsing newsgroups. Usenet-Webusers can search news­ play newly inputted data as well as that which can be re­ group archives at irregular intervals without worrying trieved from a file. FORM capability alone is not enough about having missed a key posting critical to answering a to send and retrieve information. A program, usually im­ particular question. plemented as a CGI script, is responsible for executing the Successful implementation of Usenet-Web requires actual data exchange. When data are sent by a gateway the skills of someone familiar with both WWW server script, they can be written directly to a file or forwarded and UNIX system administration. It also requires that a to another application, such as an e-mail program. WWW server have the ability to mount the necessary An example application ofa gateway script that may be news-related volumes from a local news server. Normally, ofinterest to psychologists for research purposes is called file sharing is done using NFS (Network File System). psysurvey.cgi. This script supports the collection ofsur­ Although NFS is normally thought ofas a UNIX file shar­ vey data over the Internet. It has been adapted from scripts ing protocol, other operating systems can access NFS developed at the Georgia Institute of Technology GVU volumes with the proper utility software. Thus, it may be Center (1995). It works in conjunction with an HTML possible to implement Usenet-Web on non-UNIX Web file called psy_survey.html. The latter file utilizes FORM systems that also support CGI scripts written in . tags that produce the fields necessary to input responses. One factor that needs to be considered before setting The most critical line ofcode from the psy_survey.html up Usenet-Web is the disk space requirements for the long­ file is

. FORM METHOD stallations may need to be restricted to a few newsgroups. ="POST" specifies the method for submitting informa­ The amount of activity within the selected groups will, tion to the gateway script. The command inPUT TYPE ofcourse, affect total storage requirements. A moderately ="submit" then creates a button that, when selected, active group may generate on the order of 1-2 Mb of passes data on to the CGI script specified with the FORM data per month. METHOD command. The psysurvey.cgi script parses the Usenet-Web can be accessed for demonstration pur­ data sent by the submit command and then forwards it, poses at http://acb2.cgs.edu/cgi-bin/fetch.This site in­ along with designated variable labels, to a specified cludes an archive ofcomp.infosystems.www.authoring. e-mail address. Each variable name-value pairing is sent cgi, which is a forum that focuses on many topics dis­ as a separate line in an e-mail message. This script can be cussed in this paper. A keyword search can be restricted modified to pass data to a file or a structured database in by year, month, or day of posting. A search returns the a format that can directly be read by a statistical analysis titles of all postings that meet specified criteria. A user program (for related materials, see NCSA, 1995b). Data can then click on a given title to view a posting. collection methods similar to those discussed here could Usenet-Web can be accessed from virtually any WWW also be used to collect accuracy responses to visual or browser that supports FORMS. Usenet-Web can also be auditory experimental stimuli presented using standard extended to act as an archive for listservs, such as SCiP­ HTMLtags. L (Ransdell & Anderson, 1995), using a utility program 2. Sharing expertise. Recently, it has been estimated called mail2news (Salz, 1992). Configuration informa­ that there may be as many as 10,000 different special in­ tion for Usenet-Web is included with the distribution terest forums that can be accessed via USENET news (for (see Franz, 1995). a description ofthis service, see Kieley, 1993). Some of 3. Electronic testing. The Internet in general and the these forums are relatively local and inactive, whereas WWW in particular hold much promise for the direction others generate hundreds ofmessages per day that are ac­ in which education will move over the next decade. In­ cessible across the Internet. As work habits become more creasing demand for classes coupled with decreased fund­ technologically based, many people find themselves be­ ing, which impact on numbers ofboth faculty and facili­ coming increasingly dependent on USENET news for ties, has many rethinking the education process. Distance obtaining technical expertise and for other types offree education is a term that has been used to capture any teach­ information sharing. There are two fundamental short­ ing that takes place outside of a traditional classroom comings associated with USENET news the way it is nor­ context. mally accessed: (1) the information is transient, with There are now a number ofdistance education projects posted articles typically expiring and being inaccessible that use the WWW to distribute course materials. Typi­ after only a few days, and (2) the search mechanisms of cally, students sitting in a remote classroom or at home most news readers are limited. interactively point and click their way through informa­ Usenet-Web addresses the two aforementioned limita­ tion that a professor has made network accessible. This tions ofUSENET news. One script included in the distri­ may include assignments, lab times, lecture notes, and bution is responsible for archiving postings from selected even full-text readings. newsgroups. The archived information is easily search­ CGI scripts permit the extension ofthe WWW for other able from a Web browser using another script. These classroom needs. For example, Cunningham (1995) has CGI SCRIPTS: GATEWAYS TO WWWPOWER 167 provided an example ofhow the Web can be used for ob­ 5. Electronic indexing. At the time this was written, jective testing. The original distribution includes two by some estimates the amount ofinformation accessible scripts written in standard : (1) qform, used to automati­ via the WWW was growing at the rate of 10% per week. cally generate a form-based HTML file from a simple The number of WWW servers, while unknown, is now ASCII text file the test maker writes, and (2) qscore, used likely in the hundreds of thousands, and the number of to score and send the test result back to the test taker along individual documents is likely in the tens ofmillions. A with a corrected list of inappropriate responses. Kieley web of information encompassing the entire planet is a (1995) has provided a modification to the original qscore great thing, but it is fully usable only ifthe information program that automatically e-mails results to a desig­ on it can be selectively and efficiently retrieved. The ob­ nated address. vious tools for finding information are indexes; on the The URL http://acb2.cgs.edulqform/qformA.htmlpro­ WWW, they essentially come in three varieties. Spiders vides an example ofthe type ofsimple ASCII format that (also known as robots) are computer programs that tra­ can be used to generate a test. The URL http://acb2.cgs. verse the Web using various algorithms that automatically edu/qform/qformRhtml shows portions of the HTML build searchable indexes. Trees are hierarchically orga­ file that result after processing the ASCII file with the nized indexes that require human intervention and deci­ qform program. The URL http://acb2.cgs.edu/qform/ sion making. A third class ofindex is represented by au­ qformC.html shows how this document would appear in tomatic indexing programs that are intended to make a browser that supports FORMS. The qscore script is re­ searchable only the information located within a given sponsible for grading a test when the submit button is directory structure on a local server. pushed. The URL http://acb2.cgs.edulqform/qformD.html The amount of information on a given server or in a shows the type of scored result that can be sent back to given directory structure can vary a great deal. For one the test taker or forwarded to the instructor for recording. recent class, I decided to abandon traditional textbooks 4. Electronic conferencing. The term chat seen in iso­ and make all class-related information WWW-accessible. lation is usually interpreted as an abbreviation for the in­ Thus, all reading materials, handouts, assignments, class ternet relay chat (lRC) service that was first introduced exercises, and quizzes were distributed via a Web server. several years ago. IRC is a client-server technology that All told, the materials consisting ofmany different kinds permits a group of people located anywhere with Inter­ ofdocuments took up over 7 Mb ofdisk storage. net access to have typed conversations in real time. In Rather than require students to manually search through essence, this is an extension ofthe UNIX talk facility that the information hierarchy each time they wanted to look supports typed real-time exchanges between individuals something up, I provided two indexes now accessible at located at two workstations. http://acb2.cgs.edulindexes.html. One index is simply a Historically, IRC has been a popular activity for col­ tree diagram of the directory structure. Documents can lege students sitting at university-owned workstations be individually selected from the tree and then searched idling away the late-night hours, having conversations on using a browser's built-in search function. This method any ofa number oftopics. The client software for early is tedious when numerous documents must be searched. IRC implementations required knowledge ofa fairly cryp­ The other index is automatically generated from all di­ tic command set. rectory documents. Titles ofall documents containing at There are now a number ofWeb-based chat programs least one occurrence of a given search string can be re­ that function in a manner fairly similar to the original trieved at once using a CGI script. Returned titles can then UNIX-based IRC programs. They vary in terms oftheir be individually selected for full display of the file. The complexity and features. Psychat accessible at http://acb2. indexing and search scripts are collectively referred to cgs.edu/webchat/psychat.html is an implementation ofa as ICE indexing (Neuss, 1995). program called Webchat offered by the Internet Round­ Automatically generated Web indexes have the advan­ table Society (1995). tages ofbeing up-to-date and ofbeing thoroughly search­ Psychat allows multiple individuals to have a text-based able in a single pass. Sometimes, the amount of information virtual conference in real time using a simple Webbrowser generated with such a search method can be overwhelm­ interface. The typed messages ofparticipants can be ac­ ing, and not all available indexing scripts provide the user companied by to a graphics image in gif format. with the ability to use Boolean search strategies. Typically,participants will type a URL (uniform resource locater) that is a link to pictures ofthemselves. Psychat Writing CGI Scripts also permits participants to include HTML links to other The easiest way to write CGI scripts is to find an ex­ sites in their messages. isting script that closely approximates a desired function. In addition to real-time propagation and accompany­ A script that can be used unaltered is always preferred; ing images, Webchat has other technical advantages over however, in many cases, minor changes may be necessary. listservs and Usenet news. Any posting needs only to re­ There are a number ofexample scripts in the public do­ side on a single server to be world-wide accessible. This main that can be retrieved over the Internet. Several can conserves disk space. Also, all postings are generally per­ be retrieved from NCSA (1995c). A list oflocations with manently archived. additional examples is included in Table 1. The most cur- 168 KIELEY rent version ofthis list can be accessed interactively from a novice, it is not as difficult to learn as C. Two O'Reilly http://acb2.cgs.edu/cgi.html. books, Learning Perl (Schwartz, 1993) and Programming Choice of the most suitable programming language Perl (Wall & Schwartz, 1990) and a book by Till (1994) for writing new scripts from scratch depends on a num­ can be used to learn the elements of Perl. ber offactors, including the platform a The majority of existing CGI scripts were written for server is running under, whether or not performance is a Perl version 4. Some scripts will also run under the re­ critical issue, and, perhaps most importantly, preexisting cently released Perl 5, which is object-oriented in nature. programming expertise. To date, UNIX-based servers The Perl language is freely distributed. A thorough elec­ are still the overwhelming choice for WWW sites serving tronic index that provides Internet access to Perl bina­ more than a few simultaneous users. Gateway scripts can ries, scripts, documentation, and help for a number of be written in any ofa number ofpopular languages found platforms, including UNIX, VMS, Mac OS, Windows, on UNIX systems, including C, Perl (short for practical and Windows NT, is available (Seymour, 1995). extraction and report language), and TCL. UNIX shell scripts can also be used as CGI scripts, although execu­ The Future ofCGI tion speed may be less than desired using this method. CGI The continuing widespread adoption ofthe WWW as scripts can also be implemented on OS, Win­ an information collection and distribution mechanism dic­ dows NT, and Windows WWW servers. Scripting infor­ tates that both current protocols and current development mation for non-UNIX-based servers is available on the methods must be either significantly enhanced or replaced. WWw. Table 1 lists several relevant locations. This trend will ultimately affect the current CGI protocol. As personal and small departmental Web servers con­ Recent developments at university and corporate re­ tinue to proliferate, a variety ofmore user-friendly tools search labs provide some indication ofwhat to expect in to support added functionality will emerge. Recent an­ the future. For example, Java is a new language environ­ nouncements suggest that these tools will be object-based ment developed by SUN Computer that has now been and require minimal levels ofprogramming proficiency. adopted by Netscape Communications, perhaps the most Whether the majority ofnew tools developed to support dominant commercial WWW software company. Java client-initiated program execution will conform to the cur­ has been designed as a simple-to-use, object-oriented, rent CGI protocol is open to question. Internet-friendly, multiplatform, development system. As it stands now, CGI scripting is the predominant VRML (virtual reality markup language) has many fea­ method for executing programs from a Web browser, and tures that are similar to Java and also holds promise as a Perl is the language ofchoice for authoring CGI scripts. tool to further accelerate the development ofnetwork in­ Perl generally has speed advantages over TCL, and, for formation services.

Table 1 Internet-Accessible CGI Scripting Resources URL Location Description http://cy-mac.welc.cam.ac.uk/cgi.html Applescript/Frontier car Tour http://www.bio.cam.ac.uk/web/form.html car Form Handling In Perl http://www.best.coml-hedlund/cgi-faq/ CGI Programmer's Reference http://www.boutell.comlcgic/ cgic: an ANSI C library for car Programming http://www.webedge.comlCGIinC CGl's in C (Mac) http://www.clark.net/pub/jrinker/cgi.htm CGl. .. Common Gateway Interface Index http://cy-mac.welc.cam.ac.uk/cgi.html car Tour http://www.intergalact.com/hp/hp.html car Tutorials On-Line http://wsk.eit.comlwsk/dist/doc/libcgi/libcgi.html ElI's car Library http://macsolutions.interstate.net/fmprocgi.html File Maker Pro car gateway http://www.comvista.comlnet/www/WWWDirectory.html Invented Worlds Software http://java.sun.coml Java Home Page http://www.uwtc.washington.edu/computing/ www/mac/cgi.html MacWWW-eGI Applications http://www.aftemet.coml-michaelmlold_public/cgi/ Michael's Guide to car Programming http://www.halcyon.comlmooncrow/cgi.htm Mooncrow's CGIIPERL Source Page http://hoohoo.ncsa.uiuc.edu/cgi/overview.html NCSA car Overview http://marquis.tiac.net/software/parse-cgi.html PARSE car OSAX (Macintosh) http://www.uni-frankfurt.de/-fp/tools/proccgiinput.html Processing Form Data in Shell CGI Scripts http://ruulst.let.ruu.nl:2000/tcl-cgi.html TCL-car home page http://www.hyperion.coml-koreth/uncgi.html Un-car version 1.5 http.z/www.umr.edu/r-cgiwrap/ Using CGI at UMR (CarWrap) http://www.creation.comlcode.html Virtual Creation's CGI Programmers Page http://Web.Actwin.Com:80/Newtype/vr/vrml/index.htm VRML Gateway http://www.charm.net/-webNlib/Providers/CGI.html Web Developer's Virtual Library: car http://www.Stars.coml Web Developer's Virtual Library http://www.achilles.net/-john/cgi-dos/ Win-httpd Car-DOS http://akebono.stanford.edu/yahoo/Computers/ World_Wide Web/CGCCommon Gateway_Interface Yahoo car Listings ------'" CGI SCRIPTS: GATEWAYS TO WWW POWER 169

Java, VRML, or similar language products may at some CUNNINGHAM, R. (1995). The qform package [On-line]. Available point replace the current methods as the preferred way to URL: http://www.satlab.hawaii.edu/space/hawaii/qform.html initiate program execution from a Web browser, but this FRANZ, B. (1995). Usenet-Web 1.0.2 [On-line]. Available URL: http: //www.netimages.com/-snowhare/utilities/usenet-web/ is unlikely to occur soon. Official are evolv­ GEORGIA INSTITUTE OF TECHNOLOGY GVU CENTER (1995). GVU's ing slowly in part because ofthe number ofinternational WWW user survey home page [On-line]. AvailableURL: http://www. institutions and companies involved in Web development. cc.gatech.edu/gvu/user_surveys/ While the Internet continues to evolve, CGI will con­ INTERNET ROUNDTABLE SOCIETY (1995). Webchat. [On-line].Available URL: http://www.irsociety.com/webchat.html tinue to find its place and allow Web enhancements to KIELEY, J. M. (1993). Integrating remotely accessible information ser­ meet specific application needs. vices into psychology instruction and research. Behavior Research Methods, Instruments, & Computers, 25, 287-294. Conclusions KIELEY, J. M. (1995). Qscore modifications [On-line]. AvailableURL: The current WWW represents a significant first step http://acb2.cgs.edu/qscoremod.html LEMAY, L. (1995). TeachyourselfWebpublishing with HTML in a week. toward integrating a wide variety ofelectronic informa­ Indianapolis, IN: Sams. tion resources. The standard Web programming language, LID, c., PEEK, J., JONES, R., Buus, B., & NYRE, A. (1994). Managing HTML, does not work this magic alone. HTML is a sim­ internet information resources. Sebastopol, CA: O'Reilly. ple presentation language. HTML must rely on CGI gate­ NCSA (I 995a). The generic SQL gateway to the World-Wide Web [On­ way scripts to achieve its full potential at present. line]. AvailableURL:http://hoohoo.ncsa.uiuc.edu/CGI/overview.html NCSA (I 995b). The generic SQL gateway to the World-Wide Web [On­ With the currently available tools, CGI script writing is line]. AvailableURL: http://fiaker.ncsa.uiuc.edu:7999/gsqlbof/ not something everyone will want to attempt. It is impor­ NCSA (l995c). The generic SQL gateway to the World-Wide Web tant to recognize some of the ways in which standard [On-line]. Available URL: ftp://ftp.ncsa.uiuc.edu/Web/httpdlUnix! WWW technology can be enhanced by programs ascrib­ ncsa_httpdlcgi/ NETSCAPE COMMUNICATIONS (1995). Netscape communications home ing to the CGI protocol. Existing CGI programs may be page [On-line]. AvailableURL: http://home.netscape.com/ available to fit a given application unaltered. In other in­ NEUSS, C. (1995). Ice indexingserverextensionfor WWWservers [On­ stances, minor modifications may be able to adapt a script line]. AvailableURL: http://www.ham.muohio.edul./ICE for a specific use. Under some circumstances, scripts may RANSDELL, S. E., & ANDERSON, M. D. (1995). Establishing a SCiP list for year-round discussion: Keeping the information ball rolling. Be­ have to be developed from scratch. havior Research Methods, Instruments, & Computers, 27,116-119. As commercial developers become more invested in the SALZ, R. (1992). mai/2news [On-line]. AvailableURL: ftp://ftpmail/ftp. WWW, custom Web extensions will become easier to de­ elelab.nsc.co.jp/pub/net/news/maiI2news.tar.gz velop for nonprogrammers. It is too early to tell which SCHWARTZ, R. L. (1993). Learning Perl. Sebastopol, CA: O'Reilly. methods will eventually become dominant, but it is clear SEYMOUR, R. (1995). Usenet-Web 1.0.2 [On-line]. Available URL: http://www.ee.pdx.edu/-rseymour/perl/ that Web technology will evolve in the direction of ex­ TILL, D. (1994). TeachyourselfPerl in 21 days. Indianapolis, IN: Sams. tended functionality and more user-friendly control. WALL, L., & SCHWARTZ, R. L. (1990). Programming Perl. Sebastopol, CA: O'Reilly. REFERENCES

CHANDLER. D. M. (1995). Running a perfect . Indianapolis, IN: (Manuscnpt received November 13, 1995; Que Corporation. revision accepted for publication February 12, 1996.)