<<

Today

• Finish introduction to Perl Lecture 10 • CGI scripting with Perl • about next assignment

Perl (cont.) • development tools UNIX Development Tools

RE (cont.) Awk Influence

string using RE (whitespace by default) • $, field separator OFS @fields = split /:/, “::ab:cde:f”; # gets (“”,””,”ab”,”cde”,”f”) • $/ input record separator (line delimiter) RS • into one undef $/; $str = join “-”, @fields; # gets “--ab-cde-f” $str = ; # str gets whole something from a list • $\ output record separator (for print) ORS – Similar to UNIX grep, but not limited to using regular expressions @selected = grep(!/^#/, @code);

Subroutines Lexical Variables

• Defined with sub: sub myfunc { ... } • Local (lexical) variables can be declared with my • Subroutines calls are prefaced with &. e.g. &foo my $val = 10; my($param1, $param2) = @_; • Any of the three principal types may be passed as parameters or used as a return value • Can also be used in any block • $_ passed by default • Use the use strict pragma to enforce good programming practice • Parameters are received by the subroutines in the special #!/usr/bin/perl -w array @_: $_[0], $_[1], ... use strict; • The scalars in @_ are implicit aliases for the ones passed $foo = “bar”; # not localized by my • By default, value of last expression evaluated is returned. print “$foo\n”; Override with return statement

1 Subroutine Examples Another Subroutine Example

&foo; # call subroutine foo with $_ @nums = (1, 2, 3); &foo(@list); # call foo passing an array $num = 4; $x = &foo(‘red’, 3, @arr); @res = dec_by_one(@nums, $num); # @res=(0, 1, 2, 3) # (@nums,$num)=(1, 2, 3, 4) @list = &foo; # foo returns a list dec_by_1(@nums, $num); # (@nums,$num)=(0, 1, 2, 3)

sub dec_by_one { sub simple { my @ret = @_; # a copy my $sum; for my $n (@ret) { $n-- } foreach $_ (@_) { return @ret; $sum += $_; } } return $sum; sub dec_by_1 { } for (@_) { $_-- } } $foo = ‘myroutine’ &$foo(@list); # call subroutine indirectly

Pass-by-reference String Functions • Several string manipulation functions: @x = (1, 2); – crypt, index, rindex, length, substr, sprintf @y = (3, 4); • Adds others: @z = &vector_add( \@x, \@y ); – chop removes the last character from a string – chomp removes trailing string corresponding to $/, normally sub vector_add { newline my ($u, $v) = @_; – work with scalars or arrays # @$u and @$v refer to @x and @y @stuff = (“hello\n”, “hi\n”, “ola\n”); my @sum; chomp(@stuff); $sum[0] = $$u[0] + $$v[0]; $sum[1] = $$u[1] + $$v[1]; return @sum; }

Other Built-in Functions Good Way to Learn Perl

• Numeric functions • a2p – abs, sin, cos, log, sqrt, rand – Translates an program to Perl • Functions for processes and process groups • s2p – , , system, waitpid – Translates a script to Perl • functions – localtime, time $time_str = localtime; print $time_str; # Wed Nov 10 19:24:30 2004

2 Modules Example — Mail::Mailer

• Modules are reusable code with specific use Mail::Mailer;

functionality $mailer = Mail::Mailer->new("sendmail"); • Standard modules are distributed with Perl, $mailer->open({ From => “elee\@cims.nyu.edu”, To => “kornj\@cs.nyu.edu”, others can be obtained from Subject => “Lecture 10” }) or die "Can't open mailer: $!\n"; • Include modules in your program with use, print $mailer “Let’s cancel today’s lecture\n”; e.g. use CGI; $mailer->close( );

CGI Review CGI Review (cont.)

• Allow web server to communicate with other GET: programs GET /cgi-bin/myscript.pl?name=Bill%20Gates& • Web pages created dynamically according to user title=Chairman HTTP/1.1 input POST: • 2 methods for sending form data: – GET - variable=value pairs in the POST /cgi-bin/myscript.pl HTTP/1.1 QUERY_STRING (from %ENV in Perl) … headers… – POST - variable=value pairs in STDIN

name=Bill%20Gates&title=Chairman

A (rather ugly) CGI Script CGI.pm

#!/usr/local/bin/perl $size_of_form_info = $ENV{'CONTENT_LENGTH'}; • Interface for parsing and interpreting query read ($STDIN, $form_info, $size_of_form_info);

# Split up each pair of key/value pairs strings passed to CGI scripts foreach $pair (split (/&/, $form_info)) { # For each pair, split into $key and $value variables ($key, $value) = split (/=/, $pair); • Methods for creating generating HTML # Get rid of the pesky %xx encodings $key =~ s/%([\dA-Fa-f][\dA-Fa-f])/pack ("C", hex ($1))/eg; $value =~ s/%([\dA-Fa-f][\dA-Fa-f])/pack ("C", hex ($1))/eg; • Methods to handle errors in CGI scripts # Use $key as index for $parameters hash, $value as value $parameters{$key} = $value; }

# Print out the obligatory content line print "Content-type: text/plain\n\n";

# Tell the user what they said print "Your birthday is on " . $parameters{birthday} . ".\n";

3 Script Using CGI.pm The Mailer Form

**Formatting tags have been stripped #!/usr/local/bin/perl -w

From: my $query = CGI->new(); To: Subject: my $bday = $query->param("birthday"); Message: print "Your birthday is $bday.";

The Mailer CGI Script mod_perl use CGI; use Mail::Mailer; my $query = CGI->new(); • Perl embedded in Apache web my $from_user = $query->param(”from"); my $to_user = $query->param(“ rcpt”); server my $subject = $query->param(“ subject”); my $msg = $query->param(“ msg”); • No need to start new process for every CGI $mailer = Mail::Mailer->new("sendmail"); $mailer->open({ From => $from_user, request To => $to_user, Subject => $subject }) or die "Can't open mailer: $!\n"; print $mailer $msg; $mailer->close( ); print $query->header(-type => “text/html”); print $query->start_html(-title => “Message Sent”); print $query->p(”Your message to $to_user has been sent!\n”); print $query->end_html();

Further Reading

Software Development Tools

4 Types of Development Tools Make

• Compilation and building: make • make: A program for building and • Managing files: RCS, SCCS, CVS maintaining computer programs • Editors: , emacs – developed Bell Labs around 1978 by S. Feldman (now at IBM) • Archiving: tar, cpio, , RPM • Instructions stored in a special format file • Configuration: autoconf called a “makefile”. • Debugging: gdb, dbx, prof, strace, purify • Programming tools: , , lint, indent

Make Features Compilation Phases

• Contains the build instructions for a project Component Input Output – Automatically updates files based on a series of pre-processed dependency rules preprocessor source code source code – Supports multiple configurations for a project pre-processed assembly • Only re-compiles necessary files after a change (conditional compilation) source code source code – Major time-saver for large projects assembly assembler – Uses timestamps of the intermediate files source code • Typical usage: is updated from object files which are in turn compiled from source files linker object files executable file

Dependency Graph Makefile Format

• Rule Syntax: myprog : link

foo.o bar.o baz.o – The is a list of files that the command will generate compile – The may be files and/or other targets, and will be used to create the target foo.c bar.c baz.c – It must be a tab before , or it won’t work – The first rule is the default for make

original generated

5 Examples of Invoking Make Make: Sequence of Execution

• make -f makefile • Make executes all commands associated • make target with target in makefile if one of these conditions is satisfied: • make – file target does not exist – looks for file makefile or Makefile in current directory, picks first target listed in the – file target exists but one of the source files in makefile the dependency list has been modified more recently than target

Example Makefile Make Power Features

# Example Makefile CC=g++ • Many built-in rules CFLAGS=-g –Wall -DDEBUG – e.g. C compilation foobar: foo.o bar.o $(CC) $(CFLAGS) –o foobar foo.o bar.o • “Fake” targets – Targets that are not actually files foo.o: foo.cpp foo.h – Can do just about anything, not just compile $(CC) $(CFLAGS) –c foo.cpp – Like the “clean” target bar.o: bar.cpp bar.h $(CC) $(CFLAGS) –c bar.cpp • Forcing re-compiles the required files clean: – foo.o bar.o foobar – touch the Makefile to rebuild everything

Make Patterns and Variables Version Control

• Variables (macros): • Provide the ability to store/access and protect all – VAR = Set a variable of the versions of source code files – $(VAR) Use a variable • Provides the following benefits: • Suffix Rules – .c.o: specifies a rule to build x.o from x.c – If program has multiple versions, it keeps track only of – Default: differences between multiple versions. .c.o: – Multi-user support. Allows only one person at the time $(CC) $(CFLAGS) -c $< to do the editing. • Special: – Provides a way to look at the history of program – $@: target development. – $<: dependency list – $*: target with suffix deleted

6 Version Control Systems RCS Basic Operations • Set up a directory for RCS: • SCCS: UNIX Source Code Control System – RCS – Rochkind, Bell Labs, 1972. • Check in a new file into the repository – ci filename • RCS: Revision Control System • Check out a file from the repository for reading – Tichy, Purdue, 1980s. – co filename • Check out a file from the repository for writing • CVS: Concurrent Versions System – co –l filename – Grune, 1986, Berliner, 1989. – Acquires lock • Compare local copy of file to version in repository – rcsdiff [–] filename

SCCS Equivalents RCS Keywords Function RCS SCCS

Setup mkdir RCS mkdir SCCS • Keywords in source files are expanded to contain RCS info at checkout Check in new foo.c ci foo.c sccs create foo.c – $keyword$ → $keyword: value $ – Use ident to extract RCS keyword info Check in update to foo.c ci foo.c sccs delta foo.c • $Author$ Username of person checked in the revision Get read-only foo.c co foo.c sccs get foo.c • $Date$ Date and time of check-in • $Id$ A title that includes the RCS filename, Get writeable foo.c co -l foo.c sccs edit foo.c revision number, date, author, state, and (if locked) the person locked the file Version history of foo.c rlog foo.c sccs print foo.c • $Revision$ The revision number assigned Compare foo.c to v1.1 rcsdiff sccs diffs –r1.1 foo.c -r1.1 foo.c

CVS Major Features CVS Repositories

• No exclusive locks like RCS • All revisions of a file in the project are in – No waiting around for other developers the repository (using RCS) – No hurrying to make changes while others • Work is done on the checkout (working – Avoid the “lost update” problem copy) • Client/Server model • Top-level directories are modules; checkout – Distributed software development operates on modules • Front-end tool for RCS with more functions • Different ways to connect

7 CVSROOT Getting Started

• Environment Variable • cvs [basic-options] [cmd-options] [files] • Basic options: • Location of Repository – -d Specifies CVSROOT – -H Help on command • Can take different forms: – -n Dry run – Local file system: /usr/local/cvsroot • Commands – import, checkout – Remote Shell: – update, commit user@server:/usr/local/cvsroot – add, remove – Client/Server: – status, , log – tag... :pserver:user@server:/usr/local/cvsroot

Setting up CVS Managing files

• Importing source • Add files: add (cvs add ) • Remove files: remove (cvs remove ) – Generates a new module • Get latest version from repository: update – into source directory – If out of sync, merges changes. Conflict resolution is manual. – cvs –d import • Put changed version into repository: commit – Fails if repository has newer version (need update first) – cvs –d checkout • Can handle binary files (no merging or diffs) • Specify a symbolic tag for files in the repository: tag

tar: Tape ARchiver tar: archiving files options

• tar: general purpose archive utility – c creates a tar-format file (not just for tapes) – f filename specify filename for tar-format file, – Usage: tar [options] [files] • Default is /dev/rmt0. – Originally designed for maintaining an archive of files • If - is used for filename, standard input or standard on a magnetic tape. output is used as appropriate – Now often used for packaging files for distribution – v verbose output – If any files are subdirectories, tar acts on the entire subtree. – x allows to extract named files

8 tar: archiving files (continued) cpio: copying files

– t generates table of contents • cpio: copy file archives in from or out of – r unconditionally appends the tape or disk or to another location on the listed files to the archive files local machine – u appends only files that are more recent than those already archived • Similar to tar – L follow symbolic links • Examples: – m do not restore file modification times – Extract: cpio -idtu [patterns] – l print error messages about links it – Create: cpio -ov cannot – Pass-thru: cpio -pl directory

cpio (continued) cpio (continued)

• cpio -i [dtum] [patterns] • cpio -ov – Copy in (extract) files whose names match • Copy out a list of files whose names are given on the selected patterns. standard input. -v lists files processed. – If no pattern is used, all files are extracted • cpio -p [options] directory – During extraction, older files are not extracted • Copy files to another directory on the same system. (unless -u option is used) Destination pathnames are relative to the named – Directories are not created unless –d is used directory – Modification times not preserved with -m • Example: To copy a directory tree: – find . -depth -print | cpio -pdumv /mydir – Print the table of contents: -t

pax: replacement for cpio and tar Distributing Software

• Portable Archive eXchange format • Pieces typically distributed: • Part of POSIX – Binaries • Reads/writes cpio and tar formats – Required runtime libraries • Union of cpio and tar functionality – Data files • Files can come from standard input or command line – Man pages • Sensible defaults – pax –wf archive *.c – Documentation – pax –r < archive – Header files • Typically packaged in an archive: – e.g., perl-solaris.tgz or perl-5.8.5-9.i386.rpm

9 Packaging Source: autoconf Installing Software From Tarballs • Produces shell scripts that automatically configure software to adapt to UNIX-like systems. tar xzf – Generates configuration script (configure) cd • The configure script checks for: – programs ./configure – libraries – header files make – typedefs make install – structures – compiler characteristics – functions – system services and generates makefiles

Debugging Debuggers

• The ideal: • Advantages over the “old fashioned” way: do it right the first time – you can step through code as it runs – you don’t have to modify your code • The reality: – you can examine the entire state of the program bugs happen • , variable values, scope, etc. – you can modify values in the running program • The goal: – you can view the state of a crash using core files exterminate, quickly and efficiently

Debuggers Using the Debugger

• The GDB or DBX debuggers let you examine the • Two ways to use a debugger: internal workings of your code while the program 1. Run the debugger on your program, executing the runs. program from within the debugger and see what happens – Debuggers allow you to set breakpoints to stop the 2. Post-mortem mode: program has crashed and core program's execution at a particular point of interest and dumped examine variables. • You often won't be able to find out exactly what happened, – To work with a debugger, you first have to recompile but you usually get a stack trace. the program with the proper debugging options. • A stack trace shows the chain of function calls where the program exited ungracefully – Use the -g command line parameter to cc, gcc, or CC • Does not always pinpoint what caused the problem. • Example: cc -g -c foo.c

10 GDB, the GNU Debugger Basic GDB Commands

• General Commands: • Text-based, invoked with: file [] selects as the program to debug gdb [ [|]] run [] runs selected program with arguments • Argument descriptions: attach attach gdb to a running process executable program file kill kills the process being debugged core dump of program quit quits the gdb program process id of already running program help [] accesses the internal help documentation • Example: • Stepping and Continuing: c[ontinue] continue execution (after a stop) gdb ./hello s[tep] step one line, entering called functions • Compile with –g for debug info n[ext] step one line, without entering functions finish finish the function and print the return value

GDB Breakpoints Playing with Data in GDB

• Useful breakpoint commands: • Commands for looking around: b[reak] [] sets breakpoints. can be list [] prints out source code at a number of things, including a hex search searches source code for address, a function name, a line backtrace [] prints a backtrace levels deep number, or a relative line offset info [] prints out info on (like [r]watch sets a watchpoint, which will break local variables or function args) when is written to [or read] p[rint] [] prints out the evaluation of prints out a listing of all breakpoints info break[points] • Commands for altering data and control path: clear [] clears a breakpoint at set sets variables or arguments d[elete] [] deletes breakpoints by number return [] returns from current function jump jumps execution to

Tracing System Calls • Most operating systems contain a utility to monitor system calls: – : strace, Solaris: truss, SGI: par

27mS[ 1] : close(0) OK 27mS[ 1] : open("try.in", O_RDONLY, 017777627464) 29mS[ 1] : END-open() = 0 29mS[ 1] : read(0, "1\n2\n|/bin/date\n3\n|/bin/sleep 2", 2048) = 31 29mS[ 1] : read(0, 0x7fff26ef, 2017) = 0 29mS[ 1] : getpagesize() = 16384 29mS[ 1] : brk(0x1001c000) OK 29mS[ 1] : time() = 1003207028 29mS[ 1] : fork() 31mS[ 1] : END-fork() = 1880277 41mS[ 1] (1864078): was sent signal SIGCLD 31mS[ 2] : waitsys(P_ALL, 0, 0x7fff2590, WTRAPPED|WEXITED, 0) 42mS[ 2] : END-waitsys(P_ALL, 0, {signo=SIGCLD, errno=0, code=CLD_EXITED, pid=1880277, status=0}, WTRAPPED|WEXITED, 0) = 0 42mS[ 2] : time() = 1003207028

11