<<

Scripting Languages Awk Higher-level languages. Named for its three developers: Scripting Languages compact for small programs. Alfred Aho Often not suitable for large systems. Peter Weinberger COMS W4115 The more popular ones: Brian Kernighan Prof. Stephen A. Edwards Awk Good for file manipulation. I use it for Spring 2002 grades and other simple tasks. Columbia University Department of Science Python

Simple Awk Program Simple Awk Program General structure of an Awk program Input file. Each line a record. Space-separated fields: Beth 10.0 0 pattern { action } Kathy 14.0 10 employee, pay rate, hours worked pattern { action } Beth 10.0 0 . 3rd field . Dan 9.75 0 z}|{ Kathy 14.0 10 $3 > 0 { print $1, $2 * $3 } Mark 10.0 20 | {z } | {z } scans an input file one line a and, in order, Susie 8.25 18 pattern action runs each action whose pattern matches.

Run on the awk program Kathy 140 Patterns: $3 > 0 { print $1, $2 * $3 } BEGIN, END True before and after the file. expression Condition produces // String pattern match && Kathy 140 pattern pattern Mark 200 pattern || pattern Boolean operators Susie 148.5 ! pattern

Awk One-Liners Statistics in Awk Associative Arrays: Word Counter Print every line #!/bin/awk -f { gsub(/[.,:;!(){}]/, "") # remove punctuation BEGIN { n = 0; s = 0; ss = 0;} for ( i = 1 ; i <= NF ; i++ ) { print } NF == 1 { n++; s += $1; ss += $1 * $1; } count[$i]++ Print the first and third fields of each line END { } print n " data points" END { for (w in count) { print $1, $3 } m = (s+0.0) / n; print m " average" print count[w], w | " -rn" sd = sqrt( (ss - n * m * m) / ( n - 1.0)) } Print every line with three fields print sd " standard deviation" NF == 3 { print } } Run on the Tiger reference manual produces 1 103 the 5 6 data points 58 of Print a line number before every line 10 51 is Run on gives 6.16667 average 3 49 and { print NR, $0 } 3.92003 standard deviation 49 a 7 35 expression 11 32 The 29 = Perl Wordcount in Perl Understandable wordcount in Perl ’s #!/usr/bin/perl #!/usr/bin/perl Practical Extraction and Report Language while(<>) { while($line = <>) { chop; chop($line); or s/.,:;!(){}]//g; $line =˜ s/.,:;!(){}]//g; Pathologically Eclectic Rubbish Lister @words = ; @words = split(/\s+/, $line); Larger, more flexible language than Awk. Good for text foreach $word (@words) { foreach $word (@words) { processing and other tasks. Strange . Henious $count{$word}++; $count{$word}++; syntax. } } } } Excellent regular-expression support. More complicated open(SORTER, "| sort -nr"); open(SORTER, "| sort -nr"); data structures possible (even classes). foreach $word (keys %count) { foreach $word (keys %count) { print SORTER print SORTER $count{$word}, " ", $word,"\n"; $count{$word}, " ", $word,"\n"; } }

“There’s More Than One Way To Do It” So Why Perl? Python Perhaps too many. Equivalent ways to print STDIN: Perhaps the most popular . Perl designed by a sane man. while () { print; } Despite its flaws, it’s very powerful. Very clean syntax and semantics. print while Almost has a good system. Large collection of libraries (but not as big as Perl’s). while (defined(($_ = )) { print $_; } for (;;) { print; } Very few things can't be done in Perl. Regular expression support (but not as integrated as print $_ while defined($_ = ); Fast, flexible . Perl’s.) Many Perl statements come in prefix and postfix Ability to virtually every . Binary while (...) ... data manipulation. ... while ... Ported everywhere. if (...) ...... if ... Very, very extensive collection of libraries. Database ... unless ... access. CGI/HTML for the web. Math. IPC. Time.

Wordcount in Python Python Classes Python’s Merits #!/usr/bin/env python class Complex: Good support for programming-in-the-large: def __init__(self, realpart, imagpart): self. = realpart import fileinput, re, string, os self.i = imagpart Packages with separate namespaces; Exceptions; def add(self, a): Classes count = {} self.r = self.r + a.r for line in fileinput.input(): self.i = self.i + a.i Persistent datastructures (pickling) line = re.sub(r’[.,:;!(){}]’,"",line) def p(self): High-level: lists, , associative arrays, for word in string.split(line): print "%g + %gi" % (self.r,self.i) if not count.has_key(word): count[word] = 1 x = Complex(1,2) Good collection of libraries: else: y = Complex(2,3) x.p() Operating-system access (files, directories, etc.); count[word] = count[word] + 1 x.add(y) x.p() String manipulation; Curses; Databases; Networking f = os.popen("sort -nr",’w’) Prints (CGI, HTTP, URL, mail/Mime, HTML); ; for word in count.keys(): f.(’%d %s\n’ % (count[word], word) ) 1 + 2i Cryptography; System-specific (Windows, Mac, SGI, 3 + 5i POSIX) Python vs. Perl Tcl Tcl Syntax Python can be the more verbose language, but Perl can John Ousterhout’s Tool Command Language was Shell-like command syntax: be cryptic. originally intended to be grafted on to an application to command argument argument . . . make it controllable. Regular expression support more integrated with All data is strings (incl. numbers and lists) language in Perl. Since become a general-purpose scripting language. Its -like variable substitution: syntax is quite simple, although rather atypical for a Perl better-known. . foo "123 abc" Probably comparable execution speeds. bar 1 $foo 3 Tk, a Tcl package, provide More “tricks” possible in Perl; Python more disciplined. widgets. Tcl/Tk may be the easiest way to write a GUI. : Python has the much cleaner syntax and semantics; I Tk has been connected to Perl and Python as well. set foo 1 know language’s programs I’d rather maintain. set bar 2 puts [ $foo + $bar]; # Print 3

Wordcount in Tcl Nifty Tcl Features Tk #!/usr/bin/env tclsh Associative arrays “Hello World” in Tk. while {[gets stdin line] >= 0} { set count(Stephen) 1 button . -text "Hello World" -command "" regsub -all {[.,:;!(){}]} $line "" line pack .b foreach word $line { Lists if {![info exists count($word)]} { set count($word) 1 lappend foo 1 } else { incr count($word) lappend foo 2 } foreach i $foo { puts $i } ; # print 1 then 2 } } Procedures set f [open "| sort -rn" w] proc sum3 {a b } { foreach word [array names count] { puts $f "$count($word) $word" return [ $a + $b + $c] } }

An Editable Graph An Editable Graph An Editable Graph

# Set up the main window # Set up bottom control buttons set w .plot frame $w.buttons catch destroy $w pack $w.buttons -side bottom -fill x -pady 2m button $w.buttons.dismiss -text Dismiss -command "destroy $w" toplevel $w button $w.buttons.code -text "See Code" -command "showCode $w" wm title $w "Plot Demonstration" pack $w.buttons.dismiss $w.buttons.code -side left -expand 1 wm iconname $w "Plot" positionWindow $w # Set up graph itself set c $w.c $c -relief raised -width 450 -height 300 pack $w.c -side top -fill x # Text description at top # Draw axes label $w.msg -font $font -wraplength 4i -justify left \ set plotFont Helvetica 18 -text "This window displays a canvas widget containing $c create line 100 250 400 250 -width 2 a simple 2-dimensional plot. You can doctor the data $c create line 100 250 100 50 -width 2 by dragging any of the points with mouse button 1." $c create text 225 20 -text "A Simple Plot" -font $plotFont \ pack $w.msg -side top -fill brown An Editable Graph An Editable Graph Bourne Shell # Draw axis labels # Bind actions to events for {set i 0} {$i <= 10} {incr i} { $c bind point "$c itemconfig current -fill red" Default shell on most Unix systems (sh or ). set x [expr {100 + ($i*30)}] $c bind point "$c itemconfig current -fill SkyBlue2" $c create line $x 250 $x 245 -width 2 $c bind point <1> "plotDown $c %x %y" Good for writing “shell scripts:” command-line $c create text $x 254 -text [expr 10*$i] \ $c bind point "$c dtag selected" -anchor n -font $plotFont bind $c "plotMove $c %x %y" arguments, invoking and controlling other commands, etc. } set plot(lastX) 0 for {set i 0} {$i <= 5} {incr i} { set plot(lastY) 0 set y [expr {250 - ($i*40)}] Example: The cc command. proc plotDown {w x y} { # Called when point clicked $c create line 100 $y 105 $y -width 2 global plot $c create text 96 $y -text [expr $i*50].0 \ $w dtag selected Most C built from four pieces: -anchor e -font $plotFont $w addtag selected withtag current } $w raise current # Draw points set plot(lastX) $x Preprocessor (cpp) foreach point {{12 56} {20 94} {33 98} {32 120} {61 180} set plot(lastY) $y {75 160} {98 223}} { } Actual (cc1) set x [expr {100 + (3*[lindex $point 0])}] proc plotMove {w x y} { # Called when point dragged set y [expr {250 - (4*[lindex $point 1])/5}] global plot set item [$c create oval [expr $x-6] [expr $y-6] \ $w move selected [expr $x-$plot(lastX)] \ Assembler (as) [expr $x+6] [expr $y+6] -width 1 -outline black \ [expr $y-$plot(lastY)] -fill SkyBlue2] set plot(lastX) $x $c addtag point withtag $item set plot(lastY) $y Linker (ld) } }

cc in sh cc in sh cc in sh #!/bin/sh # Parse command-line options # Parse filenames while [ ! -z "$1" ]; while [ ! -z "$1" ]; do do case x"$1" in # Set up command names case x"$1" in x-v) "Stephen's cc 1.0"; exit 0 ;; root=/usr/lib x-o) shift; outfile=$1 ;; x*.c) cfiles="$cfiles $1" ;; cpp=$root/cpp x-c) stopafterassemble=1 ;; x*.s) sfiles="$sfiles $1" ;; cc1=$root/cc1 x-S) stopaftercompile=1 ;; x*.o | x*.a) ofiles="$ofiles $1" ;; as=/usr/bin/as x-E) stopafterpreprocess=1 ;; *) echo "Unrecognized type $1" 1>&2; exit 1 ;; ld=/usr/bin/ld x-*) echo "Unknown option $1" 1>&2; usage ;; esac *) break ;; shift esac # Complaint function shift done usage() { done echo "usage: $0 [options] files ..." 1>&2 # Run preprocessor standalone exit 1 # Initialize lists of files to process if [ "$stopafterpreprocess" ]; then } cfiles="" for file in $cfiles; do sfiles="" $cpp $file ofiles="crt1.o" # Default output filename done outfile="a.out" if [ $# = 0 ]; then exit 0 echo "$0: No input files" 1>&2; exit 1 fi fi

cc in sh Scripting Languages Compared What To Use When # Preprocess and compile to assembly awk: Best for simple text-processing (file of fields) for file in $cfiles; do awk Perl Python Tcl sh asmfile=`echo $file | s/.c$/.s/` Shell-like N N N Y Y $cpp $file | $cc1 > $asmfile Perl: Best for legacy things, things requiring regexps sfiles="$sfiles $asmfile" Reg. Exp. B A C C D done Types C B A B D Python: Best all-around, especially for large programs if [ "$stopaftercompile" ]; then exit 0; fi Structure C B A B C Tcl: Best for command languages, GUIs # Assemble object files Syntax B F A B C for file in $sfiles; do sh: Best for portable “invoking” scripts objfile=`echo $file | sed s/.s$/.o/` Semantics A C A B B $as -o $objfile $file ofiles="$ofiles $objfile" Speed B A A B C done Libraries C A A B C if [ "$stopafterassemble" ]; then exit 0; fi Power B A A B C # Link to build $ld -o $outfile $ofiles Verbosity B A C C B exit 0