CSE 303 Lecture 6 10/9/2006

Administravia

CSE 303 • Assignment 2 is due in one week Concepts and Tools for • Office hours Software Development – Mine are after class today – I’ll have extra office hours later this week Richard C. Davis – T/A office hours will be posted by next lecture UW CSE – 10/9/2006 • Follow Turn-in instructions! Lecture 6 – String Processing – username-hw2 – Plain text in readme.txt

10/9/2006 CSE 303 Lecture 6 2

Tip of the Day Last

• Replacing in Emacs • Powerful Tools – M-x query-replace – • Enter name of string to search for – • Enter name of string to replace – • Enter “y” or “n” for each match found – – M-x replace-string • Same, but does not prompt for each string

10/9/2006 CSE 303 Lecture 6 3 10/9/2006 CSE 303 Lecture 6 4

Today Automating Editing

• String Processing • We’ve learned to automate simple tasks – :single lines – Move around files – Complex – Start/Stop processes • : more complex processing – Change user environment/permissions • perl, python, ruby :general programming • But what about… • Shell Wrap-Up – Changing strings – Repetitive edits to multiple files • sed :can help (used in HW2)

10/9/2006 CSE 303 Lecture 6 5 10/9/2006 CSE 303 Lecture 6 6

1 CSE 303 Lecture 6 10/9/2006

Sed : A Stream EDitor How Sed Works

• sed • Each line copied to “pattern space” – Non-interactive editor • All editing commands applied – Performs editing actions – To data in pattern space – Actions defined in a “” – Done in sequence – Stream-oriented • Original input does not change • Input from file or stdin • Script processes each line • Possible to restrict edits to subset of lines • Output goes to stdout

10/9/2006 CSE 303 Lecture 6 7 10/9/2006 CSE 303 Lecture 6 8

Command-Line Syntax Search and Replace with Sed

• Method 1: One-line syntax • Most Common use – sed [options] 'command' file(s) – sed ‘s/pattern/replacement/g file – Means “replace every (longest) substring that – sed -e 'cmd1' –e 'cmd2' file(s) matches pattern with replacement” • Method 2: Script file holds commands • Common variations – sed [options] –f script file(s) –Omit g end: replace only first match – Put num at end: replace every numth match – sed -n : suppress normal output –Put p at end: print matching lines – sed -r : Use “extended” regular expressions

10/9/2006 CSE 303 Lecture 6 9 10/9/2006 CSE 303 Lecture 6 10

Using the Pattern Space Examples

• Can replace with all or part of a match • Not so useful – sed 's/a/b/g' ex1.txt • Special characters in replacement – sed 's/a/b/' ex1.txt – & : Entire pattern space – sed 's/a/b/2' ex1.txt – sed -n 's/a/b/2p' ex1.txt st – \1 : String that match 1 set of parentheses • More useful – \2 : String that match 2nd set of parentheses – sed 's/.*Linux \(.*\) .*/\1:/' ex2.txt –… – sed 's/.*Linux.*/&:/' ex2.txt • Newline Note – The \n is not in the text matched against and is (re)- added when printed

10/9/2006 CSE 303 Lecture 6 11 10/9/2006 CSE 303 Lecture 6 12

2 CSE 303 Lecture 6 10/9/2006

Sed Command Details More Sed Examples

• General syntax of sed commands • Delete lines 3-5: sed '3,5 d' ex3.c – [address[,address]][!]command[args] • Delete lines that don’t contain SAVE • Address specifies range to look at – sed '/SAVE/! d' ex3.c – Address types • Line with a particular number e.g.: 3 • Delete lines that start with // • Lines matching pattern e.g.: /SAVE/ – sed '/\/\// d' ex3.c – Using two addresses specifies a range of lines • Delete lines between /* and */ –Using ! Means “use lines not specified in address” • Other Commands – sed '/\/\*/, /\*\// d' ex3.c – d : delete lines

10/9/2006 CSE 303 Lecture 6 13 10/9/2006 CSE 303 Lecture 6 14

Advanced Sed Features Awk

• Commands so far: substitute, print, delete • Processes text files • Other commands (not used in class) – File contains records • Separated by newline (default) – Append, replace with block, insert, translate – Records contain fields – Branch to label • Separated by spaces (default) – Multi-line patterns • Why use awk? – The hold space for fancy editing – Generate reports from logs • E.g., copy and of lines – Process results of an experiment

• Need these? Use more powerful language (Named after authors, Aho, Weinberger, and Kernighan)

10/9/2006 CSE 303 Lecture 6 15 10/9/2006 CSE 303 Lecture 6 16

Running Awk Awk Functionality

• One-line syntax • Script structure – awk [options] 'script' file(s) – pattern { procedure } • Script file • Records processed one at a time – awk [options] –f scriptFile file(s) – Pattern restricts to matching records • Fields accessed with $1, …$n • BEGIN and END patterns – For procedures before/after processing file

10/9/2006 CSE 303 Lecture 6 17 10/9/2006 CSE 303 Lecture 6 18

3 CSE 303 Lecture 6 10/9/2006

Advanced Awk Features More Powerful Script Languages

• awk is a very powerful language • Perl, Python, and Ruby – Looping constructs • Interpreted – Arrays • scripts like bash – Functions – Prefix script with #! – Fancy printing – executable with – Powerful math functions • Pre-compiled (fast!) • Need these? Use Perl, Python, or Ruby

10/9/2006 CSE 303 Lecture 6 19 10/9/2006 CSE 303 Lecture 6 20

Perl Python

• Practical Extraction and Report Language • Fully Object Oriented – Or “Pathologically Eclectic Rubbish Lister” • Simpler Syntax • Language properties • Allows different styles – Excellent pattern matching – Procedural – “Kitchen Sink” syntax – Functional – No objects in original version

10/9/2006 CSE 303 Lecture 6 21 10/9/2006 CSE 303 Lecture 6 22

Ruby Summary

• Fully Object Oriented • String Processing • Syntax more similar to Smalltalk – sed : quick mods to single lines • Many different ways to do the same things – awk : more complex record processing – Harder to debug – perl, python, ruby: learn one • That’s all for the shell!

Note: We don’t require you to know how to use any scripting tools other than sed in this class, but we do require you to know when you should consider learning to use one of these tools.

10/9/2006 CSE 303 Lecture 6 23 10/9/2006 CSE 303 Lecture 6 24

4 CSE 303 Lecture 6 10/9/2006

Next Time

• Introduction to C!

10/9/2006 CSE 303 Lecture 6 25

5