Wildcards and Regular Expressions
Hour 9 PObjectives Copyright © 1998-2002 Delroy A. Brinkerhoff. All Rights Reserved. Hour 9 Unix Slide 1 of 12 Regular Expressions A formal language PFormal (computer) languages are categorized by their strength (i.e., by the complexity of the grammar they accept) PRegular expressions are the simplest of these languages PRegular expressions are formed with metacharacters Hour 9 Unix Slide 2 of 12 Wildcards File name shortcuts PWildcard characters Hour 9 Unix Slide 3 of 12 Wildcard Metacharacters Selecting related files P* (asterisk, or “splat” in Unix) matches zero or more occurrences of any character < ls * list all non-dot files (equivalent to ls) < cp c*s bak copies all files whose name begins with ‘c’ and ends with ‘s’ to directory bak P? (question mark) matches any one character < cp *.? bak copies all files whose name ends with a “dot” (period) and a single character P[xyz] matches one character from x, y, or z < cp [ABC]* /bak copies all files whose name begins with ‘A,’ ‘B,’ or ‘C’ to the bak directory P[a-z] matches one character from the range a to z < ls [A-Z]*.c lists all files whose name begins with a capital letter and end with a .c extension Hour 9 Unix Slide 4 of 12 Wildcard Examples % ls -a . .login DOCS Mail prog.c wc.c .. .profile DOCS.bak Mandel.java sprio.z wc.java .TRASH Count.p Hypo.java pgp.Z wc.asm wc.p % ls * Count.p Hypo.java pgp.Z wc.asm wc.p DOCS Mail prog.c wc.c DOCS.bak Mandel.java sprio.z wc.java %ls *.? Count.p pgp.Z prog.c sprio.z wc.c wc.p % ls [A-Z]* Count.p DOCS DOCS.bak Hypo.java Mail Mandel.java % ls *.c prog.c wc.c % ls *.[zZ] pgp.Z sprio.z % ls wc.* wc.asm wc.c wc.java wc.p % ls w*a wc.java Hour 9 Unix Slide 5 of 12 Other Wildcards Not supported by all (especially older) shells PSupported by Bourne and Korn shells P[!xyz] % ksh $ ls [!Dw]* Count.p Mail pgp.Z sprio.z Hypo.java Mandel.java prog.c Hour 9 Unix Slide 6 of 12 The grep Command Pattern matching with a regular expression processor Pgrep [ -ivlnw ] re-pattern [file ...] < -i case insensitive < -v invert test (print lines that don’t match) < -l list files but don’t print matched lines < -n print line number and matched line < -w match whole words only PInput is line-oriented text PSearches lines for a specified regular expression pattern Hour 9 Unix Slide 7 of 12 grep Examples Simple regular expressions Pgrep BUFSIZ *.c PC matches character C exactly P\C escape C (treat C as a “normal” character) P. matches any single character except new-line P^R matches reg exp R when at the beginning of line PR$ matches reg exp R when at the end of the line PR* repeat regular expression R zero or more times P[xyz] matches one of x, y, or z P[^xyz] matches any one character except x, y, or z P[a-z] matches any one character from a to z P[^a-z] matches any character except in the range a to z Hour 9 Unix Slide 9 of 12 Advanced grep Examples Filtering out non-matching lines PRegular expression pattern must be hidden from shell < grep ’^Count’ prog.c < “Count must be at the beginning of the line Pgrep ’Count$’ prog.c < “Count” must be at the end of a line Pgrep ’^Count\$’ prog.c < “Count$” must be at the beginning of the line; $ is matched exactly Pmake | grep ’[Ee]rror’ Pfgrep [ -ivlnw ] [ -f pattern-file ] string [file ...] –R1|R2 match regular expression R1 or R2 – (R) groups regular expression R for other RE operators Hour 9 Unix Slide 11 of 12 egrep Examples Great for cross word puzzles Pegrep -i ’bomb|explosive|terrorist’ mail/*