<<

(NeXT #2) The ''

Christopher Lane (lane[]sumex-aim.stanford.edu) Mon, 19 Oct 1992 10:04:34 -0700 (PDT)

To: KSL-NeXT[at]sumex-aim.stanford.edu Message-ID: MIME-Version: 1.0 Content-: TEXT/PLAIN; charset=US-ASCII

(This is actually a generic BSD Unix tip.) The 'tr' command in Unix is used to transliterate (substitute and delete) on an individual character basis. For example, you can change the case of (or any data stream) using 'tr': lower "tr 'A-Z' 'a-z'" alias raise "tr 'a-z' 'A-Z'"

Now you can do 'lower < file1 > file2' to change a file to lower case. Note that you cannot do translations like 'substitute "sheep" for "goat"', the 'tr' command only deals in single character mappings. A useful example: alias crtolf "tr '\015' '\012'" alias lftocr "tr '\012' '\015'"

This lets you do 'crtolf < file1 > file2' to change the end of line convention >From carriage return to line feed (Unix ). The use of 'alias' here is a convenience, you can of course type the 'tr' commands directly to the shell.

Also note the '\012' notation, this is octal notation is used for characters you can't type directly. To easily get the octal notation for a character, you can do 'man ascii' which will print out an ASCII character table with the octal (and hexadecimal -- and sometimes decimal) equivalents.

The 'tr' command takes three 'switches': -c which 'complements' the set of characters in first argument string with respect to the set of ASCII character codes; -d which 'deletes' all input characters in the first string argument (and doesn't use a second string argument); -s which 'squeezes' sequences of repeated output characters from the second string into a single characters. (E.g. >echo 'rabbit' | tr -s 'bit' 'dar' => 'radar')

A real use of 'tr' that NeXT users might run into is that the NeXT, like the Macintosh and other newer systems, uses 'unbroken' text in it's editor where each paragraph is terminated by a Unix 'newline'. This creates lines that are too long for some older Unix programs like '' and 'diction' which were built for 80 character terminal lines and haven't been fixed. To facilitate using these programs, we can use an example from the 'tr' : alias breakup "tr -cs 'A-Za-z' '\012'"

This will convert all characters that are NOT alphabet characters to , and 'squeeze' multiple newlines in a row to a single one. Thus you can then safely pipe the results to 'spell' as there will now be one word per line. Note that this example does not handle the following case correctly:

> "don't" | breakup don t >

This is easy to fix, however, and left as an exercise for the reader. (Hint, you'll probably need to use 'man ascii'.) Other examples of using 'tr':

Although USENET news readers have this built in, here's a way to unscramble 'rot13' (rotated 13) offensive jokes on rec.humor.*: alias rot13 "tr 'A-Za-z' 'N-ZA-Mn-za-m'"

Or, get rid of blank lines in a file: tr -s '\012' < file1 > file2

Note that 'tr' on System V Unix machines (like our HP 720 'Snakes') is slightly different. The command and switches are the same but the arguments are more complex, handling character and equivalence classes, multicharacter elements, etc. See 'man tr' for exact details on any Unix system.

The 'tr' command is useful in 'csh' scripts and, on the NeXT, in .pipedict and .commanddict files which provide user defined extensions to the 'Edit' application. (Perhaps the topic of a future NeXT tip.)

- Christopher