Tips.Mbox: (Next Tip #2) the Unix 'Tr' Command

Total Page:16

File Type:pdf, Size:1020Kb

Tips.Mbox: (Next Tip #2) the Unix 'Tr' Command (NeXT Tip #2) The Unix 'tr' command Christopher Lane (lane[at]sumex-aim.stanford.edu) Mon, 19 Oct 1992 10:04:34 -0700 (PDT) To: KSL-NeXT[at]sumex-aim.stanford.edu Message-ID: <MailManager.719514274.3278.lane[at]ssrg-next-1> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII (This is actually a generic BSD Unix tip.) The 'tr' command in Unix is used to transliterate (substitute and delete) on an individual character basis. For example, you can change the case of file (or any data stream) using 'tr': alias lower "tr 'A-Z' 'a-z'" alias raise "tr 'a-z' 'A-Z'" Now you can do 'lower < file1 > file2' to change a file to lower case. Note that you cannot do translations like 'substitute "sheep" for "goat"', the 'tr' command only deals in single character mappings. A more useful example: alias crtolf "tr '\015' '\012'" alias lftocr "tr '\012' '\015'" This lets you do 'crtolf < file1 > file2' to change the end of line convention >From carriage return to line feed (Unix newline). The use of 'alias' here is a convenience, you can of course type the 'tr' commands directly to the shell. Also note the '\012' notation, this is octal notation which is used for characters you can't type directly. To easily get the octal notation for a character, you can do 'man ascii' which will print out an ASCII character table with the octal (and hexadecimal -- and sometimes decimal) equivalents. The 'tr' command takes three 'switches': -c which 'complements' the set of characters in first argument string with respect to the set of ASCII character codes; -d which 'deletes' all input characters in the first string argument (and doesn't use a second string argument); -s which 'squeezes' sequences of repeated output characters from the second string into a single characters. (E.g. >echo 'rabbit' | tr -s 'bit' 'dar' => 'radar') A real use of 'tr' that NeXT users might run into is that the NeXT, like the Macintosh and other newer systems, uses 'unbroken' text in it's editor where each paragraph is terminated by a Unix 'newline'. This creates lines that are too long for some older Unix programs like 'spell' and 'diction' which were built for 80 character terminal lines and haven't been fixed. To facilitate using these programs, we can use an example from the 'tr' man page: alias breakup "tr -cs 'A-Za-z' '\012'" This will convert all characters that are NOT alphabet characters to newlines, and 'squeeze' multiple newlines in a row to a single one. Thus you can then safely pipe the results to 'spell' as there will now be one word per line. Note that this example does not handle the following case correctly: > echo "don't" | breakup don t > This is easy to fix, however, and left as an exercise for the reader. (Hint, you'll probably need to use 'man ascii'.) Other examples of using 'tr': Although most USENET news readers have this built in, here's a way to unscramble 'rot13' (rotated 13) offensive jokes on rec.humor.*: alias rot13 "tr 'A-Za-z' 'N-ZA-Mn-za-m'" Or, get rid of blank lines in a file: tr -s '\012' < file1 > file2 Note that 'tr' on System V Unix machines (like our HP 720 'Snakes') is slightly different. The command and switches are the same but the arguments are more complex, handling character and equivalence classes, multicharacter elements, etc. See 'man tr' for exact details on any Unix system. The 'tr' command is useful in 'csh' scripts and, on the NeXT, in .pipedict and .commanddict files which provide user defined extensions to the 'Edit' application. (Perhaps the topic of a future NeXT tip.) - Christopher.
Recommended publications
  • Student Number: Surname: Given Name
    Computer Science 2211a Midterm Examination Sample Solutions 9 November 20XX 1 hour 40 minutes Student Number: Surname: Given name: Instructions/Notes: The examination has 35 questions on 9 pages, and a total of 110 marks. Put all answers on the question paper. This is a closed book exam. NO ELECTRONIC DEVICES OF ANY KIND ARE ALLOWED. 1. [4 marks] Which of the following Unix commands/utilities are filters? Correct answers are in blue. mkdir cd nl passwd grep cat chmod scriptfix mv 2. [1 mark] The Unix command echo HOME will print the contents of the environment variable whose name is HOME. True False 3. [1 mark] In C, the null character is another name for the null pointer. True False 4. [3 marks] The protection code for the file abc.dat is currently –rwxr--r-- . The command chmod a=x abc.dat is equivalent to the command: a. chmod 755 abc.dat b. chmod 711 abc.dat c. chmod 155 abc.dat d. chmod 111 abc.dat e. none of the above 5. [3 marks] The protection code for the file abc.dat is currently –rwxr--r-- . The command chmod ug+w abc.dat is equivalent to the command: a. chmod 766 abc.dat b. chmod 764 abc.dat c. chmod 754 abc.dat d. chmod 222 abc.dat e. none of the above 2 6. [3 marks] The protection code for def.dat is currently dr-xr--r-- , and the protection code for def.dat/ghi.dat is currently -r-xr--r-- . Give one or more chmod commands that will set the protections properly so that the owner of the two files will be able to delete ghi.dat using the command rm def.dat/ghi.dat chmod u+w def.dat or chmod –r u+w def.dat 7.
    [Show full text]
  • En Rgy Tr Ls
    HTING LIG NAL SIO ES OF PR • • N O I T C E T O R P E G R U S • F O O R ENER P GY R C E O H N T A T E R W • O A L P S & S L O O P InTouch™ — The Centerpiece for Peaceful Living Z-wave™ makes reliable wireless home control possible. Several factors contribute to make Z-wave a breakthrough Intermatic makes it a reality with the InTouch™ series innovation. First is the “mesh” network, which basically of wireless controls. Until now home controls had only means that every line-powered device within the network acts two options — reliable controls that were too expensive or as a repeater to route signals among distant devices. Second is low cost controls that were unreliable. That’s all changed that Z-wave networks operate in the 900Mhz band, providing a with Intermatic’s InTouch wireless controls, which bring penetrating signal to deliver reliable communications between reliable controls together with low cost to deliver a home devices. Another factor is that with more than 4 billion unique solution everyone can enjoy. house codes, Z-wave provides a secure network with no fear of interference from neighboring systems. Finally, Z-wave Named the Best New Emerging Technology by CNET at has made the leap from theory to reality as more than 100 the 2006 Consumer Electronic Show, and recipient of the companies are working in the Z-wave Alliance to develop Electronic House Product of the Year Award, Z-wave is a actual products that benefit homeowners today and in the breakthrough technology that enables products from many near future.
    [Show full text]
  • UNIX (Solaris/Linux) Quick Reference Card Logging in Directory Commands at the Login: Prompt, Enter Your Username
    UNIX (Solaris/Linux) QUICK REFERENCE CARD Logging In Directory Commands At the Login: prompt, enter your username. At the Password: prompt, enter ls Lists files in current directory your system password. Linux is case-sensitive, so enter upper and lower case ls -l Long listing of files letters as required for your username, password and commands. ls -a List all files, including hidden files ls -lat Long listing of all files sorted by last Exiting or Logging Out modification time. ls wcp List all files matching the wildcard Enter logout and press <Enter> or type <Ctrl>-D. pattern Changing your Password ls dn List files in the directory dn tree List files in tree format Type passwd at the command prompt. Type in your old password, then your new cd dn Change current directory to dn password, then re-enter your new password for verification. If the new password cd pub Changes to subdirectory “pub” is verified, your password will be changed. Many systems age passwords; this cd .. Changes to next higher level directory forces users to change their passwords at predetermined intervals. (previous directory) cd / Changes to the root directory Changing your MS Network Password cd Changes to the users home directory cd /usr/xx Changes to the subdirectory “xx” in the Some servers maintain a second password exclusively for use with Microsoft windows directory “usr” networking, allowing you to mount your home directory as a Network Drive. mkdir dn Makes a new directory named dn Type smbpasswd at the command prompt. Type in your old SMB passwword, rmdir dn Removes the directory dn (the then your new password, then re-enter your new password for verification.
    [Show full text]
  • TR Body Styles-Category Codes
    T & R BODY STYLES / CATEGORY CODES Revised 09/21/2018 Passenger Code Mobile Homes Code Ambulance AM Special SP Modular Building MB Convertible CV Station Wagon * SW includes SW Mobile Home MH body style for a Sport Utility Vehicle (SUV). Convertible 2 Dr 2DCV Station Wagon 2 Dr 2DSW Office Trailer OT Convertible 3 Dr 3DCV Station Wagon 3 Dr 3DSW Park Model Trailer PT Convertible 4 Dr 4DCV Station Wagon 4 Dr 4DSW Trailers Code Convertible 5 Dr 5DCV Station Wagon 5 Dr 5DSW Van Trailer VNTL Coupe CP Van 1/2 Ton 12VN Dump Trailer DPTL Dune Buggy DBUG Van 3/4 Ton 34VN Livestock Trailer LS Hardtop HT Trucks Code Logging Trailer LP Hardtop 2 Dr 2DHT Armored Truck AR Travel Trailer TV Hardtop 3 Dr 3DHT Auto Carrier AC Utility Trailer UT Hardtop 4 Dr 4DHT Beverage Rack BR Tank Trailer TNTL Hardtop 5 Dr 5DHT Bus BS Motorcycles Code Hatchback HB Cab & Chassis CB All Terrain Cycle ATC Hatchback 2 Dr 2DHB Concrete or Transit Mixer CM All Terrain Vehicle ATV Hatchback 3 Dr 3DHB Crane CR Golf Cart GC Hatchback 4 Dr 4DHB Drilling Truck DRTK MC with Unique Modifications MCSP Hatchback 5 Dr 5DHB Dump Truck DP Moped MP Hearse HR Fire Truck FT Motorcycle MC Jeep JP Flatbed or Platform FB Neighborhood Electric Vehicle NEV Liftback LB Garbage or Refuse GG Wheel Chair/ Motorcycle Vehicle WCMC Liftback 2 Dr 2DLB Glass Rack GR Liftback 3 Dr 3DLB Grain GN Liftback 4 Dr 4DLB Hopper HO Liftback 5 Dr 5DLB Lunch Wagon LW Limousine LM Open Seed Truck OS Motorized Home MHA Panel PN Motorized Home MHB Pickup 1 Ton 1TPU Motorized Home MHC Refrigerated Van RF Pickup PU
    [Show full text]
  • How to Build a Search-Engine with Common Unix-Tools
    The Tenth International Conference on Advances in Databases, Knowledge, and Data Applications Mai 20 - 24, 2018 - Nice/France How to build a Search-Engine with Common Unix-Tools Andreas Schmidt (1) (2) Department of Informatics and Institute for Automation and Applied Informatics Business Information Systems Karlsruhe Institute of Technologie University of Applied Sciences Karlsruhe Germany Germany Andreas Schmidt DBKDA - 2018 1/66 Resources available http://www.smiffy.de/dbkda-2018/ 1 • Slideset • Exercises • Command refcard 1. all materials copyright, 2018 by andreas schmidt Andreas Schmidt DBKDA - 2018 2/66 Outlook • General Architecture of an IR-System • Naive Search + 2 hands on exercices • Boolean Search • Text analytics • Vector Space Model • Building an Inverted Index & • Inverted Index Query processing • Query Processing • Overview of useful Unix Tools • Implementation Aspects • Summary Andreas Schmidt DBKDA - 2018 3/66 What is Information Retrieval ? Information Retrieval (IR) is finding material (usually documents) of an unstructured nature (usually text) that satisfies an informa- tion need (usually a query) from within large collections (usually stored on computers). [Manning et al., 2008] Andreas Schmidt DBKDA - 2018 4/66 What is Information Retrieval ? need for query information representation how to match? document document collection representation Andreas Schmidt DBKDA - 2018 5/66 Keyword Search • Given: • Number of Keywords • Document collection • Result: • All documents in the collection, cotaining the keywords • (ranked by relevance) Andreas Schmidt DBKDA - 2018 6/66 Naive Approach • Iterate over all documents d in document collection • For each document d, iterate all words w and check, if all the given keywords appear in this document • if yes, add document to result set • Output result set • Extensions/Variants • Ranking see examples later ...
    [Show full text]
  • Unix, Standard I/O and Command Line Arguments Overview Redirection
    Unix, Standard I/O and command line arguments For any programming assignments I give you, I expect a program that reads and writes to standard input and output, taking any extra parameters from the command line. This handout explains how to do that. I have also appended a small appendix of useful Unix commands. I recommend that you go to a Unix terminal and type in and run all of the examples in the gray boxes. You can do this from the Terminal application on a MacIntosh, or from a terminal window in GNU/Linux or any Unix-like operating system, from Cygwin in Windows maybe, or by connecting to bingsuns using SSH. Overview Most programming languages feature “standard,” or default, input and output channels where I/O goes unless otherwise specified. In C, the functions scanf(), gets() and getchar() read from standard input, while printf(), puts() and putchar() write to standard output. In Tcl, the gets and puts commands read and write standard input and output. In awk and perl, you just sort of use the force and your program receives a line of input from somewhere. Letʼs stick with C for the time being.1 When a program runs from a terminal, standard input is usually the userʼs keyboard input, while standard output is displayed as text on-screen. Unix and Unix-like operating systems allow you to intercept the standard I/O channels of a program to redirect them into files or other programs. This gives you the ability to chain many simple programs to do something genuinely useful.
    [Show full text]
  • SPELL-2 Manual.Pdf
    Julie J. Masterson, PhD • Kenn Apel, PhD • Jan Wasowicz, PhD © 2002, 2006 by Learning By Design, Inc. All rights reserved. No part of this publication may be reproduced in whole or in part without the prior written permission of Learning By Design, Inc. SPELL: Spelling Performance Evaluation for Language and Literacy and Learning By Design, Inc. are registered trademarks, and Making A Difference in K-12 Education is a trademark of Learning By Design, Inc. 13 12 11 10 09 08 07 06 8 7 6 5 4 3 2 1 ISBN-13: 978-0-9715133-2-7 ISBN-10: 0-9715133-2-5 Printed in the United States of America P.O. Box 5448 Evanston, IL 60604-5448 www.learningbydesign.com Information & Customer Services For answers to Frequently Asked Questions about SPELL–2, please visit the Learning By Design, Inc., website at www.learningbydesign.com. Technical Support First, please visit the technical support FAQ page at www.learningbydesign.com. If you don’t find an answer to your question, please call 1-847-328-8390 between 8 am and 5 pm CST. We’d love to hear from you! Your feedback, comments, and suggestions are always welcome. Please contact us by email at [email protected]. Portions of code are Copyright © 1994–2002 Integrations New Media, Inc., and used under license by Integration New Media, Inc. DIRECTOR® © 1984–2004 Macromedia, Inc. DIBELS is a registered trademark of Dynamic Measurement Group, Inc., and is not affiliated with Learning By Design, Inc. Earobics® is a product and registered trademark of Cognitive Concepts, Inc., a division of Houghton-Mifflin, and is not affiliated with Learning By Design, Inc.
    [Show full text]
  • Gnu Coreutils Core GNU Utilities for Version 5.93, 2 November 2005
    gnu Coreutils Core GNU utilities for version 5.93, 2 November 2005 David MacKenzie et al. This manual documents version 5.93 of the gnu core utilities, including the standard pro- grams for text and file manipulation. Copyright c 1994, 1995, 1996, 2000, 2001, 2002, 2003, 2004, 2005 Free Software Foundation, Inc. Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.1 or any later version published by the Free Software Foundation; with no Invariant Sections, with no Front-Cover Texts, and with no Back-Cover Texts. A copy of the license is included in the section entitled “GNU Free Documentation License”. Chapter 1: Introduction 1 1 Introduction This manual is a work in progress: many sections make no attempt to explain basic concepts in a way suitable for novices. Thus, if you are interested, please get involved in improving this manual. The entire gnu community will benefit. The gnu utilities documented here are mostly compatible with the POSIX standard. Please report bugs to [email protected]. Remember to include the version number, machine architecture, input files, and any other information needed to reproduce the bug: your input, what you expected, what you got, and why it is wrong. Diffs are welcome, but please include a description of the problem as well, since this is sometimes difficult to infer. See section “Bugs” in Using and Porting GNU CC. This manual was originally derived from the Unix man pages in the distributions, which were written by David MacKenzie and updated by Jim Meyering.
    [Show full text]
  • Glyne Piggott: Cyclic Spell-Out and the Typology of Word Minimality
    Cyclic spell-out and the typology of word minimality∗ Glyne Piggott McGill University “Why is language the way it is [and not otherwise]?” (adapted from O’Grady 2003) Abstract This paper rejects the view that a minimal size requirement on words is emergent from the satisfaction of a binarity condition on the foot. Instead it proposes an autonomous minimality condition (MINWD) that regulates the mapping between morpho-syntactic structure and phonology. This structure is determined by principles of Distributed Morphology, and the mapping proceeds cyclically, as defined by phase theory. The paper postulates that, by language-specific choice, MINWD may be satisfied on either the first or last derivational cycle. This parametric choice underlies the observation that some languages (e.g. Turkish, Woleaian) may paradoxically both violate and enforce the constraint. It also helps to explain why some languages actively enforce the constraint by augmenting words (e.g. Lardil, Mohawk), while others do not have to resort to such a strategy (e.g. Ojibwa, Cariban languages). The theory of word minimality advocated in this paper generates a restrictive typology that fits the attested patterns without over-generating some unattested types that are sanctioned by other frameworks. 1. Introduction. Some languages disallow (content) words that consist of just one light (i.e. CV/CVC) syllable. Hayes (1995: 88) lists forty languages that display this "minimal word syndrome". The cited evidence for the syndrome includes cases where a truncation process is blocked to avoid creating words that are too short and also in cases where the size of a CV/CVC input is increased by a phonological process of word augmentation.
    [Show full text]
  • Gnu Coreutils Core GNU Utilities for Version 6.9, 22 March 2007
    gnu Coreutils Core GNU utilities for version 6.9, 22 March 2007 David MacKenzie et al. This manual documents version 6.9 of the gnu core utilities, including the standard pro- grams for text and file manipulation. Copyright c 1994, 1995, 1996, 2000, 2001, 2002, 2003, 2004, 2005, 2006 Free Software Foundation, Inc. Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation; with no Invariant Sections, with no Front-Cover Texts, and with no Back-Cover Texts. A copy of the license is included in the section entitled \GNU Free Documentation License". Chapter 1: Introduction 1 1 Introduction This manual is a work in progress: many sections make no attempt to explain basic concepts in a way suitable for novices. Thus, if you are interested, please get involved in improving this manual. The entire gnu community will benefit. The gnu utilities documented here are mostly compatible with the POSIX standard. Please report bugs to [email protected]. Remember to include the version number, machine architecture, input files, and any other information needed to reproduce the bug: your input, what you expected, what you got, and why it is wrong. Diffs are welcome, but please include a description of the problem as well, since this is sometimes difficult to infer. See section \Bugs" in Using and Porting GNU CC. This manual was originally derived from the Unix man pages in the distributions, which were written by David MacKenzie and updated by Jim Meyering.
    [Show full text]
  • Unix™ for Poets
    - 1 - Unix for Poets Kenneth Ward Church AT&T Research [email protected] Text is available like never before. Data collection efforts such as the Association for Computational Linguistics' Data Collection Initiative (ACL/DCI), the Consortium for Lexical Research (CLR), the European Corpus Initiative (ECI), ICAME, the British National Corpus (BNC), the Linguistic Data Consortium (LDC), Electronic Dictionary Research (EDR) and many others have done a wonderful job in acquiring and distributing dictionaries and corpora.1 In addition, there are vast quantities of so-called Information Super Highway Roadkill: email, bboards, faxes. We now has access to billions and billions of words, and even more pixels. What can we do with it all? Now that data collection efforts have done such a wonderful service to the community, many researchers have more data than they know what to do with. Electronic bboards are beginning to ®ll up with requests for word frequency counts, ngram statistics, and so on. Many researchers believe that they don't have suf®cient computing resources to do these things for themselves. Over the years, I've spent a fair bit of time designing and coding a set of fancy corpus tools for very large corpora (eg, billions of words), but for a mere million words or so, it really isn't worth the effort. You can almost certainly do it yourself, even on a modest PC. People used to do these kinds of calculations on a PDP-11, which is much more modest in almost every respect than whatever computing resources you are currently using.
    [Show full text]
  • Vehicle Registration/Title Application – MV-82.1
    Office Use Only Class VEHICLE REGISTRATION/TITLE Batch APPLICATION File No. o Orig o Activity o Renewal o INSTRUCTIONS: Lease Buyout Three of Name o Dup o Activity W/RR o Renew W/RR A. Is this vehicle being registered only for personal use? o Yes o No o Sales Tax with Title o Sales Tax Only without Title If YES - Complete sections 1-4 of this form. Note: If this vehicle is a pick-up truck with an unladen weight that is a maximum of 6,000 pounds, is never used for commercial purposes and does not have advertising on any part of the truck, you are eligible for passenger plates or commercial plates. Select one: o Passenger Plates o Commercial Plates If NO - Complete sections 1-5 of this form. B. Complete the Certification in Section 6. C. Refer to form MV-82.1 Registering/Titling a Vehicle in New York State for information to complete this form. I WANT TO: REGISTER A VEHICLE RENEW A REGISTRATION GET A TITLE ONLY Current Plate Number CHANGE A REGISTRATION REPLACE LOST OR DAMAGED ITEMS TRANSFER PLATES NAME OF PRIMARY REGISTRANT (Last, First, Middle or Business Name) FORMER NAME (If name was changed you must present proof) Name Change Yes o No o NYS driver license ID number of PRIMARY REGISTRANT DATE OF BIRTH GENDER TELEPHONE or MOBILE PHONE NUMBER Month Day Year Area Code o o Male Female ( ) NAME OF CO-REGISTRANT (Last, First, Middle) EMAIL Name Change Yes o No o NYS driver license ID number of CO-REGISTRANT DATE OF BIRTH GENDER SECTION 1 Month Day Year Male o Female o ADDRESS CHANGE? o YES o NO THE ADDRESS WHERE PRIMARY REGISTRANT GETS MAIL (Include Street Number and Name, Rural Delivery or box number.
    [Show full text]