GAWK: Effective AWK Programming a User’S Guide for GNU Awk Edition 3 July, 2009

GAWK: Effective AWK Programming A User's Guide for GNU Awk Edition 3 July, 2009 Arnold D. Robbins \To boldly go where no man has gone before" is a Registered Trademark of Paramount Pictures Corporation. Published by: Free Software Foundation 51 Franklin Street, Fifth Floor Boston, MA 02110-1301 USA Phone: +1-617-542-5942 Fax: +1-617-542-2652 Email: [email protected] URL: http://www.gnu.org/ ISBN 1-882114-28-0 Copyright c 1989, 1991, 1992, 1993, 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005, 2007, 2009 Free Software Foundation, Inc. This is Edition 3 of GAWK: Effective AWK Programming: A User's Guide for GNUAwk, for the 3.1.7 (or later) version of the GNU implementation of AWK. Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.3 or any later version published by the Free Software Foundation; with the Invariant Sections being \GNU General Public License", the Front-Cover texts being (a) (see below), and with the Back-Cover Texts being (b) (see below). A copy of the license is included in the section entitled \GNU Free Documentation License". a. \A GNU Manual" b. \You have the freedom to copy and modify this GNU manual. Buying copies from the FSF supports it in developing GNU and promoting software freedom." Cover art by Etienne Suvasa. To Miriam, for making me complete. To Chana, for the joy you bring us. To Rivka, for the exponential increase. To Nachum, for the added dimension. To Malka, for the new beginning. i Short Contents Foreword :::::::::::::::::::::::::::::::::::::::::::::::: 1 Preface ::::::::::::::::::::::::::::::::::::::::::::::::: 3 1 Getting Started with awk :::::::::::::::::::::::::::::: 11 2 Regular Expressions :::::::::::::::::::::::::::::::::: 24 3 Reading Input Files :::::::::::::::::::::::::::::::::: 36 4 Printing Output ::::::::::::::::::::::::::::::::::::: 58 5 Expressions ::::::::::::::::::::::::::::::::::::::::: 75 6 Patterns, Actions, and Variables :::::::::::::::::::::::: 96 7 Arrays in awk :::::::::::::::::::::::::::::::::::::: 119 8 Functions :::::::::::::::::::::::::::::::::::::::::: 130 9 Internationalization with gawk ::::::::::::::::::::::::: 160 10 Advanced Features of gawk ::::::::::::::::::::::::::: 169 11 Running awk and gawk ::::::::::::::::::::::::::::::: 177 12 A Library of awk Functions ::::::::::::::::::::::::::: 186 13 Practical awk Programs :::::::::::::::::::::::::::::: 215 A The Evolution of the awk Language::::::::::::::::::::: 257 B Installing gawk ::::::::::::::::::::::::::::::::::::: 265 C Implementation Notes:::::::::::::::::::::::::::::::: 285 D Basic Programming Concepts ::::::::::::::::::::::::: 301 Glossary :::::::::::::::::::::::::::::::::::::::::::::: 307 GNU General Public License :::::::::::::::::::::::::::::: 317 GNU Free Documentation License ::::::::::::::::::::::::: 328 Index ::::::::::::::::::::::::::::::::::::::::::::::::: 336 ii GAWK: Effective AWK Programming Table of Contents Foreword :::::::::::::::::::::::::::::::::::::::::::: 1 Preface :::::::::::::::::::::::::::::::::::::::::::::: 3 History of awk and gawk ::::::::::::::::::::::::::::::::::::::::::::: 3 A Rose by Any Other Name ::::::::::::::::::::::::::::::::::::::::: 4 Using This Book :::::::::::::::::::::::::::::::::::::::::::::::::::: 5 Typographical Conventions :::::::::::::::::::::::::::::::::::::::::: 6 The GNU Project and This Book:::::::::::::::::::::::::::::::::::: 7 How to Contribute :::::::::::::::::::::::::::::::::::::::::::::::::: 8 Acknowledgments ::::::::::::::::::::::::::::::::::::::::::::::::::: 9 1 Getting Started with awk ::::::::::::::::::::: 11 1.1 How to Run awk Programs :::::::::::::::::::::::::::::::::::: 11 1.1.1 One-Shot Throwaway awk Programs :::::::::::::::::::::: 11 1.1.2 Running awk Without Input Files :::::::::::::::::::::::: 12 1.1.3 Running Long Programs ::::::::::::::::::::::::::::::::: 12 1.1.4 Executable awk Programs :::::::::::::::::::::::::::::::: 13 1.1.5 Comments in awk Programs :::::::::::::::::::::::::::::: 14 1.1.6 Shell-Quoting Issues :::::::::::::::::::::::::::::::::::::: 14 1.1.6.1 Quoting in MS-DOS Batch Files ::::::::::::::::::::: 16 1.2 Data Files for the Examples ::::::::::::::::::::::::::::::::::: 16 1.3 Some Simple Examples:::::::::::::::::::::::::::::::::::::::: 17 1.4 An Example with Two Rules :::::::::::::::::::::::::::::::::: 19 1.5 A More Complex Example :::::::::::::::::::::::::::::::::::: 20 1.6 awk Statements Versus Lines :::::::::::::::::::::::::::::::::: 21 1.7 Other Features of awk ::::::::::::::::::::::::::::::::::::::::: 22 1.8 When to Use awk ::::::::::::::::::::::::::::::::::::::::::::: 23 2 Regular Expressions::::::::::::::::::::::::::: 24 2.1 How to Use Regular Expressions :::::::::::::::::::::::::::::: 24 2.2 Escape Sequences ::::::::::::::::::::::::::::::::::::::::::::: 25 2.3 Regular Expression Operators ::::::::::::::::::::::::::::::::: 27 2.4 Using Character Lists ::::::::::::::::::::::::::::::::::::::::: 29 2.5 gawk-Specific Regexp Operators ::::::::::::::::::::::::::::::: 31 2.6 Case Sensitivity in Matching :::::::::::::::::::::::::::::::::: 32 2.7 How Much Text Matches? ::::::::::::::::::::::::::::::::::::: 33 2.8 Using Dynamic Regexps::::::::::::::::::::::::::::::::::::::: 34 2.9 Where You Are Makes A Difference ::::::::::::::::::::::::::: 35 iii 3 Reading Input Files ::::::::::::::::::::::::::: 36 3.1 How Input Is Split into Records ::::::::::::::::::::::::::::::: 36 3.2 Examining Fields ::::::::::::::::::::::::::::::::::::::::::::: 39 3.3 Nonconstant Field Numbers ::::::::::::::::::::::::::::::::::: 40 3.4 Changing the Contents of a Field:::::::::::::::::::::::::::::: 41 3.5 Specifying How Fields Are Separated :::::::::::::::::::::::::: 43 3.5.1 Using Regular Expressions to Separate Fields ::::::::::::: 44 3.5.2 Making Each Character a Separate Field ::::::::::::::::: 45 3.5.3 Setting FS from the Command Line :::::::::::::::::::::: 45 3.5.4 Field-Splitting Summary ::::::::::::::::::::::::::::::::: 46 3.6 Reading Fixed-Width Data:::::::::::::::::::::::::::::::::::: 48 3.7 Multiple-Line Records::::::::::::::::::::::::::::::::::::::::: 49 3.8 Explicit Input with getline :::::::::::::::::::::::::::::::::: 52 3.8.1 Using getline with No Arguments ::::::::::::::::::::::: 52 3.8.2 Using getline into a Variable :::::::::::::::::::::::::::: 53 3.8.3 Using getline from a File ::::::::::::::::::::::::::::::: 53 3.8.4 Using getline into a Variable from a File :::::::::::::::: 54 3.8.5 Using getline from a Pipe::::::::::::::::::::::::::::::: 54 3.8.6 Using getline into a Variable from a Pipe ::::::::::::::: 56 3.8.7 Using getline from a Coprocess ::::::::::::::::::::::::: 56 3.8.8 Using getline into a Variable from a Coprocess :::::::::: 56 3.8.9 Points to Remember About getline ::::::::::::::::::::: 56 3.8.10 Summary of getline Variants::::::::::::::::::::::::::: 57 4 Printing Output ::::::::::::::::::::::::::::::: 58 4.1 The print Statement ::::::::::::::::::::::::::::::::::::::::: 58 4.2 Examples of print Statements :::::::::::::::::::::::::::::::: 58 4.3 Output Separators :::::::::::::::::::::::::::::::::::::::::::: 60 4.4 Controlling Numeric Output with print::::::::::::::::::::::: 60 4.5 Using printf Statements for Fancier Printing ::::::::::::::::: 61 4.5.1 Introduction to the printf Statement :::::::::::::::::::: 61 4.5.2 Format-Control Letters::::::::::::::::::::::::::::::::::: 61 4.5.3 Modifiers for printf Formats :::::::::::::::::::::::::::: 63 4.5.4 Examples Using printf :::::::::::::::::::::::::::::::::: 65 4.6 Redirecting Output of print and printf :::::::::::::::::::::: 66 4.7 Special File Names in gawk :::::::::::::::::::::::::::::::::::: 69 4.7.1 Special Files for Standard Descriptors :::::::::::::::::::: 69 4.7.2 Special Files for Process-Related Information ::::::::::::: 70 4.7.3 Special Files for Network Communications :::::::::::::::: 71 4.7.4 Special File Name Caveats ::::::::::::::::::::::::::::::: 71 4.8 Closing Input and Output Redirections :::::::::::::::::::::::: 71 iv GAWK: Effective AWK Programming 5 Expressions :::::::::::::::::::::::::::::::::::: 75 5.1 Constant Expressions ::::::::::::::::::::::::::::::::::::::::: 75 5.1.1 Numeric and String Constants:::::::::::::::::::::::::::: 75 5.1.2 Octal and Hexadecimal Numbers ::::::::::::::::::::::::: 75 5.1.3 Regular Expression Constants :::::::::::::::::::::::::::: 76 5.2 Using Regular Expression Constants::::::::::::::::::::::::::: 76 5.3 Variables:::::::::::::::::::::::::::::::::::::::::::::::::::::: 78 5.3.1 Using Variables in a Program::::::::::::::::::::::::::::: 78 5.3.2 Assigning Variables on the Command Line:::::::::::::::: 78 5.4 Conversion of Strings and Numbers:::::::::::::::::::::::::::: 79 5.5 Arithmetic Operators ::::::::::::::::::::::::::::::::::::::::: 81 5.6 String Concatenation:::::::::::::::::::::::::::::::::::::::::: 82 5.7 Assignment Expressions ::::::::::::::::::::::::::::::::::::::: 83 5.8 Increment and Decrement Operators :::::::::::::::::::::::::: 86 5.9 True and False in awk ::::::::::::::::::::::::::::::::::::::::: 87 5.10 Variable Typing and Comparison Expressions :::::::::::::::: 87 5.10.1 String Type Versus Numeric Type ::::::::::::::::::::::: 88 5.10.2 Comparison Operators :::::::::::::::::::::::::::::::::: 89 5.11 Boolean Expressions ::::::::::::::::::::::::::::::::::::::::: 91 5.12 Conditional Expressions:::::::::::::::::::::::::::::::::::::: 92 5.13 Function Calls ::::::::::::::::::::::::::::::::::::::::::::::: 93 5.14 Operator Precedence (How Operators Nest) :::::::::::::::::: 94 6 Patterns, Actions, and Variables ::::::::::::: 96 6.1

GAWK: Effective AWK Programming a User’S Guide for GNU Awk Edition 3 July, 2009

At—At, Batch—Execute Commands at a Later Time

The /Proc File System

A.5.1. Linux Programming and the GNU Toolchain

Bash Crash Course + Bc + Sed + Awk∗

Useful Commands in Linux and Other Tools for Quality Control

The AWK Programming Language

Regular Expressions and Sed &

An AWK to C++ Translator

User's Guide for Linux and UNIX

A Crash Course on UNIX

Man Pages Section 2 System Calls

Intro to Unix-1-2014