The A-Z of Programming Languages: AWK

The A-Z of Programming Languages: AWK

CS@CU NEWSLETTER OF THE DEPARTMENT OF COMPUTER SCIENCE AT COLUMBIA UNIVERSITY VOL.5 NO.1 WINTER 2008 The A-Z of Programming Languages: AWK Professor Alfred V. Aho talks about the history and continuing popularity of his pattern matching language AWK. Lawrence Gussman Professor of Computer Science Alfred V. Aho The following article is reprinted How did the idea/concept of language suitable for simple It was built to do simple data with the permission of the AWK language develop data-processing tasks. processing: the ordinary data Computerworld Australia and come into practice? processing that we routinely (www.computerworld.com.au). We were heavily influenced by As with a number of languages, GREP, a popular string-matching did on a day-to-day basis. We The interview was conducted just wanted to have a very by Naomi Hamilton. it was born from the necessity utility on UNIX, which had to meet a need. As a researcher been created in our research simple scripting language that Lawrence Gussman Professor would allow us, and people of Computer Science Alfred V. at Bell Labs in the early 1970s, center. GREP would search I found myself keeping track a file of text looking for lines who weren’t very computer Aho is a man at the forefront savvy, to be able to write of computer science research. of budgets, and keeping track matching a pattern consisting Formerly the Vice President of editorial correspondence. of a limited form of regular throw-away programs for of the Computing Sciences I was also teaching at a nearby expressions, and then print all routine data processing. Research Center at Bell Labs, university at the time, so I lines in the file that matched Were there any programs or Professor Aho is well known had to keep track of student that regular expression. languages that already had for co-authoring the ‘Dragon’ grades as well. book series and for being one We thought that we’d like to these functions at the time of the three developers of I wanted to have a simple little generalize the class of patterns you developed AWK? the AWK pattern matching language in which I could write to deal with numbers as well Our original model was GREP. language in the mid-1970’s, one- or two-line programs to as strings. We also thought But GREP had a very limited along with Brian Kernighan do these tasks. Brian Kernighan, that we’d like to have more form of pattern action process- and Peter Weinberger. In the a researcher next door to me computational capability than ing, so we generalized the following interview, Professor at the Labs, also wanted to just printing the line that Aho explains more about the capabilities of GREP considerably. create a similar language. We matched the pattern. development of AWK. I was also interested at that had daily conversations which So out of this grew AWK, a time in string pattern matching culminated in a desire to language based on the principle (continued on next page) create a pattern-matching of pattern-action processing. CS@CU WINTER 2008 1 Cover Story (continued) algorithms and context-free scanned for each pattern in the was that I got to know how language designers later used grammar parsing algorithms program, and for each pattern Kernighan and Weinberger it as a model for developing for compiler applications. that matches, the associated thought about language design: more powerful languages. This means that you can see action is executed. it was a really enlightening About 10 years after AWK was a certain similarity between A simple example should make process! With the flexible created, Larry Wall created a what AWK does and what the this clear. Suppose we have a compiler construction tools we language called PERL, which compiler construction tools file in which each line is a name had at our disposal, we very was patterned after AWK and LEX and YACC do. followed by a phone number. quickly evolved the language some other UNIX commands. LEX and YACC were tools that Let’s say the file contains the to adopt new useful syntactic PERL is now one of the most were built around string pattern line ‘Naomi 1234’. In the AWK and semantic constructs. We popular programming languages matching algorithms that I was program the first field is referred spent a whole year intensely in the world. So not only was working on: LEX was designed to as $1, the second field as debating what constructs AWK popular when it was to do lexical analysis and YACC $2, and so on. Thus, we can should and shouldn’t be in introduced but it also stimulated syntax analysis. These tools were create an AWK program to the language. the creation of other popular compiler construction utilities retrieve Naomi’s phone number Language design is a very languages. which were widely used in Bell by simply writing $1==“Naomi” personal activity and each person labs, and later elsewhere, to {print $2} which means if the brings to a language the classes AWK has inspired many create all sorts of little languages. first field matches Naomi, then of problems that they’d like to other languages as you’ve Brian Kernighan was using print the second field. Now solve, and the manner in which already mentioned: them to make languages for you’re an AWK programmer! they’d like them to be solved. why do you think this is? typesetting mathematics and If you typed that program into I had a lot of fun creating AWK, What made AWK popular initially picture processing. AWK and presented it with a and working with Kernighan and was its simplicity and the kinds LEX is a tool that looks for lex- file that had names and phone Weinberger was one of the most of tasks it was built to do. It emes in input text. Lexemes numbers, then it would print stimulating experiences of my has a very simple programming are sequences of characters 1234 as Naomi’s phone number. career. I also learned I would not model. The idea of pattern-action that make up logical units. For A typical AWK program would want to get into a programming programming is very natural example, a keyword like ‘then’ have several pattern-action contest with either of them for people. We also made the in a programming language is statements. The patterns can however! Their programming language compatible with pipes a lexeme. The character ‘t’ by be Boolean combinations abilities are formidable. in UNIX. The actions in AWK itself isn’t interesting, ‘h’ by of strings and numbers; the Interestingly, we did not intend are really simple forms of C itself isn’t interesting, but the actions can be statements in a the language to be used programs. You can write a simple combination ‘then’ is interest- C-like programming language. except by the three of us. action like {print $2} or you can write a much more complex ing. One of the first tasks a AWK became popular since But very quickly we discovered compiler has to do is read the it was one of the standard lots of other people had the C-like program as an action source program and group its programs that came with need for the routine kind of associated with a pattern. Some Wall Street financial houses used characters into lexemes. every UNIX system. data processing that AWK was AWK when it first came out to AWK was influenced by this kind good for. People didn’t want to balance their books because of textual processing, but AWK What are you most proud of write hundred-line C programs was aimed at data-processing in the development of AWK? to do data processing that it was so easy to write data- could be done with a few lines processing programs in AWK. tasks and it assumed very little AWK was developed by three background on the part of the of AWK, so lots of people AWK turned a number of people people: me, Brian Kernighan started using AWK. user in terms of programming and Peter Weinberger. Peter into programmers because the sophistication. Weinberger was interested in For many years AWK was one learning curve for the language what Brian and I were doing right of the most popular commands was very shallow. Even today a Can you provide our readers from the start. We had created on UNIX, and today, even though large number of people continue with a brief summary in a grammatical specification for a number of other similar to use AWK, saying languages your own words of AWK as AWK but hadn’t yet created languages have come on the such as PERL have become too a language? the full run-time environment. scene, AWK still ranks among complicated. Some say PERL AWK is a language for processing Weinberger came along and said the top 25 or 30 most popular has become such a complex files of text. A file is treated ‘hey, this looks like a language programming languages in the language that it’s become as a sequence of records, and I could use myself’, and within world. And it all began as a almost impossible to understand by default each line is a record. a week he created a working little exercise to create a utility the programs once they’ve Each line is broken up into a run time for AWK. This initial that the three of us would find been written. sequence of fields, so we can form of AWK was very useful useful for our own use.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    20 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us