Chapter 11 the Perl Scripting Language Dr
Total Page:16
File Type:pdf, Size:1020Kb
Chapter 11 The Perl Scripting Language Dr. Marjan Trutschl [email protected] Louisiana State University, Shreveport, LA 71115 Chapter 11 The Perl Scripting Language ¤ Intro to Perl ¤ Working with Files ¤ Help ¤ Sort ¤ Terminology ¤ Subroutines ¤ Running a Perl Program ¤ Regular Expressions ¤ Syntax ¤ CPAN Modules ¤ Variables ¤ Examples ¤ Control Structures Intro to Perl ¤ The Perl motto is "There's more than one way to do it." Divining how many more is left as an exercise to the reader. ¤ Originally created by Larry Wall as a tool to process reports Yep. This guy. at NASA, AKA the Practical Extraction and Report Language ¤ Perl is easy, flexible and processes text information quickly ¤ Since most graphics are text at some level, Perl has been adapted to CGI pretty widely ¤ It’s also great at RegExp and works very well with Linux by allowing embedded bash in a Perl script or Perl in a bash script Help ¤ Help ¤ There are numerous resources for Perl, so please don’t stop with just the text ¤ Try reading the documentation…it’s actually very well written and provides sound advice ¤ Also, if you’re in a hurry, try this tutorial: ¤ http://www.perl.com/pub/2008/04/23/a-beginners- introduction-to-perl-510.html Perl is already installed on the Sun servers or any OSX system, but some Linux distributions (like raspbian) don’t ship with it. Simply install it with ‘apt-get’ or your favorite package manager. Help ¤ Documentation ¤ Because Perl’s restrictions can be turned off, it’s easy to use it for simple tasks with sloppy code ¤ Turn on ‘strict’ and ‘warnings’ to tighten up your code ¤ Perl supports object oriented code, though there is no special syntax for it, feel free to write good code ¤ Constructors can be created using the ‘bless’ function ¤ Use the ‘perldoc’ utility to document Larry is very religious, and has a good sense of humor about it. Terminology ¤ Terminology ¤ Module: Similar to an API, it’s a piece of code containing several functions that work together ¤ Distribution: A set of modules that perform a task ¤ Package: The Perl term for a namespace, it defines a variable in relation to a distribution or module to prevent collisions ¤ Block: Statements within curly braces that defines a scope ¤ Package variable: A variable with a scope limited to the current package ¤ Lexical variables: Locally defined variables by the ‘my’ keyword, since the ‘local’ keyword was already taken Terminology ¤ Terminology (cont) ¤ List: A set of comma separated values in parentheses ¤ Array: An ordered list of scalar variables, indexed by number ¤ Hash: Unordered collection of scalar values indexed by their associated string key ¤ Scalar: A value, in the form of a number, string or reference ¤ Compound Statement: A collection of statements, within one or more blocks Running a Perl Program ¤ Running Perl ¤ Perl can be run from the command line or a script ¤ Use a ‘.pl’ file extension when writing scripts and a shebang using the ‘which’ utility ¤ Terminate lines in a script with a semicolon ¤ Just like bash, use ‘chmod’ to give execute permission to the script and run it with ‘./’ from within the same directory The ‘use strict’ and ‘use warnings’ enforce better coding practices, but are optional Syntax ¤ Perl Syntax ¤ Perl’s syntax is similar to C, which is similar to bash ¤ Quotation Mark: These act like bash, where double quotes allow expansion of variables and single quotes do not. ¤ Statement: Each statement is terminated by a semicolon, unless it’s the final statement in a block ¤ Slash: The lead guitarist for Guns and Roses, or a delimiter for Regular Expressions ¤ Backslash: Escape character ¤ Hashmark: #comments #sorrynotsorry Variables ¤ Three Types of Variables ¤ $scalar: Scalar variables contain a single value ¤ @array: Arrays contain multiple values, indexed by number and starting at 0 ¤ %hash: Hashes contain references to data, or an associative array ¤ Syntax ¤ Variable names are case sensitive and can include letters, digits and the underscore ¤ Package Qualifiers are designated by ‘::’ or a single quote Variables ¤ Usage of Variables ¤ If a lexical (local) variable has the same name as a package variable, Perl uses the lexical variable within the block ¤ Use ‘my’ to keep all variables local, particularly in larger software packages, to prevent confusion ¤ Use ‘our’ to use a package variable of the same name ¤ Variables do not need to be declared before usage, which can lead to bugs ¤ Turn on ‘use strict’ to prevent this to prevent typos from turning into bugs ¤ Turn on ‘use warnings’ to get a notification that a variable is used without initialization Variables ¤ Scalar Variables ¤ Scalar, or singular, variables are prefixed by ‘$’ and hold a single value ¤ Using ‘my’ will identify them as Lexical (or local) variables ¤ Scalars and RegExp ¤ When Perl finds a RegExp match, then it will store it in a variable $1 through $9 ¤ Since these are lexical variables and easily clobbered, keep them from disappearing outside of the block by assigning them to another variable Variables ¤ Array Variables ¤ Arrays used zero based indexing, so they start at 0 ¤ There is no set limit to the size of an array, but it will return ‘undef’ when an array index is accessed out of bounds ¤ Double quote strings in arrays ¤ There are many different ways to deal with arrays, but a lot fewer when ‘strict’ is enabled ¤ Access sequential array items using an array slice ¤ ‘..’ Prints continuous entries, while ‘,’ prints specific entries ¤ Use double quotes to print spaces between entries Variables ¤ Array Sample ¤ Here are some simple array samples to try out for practice Why would you ever need an array that’s not a linked list? Perl dispenses with the unnecessary. Variables ¤ Hash Variables ¤ A hash, unlike an array, is unordered and values are accessed by a string value called a key ¤ Hashing will speed up finding data in large data sets, and Perl runs like many others at O(1) with a worst case O(n) ¤ This makes the normal use case to use a hash to link to a scalar value, improving the search performance Hashes in Perl are a little different. Since they only map to one value (which can be an array), they are ideally suited as dictionaries. Once the search key is placed in a hash table, the search is basically free. The value in the dictionary can easily be stored in a format for an address book or other kind of table Control Structures ¤ Perl Control Structures ¤ Perl operates like most other C based languages with some minor differences ¤ if/unless: Simplifies some nested if/else statements when a nullifier is required ¤ foreach: There is an alternative ‘for’ syntax that might look familiar, but they are synonymous ¤ last and next: Like skips or jumps, ‘next’ will return directly to the conditional while ‘last’ exits from the loop ¤ while and until: These are identical, except that ‘while’ continues to loop when the conditional evaluates to false and ‘until’ continues until it evaluates to true Working with Files ¤ File Handles ¤ Refers to a file or process that is open for reading or writing ¤ Similar to bash, Perl opens handles for input, output and error ¤ STDIN, STDOUT and STDERR ¤ Other handles must be manually opened ¤ This sounds hard, but really, they’re not very heavy ¤ Give the file handle any name you like ¤ Once opened, simply write to it using the ‘print’ function ¤ Or assign the handle to STDOUT so all output defaults to it ¤ “Magic” file handles are set through command line arguments Working with Files ¤ File Handles ¤ A brief note about file handles: The convention is to use all caps when writing ‘bareword’ file handles, because they are global variables. Because of this, they can cause collisions and can be more easily used by hackers. Since they are input directly, they also don’t give the program a chance to filter the input through string validation. Get into the habit of using three argument opens. Many reasonable people disagree on this topic, but the safest route is three argument file handles by reference. ¤ Optionally, this is also where you would specify character encoding, as part of the second argument Working with Files Working with Files ¤ Further Notes ¤ ‘chomp’ and ‘chop’ remove extra newlines and spaces, respectively ¤ ‘$!’ holds the last system error, if you’d like to print a full error message with some verbosity ¤ ‘$ARGV’ holds the argument from the command line, useful if your program reads in a filename Sort ¤ Sort ¤ The Perl ‘sort’ is a little more refined than bash ¤ It sorts numerically or alphabetically, based on ‘locale’ setting ¤ ‘reverse’ doesn’t sort, it simply swaps array values This sort might not work as expected. It sorts based on character sequence in the ASCII table. There are several ways to deal with this. Sort ¤ Sort ¤ If you’d like to sort numbers as integers, you’ll need to get it to compare things differently This sort works as expected with integers, but uses a couple of odd things. The ‘<=>’ “spaceship operator” The ‘$a’ and ‘$b’ variables should is a three way comparison only be used for sorting (although operator, which returns 1, 0 or -1. they work outside of it) because So you quickly get less than, they are specialized to the task. greater than and equal to results Don’t lexicalize them with ‘my’. from one comparison Subroutines ¤ Locality and Subroutines ¤ In Perl, all variables default to package level ¤ Local variables are declared with the ‘my’ keyword and package variables are declared with ‘our’ ¤ When creating subroutines, sometimes we want variables to be accessible outside of their block, so package is good ¤ The @_ array variable is local to the subroutine, but aliases the values in the calling argument, allowing it to pass parameters ¤ So you can pass in as many parameters as you’d like.