What Is Object-Oriented Perl?
Total Page:16
File Type:pdf, Size:1020Kb
This is a series of extracts from Object Oriented Perl, a new book from Manning Publications that will be available in August 1999. For more information on this book, see http://www.manning.com/Conway/. What is object-oriented Perl? Object-oriented Perl is a small amount of additional syntax and semantics, added to the existing imperative features of the Perl programming language. Those extras allow regular Perl packages, variables, and subroutines to behave like classes, objects, and methods. It's also a small number of special variables, packages and modules, and a large number of new techniques, that together provide inheritance, data encapsulation, operator overloading, automated definition of commonly used methods, generic programming, multiply-dispatched polymorphism, and persistence. It's an idiosyncratic, no-nonsense, demystified approach to object-oriented programming, with a typically Perlish disregard for accepted rules and conventions. It draws inspiration (and sometimes syntax) from many different object-oriented predecessors, adapting their ideas to its own needs. It reuses and extends the functionality of existing Perl features, and in the process throws an entirely new slant on what they mean. In other words, it's everything that regular Perl is, only object-oriented. Using Perl makes object-oriented programming more enjoyable, and using object- oriented programming makes Perl more enjoyable too. Life is too short to endure the cultured bondage-and-discipline of Eiffel programming, or to wrestle the alligators that lurk in the muddy semantics of C++. Object-oriented Perl gives you all the power of those languages, with very few of their tribulations. And best of all, like regular Perl, it's fun! Before we look at object orientation in Perl, let’s talk about what object orientation is in general... The essentials of object orientation You really need to remember only five things to understand 90% of the theory of object orientation: • an object is anything that provides a way to locate, access, modify, and secure data; • a class is a description of what data is accessible through a particular kind of object, and how that data may be accessed; • a method is the means by which an object's data is accessed, modified, or processed; • inheritance is the way in which existing classes of object can be upgraded to provide additional data or methods; • polymorphism is the way that distinct objects can respond differently to the same message, depending on the class they belong to; This section discusses each of these ideas. Page 1 Objects An object is an access mechanism for data. In most object-oriented languages that means that objects act as containers for data (or at least, containers for pointers to data). But in the more general sense, anything that provides access to data (a variable, a subroutine, a file handle) may be thought of as an object. The various data to which an object provides access are known as its attribute values, and the containers storing those attribute values are called attributes. Attributes are (usually) nothing more than variables that have somehow been exclusively associated with a given object. Objects are more than just collections of variables however. In most languages, objects have an extra property called encapsulation. Encapsulation1 means that the attributes of an object are not directly accessible to the entire program. Instead, they can only be accessed through certain subroutines that are associated with the object. Those subroutines are called methods, and they are (usually) universally accessible. This layer of indirection means that methods can be used to limit the ways in which an object's attribute values may be accessed or changed. In other words, an object's attribute values can only be retrieved or modified in the ways permitted by that object's methods. Let's take a real-world example of an object: an automated teller machine. An ATM is an object because it provides (controlled) access to certain attribute values, such as your account balance, or the bank's supply of cash. Some of those attribute values are stored in attributes within the machine itself (i.e. its cash trays), whilst others are stored elsewhere (i.e. in the bank's central accounts computer). From the client's point of view, it doesn't matter where the attribute values actually are, so long as they're accessible via the ATM object. Access to the ATM's various attributes is restricted by the interface of the machine. That is, the various buttons, screens, and slots of the ATM control how encapsulated attribute values (cash, information, etc.) may be accessed. Those restrictions are designed to ensure that the object maintains a consistent internal state and that any external interactions with its attributes are valid and appropriate. For example, most banks don't use ATMs consisting of a big basket of loose cash and a note pad on which you record exactly how much you took. Even if the bank could assume that everyone was honest, it couldn't assume that everyone was infallible. People would inevitably end up taking (or recording) the wrong amount by mistake, even if no-one did so deliberately. The restrictions on access are in the client's interest too. The machine can provide access to attribute values that are private to a particular client (e.g. their account balance) and it shouldn't make that information available to just anyone. Even if we are pretending that all the ATMs clients are entirely honest, the account information shouldn't be universally 1 Encapsulation is an awkward term, because it has two distinct meanings: "bundling things together" and "isolating things from the outside world". In the literature of object orientation both senses of the word have been used at different times. Originally, encapsulation was used in the "bundling" sense, as a synonym for aggregation. More recently, encapsulation has increasingly been used in the "isolation" sense, as a synonym for data hiding. It's in that more modern sense that the term is used hereafter. Page 2 available, because eventually someone will access and modify the wrong account data by accident. In object-oriented programming, an object's methods provide the same kinds of protection for data. The question is: how does an object know which methods to trust? Classes Setting up an association between a particular kind of object and a set of trusted subroutines (i.e. methods) is the job of the object's class. A class is a formal specification of the attributes of a particular kind of object, and of the methods that may be called to access those attributes. In other words, a class is a blueprint for a given kind of object. Every object belonging to a class has an identical interface (a common set of methods that may be called) and implementation (the actual code defining those methods and the attributes they access). Objects are said to be instances of the class. When a program is asked to create an object of a particular kind, it consults the appropriate class definition (blueprint) to determine how to build such an object. Typically the class definition will specify what attributes the class's objects possess and where those attributes are stored (i.e. inside the object, or remotely through a pointer or reference). When a particular method is called on an object, the program again consults the object's class definition to ensure that the method is "legal" for that object (i.e. the method is part of the object's blueprint), and that the method has been called correctly (i.e. in line with the definition in the class blueprint). For example, in software controlling a bank's automated teller network there might be a class called ATM that describes the structure and behaviour of objects that represent individual ATMs. The ATM class might specify that each ATM object has the attributes cash_remaining, transaction_list, cards_swallowed, etc., and methods such as start_up(), withdraw_cash(), list_transactions(), restrict_withdrawal(), chew_cards(), close_down(). Thereafter, when an ATM object receives a request to invoke a method called withdraw_cash_without_debiting_account(), it can check the ATM class blue-print and ascertain that the method cannot be called. Alternatively, if the (valid) method close_down() is defined to increment a (non-existent) attribute called downtime, then this coding error can be detected. Class attributes and methods So far, we've only considered attributes that are accessed through (i.e. "belong to") an individual object. Such attributes are more formally known as object attributes. Likewise, we've only talked about methods that were called on a particular object to manipulate its object attributes. No prizes for guessing that such methods are called object methods. Unfortunately, object attributes and methods don't always provide an appropriate mechanism for controlling the data associated with the objects of a particular class. In particular, the attributes of an individual object of a class are not usually suitable for encapsulating data that belongs—collectively—to the complete set of objects of that class. Let's go back to the ATM example for a moment. At the end of each day, the bank will want to know how much money in total has been dispensed from all its ATM machines. Page 3 Each of those machines will have a record of how much it has dispensed individually, but no machine will have a record of how much all the bank's machines have dispensed collectively. That information is not a property of a particular ATM. Rather, it's a collective property of the entire set of ATMs. The most obvious solution is to design another kind of machine—an ATM coordinator—that gathers and stores the collective data of the set of ATMs (i.e.