The Evolution of Python

Juha Helminen 10.11.2009

ABC

language and interactive programming environment

● Developed in the late 70s and early 80s – initially under the name B (not to mix with 's predecessor)

● Leo Geurts, , and Steven Pemberton, National Research Institute for Mathematics and Computer Science in the Netherlands (CWI, Centrum voor Wiskunde en Informatica – also Algol 68)

● Focus on creating a language suitable for non- programmers (e.g. physicists, linguists)

● Intended as a replacement for BASIC as a teaching language and environment ABC Features

● Statement nesting with indentation - however, no nested scoping, only top/single-level functions

● No variable declarations, static typing, type inference [18]

● No file I/O - persistent global variables that were automatically stored to files

● 5 data types – Numbers - floating point + infinite precision integers and fractions – Texts (strings) - unbounded size – Compounds (tuples) – Lists - sorted collections of any one type of items – Tables - associative arrays with any one type of keys,

any one type of items (sorted by key) ABC Example

Function words finds the set of words in a document.

HOW TO RETURN words document: PUT {} IN collection FOR line IN document: FOR word IN split line: IF word not.in collection: INSERT word IN collection RETURN collection

Problems with ABC

● Funding withdrawn, few users because of [4] – unconventional terminology threw off more experienced users - more “newbie-friendly”

● Procedure ↔ how to, variable ↔ location – a monolithic implementation that made it hard to add new features – too much emphasis on theoretical performance - tree- based DS algorithms, optimal for asymptotically large collections but not so for small [14] – not enough flexibility in interaction with other software due to oversimplified I/O

● Very monolithic, no concept of a standard library, built-in commands known by the parser and functions integrated in the runtime → closed system, no user participation Guido van Rossum

● Fresh out of university 1982 - 1986 worked on implementing ABC at National Research Institute for Mathematics and Computer Science in the Netherlands in the Netherlands (CWI)

● 1986 moved on to work on Amoeba, a distributed operating system

● 1989 started implementation of a simple scripting language for the needs of the Amoeba project, based on experiences with ABC

● Python's principal author – Other CWI employees with major involvement: Sjoerd Mullender and Jack Jansen [16]

"I needed something that would run first and foremost on Amoeba ... available on that platform was a port of the V7 shell; but it was hard to teach it about Amoeba's (at the time) exciting new features."

- Guido van Rossum [4]

“my initial goal for Python was to serve as a second language for people who were C or C++ programmers, but who had work where writing a C program was just not effective … Bridge the gap between the shell and C.”

- Guido van Rossum [5]

"We found that we needed to write a lot of applications to support users, and that writing these in C our productivity was atrocious. This made me want to use something like ABC ... ABC had been designed more as a teaching and data manipulation language, and its capabilities for interacting with the operating system (which we needed) were limited to non-existent by design. … So I set out to come up with a language that made programmers more productive, and if that meant that the programs would run a bit slower, well, that was an acceptable trade-off."

- Guido van Rossum [11]

"One, I was working at Centrum voor Wiskunde en Informatica (CWI) as a programmer on the Amoeba distributed operating system, and I was thinking that an interpreted language would be useful for writing and testing administration scripts.

Second, I had previously worked on the ABC project, which had developed a programming language intended for non- technical users. I still had some interesting ideas left over from ABC, and wanted to use them."

- Guido van Rossum [12]

"I had a number of gripes about the ABC language, but also liked many of its features. It was impossible to extend the ABC language (or its implementation) to remedy my complaints -- in fact its lack of extensibility was one of its biggest problems”

- Guido van Rossum [13] "ABC was designed as a diamond - perfect from the start, but impossible to change. I realized that this had accidentally closed off many possible uses, such as interacting directly with the operating system: ABC's authors had a very low opinion of operating systems, and wanted to shield their users completely from all their many bizarre features"

- Guido van Rossum [11]

One-Person Project [14]

● No official budget → needed results quickly

● Borrow as much ideas whenever it makes sense

● “Things should be as simple as possible, but no simpler”

● Do one thing well (“The UNIX Philosophy”)

● Plan to optimize later

● Don't fight the environment and go with the flow

● Don't try for perfection because “good enough” is often just that

● It's okay to cut corners sometimes, especially if you can do it right later

Design Guidelines [14]

● Not tied to a particular platform – it's okay if some functionality is not always available but the core should work everywhere

● Don't bother users with detail the machine can handle

● Encourage platform-independent code but do not cut off access to platform capabilities (c.f. Java)

● Multiple levels of extensibility

● Errors should not be fatal as long as the virtual machine is still functional – errors should neither pass silently

● A bug in the user’s Python code should not be allowed to lead to undefined behavior of the Python interpreter; a core dump is never the user’s fault.

Birth of Python ● First public release 0.9.0 February 20, 1991 (alt.sources) – an interpreted, interactive object-oriented programming language with dynamic typing – virtual machine, parser, and runtime written in C

● Features – indentation used for statement grouping (instead of begin-end or braces) – powerful built-in data types: hash table (dictionary), list (variable-length array), string, number – classes with inheritance – exception handling – extensible - Python and C modules

– C modules could make new types available Python Implementation [19]

● In C as a stack-based byte code interpreter with a collection of primitive types also implemented in C

● The underlying architecture uses “objects” throughout implemented using structures and function pointers

● User-defined objects – objects represented by a new kind of built-in object that stored a class reference pointing to a "class object" shared by all instances of the same class, and a dictionary, dubbed the "instance dictionary" that contained the instance variables – the set of methods of a class were stored in a dictionary whose keys are the method names – classes “first-class objects”, which are easily

introspected at run time, it also makes it possible to modify a class dynamically "We'll provide a bunch of built-in object types, such as dictionaries, lists, the various kinds of numbers, and strings, to the language. But we'll also make it easy for third-party programmers to add their own object types to the system."

- Guido van Rossum

Python Example

- indentation, no declarations, colon starts a block

>>> def select_max(a,b): ... if a < b: ... max=b ... else: ... max = a ... return max ... >>> max_value = select_max(1,2) >>> print 'Max is', max_value Max is 2 >>>

Python Example

- class definition, instantiation and attribute access – explicit receiver/passed implicitly instead of a new keyword >>> class A: ... def __init__(self, x): ... self.x = x ... def spam(self, y): ... print self.x, y ... >>> inst = A(5) >>> inst.x 5 >>> inst.spam(6)

5 6 Python Example

- built-in data types, initialization with literal values, tuple unpacking, slicing >>> tuple = (1,2,3) >>> list = [1,2,3] >>> hash = {1:'a', 2:'b', 3:'c'} >>> a,b,c = tuple >>> a 1 >>> a,b = b,a >>> a 2 >>> b 1 >>> list[0:1] [1] >>> hash[2] 'b' Zen of Python by Tim Peters [15]

● Beautiful is better than ugly.

● Explicit is better than implicit.

● Simple is better than complex.

● Complex is better than complicated.

● Flat is better than nested.

● Sparse is better than dense.

● Readability counts.

● Special cases aren't special enough to break the rules.

● Although practicality beats purity.

Zen of Python by Tim Peters [15]

● Errors should never pass silently.

● Unless explicitly silenced.

● In the face of ambiguity, refuse the temptation to guess.

● There should be one-- and preferably only one --obvious way to do it.

● Although that way may not be obvious at first unless you're Dutch.

● Now is better than never.

● Although never is often better than *right* now.

● If the implementation is hard to explain, it's a bad idea.

● If the implementation is easy to explain, it may be a good idea.

● Namespaces are one honking great idea - let's do more of those! Language Design Influences

● Module system, exceptions and explicit self in methods' parameter lists are borrowed from Modula-3 [13]

● ABC is the origin for indentation (and the colon), the idea of built-in high-level data types and tuple packing / unpacking – and overall the source for the ideas of elegance, simplicity and readability [13,14,17] – punctuation characters should be used conservatively

● Most keywords, such as if, else, while, break, continue and others, are identical to C's, as are operator priorities and identifier naming rules [12]

● The Bourne shell was the model for the Python shell's behaviour [12]

● Slicing came from Algol-68 and Icon [12] "After early user testing [of ABC] without the colon, it was discovered that the meaning of the indentation was unclear to beginners being taught the first steps of programming. The addition of the colon clarified it significantly: the colon somehow draws attention to what follows and ties the phrases before and after it together in just the right way."

- Guido van Rossum [17]

About using indentation for grouping:

"It's the right thing to do from a code readability point of view, and hence from a maintenance point of view. And maintainability of code is what counts most: no program is perfect from the start, and if it is successful, it will be extended. So maintenance is a fact of life, not a necessary evil."

- Guido van Rossum [11]

Dynamic Typing

● Type associated with value not variable - variable is simply a label for a value, name is bound to a value

● Duck typing - if it quacks like a duck, it is a duck – enables proxies etc.

● "In Python, you have an argument passed to a method. You don't know what your argument is. You're assuming that it supports the readline method, so you call readline. Now, it could be that the object doesn't support the readline method." → exception [6]

● "On the other hand, when you find out, you find out in a very good way. The interpreted language tells you exactly this is the type here, that's the type there, and this is where it happened." [8]

Python 1.0 (Jan 26, 1994)

● Semicolon separator, long integer (0.9.2)

● Global statement to assign to global variables (0.9.3)

● Built-in function apply(function, tuple) ↔ function(tuple[0], tuple[1], ..., tuple[len(tuple)-1]) (0.9.4)

● For sequences x[-i] ↔ x[len(x)-i] (0.9.4)

● Cleaner class syntax class M(B, D): ↔ class M() = B(), D(): (0.9.4)

● User-defined classes can now implement operations invoked through special syntax, such as x[i] or `x` by defining methods named __getitem__(self, i) or __repr__(self), etc. (0.9.7)

● Continuation lines without a backslash: if the continuation is contained within nesting (), [] or {} brackets the \ may be omitted (0.9.9) Python 1.0 (Jan 26, 1994)

● Functional programming tools lambda, map, filter and reduce (1.0.0)

● Built-in function xrange() creates a "range object" (1.0.0) – Arguments the same as those of range() – Behaves the same in a for loop – Representation is much more compact – cannot be used to initialize a list using idiom: [RED, GREEN, BLUE] = range(3)

● Function argument default values, def f(a, b=1) (1.0.2) – Evaluated at function definition time

● The try-except statement's else clause, executed when no exception occurs in the try clause (1.0.2) Python Example

- global statement for assignment to global variables, not needed for referring to global variables and consequently assignment to “parts” of variables - semicolon

extra = 1 total = 0.0; count = 0

def add_to_total(amount): global total, count total = total + amount + extra count = count + 1

Python Example

- An example of functional features of Python with an anonymous lambda function - Contributed code

>>> vals = [1, 2, 3, 4] >>> newvals = map(lambda x: x*x, vals) >>> print newvals [1, 4, 9, 16]

Python 1.4 (Oct 25, 1996)

● Classes can define methods named __getattr__, __setattr__, __delattr__ to trap attribute accesses (1.1)

● Classes can define method __call__ so instances can be called directly – obj(arg) invokes obj.__call(arg)__ (1.1)

● Documentation strings, accessible through the __doc__ attribute - Modules, classes and functions support special syntax to initialize the __doc__ attribute: if the first statement consists of just a string literal, that string literal becomes the value of the __doc__ attribute (1.2)

● Modula-3 inspired keyword arguments (1.3)

● Name mangling to implement a simple form of class- private variables - __spam can't easily be used outside the class (1.4)

Python Example

- Keyword arguments

def parrot(voltage, state='a stiff', action='voom', type='Norwegian Blue'): print "-- This parrot wouldn't", action, print "if you put", voltage, "volts through it." print "-- Lovely plumage, the", type print "-- It's", state, "!" parrot(1000) parrot(action = 'VOOOOOM', voltage = 1000000) parrot('a thousand', state = 'pushing up the daisies') parrot('a million', 'bereft of life', 'jump')

Python 1.5 (Jan 3, 1998)

● (built-in) Packages - import spam.ham.eggs

● Perl style regular expressions (replaced another module) – raw strings for re literals

● Class exceptions instead of strings

Python 2.0 (Oct 16, 2000)

● Shift to a more transparent and community-backed process at Sourceforge – Formal process to write Python Enhancement Proposals (PEPs), modelled on the RFC process

● Unicode support

● List comprehensions

● Augmented assignment - +=, -=, *=, ...

● Enhanced garbage collection capable of collecting reference cycles – earlier simple reference counting leaked memory in case of cycles

PEP 1

● “PEP stands for Python Enhancement Proposal. A PEP is a design document providing information to the Python community, or describing a new feature for Python. The PEP should provide a concise technical specification of the feature and a rationale for the feature.”

● “We intend PEPs to be the primary mechanisms for proposing new features, for collecting community input on an issue, and for documenting the design decisions that have gone into Python. The PEP author is responsible for building consensus within the community and documenting dissenting opinions.”

Python Example

- list comprehensions

# Given the list L, make a list of all # strings containing the substring S. sublist = [ s for s in L if string.find(s, S) != -1 ]

[ expression for expr in sequence1 for expr2 in sequence2 ... for exprN in sequenceN if condition ]

Python 2.1 (Apr 17, 2001)

● Python Software Foundation License - all code, documentation and specifications, from 2.1 alpha on, is owned by the Python Software Foundation (PSF), a non- profit organization formed in 2001, modeled after the Apache Software Foundation

● Statically nested scopes (PEP 227)

● __future__ for gradually introducing new features: from __future__ import nested_scopes (PEP 236)

Python Example

- nested scopes – previously 3 namespaces: built-in, global (module), local (function)

def f(): ... def g(value): ... return g(value-1) + 1 ...

Python Example

- nested scopes – previously 3 namespaces: built-in, global (module), local (function)

def find(self, name): "Return list of any entries equal to 'name'" L = filter(lambda x, name=name: x == name, self.list_attribute) return L

Python 2.2 (Dec 21, 2001)

● Unification of Python's types (types written in C), and classes (types written in Python) to allow subclassing Python types implemented in C, such as subclass built-in types to add a method (PEP 252, 253)

● Static and class methods

● Properties

● Iterators (PEP 234)

● Generators inspired by Icon (PEP 255)

● Non-truncating division (PEP 238)

Python Example

- Static methods aren’t passed the instance, and therefore resemble regular functions - Class methods are passed the class of the object, but not the object itself class C(object): def f(arg1, arg2): ... f = staticmethod(f)

def g(cls, arg1, arg2): ... g = classmethod(g)

Python Example

- Properties provide a simpler way to trap attribute references class C(object): def get_size (self): result = ... computation ... return result def set_size (self, size): ... compute something based on the size and set internal state appropriately ...

# Define a property. The 'delete this attribute' # method is defined as None, so the attribute # can't be deleted. size = property(get_size, set_size, None, "Storage size of this instance")

Python Example

- Wherever the Python interpreter loops over a sequence, changed to use the iterator protocol

>>> L = [1,2,3] >>> i = iter(L) >>> print i >>> i.next() 1 >>> i.next() 2 >>> i.next() 3 >>> i.next() Traceback (most recent call last): File "", line 1, in ? StopIteration >>> i = iter(L) >>> a,b,c = i >>> a,b,c (1, 2, 3) Python Example

- When you call a generator function, it doesn’t return a single value; instead it returns a generator object that supports the iterator protocol - Difference between yield and a return statement is that on reaching a yield the generator’s state of execution is suspended and local variables are preserved

# A recursive generator that generates Tree leaves # in in-order. def inorder(t): if t: for x in inorder(t.left): yield x yield t.label for x in inorder(t.right): yield x

Python 2.3 (Jul 29, 2003)

● No major language changes

● A Boolean type, built-in constants True/False (PEP 285) – alternative ways to spell the integer values 1 and 0, with the single difference that str() and repr() return the strings 'True' and 'False' instead of '1' and '0'

● Source code encodings (PEP 263) – declared by including a specially formatted comment in the first or second line of the source file # -*- coding: UTF-8 -*-

● Importing modules from zip archives (PEP 273)

● Implemented extended slicing syntax in basic types

Python Example

- an optional third “step” argument (syntax supported since 1.4, used by )

>>> L = range(10) >>> L[::2] [0, 2, 4, 6, 8]

>>> L[::-1] [9, 8, 7, 6, 5, 4, 3, 2, 1, 0]

>>> s='abcd' >>> s[::2] 'ac' >>> s[::-1] 'dcba'

Python 2.4 (Nov 30, 2004)

● Generator expressions (PEP 289) – Generator expressions work similarly to list comprehensions but don’t materialize the entire list; instead they create a generator that will return elements one by one max(len(line) for line in file if line.strip()) c.f. max([len(line) for line in file if line.strip()])

● Function decorators (PEP 318) – A decorator is just a function that takes the function to be decorated as an argument and returns either the same function or some new object

Python Example

- the following decorator checks that the supplied argument is an integer

def require_int(func): def wrapper(arg): assert isinstance(arg, int) return func(arg)

return wrapper

# same as: p1 = require_int(p1) @require_int def p1(arg): print arg

@require_int def p2(arg): print arg*2

Python 2.5 (Sep 16, 2006)

● Conditional expressions (PEP 308) – x = true_value if condition else false_value – contents = ((doc + '\n') if doc else '')

● Partial function application (PEP 309)

● Unified try/except/finally (PEP 341) – Before You couldn’t combine both except blocks and a finally block

● with statement (PEP 343) – a new control-flow structure for context management protocol – clarifies code that previously would use try...finally blocks to ensure that clean-up code is executed

● ctypes module for calling C functions Python Example

- Partial function application: construct variants of existing functions that have some of the parameters filled in

import functools def log (message, subsystem): "Write the contents of 'message' to the specified subsystem." print '%s: %s' % (subsystem, message) ... server_log = functools.partial(log, subsystem='server') server_log('Unable to open socket')

Python Example

- Unified try/except/finally

try: block-1 ... except Exception1: handler-1 ... except Exception2: handler-2 ... else: else-block finally: final-block

Python Example

- The file object in f will have been automatically closed, even if the for loop raised an exception - The lock is acquired before the block is executed and always released once the block is complete

with open('/etc/passwd', 'r') as f: for line in f: print line ... more processing code ...

lock = threading.Lock() with lock: # Critical section of code ...

Python 2.6 (Oct 1, 2008)

● preparing the migration path to Python 3.0

● Print as a function (PEP 3105) – makes it possible to replace the function easily

● Abstract Base Classes (PEP 3119) – Python's implementation of interfaces – isinstance() and issubclass to check

● Class decorators (PEP 3129)

Python 2.7

● In development

● Some Python 3.1 features backported to 2.7

Python 3.0 (Dec 3, 2008)

● Breaks backwards compatibility with 2.x series in order to repair flaws in the language - "There should be one— and preferably only one —obvious way to do it."

● Print as a function – Old: print "The answer is", 2*2 – New: print("The answer is", 2*2)

● Unicode for all text strings - unifying the str/unicode types, and introducing a separate immutable bytes type – All text is Unicode; however encoded Unicode is represented as binary data

● Function annotations that can be used for informal type declarations or other purposes (PEP 3107)

Python 3.0 (Dec 3, 2008)

● Nonlocal - assign directly to a variable in an outer (but non-global) scope (PEP 3104)

● Extended Iterable Unpacking (PEP 3132) – a, b, *rest = some_sequence – *rest, a = stuff

● Dictionary comprehensions: {k: v for k, v in stuff} (PEP 0274)

● Set literals, e.g. {1, 2}, and set comprehensions, e.g. {x for x in stuff}

Python Example

- a function's annotations are available via the function's func_annotations - a syntax for adding arbitrary metadata annotations to Python functions

def foo(a: 'x', b: 5 + 6, c: list) -> max(2, 9): ...

# The resulting contents of func_annotations {'a': 'x', 'b': 11, 'c': list, 'return': 9}

Python 3.1

● Current version 3.1.1 released Aug 09, 2009

● No major language changes

The Power of Python

● In Python, all objects are said to be "first class." This means that functions, classes, methods, modules, and all other named objects can be freely passed around, inspected, and placed in various data structures (e.g., lists or dictionaries) at run-time – Also, classes are objects, they area instances of so- called metaclasses

● Extensive standard library (with simple APIs) – Performance-critical code can be implemented in C

● Special methods allow implementing operator functions for user-defined classes (e.g. + ↔ __add__)

● Clean syntax with an emphasis on readability

References [1] ABC (programming language) http://en.wikipedia.org/wiki/ABC_(programming_language) [2] A Short Introduction to the ABC Language http://homepages.cwi.nl/~steven/abc/ [3] The Making of Python - A Conversation with Guido van Rossum, http://www.artima.com/intv/python.html, [4] An Interview with Guido van Rossum, http://onlamp.com/pub/a/python/2002/06/04/guido.html [5] Python's Design Goals - A Conversation with Guido van Rossum, http://www.artima.com/intv/pyscale.html [6] Programming at Python Speed - A Conversation with Guido van Rossum, http://www.artima.com/intv/speed.html [7] Contracts in Python - A Conversation with Guido van Rossum, http://www.artima.com/intv/pycontract.html References [8] Strong versus Weak Typing - A Conversation with Guido van Rossum, http://www.artima.com/intv/strongweak.html [9] Designing with the Python Community - A Conversation with Guido van Rossum, http://www.artima.com/intv/pycomm.html [10] Guido van Rossum, A Brief Timeline of Python, http://python-history.blogspot.com/2009/01/brief-timeline-of- python.html [11] The A-Z of Programming Languages: Python, http://www.computerworld.com.au/article/255835/- z_programming_languages_python [12] Interview with Guido van Rossum, http://www.amk.ca/python/writing/gvr-interview [13] General Python FAQ, http://www.python.org/doc/faq/general/ References [14] Guido van Rossum, The : Python's Design philosophy, http://python-history.blogspot.com/2009/01/ pythons-design-philosophy.html [15] Tim Peters, PEP 20 - The Zen of Python, http://www.python.org/dev/peps/pep-0020/ [16] The History of Python - Personal History - part 1, CWI, http://python-history.blogspot.com/2009/01/personal- history-part-1-cwi.html [17] The History of Python - Early Language Design and Development, http://python- history.blogspot.com/2009/02/early-language-design-and- development.html [18] The History of Python - Python's Use of Dynamic Typing, http://python-history.blogspot.com/2009/02/pythons-use-of- dynamic-typing.html References [19] Guido van Rossum, The History of Python: Adding Support for User-defined Classes, http://python- history.blogspot.com/2009/02/adding-support-for-user-defined- classes.html