Ipython: a System for Interactive Scientific
Total Page:16
File Type:pdf, Size:1020Kb
P YTHON: B ATTERIES I NCLUDED IPython: A System for Interactive Scientific Computing Python offers basic facilities for interactive work and a comprehensive library on top of which more sophisticated systems can be built. The IPython project provides an enhanced interactive environment that includes, among other features, support for data visualization and facilities for distributed and parallel computation. he backbone of scientific computing is All these systems offer an interactive command mostly a collection of high-perfor- line in which code can be run immediately, without mance code written in Fortran, C, and having to go through the traditional edit/com- C++ that typically runs in batch mode pile/execute cycle. This flexible style matches well onT large systems, clusters, and supercomputers. the spirit of computing in a scientific context, in However, over the past decade, high-level environ- which determining what computations must be ments that integrate easy-to-use interpreted lan- performed next often requires significant work. An guages, comprehensive numerical libraries, and interactive environment lets scientists look at data, visualization facilities have become extremely popu- test new ideas, combine algorithmic approaches, lar in this field. As hardware becomes faster, the crit- and evaluate their outcome directly. This process ical bottleneck in scientific computing isn’t always the might lead to a final result, or it might clarify how computer’s processing time; the scientist’s time is also they need to build a more static, large-scale pro- a consideration. For this reason, systems that allow duction code. rapid algorithmic exploration, data analysis, and vi- As this article shows, Python (www.python.org) sualization have become a staple of daily scientific is an excellent tool for such a workflow.1 The work. The Interactive Data Language (IDL) and IPython project (http://ipython.scipy.org) aims to Matlab (for numerical work), and Mathematica and not only provide a greatly enhanced Python shell Maple (for work that includes symbolic manipula- but also facilities for interactive distributed and par- tion) are well-known commercial environments of allel computing, as well as a comprehensive set of this kind. GNU Data Language, Octave, Maxima tools for building special-purpose interactive envi- and Sage provide their open source counterparts. ronments for scientific computing. Python: An Open and General- 1521-9615/07/$25.00 © 2007 IEEE Purpose Environment Copublished by the IEEE CS and the AIP The fragment in Figure 1 shows the default inter- FERNANDO PÉREZ active Python shell, including a computation with University of Colorado at Boulder long integers (whose size is limited only by the BRIAN E. GRANGER available memory) and one using the built-in com- plex numbers, where the literal 1j represents Tech-X Corporation i = −1 . MAY/JUNE 2007 THIS ARTICLE HAS BEEN PEER-REVIEWED. 21 $ python # $ represents the system prompt One of us (Fernando Pérez) started IPython as Python 2.4.3 (Apr 27 2006, 14:43:58) a merger of some personal enhancements to the [GCC 4.0.3 (Ubuntu 4.0.3-1ubuntu5)] on linux2 basic interactive Python shell with two existing Type “help”, “copyright”, “credits” or “license” open source projects (both now defunct and sub- for more information. sumed into IPython): >>> print “This is the Python shell.” This is the Python shell. • LazyPython, developed by Nathan Gray at Cal- tech, and >>> 2**45+1 # long integers are built-in •Interactive Python Prompt (IPP) by Janko 35184372088833L Hauser at the University of Kiel’s Institute of >>> import cmath # default complex math library Marine Research. >>> cmath.exp(–1j*cmath.pi) (–1–1.2246063538223773e-16j) After an initial development period as a mostly single-author project, IPython has attracted a growing group of contributors. Today, Ville Figure 1. Default interactive Python shell. In the two computations Vainio and other collaborators maintain the sta- shown—one with long integers and one using the built-in complex ble official branch, while we’re developing a next- numbers—the literal 1j represents i = −1 . generation system. Since IPython’s beginning, we’ve tried to pro- vide the best possible interactive environment for This shell allows for some customization and ac- everyday computing tasks, whether the actual work cess to help and documentation, but overall it’s a was scientific or not. With this goal in mind, we’ve fairly basic environment. freely mixed new ideas with existing ones from However, what Python lacks in the sophistica- Unix system shells and environments such as tion of its default shell, it makes up for by being Mathematica and IDL. a general-purpose programming language with access to a large set of libraries with additional ca- Features of a Good pabilities. Python’s standard library includes Interactive Computing Environment modules for regular expression processing, low- In addition to providing direct access to the un- level networking, XML parsing, Web services, derlying language (in our case, Python), we con- object serialization, and more. In addition, hun- sider a few basic principles to be the minimum dreds of third-party Python modules let users do requirements for a productive interactive comput- everything from work with Hierarchical Data ing system. Format 5 (HDF5) files to write graphical appli- cations. These diverse libraries make it possible Access to all session state. When working interac- to build sophisticated interactive environments tively, scientists commonly perform hundreds of in Python without having to implement every- computations in sequence and often might need to thing from scratch. reuse a previous result. The standard Python shell remembers the very last output and stores it into a IPython variable named “_” (a single underscore), but each Since late 2001, the IPython project has provided new result overwrites this variable. IPython stores tools to extend Python’s interactive capabilities be- a session’s inputs and outputs into a pair of num- yond those shipped by default with the language, bered tables called In and Out. All outputs are also and it continues to be developed as a base layer for accessible as _N, where N is the number of results new interactive environments. IPython is freely (you can also save a session’s inputs and outputs to available under the terms of the BSD license and a log file). Figure 2 shows the use of previous re- runs under Linux and other Unix-type operating sults in an IPython session. Because keeping a very systems, Apple OS X, and Microsoft Windows. large set of previous results can potentially lead to We won’t discuss IPython’s features in detail memory exhaustion, IPython lets users limit how here—it ships with a comprehensive user manual many results are kept. Users can also manually (also accessible on its Web site). Instead, we highlight delete individual references using the standard some of the basic ideas behind its design and how Python del keyword. they enable efficient interactive scientific computing. We encourage interested readers to visit the Web site A control system. It’s important to have a secondary and participate on the project’s mailing lists. control mechanism that is reasonably orthogonal 22 COMPUTING IN SCIENCE & ENGINEERING to the underlying language being executed (and in- $ ipython dependent of any variables or keywords in the lan- Python 2.4.3 (Apr 27 2006, 14:43:58) guage). Even programming languages as compact Type “copyright”, “credits” or “license” for more as Python have a syntax that requires parentheses, information. brackets, and so on, and thus aren’t the most con- venient for an interactive control systems. IPython 0.7.3 — An enhanced Interactive Python. IPython offers a set of control commands (or ? –> Introduction to IPython features. magic commands, as inherited from IPP) designed %magic –> Information about IPython magic % to improve Python’s usability in an interactive con- functions. text. The traditional Unix shell largely inspires the syntax for these magic commands, with white Help –> Python help system. space used as a separator and dashes indicating op- object? –> Details about object. ?object also tions. This system is accessible to the user, who works, ?? prints more. can extend it with new commands as desired. In [1]:2**45+1 The fragment in Figure 3 shows how to activate Out[1]:35184372088833L IPython’s logging system to save the session to a In [2]:import cmath named file, requesting that the output is logged In [3]:cmath.exp(–1j*cmath.pi) and every entry is time stamped. IPython auto- Out[3]:(–1–1.2246063538223773e–16j) matically interprets the logstart name as a call # The last result is always stored as '_' to a magic command because no Python variable In [4]:_ ** 2 with that name currently exists. If there were such Out[4]:(1+2.4492127076447545e–16j) a variable, typing %logstart would disambiguate # And all results are stored as N, where _N is the names. their number: Operating system access. Many computing tasks In [5]:_3+_4 involve working with the underlying operating Out[5]:1.2246063538223773e–16j system (reading data, looking for code to execute, loading other programs, and so on). IPython lets users create their own aliases for common system tasks, navigate the file system with familiar com- Figure 2. The use of previous results in an IPython session. In mands such as cd and ls, and prefix any command IPython, all outputs are also accessible as _N, where N is the number with ! for direct execution by the underlying OS. of results. Although these are fairly simple features, in prac- tice they help maintain a fluid work experience— they let users type standard Python code for In [2]: logstart –o –t ipsession.log programming tasks and perform common OS- Activating auto–logging. Current session state level actions with a familiar Unix-like syntax.