Python Scientific Lecture Notes
Total Page:16
File Type:pdf, Size:1020Kb
Python Scientific lecture notes Release 2013.1 EuroScipy tutorial team Editors: Valentin Haenel, Emmanuelle Gouillart, Gaël Varoquaux http://scipy-lectures.github.com February 10, 2013 (2013.1-2-g9a33667) Contents I Getting started with Python for science2 1 Scientific computing with tools and workflow3 1.1 Why Python?.............................................3 1.2 Scientific Python building blocks..................................5 1.3 The interactive workflow: IPython and a text editor.........................6 2 The Python language 9 2.1 First steps............................................... 10 2.2 Basic types.............................................. 10 2.3 Control Flow............................................. 17 2.4 Defining functions.......................................... 21 2.5 Reusing code: scripts and modules................................. 26 2.6 Input and Output........................................... 33 2.7 Standard Library........................................... 34 2.8 Exception handling in Python.................................... 38 2.9 Object-oriented programming (OOP)................................ 40 3 NumPy: creating and manipulating numerical data 42 3.1 The numpy array object....................................... 42 3.2 Numerical operations on arrays................................... 53 3.3 More elaborate arrays........................................ 72 3.4 Advanced operations......................................... 76 4 Matplotlib: plotting 81 4.1 Introduction............................................. 82 4.2 Simple plot.............................................. 83 4.3 Figures, Subplots, Axes and Ticks.................................. 89 4.4 Other Types of Plots: examples and exercises............................ 91 4.5 Beyond this tutorial......................................... 98 4.6 Quick references........................................... 100 5 Scipy : high-level scientific computing 104 5.1 File input/output: scipy.io .................................... 105 5.2 Special functions: scipy.special ................................ 105 5.3 Linear algebra operations: scipy.linalg ............................ 106 5.4 Fast Fourier transforms: scipy.fftpack ............................ 107 5.5 Optimization and fit: scipy.optimize ............................. 111 5.6 Statistics and random numbers: scipy.stats .......................... 115 i 5.7 Interpolation: scipy.interpolate ............................... 117 5.8 Numerical integration: scipy.integrate ............................ 118 5.9 Signal processing: scipy.signal ................................ 120 5.10 Image processing: scipy.ndimage ............................... 121 5.11 Summary exercises on scientific computing............................. 126 6 Getting help and finding documentation 139 II Advanced topics 143 7 Advanced Python Constructs 144 7.1 Iterators, generator expressions and generators........................... 145 7.2 Decorators.............................................. 149 7.3 Context managers.......................................... 157 8 Advanced Numpy 160 8.1 Life of ndarray............................................ 161 8.2 Universal functions.......................................... 174 8.3 Interoperability features....................................... 183 8.4 Array siblings: chararray, maskedarray, matrix ...................... 186 8.5 Summary............................................... 189 8.6 Contributing to Numpy/Scipy.................................... 189 9 Debugging code 193 9.1 Avoiding bugs............................................ 193 9.2 Debugging workflow......................................... 196 9.3 Using the Python debugger...................................... 196 9.4 Debugging segmentation faults using gdb.............................. 201 10 Optimizing code 204 10.1 Optimization workflow........................................ 204 10.2 Profiling Python code........................................ 205 10.3 Making code go faster........................................ 208 10.4 Writing faster numerical code.................................... 209 11 Sparse Matrices in SciPy 212 11.1 Introduction............................................. 212 11.2 Storage Schemes........................................... 214 11.3 Linear System Solvers........................................ 226 11.4 Other Interesting Packages...................................... 230 12 Image manipulation and processing using Numpy and Scipy 232 12.1 Opening and writing to image files................................. 233 12.2 Displaying images.......................................... 234 12.3 Basic manipulations......................................... 236 12.4 Image filtering............................................ 238 12.5 Feature extraction.......................................... 243 12.6 Measuring objects properties: ndimage.measurements .................... 246 13 Mathematical optimization: finding minima of functions 252 13.1 Knowing your problem........................................ 253 13.2 A review of the different optimizers................................. 255 13.3 Practical guide to optimization with scipy.............................. 262 13.4 Special case: non-linear least-squares................................ 264 13.5 Optimization with constraints.................................... 266 14 Traits 268 14.1 Introduction............................................. 269 14.2 Example............................................... 269 ii 14.3 What are Traits............................................ 270 14.4 References.............................................. 285 15 3D plotting with Mayavi 287 15.1 Mlab: the scripting interface..................................... 287 15.2 Interactive work........................................... 293 16 Sympy : Symbolic Mathematics in Python 294 16.1 First Steps with SymPy....................................... 295 16.2 Algebraic manipulations....................................... 296 16.3 Calculus............................................... 297 16.4 Equation solving........................................... 298 16.5 Linear Algebra............................................ 299 17 scikit-learn: machine learning in Python 301 17.1 Loading an example dataset..................................... 302 17.2 Classification............................................. 303 17.3 Clustering: grouping observations together............................. 306 17.4 Dimension Reduction with Principal Component Analysis..................... 307 17.5 Putting it all together: face recognition............................... 308 17.6 Linear model: from regression to sparsity.............................. 310 17.7 Model selection: choosing estimators and their parameters..................... 311 18 Interfacing with C 312 18.1 Introduction............................................. 312 18.2 Python-C-Api............................................ 313 18.3 Ctypes................................................ 317 18.4 SWIG................................................. 320 18.5 Cython................................................ 324 18.6 Summary............................................... 328 18.7 Further Reading and References................................... 328 18.8 Exercises............................................... 329 Index 331 iii Python Scientific lecture notes, Release 2013.1 Contents 1 Part I Getting started with Python for science 2 CHAPTER 1 Scientific computing with tools and workflow authors Fernando Perez, Emmanuelle Gouillart, Gaël Varoquaux, Valentin Haenel 1.1 Why Python? 1.1.1 The scientist’s needs • Get data (simulation, experiment control) • Manipulate and process data. • Visualize results... to understand what we are doing! • Communicate results: produce figures for reports or publications, write presentations. 1.1.2 Specifications • Rich collection of already existing bricks corresponding to classical numerical methods or basic actions: we don’t want to re-program the plotting of a curve, a Fourier transform or a fitting algorithm. Don’t reinvent the wheel! • Easy to learn: computer science is neither our job nor our education. We want to be able to draw a curve, smooth a signal, do a Fourier transform in a few minutes. • Easy communication with collaborators, students, customers, to make the code live within a lab or a com- pany: the code should be as readable as a book. Thus, the language should contain as few syntax symbols or unneeded routines as possible that would divert the reader from the mathematical or scientific understanding of the code. • Efficient code that executes quickly... but needless to say that a very fast code becomes useless if we spend too much time writing it. So, we need both a quick development time and a quick execution time. • A single environment/language for everything, if possible, to avoid learning a new software for each new problem. 1.1.3 Existing solutions Which solutions do scientists use to work? 3 Python Scientific lecture notes, Release 2013.1 Compiled languages: C, C++, Fortran, etc. • Advantages: – Very fast. Very optimized compilers. For heavy computations, it’s difficult to outperform these lan- guages. – Some very optimized scientific libraries have been written for these languages. Example: BLAS (vector/matrix operations) • Drawbacks: – Painful usage: no interactivity during development, mandatory compilation steps, verbose syntax (&, ::, }}, ; etc.), manual memory management