The Future of Python on the Web My Data Journey
Total Page:16
File Type:pdf, Size:1020Kb
The future of Python on the Web My data journey 2 3 4 5 6 7 8 Lean Data Practices https://www.mozilla.org/en-US/about/policy/lean-data/ 9 vs. (potentially) universal specific 18GB / day 2TB / day 10 Communicating about Data Science Mozilla Confidential 11 The lifecycle of data science Exploration Collaboration Explanation Exploration and Explanation in Computational Notebooks 12 13 Architecture Jupyter-like model Iodide model Browser Data Data UI Browser Server Data Server Kernel Kernel UI Data Data Remote Compute (optional) Kernel Adapted from: https://jupyter.readthedocs.io/en/latest/architecture/how_jupyter_ipython_work.html#notebooks 14 15 iomd %% md ● Human readable and editable # This is a markdown header ● Easy for programs to support %% js el = document.getElementById(“foo”) ● Diffable with standard tools %% py from js import el ● See Matlab cell mode, R Markdown, el.text = “Hello World!” Jupytext (and many others) 16 Javascript PROS CONS FAST: Some of the best compiler Legacy “rough edges” technology of any dynamic language Familiar to many programmers Not familiar to many data scientists Large selection of user interface and Lacks a mature data science ecosystem visualization tools 17 What if we could bring Python to the ��browser? 18 Transpiling Convert Python to Javascript Python Javascript def fib(n): export var fib = function(n) { if n == 1: if (n == 1) return 0; return 0 else if (n == 2) return 1; elif n == 2: else return fib(n - 1) + fib(n - 2) return 1 }; else: return fib(n - 1) + fib(n - 2) transcrypt, pyjs 19 Transpiling Convert Python to Javascript Pros Cons ● Small ● Server-side "ahead of time" ● Fast ● Subtly different semantics ● Covering all of CPython's functionality is a lot of work ● Keeping up with CPython's progress is a lot of work ● No support for C extensions (Numpy, Scipy, etc.) 20 Interpreter Porting Rewrite the Python interpreter and VM in Javascript C Javascript static int function $add(self, item){ set_add_entry( self.$items.push(item) PySetObject *so, var value = item.valueOf() PyObject *key, if(typeof value == "number"){ Py_hash_t hash self.$numbers.push(value) ) } { while (1) { if (entry->hash == hash) { PyObject *startkey = entry->key; assert(startkey != dummy); if (startkey == key) goto found_active; brython, skulpt, batavia 21 Interpreter Porting Rewrite Python interpreter and VM in Javascript Pros Cons ● Can compile and run Python entirely in ● Larger download and slower startup the browser than transpiling ● Can embed a transpiler in the browser ● Subtly different semantics for a hybrid approach ● Covering all of CPython's functionality is a lot of work ● Keeping up with CPython's progress is a lot of work ● No support for C extensions 22 WebAssembly 23 Compile to WebAssembly Recompile the Python interpreter to WebAssembly C WebAssembly static int (func (;1839;) (type 4) (param i32 i32 i32) (result set_add_entry( i32) PySetObject *so, (local i32 i32 i32 i32 i32 i32 i32 i32 i32 i32) ... PyObject *key, if ;; label = @1 Py_hash_t hash block ;; label = @2 ) block ;; label = @3 { block ;; label = @4 while (1) { loop ;; label = @5 block ;; label = @6 if (entry->hash == hash) { block (result i32) ;; label = @7 PyObject *startkey = entry->key; block ;; label = @8 assert(startkey != dummy); if (startkey == key) goto found_active; PyPy.js, cpython-wasm, Pyodide 24 Compile to WebAssembly Recompile the Python interpreter to WebAssembly Pros Cons ● It's the same as upstream CPython ● Very large download sizes ● Everything that can work does work ● High memory usage ● Supports C extensions (Numpy, Scipy etc.) ● Performance on par with native code 25 Tradeoffs Transpiling Porting Recompiling interpreter Download size Small Medium Large Memory usage Small Medium Large Similarity to upstream Low Medium High Easily track upstream ✅ changes Supports C extensions ✅ 26 Pyodide The scientific Python stack, compiled to WebAssembly 27 Pyodide ● Upstream CPython ● numpy, pandas, matplotlib, scipy ● "pip install" pure Python wheels 28 Your Python Code Pyodide Python CPython Extension Emscripten system abstraction Javascript interpreter DOM APIs 29 30 Accelerating Python Input Process Output C extension Cython Numba Conversion JavaScript Conversion 31 Sharing arrays with zero copying Future? 32 The Web API ● DOM ● Graphics: Canvas, WebGL ● Audio: WebAudio, WebRTC ● Video: HTMLMediaElement ● Device: Notifications, WebBluetooth ● Storage: Client-side storage 33 Pyodide Demo 34 Performance https://github.com/serge-sans-paille/numpy-benchmarks 35 Ways to get more performance ● Cython ● Numba ● PyPy ● Apache Arrow ● General purpose GPU ● Distributed computing 36 What doesn't work Probably never Someday ● Raw network sockets ● threads ● Subprocesses ● async ● Access to the host filesystem ● SIMD ● General Purpose GPU computing 37 Monolithic Libraries package Total size Loaded at import Scipy 65MB 11MB Pandas 50MB 43MB Matplotlib 20MB 13MB Numpy 20MB 11MB * values are for native x86_64 Python 38 Future Directions conda forge infrastructure for package building 39 Future directions Language interoperability Julia ⬛ Works today Lua ⬛ Planned Python Text in/out only OCaml Javascript JSX R Typescript Apache Ruby Arrow / libndtype Rust 40 We're open source on github http://github.com/iodide-project/ Come build We need: ● Experimenters with us! ● Designers ● Programmers ● Writers ● Bug hunters 41 Our team Roman Yurchak Brendan Colloran Devin Bayly Kirill Smelkov Hamilton Ulmer William Lachance Michael Droettboom Teon Brooks Madhur Tandon John Karahalis Rob Miller Dhiraj Barnwal Jannis Leidel ... ...and many other community contributors 42 iodide.io github.com/iodide-project [email protected] 43.