The future of Python on the Web My data journey

2 3 4 5 6 7 8 Lean Data Practices

https://www.mozilla.org/en-US/about/policy/lean-data/ 9 vs.

(potentially) universal specific 18GB / day 2TB / day

10 Communicating about Data Science

Mozilla Confidential 11 The lifecycle of data science

Exploration

Collaboration Explanation

Exploration and Explanation in Computational Notebooks 12 13 Architecture

Jupyter-like model Iodide model

Browser Data Data UI

Browser Server Data Server Kernel Kernel UI Data Data Remote Compute (optional)

Kernel

Adapted from: https://jupyter.readthedocs.io/en/latest/architecture/how_jupyter_ipython_work.html#notebooks 14 15 iomd

%% md ● Human readable and editable # This is a markdown header ● Easy for programs to support %% js el = document.getElementById(“foo”) ● Diffable with standard tools

%% py from js import el ● See Matlab cell mode, R Markdown, el.text = “Hello World!” Jupytext (and many others)

16 Javascript

PROS CONS

FAST: Some of the best Legacy “rough edges” technology of any dynamic language

Familiar to many Not familiar to many data scientists

Large selection of user interface and Lacks a mature data science ecosystem visualization tools

17 What if we could bring Python to the ��browser?

18 Transpiling Convert Python to Javascript

Python Javascript def fib(n): export var fib = function(n) { if n == 1: if (n == 1) return 0; return 0 else if (n == 2) return 1; elif n == 2: else return fib(n - 1) + fib(n - 2) return 1 }; else: return fib(n - 1) + fib(n - 2)

transcrypt, pyjs

19 Transpiling Convert Python to Javascript

Pros Cons ● Small ● Server-side "ahead of time" ● Fast ● Subtly different semantics ● Covering all of CPython's functionality is a lot of work ● Keeping up with CPython's progress is a lot of work ● No support for extensions (Numpy, Scipy, etc.)

20 Interpreter Porting Rewrite the Python interpreter and VM in Javascript

C Javascript

static int function $add(self, item){ set_add_entry( self.$items.push(item) PySetObject *so, var value = item.valueOf() PyObject *key, if(typeof value == "number"){ Py_hash_t hash self.$numbers.push(value) ) } { while (1) { if (entry->hash == hash) { PyObject *startkey = entry->key;

assert(startkey != dummy); if (startkey == key) goto found_active; brython, skulpt, batavia

21 Interpreter Porting Rewrite Python interpreter and VM in Javascript

Pros Cons ● Can compile and run Python entirely in ● Larger download and slower startup the browser than transpiling ● Can embed a transpiler in the browser ● Subtly different semantics for a hybrid approach ● Covering all of CPython's functionality is a lot of work ● Keeping up with CPython's progress is a lot of work ● No support for C extensions

22 WebAssembly

23 Compile to WebAssembly Recompile the Python interpreter to WebAssembly

C WebAssembly

static int (func (;1839;) (type 4) (param i32 i32 i32) (result set_add_entry( i32) PySetObject *so, (local i32 i32 i32 i32 i32 i32 i32 i32 i32 i32) ... PyObject *key, if ;; label = @1 Py_hash_t hash block ;; label = @2 ) block ;; label = @3 { block ;; label = @4 while (1) { loop ;; label = @5 if (entry->hash == hash) { block ;; label = @6 block (result i32) ;; label = @7 PyObject *startkey = entry->key; block ;; label = @8 assert(startkey != dummy); if (startkey == key)

goto found_active;

PyPy.js, cpython-wasm, Pyodide

24 Compile to WebAssembly Recompile the Python interpreter to WebAssembly

Pros Cons ● It's the same as upstream CPython ● Very large download sizes ● Everything that can work does work ● High memory usage ● Supports C extensions (Numpy, Scipy etc.) ● Performance on par with native code

25 Tradeoffs

Transpiling Porting Recompiling interpreter

Download size Small Medium Large

Memory usage Small Medium Large

Similarity to upstream Low Medium High

Easily track upstream ✅ changes

Supports C extensions ✅

26 Pyodide The scientific Python stack, compiled to WebAssembly

27 Pyodide

● Upstream CPython

● numpy, pandas, matplotlib, scipy

● "pip install" pure Python wheels

28 Your Python Code

Pyodide Python CPython Extension

Emscripten system abstraction

Javascript interpreter

DOM APIs

29 30 Accelerating Python

Input Process Output

C extension

Cython

Numba

Conversion JavaScript Conversion

31 Sharing arrays with zero copying Future?

32 The Web API

● DOM ● Graphics: Canvas, WebGL ● Audio: WebAudio, WebRTC ● Video: HTMLMediaElement ● Device: Notifications, WebBluetooth ● Storage: Client-side storage

33 Pyodide Demo

34 Performance

https://github.com/serge-sans-paille/numpy-benchmarks 35 Ways to get more performance

● Cython ● Numba ● PyPy ● Apache Arrow ● General purpose GPU ● Distributed computing

36 What doesn't work

Probably never Someday

● Raw network sockets ● threads ● Subprocesses ● async ● Access to the host filesystem ● SIMD ● General Purpose GPU computing

37 Monolithic Libraries

package Total size Loaded at import Scipy 65MB 11MB Pandas 50MB 43MB Matplotlib 20MB 13MB Numpy 20MB 11MB

* values are for native x86_64 Python

38 Future Directions conda forge infrastructure for package building

39 Future directions Language interoperability

Julia ⬛ Works today Lua ⬛ Planned

Python Text in/out only OCaml Javascript

JSX R

Typescript

Apache Ruby Arrow / libndtype

Rust

40 We're open source on github

http://github.com/iodide-project/

Come build We need: ● Experimenters with us! ● Designers ● Programmers ● Writers ● Bug hunters

41 Our team

Roman Yurchak Brendan Colloran Devin Bayly Kirill Smelkov Hamilton Ulmer

William Lachance

Michael Droettboom

Teon Brooks Madhur Tandon John Karahalis

Rob Miller Dhiraj Barnwal

Jannis Leidel

...... and many other community contributors

42 iodide.io github.com/iodide-project

[email protected]

43