Introduction Jupyter Prof Hans Fangohr, 5 October 2019 1

09:00 - 09:30 Jupyter Notebook and Ecosystem 30', Hans Fangohr (European XFEL GmbH) 09:30 - 10:00 Status updates from facilities ► Jupyter at Synchrotron Soleil (Alain Buteau & Gwenaelle Abeille) 5' ► Jupyter at CERN (Manuel Gonzalez & Jakub Wozniak) 5' ► Jupyter at European Southern Observatory (Gianluca Chiozzi) 5' ► Jupyter at J-PARC MLF (Kentaro Moriyama) 5’ ► Jupyter at MAX IV Synchrotron (Vincent Hardion) 5’ 10:00 - 10:30 Coffee break 10:30 - 12:30 JupyterLab Tutorial 2h0', Saul Shanabrook (Quansight) 12:30 - 14:00 Lunch and networking 14:00 - 14:30 Jupyter for processing neutron event data at ESS 30', Dr. Jonathan Taylor (European Spallation Source) 14:30 - 15:00 Jupyter at Brookhaven National Laboratory 30', Dr. Daniel B. Allan (Brookhaven National Lab) 15:00 - 15:20 Jupyter for Accelerator Physics 20', Dr. Jonathan Edelen (RadiaSoft LLC) 15:30 - 16:00 Coffee break 16:00 - 17:00 Questions and answers, discussion, Hans Fangohr (European XFEL GmbH) Introduction Jupyter notebook and ecosystem

Hans Fangohr Data Analysis European X-ray Free Electron Laser (EuXFEL), Germany Professor of Computational Modelling University of Southampton, United Kingdom [email protected] @ProfCompMod

Jupyter Workshop, ICALEPCS 2019, New York, US, Saturday 5 October 2019 https://indico.desy.de/indico/event/23354 Introduction Jupyter Prof Hans Fangohr, 5 October 2019 3

Outline

Quick introduction Jupyter Notebook

Introduce a number of use cases (most from European XFEL)

Introduce a number of tools from

Format (for the day)

Informal Ask questions to make it informal To guide the presenter what is useful

Slides will be made available on Workshop site (speakers please send to Hans Fangohr) In [1]: # Import Python and libraries we need later %matplotlib inline Introduction Jupyter Prof Hans Fangohr,from numpy 5 October import 2019 exp, cos, linspace import pylab 4 from ipywidgets import interact

Mathematical model: We would like to understand !(", #, $) = exp(−#") cos($") Jupyter Notebook Code: Here is an implementation: In [2]: def f(t, alpha, omega): """Computes and returns exp(-alpha*t) * cos(omega*t)""" Document hosted in web browser (demo) return exp(-alpha * t) * cos(omega * t) Interactive exploration: We can execute the function for values of ", # and $:

Combines In [3]: f(t=0.1, alpha=1, omega=10) text (markdown with LaTeX support) Out[3]: 0.48888574340060287 Computer code (Python) Or produce a plot (in a function plot_f so it can be re-used for different parameters): Output from code In [4]: def plot_f(alpha, omega): ts = linspace(0, 5, 500) # 500 points in interval [0, 5] ys = f(ts, alpha, omega) Saved in one document pylab.plot(ts, ys, '-') *.ipynb [IPYthon NoteBook] In [5]: plot_f(alpha=0.1, omega=10) # call function and create plot Combines input and output cells JSON format

Using interaction widgets, we can re-trigger execution of the plot_f function via GUI elements such as sliders.

In [6]: interact(plot_f, alpha=(0, 2, 0.1), omega=(0, 50, 0.5));

alpha 1.50

omega 45.50

Conclusion: We observe that parameter # is responsible for damping, and $ for the frequency. Introduction Jupyter Prof Hans Fangohr, 5 October 2019 5 Toolbar Notebook name Kernel name

Markdown cell

Currently selected cell

Code cell

Code output In [1]: # Import Python and libraries we need later %matplotlib inline Introduction Jupyter Prof Hans Fangohr, 5 October 2019from numpy import exp, cos, linspace import pylab 6 from ipywidgets import interact

Mathematical model: We would like to understand !(", #, $) = exp(−#") cos($") History of Jupyter Code: Here is an implementation: In [2]: def f(t, alpha, omega): """Computes and returns exp(-alpha*t) * cos(omega*t)""" return exp(-alpha * t) * cos(omega * t)

IPython à IPYthon NoteBook à *.ipynb Interactive exploration: We can execute the function for values of ", # and $:

In [3]: f(t=0.1, alpha=1, omega=10) IPython notebooks were language agnostic, as they run Out[3]: 0.48888574340060287 over open network protocols Or produce a plot (in a function plot_f so it can be re-used for different parameters):

In [4]: def plot_f(alpha, omega): The community began adding other languages to the ts = linspace(0, 5, 500) # 500 points in interval [0, 5] ys = f(ts, alpha, omega) notebooks, starting with Julia and pylab.plot(ts, ys, '-')

In [5]: plot_f(alpha=0.1, omega=10) # call function and create plot As the project expanded away from just Python, the notebooks had to be renamed

JUlia, PYThon, and R à Jupyter

“Jupyter” is also a homage to Galileo’s notebooks which recorded the discovery of Jupiter’s moons

Using interaction widgets, we can re-trigger execution of the plot_f function via GUI elements such as sliders.

In [6]: interact(plot_f, alpha=(0, 2, 0.1), omega=(0, 50, 0.5));

alpha 1.50

omega 45.50

Conclusion: We observe that parameter # is responsible for damping, and $ for the frequency. Introduction Jupyter Prof Hans Fangohr, 5 October 2019 7

How does it work?

Starting jupyter-notebook starts the notebook server

Opening a notebook starts up the kernel

The kernel communicates to the notebook server via ZMQ

The notebook server shows content via the browser

JavaScript used in Browser Introduction Jupyter Prof Hans Fangohr, 5 October 2019 8

Supported Kernels

50+ languages supported. More complete list at https://github.com/jupyter/jupyter/wiki/Jupyter-kernels Introduction Jupyter Prof Hans Fangohr, 5 October 2019 9

Use case 1: data analysis in notebook

Explorative data analysis

Convenient combination of processing, results and interpretation

Complete capture of all computational steps good record for reproducibility and re-use ► FAIR data

Through export to HTML, easy to share with collaborators & supervisors

Scientists are confident drivers of this Example on the right from SCS instrument Introduction Jupyter Prof Hans Fangohr, 5 October 2019 10

Sharing and exporting notebooks

Can share *.ipynb files Can be displayed using jupyter-notebook Code can be re-executed on collaborator‘s machine ► Only if software is available, and data is available, and collaborators know how to start notebook

“Static sharing“ through html and email (for example) Often sufficient – does not require installation or additional skills ► Effective to communicate with supervisors, line managers etc Convert using menu “File -> Download as -> html“ Or using nbconvert: $ jupyter-nbconvert --to html 2-widgets.ipynb [NbConvertApp] Converting notebook 2-widgets.ipynb to html [NbConvertApp] Writing 382609 bytes to 2-widgets.html Introduction Jupyter Prof Hans Fangohr, 5 October 2019 11

Use case 2: notebooks as recipes

Pre-populate notebook with cells to carry out a particular type of data analysis Provide a directory full of such recipes to users Users execute cells during beamtime and later

Convenient compromise between Static recipe (=script) Interactive exploration

Experience Keep code in notebook cells short and move functionality into library (here “ToolBox“) Archive directory of modified recipes with data Introduction Jupyter Prof Hans Fangohr, 5 October 2019 12

Use case 4: notebooks as a script

Use Jupyter Notebook as a script Can execute using nbconvert to take commands in notebook, execute them, save resulting notebook. Can create data files and plots in process.

Use case: detector calibration pipeline Use of nbparametrize (or papermill) to insert run parameters into notebook before execution Automatic execution and creation of pdf Error messages embedded in output Introduction Jupyter Prof Hans Fangohr, 5 October 2019 13

Executing a notebook from the command line (nbconvert)

Convert from ipynb to html: $ jupyter-nbconvert --to html 2-widgets.ipynb

Can optionally set output name $ jupyter-nbconvert --to html --output myout.html 2-widgets.ipynb

Can execute notebook before conversion: $ jupyter-nbconvert --execute --to html --output myout.html 2-widgets.ipynb

Execute notebook and save results as notebook: $ jupyter-nbconvert --execute --to ipynb --output myout.ipynb 2-widgets.ipynb

[NbConvertApp] Converting notebook 2-widgets.ipynb to ipynb [NbConvertApp] Executing notebook with kernel: python3 [NbConvertApp] Writing 103398 bytes to myout.ipynb Introduction Jupyter Prof Hans Fangohr, 5 October 2019 14 nbconvert

nbconvert is a tool which can convert notebooks to various other formats such as: HTML, LaTeX, PDF, Markdown, ReStructuredText, Python, …

Static HTML pages can be created as documentation or tutorial Nbsphinx -> Jupyter notebook provides section in sphinx documentation

Homepage: https://github.com/jupyter/nbconvert

Notebook as a script approach Can change parameters inside notebook before execution: ► nbparametrize or papermill Introduction Jupyter Prof Hans Fangohr, 5 October 2019 15

Use case 5: (remote) data analysis environment (JupyterHub)

JupyterHub allows to users connect through browser and https use existing authentication systems serve notebooks on facility hardware connect to user’s file storage

Example: Jupyter Hub at EuXFEL & DESY Uses Maxwell HPC cluster

Popular with users: no software installation & browser of choice works locally and remotely the same Introduction Jupyter Prof Hans Fangohr, 5 October 2019 16

Using remote resources: JupyterHub

In principle user can ssh to HPC resource, use port forwarding to connect local machine with HPC computer running notebook server

JupyterHub helps orchestrate and manage individual Jupyter instances for multiple users

It provides an interface allowing users to easily spawn and connect their own Jupyter Server

Different options for resource allocation Integrated with HPC scheduler . . . Introduction Jupyter Prof Hans Fangohr, 5 October 2019 17

JupyterHub – manage individual Jupyter instances for multiple users Introduction Jupyter Prof Hans Fangohr, 5 October 2019 18

JupyterHub – manage individual Jupyter instances for multiple users Introduction Jupyter Prof Hans Fangohr, 5 October 2019 19 Introduction Jupyter Prof Hans Fangohr, 5 October 2019 20

Use case 6: blending GUI and script

JupyterWidgets provide graphical control elements in notebook Buttons, sliders etc trigger code execution and update of plot

Useful for Data analysis of fixed type Data exploration of data sets

Discussion Less powerful than, for example, Qt GUI Popular with users due to Being embedded in notebook No software installation (via JupyterHub) Introduction Jupyter Prof Hans Fangohr, 5 October 2019 21

Binder project

Given a (github) repository with Jupyter notebooks Software requirements (Dockerfile, requirements.txt, environment.yml)

Binder service builds a container with the required software starts Jupyter notebook server in that container offering the notebooks

Binder project provides free pilot at https://mybinder.org

Institutional Binder instances are being deployed Example: https://github.com/fangohr/jupyter-demo Introduction Jupyter Prof Hans Fangohr, 5 October 2019 22

Example: https://github.com/fangohr/jupyter-demo Introduction Jupyter Prof Hans Fangohr, 5 October 2019 23

Example: https://github.com/fangohr/jupyter-demo Introduction Jupyter Prof Hans Fangohr, 5 October 2019 24

Example: https://github.com/fangohr/jupyter-demo Introduction Jupyter Prof Hans Fangohr, 5 October 2019 25

BinderHub – mybinder.org Introduction Jupyter Prof Hans Fangohr, 5 October 2019 26

Use case 7: documenting software library

Use notebook as chapter in documentation Supported by sphinx à html, pdf as usual

Documentation easy to create: enter commands in notebook ouput is produced automatically updating docs means re-running notebook

Can run regression test on documentation notebooks using NoteBook VALidate (nbval) Introduction Jupyter Prof Hans Fangohr, 5 October 2019 27

Use case 7: documenting software library

Use notebook as chapter in documentation Supported by sphinx à html, pdf as usual

Documentation easy to create: enter commands in notebook ouput is produced automatically updating docs means re-running notebook

Can run regression test on documentation notebooks using NoteBook VALidate (nbval)

With Binder, can make documentation executable (for example DiscretisedField Tutorials) Introduction Jupyter Prof Hans Fangohr, 5 October 2019 28 nbval

py.test is a popular Python testing framework

nbval is a py.test plugin which lets py.test recognise and collect Jupyter notebooks

In each notebook, each cell is a test: The test passes if execution of the input creates the stored output The test fails otherwise

There is a variety of configuration parameters

Home page: - https://github.com/computationalmodelling/nbval Introduction Jupyter Prof Hans Fangohr, 5 October 2019 29 nbval example

$ py.test --verbose --nbval 2-widgets.ipynb ======test session starts ======platform darwin -- Python 3.6.8, pytest-3.10.0, py-1.7.0, pluggy-0.8.0 -- /Users/fangohr/anaconda3/bin/python rootdir: /Users/fangohr/Desktop/jupyter-demo, inifile: plugins: remotedata-0.3.1, openfiles-0.3.0, doctestplus-0.1.3, arraydiff-0.2, nbval-0.9.1 collected 9 items

2-widgets::ipynb::Cell 0 PASSED [ 11%] 2-widgets::ipynb::Cell 1 PASSED [ 22%] 2-widgets::ipynb::Cell 2 PASSED [ 33%] 2-widgets::ipynb::Cell 3 PASSED [ 44%] 2-widgets::ipynb::Cell 4 PASSED [ 55%] 2-widgets::ipynb::Cell 5 PASSED [ 66%] 2-widgets::ipynb::Cell 6 PASSED [ 77%] 2-widgets::ipynb::Cell 7 PASSED [ 88%] 2-widgets::ipynb::Cell 8 PASSED [100%]

======9 passed, 1 warning in 2.11 seconds ======(base) [20:49:09] fangohr:jupyter-demo git:(master*) $ Introduction Jupyter Prof Hans Fangohr, 5 October 2019 30

Use case 8: reproducible publication

Create github repository to complement publication Create one notebook per figure / main result Define software environment in github repository using Binder syntax

Close to reproducible publication: fully specified software environment fully specified data analysis Data access ià Andy Götz talk, PaNOSC

Zenodo for long term preservation Create Zenodo deposit for repository Cite Zenodo DOI in publication Example: https://github.com/maxalbert/paper-supplement-nanoparticle-sensing Introduction Jupyter Prof Hans Fangohr, 5 October 2019 31 Introduction Jupyter Prof Hans Fangohr, 5 October 2019 32 Introduction Jupyter Prof Hans Fangohr, 5 October 2019 33

Jupyter Lab

Next generation Jupyter notebook interface

Window manager embedded in browser

“classic notebook“ in one window

Additional features in other windows File browser Extensions CSV viewer. Introduction Jupyter MultipleProf tabs Hans Fangohr, 5 October 2019 34

Multiple panels

Explore files (e.g. CSV) Table of Contents extension

Dark theme! Introduction Jupyter Prof Hans Fangohr, 5 October 2019 35

Jupyter Notebook vs. JupyterLab

JupyterLab is the newer interface to the Flexible layouts and panes notebooks More flexible console/text editor It provides a more flexible interface closer to a modern IDE Drag/Drop/Expand/Collapse cells

More on this later in the day Themes Extensions Introduction Jupyter Prof Hans Fangohr, 5 October 2019 36

Nbdime – NoteBook DIff and MErge

Jupyter notebooks are rich media documents, stored as plain text JSON files

Basic diff and merge tools (such as that used by git) do not handle this format well Small changes to text/plots result in unreadable diffs

nbdime provides multiple tools to help with “content-aware” diffing and merging for Jupyter notebook files nbdiff compare notebooks in a terminal-friendly way nbmerge three-way merge of notebooks with automatic conflict resolution nbdiff-web shows you a rich rendered diff of notebooks nbmerge-web gives you a web-based three-way merge tool for notebooks nbshow present a single notebook in a terminal-friendly way

Homepage: -https://github.com/jupyter/nbdime Introduction Jupyter Prof Hans Fangohr, 5 October 2019 37 Introduction Jupyter Prof Hans Fangohr, 5 October 2019 38 Introduction Jupyter Prof Hans Fangohr, 5 October 2019 39

Summary – tools

Jupyter Notebook Jupyter Lab – next generation Jupyter user interface Jupyter Hub – serving notebook from compute facility for multiple users NBDIME – DIffing and MErging tools NBVAL – VALidation tool; use each cell as a test NBCONVERT – conversion of notebooks to other formats & execution NBParametrize and Papermill – inject parameters into notebook files IPyWidgets – GUI like elements in notebook Binder – Cloud hosted execution of notebooks from github repositories

There is much more. Introduction Jupyter Prof Hans Fangohr, 5 October 2019 40

Summary – use cases

Use cases Data analysis Provision of recipes Notebook-as-a-script Remote data analysis (JupyterHub) Mixing GUI and script-driven analysis Documentation of Software Reproducibility . . . Introduction Jupyter Prof Hans Fangohr, 5 October 2019 41

Summary

Jupyter notebook and ecosystem provides many options

Acknowledgements Authors and co-authors of ICALEPCS contribution TUCPR02 (Tuesday 14:30) Robert Rosca and Thomas Kluyver OpenDreamKit Horizon 2020, European Research Infrastructures project (#676541), http://opendreamkit.org PaNOSC: Photon and Neutron Open Science Cloud, European Union’s Horizon 2020 research and innovation programme under grant agreement No 654220 The Gordon and Betty Moore Foundation through Grant GBMF #4856, by the Alfred P. Sloan Foundation and by the Helmsley Trust. EPSRC's Centre for Doctoral Training in Next Generation Computational Modelling, http://ngcm.soton.ac.uk (#EP/L015382/1), Introduction Jupyter Prof Hans Fangohr, 5 October 2019 42

09:00 - 09:30 Jupyter Notebook and Ecosystem 30', Hans Fangohr (European XFEL GmbH) 09:30 - 10:00 Status updates from facilities ► Jupyter at Synchrotron Soleil (Alain Buteau & Gwenaelle Abeille) 5' ► Jupyter at CERN (Manuel Gonzalez & Jakub Wozniak) 5' ► Jupyter at European Southern Observatory (Gianluca Chiozzi) 5' ► Jupyter at J-PARC MLF (Kentaro Moriyama) 5’ ► Jupyter at MAX IV Synchrotron (Vincent Hardion) 5’ 10:00 - 10:30 Coffee break 10:30 - 12:30 JupyterLab Tutorial 2h0', Saul Shanabrook (Quansight) 12:30 - 14:00 Lunch and networking 14:00 - 14:30 Jupyter for processing neutron event data at ESS 30', Dr. Jonathan Taylor (European Spallation Source) 14:30 - 15:00 Jupyter at Brookhaven National Laboratory 30', Dr. Daniel B. Allan (Brookhaven National Lab) 15:00 - 15:20 Jupyter for Accelerator Physics 20', Dr. Jonathan Edelen (RadiaSoft LLC) 15:30 - 16:00 Coffee break 16:00 - 17:00 Questions and answers, discussion, Hans Fangohr (European XFEL GmbH)