Data science for Physics and Astronomy Lightning talks Data science for Physics & Astronomy Classification of transients Catarina Alves

PhD @

Supervisors: ● Prof Hiranya Peiris ● Dr Jason McEwen

[email protected]

Supervised Learning Classification Unbalanced data Non representative data catarina-alves

Catarina-Alves Unsupervised Learning Real time classification Anomaly detection Large datasets 2 Data science for Physics & Astronomy Adrian Bevan Particle physics experimentalist with a keen interest in DS & AI

Parameter estimation: ● 2 decades of likelihood fitting experience spanning NA48, BaBar and ATLAS experiments; currently using ML on the ATLAS and MoEDAL experiments @ LHC. ● Latest relevant work: https://royalsocietypublishing.org/doi/10.1098/rsta.2019.0392

Machine learning: ● Broad interest in algorithms: domains of validity to understand what to use when and why to optimise the use of algorithms for a given problem domain. Have used SVMs, BDTs, MLPs, CNNs, Deep Networks, KNN for analysis work.

● Working on understanding why algorithms make a given inference or decision based on input feature space (explainability and interpretability).

● Want to explore unsupervised learning and bayesian networks when an appropriate problem presents itself.

Have taught or currently teach data science and machine learning undergrad and grad courses. [email protected] 3 Data science for Physics & Astronomy David Berman

4 Data science for Physics & Astronomy Richard Bielby

5 Data science for Nachiketa Chakraborty Physics & Astronomy

6 Data science for Physics & Astronomy ElenaElena CuocoCuoco

7 Data science for Physics & Astronomy Guy Davies [email protected] @CitationWarrior

Expertise: Astrophysics - Sun, Stars, Planets, & Galaxies Instrumentation - Resonant Scattering Spectrometers Data - Statistics, Bayesian, Generative models Problem: Over/under fitting and/or systematics Got my eye on: Massively Hierarchical Bayesian Models Gaussian Processes for everything Tensorflow Probability Want to know more about: Everything Hierarchical Neural Nets 8 Data science for Physics & Astronomy Joe Davies

What do I hope to gain from today: - Knowledge of how to understand ‘black box’ algorithms - Understand the kind of systematic uncertainties that can arise from using these type of models - Any useful techniques for writing more efficient code

ann: artificial neural network dt: decision tree lr: logistic regression rf: random forest 9 svm: support vector machine Data science for Physics & Astronomy Damien Di Mijolla

10 Data science for Physics & Astronomy Caterina Doglioni Caterina My expertise is: Dark matter, measurements and searches with Doglioni hadronic jets, real-time analysis in ATLAS A real-time problem I’m grappling with: Synergies between particle, astroparticle and nuclear physics in terms of common tools and [email protected] @CatDogLund Senior Lecturer physics questions

My research: I’ve got my eyes on: I am a researcher at Lund University and a member of the Ways to discover weird jets from dark sector ATLAS Collaboration at the LHC. I search for new physics particle cascades, axion-like particles phenomena that can be produced in proton-proton collisions, motivated by the presence of dark matter in I want to know more about: our universe. I work on the DARKJETS ERC project Reproducible and understandable machine together with a post-doctoral researcher and students, learning algorithms, real-time analysis in looking for the particles that mediate the interaction astrophysics between known particles and Dark Matter particles. I am a PI in the INSIGHTS ITN on statistics and machine learning, and I am a member/WG convenor of the HEP Software Foundation. 11 Data science for Physics & Astronomy Conor My expertise is: Fitzpatrick Particle Physicist Real-time analysis with high LHCb Experiment @CERN throughput data UKRI Future Leaders A problem I’m grappling with: Fellow, Uni. Manchester How to optimise expenditure on [email protected] hardware between buffer/processing @fitzparticle in an evolving and correlated trigger & DAQ system My research: Matter-antimatter asymmetries at LHCb I’ve got my eyes on: Software Trigger and Real-Time Analysis GPGPU for trigger & DAQ Optimisation problems for HEP experiments Price/performance trends in Beginning involvement in the SKA processing radiotelescope I want to know more about: Training useful ML algorithms on finite and highly imbalanced data Transients and how to find them Anomaly detection for data quality monitoring

12 Data science for Physics & Astronomy Sotiria Fotopoulou

Vice Chancellor Fellow

Astrophysics: - Black-hole galaxy coevolution - Large multiwavelength surveys Machine-learning: - Classification - Outlier detection - Unsupervised learning

Random HDBSCAN Forest Application, Application, Logan & Fotopoulou Fotopoulou & Paltani 2019 2018 arXiv:1911.05107 arXiv:1808.04977v1

13 Data science for Physics & Astronomy

14 Data science for Physics & Astronomy Sarah Gibson

My PhD… ● Studied Gamma-Ray Bursts with NASA’s Swift satellite ● Used Monte Carlo methods to optimise models to data

Currently… ● Research Data Scientist/Software Engineer at the Turing ● Help promote reproducible and shareable research with Project Binder: https://mybinder.org ● Also launched The Turing Way: A Handbook for Data Science: https://the-turing-way.netlify.com/

Email… ● [email protected]

15 Data science for Physics & Astronomy Imogen Gingell

16 Data science for Physics & Astronomy

17 Data science for Peter Hatfield - Postdoc, Oxford Physics Physics & Astronomy Peter [email protected] Hatfield Physics interests: - Large galaxy surveys - Laboratory astrophysics Machine-learning projects: - Machine learning for photometric redshifts - ICF Experiment Design - Classification of particle tracks in radiation detectors (project with the Institute for Research in Schools)

18 Data science for Physics & Astronomy [email protected]

Jonathan Holdship jonathan-holdship Who I am jonholdship.github.io An STFC/DiRAC funded researcher at NHS Guy’s and St Thomas Trust

My current work I am using classification algorithms to predict hospital attendance and adverse patient outcomes

My Astrophysics research My usual research is focused on modelling and observing chemistry in star forming regions

Machine Learning My immediate goal is to train a neural network to emulate my numerical chemical model so that it can be used in radiative-hydrodynamic codes with little computation cost

19 Data science for Physics & Astronomy Omar WIMP to find Dark Matter Jahangir

Who am I: How does it work? I am a 3rd year PhD We cool ~10T of Liquid Xenon to Student at UCL, and am -170C, and place it ~1 mile in an part of the Centre of unused Gold Mine. We then wait for a Doctoral Training in Weakly Interacting Massive Particle Data Intensive Science. (WIMP) to recoil with the Xe.

My Research: What ML do I use? My research involves Use a variety of Neural Networks to using Deep Learning to analyze data to find the unique WIMP find Dark Matter. My signal within all our background data. experiment is based at the LZ Experiment in Currently doing an Internship at South Dakota, US. Babylon Health, working on Causal Discovery for Medical Diagnosis and Disease prediction Supervisors: Omar Jahangir Dr Chamkaur Ghag - HEP [email protected] Dr Ingo Waldmann - Astrophysics Data Science Workshop - ATI Dr Tim Scanlon - HEP 02/12/201920 Data science for Physics & Astronomy

21 Data science for Physics & Astronomy Benjamin Joachimi Key data science interests: ● Principled Bayesian inference ● Physical interpretability of machine learning ● Robustness of machine learning ~20 model The challenge: parameters highly nonlinear

~100PB raw data 3.5x1011 sources

22 Data science for Physics & Astronomy Andreas Korn

Particle Physicist at the Large Hadron Collider

● Interest in Particle Tracking ● Track vertexing to identify decay signatures ● Interest in new Neural Net Methodologies for situations with variable inputs (RNN, Graph Nets etc) ● Interest in multidisciplinary projects (Dark Matter)

23 Data science for Physics & Astronomy Ofer Lahav Cosmology with AI

● Perren Professor of Astronomy and Co-Director of the CDT in Data Intensive Science (UCL) ● Research in Observational Cosmology: Dark Matter and Dark Energy ● Galaxy surveys: DES, DESI, Euclid, LSST ● AI/ML applications: ; object classification, photomeric redshifts. map reconstruction, gravitational dynamics, multi-messenger; ● AI/ML topics: interpretability of deep learning, augmentation ● Recent and current PhD students in DIS-topics: Antonella Palmese, Davide Gualdi, Michael McLeod, John Soo, Lucy Clerkin, Niall Jeffrey, Krishna Naidoo, Ben Henghes, Constantina Nicolaou, Sunil Mucesh

Minimum Spanning Tree MiSTree DeepMass applied to DES (Naidoo et al. 1907.00989) 24 (Jeffrey et al.1908.005543) Data science for Physics & Astronomy Sam Sam Lawrence Lawrence PhD Student Institute of Cosmology & Gravitation, Portsmouth [email protected]

Interests: ● Relativistic effects in large scale structure ● Computational cosmology (second order Boltzmann codes) ● Neural networks ● Blob detection

25 Data science for Physics & Astronomy Michaela MichaelaGLawrence Lawrence MichaelaLawrence MichaelaLawrenceONS M.G.Lawrence at sussex.ac.uk

I work on: I also work on: ● Testing alternative theories of gravity ● Creating synthetic versions of the annual using gravitational waves business survey using SMOTE, GANs and ● Challenging the cosmological constant VAEs. problem. Quantum field theory ● Analysis on very large datasets: k-means ● Markov chain Monte Carlo methods clustering for migration datasets. for testing theories of gravity ● Natural language processing: Sentiment analysis, topic modeling, and summarisation. 26 Data science for Physics & Astronomy www.christopherlovell.co.uk Christopher github.com/christopherlovell Lovell Postdoc at the University of Hertfordshire twitter.com/chrisclovell Recently finished my PhD at the University of Sussex

● Cosmological simulations of galaxy evolution ○ SIMBA, EAGLE, Illustris... ● ‘Forward modelling’ Epoch of Reionisation observables using radiative transfer techniques and self-consistent dust models

● Combining simulations with machine learning techniques ○ Learning the relationship between galaxies spectra and their histories using convolutional neural networks and cosmological simulations arXiv:1903.10457

● Sengi, A small, fast, interactive viewer for SPS spectra (out today!) ○ christopherlovell.github.io/sengi ○ arXiv:1911.12713

● Interested in REPRODUCIBILITY! (shout out to The 27 Turing Way) DataData sciencescience forfor PhysicsPhysics & Astronomy & Astronomy Gleb Lukicov I am a postgraduate researcher in experimental particle physics at UCL The title of my thesis is “Measurement of the anomalous magnetic moment of the muon to 140 ppb using the Fermilab muon g-2 experiment”

Expertise: Alignment (arXiv:1909.12900), tracking, DAQ, grid-based data production I want to learn about: How to more effectively apply ML in my research Best practices, and modern tools http://www.hep.ucl.ac.uk/~lukicov/ [email protected]

https://github.com/glukicov 28 Data science for Physics & Astronomy PhD student, Cavendish Laboratory, Cambridge Supervisor: Dr Alpha Lee

Working on predicting: ML challenges: - Dealing with non-uniform data - Molecular Properties from Molecular Structure (GPs, VAEs) - Choosing appropriate feature representation - Chemical Reaction Products from William Reactants/Reagents (Transformer) - Quantifying model uncertainty McCorkindale [email protected]

In silico Design of Electrolytes, Superconductors, Drugs ... 29 Data science for Physics & Astronomy Jason McEwen

30 Data science for Physics & Astronomy Stephen Menary: experimental particle physics (and beyond…)

31 Data science for Physics & Astronomy

32 Data science for Physics & Astronomy Nikolaos Nikolaou

33 DataData sciencescience forfor PhysicsPhysics && AstronomyAstronomy Davide Piras Using machine learning PhD student in Data Intensive Science to generate virtual universes Supervisor: Dr Benjamin Joachimi

● Generative methods ● N-body simulations ● Likelihood-free inference [email protected]

davide-piras

34 Data science for Physics & Astronomy Andrew Patterson ● PhD Student, UCL CDT in Developing Quantum Technologies ● Working on quantum algorithms for near-term quantum devices ● Using hybrid algorithms, where a classical minimiser is trained on the output of a quantum device. ● Applications in simulation and optimisation

Quantum State Discrimination using a Nosy DMFT algorithm and Device: experiment:

35 Data science for Physics & Astronomy Alkistis [email protected] Pourtsidou

UKRI Future Leaders Fellow @QMUL Neutral Hydrogen Intensity Mapping Large Scale Structure group working on: (MeerKAT, SKA): ● Cosmology with Radio and Optical galaxy surveys simulations, foreground ● surveys: MeerKAT, SKA, Euclid removal, parameter ● Simulations, nonlinear power spectrum modelling, estimation, pathfinder Bayesian parameter estimation, foreground removal, data analysis Epoch of Reionization

Nonlinear modelling Key data science interests: for galaxy clustering ● Fast and reliable parameter estimation (Euclid): Figure of Merit ● ML for foreground removal, RFI removal, vs Figure of Bias, model identification of nasty radio systematics,... complexity, nuisance ● Hybrid approaches: ML + “traditional” statistics parameters problem ● Emulators

36 Data science for Physics & Astronomy Darren Price Experimental particle physicist at STFC Ernest Rutherford Fellow | Turing Fellow Director of CDT in Data Intensive Science (Manchester–Sheffield–Lancaster) Research primarily on the Large Hadron Collider at CERN and direct dark matter detection experiments

● Real-time data analysis and data acquisition algorithms Selecting rare, unexpected but physically-viable data signatures from large data streams in real time. ● Exploratory large scale data analysis w/ significant class imbalance How do we design data analyses probing PB-scale datasets to find rare (~10-20) new phenomena when we don’t (necessarily) know what we are looking for? ● Simulation of particle production / interactions Monte Carlo modelling techniques and deep learning for high fidelity and efficient @darrenprice detector and particle interaction simulation. http://cern.ch/dprice ● Real time detector calibration Exploiting deep learning techniques for calibration of detectors with in situ data. [email protected] ● Reinterpretability/longevity of data mined with ML How do we best ensure the outputs of ML-mined data are reinterpretable, analysable, combinable with other data long-term?

`

37 Data science for Physics & Astronomy Anna Scaife

I am a radio astronomer working at Jodrell Bank Centre for Astrophysics and I’m funded by the Alan Turing Institute as an AI Fellow.

I work on #AI4Astro mainly around image analysis using CNNs, particularly looking at transfer learning, bias, uncertainty quantification and preserving discovery in automated analysis of very large datasets.

39 Data science for Physics & Astronomy Postdoctoral researcher @ McGill University

I am an experimental particle physicist working on the My current work ATLAS experiment at CERN Measuring very rare standard model processes Searching for beyond the standard model particles with unconventional signatures Heather My (relevant) interests Russell Brainstorming new ways to identify striking, non-standard model signatures

What I hope to learn

What are the creative ways that ✉ [email protected] others are using advanced ��cern.ch/hrussell techniques to isolate rare events @heather_russe

Creative ways of visualizing datasets to enhance our ability to communicate results 38 Data science for Physics & Astronomy AI for Science Sohan Seth

Model diagnostic, criticism, interpretation, visualization

40 Data science for Physics & Astronomy Difu Shi (Steve) My Expertise: Innovation Fellow Study the nature of dark energy by large-scale structure of the Universe and ICC, Durham University galaxy formation using numerical simulations. [email protected]

My current role:

Data intensive science translation fellow. To be the bridge between academia and industry, and transfer knowledge and skills in both directions.

Ongoing projects:

Collaboration with northeast local industrial partners: Kromek and Atom bank. Take part in their commercial projects.

key words: machine learning, model sensitivity analysis, big data statistics

41 Data science for Physics & Astronomy

emma-slade1994 [email protected]

42 Data science for Physics & Astronomy Dr. Maritza Soto Exoplanet K2-237b, detected using Maritza Postdoc at Queen Mary radial velocity data from the HARPS Soto University London and CORALIE spectrographs, and transit data from the K2 mission. The planet was found to be more massive Interests: than Jupiter and highly inflated ● Automatic determination of stellar parameters ● Exoplanet detection

We need to understand the stars before we can deduce anything about the planets that might orbit them

Parameters determined for a sample of 130 giant stars with and without planets 43 Data science for Physics & Astronomy Tom Tom Stevenson Stevenson PDRA @ University of Sussex Working on ATLAS experiment

● Interests: ○ ML for reconstruction and unfolding for top physics ○ Uncertainties on ML methods ○ GPU and accelerator card based learning ● Previously worked on: ○ Search for di-higgs production at the LHC p (Z) [GeV] T using BDTs, SVMs and committee methods ● Sparetime ML hobbies: ○ Asynchronous actor-critic agents playing computer games ○ Playing with RNN based text generation with papers from the arXiv

44 Data science for Physics & Astronomy Jeyan Thiyagalingam

45 Data science for Physics & Astronomy Roberto Trotta Imperial College London [email protected] @R_Trotta

Questions/themes for this workshop:

● Explainable AI/ML ● Model-based vs Data-based modelling ● Incorporating physical symmetries in DL

46 46 Data science for Physics & Astronomy Katie Tucker

47 Serena Viti (Dept of Physics and Astronomy) [email protected] Field of research: Astrochemical modelling and observations of star forming regions and galaxies

Examples of Data Science problems I am dealing with: Inferring physics from chemistry: need to maximize the number of chemical models we can run for the accuracy and validity of statistical inferences: Perform simulations over a very large parameter space, generating a combinatorial explosion of model runs and large, high-dimensional data sets.

Optimization and derivation of chemical networks for molecules in space: need to simultaneously investigate the paths and efficiencies of formation and destruction of chemical reactions over a large physical and parameter space: Development of Statistical and Machine Learning technologies for Reactions Databases Data science for Physics & Astronomy Iacopo Vivarelli

● You find me at http://www.sussex.ac.uk/profiles/324419 ● Work on one of the Large Hadron Collider experiments (ATLAS) ● Main activity: look for specific deviations from the predictions of the baseline particle physics model.

● Interested in the use of machine learning for detector simulation, model inter- and extrapolation, etc.

49 Data science for Physics & Astronomy Catherine Watkinson Constraining the properties of the first stars and galaxies using 21cm data (with e.g. SKA, HERA)

Challenging observation as dealing with e.g.: ● Foregrounds orders of magnitudes greater than the signal ● Interferometers whose point-spread function changes with observational frequency ● Time and direction-dependent iononispheric effects How do we then robustly perform parameter estimation & model selection? I have to date have focused on how we can exploit statistics like the bispectrum to exploit non-Gaussiany in the data.

50 Data science for Physics & Astronomy Jeremy Yates

51 Data science for Physics & Astronomy Miha Zgubic

52 Data science for Physics & Astronomy Jeyan Thiyagalingam Scientific Machine Learning Group - Rutherford Appleton Laboratory

We are interested in applying machine learning and signal processing techniques to advance science - including astronomy & particle physics

Ongoing projects:

● Benchmarking of photometric estimation techniques ● Tracking and estimation of post-collision trajectories (TrackML)

What we are hoping for: ● To build collaborations with the community on scientific problems where ML or signal processing techniques can play a role ● Build a suite of large-scale, scientific benchmarks covering different problems in astronomy / particle physics

Contact: ● Jeyan Thiyagalingam : [email protected] 53 Sarah Jaffa - University of Hertfordshire [email protected] Sjaffa.github.io Postdoctoral Research Fellow

What I do: -Simulations of massive clusters of stars (Smooth Particle Hydrodynamics)

-Interested in classification of chaotic systems; image segmentation; quantifying chaotic, noisy, badly defined structures.

What I want: -Better (formal) training in Data Science/ML for mathematically literate graduates. -Comp Sci review of Astro ML papers? -More resources (storage, computing time, etc.)

54

Data science for Physics & Astronomy Bona

56