<<

Artificial proto-modelling with simplified-model results from the LHC (or: statistically learning dispersed signals at the LHC)

Wolfgang Waltenberger, Andre Lessa, Sabine Kraml https://smodels.github.io/protomodels/ arXiv:2012.12246 (JHEP 2021, 207)

Moriond QCD • March 27 - April 3, 2021 • @home Motivation We all believe some BSM is out there (LHC, ILC, CLIC, …)

We want a global view of what all the experimental data tell us about new . Illustration from Jon Butterworth, A Map of the Invisible Illustration from

S. Kraml - Artificial Proto-Modelling - Moriond QCD - 31 March 2021 2 Motivation We all believe some BSM is out there (LHC, ILC, CLIC, …)

We want a global view of what all the experimental data tell us about new physics. of new physics; nonetheless there The LHC currently has no clear sign * hiding in the slew of data may be dispersed signals * effects of new particles which are spread out over several search regions or final states Illustration from Jon Butterworth, A Map of the Invisible Illustration from

S. Kraml - Artificial Proto-Modelling - Moriond QCD - 31 March 2021 2 Motivation We all believe some BSM is out there (LHC, ILC, CLIC, …)

We want a global view of what all the experimental data tell us about new physics. of new physics; nonetheless there The LHC currently has no clear sign * hiding in the slew of data may be dispersed signals * effects of new particles which are spread out over several search regions or final states which missed by channel-by-channel analyses Dispersed signals can be each look at only part of the data Illustration from Jon Butterworth, A Map of the Invisible Illustration from

S. Kraml - Artificial Proto-Modelling - Moriond QCD - 31 March 2021 2 Motivation We all believe some BSM is out there (LHC, ILC, CLIC, …)

We want a global view of what all the experimental data tell us about new physics. of new physics; nonetheless there The LHC currently has no clear sign * hiding in the slew of data may be dispersed signals * effects of new particles which are spread out over several search regions or final states which missed by channel-by-channel analyses Dispersed signals can be each look at only part of the data global exploration of the Change of perspective in the quest for BSM; LHC data to complement the analyses in individual final states Illustration from Jon Butterworth, A Map of the Invisible Illustration from

S. Kraml - Artificial Proto-Modelling - Moriond QCD - 31 March 2021 2 Motivation We all believe some BSM is out there (LHC, ILC, CLIC, …)

We want a global view of what all the experimental data tell us about new physics. of new physics; nonetheless there The LHC currently has no clear sign * hiding in the slew of data may be dispersed signals * effects of new particles which are spread out over several search regions or final states which missed by channel-by-channel analyses Dispersed signals can be each look at only part of the data global exploration of the Change of perspective in the quest for BSM; LHC data to complement the analyses in individual final states

Even in the event of a discovery, given the (like plethora for the of Higgs) possible may BSM not incarnations, classical hypothesis testing be enough to determine the underlying theory. Illustration from Jon Butterworth, A Map of the Invisible Illustration from

S. Kraml - Artificial Proto-Modelling - Moriond QCD - 31 March 2021 2 Top-down or bottom-up

Construct a BSM Lagrangian Infer the BSM Lagrangian

Compute masses, couplings, cross Fit (learn) particle content, masses, sections, branching ratios cross sections, branching ratios

Compare with data Take the data from Hitoshi Murayama from

S. Kraml - Artificial Proto-Modelling - Moriond QCD - 31 March 2021 3 Top-down or bottom-up We want to try this

Construct a BSM Lagrangian Infer the BSM Lagrangian

Compute masses, couplings, cross Fit (learn) particle content, masses, sections, branching ratios cross sections, branching ratios

Compare with data Take the data from Hitoshi Murayama from

S. Kraml - Artificial Proto-Modelling - Moriond QCD - 31 March 2021 3 Top-down or bottom-up We want to try this

Construct a BSM Lagrangian Infer the BSM Lagrangian

Precursor theories

Compute masses, couplings, cross Fit (learn) particle content, masses, sections, branching ratios cross sections, branching ratios

Compare with data Take the data from Hitoshi Murayama from

S. Kraml - Artificial Proto-Modelling - Moriond QCD - 31 March 2021 3 Our approach https://smodels.github.io/protomodels/

• Statistical learning algorithm to - identify potential dispersed signals in the LHC data

- fit “proto-models” (new particles, decay modes, signal strengths) to them while remaining compatible with the entirety of LHC results

• Based on simplified model results → exploit SModelS functionality and database

S. Kraml - Artificial Proto-Modelling - Moriond QCD - 31 March 2021 4 In practice

MCMC-like algorithm (dubbed the walker) to comb through proto-model parameter space in order to identify the models that best fit the data.

Composed of three blocks, or machines which interact with each other:

- Starting with the , the builder creates proto-models, randomly adding or removing new particles and changing any of the proto-model parameters.

- The proto-model is then passed on to the critic, which checks the model against the database of simplified-model results to determine an upper bound on an overall signal strength (μmax).

- The combiner identifies all possible combinations of results and constructs a combined likelihood for each subset.

Builder Combiner Critic

S. Kraml - Artificial Proto-Modelling - Moriond QCD - 31 March 2021 5 In practice

MCMC-like algorithm (dubbed the walker) to comb through proto-model parameter space in order to identify the models that best fit the data.

Composed of three blocks, or machines which interact with each other:

- Starting with the Standard Model, the builder creates proto-models, randomly adding or removing new particles and changing any of the proto-model parameters.

The proto-model is then passed on to the critic, which checks the - Proto-models are defined by their: model against the database of simplified-model results to determine an upper bound on an overall signal strength (μmax). - Particle content* - Masses - The combiner identifies all possible combinations of results and constructs a combined likelihood for each subset. - Decay modes - Signal strengths

NB this gives a parameter space of varying dimensionality !

Builder Combiner Critic * BSM particles are assumed odd under a Z2-type symmetry, so they are pair produced and cascade decay to the lightest state S. Kraml - Artificial Proto-Modelling - Moriond QCD - 31 March 2021 5 In practice

MCMC-like algorithm (dubbed the walker) to comb through proto-model parameter space in order to identify the models that best fit the data.

Composed of three blocks, or machines which interact with each other: XW - Starting with the Standard Model, the builder creates proto-models, randomly adding or removing new particles and changing any of the proto-model parameters.

The proto-model is then passed on to the critic, which checks the - Proto-models are defined by their: model against the database of simplified-model results to determine an upper bound on an overall signal strength (μmax). - Particle content* - Masses - The combiner identifies all possible combinations of results and constructs a combined likelihood for each subset. - Decay modes - Signal strengths

NB this gives a parameter space of varying dimensionality !

Builder Combiner Critic * BSM particles are assumed odd under a Z2-type symmetry, so they are pair produced and cascade decay to the lightest state S. Kraml - Artificial Proto-Modelling - Moriond QCD - 31 March 2021 5 In practice

MCMC-like algorithm (dubbed the walker) to comb through proto-model parameter space in order to identify the models that best fit the data.

Composed of three blocks, or machines which interact with each other:

- Starting with the Standard Model, the builder creates proto-models, randomly adding or removing new particles and changing any of the proto-model parameters.

The proto-model is then passed on to the critic, which checks the - Proto-models are defined by their: model against the database of simplified-model results to determine an upper bound on an overall signal strength (μmax). - Particle content* - Masses - The combiner identifies all possible combinations of results and constructs a combined likelihood for each subset. - Decay modes - Signal strengths

NB this gives a parameter space of varying dimensionality !

Builder Combiner Critic * BSM particles are assumed odd under a Z2-type symmetry, so they are pair produced and cascade decay to the lightest state S. Kraml - Artificial Proto-Modelling - Moriond QCD - 31 March 2021 5 In practice

MCMC-like algorithm (dubbed the walker) to comb through proto-model parameter space in order to identify the models that best fit the data.

Composed of three blocks, or machines which interact with each other: XW - Starting with the Standard Model, the builder creates proto-models, randomly adding or removing new particles and changing any of the proto-model parameters.

The proto-model is then passed on to the critic, which checks the - Proto-models are defined by their: model against the database of simplified-model results to determine an upper bound on an overall signal strength (μmax). - Particle content* - Masses - The combiner identifies all possible combinations of results and constructs a combined likelihood for each subset. - Decay modes - Signal strengths

NB this gives a parameter space of varying dimensionality !

Builder Combiner Critic * BSM particles are assumed odd under a Z2-type symmetry, so they are pair produced and cascade decay to the lightest state S. Kraml - Artificial Proto-Modelling - Moriond QCD - 31 March 2021 5 In practice

MCMC-like algorithm (dubbed the walker) to comb through proto-model parameter space in order to identify the models that best fit the data.

Composed of three blocks, or machines which interact with each other: XW - Starting with the Standard Model, the builder creates proto-models, randomly adding or removing new particles and changing any of the proto-model parameters.

- The proto-model is then passed on to the critic, which checks the model against the database of simplified-model results to determine an upper bound on an overall signal strength (μmax).

- The combiner identifies all possible combinations of results and constructs a combined likelihood for each subset.

Overall signal strength μ^ chosen to maximise the combined likelihood ^ under the condition μ < μmax .

Builder Combiner Critic

S. Kraml - Artificial Proto-Modelling - Moriond QCD - 31 March 2021 6 Simplified-model results from SModelS smodels.github.io

SModelS is a public tool for interpreting simplified-model results from the LHC

• Based on a general procedure to decompose BSM collider signatures featuring a Z2-like symmetry into simplified-model topologies.

• Large database of simplified-model results (currently 40 ATLAS & 46 CMS searches).

• Very fast b/c no need for MC simulation.

S. Kraml - Artificial Proto-Modelling - Moriond QCD - 31 March 2021 7 Simplified-model results from SModelS smodels.github.io

SModelS is a public tool for interpreting simplified-model results from the LHC

• Based on a general procedure to decompose BSM collider signatures featuring a Z2-like symmetry into simplified-model topologies.

• Large database of simplified-model results (currently 40 ATLAS & 46 CMS searches).

• Very fast b/c no need for MC simulation.

We are exploiting this part of SModelS

S. Kraml - Artificial Proto-Modelling - Moriond QCD - 31 March 2021 7 Simplified-model results from SModelS smodels.github.io

Two types of experimental results: upper limit maps and efficiency (A×ε) maps

Efficiency map results allow us to sum different contributions to the same signal region, and to compute a likelihood.

S. Kraml - Artificial Proto-Modelling - Moriond QCD - 31 March 2021 8

AAACPXicbVBNSyNBFOxx/dqsunE97uVhWEgQwoy46EWQNYc9uKBoVMiE0NN5kzR2zwzdb4Qw5o/txf/gzZuXPSiyV692PmDXj4IHRdV7dFdFmZKWfP/Wm/kwOze/sPix9GlpeeVzefXLqU1zI7ApUpWa84hbVDLBJklSeJ4Z5DpSeBZd7I/8s0s0VqbJCQ0ybGveS2QsBScndconoebUN7o4GHaK0Gj4cfxrWA11ftWowS7APztUGFO1ceU82IDITUh9JA6hkb0+1SCrToQadMoVv+6PAW9JMCUVNsVhp3wTdlORa0xIKG5tK/AzahfckBQKh6Uwt5hxccF72HI04RptuxinH8I3p3QhTo2bhGCs/n9RcG3tQEducxTGvvZG4nteK6d4p13IJMsJEzF5KM4VUAqjKqErDQpSA0e4MNL9FUSfGy7IFV5yJQSvI78lp5v1YKv+/WirsteY1rHIvrJ1VmUB22Z77Cc7ZE0m2G92x+7Zg3ft/fEevb+T1RlverPGXsB7egZhDazc AAACjHicbVHbbhMxEPUutzZASeGRF0OElAg17EZBLUKVKqgQDyAVQdpKcRJ5ndnEqr27sr2IyPhr+CPe+Bu8l4fQMtJIZ86c0dhnkkJwbaLoTxDeun3n7r2d3c79Bw/3HnX3H5/rvFQMJiwXubpMqAbBM5gYbgRcFgqoTARcJFfvq/7Fd1Ca59k3sylgJukq4yln1Hhq0f1FJDVrJe0nt7BESfzu62fXJ7L8eTrAx5ikijJb1Ri/xIlPYtZg6GBus0afJ9o5DHN7sKXqtCrntmXPHK4x/CgcJgJS08cHzYZGPx85OyJLEDXERPHV2gzwotuLhlEd+CaIW9BDbZwtur/JMmelhMwwQbWexlFhZpYqw5kA1yGlhoKyK7qCqYcZlaBntjbT4ReeWeI0Vz4zg2t2e8JSqfVGJl5ZWaev9yryf71padKjmeVZURrIWLMoLQU2Oa4ug5dcATNi4wFlivu3Yram3h3j79fxJsTXv3wTnI+G8Xj4+su4d3La2rGDnqLnqI9idIhO0Ed0hiaIBbvBq+AoeBPuhePwbXjcSMOgnXmC/onww1/kk8LB AAACB3icbZDLSsNAFIZP6q3WW9SlIINFEISSlIpuhIIuXFawF2hjmEym7dDJhZmJUEJ3bnwVNy4UcesruPNtnLZRtPWHgY//nMOZ83sxZ1JZ1qeRW1hcWl7JrxbW1jc2t8ztnYaMEkFonUQ8Ei0PS8pZSOuKKU5bsaA48DhteoOLcb15R4VkUXijhjF1AtwLWZcRrLTlmvsdn3KFb8vn3+DK4x/0XLNolayJ0DzYGRQhU801Pzp+RJKAhopwLGXbtmLlpFgoRjgdFTqJpDEmA9yjbY0hDqh00skdI3SoHR91I6FfqNDE/T2R4kDKYeDpzgCrvpytjc3/au1Edc+clIVxomhIpou6CUcqQuNQkM8EJYoPNWAimP4rIn0sMFE6uoIOwZ49eR4a5ZJdKZ1cV4rVyyyOPOzBARyBDadQhSuoQR0I3MMjPMOL8WA8Ga/G27Q1Z2Qzu/BHxvsX9USYvg== Likelihoods for individual analyses

Key to the statistical learning procedure is the construction of a likelihood for the hypothesised signal (the proto-model)

μ … signal LBSM(µ D)=L(D µ + b + ✓) p(✓) b … background | | p(�) … pdf of nuisances

• For efficiency-map (EM) results, we can compute a simplified likelihood, assuming p(θ) to follow a Gaussian 2 distribution centred around zero and with a variance of δ :

n (µ+b+✓) 2 2 2 2 (µ + b + ✓) obs e ✓ = s + b signal+background LBSM(µ D)= exp 2 | n ! 2 uncertainties obs ✓ ◆

- CMS sometimes provides a covariance matrix, which allows the combination of signal regions. - ATLAS has started to provide full likelihoods (JSON format), making the above approximation unnecessary.

• For upper limit (UL) maps, if observed+expected ULs are available: LBSM ~ truncated Gaussian (nb: crude approx!!) ⚠ If only the observed UL is available → constraint in the form of a step function at the observed 95% CL limit. Not useful for constructing LBSM per se; only used to determine the maximal allowed signal strength μmax (in the critic).

S. Kraml - Artificial Proto-Modelling - Moriond QCD - 31 March 2021 9 AAACFnicbVDLSsNAFJ34rPUVdelmsAguakmkohuhoAsXLirYBzQhTCaTduhkEmYmQgn9Cjf+ihsXirgVd/6NkzaL2npg4HDOucy9x08Ylcqyfoyl5ZXVtfXSRnlza3tn19zbb8s4FZi0cMxi0fWRJIxy0lJUMdJNBEGRz0jHH17nfueRCElj/qBGCXEj1Oc0pBgpLXnmqRMhNRBRdjf2MLyCTiLiwMuoU3Uod6p4DGcC1DMrVs2aAC4SuyAVUKDpmd9OEOM0IlxhhqTs2Vai3AwJRTEj47KTSpIgPER90tOUo4hIN5ucNYbHWglgGAv9uIITdXYiQ5GUo8jXyXxHOe/l4n9eL1XhpZtRnqSKcDz9KEwZVDHMO4IBFQQrNtIEYUH1rhAPkEBY6SbLugR7/uRF0j6r2fXa+X290rgp6iiBQ3AEToANLkAD3IImaAEMnsALeAPvxrPxanwYn9PoklHMHIA/ML5+AS2Kn2c= Building a global likelihood (combining analyses)

• Global likelihood = product of likelihoods of approximately uncorrelated analyses: Combiner - from different LHC runs (8 or 13 TeV) and/or - from different experiments (ATLAS or CMS) and/or

- for clearly different signals (e.g., purely hadronic or leptonic)

• A combination “c” of analyses is “legal” if all results are mutually uncorrelated (= combinable) • If a result can be added, it has to be added (any subset of a legal combination is not itself legal)

Approx. uncorrelated Lc = Li i c Correlated Y2 Can’t construct a likelihood

NB: easy to combine with LC results as well

S. Kraml - Artificial Proto-Modelling - Moriond QCD - 31 March 2021 10

AAACoXicbVHbbhMxEPUut3a5BXhEQiMipPQl2q2KQEhIFe0DKBSFQtpKcRp5vd7Eqi+L7QUia/+L7+CNv8Gb7gOkHcnS8Tkz4/GZvBLcujT9E8U3bt66fWdrO7l77/6Dh71Hj0+srg1lE6qFNmc5sUxwxSaOO8HOKsOIzAU7zS8OWv30OzOWa/XVrSo2k2SheMkpcYGa936Nzj1t4M1b2AUsFODSEOqTbSyJWxrpPzatPvfYSHj35agZ4CVxHsu62QFMC+0AV3zQqTuNv64ulG3ktqmQ4G81KfAxXywdMUb/WN+TUTtMaPMTCy65s6FHqQ0RAihgruCggdE5hXmvnw7TdcBVkHWgj7oYz3u/caFpLZlyVBBrp1lauZknxnEqWJPg2rKK0AuyYNMAFZHMzvza4QZeBKaAMEc4ysGa/bfCE2ntSuYhszXAbmoteZ02rV35eua5qmrHFL18qKwFOA3tuqDghlEnVgEQaniYFeiShB25sNQkmJBtfvkqONkdZnvDl5/3+vuHnR1b6Cl6jgYoQ6/QPnqPxmiCaPQsOoyOok9xP/4Qj+Pjy9Q46mqeoP8inv4FOcrL9w== AAACFXicbVDLSgMxFM34rPU16tJNsAgVpcyUii6LdiGIUNE+oFNLJk3b0EwmJBmxDP0JN/6KGxeKuBXc+Tem7Sy09cCFwzn3cu89vmBUacf5tubmFxaXllMr6dW19Y1Ne2u7qsJIYlLBIQtl3UeKMMpJRVPNSF1IggKfkZrfPx/5tXsiFQ35rR4I0gxQl9MOxUgbqWUfXUIPCSHDB+iVCNPIwz16l4eHMA89xqEnaNaTATy7uTqALTvj5Jwx4CxxE5IBCcot+8trhzgKCNeYIaUariN0M0ZSU8zIMO1FigiE+6hLGoZyFBDVjMdfDeG+UdqwE0pTXMOx+nsiRoFSg8A3nQHSPTXtjcT/vEakO6fNmHIRacLxZFEnYlCHcBQRbFNJsGYDQxCW1NwKcQ9JhLUJMm1CcKdfniXVfM4t5I6vC5liKYkjBXbBHsgCF5yAIrgAZVABGDyCZ/AK3qwn68V6tz4mrXNWMrMD/sD6/AEouJxK AAACmnicbVFdb9MwFHXC1yhfHTzwAA9XVEidGFUyNsHLpKkgPoSQBqzbpDqKHMdprdmxZzsTVZQfxV/hjX+D0wYJul3J8tE599r3nptpwa2Lot9BeO36jZu3Nm737ty9d/9Bf/PhsVWVoWxClVDmNCOWCV6yieNOsFNtGJGZYCfZ2dtWP7lgxnJVHrmFZokks5IXnBLnqbT/E2s+xEbC+PuXLdgHzH5oLFjhpi9heQ9xYQityxRL4uZG1poYx6lgtmlqksYNvID1lPG3lbhzlaiNykdS5d0Drxps+GzutmB1J4C3t/F5RXLo/e2tbQ2z84pfQAxpfxCNomXAZRB3YIC6OEz7v3CuaCVZ6agg1k7jSLuk7sZoeriyTBN6RmZs6mFJJLNJvbS2geeeyaFQxp/SwZL9t6Im0tqFzHxmO59d11ryKm1aueJNUvNSV46VdPVRUQlwCto9Qc4No04sPCDUcN8r0DnxVjq/zZ43IV4f+TI43hnFu6O9r7uDg3edHRvoCXqGhihGr9EB+ogO0QTR4HGwH7wPPoRPw3H4Kfy8Sg2DruYR+i/Coz/yacrR Test statistic

Finally, we need a test statistic that increases for models which better satisfy all the constraints (which includes better fitting potential dispersed signals). Define:

Lc (ˆµ) ⇡(BSM) Kc := 2 ln BSM · K := max Kc Lc ⇡(SM) ) c C SM · 8 2 � is a prior, which punishes the proto-model for unneeded complexity (principle of parsimony)

n n n ⇡(BSM) = exp particles + BRs + prod.modes , ⇡(SM) 1 a a a ⌘  ✓ 1 2 3 ◆

where we take a1 = 2, a2 = 4, a3 = 8.

The test statistic thus roughly corresponds to a ∆χ2 of the proto-model with respect to the SM, with a penalty for the new degrees of freedom:

K 2 +2ln⇡(BSM) NB: K>0 (K<0) can be interpreted as preference ⇡ of the proto-model over the SM (and vice-versa)

S. Kraml - Artificial Proto-Modelling - Moriond QCD - 31 March 2021 11 AAACFXicbVDLSgNBEJz1bXxFPXoZDIKCCbui6FHQg5BLBBOF7LLMTnqTwdkHM71iWPYnvPgrXjwo4lXw5t84eRzUWNBQVHXT3RWkUmi07S9ranpmdm5+YbG0tLyyulZe32jpJFMcmjyRiboJmAYpYmiiQAk3qQIWBRKug9uzgX99B0qLJL7CfgpexLqxCAVnaCS/vO/CfepKCLFN3VAxnjtFflDs1n1Bq7Tu56LqFHuuEt0een65YtfsIegkccakQsZo+OVPt5PwLIIYuWRatx07RS9nCgWXUJTcTEPK+C3rQtvQmEWgvXz4VUF3jNKhYaJMxUiH6s+JnEVa96PAdEYMe/qvNxD/89oZhideLuI0Q4j5aFGYSYoJHUREO0IBR9k3hHElzK2U95jJBk2QJROC8/flSdI6qDmHtaPLw8rp+TiOBbJFtskuccgxOSUXpEGahJMH8kReyKv1aD1bb9b7qHXKGs9skl+wPr4ByOKd9A== The Walker

• MCMC (Markov chain) like walk through proto-model parameter space, randomly - adding or removing particles - changing particle masses - changing signal strength multipliers (for production modes) - changing decay channels and their branching ratios

• At each step, compute test statistic K • Step i is reverted with probability

1 exp (Ki Ki 1) 2  https://smodels.github.io/protomodels/videos/

S. Kraml - Artificial Proto-Modelling - Moriond QCD - 31 March 2021 12 AAACFXicbVDLSgNBEJz1bXxFPXoZDIKCCbui6FHQg5BLBBOF7LLMTnqTwdkHM71iWPYnvPgrXjwo4lXw5t84eRzUWNBQVHXT3RWkUmi07S9ranpmdm5+YbG0tLyyulZe32jpJFMcmjyRiboJmAYpYmiiQAk3qQIWBRKug9uzgX99B0qLJL7CfgpexLqxCAVnaCS/vO/CfepKCLFN3VAxnjtFflDs1n1Bq7Tu56LqFHuuEt0een65YtfsIegkccakQsZo+OVPt5PwLIIYuWRatx07RS9nCgWXUJTcTEPK+C3rQtvQmEWgvXz4VUF3jNKhYaJMxUiH6s+JnEVa96PAdEYMe/qvNxD/89oZhideLuI0Q4j5aFGYSYoJHUREO0IBR9k3hHElzK2U95jJBk2QJROC8/flSdI6qDmHtaPLw8rp+TiOBbJFtskuccgxOSUXpEGahJMH8kReyKv1aD1bb9b7qHXKGs9skl+wPr4ByOKd9A== The Walker

• MCMC (Markov chain) like walk through proto-model parameter space, randomly - adding or removing particles - changing particle masses - changing signal strength multipliers (for production modes) - changing decay channels and their branching ratios

• At each step, compute test statistic K • Step i is reverted with probability

1 exp (Ki Ki 1) 2  https://smodels.github.io/protomodels/videos/

S. Kraml - Artificial Proto-Modelling - Moriond QCD - 31 March 2021 12 Results (walking over SModelS v1.2.4 database)

10 “runs”, each consisting of 50 parallel walkers with 1000 steps

Analyses contributing to the K value of the highest score proto-model

List of the most constraining results for the highest score proto-model

In each run, the high-score model has a top partner and a partner with SUSY-like cross sections. Driven by ~2σ excesses in ATLAS-SUSY-2016-07 (gluino, squark), ATLAS-SUSY-2016-16 (1-lept. stop) and CMS-SUS-16-050 (0-lept. stop) searches.

S. Kraml - Artificial Proto-Modelling - Moriond QCD - 31 March 2021 13 AAACxHicbVFbixMxFM6Ml13HW9VHX4JFaEHKjKy4CgsrigiFZQW7u9DpDpk0Mw3NTGJyRi0h/kjfxD9jph2r7u6BA1++851bTq4ENxDHP4Pw2vUbN3d2b0W379y9d7/34OGJkY2mbEKlkPosJ4YJXrMJcBDsTGlGqlyw03z5to2ffmHacFl/gpVis4qUNS84JeCprPdLZWlFYKErWwqZE+Hw6wMcpYJXrXMwmT1KNS8XQLSWX1NeF7ByXlFoQm3i7FH7MM1fOT9I3LmnoySz0zQn2o7dtonMjXu2KTJ0g/E530YKsmRuiFOilJbfIq+Bbcmrqvgef4aZj3GqF3IwHkZdOo5Hyaus149H8drwZZB0oI86O856P9K5pE3FaqCCGDNNYgUzSzRwKpiL0sYwReiSlGzqYU0qZmZ2fQSHn3pmjgupvdeA1+y/GZZUxqyq3CvbLczFWEteFZs2UOzPLK9VA6ymm0ZFIzBI3F4Uz7lmFMTKA0I197NiuiD+OuDvHvlPSC6ufBmcPB8le6MXH/f6h++679hFj9ETNEAJeokO0Qd0jCaIBm+CMlDB5/B9KEITNhtpGHQ5j9B/Fn7/DUmO31g= Testing the SM hypothesis

To mimic the case that no new physics is in the data, we produce “fake” SModelS databases by sampling background models (i.e. setting #observed = #expected, sampled within BG uncertainties)

➡ Determine density of K under SM-only hypothesis (Kfake) from 50 fake SModelS databases; 50×50 walkers in total

K obtained from the 10 runs over the real SModelS database

N 1 1 i ➡ Determine global p-value of SM-only hypothesis pglobal := lim 1[K¯ , )(Kfake) dK⇢(K) 0.19 (N=50) N N obs 1 ⇡ ⇡ !1 i=1 ¯Z X Kobs S. Kraml - Artificial Proto-Modelling - Moriond QCD - 31 March 2021 14 Conclusions

• Null results (so far) in the numerous channel-by-channel searches for new particles make it increasingly relevant to change perspective and attempt a more global approach to find out where BSM physics may hide. • Presented a novel statistical learning algorithm, that identifies potential dispersed signals in the slew of published LHC analyses; the aim is to build candidate proto-models from small excesses in the data, while remaining consistent with all other constraints. • Procedure may also be relevant for global interpretations including data from a future e+e- LC; ultimate goal is a bottom-up approach to inferring the BSM Lagrangian from the data.

arXiv:2012.12246 is but a proof-of-principle; lots of interesting further developments to be done! Illustration from Jon Butterworth, A Map of the Invisible Illustration from

S. Kraml - Artificial Proto-Modelling - Moriond QCD - 31 March 2021 15 Conclusions

• Null results (so far) in the numerous channel-by-channel searches for new particles make it increasingly relevant To go further to change perspective and attempt a more global approach to find out where BSM physics may hide. • Current realisation is based on simplified-model results in the • Presented a novel statistical learning algorithm, that SModelS database identifies potential dispersed signals in the slew of published LHC analyses; the aim is to build candidate • limited by the variety and type of proto-models from small excesses in the data, while simplified-model results available remaining consistent with all other constraints. need more (and larger) Procedure may also be relevant for global interpretations ‣ • efficiency-map results including data from a future e+e- LC; ultimate goal is a bottom-up approach to inferring the BSM Lagrangian ‣ need to go beyond SUSY-like from the data. simplified models

arXiv:2012.12246 is but a proof-of-principle; lots of interesting further developments to be done! Illustration from Jon Butterworth, A Map of the Invisible Illustration from

S. Kraml - Artificial Proto-Modelling - Moriond QCD - 31 March 2021 15 Moriond@home Run2 results in the SModelS database (v1.2.4)

Can’t construct a likelihood

S. Kraml - Artificial Proto-Modelling - Moriond QCD - 31 March 2021 17 Run1 results in the SModelS database (v1.2.4)

Can’t construct a likelihood

S. Kraml - Artificial Proto-Modelling - Moriond QCD - 31 March 2021 18 Proto-model construction rules for the time being …

• Proto-models are defined by their BSM particle content, the particle masses, production cross sections (signal strengths) and decay branching ratios • Current assumptions:

Possible particle content (spin remains undefined) - BSM particles are odd under a Z2-type symmetry → always produced in pairs and always cascade - Light quark partners Xq (q = u, d, c, s) decay to the lightest state; - Heavy quark partners Xb, Xt (2 of each) - the lightest BSM particle (LBP) is stable and electrically and color neutral (cosmological - partner Xg (1) constraints, dark matter); - Electroweak partners XW (2), XZ (3) - except for the LBP, all particles are assumed to - Lepton partners Xℓ, X�ℓ (ℓ = e, µ, �) decay promptly; - . Cross sections: start from MSSM cross sections, - only particles with masses within LHC reach are then varied freely via signal strength multipliers considered part of a specific proto-model.

• NB proto-models are just a toolkit, they are not intended to be consistent theoretical models

S. Kraml - Artificial Proto-Modelling - Moriond QCD - 31 March 2021 19 Closure tests: inject a signal

• Replace the observed data with the expected background plus signal yields for the highest-score proto-model with K=6.9; re-run and see whether the injected signal is “rediscovered” • We do this with and without sampling from the backgrounds

Technical closure test (without sampling) Full closure test (with sampling)

S. Kraml - Artificial Proto-Modelling - Moriond QCD - 31 March 2021 20 Posterior distributions

68% Bayesian credibility regions (coloured areas, solid contours) in the mass vs. mass planes of the corresponding simplified model. The signal strength multipliers are fixed at the values of the highest-score proto-model with K=6.9. The coloured stars show the preferred points for each individual analysis, while the black stars display the proto-model parameters. The dashed lines indicate the size of the UL and EM maps provided by the experiments, thus defining the 100\% credibility regions.

S. Kraml - Artificial Proto-Modelling - Moriond QCD - 31 March 2021 21 SM backgrounds

• p-values computed for all signal regions of the EM results contained in the SModelS database. • If the observed data is well described by the estimated backgrounds and the uncertainties thereon, one would expect a flat distribution of p-values. • For the fake databases generated under the SM hypothesis, this is the case (red histogram). • With the real data (blue histogram) the distribution is not flat — this is expected if the background uncertainties are overestimated.

S. Kraml - Artificial Proto-Modelling - Moriond QCD - 31 March 2021 22 ATLAS’ full likelihoods ATL-PHYS-PUB-2019-029 (05 Aug 2019)

• Plain-text serialisation of HistFactory workspaces, JSON format

- Provides background estimates, changes under systematic variations, and observed data counts at the same fidelity as used in the experiment.

Rate modifications defined in HistFactory for bin b, sample s, channel c.

- Usage: RooFit, pyhf - Target: long-term data/analysis preservation, reinterpretation purposes

S. Kraml - Artificial Proto-Modelling - LCWS 2021 - Global Interpretations - 18 March 2021 23 Why public likelihoods

• The statistical model of an experimental Les Houches Recommandations (2012) analysis provides the complete mathematical description of that analysis 3b: When feasible, provide a mathematical description of the final likelihood function in p(o|�) relating the observed quantities o to the parameters � which experimental data and parameters are clearly distinguished, either in the publication • Given the likelihood, all the standard or the auxiliary information. Limits of validity should always be clearly specified. statistical approaches are available for extracting information from it 3c: Additionally provide a digitized implementation of the likelihood that is consistent with the mathematical description. • Essential information for any detailed interpretation of experimental results arXiv:1203.2489 . = determining the compatibility of the observations with theoretical predictions

S. Kraml - Artificial Proto-Modelling - Moriond QCD - 31 March 2021 24 O±δO plus correlations

• Simplified likelihood, Gaussian approximation - CMS SUSY group: covariance matrices for combination of signal regions

CMS-SUS-16-039 (EWino search)

Contribution 15, LH 2019 BSM WG report arXiv:2002.12220

«Correlation data […] has proven excellent for stabilising and ensuring better statistical definition in global fits as well as avoiding either overly conservative or over-enthusiastic interpretations.»

Reinterpretation Forum Report, 2003.07868

S. Kraml - Artificial Proto-Modelling - LCWS 2021 - Global Interpretations - 18 March 2021 25 Moriond QCD • March 27 - April 3, 2021 • @home

https://smodels.github.io/protomodels/

Artificial proto-modelling with simplified-model results from the LHC (or: statistically learning dispersed signals at the LHC)

Wolfgang Waltenberger Andre Lessa Sabine Kraml arXiv:2012.12246 (JHEP 2021, 207) Backup