<<

Beyond the Cabello-Severini-Winter framework: Making sense of contextuality without sharpness of measurements

Ravi Kunjwal

Perimeter Institute for Theoretical Physics, 31 Caroline Street North, Waterloo, Ontario, Canada, N2L 2Y5. September 4, 2019

We develop a hypergraph-theoretic frame- Kochen-Specker functionals appearing in the work for Spekkens contextuality applied to CSW framework; when Corr = 1, the inequal- Kochen-Specker (KS) type scenarios that goes ities formally reduce to CSW type bounds on beyond the Cabello-Severini-Winter (CSW) R. Along the way, we also consider in detail framework. To do this, we add new the scope of our framework vis-`a-vis the CSW hypergraph-theoretic ingredients to the CSW framework, particularly the role of Specker’s framework. We then obtain noise-robust non- principle in the CSW framework, i.e., what the contextuality inequalities in this generalized principle means for an operational theory sat- framework by applying the assumption of isfying it and why we don’t impose it in our (Spekkens) noncontextuality to both prepara- framework. tions and measurements. The resulting frame- work goes beyond the CSW framework in both senses, conceptual and technical. On the con- ceptual level: 1) as in any treatment based on Contents the generalized notion of noncontextuality `a la Spekkens, we relax the assumption of outcome 1 Introduction2 inherent to the Kochen-Specker theorem but retain measurement noncontex- 2 Spekkens framework5 tuality, besides introducing preparation non- 2.1 Operational theory ...... 5 contextuality, 2) we do not require the exclu- 2.2 Ontological model ...... 6 sivity principle – that pairwise exclusive mea- surement events must all be mutually exclu- 2.3 Representation of coarse-graining . . . 7 sive – as a fundamental constraint on mea- 2.3.1 Coarse-graining of measurements 7 surement events of interest in an experimen- 2.3.2 Coarse-graining of preparations 8 tal test of contextuality, given that this prop- 2.4 Joint measurability (or compatibility) 9 erty is not true of general quantum measure- ments, and 3) as a result, we do not need to 2.5 Noncontextuality ...... 9 presume that measurement events of interest 2.6 An example of Spekkens contextuality: are “sharp” (for any definition of sharpness), the fair coin flip inequality ...... 10 where this notion of sharpness is meant to im- 2.7 Connection to Bell scenarios ...... 12 ply the exclusivity principle. On the techni- cal level, we go beyond the CSW framework 3 Hypergraph approach to Kochen- in the following senses: 1) we introduce a Specker scenarios in the Spekkens source events hypergraph – besides the mea- framework 13 arXiv:1709.01098v4 [quant-ph] 3 Sep 2019 surement events hypergraph usually consid- 3.1 Measurements ...... 13 ered – and define a new operational quantity Corr that appears in our inequalities, 2) we de- 3.1.1 Classification of probabilistic fine a new hypergraph invariant – the weighted models ...... 14 max-predictability – that is necessary for our 3.1.2 Distinguishing two conse- analysis and appears in our inequalities, and 3) quences of Specker’s principle: our noise-robust noncontextuality inequalities Structural Specker’s principle quantify tradeoff relations between three oper- vs. Statistical Specker’s principle 15 ational quantities – Corr, R, and p0 – only one 3.1.3 What does it mean for an oper- of which (namely, R) corresponds to the Bell- ational theory to satisfy struc- tural/statistical Specker’s prin- Ravi Kunjwal: [email protected] ciple? ...... 16

Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 1 3.1.4 Remark on the classification of A.2 Robustness of Bell nonlocality vis-`a-vis probabilistic models: why we POVMs ...... 36 haven’t defined “quantum mod- els” as those obtained from pro- B Ontological models without respecting jective measurements ...... 20 coarse-graining relations 37 3.1.5 Scope of this framework . . . . 20 B.1 How to construct a “KS- 3.2 Sources ...... 21 noncontextual” ontological model of the KCBS experiment [47] without 4 A key hypergraph invariant: the coarse-graining relations ...... 37 weighted max-predictability 23 B.2 How to construct a “preparation and measurement noncontextual” ontologi- 5 Noise-robust noncontextuality inequali- cal model without coarse-graining rela- ties 24 tions ...... 37 5.1 Key notions from CSW ...... 24 5.2 Key notion not from CSW: C Trivial POVMs 38 source-measurement correlation, Corr 25 C.1 Bell-CHSH scenario ...... 38 5.3 Obtaining the noise-robust noncontex- C.2 CHSH-type contextuality scenario: 4- tuality inequalities ...... 25 cycle ...... 38 5.3.1 Expressing operational quanti- ties in ontological terms . . . . 25 D The KS-uncolourable hypergraph Γ18 40 5.3.2 Derivation of the noncontextual tradeoff for any graph G .... 26 References 41 5.3.3 When is the noncontextual tradeoff violated? ...... 27 1 Introduction 5.4 Example: KCBS scenario ...... 27 To say that quantum theory is counterintuitive, or 6 Discussion 30 that it requires a revision of our classical intuitions, 6.1 Measurement-measurement cor- requires us to be mathematically precise in our def- relations vs. source-measurement inition of these classical intuitions. Once we have a correlations ...... 30 precise formulation of such classicality, we can begin 6.2 Can our noise-robust noncontextuality to investigate those features of quantum theory that inequalities be saturated by a noncon- power its nonclassicality, i.e., its departure from our textual ontological model? ...... 30 classical intuitions, and thus prove theorems about 6.2.1 The special case of facet- such nonclassicality. To the extent that a physical defining Bell-KS inequalities: theory is provisional, likely to be replaced by a better Corr=1 ...... 30 theory in the future, it also makes sense to articu- 6.2.2 The general case: Corr < 1 .. 30 late such notions of classicality in as operational a 6.3 Can trivial POVMs ever violate these manner as possible. By ‘operational’, we refer to a noncontextuality inequalities? . . . . . 31 formulation of the theory that takes the operations – 6.3.1 The case p ∈ C(ΓG) ...... 31 preparations, measurements, transformations – that 6.3.2 The case p ∈ can be carried out in an experiment as primitives and ConvHull(G(ΓG)|ind) ...... 31 which specifies the manner in which these operations 6.3.3 The general case p ∈ G(ΓG) .. 31 combine to produce the data in the experiment. Such an operational formulation often suggests generaliza- 7 Conclusions 32 tions of the theory that can then be used to better understand its axiomatics [1–3]. At the same time, Acknowledgments 33 an operational formulation also lets us articulate our notions of nonclassicality in a manner that is experi- A Status of KS-contextuality as an experi- mentally testable and thus allows us to leverage this mentally testable notion of nonclassical- nonclassicality in applications of the theory. Indeed, ity for POVMs in quantum theory 33 a key area of research in and A.1 Limitations of KS-contextuality vis-`a- quantum information is the development of methods vis POVMs ...... 34 to assess nonclassicality in an experiment under min- A.1.1 KS-contextuality for POVMs in imal assumptions on the operational theory describ- the literature ...... 34 ing it. The paradigmatic example of this is the case of A.1.2 Classifying probabilistic mod- Bell’s theorem and Bell experiments [4–11], where any els: restriction of quantum operational theory that is non-signalling between the models to PVMs ...... 35 different spacelike separated wings of the experiment

Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 2 is allowed. The notion of classicality at play in Bell’s 23, 25] is a formal unification of Bell scenarios theorem is the assumption of local causality: any non- with KS-contextuality scenarios, treating them on signalling theory that violates the assumption of local the same footing. Indeed, the perspective there is causality is said to exhibit nonclassicality by the lights to consider Bell scenarios as a special case of KS- of Bell’s theorem. contextuality scenarios. What is lost in this math- More recently, much work [12–17] has been devoted ematical unification, however, is the fact that Bell- to obtaining constraints on operational statistics that locality and KS-noncontextuality have physically dis- follow from a generalized notion of noncontextuality tinct, if related, motivations. The physical situation proposed by Spekkens [18]. This notion of classicality that Bell’s theorem refers to requires (at least) two [18] has its roots in the Kochen-Specker (KS) theo- spacelike separated labs (where local measurements rem [19], a no-go theorem that rules out the possibility are carried out) so that the assumption of local causal- 1 that a deterministic underlying ontological model [20] ity (or Bell-locality) can be applied. On the other could reproduce the operational statistics of (projec- hand, the physical situation that the Kochen-Specker tive) quantum measurements in a manner that sat- theorem refers to does not require spacelike separa- isfies the assumption of KS-noncontextuality. KS- tion as a necessary ingredient and one can there- noncontextuality is the notion of classicality at play fore consider experiments in a single lab. However, in the Kochen-Specker theorem. The Spekkens frame- the assumption of KS-noncontextuality entails out- work abandons the assumption of outcome determin- come determinism [18], something not required by 2 ism [18] – the idea that the ontic state of a system local causality in Bell scenarios. This difference fixes the outcome of any measurement deterministi- in the physical situation for the two kinds of ex- cally – that is intrinsic to KS-noncontextuality. It also periments is one of the reasons for generalizing KS- applies to general operational theories and extends noncontextuality to the notion of noncontextuality in the notion of noncontextuality to general experimen- the Spekkens framework [18] (so that outcome deter- tal procedures – preparations, transformations, and minism is not assumed) while leaving Bell’s notion of measurements – rather than measurements alone. local causality untouched. In the present paper we build a bridge from the Parallel to work along the lines of Spekkens [18], CSW approach, where KS-noncontextual correlations work seeking to directly operationalize the Kochen- are bounded by Bell-KS inequalities, to noise-robust Specker theorem (rather than revising the notion noncontextuality inequalities in the Spekkens frame- of noncontextuality at play) culminated in two re- work [18]. That is, we show how the constraints from cent approaches that classify theories by the de- KS-noncontextuality in the framework of Ref. [22] gree to which they violate the assumption of KS- translate to constraints from generalized noncontex- noncontextuality: the graph-theoretic framework of tuality in the framework of Ref. [18]. The resulting Cabello, Severini, and Winter (CSW) [21, 22], where a operational criteria for contextuality `ala Spekkens general approach to obtaining graph-theoretic bounds on linear Bell-KS functionals was proposed, and the 1What do we mean by whether an assumption “can be ap- related hypergraph framework of Ac´ın, Fritz, Lev- plied”? Of course, mathematically, one can “apply” any as- errier, and Sainz (AFLS) [23], where an approach sumption one wants in the service of proving a theorem. But to characterizing sets of correlations was proposed. insofar as the mathematics here is trying to model a real exper- iment, the consistency of those assumptions with some essen- The CSW framework relates well-known graph in- tial facts of the experiment is the minimal requirement for any variants to: 1) upper bounds on Bell-KS inequali- no-go theorem derived from such assumptions to be physically ties that follow from KS-noncontextuality, 2) upper interesting. Hence, in the presence of signalling (implying the bounds on maximum quantum violations of these in- absence of spacelike separation), it makes no sense to assume local causality in a Bell experiment and derive the resulting equalities that can be obtained from projective mea- Bell inequalities: such an assumption on the ontological model surements, and 3) upper bounds on their violation is already in conflict with the fact of signalling across the labs in general probabilistic theories [24] – denoted E1 – and no Bell inequalities are needed to witness this fact. Bell in- which satisfy the “exclusivity principle” [22]. Com- equalities only become physically interesting when the theories being compared relative to them are all non-signalling: if the plementary to this, the AFLS framework uses graph experiment itself is signalling, any non-signalling description – invariants in the service of deciding whether a given locally causal, quantum, or in a general probabilistic theory assignment of probabilities to measurement outcomes (GPT) – is ipso facto ruled out. in a KS-contextuality experiment belongs to a partic- 2Note that this assumption of outcome determinism doesn’t ular set of correlations; they showed that membership affect the conclusions in a Bell scenario even if one adopted in the quantum set of correlations (defined only for it because of Fine’s theorem [26]: a locally deterministic on- tological model entails the same set of (Bell-local) correlations projective measurements in quantum theory) cannot as a locally causal ontological model. Relaxing outcome deter- be witnessed by a graph invariant, cf. Theorem 5.1.3 minism, however, doesn’t mean the same thing for the kinds of Ref. [23]. Another recent approach due to Abram- of experiments envisaged by the Kochen-Specker theorem – in sky and Brandenburger [25] employs sheaf-theoretic particular, it doesn’t mean that models satisfying factorizabil- ity `ala Ref. [25] are the most general outcome-indeterministic ideas to formulate KS-contextuality. models – and thus considerations parallel to Fine’s theorem [26] A key achievement of the frameworks of Refs. [22, do not apply, cf. [27, 28].

Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 3 are noise-robust and therefore applicable to arbi- the contributions of this paper also include: trary positive operator-valued measures (POVMs) and mixed states in quantum theory. Note that the • An exposition of Specker’s principle and how dif- insights gleaned from frameworks such as those of ferent implications of it (e.g., consistent exclu- Refs. [22, 23, 25] regarding Bell nonlocality require sivity [23]) for a given operational theory arise in no revision in our approach. It is only in the appli- the hypergraph framework (cf. Sections 3.1.2 and cation of such frameworks (in particular, the CSW 3.1.3), in particular the results in Theorems1,2, framework) to the question of contextuality that we and Corollary1. seek to propose an alternative hypergraph framework • Introduction of a hypergraph invariant – the (formalizing Spekkens contextuality [18]) that is more weighted max-predictability – that is key to operationally motivated for experimental situations our noise-robust noncontextuality inequalities, where one cannot appeal to spacelike separation to cf. Section4. This invariant is also key to the 3 justify locality of the measurements. For Kochen- hypergraph framework of Ref. [34] which is com- Specker type experimental scenarios, we will con- plementary to the present framework. sider the twin notions of preparation noncontextu- ality and measurement noncontextuality – taken to- • A detailed discussion of how KS- gether as a notion of classicality – to obtain noise- noncontextuality for POVMs has been previously robust noncontextuality inequalities that generalize treated in the literature and the limitations of the KS-noncontextuality inequalities of CSW. These those treatments, cf. AppendicesA andC. inequalities witness nonclassicality even when quan- Also, unlike for the case of KS-noncontextuality tum correlations arising from arbitrary (i.e., possibly inequalities, we show that trivial POVMs can nonprojective) quantum measurements on any quan- never violate our noise-robust noncontextuality tum state are allowed. A key innovation of this ap- inequalities, cf. Section 6.3. proach is that it treats all measurements in an oper- • A discussion of coarse-graining relations in Sec- ational theory on an equal footing. No definition of tion 2.3 and their importance for contextual- “sharpness” [29–31] is needed to justify or derive non- ity no-go theorems, in particular a discussion of contextuality inequalities in this approach. Further- ontological models that do not respect coarse- more, if certain idealizations are presumed about the graining relations in AppendixB. We show, operational statistics, then these inequalities formally in AppendixB, how relaxing the constraint recover the usual Bell-KS inequalities `ala CSW. The from coarse-graining relations on an ontological Bell-KS inequalities can be viewed as an instance of model renders either notion of noncontextuality the classical marginal problem [25–27, 32, 33], i.e., – whether Kochen-Specker [19] or Spekkens [18] as constraints on the (marginal) probability distri- – vacuous. butions over subsets of a set of that fol- low from requiring the existence of global joint prob- • A discussion, by example, of why our generaliza- ability distribution over the set of all observables. tion of the CSW framework cannot accommodate Since the Bell-KS inequalities are only recovered un- contextuality scenarios that are KS-uncolourable der certain idealizations, but not otherwise, the noise- in AppendixD and why one needs a distinct robust noncontextuality inequalities we obtain can- framework, i.e., the framework of Ref. [34], to not in general be viewed as arising from a classical treat KS-uncolourable scenarios. marginal problem. Hence, they cannot be understood within existing frameworks that rely on this (reduc- The structure of this paper follows: Section2 reviews tion to the classical marginal problem) property to the Spekkens framework for generalized noncontextu- formally unify the treatment of Bell-nonlocality and ality [18]. Section3 introduces a hypergraph frame- KS-contextuality [22, 23, 25]. This is a crucial dis- work that shares features of traditional frameworks tinction relative to the usual Bell-KS inequality type for KS-contextuality [22, 23] but is also augmented witnesses of KS-contextuality. (relative to these traditional frameworks) with the in- This paper is based on a previous contribution [16] gredients necessary for obtaining noise-robust noncon- that laid the conceptual groundwork for the progress textuality inequalities. In particular, its subsections we make here. Besides the noise-robust noncontex- 3.1.2 and 3.1.3 discuss Specker’s principle [35] and tuality inequalities that generalize constraints from define its different implications for contextuality sce- KS-noncontextuality in the CSW framework leverag- narios `ala Ref. [23]. Section4 defines a new hyper- ing the graph invariants of CSW [22] (cf. Section5), graph invariant – the weighted max-predictability – that we need later on as a crucial new ingredient in our inequalities. Section5 obtains noise-robust non- 3 Nor the sharpness of the measurements to justify outcome contextuality inequalities within the framework de- determinism. We discuss these issues in detail – in particular, the physical basis of KS-noncontextuality vis-`a-visBell-locality fined in Section3 and using the hypergraph invariant and how that influences our framework – in AppendixA for the of Section4 in addition to two graph invariants from interested reader. the CSW framework [22]. These inequalities can be

Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 4 seen as special cases of the general approach outlined in Ref. [16]. In Section6, we include discussions on various features of our noise-robust noncontextuality inequalities, in particular the fact that trivial POVMs can never violate them. Section7 concludes with some open questions and directions for future research. Measurement

2 Spekkens framework

We concern ourselves with prepare-and-measure ex- periments. A schematic of such an experiment is shown in Figure1 where, for the sake of simplicity, we imagine a single source device that can perform any preparation procedure of interest (rather than a col- lection of source devices, each implementing a partic- ular preparation procedure) and a single measurement device that can perform any measurement procedure Source of interest (rather than a collection of measurement devices, each implementing a particular measurement procedure). Note that this is just a conceptual ab- straction: in particular, the various possible measure- ment settings on the measurement device may, for ex- ample, correspond to incompatible measurement pro- cedures in quantum theory. The fact that we repre- sent the different measurement settings by choices of Figure 1: A prepare-and-measure experiment. knob settings M ∈ M on a single measurement de- vice does not mean that it’s physically possible to im- plement all the measurement procedures represented labelled by S that can be chosen from a set S. The by M jointly; it only means that the experimenter set S represents, in general, some subset of the set of can choose to implement any of the measurements in all source settings, S , that are admissible in the op- the set M in a particular prepare-and-measure exper- erational theory, i.e., S ⊆ S . In a particular prepare- iment. The same is true for our abstraction of prepa- and-measure experiment, S will typically be a finite ration procedures to knob settings (S ∈ S) and out- set of source settings. Choosing the setting S pre- comes (s ∈ VS) of a single source device: it’s not that pares a system according to an ensemble of prepara- the same device can physically implement all possible tion procedures, denoted {(p(s|S),P[s|S])}s∈VS , where preparation procedures; it’s just that an experimenter {p(s|S)}s∈VS is a probability distribution over the preparation procedures {P } in the ensemble. can choose to implement any procedure in the set S [s|S] s∈VS in a particular prepare-and-measure experiment. This means that the source device has one classical We will consider two levels of description of input S and two outputs: one output is a classical prepare-and-measure experiments represented by label s ∈ VS identifying the preparation procedure

Fig.1: operational and ontological. The operational (in the ensemble {(p(s|S),P[s|S])}s∈VS ) that is car- description will be specified by an operational theory ried out when source outcome s is observed for source that takes source and measurement devices as primi- setting S (this source event is denoted [s|S]), and the tives and describes the experiment solely in terms of other output is a system prepared according to the the probabilities associated to their input/output be- source event [s|S], i.e., preparation procedure P[s|S], haviour. The ontological description will be specified with probability p(s|S). Thus, the assemblage of pos- by an ontological model that takes the system that sible ensembles that the source device can prepare can passes between the source and measurement devices be denoted by {{(p(s|S),P[s|S])}s∈VS }S∈S. as primitive and describes the experiment in terms of On the other hand, the measurement device has probabilities associated to properties of this system, two inputs, one a classical input M ∈ M specifying deriving the operational description as a consequence the choice of measurement setting to be implemented, of coarse-graining over these properties. Let us look and the other input receives the system prepared ac- at each description in turn. cording to prepartion procedure P[s|S] and on which this measurement M is carried out. The measurement 2.1 Operational theory device has one classical output m ∈ VM denoting the outcome of the measurement M implemented on a We now describe the various components of Fig.1 in system prepared according to P[s|S], and which occurs more detail. The source device has a source setting with probability p(m|M, S, s). The set M represents,

Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 5 in general, some subset of the set of all measurement tional theory that can distinguish them, i.e., settings, M , that are admissible in the operational p(m, s|M,S) = p(m, s0|M,S0), theory, i.e., M ⊆ M . In a particular prepare-and- measure experiment, M will typically be a finite set ∀[m|M], m ∈ VM ,M ∈ M . (2) of measurement settings. Again, the statistical indistinguishability of [s|S] and We will be interested in the operational joint prob- 0 0 ability p(m, s|M,S) ≡ p(m|M, S, s)p(s|S) for this [s |S ] must hold for all possible measurement settings prepare-and-measure experiment for various choices M , not merely those (i.e., M) that are of direct inter- of M ∈ , S ∈ . Note how this operational de- est in a particular prepare-and-measure experiment. M S Similar to measurement events, the “distinction of la- scription takes as primitive the operations carried out 0 0 in the lab and restricts itself to specifying the prob- bels”, [s|S] or [s |S ], is empirically inconsequential abilities of classical outcomes (i.e., m, s) given some since the two procedures are, in principle, indistin- interventions (i.e., classical inputs, M,S). So far, we guishable by the lights of the operational theory. haven’t assumed any structure on the operational the- Given this equivalence structure for preparation ory describing the schematic of Fig.1 beyond the fact and measurement procedures in the operational the- that it is a catalogue of input/output probabilities ory, we can now formalize the notion of a context: Definition 1. A context is any distinction of labels

{{p(m, s|M,S) ∈ [0, 1]}m∈VM ,s∈VS }M∈M,S∈S between operationally equivalent procedures in the op- erational theory. for various interventions S ∈ S and M ∈ M that we will consider in a prepare-and-measure experiment. To see concrete examples of the kinds of contexts We now require more structure in the operational the- that will be of interest to us in this paper, con- ory underlying this experiment, beyond a mere spec- sider quantum theory. Any mixed quantum state ad- ification of these probabilities. mits multiple convex decompositions in terms of other quantum states, i.e., it can be prepared by coarse- We require that the operational theory admits graining over distinct ensembles of quantum states, equivalence relations that partition experimental pro- each ensemble denoted by a different label. In this cedures of any type, whether preparations or measure- case, the “distinction of labels” between different de- ments, into equivalence classes of that type. These compositions denotes a distinction of preparation en- equivalence relations are defined relative to the op- sembles, which instantiates our notion of a prepara- erational probabilities (not necessarily restricted to a tion context. Similarly, a given positive operator can particular prepare-and-measure experiment) that are be implemented by different positive operator-valued admissible in the theory. We will call these equiv- measures (POVMs), and the distinction of labels de- alence relations “operational equivalences”, in keep- noting these different POVMs instantiates our notion ing with standard terminology [18]. This means that of a measurement context. any distinctions of labels between procedures in an equivalence class of procedures do not affect the oper- ational probabilities associated with the procedures. 2.2 Ontological model We specify these equivalence relations for measure- Given the operational description of the experiment ment and preparation procedures below. in terms of probabilities p(m, s|M,S), we want to 0 0 Two measurement events [m|M] and [m |M ] are explore the properties of any underlying ontological said to be operationally equivalent, denoted [m|M] ' model for this operational description. Any such on- 0 0 [m |M ], if there exists no source event in the opera- tological model, defined within the ontological mod- tional theory that can distinguish them, i.e., els framework [20], takes as primitive the physical system (rather than operations on it) that passes 0 0 p(m, s|M,S) = p(m , s|M ,S) ∀[s|S], s ∈ VS,S ∈ S . between the source and measurement devices, i.e., (1) its basic objects are ontic states of the system, de- Note that the statistical indistinguishability of [m|M] noted λ ∈ Λ, that represent intrinsic properties of and [m0|M 0] must hold for all possible source settings the physical system. When a preparation proce- S in the operational theory, not merely the source dure [s|S] is carried out, the source device samples settings S that are of direct interest in a particular from the space of ontic states Λ according to a prob- prepare-and-measure experiment. Hence, the “dis- ability distribution {µ(λ|S, s) ∈ [0, 1]}λ∈Λ, where tinction of labels”, [m|M] or [m0|M 0], is empirically P λ∈Λ µ(λ|S, s) = 1, and the joint distribution over inconsequential since the two procedures are, in prin- s and λ given S, i.e., {µ(λ, s|S)}λ∈Λ, is given by ciple, indistinguishable by the lights of the operational µ(λ, s|S) ≡ µ(λ|S, s)p(s|S). On the other hand, when theory. a system in ontic state λ is input to the measure- Similarly, two source events [s|S] and [s0|S0] are said ment device with measurement setting M ∈ M, the to be operationally equivalent, denoted [s|S] ' [s0|S0], probability distribution over the measurement out- if there exists no measurement event in the opera- comes is given by {ξ(m|M, λ) ∈ [0, 1]}m∈VM , where

Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 6 P ξ(m|M, λ) = 1. The operational statistics then its representation in the operational description m∈VM as well as in the ontological description satisfies this {{p(m, s|M,S) ∈ [0, 1]} } m∈VM ,s∈VS M∈M,S∈S coarse-graining relation.6 More explicitly, the coarse- results from a coarse-graining over λ, i.e., graining relation of Eq. (4) denotes the following post- X ˜ p(m, s|M,S) = ξ(m|M, λ)µ(λ, s|S), (3) processing of M: for each m ∈ VM , relabel each outcome m˜ ∈ V ˜ to outcome m with probability λ∈Λ M p(m|m˜ ) ∈ {0, 1}; the logical disjunction of those m˜ for all m ∈ VM , s ∈ VS,M ∈ M,S ∈ S. which are relabelled to m with probability 1 then de- Note that the definition of an ontological model fines the measurement event [m|M]. Now, in the op- above extends to the definition of an ontological model erational theory, this post-processing is represented of the operational theory (as opposed to a particular by fragment of the theory representing the experiment) when we take = and = . M M S S ∀[s|S], where s ∈ VS,S ∈ S : X ˜ 2.3 Representation of coarse-graining p(m, s|M,S) ≡ p(m|m˜ )p(m, ˜ s|M,S), (5) m˜ We will now specify how coarse-graining of procedures and in the ontological model it is represented by in a prepare-and-measure experiment is represented in its description, whether operational or ontological. X ∀λ ∈ Λ: ξ(m|M, λ) ≡ p(m|m˜ )ξ(m ˜ |M,˜ λ). (6) Namely, if a procedure is defined as a coarse-graining m˜ of other procedures, then we require that the repre- sentation of such a procedure is defined by the same As an example, consider a three-outcome measure- coarse-graining of the representation of the other pro- ment M˜ with outcomes m˜ ∈ {1, 2, 3}, which can cedures.4 Implicit in this discussion is the assump- be classically post-processed to obtain a two-outcome tion that the operational theory allows one to define measurement M with outcomes m ∈ {0, 1}, such that new procedures in the set M or S by coarse-graining p(m = 0|m˜ = 1) = p(m = 0|m˜ = 2) = 1 and other procedures in these sets, i.e., both M and S p(m = 1|m˜ = 3) = 1. The measurement events of are closed under coarse-grainings. In particular, one M are then just can consider coarse-graining measurement and source ˜ ˜ settings (belonging to sets M and S, respectively) ac- [m = 0|M] ≡ [m ˜ = 1|M] + [m ˜ = 2|M], (7) tually implemented in the lab to define new measure- [m = 1|M] ≡ [m ˜ = 3|M˜ ], (8) ment and source settings that belong to M \M and S \S, respectively.5 where the “+” sign denotes (just as the summation sign in the definition of [m|M] in Eq. (4) did) logical 2.3.1 Coarse-graining of measurements disjunction, i.e., measurement event [m = 0|M] is said to occur when [m ˜ = 1|M˜ ] or [m ˜ = 2|M˜ ] occurs. The Let us see how this works for the case of measurement operational and ontological representations of these procedures: if a measurement procedure M with mea- measurement events are then given by surement events {[m|M]}m∈VM is defined as a coarse- graining of another measurement procedure M˜ with ∀[s|S], where s ∈ VS,S ∈ S : measurement events {[m ˜ |M˜ ]} , symbolically de- m˜ ∈VM˜ 2 X noted by p(m = 0, s|M,S) ≡ p(m, ˜ s|M,S˜ ), (9) X [m|M] ≡ p(m|m˜ )[m ˜ |M˜ ], m˜ =1 m˜ p(m = 1, s|M,S) ≡ p(m ˜ = 3, s|M,S˜ ), (10) X where ∀m, m˜ : p(m|m˜ ) ∈ {0, 1}, p(m|m˜ ) = 1, m ∀λ ∈ Λ: (4) 2 X ˜ 4Quantum theory is an example of an operational theory ξ(m = 0|M, λ) ≡ ξ(m ˜ |M, λ), (11) that satisfies this requirement because of the linearity of the m˜ =1 with respect to both preparations and measurements. ξ(m = 1|M, λ) ≡ ξ(m ˜ = 3|M,˜ λ). (12) The same is true, more generally, of general probabilistic theo- ries (GPTs) [1, 24]. We require this feature in any ontological model as well, regardless of its (non)contextuality. This requirement on the representation of coarse- graining of measurements is particularly important 5 Similarly, we also allow probabilistic mixtures of (prepa- (and often implicit) when the notion of a mea- ration or measurement) procedures in the operational theory to define new procedures, i.e., the theory is convex. See the surement context is instantiated by compatibility last paragraph of Section 2.5 for the role of this convexity in experimental tests of contextuality and Section 2.6 for an ex- 6Note that Eq. (4) is not an operational equivalence between ample where a probabilistic mixture of measurement settings independent procedures. It is a definition of a new procedure is required in a proof of contextuality. obtained by coarse-graining another procedure.

Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 7 (or joint measurability), as in the case of KS- 2.3.2 Coarse-graining of preparations contextuality, where one needs to consider coarse- Let us now consider the representation of coarse- grainings of distinct measurements. For example, grainings for preparation procedures. This works consider a measurement setting M with outcomes 12 in a way similar to the case of measurement proce- (m1, m2) ∈ V1 × V2 that is coarse-grained over (2) dures which we have already outlined. If an ensemble m2 to define an effective measurement setting M1 of source events {[s|S]} is defined as a coarse- (2) s∈VS with measurement events {[m1|M1 ]}m1∈V1 . Sym- graining of another ensemble, {[˜s|S˜]} , symboli- s˜∈VS˜ bolically, [m |M (2)] ≡ P [(m , m )|M ], which cally denoted as 1 1 m2 1 2 12 is represented in the operational theory as ∀[s|S]: X ˜ p(m , s|M (2),S) ≡ P p((m , m ), s|M ,S) and [s|S] ≡ p(s|s˜)[˜s|S], where 1 1 m2 1 2 12 (2) s˜ in the ontological model as ∀λ : ξ(m1|M , λ) ≡ X P 1 ∀s, s˜ : p(s|s˜) ∈ {0, 1}, p(s|s˜) = 1, (13) m ξ((m1, m2)|M12, λ). Similarly, consider another 2 s measurement setting M13 with outcomes (m1, m3) ∈ V1 × V3 that is coarse-grained over m3 to de- then its representation should satisfy the same coarse- (3) graining relation in any description, operational or fine an effective measurement setting M1 with (3) ontological. More explicitly, this coarse-graining de- measurement events {[m1|M1 ]}m1∈V1 . Symboli- (3) P notes the following post-processing: for any s ∈ VS, cally, [m1|M ] ≡ [(m1, m3)|M13], which is 1 m3 relabel each outcome s˜ ∈ V to outcome s with prob- represented in the operational theory as ∀[s|S]: S˜ ability p(s|s˜) ∈ {0, 1}; the logical disjunction of those p(m , s|M (3),S) ≡ P p((m , m ), s|M ,S) and 1 1 m3 1 3 13 s˜ which are relabelled to s with probability 1 then de- (3) in the ontological model as ∀λ : ξ(m1|M1 , λ) ≡ fines the source event [s|S]. Now, in the operational P ξ((m , m )|M , λ). theory, this coarse-graining is represented by m3 1 3 13 Now, imagine that the following oper- ational equivalence holds at the opera- ∀[m|M], where m ∈ VM ,M ∈ M : (2) (3) X ˜ tional level: [m1|M1 ] ' [m1|M1 ]. KS- p(m, s|M,S) ≡ p(s|s˜)p(m, s˜|M, S), (14) noncontextuality is then the assumption that s˜ P P ξ((m1, m2)|M12, λ) = ξ((m1, m3)|M13, λ) m2 m3 and in the ontological model it is represented by (i.e., ξ(m |M (2), λ) = ξ(m |M (3), λ)) for all λ and 1 1 1 1 X ˜ that ξ((m1, m2)|M12, λ), ξ((m1, m3)|M13, λ) ∈ {0, 1} ∀λ ∈ Λ: µ(λ, s|S) ≡ p(s|s˜)µ(λ, s˜|S). (15) for all λ. This assumption applied to multiple s˜ (compatible) subsets of a set of carefully chosen In this paper, we will focus on a specific type of coarse- measurements can then provide a proof of the KS graining: namely, completely coarse-graining over the theorem, i.e., there exist sets of measurements in outcomes of a source setting, say {[˜s|S˜]} , to yield s˜∈VS˜ quantum theory such that their operational statis- an effective one-outcome source-setting, denoted S˜>, tics cannot be emulated by a KS-noncontextual associated with a single source event {[>|S˜>]}, where ontological model. ˜ P ˜ [>|S>] ≡ s˜[˜s|S]. In the operational theory, this The key point here is this: the requirement that coarse-graining is represented by coarse-graining relations between measurements be respected by their representations in the ontological ∀[m|M], where m ∈ VM ,M ∈ M : model is independent of the KS-(non)contextuality of ˜ X ˜ 7 p(m, >|M, S>) ≡ p(m, s˜|M, S), (16) the ontological model. However, this requirement is s˜ necessary for the assumption of KS-noncontextuality to produce a contradiction with quantum theory; on and in the ontological model it is represented by the other hand, a KS-contextual ontological model X ∀λ ∈ Λ: µ(λ, >|S˜ ) ≡ µ(λ, s˜|S˜). (17) (while respecting the coarse-graining relations) can > s˜ always emulate quantum theory. In this sense, the representation of coarse-grainings is baked into an on- surement obtained by coarse-graining another (parent) mea- tological model from the beginning (just as it is baked surement as a fundamentally new measurement with response functions not respecting the coarse-graining relations with the into an operational description), before any claims parent measurement’s response functions, even if such coarse- about its (non)contextuality.8 graining relations are respected in the operational description. Such an ontological model, however, will not be able to ar- 7In our example, this requirement has to do with the defini- ticulate the ingredients needed for a proof of the KS theorem and we will not consider it here. Indeed, in the absence of tions of ξ(m |M (2), λ) and ξ(m |M (3), λ), not their ontological 1 1 1 1 the requirement that coarse-graining relations be respected in equivalence. The ontological equivalence only comes into play an ontological model, one can easily construct an ontological when invoking KS-noncontextuality. model that is “KS-noncontextual” for any operational theory. 8One could, of course, choose to not respect the coarse- The interested reader may look at AppendixB for more details, graining relations and define a notion of an ontological model perhaps after looking at Section 2.5 for the relevant definitions without them. In such a model, one could treat every mea- of noncontextuality.

Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 8 Hence, we use the notation [>|S˜>] to denote the measurement procedure, then the measurement pro- source event that “at least one of the source outcomes cedures in the set are said to be incompatible, i.e., ˜ in the set VS˜ occurs for source setting S” (i.e., the they cannot be jointly implemented. logical disjunction of s˜ ∈ VS˜), formally denoting the Note that we will also often refer to a measurement ˜ choice of S and the subsequent coarse-graining over s˜ procedure {[mi|Mi]}mi∈VM by just its measurement ˜ i by the “source setting” S> and the definite outcome setting, Mi, and thus speak of the (in)compatibility of this source setting by “>”. This source event al- of measurement settings. Another notion that we will ways occurs, i.e., p(>|S˜>) = 1, so p(m, >|M, S˜>) = need to refer to is the joint measurability of measure- p(m|M, S˜>, >) and µ(λ, >|S˜>) = µ(λ|S˜>, >). ment events: a set of measurement events that arise This notion of coarse-graining over all the outcomes as outcomes of a single measurement setting are said of a source setting allows us to define a notion of to be jointly measurable, e.g., all the measurement operational equivalence between the source settings events in {[m|M]}m∈VM are jointly measurable since themselves. More precisely, two source settings S and they arise as outcomes of a single measurement set- S0 are said to be operationally equivalent, denoted ting M. 0 [>|S>] ' [>|S>], if no measurement event can distin- As a quantum example, consider a commuting pair guish them once all their outcomes are coarse-grained of projective measurements, say {Π1,I − Π1} and over, i.e., {Π2,I − Π2}, where Π1 and Π2 are projectors on X X some Hilbert space H such that Π1Π2 = Π2Π1 and p(m, s|M,S) = p(m, s0|M,S0) I is the identity operator on H. This pair is jointly 0 s∈VS s ∈VS0 implementable since they can be obtained by coarse- ∀[m|M], m ∈ VM ,M ∈ M . (18) graining the outcomes of the joint projective measure- In quantum theory, this would correspond ment given by {Π1Π2, Π1(I − Π2), (I − Π1)Π2, (I − P Π1)(I − Π2)}. to the operational equivalence s p(s|S)ρ[s|S] = P 0 0 s0 p(s |S )ρ[s0|S0] for the density operator obtained by completely coarse-graining over two distinct en- 2.5 Noncontextuality sembles of quantum states, {(p(s|S), ρ[s|S])}s∈VS and 0 0 0 0 0 It is always possible to build an ontological model {(p(s |S ), ρ[s |S ])}s ∈VS0 on some Hilbert space H. reproducing the predictions of any operational the- 9 2.4 Joint measurability (or compatibility) ory, while respecting the coarse-graining relations. A trivial example of such an ontological model is one

A given measurement procedure, {[m|M]}m∈VM for where ontic states λ are identified with the prepara- some M ∈ M , in the operational description can tion procedures P[s|S] (where s ∈ VS and S ∈ S ) be coarse-grained in many different ways to define and we have µ(λ, s|S) ≡ δλ,λ[s|S] p(s|S), where ontic new effective measurement procedures. The coarse- state λ[s|S] is the one deterministically sampled by the grained measurement procedures thus obtained from preparation procedure P[s|S]. Further, the response

{[m|M]}m∈VM are then said to be jointly measurable functions are identified with operational probabili- (or compatible), i.e., they can be jointly implemented ties as ξ(m|M, λ[s|S]) ≡ p(m|M, S, s). Then we have P by the same measurement procedure {[m|M]}m∈VM λ∈Λ ξ(m|M, λ)µ(λ, s|S) = ξ(m|M, λ[s|S])p(s|S) = which we refer to as their parent or joint measure- p(m, s|M,S). Also, coarse-graining relations of the ment. Formally, a set C of measurement procedures type [m ˜ |M˜ ] ≡ P p(m ˜ |m)[m|M] and [˜s|S˜] ≡ P m s p(˜s|s)[s|S] that are respected in the operational {{[mi|Mi]}m ∈V i ∈ {1, 2, 3,..., |C|}} i Mi description are also respected in this ontological de- ˜ is said to be jointly measurable (or compatible) if it scription: that is, we have ∀λ ∈ Λ: ξ(m ˜ |M, λ) ≡ P ˜ arises from coarse-grainings of a single measurement m p(m ˜ |m)ξ(m|M, λ) and ∀λ ∈ Λ: µ(λ, s˜|S) ≡ P p(˜s|s)µ(λ, s|S). procedure M ∈ M , i.e., for all {[mi|Mi]}mi∈VM ∈ C s i Hence, it is only when additional assumptions are X [mi|Mi] ≡ p(mi|m)[m|M], (19) imposed on an ontological model that deciding its ex-

m∈VM istence becomes a nontrivial problem. Such additional assumptions must, of course, play an explanatory role where for all i, m, mi: p(mi|m) ∈ {0, 1} and to be worth investigating. The assumption we are in- P p(mi|m) = 1. In terms of the operational mi∈VMi terested in is noncontextuality, applied to both prepa- probabilities, this means that ration and measurement procedures. Motivated by the methodological principle of the identity of indis- ∀[s|S], s ∈ VS,S ∈ S and ∀{[mi|Mi]}m ∈V ∈ C : i Mi cernables [18], noncontextuality is an inference from X p(mi, s|Mi,S) ≡ p(mi|m)p(m, s|M,S). (20) 9 m∈VM Note that we will always assume coarse-graining relations are respected in any ontological model. The exception is (some If, on the other hand, a set of measurement proce- of) the discussion in Section 2.3 and AppendixB where we dures cannot arise from coarse-grainings of any single consider the alternative possibility.

Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 9 the operational description to the ontological descrip- requirement that S and M include tomographically tion of an experiment. It posits that the equivalence complete sets doesn’t directly reflect in our theoreti- structure in the operational description is preserved cal derivation of the noise-robust noncontextuality in- in the ontological description, i.e., the reason one equalities later, it is crucial for experimentally verify- cannot distinguish two operationally equivalent rep- ing the operational equivalences (cf. Eqs. (1),(2),(18)) resentations of procedures based on their operational we need to even invoke the assumption of noncontex- statistics is that there is, ontologically, no difference tuality (cf. Eqs. (21),(22)). Further, this assumption in their representations. We now formally define the on M and S has so far been necessary to be able to notion of noncontextuality in its generalized form due implement an actual noise-robust contextuality exper- to Spekkens [18]. iment [13], besides the requirement that the opera- Mathematically, the assumption of measurement tional theory be convex, i.e., probabilistic mixtures noncontextuality entails that of procedures in the theory (whether preparations or

0 0 measurements) are also valid procedures in the the- [m|M] ' [m |M ] ory. We refer the reader to Refs. [13, 36, 37] for a ⇒ ξ(m|M, λ) = ξ(m0|M 0, λ), ∀λ ∈ Λ, (21) discussion of what tomographic completeness entails for (convex) operational theories formalized as general while the assumption of preparation noncontextuality probabilistic theories (GPTs). Although we will not entails that discuss it in this paper, see Ref. [38] for some recent [s|S] ' [s0|S0] ⇒ µ(λ, s|S) = µ(λ, s0|S0) ∀λ ∈ Λ, work towards relaxing the tomographic completeness 0 0 requirement for the set of measurement settings. [>|S>] ' [>|S>] ⇒ µ(λ|S) = µ(λ|S ) ∀λ ∈ Λ.(22) Here we denote µ(λ|S) ≡ P µ(λ, s|S), etc., for s∈VS 2.6 An example of Spekkens contextuality: the simplicity of notation, rather than use the notation fair coin flip inequality µ(λ, >|S>), etc., for these coarse-grained probability distributions. Note that since coarse-grainings are re- We recap here an example of Spekkens contextual- spected in any ontological model we consider, we in- ity that has been experimentally demonstrated [13] deed have that µ(λ, >|S ) ≡ P µ(λ, s|S). > s∈VS to give the reader a flavour of the general approach These are the assumptions of noncontextuality – we are going to adopt in the rest of this paper with termed universal noncontextuality – that form the ba- regard to Kochen-Specker type scenarios. We call the sis of our approach to noise-robust noncontextuality inequality tested in Ref. [13] the “fair coin flip” in- inequalities [12–17, 45]. Note that the traditional equality. notion of KS-noncontextuality entails, besides mea- Consider a prepare-and-measure scenario with surement noncontextuality above, the assumption of three source settings, denoted S ≡ {S1,S2,S3}, such outcome-determinism, i.e., for any measurement event that VSi ≡ {0, 1} and we have p(si = 0|Si) = p(si = [m|M], ξ(m|M, λ) ∈ {0, 1} for all λ ∈ Λ. 1|Si) = 1/2 for all i ∈ {1, 2, 3}. Each Si thus cor- It is important to note that in order for our no- responds to the ensemble of preparation procedures tion of operational equivalence to be experimentally {(p(s |S ),P )} and we have the following i i [si|Si] si∈VSi testable, we need that each of sets M and S includes operational equivalence among the source settings af- a tomographically complete set of measurements and ter coarse-graining: preparations, respectively. That is, the prepare-and- [>|S ] ' [>|S ] ' [>|S ]. (23) measure experiment testing contextuality can probe 1> 2> 3> a tomographically complete set of preparations and There are four measurement settings in this sce- measurements. Of course, the set of all possible mea- nario, denoted M ≡ {M1,M2,M3,Mfcf }, such that surements in a theory ( ) is (by definition) tomo- M VMi ∈ {0, 1} for all i ∈ {1, 2, 3, fcf}. The measure- graphically complete for any preparation in the the- ment setting Mfcf is a fair coin flip, i.e., it is in- ory and, similarly, the set of all possible preparations sensitive to the preparation procedure preceding it (S ) in a theory is tomographically complete for any and yields the outcome mfcf = 0 or 1 with equal measurement in the theory. However, there may exist probability for any preparation procedure P[s|S], i.e., smaller (finite) sets of preparations and measurements p(mfcf = 0|Mfcf , S, s) = p(mfcf = 1|Mfcf , S, s) = 1/2 in the theory that are tomographically complete and for all [s|S]. in that case we require that S and M include such We also define a measurement procedure Mmix as a tomographically complete sets, even if they don’t in- classical post-processing of M1,M2,M3, i.e., its mea- clude all possible preparations and measurements in surement events {[m |M ]}1 are defined by mix mix mmix=0 the theory. For example, when the operational theory the classical post-processing relation is quantum theory for a , the three spin measure- ments {σx, σy, σz} are tomographically complete for 3 1 X X any qubit preparation, so we require that M includes [mmix|Mmix] ≡ p(i) p(mmix|mi)[mi|Mi], these three measurements even if it doesn’t include ev- i=1 mi=0 ery other possible measurement on a qubit. While the (24)

Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 10 which symbolically denotes the following post- To see how this is obtained, note that processing of measurements M1,M2,M3: consider 3 3 1 a uniform probability distribution p(i) = 1 X X 3 i=1 δmi,si p(mi, si|Mi,Si) 3 3 over the measurement settings {Mi}i=1 and relabel i=1 mi,si the respective measurement outcomes, i.e., {m ∈ 3 i X 1 X X {0, 1}}3 , to a measurement outcome m ∈ {0, 1} = δ ξ(m |M , λ)µ(λ, s |S ) i=1 mix 3 mi,si i i i i according to the probability distributions i=1 mi,si λ∈Λ 3 3 X 1 X X {{p(mmix|mi) = δmmix,mi }mmix∈{0,1}}i=1; ≤ max ξ(mi|Mi, λ) δmi,si µ(λ, si|Si) 3 mi i=1 λ∈Λ mi,si coarse-graining over mi and i then yields the effective 3 X 1 X X measurement setting Mmix with outcomes labelled by = ζ(M , λ) µ(λ, s |S ) 3 i i i mmix ∈ {0, 1}. In contrast to the kinds of coarse- i=1 λ∈Λ si graining (over measurement outcomes) that appear in 3 X X 1 KS-noncontextuality (which we discussed in Section = ζ(M , λ)ν(λ), (29) 3 i 2.3), the (probabilistic) coarse-graining here is over λ∈Λ i=1 the measurement settings themselves while retaining where we have that ζ(M , λ) ≡ max ξ(m |M , λ) the outcome labels.10 We require that this coarse- i mi i i and that ν(λ) ≡ µ(λ|S ) for all i ∈ {1, 2, 3}. This graining relation be respected in the operational as i allows us to put the upper bound well as the ontological description. In the operational 3 description, this coarse-graining is represented by 1 X Corrfcf ≤ max ζ(Mi, λ), (30) λ∈Λ 3 ∀[s|S], b ∈ {0, 1} : i=1 3 which, subject to the constraint (from measurement 1 X 1 1 p(mmix = b, s|Mmix,S) ≡ p(mi = b, s|Mi,S). noncontextuality) that 3 ξ(0|M1, λ) + 3 ξ(0|M2, λ) + 3 1 1 11 i=1 3 ξ(0|M3, λ) = 2 , yields Eq. (28). It turns out that (25) in quantum theory the sources and measurements re- quired for this scenario can be realized on a qubit and We require the following operational equivalence they can, in principle, achieve the value Corr = 1. between measurement events of Mmix and Mfcf with This can be achieved by taking the three prepara- respect to which we invoke the assumption of mea- tions to be the trine preparations on an equatorial surement noncontextuality: plane (say, the Z-X plane) of the Bloch sphere and 3 the measurements {Mi}i=1 to be the trine measure- ∀b ∈ {0, 1} :[mmix = b|Mmix] ' [mfcf = b|Mfcf ] (26) ments, i.e.,

1 0 If we then look at an operational quantity quanti- ρ[si=0|Si] ≡ (I + ~σ.~ni) ≡ Πi , fying source-measurement correlations, namely, 2 1 1 ρ[si=1|Si] ≡ (I − ~σ.~ni) ≡ Πi , 3 2 X 1 X Corr ≡ δ p(m , s |M ,S ), (27) E ≡ Π0, fcf 3 mi,si i i i i [mi=0|Mi] i i=1 mi,si 1 E[mi=1|Mi] ≡ Πi , (31) √ then the assumption of preparation noncontextuality 3 1 where ~n1 ≡ (0, 0, 1), ~n2 ≡ ( , 0, − ), ~n3 ≡ √ 2 2 applied to operational equivalence in Eq. (23) (so that 3 1 (− , 0, − ), and ~σ ≡ (σx, σy, σz) denotes the three µ(λ|S1) = µ(λ|S2) = µ(λ|S3) for all λ ∈ Λ) and the 2 2 0 1 0 −i assumption of measurement noncontextuality applied Pauli matrices σ = , σ = , and x 1 0 y i 0 to the operational equivalence in Eq. (26) (so that   1 ξ(0|M , λ) + 1 ξ(0|M , λ) + 1 ξ(0|M , λ) = 1 for all 1 0 3 1 3 2 3 3 2 σz = . The operational equivalences are λ ∈ Λ) lead to the following constraint: 0 −1 then easy to verify: 5 Corr ≤ . (28) I fcf ρ[>|S ] = , ∀i ∈ {1, 2, 3}, 6 i> 2 3 10We did not discuss these more general types of classical 1 X Π0 = I . (32) post-processing in Section 2.3 because they are not relevant to 3 i 2 the treatment of Kochen-Specker type scenarios in the Spekkens i=1 framework. The example we present here is from Ref. [13], which is not of Kochen-Specker type. The general principle un- 11The reader may look at Appendix B.1 of Ref. [13] to con- derlying the representation of such classical post-processings is, vince themselves that the maximum is achieved for an as- however, the same: they should be respected in the operational signment of response functions of the type ξ(0|M1, λ) = 1, 1 as well as the ontological description. ξ(0|M2, λ) = 2 and ξ(0|M1, λ) = 0 for some λ.

Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 11 The quantity Corrfcf = 1 from this quantum realiza- The quantity of interest in a Bell experiment A B A B tion. The experimental violation of the noise-robust p(mi , mj |Mi ,Mj ) (i ∈ {x, z}) is then formally the A B A B noncontextuality inequality, Eq. (28), was demon- same as the quantity p(si , mj |Si ,Mj ) that we are strated in Ref. [13], where more details may be found. interested in our prepare-and-measure scenario. In Note that the fair coin flip inequality, Eq.(28), is not the ontological model describing the effective prepare- inspired by the kinds of operational equivalences that and-measure experiment on Bob’s system, we have are relevant in a proof of the Kochen-Specker theo- the following: rem, but employs other kinds of operational equiva- lences allowed in the Spekkens framework [18], i.e., p(sA, mB|SA,M B) the operational equivalences in Eqs. (23) and (26) do i j i j X B B A A not arise from the same measurement outcome being = Pr(mj |Mj , λ)Pr(λ, si |Si ) shared by different measurements. λ X Our goal in the present paper is to provide a frame- = Pr(mB|M B, λ)Pr(sA|SA, λ)Pr(λ|SA). (34) work for noise-robust noncontextuality inequalities j j i i i λ obtained from statistical proofs of the KS theorem, in particular those that are covered by the CSW Assuming preparation noncontextuality relative to framework [22], so that such inequalities can be put A A the operational equivalence [>|Sx ] ' [>|Sz ], we have to an experimental test along the lines of Ref. [13] A A within the Spekkens framework. Hence, the opera- Pr(λ|Sx ) = Pr(λ|Sz ) ≡ Pr(λ), so that tional equivalences between measurement events that A B A B will be of interest to us in this paper are precisely p(si , mj |Si ,Mj ) those which allow for a proof of the KS theorem, i.e., X = Pr(sA|SA, λ)Pr(mB|M B, λ)Pr(λ), (35) those which correspond to the same measurement out- i i j j come (e.g., a projector) being shared by different mea- λ surements (e.g., projective measurements). which formally resembles the expression for local causality when applied to the corresponding two- 2.7 Connection to Bell scenarios party Bell experiment: As further motivation to study the questions we are posing, note that one can also view the general A B A B p(mi , mj |Mi ,Mj ) prepare-and-measure scenario we are considering in X A A B B this paper (Fig.1) as arising on one wing of a two- = Pr(mi |Mi , λ)Pr(mj |Mj , λ)Pr(λ). (36) party Bell experiment: that is, given two parties – λ Alice and Bob – sharing an entangled state and per- forming local measurements in a Bell experiment, one If no other assumption of noncontextuality is in- can view each choice of measurement setting on Al- voked besides the one applied to the operational ice’s side as preparing an ensemble of states on Bob’s equivalence of source settings on Bob’s system, then side; on account of no-signalling, the reduced state the constraints on p(sA, mB|SA,M B) will be the same on Bob’s side will be the same regardless of Alice’s i j i j as the constraints on p(mA, mB|M A,M B) from Bell choice of measurement setting, i.e., all the ensembles i j i j inequalities. corresponding to Alice’s measurement settings (hence, Bob’s source settings) will be operationally equiva- Note, however, that the response functions B B A A lent. Pr(mj |Mj , λ) and Pr(mj |Mj , λ) can be completely For example, consider a Bell experiment where arbitrary in a locally causal ontological model for the Alice has two choices of measurement settings, Bell experiment and the same applies to the distri- A A butions Pr(sA|SA, λ) and Pr(mB|M B, λ) in a prepa- Mx ≡ σx or Mz ≡ σz, and she shares a Bell i i j j state with Bob: |ψi = √1 (|00i + |11i). Bob ration noncontextual model of the corresponding 2 has access to some set of measurement settings prepare-and-measure scenario on Bob’s side. We will B B be interested in imposing additional constraints on M ≡ {Mj }j on his system. When Alice mea- B B A the response functions Pr(mj |Mj , λ) of the prepare- sures Mx , she prepares the ensemble of states A and-measure scenario (on Bob’s side) that follow from S ≡ {(1/2, ρ[sA=0|SA] ≡ |+ih+|), (1/2, ρ[sA=1|SA] ≡ x x x x x the assumption of measurement noncontextuality ap- |−ih−|)} on Bob’s side and when she measures plied to operational equivalences between measure- M A she prepares the ensemble of states SA ≡ z z ment events on Bob’s side. In particular, we are inter- {(1/2, ρ[sA=0|SA] ≡ |0ih0|), (1/2, ρ[sA=1|SA] ≡ |1ih1|)}. z z z z ested in those operational equivalences between mea- These ensembles are operationally equivalent, yielding surement events that are required by any statistical the maximally mixed state on coarse-graining, i.e., proof of the Kochen-Specker theorem [16, 47]. We 1 1 1 1 develop this approach more carefully in the following |0ih0| + |1ih1| = |+ih+| + |−ih−| = I . (33) 2 2 2 2 2 sections.

Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 12 3 Hypergraph approach to Kochen- (cf. Section2). In particular, depending on the oper- Specker scenarios in the Spekkens ational equivalences that an operational theory can exhibit (by virtue of (in)compatibility relations be- framework tween measurements), it may or may not allow some contextuality scenario to be realized by measurement Having set up the framework needed to articulate events in the theory. The fact that a given vertex, the relevant notions in Section2, we now proceed to say v ∈ V (Γ), appears in multiple hyperedges, say consider Kochen-Specker type experimental scenarios E0 ≡ {e ∈ E(Γ)|v ∈ e}, means that the measurement in this framework. To do this, we will use the lan- events corresponding to this vertex, i.e., {[v|e]}e∈E0 , guage of hypergraphs and their subgraphs to repre- are operationally equivalent, and the equivalence class sent the operational equivalences between measure- of these measurement events is denoted by the vertex ment events that are required in a Kochen-Specker v itself. In the case of quantum theory, for example, argument as well as the operational equivalences be- v can represent a positive operator that appears in tween source settings that we will invoke in our gen- different positive operator-valued measures (POVMs) eralization. The (hyper)graph-theoretic ingredients of represented by the hyperedges. our approach will represent those aspects of the gen- A probabilistic model on Γ is an assignment of prob- eral framework of Section2 that are necessary to go abilities to the vertices v ∈ V (Γ) such that p(v) ≥ 0 from the CSW framework for KS-contextuality to a P for all v ∈ V (Γ) and v∈e p(v) = 1 for all e ∈ E(Γ). hypergraph framework for Spekkens contextuality ap- As we have noted, every vertex v represents an equiv- plied to Kochen-Specker type experimental scenarios. alence class of measurement events, denoted [m|M], Our presentation will be a hybrid one, discussing and every hyperedge e represents an equivalence class features of the CSW framework [21, 22] in the nota- of measurement procedures, denoted M.12 The fact tion of the AFLS framework [23], but extending both that each v represents an equivalence class of mea- in ways appropriate for the purpose of this paper. Our surement events means that goal is to demonstrate how the graph-theoretic invari- ants of CSW [22] can be repurposed towards obtaining 1. any probabilistic model p on Γ, realized by op- noise-robust noncontextuality inequalities. erational probabilities for a given source event – We do this in two parts: first, we define a rep- that is, where for all v ∈ V (Γ) and a given [s|S], resentation of measurement events in the manner of p(v) ≡ p(v|S, s) ≡ p(m|M, S, s) – is consistent Refs. [22, 23], and then we define a representation of with the operational equivalences represented by source events in the spirit of Ref. [12]. Γ, and 2. any probabilistic model on Γ, realized by ontolog- 3.1 Measurements ical probabilities for a given ontic state – that is, where for all v ∈ V (Γ) and a given ontic state λ, The basic object for representing measurements is a p(v) ≡ p(v|λ) ≡ ξ(m|M, λ) – respects (by defini- hypergraph, Γ, with a finite set of vertices V (Γ) such tion) the assumption of measurement noncontex- that each vertex v ∈ V (Γ) denotes a measurement tuality with respect to the presumed operational outcome, and a set of hyperedges E(Γ) such that equivalences between measurement events. each hyperedge e ∈ E(Γ) is a subset of V (Γ) and denotes a measurement consisting of outcomes in e. We will therefore often write p(m, s|M,S) as V (Γ) S Here, E(Γ) ⊆ 2 and e∈E(Γ) e = V (Γ). Such a p(v, s|S) and p(m|M, S, s) as p(v|S, s), where [s|S] is hypergraph satisfies the definition of a contextuality a source event. Similarly, we will also write ξ(m|M, λ) scenario `ala AFLS [23]. We will further assume, un- as p(v|λ), where λ is an ontic state. less specified otherwise, that the hypergraph is simple: Orthogonality graph of Γ, O(Γ): Given the hy- that is, for all e1, e2 ∈ E(Γ), e1 ⊆ e2 ⇒ e1 = e2, or pergraph Γ, we construct its orthogonality graph that no hyperedge is a strict subset of another. Such O(Γ): that is, the vertices of O(Γ) are given by hypergraphs are also called Sperner families [46]. Two V (O(Γ)) ≡ V (Γ), and the edges of O(Γ) are given measurement events are said to be (mutually) exclu- by E(O(Γ)) ≡ {{v, v0}|v, v0 ∈ e for some e ∈ E(Γ)}. sive if the vertices denoting them appear in a common hyperedge, i.e., if they can be realized as outcomes of 12Note that two measurement procedures with measurement 0 a single measurement setting. settings M and M are operationally equivalent if every mea- surement event of one is operationally equivalent to a distinct The structure of a contextuality scenario Γ repre- measurement event of the other. That is, there is a bijective sents the operational equivalences between measure- correspondence (of operational equivalence) between the two ment events that are of interest in a Kochen-Specker sets of measurement events. In quantum theory, for example, argument. We emphasize here that we take the opera- a given POVM (which is what a hyperedge would represent), say {Ek}k, can be implemented in many possible ways, each tional theory to be fundamental and the contextuality such measurement procedure corresponding to different quan- scenario for a particular Kochen-Specker argument to tum instrument. Mathematically, these different procedures be derived from (and as a graphical representation of) can be represented by different sets of operators {Ok}k such that E = O† O for all k and P O† O = . the operational equivalences in the operational theory k k k k k k I

Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 13

Figure 2: The KCBS scenario with 4-outcome joint measure- Figure 3: A subgraph of KCBS hypergraph Γ, representing ments, visualized as a hypergraph Γ [16, 22, 47]. orthogonality relations of the events of interest in the KCBS inequality [22, 47].

Each edge of O(Γ) denotes the exclusivity of the two P measurement events it connects, i.e., the fact that {0, 1}, where v∈e p(v) = 1 for all e ∈ E(Γ). they can occur as outcomes of a single measurement. In Ref. [23], this is referred to as a “classical 13 For any Bell-KS inequality constraining correla- model”. tions between measurement events from O(Γ) (when Note that we call Γ KS-colourable if C(Γ) 6= ∅ all measurements are implemented on a given source and we call it KS-uncolourable if C(Γ) = ∅. Our event), we construct a subgraph G of O(Γ) such that terminology here is inspired by the traditional the vertices of G, i.e., V (G), correspond to mea- usage of the term “Kochen-Specker colouring” to surement events that appear in the inequality with refer to an assignment of two colours to vectors nonzero coefficients, and two vertices share an edge satisfying some orthogonality relations under the in G if and only if they share an edge in O(Γ). More colouring constraints of the KS theorem [48]. explicitly, consider a Bell-KS expression • Consistent exclusivity satisfying probabilistic X models, CE1(Γ): a probabilistic model on Γ, R([s|S]) ≡ wvp(v|S, s), (37) p : V (Γ) → [0, 1], such that (in addition to sat- v∈V (G) isfying the definition of a probabilistic model), P where wv > 0 for all v ∈ V (G). A Bell-KS in- v∈c p(v) ≤ 1 for all cliques c in the orthogo- equality imposes a constraint of the form R([s|S]) ≤ nality graph O(Γ). This is the same as the set of RKS, where RKS is the upper bound on the expres- E1 probabilistic models of Ref. [22]. sion in any operational theory that admits a KS- Note that a clique in the orthogonality graph noncontextual ontological model. Often, but not al- O(Γ) is a set of vertices that are pairwise exclu- ways, these inequalities are simply of the form where sive (i.e., every vertex in this set shares an edge wv = 1 for all v ∈ V (G). In keeping with the CSW with every other vertex). notation [22], we will denote the general situation by a weighted graph (G, w), where w is a function that • General probabilistic models, G(Γ): Any p that satisfies the definition of a probabilistic model is maps vertices v ∈ V (G) to weights wv > 0. See Fig- ures2 and3 for an example from the Klyachko-Can- a general probabilistic model, i.e., it can arise Binicio˘glu-Shumovsky (KCBS) scenario [22, 47]. from measurements in some general probabilistic Below, we make some remarks clarifying the scope theory [1] that isn’t necessarily quantum. of the framework described above before we move to The set of all probabilistic models G(Γ) (for any the case of sources. Γ) forms a polytope since it is defined by just the positivity and normalization constraints on 3.1.1 Classification of probabilistic models the probabilities. The extremal points (or ver- tices) of this polytope fall into two categories We classify the probabilistic models on a hypergraph that will interest us: deterministic and indeter- Γ as follows: ministic. The deterministic extremal points are

• KS-noncontextual probabilistic models, C(Γ): a 13We use a different term because we are advocating a revi- probabilistic model which is a convex combina- sion of the notion of classicality from KS-noncontextuality to tion of deterministic assignments p : V (Γ) → generalized noncontextuality `ala Spekkens.

Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 14 P the p : V (Γ) → {0, 1} such that v∈e p(v) = 1 that one needs to keep in mind which we distinguish as for all e ∈ E(Γ) and we denote the set of these structural Specker’s principle vs. statistical Specker’s points by G(Γ)|det. The indeterministic extremal principle. We define these two readings below: points are the p ∈ G(Γ) which are not determinis- tic and which, furthermore, cannot be expressed • Structural Specker’s principle imposes a struc- as a convex mixture of other points in G(Γ). tural constraint on a contextuality scenario Γ. We denote the set of indeterministic extremal This (strong) reading of Specker’s principle ap- points by G(Γ)|ind. Clearly, G(Γ)|det ( C(Γ) and plies to any set of measurement events, say M ⊆ G(Γ)|ind ⊆ G(Γ)\C(Γ). V (Γ), where every pair of measurement events Overall, we have can arise as outcomes of a single measurement: that is, for each pair {v, v0} ⊆ M, there exists 0 C(Γ) ⊆ CE1(Γ) ⊆ G(Γ) (38) some e ∈ E(Γ) such that {v, v } ⊆ e. The princi- ple then states: for any hypergraph Γ. Given a set M of pairwise jointly measurable measurement events in some contextuality sce- 3.1.2 Distinguishing two consequences of Specker’s nario Γ, all the measurement events in M are principle: Structural Specker’s principle vs. Statistical jointly measurable, i.e., all the measurement Specker’s principle events in the set can arise as outcomes of a single The CSW framework [22] restricts the scope of prob- measurement: M ⊆ e for some e ∈ E(Γ). abilistic models on a hypergraph to those satisfying Alternatively, the constraint of structural consistent exclusivity (the E1 probabilistic models), Specker’s principle can be restated as: motivated by what is sometimes called Specker’s prin- ciple [35]: that is, Every clique in the orthogonality graph of Γ, O(Γ), is a subset of some hyperedge in Γ. “if you have several questions and you can Note that we haven’t said anything directly answer any two of them, then you can also about probabilities here: any Γ satisfying the answer all of them” above property is said to satisfy structural If by “questions” we understand measurement set- Specker’s principle. tings, then the principle says that a set of pairwise jointly implementable measurement settings is itself • Statistical Specker’s principle (or consistent ex- jointly implementable. Note that when we say a set clusivity) imposes a statistical constraint on prob- of measurement settings is “jointly implementable”, abilistic models on any contextuality scenario Γ “jointly measurable”, or “compatible”, we mean that representing measurement events in an opera- there exists another choice of a single measurement tional theory. setting in the theory such that this measurement set- This (weak) reading of Specker’s principle im- ting can reproduce the statistics of all the measure- poses an additional constraint on a probabilistic ment settings in the set by coarse-graining.14 As such, model p ∈ G(Γ) (thus defining CE1(Γ) ⊆ G(Γ)), in its application to measurement settings, Specker’s namely: principle is a constraint on the measurements allowed Given a set M of pairwise jointly measurable in a physical theory that respects it, e.g., measure- measurement events, p satisfies P p(v) ≤ 1. ment settings that correspond to PVMs (projection v∈M valued measures) in quantum theory. This is, for ex- This can also be expressed as: ample, the reading adopted in Ref. [49], where the A probabilistic model p ∈ G(Γ) is said to satisfy failure of Specker’s principle in any almost quantum statistical Specker’s principle if the sum of prob- theory was demonstrated. On the other hand, we will abilities it assigns to the vertices of every clique often also refer to the “joint measurability” of a set of in the orthogonality graph of Γ, O(Γ), does not measurement events, by which we mean that this set exceed 1, i.e., P p(v) ≤ 1 for all cliques c in of measurement events is a subset of the set of mea- v∈c O(Γ). surement outcomes for some choice of measurement setting. At the level of measurement events,15 then, All probabilistic models that satisfy this con- there are two distinct ways to read Specker’s principle straint define the set of probabilistic models CE1(Γ) (or E1) for any contextuality scenario 14The reader may recall from Section 2.4 the general defini- Γ regardless of whether Γ satisfies structural tion of compatibility. Also, see Ref. [44] for an overview of joint Specker’s principle. Clearly, CE1(Γ) ⊆ G(Γ). measurability in quantum theory. 15Recall that a measurement event is a measurement outcome Any probabilistic model p on Γ such that p ∈ given a choice of measurement setting, e.g., a projector that CE1(Γ) is said to satisfy statistical Specker’s prin- appears in a particular PVM in quantum theory. ciple or, equivalently, consistent exclusivity [23].

Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 15 Probabilistic models on any hypergraph Γ which We denote by T(Γ) the set of probabilistic mod- satisfies the (strong) structural Specker’s principle ob- els achievable on Γ by an operational theory T, i.e., viously satisfy the (weak) statistical Specker’s princi- for any p ∈ T(Γ), we have that ∀v ∈ V (Γ) : p(v) = ple. This holds simply on account of the structure of p(v|S, s) for some source event [s|S] possible in the op- such Γ: that is, for all Γ satisfying structural Specker’s erational theory T.19 Since an operational theory can principle, we have CE1(Γ) = G(Γ). To see this, note only put further constraints on probabilistic models that every clique c in O(Γ) is a subset of some hyper- in G(Γ), we obviously have: T(Γ) ⊆ G(Γ). P edge in Γ, hence for every clique c, v∈c p(v) ≤ 1 for 1 1. satisfies statistical Specker’s principle: all p ∈ G(Γ), i.e., p ∈ CE (Γ).16 On the other hand, T We say an operational theory T satisfies statisti- it remains an open question whether the converse is 1 true: cal Specker’s principle if T(Γ) ⊆ CE (Γ) ⊆ G(Γ) for all Γ.20 That is, given that CE1(Γ) = G(Γ) for some Γ, is it the case that Γ must then necessarily satisfy structural Since the satisfaction of statistical Specker’s prin- Specker’s principle, namely, that every clique in O(Γ) ciple is a constraint on the statistical predictions is a subset of some hyperedge in Γ? of T, there must be some fact about the struc- A positive answer to this question would answer ture of theory T that leads to this constraint. Problem 7.2.3 of Ref. [23] asking for a characterization This fact enforcing statistical Specker’s principle of Γ for which CE1(Γ) = G(Γ). could be some restriction arising from the struc- ture of allowed measurement events and/or even the structure of allowed preparations in the oper- 3.1.3 What does it mean for an operational theory to ational theory T. For instance, this is the case for satisfy structural/statistical Specker’s principle? quantum theory when one only considers projec- tive measurements implemented on an arbitrary We have so far defined structural Specker’s principle 1 as a constraint on Γ and statistical Specker’s princi- quantum state, i.e., Q(Γ) ⊆ CE (Γ) ⊆ G(Γ), ple as a constraint on a probabilistic model on any Γ. where Q(Γ) denotes the set of probabilistic mod- Any operational theory would typically allow many els that can be obtained in this way. More gener- possible Γ to be realized by its measurement events ally, one could relax the no-restriction hypothe- as well as many possible probabilistic models to be re- sis [3] in some particular way in T so that not all alized on any Γ representing its measurement events. probabilistic models in G(Γ) are allowed in T(Γ). Note that when we say that a particular Γ is “re- In the case of quantum theory, restricting atten- alizable” or “allowed” by an operational theory, we tion to only projective measurements (as we just mean that there exist measurement events in the op- pointed out) rather than the more general case erational theory that satisfy the operational equiva- allowing arbitrary POVMs is one way of restrict- lences required by Γ.17 Further, given such a Γ, the ing the set of possible probabilistic models re- realizability of a probabilistic model on it by the oper- alizable with quantum states and measurements ational theory means that there exists a source event to a strict subset of G(Γ). Allowing arbitrary in the operational theory that assigns probabilities to POVMs would lead to a violation of statistical Specker’s principle by probabilistic models aris- the measurement events in Γ according to the proba- 21 bilistic model. It will be useful for our discussion to ing from quantum theory. define what it means for an operational theory, say T, Let us now define what it means for an oper- to satisfy structural or statistical Specker’s principle. ational theory T to satisfy structural Specker’s But before we do that, let us formally specify what it principle. means for to satisfy Specker’s principle: T 2. satisfies structural Specker’s principle: satisfies Specker’s principle: An operational T T An operational theory is said to satisfy struc- theory is said to satisfy Specker’s principle if, for T T tural Specker’s principle if for any set of mea- any set of measurement settings in that are pair- T surement events that are pairwise jointly measur- wise jointly implementable, it follows that they are all able, i.e, measurement events in each pair arise jointly implementable in T.18 19Note that if the operational theory does not admit mea- 16This partially answers the open Problem 7.2.3 of Ref. [23]. surement events (represented by vertices) exhibiting the opera- tional equivalences represented by Γ (that is, does not allow 17Realizability of a particular Γ in an operational theory de- T Γ), then we have that T(Γ) is an empty set. pends on the (in)compatibility relations that the operational theory allows between its measurements (cf. Section 2.4). Re- 20That is, instead of considering only a particular probabilis- call that incompatibility of measurements is necessary for KS- tic model on a particular Γ, we now consider the satisfaction contextuality to be witnessed and the structure of Γ depends of statistical Specker’s principle by a whole set of probabilistic on this incompatibility. models, namely, T(Γ), for all Γ. 18Recall from Section 2.4 the definition of joint imple- 21See AppendicesA (specifically A.1.2) andC for other con- mentability (or joint measurability) of some set of measurement sequences of allowing arbitrary POVMs, in particular the trivial settings. ‘classical’ ones.

Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 16 as outcomes of some measurement in the theory, [01|Mvv0 ], [0|Mv] ≡ [00|Mvv0 ] + [01|Mvv0 ], [0|Mv0 ] ≡ it is the case that all the measurement events in [00|Mvv0 ] + [10|Mvv0 ]. the set are jointly measurable, i.e., all the mea- The joint measurement Mvv0 can be constructed surement events in the set arise as outcomes of from any hyperedge that v and v0 appear in: for a single measurement in the theory. any e ∈ E(Γ) such that {v, v0} ⊆ e, we have that [10|Mvv0 ] is a measurement event correspond- 22 0 We now show that a theory T that satisfies ing to v, [01|Mvv0 ] corresponds to v , [00|Mvv0 ] Specker’s principle also satisfies structural Specker’s corresponds to e\{v, v0} (the coarse-graining of all principle. measurement outcomes in e except v and v0), and [11|Mvv0 ] denotes the null event ∅ ⊆ e. This means Theorem 1. If an operational theory satisfies T p(10|Mvv0 , S, s) + p(01|Mvv0 , S, s) + p(00|Mvv0 , S, s) = Specker’s principle, then it also satisfies structural p(v|S, s) + p(v0|S, s) + p(e\{v, v0}|S, s) = 1 and Specker’s principle. p(11|Mvv0 , S, s) = 0 for any probabilistic model (in- duced by some source event [s|S]) on Γ. Proof. The argument here relies on the fact that the Consider now any set of vertices in Γ that is pair- operational theory is such that measurement set- T wise jointly measurable, denoted V ⊆ V (Γ). We tings can be coarse-grained to yield new measurement 2JM need to show that any such set of vertices V is settings with fewer outcomes. Operationally, this just 2JM jointly measurable, i.e., the theory realizing Γ ad- corresponds to binning some subsets of outcomes to- T mits a single measurement such that all the vertices gether in a measurement procedure. The operational in V arise as outcomes of this measurement. theories we consider in this paper satisfy this prop- 2JM Now, the two-outcome measurement settings erty, as outlined in Section 2.3 on coarse-graining. {M |v ∈ V } we have defined are pairwise jointly The argument proceeds, for any Γ realizable in , v 2JM T measurable and as such, following Specker’s principle, by constructing a set of binary-outcome measurement they should all be jointly measurable in theory . The settings for any given set of pairwise jointly mea- T joint measurement corresponding to them can be de- surable vertices in Γ. These measurement settings fined as are, by construction, pairwise jointly measurable, so

Specker’s principle applied to them implies that they ~ ~ V2JM MV ≡ {[b|MV ] b ∈ {0, 1} }, (39) are all jointly measurable. This in turn means that the 2JM 2JM pairwise jointly measurable vertices in the given set where each event [~b|M ] in the joint measurement are also all realizable as outcomes of a single measure- V2JM M represents a particular set of outcomes for mea- ment setting. Hence, the theory satisfies structural V2JM T surements in the set {M |v ∈ V }. Specker’s principle. We detail the argument below. v 2JM Denoting V ≡ {v , v , . . . , v }, we have that Consider a contextuality scenario Γ realizable in 2JM 1 2 |V2JM| T. To each vertex v ∈ V (Γ), we can associate a [(10 ... 0)|MV2JM ] ≡ [1|Mv1 ], measurement setting Mv with two possible outcomes [(01 ... 0)|MV2JM ] ≡ [1|Mv2 ], labelled {0, 1} such that [1|Mv] denotes the occur- rence of v and [0|M ] denotes the non-occurrence of . v . v, i.e., p(v|S, s) = p(1|Mv, S, s) and 1 − p(v|S, s) = [(00 ... 1)|MV ] ≡ [1|Mv ], p(0|Mv, S, s) for any probabilistic model on Γ induced 2JM |V2JM| by some source event [s|S]. The measurement setting [(00 ... 0)|MV2JM ] Mv can be obtained in various (operationally equiva- ≡ [0|Mv ] + [0|Mv ] + ··· + [0|Mv ], (40) lent) ways from the hyperedges that v ∈ V (Γ) appears 1 2 |V2JM| in: for each hyperedge e ∈ E(Γ) such that v ∈ e, where [0|Mv ]+[0|Mv ]+···+[0|Mv ] denotes the 1 2 |V2JM| we have that the binary-outcome measurement set- measurement event obtained by coarse-graining the ting consisting of the vertices {v, e\v} — where e\v measurement events in {[0|Mv]|v ∈ |V2JM|}. All the denotes a coarse-graining over all the measurement other measurement events of MV2JM are null events outcomes of e except v — is operationally equivalent that never occur, i.e., they are assigned probability to Mv. zero by every source event. Thus, using Specker’s 0 Now, for any pair of vertices {v, v } that appear in a principle applied to the binary-outcome measurement common hyperedge of Γ, consider the two correspond- settings defined for the vertices in V2JM, we have that ing measurement settings {M ,M 0 } such that they v v the pairwise jointly measurable vertices in V2JM are all are jointly measurable and their outcomes are mutu- jointly measurable, appearing as outcomes of a single ally exclusive. The measurement events that can pos- measurement MV2JM . sibly occur in their joint measurement, denoted Mvv0 , are [10|Mvv0 ], [01|Mvv0 ] and [00|Mvv0 ]. The probabil- ity of [11|Mvv0 ] is always zero, reflecting the fact that 22Recall that every vertex v ∈ V (Γ) is an equivalence class v and v0 are mutually exclusive. Here, the coarse- of measurement events [v|e] ' [v|e0] for all e, e0 such that v ∈ e and v ∈ e0. graining relations are: [1|Mv] ≡ [10|Mvv0 ], [1|Mv0 ] ≡

Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 17 Having established Theorem1, we now proceed to • Probabilistic models in G(Γ0) are in bijective cor- show that a theory which satisfies structural Specker’s respondence with probabilistic models in CE1(Γ): 1 principle also satisfies statistical Specker’s principle. for any probabilistic model pΓ ∈ CE (Γ), there To do this, we consider a contextuality scenario Γ exists a unique probabilistic model pΓ0 ≡ f(pΓ) ∈ 0 which may not satisfy structural Specker’s princi- G(Γ ), where the function f is given by pΓ0 (v) ≡ 0 ple and from it construct a contextuality scenario Γ f(pΓ)(v) = pΓ(v) for all v ∈ V (Γ) and pΓ0 (vc) ≡ P 23 which does satisfy the principle. The construction f(pΓ)(vc) = 1 − v∈c pΓ(v) for all c ∈ C. Sim- 0 proceeds as follows: ilarly, for any pΓ0 ∈ G(Γ ), there exists a unique 1 probabilistic model pΓ ≡ g(pΓ0 ) ∈ CE (Γ) given 1. Construct O(Γ). by pΓ(v) ≡ g(pΓ0 )(v) = pΓ0 (v) for all v ∈ V (Γ), i.e., we simply ignore the probabilities assigned 2. Turn each clique in O(Γ) that is a hyperedge in Γ to the vertices v ∈ V (Γ0)\V (Γ) which do not to a hyperedge in a new hypergraph Γ0. That is, c appear in Γ. Now note that the functions f and Γ0 is such that V (Γ) ⊆ V (Γ0) and E(Γ) ⊆ E(Γ0). g are inverses of each other: g(f(pΓ)) = g(pΓ0 ) = 3. Turn each maximal clique c in O(Γ) that is not a pΓ and f(g(pΓ0 )) = f(pΓ) = pΓ0 . Hence, there hyperedge in Γ to a hyperedge in Γ0 and include is a bijective correspondence between G(Γ0) and 1 an additional vertex vc in this hyperedge. Here, CE (Γ). a maximal clique in a graph is a clique that is not a strict subset of another clique, i.e., there is • Hence, the set of probabilistic models on Γ no vertex outside the clique that shares an edge that satisfy statistical Specker’s principle, i.e., 1 with each vertex in the clique. CE (Γ), are in one-to-one correspondence with 0 0 the set of probabilistic models on Γ which (by We then have for the hyperedges of Γ , construction) satisfies structural Specker’s prin- 1 0 0 0 ciple so that CE (Γ ) = G(Γ ). E(Γ ) = E(Γ) ∪ {c ∪ {vc}}c∈C , (41) 1 1 0 We therefore have that CE (Γ) = CE (Γ )|V (Γ), where C is the set of maximal cliques in O(Γ) 1 0 where CE (Γ )|V (Γ) denotes the probabilistic that are not hyperedges in Γ. models induced on Γ by those on Γ0 (ignoring the Note that as long as a theory T satisfies structural probabilities assigned to vertices in V (Γ0)\V (Γ)). Specker’s principle, converting maximal cliques in O(Γ) that are not hyperedges in Γ to hyper- It is conceivable that a particular Γ may not ad- edges in Γ0 is a valid move within the theory since mit probabilistic models from an operational theory the resulting hyperedge would indeed constitute T, i.e., T(Γ) = ∅. On the other hand, if Γ admits a a valid measurement in the theory. representation in terms of measurement events admis- sible in , so that T(Γ) 6= , then two possibilities If C = (i.e., Γ satisfies structural Specker’s T ∅ ∅ arise: Γ satisfies structural Specker’s principle or it principle), then we just have E(Γ0) = E(Γ). doesn’t. If Γ satisfies structural Specker’s principle 4. The resulting contextuality scenario Γ0 is thus then any probabilistic model in T(Γ) will satisfy sta- 0 0 0 given by: V (Γ ) = V (Γ) ∪ {vc}c∈C and E(Γ ) = tistical Specker’s principle and we have Γ = Γ. If E(Γ) ∪ {c ∪ {vc}}c∈C . Γ does not satisfy structural Specker’s principle, we consider its relation with the contextuality scenario If C = we just have V (Γ0) = V (Γ) and E(Γ0) = ∅ Γ0 constructed from it that does satisfy structural E(Γ) so that Γ0 = Γ (i.e., the two hypergraphs Specker’s principle. Such a Γ0 admits a representa- are isomorphic). tion in a theory T satisfying structural Specker’s prin- 0 Our construction of Γ0 leads to the following prop- ciple (that is, T(Γ ) 6= ∅) as long as Γ admits such erties: a representation (that is, T(Γ) 6= ∅). Indeed, it’s the satisfaction of structural Specker’s principle in T • Γ0 satisfies structural Specker’s principle (by con- that renders the construction of Γ0 from Γ physically 0 struction) since every clique in O(Γ ) is a subset allowed in T. 0 of some hyperedge in Γ . Hence, it’s also the case Thus, in a theory T that satisfies structural that statistical Specker’s principle holds for prob- Specker’s principle, the following holds: for every 0 1 0 0 1 0 abilistic models on Γ as CE (Γ ) = G(Γ ). probabilistic model pΓ ∈ T(Γ) (⊆ CE (Γ)), Γ ad- 0 Note that the construction of Γ0 relied on the fact mits a corresponding probabilistic model pΓ0 ∈ T(Γ ) that the theory we are considering satisfies struc- satisfying pΓ0 (v) = pΓ(v) for all v ∈ V (Γ) and P tural Specker’s principle. If the theory doesn’t pΓ0 (vc) = 1 − v∈c pΓ(v) for all c ∈ C, where C is satisfy this principle, but one goes ahead with the set of maximal cliques in O(Γ) such that none of 0 the construction of Γ0, then the new hyperedges them is a hyperedge in Γ. Similarly, given pΓ0 ∈ T(Γ ) in Γ0 may not constitute valid measurements in 23 0 the theory. Recall that {vc}c∈C = V (Γ )\V (Γ).

Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 18 1 0 1 (⊆ CE (Γ )), pΓ ∈ T(Γ) is uniquely fixed: it’s ob- T(Γ) to a subset of CE (Γ) for any Γ is to require that tained by just neglecting the probabilities assigned the theory T satisfy structural Specker’s principle.24 0 by p 0 to the vertices in V (Γ )\V (Γ). Γ Corollary 1. For any operational theory , the fol- We must therefore have T(Γ) = T(Γ0) for any T V (Γ) lowing implications hold: 0 Γ, where T(Γ ) V (Γ) denotes the set of probabilistic models induced on Γ by the set of probabilistic models in T(Γ0) under the correspondence we have already T satisfies Specker’s principle established above. We can now state and prove the ⇒T satisfies structural Specker’s principle (42) following theorem: ⇒T satisfies statistical Specker’s principle, i.e., consistent exclusivity. (43) Theorem 2. If an operational theory T satisfies structural Specker’s principle, then it also satisfies Proof. This follows from combining Theorems1 and statistical Specker’s principle. 2. Proof. For any Γ that does not admit a probabilistic Note that statistical Specker’s principle (or consis- model in T, i.e., T(Γ) = ∅, statistical Specker’s prin- tent exclusivity) is so intrinsic to the CSW approach 1 ciple is trivially satisfied since T(Γ) = ∅ ⊆ CE (Γ) ⊆ [22] that they do not consider probabilistic models G(Γ). that do not satisfy this principle.25 This will become For any Γ that does admit a probabilistic model in important when we consider the fact that nonprojec- T, i.e., T(Γ) 6= ∅, we can have one of two possibili- tive measurements in quantum theory do not satisfy ties: either it satisfies structural Specker’s principle, Specker’s principle, structural or statistical (at the in which case T(Γ) ⊆ CE1(Γ) = G(Γ), or it doesn’t, level of measurement events), and thus also fail to in which case we consider the Γ0 constructed from it satisfy the stronger statement of Specker’s principle following the recipe we have already outlined so that for measurement settings (cf. Ref. [49]). Indeed, such we have: measurements admit contextuality scenarios Γ that 0 0 1 0 are not possible with projective measurements, such T(Γ ) V (Γ) ⊆ G(Γ ) V (Γ) = CE (Γ ) V (Γ) = 1 as the one from three binary-outcome POVMs that CE (Γ). are pairwise jointly measurable but not triplewise so Since satisfies structural Specker’s principle, we T [39–41], and the probabilistic models they give rise to have T(Γ) = T(Γ0) , which immediately implies V (Γ) can only be accommodated in the most general set 1 that T(Γ) ⊆ CE (Γ). That is, the theory T satisfies of probabilistic models, G(Γ), since trivial POVMs 1 statistical Specker’s principle on Γ: T(Γ) ⊆ CE (Γ) ⊆ can realize any probabilistic model. Specker’s prin- G(Γ). ciple, structural Specker’s principle, and statistical Overall, we have the desired result: T satisfies Specker’s principle were all motivated by the fact that structural Specker’s principle ⇒ T(Γ) ⊆ CE1(Γ) ⊆ projective measurements in quantum theory satisfy G(Γ) for all Γ, i.e., T satisfies statistical Specker’s them. In particular, consistent exclusivity (or sta- principle. tistical Specker’s principle) would be obeyed in any theory where measurement events satisfy structural Thus, one way of enforcing that a particular opera- Specker’s principle, and indeed, the more recent ap- proach [29] is to restrict attention to “sharp” mea- tional theory T satisfies statistical Specker’s principle — that is, T(Γ) ⊆ CE1(Γ) ⊆ G(Γ) for all Γ — is surements in such theories [30, 31], where the def- to require that it satisfies structural Specker’s prin- inition of “sharp” ensures the property of pairwise ciple, a constraint on the structure of measurement jointly measurable events being globally jointly mea- surable. This property forms the motivational basis events in T. This is, for example, what is achieved in Ref. [30] by invoking a notion of “sharpness” for mea- 24Indeed, any putative theory yielding the set of almost quan- surement events in an operational theory such that tum correlations (which satisfy statistical Specker’s principle) any set of sharp measurement events that are pairwise [50] cannot satisfy Specker’s principle — that pairwise joint im- jointly measurable are all jointly measurable. That plementable measurement settings are all jointly implementable is, structural Specker’s principle is satisfied in a the- — for any notion of sharp measurements [49]. Whether struc- tural Specker’s principle, which is defined at the level of mea- ory with such sharp measurement events and, con- surement events, can be upheld for an almost quantum theory sequently, statistical Specker’s principle, or what is — so that it falls in the category of operational theories with more conventionally called consistent exclusivity [23], sharp measurements envisaged in Ref. [30] — remains an open is also satisfied. But it’s conceivable that there may question. be other ways to ensure that only a subset of CE1(Γ) 25As we have already noted, a noise-robust noncontextuality probabilistic models are allowed in (Γ) for any Γ. inequality of the type in Ref. [12] that is based on a logical T proof of the KS theorem is not even obtainable if one restricted What we wish to emphasize here is that it is by no attention to probabilistic models satisfying CE1. The upper means obvious (or at least, it needs to be proven) that bound on that inequality comes from a probabilistic model that the only way to restrict the set of probabilistic models does not satisfy CE1.

Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 19 (and is sufficient) for statistical Specker’s principle to some density operator ρ on the Hilbert space, I being hold (cf. Theorem2). That is, this approach [29, 30] the identity operator. regards statistical Specker’s principle as grounded in On the other hand, allowing arbitrary positive (and physically justified by) structural Specker’s prin- operator-valued measures (POVMs) in a definition ciple. Theorem2 is a precise statement of this in- of a quantum model (as we would rather prefer) tuition in the hypergraph formalism `ala AFLS [23]. means that, in fact, quantum models on a hyper- The work of Refs. [30, 31] can be understood as bridg- graph Γ are as general as the general probabilistic ing the gap between structural Specker’s principle and models G(Γ), rendering such a definition redundant. statistical Specker’s principle by formally defining a This can be seen by noting that for any probabilistic notion of sharp measurements in an operational the- model p ∈ G(Γ), one can associate positive opera- ory such that structural Specker’s principle holds for tors to the vertices of Γ given by p(v)I such that for these sharp measurements. any quantum state ρ on some Hilbert space, we have On the other hand, and this is the key point for p(v) = Tr(ρp(v)I), where I is the identity operator. our purposes, if one wants to make no commitment Our focus in this paper is not on quantum the- about the representation of measurements in the op- ory, in particular, even though the need to be able erational theory (in particular, not requiring a notion to handle noisy measurements and preparations (par- of “sharpness”), then Specker’s principle is not a nat- ticularly, trivial POVMs) in quantum theory can be ural constraint to impose on probabilistic models and, taken as a motivation for this work. Rather, our focus indeed, one must deal with the full set of probabilistic is on delineating the boundary between operational models G(Γ) on any contextuality scenario Γ rather theories that admit noncontextual ontological mod- than restrict oneself to the set of probabilistic models els (for Kochen-Specker type experiments, suitably CE1(Γ). It is for this reason that we are translating augmented with multiple preparation procedures, as the notions from CSW [22] to the notational conven- outlined in this paper) and those that don’t by ob- tions of AFLS [23], the latter being a more natural taining noise-robust noncontextuality inequalities. In choice for our purposes, allowing the language needed particular, we want these inequalities to indicate the to articulate the difference between CE1(Γ) and G(Γ) noise thresholds beyond which an experiment cannot rather than excluding the latter by fiat or, perhaps, by rule out the existence of a noncontextual ontological an appeal to structural Specker’s principle holding for model with respect to the quantities of interest. This sharp measurements in the landscape of operational also means that making sense of quantum correlations theories under consideration (cf. Theorem2). It is for in this approach requires one to pay attention not only all these reasons that the “exclusivity principle” `ala to the measurements involved in an experiment but CSW [22] is not enough to make sense of Spekkens also the preparations; indeed, this shift of focus from contextuality applied to Kochen-Specker type scenar- measurements alone, to include multiple preparations ios. The framework we propose in this paper ad- (or source settings), is a fundamental conceptual dif- dresses this gap between the notions Spekkens con- ference between our approach and that of traditional textuality (which applies to arbitrary measurements) Kochen-Specker contextuality frameworks [22, 23, 25]. requires in a hypergraph framework and those that the CSW framework [22] (which applies to “sharp” 3.1.5 Scope of this framework measurements) can provide in its graph-theoretic for- mulation. Note that whenever we refer to the “CSW frame- work”, we mean the framework of Ref. [22], which often differs from the framework of Ref. [21] in some 3.1.4 Remark on the classification of probabilistic mod- respects, e.g., the normalization of probabilities in a els: why we haven’t defined “quantum models” as those given hyperedge, assumed in [22], but not in [21]. In obtained from projective measurements Ref. [21], the authors write: The reader may note that we haven’t tried to de- Notice that in all of the above we never fine any notion of a “quantum model” so far, hav- require that any particular context should be ing only adopted the definitions of Ref. [23] for KS- associated to a complete measurement: the noncontextual models (C(Γ)), for models satisfying conditions only make sure that each context 1 consistent exclusivity (CE (Γ)), and for general prob- is a subset of outcomes of a measurement abilistic models (G(Γ)). The reason for this is that and that they are mutually exclusive. Thus, we do not wish to restrict ourselves to projective mea- unlike the original KS theorem, it is clear surements in defining a “quantum model”, unlike the that every context hypergraph Γ has always traditional Kochen-Specker approaches [22, 23]. In a classical noncontextual model, besides pos- Ref. [23], a quantum model is defined as a probabilis- sibly quantum and generalized models. tic model that can be realized in the following manner: On the other hand, in Ref. [22], they write: assign projectors {Πv}v∈V (Γ) (defined on any Hilbert P space) to all the vertices of Γ such that v∈e Πv = I The fact that the sum of probabilities for all e ∈ E(Γ), and we have p(v) = Tr(ρΠv), for of outcomes of a test is 1 can be used to

Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 20 since we want to leverage their graph invariants in obtaining our noise-robust noncontextuality inequali- ties. The study of other KS-colourable hypergraphs, in particular those which arise only with nonprojective measurements in quantum theory [39–41] and are out- side the scope of traditional frameworks [22, 23, 25], will be taken up in future work. To summarize, the measurement events hyper- graphs Γ where the present framework (and the CSW framework [22]) applies must satisfy two properties: 1 C(Γ) 6= ∅ (that is, KS-colourability) and CE (Γ) = G(Γ).27 In the next subsection, we define additional notions

necessary to obtain noise-robust noncontextuality in- equalities that make use of graph invariants from the CSW framework. These notions correspond to source Figure 4: The KS-uncolourable hypergraph from Ref. [51] events that are an integral part of our framework. that is not covered by our generalization of the CSW frame- work. We denote this hypergraph as Γ18. 3.2 Sources

express these correlations as a positive lin- Having introduced the (hyper)graph-theoretic ele- ear combination of probabilities of events, ments that we need to talk about measurement P events, we are now in a position to introduce features S = i wiP (ei), with wi > 0. of source events that are relevant in the Spekkens The latter presentation [22] is more in line with the framework. This part of our framework has no prece- “original KS theorem” [19], as well as the presenta- dent in the literature on KS-noncontextuality, in par- tion in Ref. [23]. Since normalization of probabili- ticular the CSW framework [22]. We introduce these ties is thus presumed in Ref. [22], in keeping with the source events in order to benchmark the measure- definition of a probabilistic model we have presented ment events against them, i.e., for every measurement (following [23]), the graph invariants of CSW [22] re- event, we seek to identify in the operational theory fer, specifically, to subgraphs G of those hypergraphs a corresponding source event that makes this mea- Γ on which the set of KS-noncontextual probabilistic surement event as likely as possible. This helps us models is non-empty. In particular, our generaliza- deal with cases where a measurement device may be tion of the CSW framework [22] in this paper says implementing very noisy measurements by explicitly nothing about noise-robust noncontextuality inequal- accounting for this noise in our noise-robust noncon- ities from logical proofs of the Kochen-Specker the- textuality inequalities. Further, while we do not as- orem [19], which rely on hypergraphs Γ that admit sume outcome determinism (which is essential to KS- no KS-noncontextual probabilistic models, i.e., KS- noncontextuality), we will invoke preparation noncon- uncolourable hypergraphs. It also says nothing for textuality with respect to these source events in the the hypergraphs Γ that do not satisfy the property Spekkens framework [18]. As an example of what CE1(Γ) = G(Γ). An example of such a hypergraph, we mean by “benchmarking” a measurement event which is not covered by our generalization of the CSW against a source event, consider the case of quantum framework on both counts, is the 18 ray hypergraph first presented in Ref. [51], denoted Γ18 (see Fig.4 and els, which is not the case for Γ18 (for example). See Appendix AppendixD). Indeed, the study of noise-robust non- D for a detailed discussion of this point. 27 contextuality inequalities from such KS-uncolourable As we have shown, when the operational theory T un- hypergraphs was initiated in Ref. [12], and a more ex- der consideration satisfies structural Specker’s principle, we can always turn a hypergraph Γ that doesn’t satisfy structural haustive hypergraph-theoretic treatment of it is pre- Specker’s principle into a hypergraph Γ0 that satisfies it and for sented in Ref. [34]. In this paper, we will restrict which, therefore, CE1(Γ0) = G(Γ0) holds. This can be seen as ourselves to KS-colourable hypergraphs, the study justification for restricting oneself to probabilistic models sat- of which was initiated in Ref. [16], and, of these, isfying consistent exclusivity in the CSW framework [22]: such a restriction is not really a restriction if the theory satisfies only those KS-colourable hypergraphs Γ which satisfy structural Specker’s principle. On the other hand, we restrict 1 CE (Γ) = G(Γ). Note that this is not a limitation of ourselves to hypergraphs for which CE1(Γ) = G(Γ) without as- our general approach, which is based on Ref. [16] and suming that T satisfies structural Specker’s principle. The jus- applies to any KS-colourable hypergraph, but rather a tification for this seemingly ad hoc restriction is simply that it is necessary in order to meaningfully leverage the graph invari- 26 limitation we inherit from the CSW framework [22] ants of CSW [22] – in particular, the fractional packing number – in our noise-robust noncontextuality inequalities. This will 26Ref. [22] takes Specker’s principle to be fundamental and become clear when we obtain our noise-robust noncontextuality 1 identifies CE (Γ18) as the most general set of probabilistic mod- inequalities.

Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 21 theory, where any measurement event represented by a projector occurs with probability 1 for any source event that is represented by an eigenstate of this projector; on the other hand, a positive operator that isn’t projective cannot occur with a probabil- ity greater than its largest eigenvalue (< 1) for any source event. We now proceed to describe the neces- sary hypergraph-theoretic ingredients we need to ac- commodate source events in our framework. As we have argued previously, we require the mea- surement events hypergraph Γ to be such that C(Γ) 6= 1 ∅ and CE (Γ) = G(Γ) to be able to obtain noise- robust noncontextuality inequalities that use graph invariants from the CSW framework [22]. Hence, we will restrict ourselves to experiments that realize the operational equivalences represented by this class of Γ. Now, in the CSW framework [22], every Bell-KS Figure 5: The hypergraph ΓG obtained from G by adding a no-detection vertex (represented by a hollow circle) to every expression picks out a particular subgraph G of the maximal clique in G. orthogonality graph O(Γ) of the contextuality sce- nario Γ of interest. This amounts to focussing on a restricted set of probabilities (for the vertices of This ΓG, for any G, will satisfy the prop- G) rather than probabilities for all the measurement 1 erty that CE (ΓG) = G(ΓG) and any probabilis- events (represented by vertices of Γ) in the experi- tic model on Γ assigning probabilities to measure- ment. Hence, the vertices of G denote the measure- ment events in G will correspond to a probabilis- ment events of interest in a given Bell-KS expression tic model on ΓG which also assigns the same prob- and we have the following: abilities to measurement events in G. Formally: F • A general probabilistic model p ∈ G(Γ) will V (ΓG) ≡ V (G) {vc|c is a maximal clique in G}, assign probabilities to vertices in G such that: and E(ΓG) ≡ {c t {vc}|c is a maximal clique in G}, p(v) ≥ 0 for all v ∈ V (G) and p(v) + p(v0) ≤ 1 where vc is the extra no-detection vertex added to for every edge {v, v0} ∈ E(G). the hyperedge corresponding to maximal clique c in G. 1 • A probabilistic model p ∈ CE (Γ) will assign We have the following probabilistic model on ΓG, probabilities to vertices in G such that: p(v) ≥ 0 given a probabilistic model p ∈ G(Γ): the probabili- for all v ∈ V (G) and ties assigned to the vertices in V (G) ⊆ V (ΓG) are the X same as specified by p ∈ G(Γ) and the probabilities as- p(v) ≤ 1, (44) signed to the remaining vertices in V (ΓG)\V (G) are v∈c P given by p(vc) = 1 − v∈c p(v), for every maximal for every clique c ⊆ V (G). clique c in G. Consider, for example, the KCBS sce- nario [16, 22, 47]: the 20-vertex Γ representing mea- • A probabilistic model p ∈ C(Γ) will assign prob- surement events from five 4-outcome joint measure- abilities to vertices in G such that: p(v) = ments (Fig.2), its 5 vertices G involved in the KCBS P P k Pr(k)pk(v), where Pr(k) ≥ 0, k Pr(k) = 1, inequality (Fig.3), and 10-vertex hypergraph ΓG con- and for each k, pk is a deterministic assign- structed from G (Fig.5). ment pk(v) ∈ {0, 1} for all v ∈ V (G), and Given ΓG, constructed from G, we now require 0 0 pk(v) + pk(v ) ≤ 1 for every edge {v, v } ∈ E(G). that the operational theory that realizes measure- 1 ment events in Γ also admits preparations that can Since Γ is such that CE (Γ) = G(Γ), the condition G be represented by a hypergraph ΣG of source events X p(v) ≤ 1 for every clique c ⊆ V (G) as follows: for every hyperedge e ∈ E(ΓG), corre- v∈c sponding to the choice of measurement setting Me, we define a hyperedge e ∈ E(ΣG) denoting a cor- on the probabilities assigned to vertices in G is redun- responding choice of source setting Se. And for dant. We now obtain a simplified hypergraph, Γ , G every vertex v ∈ e(∈ E(ΓG)), we define a vertex from G as follows: convert all maximal cliques in G 29 ve ∈ e(∈ E(ΣG)). Hence, every measurement event to hyperedges and add an extra (no-detection) vertex 28 to each such hyperedge. 29Recall from the discussion at the beginning of Section 3.2 that we seek to benchmark the measurement events against 28Physically, a “no-detection” vertex denotes the case when those source events in the operational theory that (ideally) none of the measurement events of interest (here, the events in make them as predictable as possible. The source setting G) for a given measurement setting occur. against which the predictability of a particular measurement

Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 22 source labelled by e∗ here [16]. We illustrate it here in Fig.6 for the KCBS scenario.

4 A key hypergraph invariant: the weighted max-predictability

We now define a hypergraph invariant that will be rel- evant for our noise-robust noncontextuality inequali- ties: X β(ΓG, q) ≡ max qeζ(Me, p), (45) p∈G(ΓG)|ind e∈E(ΓG)

where q ≥ 0 for all e ∈ E(Γ ), P q = 1, and e G e∈E(ΓG) e Figure 6: The source events hypergraph with the operational ζ(Me, p) ≡ max p(v) equivalences between the source settings separately specified. v∈e is the maximum probability assigned to a vertex in e ∈

[v|e] in ΓG corresponds to a vertex ve of ΣG, and the E(ΓG) by an extremal indeterministic probabilistic 30 number of such vertices in V (ΣG) is |V (ΓG)||E(ΓG)|. model p ∈ G(ΓG)|ind. This means that the operational equivalences between We call β(ΓG, q) the weighted max-predictability of the measurement events that are implicit in ΓG — the measurement settings (i.e., hyperedges) in ΓG, 0 such as [v|e] is operationally equivalent to [v|e ], where where the hyperedges e ∈ E(ΓG) are weighted accord- 0 e, e ∈ E(ΓG) are distinct hyperedges that share the ing to the probability distribution q ≡ {qe}e∈E(ΓG). vertex (representing an equivalence class of measure- We now outline how this quantity is related to prop- ment events) v ∈ V (ΓG) — are not carried over to the erties of an operational theory T admitting a mea- source events, where none is presumed to be opera- surement noncontextual ontological model. ΓG repre- tionally equivalent to any other, hence ve ∈ V (ΣG) is sents a particular configuration of operational equiv- a different vertex from ve0 ∈ V (ΣG). Here ve (ve0 ) rep- alences that a set of measurement events in T may resents a source event [se|Se] ([se0 |Se0 ]), rather than realize. The probabilistic models on ΓG that can be an equivalence class of source events. realized by T are, as earlier, denoted by T(ΓG). Since Besides these |V (ΓG)||E(ΓG)| vertices in V (ΣG) T admits a measurement noncontextual ontological 31 and the associated hyperedges e ∈ E(ΣG), we require model, its predictions for the specific case of ΓG that the operational theory admits an additional hy- can be reproduced by such a model. But since, in peredge e∗ ∈ E(ΣG), representing a source setting keeping with the CSW approach [22], we will look at S , containing two new vertices v0 , v1 ∈ V (Σ ). witnesses of contextuality tailored to particular ex- e∗ e∗ e∗ G 0 Here v represents the source event [s = 0|S ] periments (ΓG representing features of one such ex- e∗ e∗ e∗ and v1 represents the source event [s = 1|S ]. periment), we do not need an ontological model for e∗ e∗ e∗ Hence, we have |V (ΣG)| = |V (ΓG)||E(ΓG)| + 2 and the full theory T to reproduce its predictions for a |E(ΣG)| = |E(ΓG)| + 1. particular experiment. Indeed, to construct a mea- The operational equivalence we do require for ΣG surement noncontextual ontological model for the set (in any operational theory that admits source events of probabilistic models T(ΓG), it suffices to assume represented by ΣG) applies to the source settings: all (without loss of generality) that the extremal proba- source settings, each represented by coarse-graining bilistic models on ΓG – given by G(ΓG)|det tG(ΓG)|ind the source events in a hyperedge e ∈ E(ΣG), are op- 30An extremal indeterministic probabilistic model refers to erationally equivalent, i.e., [>|Se ] ' [>|Se0 ] for all > > those extremal p ∈ (Γ ) for which ζ(M , p) < 1 for some e, e0 ∈ E(Σ ), i.e., ∀[m|M]: P p(m, s |M,S ) = G G e G se e e e ∈ E(Γ ). P 0 G p(m, se0 |M,Se0 ), for all e, e ∈ E(ΣG). se0 31This will always be the case for any operational theory we An example of such a source events hypergraph was consider: the assumption of measurement noncontextuality on considered in Ref. [12], albeit without the additional its own can always be satisfied by a trivial ontological model of the type we outlined in Section 2.5. Indeed, quantum the- setting is tested – that is the predictability of each measure- ory satisfies it, the Beltrametti-Bugajski model [52] that was ment event (e.g., v ∈ e(∈ E(ΓG))) for this measurement set- discussed in Ref. [18] being an example of a measurement non- ting (e.g., Me) is benchmarked against some source event (e.g., contextual ontological model of quantum theory. It is only ve ∈ e(∈ E(ΣG))) for the source setting (e.g., Se) – is the when this assumption is supplemented with something else – “corresponding choice of source setting Se”. In Section 5.2 we outcome determinism in the case of KS-noncontextuality and will see how these pairs of source and measurement settings preparation noncontextuality in the case of generalized noncon- are used to compute an operational quantity relevant for our textuality [18] – that it can produce a contradiction with the noise-robust noncontextuality inequalities. predictions of an operational theory.

Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 23 – are in bijective correspondence with the ontic states where wv > 0 for all v ∈ V (G). (Λ) of the physical system on which the measure- The fundamental result of CSW is that this quan- ments are carried out. This is because, firstly, any tity is bounded for different sets of correlations — KS- probabilistic model in G(ΓG) can be expressed as a noncontextual, those realizable by projective quan- convex mixture of extremal probabilistic models in tum measurements, and those satisfying consistent G(ΓG)|det tG(ΓG)|ind, and, secondly, associating each exclusivity — by graph-theoretic invariants as follows: ontic state in the ontological model with an extremal 32 KS Q CE1 probabilistic model in G(ΓG)|det t G(ΓG)|ind means ∀[s|S]: R([s|S]) ≤ α(G, w) ≤ θ(G, w) ≤ α∗(G, w), that any probabilistic model in G(ΓG) corresponding (48) to predictions of an operational theory (in particular, where KS denotes operational theories that admit any p ∈ T(ΓG) ⊆ G(ΓG)) can be obtained by an ap- KS-noncontextual ontological models and thus realize propriate probability distribution over this set of ontic probabilistic models on ΓG that fall in the set C(ΓG), states. Denoting the set of ontic states corresponding Q denotes quantum theory with projective measure- to G(ΓG)|det by Λdet and the set of ontic states cor- ments which assigns probabilistic models on ΓG de- responding to G(ΓG)|ind by Λind, we have that the 1 noted by Q(ΓG), and CE denotes operational theo- measurement noncontextual ontological model given ries satisfying consistent exclusivity and thus realiz- by Λ ≡ Λdet t Λind reproduces the predictions T(ΓG) 1 ing the set of probabilistic models CE (ΓG) on ΓG. of any operational theory T that admits a measure- The graph invariants of the weighted graph (G, w), ment noncontextual ontological model: that is, for namely, α(G, w), θ(G, w), and α∗(G, w) are defined every p ∈ G(ΓG) (and therefore also p ∈ T(ΓG)), as follows: X p(v) = ξ(v|λ)µ(λ) 1. Independence number α(G, w): λ∈Λ X α(G, w) ≡ max wv, (49) I for all v ∈ V (ΓG), for some probability distribution v∈I P 33 µ :Λ → [0, 1] such that λ∈Λ µ(λ) = 1. We can also then rewrite β(ΓG, q) as where I ⊆ V (G) is an independent set of vertices of G, i.e., a set of nonadjacent vertices of G, so X that none of the vertices in this set shares an edge β(ΓG, q) = max qeζ(Me, λ), (46) λ∈Λind with any other vertex in the set. e∈E(ΓG) 2. Lovasz theta number θ(G, w): where ζ(Me, λ) ≡ maxme ξ(me|Me, λ). X 2 θ(G, w) ≡ max wv|hψ|uvi| , {|uv i}v∈V (G),|ψi 5 Noise-robust noncontextuality in- v∈V (G) (50) equalities where {|uvi}v∈V (G) = {|uvi}v∈V (G¯) (each |uvi a unit vector in Rd) is an orthonormal representa- We will now proceed to obtain our noise-robust non- tion (OR) of the complement of G, namely, G¯, contextuality inequalities following the ideas outlined and the unit vector |ψi ∈ d is called a handle. in Ref. [16]. R Here V (G¯) ≡ V (G) and E(G¯) ≡ {(v, v0)|v, v0 ∈ V (G), (v, v0) ∈/ E(G)}, and we have in an or- 5.1 Key notions from CSW thonormal representation that huv00 |uv000 i = 0 for 00 000 ¯ We first recall some key notions from the CSW frame- all pairs of nonadjacent vertices, (v , v ), in G, 00 000 work [22] before obtaining our inequalities. or equivalently, for all (v , v ) ∈ E(G). Consider the positive linear combination of the 3. Fractional packing number α∗(G, w): probabilities of measurement events, ∗ X X α (G, w) ≡ max wvpv, (51) {pv }v∈V (G) R([s|S]) ≡ wvp(v|S, s), (47) v∈V (G) v∈V (G)

where {pv}v∈V (G) is such that pv ≥ 0 for all v ∈ 32Representing response functions for the ontic state, i.e., P V (G) and v∈c pv ≤ 1 for all cliques c in G. p(v) = ξ(v|λ), ∀v ∈ V (ΓG) 33 As a corollary, note that as long as the polytope G(ΓG) has Note that since we are always considering ΓG such 1 a finite number of extreme points, we can take the ontic state that CE (ΓG) = G(ΓG), we, in fact, have the bounds space to consist of a finite number of ontic states (as we have done) without any loss of generality. The hypergraphs ΓG we study – representing the measurement events of interest in a KS Q GPT contextuality experiment – have this property because of their ∀[s|S]: R([s|S]) ≤ α(G, w) ≤ θ(G, w) ≤ α∗(G, w), finiteness. (52)

Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 24 where “GPT” denotes the full set of probabilistic Similarly, Corr is given by models on ΓG, i.e., G(ΓG). Corr In terms of the notation we have already intro- X X X = qe δm ,s ξ(me|Me, λ)µ(λ, se|Se) duced, where R([s|S]) ≤ RKS was a Bell-KS in- e e m ,s equality, we now have — from CSW [22] — that λ∈Λ e∈E(ΓG) e e X X RKS = α(G, w). = qe

λ∈Λ e∈E(ΓG) 5.2 Key notion not from CSW: X δme,se ξ(me|Me, λ)µ(se|Se, λ)µ(λ|Se). source-measurement correlation, Corr me,se We need to define a new quantity not in the CSW (56) framework, namely, Here, we have used the fact that X X µ(λ, se|Se) = µ(se|Se, λ)µ(λ|Se) Corr ≡ qe δme,se p(me, se|Me,Se), (53) m ,s e∈E(ΓG) e e to express Corr in a way that treats sources and mea- surements similarly. where {q } is a probability distribution, i.e., e e∈E(ΓG) Using preparation noncontextuality (cf. Eq. (22)), q ≥ 0 for all e ∈ E(Γ ) and P q = 1, e G e∈E(ΓG) e 34 we have that such that β(ΓG, q) < 1 holds. In previous work 0 0 [12, 16], we have taken q to be the uniform distribution ∀e, e ∈ E(ΣG):[>|Se> ] ' [>|Se > ] 1 qe = , but the derivation of the noncontextual- 0 |E(ΓG)| ⇒ µ(λ|Se) = µ(λ|Se ) ≡ ν(λ), ∀λ ∈ Λ. (57) ity inequalities is independent of that choice (as we’ll Then we can rewrite Corr as see here). Also, note that we have chosen the following labelling convention for outcomes of source setting S Corr e X X X (namely, se) and measurement setting Me (namely, = qe δme,se ξ(me|Me, λ)µ(se|Se, λ)ν(λ). me): the source outcomes se for source setting Se λ∈Λ e∈E(ΓG) me,se take values in the same set as measurement outcomes (58) me for measurement setting Me, i.e., VSe = VMe (re- calling notation from Section2). In particular, out- Note that the only λ that contribute to Corr are comes corresponding to the measurement event [v|e] those for which ν(λ) > 0. Also, µ(se|Se, λ) and µ(λ|Se, se) satisfy the condition µ(se|Se, λ)ν(λ) = (representing [me|Me]) and its corresponding source µ(λ|Se, se)p(se|Se), so that µ(se|Se, λ) is well-defined event ve (representing [se|Se]) are both denoted by whenever ν(λ) > 0. the same label, so that me = se for them. An exam- ple of this from Figs.5 and6 would be to, say, denote Defining the outcomes of a particular e ∈ E(ΓG) (measurement X X Corr(λ) ≡ qe δme,se ξ(me|Me, λ)µ(se|Se, λ), setting Me) by me ∈ VMe ≡ {0, 1, 2} and correspond- e∈E(ΓG) me,se ing outcomes of e ∈ E(ΣG) (source setting Se) by (59) se ∈ VSe ≡ {0, 1, 2}; so if [v|e] denotes [me = 0|Me], we have that then ve will denote [se = 0|Se], etc. X Corr = Corr(λ)ν(λ), (60) 5.3 Obtaining the noise-robust noncontextual- λ∈Λ Recalling that ζ(M , λ) = max ξ(m |M , λ), ity inequalities e me e e note that Corr(λ) is upper bounded as follows (for 5.3.1 Expressing operational quantities in ontological any λ ∈ Λ): terms Corr(λ) We begin with expressing the operational quantities of X X ≡ q δ ξ(m |M , λ)µ(s |S , λ) interest in terms of a noncontextual ontological model. e me,se e e e e e∈E(Γ ) me,se In an ontological model, R([s|S]) is given by G X X ≤ q ζ(M , λ) µ(s |S , λ) X X e e e e R([s|S]) = wvp(v|λ)µ(λ|S, s). (54) e∈E(ΓG) se λ∈Λ v∈V (G) X = qeζ(Me, λ). (61)

P e∈E(ΓG) Defining R(λ) ≡ v∈V (G) wvp(v|λ), we have that If λ ∈ Λ , then this upper bound is trivial, i.e., X det R([s|S]) = R(λ)µ(λ|S, s). (55) Corr(λ) ≤ 1, since every measurement has determin- λ∈Λ istic response functions. On the other hand, for all λ ∈ Λind, we have (from Eq. (46)) 34Indeed, for the strongest possible constraint on Corr, one must pick q such that β(Γ , q) is minimized. G Corr(λ) ≤ β(ΓG, q). (62)

Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 25 Similarly, for λ ∈ Λdet we have R(λ) ≤ α(G, w), while simply eliminating µdet and µind from these con- ∗ 35 for λ ∈ Λind we have R(λ) ≤ α (G, w). straints leads us to Using the fact that Corr se∗ =0 X ν(λ) = µ(λ|S) = µ(λ|S, s)p(s|S), R − α(G, w) ≤ 1 − (1 − β(ΓG, q)) s α∗(G, w) − α(G, w) (69) for any S ≡ Se, e ∈ E(ΣG), we have where the upper bound is nontrivial if and only if Corr β(Γ , q) < 1 and R − α(G, w) > 0. ! G X X If we are given that β(Γ , q) < 1, then we have a = Corr(λ)µ(λ|S, s) p(s|S) G trivial upper bound on Corrs =0 for the remaining s λ e∗ X cases: the upper bound is 1 for R = α(G, w) and = Corrsp(s|S). (63) greater than 1 for R < α(G, w). s Thus, our noise-robust noncontextuality inequality P now reads: where we have defined Corrs ≡ λ Corr(λ)µ(λ|S, s). R − α(G, w) Corr ≤ 1−p0(1−β(ΓG, q)) , (70) 5.3.2 Derivation of the noncontextual tradeoff for any α∗(G, w) − α(G, w) graph G which can be rewritten as We are now in a position to express our general noise- α∗(G, w) − α(G, w) 1 − Corr robust noncontextuality inequality as a tradeoff be- R ≤ α(G, w) + . p 1 − β(Γ , q) tween three operational quantities: Corr, R([s = 0 G e∗ (71) 0|S ]), and p(s = 0|S ). e∗ e∗ e∗ Note that Eq. (70) expresses the constraint from First, note that KS-contextuality is witnessed when noncontextuality as an upper bound on the source- for some choice of [s|S], here given by [s = 0|S ], e∗ e∗ measurement correlations Corr, reminiscent of the we have noise-robust noncontextuality inequality first derived R([se∗ = 0|Se∗ ]) > α(G, w). in Ref. [12] (and later treated in hypergraph-theoretic terms in Ref. [34]), except here the upper bound This means that for some set of ontic states in the on Corr depends not only on the hypergraph in- support of [se∗ = 0|Se∗ ], i.e., variant β(ΓG, q) but also two of the graph invari- ants from the CSW framework [22], namely, α(G, w) λ ∈ Supp{µ(.|S , s = 0)} e∗ e∗ and α∗(G, w), besides also the operational quantity ≡ {λ ∈ Λ: µ(λ|S , s = 0) > 0}, (64) e∗ e∗ R, which is the figure-of-merit for KS-contextuality (R > α(G, w) witnesses KS-contextuality) in the we have R(λ) > α(G, w). For such a set of ontic CSW framework. Eq. (70) indicates that the source- states one must then have Corr(λ) < 1 (because these measurement correlations would fail to be perfect λ ∈ Λind and we have Eq. (62)), which in turn implies (i.e., Corr < 1) in an operational theory admit- that Corr < 1. On the other hand, for s = 1, se∗ =0 e∗ ting a noncontextual ontological model if and only if we have no constraints: Corrs =1 ≤ 1. Thus, e∗ R > α(G, w) and β(ΓG, q) < 1. Contextuality would be witnessed when the source-measurement correla- Corr tions are stronger than the constraint from Eq. (70). = Corr p(s = 0|S ) + Corr p(s = 1|S ) se∗ =0 e∗ e∗ se∗ =1 e∗ e∗ For R ≤ α(G, w), in particular, there is no constraint ≤ p Corr + 1 − p , (65) from noncontextuality on Corr. 0 se∗ =0 0 On the other hand, rewriting the constraint from where p0 ≡ p(se∗ = 0|Se∗ ). noncontextuality as Eq. (71), one is reminded of the Defining µ ≡ P µ(λ|S , s = 0) and det λ∈Λdet e∗ e∗ CSW framework [22], where R is taken to be the quan- µ ≡ P µ(λ|S , s = 0), we now have ind λ∈Λind e∗ e∗ tity that is upper bounded by KS-noncontextuality. Here, instead, we have that R is upper bounded by µdet + µind = 1, (66) a term that includes the source-measurement correla- tions Corr that can be achieved for the measurements Corr ≤ µ + β(Γ , q)µ , (67) and thus penalizes for measurements that cannot be se∗ =0 det G ind made highly predictable with respect to some prepa- ∗ R ≤ α(G, w)µdet + α (G, w)µind. (68) rations, i.e., Corr < 1 makes it harder to violate the

Note that assuming µdet = 1 would reduce these 35To see this explicitly, just use Eq. (66) to make the substi- constraints to a standard Bell-KS inequality, R ≤ tution µind = 1−µdet in Eqs. (67) and (68), then eliminate µdet α(G, w). However, since we are not assuming this, from Eq. (67) by using the upper bound on it from Eq. (68).

Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 26 upper bound on R. When the upper bound reaches this value be an evidence of contextuality? For this α∗(G, w), it becomes trivial and R is no longer con- to be the case, we must have: strained by noncontextuality on account of noise in θ(G, w) − α(G, w) the measurements. Indeed, trivial POVMs (cf. Ap- Corr > 1−p (1−β(Γ , q)) . (74) pendices A.1.2 andC) never violate such a noncon- 0 G α∗(G, w) − α(G, w) textuality inequality because of the penalty incurred via Corr, as we later show in Section 6.3. Now, for the ideal quantum realization where mea- surement events are projectors, and the corresponding 5.3.3 When is the noncontextual tradeoff violated? source events are eigenstates, it is always the case that Corr = 1, hence contextuality is witnessed. However, The inequality of Eq. (71) can be rewritten as the it’s possible to witness contextuality even if Corr < 1, following tradeoff between Corr, p0, and R: as long as it exceeds the lower bound specified above. In a sense, for quantum theory, this allows for a quan- R − α(G, w) titative accounting of the effect of nonprojectiveness Corr+p (1−β(Γ , q)) ≤ 1. (72) in the measurements (or mixedness in preparations) 0 G α∗(G, w) − α(G, w) on the possibility of witnessing contextuality, a fea- Writing the constraint from noncontextuality in the ture that is absent in traditional Kochen-Specker ap- form of Eq. (72) (in contrast to Eqs. (70) and (71)) proaches [21–23, 25]. Indeed, as long as one achieves makes it more even-handed in its treatment of the any value of R > α(G, w), it is possible to witness two operational quantities R (which is key in the contextuality for a sufficiently high value of Corr (see CSW framework [22]) and Corr (which is key in Eq. (70)). noise-robust noncontextuality inequalities inspired by logical proofs of the KS theorem [12, 34]) and em- 5.4 Example: KCBS scenario phasizes that noise-robust noncontextuality inequal- ities inspired by statistical proofs of the KS theo- We will now illustrate our hypergraph framework by rem [16] are tradeoffs between R (which is about applying it to the KCBS scenario to make differences the strength of correlations between measurements) with respect to the CSW graph-theoretic framework and Corr (which is about the predictability of mea- [22] explicit. surements) that must be satisfied by any operational The graph G for the KCBS scenario is given in theory admitting a noncontextual ontological model. Fig.3, the measurement events hypergraph ΓG is Roughly speaking, a high degree of predictability for given in Fig.5, and the source events hypergraph ΣG measurements (e.g., Corr = 1) cannot coexist with is given in Fig.6. We then have very strong correlations between the measurements ∗ X (e.g., R = α (G, w)) when the operational theory ad- R([s|S]) = p(v|S, s), (75) mits a noncontextual ontological model. v∈V (G) For a nontrivial constraint – and hence, the pos- sibility of witnessing contextuality via violation of where the (vertex) weights wv = 1 for all v ∈ V (G), this inequality (Eq. (72)) – the upper bound on Corr i.e., it’s an unweighted graph and we will use α(G) (the right-hand-side of Eq. (70)) should be strictly and α∗(G) to denote its independence number and bounded above by 1, and the upper bound on R the fractional packing number, respectively. These (the right-hand-side of Eq. (71)) should be strictly are given by bounded above by α∗(G, w) (the algebraic upper bound on R), that is α(G) = 2 and α∗(G) = 5/2. (76) p > 0 and β(Γ , q) < 1, 0 G The source-measurement correlation term is given by R > α(G, w), Corr > 1 − p (1 − β(Γ , q)). (73) X X 0 G Corr = qe δme,se p(me, se|Me,Se) (77) e∈E(Γ ) me,se These are the minimal benchmarks necessary — be- G sides the requirement of tomographic completeness of for any choice of probability distribution q ≡ a finite set of procedures and the possibility of in- {qe}e∈E(Γ ). For simplicity, we will just take this ferring secondary procedures with exact operational G probability distribution to be uniform, i.e., q = 1 equivalences using convexity of the operational theory e 5 for all e ∈ E(Γ ). Note that the only extremal prob- [13] — to witness contextuality in a Kochen-Specker G abilistic model on Γ corresponding to an indeter- type experiment adapted to our framework following G ministic assignment (in Λ ) assigns ξ(v|λ) = 1 for Spekkens [18]. ind 2 all v ∈ V (G). This means Suppose one achieves, by some means, a value of R = θ(G, w), the upper bound on the quantum 1 β(Γ , q) = ∀q. (78) value with projective measurements. When would G 2

Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 27 |li+1ihli+1|, [mei = 2|Mei ] = |li+1ihli+1|}, where for i = 5, i + 1 = 1 (addition modulo 5). Similarly, in ΣG, the source events corresponding to source

setting Sei are given by {[sei = 0|Sei ] = |liihli|,

[sei = 2|Sei ] = |li+1ihli+1|, and [sei = 1|Sei ] = 1 I − |liihli| − |li+1ihli+1|}, where p(sei = b|Sei ) = 3 for

all b ∈ {0, 1, 2}. The special source setting Se∗ con-

sists of source events {[se∗ = 0|Se∗ ] = |ψihψ|, [se∗ = I−|ψihψ| 1 1|Se∗ ] = 2 }, where p(se∗ = 0|Se∗ ) = 3 and 2 p(se∗ = 1|Se∗ ) = 3 . We thus have the operational equivalences we need between the source settings:

I 0 [>|S ] ' [>|S 0 ] = , ∀e, e ∈ E(Σ ). (83) e> e > 3 G

This choice of representation for ΓG and√ΣG yields p = 1 , Corr = 1, and R([s = 0|S ]) = 5, so that Figure 7: Geometric configuration of the vectors appearing 0 3 e∗ e∗ in the KCBS construction [47]. the inequality 1 − Corr R ≤ 2 + (84) The noncontextuality inequality of Eq. (71) p0 is violated. However, note that this is an idealization α∗(G, w) − α(G, w) 1 − Corr R ≤ α(G, w) + (under which Corr = 1) and, typically, the source p0 1 − β(ΓG, q) events and measurement events will not be perfectly (79) correlated (Corr < 1) and the operational equiva- then becomes (in the KCBS scenario) lences between the source settings need not corre- 1/2 1 − Corr spond to the maximally mixed state. All that is re- R ≤ 2 + , (80) p 1/2 quired for a test of noncontextuality using this in- 0 equality is that the operational equivalences hold for or some choice of preparations and measurements which 1 − Corr R ≤ 2 + . (81) need not be the same as that in the ideal KCBS con- p0 struction. Recall that the KCBS inequality [22, 47] reads R ≤ 2 To illustrate what happens when Corr < 1, we con- and it would be a valid noncontextuality inequality sider the effect of a depolarizing channel on the states in our framework if and only if one can find mea- and measurements in the ideal KCBS construction. surements and preparations such that Corr = 1. The channel is given by In the standard KCBS construction [47] that vio- 1 lates the inequality R ≤ 2, we have the five ver- D (·) = rI(·) + (1 − r) Tr(·), r ∈ [0, 1]. (85) r I 3I tices in G (say vi, i ∈ {1, 2, 3, 4, 5}, labelled cycli- cally) associated with five projectors Πi = |liihli|, The action of this channel – with parameter r1 ∈ 5 i ∈ {1, 2, 3, 4, 5}, on a qutrit Hilbert space, given [0, 1], say – on the pure states {{|liihli|}i=1, |ψihψ|} by the vectors |lii = (sin θ cos φi, sin θ sin φi, cos θ), yields the noisy states given by 4πi √1 φi = 5 , and cos θ = 4 . The special source event 5 I [se∗ = 0|Se∗ ] is associated with the quantum state Dr1 (|liihli|) = r1|liihli| + (1 − r1) , ∀i ∈ [5], (86) |ψi = (0, 0, 1), so that 3 I Dr1 (|ψihψ|) = r1|ψihψ| + (1 − r1) , (87) 5 √ 3 X 2 R = |hli|ψi| = 5 > 2. (82) and the action of its adjoint – with parameter r2 ∈ i=1 5 [0, 1], say – on the ideal projectors, {|liihli|}i=1, in- See Fig.7 for a depiction of the geometric configura- volved in the measurements correspondingly yields tion of these vectors. the POVM elements given by To turn this KCBS construction into an argument against noncontextuality in our approach, we need † (|l ihl |) = r |l ihl | + (1 − r ) I , ∀i ∈ [5]. (88) Dr2 i i 2 i i 2 additional ingredients beyond the graph G. Firstly, 3 for both the measurement events hypergraph ΓG and Hence, we are imagining a situation where the prepa- the source events hypergraph ΣG, we denote the hy- ration procedures are affected by depolarizing noise peredges by ei, i ∈ {1, 2, 3, 4, 5}. In ΓG, the mea- with parameter r1 and measurement procedures are surement events for the setting Mei are given by affected by depolarizing noise with parameter r2, sim-

{[mei = 0|Mei ] = |liihli|, [mei = 1|Mei ] = I − |liihli| − ilar to the situation considered previously in Section

Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 28 II of the Supplemental material of Ref. [12]. The op- These satisfy the operational equivalences erational equivalences required for our argument from I 0 preparation and measurement noncontextuality are 0 [>|Se> ] ' [>|Se > ] = , ∀e, e ∈ E(ΣG). (97) satisfied by these noisy preparations and measure- 3 ments. That is, in ΓG, we can represent the mea- We then have surement events for the setting Mei (where i ∈ [5], 1 X 1 X i + 1 = 1, i.e., addition modulo 5) by Corr = p(m = b|M ,S , s = b). 5 3 e e e e † e∈E(ΓG) b∈{0,1,2} [me = 0|Me ] = D (|liihli|), (89) i i r2 (98) [m = 1|M ] = † ( − |l ihl | − |l ihl |), (90) ei ei Dr2 I i i i+1 i+1 † Noting that for any qutrit pure state |φi and its corre- [me = 2|Me ] = D (|li+1ihli+1|). (91) i i r2 sponding projector |φihφ|, each affected by depolariz- It is easy to verify that these form elements of a valid ing noise with parameters r1 and r2, respectively, we have POVM denoted by the measurement setting Mei and that the operational equivalences between the mea- † Tr(Dr1 (|φihφ|)Dr (|φihφ|)) surement events (represented by ΓG) are indeed re- 2 1 2 spected. On the other hand, in ΣG, the source events = + r r . (99) 3 3 1 2 corresponding to source setting Sei can be represented by Now, each term in the summation defining Corr, namely, p(m = b|M ,S , s = b), is obtained from [s = 0|S ] = D (|l ihl |), (92) e e e e ei ei r1 i i a calculation of the type in Eq. (99). Hence, we have

[sei = 1|Sei ] = Dr1 (I − |liihli| − |li+1ihli+1|}) (93) for each such term,

[sei = 2|Sei ] = Dr1 (|li+1ihli+1|), (94) p(me = b|Me,Se, se = b) 1 1 2 where p(sei = b|Sei ) = 3 for all b ∈ {0, 1, 2}, while the = + r1r2, (100) source events for source setting Se∗ can be represented 3 3 by so that

[s = 0|S ] = (|ψihψ|), (95) 1 2 e∗ e∗ Dr1 Corr = + r1r2. (101)   3 3 I − |ψihψ| [se∗ = 1|Se∗ ] = Dr1 , (96) 2 In the noiseless regime, i.e., r1 = r2 = 1, this reduces to the ideal KCBS scenario. On the other hand, we 1 2 where p(se∗ = 0|Se∗ ) = 3 and p(se∗ = 1|Se∗ ) = 3 . have

R([se∗ = 0|Se∗ ]) X = p(v|Se∗ , se∗ = 0) v∈V (G) 5 X = Tr † (|l ihl |) (|ψihψ|) Dr2 i i Dr1 i=1 5   X r2(1 − r1) r1(1 − r2) (1 − r1)(1 − r2) = r r |hl |ψi|2 + + + 1 2 i 3 3 3 i=1 5 X 5 =r r |hl |ψi|2 + (1 − r r ). (102) 1 2 i 3 1 2 i=1

Recall that violation of the noncontextuality inequal- That is, ity requires that 5 X 5 r r |hl |ψi|2 + (1 − r r ) 1 2 i 3 1 2 i=1 1 − Corr  1 2  R > 2 + . (103) >2 + 3 1 − + r r . (104) p0 3 3 1 2

Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 29 √ P5 2 Given that i=1 |hli|ψi| = 5, this becomes 6.2 Can our noise-robust noncontextuality in- equalities be saturated by a noncontextual on- √ 5 r r 5 + (1 − r r ) tological model? 1 2 3 1 2 >2 + 2(1 − r1r2). (105) A natural question concerns the tightness of these noncontextuality inequalities, i.e., can Eq. (72) be sat- Rewriting this, we obtain urated by a noncontextual ontological model? This √ requires one to specify a noncontextual ontological 5 − 2 model reproducing the operational equivalences be- r r > 1 − √ ≈ 0.908, (106) 1 2 1 tween the measurement events and between the source 5 + 3 settings, such that that is, the noncontextuality inequality can be vio- lated only when the depolarizing noise is below a cer- R − α(G, w) tain threshold given by r r > 0.908. In terms of Corr + p (1 − β(Γ , q)) = 1. 1 2 0 G α∗(G, w) − α(G, w) Corr, this requires Corr > 0.939. The noiseless case (107) r = r = 1 takes us back to the Corr = 1 regime that 1 2 The assumption of measurement noncontextuality we previously discussed. is already implicit in our characterization of the re- sponse functions ξ(me|Me, λ), and for this reason it 6 Discussion is, indeed, trivial to satisfy measurement noncontex- tuality while saturating these noncontextuality in- equalities. Measurement noncontextuality, alone, in 6.1 Measurement-measurement correlations fact even allows a violation of the inequality (when vs. source-measurement correlations no preparation noncontextuality is imposed), the ex- ∗ Note that the usual Kochen-Specker experiment, as treme case being R = α (G, w) and 1 ≥ Corr > conceptualized in Refs. [21–23, 25], for example, in- 1 − p0(1 − β(ΓG, q)). It’s the assumption of prepara- volves only the quantity R([s|S]), representing corre- tion noncontextuality that is nontrivial to satisfy and lations between various measurement events when all we do not know if there exists a general construction the measurements are implemented on a system pre- of a noncontextual ontological model saturating our pared according to the same preparation procedure, noncontextuality inequalities. We outline the general denoted by the source event [s|S]. Thus, R represents situation below. measurement-measurement correlations on a system prepared according to a fixed choice of preparation 6.2.1 The special case of facet-defining Bell-KS in- procedure. equalities: Corr=1 On the other hand, the experiment we have concep- If outcome determinism is presumed (as in traditional tualized in this paper involves, besides the quantity Bell-KS type treatments), then we know that there R, a quantity Corr representing source-measurement exists a necessary and sufficient set of Bell-KS in- correlations, characterizing the quality of the mea- equalities (each corresponding to a particular choice surements in terms of their response to corresponding of R([s|S])) that are satisfied by any operational the- preparations. ory admitting a KS-noncontextual ontological model. Our noncontextuality inequalities represent a In particular, each such (facet) Bell-KS inequality can trade-off relation that must hold between R and Corr be saturated by KS-noncontextual ontological models in an operational theory that admits a noncontextual that yield probabilities (from G(ΓG)) corresponding ontological model. Here we note that the first exam- to the facet-defining Bell-KS inequality, i.e., which ple of such a tradeoff relation, albeit only for the case satisfy R([s|S]) = α(G, w) for such a Bell-KS in- of operational quantum theory with unsharp measure- equality. Indeed, our noise-robust noncontextuality ments, appeared in Ref. [39] as the Liang-Spekkens- inequalities corresponding to these choices of R([s|S]) Wiseman (LSW) inequality [40] which has been shown (i.e., facet-defining Bell-KS inequalities of the Bell-KS 36 to be experimentally violated in Ref. [53]. And, in- polytope which is given by the convex hull of points in deed, the developments reported in Ref. [16] and the G(ΓG)|det) can always be saturated when Corr = 1, present paper have their origins in the idea of such a because in that case outcome determinism is justified trade-off relation that first appeared in Ref. [39]. by preparation noncontextuality (cf. Ref. [16]) and our inequalities are identical to the Bell-KS inequali- 36 This experiment, however, is not in a position to make ties (saturated by R = α(G, w)). claims about contextuality without presuming the operational theory is quantum theory simply because the LSW inequality presumes operational quantum theory. The noncontextuality 6.2.2 The general case: Corr < 1 inequalities in this paper do not require the operational the- ory to be quantum theory and can therefore be experimentally Since we do not want to assume outcome determin- tested using techniques from Refs. [13, 37, 54]. ism, nor necessarily the idealization of Corr = 1,

Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 30 what is at stake here is the assumption of prepa- can take in this case is less than or equal to 1. This ration noncontextuality. This assumption must be means that the upper bound on R from our noncon- satisfied while saturating the noise-robust noncontex- textuality inequality, Eq. (71), will be greater than tuality inequality in order for a measurement non- or equal to α(G, w), whereas we know that for a contextual ontological model to be universally non- KS-noncontextual probabilistic model, R ≤ α(G, w). contextual. Constructing such a noncontextual onto- Hence, there is no violation of our noncontextuality logical model amounts to specifying the distributions inequality for such trivial POVMs. µ(se|Se, λ) and ν(λ) such that

∀λ ∈ Λ: µ(λ|Se) = ν(λ), ∀e ∈ E(ΣG), (108) 6.3.2 The case p ∈ ConvHull(G(ΓG)|ind) i.e., preparation noncontextuality holds, and we have Now consider trivial POVMs that correspond to the (rewriting the saturation condition from Eq. (107)) indeterministic vertices, G(ΓG)ind (correspondingly, Λind), or their convex mixtures. We know that for ∗ (α (G, w) − α(G, w))Corr + p0(1 − β(ΓG, q))R these trivial POVMs, Corr ≤ β(ΓG, q). For any ∗ ∗ = (α (G, w) − α(G, w)) + p0α(G, w)(1 − β(ΓG, q)), R ≤ α (G, w) that is achieved by these trival POVMs, our noncontextuality inequality reads (109) R − α(G, w) where Corr ≤ 1 − p (1 − β(Γ , q)) , X 0 G α∗(G, w) − α(G, w) Corr = p(se∗ |Se∗ )Corrse , (110) ∗ (114) se∗ X A sufficient condition for this inequality to be satisfied Corr = Corr(λ)µ(λ|S , s ), (111) se∗ e∗ e∗ is that λ∈Λ R − α(G, w) β(ΓG, q) ≤ 1 − p0(1 − β(ΓG, q)) , Corr(λ) α∗(G, w) − α(G, w) X X (115) ≡ qe δme,se ξ(me|Me, λ)µ(se|Se, λ), which reduces, for R > α(G, w), to

e∈E(ΓG) me,se (112) α∗(G, w) − α(G, w) p ≤ , (116) 0 R − α(G, w) and X R = R(λ)µ(λ|S , s = 0). (113) e∗ e∗ where the upper bound is greater than or equal to λ∈Λ 1, since α(G, w) < R ≤ α∗(G, w). This is trivially Unfortunately, we do not have a general construction satisfied since p0 ≤ 1. that can show this to be possible for any noise-robust For R < α(G, w), the sufficient condition of noncontextuality inequality obtained according to the Eq. (115) is again trivially satisfied since it reduces approach we have outlined. We therefore leave it as to an open question whether such an inequality can (al- α∗(G, w) − α(G, w) p ≥ − , (117) ways?) be saturated by a noncontextual ontological 0 α(G, w) − R model. and we must anyway have p0 ≥ 0. 6.3 Can trivial POVMs ever violate these non- For R = α(G, w), the sufficient condition reduces contextuality inequalities? to β(ΓG, q) ≤ 1, which is again trivially satisfied since β(ΓG, q) < 1 by definition. No. Recall that a trivial POVM is defined as an assign- 6.3.3 The general case p ∈ G(ΓG) ment of positive operators p(v)I to the vertices of ΓG, where I is the identity operator on some Hilbert space In general, a probabilistic model achieved by trivial P and p : V (ΓG) → [0, 1], such that v∈e p(v) = 1 for POVMs can be in the convex hull of both determinis- all e ∈ E(Γ ), is a probabilistic model on Γ . G G tic (Λdet) and indeterministic (Λind) ontic states, with the total weight on deterministic ontic states denoted 6.3.1 The case p ∈ C(ΓG) by Pr(Λdet) and that on indeterministic ontic states by Pr(Λ ), so that Pr(Λ )+Pr(Λ ) = 1. We then Consider trivial POVMs corresponding to any KS- ind det ind have noncontextual probabilistic model, i..e., p ∈ C(ΓG) is a convex mixture of deterministic vertices, G(ΓG)|det, Corr ≤ Pr(Λdet) + Pr(Λind)β(ΓG, q), or equivalently, of ontic states in Λdet. In other R ≤ Pr(Λ )α(G, w) + Pr(Λ )α∗(G, w). words, C(ΓG) ≡ ConvHull(G(ΓG)|det), the convex det ind hull of points in G(ΓG)|det. The largest value Corr (118)

Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 31 A sufficient condition for satisfaction of the noncon- This framework leverages the graph invariants from textuality inequality is then the graph-theoretic framework of CSW for doing this, in addition to a new hypergraph invariant (Eq. (45)) 1 − Pr(Λind)(1 − β(ΓG, q)) that we call the weighted max-predictability. Our ap- R − α(G, w) proach is general enough to be applicable to any situ- ≤ 1 − p0(1 − β(ΓG, q)) , α∗(G, w) − α(G, w) ation involving noisy preparations and measurements (119) that arises from a KS-colourable contextuality sce- nario. which becomes We conclude with a list of open questions raised in α∗(G, w) − α(G, w) this paper and other directions for future research: p ≤ Pr(Λ ) (120) 0 R − α(G, w) ind 1. Characterizing structural Specker’s principle when R > α(G, w). Noting that from probabilistic models on a hypergraph Γ: Given that CE1(Γ) = G(Γ) for some Γ, is it R ≤ α(G, w) + Pr(Λ )(α∗(G, w) − α(G, w)), ind the case that Γ must then necessarily satisfy we have structural Specker’s principle, namely, that ev- ery clique in O(Γ) is a subset of some hyperedge R − α(G, w) in Γ? Or is it the case that there exists a hyper- Pr(Λind) ≥ , (121) α∗(G, w) − α(G, w) graph Γ0 for which CE1(Γ0) = G(Γ0) but struc- tural Specker’s principle fails? so that the sufficient condition for satisfaction of the noncontextuality inequality becomes p0 ≤ 1, which is More generally, is there any characterization of a trivially satisfied. hypergraph satisfying structural Specker’s prin- When R = α(G, w), the sufficient condition be- ciple entirely in terms of the probabilistic models comes β(ΓG, q) ≤ 1, which is again trivially satisfied. on it? Finally, when R < α(G, w), the sufficient condition As already pointed out earlier, this open question becomes relates to the open Problem 7.2.3 of Ref. [23] of 1 α∗(G, w) − α(G, w) characterizing Γ for which CE (Γ) = G(Γ). It p0 ≥ − Pr(Λind), (122) is known that Γ representing bipartite Bell sce- α(G, w) − R narios [55] satisfy the property CE1(Γ) = G(Γ) which is again trivially satisfied since p0 ≥ 0. and we have provided a generic recipe for con- Hence trivial POVMs cannot yield a violation of verting any Γ that does not satisfy structural our noncontextuality inequalities. This is the sense in Specker’s principle to a Γ0 that does satisfy it which trivial POVMs cannot lead to nonclassicality in so that CE1(Γ0) = G(Γ0). The question is if there our approach, unlike the case of traditional Kochen- are any other Γ that also satisfy CE1(Γ) = G(Γ). Specker approaches [21–23, 25] applied to the case of POVMs [73]. To violate our noncontextuality in- 2. Almost quantum theory: We know that an almost equalities, the POVMs must necessarily have some quantum theory cannot satisfy Specker’s princi- nontrivial projective component (that is not the iden- ple [49] but it satisfies statistical Specker’s princi- tity operator or zero) but they need not be projec- ple (or consistent exclusivity). An open question tors. Further, we do not rely on restricting the notion that remains is: of joint measurability [44] (cf. Section 2.4) to com- Can an almost quantum theory satisfy structural mutativity for POVMs. Taking joint measurability Specker’s principle? to be just commutativity is the approach adopted in, If not, this would render the satisfaction of con- for example, Ref. [25]. We refer to AppendixA and sistent exclusivity by an almost quantum theory AppendixC for more discussion on these issues, in unexplained by a natural structural feature of particular AppendixC for the role of commutativity measurements in the theory, namely, the satisfac- vs. joint measurability. tion of structural Specker’s principle, i.e., almost quantum theory would not fall in the category of 7 Conclusions operational theories envisaged in Ref. [30].

We have obtained a hypergraph framework for ob- 3. Conditions for saturating the noise-robust non- taining noise-robust noncontextuality inequalities cor- contextuality inequalities: responding to KS-colourable scenarios, suitably aug- As mentioned in Section 6.2, it is an open ques- mented with preparation procedures in the spirit of tion whether the noise-robust noncontextuality Spekkens contextuality [18]. The inequalities take the inequalities of Eq. (72) based on our generaliza- form of a noncontextual tradeoff between the three tion of the CSW framework [22] can be saturated operational quantities Corr, R, and p0, cf. Eq. (72). by a noncontextual ontological model.

Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 32 More generally, the status of these noise-robust we often talked about hypergraphs, Rob Spekkens noncontextuality inequalities vis-`a-visthe algo- for the often argumentative – but always productive rithmic approach of Ref. [17] for finding necessary – conversations over lunch, and participants at the and sufficient conditions for noncontextuality in Contextuality conference (CCIOSA) at Perimeter In- a general prepare-and-measure scenario remains stitute, during July 24 - 28, 2017, for very stimulat- to be explored. One would suspect that the al- ing discussions that fed into the narrative of this pa- gorithmic approach of Ref. [17] when adapted per. I would also like to thank David Schmid, Ana to the kind of situation considered in this paper Bel´enSainz, Elie Wolfe, and Tom´aˇsGonda for helping would yield nontrivial noncontextuality inequali- me better articulate the difference between structural ties that aren’t merely generalizations of the ones vs. statistical readings of Specker’s principle, and Eric obtained in the CSW framework [22]. It would be Cavalcanti for comments on the manuscript. Theorem interesting to investigate the full structure of this 1 owes its origin to a discussion with Tom´aˇsGonda. I set of inequalities and compare it with the facet- would also like to thank anonymous referees for sug- defining Bell-KS inequalities of the CSW frame- gestions that immensely improved the presentation work. of these results. Research at Perimeter Institute is supported by the Government of Canada through the 4. Properties of the weighted max-predictability, Department of Innovation, Science and Economic De- β(ΓG, q): velopment Canada, and by the Province of Ontario Since the crucial new hypergraph-theoretic ingre- through the Ministry of Research, Innovation and Sci- dient in our inequalities is the weighted max- ence. predictability, it would be interesting to under- stand properties of this hypergraph invariant on both counts: as a new mathematical object in A Status of KS-contextuality as an ex- its own right, one we haven’t been able to find perimentally testable notion of nonclas- a reference to in the hypergraph theory litera- ture, as well as an important parameter of a hy- sicality for POVMs in quantum theory pergraph relevant for noise-robustness of a noise- robust noncontextuality inequality. Indeed, as we The purpose of this section is to emphasize how the point out in Footnote 34, identifying a distribu- progression from KS-contextuality to Spekkens con- tion q (in the definition of Corr, Eq. (53)) that textuality for KS-type contextuality experiments is a natural one rather than an ad hoc move from one minimizes β(ΓG, q) for a given ΓG would lead to better noise-robustness in the inequalities of framework to another. That is, Spekkens contex- Eqs. (70) or (71). tuality is not just another notion of nonclassical- ity that is incomparable with KS-contextuality, but 5. Noise-robust applications of quantum protocols is indeed intimately connected in its motivations to based on KS-contextuality: the limitations of KS-contextuality [18]. In partic- ular, we will focus on the role of KS-contextuality A general research direction is to construct noise- with respect to POVMs and why allowing arbitrary robust versions of applications that have previ- POVMs poses a difficulty for KS-contextuality as ously been suggested for KS-contextuality. Our a notion of nonclassicality that is experimentally approach provides a recipe for doing this for testable, i.e., a notion that applies to noisy measure- any Bell-KS inequality appearing in such applica- ments (POVMs) typically implemented in a labora- tions. Besides serving as a witness for strong non- tory experiment.38 While one may be tempted to classicality [56] (i.e., Spekkens contextuality),37 reject this premise for assessing the suitability of KS- noise-robust versions of these applications can contextuality as a notion of nonclassicality – claim- help benchmark the experiments in terms of the ing instead that KS-contextuality was never meant noise that can be tolerated while still witnessing for POVMs and applies only to “purified” experi- nonclassicality. Examples of such applications in- ments (namely, ones with only PVMs and pure states) clude those from Refs. [58–63]. – the reasons for doing so are rooted in the litera- ture on KS-contextuality where POVMs have indeed Acknowledgments been considered and (at least) two kinds of conclu- sions drawn: one, that there exists a Kochen-Specker I would like to thank Andreas Winter for his com- contradiction for POVMs, even on a qubit, so KS- ments on an earlier version of some of these ideas, To- contextuality for POVMs is interesting [64] and two, bias Fritz for the ping-pong and the sing-song in which that allowing arbitrary POVMs in assessing nonclas- sicality would make the research program of identify- 37As opposed to weak nonclassicality that can arise in epis- temically restricted classical theories [57]. See also the talk at 38And how a rather compelling way to arrive at a notion that Ref. [56], 41:43 minutes, for a short discussion. is experimentally testable is Spekkens contextuality.

Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 33 ing device-independent principles for quantum corre- sets of measurements in the set of all measurements lations in KS-contextuality experiments ill-defined, so in fact extends to even the most general case when quantum correlations allowing arbitrary POVMs are the measurements are POVMs on any Hilbert space. “pathological” [73]. We will look at these arguments So, even a KS contradiction for POVMs (such as the in turn and use the latter, in particular, to segue into one in Ref. [64]) falls prey to the Meyer-Clifton-Kent our motivations for the framework proposed in this argument [77]. As Ref. [77] notes: paper. Dealing with projective measurements is arguably not enough. One quite popular A.1 Limitations of KS-contextuality vis-`a-vis view of quantum theory holds that a cor- POVMs rect version of the measurement rules would take POV measurements as fundamental, A.1.1 KS-contextuality for POVMs in the literature with projective measurements either as spe- The first paper that applied KS-contextuality to the cial cases or as idealisations which are never case of POVMs was by Cabello [64] where a KS- precisely realised in practice. In order to de- uncolourability argument for POVMs on a single fine an NCHV theory catering for this line of qubit was proposed. This was motivated by the thought, Kent constructed a KS-colourable Gleason-type derivation of the Born rule starting with dense set of positive operators in a complex the structure of POVMs due to Busch [66] and Caves Hilbert space of arbitrary dimension, with et al. [67], analogous to the case of the Kochen- the feature that it gives rise to a dense set of Specker theorem [19] which can be seen as motivated POV decompositions of the identity (Kent, by Gleason’s theorem [68]. Insofar as there exists a 1999). Clifton and Kent constructed a dense Gleason-type theorem for POVMs [66, 67], one could set of positive operators in complex Hilbert motivate KS-contextuality as a reasonable notion of space of arbitrary dimension with the special nonclassicality for POVMs, as was presumably the feature that no positive operator in the set case in Ref. [64]. The role of this notion of nonclas- belongs to more than one decomposition of sicality is then just to argue – using a finite set of the identity (Clifton & Kent, 2000). Again, POVM elements – that no KS-noncontextual assign- the resulting set of POV decompositions is ment of outcomes is possible for certain finite sets dense, and the special feature ensures that of POVMs in quantum theory. Should we, however, one can average over hidden states to recover assume that it is reasonable to demand determinis- quantum predictions. tic assignment of outcomes to POVM elements in an Hence, in any finite precision experiment it would ontological model, just as we do for PVM elements? be impossible to test the Kochen-Specker theorem, The argument of Ref. [64] was later criticized on var- i.e., such an experimental test would require an in- ious counts [18, 28, 65] and we refer the reader to finitely precise measurement and measurements in a Ref. [28] for criticisms pertinent to this paper, namely, real-world laboratory are never infinitely precise. Al- that outcome determinism for all unsharp measure- though there was a lively debate along these lines ments (ODUM in Ref. [28]) in quantum theory is un- (see the references in [77]), the resolutions that were 39 tenable. Other works in the literature where KS- proposed all involved modifying the notion of KS- contextuality for POVMs has been explored include noncontextuality by adding auxiliary assumptions Refs. [69–72]. that seek to exclude the Meyer-Clifton-Kent type ar- Besides, doubts about the experimental testability guments. A recent attempt in this direction can be of the KS theorem were raised in the late ‘90s in a se- found in Ref. [78] where a notion of “ontological faith- ries of papers by Meyer, Clifton, and Kent [74–76]. A fulness” is proposed. As such, it was already recog- review can be found in Ref. [77]. These doubts were nized – for reasons independent of Spekkens contex- premised on the idea that the set of KS-colourable tuality [18] – that the notion of KS-noncontextuality projectors (or PVMs) on any given Hilbert space is needs to be revised if one is to make it experimen- dense in the set of all projectors (or PVMs) on that tally testable.40 What Spekkens brought to the fore Hilbert space. That is, for any given set of PVMs [18], besides generalizing the notion of contextuality yielding a KS contradiction, it is always possible to to all experimental procedures rather than measure- find PVMs which are arbitrarily “close” to the PVMs ments alone, was the idea that an experimental test of required for a KS contradiction (for any finite preci- noncontextuality should not rely on inequalities that sion) but which do not themselves lead to a KS contra- presume outcome determinism, just as a test of local diction. The property of denseness of KS-colourable causality does not require the assumption of outcome

39Ref. [28] is also a good resource for a detailed analysis of 40Of course, this takes nothing away from the importance of arguments concerning dilations of POVMs, which we will not the Kochen-Specker theorem [19] as a no-go theorem concerning get into here. Besides, it also provides a principled recipe for the logical structure of quantum theory and the constraints it assigning response functions to POVMs. places on the ontological models possible for the theory.

Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 34 determinism. Indeed, the assumption of outcome de- that places a non-trivial restriction on cor- terminism for sharp measurements in quantum the- relations will be respected. Thus, this kind ory is derived in the Spekkens framework from the as- of “quantum model” is clearly pathological. sumption of preparation noncontextuality rather than being assumed independently. One way to motivate the present work is as a re- We will now consider the more modern approach sponse to the pathology that Henson and Sainz allude to KS-contextuality along the lines of the frameworks to: that trivial POVMs can realize any probabilistic in Refs.[22, 23, 25] to segue into our framework for model, hence allowing arbitrary POVMs makes the Spekkens contextuality which we develop in this pa- problem of finding principles to identify quantum cor- per. relations in KS-contextuality scenarios trivial, i.e., all probabilistic models are quantum and there is nothing to be learnt about post-quantum probabilistic mod- A.1.2 Classifying probabilistic models: restriction of els. This is because any set of probabilities satisfying quantum models to PVMs the “no-disturbance” or “no-signalling” condition (of Research on KS-contextuality took a different turn which the E1 correlations of CSW [22] are a subset, with the advent of the graph-theoretic framework of in general) can be achieved by (trivial) POVMs by Cabello, Severini and Winter in 2010 [21] (revised simply multiplying an identity operator with every 42 slightly in 2014 [22]), the sheaf-theoretic framework of probability in such an assignment of probabilities. Abramsky and Brandenburger in 2011 [25], and the By the lights of KS-noncontextuality as one’s notion hypergraph based formalism of Ac`ın,Fritz, Leverrier, of classicality, then, trivial POVMs saturating the and Sainz in 2012 [23]. The unifying theme of these general probabilistic bound on the correlations would contributions was that they took the key mathemat- seem to be maximally nonclassical (i.e., maximally ical idea underlying KS-noncontextuality and Bell- KS-contextual). To avoid such “pathological” quan- locality — namely, that both are instances of the clas- tum models, they restrict the definition of a quantum sical marginal problem [26, 32, 33] — and built frame- model to allow only projective measurements. Indeed, works that sought to distinguish between classical the- with recent work on a sensible notion of “sharp” mea- ories (namely, those admitting KS-noncontextual on- surement in a general probabilistic theory [30, 31], tological models), quantum theory, and post-quantum an appeal to the “fundamental sharpness” of all mea- general probabilistic theories by classifying their em- surements (see, e.g., [29]) is made to restrict attention pirical predictions relative to a Kochen-Specker ex- to sharp measurements in both quantum theory and periment into these categories. All these frameworks, general probabilistic theories. motivated by the device-independence paradigm, es- On the other hand, the approach in this paper is chewed the erstwhile restriction of the notion of KS- different. In particular, we want our approach to cap- noncontextuality to quantum theory and sought to ture the intuition that trivial POVMs are “classical” make their analysis theory-independent, relying only (and not pathological), so we must go beyond KS- on empirical predictions relative to a KS experiment noncontextuality. A simple operational sense in which to classify theories. They separated the assumption trivial POVMs are “classical” is that they reveal noth- of KS-noncontextuality from the operational theory – ing about the quantum state on which they are mea- namely, quantum theory – to which it was originally sured, being incapable of distinguishing any pair of 43 meant to apply, allowing arbitrary operational theo- states whatsoever. The correlations (denoted by ries in their analysis. However, there was a key dis- R([s|S])) usually examined in a KS-contextuality ex- tinction between Bell scenarios and KS-contextuality periment do not allow such experiments to witness scenarios that was lost in this formal unification: the “triviality” of trivial POVMs, i.e., the fact that namely, that while the definition of a quantum proba- they correspond to a fixed probability distribution bilistic model in a Bell scenario need not be restricted that doesn’t vary even as the choice of preparation to (local) PVMs (and arbitrary local POVMs can be is varied. Moreover, since all nonprojective mea- allowed without changing the set of quantum mod- surements are excluded by fiat in traditional Kochen- els), the same is not true of a KS-contextuality sce- Specker type approaches [22, 23] for reasons alluded nario. Indeed, as Henson and Sainz note in their work to by Henson and Sainz [73], one loses out on the po- [73],41 reflecting on the question of allowing arbitrary tential to explore the possibilities that nontrivial and POVMs in the definition of a quantum probabilistic nonprojective measurements offer with respect to con- model: 42Trivial POVMs are, therefore, trivial resolutions of the identity, where every POVM element is proportional to identity, i.e., {a } , such that a ∈ [0, 1] and P a = 1. ...if we allow general POVMs rather than I a a projective measurements then no principle 43Indeed, any trivial POVM can be realized in the follow- ing operational manner: take the quantum system prepared in 41Proposing a principle bounding the KS-contextuality pos- some state, throw it in the garbage, and then sample from the sible in quantum theory, namely, “Macroscopic Noncontextual- classical probability distribution corresponding to the trivial ity”. POVM.

Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 35 textuality.44 Our approach, therefore, is to allow arbi- derstood in terms of trivial local POVMs. Hence, it is trary POVMs when considering probabilistic models the locality of the trivial POVMs in a Bell experiment arising from quantum theory (and not restricting to that prevents them from violating a Bell inequality any notion of “sharp measurements” in general prob- and renders them non-pathological, unlike in the case abilistic theories) but examine more quantities than of KS-contextuality. The fact that they are “trivial” are examined in traditional approaches, i.e., besides in the sense of being unable to distinguish two quan- the quantity R typical in a KS-contextuality scenario, tum states plays a role in the sense that, regardless of we invoke the quantity Corr to account for noise in the the shared quantum state, these POVMs yield fixed measurements. distributions over the measurement outcomes, thus If one restricts attention to operational theories always allowing the construction of a fixed (that is, that can always achieve Corr = 1 for any KS- independent of the quantum state) global joint prob- contextuality scenario, then the usual classification ability distribution over all measurements in a Bell of probabilistic models following Refs. [22, 23] holds scenario. Since there are no such locality constraints (Eq. (72)). What is of interest in our framework, how- on the form of the POVM elements in a Kochen- ever, is the tradeoff between R and Corr: how large Specker experiment, they can easily violate any KS- can both R and Corr be in an operational theory? noncontextuality inequality, e.g., the two-party CHSH (See Eq. (72).) experiment considered as a Kochen-Specker experi- ment with four observables in a 4-cycle where ad- A.2 Robustness of Bell nonlocality vis-`a-vis jacent pairs are jointly measurable allows for trivial POVMs (like the PR-box trivial POVM above) violat- POVMs ing the CHSH-type Bell-KS inequality in this scenario Note that whenever we refer to “Bell-KS” functionals maximally. By the lights of KS-noncontextuality, this or inequalities for Kochen-Specker type experiments, violation would indicate the maximum possible KS- we are not thinking of experiments that are Bell ex- contextuality with respect to this CHSH-type inequal- 45 periments [4,5,7,9–11], which have spacelike sepa- ity. For all these reasons, our discussion of KS- ration between multiple parties, each performing lo- noncontextuality as a notion of classicality — in an cal measurements on a shared multipartite prepara- experiment with no locality constraints on the mea- tion. For the case of Bell experiments, trivial local surements — does not extend to the case of Bell- POVMs assigned to each party in a Bell experiment locality (or local causality) as a notion of classicality do not lead to Bell violations for a simple reason: the in a Bell experiment, where the experiment must re- trivial POVMs for each party are all compatible with spect locality constraints on the measurements for a each other, thereby admitting a joint probability dis- Bell inequality violation to be meaningful. tribution over their outcomes for each party; taking a product of these local joint probability distributions The unification of Bell nonlocality and KS- (one for each party) results in a joint distribution over contextuality `ala Refs. [22, 23, 25] forces a certain all measurements of all parties, hence satisfying Bell dichotomy in these approaches: while in Bell scenar- inequalities. The fact that the POVMs are trivial ios, one need not restrict to any notion of a “sharp” ensures that the Bell inequalities are satisfied regard- measurement in the definition of probabilistic models less of the choice of shared quantum state. On the (and thus claim “theory independence”), in Kochen- other hand, forgetting the constraint of local POVMs, Specker scenarios, one must make some statement there always exist global trivial POVMs that can vi- about the nature of the measurements (concerning olate Bell inequalities: e.g., just take the Popescu- their presumed sharpness [29], or that their joint mea- Rohrlich (PR) box distribution [43], and multiply an surability [42, 44] is restricted to commutativity [25]), identity operator (on the joint Hilbert space of Al- rendering any putative “theory independence” claim ice and Bob) with each probability in the PR-box; (on a level at par with Bell nonlocality) unfounded.46 this results in four trivial POVMs, defined over the joint Hilbert space, that together violate the CHSH inequality maximally. But, of course, this violation is uninteresting because it doesn’t obey the locality constraint on the measurements in a Bell experiment. This is mathematically reflected in the fact that the PR-box distribution cannot be written as a convex 45See AppendixC for more discussion. mixture of product distributions, one for each party, 46See Ref. [27] for how this lack of locality of measurements in hence the corresponding trivial POVM cannot be un- a Kochen-Specker type experiment translates, at the ontolog- ical level, to the unreasonableness of assuming factorizability 44All trivial POVMs are nonprojective, but not all nonprojec- in the ontological model; this factorizability (or the stronger tive POVMs are trivial. Indeed, see Refs. [39–42] for examples condition of outcome determinism) is invoked to justify the re- of generalized contextuality [18] with nonprojective measure- sulting derivation of Bell-KS inequalities as constraints from a ments, albeit assuming operational quantum theory. classical marginal problem.

Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 36 0 00 5 B Ontological models without respect- {Mi,Mi ,Mi }i=1. However, since there are no ing coarse-graining relations constraints from coarse-graining relations on these response functions, there is no obstruction to Here we will construct explicit examples where the the construction of a “KS-noncontextual model” coarse-graining relations are not respected in an on- of this type for any set of operational statis- tological model, in contrast to the requirement on the tics. In particular, since we do not require that 0 representation of coarse-grainings that we invoked in ∀λ ∈ Λ: ξ(0|Mi+1, λ) ≡ ξ(0|Mi+1, λ), nor that 00 Section II.C of the main text. The goal is to empha- ∀λ ∈ Λ: ξ(2|Mi , λ) ≡ ξ(2|Mi, λ), we can assign 0 00 5 size that the requirement of Section II.C is necessary arbitrary response functions to {Mi ,Mi }i=1, subject not only for the treatment of Spekkens contextuality only to the condition from KS-noncontextuality that 00 0 47 but also for Kochen-Specker contextuality. Below, we ∀λ ∈ Λ: ξ(2|Mi , λ) = ξ(0|Mi+1, λ) ∈ {0, 1}. first demonstrate how a “KS-noncontextual” model Note that, because coarse-graining relations can be constructed for any scenario that proves the are not respected, this does not imply that KS theorem by using the example of the KCBS setup ∀λ ∈ Λ: ξ(2|Mi, λ) = ξ(0|Mi+1, λ) ∈ {0, 1}, which is [47]. We then proceed to demonstrate how a “prepa- the usual constraint we would have presumed from ration and measurement noncontextual” model can be KS-noncontextuality when coarse-graining relations constructed in a similar way whenc considering gen- are respected in the ontological model. In the absence eralized noncontextuality [18]. of any such constraints on the response functions for 5 {Mi}i=1, one can always reproduce their operational statistics, in particular the operational equivalences B.1 How to construct a “KS-noncontextual” of the type [2|Mi] ' [0|Mi+1], which follow from ontological model of the KCBS experiment [47] Eqs. (123),(125), and (131). without coarse-graining relations

Here we have that M contains at least the follow- B.2 How to construct a “preparation and 5 ing measurement settings: {Mi}i=1, each with three measurement noncontextual” ontological model possible outcomes, mi ∈ {0, 1, 2}. The measurement without coarse-graining relations events for each measurement setting Mi can be coarse- grained into two different ways, defining new measure- Just as for measurements in the case of KS- 0 0 ¯ noncontextuality, abandoning the coarse-graining re- ment settings Mi (with outcomes mi ∈ {0, 0}) and 00 00 ¯ lations for preparations in the case of generalized non- Mi (with outcomes mi ∈ {2, 2}), where the coarse- graining relations are given by contextuality [18] makes possible the existence of a “preparation and measurement noncontextual” on- 0 [0|Mi ] ≡ [0|Mi], (123) tological model for any set of operational statistics. ¯ 0 For the kinds of proofs of contextuality relevant to [0|Mi ] ≡ [1|Mi] + [2|Mi], (124) this article, the relevant notion of coarse-graining is [2|M 00] ≡ [2|M ], (125) i i that of complete coarse-graining: that is, consider ¯ 0 [2|Mi ] ≡ [0|Mi] + [1|Mi]. (126) two source settings S and S0 with (respective) source 0 0 events {[s|S]} and {[s |S ]} 0 0 , that can be com- In the operational theory, these coarse-graining rela- s∈VS s ∈S pletely coarse-grained to yield the operational equiva- tions are respected, i.e., for all [s|S], s ∈ VS,S ∈ S, 0 lence [>|S>] ' [>|S>], cf. Eq. (18). In the operational 0 description, where we assume the coarse-graining re- p(0, s|Mi ,S) ≡ p(0, s|Mi,S), (127) ¯ 0 lation is respected, this is represented by p(0, s|Mi ,S) ≡ p(1, s|Mi,S) + p(2, s|Mi,S), (128) 00 p(2, s|Mi ,S) ≡ p(2, s|Mi,S), (129) ∀[m|M], m ∈ VM ,M ∈ M : ¯ 00 X X p(2, s|Mi ,S) ≡ p(0, s|Mi,S) + p(1, s|Mi,S). (130) p(m, s|M,S) = p(m, s0|M,S0). (133) However, we do not require that these relations be s s0 respected in an ontological model. Now, the KCBS In the ontological description, however, we do not argument requires the following operational equiva- impose the coarse-graining relations µ(λ, >|S>) ≡ lences, P 0 P 0 0 µ(λ, s|S) and µ(λ, >|S ) ≡ 0 µ(λ, s |S ), which 00 0 s > s [2|Mi ] ' [0|Mi+1], (131) makes it trivial to write down probability dis- 0 for all i ∈ {1, 2, 3, 4, 5}, where addition is modulo 5, tributions µ(λ, >|S>) and µ(λ, >|S>) such that 0 so that i + 1 = 1 for i = 5. A KS-noncontextual µ(λ, >|S>) = µ(λ, >|S>) (as required by prepara- 0 ontological model for this experiment requires that tion noncontextuality applied to [>|S>] ' [>|S>]) P but where we do not require that s µ(λ, s|S) = 00 0 ξ(2|Mi , λ) = ξ(0|Mi+1, λ) ∈ {0, 1}, ∀λ ∈ Λ. (132) 47This “KS-noncontextual” ontological model will thus repro- 00 0 Constructing such a model requires one to spec- duce operational equivalences of the type [2|Mi ] ' [0|Mi+1] ify response functions for the measurements (cf. Eq. (131)).

Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 37 P s µ(λ, s|S) (which is not required by preparation spacelike separation means from the quantum per- noncontextuality). Note how the refusal to re- spective is that one no longer needs to model this spect the coarse-graining relations, i.e., identifying spacelike separation by requiring a tensor product P 0 µ(λ, >|S>) with s µ(λ, s|S) and µ(λ, >|S>) with structure, or (more generally) by requiring the com- P 0 0 s0 µ(λ, s |S ), lifts the constraint from preparation mutativity of the observables that are jointly mea- noncontextuality that would have been in place if the sured [25, 79, 80]. That is, there is no physical justi- coarse-graining relations were respected. The same fication for imposing the tensor product structure or refusal for the case of measurements lifts any con- the commutativity of jointly measured observables.48 straints (just as in the case of KS-noncontextuality Thus, we have the Hilbert space H and we consider above) from measurement noncontextuality on the on- four binary-outcome POVMs, {A(0),A(1),B(0),B(1)}, tological model. It thus becomes trivial to construct on H, where a “preparation and measurement noncontextual” on- (0) (0) (0) tological model without coarse-graining relations. A ≡ {A0 ,A1 }, (1) (1) (1) A ≡ {A0 ,A1 }, (0) (0) (0) C Trivial POVMs B ≡ {B0 ,B1 }, B(0) ≡ {B(1),B(1)}, (138) C.1 Bell-CHSH scenario 0 1 0 ≤ A(0),A(1),B(0),B(1) ≤ , A(0) + A(0) = We have the Hilbert space HA ⊗ HB for Alice 0 0 0 0 IH 0 1 (1) (1) (0) (0) (1) (1) (HA) and Bob (HB). Consider four binary-outcome A0 + A1 = B0 + B1 = B0 + B1 = IH. POVMs, {A(0),A(1),B(0),B(1)}, where Further, the following sets of POVMs are jointly measurable: {A(0),B(0)}, {A(0),B(1)}, {A(1),B(0)}, (0) (0) (0) (1) (1) A ≡ {A0 ,A1 }, {A ,B }. The most general joint for a (1) (1) (1) pair of compatible POVMs {A(x),B(y)} is given by A ≡ {A0 ,A1 }, (xy) (xy) (xy) (xy) (xy) (0) (0) a POVM G ≡ {G ,G ,G ,G } (that B(0) ≡ {B ,B }, 00 01 10 11 0 1 isn’t necessarily unique [42]) such that: G(xy) + (0) (1) (1) 00 B ≡ {B0 ,B1 }, (134) (xy) (x) (xy) (xy) (x) (xy) (xy) G01 = A0 ,G10 + G11 = A1 ,G00 + G10 = (0) (1) (0) (1) (0) B(y),G(xy) + G(xy) = B(y). In particular, if (and 0 ≤ A0 ,A0 ≤ IHA , 0 ≤ B0 ,B0 ≤ IHB , A0 + 0 01 11 1 (0) (1) (1) (0) (0) (1) only if) the POVMs A(x) and B(y) commute, we can A1 = A0 + A1 = IHA , and B0 + B1 = B0 + (1) construct the joint POVM as a product: G(xy) = B = H . The quantum probability, given a shared ab 1 I B (x) (y) quantum state ρAB defined on HA ⊗ HB, is given by Aa Bb for all a, b, x, y ∈ {0, 1}. In the absence of such commutativity, the joint POVM cannot be writ- (x) (y) p(a, b|x, y) = Tr(ρABAa ⊗ Bb ), (135) ten as a product. (x) The quantum probability, given a quantum state ρ for a, b, x, y ∈ {0, 1}. Here A ⊗ IHB is jointly mea- (y) on H, is given by surable with IHA ⊗B , just because of the commuta- tivity of their respective POVM elements. The joint (xy) p(a, b|x, y) = Tr(ρG ), (139) observable being measured is A(x) ⊗ B(y). Now, con- ab sider the case when all the POVM elements are triv- (x) (x) (y) (y) for a, b, x, y ∈ {0, 1}. Note that this probability de- ial, i.e., A = q and B = r , for some (xy) a a IHA b b IHB pends on the joint measurement G implementing (x) (y) (x) (y) qa , rb ∈ [0, 1] for all a, b, x, y ∈ {0, 1}. We then A and B together, and that, in general, there have may be multiple choices of G(xy) possible. This is easy to see since there is one undetermined positive p(a, b|x, y) = q(x)r(y), ∀a, b, x, y ∈ {0, 1}. (136) a b operator in the joint measurement that is not fixed by (x) (y) A global joint probability distribution which repro- A or B , i.e., we can write the POVM elements of (xy) (xy) (x) (xy) (xy) (y) (xy) duces the above as marginals is simply given by their G as: G01 = A0 − G00 , G10 = B0 − G00 , product: 48On the other hand, what this lack of spacelike separation p(a(0), a(1), b(0), b(1)) ≡ q(0) q(1) r(0) r(1) . (137) means from the perspective of an ontological model is that one a(0) a(1) b(0) b(1) no longer has a justification for assuming factorizability [25] Hence, trivial POVMs never violate any Bell-CHSH and, consequently, the generalization of Fine’s theorem [26] fails to prove that there is no loss of generality in assuming outcome inequality for this scenario. determinism in discussions of KS-contextuality (unlike the case of Bell scenarios, where factorizability is justified by spacelike separation); there is a definite loss of generality, in that mea- C.2 CHSH-type contextuality scenario: 4- surement noncontextual and outcome-indeterministic ontologi- cycle cal models that are non-factorizable are not empirically equiva- lent to measurement noncontextual and outcome-deterministic We now consider the Bell-CHSH scenario without the (or KS-noncontextual) ontological models. See Ref. [27] for a constraint of spacelike separation. What the lack of discussion of this aspect.

Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 38 (xy) (x) (y) (xy) (xy) G11 = I−A0 −B0 +G00 , where G00 is a posi- joint POVM is the following: (x) (y) tive semidefinite operator satisfying A0 +B0 −IH ≤ G(xy) ≤ A(x),B(y). Here G(xy) represents the freedom PR(xy) IH 00 0 0 00 Gab = δa⊕b,xy, (141) in the choice of how the joint measurement might be 2 implemented within quantum theory. This freedom which leads to the probability distribution reflects the fact that since the jointly measured ob- 1 p(a, b|x, y) = 2 δa⊕b,xy for any choice of quan- servables are no longer spacelike separated, it is pos- tum state. Hence, this joint POVM GPR(xy) always sible to introduce correlations between them that are yields statistics corresponding to the PR-box, max- stronger than what is allowed in the corresponding imally violating the CHSH-type inequality for this Bell scenario in quantum theory. The strength of scenario, namely, these correlations is only limited by the constraints on (xy) G imposed by the marginal observables A(x) and X 1 3 00 p(a, b|x, y) ≤ . (142) B(y). This is in contrast to the case where A(x) and 4 4 B(y) are spacelike separated observables and the only a,b,x,y a⊕b=xy choice of joint POVM consistent with spacelike sepa- (xy) (x) (y) Physically, it’s possible to implement this (without ration is fixed by G00 = A0 B0 , i.e., the strength of correlations between A(x) and B(y) is fixed entirely requiring any quantum resources) by providing a box by them and there is no freedom in choosing G(xy). that always produces these correlations between mea- 2 Thus, we have that A(x) is jointly measurable with surement settings denoted by (xy) ∈ {0, 1} , regard- B(y) and G(xy) denotes a joint POVM of A(x) and less of the input state. Such a black-box would maxi- B(y). Now, consider the case when all the POVM mally violate the CHSH-type inequality (viewed as a (x) (x) (y) Bell-KS inequality witnessing KS-contextuality), but elements are trivial, i.e., Aa = qa IH and Bb = (y) (x) (y) that shouldn’t be surprising in the absence of space- rb IH, for some qa , rb ∈ [0, 1] for all a, b, x, y ∈ like separation. Also, the trivial PR-box joint POVM {0, 1}. PR(xy) (x) (y) Gab is a perfectly valid way to implement the (x) (y) In particular, consider the case where qa = rb = joint measurement of trivial POVMs A and B 1 2 for all a, b, x, y ∈ {0, 1}. A possible joint POVM for within the standard paradigm of operational quan- these trivial POVMs is then the product POVM: tum theory.51 1 To summarize, we note the following: G(xy) = A(x)B(y) = . (140) ab a b 4IH • Within the traditional framework of KS- If one restricted joint measurability of A(x) and B(y) noncontextuality, if one wants to go beyond pro- to just commutativity — a sufficient but not necessary jective measurements to arbitrary POVMs in a condition for joint measurability49 [44] — we would contextuality scenario, then one must – in order take the above choice of the product POVM as a “nat- to avoid the pathology of trivial POVMs violat- ural” one. Being a product of trivial POVMs, this ing the Bell-KS inequalities maximally – restrict choice will never lead to a violation of the CHSH- by fiat the notion of joint measurability to merely type inequality for this scenario. Indeed, the struc- commutativity. This is, for example, the attitude ture of a Bell scenario — requiring the decomposi- adopted in Ref. [25]. tion of the Hilbert space as H = HA ⊗ HB (tensor product paradigm), or more generally, imposing the or the commutativity paradigm. Both these ways of modelling (x) y spacelike separation lead to the same set of quantum corre- commutativity requirement [Aa ,B ] = 0 (commu- b lations for any finite-dimensional Hilbert space H [79]. The tativity paradigm) — is such that the only possible question of whether the two paradigms lead to the same set of choice of joint measurement that can be implemented correlations in the case of infinite dimensional Hilbert spaces by spacelike separated parties is the one that cor- is the subject of Tsirelson’s problem [79, 80]. Most studies responds to the product POVM, given by operators of Bell-nonlocality are primarily concerned with finite dimen- (xy) (x) (y) sional Hilbert spaces; should one encounter infinite dimensional Gab = Aa Bb . Hilbert spaces, the commutativity paradigm is the proper way However, this is not the only allowed joint mea- to model spacelike separation. surement for these trivial POVMs, particularly when 51Note that the point of this demonstration is to show how, there is no locality constraint on the measurements in the absence of spacelike separation justifying commutativ- from spacelike separation.50 An extreme choice of ity or a promise that the measurements are sharp, arbitrary correlations are achievable in quantum theory if unsharp mea- surements are allowed. All trivial POVMs are unsharp, but 49Particularly in the absence of spacelike separation. It is the the converse is not true. That is, one can consider nontriv- need to model spacelike separation in a quantum Bell exper- ial POVMs that don’t violate the CHSH-type inequality maxi- iment that makes commutativity a necessary (and sufficient) mally, but which violate it (arbitrarily) more than is allowed by condition for joint measurability of spacelike separated observ- sharp measurements in quantum theory. One could construct ables in a Bell scenario them, for example, by just taking a convex combination of the 50To incorporate such a constraint, spacelike separation PR-box trivial POVM with some sharp (and thus product) joint needs to be modelled via either the tensor product paradigm POVM.

Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 39 • However, if one is going beyond projective mea- surements, we know that commutativity is only a sufficient condition for joint measurability, not a necessary one [44].

• This brings us to our observation that the tra- ditional notion of KS-noncontextuality is patho- logical once the most general situation in quan- tum theory is considered: arbitrary POVMs with the general notion of joint measurability (see, e.g., Ref. [44] for this notion and its relation to commutativity). In particular, in the absence of spacelike separation, there is no physical justifi- cation to restrict the notion of joint measurability

to merely commutativity.

• A similar consideration applies at the level of Figure 8: The hypergraph Γ27 and its subhypergraphs, i.e., a KS-noncontextual ontological model: there, Γ18 and Γ3, appearing in the three Bell-KS expressions of factorizability is not justified in the absence of Eq. (143). The probabilistic model p considered in Eq. (143) spacelike separation. So, on those grounds alone, is a probabilistic model on Γ27, and not on the subhyper- one should go beyond KS-noncontextuality as graphs. We have illustrated the subhypergraphs separately one’s notion of classicality; particularly, if one only for clarity regarding the subsets of vertices to which the wants a notion of classicality that does not pre- Bell-KS expressions refer: the probabilities assigned to these vertices are obtained from probabilistic models on Γ . sume outcome determinism, just as local causality 27 doesn’t presume it. This was argued in Ref. [27]: imagine an adversarial setting where because of sidered in Ref. [12], where CE1(Γ ) excludes the the absence of spacelike separation in a KS- 18 extremal probabilistic model in G(Γ ) that corre- contextuality experiment, two measurement set- 18 sponds to the upper bound on the noise-robust non- tings on the same system can exhibit correla- contextuality inequality of Ref. [12]. As argued in tions that are independent of those induced by Ref. [12], this noise-robust noncontextuality inequal- the system on which the measurements are be- ity is the appropriate operational generalization (to ing implemented, thus allowing them to exhibit possibly noisy measurements) of the Kochen-Specker stronger correlations than are possible in a KS- contradiction first demonstrated in Ref. [51]; this gen- noncontextual model. We use trivial POVMs eralization cannot be accommodated in our general- only to drive home that this can be done arbi- ization of the CSW framework [22]. trarily well (achieving PR-box type correlations, If one extends the KS-uncolourable Γ to a KS- in fact) if there is no constraint on the strength 18 colourable hypergraph Γ with 9 “no-detection” of correlations the measurement settings can ex- 27 events, one for each hyperedge, then we have C(Γ ) 6= hibit. The way such constraints on the corre- 27 , but it’s still the case that C(Γ ) CE1(Γ ) lations between the measurement settings show ∅ 27 ( 27 ( G(Γ ) for this hypergraph.52 Hence, Γ cannot be up in our analysis within the Spekkens frame- 27 27 understood in our generalization of the CSW frame- work is in terms of the quantity Corr: if Corr work either.53 is really high, the measurements in a noncon- textual ontological model cannot be arbitrarily Indeed, if one “blindly” writes down a CSW clas- strongly correlated, i.e., R cannot be arbitrarily sical bound for some Bell-KS expression defined on high (cf. Eq. (72)). 52This follows from noting that extremal probabilistic models on Γ18 are still extremal probabilistic models on Γ27: ones where the no-detection events are assigned zero probabilities. D The KS-uncolourable hypergraph See Theorem 2.5.3 of Ref. [23]. Γ18 53Note that adding these no-detection events is equivalent to allowing subnormalized probabilities (i.e., sum of probabili- It is instructive to consider the KS-uncolourable hy- ties assigned to measurement events in a hyperedge can be less than 1) on Γ . Hence, even allowing for subnormalization on pergraph Γ , originally appearing in Ref. [51], and 18 18 Γ18, which means that one is looking at probabilistic models on 1 studied in the light of Spekkens contextuality in the hypergraph Γ27, does not eliminate the gap between CE Ref. [12]. This hypergraph fails both criteria for probabilistic models and general probabilistic models, so that any upper bound on a Bell-KS expression given by probabilis- the hypergraphs Γ considered in this paper, namely, 1 1 tic models in CE (Γ27) is not always the same as the general C(Γ) 6= ∅ (KS-colourability) and CE (Γ) = G(Γ). probabilistic upper bound from probabilistic models in G(Γ27). For probabilistic models on Γ18, the following hold: The CSW framework only considers the upper bound given by 1 CE1(Γ ) probabilistic models. C(Γ18) = ∅ ( CE (Γ18) ( G(Γ18). This was con- 27

Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 40 general probabilistic models don’t agree, and we can- not take the graph-theoretic upper bounds of CSW for granted in our noise-robust noncontextuality inequal- ities. Indeed, the general probabilistic upper bound for any Bell-KS expression defined on a contextual- ity scenario is a hypergraph invariant — in the sense that it is a property that is shared by all hypergraphs isomorphic to each other — that may or may not be expressible as a graph invariant `ala CSW. What, then, do the bounds given by graph invari- ants of CSW for O(Γ18) mean in our generalization of the CSW framework? Following our approach, out- lined in Sec. III.B, we can go from G = O(Γ18) to the

hypergraph ΓG = ΓO(Γ18) (see Fig.9) for which we

have (by construction) C(ΓO(Γ18)) 6= ∅ (so that the underlying hypergraph is no longer KS-uncolourable) Figure 9: Going from the orthogonality graph, G, of Γ18 to 1 and CE (ΓO(Γ18)) = G(ΓO(Γ18)) (so that, for any Bell- the hypergraph ΓG (on the right) to which our noise-robust noncontextuality inequality pertains. KS expression, the upper bound given by the frac- tional packing number α∗(G, w) in the CSW frame- work agrees with the general probabilistic upper

O(Γ18), then such a bound is equivalently a bound bound). Since this construction proceeds by con- for the same Bell-KS expression defined on Γ27 (where verting all maximal cliques in Γ18 to hyperedges in normalization is restored). Further, the E1 bound on ΓO(Γ18) and adding a new vertex to each such hy- 1 Γ18 is a CE bound on Γ27. The GPT bound happens peredge, it achieves both purposes: firstly, adding a to agree with the CE1 bound for a particular Bell-KS (no-detection) vertex to every maximal clique that is expression (sum of all probabilities) but differs for a hyperedge in Γ18 ensures the KS-colourability of some other Bell-KS expressions defined on this hy- ΓO(Γ18), i.e., C(ΓO(Γ18)) 6= ∅, and secondly, adding a pergraph. Consider, for example, the following three vertex to every maximal clique that is not a hyperedge 1 expressions (see Fig.8): in Γ18 ensures that CE (ΓO(Γ18)) = G(ΓO(Γ18)). Once these two properties are satisfied, the graph invari- ants of CSW [22] become applicable to any Bell-KS X Expr1 ≡ p(v), expression defined for any set of vertices in the sub-

v∈V (Γ18) hypergraph Γ18 of ΓO(Γ18). X Expr ≡ p(v), Our noise-robust noncontextuality inequality then 2 applies to the KS-colourable hypergraph Γ , v∈V (Γ ) O(Γ18) 3 where the graph invariants of CSW make sense, rather X X Expr3 ≡ p(v) + p(v). (143) than the KS-uncolourable hypergraph Γ18. On the v∈V (Γ18) v∈V (Γ3) other hand, an appropriate noise-robust noncontex- tuality inequality for the KS-uncolourable hypergraph We have: 54 Γ18 is, then, the one reported in Ref. [12].

1 C(Γ27) CE (Γ27) G(Γ27) Expr1 ≤ 8 < 9 = 9, References C(Γ27) CE1(Γ ) G(Γ27) 3 Expr ≤ 1 =27 1 < , 2 2 [1] L. Hardy, “Quantum Theory From Five Reason- 1 able Axioms”, arXiv:quant-ph/0101012 (2001). C(Γ27) CE (Γ27) G(Γ27) Expr3 ≤ 9 < 10 < 10.5. (144) [2] L. Masanes and M. P. Mueller, “A derivation of quantum theory from physical requirements”, Thus, Expr3 is a Bell-KS expression that discrimi- New J. Phys. 13, 063001 (2011). nates between probabilistic models at all three levels [3] G. Chiribella, G. M. D’Ariano, and P. Perinotti, of the hierarchy. Indeed, the upper bound on Expr3 “Probabilistic theories with purification”, Phys. 1 for CE (Γ27) models can be saturated by projective Rev. A 81, 062348 (2010). quantum realizations of the hypergraph, in particular [4] J. S. Bell, “On the Einstein-Podolsky-Rosen para- the standard realization with 18 rays, with the zero dox”, Physics 1, 195 (1964). Reprinted in Ref. [6], operator for the no-detection events [51]. The fact Chapter 2. that there exists such a Bell-KS expression as Expr3 1 means that the CE upper bounds from the CSW 54The approach for KS-uncolourable hypergraphs will be fur- approach can be violated by a general probabilistic ther developed in hypergraph-theoretic terms in forthcoming model, i.e., the upper bounds for CE1 models and work [34].

Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 41 [5] J. S. Bell, “On the problem of hidden variables [21] A. Cabello, S. Severini, and A. Winter, “(Non- in ”, Rev. Mod. Phys. 38, 447 )Contextuality of Physical Theories as an Axiom”, (1966). Reprinted in Ref. [6], Chapter 1. arXiv:1010.2163 [quant-ph] (2010). [6] J. S. Bell, “Speakable and Unspeakable in Quan- [22] A. Cabello, S. Severini, and A. Winter, “Graph- tum Mechanics”, 2nd Edition, Cambridge Univer- Theoretic Approach to Quantum Correlations”, sity Press, 2004. Phys. Rev. Lett. 112, 040401 (2014). [7] J. F. Clauser, M. A. Horne, A. Shimony, and [23] A. Ac´ın,T. Fritz, A. Leverrier, and A. B. Sainz, R. A. Holt, “Proposed Experiment to Test Local A Combinatorial Approach to Nonlocality and Hidden-Variable Theories”, Phys. Rev. Lett. 23, Contextuality, Comm. Math. Phys. 334(2), 533- 880 (1969). 628 (2015). [8] N. Brunner, D. Cavalcanti, S. Pironio, V. Scarani, [24] J. Barrett, “Information processing in general- and S. Wehner, “Bell nonlocality”, Rev. Mod. ized probabilistic theories”, Phys. Rev. A 75, Phys. 86, 419 (2014). 032304 (2007). [9] B. Hensen et al., “Loophole-free Bell inequality vi- [25] S. Abramsky and A. Brandenburger, “The sheaf- olation using electron spins separated by 1.3 kilo- theoretic structure of non-locality and contextual- metres”, Nature 526, 682 - 686 (2015). ity”, New J. Phys. 13, 113036 (2011). [10] Lynden K. Shalm et al., “Strong Loophole-Free [26] A. Fine, “Hidden Variables, Joint Probability, Test of Local Realism”, Phys. Rev. Lett. 115, and the Bell Inequalities”, Phys. Rev. Lett. 48, 250402 (2015). 291 (1982). [27] R. Kunjwal, “Fine’s theorem, noncontextuality, [11] M. Giustina et al., “Significant-Loophole-Free and correlations in Specker’s scenario”, Phys. Rev. Test of Bell’s Theorem with Entangled Photons”, A 91, 022108 (2015). Phys. Rev. Lett. 115, 250401 (2015). [28] R. W. Spekkens, “The Status of Determinism [12] R. Kunjwal and R. W. Spekkens, “From the in Proofs of the Impossibility of a Noncontextual Kochen-Specker Theorem to Noncontextuality In- Model of Quantum Theory”, Found. Phys. 44, equalities without Assuming Determinism”, Phys. 1125-1155 (2014). Rev. Lett. 115, 110403 (2015). [29] A. Cabello, “What do we learn about quantum [13] M. D. Mazurek, M. F. Pusey, R. Kunjwal, K. theory from Kochen-Specker quantum contextual- J. Resch, R. W. Spekkens, “An experimental test ity?”, PIRSA:17070034 (2017). of noncontextuality without unphysical idealiza- [30] G. Chiribella and X. Yuan, “Measurement sharp- tions”, Nat. Commun. 7, 11780 (2016). ness cuts nonlocality and contextuality in ev- [14] A. Krishna, R. W. Spekkens, and E. Wolfe, “De- ery physical theory”, arXiv:1404.3348 [quant-ph] riving robust noncontextuality inequalities from (2014). algebraic proofs of the Kochen-Specker theorem: [31] G. Chiribella and X. Yuan, “Bridging the gap the Peres-Mermin square”, New J. Phys 19, between general probabilistic theories and the 123031 (2017). device-independent framework for nonlocality and [15] D. Schmid and R. W. Spekkens, “Contextual Ad- contextuality”, Information and Computation, vantage for State Discrimination”, Phys. Rev. X 250, 15-49 (2016). 8, 011015 (2018). [32] R. Chaves and T. Fritz, “Entropic approach to [16] R. Kunjwal and R. W. Spekkens, “From sta- local realism and noncontextuality”, Phys. Rev. A tistical proofs of the Kochen-Specker theorem 85, 032113 (2012). to noise-robust noncontextuality inequalities”, [33] Tobias Fritz and Rafael Chaves, “Entropic In- Phys. Rev. A 97, 052110 (2018). equalities and Marginal Problems”, IEEE Trans. [17] D. Schmid, R. W. Spekkens, and E. Wolfe, on Information Theory, vol. 59, pages 803 - 817 “All the noncontextuality inequalities for arbi- (2013). trary prepare-and-measure experiments with re- [34] R. Kunjwal, “Hypergraph framework for irre- spect to any fixed set of operational equivalences”, ducible noncontextuality inequalities from log- Phys. Rev. A 97, 062103 (2018). ical proofs of the Kochen-Specker theorem”, [18] R. W. Spekkens, “Contextuality for prepara- arXiv:1805.02083 [quant-ph] (2018). tions, transformations, and unsharp measure- [35] A. Cabello, “Specker’s fundamental principle of ments”, Phys. Rev. A 71, 052108 (2005). quantum mechanics”, arXiv:1212.1756 [quant-ph] [19] S. Kochen and E. P. Specker, “The Problem (2012). of Hidden Variables in Quantum Mechanics”,J. [36] R. W. Spekkens, “Noncontextuality: how we Math. Mech. 17, 59 (1967). Also available at JS- should define it, why it is natural, and what to TOR. do about its failure”, PIRSA:17070035 (2017). [20] N. Harrigan and R. W. Spekkens,“Einstein, In- [37] M. D. Mazurek, M. F. Pusey, K. J. Resch, completeness, and the Epistemic View of Quan- and R. W. Spekkens, “Experimentally bound- tum States,” Found. Phys. 40, 125 (2010). ing deviations from quantum theory in the

Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 42 landscape of generalized probabilistic theories”, thogonality: a multipartite principle for correla- arXiv:1710.05948 [quant-ph] (2017). tions”, Nat. Commun. 4, 2263 (2013). [38] M. F. Pusey, L. del Rio, and B. Meyer, “Contex- [56] R. W. Spekkens, “Nonclassicality as the failure of tuality without access to a tomographically com- noncontextuality”, PIRSA:15050081 (2015) (see plete set”, arXiv:1904.08699 (2019). the slide at 41:43 minutes). [39] Y. C. Liang, R. W. Spekkens, H. M. Wiseman, [57] R. W. Spekkens, “Quasi-Quantization: Classi- “Specker’s parable of the overprotective seer: A cal Statistical Theories with an Epistemic Re- road to contextuality, nonlocality and complemen- striction”, In: Chiribella G., Spekkens R. (eds) tarity”, Phys. Rep. 506, 1 (2011). Quantum Theory: Informational Foundations and [40] R. Kunjwal and S. Ghosh, “Minimal state- Foils. Fundamental Theories of Physics, vol 181. dependent proof of measurement contextuality for Springer, Dordrecht. a qubit”, Phys. Rev. A 89, 042118 (2014). [58] T. Vidick and S. Wehner, “Does Ignorance of the [41] R. Kunjwal, C. Heunen, and T. Fritz, “Quantum Whole Imply Ignorance of the Parts? Large Vio- realization of arbitrary joint measurability struc- lations of Noncontextuality in Quantum Theory”, tures”, Phys. Rev. A 89, 052126 (2014). Phys. Rev. Lett. 107, 030402 (2011). [42] R. Kunjwal, “A note on the joint measurabil- [59] R. Raussendorf, “Contextuality in measurement- ity of POVMs and its implications for contextual- based quantum computation”, Phys. Rev. A 88, ity”,arXiv:1403.0470 [quant-ph] (2014). 022322 (2013). [43] S. Popescu and D. Rohrlich, “Quantum nonlocal- [60] M. Howard, J. Wallman, V. Veitch, and J. Emer- ity as an axiom”, Found. Phys. 24, 379-385 (1994). son, “Contextuality supplies the ‘magic’ for quan- [44] T. Heinosaari, D. Reitzner, and P. Stano, “Notes tum computation”, Nature 510, 351 (2014). on Joint Measurability of Quantum Observables”, [61] N. Delfosse, P. A. Guerin, J. Bian, and Found. Phys. 38, 1133-1147 (2008). R. Raussendorf, “Wigner Function Negativity [45] R. Kunjwal, “How to go from the KS theorem to and Contextuality in Quantum Computation on experimentally testable noncontextuality inequal- Rebits”, Phys. Rev. X 5, 021003 (2015). ities”, PIRSA:17070059 (2017). [62] J. Bermejo-Vega, N. Delfosse, D. E. Browne, [46] Konrad Engel, “Sperner theory: Encyclopedia of C. Okay, R. Raussendorf, “Contextuality as a re- Mathematics and its Applications”, Vol. 65, Cam- source for qubit quantum computation”, Phys. bridge University Press, Cambridge (1997). Rev. Lett. 119, 120505 (2017). [47] A. A. Klyachko, M. A. Can, S. Binicio˘glu,and [63] J. Singh, K. Bharti, and Arvind, “Quantum A. S. Shumovsky, “Simple Test for Hidden Vari- key distribution protocol based on contextuality ables in Spin-1 Systems”, Phys. Rev. Lett. 101, monogamy”, Phys. Rev. A 95, 062333 (2017). 020403 (2008). [64] A. Cabello, “Kochen-Specker Theorem for a Sin- [48] C. Held, “The Kochen-Specker Theorem”, The gle Qubit using Positive Operator-Valued Mea- Stanford Encyclopedia of Philosophy (Spring 2018 sures”, Phys. Rev. Lett. 90, 190401 (2003). Edition), Edward N. Zalta (ed.). [65] A. Grudka and P. Kurzy´nski,“Is There Contex- [49] T. Gonda, R. Kunjwal, D. Schmid, E. Wolfe, and tuality for a Single Qubit?”, Phys. Rev. Lett. 100, A. B. Sainz, “Almost Quantum Correlations are 160401 (2008). Inconsistent with Specker’s Principle”, Quantum [66] P. Busch, “Quantum States and Generalized Ob- 2, 87 (2018). servables: A Simple Proof of Gleason’s Theorem”, [50] M. Navascu´es, Y. Guryanova, M. J. Hoban, Phys. Rev. Lett. 91, 120403 (2003). and A. Ac´ın, “Almost quantum correlations”, [67] C. M. Caves, C. A. Fuchs, K. Manne, and Nat. Commun. 6, 6288 (2015). J. M. Renes, “Gleason-Type Derivations of the [51] A. Cabello, Adan, J. Estebaranz, and G. Garcia- Quantum Probability Rule for Generalized Mea- Alcaine, “Bell-Kochen-Specker theorem: A proof surements”, Found. Phys. 34, 193 (2004). with 18 vectors,” Phys. Lett. A 212, 183 (1996). [68] A. M. Gleason, “Measures on the closed sub- [52] E. G. Beltrametti and S. Bugajski, “A classical spaces of a Hilbert space”, J. Math. Mech. 6, 885 extension of quantum mechanics”, J. Phys. A 28, (1957). Also available at JSTOR. 3329 (1995). [69] P. K. Aravind, “The generalized Kochen-Specker [53] X. Zhan, E. G. Cavalcanti, J. Li, Z. Bian, theorem”, Phys. Rev. A 68, 052104 (2003). Y. Zhang, H. M. Wiseman, and P. Xue, “Ex- [70] A. A. Methot, “Minimal Bell-Kochen-Specker perimental generalized contextuality with single- proofs with POVMs on ”, Int. J. Quantum photon qubits”, Optica 4, 966-971 (2017). Inf. 5, 353 (2007). [54] R. Kunjwal, “Contextuality beyond the Kochen- [71] Q. Zhang, H. Li, T. Yang, J. Yin, J. Du, Specker theorem”, arXiv:1612.07250 [quant-ph] J. W. Pan, “Experimental Test of the Kochen- (2016). Specker Theorem for Single Qubits using Pos- [55] T. Fritz, A. B. Sainz, R. Augusiak, J. B. Brask, itive Operator-Valued Measures”, arXiv:quant- R. Chaves, A. Leverrier, and A. Ac´ın,“Local or- ph/0412049 (2004).

Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 43 [72] L. Mancinska, G. Scarpa, and S. Severini, “New Separations in Zero-Error Channel Capac- ity Through Projective Kochen Specker Sets and Quantum Coloring”, IEEE Transactions on Infor- mation Theory 59, 4025 (2013). [73] J. Henson and A. B. Sainz, “Macroscopic non- contextuality as a principle for almost-quantum correlations”, Phys. Rev. A 91, 042114 (2015). [74] D. A. Meyer, “Finite Precision Measurement Nullifies the Kochen-Specker Theorem”, Phys. Rev. Lett. 83, 3751 (1999). [75] A. Kent, “Noncontextual Hidden Variables and Physical Measurements”, Phys. Rev. Lett. 83, 3755 (1999). [76] R. Clifton and A. Kent, “Simulating quantum mechanics by non-contextual hidden variables”, Proc. R. Soc. Lond. A: Vol. 456, 2101-2114 (2000). [77] J. Barrett and A. Kent, “Non-contextuality, finite precision measurement and the Kochen-Specker theorem”, Stud. Hist. Phi- los. Mod. Phys. 35, 151 (2004). [78] A. Winter, “What does an experimental test of quantum contextuality prove or disprove?”,J. Phys. A: Math. Theor. 47, 424031 (2014). [79] V. B. Scholz and R. F. Werner, “Tsirelson’s Problem”, arXiv:0812.4305 [math-ph] (2008). [80] T. Fritz, “Tsirelson’s problem and Kirchberg’s conjecture”, Rev. Math. Phys. 24 (5), 1250012 (2012).

Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 44