From statistical proofs of the Kochen-Specker theorem to noise-robust noncontextuality inequalities

Ravi Kunjwal∗ and Robert W. Spekkens† Perimeter Institute for Theoretical Physics, 31 Caroline Street North, Waterloo, Ontario Canada N2L 2Y5 (Dated: May 11, 2018) The Kochen-Specker theorem rules out models of quantum theory wherein projective measure- ments are assigned outcomes deterministically and independently of context. This notion of non- contextuality is not applicable to experimental measurements because these are never free of noise and thus never truly projective. For nonprojective measurements, therefore, one must drop the requirement that an outcome is assigned deterministically in the model and merely require that it is assigned a distribution over outcomes in a manner that is context-independent. By demand- ing context-independence in the representation of preparations as well, one obtains a generalized principle of noncontextuality that also supports a quantum no-go theorem. Several recent works have shown how to derive inequalities on experimental data which, if violated, demonstrate the impossibility of finding a generalized-noncontextual model of this data. That is, these inequalities do not presume quantum theory and, in particular, they make sense without requiring an opera- tional analogue of the quantum notion of projectiveness. We here describe a technique for deriving such inequalities starting from arbitrary proofs of the Kochen-Specker theorem. It extends signifi- cantly previous techniques that worked only for logical proofs, which are based on sets of projective measurements that fail to admit of any deterministic noncontextual assignment, to the case of statis- tical proofs, which are based on sets of projective measurements that do admit of some deterministic noncontextual assignments, but not enough to explain the quantum statistics.

I. INTRODUCTION the measurements here means recovering them as a con- vex mixture of deterministic noncontextual assignments In quantum theory, a given sharp measurement (i.e., to these measurements. one associated to a projector-valued measure) can be compatible with each of two other sharp measurements As an aside, we note that in most of the literature on that are incompatible with one another. In this case, the the subject (see, e.g., [5]), a of the KS theorem is latter pair of measurements define two distinct compat- said to be state-independent if the proof works for an arbi- ibility contexts for the first measurement. The Kochen- trary choice of quantum state on which the measurements Specker (KS) theorem [1] rules out a particular kind of are implemented, and it is said to be state-dependent if explanation of the operational predictions of quantum the proof appeals to a special quantum state. In our theory, namely, one wherein sharp measurements are as- terminology, logical proofs of the KS theorem are always signed outcomes deterministically and independently of state-independent. The is that logical proofs by context. We will term this sort of explanation a KS- definition appeal to sets of sharp measurements that ad- noncontextual model of the statistics. mit of no deterministic noncontextual assignments (for A particular proof of the KS theorem is said to be logi- example, those that appeal to sets of rays that admit of cal if it appeals to the existence of a set of sharp quantum no KS-colourings [1, 4]), and if the set of deterministic measurements that admit of no deterministic noncontex- noncontextual assignments is empty, then the set of mix- tual assignments. (For the case where all measurement tures of deterministic noncontextual assignments is also outcomes correspond to rank-1 projectors, i.e., projectors empty, and we cannot explain measurement statistics in onto rays of Hilbert space, a deterministic noncontextual terms of such a mixture regardless of the quantum state. assignment is one that assigns, for every basis of orthog- Statistical proofs of the KS theorem, on the other hand, arXiv:1708.04793v3 [quant-ph] 10 May 2018 onal rays, the value 1 to precisely one such ray and the are those that appeal to sets of sharp measurements that value 0 to the others, termed a KS-colouring of the rays. admit of one or more deterministic noncontextual assign- In this case, a proof is said to be logical if it appeals to the ments (for example, sets of rays that admit of one or more existence of a set of rays in a Hilbert space that admit of KS-colourings). These can be state-dependent, such as no KS-colourings [1–4].) A proof is said to be statistical if the proof in Ref. [6], or state-independent, such as the it appeals to a set that does admit of some deterministic proof in Ref. [5, 7]. noncontextual assignments, but these assignments are in- sufficient to explain the statistics of the measurements on In recent years, there has been much work deriving one or more quantum states. Explaining the statistics of inequalities for operational statistics that follow directly from an assumption of noncontextuality, without assum- ing the validity of quantum theory. If these are violated ∗ [email protected] experimentally, one can conclude that not just quantum † [email protected] theory, but any successor thereof must fail to admit of a 2 noncontextual model.1 One approach to deriving these conditions on obtaining this coarse-grained outcome. We inequalities is to seek inspiration from particular proofs denote the full set of outcomes by >, so that the prepa- of the KS theorem. Refs. [10] and [11] do so for logical ration procedure corresponding to implementing S and proofs. Here, we consider statistical proofs. not conditioning on obtaining any particular outcome is Section II reviews background material, in particu- denoted [>|S].2 A measurement M has outcome denoted lar, the notions of operational theories, ontological mod- by m, and the event of obtaining outcome m of measure- els, measurement noncontextuality and Kochen-Specker ment M is termed a measurement event, denoted [m|M]. (KS) noncontextuality. For pedagogical clarity, we The measurement event [V |M] for a subset V of the out- present our result as a generalization of the results ob- comes is defined in the obvious manner, similarly to the tained previously for logical proofs [10], but cast into a case of sources. slightly different form. In Section III, therefore, we re- An operational theory specifies a rule for assigning cast these earlier results. Along the way, we review the a joint p(m, s|M,S), denoting notion of preparation noncontextuality and how it can be the probability that in a prepare-and-measure experi- used to infer that certain measurements are assigned out- ment with source S and measurement M, the source comes deterministically in the ontological model if cer- event [s|S] occurs followed by the measurement event tain operational correlations hold. We also review how [m|M]. For example, when the operational theory is this allows for a proof of the failure of KS non- quantum theory, the source event [s|S] is represented contextuality to be translated into a proof of the fail- by some density matrix, say ρ[s|S], the measurement ure of preparation and measurement noncontextuality. event [m|M] is represented by a positive operator, say Then, in Section IV, we generalize to the case of statis- E[m|M], and the rule for assigning the joint probability tical proofs. Our main result is presented in Theorem 1 is p(m, s|M,S) = p(s|S)Tr(E[m|M]ρ[s|S]), where p(s|S) and we give some concrete examples of its application. is the probability that the source S yields outcome s. We close with a discussion in Section V. Note that our analysis in this paper does not depend on Xu et al. [12] have previously obtained noise-robust the particular representation of preparations and mea- noncontextuality inequalities for some statistical proofs surements in an operational theory, nor on the particular of the KS theorem. Our approach in this article is dis- probability rule associated with the theory; in this sense, tinct from that of Ref. [12] and allows a consideration we consider operational theories more general than quan- of arbitrary statistical proofs of the KS theorem, rather tum theory. than particular examples of it. A discussion of the re- An ontological model of an operational theory posits lation between Ref. [12] and this article is provided in that the causal influence of the source on the measure- Section V and Appendix E. ment is mediated by the ontic state, λ, of the system (a point in the underlying ontic state space Λ, which for our purposes can be taken to be discrete). For a source S with II. PRELIMINARIES outcome s, the ontological model associates a conditional P P probability µ(λ, s|S) such that s λ∈Λ µ(λ, s|S) = We conceptualize an experiment as a source followed 1. Here µ(λ, s|S) = µ(λ|s, S)p(s|S). For a measure- by a measurement. A source is a procedure that samples ment M with outcome m, the ontological model as- sociates a conditional probability ξ(m|M, λ) such that a random variable and implements a preparation on the P system conditioned on the value of this variable. The m ξ(m|M, λ) = 1 for all λ ∈ Λ. Finally, the ontological value of the variable is termed the outcome of the source. model must reproduce the statistical predictions of the The event of obtaining an outcome s of a source S will operational theory, be termed a source event, denoted [s|S]. For each source X event [s|S], one can associate a preparation procedure pr(m, s|M,S) = ξ(m|M, λ)µ(λ, s|S). (1) on the system; it is the one implemented by the source λ∈Λ S conditional on outcome s occurring. The event of ob- In the case of quantum theory, two measurement pro- taining any outcome in some subset V of all possible out- cedures differ only by context if and only if they yield comes of a source S will be denoted by [V |S], and is asso- the same statistics for all quantum states. In this case, ciated to the preparation procedure wherein one imple- they are represented by the same positive operator- ments S, coarse-grains over the outcomes in V , and then valued measure (POVM).3 Equivalently, two measure- ment events differ only by context if and only if they are

1 The difference betwen a quantum no-go theorem for noncontex- tuality and a noncontextuality inequality is precisely analogous 2 Note that our notational conventions here align with those of to the difference between Bell’s 1964 no-go theorem establishing Ref. [11] rather than those of Ref. [10]. For example, the prepa- the impossibility of a locally causal model of quantum theory [8] (ave) ration procedure [>|S] would be represented as PS in the and the CHSH inequalities [9], which are constraints on opera- notation of Ref. [10], where S is the choice of source setting. tional statistics that follow directly from the assumption of local 3 The type of measurement that is considered in most discus- causality, without presuming the validity of quantum theory. sions of the Kochen-Specker theorem is a projective measure- 3 assigned the same probability by all quantum states. In is a third measurement with an outcome set that is the this case, they are represented by the same positive oper- Cartesian product of the two outcome sets such that one ator less than identity (or the same projector in the case simulates M1 and M2 by marginalization (i.e., for which of a sharp measurement). the marginalized versions are operationally equivalent to By analogy, in an arbitrary operational theory, two M1 and M2). measurement events, [m|M] and [m0|M 0], differ only by context if and only if for every preparation procedure, the probability of [m|M] is the same as that of [m0|M 0]. The condition can be formalized as:4 III. FROM LOGICAL PROOFS OF THE KS THEOREM TO OPERATIONAL CRITERIA FOR ∀[s|S] : pr(m|M, s, S) = pr(m0|M 0, s, S). (2) UNIVERSAL NONCONTEXTUALITY

0 0 When measurement events [m|M] and [m |M ] differ only It follows from the above that the proof schema which by context, they are said to be operationally equivalent, 0 0 generalizes a logical proof of the KS theorem from quan- denoted [m|M] ' [m |M ]. tum theory to an arbitrary operational theory is of the In Ref. [13], a measurement noncontextual ontological following form: model was defined to be one wherein operationally equiv- alent measurement events are represented by equivalent response functions: Proposition 1 (No-go for KS noncontextuality from log- ical proof). [m|M] ' [m0|M 0] =⇒ ξ(m|M, λ) = ξ(m0|M 0, λ), ∀λ ∈ Λ, Measurement noncontextuality (Eq. (3)) (3) + Outcome determinism (Eq. (4)) ∀λ ∈ Λ, ∀M ∈ M + Operational equivalences in the set M (proof depen- where one allows the measurements to respond dent) indeterministically to λ: ξ(m|M, λ) ∈ [0, 1]. On the other =⇒ Contradiction. hand, many have proposed to generalize the notion of KS-noncontextuality from quantum theory to arbitrary In the quantum case, this contradiction can be inferred operational theories in a different manner, namely, by from KS-uncolourability; Kochen and Specker’s original assuming Eq. (3) but with measurements responding de- proof of the KS theorem is an example [1]. terministically, that is, Note that in the face of this contradiction, one can ξ(m|M, λ) ∈ {0, 1}. (4) always salvage the spirit of noncontextuality (measure- We term the latter proposal KS-noncontextuality: ment noncontextuality) simply by abandoning outcome determinism. By contrast, this is not a way out of Bell’s KS-noncontextuality theorem because the notion of local causality does not = Measurement noncontextuality (Eq. (3)) presume outcome determinism. For these , it was argued in Ref. [13] that + Outcome determinism (Eq. (4)) ∀λ ∈ Λ. (5) one should drop the assumption of outcome determin- In the following, we let M denote a set of measure- ism that is part of KS-noncontextuality and simply as- ment procedures whose operational features include the sume measurement noncontextuality. Such a move blocks compatibility relations and operational equivalences that the derivation of the contradiction in Proposition 1. It their quantum counterparts satisfy in a proof of the KS might appear, therefore, that there is in fact no conflict theorem. between quantum theory and the spirit of noncontextu- To operationalize the KS theorem, the notion of com- ality if one excises the notion of outcome determinism patibility must also be generalized to an arbitrary oper- from the latter. However, it turns out that the property ational theory. We follow the proposal of Ref. [14]: mea- of outcome determinism can be inferred for certain mea- surements by applying a notion of noncontextuality to surements M1 and M2 are deemed compatible if there preparations [13], as we now explain. Two source events, [s|S] and [s0|S0], are operationally equivalent, denoted [s|S] ' [s0|S0], if for every measure- ment (which we here refer to as a sharp measurement). Such ment event [m|M], the joint probability of obtaining [s|S] measurements are a special class of POVMs wherein the positive and [m|M] is the same as that of obtaining [s0|S0] and operators are all projectors. Note that these are the only mea- [m|M], surements in quantum theory that can be represented by a single Hermitian operator, namely, the one whose spectral projectors are the elements of the projector-valued measure. ∀[m|M]: p(m, s|M,S) = p(m, s0|M,S0). (6) 4 By a Bayesian inversion, this condition is equivalent to ∀[s|S]: pr(m, s|M,S) = pr(m0, s|M 0,S), a form that makes more appar- ent the close analogy with the operational equivalence relation The assumption of preparation noncontextuality re- among source events which we define further on. quires that operationally equivalent source events should 4 be represented equivalently in the ontological model:5 The proof is as follows. From Eqs. (7) and (10) we conclude that [s|S] ' [s0|S0] =⇒ µ(λ, s|S) = µ(λ, s0|S0), ∀λ ∈ Λ. (7) 0 ∀i, i : µ(λ|Si) = µ(λ|Si0 ) It was argued in Ref. [13] that whatever reasons can be ≡ ν(λ). (12) given in support of measurement noncontextuality, these are also reasons to believe in preparation noncontextual- By Bayesian inversion, µ(s|λ, S) = µ(λ, s|S)/ν(λ). Sub- ity and therefore that the only reasonable assumption to stituting this into Eq. (1), we have make is noncontextuality for all experimental procedures, X termed universal noncontextuality. It is the assumption pr(mi, si|Mi, Si) = ξ(mi|Mi, λ)µ(si|λ, Si)ν(λ). (13) we make here. λ∈Λ Ref. [15] showed that for quantum theory, preparation noncontextuality implies that measurements should be Given this expression, the only way to explain the per- assigned outcomes deterministically if and only if they fect correlation of Eq. (9), then, is if the measurements are projective. Ref. [10] generalized the logic to all op- respond deterministically for all ontic states in the sup- erational theories by focusing not just on the set M of port of ν, that is, measurements, but on a corresponding set S of sources ∀λ ∈ supp(ν): ξ(m |M , λ) ∈ {0, 1}. (14) as well. Outcome determinism is justified when the fol- i i lowing operational criteria are satisfied: where supp(ν) ≡ {λ ∈ Λ: ν(λ) > 0}. But given Eq. (12), (i) For each equivalence class of measurements, Mi ∈ M, there must exist an equivalence class of sources, Si ∈ supp(ν) = supp(µ(·|S)) ∀S ∈ S, (15) S, such that the outcomes of Mi and Si, denoted mi and si respectively, are perfectly correlated. This implies that which concludes the proof. the average correlation over the pairings {(Mi, Si): i ∈ To obtain a contradiction in the no-go theorem of {1, . . . , n}}, Proposition 1, it is sufficient to assume outcome deter- n minism just for the ontic states in the union of the ontic 1 X X supports of the distributions representing the sources in Corr ≡ δm ,s pr(mi, si|Mi, Si), (8) n i i S, even though this might be a subset of the full set i=1 mi,si of ontic states. This is because any ontic state outside satisfies this subset does not affect the operational statistics of Corr = 1. (9) any experiment involving measurements on sources in S. Hence, the premiss of outcome determinism ∀λ ∈ Λ in the (ii) The sources must obey the following operational no-go of Proposition 1 can be replaced by the same pre- equivalence relations: miss ∀λ ∈ ∪S∈Ssupp(µ(·|S)) and thus by the antecedent 0 of Proposition 2. ∀i, i :[>|Si] ' [>|Si0 ], (10) The no-go that one obtains by combining Proposition where, as stipulated earlier, [>|S] denotes the event cor- 2 with Proposition 1 is a no-go for universal noncontex- responding to implementing S and not conditioning on tuality based on a logical proof of the KS theorem. its outcome. Note that if the source event [s|S] is represented in the Proposition 3 (No-go for universal noncontextuality ontological model by µ(λ, s|S), then [>|S] is represented from logical proof). by Universal noncontextuality (Eq. (3), Eq. (7)) X +Operational equivalences in the set S (Eq. (10)) µ(λ|S) ≡ µ(λ, s|S). (11) +Operational equivalences in the set M (proof dependent) s +Perfect Correlation between outcomes of Si and Mi for The inference established in Ref. [10] can then be ex- all i (Eq. (9)) pressed as follows: =⇒ Contradiction. Proposition 2 (Justifying outcome determinism). As noted in Ref. [10], it implies that any operational Preparation noncontextuality (Eq. (7)) theory that does admit of a universally noncontextual +Operational equivalences in the set S (Eq. (10)) model while exhibiting the appropriate operational fea- +Perfect Correlation between outcomes of Si and Mi for tures of M and S must exhibit imperfect correlations for all i (Eq. (9)) the pairings {(Si, Mi)}, that is, it must satisfy Corr < 1. =⇒ Outcome determinism ∀λ ∈ ∪S∈Ssupp(µ(·|S)) The precise amount by which Corr is bounded away (Eqs. (14),(15)) ∀M ∈ M. from 1 is determined as follows. Substituting Eq. (13) into the definition of Corr (Eq. (8)), we obtain X 5 Note that Eq. (6) is analogous to Eq. (2) and Eq. (7) is analogous Corr = Corr(λ)ν(λ), (16) to Eq. (3). λ 5 where there is a special source event, that is, the event of ob- taining a special outcome s = 0 of a special source S , n ∗ ∗ 1 X X denoted [s = 0|S ], such that the measurement statistics Corr(λ) ≡ δ ξ(m |M , λ)µ(s |S , λ). (17) ∗ ∗ n mi,si i i i i one obtains for the special preparation associated to this , i=1 mi si event are inconsistent with KS-noncontextuality. For a given choice of λ corresponding to a noncontextual We index the compatible subsets of M by α and de- assignment to the { } , Corr(λ) is maximized by taking note the equivalence class of measurements that jointly Mi i (α) µ( | , λ) = 1 for = max, where max is any value of simulates the elements of this subset by M , with the si Si si mi mi (α) max vector of outcomes denoted by m~ . mi such that maxmi ξ(mi|Mi, λ) = ξ(mi |Mi, λ). We 1 Pn Let R be the value of a particular linear function F then have Corr(λ) = n i=1 maxmi ξ(mi|Mi, λ). Be- cause every noncontextual assignment is indeterminis- of the operational statistics for the compatible subsets of measurements when these are implemented on the special tic for some Mi, Corr(λ) is bounded away from 1 for preparation, i.e., all λ. Letting Corrind denote the maximum value of 1 Pn n i=1 maxmi ξ(mi|Mi, λ) in a variation over λ that cor- (α) (α) respond to indeterministic noncontextual assignments, R = F ({pr(m~ |M , s∗ = 0, S∗)}α). (19) we have We define Rdet to be the largest value of R consistent with deterministic noncontextual measurement assign- Corr ≤ Corrind. (18) ments, and Rind to be the largest value of R consistent with indeterministic noncontextual measurement assign- The qualifier that λ correspond to indeterministic non- ments. For every statistical proof of the KS theorem, it contextual assignments in the variation over λ that de- is possible to construct a function F such that fines Corrind may seem unnecessary at this stage since all λ in a logical proof of the KS theorem correspond to such R ≥ R > R . (20) assignments. However, we will soon consider statistical ind det proofs of the KS theorem, where there exist λ that cor- (An example is given in Eq. (23).) respond to deterministic noncontextual assignments and We define where the qualifier that Corrind is computed by varying over λ that correspond to indeterministic noncontextual p∗ ≡ pr(s∗ = 0|S∗) (21) assignments becomes necessary. The compatibility and operational equivalence rela- to be the probability of the source S∗ yielding the out- tions on the {Mi}i, combined with the assumption of come s∗ = 0. The assumption that the special prepara- measurement noncontextuality, define linear constraints tion sometimes occurs can be formalized as on the n-tuple of response functions {ξ(mi|Mi, λ)}i. These linear constraints describe the facets of a p∗ > 0. (22) polytope, termed the noncontextual measurement- assignment polytope. One can obtain the vertices of Recalling (5), it follows that the proof schema for a no- this polytope from its facets using convex hull algo- go theorem for KS noncontextuality based on a statistical rithms. One determines Corrind by determining the max- proof of the KS theorem is as follows: imum value of 1 Pn max ξ(m |M , λ) in a variation n i=1 mi i i Proposition 4 (No-go for KS noncontextuality from over the (indeterministic) vertices. The noncontextual- statistical proof). ity inequality one obtains for a given logical proof of the Measurement noncontextuality (Eq. (3)) KS theorem, therefore, is simply the inequality one ob- +Outcome determinism (Eq. (4)) ∀λ ∈ Λ, ∀M ∈ M tains by substituting the determined value of Corr into ind +Operational equivalences in the set M (proof- Eq. (18). Refs. [10, 11] provide examples of how to de- dependent) rive such inequalities for specific logical proofs of the KS +Features of correlations among compatible subsets of theorem. M for the special preparation (Eq. (20)) +Nonzero probability of the special preparation (Eq. (22)) =⇒ Contradiction. IV. FROM STATISTICAL PROOFS OF THE KS THEOREM TO OPERATIONAL CRITERIA FOR A simple example of such a no-go theorem is based UNIVERSAL NONCONTEXTUALITY on the n-cycle scenario for odd n [6, 14, 16, 17]. Here, there are n equivalence classes of binary-outcome mea- n We can now turn to the question of how to obtain surements, M ≡ {Mi}i=1, where adjacent pairs are com- operational criteria for the failure of universal noncon- patible, so that there are n compatible subsets, which textuality from statistical, rather than logical, proofs of we can index by {(1, 2), (2, 3),..., (n − 1, n), (n, 1)}. Let the KS theorem. In the operationalized version of such M(i,i⊕1) denote the equivalence class of measurements proofs (i.e., one that makes no reference to the quantum which jointly simulates Mi and Mi⊕1 (here, ⊕ denotes formalism), the contradiction is achieved by noting that sum modulo n). Let M (i,i⊕1) denote a procedure in the 6

(i,i⊕1) equivalence class M , and let Mi(i⊕1) be the proce- dure one obtains by implementing M (i,i⊕1) and marginal- izing over the outcome of Mi⊕1. Note that by this def- inition, Mi(i⊕1) is in the equivalence class Mi. Define Mi(i 1) similarly. The difference between Mi(i⊕1) and Mi(i 1) is merely a difference of context, corresponding to the neighbour with which Mi is jointly implemented. The relevant operational equivalence relations (implicit in the definition of the equivalence classes) are therefore ∀i : Mi(i⊕1) ' Mi(i 1). An assignment of outcomes (de- terministic or indeterministic) to these measurements in the ontological model is noncontextual if it is indepen- dent of this choice. Clearly, if n is odd, then not all adjacent pairs of mea- surements can have anticorrelated outcomes if the assign- ment is deterministic. At most, this can occur for n − 1 out of the n pairs. Consequently, if one defines R to be FIG. 1. Quantum construction for a statistical proof of the the probability of seeing anticorrelated outcomes when KS theorem based on the 5-cycle [6]. jointly implementing an adjacent pair of measurements (Mi and Mi⊕1) on the special preparation, averaged over all such pairs, i.e., the set S (redefined to include the special source S∗), as n long as one assumes that the operational features of the 1 X (i,i⊕1) R ≡ pr(m 6= m |M , s = 0, S ), (23) set S now include the fact that [>|S∗] is operationally n i i⊕1 ∗ ∗ i=1 equivalent with all other marginalized sources,

0 0 then the largest value achievable by deterministic non- ∀i, i :[>|Si] ' [>|S ] ' [>|S∗]. (24) n−1 i contextual assignments is Rdet = n , and KS- n−1 noncontextuality implies R ≤ n . It follows that in any Therefore, among the premisses of Proposition 4, we n−1 can replace the assumption of outcome determinism with operational theory that predicts p∗ > 0 and R > n , we obtain a contradiction with KS-noncontextuality. the antecedent of the inference of Proposition 2 (with S As is well known, this is the case for quantum theory. now including S∗ and Eq. (10) replaced by Eq. (24)), to The first instance of such a proof, due to Klyachko, Can, obtain a no-go theorem for universal noncontextuality: Binicioglu and Shumovsky (KCBS) [6], was for the 5- cycle. It showed that the compatibility relations required Proposition 5 (No-go for universal noncontextuality 5 from statistical proof). to hold among the {Mi} can be achieved with sharp i=1 Universal noncontextuality (Eq. (3), Eq. (7)) quantum measurements on a qutrit if Mi corresponds +Operational equivalences in the set S (Eq. (24)) to the projector-valued measure {|liihli|, 1 − |liihli|}, 4πi +Operational equivalences in the set M (proof dependent) where |lii = (sin θ cos φi, sin θ sin φi, cos θ), φi = 5 , and 1 +Perfect Correlation between outcomes of and for cos θ = √ . These are depicted as 3-dimensional vec- Si Mi 4 5 all i (Eq. (9)) tors in Fig. 1. The special preparation event [s∗ = 0|S∗] +Features of correlations among compatible subsets of M corresponds to the quantum state |ψi = (0, 0, 1), also de- for the special preparation (Eq. (20)) picted in Fig. 1. Consequently, by letting S∗ be a quan- +Nonzero probability of the special preparation (Eq. (22)) tum source that prepares |ψi with nonzero probability, =⇒ Contradiction. we ensure that p∗ > 0. The average anticorrelation for |ψi is found to be R = √2 ≈ 0.89442, contradicting the 5 It is useful to see how this proof schema yields a no-go 4 for universal noncontextuality in quantum theory when prediction of KS-noncontextuality that R ≤ 5 . All of this generalizes to arbitrary odd n ≥ 5 [14, 16, 17]: the |lii one particularizes to the n-cycle scenario. Let {Mi}i be n−1 the set of projective binary-outcome measurements on have the same form with i ∈ {1, 2, . . . , n}, φi = n πi, and cos2 θ = cos(π/n)/(1+cos(π/n)), with |ψi as before, a qutrit specified earlier. Let Si be the quantum source π 1 1−|liihli| 2 cos( ) n−1 n that prepares |liihli| with probability 3 and 2 with so that p∗ > 0 and R = 1+cos( π ) > n . n probability 2 . Similarly, let be the quantum source To convert a statistical proof of the KS theorem into a 3 S∗ 1 proof of the failure of universal noncontextuality, we pro- that prepares |ψihψ| with probability 3 (corresponding 1−|ψihψ| 2 ceed analogously to the conversion procedure for logical to outcome s∗ = 0) and 2 with probability 3 . proofs. We note that in the no-go of Proposition 4, one Clearly, we have the operational equivalences of (24) by can continue to derive a contradiction if one assumes out- virtue of the fact that all of these ensembles average to 1 come determinism just for the ontic states in the union 3 1. Furthermore, we have Corr = 1. Thus all of the an- of the supports of the distributions for the sources in tecedents of the inference of Proposition 5 are satisfied, 7 and we have a proof of the failure of universal noncon- outcomes to the measurements deterministically, which textuality in quantum theory. in turn implies that Corr(s∗ 6= 0) can only be upper With the proof schema of Proposition 5 in hand, we bounded by its logical maximum, can finally turn to the question of how to derive a non- contextuality inequality from statistical proofs of the KS Corr(s∗ 6= 0) ≤ 1. (31) theorem. In all, therefore, we have It suffices to note that Proposition 5 implies that any operational theory that does admit of a universally Corr ≤ p∗Corr(s∗ = 0) + (1 − p∗). (32) noncontextual model while exhibiting the specified op- erational features of the sets M and S cannot satisfy Given the dependence of Corr(s∗ = 0) on R, this equa- Eqs. (9), (20) and (22), that is, it cannot satisfy Corr = 1, tion specifies a tradeoff relation between Corr, R, and R > Rdet and p∗ > 0. To derive a noncontextuality in- p∗. Such a tradeoff relation constitutes a noncontextual- equality, therefore, one must simply determine the pre- ity inequality derived from a statistical proof of the KS cise trade-off relation satisfied by Corr,R, and p∗ in a theorem. universally noncontextual model. The precise amount by which Corr is bounded away Applying the assumption of preparation noncontextu- from 1 for a given value of R in Eq. (30) depends on ality to Eq. (24), we can infer that two quantities: (i) the maximum value of R(λ) for any deterministic noncontextual assignment, denoted here by X ν(λ) = µ(λ|S∗) = µ(λ|s∗, S∗)pr(s∗|S∗). (25) Rdet, (ii) the maximum value of R(λ) for any indetermin-

s∗ istic noncontextual assignment, denoted here by Rind, and (iii) the maximum value of Corr(λ) for any inde- Substituting this into Eq. (16), we obtain terministic noncontextual assignment, which (as in the case of logical proofs of the KS theorem) we denote by X Corr = pr(s∗|S∗)Corr(s∗), (26) Corrind. The values of Rdet, Rind, and Corrind depend

s∗ on the particular statistical proof of the KS theorem one is considering. We will show that where Rind − R X Corr(s∗ = 0) ≤ (1 − Corrind) + Corrind. Corr(s∗) ≡ Corr(λ)µ(λ|s∗, S∗). (27) Rind − Rdet λ (33)

Corr(s∗) quantifies the average degree of correlation for Substituting this into Eq. (32), we obtain the main the pairings {(Si, Mi)} predicted by ontic state λ, aver- result of this article. aged over the ontic states in the support of µ(·|s∗, S∗). The argument proceeds by showing that the quantity Theorem 1. In a prepare-and-measure experiment that admits of a universally noncontextual ontological model, Corr(s∗ = 0) has a nontrivial upper bound. Define the following tradeoff relation between Corr, R, and p∗ (defined in Eqs. (8), (19), and (21), respectively), holds: (α) (α) R(λ) = F ({ξ(m~ |M , λ)}α), (28)   R − Rdet Corr ≤ 1 − p∗(1 − Corrind) . (34) where F is the linear function specified in Eq. (19), so Rind − Rdet that the expression for R in the ontological model is This is our noise-robust noncontextuality inequality. X Recall that by assumption, the precise form of R de- R = R(λ)µ(λ| , ) (29) s∗ S∗ pends on which statistical proof of the KS theorem one λ is considering, and that the definition of R is such that R > R . Recalling that Rdet denotes the maximum value that can ind det be achieved by deterministic noncontextual assignments Note that this inequality implies that if Corr = 1 and p > 0, then R ≤ R , so that our noise-robust non- to the measurements, if R > Rdet, then some of the ontic ∗ det contextuality inequality for a given statistical proof of states in the support of µ(·|s∗ = 0, S∗) must be incon- sistent with a convex mixture of deterministic noncon- the KS theorem reduces to the KS-noncontextuality in- equality that is conventionally associated to that proof textual assignments. In this case, Corr(s∗ = 0) must be bounded away from 1, [6, 17–22]. Experimentally, however, one never achieves perfect correlation, that is, one always finds Corr < 1, so Corr(s∗ = 0) < 1. (30) that our criterion for noncontextuality never reduces to a conventional KS-noncontextuality inequality in a real By contrast, given that the no-go result does not make experiment. It follows that a violation of a conventional any appeal to the statistics of measurements on the KS-noncontextuality inequality (R ≤ Rdet) in a real ex- preparations associated to [s∗ 6= 0|S∗], the ontic states periment is insufficient to demonstrate the failure of non- in the support of µ(·|s∗ 6= 0, S∗) could potentially assign contextuality. 8

A question that arises at this point is whether it might indeterministic assignments, denoted Λdet and Λind re- still be appropriate to test the inequality R ≤ Rdet spectively, so that Λ = Λdet ∪ Λind. For all λ ∈ Λdet, on the grounds that it tests the assumption of KS- R(λ) satisfies the nontrivial upper bound R(λ) ≤ Rdet noncontextuality rather than the assumption of univer- (where Rdet < Rind because R is, by construction, a func- sal noncontextuality. Recalling from Eq. (5) that the tion that cannot achieve the logically maximal value of assumption of KS-noncontextuality incorporates an as- Rind for deterministic noncontextual assignments) while sumption of outcome determinism, to adopt such a view Corr(λ) can always achieve its logical maximum of 1, so would be to simply assume outcome determinism rather that the bound is trivial, Corr(λ) ≤ 1. By contrast, for than seeking to justify it from preparation noncontextu- all λ ∈ Λind, Corr(λ) satisfies the nontrivial upper bound ality and perfect correlations between sources and mea- Corr(λ) ≤ Corrind (where Corrind < 1 because inde- surements. However, it was shown in Ref. [15] (see also terministic noncontextual assignments necessarily imply Ref. [23]) that assuming KS-noncontextuality (hence out- a failure to achieve perfect source-measurement correla- come determinism) for unsharp quantum measurements tions), but because there exist λ ∈ Λind such that R(λ) leads to absurd conclusions, such as the failure of KS- achieves its maximum of Rind, we have only the trivial noncontextuality for experiments that are completely bound R(λ) ≤ Rind. We now make use of these facts to classical (in the sense that all states and measurements determine the upper bound on Corr(s∗ = 0) for a given are diagonal in the same basis). Therefore, the assump- value of R. Defining tion of KS-noncontextuality is only applicable to sharp, X i.e., noiseless, quantum measurements, which are never µdet ≡ µ(λ|s∗ = 0, S∗) achieved experimentally. For operational theories other λ∈Λdet than quantum theory, the same argument holds: every measurement that can be achieved in a real experiment and fails to satisfy the ideal of noiselessness, and it is only for X such noiseless measurements that the assumption of KS µind ≡ µ(λ|s∗ = 0, S∗), 6 noncontextuality is justified. λ∈Λind The noncontextuality inequality of Eq. (34), on the other hand, accommodates noisy experimental data for so that µdet + µind = 1, and recalling Eqs. (27) and (29), which Corr < 1. Thus, even if sources and measurements we have deviate from the ideal of sharpness in an experiment, one can still see a violation of our inequality. It is in this sense Corr(s∗ = 0) ≤ µdet + Corrindµind, (35) that it is robust to noise. and One can also deduce the precise limit to noise tolerance for such an inequality. For a fixed p∗, in order to obtain R ≤ R µ + R µ . (36) a nontrivial upper bound on R (i.e., a bound smaller det det ind ind than R ), one must have Corr > 1 − p (1 − Corr ). ind ∗ ind Eliminating µdet and µind from these constraints, we ob- If the noise is such that Corr is reduced to a value be- tain Eq. (33). low this bound, then it becomes impossible to witness The noncontextuality inequality of Eq. (34) can be contextuality via this inequality. Further, if in addi- saturated by a noncontextual ontological model in cer- tion to fixing p∗, one achieves a certain value of R, say 7 max tain circumstances. Define Λdet ≡ {λ ∈ Λdet : R = Rexpt, then contextuality is witnessed if and only max R(λ) = Rdet, Corr(λ) = 1} and Λind ≡ {λ ∈ Λind :  Rexpt−Rdet  max if Corr > 1 − p∗(1 − Corrind) R −R (which is just R(λ) = Rind, Corr(λ) = Corrind}. Clearly, Λdet ⊆ Λdet ind det max a rewriting of the violation of Eq. (34)). In Appendix and Λind ⊆ Λind. For noncontextual measurement- assignment polytopes based on statistical proofs of the C, we provide further details about what an experiment max max must achieve in order to test an inequality of the form of KS theorem, Λdet is always a non-empty set. If Λind Eq. (34). is non-empty, then the noncontextuality inequality of The proof of Eq. (33) proceeds as follows. Without Eq. (34) can be saturated. The reason is that in this case one can choose the support of µ(λ|s∗ = 0, S∗) on Λdet to any loss of generality, we identify the set of ontic states Λ max be restricted to Λdet and the support of µ(λ|s∗ = 0, S∗) with the set of vertices of the polytope of noncontextual max assignments to the elements of M. We divide the ver- on Λind to be restricted to Λind , in which case the in- equalities in Eqs. (35) and (20) become equalities. The tices into two sets, corresponding to deterministic and max condition that Λind be non-empty is satisfied for the case of odd n-cycle scenario, and therefore the noncontextu- ality inequalities that we derive in this case will be tight.

6 Note that the inappropriateness of applying KS-noncontextuality to real experiments has been argued in detail elsewhere over the years [10, 11, 13, 15, 23–25] and our comments here are meant merely to highlight the precise sense in which this plays out for 7 See Section VI.B of Ref.[23] for a detailed discussion of this non- statistical proofs of the KS theorem. contextual ontological model. 9

To illustrate our technique on a concrete example, we theirs for the case of the odd n-cycle scenario. One differ- now turn to the odd n-cycle scenario. ence is that all of the simulating measurements in their For the case of the n-cycle scenario, the function F approach are 3-outcome measurements, rather than the defining the quantity R is specified in Eq. (23). As noted 4-outcome case we have considered here (see also Ap- in our previous discussion of the n-cycle scenario, for de- pendix D). We also show that their inequality for the terministic vertices of the noncontextual measurement- odd n-cycle case is a special case of our noncontextuality assignment polytope, there is a nontrivial upper bound inequality for the odd n-cycle scenario, Eq. (40), when on R(λ), namely, R(λ) ≤ Rdet, where p∗ is presumed to take the value of 1/3. Hence, unlike Ref. [12], our inequality does not presume that all the n − 1 ensembles of preparations in the experiment correspond Rdet = . (37) n to uniformly random probability distributions (i.e., prob- ability 1/3 for each preparation). Rather, it merely pre- Similarly, for every indeterministic vertex of the non- sumes that the relevant operational equivalences hold. If contextual measurement-assignment polytope, there is a the value of p realized in an experiment is different from nontrivial upper bound on Corr(λ), namely, Corr(λ) ≤ ∗ the 1/3 value of the ideal quantum realization, then our Corr , where ind inequality specializes to one that is different from that 1 of Ref. [12]. These differences are not very significant, Corr = . (38) ind 2 however. We consider the main advantage of our ap- proach over that of Ref. [12] to be that it makes clear Also, we have precisely which aspects of the polytope of noncontextual measurement-assignments (for any given statistical proof Rind = 1. (39) of the KS theorem) need to be identified in order to de- termine the form of the noncontextuality inequality. (For further discussion of the vertices of the polytope in The value of upper bounds on R(λ) for determin- the case of the n-cycle scenario, see Appendix B.) Sub- istic and indeterministic vertices of the noncontextual stituting Eqs. (37) and (38) into Eq. (34), we find that a measurement-assignment polytope, denoted here by R noise-robust noncontextuality inequality for the n-cycle det and R respectively, are well-studied for many statis- scenario is: ind tical proofs of the KS theorem[16, 17]. The values of n  n − 1 the upper bounds on Corr(λ) for indeterministic ver- Corr ≤ 1 − p∗ R − . (40) 2 n tices of this polytope, denoted here by Corrind, have not been studied previously, but are just as easy to deter- The quantum realization of the n-cycle scenario [6, 14] mine.8 From these, one can determine a noncontextu- that was discussed earlier clearly violates this inequality. ality inequality for any statistical proof of the KS theo- It suffices to note that by Eq. (40), if Corr = 1 and rem via Eq. (34). A study of how our technique allows n−1 one to convert the graph-theoretic framework of [17] to p∗ > 0, then R ≤ n , while this quantum realization π 1 2 cos( n ) n−1 a hypergraph-theoretic framework for noise-robust non- achieves Corr = 1, p∗ = 3 and R = 1+cos( π ) > n . The n contextuality inequalities is carried out in [23]. 1 violation persists in the presence of noise: if p∗ = 3 , there is a range of values of Corr and R below those achieved in the ideal quantum realization — that is, where Corr < 1 π 2 cos( n ) and R < π — such that the inequality is still ACKNOWLEDGMENTS 1+cos( n ) violated. We would like to thank David Schmid and Elie Wolfe for discussions, and Debashis Saha and Zhen-Peng Xu for V. DISCUSSION comments on an earlier version of this paper. Research at Perimeter Institute is supported by the Government of As noted earlier, Xu et al. [12] have previously obtained Canada through the Department of Innovation, Science noise-robust noncontextuality inequalities starting from and Economic Development Canada and by the Province the KCBS [6] and Yu-Oh [7] statistical proofs of the KS of Ontario through the Ministry of Research, Innovation theorem. In Appendix E, we compare our approach to and Science.

8 Indeed, this quantity makes appearance as a hypergraph invari- ant in Ref. [23], in addition to the usual invariants in the graph- 10

of Quantum Contextuality in Neutron Interferometry”, [1] S. Kochen and E. P. Specker, “The Problem of Hidden Phys. Rev. Lett. 103, 040403 (2009). Variables in Quantum Mechanics”, J. Math. Mech. 17, [20] R. Lapkiewicz, P. Li, C. Schaeff, N. K. Langford, 59 (1967). Available at Indiana University Mathematics S. Ramelow, M. Wie´sniak,A. Zeilinger, “Experimental Journal. 17, 59 (1968). non-classicality of an indivisible quantum system”, Na- [2] A. Peres, “Two simple proofs of the Kochen-Specker the- ture 474, 490 - 493 (2011). orem,” J. Phys. A 24, L175 (1991). [21] C. Zu, Y.-X. Wang, D.-L. Deng, X.-Y. Chang, K. Liu, [3] N. D. Mermin, “Hidden variables and the two theorems P.-Y. Hou, H.-X. Yang, and L.-M. Duan, “State- of John Bell,” Rev. Mod. Phys. 65, 803 (1993). Independent Experimental Test of Quantum Contextu- [4] A. Cabello, Adan, J. Estebaranz, and G. Garcia-Alcaine, ality in an Indivisible System”, Phys. Rev. Lett. 109, “Bell-Kochen-Specker theorem: A proof with 18 vectors,” 150401 (2012). Physics Letters A 212, 183 (1996). [22] F. M. Leupold, M. Malinowski, C. Zhang, V. Negnevit- [5] A. Cabello, M. Kleinmann, J. R. Portillo, “Quantum sky, J. Alonso, A. Cabello, J. P. Home, “Sustained state- state-independent contextuality requires 13 rays”, J. independent quantum contextual correlations from a sin- Phys. A: Math. Theor. 49, 38LT01 (2016). gle ion”, arXiv:1706.07370 [quant-ph] (2017). [6] A. A. Klyachko, M. A. Can, S. Binicio˘glu,and A. S. Shu- [23] R. Kunjwal, “Beyond the Cabello-Severini-Winter frame- movsky, “Simple Test for Hidden Variables in Spin-1 Sys- work: making sense of contextuality without sharpness tems”, Phys. Rev. Lett. 101, 020403 (2008). of measurements”, arXiv:1709.01098 [quant-ph] (2017). [7] S. Yu and C. H. Oh, “State-Independent Proof of [24] R. Kunjwal, “Fine’s theorem, noncontextuality, and cor- Kochen-Specker Theorem with 13 Rays”, Phys. Rev. relations in Specker’s scenario”, Phys. Rev. A 91, 022108 Lett. 108, 030402 (2012). (2015). [8] J. S. Bell, “On the Einstein-Podolsky-Rosen Paradox”, [25] M. D. Mazurek, M. F. Pusey, R. Kunjwal, K. J. Resch, Physics 1, 195 (1964). Reprinted in Ref. [28], chap. 2. R. W. Spekkens, “An experimental test of noncontextu- [9] J. F. Clauser, M. A. Horne, A. Shimony, and R. A. Holt, ality without unphysical idealizations”, Nat. Commun. 7, “Proposed Experiment to Test Local Hidden-Variable 11780 (2016). Theories”, Phys. Rev. Lett. 23, 880 (1969). [26] S. Popescu and D. Rohrlich, “Quantum nonlocality as an [10] R. Kunjwal and R. W. Spekkens, “From the Kochen- axiom”, Found Phys (1994) 24: 379. Specker theorem to noncontextuality inequalities without [27] A. Fine, Hidden Variables, Joint Probability, and the Bell assuming determinism”, Phys. Rev. Lett. 115, 110403 Inequalities, Phys. Rev. Lett. 48, 291 (1982). (2015). [28] J. S. Bell, “Speakable and unspeakable in quantum me- [11] A. Krishna, R. W. Spekkens, and E. Wolfe, “Deriv- chanics” (Cambridge University Press, New York, 1987). ing robust noncontextuality inequalities from algebraic proofs of the Kochen-Specker theorem: the Peres-Mermin square”, New. J Phys 19, 123031 (2017). Appendix A: Elaboration of the ideas underlying the [12] Z-P. Xu, D. Saha, H-Y. Su, M. Pawlowski, and J-L. Chen, technique “Reformulating noncontextuality inequalities in an oper- ational approach”, Phys. Rev. A 94, 062103 (2016). [13] R. W. Spekkens, “Contextuality for preparations, trans- The features of the correlations within compatible sub- formations, and unsharp measurements”, Phys. Rev. A sets of M (for the special preparation [s∗ = 0|S∗]) which 71, 052108 (2005). underlie the no-go theorem of Proposition 5 constitute [14] Y. C. Liang, R. W. Spekkens, H. M. Wiseman, “Specker’s a witness that these correlations cannot arise from a parable of the overprotective seer: A road to contextual- distribution µ(λ|s∗ = 0, S∗) supported only on ontic ity, nonlocality and complementarity”. Phys. Rep. 506, states corresponding to measurement-noncontextual and 1 (2011). outcome-deterministic assignments. These features are [15] R. W. Spekkens,“The status of determinism in proofs of represented by a quantity R that is upper bounded by a the impossibility of a noncontextual model of quantum constant number if the correlations do arise from a dis- theory”, Found. Phys. 44, 1125 (2014). [16] M. Ara´ujo,M. T. Quintino, C. Budroni, M. T. Cunha, tribution µ(λ|s∗ = 0, S∗) supported only on such ontic and A. Cabello, “All noncontextuality inequalities for the states. n-cycle scenario”, Phys. Rev. A 88, 022118 (2013). This is precisely analogous to how, in a Bell scenario, [17] A. Cabello, S. Severini, and A. Winter, “(Non- the violation of a Bell inequality witnesses the fact that )Contextuality of Physical Theories as an Axiom”, the distribution over ontic states has support on indeter- arXiv:1010.2163 [quant-ph] (2010), and “Graph- ministic vertices of the no-signalling polytope, e.g., PR- Theoretic Approach to Quantum Correlations”, Phys. boxes [26] in the case of the CHSH scenario. The ontic Rev. Lett. 112, 040401 (2014). states corresponding to deterministic vertices of the no- [18] A. Cabello, S. Filipp, H. Rauch, and Y. Hasegawa, “Pro- signalling polytope define the Bell polytope. A Bell vio- posed Experiment for Testing Quantum Contextuality lation thus rules out outcome-deterministic locally-causal with Neutrons”, Phys. Rev. Lett. 100, 130404 (2008). [19] H. Bartosik, J. Klepp, C. Schmitzer, S. Sponar, A. Ca- ontological models. bello, H. Rauch, and Y. Hasegawa, “Experimental Test By Fine’s theorem [27], a Bell violation also rules out locally-causal ontological models that are outcome- indeterministic. This is because the notion of local causality implies factorizability of the joint response func- theoretic approach of Ref. [17]. tion, so that these joint response functions are convex 11 mixtures of outcome-deterministic assignments to the lo- cal measurements. In other words, Fine’s theorem shows that there is no loss of generality in assuming outcome determinism in tests of locality.9 However, for tests of noncontextuality involving nontrivial applications of the assumption of measurement noncontextuality (as is the case for every test of noncontextuality arising from a sta- tistical or logical proof of the KS theorem), there is no analogue of Fine’s theorem. The reasons for this are de- scribed in Ref. [15]. Now, for Corr to be bounded away from 1, a non-empty subset of the ontic states in the union of the supports of sources in S must correspond to the indeterministic vertices of the polytope of measurement noncontextual assignments of probabilities to measurement outcomes. In the case of noncontextuality inequalities inspired by logical proofs of the KS theorem [10], all the ontic states FIG. 2. Compatibility relations for the 5-cycle case: the in the union of the supports of sources in S correspond to vertices represent the equivalence classes of binary-outcome indeterministic vertices of this polytope, simply because measurements and the edges denote compatibility (i.e., joint the polytope admits no deterministic vertices on account measurability) of the vertices they contain. Specifically, the of the KS-uncolourability that such proofs hinge upon. ith vertex denotes the equivalence class Mi, and the ith edge On the other hand, for noncontextuality inequalities in- denotes the equivalence class M(i,i⊕1). spired by statistical proofs of the KS theorem, we need a witness for the fact that some non-empty subset of the union of the ontic supports of sources in S corresponds The assumption of noncontextuality implies that the to indeterministic vertices of the measurement noncon- response function representing a measurement depends textuality polytope. Only then can we expect Corr to be only on its equivalence class, so that we infer the con- bounded away from 1. This witness corresponds to the straints quantity R exceeding its KS-noncontextual bound Rdet. X (i,i 1) ∀i : ξ(mi, mi 1|M , λ)

mi 1 Appendix B: Noncontextual X (i,i⊕1) measurement-assignment polytope for the n-cycle = ξ(mi, mi⊕1|M , λ). (B3) scenario for odd n mi⊕1 Together with the conditions of being a probability distri- An ontological model must specify a conditional prob- (i,i⊕1) bution, 0 ≤ ξ(mi, mi⊕1|M , λ) ≤ 1, and of normal- ability distribution for every compatible subset of mea- P (i,i⊕1) ization, ξ(mi, mi⊕1|M , λ) = 1, Eq. (B3) surements. In the n-cycle scenario, there are n such sub- mi,mi⊕1 defines the constraints on the response functions. The sets, corresponding to all adjacent pairs of measurements set of solutions to these constraints in turn determines in the cycle. This is depicted in Fig. 2. Recalling that the set of solutions for the n-tuple of response functions M(i,i⊕1) denotes the equivalence class of measurement for the binary-outcome measurements, {ξ(m |M , λ)} , procedures that jointly simulate M and M , an onto- i i i i i⊕1 through Eq (B2). logical model must specify an n-tuple of response func- For every λ, such an n-tuple of response functions de- tions of the form fines a possible n-tuple of (deterministic or indetermin- (i,i⊕1) istic) assignments to all of the measurements. Follow- ξ(mi, mi⊕1|M , λ). (B1) ing Ref. [11], we term the latter set the noncontextual Marginalization of an outcome of a procedure is mod- measurement-assignment polytope. (It is equivalent to elled in the ontological model by marginalization of the what is termed the “no-disturbance” polytope elsewhere; response function, so that the ontological representation for the n-cycle scenario, it was characterized in Ref. [16].) of Mi satisfies There are two types of vertex for this polytope, cor- responding to noncontextual measurement assignments X (i,i⊕1) ξ(mi|Mi, λ) = ξ(mi, mi⊕1|M , λ). (B2) that are deterministic and indeterministic respectively. mi⊕1 If we identify the set of ontic states Λ with the set of ver- tices of the polytope, as in the main text, then the two types define a partition of Λ into subsets Λdet and Λind. The deterministic vertices are simply those that assign 9 See Ref. [24] for an analysis of the role of Fine’s theorem in tests an outcome to each of the Mi independently. There are of locality vis-`a-vistests of noncontextuality. consequently 2n of these. All of these can achieve the 12

FIG. 3. An example of a deterministic measurement- FIG. 4. The indeterministic measurement-noncontextual as- noncontextual assignment that maximizes the amount of an- signment that achieves perfect anticorrelation, R(λ) = 1. 4 ticorrelation achievable by such assignments, R(λ) = 5 .

The vertex κ∗ is depicted in Fig. 4. It is the only element 1 logical maximum value of 1 for Corr(λ), and of these, of the set Λind that saturates the inequalities Corr(λ) ≤ 2 n−1 there are 2n that achieve R(λ) = n . For instance, this and R(λ) ≤ 1 described in the main text. n is achieved for the vertices {κj}j=1 of the form For each statistical proof of the KS theorem, one can determine the polytope of noncontextual measurement ∀i ∈ [n]:ξ(mi, mi⊕1|κj) assignments. The details of this polytope will determine the nontrivial upper bound on R(λ) for the deterministic = ξ(mi|κj)ξ(mi⊕1|κj) (B4) vertices (where R is defined in a manner that is specific where to the statistical proof one is considering; see Eq. (19) for the general form and Eq. (23) for an example from the n-cycle scenario) and the nontrivial upper bound on ξ(mi|κj) = δmi,+1 for i ∈ {j, j ⊕ 2, . . . , j ⊕ (n − 1)} Corr(λ) for the indeterministic vertices. These determi- = δmi,−1 for i ∈ {j ⊕ 1, j ⊕ 3, . . . , j ⊕ (n − 2)}. (B5) nations are all that one requires to derive a noncontextu- ality inequality for any given statistical proof of the KS theorem. κ1 is depicted in Fig. 3. To obtain another set of n vertices that achieve R(λ) = n−1 , it suffices to flip the sign of all the assignments. n Appendix C: How to test such inequalities These 2n vertices constitute the subset of Λdet that can n−1 experimentally saturate the inequalities Corr(λ) ≤ 1 and R(λ) ≤ n described in the main text. The indeterministic vertices are those that exhibit ei- Recall that α is a variable that runs over the com- (α) ther perfect positive correlation or perfect negative corre- patible subsets of measurements in M and that M lation for each of the adjacent pairs of measurements with denotes the measurement that jointly simulates the com- the number of pairs that exhibit perfect negative corre- patible subset associated to α, that is, {Mi}i∈α. Strictly (α) lation being odd. There are 2n−1 such assignments. All speaking, M denotes an equivalence class of measure- (α) of these achieve Corr(λ) = 1 . One of them also achieves ment procedures. Let M denote a specific proce- 2 (α) R(λ) = 1, namely, the one, denoted κ∗, corresponding to dure in the class M , and let Mi(α) denote the proce- perfect negative correlation for all pairs, dure in the equivalence class Mi that is obtained by im- plementing the joint measurement procedure M (α) and ∀i ∈ [n]:ξ(mi, mi⊕|κ∗) post-processing its outcome (specifically, by marginaliz- 1 1 ing over the outcomes of all measurements other than Mi = δmi,+1δmi⊕,−1 + δmi,−1δmi⊕,+1, (B6) in the compatible subset). Supposing that α = ai and 2 2 0 α = ai both correspond to compatible subsets of mea- surements that include M , then M and M 0 such that the assignment to each Mi is uniformly ran- i i(α=ai) i(α=ai) dom, are distinct procedures in the operational equivalence class Mi. 1 1 ∀i ∈ [n]:ξ(m |κ ) = δ + δ . (B7) Any experiment that involves the set of measurements i ∗ 2 mi,−1 2 mi,+1 M and seeks to test noncontextuality must aim to im- 13 plement a specific measurement procedure M (α) for each Appendix D: An alternative way of operationalizing α such that every operational equivalence relation of the the KCBS proof of the KS theorem form M ' M 0 holds. i(α=ai) i(α=ai) For instance, in the case of the n-cycle scenario, In this article, we have operationalized the KCBS proof where M(i,i⊕1) denotes the equivalence class of measure- of the KS theorem as an n-cycle scenario with odd n, ment procedures that jointly simulate Mi and Mi⊕1, that is, as n binary-outcome measurements arranged in where M (i,i⊕1) denotes a specific procedure in the class a cycle such that adjacent pairs are jointly measurable. (i,i⊕1) M , and where Mi(i⊕1) denotes the procedure one The outcome set of each joint measurement can be taken obtains by implementing M (i,i⊕1) and marginalizing over to be the Cartesian product of the outcome sets of the the outcome mi⊕1, any experiment that seeks to test non- two measurements being simulated, so that each joint contextuality must aim to implement a specific measure- measurement has four outcomes. Recall that the binary- (i,i⊕1) ment procedure M for each i ∈ [n] such that the outcome measurements were denoted Mi with outcome (i,i⊕1) operational equivalence relation Mi(i⊕1) ' Mi(i 1) holds mi, and the joint measurements were denoted M for all i ∈ [n]. with outcome (mi, mi⊕1) (For more details, see Appendix Furthermore, any experiment that involves the set of B.) sources S and seeks to test noncontextuality must aim However, one can also imagine operationalizing the to implement a specific binary-outcome source procedure proof as an odd number n of three-outcome measure- tri Si for each i, as well as a special source procedure S∗ ments. Denoting the ith measurement by Mi , and tak- tri such that the operational equivalence relations [>|Si] ' ing the outcome set to be mi ∈ {0, 1, 2}, the opera- 0 tri [>|Si0 ] ' [>|S∗] for all i, i hold (see Eq. (24)). tional equivalence relations have the form ∀i : [0|Mi ] ' (α) tri Whichever measurement procedures {M }α and [2|Mi⊕1]. In words, the last outcome of one measure- source procedures {Si}i and S∗ one targets, however, ex- ment in the cycle is operationally equivalent to the first perimental imperfections ensure that the relevant oper- outcome of the next measurement in the cycle. ational equivalence relations are not achieved precisely. To translate between the two approaches, it suffices tri But given that noncontextuality is an inference from op- to recognize that the three-outcome measurement Mi erational equivalences to equivalences in the ontological can be identified with the four-outcome measurement model, such imprecision blocks the derivation of any con- M(i,i⊕1) as long as one of the outcomes of the latter sequences for the ontological model of the experiment. has probability zero for all preparations. Specifically, we This was termed the problem of no strict operational take (mi = +1, mi⊕1 = +1) to be the outcome that is al- equivalences in Ref. [25]. It was shown there how to ways assigned zero probability, and make the translations tri tri solve it using the technique of secondary procedures (see mi = 0 ↔ (mi = +1, mi⊕1 = −1), mi = 1 ↔ (mi = tri also Sec. V of Ref. [11]). The idea is to identify, within −1, mi⊕1 = −1), and mi = 2 ↔ (mi = −1, mi⊕1 = +1). the convex hull of the sources and measurements that If an experiment implements n four-outcome measure- were experimentally implemented (termed the primary ments where all of the outcomes have nonzero probabil- procedures), sources and measurements that satisfy the ity, then one must use the n-cycle approach, whereas if operational equivalence relations exactly (termed the sec- the measurements that are implemented have only three ondary procedures), and then to test the noncontextual- outcomes that ever occur, then either approach can be ity inequalities on the secondary procedures. used. Note that operational equivalence of two measure- Some illustrative quantum examples help to clarify ments (sources) requires equivalence of statistics for all these ideas. sources (measurements), or equivalently, equivalence for Recall that in KCBS’s quantum construction [6], a tomographically complete set of sources (measure- the binary-outcome measurement Mi is represented by ments). Experiments seeking to test operational equiv- (i) (i) a projection-valued measure (PVM) {Π+ , Π− } where alence relations, therefore, must accumulate in Π(i) ≡ |l ihl |, and Π(i) = I − Π(i). By con- favour of a given set of procedures being tomographically + i i − + (i) (i) complete. See, e.g., the evidence described in Ref. [25]. struction, any two neighbouring PVMs, {Π+ , Π− } (i⊕1) (i⊕1) Note that there is a loophole, which we term the tomog- and {Π+ , Π− }, consist of projectors that com- raphy loophole, for experiments testing universal noncon- mute and satisfy Π(i)Π(i⊕1) = 0. It follows that the textuality: no matter how much evidence one accumu- + + joint measurement (i,i⊕1) is represented (uniquely) by lates for the tomographic completeness of some set of M the four-outcome PVM consisting of the products of procedures, it is possible that future experiments will un- the projectors from each of the neighbouring PVMs, cover new procedures whose statistics are not predicted (i) (i⊕1) (i) (i⊕1) (i) (i⊕1) (i) (i⊕1) by the statistics of those in the set. It follows that any {Π+ Π+ , Π+ Π− , Π− Π+ , Π− Π− }. But this (i) (i⊕1) (i) (i⊕1) of tomographic completeness of some set is simplifies to {0, Π+ , Π+ ,I − Π+ − Π+ }, so that necessarily tentative. One should endeavour to falsify it it is clear that the first outcome always has probabil- experimentally, and as long as it resists falsification, one ity zero of occuring and consequently the joint measure- has good evidence for the hypothesis. But one can never ment is translatable into one of the three-outcome vari- verify it. ety. This, therefore, is an example of the type described 14

above, wherein the outcome (mi = +1, mi⊕1 = +1) of the We are confident that generalizations of the two tech- four-outcome joint measurement M(i,i⊕1) never occurs. niques can address these concerns, because the intuition Now consider quantum realizations of the n-cycle sce- behind them (articulated in the last paragraph of Sec. 2.2 nario wherein the binary-outcome measurement Mi is of Ref. [12]) is in line with that of our approach. How- not represented projectively, but rather by a nonpro- ever, Xu et al. did not disentangle this intuition from (i) (i) features of the ideal quantum realization as cleanly as we jective POVM, which we denote by {E+ ,E− }, where (i) (i) do here. E = I − E . The constraint that M and M − + i i⊕1 We consider the main advantages of the approach de- be jointly simulatable implies that there must exist a (i) (i) (i) (i) scribed in this article, relative to those of Ref [12], to be four-outcome POVM, {G++,G+−,G−+,G−−} such that two-fold: (i) we have derived the inequalities in a princi- (i) (i) (i) (i) (i) (i⊕1) G++ + G+− = E+ and G++ + G−+ = E+ . pled manner, motivating the logic with a no-go theorem For certain compatible pairs of nonprojective POVMs, for universal noncontextuality, and excising all features of (i) (i⊕1) namely, those for which E+ + E+ ≤ I, there the ideal quantum realization that are not needed to de- exists a joint measurement POVM of the form rive nontrivial inequalities (such as the particular choice (i) (i) (i) (i) (i) (i) (i) of probabilities for source outcomes), and (ii) we have {G++,G+−,G−+,G−−} where G++ = 0, G+− = E+ , described explicitly which parameters in the noncontex- G(i) = E(i⊕1), and G(i) = I−E(i) −E(i⊕1). This consti- −+ + −− + + tuality inequality depend on the choice of statistical proof tutes another example of a four-outcome joint measure- and how to compute these parameters by characterizing ment where the first outcome never occurs, so that it is the noncontextual measurement-assignment polytope as- translatable into one of the three-outcome variety. sociated to that proof. On the other hand, for generic compatible pairs of non- For the case of the KS theorem based on the odd n- projective POVMs, i.e., those for which it is not the case (i) (i⊕1) cycle scenario, the second technique described in Xu et that E+ + E+ ≤ I, the joint measurement must be al. leads to an inequality that is a special case of our represented by a genuinely four-outcome POVM. It fol- noncontextuality inequality for this scenario when some lows that if one considers a quantum realization of the parameters are fixed. In the rest of this section, we make n-cycle wherein the compatible pairs are of this sort, then the connection explicit. the four-outcome joint measurement is not translatable First, we note that the manner in which the KCBS into one of the three-outcome variety. statistical proof of the KS theorem is operationalized in Ref. [12] differs from the manner in which we do so here in precisely the sort of way outlined in the previous Ap- Appendix E: Comparison with the approach of Xu pendix. Strictly speaking, therefore, their inequality is et. al. only applicable for experiments that aim to implement a tri set of n three-outcome measurements, {Mi }i, with op- We here compare our approach to obtaining inequali- tri erational equivalence relations of the form ∀i : [0|Mi ] ' ties for universal noncontextuality from statistical proofs tri [2|Mi⊕1]. of the KS theorem to the one described in Xu et. al. [12]. For ease of comparison with our results, however, In fact, Ref. [12] describes two approaches to doing so. we conceptualize these three-outcome measurements as The first is described in Sections IV A and B of that four-outcome measurements wherein one of the outcomes paper and the second is described in Section V. Neither never occurs, and we make the particular identification approach, however, is presented in a manner that fully between outcomes outlined in the previous Appendix. excises reference to the ideal quantum realization. We can then rewrite their inequalities using the nota- The first technique, for instance, makes explicit refer- tional conventions of this article. ence to predictions of quantum theory when determining Doing so, the inequality of Ref. [12] becomes the upper bound on their quantity A (in the case of the KCBS proof); they presume that their quantity I can n  n − 1 √ Corr ≤ 1 − R − . (E1) achieve the maximum quantum value of 5 (their Eq. 6 n (17)). In the second technique (also in the case of the n-cycle which is a special case of our inequality for the n-cycle proof), the choice of coefficients in the inequality is par- scenario (Eq. (40)) where p∗ is presumed to take the value ticular to the case when the source outcomes are uni- that it takes in the ideal quantum realization of the no-go 10 1 formly random, as in the ideal quantum realization. result for universal noncontextuality, namely p∗ = 3 .

10 Note that some of the commentary provided in Ref. [12] on their notation) are perfectly distinguishable, so that their ontic sup- derivation of the inequality may create the impression that it is ports are disjoint. See, e.g., the comment above Eq. (20) in their important that two preparation procedures (P and P¯ in their article. This idealization is not, however, required to derive the inequality.