<<

Quantum Contextuality

Costantino Budroni,1, 2 Ad´anCabello,3, 4 Otfried G¨uhne,5 Matthias Kleinmann,5 and Jan-Ake˚ Larsson6 1Faculty of Physics, University of Vienna, Boltzmanngasse 5, 1090 Vienna, Austria 2Institute for Quantum Optics and Quantum Information (IQOQI), Austrian Academy of Sciences, Boltzmanngasse 3, 1090 Vienna, Austria 3 Departamento de F´ısicaAplicada II, Universidad de Sevilla, 41012 Sevilla, Spain 4 Instituto Carlos I de F´ısicaTe´oricay Computacional, Universidad de Sevilla, 41012 Sevilla, Spain 5 Naturwissenschaftlich-Technische Fakult¨at,Universit¨at Siegen, Walter-Flex-Straße 3, 57068 Siegen, Germany 6 Institutionen f¨orSystemteknik och Matematiska Institutionen, Link¨opingsUniversitet, 58183 Link¨oping, Sweden (Dated: March 15, 2021)

A central result in the foundations of is the Kochen-Specker theorem. In short, it states that quantum mechanics is in conflict with classical models in which the result of a measurement does not depend on which other compatible measurements are jointly performed. Here, compatible measurements are those that can be performed simultaneously or in any order without disturbance. This conflict is generically called quantum contextuality. In this article, we present an introduction to this subject and its current status. We review several proofs of the Kochen-Specker theorem and different notions of contextuality. We explain how to experimentally test some of these notions and discuss connections between contextuality and nonlocality or . Finally, we review some applications of contextuality in quantum information processing.

Contents 5. Final considerations on KS contextuality experiments 25 I. Introduction 2 E. A different notion of contextuality: Spekkens’ approach 26 II. Quantum contextuality in a nutshell 2 1. Spekkens’ definition of noncontextuality 26 A. A first example3 2. Inequalities for Spekkens’ noncontextuality 27 B. A second look3 3. Experimental tests of Spekkens’ contextuality 28 4. Relation with different notions of hidden variable III. The Kochen-Specker theorem 4 theory 28 A. Kochen-Specker sets4 B. Generalized Kochen-Specker-type arguments6 V. Advanced topics and methods 29 1. The magic square and the magic pentagram7 A. The noncontextuality polytope 29 1. Simplest example 30 2. Yu and Oh’s set8 2. Basics of convex polytopes, affine geometry, and C. As a special instance of Gleason’s theorem9 linear programming 30 3. Noncontextuality inequalities 31 IV. Contextuality as a property of nature 9 B. Graph theory and contextuality 32 A. Noncontextuality inequalities9 1. Basic notions 32 1. Mathematical structure of noncontextual hidden 2. Graphs, hypergraphs, and marginal scenarios 33 variable models9 3. Exclusivity graphs and their independence, Lov´asz, 2. State-dependent contextuality 10 and fractional packing numbers 34 3. State-independent contextuality 11 4. The graph approach and the quest for a principle 4. Other approaches to noncontextual hidden variable for quantum correlations 37 models 11 5. Chromatic and fractional chromatic numbers 38 B. Operational definitions and physical assumptions: C. The connections between Kochen-Specker and Bell ideal measurements 12 theorems 39 1. Two perspectives: and effects 12 D. Classical simulation of quantum contextuality 40 2. Operational definitions of contexts: OP 13 1. Simulation with Mealy machines 41 3. Operational definitions of contexts: EP 16 2. Simulation with ε-transducers 41 C. Modeling experimental imperfections 16 3. Other related results 42 1. Quantifying disturbance in sequential E. The so-called nullifications of Kochen-Specker measurements 17 theorem 43 arXiv:2102.13036v2 [quant-ph] 12 Mar 2021 2. Context-independent time evolution 18 1. Meyer’s “nullification” of the KS theorem 43 3. First proposals of experimentally testable 2. Clifton and Kent’s “nullification” of the KS inequalities 19 theorem 44 4. Approximate quantum models 19 5. Maximally noncontextual models 20 VI. Applications of quantum contextuality 44 D. Experimental realizations 21 A. Contextuality and quantum computation 45 1. Early experiments 21 1. Contextuality and magic states 45 2. A test of the Peres-Mermin inequality with trapped 2. Contextuality and shallow quantum circuits 46 ions 22 B. Contextuality and quantum cryptography 47 3. A test of the Peres-Mermin inequality with photons23 1. Svozil’s quantum key distribution protocol 47 4. A test of the KCBS inequality with photons 24 2. Contextuality offers device-independent security 48 2

C. Random number generation 48 (b) What are the physical assumptions involved in the D. Further applications 49 definition of contextuality and how to operationally 1. Parity-oblivious multiplexing 49 define contexts? 2. State discrimination 50 3. Zero-error channel capacities 50 (c) How to perform experimental tests? What are the 4. Dimension witnesses 50 5. Self-testing 51 assumptions, the loopholes, and the methods to 6. Further applications on the horizon 51 deal with them?

VII. Summary and outlook 51 (d) What are the applications of quantum contextual- ity in quantum information processing? Acknowledgments 52

A. Quantum contextuality from a historical These can be summarized as: What is quantum contex- perspective 53 tuality, how to test it, what is useful for, and what do we 1. The problem of hidden variables 53 learn from it? 2. The Kochen-Specker theorem 54 This review aims to reach a broad audience, from peo- 3. The origin of the word “contextuality” 54 ple with little or no experience with quantum contextual- 4. The relation between the KS and Bell theorems and the need for a theory-independent notion of ity, to experts working on the field, both on the theoret- noncontextuality 54 ical and experimental side. As a consequence, it can be 5. Noncontextuality for ideal measurements 55 read in different ways and some parts can be skipped by 6. The hidden history of noncontextuality inequalities 56 readers with already some experience on contextuality. References 57 The content of this review can be briefly outlined as follows. SectionII contains a brief introduction to the basic concepts involved in quantum contextuality, such I. INTRODUCTION as compatible measurements, contexts and noncontextu- ality inequalities. SectionIII contains the statement and Quantum contextuality is a phenomenon that com- proof of the original Kochen-Specker theorem and fur- bines many of the intriguing aspects of quantum theory ther simplifications and related arguments. SectionIV in a single framework: from measurement incompatibil- addresses the main question of this review: what is the ity, as the impossibility of performing simultaneous mea- mathematical structure of noncontextual hidden variable surements of arbitrary observables, to Bell nonlocality theories and how can we put this theories to test in an and entanglement, when the system examined is com- experiment? This includes the questions such a as the posed of several spatially separated parts. The adopted operational identification of contexts, the problem of im- perspective is that of observed statistics, which allows perfect experimental realizations, and experimental tests for a theory-independent description and analysis of the performed so far. A reader with some experience in quan- experimental results. One the one hand, quantum con- tum contextuality could skip Sec.II andIII and start textuality generated an intense debate on the founda- directly from here. In Sec.V, we present a collection of tions of quantum mechanics and stimulated the search for advanced topics associated with quantum contextuality, physical principles explaining why quantum theory is the from the definition of the noncontextuality polytope and way it is. On the other hand, the nonclassical properties the computation noncontextuality inequalities, to the re- of contextual correlations have been directly connected lations between contextuality and graph theory and the to quantum information processing applications such as connection between quantum contextuality and Bell non- quantum computation. locality. Finally, in Sec.VI information theoretic appli- The central role of quantum contextuality in quantum cations of quantum contextuality, such as quantum com- theory, both from a fundamental and an applied perspec- putation and random number generation, are discussed. tive, is what motivates this review. The difficulty is that In addition, AppendixA puts the results presented in the there is broad variety of perspectives from which to ap- review into a historical perspective. proach quantum contextuality, ranging from physics to mathematics, computer science, and philosophy, to men- tion some, and consequently a vast literature. It is im- II. QUANTUM CONTEXTUALITY IN A NUTSHELL possible to review all the literature and, at the same time, it would not be useful for the reader. We are thus forced In this section, we briefly explain the essence of the to make a selection of topics to be presented. Kochen-Specker theorem and the modern view on quan- Our goal with this review is to provide an introduc- tum contextuality. This is intended to be a simple ex- tion to contextuality that covers all the most important planation for introducing the readers with little or no (of course, from our subjective perspective) topics. In experience to the topic of quantum contextuality. Many particular, we address the following questions: of the subtleties and open problems, in particular those connected to the definition of contexts and compatible (a) What is the structure of noncontextual hidden vari- measurements, are discussed in more details in the sub- able models? sequent sections. 3

A. A first example the Pauli operators, the observables are     In the realm of classical physics it is possible to consis- ABC σz ⊗ 11 11 ⊗ σz σz ⊗ σz tently assume the existence of values for intrinsic proper- a b c  =  11 ⊗ σx σx ⊗ 11 σx ⊗ σx . (4) ties (e.g., the length) of a physical object. While it might α β γ σz ⊗ σx σx ⊗ σz σy ⊗ σy be that a deficit in the preparation or measurement pro- cedure leads to nondeterministic measurement outcomes, Notice that the observables within one row or one column this uncertainty can always be assumed to be the result mutually commute, which allows one to simultaneously of insufficient control of certain parameters in the exper- measure them and make sense of the expectation value iment. Quantum theory fundamentally contradicts such for the product of outcomes, e.g., ABC. One verifies that a point of view and quantum contextuality is one way to for a system in the state |ψi, such an expectation value is reveal this contradiction. given by hABCi = hψ|ABC|ψi. In fact, for the value of Arguably, the simplest example of this phenomenon, the terms in hPMi the state |ψi does not play any role, in which state preparation plays no role, is provided by since ABC = 11 and we thus have hABCi = +1, and the so-called “Peres-Mermin square,” (Mermin, 1990b, similarly for all products except Ccγ = −11, which gives 1993; Peres, 1990, 1991, 1992, 1993) a construction of hCcγi = −1. Summing these we obtain hPMi = 6 in nine measurements arranged in a square: clear contradiction to Eq. (3)(Cabello, 2008). Since we derived Eq. (3) under the assumption that it is possible to   consistently assign a value to the nine observables of the ABC object, the violation of Eq. (3) implies either that there a b c  . (1) is no value assignment or that the value-assignment must α β γ depend on which context the appears in. This phenomenon is known as quantum contextuality. Each measurement is dichotomic, i.e., it has only two possible outcomes, in this case labeled by +1 and −1. If we think in classical terms, there could be nine proper- B. A second look ties of an object and performing a measurement reveals whether the property is present (+1) or absent (−1). At this point, we want to give a short preview of why In the following, it is assumed that the three measure- quantum contextuality is a more subtle topic than it ments in each column and row form a “context”, i.e., a might seem from the argument that we presented so far. set of measurements whose values could, in principle, be As an entry point, one might wonder, why we chose the jointly measured without disturbance. We write ABC to particular form of the above inequality, instead of a sim- denote the product of the values of the measurements A, pler form like B, and C for a single object. Similarly we use abc, Aaα, etc. In a classical model describing the object, each of the hABC abc αβγ Aaα Bbβ Ccγi = +1. (5) nine measurements has a definite value, independent of which context the measurement is contained in. Such a The reason is that, in the quantum example, in order to value assignment is then said to be noncontextual. Then, violate inequality (3), we have to choose the observables for the set {ABC, abc, αβγ, Aaα, Bbβ, Ccγ} there can in such a way that they are not all jointly measurable, only be an even number of products with assigned value i.e., they do not all mutually commute as in the r.h.s. of +1. This holds since assigning +1 to all measurements Eq. (4). In such a case, according to quantum mechanics gives six positive products and changing the value as- there is no measurement able to reveal the value of those signed to any measurement changes the value of two of observables on the same object consistently. For exam- the products, since each measurement appears in two of ple, there is no common eigenstate of the observables A them. and b defined above. Hence, in order to test experimen- Defining the expectation value tally the expression in Eq. (5) one would need to perform a joint measurement of incompatible observables. Another common misconception, associated with the hABCi ≡ Prob[ABC = +1] − Prob[ABC = −1], (2) particular realization of the PM square as two- ob- servables in Eq. (5), is that the measurements of each we have thus shown the validity of the inequality (Ca- observable in the last row and last column can be per- bello, 2008) formed as two single-qubit local measurements (with four outcomes) instead of a global (dichotomic) measurement. hPMi ≡ hABCi+habci+hαβγi+hAaαi+hBbβi−hCcγi ≤ 4. By doing so, however, one has, e.g., in the last row a mea- (3) surement of six incompatible single-qubit observables, The significance of this inequality comes, of course, which cannot form a context. The use of single-qubit with the fact that it can be violated by a quantum sys- measurements, then, is at variance with the assumption tem. The quantum example works for a system composed that the last row and last column form a context and of two spin-1/2 particles. If we denote by σx, σy, and σz hence removes the contradiction. Performing coherent 4 global measurements on the two can indeed be a in the sequence ABA. This is the property of outcome crucial challenge in experiments, see also Sec. IV.D. repeatability. Conversely, projective measurements that A third misconception, diametrically opposite to the satisfy one of the above properties are necessarily com- previous one, is to consider the measurement of each row muting (Heinosaari and Wolf, 2010). and column as a single global and fundamental measure- This is no longer true in the case of generalized mea- ment. In this view, one has six measurements, corre- surements, which may be nonprojective (or, non ideal, sponding to the three rows and columns, which simulate unsharp). In this case, notions such as commutativity, the nine measurements A, . . . , γ. Each of these six global nondisturbance, and joint measurability are no longer measurements has four outcomes, corresponding to the equivalent (Heinosaari and Wolf, 2010), and the term “in- outcomes of the three simulated measurements under the compatibility” usually denotes the lack of joint measur- constraint that their product equals +1 (or −1 for the last ability (Heinosaari et al., 2016). Different notions, cor- column). Then, one may be surprised that none of the responding to stronger or weaker assumptions, are also 29 joint assignments of outcomes to the measurements possible. A, . . . , γ are logically possible. For the Peres-Mermin For the moment, we do not enter into this problem. square, however, only the nine dichotomic measurements The reader can think of the case of projective measure- A, . . . , γ are fundamental entities and an evaluation of, ments, where these ambiguities do not arise and which for instance, hCcγi entails a product of three numbers, were the focus of most contextuality arguments until which experimentally is by no means guaranteed to be recent times (e.g., all examples presented in Sec.III minus one, see also Fig. 11 in Sec. IV.D. and Sec. IV.A). The problem of non ideal measurements Finally, coming back to the discussion after Eq. (5), it arises at the moment of providing an operational defi- should be noted that incompatibility is not immediately nition of contextuality and designing experimental tests. ruling out a classical description: One can imagine a clas- This problem is discussed in detail in Sects. IV.B, IV.C, sical theory where values of all physical properties are and IV.D. simultaneously defined, but the (classical) measurement procedure of a property introduces some disturbance in the system and modifies the value of other physical prop- III. THE KOCHEN-SPECKER THEOREM erties. We come back to this problem in Sec.IV. There is a price we have to pay to see a violation of This section is devoted to the Kochen-Specker theo- the inequality in Eq. (3): The current status of research rem, which can be considered the starting point of the is that it is impossible to conceive a quantum experi- research in quantum contextuality. In brief, it states that ment featuring contextual behavior without additional noncontextual models conflict with quantum theory. Un- assumptions. The basic reason is that we accepted that derstanding this theorem, and the several variants pre- there are sets of observables the value of which cannot be sented in the following, is important to understand quan- revealed on the same object. But how can we ensure that tum contextuality and the further developments in this a specific measurement in two different contexts does re- research line. veal the value of the same physical property? This question brings us to the notion of compatibil- ity. Intuitively, this corresponds to some notion of si- A. Kochen-Specker sets multaneous measurability and non-disturbance among quantum measurements. In textbook quantum mechan- The example presented in the previous section is based ics, an observable corresponds to an Hermitian oper- on the violation of an inequality that is satisfied by any ator, i.e., A = A†, with outcomes identified with its noncontextual value assignement. This is a “modern” P eigenvalues, i.e., A = i λiPi and the spectral projec- tool to witness quantum contextuality. In contrast, the tions Pi identified with the measurement effects, i.e., original argument by Kochen and Specker(1967) was de- Prob(λi) = tr(ρPi); two observables A and B are said to signed as a logical impossibility proof for value assigne- compatible if they commute, i.e., [A, B] = 0. This type ments. In the following, we explain the original argument of measurements are called projective or ideal or sharp, and some of its simplifications. depending on which property one wants to the empha- The Kochen-Specker (KS) theorem (Kochen and size. Commutativity is a strong property that implies Specker, 1967) deals with assignments of truth values to several other properties of these measurements. In fact, potential measurement results. In quantum mechanics, if [A, B] = 0, then there exists another observable C such such measurement results are, for ideal measurements, that the spectral projections of A and B are a coarse- described by projectors. Each projector defines a sub- graining of those of C and, thus, measuring C allows one space of the Hilbert space, namely the subspace that is to infer the result of both, A and B, a property called left invariant under the action of the projectior. If this joint measurability. Similarly, from the state-update rule subspace is one-dimensional, then this subspace is a ray ρ 7→ PiρPi, one can see that the outcomes of A are not spanned by a single vector. The KS theorem can be seen disturbed by a subsequent measurement of B and the as a statement about the impossibility of certain assign- outcomes are repeated by a later measurement of A, e.g., ments to sets of vectors or, equivalently to sets of rank-1 5

13 For an arbitrary set rank-1 of projectors in dimension d, only certain subsets may obey conditions (O) and (C) 12 and, analogously, for an arbitrary set of propositions and a fixed d, certain subsets can be subject to conditions (O’) and (C’). A set of d mutually exclusive propositions is a context. We use two different graphical representa- 10 tions for the relations (C’),(O’) in a set of propositions, as explained in Fig.2. In one representation, sets of 5 7 3 14 9 mutually exclusive propositions are nodes in the same straight or smooth line, see Fig.2 (a) and also Figs.1,3, 8 1 6 and4. In the other representation in Fig.2 (b), edges 4 simply connect exclusive propositions, see also Figs.7 11 2 and5. Then, the constraints ( O’),(C’) can be translated into rules for coloring the vertices of the graph with two colors, e.g., green for “true” and red for “false”, namely, that each two exclusive nodes cannot be both green, con- dition (O’), and each set of d mutally exclusive propo- sitions must contain a green node, condition (C’), see also Fig.5. The problem of finding a coloring with two colors according to the above rules is referred to as KS colorability problem (Belinfante, 1973). Kochen and Specker provided a physical interpreta- FIG. 1 The set in the original proof of KS has 117 vectors tion of certain rank-1 projectors in d = 3 as spin oper- and 118 contexts. Each node represents a vector. Nodes in ators for a spin-1 particle. More precisely, the spin op- the same straight line or circumference represent mutually orthogonal vectors. The red node is orthogonal to all nodes erator along one direction, say Sx along the direction x connected to the red edge. Similarly for the green and yellow in the Euclidean three-dimensional space, has eigenvalues 2 nodes. A proof of the KS theorem can be obtained as follows. −1, 0, +1, hence, the operator Sx is a projector on the two One of the nodes 1, 2 and 11 has to be true. The symmetry of dimensional space corresponding to the eigenvalues ±1. the graph allows us to assume without loss of generality that Moreover, such spin operators have the property that for it is node 1. Then, node 9 must be false, because of the “bug” each triple of orthogonal directions, say x, y, z, the oper- subgraph (see Fig.2) between node 1 and node 9. Then, since 2 2 2 2 2 2 ators commute, i.e., [Sx,Sy ] = [Sx,Sz ] = [Sy ,Sz ] = 0. 2, 9 and 10 are mutually orthogonal and 2 is connected to 1, This way, directions in the Euclidean space correspond to node 10 must be true. Applying the same argument, 12 must spin measurements and can be identified with directions be false and 13 must be true, since (12, 13, 2) form a basis. in the Hilbert space. For a given direction ~v, the one- Repeating it again twice, 14 must be true. However, 1 and 14 1 2 cannot be both true. This concludes the proof. This figure dimensional projector P~v = 1 − S~v can be interpreted improves the representation in Kochen and Specker(1967), as the measurement outcome 0 of a spin measurement where the 117 vectors are represented by 120 nodes, using in direction ~v. For this reason, in the context of the KS two nodes each for three of the 117 vectors. theorem one often considers vectors ~v ∈ R3 in place of rank-1 projectors P on a three-dimensional Hilbert space. It is then straightforward to show that a joint measure- projectors. ment of the dichotomic observables P~u,P~v,P ~w for three Let us first fix the framework for the theorem. In a d- orthogonal vectors ~u, ~v, and ~w, can be obtained as a sin- dimensional Hilbert space H consider d rank-1 projectors gle trichotomic measurement having as effects precisely P1,P2,...,Pd associated with d orthogonal vectors in H. {P~u,P~v,P ~w}, i.e., a measurement in a given orthogonal They satisfy the following relations basis. With the usual convention for truth assignments, i.e., 1 for “true” and 0 for “false”, the above assignments (O) PiPj = 0 for any i 6= j (orthogonality). can be formulated as a map from a (finite) set of vec- tors S ⊂ 3 to { 0, 1 } such that for any orthogonal basis Pd R (C) i=1 Pi = 11, or equivalently using (O), contained in the set S one and only one of the vectors Qd i=1(11 − Pi) = 0 (completeness). is mapped to the value 1. The set S consists of several bases, possibly with intersection, i.e., one vector may be Such relations can be interpreted in terms of yes/no ques- part of several bases in S. Kochen and Specker then tions (or true/false propositions) Q1,...,Qd as follows: proved the following:

(O’) Qi and Qj are exclusive, i.e., they cannot be simul- Theorem. (Kochen-Specker, 1967) There exist a taneously “true” for i 6= j, finite set S ⊂ R3 such that there exist no function f : S → {0, 1} satisfying (C’) Q1,...,Qd cannot be simultaneously “false”, one of them has to be “true”. f(~u) + f(~v) + f(~w) = 1, (6) 6

3 2 4 2 4 et al., 2014), and in dimension 8 (Kernaghan and Peres, 1995; Toh, 2013a,b). Subsequent works have identified 3 many other examples of KS sets in different dimensions 1 5 1 5 7 (Aravind and Lee-Elkin, 1998; Arends et al., 2011; Ca- bello, 1994; Gould and Aravind, 2010; Megill et al., 2011; 8 7 6 (a) 8 6 (b) Paviˇci´c, 2006; Paviˇci´c et al., 2011, 2005; Ruuge, 2012; Waegell and Aravind, 2010, 2011a,b, 2012, 2013b, 2015, FIG. 2 Different representation of the same orthogonality re- 2017; Waegell et al., 2011). The method used in Kochen lations. (a) Vectors are represented by nodes, contexts are and Specker(1967) can be extended to construct KS sets represented by straight lines (or, more generally, by smooth in any dimension d > 3 (Cabello and Garc´ıa-Alcaine, lines). Three vectors in the same straight line are mutually orthogonal. This graph is also called a Greechie diagram, see 1996). Other methods for obtaining KS sets in d > 3 Greechie(1971); (b) Vectors are represented by nodes, orthog- have been proposed in Cabello et al. (2005); Matsuno onal vectors are connected by edges. In particular, triples of (2007); Paviˇci´c et al. (2005); Ruuge(2007); Zimba and mutually orthogonal vectors form triangles, e.g., nodes 2, 3, 4. Penrose(1993). KS sets with a continuum of vectors in An example of vectors realizing such relations are given by: dimension 3 have been presented in Galindo(1975); Gill v1 = (1, −1, 1), v2 = (1, 1, 0), v3 = (0, 0, 1), v4 = (1, −1, 0), and Keane(1996). A way to further reduce the number v5 = (1, 1, −1), v6 = (1, 0, 1), v7 = (0, 1, 0), v8 = (1, 0, −1). of vectors has been discussed in Cabello et al. (1996b). This graph is called the “bug” (Specker, 1999) and it has the The smallest KS set in terms of vectors is the 18-vector property that, for d = 3, an assignment of true to node 1 (9-context) set in dimension 4 introduced in Cabello et al. implies a false for node 5. In fact, if both 1 and 5 are true, (1996a) and shown in Fig.3. A proof of the minimality then 2, 4, 6, and 8 must be false (they are connected), which implies 3 and 7 are true (they are the remaining nodes of two is presented in Xu et al. (2020a). The impossibility of triples), which gives a contradiction since 3 and 7 are con- an assignment satisfying conditions (O), (C) is proven nected. As can be seen from Fig.1, they are the building by a parity argument: since there are 9 contexts, one block of the original proof of the KS theorem. must assign exactly 9 times “true”. However, this is not possible since each vector appears in two contexts. The smallest KS set known in terms of contexts is for all triples (~u,~v, ~w) of mutually orthogonal vectors in the (21-vector) 7-contexts set in dimension 6 introduced S. in Lisonˇek et al. (2014) and shown in Fig.4. This KS set In general, a KS set S in dimension d is defined as a set has been proven to be the one with the smallest number S of vectors in a d-dimensional Hilbert space, with the of contexts allowing for a symmetric parity proof of the property that there is no map f : S → {0, 1} satisfying KS theorem (Lisonˇek et al., 2014). P The first step in the proof in Kochen and Specker |ψi∈B f(|ψi) = 1, for any subset B ⊂ S of d orthogonal vectors. Because any such set provides a proof of the (1967) consists of identifying a set of 8 vectors whose KS theorem in dimension d, these sets are also called relations of orthogonality are represented by the 8-node a “proof of the KS theorem.” That there is no KS set graph in Fig.5. Specker called this graph the “bug” for d = 2 follows from the fact that one can construct (Specker, 1999). It has the peculiarity that, whenever explicit noncontextual assignments for all projectors in A is true, then B must be false. This is at the basis 2 of several KS-type contradictions, such as the ones by C , see e.g., (Kochen and Specker, 1967). The original proof of the KS theorem consists of a set Stairs(1983) and Clifton(1993). They provide a real- of 117 vectors that realize the graph in Fig.1 for which it ization of the bug as orthogonality relations of a set of is impossible to assign values “true” and “false” such that rank-1 projectors, PA,...,PB, such that PA = |ψihψ|, two adjacent nodes cannot be both true, condition (O’), with hψ|PA|ψi = 1 and hψ|PB|ψi > 0, in contradiction and each set of three mutually exclusive nodes must con- with the KS assignment rules (see Fig2). Interestingly, tain a value “true”, condition (C’). For any two vectors the bug is the simplest example (Cabello et al., 2018b) which are orthogonal, but do not participate in a basis of other “true-implies-false” structures, see Cabello and one can readily add a vector to complete the pair to a Garc´ıa-Alcaine(1995). Hardy’s proof (Hardy, 1993) can basis. This enlarges the 117 vectors to 192 vectors and be re-cast as a true-implies-false proof (Cabello et al., those 192 vectors form then the set S in the KS theorem. 1996a) in which the initial truth corresponds to being The above proof is long and complicated, given the in a particular entangled state. Similarly, one can con- high number of vectors necessary to obtain a contradic- struct proofs in which the initial and final propositions tion. Some authors worked on the problem and simplified are product states (Cabello, 1997). it by finding KS sets with an increasingly smaller number of vectors in different dimensions. For example, in dimen- sion 3 (Alda, 1980; Belinfante, 1973; Bub, 1996; Conway B. Generalized Kochen-Specker-type arguments and Kochen, 2000; de Obaldia et al., 1988; Peres, 1991, 1993; Peres and Ron, 1988), in dimension 4 (Cabello A different approach to the Kochen-Specker contradic- et al., 1996a; Kernaghan, 1994; Penrose, 2000; Peres, tion has been undertaken using other types of algebraic 1991; Zimba and Penrose, 1993), in dimension 6 (Lisonˇek relations instead of (O) and (C). Important examples are 7

(1,0,0,0) 10ba01 (0,0,0,1) (0,1,0,0)

(0,1,1,0) (0,0,1,1) 0110ba ba0110

(0,1,-1,0) (0,0,1,-1) 01ab01 10ab10

111100 001111 (1,0,0,1) (1,-1,0,0) ab1010 01ba10 0101ab ba1001 1001ba (1,1,1,-1) (1,1,-1,-1) 000001 100000 000010 010000

(-1,1,1,1) (1,1,1,1) 110011 000100 001000 (1,1,-1,1) (1,-1,1,-1) (1,0,1,0) (1,0,-1,0) ab0101 1010ab (0,1,0,-1) FIG. 4 Orthogonality relations between the vectors of the 21-vector, 7-context KS set in Lisonˇek et al. (2014). Vec- tors are represented by nodes and contexts by straight lines. 1010ab denotes the vector (1, 0, 1, 0, a, b), where a = e2πi/3 and b = a2. For simplicity, normalization factors are omitted. Contexts contain mutually orthogonal vectors. The set has 21 vectors and each vector is in two contexts. To map one and only one of the vectors in each context to 1 and preserve the latter property, one would need to associate 1 with 21/2 vectors, which is not an integer. This makes the mapping impossible and proves that the set is a KS set.

FIG. 3 Graphical representation of the 18-vector KS set by 8 5 Cabello et al. (1996a). Each node represents a vector. For simplicity’s sake, vectors are unnormalized. Each smooth line, 2 i.e., every straight line or ellipse, represents a context. Vectors A B in each context are mutually orthogonal. Each vector appears 1 in exactly two contexts e.g., v12 appears both in context 1 and 2, and so on. As a consequence, by assigning a noncontex- 7 tual “true” to some vectors, one obtains an even number of 4 “true”, whereas one should get exactly nine “true”, one for FIG. 5 Subgraph of the Yu-Oh graph given by the vertices each context. Therefore, the set is not KS colorable. Notice { A, B, 1, 2, 4, 5, 7, 8 }, corresponding to a basic block of the that there are additional relations of orthogonality not shown original KS graph (the “bug”), with a valid coloring, i.e., in the graph and not used to prove the contradiction. Vec- green=true, red=false. As discussed in Sec. III.A (Fig.2), tors are also listed in the table, where each row represent a A and B must be exclusive events. context. For simplicity, we denote −1 with 1.¯

square of observables the Peres-Mermin magic square (Mermin, 1990b, 1993; ABC  σ ⊗ 11 11 ⊗ σ σ ⊗ σ  Peres, 1990), the Mermin magic pentagram, and the sce- z z z z a b c = 11 ⊗ σ σ ⊗ 11 σ ⊗ σ . (7) nario of Yu and Oh (Bengtsson et al., 2012; Kleinmann    x x x x α β γ σ ⊗ σ σ ⊗ σ σ ⊗ σ et al., 2012; Yu and Oh, 2012). z x x z y y Each row and column contains a set of commuting ob- servables, in addition, we have that the product of ob- 1. The magic square and the magic pentagram servables along rows and column is +11, with the excep- tion of the last column where it is −11. The logical rela- tions (O’) and (C’) are here substituted by the algebraic The Peres-Mermin “magic” square (Mermin, 1990b, relations 1993; Peres, 1990), already introduced in Sec.II, is a proof of the Kochen-Specker theorem even though it does v(A)v(B)v(C) = v(a)v(b)v(c) = (8) not explicitly use a KS set of vectors. The difference re- = ... = −v(C)v(c)v(γ) = +1, sides in the fact that, instead of imposing (O) and (C) relations on rank-1 projectors, they impose analogous al- where with v(A) we denoted the value ±1 assigned to gebraic relations on ±1 observables. Consider, again, the the measurement A, etc. It is then clear that Eq. (8) can 8

v1=(0,1,0)

vD=(1,1,1)

v4=(0,1,-1) v7=(0,1,1)

vA=(-1,1,1) vB=(1,-1,1)

v9=(1,1,0) v5=(1,0,-1)

FIG. 6 (a) Peres-Mermin magic square and (b) Mermin’s v3=(0,0,1) v6=(1,-1,0) v8=(1,0,1) v2=(0,1,0) magic pentagram. Each dot represents an observable with v =(1,1,-1) possible outcomes −1 or 1. Each line contains mutually com- C patible observables. For each line, the product of the corre- FIG. 7 Graph of orthogonality between the vectors of the sponding observables is the identity, except for the bold lines, Yu and Oh set (Yu and Oh, 2012). Adjacent nodes represent where it is minus the identity. A possible choice of observ- orthogonal vectors. For simplicity’s sake, vectors are unnor- ables satisfying the conditions in (a) are given in Eq. (7). A malized. See the text for further details. possible choice of observables satisfying the conditions in (b) (1) (2) (3) (1) (2) (3) is the following: O1 = σz ⊗σx ⊗σx , O2 = σx ⊗σz ⊗σx , (1) (2) (3) (1) (2) (3) (3) O3 = σx ⊗ σx ⊗ σz , O4 = σz ⊗ σz ⊗ σz , O12 = σx , 2. Yu and Oh’s set (2) (1) (2) (3) O13 = σx , O14 = σz , O24 = σz , O34 = σz . Yu and Oh’s argument (Yu and Oh, 2012) does not pro- vide a proof of the Kochen-Specker theorem. However, it never be satisfied since it would imply fits in the more general framework of state-independent contextuality (SIC), namely, it is a contextuality argu- [v(A)v(B)v(C)][v(a)v(b)v(c)] ... [v(C)v(c)v(γ)] ment that does not depend on the choice of a particular (9) quantum state, but rather on the properties of the ob- = 1 × 1 × ... × (−1) = −1. servables alone. Similarly to KS proofs, Yu and Oh’s ar- gument is based on a set of rank-1 projectors. However, But, on the other hand, similarly to Eq. (5), contrary to KS proofs, the corresponding vectors admit a value assignment consistent with the conditions (O’) v(A)v(B)v(C)v(a)v(b)v(c) . . . v(C)v(c)v(γ) (10) and (C’). The contextuality argument arises from the = v(A)2v(B)2v(C)2 . . . v(γ)2 = 1, fact that every probability distribution consistent with those assignments, i.e., coming from a convex mixture which gives a contradiction. The magic square can be of them, is in contradiction with the probabilities that converted into a standard proof of the KS theorem with can be obtained using the 13 projectors, for all quantum vectors (Peres, 1991). states. There is a similar compact proof of the KS theorem The basic elements are 13 vectors in C3, listed in Fig.7, with Pauli operators for three qubits found by Mermin and the corresponding set of projectors |vihv|. The or- (1990b, 1993). It is based on ten observables with can thogonality relations of such vectors are depicted in the be arranged as shown in Fig.6 (b), a construction that graph in Fig.7. The projectors associated with nodes is sometimes called the magic pentagram. A, B, C, D sum up to a multiple of the identity, namely, The Peres-Mermin magic square and Mermin’s magic 4 pentagram have the minimum number of Pauli observ- |v ihv | + |v ihv | + |v ihv | + |v ihv | = 11, (11) A A B B C C D D 3 ables required for proving the KS theorem for two and three qubits, respectively. There are several similar thus, for any quantum state the sum of their probabil- 4 proofs of the KS theorem with Pauli observables for more ities is 3 > 1. On the other hand, the orthogonality than three qubits (Planat, 2012, 2013; Saniga and Planat, relations among the vectors {vi}, which correspond to 2012; Waegell, 2014; Waegell and Aravind, 2013a,b). In- exclusivity of the respective propositions, imply that the terestingly, a result by Arkhipov(2012) shows that all propositions associated with nodes A, B, C, D are also critical (i.e., the contradiction disappears by removing exclusive, i.e., they cannot be simultaneously “true”. one observable) parity proofs (i.e., based on a parity ar- This exclusivity implies that the sum of probabilities gument, as described before, see Sec. III.A) of the KS Prob(A) + Prob(B) + Prob(C) + Prob(D) ≤ 1 in any theorem for more than three qubits with Pauli observ- noncontextual hidden variable model. ables, where each observable is in exactly two contexts, This can be easily proven by identifying subgraphs con- can be reduced to the magic square or the magic penta- taining two of the vertices { A, B, C, D } and two of the gram. triangles { 1, 4, 7 }, { 2, 5, 8 }, { 3, 6, 9 } as the basic blocks 9 of the original KS proof, i.e., the bug, depicted in Fig.2; IV. CONTEXTUALITY AS A PROPERTY OF NATURE for instance, the subgraph { A, B, 1, 2, 4, 5, 7, 8 } depicted in Fig.5, implies that B and C are exclusive. By sym- The Kochen-Specker theorem, originally presented as a metry the same argument applies to any two vertices in logical impossibility proof, did not involve any statistical { A, B, C, D }. argument but was rather based on perfect assignments To summarize, even if the nodes A, B, C, D are not of 0 (false) or 1 (true) to a set of quantum propositions. connected in the graph in Fig.7, their relations with This caused a debate on the role of finite precision mea- other compatible elements imply that such elements cor- surements (see Sec. V.E) that also stimulated the devel- respond to exclusive propositions and thus the sum of opment of statistical versions of the Kochen-Specker con- their probabilities is bounded by one, whereas is QM tradiction. The results of this effort were noncontextu- such a bound can be violated. We will see in Sec. IV.A.3 ality inequalities, which, under certain assumptions, are how one can demonstrate this contradiction via a state- able to experimentally detect the phenomenon of quan- independent violation of a noncontextuality inequality. tum contextuality. In the following, we introduce the It has been proven that the Yu-Oh set is the SIC set basic notions and open problems associated with non- with the smallest number of vectors in any dimension contextuality inequalities and contextuality tests. We (Cabello et al., 2016b). present the definition of noncontextual hidden variable theories and noncontextuality inequalities in Sec. IV.A, to the operational definition of contexts in Sec. IV.B, the C. As a special instance of Gleason’s theorem problem of noise and imperfections in Sec. IV.C, and fi- nally experimental tests of contextuality in Sec. IV.D. In In the following, we outline the connection between Sec. IV.E, we review a different notion of contextuality Gleason’s theorem (Gleason, 1957) and the Kochen- introduced by Spekkens(2005). Specker theorem. It is helpful to recall the result un- derlying Gleason’s theorem.

Theorem. (Gleason, 1957) Let f : S2 → R be a non- A. Noncontextuality inequalities negative function on the real sphere S2 ⊂ R3, such that all orthonormal bases (~u,~v, ~w) in S2 obey Noncontextuality inequalities provide bounds obeyed by noncontextual hidden variable theories, in analogy f(~u) + f(~v) + f(~w) = 1. (12) to Bell inequalities that provide bounds for local hidden variable models (Bell, 1964; Brunner et al., 2014). The Then, there exists a positive semidefinite matrix R with first proposal of Kochen-Specker-type inequalities were tr(R) = 1, such that made by Larsson(2002) and Simon et al. (2001), but these require stronger assumptions than later noncontex- f(~v) = ~v>R~v. (13) tuality inequalities (Cabello, 2008; Klyachko et al., 2008), It is easy to see that an assignment of values 0, 1 ac- see below. cording to (O’) to all the orthonormal bases in C3 sat- We start by introducing the mathematical formula- isfies the assumptions of the theorem, hence it must be tion of noncontextual hidden variable theories. Then, given by a , according to Eq. (13). On we discuss basic examples of noncontextuality inequali- the other hand, there is no density matrix providing 0, 1 ties such as the the Klyachko-Can-Binicio˘glu-Shumovsky assignments to all orthonormal basis in R3, hence, the (KCBS) inequality (Klyachko et al., 2008) and the Yu-Oh contradiction (Kochen and Specker, 1967). In contrast, inequality (Yu and Oh, 2012), exhibiting, respectively, Kochen and Specker obtain a contradiction using only a state-dependent and state-independent quantum viola- finite set of vectors. tions. Finally, we compare these constructions with other A similar argument, connecting Gleason’s theorem to related approaches to noncontextual hidden variable the- the impossibility of a noncontextual hidden variable as- ories. signment, has been provided by Bell(1966). He showed that given a function f satisfying Eq. (12), two vectors ~v and ~w such that f(~v) = 1 and f(~w) = 0 cannot be 1. Mathematical structure of noncontextual hidden variable arbitrary close. This, in turn, is in contradiction with models the possibility of assigning 0, 1 values to all orthonor- mal bases while obeying the rules in Eq. (12), since there Different definitions of noncontextual hidden vari- would be arbitrary close pairs with different assignments. able (NCHV) models are present in the literature, Such a minimal angle between vectors has been quanti- which, despite their substantial equivalence, use differ- −1 1 fied as tan ( 2 ) ≈ 0.464 and the argument has been ent terminology and a different mathematical structures, further refined by Mermin(1993), who noticed that the from the marginal problem definition in the work of above reasoning can be easily extended to an argument KCBS (Chaves and Fritz, 2012; Fritz and Chaves, 2013; that uses only a finite set of vectors, as the original KS Klyachko et al., 2008), the (equivalent) definition of the argument. noncontextuality polytope (Kleinmann et al., 2012), to 10 the sheaf-theoretic approach (Abramsky and Branden- burger, 2011), the graph-theoretic approach by Cabello v4 et al. (2014), and the hypergraph-theoretic approach v2 by Ac´ın et al. (2015) (see also the book by Amaral and Terra Cunha, 2018). Here, we adopt what we consider v1 a minimal mathematical structure based on the noncon- textuality polytope and the marginal problem characteri- v0 zation of NCHV. Further properties of NCHV models re- v3 lated to graphs and hypergraphs are discussed in Sec.V.

Given a set of observables G = { A1,...,An }, a col- lection of contexts is a subset M of the power set of G, i.e., M ⊂ 2G; M is sometimes called the marginal sce- nario (Chaves and Fritz, 2012). The idea behind this 3 name is that the observed data from measurements in FIG. 8 The set of vectors vj ∈ R giving the dichotomic each context arises as a marginal of a global probabil- observables Aj = 2|vj ihvj | − 11 providing the maximum vi- ity distribution on all observables. In an NCHV theory, olation of the KCBS inequality form a regular pentagram, we assume the existence of a hidden variable which de- with orthogonal vectors connected by a blue line. The state ψ is directed along its symmetry axis. A realization is termines the outcomes of each observable independently > > |ψi = (1, 0, 0) , |vki = (cos θ, sin θ cos ϕk, sin θ sin ϕk) , with of the context. For the observed probabilities, e.g., for a 1 2πk √ cos θ = 4 , ϕk = 5 , and k = (2j + 1) mod 5. context given by the observables {Ai}i∈C , C ⊂ {1, . . . , n} 5 and the outcomes {ai}i∈C this corresponds to

X Y 2. State-dependent contextuality p({ai}i∈C) = p(λ) p(ai|λ), (14) λ i∈C We can now proceed to discuss noncontextuality in- equalities in the state-dependent scenario. The mini- with p(λ) ≥ 0, P p(λ) = 1, p(a |λ) ≥ 0, P p(a |λ) = λ i ai i mal dimension to witness contextuality is d = 3 and the 1, for i ∈ C. Notice that the outcomes as are, at this level, KCBS scenario (Klyachko et al., 2008) is the simplest arbitrary. In most cases, we consider either as ∈ {0, 1} scenario where qutrits produce contextuality. The sce- or as ∈ {−1, 1}. nario is defined by five measurements A0,...,A4, with Equation (14) implies that each outcome only depends outcomes ai ∈ {−1, 1} such that Ai,Ai+1, with sum modulo 5, are compatible, i.e., with a marginal scenario on the hidden variable λ and not on the specific con- 4 text in which the observable is measured. It can be eas- M = { (Ai,Ai+1) }i=0; see Fig.8. KCBS proposed the following inequality valid for NCHV theories ily proven that the response functions p(as|λ) that are not deterministic (i.e., 6= 0, 1) can always be transformed into deterministic functions of a new hidden variable hA0A1i + hA1A2i + hA2A3i + hA3A4i + hA4A0i ≥ −3, 0 (16) λ . By further developing this intuition, an argument P where hAiAji := aiaj p(ai, aj). According to the known in the literature as Fine’s theorem (Fine, 1982a) ai,aj (see also Abramsky and Brandenburger, 2011) shows that discussion in the previous section and by convexity ar- Eq. (14) is equivalent to the existence of a global prob- guments, the noncontextual bound −3 can be proven by trying all possible ±1 noncontextual assignments to the ability distribution over all observables A1,...,An, such observables A . All other noncontextuality inequalities that P ({ai}i∈C) is obtained by summing over all possible i outcomes for the other observables, namely, for this scenario can be obtained by relabeling the out- comes, i.e., by mapping Ai 7→ −Ai. For instance, with the transformation A 7→ −A and A 7→ −A , we obtain X 1 1 3 3 p({ai}i∈C) = p(a1, . . . , an). (15) the inequality

as:s/∈C hA0A1i+hA1A2i+hA2A3i+hA3A4i−hA4A0i ≤ 3. (17)

Starting from the above definition, noncontextuality In contrast to Bell inequalities, here there is no biparti- inequalities can be derived with the same methods used tion of the set of observables such that every observable for Bell inequalities, namely with the correlation polytope in one part is compatible with every observable of the method (Pitowsky, 1989). Similarly to Bell inequalities, other. Consequently, Eq. (16) cannot be interpreted as noncontextuality inequalities are satisfied by NCHV the- a Bell inequality: The measurements must be performed ories, and their violation by data collected in a quantum on a single system. experiment demonstrates quantum contextuality. We On a three-level√ system Eq. (16) can be violated up discuss general methods to derive them in Sec. V.A.1 to 5 − 4 5 ≈ −3.94 with the state |ψi = (1, 0, 0) and and Sec. V.A.2. measurement settings Aj = 2|vjihvj| − 11 and |vji = 11

(cos θ, sin θ cos[jπ4/5], sin θ sin[jπ4/5]) with cos2 θ = ily compute the quantum value for the operator cos(π/5)/[1 + cos(π/5)], see Fig.8. One can straightfor- X 1 X 25 wardly verify that hvi|vi+1i = 0 and, thus, [Ai,Ai+1] = 0, L = A − A A = 11, (19) i 2 i j 3 where the sum is mod 5. The KCBS inequality has been i edges violated in several experiments, see Sec. IV.D for more details. giving The KCBS inequality, together with the inequality by 25 Clauser, Horne, Shimony, and Holt (CHSH 1969), are hLi = > 8, (20) part of a general family of noncontextuality inequali- ρ 3 ties associated to compatibility structures form an n- for any quantum state ρ. The inequality associated with cycle, for n = 5 and 4 respectively. As a result of the Yu-Oh set has been further improved to maximize Vorob’ev’s theorem (Vorob’ev, 1962), see also the discus- the gap between NCHV and quantum values (Kleinmann sion in Sec. V.B.2, cycles of length strictly greater than et al., 2012). 3 are necessary to witness contextuality. Other form of Subsequently, several other SIC sets that are not KS cycles inequalities have been investigated in Bengtsson proofs have been proposed (Bengtsson et al., 2012; Xu (2009) for n = 7 and in Cabello et al. (2010) for arbi- et al., 2015), and methods to identify such SIC sets have trary n. The general form of the KCBS-like inequality been proposed (Cabello et al., 2015; Kleinmann et al., for odd cycles of length ≥ 5 was introduced in Liang et al. 2012; Ramanathan and Horodecki, 2014), which are dis- (2011) and in Cabello et al. (2010). The n-cycle inequali- cussed in more details in Sec. V.B. ties were proven to be tight (see Sec. V.A for a formal def- Subsequently, several other SIC sets that are not inition) for any n-cycle contextuality scenario for n ≥ 5 KS proofs have been proposed (Bengtsson et al., 2012; in Ara´ujo et al. (2013). The case n = 4 (CHSH) was Xu et al., 2015) and a systematic construction of already proven by Fine(1982a). More generally, for even state-independent contextuality inequality based on anti- n, with n > 4, the NC inequalities can also be interpreted distinguishable sets of vectors was proposed by Leifer and as Bell inequalities, the chained inequalities (Braunstein Duarte(2020). Graph-theoretical methods allow for an and Caves, 1990), but they are not tight in the Bell sce- exhaustive search of SIC sets (Cabello et al., 2015, 2016b; nario. Ramanathan and Horodecki, 2014) which is discussed in more details in Sec. V.B.

3. State-independent contextuality 4. Other approaches to noncontextual hidden variable models State-independent contextuality is directly related to proof of the KS theorem. In fact, Badziag‘ et al. (2009) The definitions discussed above represent the minimal proved that each KS set can be converted into an in- requirements for a theory where outcomes have a con- equality showing SIC, and Yu et al. (2015) developed a text independent assignment. Closer to the original for- similar method for proofs of the KS theorem based on mulation of the KS theorem, one can add exclusivity, more general algebraic conditions, as it is the case for arising from the condition (O’), or completeness, arising the PM square. from he condition (C’) in Sec. III.A. Noncontextuality Yu and Oh(2012) proved a stronger statement, which inequalities derived by using both these two assumptions was already partially discussed above. Yu and Oh pro- are usually called Kochen-Specker inequalities in Lars- vided a set of projectors admitting a {0, 1}-assignment, son(2002), and a similar argument was presented in Si- according to the constraints (O’),(C’), but that nev- mon et al. (2001). More recent results do not use these ertheless demonstrates SIC. From the set of vectors extra conditions but they do appear, often in parallel {|v i} listed in Fig.7, one constructs the observables with a derivation of general NC inequalities, like in the i i KCBS (Klyachko et al., 2008) and Yu and Oh(2012) pa- A ≡ 2|v ihv | − 11. The compatibility relations among i i i pers, and in several other subsequent works. It is thus the observables Ai follows from the orthogonality rela- tions of the corresponding vectors, and are again sum- worth mentioning such approaches to contextuality and marized in the graph in Fig.7. Each pair of observables emphasize the difference and connection with the above one. AiAj such that (i, j) is an edge of the graph is compati- ble. One can thus write the following NC inequality A typical example is the following. Given a set of rank-1 projectors {|viihvi|}i in dimension d their compat- ibility relations corresponds to orthogonality relations, X 1 X hA i − hA A i ≤ 8, (18) i.e., |v ihv | is compatible with |v ihv | if and only if i 2 i j i i j j i edges | hvi|vji | ∈ { 0, 1 }. If we denote with ai = 1, 0 the classi- cal value associated with the POVM {|viihvi|, 11−|viihvi|} where the NCHV bound 8 is simply computed by try- in the NCHV model, then the assumptions of exclusiv- 13 ing all possible 2 noncontextual value assignments for ity and completeness correspond, respectively, to p(ai = P {Ai}i. However, using the vectors in Fig.7, one can eas- aj = 1|λ) = 0 whenever hvi|vji = 0, and to i∈I p(ai = 12

1|λ) = 1 whenever hvi|vji = 0 for all i, j ∈ I, i 6= j and Finally, a construction of further noncontextuality in- P i∈I |viihvi| = 11. equalities that use extra exclusivity assumptions can be It is clear that any inequality valid for a NCHV model obtained within the graph-theoretic approach to quan- is also valid for an NCHV model with the above extra as- tum contextuality, see Sec. V.B. sumptions. Conversely, an inequality valid for an NCHV model with such extra assumptions can be transformed into an inequality valid for NCHV models with the extra B. Operational definitions and physical assumptions: ideal assumptions by adding extra terms as we briefly discuss measurements in the following for the case of the extra assumption of exclusivity. A similar argument can be constructed for In the previous sections, an abstract notion of a con- the case in which the completeness condition (C’) is also text was enough to introduce the mathematical struc- assumed. tures of NCHV models and the Kochen-Specker theorem; By assuming exclusivity, we obtain the inequality only some intuition on its physical meaning was provided. This is no longer sufficient when discussing possible ex- X NCHV+E µip(ai = 1) ≤ Ω, (21) perimental tests. In particular, the physical implications i∈I of a violation of a noncontextuality inequality, actually depend on the specific notion of context chosen, the type with the superscript indicating that it holds when the of measurements considered, the details of their experi- extra exclusivity assumption is made. We can, then, mental implementation, and so on. All these details must add on the l.h.s. pairwise correlation terms −µijp(ai = be verified to check their consistency with the notion of 1, aj = 1), with a appropriately chosen weight µij ≥ 0, NCHV theory one wishes to put at test. In the following, such that the total value of the expression decreases when we address questions such as: What assumptions must be the exclusivity condition is violated. For suitable chosen fulfilled by this implementation, in order to make an ex- weights we have periment a reasonable test of contextual behavior? What models can be disproved by such experiments? X X NCHV µip(ai = 1) − µijp(ai = 1, aj = 1) ≤ Ω, To achieve this, we proceed in two steps. First, we i∈I (i,j)∈I0 need an operational definition of contexts that allows us (22) to identify them with certain experimental joint mea- making Eq. (22) valid also for general NCHV theories. surements. More precisely, we focus on the notion of It is straightforward to use this conversion in the KBCS disturbance for sequential measurements. This is done inequality (Klyachko et al., 2008) in the framework of ideal measurements, where proper- ties such as perfect nondisturbance are achievable. The 4 X NCHV+E second step is to extend the notion of noncontextuality p(a = 1) ≤ 2, (23) i to discuss the case of nonideal measurements. We show i=0 in Sec. IV.C how this can be achieved via an explicit transforming it into an inequality valid for general NCHV quantification of the disturbance. theories, namely

4 4 X X 1. Two perspectives: observables and effects p(ai = 1) − p(ai = 1, ai+1 = 1) i=0 i=0 (24) 4 To describe experimental realizations of contextual- X NCHV = p(a = 1, a = −1) ≤ 2, ity tests, we adopt as far as possible a “black-box” de- i i+1 scription of the measurements. Such a description does i=0 not presuppose the validity of quantum mechanics, even where the sum in ai+1 is modulo 5 and we simplified the though the design of the operations carried out to obtain expression by using the marginal condition in Eq. (15). the measurement results (e.g., which laser pulses to use Not only does this transformation not change the classi- in ions experiments, or where to put beam splitters and cal bound, but it also does not modify the quantum value polarizers in photonic experiments) may be motivated obtained from the quantum observables Aj = 2|vjihvj|−11 by a quantum mechanical description. Each measure- discussed in the previous section, since they satisfy by ment apparatus is seen as a box that takes as input a construction hvi|vi+1i = 0 giving p(ai = 1, ai+1 = 1) = 0. physical system in a certain state and return a classical This idea has been exploited in several works (Asadian output and, possibly, the physical system in a new state. et al., 2015; Cabello et al., 2015; Yu and Oh, 2012), and We are not interested in the details of the functioning of a completely general treatment of the problem has been the apparatus, however, we still need some physical as- presented in Cabello(2016); Yu and Tong(2014). Ex- sumptions on how to combine the different experimental perimental tests of contextuality are very challenging, so apparatuses to observe joint probability distributions. In it very useful to be able to reduce the set of assumptions fact, a prerequisite for contextuality, in the sense of the to an absolute minimum when designing such tests. inequalities derived above, is the possibility of perform- 13 ing together two or more measurements, corresponding tions. A context is defined by a set of compatible to non trivial marginal scenarios. observables. A noncontextual hidden variable the- It is important to note that different notions of contex- ory is one that assigns values to each observable tuality exist in the literature. In order to clarify the origin independently of which joint measurement they ap- and relation between the different notions of contextual- pear in. ity, it is helpful to go back to the original discussion by Kochen and Specker(1967). The starting point of their EP The basic objects of contextuality are effects and argument is that it is always possible to construct a hid- their relation of being part of the same observable. den variable theory for a set of observables, if such a the- A context is defined by a single observable. A non- ory does not need to satisfy functional relations among contextual hidden variable theory is one that as- them. At the same time, they were dissatisfied by the signs values to each effect independently of which impossibility proof by von Neumann(1931, 1932), which observable they appear in. used linear relations among incompatible observables. In contrast, they chose an intermediate perspective, inspired It is important to remark that this distiction between by Gleason’s approach (Gleason, 1957), where functional OP and EP has been introduced here to clarify and sepa- relations are assumed only among compatible measure- rate different ideas. In some works on contextuality this ments, since they can be experimentally tested in a joint distinction was not so sharp and the two perspectives measurement. As a notion of compatibility, they define were often interchanged. the notion of commeasurability, meaning that the statis- This equivalence no longer holds if one wants to pass tics of a set of observables {Ai}i can be recovered as a from the scenario with idealized measurements to ac- function of a single measurement B. In particular, for tual experimental tests. Several questions arise, such ideal measurements, this notion was shown to be equiv- as: What happens if the measurements are noisy? How alent to pairwise commutativity (Kochen and Specker, can we operationally identify contexts in an experimen- 1967). In a more modern terminology, we may identify tal scenario? Possible ways of generalizing these two this idea with the notion of joint measurability (Busch different perspectives to deal with actual experiments et al., 2014, 1996, 2016) valid for more general measure- gave rise to different notions of contextuality. The ob- ment. A generalized measurement is represented by a servable perspective OP is the one we are consider- positive operator valued measure (POVM): a collection of ing mostly in this review. The effect perspective, EP, P effect operators {Ei}i such that Ei ≥ 0, i Ei = 11 with was the most common in the initial works on KS the- the computation of probabilities via Prob(i) = tr(ρEi), orem, see e.g., the discussion in Sec. III.A in terms of and quantum instruments {Ii}i for the computation of value assignments to triples of orthogonal vectors. In the state-update rule, i.e., ρ 7→ Ii(ρ), where Ii is a com- more recent times, Spekkens analyzed this approach to P pletely positive map and i Ii is a quantum channel the Kochen-Specker theorem and contextuality, arguing (see, e.g., Heinosaari and Ziman, 2012). Two POVMs against the possibility of applicability of such notions to A and B with effects Ai and Bj are called jointly mea- unsharp measurements and, consequently, to experimen- sureable, if there is a third POVM G with effects {Gij}ij, tal tests (Spekkens, 2005, 2014). Such ideas have led to P P such that Ai = j Gij and Bj = i Gij. Equivalently, a definition of contextuality and an approach to contex- one can substitute the sum over one index with a more tuality tests different from the one presented so far. general classical postprocessing (Ali et al., 2009). The For the sake of completeness, we want to clarify the equivalence between joint measurability and commuta- difference between our approach, OP, the most common tivity is no longer true for non-ideal measurements, this alternative, EP, and Spekkens definition of contextual- point plays an important role in the discussion of exper- ity in some detail. We discuss EP briefly in Sec. IV.B.3. imental tests. In particular, we discuss Spekkens’ criticism of the latter One may summarize the above by saying that the ba- perspective. Later on, in Sec. IV.E, we provide a more sic elements of the KS theorem are dichotomic observ- detailed discussion of Spekkens’ approach to contextual- ables, i.e., the {Pi}i of Sec. III.A with values {0, 1} or ity. {false, true}, and contexts are defined as sets of com- measurable observables. On the other hand, as discussed already by Kochen and Specker(1967), the simplest way 2. Operational definitions of contexts: OP of performing a joint measurement of three observables, belonging to a context, is given by a single trichotomic measurements, where the triple of orthogonal projectors Given the two perspectives above, a natural question P ,P ,P are interpreted as its effects. Thus, the KS is how to translate such notions into experimental pro- i j k cedures in the lab. For instance, an observable can be theorem can be equivalently analyzed from the observ- identified with an experimental measurement procedure able perspective (OP) or from the effect perspective (EP), (e.g., a sequence of laser pulses and detection for an ion- namely, trap experiment, or optical elements and detection for a OP The basic objects of contextuality are observables photonic one), an effect can be connected with the prob- and their compatibility (joint measurability) rela- ability of a certain outcome in a measurement. How can 14

turbance, in the sense that the physical property revealed A B A B 1 1 1 1 by a measurement B is not altered if A is measured first, 흆 흆 and the same if the order is exchanged. This intuition AB AB is particularly clear when one considers a Bell scenario: the choice of measurement performed by one party, say A2 B2 A2 B2 Alice, cannot have any influence on the measurement out- come of the other party, say Bob, since these events are (a) assumed to happen in space-like separated regions. Spe- cial relativity guarantees that no “influence” or ”distur- 흆 bance” could propagate from one space-time region to AB C the other. Bell nonlocality can be considered a particular form of contextuality, where the assumption of context- independence is identified with the locality assumption, 흆 Alice’s outcome is independent of what measurement Bob A a α is performing. One may relax such a constraint, e.g., by assuming that two measurements are performed on two (b) systems few meters apart in the same lab and still do not disturb each other, or even measurements on differ- 흆 ent degrees of freedom on the same system, and so on. AB A This motivates us to further develop this idea to identify contexts as sets of measurements that are in some sense nondisturbing, and to deal with experimental imperfec- 흆 tion from this perspective, as we extensively discuss in AA A Sec. IV.C. It is interesting to remark that similar ideas of classical hidden variable models based on the notion (c) of nondisturbance among measurements have been devel- oped by Leggett and Garg in the notion of macrorealist FIG. 9 Different experimental procedures in Bell and more models (Emary et al., 2014; Leggett and Garg, 1985). general noncontextuality scenarios. (a) Two contexts in Bell scenario: The same measurement A1 is performed in two dif- Before defining a notion of nondisturbance, we should ferent contexts, either with B1 or with B2. (b) Two contexts clarify what type of measurements are relevant. A in PM scenario: The measurement of A is performed in two wide range of measurement are possible in quantum me- contexts, either with B and C or with a and α. Similarly chanics, from sharp measurement to the trivial POVM to the Bell scenario, the same measurement procedure, repre- {11/2, 11/2}, giving always one of two outcomes with equal sented by the box with label A, is repeated in different con- probability. We would like the observables to represent texts. (c) Additional measurements to quantify experimental the measurement of a property, rather than a random imperfections. The measurement of A is repeated alone or to- coin flip. A possible condition is that of self-repeatability, gether with B. Subsequent measurements of A must confirm namely, if we perform the same measurement twice, we the same the same outcome, represented by the yellow light on the bottom of the device. obtain the same outcome. This is a prerequisite for speaking about nondisturbance for single runs of an ex- periment. In fact, if the outcome changes randomly when repeating a measurement, as in the case of the trivial we say that we implement the “same object” (i.e., ob- POVM above, it does not make sense to speak about the servable in OP, or effect in EP) in “different contexts”? outcome not being disturbed. To better understand this Let us start with OP. In this case, it is easy to identify point, consider the notion of nondisturbance for quan- the basic objects, i.e., the observables, whereas an harder tum sequential measurements introduced by Heinosaari task is to identify contexts and compatible observables. and Wolf(2010): A does not disturb B, if it is impossible The minimal requirement for a context is to be given by to detect through a measurement of B whether A was a joint measurement. A possible way to perform joint measured (and its outcome discarded) before B. How- measurements is simply to perform the measurements in ever, this property does not imply that A has no effect a sequence. In this way, it is clear how to identify “the on B. For instance, imagine we perform the sequence same observable” in ”different contexts”, since an observ- of measurements BAB, if we obtain the outcome +1 for able is given by a specific set of measurement procedures the first measurement of B and -1 for the second, we may (laser pulses, beam splitters, etc.). One just has to repeat conclude that A “disturbed” B. Of course, this notion the same procedures in different sequences. makes sense only if B is self-repeatable. A given set of ob- What are then the joint measurements? Of course, servables {A, B, C, . . .} satisfy the outcome repeatability the simple fact of performing one measurement after the property if for any sequence of them, in any possible order other is not enough to consider it to be a joint measure- and with each observable appearing multiple times, the ment. Intuitively, one would need some notion of nondis- outcome of their first appearance is always repeated in all 15 subsequent measurements. A set of observables that sat- and Specker, 1967), by extending it to non-ideal measure- isfy outcome repeatability, and in addition the statistical ments. In Larsson et al. (2011), it has been extensively nondisturbance conditions of Heinosaari and Wolf(2010) argued in favor of using sequential measurements for con- and their generalization to arbitrary sequences (see e.g. textuality tests. The main motivations can be summa- Uola et al., 2019) is called a context (in OP). rized as follows. For joint measurement devices, a change The above definition has been inspired by the quan- of context corresponds to a physically entirely different tum mechanical notion of projective measurements. In setup even if one of the setting within the context remains fact, for projective measurements, all notions of compat- unchanged. It is, then, difficult to argue that the out- ibility, i.e., joint measurability, nondisturbance, outcome come of the unchanged setting is unchanged from physi- repeatability, etc., are equivalent to the commutativity cal principles, as already noted by Bell(1966, 1982), see of observables. This notion is the basis of the NCHV also the discussion by Cabello(2009). In the black box models presented in Sec. IV.A.1, e.g., Eq. (14), since one scenario outlined above, one can always decide whether requires that the single outcomes, i.e., the corresponding one performs a measurement alone or in sequence with deterministic response functions, are not disturbed. other measurements. The existence of “context-less” de- In summary, we may say that in OP observables are vices, associated with the single measurement setting and given by experimental measurement procedures, which then combined in the sequential measurement setup, is are repeated in an identical way in each context, and con- then the argument for noncontextual behavior. In a se- text are defined by sequential measurements of outcome- quential measurement one uses the very observables of repeatable observables. The different steps in the real- which one wants to verify the contextual behavior. This ization of a contextuality test can be listed as allows for a direct identification of the single observables in each context and a change of the context occurs by S.1 Define experimental measurement procedures and substituting only some observables in the sequence. This associate to each one a classical random variable is in contrast to the joint measurement scenario where with same values as possible outcomes. the whole device changes and other means of identifying S.2 Identify contexts in terms of outcome-repeatable observables have to applied. and statistical-nondisturbing measurements. Finally, an argument against the use of joint mea- surability to define contexts for non ideal measurements S.3 Perform measurements in different sequences, ac- comes from the discussion by Fritz(2012). Even though cording to the defined contexts. For each measure- the original argument dealt with nonlocality, it straight- ment the same procedure is repeated in different forwardly generalizes to contextuality. Similar consider- contexts. ations can be found in Henson and Sainz(2015), see also the discussion by Kunjwal(2020). We briefly present S.4 Compare the observed statistics for contexts (se- the main idea in the following, by considering the CHSH quences) with the one predicted by the NCHV for scenario as a contextuality scenario. We denote the mea- the corresponding classical variables. x y surements effects as {Aa} and {Bb }, where a, b are the This is the perspective explicitly adopted in, e.g., G¨uhne outputs and x, y the measurement settings, and define et al. (2010); Kirchmair et al. (2009); Larsson et al. contexts as set of jointly measurable observables. In the x y (2011), to mention a few. We remark that the above CHSH scenario, each context consists of a pair {A ,B }. steps are defined for ideal measurements. In actual ex- If we assume that contexts are defined by jointly mea- surable observables, this implies that there exist a joint periments, the outcome repeatability property is never xy exactly satisfied due to unavoidable errors and impreci- measurement Gab for each pair of settings x, y. The joint sion in the experimental implementations. Nevertheless, measurability conditions amounts to having this theoretical framework in mind, one can de- x X xy y X xy vise practical methods to deal with experimental imper- Aa = Gab ,Bb = Gab , for all a, b, x, y. (25) fections. In real experiments, then, we need an additional b a step Notice that such operators automatically give rise S.5 Perform additional experimental runs to quan- to a nonsignaling distribution p(ab|xy)(Popescu and P tify deviations from ideal (outcome-repeatable and Rohrlich, 1994), namely, it satisfies b p(ab|xy) = P 0 0 P nondisturbing) measurements and compare with b p(ab|xy ) for all a, x, y, y , and a p(ab|xy) = P 0 0 the classical models accordingly. a p(ab|x y) for all b, x, x , y. On the other hand, for any given nonsignaling distribution {p(ab|xy)}abxy, we See Fig.9 for some examples of the different measurement can construct such joint measurements simply as procedures. The problem of quantifying deviations from ideal measurements and the comparison with the classical xy x X xy y X xy Gab := p(ab|xy)11,Aa := Gab ,Bb := Gab . models is extensively discussed in Sec. IV.C. b a Before concluding this subsection, it is interesting to (26) comment on the possibility of using simply the notion Since all operators are a multiple of the identity, one can of joint measurability, as in the original work (Kochen simply take a one-dimensional Hilbert space. At the level 16 of correlations, the conditions of nonsignalling precisely must be represented by the same object in the hidden amounts to the condition of joint-measurability for one- variable model. dimensional quantum systems. In this framework, a hidden variable model, or onto- The above construction implies that, by defining con- logical model according to the terminology by Spekkens texts simply in terms of jointly measurable observables, (2005), describes observed probabilities for a given prepa- all nonsignaling correlations (Popescu and Rohrlich, ration P and measurement M with outcome k as 1994) can be obtained by one-dimensional quantum sys- X tems. The argument extends straightforwardly to any p(k|P,M) = µP (λ)ξM,k(λ), (28) arbitrary contextuality scenario, where the counterparts λ of the nonsignalling correlations are the nondisturb- ing (Kurzy´nski et al., 2012) or the nonsignaling-in-time where µP (λ) represents the probability distribution one the space of the hidden variable λ associated with the (Kofler and Brukner, 2013) correlations. In other words, P by defining contexts simply in terms of joint measurabil- preparation P , satisfying µP (λ) ≥ 0, λ µP (λ) = 1, and ity, all maximally contextual correlations, defined as the ξM,k(λ) the response function associated with the out- come k of the measurement M, satisfying ξM,k(λ) ≥ 0, extreme point of the nondisturbing polytope (Kurzy´nski P et al., 2012), can be reached by one-dimensional (hence, k ξM,k(λ) = 1 for all λ. Noncontextuality, then, classical simulable) quantum systems. amounts to the assumption that ξM,k = ξM 0,k0 if con- Intuitively, Fritz’s argument shows that by using a too dition (27) is satisfied. One can, then, compare these broad notion of contexts, i.e., joint measurability, contex- assumptions with the usual assumptions of Kochen- tuality becomes a trivial property: quantum systems are Specker theorem (Kunjwal and Spekkens, 2015; Leifer no longer needed to violate noncontextuality inequalities. and Maroney, 2013; Spekkens, 2005). This does not mean that the possibility of defining We have seen in Sec. IV.A.1, Eq. (14), that nondeter- contexts through joint measurability has not been ex- ministic response functions can always be transformed plored and that other approaches are not possible. For into deterministic ones by extending the space of the instance, the notion of joint measurability has been used hidden variable. However, if one assumes that two ef- as a definition of context for non ideal measurements fects with the same statistics are represented by the same by Liang et al. (2011). In order to deal with the prob- object in the hidden variable theory, one can no longer lem posed by Fritz, the authors introduced a sharpness transform nondeterministic response functions ξM,k(λ) parameter and a consequent modification of noncontex- into deterministic ones, if the measurement is not sharp. tual bounds for correlations. Observables proportional to A proof of this fact can be found in Spekkens(2014). The the identity, as in the example of Eq. (26), are then maxi- intuition at the basis of the proof could be summarized as follows. For a given effect E of an unsharp measure- mally unsharp, and the modified bound of the considered P noncontextuality inequality, then, becomes the algebraic ment, take its spectral decomposition i λiΠi, which has maximum (Liang et al., 2011). This approach was fur- all eigenvalues in [0, 1], then the projectors {Πi}i con- ther investigated (Kunjwal, 2015; Kunjwal and Ghosh, stitute a projective measurement that gives E via some 2014) and the corresponding inequality was experimen- postprocessing. Since E is statistically indistinguishable tally tested by Zhan et al. (2017). from the above-constructed postprocessing of a projec- tive measurement, the corresponding response functions in the ontological model must be identical, hence, the re- 3. Operational definitions of contexts: EP sponse function associated with E must also arise from the same postprocessing, i.e., they are not deterministic. Different problems arise in the perspective EP. Here, This observation together with the practical impos- a context is easily identifiable operationally, as it con- sibility of obtaining perfect projective measurements in sists simply in a single measurement. The difficult part actual experimental implementations motivated the de- is to identify the “same effect” in “different measure- velopment of a different notion of noncontextuality in- ments”. A solution is an identification of effects based on equalities (Krishna et al., 2017; Kunjwal and Spekkens, the observed statistics. As an example, we can consider 2015, 2018; Mazurek et al., 2016; Pusey, 2018; Schmid Spekkens’ approach, first proposed in Spekkens(2005) et al., 2018; Spekkens et al., 2009; Xu et al., 2016), based and further clarified in Spekkens(2014), based on the on the identification of measurement effects according notion of statistical indistinguishability. In simple terms, to Eq. (27), and an analogous notion for preparations one identifies the effect associated with the outcome k called preparation noncontextuality (Spekkens, 2005). We of the measurement M with effect associated with the briefly review the approach based on the latter perspec- outcome k0 of the measurement M 0 if tive in Sec. IV.E. p(k|P,M) = p(k0|P,M 0), for all preparations P, (27) C. Modeling experimental imperfections where p(k|P,M) denotes the probability of the outcome k of the measurement M given the preparation P . In Starting from the operational definition of contexts and other words, if two effects have the same statistics, they compatibility provided in the previous section for ideal 17 projective measurements, we further develop these ideas as it follows immediately from Eq. (14). Consider now in order to deal with imperfect and noisy measurements the case of A and B not compatible. Terms that are not typical of any experimental implementation of a contex- experimentally accessible are still well defined in this the- tuality test. It is important to emphasize that there is no ory, such as, e.g., p[(+|A) and (+|B)], namely, the prob- general recipe that can be applied to all experiments. On ability that the first measurement gives the outcome +1, the contrary, for every experimental realization it is nec- both in the case of a measurement of A and of a measure- essary to make some assumptions on the hidden variable ment of B. However, the outcome probability p[++|AB] model describing the type of noise present. Typically, for their sequential measurement does not correspond to it is necessary to perform additional measurements to the one above, and it is, in fact, not included in the de- quantify the amount of noise and check its consistency scription given by the NCHV theory. In this sense, the with the above-mentioned assumptions, and possibly to present discussion extends the NCHV framework to a modify the noncontextuality inequality accordingly. Sev- new hidden variable theory that includes the description eral approaches have been proposed and implemented in of sequences of incompatible measurements, and in which experimental tests of contextuality. In the following, we certain outcomes are allowed to depend on the particu- discuss the theoretical work presented in G¨uhne et al. lar sequence of measurements, if such a sequence involves (2010); Kirchmair et al. (2009), in Szangolies(2015); incompatible measurements. This is a central point com- Szangolies et al. (2013), and in Kujala et al. (2015); Lars- mon to all analyses of experimental imperfections. son(2002); Simon et al. (2001); Winter(2014). For measurements that are assumed to be compatible, say A and B, the hidden variable theory satisfies the usual noncontextuality conditions:

1. Quantifying disturbance in sequential measurements v(A1|S1, λ) = v(A2|S2, λ), for all λ, (29) for sequences S = {A},S = {BA}, To deal with actual experiments, we need to relax the 1 2 assumption of perfectly compatible measurements by ad- where v(Ai|S, λ) denotes the value assigned by the hid- mitting measurements that produce a (certain type of) den variable theory to the observable A in position i in disturbance on subsequent measurements, and then try- the sequence S for a given λ. Notice that Eq. (29) is a ing to quantify such a disturbance. This is the approach condition on the hidden variable model that is not di- proposed by Kirchmair et al. (2009) and further devel- rectly experimentally testable, hence, it cannot be taken oped by G¨uhne et al. (2010); we mostly follow the lat- as an operational definition of compatibility. How can we ter reference. Such an approach can be summarized as quantify the disturbance introduced by a measurement in follows. Probabilities associated to perfectly compatible this model? The first observation is that for this model measurements are still described by NCHV models, how- the following inequality holds ever, we admit the possibility of incompatible measure- p[(+|A) and (+|B)] ≤ p[++|AB]+p[(+|A) and (•−|AB)]. ments that introduce disturbance and change the out- (30) comes of subsequent measurements in a sequence, giv- Intuitively, given a specific value λ of the hidden vari- ing rise to context-dependent outcomes. The amount of able such that it contributes to the l.h.s., the value of B disturbance is then estimated experimentally, under the either stays the same, i.e., λ contributes to p[+ + |AB], physical assumption that the amount of noise introduced or the value of B is flipped by the measure of A, i.e., λ by the measurements is cumulative, i.e., it does not can- contributes to p[(+|A) and (• − |AB)]. cel out by performing more measurements. P The correlator hABi = ab=±1 ab p[(a|A) and (b|B)], To keep the notation simple, we consider only the case representing the correlation between A and B assigned of ±1-valued observables, a generalization to arbitrary by the hidden variable theory, can be written as (finite) outcomes is straightforward. Consider a hidden variable model describing the probabilities for all possible hABi = 1−2p[(+|A) and (−|B)]−2p[(−|A) and (+|B)]. sequences SAB = {A, B, AA, AB, BA, BB, AAA, . . .} of (31) two ±1-valued observables A, B. We denote the outcome In contrast, the correlation between A, B observed in probabilities for single measurements as p[±|A], p[±|B], an experiment where we measure the sequence S = and similarly, for sequences p[± ± |AA], p[± ± |AB],..., {AB}, denote by hA1B2i, is given by hA1B2i = P for respectively, measurement of the sequence AA, AB, ab=±1 ab p[ab|AB]. In general, hA1B2i= 6 hABi if A, B etc., where the temporal ordering of the sequence is from are incompatible. left to right. We admit the possibility of a discarded out- By definining pflip[AB] as the probability that the out- come, e.g., p[+ • −|BAB] denotes the probability for the come of B is flipped by the measurement of A, namely, outcomes + for the first measurement of B, a discarded pflip[AB] := p[(+|B) and (•−|AB)]+p[(−|B) and (•+|AB)], outcome (•) for the measurement of A and − for the final (32) measurement of B. and using Eq. (30), we can bound hABi with the exper- As we anticipated, our model departs from the NCHV imentally observable correlator hA B i as follows: model discussed in Sec. IV.A.1. If A, B are compatible 1 2 flip flip observables, it must necessarily be p[+ • −|BAB] = 0, hA1B2i−2p [AB] ≤ hABi ≤ hA1B2i+2p [AB]. (33) 18

By Eq. (29), pflip[AB] = 0 for compatible measurements. 2. Context-independent time evolution Applying this reasoning to the Clauser-Horne- Shimony-Holt (CHSH) inequality (Clauser et al., 1969), An implicit assumption hidden in the definition of the previous model is that the hidden variable λ is static, hABi + hBCi + hCDi − hADi ≤ 2. (34) i.e., not evolving during the time passing between one measurement and the subsequent one. We have the corresponding expression for the case of se- Szangolies(2015); Szangolies et al. (2013), proposed quential measurements (G¨uhne et al., 2010) a relaxation of such conditions admitting a hidden vari- able changing in time, but with an evolution that is still hχseq i := hA B i + hC B i + hC D i − hA D i CHSH 1 2 1 2 1 2 1 2 context-independent, in the sense that it does not de- ≤ 2 1 + pflip[AB] + pflip[CB] + pflip[CD] + pflip[AD] . pend on the specific measurements performed. Such an (35) investigation led to modified noncontextuality inequali- ties satisfied by such extended set of noncontextual corre- What is left is to bound the unobservable quantity pflip. lations, thus, able to demonstrate quantum contextuality We introduce in a broader framework. The notion of noncontextual evolution (NCE) has been err p [BAB] := p[+ • −|BAB] + p[− • +|BAB], (36) formalized as follows (Szangolies, 2015): The system evolves according to a sequence of hidden variable states corresponding to the probability of flipping the value of B λi → λj → λk → ... that is independent of the mea- in a sequence of three measurements with an intermediate surements performed. To understand such a notion, it is measurement of A, and experimentally measurable. helpful to consider a simple example of a noncontextu- At this point, one needs an assumption on the HV ality inequality that can be maximally violated by such model describing the experimental noise, namely models. Consider the CHSH inequality, but now evalu- (CN) Cumulative noise assumption: ated according to the following sequential measurement scheme p[(±|B) and (• ∓ |AB)] ≤ p[(±|B) and (± • ∓|BAB)] hA B i + hB C i + hC D i − hD A i ≤ 2, (39) = p[± • ∓|BAB]. (37) 1 2 1 2 1 2 1 2 where, as in the previous section, we denote via the sym- Notice that this assumption is not directly testable as it bol hA1B2i the fact that we first perform the measure- contains some unaccessible correlations that are defined ment A, then the measurement B, and take the expecta- only at the level of the HV model. Nevertheless, one can tion value of the product of their outcome. The authors have a physical intuition of what this means. In fact, this construct the following noncontextual model with evolu- assumption corresponds to the idea of a cumulative noise, tion. The hidden variable λ evolves from an initial state i.e., the noise always increases with additional measure- λ1 to λ2, independently of the measurement performed ments: it is more likely to flip the outcome of B if we at the initial time. The measurements are chosen such perform a measurement of both B and A than it is if we that A, B, C, D give always outcome +1 on the state λ1 perform only a measurement of A. This seems a reason- and similarly for λ2, with the only exception of A, which able assumption if we want to model experimental im- gives the value −1 on λ2. It is then straightforward to perfections, where the measurements are not supposed check that Eq. (39) can be violated up to the algebraic to “collude” to cancel out the noise when arranged in maximum 4. On the other hand, forcing the measure- specific sequences. Similar ideas have been explored by ments to always appear in the same order, namely Wilde and Mizel in their discussion of the so-called clum- siness loophole for Leggett-Garg inequalities (Wilde and hA1B2i + hB1C2i + hC1D2i − hA1D2i ≤ 2, (40) Mizel, 2012). Then, Eq. (37) directly implies pflip[AB] ≤ perr[BAB], restores the classical bound 2. Intuitively, since observ- and allows us to rewrite Eq. (35) as ables are forced to appear always in the same position in the sequence, they are drawn always from the same dis- seq err err err hχCHSHi ≤ 2 (1 + p [BAB] + p [BCB] + p [DCD] tribution for λ = λ1 or λ2, hence, they are not affected + perr[DAD]) , by the evolution of λ. This behavior is analogous to (38) what happens in Leggett-Garg tests (Emary et al., 2014; Leggett and Garg, 1985), where the hidden variable λ is which involves only experimentally testable quantities. allowed to evolve freely in time. In Kirchmair et al. (2009), the authors reported an ex- An analogous reasoning is applied to different noncon- perimental violation of Eq. (38) for a sequential measure- textuality inequalities (Szangolies et al., 2013), such as seq ment of the CHSH expression, with the value hχCHSHi − PM inequality seen in Eq. (3) 2(perr[BAB] + perr[BCB] + perr[DCD] + perr[DAD]) = 2.23(5). A similar analysis has been performed also in hA1B2C3i + hc1a2b3i + hβ1γ2α3i + hA1a2α3i (41) the experiment by Jerger et al. (2016). +hβ1B2b3i − hc1γ2C3i ≤ 4, 19 where observables are forced to appear always in the same of contexts. A Kochen-Specker proof consists in a set of position in each sequence of measurements, e.g., A always vectors for which it is impossible to assign a noncontex- appears first, a always second, α always third, and so tual value satisfying all logical relations. In the above on. The same inequality has been investigated also from language, this implies the impossibility of an assignment the perspective of dimension witnesses based on quantum with ε = ε0 = 0. Naturally, this is expected to hold also contextuality (G¨uhne et al., 2014) and for the compari- for small disturbances. This intuition can be made quan- sion between the spatial and temporal scenario (Xu and titative, with the following inequality derived by Larsson Cabello, 2017). (2002) from the above assumptions: Moreover, similar ideas can be applied even to noncon- textuality inequality where observables cannot be forced Mε + Nε0 ≥ 1, (45) to be always in the same position, such as, e.g., KCBS inequality (Szangolies et al., 2013, cf. Eq. (16)) implying that if the errors in the logical relations (ε0) are small, noncontextuality must fail often (ε must be large). hA1B2i + hB1C2i + hC1D2i + hD1E2i (42) The presence of the logical relations associated with KS- +hE1A2i − hA1A2i ≥ −4, theorem makes it a KS-inequality as those discussed in Sec. IV.A.4. where the extra term hA1A2i is designed to give a By using experimental estimates of the probability of “penalty” whenever the value A changes with a change failure of the logical relations, one can draw conclusions in λ. on how far from being noncontextual the data are. Un- In summary, this approach provides a simple method fortunately, for the Kochen-Specker proofs available at of dealing with context-independent time evolution of the the time, the numbers M (interconnections) and N (con- hidden variable, which can be easily combined with the texts) are very high, so that even small values of ε and ε0 one presented in the previous section. give a value of the left-hand side of (45) that is larger than 1, see the numbers listed in TableI. An experiment to violate one of these inequalities would be very challeng- 3. First proposals of experimentally testable inequalities ing, given that errors in the directions would translate to an increased probability of failure of noncontextual- The first attempts to derive experimentally testable ity, and therefore lower the bound on ε0. However, there noncontextuality inequalities are from the early 2000’s is no direct connection between directional accuracy and (Larsson, 2002; Simon et al., 2001). These authors re- ε, the probability of failure of noncontextuality, and no lax Kochen-Specker assumptions by requiring that the immediate way to estimate this failure probability from noncontextuality assumption and the completeness con- experimentally measurable quantities. dition (C’) of Sec. III.A, are only approximately satis- fied. More precisely, the model proposed by Simon et al. (2001) considers a relaxation of (C’), whereas Larsson (2002) considers a relaxation of both (C’) and noncon- 4. Approximate quantum models texual value assignment. In the following, we discuss in detail the approach of Larsson(2002). An attempt to use the quantum description of the sys- For any pair of intersecting contexts appearing in a tem and measurements for the estimate of the failure KS set, i.e., triples of vectors (a, b, c) and (a, b0, c0), we probability of the previous section was presented by Win- have the assumption that the value assignment is ap- ter(2014). The author considers quantum effects Qi that proximately context-independent, i.e., are ε-close, in the operator norm, to the ideal projectors Pi associated with each direction i, so that   p v(a,b,c)(a) 6= v(a,b0,c0)(a) ≤ ε, (43) ||Qi − Pi|| ≤ ε. (46) where v(a,b,c)(a) denotes the value assigned to a in the context (a, b, c). In the paper, this is given the name ε-precise quantum Models that obey the above assumption are called (ε-PQ) model. Note that in principle ε can be estimated ε-ontologically faithful noncontextual (ε-ONC) models through a tomographic characterization of the measure- by Winter(2014). A formal statement of the second ment. It is now tempting to equate the ε from the ε-ONC assumption is that the value assignments on any given model (as defined above) with the ε from the ε-precise context approximately satisfy the condition (C’), i.e., quantum model, the reason being that they each consti- tute a distance measure in their respective realm. In the   X 0 following, we briefly review how this identification can p v(a,b,c)(i) 6= 1 ≤ ε . (44) play a role in contextuality tests according to Winter i=a,b,c (2014). Let us now write M for the number of interconnections Let us consider an ε-ONC model consisting of a set of C between contexts [e.g., M = 1 for the single shared vec- {0, 1}-valued classical random variables Xi , each asso- 0 0 tor between (a, b, c) and (a, b , c )], and N for the number ciated with a rank-1 projector Pi and a context C, such 20

d n N M ε0 (ε = 0) ε0 (ε = 0.01) Peres(1993) 3 57 (33) 40 96 0.0250 0.0010 Kochen & Conway (Peres, 1993) 3 51 (31) 37 91 0.0270 0.0024 Sch¨utte(Bub, 1997) 3 49 (33) 36 87 0.0278 0.0036 Kernaghan and Peres(1995) 8 36 11 72 0.0909 0.0255 Kernaghan(1994) 4 20 11 30 0.0909 0.0636 Cabello et al. (1996a) 4 18 9 18 0.1111 0.0911 Lisonˇek et al. (2014) 6 21 7 21 0.1429 0.1129

TABLE I Different Kochen-Specker proofs: the dimension d, the number of projectors n (number inside paranthesis is the number used in the contradiction, number outside counts all vectors when completing the bases), the number of contexts N, and the number of context changes M. The final two columns are lower bounds for the probability of failure of the logic relation ε0 given a noncontextual model (ε = 0), and when there is a small probability of failure of noncontextuality from experimental causes (for example ε = 0.01). that tomography of the quantum effects, one could use the X above argument to disprove noncontextuality for impre- Xi ≤ 1, for all contexts C cise measurements. However, there is no direct connec- i∈C tion between the two as both definitions, of an approx- C C0 0 0 imate quantum model and of an ontologically faithful Prob(Xi 6= Xi ) ≤ ε, for all i, C, C , s.t. i ∈ C ∩ C . (47) non-contextual classical model, are independent. In fact, the measures arise in different domains and measure dis- In other words, the model is explicitly contextual, i.e., for tances between different conceptual object types: in one each i and each context C, we have a different random case the operator-norm distance between an ideal mea- C C C0 variable Xi , but Xi and Xi take different values at surement (Pi) and the realized effect (Qi), and in the most with probability ε. The case ε = 0 clearly coincides other case the statistical distance between an outcome with the usual NCHV model. assignment in one context (e.g., v(a,b,c)(a) in the previ- From the above explicitly contextual model one can ous example) and the corresponding outcome assignment define context-independent variables Yi as in another context [e.g., v(a,b0,c0)(a)].

Y C Yi := Xi . (48) C 5. Maximally noncontextual models

From the definition it is clear that the probability that Yi C Another approach to the quantification of measure- is different from Xi for some C is equal to the probability C ment disturbance in contextuality tests was proposed by that the Xi are not all equal for different C, hence, it is Kujala et al. (2015), via the notion of maximal noncon- smaller than (ki −1)ε, where ki is the number of contexts textuality. The authors relax the definition of a NCHV in which i appears. We thus have model by allowing some disturbance in the measure- hXC i ≤ hY i + (k − 1)ε, (49) ments, not necessarily seen as sequential measurements, i i i leading to an apparent context-dependence, namely, the corresponding to the modified bound for ε-ONC models observation of different marginal distributions for the measurement of the same observable in different contexts. X X λihPii ≤ βε−ONC := β0−ONC +ε λi(ki −1), (50) In this way, they obtain an explicitly contextual classi- i i cal model, in which observables in different contexts are represented by different classical random variables. where β0−ONC denotes the usual NCHV bound and It is instructive to show a simple example. Con- βε−ONC the modified bound for a ε-ONC model. sider the KCBS scenario discussed in Sec. IV.A.2: five For a given violation βq of the NCHV bound β0−ONC {+1, −1}-valued observables A0,...,A4 with compatible we can have a contradiction up to the precision ε0 = P pairs Ai,Ai+1, with the sum modulo 5. Consider the (βq − β0−ONC)/[ i λi(ki − 1)], i.e., the violation can- following version of the KCBS inequality not be explained by a model with imprecision ε ≤ ε0. 3 Winter(2014) provides also some estimate of the ε0 as- X sociated with maximal quantum violation of different hAiAi+1i − hA0A4i ≤ 3, (51) noncontextuality inequalities, e.g., the KCBS inequality i=0 KCBS with ε0 < 0.047 and the PM-square inequality with where addition in the indices is interpreted modulo 5. PM ε0 < 0.0138. The NCHV model is extended by Kujala et al. (2015) By equating the ε of the ε-ONC model to the ε of by taking a copy of each classical variable for each con- the ε-PQ model, e.g., with the latter extracted via a text. In this case, there are five contexts {Ai,Ai+1} for 21 i = 0,..., 4 and each Ai appears twice, i.e., in the i-th ity published in Lapkiewicz et al. (2011), by computing a and the (i + 1)-th contexts. The authors construct an 99.99999999% confidence interval for the l.h.s. of Eq. (52) explicitly contextual model by taking context-dependent and obtaining the interval [3.127, 4.062], confirming a vi- (i) (i+1) olation of the inequality. This result can then be inter- copies, i.e., Ai and Ai , where the superscript in- dicates the context. This construction of a contextual preted as a disproof of maximally noncontextual models. model is similar to that in Sec. IV.C.1, where the value In Amaral and Duarte(2019); Amaral et al. (2018), the assignment of an observable explicitly depends on the se- authors investigated general methods to derive inequali- quence it appears in, see Eq. (29). However, to allow an ties such as Eq. (52) for arbitrary scenarios. easier comparison with the original paper, we follow the notation by Kujala et al. (2015). As in the previous cases, in order to interpret experi- D. Experimental realizations mental results one needs some assumption on the type of disturbance present. The authors choose to intro- In this section, we discuss some experimental tests of duce the notion of maximally noncontextual model. A quantum contextuality. Clearly, we cannot give a de- rigourous definition is provided by Kujala et al. (2015), tailed description of the experimental techniques. In- here, we may reformulate it in simple terms as follows: stead, we aim to explain some typical experiments and variables representing observables in different contexts their underlying assumptions. are equal to each other with the maximum probability allowed by the observed marginals. Intuitively, this no- tion states that there is no more disturbance than the 1. Early experiments one observed in the marginals. Similarly to what we dis- cussed in Sec. IV.C.1, it is reasonable to apply this model Let us first mention some of the early experiments aim- if the noise is supposed to arise from some clumsiness of ing at a test of quantum contextuality (Bartosik et al., the measurements, i.e., the measurement apparatuses are 2009; Hasegawa et al., 2006; Huang et al., 2003; Michler not colluding to cancel out the noise when combined in et al., 2000). These early experiments are characterized a certain way. by the fact that they did not measure one of the con- The authors derive a class of inequalities valid for textuality inequalities from Section IV.A. Instead, some maximally noncontextual models, for what they call predictions of quantum mechanics are assumed to be cor- cyclic systems, also called n-cycle scenario (Ara´ujo et al., rect in order to interpret the observations as a refutation 2013), namely a collection of {+1, −1}-valued observ- of noncontextuality. ables A0,...,An−1 with compatible pairs Ai,Ai+1, with As an example, let us discuss the experiment by Bar- the sum modulo n. This scenario includes the Leggett- tosik et al. (2009) in some detail. We consider six ob- Garg inequality (Leggett and Garg, 1985), the CHSH servables on a two-qubit system, inequality (Clauser et al., 1969), and the KCBS inequal- ity (Klyachko et al., 2008), corresponding to, respec- A = σx ⊗ 11,B = 11 ⊗ σx, a = σy ⊗ 11, tively, n = 3, 4, 5. The KCBS inequality in Eq. (51) b = 11 ⊗ σ ,G = σ ⊗ σ , g = σ ⊗ σ (53) becomes y x y y x

3 4 Then, for any noncontextual model the inequality X (i) (i) (4) (4) X (i) (i−1) hA A i − hA A i − |hA i − hA i| ≤ 3, i i+1 0 4 i i hΘi = −hABi − habi + hGAbi + hgaBi − hGgi ≤ 3 (54) i=0 i=0 (52) Such inequalities can be derived with methods similar holds. This can be directly checked by considering all to those associated with standard noncontextuality in- the ±1 assignments to the measurements. For a two- equalities (cf. Sec. V.A): under the assumption of a joint qubit Bell state in an appropriate basis, one can reach (c) the value hΘi = 5, as the Bell state can be a common probability distribution over all variables {A } , one i i,c eigenstate of G and g with eigenvalue −1. computes the projection of the corresponding probabil- ity simplex onto the space of observable marginals; in this As our observables are defined locally, one can assume the relation hGAbi = hgaBi = 1. Note, however, that case, the correlators {hA(i)A(i) i} and expectation val- i i+1 i this assumption is justified only for the given definition of (c) ues (marginals) {hAi i}i,c=i,i+1. A rigorous derivation observables, it does not need to hold if the measurements of Eq. (52) can be found in Kujala et al. (2015). are considered to be black boxes. But then, the inequality In contrast to the proposal of Sec. IV.C.1, the present simplifies to method does not require to perform additional measure- ments, as the experimental data obtained for the usual hθi = −hABi − habi − hGgi ≤ 1. (55) test of the KCBS inequality already contains all the in- formation necessary to evaluate the l.h.s. of Eq. (52). In For testing this inequality, Bartosik et al. (2009) used fact, Kujala et al. (2015) compare this expression with neutron interferometry. Here, the two qubits are rep- the experimental results for the test of the KCBS inequal- resented by the spin and the path of a neutron in an 22

and reading out this single system can be done by laser pulses with high fidelity. For performing non-local gates, a Mølmer-Sørensen gate was used, where both ions are illuminated with the same laser. This allows to perform non-local gates even if the ion-crystal is not in the ther- mal ground state. This is important, as the later mea- surements require state detection of one ion, which can excite the motional quantum number. A crucial point of the experiment is the appropri- ate implementation of the measurements of the Peres- Mermin square. Here, it is important that these are global measurements on a four-dimensional system with only two outcomes ±1. This means that a measurement like C = σz ⊗σz cannot be implemented by measuring σz on both particles separately, as this would give four dif- ferent results and destroy the coherence in the subspaces corresponding to the eigenvalues of σz ⊗ σz, as discussed also in our first presentation of the PM square in Sec.II. In order to circumvent this, one can write C and any FIG. 10 Upper panel: level scheme of a single 40Ca+ ion, other observable in the Peres Mermin square as highlighting the electronic levels used for the qubit. Lower panel: Implementation of the measurement sequence † C = σz ⊗ σz = UC [σz ⊗ 11]UC (56) of one column or row. In order to retrieve only one bit of information, a measurement is implemented by first perform- where UC is some non-local unitary gate. Physically, this ing a nonlocal unitary transformation, then only one qubit is allows to implement C by first applying the U to the read out and finally the unitary transformation is reversed. C state, then reading out only the first ion, and finally un- Both figures are taken from G¨uhne et al. (2010). doing the transformation UC again. In the experiment, the internal state of the second ion was in addition trans- interferometer. These two degrees of freedom are inde- ferred to a different level during the readout of the first pendently accessible, and the terms hABi and habi can ion, in order to protect it from fluorescence light during directly be measured. the detection process. Under the assumption of the validity of quantum me- In this way, all the non-local measurements on the chanics, the term hGgi can be measured by performing a Peres-Mermin square can be implemented, but it should Bell measurement (i.e., a measurement in the basis of all be noted that the measurement of the third row and the the four Bell states) on the two qubits. This allows also third column requires the implementation of six non-local to reconstruct the values of G and g separately. Typ- gates within the sequence. Consequently, the fidelity of ically, such a Bell measurement is nonlocal and there- a single non-local gate must be very high (in the exper- fore difficult, but since the two qubits are encoded on iment it was around 98%) in order to observe the de- a single neutron, this is feasible here. Finally, a value sired results. For the interpretation of the experiment, † of hθi = 2.291 ± 0.008 was found, resulting in a clear the details of the decomposition of C = UC [σz ⊗ 11]UC violation of Eq. (55). are not relevant: A measurement like C is seen as a black box, where a state is subjected to certain measure- ment procedures and a result ±1 is obtained. The details 2. A test of the Peres-Mermin inequality with trapped ions of the decomposition are inside the black box (see also Fig. 10). Of course, one has to check in the experiment One of the first experiments testing contextuality in- whether these black boxes represent repeatable and non- equalities was performed with ion traps (Kirchmair et al., disturbing measurements, this also has been done (see 2009) and it can be consider the prototypical example below). from which the general description in Sec. IV.B has been With these measurements, one can start to test the developed, as well as the basis for many other subsequent noncontextuality inequality. Performing a sequence of experiments. This experiment aimed at an implemen- measurements, one obtains eight possible results, since tation of the inequality coming from the Peres-Mermin any of the measurements results in a ±1 outcome (see also square, see Eq. (3). Fig. 11). Multiplying the results gives the total value, Let us start by describing the experimental setup: A which is then used for computing the total expectation pair of 40Ca+ ions in a Paul trap was used to model the value of in inequality. As a first result, if one takes a four-dimensional Hilbert space. For each ion, the qubit two-qubit singlet state as an input state, a violation of is represented by the states |1i = |S1/2, mS = 1/2i and |0i = |D5/2, mD = 3/2i, see also Fig. 10. Manipulating hPMi = 5.46 ± 0.04 > 4 (57) 23

Fig. 11) and confirmed. Second, for compatible mea- surements the values within a sequence of measurements should not change. For that, one can consider the mea- surement A = σz ⊗ 11 and a sequence of measurements A1A2A3 ··· of it, where we use again the notation of Sec. IV.C.1, with Ai denoting the measurement of A in the sequence position i (to avoid confusion, we remark that this notation is no longer used in the following sec- tions). Then, the question is whether the results of Ai and Aj are the same, this can be quantified by the mean value hAiAji. Here, values from hA1A2i = 0.97 ± 0.01 to hA1A5i = 0.95 ± 0.01 have been reported (G¨uhne et al., 2010). For a nonlocal measurement such as c = σx ⊗ σx the imperfections are larger and one finds, for instance hc1c3i = 0.88±0.01. In addition, measurement sequences like c1C2c3 can be tested in order to test the compatibil- ity of C and c. Here, values of hc1c3i = 0.83 ± 0.02 have been found. The observations confirm that the implemented mea- surements are to a certain extent repeatable and compat- ible. The question remains whether this is sufficient to rule out hidden variable models with some extra assump- tions. For that, Kirchmair et al. (2009) used a model where certain error probabilities for short sequences are assumed to to be bounded by the error probabilities of FIG. 11 Upper panel: Example of measurement correlations longer sequences, see Sec. IV.C.1 and G¨uhne et al. (2010) for a sequence of measurements for a partially entangled input for a detailed discussion. With that, the inequality state. The colors indicate whether the product of the three results gives +1 (yellow) or −1 (red). The volume of a sphere hχi = hBCi + hbci + hBbi − hCci − 2perr[CBC] is proportional to the likelihood of finding the corresponding (58) measurement outcome. −2perr[cbc] − 2perr[bBb] − 2perr[cCc] ≤ 2 Lower panel: Permutations of the observables within the rows and columns can serve as a test of the compatibility of the can be derived. Here, perr[CBC] denotes the probability measurements. The figure shows the measured absolute val- that the value of C is flipped, if the sequence C1B2C3 is ues of the products of observables for any of the six possible measured. Experimentally, a value of hχi = 2.23 ± 0.05 permutations. For each permutation, 1,100 copies of the sin- was found, ruling out this type of hidden variable model. glet state were used. Both figures are taken from Kirchmair et al. (2009).

3. A test of the Peres-Mermin inequality with photons has been found, displaying a clear violation. A further A further test of the Peres-Mermin square has been central prediction of quantum mechanics is the state in- performed with photons (Amselem et al., 2009). Here, a dependence of the violation. For that, 10 different states single photon was used to carry two qubits: One qubit have been tested, including mixed states and separable was encoded in the polarization, and a second one in the states. In all cases, a violation has been found with values path of the photon (see also Fig. 12). of hPMi ranging from 5.23 ± 0.05 to 5.46 ± 0.04. Given this two-qubit system, one has to implement In order to be a complete contextuality test, some more the nine measurements A, . . . , γ from the Peres-Mermin issues have to be discussed. First, one has to test and square. Note that a standard measurement of the po- quantify the amount to which the implemented measure- larization or path with photon detectors is not suitable, ments are indeed compatible. Closely related to that is as then the photon is absorbed and no further sequence the question why the observed violation was not the one of measurements can be carried out. In Amselem et al. expected from quantum mechanics. All this allows finally (2009) this was done by constructing for each measure- to exclude hidden variable models, albeit with additional ment an interferometric setup, where the result of the assumptions besides noncontextuality. measurement is encoded in the output port of the in- s Concerning the compatibility of the measurements terferometer, see Fig. 12 (c). For instance, A = σz on within one row or column, the experiment by Kirchmair the spatial qubit is effectively an empty interferometer, et al. (2009) made several tests. First, for compatible the photon leaves the setup in direct correspondence to p measurements the order of the measurements within one the input. The polarization measurement B = σz can row or column should not matter. This was tested (see be implemented with a polarizing beam splitter, which 24

FIG. 12 (a) Encoding of two qubits in one photon. A tunable polarizing beam splitter distributes the photon over two spatial modes, resulting in the spatial qubit. The polarization qubit is adjusted by half- and quarter wave plates. (b) A row or column of the Peres-Mermin square is measured by a sequence of interferometers. After the sequence, the photon can be in one of eight different outputs, representing the eight outcomes of the sequence of measurements. (c) Detailed interferometric setups for all the nine measurements in the Peres-Mermin square. The figure is taken from Amselem et al. (2009). makes the output port dependent on the input polariza- KCBS inequality is measured simultaneously by marking tion. These are only simple examples, some other mea- two paths; if the photon is in such a path, the prod- surements require essentially an entangled Bell measure- uct is assigned the value −1; no detection corresponds ment for their implementation. to the value +1. For example, A1A2 can be directly be For measuring a sequence like CAB one has to con- measured (Fig. 13(b)), but for the next term in the in- catenate these interferometers, see Fig. 12 (b). Here, one equality, optical elements are used to manipulate the two needs to build two times the setup for measuring A, one paths which are not needed for the measurement of A2. for each possible output ports of C. Moreover, one needs The two paths are then used for A3 and the next term in four times the setup for B, as CA can have four possible the inequality (Fig. 13(c)). results, i.e. the photon may be in four different paths. The setup has to to ensure that, e.g., the observable A2 Finally, the result of the entire sequence is measured by a which is measured both in the term A1A2 and in A2A3 click in one of eight detectors. Finally, the inequality has corresponds exactly to the same experimental setup. In been checked for 20 different input states. On average, a the experiment this problem was solved by a careful de- value of sign of a measurement sequence (see the right hand side of Fig. 13). However, in the last correlation one does not hPMi = 5.4550 ± 0.0006 > 4 (59) 0 0 measure A5A1, but A5A1 where A1 has a different struc- was found. ture than the measurement A1 in the correlation A1A2. 0 One can however compare the properties of A1 and A1 and then argue that it is effectively the same measure- 4. A test of the KCBS inequality with photons ment. In the experiment there was a small deviation be- 0 tween A1 and A1, but it was suggested to take this into The KCBS inequality from Section IV.A has first been account with a correction term in the KCBS inequality, tested in an experiment using photons (Lapkiewicz et al., leading to a modified classical bound of −3.081 ± 0.002. 2011). Here, the three basis states of the Hilbert space For the KCBS correlation, a value of were given by three possible paths of a photon in an interferometer (see also Fig. 13). hA1A2i + hA2A3i + hA3A4i + hA4A5i First, a single photon is coherently distributed over the (60) +hA A0 i = −3.893 ± 0.006, three paths via beam splitters, thus preparing the initial 5 1 state. A measurement is done by detection in a possi- ble path (result −1) and if no photon is detected the violating the contextuality inequality by 120 standard measured value is +1. A pair of observables from the deviations. 25

FIG. 13 Left: Setup for the measurement of the correlations hAiAj i in the KCBS inequality in Eq. (60). The measurement Ai has the result −1 if the corresponding detector clicks, otherwise the value +1 is assigned. Consequently, the product AiAj has the value +1, if both or no detectors register a photon, otherwise the value is −1. Right: Concrete experimental setup. The transformations Ti are implemented via insertion of the half-wave plates WPi, which, in combination with polarizing beam splitters distribute the photons across the modes. The figure is taken from Lapkiewicz et al. (2011).

5. Final considerations on KS contextuality experiments tems (Frustaglia et al., 2016; Li et al., 2017; Zhang et al., 2019), see also the discussion of Markiewicz et al. (2019). To conclude this section, let us first briefly mention In all the experiments listed above, several different ap- some other experimental tests of contextuality inequali- proaches have been proposed to test contextuality. It is ties. The contextuality inequality from the Peres-Mermin important to distinguish them into three different main square has also been tested using nuclear magnetic res- categories. In Sec. IV.B and IV.C, we argued in favor onance (Moussa et al., 2010), with photons in its en- of the use of sequential measurements, such as, e.g., the tropic version (Qu et al., 2020), a similar inequality has experiments by Kirchmair et al. (2009) discussed in this been tested with photons (Liu et al., 2009), and the section. The alternative is that of joint measurements. Mermin star inequality has been tested with nitrogen- Some experiments adopted this approach, such as the vacancy centers in diamond (van Dam et al., 2019). The one by Lapkiewicz et al. (2011) described in this section. KCBS inequality and its generalizations have been tested Even in the joint measurement approach, effort has been with superconducting qubits (Jerger et al., 2016) photons put into the identification of which part of the device (Arias et al., 2015; Borges et al., 2014), ions (Malinowski corresponds to each single measurement, and the quan- et al., 2018), and Um et al. (2013, 2020) explored the tification of experimental imperfections and their conse- connection to randomness generation. quences on the noncontextual bound, in a similar spirit The inequality of Yu and Oh has first been tested with as the approaches discussed in Sec. IV.C. Finally, there photons by Zu et al. (2012) (see also the discussion in are other experiments where the single measurements in Amselem et al., 2013; Zu et al., 2013). Further tests each context are not so well characterize, sometimes even have been implemented with a single trapped ion (Zhang implemented in different ways in different contexts, mak- et al., 2013) and nitrogen-vacancy centers in diamond ing the experimental procedure itself context dependent. (Kong et al., 2012, 2016). In a more recent experiment This can lead to discussions about the interpretation of with a trapped ion, the compatibility relations of the the experiment (Amselem et al., 2013; Zu et al., 2012, observables have been studied in detail (Leupold et al., 2013). 2018). Finally, there are experimental works which aim The experiments chosen as examples are just repre- at an observation of contextuality effects in classical sys- sentatives of a broad range of different experiments with 26 analogies and differences. The variability of the differ- equal mixture of states polarized along x, corresponding ent approaches is arguably due to the lack of a clear and to the quantum state ρ = (|+ih+| + |−ih−|)/2. Similarly comprehensive theoretical description of a quantum con- for Eq. (62), each measurement M is associated with a textuality experiment. The present review aims to fill POVM {Ek}k and each outcome k to a single element P this gap. Ek, with Ek ≥ 0 and k Ek = 11, independently of the Finally, there are two types of experiments we have particulars of the experimental implementations of the not mentioned so far, namely, experiments of Spekkens’ measurement. contextuality, which are covered in detail in the next sec- A hidden variable description of preparations and mea- tion, and the Liang-Spekkens-Wiseman approach (Liang surement procedures is given by an ontological model, et al., 2011), e.g., tested by Zhan et al. (2017), which we which plays a role similar to that of a NCHV in do not cover in this review. Sec. IV.A.1. A crucial difference, however, is that such procedures involve only a preparation and a measure- ment, possibly also a transformation, but we do not con- E. A different notion of contextuality: Spekkens’ approach sider this case here. Joint or sequential measurements are not considered. An ontological model for the probability 1. Spekkens’ definition of noncontextuality p(k|P,M) is then given by A different notion of contextuality has been intro- Z duced by Spekkens(2005) and further explored and de- p(k|P,M) = dλµP (λ)ξM,k(λ), (63) veloped in subsequent papers (Kunjwal, 2020; Kunjwal and Spekkens, 2015, 2018; Mazurek et al., 2016; Pusey,

2018; Schmid et al., 2019, 2020, 2018; Spekkens, 2014; where µP :Ω → [0, 1] is the probability density associ- R Xu et al., 2016). The starting point is an operational in- ated with the preparation procedure P , i.e., dλµP (λ) = terpretation of a physical theory, namely, a construction 1, and ξM,k :Ω → [0, 1] represent the indicator func- where the primitive elements are preparation, transfor- tion associated with the outcome k of M, satisfying mation, and measurement procedures. Such procedures P k ξM,k(λ) = 1, for all λ ∈ Ω. Equation (63) is remi- are intended as a list of instructions for “operations” niscent of the expression for joint measurements or local that can be performed in a laboratory. For the case of a hidden state models arising in the context of quantum prepare-and-measure scenario, i.e., ignoring for the mo- steering (Uola et al., 2020). In fact, some formal equiva- ment transformation procedures, the basic elements are lence between Spekkens’ contextuality and these two phe- preparations and measurement effects together with rules nomena has been shown (Tavakoli and Uola, 2020). for calculating probabilities, i.e., p(k|P,M), representing The condition of noncontextuality, then, amounts to the probability of obtaining the outcome k for the mea- the requirement that to each equivalence class in the surement M, given the preparation procedure P . operational model corresponds the same description in The notion of a noncontextual operational theory is the ontological model. In other words, if P ∼ P 0 then based on the idea of statistical indistinguishability of µP = µP 0 (condition of preparation noncontextuality) and procedures. In simple terms, one may call operationally 0 0 if (M, k) ∼ (M , k ), then ξ = ξ 0 0 (condition of equivalent those procedures that give rise to the same M,k M ,k measurement noncontextuality). It is then possible to ob- statistics, and require that they should be represented by tain a contradiction between the above assumptions and the same elements of the theory. Consequently, Spekkens the predictions of quantum mechanics, hence, showing defines equivalence classes of preparations and measure- the impossibility of a noncontextual ontological model. ments, as follows In the simplest example of the impossibility of a prepa- P ∼ P 0 ⇐⇒ p(k|P,M) = p(k|P 0,M), (61) ration noncontextual ontological model, Spekkens(2005) for all measurements and outcomes k, M, writes the maximally mixed state of a qubit, i.e., 11/2, 0 0 0 0 as a convex combination of different rank-1 projectors, (M, k) ∼ (M , k ) ⇐⇒ p(k|P,M) = p(k |P,M ), (62) namely, for all preparations P. 11 1 If one applies the equivalence in Eq. (61) to experimen- = (|ψaihψa| + |ψAihψA|) tal procedures described according to quantum mechan- 2 2 1 1 ics, one obtains that each equivalence class [P ] is asso- = (|ψbihψb| + |ψBihψB|) = (|ψcihψc| + |ψC ihψC |) ciated with a quantum state ρ, since there is no way 2 2 1 to distinguish via a quantum measurement two prepa- = (|ψaihψa| + |ψbihψb| + |ψcihψc|) rations that give rise to the same state ρ. A typical 3 example is that of a spin-1/2 particle: it is not possi- 1 = (|ψAihψA| + |ψBihψB| + |ψC ihψC |), (64) ble to distinguish the preparation of an equal mixture of 3 states polarized along z, corresponding to the quantum state ρ = (|0ih0| + |1ih1|)/2, from the preparation of an where the vectors, depicted in Fig. 14 in the Bloch rep- 27

2. Inequalities for Spekkens’ noncontextuality

Spekkens’ notion of noncontextuality has been tested experimentally in Anwer et al. (2019); Hameedi et al. (2017); Mazurek et al. (2016); Spekkens et al. (2009). In order to do so, it is necessary first to derive noncontex- tuality inequalities that are testable against the observed statistics. This has been done in several works (Kr- ishna et al., 2017; Kunjwal and Spekkens, 2015, 2018; Mazurek et al., 2016; Pusey, 2018; Schmid et al., 2018; Spekkens et al., 2009; Xu et al., 2016). In the follow- ing, we present the results by Mazurek et al. (2016), both on the theoretical and experimental side, as the au- thors managed to overcome some difficulties present in the first experiment (Spekkens et al., 2009), in particular, in relation to the operational equivalence of preparations and measurements. The noncontextuality inequality by Mazurek et al. (2016), based on both preparation and measurement noncontextuality, is presented in the fol- FIG. 14 Different decomposition of the maximally mixed lowing. First, one needs to introduce six preparations 11 Pt,b, for t = 1, 2, 3 and b = 0, 1 , such that state 2 , represented in the (x, z)-plane of the Bloch sphere, in terms of the pairs along opposite lines, e.g., |ψai , |ψAi, or 1 1 0 triples in the same triangle, e.g., |ψai , |ψbi , |ψci. P := (P +P ) = (P 0 +P 0 ), for all t, t = 1, 2, 3, ∗ 2 t,0 t,1 2 t ,0 t ,1 (68) resentation, are defined as and three dichotomic measurements Mt for t = 1, 2, 3 such that the average measurement is the “fair coin flip” |ψai = (1, 0), |ψAi = (0, 1), measurement, i.e., 1  √  1 √  |ψbi = 1, 3 , |ψBi = 3, −1 , 1 X 1 2 2 M := M , with p(b|M ,P ) = , ∀P and b = 0, 1, 1  √  1 √  ∗ 3 t ∗ 2 |ψ i = 1, − 3 , |ψ i = 3, 1 . (65) t c 2 C 2 (69) Under the assumption of preparation noncontextuality, where, as usual, p(b|M,P ) denotes the probability of an the corresponding probability measures in the ontological output b given the preparation P and the measurement models must coincide M. Then, the noncontextuality inequality reads µa(λ) + µA(λ) µb(λ) + µB(λ) µc(λ) + µC (λ) 1 X X 5 = = A = p(b|M ,P ) ≤ . (70) 2 2 2 6 t t,b 6 µ (λ) + µ (λ) + µ (λ) µ (λ) + µ (λ) + µ (λ) t=1,2,3 b=0,1 = a b c = A B C . 3 3 (66) Such an upper bound can be computed in terms of the ontological model as follows Moreover, due to the orthogonality conditions, i.e., 1 X X X hψa|ψAi = hψb|ψBi = hψc|ψC i = 0, one has that the cor- ξ(b|M , λ)µ(λ|P ) responding distributions in the ontological model should 6 t t,b t=1,2,3 b=0,1 λ not overlap, namely, 1 X X 1 X  ≤ η(Mt, λ) µ(λ|Pt,b) µa(λ)µA(λ) = µb(λ)µB(λ) = µc(λ)µC (λ) = 0, for all λ, 3 2 (67) t=1,2,3 λ b=0,1 X 1 X  since they can be distinguished with certainty with a = µ(λ|P ) η(M , λ) single-shot measurement. By checking all possible as- ∗ 3 t λ t=1,2,3 signments of null values according to the above product 1 X  conditions, one immediately realizes that the only possi- ≤ max η(Mt, λ) , (71) λ 3 ble solution to Eqs. (66) and (67), is µa(λ) = µA(λ) = t=1,2,3 µb(λ) = µB(λ) = µc(λ) = µC (λ) = 0. The same argu- ment can be extended to show preparation contextuality where η(Mt, λ) := maxb=0,1 ξ(b|Mt, λ). The assump- of any mixed state (Banik et al., 2014). The above ideas tion that M∗ is the “fair coin flip”, implies that 1 P 1 have been further elaborated to devise experimental tests 3 t ξ(b|Mt, λ) = ξ(b|M∗, λ) = 2 for b = 0, 1, which con- of Spekkens’ contextuality, as we discuss in the next sec- strains the three-dimensional vector (ξ(0|Mt, λ))t=1,2,3 tion. on a two-dimensional polytope inside the [0, 1]3 cube, 28

1 whose extremal points are given by (1, 2 , 0) and coor- dinate permutations. Taking the outcome maximization PBS Mirror defining η, i.e., flipping one or more outcomes, one ob- Dh IF 1 GT- tains at most assignments (1, 2 , 1), and permutations, PBS 5 HWP which give the upper bound 6 . Notice that the deriva- Heralded Single tion of Eq. (70) uses both preparation noncontextuality Photon Source PPKTP P Coupler QWP [ µ(λ|Pt,b)/2 = µ(λ|P∗)] and measurement noncontex- b P tuality [ t ξ(b|Mt, λ)/3 = ξ(b|M∗, λ)], in fact, the in- equality can be violated by models that violate at least Dt one of the constraints, as shown by Mazurek et al. (2016). prep comp meas Moreover, it is clear that quantum theory can violate it Dr up to the algebraic maximum 1. This is done by using as P the six preparations in Fig. 14, with the pairs t,b State Preparation Measurement b = 0, 1, for fixed t, associated with antipodal points (i.e., |ψai , |ψAi etc.), and as measurements the three Mt projective measurements with two outcomes, rotated by FIG. 15 Experimental setup in Mazurek et al. (2016). The quantum system consists of a single photon of which a specific 2π/3 on the Bloch sphere: where the three effects, each polarization is prepared via a polarizer and two waveplates for the 0 outcome of M for t = 1, 2, 3, form the triangle t (preparation Pt,b) and then measured via two waveplates, a |ψai , |ψbi , |ψci. General methods for computing maxi- polarized beamsplitter and two detectors (measurement Mt). mal violations of such noncontextuality inequalities have The figure is taken from Mazurek et al. (2016). been developed (Chaturvedi et al., 2020; Tavakoli et al., 2020). one can infer what would have been the value of their convex combinations. Among the secondary operations, 3. Experimental tests of Spekkens’ contextuality one can select those that are the closest to the primary ones and at the same time satisfy exactly the conditions In the derivation of Eq. (70), both the assumptions of in Eqs. (68),(69). To avoid any reference to quantum preparation and measurement noncontextuality enter. theory, secondary preparations and measurements are Notice that, without further assumptions, to infer described in terms of a generalized probability theory, operationally indistinguishability one needs to perform with the above assumption on tomographically com- all possible measurements as in Eq. (62). To avoid this pleteness for three measurements and four preparations. problem a minimal set of measurements is assumed to The experiment is performed by preparing a measuring be necessary to infer that two preparations procedures the polarization degree of freedom of a single photon, are operationally indistinguishable. Such a minimal as depicted in Fig. 15. An experimental value of set is said to be tomographically complete. Similarly, a A = 0.99709 ± 0.00007 for the parameter A in Eq. (70) tomographically complete set of preparations is assumed is then observed, based on the inferred values of the to exist, in order to infer that two measurements secondary preparations and measurements, violating the are operationally indistinguishable. Without assuming noncontextual bound 5/6. quantum theory in its entirety, the set of tomographically complete operations is usually assumed to be the same as the quantum one in order to interpret experimental results (Mazurek et al., 2016). Under this assumption, 4. Relation with different notions of hidden variable theory one can avoid testing the operational equivalence over infinitely many preparations and measurements. In In Spekkens’ notion of contextuality, a fundamental Mazurek et al. (2016), it is assumed that, respectively, role is played by two properties, namely, the existence three measurements and four state preparations are of a non-unique decomposition of quantum mechanical tomographically complete, as it is the case for qubit mixed states into pure states, and the requirement that systems. A second problem arises, namely that due to the indistiguishability present at the operational level, experimental imperfections, it is not possible to find identified here with quantum mechanical predictions, is pairs of preparations {Pt,b}b with the same average satisfied at the level of the ontological model. preparation P∗ and a triple of measurements with the In classical probability theory (let us consider for sim- same average measurement M∗ corresponding to a “fair plicity a finite number of events), the space of proba- coin flip”. This problem is addressed by constructing bilities is a simplex, hence, each point can be written the so-called secondary preparations and measurements, as a unique decomposition of its extreme points, in- as convex mixtures of the primary ones (8 state prepa- terpreted as classical pure states and corresponding to rations and 4 measurements), which are those directly {0, 1}-valued probability assignments (see also the discus- implemented in the experiment. In other words, from sion in Sec. V.A). It is perhaps not surprising, then, that the primary preparations and measurements performed, attempts to construct (measurement) noncontextual hid- 29 den variable models for two-level quantum systems ,e.g., ity when comparing the (high-level) continuum theory the models by Bell(1966) and by Kochen and Specker and the (low-level) crystal lattice theory. (1967), turn out to be explicitly preparation contextual Symmetry may play a central role in the formulation of (Leifer and Maroney, 2013). In these models, in fact, a physical theory also in the form of gauge symmetry. In if we were able to “directly measure” the hidden vari- fact, even in the absence of observable differences, a re- able λ, we could distinguish between an equal mixture dundant description, i.e., one where different theoretical of |0i and |1i and an equal mixture of |+i and |−i, even descriptions are assigned to the same physical situation, though both mixtures give rise to same quantum state may still prove itself useful, as it is manifest in the case ρ = 11/2. This is an explicit example of preparation con- of classical electrodynamics. textuality: According to the hidden variable model such Of course, we have so far no experimental evidence preparations are indeed different, even though quantum suggesting that we should abandon quantum mechanics measurements are not able to distinguish them. for some alternative theory, classical or not. In con- The notion of operational indistiguishability, based on trast, much effort has been devoted to disproving var- the above notion of a tomographically complete set of ious classes of hidden variable theories on the basis of operations, implicitly assumes that the ontological model experimental data. In this sense, a minimal set of ini- predicts no new phenomena, but merely provides a clas- tial assumptions on hidden variable theories allows one sical description of the same operations as those allowed to disprove the largest possible class of such theories. in quantum mechanics. Operational indistinguishability is, then, justified by what Spekkens calls Leibniz’s prin- ciple of the ontological identity of empirical indiscern- V. ADVANCED TOPICS AND METHODS ables (Spekkens, 2019). This assumption may or may not be justified, depend- This section is intended to be a detailed one where ing on the chosen perspective on hidden variable models: we discuss more advanced topics and methods associated Is it possible that some state preparations are indistin- with quantum contextuality. We address questions such guishable in quantum mechanics, but at the level of a as how to compute noncontextuality inequalities for a deeper theory, they are different? An analogy often en- given scenario, what the corresponding maximal quan- countered is that of thermodynamics and classical statis- tum violation is, which scenarios give rise to contextual- tical mechanics: we have the assumption of a microscopic ity and to state-independent contextuality, how contex- theory, but with limitations on available state prepa- tuality is related to nonlocality, and so on. rations (e.g., thermal states described by the Maxwell- The section is organized as follows. In Sec. V.A, we in- Boltzmann distribution) and measurements (e.g., macro- troduce the noncontextuality polytope, which describes scopic variables such as volume, pressure, temperature, noncontextual correlations associated with a given sce- total magnetization, heat capacity). Eventually, atomic nario, i.e., a fixed set of measurements and contexts, and theory led to observable consequences. In contrast, it allows one to compute noncontextuality inequalities. In is still under debate whether hidden variable theories Sec. V.B, we discuss the connection between graph the- even predict deviations from quantum theory and, if ory and noncontextuality: since contexts can be repre- that is the case, whether we will be able to observe sented as graphs, or more generally hypergraphs, several them at all, due to practical or fundamental limitations. properties of contextuality scenarios can be investigated The former is an open problem in, e.g., Bohmian me- in terms of graph theoretic properties. In Sec. V.C, we chanics, in relation to Bell experiments (Correggi and discuss the connection between Bell nonlocality and con- Morchio, 2002; Kiukas and Werner, 2010) and tunneling textuality. In Sec. V.D, we discuss different approaches time (Hauge and Støvneng, 1989; Landauer and Martin, to the classical simulation of contextual correlations. In 1994; Stomphorst, 2002), or in hidden variable theories Sec. V.E, we present the debate on the nullification of analyzed from a thermodynamical perspective (Cabello KS theorem. et al., 2016a), whereas the latter is an open problem in, e.g., collapse models (Bassi et al., 2013). Another interpretation of preparation contextuality, A. The noncontextuality polytope such as the one in the model of Kochen and Specker (1967), has been presented by Ballentine(2014): the The set of correlations that can be achieved within a symmetry of the high-level theory (in this case, quantum noncontextual theory forms a polytope in the space of theory) is not satisfied by the low-level theory (hidden probability assignments. Analogously to the case of Bell variable). For instance in the example above, an equal nonlocality, the study of these polytopes plays a funda- mixture of |0i and |1i would have a cylindrical symme- mental role in the investigation of noncontextuality in- try along the z-axis, whereas an equal mixture of |+i equalities. We introduce the basic notions and briefly and |−i a cylindrical symmetry along the x-axis. Ac- discuss some results that have been achieved using this cording to Ballentine(2014), however, this is a common approach. Correlations polytopes have been introduced phenomenon in both classical and quantum physics. For in the study of Bell inequalities by Froissart(1981), Garg instance, it appears in the theory of electrical conductiv- and Mermin(1984), and Pitowsky(1986), see also the 30

(1,1,1) define events as {00, 01, 10, 11} and their probabilities as p00 := Prob(A1 = 0,A2 = 0), p01 := Prob(A1 = 0,A2 = 1), p10 := Prob(A1 = 1,A2 = 0), p11 := Prob(A1 = 1,A2 = 1). It is straightforward to check that the corresponding polytope has dimension three, since there is an equality constraint (normalization of probability). We can, then, perform a linear transfor- mation (p00, p01, p10, p00) 7→ (p1, p2, p12), by comput- ing the marginals, i.e., as pi = Prob(Ai = 1) and (0,1,0) p12 = Prob(A1 = A2 = 1). The four vertices of the polytope are, then, the four vectors p = (ε1, ε2, ε1ε2) for ε1, ε2 = 0, 1, corresponding to deterministic assignments of values to A1 and A2. Such vectors form the tetrahe- dron plotted in Fig. 16. It is straightforward to verify (0,0,0) (1,0,0) that the faces of the tetrahedron are given by the follow- ing inequalities FIG. 16 Polytope associated with two measurements A1,A2. The four vertices are the the deterministic assignments, with p ≥ 0, corresponding coordinate labelling. Ineqs. (72) are associated 12 with the four faces of the tetrahedron, e.g., p12 = 0 is the p1 − p12 ≥ 0, plane tangent to vertices (0, 0, 0), (1, 0, 0) and (0, 1, 0), and so (72) p2 − p12 ≥ 0, on. 1 − p1 − p2 + p12 ≥ 0, book Pitowsky(1989). The latter has been the most which simply represents the constraints of positivity of widely used reference. Interestingly, Horn and Tarski the four probabilities Prob(A1 = x, A2 = y), x, y = 0, 1, (1948) already provided a solution to the marginal prob- rewritten in terms of marginals p1, p2, p12. lem several years earlier, which turned out to be equiva- lent to the correlation polytope approach (see De Simone and Pt´ak, 2015). In the context of noncontextuality in- 2. Basics of convex polytopes, affine geometry, and linear programming equalities, the first reference to systematically use this notions are Klyachko et al. (2008), and Kleinmann et al. The general case of an arbitrary number of events is not (2012). that far from the above simple one. Before proceeding with it, we first recall some basic notions about convex polytopes, for a more detailed exposition, see Gr¨unbaum 1. Simplest example (2003). A convex polytope can be defined in two different ways. In the vertex-representation, one specifies the ex- In this subsection, we introduce, in basic terms and by tremal points of the polytope, i.e., the vertices. If the set means of the simplest example, the notion of correlation of vertices is finite, then the convex hull of those points is polytope. In the next subsection, we discuss their basic a convex polytope. The other way to specify a polytope mathematical properties. is as a intersection of a finite family of closed half-spaces In the case of a finite number of measurement set- Hi = { x | mi · x ≤ bi }. If the resulting set is bounded, tings and outcomes, a probability distribution is de- it is also a convex polytope, otherwise, it is simply called scribed by some positive numbers pi ≥ 0, i = 1, . . . , n a polyhedron, or a polyhedral set. An intersection of a P such that i pi = 1. We can interpret them geomet- polytope with an affine subspace (a section) or the im- n P rically as the set Sn = {p ∈ R |pi ≥ 0, i pi = 1}, also age of a polytope under an affine map (e.g., a projection) known as a simplex: a n − 1-dimensional polyhedron both yield again a polytope. with n facets and n extremal points, i.e., the general- A family of vectors (x1, x2,... ) is affinely indepen- ization of the triangle, tetrahedron, etc. Equivalently, dent if there is only a trivial solution to the equations P P it can be seen as the set of convex combinations of the k λkxk = 0 and k λk = 1. Accordingly, the affine n n elements of the canonical basis of R , {ei}i=1, namely dimension of a family of vectors is d = n − 1 if n is P P Sn = { i piei|pi ≥ 0, i pi = 1} =: conv({ei}i). Each the maximal number of affinely independent vectors from extremal point ei can be interpreted as a probability as- the family. A facet F of a d-dimensional polytope is the signment of 1 to the i-th event and 0 to the others. intersection of an affine (d − 1)-dimensional hyperplane Ultimately, we want to represent probabilities of out- with the polytope, so that one of the open half-spaces de- comes for a certain set of measurements, hence, each fined by the hyperplane does not contain any part of the single event i is associated with a specific sequence of polytope. If the polytope is specified by a minimal set outcomes. For instance, we may have the case of two of (closed) half-spaces {Hi}i, then those hyperplanes are measurements A1,A2 with values 0, or 1. Then, we Hi ∩ −Hi. A facet is a (d − 1)-polytope and the extremal 31

N points of the facet are exactly those extremal points of where ε = (ε1, . . . , εn) ∈ {0, 1} and εi represents a the polytope that belong to the facet. Pitowsky’s con- {0, 1}-valued assignment to p(Ai) := Prob(Ai = 1), and struction (Pitowsky, 1989) makes use of those facts: The the marginals p(Ai,Aj),... are those appearing in the intersection of a half-space H with a d-polytope P is a marginal scenario M. These extreme points are precisely facet of that polytope if and only if the extremal points the projection of the extreme points of the simplex onto of P within H span a (d−1)-dimensional affine subspace. the subspace of observable probabilities. For the theory of Bell inequalities or noncontextuality Once defined these extreme points, the corresponding inequalities the theory of linear optimization is central. noncontextuality inequalities can be obtained by comput- A linear program (LP) is an optimization problem of the ing the half-space representation of the polytope. There type “minimize c · x over x ∈ K”, where c is some con- are several algorithms performing this transformation stant vector and K is a polyhedral set, i.e., a finite inter- and several implementation of them, such as Avis(2018); section of closed half-spaces. The set of optimal solutions Christof and Loebel(2015); Fukuda(2018); L¨orwald and forms again a polyhedral set, or if the set K is a polytope, Reinelt(2015). it is again a polytope. If K is specified by the vertices, A noncontextuality inequality is of the form λ · p ≤ η, then solving the program is simple, since the optimum is where the inequality holds true for any p in the non- attained at one of the vertices. The most important in- contextuality polytope. That is, a noncontextuality sight about linear programs is that, even if K is specified inequality is a half-space containing the noncontex- as an intersection of half-spaces, the optimization can tuality polytope. Of course, such an inequality is be solved by numerical means very efficiently and with a only useful, if it can be violated by a quantum sys- certificate of optimality (Boyd and Vandenberghe, 2004). tem. We write Π for the vector of projectors Π :=

(P1,...,PN ,...,PiPj,...,Pi1 Pi2 ··· Pim ,...), in analogy to Eq. (73), such that the i-th entry of the probability 3. Noncontextuality inequalities vector p can be computed from the i-th entry of Π as pi = tr(%Πi). For a nontrival noncontextuality inequal- The noncontextual polytope can be explicitly con- ity we, then, have λ · tr(%ΠΠ) > η. N structed as follows. Given a set of observables { Ai }i=1, The violation of an inequality for a fixed Π is defined we denote their possible value assignments as V = V1 × as V2 × · · · × VN , where Vk is the set of possible values that max {λ · tr(%ΠΠ) | % quantum state } the observable Ak can assume, e.g., V1 = { 0, 1 } for A1 Γ(λ) = − 1. (74) being a dichotomic observable. The corresponding prob- max {λ · p | p in NC polytope } ability simplex is the convex hull of all assignments on |V| P Hence Γ = 0 corresponds to the situation where the in- the set V, i.e., {p ∈ R | pv ≥ 0, v pv = 1}, where v = (v , . . . , v ) ∈ V, and p = Prob(A = v ,...,A = v ). equality does not have a violation for the projectors ΠΠ. 1 n v 1 1 n n Note, that both maximizations may be restricted to the The set of all possible contexts of { A }N , then, de- i i=1 extremal points, i.e., the maximization for the quantum fines the marginal scenario M, i.e., the set of marginal value can be performed over the pure states, while for the which can be experimentally accessible, e.g., the pairs noncontextuality polytope, it is sufficient to consider all {A ,A } in the KCBS scenario (Klyachko et al., 2008). i i+1 extremal points of the polytope. As a consequence, the The correlation polytope is, then, the projection of the validity of a noncontextuality inequality m · x ≤ b can be probability simplex onto the coordinates corresponding checked by verifying it is not violated by any vertex of the to observable probabilities. To do so, we first need a polytope, for the vertex representation conv(v , . . . , v ), change of coordinates. The new coordinates are obtained 1 k and by linear programming if the description of the poly- by considering the marginals P (A = v ),...,P (A = 1 1 i tope is given in terms of half-spaces {x|Ax ≤ b}, i.e., as v ,A = v ),...,. Alternatively, one can choose any i j j max m · x subject to Ax ≤ b. affine transformation (e.g., representation in terms of ex- x Similarly, given a probability vector p, one can check pectation values, correlators, etc.). An invertible affine if it belongs to a given noncontextuality polytope via a transformation guarantees that the obtained conditions LP. If not, this LP provides, via its dual formulation, are still necessary and sufficient for a probability vec- a noncontextuality inequality violated by p. One ex- tor to have a noncontextual hidden variable model. If ample of this is given by the contextual fraction (CF) the transformation is not invertible, one obtains only LP (Abramsky et al., 2017), which can also be inter- necessary conditions. Notice that not all coordinates of preted as a geometric quantification of contextuality. In p are independent, due to normalization (P p = 1) v v simple terms, the noncontextual fraction (NCF) is the and nondisturbance conditions (Kurzy´nski et al., 2012), maximum α ∈ [0, 1] such that p can be decomposed as hence, invertibility must be checked with respect to this p = αp + (1 − α)p , where p is a vector belonging subspace. NC C NC to the noncontextuality polytope, and CF = 1 − NCF. For instance, for the case of V = {0, 1} for all i, the i Several related questions are addressed in the follow- correlation polytope associated with a marginal scenario ing sections, such as the identification of the interesting M is the convex hull of vectors contextuality scenarios, i.e., giving rise to some Γ > 0, or

uε = (ε1, . . . , εN , . . . , εiεj, . . . , εi1 εi2 ··· εim ,...), (73) even SIC scenarios, the computation of quantum bounds, 32 and so on. A0 An analogous approach, based on ideas on convex op- timization, polyhedral sets and linear programming, can be developed for the analysis of entropy, rather than A4 A1 probability. Following the idea initially developed by Braunstein and Caves(1988), several authors investi- gated entropic noncontextuality inequalities (Chaves, 2013; Chaves and Fritz, 2012; Durucan and Grinbaum, 2020; Fritz and Chaves, 2013; Kurzy´nski et al., 2012; A3 A2 Raeisi et al., 2015). In particular, Chaves and Fritz (2012); Fritz and Chaves(2013) developed a systematic method to derive noncontextuality inequalities for an ar- FIG. 17 Compatibility graph associated to the observables of the KCBS scenario, corresponding to the marginal scenario bitrary marginal scenario, which can be briefly described 4 as follows. In the entropic approach, one can derive the {(Ai,Ai+1)}i=0. This graph can be used also to illustrate the basic notions of path, cycles, and independent sets. A path is entropic inequalities by projecting the entropic cone, de- given by any sequence of sequentially connected vertices, e.g., scribing the joint entropies over all variables, onto the (Ai,Ai+1,Ai+2), for any i and with sum modulo 5. A cycle is variables corresponding to the observed marginals, in a closed path such as (A0,A1,...,A4,A0). An independent analogy with the projection of the probability simplex set is a set of disconnected vertices such as (Ai,Ai+2). The described above. A complete characterization of the en- pentagon contains independent sets of at most size two. One tropy cone is not known for more than 3 variables, how- can easily show that the complement of a pentagon is again ever, an outer approximation in terms of the so-called a pentagon with edges (Ai,Ai+2), for i = 0,..., 4 and sum Shannon inequalities is known (see Yeung, 2008, for a modulo 5. textbook introduction). In contrast to the probabil- ity case, entropic inequalities provide only a necessary condition for noncontextuality, except in some special 1. Basic notions cases (Chaves, 2013). Finally, a case not covered in the correlation polytope We start by introducing basic notions and definitions approach is the continuous-variable (CV) case. The first in graph theory, an extensive discussion of this topic can proposal of a CV contextuality test was presented by be found in Beeri et al. (1983); Bretto(2013); Diestel Plastino and Cabello(2010) for a CV version of the Peres- (2018); Lauritzen(1996). A graph is a pair G = (V,E) Mermin square based on modular variables. The argu- where V is the set of vertices, or nodes, and E the set of ment was further improved by (Asadian et al., 2015), re- edges, i.e., unordered pairs (i, j) for some i, j ∈ V . Two moving one assumption from the NCHV model (classical vertices i, j ∈ V of a graph are adjacent, or connected, complex variables of modulo 1). The same scenario was if (i, j) ∈ E. A set of mutually connected vertices is further explored by (Laversanne-Finot et al., 2017), who called a clique of the graph. A set of vertices such that considered more general observables. Recently, Soares no two of them are connected is called an independent Barbosa et al. (2019) presented a general framework for set.A path is a sequence of distinct vertices v0, . . . , vn the investigation of CV contextuality. such that vi is connected to vi+1, for i = 0, . . . , n − 1. A cycle is defined in the same way, but with v0 = vn. A graph is an acyclic, or a tree, graph, if it contains no cycle. A graph is triangulated, or chordal, if every cycle of length n ≥ 4 contains a chord, i.e., and edge connecting B. Graph theory and contextuality (vi, vi+2). The complement of a graph G = (V,E) is a graph G¯ = (V, E¯) where E¯ = {(i, j)|i, j ∈ V }\E, i.e., Since the original paper of Kochen and Specker(1967), every pair of connected vertices in G is disconnected in graphs played a central role in contextuality arguments. G¯ and vice versa. In the following, we discuss the connection between con- A hypergraph is a generalization of the above idea ob- textuality and graph theory, with particular emphasis on tained by allowing edges to connect more than two ver- two types of graphs, namely, compatibility graphs and tices, namely, a pair H = (V,E), where V is the set exclusivity graphs. We review several problems that can vertices and E the set of hyperedges, i.e., E ⊂ 2V , with be formulated in terms of graph properties and graph- 2V the power set of V . Hypergraphs can also arise from theoretic results. This comprises the following questions: graphs; for instance, the clique hypergraph H of a graph What compatibility structures always admit a noncon- G is defined by the same set of vertices and has as hy- textual hidden variable model? Or, equivalently, what peredges the cliques of G. If a hypergraph contain only structures are interesting for contextuality? How can we maximal hyperedges, i.e., for each hyperedge E there no derive noncontextuality inequalities and compute the cor- hyperedge E0 such that E0 ⊂ E, the graph is said to responding quantum bound efficiently? Which scenarios be reduced. Given a hypergraph H, we say that H0 is give rise to state-independent contextuality? the reduced hypergraph of H if it is obtained from H by 33 removing all non-maximal hyperedges. An example of a compatibility graph is given in Fig. 17 As opposed to the case of graphs, different notions of for the KCBS scenario. Each vertex represents a mea- acyclicity are possible for hypergraphs. The relevant one surement setting A0,...,A4, and edges connect vertices for us is given by the following two equivalent definitions. corresponding two joint measurement hAiAi+1i appear- First, a graph is acyclic if it has the running intersection ing in the KCBS inequality (Klyachko et al., 2008). property, i.e., if if there exists an ordering of the hyper- The relation between the existence of NCHV models edges, E1,...,En, such that and graph-theoretic properties of the marginal scenario hypergraph, or of the compatibility graph for the case Ei ∩ (E1 ∪ ... ∪ Ei−1) ⊂ Ej, with j < i, for all i, (75) of sharp measurements, has been investigated by several authors (Budroni and Morchio, 2010; Kurzy´nski et al., namely, there exists an ordering such that the intersec- 2012; Ramanathan et al., 2012). The general result for tion with any new hyperedge is completely contained in marginal scenario hypergraphs follows from the Vorob’ev one of the previous hyperedges. The second, equivalent, theorem (or Vorob’yev, depending on the transliteration definition is that a graph is acyclic if it is the clique graph from Cyrillic used), originally stated in Vorob’ev(1959) of a triangulated graph. [the English translation in Vorob’yev(1967)] and later Their equivalence is not obvious, cf. Beeri et al. (1983); proven in Vorob’ev(1962) [see also Vorob’ev(1963)]. Lauritzen(1996), however, one can easily check that The same result was also independently proven by other these definitions coincide in the case of hyperedges of authors (Kellerer, 1964a,b; Malvestuto, 1988). In our cardinality two with that of trees for graphs. We will see terminology, the theorem can be stated as follows: that the running intersection property plays a central role in the construction of NCHV models. Theorem (Vorob’ev, 1962). Any marginal scenario rep- This notion is usually called α-acyclicity in the litera- resented by an acyclic hypergraph admits a joint proba- ture (Beeri et al., 1983; Lauritzen, 1996). In the follow- bility distribution. ing, we simply refer to it as acyclicity. In other words, given a set of probabilities associated with a marginal scenario and coinciding on their inter- 2. Graphs, hypergraphs, and marginal scenarios section, if their structure is represented by an acyclic hypergraph, then there always exists a probability dis- In the abstract formulation of NCHV in Sec. IV.A.1, tribution of which they are the marginals. Intuitively, we defined a marginal scenario M as the set of of all Vorob’ev’s result can be understood as the construction contexts for a given set of measurements A1,...,An.A of a global probability by “glueing together” probability natural representation of a marginal scenario is given distributions on their intersection, a notion referred to as by an hypergraph H: vertices represent measurements, “adhesitivity” (Mat´uˇs, 2007a). The acyclicity property of whereas hyperedges represent contexts, see also Amaral hypergraphs, in particular the running intersection prop- and Terra Cunha(2018). Here, we consider the most gen- erty, guarantees that such a construction can always be eral structure possible, without entering into the details made in a consistent way. It is instructive to illustrate of the specific way of realizing such contexts in practice, this idea with the simplest example, as follows. Consider as we discussed in Sec. IV.B. Given its relevance, we of- three variables A, B, C and two distribution p1(a, b) and P P ten discuss the specific case of sharp measurements. For p2(b, c), such that a p1(a, b) = c p2(b, c) =: p(b). This sharp measurements in quantum mechanics, Specker’s corresponds to a marginal scenario described by a line principle applies (Cabello, 2012; Kochen and Specker, graph A − B − C, which is acyclic. One can explicitly 1967; Specker, 1960), namely that pairwise compatibil- construct a joint distribution on A, B, C by “gluing” the ity is equivalent to global compatibility. For this rea- distributions on their intersection, namely son for sharp measurements it is enough to represent the marginal scenario as a graph, interpreting edges as pair- p1(a, b)p2(b, c) p(a, b, c) := = p (a|b)p (c|b)p(b), (76) wise compatibility relations and cliques as contexts. For p(b) 1 2 the case of sharp measurements, we call such graphs com- patibility graphs. It is interesting to notice that any graph with the convention that p(a, b, c) := 0 if p(b) = 0. In- can be interpreted as such a compatibility graph for sharp terestingly, this is precise the construction used by Fine measurements, in the sense that these compatibility re- (1982a) to prove that CHSH inequalities are necessary lations can be realized by a set of sharp observables on and sufficient conditions for the existence of local hid- a Hilbert space (Heunen et al., 2014). Similarly, if one den variable models in the case of two inputs and two considers contexts simply as sets of jointly measurable outputs. observables, then every hypergraph can be interpreted The Vorob’ev result has been recently discussed in re- as a set of joint measurability relations for a given set lation to contextuality (Soares Barbosa, 2014, 2015; Xu of POVMs (Kunjwal et al., 2014). We recall that we and Cabello, 2019) and causal discovery methods (Bu- discussed in Sec. IV.B the problems associated with the droni et al., 2016). This result has implication for the definition of contextuality simply in terms of joint mea- computation of correlation polytopes and entropic cones surability. associated with noncontextuality scenarios (Ara´ujo et al., 34

2013; Budroni and Cabello, 2012; Kujala et al., 2015) (-1,+1|0,1) and more general causal structures (Budroni et al., 2016; Chaves et al., 2014). It also interesting to notice that such a result is also at the basis of the derivation of (-1,+1|4,0) (-1,+1|1,2) non-Shannon inequalities in classical information theory (Mat´uˇs, 2007b; Zhang, 2003). For the case of compatibility graphs, it is sufficient to check that the graph is triangulated, since the corre- sponding hypergraph of contexts, the clique hypergraph, is acyclic according to the above definition, see also dis- (-1,+1|3,4) (-1,+1|2,3) cussion in Xu and Cabello(2019) for more details. The above result has allowed for the identification of FIG. 18 Exclusivity graph associated with the five events the simplest noncontextuality scenarios. The argument appearing in the inequality (77). The notation (−1, +1|0, 1) presented in Kurzy´nski et al. (2012) can be briefly sum- refers to the event of outcome −1 for the measurement of A0 marized as follows. The simplest compatibility graph giv- and outcome +1 for the measurement of A1, and so on. ing rise to contextual correlations must contain a cycle of length bigger than three, i.e., it is a square corresponding to the CHSH scenario (Clauser et al., 1969). For sharp the KCBS noncontextuality, namely Eq. (24) discussed measurements, such a graph can only be obtained in di- in Sec. IV.A.4 mension d = 4. For d = 3, one needs at least a pentagon, 4 corresponding precisely to the KCBS scenario (Klyachko X NCHV S = p(−1, +1|i, i + 1) ≤ 2, (77) et al., 2008). This can be seen as follows: A nontriv- KCBS ial sharp measurement in d = 3 is represented by the i=0 POVM {|vihv|, 11 − |vihv|}, since it cannot be the identity where p(−1, +1|i, i + 1) ≡ Prob(Ai = −1,Ai+1 = 1) and and it cannot be nondegenerate, otherwise compatibil- the sum is taken modulo 5. ity becomes a transitive relation (Correggi and Morchio, For the specific choice of quantum observables in the 2002), giving rise only to (a collection of) fully connected KCBS scenario, discussed in Sec. IV.A.2, such events are graphs, hence, always admitting a NCHV. The compat- represented projectors, e.g., (−1, +1|i, i + 1) 7→ Qi and ibility of two measurements, with associated rank-1 pro- + − p(−1, +1|i, i + 1) = tr(ρQi), where Qi := Π Π and jectors |vihv| and |wihw|, corresponds to hv|wi = 0; since i i+1 Π± is the projector associated with the outcome ±1 of two nonorthogonal vectors in d = 3 have a unique or- i A . Mutually exclusive events correspond to orthogonal thogonal subspace, it is impossible to get a square com- i projectors, e.g., Q Q = (Π+Π− )(Π+ Π− ) = 0. patibility graph with four different measurements. i i+1 i i+1 i+1 i+2 In Cabello et al. (2014), CSW noticed the similarity be- tween Eq. (77) and the definition of the Lov´asznumber of a graph. The Lov´asznumber was initially introduced in 3. Exclusivity graphs and their independence, Lov´asz,and Lov´asz(1979) as an upper bound on the Shannon capac- fractional packing numbers ity of a graph (Shannon, 1956), it is a well studied object in graph theory, and it can be efficiently computed via In this section, we introduce the notion of exclusivity semidefinite programming (Lov´asz, 2009). For a graph graph, namely a graph where connected vertices repre- G = (V,E) its Lov´asznumber θ is given by sent mutually exclusive events, and discuss the signif- icance of associated graph-theoretic quantities such as X 2 ϑ(G) = max |hψ|vii| , (78) the independence number, Lov´asznumber, and fractional vi,ψ i∈V packing number, following the discussion presented by Cabello, Severini, and Winter (CSW, 2014). where the maximum is take over all vectors |ψi and over Given a measurement context, denoted by compati- all vectors |vii such that hvi|vji = 0 whenever i, j ∈ V ble settings (s1, . . . , sn), an event corresponds to a given are adjacent vertices. This set of vectors is also called set of joint outcomes (o1, . . . , on|s1, . . . , sn). Two events an orthogonal representation (OR) of G, the complement 0 0 0 0 (o1, . . . , on|s1, . . . , sn) and (o1, . . . , on|s1, . . . , sn) are said of G. Notice that the OR of a graph is defined by the 0 to be exclusive if there exists i, j such that si = sj but fact that non-adjacent nodes are associated with vec- 0 oi 6= oj. In other words, two events are exclusive if at tors that are orthogonal (Lov´asz, 2009), which is why least one pair of measurement settings coincides, but they we consider the complement graph G to define the OR have different outcomes. entering in Eq. (78). This seemingly counterintuitive con- It is helpful to consider in detail a simple example given vention might be due to the original definition of Shan- by the graph in Fig 18: each vertex represents two possi- non capacity of a (confusability) graph (Shannon, 1956), ble outcomes for two settings, and two vertices are con- see Sec. VI.D.3 for more details, where nodes represent nected by an edge if the corresponding events are mu- symbols of an alphabet and edges their “confusability”, tually exclusive. Such a graph represents a version of whereas in our case edges represent exclusivity. 35

The maximum of the expression SKCBS in QM can, in the bounds on correlations for different theories, clas- fact, be written as sical, quantum, and generalized probability theories, re-

spectively. For the expression SKCBS in Eq. (77), we X X 2 max tr (ρQi) = max |hψ|vii| = ϑ(G), (79) know that the Lov´asznumber provides a tight bound, ρ,Qi vi,ψ i i∈V i.e., it can be achieved in quantum mechanics. More pre- cisely, in the KCBS case there exists sharp measurements + − where each vertex of the graph G = (V,E) corresponds A0,...,A4, with Ai = {Πi , Πi } such that the events to a projector appearing on the l.h.s. of (79) and two (+1, −1|i, i + 1) are exclusive and the rank-1 projec- vertices are adjacent if the corresponding projectors are tors |viihvi| maximizing the expression (78) are given by + − orthogonal. Notice that with the above definition of the |viihvi| = Πi Πi+1, as we already discussed in Sec. IV.A.2. Lov´asznumber, the quantum maximum is given by the Depending on the specific assumptions on the mea- Lov´asznumber of the exclusivity graph. Moreover, the surement scenario, however, the bounds obtained by the use of a pure state |ψi instead of ρ is no restriction since, Lov´asznumber may not be tight. A typical example by a convexity argument, the maximum of SKCBS is al- is the pentagon (Sadiq et al., 2013), interpreted as the ways achieved by pure states. Similarly, the use of one- exclusivity graph of a subset of events in the CHSH sce- dimensional projectors |viihvi| is no restriction, since for nario, i.e., of the form (a, b|x, y), with x the setting of Al- 2 an arbitrary projector Qi, we have hψ|Qi|ψi = |hψ|vii| ice and y of Bob. The reason why this bound is not tight p where |vii := Qi |ψi / hψ|Qi|ψi. is that in order to interpret these events in the CHSH In addition to the Lov´asznumber, two other graph- scenario, we need additional compatibility constraints on theoretic quantities are central in the discussion on cor- the measurements, in order for them to be distributed relations bounds in different theories: the independence between two parties, i.e., Alice’s observables are compat- number, α, and the fractional packing number, α∗. The ible with Bob’s, a condition which is not encoded in the former is defined as the cardinality of the maximal in- exclusivity graph. A possible extension of the exclusivity dependent set of a graph, which can be interpreted as graph approach to nonlocality scenarios via multi-graphs, the maximum number of “ones”, i.e., logical “true”, that encoding the separation into different parties, has been can be assigned to a set of vertices without violating the proposed in Rabelo et al. (2014). exclusivity condition, namely This situation, however, is not specific of Bell scenar- X ios, but it happens also for contextuality scenarios if addi- α(G) = max ci, tional assumptions on the compatibility relations among ci i∈V (80) measurements are made. In other words, if one does such that ci = 0, 1, cicj = 0 if (i, j) ∈ E. not simply want to reconstruct the effect operators, but also the original observables and their compatibility rela- In terms of NCHV theories with additional exclusivity tions. This is similar to what happens in the Navascu´es- constraints (NCHV+E) discussed in Sec. IV.A.4, α can Pironio-Ac´ın characterization of multipartite quantum be interpreted as the maximum of deterministic assign- correlations (Navascu´es et al., 2007, 2008), and the reason ments that respects the exclusivity condition, namely, why one needs to define a hierarchy of SDP conditions that two adjacent vertices cannot be assigned both the rather than a single one. In fact, even if the single oper- value “1”. ators |viihvi| can be reconstructed by the Lov´asznumber The fractional packing number is a linear program re- SDP, it is not clear that one can reconstruct the observ- laxation of the independence number, namely, the max- ables, e.g., {Aa|x}a,x {Bb|y}b,y associated to the events imum sum of weights such that in every clique the sum (a, b|x, y) in a Bell scenario, of which they are assumed of weights is one, to be effects, with the correct compatibility (in this case, commutativity) relations among them. ∗ X α (G) = max pi, such that pi ≥ 0, An alternative approach is that of taking the no- pi i∈V tion of observables and contexts as our starting point X (81) and pi ≤ 1, for all cliques C. and develop from there the exclusivity relations: given i∈C the observables {Ao|s}o,s, one constructs all possi- ble events, i.e., for each context C all the events The interpretation is the following. Probabilities for sin- p(o1, . . . , o|C||s1, . . . , s|C|), constituting the nodes of the gle events are identified independently of the context and hypergraph, whereas the hyperedges are defined by the the sum of probabilities of exclusive events within each same exclusivity relations as above (at least two identical context is less or equal one. This can be interpreted as settings with different associated outcomes). This is the a bound for generalized probability theories (GPTs) that hypergraph-theoretic approach to contextuality intro- still respects some notion of exclusivity within each con- duced and extensively investigated by Ac´ın,Fritz, Lev- text, i.e., sum of probabilities below one. We will come errier, and Sainz (AFLS, 2015). In the AFLS approach, back to this notion of exclusivity below. one starts from the set of observables and contexts and In summary, we may say that different graph-theoretic then constructs the hypergraph of effects (nodes) and ex- quantities, i.e., α, θ, and α∗, provide information on clusivity relations (hyperedges). In contrast, in the CSW 36 approach, one starts from the effects (nodes) and their and a state |ψi such that hvi|vji = 0 if (i, j) ∈ E and exclusivity relations (edges) and try to construct a gen- X 2 eral noncontextuality inequality, as we explain below. |hψ|vii| = ϑ(G). (84) Let us now discuss how one can find noncontextuality i inequalities in the graph approach. Here, one starts from ˜ an exclusivity graph, e.g., one for which it is known that By constructing the POVMs Pi = {|viihvi|, 11 − |viihvi|} α(G) < θ(G), and interprets it as a compatibility graph, and considering the initial state ρ = |ψihψ|, one obtains ˜ 2 ˜ ˜ ˜ ˜ i.e. promotes each single event to a measurement. An as- hPiiρ = |hψ|vii| and hPiPjiρ = hψ| PiPj |ψi = 0, when- sociated noncontextuality inequality can be constructed ever (i, j) ∈ E. As a consequence, one obtains such that the classical and quantum bounds correspond X ˜ X ˜ ˜ X 2 to the independence and Lov´asznumber, respectively. A hPii − hPiPjiρ = |hψ|vii| = θ(G) > α(G), general method has been presented in the original paper i∈V (i,j)∈E i (Cabello et al., 2014), however, here we discuss a slightly (85) different, and arguably simpler, approach, since we al- giving a violation of the noncontextuality inequality in ready encountered in the KCBS example in Sec. IV.A.4. Eq. (83). This approach is based on a general method to transform In addition to classical, quantum and GPT bounds for KS inequalities into NC inequalities, see Cabello(2016); a given expression, the exclusivity graph approach also Yu and Tong(2014) for more details. allows for the definition of the set of their correlations Let us assume to have a graph G such that α(G) < through the notions of stable set polytope STAB(G), theta θ(G) and to wish to construct a noncontextuality in- body TH(G), and clique constrained stable set polytope equality and to provide a state and sharp quantum ob- QSTAB(G), of a given exclusivity graph G, see Amaral servables, with the correct compatibility relations, able and Terra Cunha(2018); Cabello et al. (2014) for a de- to show a violation of the inequality. First, to construct tailed discussion. These sets are closely related to the the NC model, to each node of the graph G = (V,E), quantities α, θ and α∗ defined above: we associate a classical variable Pi with values in 0, 1. n o |V | We write Prob(Pi = 1) =: hPii and the joint probability STAB(G) = conv x ∈ {0, 1} xixj = 0 if (i, j) ∈ E , Prob(Pi = 1,Pj = 1) =: hPiPji. From the independence n o |V | 2 number we can derive the following bound for NCHV the- TH(G) = p ∈ R+ pi = |hψ|vii| , {|vii}i OR of G , ories with the additional exclusivity assumption among n o |V | X connected events (see Sec. IV.A.4), namely, QSTAB(G) = p ∈ R+ pi ≤ 1, ∀ cliques C . i∈C X NCHV+E (86) hPii ≤ α(G). (82) i∈V In other words, STAB(G) is given by the probability vec- tors in the convex hull of the deterministic assignments The meaning of Eq. (82) is that the bound of α(G) is respecting exclusivity, i.e., of 1 to all the elements of an valid only in NCHV theories where events satisfy addi- independent (or stable) set and 0 to the other elements, tional exclusivity relations, corresponding to those en- as in Eq. (80); TH(G) is given by the assignment coming coded in the graph G, namely, connected nodes cannot from the a vector |ψi and the vectors of an orthogonal be both assigned the value 1. Following the discussion representation of G, as in Eq. (78); and QSTAB(G) is the in Sec. IV.A.4, we transform the above inequality into a set of probability assignments such that the sum of prob- general noncontextuality inequality as follows ability on each clique is bounded by 1 as in Eq. (81). In- terestingly, these sets can also be characterized in terms NCHV X X ∗ hPii − hPiPji ≤ α(G). (83) of the quantities α, θ, α , but in a reverse order w.r.t. i∈V (i,j)∈E what we have seen above (Ac´ın et al., 2015), arising from a dual approach in their description (Gr˝otschel et al., Intuitively, whenever the noncontextual assignments do 1993; Knuth, 1994). not respect the exclusivity condition the l.h.s. gets a The notion of the stable set polytope STAB(G) and “penalty”, which keeps the noncontextual bound the the theta body TH(G) also allowed to find the minimal same. Let us denote by A ⊂ {Pi}i the subset of vari- Greenberger-Horne-Zeilinger-like proof of contextuality. able to which 1 is assigned. It can be divided into an This was then shown to imply that the 18 vectors found assignment to a maximal independent set I plus some by Cabello et al. (1996a), see also Fig. (3), is the minimal extra variables E, i.e., A = I ∪ E. Each variable Pi ∈ E, Kochen Specker set (Xu et al., 2020a). however, must violate at least one exclusivity constraint The set QSTAB(G) encodes the condition that the involving an element of I, since I is a maximal indepen- probability of mutually exclusive events (represented by dent set by definition, thus, giving a factor +1 for the a clique in G) is bounded by 1, a condition that has first term and a factor ≤ −1 for the second term in the been introduced under the name (consistent) exclusiv- l.h.s. of Eq. (83). The quantum model can be constructed ity principle for contextuality (Cabello, 2012) (see also from the OR of G as in Eq. (78), namely, vectors {|vii}i Ac´ın et al., 2015; Henson, 2012) (or E-principle (Cabello, 37

2013)), and local orthogonality for Bell nonlocality (Fritz able, but it represents a particular process associated to et al., 2013). This condition has been extensively investi- a particular way to measure an observable. This process gated as a possible principle that bounds contextual and yields a result and transforms the state in a specific way. nonlocal correlations in QM (Ac´ın et al., 2015; Amaral From a theory-independent perspective (Chiribella and et al., 2014; Cabello, 2013, 2015; Fritz et al., 2013; Hen- Yuan, 2014, 2016; Kleinmann, 2014), this process corre- son, 2015; Yan, 2013). sponds to an interaction between a measurement device In the hypergraph approach, Ac´ın et al. (2015) show and a measured system that yields the same result when that consistent exclusivity cannot bound the set of quan- repeated and does not disturb any compatible observ- tum correlations, even in the limit of an infinite number able. That is, to an ideal (or sharp) measurement (Ca- of copies of the original hypergraph, with the following bello, 2019b; Chiribella et al., 2020; Chiribella and Yuan, argument. They show that the set of probability vec- 2014, 2016; Kleinmann, 2014). P tors, i.e., with the normalization condition i pi = 1, ob- Moreover, Naimark’s (or Neumark’s, depending on tained in this limit is the one characterized to the Shan- the transliteration) dilation theorem (Neumark, 1940a,b, non capacity of a graph (Shannon, 1956), which includes 1943), showing that any POVM can be obtained from a P the set TH(G)∩{ i pi = 1}, i.e., the theta body with ex- projective measurement on a larger Hilbert space, implies tra normalization constraints. As discussed above for the that, in quantum mechanics, nonideal measurements can- CHSH case, when the scenario constraints are imposed, not produce correlations which cannot be attained with i.e. each event is associated to a collection of outcomes for ideal measurements. a joint measurement, the Lov´asznumber provides only an If we call quantum theory the abstract probability be- upper bound to quantum correlations. Moreover, Ac´ın hind quantum mechanics as in, e.g., Chiribella et al. P et al. (2015) show that the set TH(G) ∩ { i pi = 1} (2010); Hardy(2001); Masanes and M¨uller(2011), then corresponds to the first level of a NPA-type hierarchy as- the previous observations point out that quantum theory sociated with the hypergraph, see also Navascu´es et al. is a probability theory for events produced by ideal mea- (2015). surements. This suggests that, to understand quantum By not fixing the measurement scenario and discussing correlations, and interesting question is: Why are corre- simply events and their exclusivity relations, the graph lations between ideal measurements in nature not more approach provides a different perspective on the deriva- contextual (Cabello et al., 2010)? tion of quantum bounds on correlations. The results of The graph-theoretic approach introduced in Cabello this research direction are summarized in the next sec- et al. (2010, 2014) provides a tool that brings a unique tion. perspective to answer this question. While the standard approach to Bell nonlocal correlations investigates mea- surement scenarios one by one, the graph-theoretic ap- 4. The graph approach and the quest for a principle for proach allows us to investigate graphs of exclusivity one quantum correlations by one. n Given n events {ej}j=1 produced by a set of measure- In the context of the program initiated by Cirel’son ments {Mi} (that also defines a measurement scenario) (1993) (or Tsirelson depending on the transliteration) and an initial state ρ, one can represent the relations of for finding simple characterizations of the sets of quan- mutual exclusivity between these events by a an n-vertex tum correlations for Bell scenarios, Popescu and Rohrlich graph in which each event is represented by a vertex (1994) asked the following question: Why are correlations (node) and mutually exclusive events are connected by in nature not more nonlocal? an edge. Recall that two events are mutually exclusive if Principles such as nontrivial communication complex- there is a measurement that produces both of them, each ity (van Dam, 1999), information causality (Pawlowski of them associated to a different outcome. et al., 2009), macroscopic locality (Navascu´esand Wun- Given an n-vertex graph G, there are infinitely many derlich, 2010), and local orthogonality (Fritz et al., 2013) measurement scenarios producing events whose graph of managed to exclude some nonquantum nonlocal correla- exclusivity is G. Let us consider all pairs (ρ, {Mi}), where tions. However, none of them managed to single out even ρ is an initial state and {Mi} is a set of ideal measure- n the set of quantum correlations for the simplest Bell sce- ments, that produce n events {ej}j=1 whose graph of nario (Navascu´es et al., 2015). exclusivity is G. For each pair, there is a set of proba- n Arguably, the focus on Bell scenarios, where no further bilities {p(ej)}j=1. Let us denote by P(G) the set of all n measurements can be carried out on a physical system af- sets {p(ej)}j=1. ter it has been measured once, is missing a fundamental In Cabello et al. (2014), it is shown that, for any G, aspect of quantum theory: measurements are transfor- in quantum mechanics, P(G) =TH(G). The fact that mations. this physical set has a simple mathematical characteri- After L¨uder’swork (L¨uders, 1951) fixing von Neu- zations, suggests the following question: Why in nature mann’s quantum theory of measurement (von Neumann, P(G) =TH(G) for any G? 1932), it becomes clear that a self-adjoint operator is not If we define ideal measurements as those that: (i) yield simply a tool to compute the probabilities of an observ- the same result when repeated, (ii) do not disturb any 38 compatible observable, and (iii) all its coarse grainings can be implemented satisfying (i) and (ii). Then, the events produce by ideal measurements satisfy the exclu- sivity principle: Given a set of events such that every pair of them are mutually exclusive, the sum of the prob- abilities of all of them is bounded by 1 (Cabello, 2019b; Chiribella et al., 2020; Chiribella and Yuan, 2014). (a) (b) (c) For theories allowing for statistically independent n copies of any set {p(ej)}j=1 and events satisfying the FIG. 19 Different a : b coloring of the pentagon, i.e., b colors exclusivity principle (as those originated from ideal mea- associated to each vertex out of a total colors. (a) 3 : 1 surements), the largest possible P(G) is TH(G) for any coloring of the pentagon, i.e., 3 colors, one for each vertex, G (Cabello, 2019b). giving a chromatic number χ = 3. (b) 6 : 2 coloring of the Taking into account the constraints due to normaliza- pentagon obtained by doubling the colors for each vertex. (c) tion, nondisturbance, and that the probability of each One color from the 6 : 2 coloring can be removed, giving a event must only be a function of the state and measure- 5 : 2 coloring corresponding to a fractional chromatic number ment outcomes that define it, the two previous results χf = 5/2 for the pentagon. allow us to prove (Cabello, 2019b) that set of quantum correlations (or behaviors) for any scenario with ideal measurements is equal to the largest set allowed under matic number can be defined as a LP relaxation of the the assumption that there is a statistically independent chromatic number, hence, it may seem easier to compute. joint realization of any two behaviors. However, computing the fractional chromatic number of Cabello(2019a) argued that this suggests a principle a graph is NP-hard (Lund and Yannakakis, 1994). Intu- for quantum correlations: Plato’s principle of plenitude itively, this comes from the fact that the LP definition of that whatever can exist must exist (also called the total- the fractional chromatic number involves the knowledge itarian principle). The quantum sets of contextual corre- of all independent sets of a graph, i.e., all sets of mutually lations for the different measurement scenarios are a sig- disconnected vertices. nature of an absence of constraints in when some parts To discuss the connection between SIC and the chro- of the world interact. matic and fractional chromatic number, we first need to recall some basic definitions. First, we call a state- independent noncontextuality (SI-NC) inequality an in- 5. Chromatic and fractional chromatic numbers equality that, for a fixed set of measurements, is violated by any initial state. A set of elementary tests that can The chromatic and fractional chromatic numbers of a be used to violate such inequality, is called a SIC set. A graph are also graph-theoretic quantities that play an im- typical example is Yu-Oh inequality in Eq. (18) and the portant role in quantum contextuality, more precisely, in Yu-Oh set in Fig.7 on Pag.8. A related notion is that state-independent contextuality (SIC). In the following, of SIC-graph introduced by Ramanathan and Horodecki we briefly recall their definition and discuss their relation (2014). A SIC-graph is a graph that, for any fixed quan- with contextuality, following the work of Cabello(2011), tum state, has a realization in terms of orthogonal projec- Ramanathan and Horodecki(2014) and Cabello et al. tors, i.e., to each vertex is associated a projector and two (2015). A k-coloring of a graph G is an assignment of projectors are orthogonal if the corresponding vertices in one out of k colors to each vertex of a graph such that the graph are connected, such that the given state violate adjacent vertices are assigned different colors. The mini- a NC inequality. Clearly, a SIC set gives rise to a SIC mal number k such that this coloring is possible is called graph, but the converse is not always true. A typical ex- the chromatic number of the graph and denoted as χ(G). ample (Cabello et al., 2015) is obtained from the Yu-Oh Equivalently, the chromatic number can be understood set by increasing the dimension by one, i.e., vi 7→ (vi, 0), as the minimal number of partitions of the graph into in- and adding an extra vector orthogonal to all the others, dependent sets. Similarly, the fractional chromatic num- i.e., vE = (0, 0, 0, 1). Clearly, the set is no longer a SIC a set, since by preparing the initial state |v i, one would ber χf (G) is the minimum of b such that vertices have E b associated colors, out of a colors, where again vertices obtain a noncontextual value assignment to all variables, connected by an edge have associated disjoint sets of col- namely, all zero except |vEihvE|. On the other hand, for ors. As a consequence, we have that χf (G) ≤ χ(G). any pure initial state |ψi, one can find a realization of the A simple example of chromatic and fractional chromatic graph such that the Yu-Oh NC inequality of Eq. (18) is number for the pentagon is given in Fig. 19. violated. It is sufficient to choose a realization for which The chromatic number of a graph is, in general a dif- hψ|vEi = 0. According to Ramanathan and Horodecki ficult quantity to compute. It is NP-complete to de- (2014), a realization can be found also for any mixed cide whether a graph admits at k-coloring, except for state. k = 0, 1, 2 and it is NP-hard to compute the chromatic The connection between graph coloring and SIC has number (Garey and Johnson, 2002). The fractional chro- been discussed in the specific case of rank-1 projectors 39

{Πi} with corresponding dichotomic measurements given C. The connections between Kochen-Specker and Bell by {Πi, 11 − Πi}. The compatibility graph and the exclu- theorems sivity graph then coincide, i.e., the projectors are com- patible if and only if they are orthogonal, ignoring the The connection between the proofs of the KS theorem trivial case of identical projectors. One can call the corre- and Bell nonlocality arguments has been extensively in- sponding graph the orthogonality graph of {Πi}, we have vestigated already from the 1970s (Brown and Svetlichny, then the following results proven by (i) Ramanathan and 1990; Clifton, 1993; Elby, 1990a,b; Elby and Jones, 1992; Horodecki(2014) and (ii) Cabello(2011): Kernaghan and Peres, 1995; Krips, 1987; Mermin, 1990b; Redhead, 1987; Stairs, 1978, 1983). On the one hand, it is clear that any Bell inequality can be interpreted as Theorem (Cabello, 2011; Cabello et al., 2015; Ra- a noncontextuality inequality and there are methods to manathan and Horodecki, 2014). For a set of rank-1 pro- convert some noncontextuality inequalities into Bell in- jectors {Πi} in dimension d, the conditions (i) χf (G) > d equalities violated by quantum theory, see, e.g., Aolita and (ii) χ(G) > d, for the orthogonality graph G, are nec- et al. (2012); Cabello et al. (2012). essary for SIC. Historically, the first results related to this question are those on the so-called “KS with locality theorem” (Brown and Svetlichny, 1990; Heywood and Redhead, Notice that since χf (G) ≤ χ(G), condition (ii) is ac- 1983; Kochen, 1970; Redhead, 1987; Stairs, 1983), that tually weaker than condition (i). However, condition (ii) later gave rise to the so-called “free will theorem” (Con- has the advantage of being solvable exactly by simple in- way and Kochen, 2006, 2009). Common to all these re- teger arithmetic, while condition (i) is the solution to a sults is that the KS proof for a single spin-1 particle is linear program. expanded into related algebraic proof involving the KS The condition χ(G) > d can be intuitively understood set and a maximally entangled state of two spin-1 parti- as necessary, since any coloring of the graph with d dif- cles. ferent colors assigns different values to each set of d or- The second wave of results connecting KS and Bell’s thogonal rank-1 projectors (forming a basis in dimension proofs were motivated by the GHZ proof of Bell’s the- d), in particular, it is a consistent assignment of 0 and orem (Greenberger, Horne, and Zeilinger, 1989). First, 1. The appearance of the fractional chromatic number it is Mermin’s observation that GHZ can be converted is more puzzling, but it can more or less straightfor- into a tripartite Bell inequality (Mermin, 1990a) and wardly derived by transforming the SDP a defining SIC a state-independent proof of the KS theorem (Mermin, 1990b, 1993). Next, the observation that Hardy’s proof set S = {Πi}i (for rank-1 projectors) into a LP by fix- ing the quantum state to be the maximally mixed one of Bell’s theorem (Hardy, 1992, 1993) can be seen as (Cabello et al., 2015). From this LP, one can extract the state-dependent version of a KS proof (Cabello et al., weights w and construct the SI-NC inequality 1996a). Finally, it is the GHZ-like proof for two parties sharing qubits (Cabello, 2001b), which can be seen as originated from the Peres-Mermin (PM) KS proof (Mer- X X X NCHV min, 1990b; Peres, 1990), and which can be converted w hΠ i − w hΠ Π i ≤ 1, (87) i i ρ i i j ρ into a bipartite Bell inequality (Cabello, 2001a) as ex- i i j∈N (i) plained below. Around all these tools, there is an exten- sive literature adopting different perspectives and names: which has the property that the maximal NCHV as- “all-vesus-nothing” proofs (Cabello, 2001a), “nonlocal signment is one respecting exclusivity relations and it games” (Cleve et al., 2004), and “quantum pseudo telepa- is violated by the maximally mixed state with a value thy” (Brassard et al., 2005; Renner and Wolf, 2004). More recently, other methods have been introduced to χf (G)/d > 1. transform inequalities associated with state-independent Notwithstanding the computational complexity of such contextuality (SIC) scenarios to Bell inequalities (Aolita problems, explicit calculations are still possible for small et al., 2012; Cabello, 2020; Cabello et al., 2012). The enough graphs. Using this result, it has been proven simplest approach to the problem is, arguably, to map that Yu-Oh set is the minimal SIC set in d = 3 (Cabello single measurements and two-time sequential measure- et al., 2015), namely, that there are no other graph SIC- ments on a single system into bipartite measurements on graphs with less than 13 vertices in dimension 3. This a maximally entangled state. Here, we will follow ap- result was further extended by proving that any SIC set proximately the discussion in Cabello(2020), but with a must contain at least 13 projectors independent of the different class of NC inequalities, namely those discussed dimension (Cabello et al., 2016b). The above results are in Sec. V.B.5. To understand this method, we start from valid under the assumption of rank-1 projectors, however, the basic observation that for the state |Ψi = √1 P |kki they have been extended to the case of uniform (i.e., all d k projectors of the same rank) rank-2 and rank-3 by Xu hΨ|A ⊗ Bt|Ψi = tr(AB)/d, (88) et al. (2020b), who were also able to exclude the case of 8 (arbitrary) projectors or less. where t represents the transposition w.r.t. the basis 40

{|ki}k. In other words, expectation values of bipartite The above construction is a simple one, but other con- operators on the maximally entangled state are equal (up structions are possible and we refer to the corresponding to a transposition) to the expectation value of their prod- references mentioned above. In particular, it is inter- uct on the maximally mixed (one party) state 11/d. Now, esting to highlight the fact that some of these construc- using the fact that in any SIC scenario the noncontextu- tions, such as the one in Aolita et al. (2012) are also ality inequality is violated even by the maximally mixed able to construct Bell inequalities where the quantum state, we can transform the SIC scenario into a bipar- and nonsignaling bound coincide. tite Bell inequality. This idea is very general and can The first experiments on the now-called Peres-Mermin be applied to a wide variety of scenarios and inequali- Bell inequality where based on the encoding proposed ties. To make a concrete example, we will discuss the in Chen et al. (2003) and carried out in Cinelli et al. specific case of noncontextuality inequalities arising from (2005); Yang et al. (2005), and subsequently repeated by the fractional chromatic number of a SIC-graph from the the group in Rome to improve the violation and fix a previous section. Given a SIC set {Πi}i, let us consider conceptual problem with the first experiment (Barbieri the associated SI-NC inequality presented in Eq. (87), et al., 2005, 2007), also Aolita et al. (2012) reports the X X X results of an improved experiment in Rome. On the basis wip(Πi = 1) − wi p(Πi = Πj = 1) ≤ 1, of the proposal in Cabello(2010), an experiment with i i j∈N (i) sequential measurements on entangled photons has been (89) performed (Liu et al., 2016). such that the optimal classical assignment correspond to one satisfying the exclusivity relations, and violated by the maximally mixed state, with a value χ (G)/d > 1. f D. Classical simulation of quantum contextuality This inequality can be transformed into the Bell inequal- ity The fact that quantum mechanics makes different pre- X 1 X dictions than noncontextual theories leads to the ques- w p(ΠA = ΠB = 1) − w × (90) i i i 2 i tion of which contextual theories can simulate the quan- i i tum mechanical behaviour. More precisely, one can ask, X h A B B A i × p(Πi = Πj = 1) + p(Πi = Πj = 1) ≤ 1, which classical resources are needed, in order to simulate j∈N (i) classically a the quantum behaviour in a contextuality experiment. by distributing a copy of all projectors {Πi}i to both Al- A B t This question has some tradition in the analysis of Bell ice and Bob, i.e., Πi = Πi and Πi = Πi, where on Bob’s scenarios. There, one may ask how much communica- projectors the above transposition has been applied. tion between the two parties is needed in order achieve By Eq. (88), we have that the value on |Ψi a maximal violation of a Bell inequality. For the case t of the simplest Clauser-Horne-Shimony-Holt inequality, hψ|Πi ⊗ Πi|ψi = tr(ΠiΠi)/d = tr(Πi)/d, t this has been discussed in detail and optimized simula- hψ|Πi ⊗ Πj|ψi = tr(ΠiΠj)/d = 0, (91) tion schemes have been designed (Cerf et al., 2005; Toner if i, j appear in a correlator in Eq. (89), and Bacon, 2003). Concerning contextuality, several contextual models is the same as the one of the l.h.s. of Eq. (89) on the have been designed, e.g., for the Peres-Mermin square maximally mixed state 11/d, namely, χf (G)/d > 1. (Blasiak, 2015; La Cour, 2009). In addition, there are, For the local hidden variable bound, one can have an of course, general approaches for simulating quantum argument similar to the one presented in Cabello et al. mechanics with classical models, including contextuality (2015) for Eq. (89). The main idea is that the weights (van Enk, 2007; Larsson, 2012; Spekkens, 2007). These w are chosen such that the maximum deterministic value models however, were not constructed to be resource- assignment is one that respects the orthogonality con- efficient and they do not allow for a clear estimate of the ditions among the projectors, i.e., if ΠiΠj = 0, to the minimal necessary classical resources. corresponding classical variables, let us denote them by In the following, we will discuss models to simulate πi, πj, are assigned one 0 and one 1. Every time that quantum contextuality in sequential measurements. It is we violate one of these constraints for just one of the A clear that such contextual models require some memory parties, say on Alice’s side, by flipping the value of Πi , to work: For instance, for obtaining the maximal value we get a factor −w /2 P πB, which decreases the i j∈N (i) j hPMi = 6 in Eq. (3) one needs to remember the pre- total value (assuming we are violating an orthogonality vious measurements in the measurement sequence. So, B constraint, so at least one of the πj is not 0). Similarly, if the question arises what minimal memory is needed for A B we violate one orthogonality for both Πi and Πi , we get the simulation. This depends on the underlying compu- P A B a factor wi −wi/2 j∈N (i)(πj +πj ), which is again neg- tational model and the quantum mechanical correlations ative. It is clear, then, that optimal classical assignments that should be simulated. In the following, we first dis- are those respecting the orthogonality relations both on cuss two concrete models and then we shortly mention Alice and Bob’s side. some more general approaches for simulating temporal 41 correlations. One can show that this automaton is the smallest au- tomation to reproduce these predictions, that is, Mealy machines with two internal states cannot do that (Klein- 1. Simulation with Mealy machines mann et al., 2011). In this example, however, one has to be careful about A simple attempt to simulate contextual behaviour the correlations one wishes to simulate. For instance, in sequential measurements is the following (Kleinmann while the above Mealy machine reaches hPMi = 6, it et al., 2011). One assumes a classical automaton with does not reproduce all deterministic quantum predic- k internal states. For a given internal state one can ask tions: Starting in S1, the sequence B1C2β3B4 yields the a certain question (or, in physical terms, perform a cer- sequence of results (+1, +1, −1, −1), i.e., B changes its tain measurement) and obtains an answer (or, result). value. This is in contrast to quantum mechanics, since C After providing the result, the automaton changes its and β are both compatible with B. So, while the above internal state, depending on the measurement that was machine reproduces compatibility constraints within one performed. So, in this model, each internal state is char- column or row, it does not reproduce all compatibility acterized by two discrete functions: One function deter- conditions. For incorporating more compatibility con- mines the output, depending on the measurement, and straints one needs four internal states (Kleinmann et al., the other function determines the update of the internal 2011). state, again in dependence of the measurement. In this example, Mealy machines were only used to Such a model is called a Mealy machine (Mealy, 1955). simulate some outcomes of quantum mechanics in a de- Given this class of models, one can easily define the mem- terministic manner. However, no quantum state gives de- ory cost required for a simulation as the minimal number terministic outcomes for all measurements of the Peres- of internal states that is required for a simulation. Mermin square. For this, one can consider probabilis- This concept is best explained with an example. We tic mixtures of different Mealy machines. For instance, focus on the Peres-Mermin square as in Eq. (1). For Fagundes and Kleinmann(2017) considered a class of a simulation, we assume that the automaton has three variations of the automaton in Eq. (92), and then prob- internal states S1, S2, and S3. For each state, we define abilistic mixtures of these. Then, they showed that the the automaton via the tables predictions for any quantum state could be reproduced, as long as only compatibility relations within the columns     + + (+, 2) + (+, 1) + and rows are considered. In an extension of this research S1 :+ + (+, 3),S2 :− + −, line, it was also shown that a Mealy-type machine with a     single qubit cannot simulate contextual correlations (Bu- + + + − (−, 3) + droni et al., 2019).   + − −   S3 :(+, 1) + +. (92) (−, 2) − + 2. Simulation with ε-transducers

This defines the automaton as follows: Assume that the A different approach to simulate contextuality or tem- poral correlations comes from the analysis of time series Mealy machine is in state S1 and we measure the observ- able γ. Then, we consider the first table at the position and can also be used to quantify the memory needed of γ (i.e., the last entry in the third row). The simple + for a simulation. Before starting the explanation of ε- sign at this position indicates that the measurement re- transducers, it is useful to recall the notions of hidden Markov models (HMMs, Rabiner, 1989) and ε-machines sult will be +1, while the system stays in state S1. If we continue and measure C, we encounter the entry (+, 2) (Crutchfield, 1994). which indicates the measurement result +1 and a subse- HMMs are probabilistic automata to simulate time se- quent change to the internal state S2. Being in state S2, ries. The automaton contains a set of internal states Sk, the second table defines the behavior for the next mea- and for each internal state, there is a probability distri- surement: For instance a measurement of c yields now bution Pk of the outcomes and a probability distribution the result −1 and the system stays in state S2. Tk of the transitions. If the automaton is in a given state Thus, starting in S1, the measurement results for the Sk, it will output an outcome drawn from Pk and move sequence γCc are +1, +1, −1, so that the product is −1 in to another state, chosen according to Tk. For the descrip- accordance with the quantum prediction. It is straight- tion of a given time series the HMM does not have to be forward to verify that this model yields hPMi = 6. In in a definite state, instead one has a probability distribu- addition, the observables within each context (defined by tion over all internal states. Two examples of HMMs are a column or row) are compatible in the sense that in se- given in Fig. 20. ε-machines can be seen as a special instance of quences of the form A1A2, A1B2A3, or A1α2a3A4 (here, ←→ the subindices indicate the temporal ordering in the mea- HMMs. Consider an infinite time series X = surement sequence) the first and last measurement of A {...,X−2,X−1,X0,X1,X2,... } where the Xi are ran- yields the same output. dom variables over some alphabet. One can split it into 42

Before discussing the application to contextuality, it is useful to explain in more detail the difference between an ε-machine and general HMMs. In an HMM (or a Mealy machine) it can happen that the simulation automaton contains information about the future that cannot be de- rived from the past. Consider Alice and Bob, where Alice only knows the current internal state of the automaton and Bob only knows the past sequence of results. If the simulation works properly, Alice can predict the future as well as Bob. It can happen, however, that Alice can pre- FIG. 20 Examples of a HMM and an ε-machine for the sim- dict the future better than Bob, e.g., if the given internal ulation of a biased coin flip (also called a perturbed coin). state Sk predicts a deterministic outcome for the next The process is given by a coin, which flips with a certain measurement, which cannot be deduced from the past. probability, as given by P (xi = T ) = P (xi = T ) = 1/2 and This difference is also illustrated in Fig. 20. Physically, P (xi = T |xi−1 = H) = 1/2 − δ and P (xi = H|xi−1 = T ) = such general HMMs with oracular information may be 1/2 − δ. This can be seen as a fair coin (δ = 0) which is dis- excluded by the demand that only the past observations turbed towards a constant process, a detailed description of should be used for simulating the future. This leads then this and the following models is given in L¨ohrand Ay(2009). to ε-machines. (a) A general HMM could simulate this with three internal For simulating contextuality, one still has to extend states, which correspond to a fair coin and two maximally biased coins, which always give heads or tails. The fact that the scheme a bit, as different measurements in each time step are possible. This, however, can easily be done by the biased coin flip tends to reproduce the previous result ←→ is modeled by the rule that the automaton acts most of the combining the sequence of measurement choices Q and ←→ ←→ times as a fair coin, but sometimes this is replaced by a de- the sequence of results A to a single variable, X = terministic coin. (b) For constructing the ε-machine, one first ←→ ←→ observes that only the last output matters, so these define the ( Q , A ). The corresponding ε-machine is then called the causal states. Since both outputs are equally probable, the ε-transducer (Barnett and Crutchfield, 2015). statistical entropy of the process is 1 bit. The HMM in (a) The simulation of the Peres Mermin square with ε- requires less memory, but it contains oracular information: If transducers was considered by Cabello et al. (2018a). one knows that the automaton is in a state corresponding to There, it turns out that the causal states are effectively a maximally biased coin, the next output is foreseeable. the 24 eigenstates occurring in the observables of the square. They occur with equal probability, so the re- quired memory is H = log(24) ≈ 4.585 bits. For the Yu a past and a future, and Oh scenario, the causal states are more difficult to ←− identify, but at least H ≈ 5.740 bits are required for a X = {...,X−3,X−2,X−1}, simulation. It is interesting although the Yu-Oh scenario −→ X = {X0,X1,X2,... }. (93) is more difficult to simulate, the scenario has a smaller degree of contextuality according to several contextuality Then, one can define an equivalence relation on the set measures (Abramsky and Brandenburger, 2011; Grudka of all possible pasts, by calling two pasts equivalent, if et al., 2014; Kleinmann et al., 2012). they both predict the same future. Mathematically, one −→ −→ defines ←−x ∼ ←−x 0 if and only if P (X |←−x ) = P (X |←−x 0). The equivalence classes of this relation are then called 3. Other related results the causal states {Sk}. Clearly, the causal state contains all information from the past that is relevant for the fu- Contextuality is relevant for quantum computation ture, the knowledge of the precise history does not add (see also Sec. VI.A), so the simulation of both phenomena anything to it. is connected. In quantum computation, the so-called sta- Given a causal state, one obtains a x0 as a new output. bilizer operation contain an important class, which are, This additional output defines a new history, belonging to however, not sufficient for universal computation. The a potentially different causal state and, consequently, the simulation of these stabilizer operations with a contex- output x0 defines a transition to a new causal state. Note tual model has been discussed by Lillystone and Emer- that in a general HMM the output does not determine son(2019); Lillystone et al. (2019). For making stabi- the transition. Finally, one can consider the probability lizer operations universal, so-called magic states need to distribution of the causal states and its entropy, H = be added and an (explicitly contextual) hidden variable P − k p(Sk) log[p(Sk)]. This is the statistical complexity model for these has been found by Zurel et al. (2020). of the process and can be used to quantify the memory Similarly, explicitly contextual classical models that sim- needed for a simulation. Recently it was shown that in ulate quantum contextuality have been investigated by this context quantum mechanics can help to reduce the Bravyi et al. (2018, 2020). In this case, the cost the clas- memory required for a simulation of time series (Gu et al., sical simulation of contextual correlations is quantified in 2012; Palsson et al., 2017). terms of the circuit depth, which increase with the input 43 size for classical models, but remains constant for any 1. Meyer’s “nullification” of the KS theorem input size for quantum models. The previous results lead to the question, how gen- What if not all sharp measurements are physically real- eral temporal quantum correlations can be simulated in izable? Meyer(1999) suggested an explicit way in which a classical manner. First, if the dimension of the under- this can happen while being undetectable due to the un- lying quantum system is not bounded, the space of all avoidable finite precision of actual measurements. Each correlations forms a polytope (Abbott et al., 2016; Hoff- direction in the three-dimensional Euclidean space can mann et al., 2018), while the correlation space becomes be represented by a unit vector nonconvex for fixed dimension (Mao et al., 2020). Then, 1 hvj| = (xj, yj, zj), (94) Mealy machines have been used to characterize the mem- q 2 2 2 ory cost for simulating correlations (Budroni et al., 2019, xj + yj + zj 2020), and the minimal dimensions for reaching certain with √ xj , √ yj , √ xj ∈ . Accord- correlations have been characterized (Mao et al., 2020; 2 2 2 2 2 2 2 2 2 R xj +yj +zj xj +yj +zj xj +yj +zj Spee et al., 2020). ing to quantum theory, each direction in the three- dimensional Euclidean space corresponds to a sharp mea- surement on a three-dimensional quantum system. The corresponding ±1-valued observable is represented in quantum theory by the self-adjoint operator constructed E. The so-called nullifications of Kochen-Specker theorem from

The KS theorem was developed in the framework of Aj = 2|vjihvj| − 11, (95) ideal measurements and, as we have seen in Sec. IV.B and where 11 is the 3 × 3 identity matrix. The possible out- IV.C, several problems arise when trying to map those comes are the eigenvalues of A : −1 (doubly degenerate) ideal measurements to actual experimental implementa- j and 1 (non-degenerate). tions. Some of the first criticisms regarding the physical If all directions |v i correspond to physically realizable implications of the KS theorem precisely involved this j sharp measurements, then at least four colors are needed transition from ideal to actual measurements, in partic- to color every |v i respecting that orthogonal |v i’s are ular the impossibility of arbitrary precise measurements, j j colored differently (Hales and Straus, 1982). However, if and were raised by Meyer(1999), Kent(1999), Clifton x y x √ j √ j √ j 2 2 2 , 2 2 2 , 2 2 2 ∈ Q, then only three and Kent(2000), and Barrett and Kent(2004). These xj +yj +zj xj +yj +zj xj +yj +zj works played a fundamental role in the development of colors are needed (Godsil and Zaks, 1988). This implies the modern approach to contextuality by stimulating the that a noncontextual assignments of −1 and 1 respecting extension of the KS notion of contextuality from a log- that, for each orthogonal trio, 1 is assigned to only one ical to a probabilistic framework. The first appearance vector are possible (Meyer, 1999). In fact, it is enough of KS inequalities are, in fact, of those years and ex- to take the three colors and assign the value “0” to two plicitly motivated by these works (Larsson, 2002; Simon of them and the value “1” to the remaining one to ob- et al., 2001). In our opinion, this transition from the log- tain a valid KS assignment. Moreover, the rational unit ical to the probabilistic perspective in Kochen-Specker’s sphere is dense in the real unit sphere and thus there contextuality, and in particular the subsequent theoret- is no experimental way to distinguish between the two ical and experimental effort in testing contextuality on spheres. physical systems, is the most interesting outcome of this According to Meyer, this shows that, despite the KS debate. Notwithstanding the value of both the criticisms theorem, NCHV theories can simulate the predictions by Meyer, Kent, Clifton, and Barrett and the responses of quantum theory within any fixed finite experimental they received from several authors (Appleby, 2000, 2001, precision. Kent(1999) generalized Meyer’s result and 2002, 2005; Cabello, 1999; Havlicek et al., 2001; Larsson, showed a construction of KS-colourable dense sets of pro- 2002; Mermin, 1999; Peres, 2003; Simon et al., 2001), we jectors onto vectors with rational components in complex decided not to present them here in detail, as to do so Hilbert spaces of arbitrary finite dimension. Kent claims another review paper would be necessary. that this shows that “noncontextual hidden variable the- Instead, after briefly presenting the arguments by ories cannot be excluded by theoretical arguments of the Meyer(1999), Kent(1999) and the one by Clifton and KS type once the imprecision in real world experiments Kent(2000), we compare them with the broader per- is taken into account” (Kent, 1999). spective of probabilistic approaches to contextuality dis- A simple counterargument to Meyer and Kent’s NCHV cussed in previous sections, and in particular the problem models is given by the fact that their model cannot re- of designing and implementing valid experimental tests produce the probabilistic predictions of quantum theory. of contextuality (Sects. IV.B and IV.C). In our opinion, The following example is taken from Cabello and Larsson these results provide the strongest argument against any (2010). Consider d = 3 and the initial state possible claim of “nullification” of Kochen and Specker’s 1 hψ| = (354, 357, −158) (96) contextuality. 527 44 and the sharp measurements associated to to the standard NC inequalities, have been extensively discussed in Sec. IV.C, so we refer the reader to that hv1| = (1, 0, 0), (97a) section for further details. hv2| = (0, 1, 0), (97b) It is interesting, at this point, to comment on the rela- 1 tion between Bell nonlocality, contextuality, and imper- hv3| = (48, 0, −55), (97c) fect compatibility. It is true, as Barrett and Kent(2004) 73 1 claim, that the original KS contextuality and Bell non- hv | = (1925, 2052, 1680), (97d) locality are logically independent concepts. However, in 4 3277 the probablistic framework for contextuality, developed 1 hv5| = (0, 140, −171). (97e) precisely after the whole “nullification” debate, Bell non- 221 locality can be seen as an special case of contextuality: This state and all these ideal measurements are allowed we have a notion of contexts, compatible/joint measure- according to Meyer. For this state and these measure- ments, and the goal of reproducing observed correlations, ments, quantum theory predicts associated to single contexts, from a global probability distribution (see Sec. IV.B). In a Bell scenario, how- κ = 3.941, (98) ever, perfect compatibility is always guaranteed by the space-like separation of measurement events, hence, by for the locality condition of special relativity no disturbance is allowed between them. Moreover, also imperfect mea- κ = −hA1A2i − hA2A3i − hA3A4i − hA4A5i − hA5A1i. surements can always be “dilated” to projective measure- (99) ments by Neumark’s theorem (Holevo, 1982; Neumark, However, for any NCHV model (Klyachko, 2007; Kly- 1940a,b, 1943; Peres, 1993). achko et al., 2008), Finally, imperfect measurements seem to forbid con- textuality by another mechanism, namely, the trans- κ ≤ 3. (100) formation of degenerate observables into nondegenerate ones. The degeneracy property of quantum observables Therefore, Meyer’s NCHV models fail to simulate the is fundamental in creating a nontrivial structure of mea- predictions of quantum theory. Notice that the set of surement contexts (see the discussion of Vorob’ev’s the- inequalities of the form (100) with κ of the form (99) orem in Sec. V.B). In fact, for nondegenerate observ- with an odd number of minus signs [e.g., (99) has five ables, commutativity becomes a transitive property (i.e., minus signs] provides a necessary and sufficient condition [A, B] = [B,C] = 0 implies [A, C] = 0), which guarantees for the existence of a NCHV model (Ara´ujo et al., 2013). the existence of a NCHV model (Correggi and Morchio, 2002), in our language, we may say that the compat- 2. Clifton and Kent’s “nullification” of the KS theorem ibility graph is a collection of disconnected cliques (cf. Sec. V.B). For physically relevant observables of a single Clifton and Kent(2000), starting from similar ideas, system, the degeneracy is a consequence of some symme- adopt a different approach. They ask the question: What try of the system (think about the rotational symmetry if every sharp measurement only belongs to one con- of the hydrogen atom) which is removed when the sym- text? They showed that there is a set of directions in metry is no longer exact (think about the level splitting the three-dimensional Euclidean space that is dense in due to a small electric or magnetic field). Arguably, this the real unit sphere and consists of directions such that is the case of imperfect experimental realization (e.g., it none of them is orthogonal to any of the others. There- is impossible to completely remove any electric and mag- fore, one can assign any predetermined outcome to any netic field). In contrast, the symmetries related to the of these directions (Clifton and Kent, 2000). In other space-time structure are robust, i.e., never removed by words, they substitute the measurements defining the set small imperfections, and thus preserve the degeneracy of of contexts (compatible ideal measurements) with other the relevant physical observables (e.g., think about ob- measurements for which contexts consist of a single mea- servables of the form Ax ⊗ 11 and 11 ⊗ By for a bipartite surement, i.e., all measurements are mutually incompati- system). Again, this problem can be analyzed from the ble. In this way, as already noted by Kochen and Specker perspective of experimental imperfections presented in (1967), no constraint is imposed on the NCHV theory, ex- Sec. IV.B). cept the reproduction of single measurement marginals, hence a NCHV model can be constructed as a product of all single-measurement distributions. VI. APPLICATIONS OF QUANTUM CONTEXTUALITY This can be formulated as a problem of imperfect compatibility, in analogy the one addressed by Larsson Since contextuality is a fundamental phenomenon of (2002), Winter(2014), and Kujala et al. (2015). Pos- quantum mechanics, it is not surprising that several au- sible solutions to this problem, within the probablistic thors studied its applications and its relevance for quan- framework of contextuality and involving modifications tum information processing. In the following, we will 45 discuss three examples in some detail: First, the role of Sec. V.B.3 for Howard et al. (2014) and the cost of clas- contextuality in quantum computation; second, poten- sical simulation of contextuality presented in Sec. V.D for tial applications in quantum cryptography; and third, an Bravyi et al. (2018, 2020), we briefly summarize these two application of contextuality in randomness generation. results in the following. Finally, we shortly mention some further connections to information processing. 1. Contextuality and magic states

A. Contextuality and quantum computation One of the basic building blocks of the paradigm of computation via are stabilizer The first works to investigate the relation between codes, which provide a fault-tolerant implementation of a quantum contextuality and computation appeared in the subset of preparations, measurements, and unitary trans- framework of measurement-based quantum computation formations. Such a subset of operations, however, is not (MBQC, Briegel et al., 2009; Raussendorf and Briegel, only not-universal for computation, but also efficiently 2001), where computation is performed by adaptively classical simulable, as shown by the Gottesman-Knill measuring single qubits prepared in a large entangled theorem (Gottesman, 1997). The additional resource state. Thus, in each experimental run a set of compatible that provides universal quantum computation are non- measurements (i.e., measurements on different qubits) stabilizer states, called magic states, possibly provided are performed. It is natural to interpret the whole ex- in a noisy form, but distillable to some target magic periments as a contextuality experiment (notice that it state (Bravyi and Kitaev, 2005). With these states non- cannot be interpreted as a Bell experiments as the sys- Clifford gates, such as, e.g., the π/8 gate or its qudit tems are not far apart) and ask whether the computa- generalization, can be implemented, thus promoting sta- tional power arises as a consequence of quantum contex- bilizer computation to universal quantum computation. tuality. The first result in this direction was presented Not all magic states are useful, as a large class of them by Anders and Browne(2009), by showing that GHZ- cannot be distilled to pure states and some can even be type correlations enable the deterministic computation efficiently classical simulable (Aaronson and Gottesman, of the NAND gate, effectively promoting a classical par- 2004; Mari and Eisert, 2012; Veitch et al., 2012, 2014). ity computer into a universal (classical) one, see also the In the work of Howard et al. (2014), the authors show non-adaptive case in Hoban et al. (2011). Starting from that a state is contextual, with respect to stabilizer mea- this observation, Raussendorf(2013) proved that all l2- surements, if and only if the state is outside the polytope MBQC, i.e., MBQC with mod 2 linear classical process- of efficiently simulable states PSIM. The authors prove ing, which compute a non-linear Boolean function with the statement for qudits with d being an odd prime and sufficiently high success probability are contextual. The for the special case of qubits. This result identifies con- result of Raussendorf was, then, further generalized: con- textuality as a necessary condition for universal quantum necting the success probability to the contextual frac- computation via magic state distillation. The proof of tion (Abramsky et al., 2017), to include reliable com- the sufficiency would require one to show that any state putation, i.e., with success probability strictly greater ρ∈ / PSIM can be distilled to a sufficiently pure magic than 1/2 (Oestereich and Galv˜ao, 2017), and beyond state. qubits (Frembs et al., 2018). In the following, we briefly explain the stabilizer for- A fundamental result showing a strong interplay be- malism and how the set of efficiently simulable states tween contextuality and computation is that of Howard can be identified with the noncontextual states, w.r.t. et al. (2014). More precisely, the result connects contex- stabilizer measurements, defined as all projective mea- tuality in the framework of NCHV theories with addi- surements consisting of rank-1 projectors onto stabilizer tional exclusivity (see Sec. IV.A.4), and quantum com- states. We consider a system of dimension p, where p is putation in the framework of quantum computation via an odd prime, the qubit case has also been discussed by magic state distillation for qudit systems with d an odd Howard et al. (2014). prime number. Finally, another important result is the First, let us recall the definition of the displacement one by Bravyi et al. (2018, 2020), who showed an example operators in the discrete phase space (Gibbons et al., of a problem that can be solved with quantum circuits of 2004; Gross, 2006; Vourdas, 2004) constant depth, independently of the input size (shallow 2−1lm l m circuits), but requires classical circuits of depth increas- Dl,m := ω X Z , (101) ing (logarithmically) with the input size. This result can be directly connected to the problem of classical simula- where the generalized X and Z are defined in the com- putational basis by Z |ki = ωk |ki, X |ki = |k + 1i, and tion of contextual correlations. i2π Given the relevance of such results in the quan- ω := e p , and 2−1 is the multiplicative inverse of 2 in 2−1 iπ tum information community and their direct connec- the field Zp, i.e., ω = e p . tion with topics discussed in this review, namely, the In analogy with the continuous-variable case, the graph-theoretic approach to contextuality presented in discrete Wigner function of a state ρ can then 46 be defined as the expectation values of displace- 2. Contextuality and shallow quantum circuits ment operators on it. Since the dimension is fi- nite and some of the displacement operators commute The name shallow circuit refers to circuits that are of 2−1(lm0−l0m) (since Dl,mDl0,m0 = ω Dl+l0,m+m0 ), it is constant depth, independently of the input size. Con- enough to consider only p + 1 of them, e.g., L = stant depth implies that the corresponding operations {D0,1,D1,0,D1,1,D1,2,...,D1,p−1}, as their eigenvectors can be run in a constant time, as operations on different form a complete set of mutually unbiased bases (Appleby sets of qubits can be run in parallel. qj et al., 2008). Now, denote by Πj the projector onto As a starting point, Bravyi et al. (2018) showed that the eigenvector with eigenvalue ωqj for the j-th operator there exist problems that can be solved by a quantum p+1 in L. Then, for a vector q ∈ Zp define the operator algorithm with certainty and in constant time for any in- q Pp+1 qj put size, i.e., with a shallow circuit, but require a time A := −11 + j=1 Πj and finally the discrete Wigner function as (Gibbons et al., 2004; Gross, 2006) logarithmic in the length of the input for any classical cir- cuit that solves them with a sufficiently high probability. W (q) = tr(ρAq). (102) This work has been further extended Bravyi et al. (2020) ρ to take into account what happens if noise is explicitly modeled. In Bravyi et al. (2020), an alternative, and ar- Following Mari and Eisert(2012); Veitch et al. (2012), guably simpler, argument has been presented to show the the efficently simulable states are identified with those gap between quantum and classical shallow circuits. with a positive Wigner function, namely It is important to remark that the arguments presented

xa+zb in Bravyi et al. (2018) and Bravyi et al. (2020) do not PSIM := {ρ | tr(ρA ) ≥ 0, x, z ∈ Zp}, (103) involve any oracle. In fact, a speed-up proven in the oracular setting may not translate into a real-world ad- with a := (1, 0, 1, . . . , p − 1) and b := −(0, 1, 1,..., 1). vantage, as a classical algorithm may take advantage of The connection between simulability and contextuality the internal structure of the oracle to solve the problem is then obtained by the authors by associating an exclu- more efficiently (Johansson and Larsson, 2017, 2019). sivity graph to a collection of projectors, precisely those Two different arguments have been presented in Bravyi xa+zb entering in the definition of A in Eq. (103), com- et al. (2018) and Bravyi et al. (2020), but both are based puting its independence number (i.e., the noncontextual on a similar reasoning. An intuitive understanding of xa+zb bound) and showing that the condition tr(ρA ) < 0 them, based on nonlocal games, can be obtained as fol- amounts to the violation of the noncontextual bound. lows. Let us first consider the case of a circuit that is fed More precisely, for a two-qudit system, with the appro- with a fixed entangled state plus classical bits as input, sj r priately chosen set of projectors {Πj } and their sum Σ , but implements only local (e.g., single qubit) operations. the authors show that This means that each gate has only a wire coming in and one coming out. Recall that the number of wires coming tr[Σr(ρ ⊗ σ)] ≤ p3 ⇔ tr(ρAr) ≥ 0, (104) in each gate, not necessarily among nearest neighbors, is called the degree or fan-in of the gate. If the input- for any state σ on the second qudit. Finally, the bound output relation of the circuit is modeled after that of a p3 is proven to be the independence number of the graph nonlocal game, then the observation of a probability of Γr associated with the set of projectors appearing in Σr, success for the correct output above a certain threshold i.e., α(Γr) = p3, showing that contextual states (ρ ⊗ σ can be interpreted as the violation of a Bell inequality. for all σ) are precisely those associated with a negative Now, imagine we allow for fan-in 2, with nearest neighbor Wigner function for ρ. This connection between contex- interactions. Then, in the analogy of the nonlocal game, tual states and the negativity of their Wigner function any two parties can collaborate to win, but only if their has been proven to be general (Delfosse et al., 2017) for distance is less than 2D, where D is the circuit depth. To n-qudit systems with n > 1 and d an odd prime, without visualize this, one can imagine that each output of the requiring the construction of the tensor product ρ ⊗ σ as circuit has some “past lightcone” indicating the initial in Eq. (104). inputs that can have influenced it. The result by Howard et al. (2014) has then been The number of parties in the game corresponds to the extended to rebits, i.e., a restriction of qubits to real- input size of the circuit, hence, to allow for a collab- valued density matrices and operators (Delfosse et al., oration among distant parties, the depth of the circuit 2015), and finally to qubits (Bermejo-Vega et al., 2017; must grow logarithmically with the input size. In other Raussendorf et al., 2017). Finally, a rather different words, to win with the above classical communication notion of contextuality, called sequential contextuality strategy, the depth of the circuit must grow (logarithmi- has been investigated from the perspective of computa- cally) with the input size. The argument is, then, ex- tion (Mansfield and Kashefi, 2018) and quantum infor- tended to the case of gates of fan-in K, with K arbitrary mation processing tasks such as quantum random access but fixed for all possible input sizes, and removing the codes for systems with bounded memory (Emeriau et al., condition of nearest-neighbor interaction. Finally, the 2020). nonlocal game is chosen such that it can be won by quan- 47 tum players without any communication, corresponding the game). The results in Bravyi et al. (2018) and Bravyi to a fixed depth circuit necessary to prepare the correct et al. (2020) can, then, be interpreted in the framework initial entangled state. Notice that, even if the number of memory cost (or, communication cost) for the classi- of operations needed for preparing this entangled state cal simulation of quantum contextuality (see Sec. V.D). grows with the input, they can be performed in parallel, Finally, notice that even if we discussed nonlocal games hence, the depth of the circuit remains constant. and Bell inequalities, in any realization of the consid- A more detailed description can be provided by explic- ered circuit the single measurements are not far apart. It itly considering the game in Bravyi et al. (2020), which is is, then, more appropriate to identify the corresponding based on the PM square nonlocal game (Mermin, 1990b; phenomenon as quantum contextuality rather than Bell Peres, 1990), and called 1D Magic square problem. In nonlocality. Bravyi et al. (2018) a similar game is used based on a GHZ-type contradiction, but the proof is more elaborate as the game is based on a two-dimensional qubit archi- B. Contextuality and quantum cryptography tecture. In Bravyi et al. (2020), the N input wires of the circuit represent the N players and additional classical 1. Svozil’s quantum key distribution protocol inputs are provided to specify the game. In each round, only two of them play the PM square game, whereas the The possibilities of contextuality for quantum key other players collaborate to create the right correlations. distribution (QKD) were first devised by Bechmann- The different roles are assigned at random at the begin- Pasquinucci and Peres(2000). Here, we review a QKD ning of each round, such that a perfect winning (clas- scheme introduced by Svozil(2010), which provides a nice sical) strategy would necessarily require a collaboration example of how contextuality adds features beyond those between any pair of players. provided by measurement incompatibility. Specifically, it The nonlocal game can be straightforwardly translated shows how contextuality can be used to counteract a pos- into an abstract relation problem: a relation is simply a sible attack, described in Svozil(2006), that may be used set of valid input-output pairs, (zin, zout), defined by a to attach the standard BB84 QKD protocol (Bennett and function R(zin, zout) taking values 0 or 1. A circuit is Brassard, 1984). said to solve the relation problem R if for each input zin Let us recall how the BB84 protocol works. There are it produces an output zout such that R(zin, zout) equals two separated parties, Alice and Bob, who want to obtain one. The pairs (zin, zout) can, then, be derived from the a shared secret key (i.e., a sequence of bits known only to input-output pairs of the nonlocal game. The problem is them). For that, they send physical systems from Alice to said to have n input-output bits if |zin| + |zout| = n. Bob and share classical information over a public channel. We can then summarize the result by Bravyi et al. The protocol goes as follows: (i) Alice randomly picks one (2020) as follows. They show that for each n there ex- from two basis of qubit states (the computational basis ists a relation problem R with n input-output bits and a and the Hadamard basis) and sends Bob over a public set of inputs S, with |S| = poly(n) such that (i) R can and authenticated quantum channel a randomly chosen be solved for all inputs with a constant-depth quantum state of that basis. (ii) Bob picks a basis at random from circuit and (ii) any classical circuit with constant fan-in the two and measures in this basis the system received that solves R with probability psuccess ≥ 90%, for S uni- from Alice. (iii) Over a public channel, Bob announces formly distributed, has depth growing at least as fast as his bases and Alice announces those events in which the log(n). state sent belongs to the measured basis. (iv) Alice and In the same paper, Bravyi et al. (2020) generalize Bob repeat steps (i) to (iii) many times and, from the the result to the case of quantum circuits with local bits where both Alice and Bob used the same basis, Alice stochastic noise, namely, a random Pauli error is applied randomly chooses half of them and discloses her choices at each time step to the ideal quantum circuit. More over the public channel. Both Alice and Bob announce precisely, the authors show that a constant-depth noisy these bits publicly and run a check to see whether more quantum circuit, this time with a 3D geometric struc- than a certain number of them agree. If this check passes, ture, can solve the n-bit problem R with high probability then Alice and Bob use the remaining undisclosed bits to (psuccess ≥ 99%), whereas any classical noiseless circuit create a shared secret key via additional techniques like that solves it with psuccess ≥ 90% has depth growing at error correction and privacy amplification. least as fast as log(n)/log[log(n)]. Let us now describe Svozil’s attack to the BB84 pro- Beyond the technical details of the results and from a tocol (Svozil, 2006). The adversary replaces the prepa- contextuality perspective, the interesting observation is rations and measurements of quantum states by classical the following. The quantum circuit can solve the prob- preparations and measurements. So, in step (i), Alice lem with a constant depth due to its ability to generate is actually picking one of two differently colored eye- contextual correlations. In fact, any classical simulation glasses (instead of one of the two bases) and picking a of these contextual correlations requires communication ball from an urn (instead of picking one quantum state) among the parties, which in turn requires the depth of the with two color symbols in it (corresponding to the basis circuit to grow with the number of parties (or inputs of the state belongs to). Each one of the two differently 48 colored eyeglasses allows her to see only one of the two implies nonlocality), then they can use them to extract colors. Svozil observed that the adversary can mimic secure key in a device-independent manner. As an appli- the quantum predictions if: (a) each of the balls has one cation, the authors introduce a QKD protocol exploiting symbol Si ∈ {0, 1} written in two different colors cho- the properties of the Peres-Mermin (PM) magic square sen among the two possible pairs. Her choice of eyeglass (see Sec. III.B.1). We do not provide here the details of decides which symbols Alice can see. (b) All colors are the QKD protocol, we just describe the resources used equally probable and, for a given color, the two symbols and the steps to prove device-independent security. are equally probable. Therefore, in step (ii), Bob is ac- A distributed PM box, shared by Alice and Bob, is tually picking one of two differently colored eyeglasses defined as follows (Cabello, 2001a). Both Alice and Bob and reading the corresponding symbol. Since the re- have a PM set of observables. Alice measures columns of quirements (a) and (b) can be satisfied simultaneously, the PM magic square, while Bob measures rows, as it was the strategy can successfully imitate the quantum statis- first proposed in (Cabello, 2001a). That is, a distributed tics of the BB84 protocol so, if the replacement remains PM box is a set of 9 conditional distributions p(a, b|x, y), unnoticed to Alice and Bob, just by checking the statis- where x labels the columns of the PM table, y labels the tics they cannot realize that the adversary may have full rows, and a = (a1, a2, a3) and b = (b1, b2, b3) are the out- knowledge of their “secret” key. comes of the joint measurement of the three observables However, Svozil(2010) also notices that, if one replaces in the respective column or row, where ai, bj ∈ {+1, −1}. the quantum states and basis of the BB84 protocol by The outcomes are assumed to satisfy the correspond- those used in a proof of the KS theorem, then in the ing quantum predictions. That is, a1a2a3 = +1 for all classical attack requirements (a) and (b) cannot be sat- columns of the PM table, except for the last one (for isfied simultaneously and then the adversary cannot sim- which a1a2a3 = −1), and b1b2b3 = +1 for all rows. ulate the quantum statistics. Specifically, Svozil’s proto- In addition, there are perfect correlations between the col uses the 9-basis 18-state KS proof in dimension 4 in outcomes of the same observables on Alice’s side and Fig.3. Now, in step (i), Alice randomly picks a basis from Bob’s side. Finally, non-signaling holds. That is, Alice’s the nine bases in Fig.3 and sends Bob a randomly chosen (Bob’s) local distributions do not depend on the choice state of that basis. (ii) Bob picks a basis at random from of measurement by Bob (Alice). In quantum mechanics, the nine and measures the system received from Alice. such a distributed PM box can be realized, if both parties The remaining steps are as in BB84, but noticing that share two singlet states. now each measurement has four outcomes (rather than Consider a PM distributed box such that the parties two). do not know how it is implemented, i.e., what observables are measured, and in which quantum state. However, if they assume the validity of quantum mechanics as usually 2. Contextuality offers device-independent security done in the the device-independent paradigm (Ac´ın et al., 2007), it can be shown (Horodecki et al., 2010) that the The core idea behind BB84 is that information gain outcomes of a fixed row/column possess about 0.44 bits about one quantum observable must cause disturbance of intrinsic randomness and hence the correlations offer to another incompatible observable. This does not re- security. It can also be proven (Horodecki et al., 2010) quire entanglement or composite systems. However, that a secure key can be obtained both in the noiseless there is a second generation of QKD protocols, offer- case and assuming a small amount of noise in the state. ing much higher levels of security, which crucially relies on nonlocality. These protocols were initiated by Ek- ert(1991), stimulated by the work of Barrett, Hardy, C. Random number generation and Kent(2005), and lead to the schemes for device- independent QKD (Ac´ın et al., 2007), in which security The inherent randomness of quantum mechanics to- is verified solely through the statistics of the measure- gether with the impossibility of a classical simulation of ment outcomes, without making assumptions about the some of its aspects suggest the possibility of using it for inner working of the devices (except those that are stan- the generation of random numbers. For instance, proto- dard in cryptography). Since nonlocality can be seen as cols of randomness expansion, based on Bell nonlocality contextuality produced by local measurements on com- and with minimal assumptions on the measuring devices, posite systems, all these schemes may be considered as have been proposed (Colbeck, 2006). applications of contextuality. However, in most of them In the following, we will review a protocol of ran- the exact role of contextuality is difficult to follow. domness generation based on quantum contextuality pro- Here, we review a result and a corresponding QKD posed by Abbott et al. (2012). The main idea is to exploit scheme using entanglement presented in Horodecki et al. a Kochen-Specker-type contradiction, namely, the impos- (2010), in which local contextuality plays a crucial role. sibility of a pre-assigned valued to certain quantum prop- The result can be summarized as follows: if two par- erties of a system to claim that the outcomes generated ties share systems which, locally, shows a KS contradic- by the measurement of such properties are genuinely ran- tion and, in addition, exhibit perfect correlations (which dom. In contrast to previous approaches, Abbott et al. 49

(2012) not only use the impossibility of a simultaneous as- Sec. IV.E, is useful (Spekkens et al., 2009). Let us signment to all variables (NCHV), but precisely localize first describe the problem. Consider a two-party system, which variable cannot have a definite value. The intuition where Alice receives a bit string x ∈ {0, 1}n of length n. is similar to the one at the basis of the “bug” graph in Bob receives a number y ∈ {1, . . . n} and has to predict Fig.5 on Pag.7: if A is assigned the value 1, then B must the bit xy of Alice’s string. In order to succeed, Alice be assigned the value 0. Abbott et al. (2012) extend this can send Bob some information about her string. So far, idea by proving a stronger result: they find a graph in this is a quite general scenario, which also occurs in ran- d = 3 such that, whenever A is assigned the value 1, both dom access codes (Ambainis et al., 2002), the interesting the assignments B = 0 and B = 1 generate some con- point is to put constraints on the information Alice is tradiction. In this case B is said to be value indefinite. allowed to send to Bob and then investigate the physical Moreover, they show that this graph can be constructed consequences. for any two projectors√PA = |aiha|, PB = |bihb| such that p5/14 ≤ |ha|bi| ≤ 3/ 14. These numbers has been sub- In the scenario considered by Spekkens et al. (2009) sequently improved in Abbott et al. (2015), to the gen- one adds the constraint that the information that Alice eral condition 0 < |ha|bi| < 1. This construction relies is allowed to send to Bob should not give any information on the assumption that value assignments respect QM about the parity of her string on any subset containing predictions for one-dimensional projectors, in particular, two or more bits. Mathematically formulated, let s ∈ n orthogonality (O0) and completeness (C0) for an orthog- {0, 1} be an arbitrary bit string with at least two entries P onal basis (see Sec. III.A). Moreover, the assumption of a “1”, then no information on i xisi should be revealed, where addition is modulo two. This constraint makes definite value for PA is translated, through the eigenstate assumption (Abbott et al., 2012), to the assumption of the information transmission from Alice to Bob “parity oblivious”. the preparation of an eigenstate of PA. The above argument can, then, be translated to a prac- tical random number generation protocol consisting in First, one can ask what is the optimal classical suc- preparing a three-level quantum system in the pure state cess probability for this game. In a classical system, the |ψi and then measuring it in a basis containing vectors constraint effectively ensures that Alice can only transfer one single bit of the string x, without losing generality |φ+i , |φ−i such that 0 < |hψ|φ±i| < 1. An explicit im- plementation is given in terms of the spin operators for a one can assume that this is the first bit. Then, Bob can predict the bit correctly for y = 1, and he has to guess for spin-1 system with |ψi = |Sz = 0i and |φ±i = |Sx = ±1i. In other words, the system is prepared in the eigenstate all other values of y. This leads to a success probability associated with 0 for the spin along the z direction, and a of p(b = xy) = 1/n + (1/2) × (n − 1)/n = (n + 1)/2n. In ontological models obeying the constraint of preparation measurement in the Sx basis is performed. Interestingly, by the geometry of the problem hS = 0|S = 0i = 0 and contextuality, one also cannot exceed this value. The rea- √ z x son is that in these models parity-obliviousness at level hS = 0|S = ±1i = 1/ 2, implying that the outcome z x of Alice’s preparations and Bob’s measurements implies 0 never appears in the measurement of S and the two x the parity-obliviousness at the level of hidden variables (value indefinite) outcomes ±1 appear with equal proba- already, see (Spekkens et al., 2009) for a detailed argu- bility. In addition to the realization with spin-1 systems, mentation. the authors discuss an implementation based on photon interferometry. In quantum mechanics, however, this bound does not This approach has been explored experimentally by hold. Consider the case n = 2. The four possible Kulikov et al. (2017), and the quality of the randomness strings for Alice can be encoded in four single-qubit produced in the experiment has been further analyzed states with the Bloch vectors lying in the x-y plane in Abbott et al. (2019). Experimental random number √ via ~r = ((−1)x1 , (−1)x2 , 0)/ 2, and the states are generation based on contextuality has been also explored x1,x2 %x1,x2 = (11 + ~rx1,x2 ~σ)/2. Since %11 + %00 = %10 + %01, no by Um et al. (2013, 2020), but not in the framework of quantum measurement can give information on the par- developed by Abbott et al. (2012). ity of x. If Bob wishes to know x1 he measures σx and for predicting x2 he measures σy. This gives the right bit with probability cos2(π/8) ≈ 0.8536, which is larger D. Further applications than the classical optimum of 3/4.

Finally, we would like to mention some other applica- The choice of the signal states is, of course, closely tions where contextuality has been proven useful. related to the examples of inequalities for preparation noncontextuality, see e.g. Eq. (64) in Sec. IV.E. The con- nection of parity-oblivious communication to preparation 1. Parity-oblivious multiplexing contextuality has been further generalized in several di- rections (Ambainis et al., 2019; Banik et al., 2015; Chail- This is an information processing task for two par- loux et al., 2016; Ghorai and Pan, 2018; Hameedi et al., ties, where preparation contextuality, as explained in 2017; Saha and Chaturvedi, 2019; Saha et al., 2019). 50

2. State discrimination c0(N ) corresponds to the maximum independent set of the confusability graph. In this language, Bob’s perspec- The task of minimum error state discrimination is a tive can be described as follows: He receives an output well-studied problem since the early days of quantum y which may origin from several xi. Any two of these xi information processing, see (Barnett and Croke, 2009) are confusable, so the possible inputs xi form a clique in for review. In the simplest scenario, two non-orthogonal the confusability graph. states |ψi and |φi are given with equal probability. Then, How does the capacity of a channel change, if Alice and the task is to make a measurement and identify the state. Bob have also some access to additional resources such as As the states are non-orthogonal this cannot be done per- shared randomness or entangled states? It was shown by fectly, so the task is to minimize the error probability of Cubitt et al. (2010, 2011) that from any Kochen-Specker the guess. Note that there is also the different notion set of vectors one can construct an example of a chan- of unambiguous state discrimination, where no error is nel, where the one-shot zero-error capacity in the pres- allowed, but it is possible to pass as a third option. ence of an shared entangled state, denoted by cE(N ), is This can be connected to preparation noncontextuality strictly larger than the capacity without shared entan- as shown by Schmid and Spekkens(2018). Consider two glement c0(N ). single-qubit states |ψi and |φi with overlap c = | hψ|φi |2. This connection is best explained with an example. We can assume without losing generality that their Bloch Consider the nine orthogonal bases in four-dimensional vectors are of the form ~rψ/φ = (cos(α), 0, ± sin(α)). Then, space, which came out of the 18 vector proof in Fig.3 the optimal measurement is clearly given√ by σz, leading in Sec. III.A. The overall 36 vectors can be organised to a success probability of s = (1 + 1 − c)/2. One can in a 9 × 4 array |ψiji and the indices (ij) constitute the consider in addition the orthogonal vectors |ψ⊥i and |φ⊥i input space X of the channel N . Then, one contructs the with the Bloch vectors ~rψ⊥/φ⊥ = (− cos(α), 0, ∓ sin(α)). channel such that two inputs (ij) and (kl) are confusable, They lead to essentially the same state discrimination if the vectors |ψiji and |ψkli are orthogonal. This can be problem, with the same success probability. achieved in different ways, for instance one can just take From the perspective of preparation contextuality is it as an output space Y = X and start with p(y|x) = δxy, important that |ψihψ| + |ψ⊥ihψ⊥| = |φihφ| + |φ⊥ihφ⊥| = then small disturbances are added in order to build the 11. This puts constraints on the hidden variable dis- desired confusability graph. tributions describing these four states in preparation- The resulting channel has now c0(N ) ≤ 8. To see this, noncontextual theories, see also Eqs. (64, 66). Under assume that c0(N ) = 9 (or larger). Clearly, the nine these constraints and under the assumption that the re- distinguishable xm have to belong to the nine different lations and symmetries between the four states are pre- bases (or rows in the array), since inputs within a row served, one can prove that the success probability for are by construction not perfectly distinguishable by Bob. the two-state problem given by |ψi and |φi is bounded Moreover, if the same vector appears on two positions by s ≤ 1 − c/2. This is, for any c strictly lower than in the array (|ψiji = |ψkli) and (ij) belongs to the set the quantum mechanical value given above (Schmid and {xm}, then also (kl) belongs to the set, since |ψiji is or- Spekkens, 2018). This result can be shown to hold also thogonal to all other vectors in the row k. So, one arrives for states affected with noise. at an assigment of values to the 36 vectors that obeys the rules of noncontextuality, and this is by construction not possible. 3. Zero-error channel capacities On the other hand, it is easy to see that with the help of entanglement cE(N ) ≥ 9. Assume that Alice and Bob In general, a classical channel N transforms inputs x ∈ share an maximally entangled state in a 4 × 4-system. X on Alice’s side to outputs y ∈ Y on Bob’s side, so it can Then, in order to send the row index i to Bob, Alice be considered as a conditional probability distribution just performs the a projective measurement of the cor- p(y|x). If only a single use of the channel is allowed, Bob responding basis on her part of the state. She obtains may not be able to uniquely determine Alice’s input from the random result j and sends (ij) through the channel. his output. So, one may ask what is the largest subset From the channel output y, Bob can identify a clique of of inputs that can perfectly be distinguished. This is four possible inputs (kl). The corresponding states |ψkli are orthogonal, so he can identify Alices input by per- also called the one-shot zero-error capacity c0(N ) of the channel. forming a projective measurement on his reduced state, This quantity can be interpreted in a graph-theoretical which is given by |ψiji . manner, by using the so-called confusability graph G(N ). The vertices of this graph are the input symbols x ∈ X and two vertices x1 and x2 are connected, if and only if 4. Dimension witnesses the probability distributions p(y|x1) and p(y|x2) overlap. This means that there is a possible output y which may As already mentioned in Section III.A, the Kochen- origin from the two xi, so these two inputs are confus- Specker theorem requires at least a three-dimensional able. In other words, the one-shot zero-error capacity Hilbert space. It is therefore natural to connect the vio- 51 lation of contextuality inequalities to the dimension. 6. Further applications on the horizon For the case of the Peres-Mermin square, this has been done by G¨uhne et al. (2014). There, the contextuality Recently, some other works considered potential appli- inequality as in Eq. (40) has been considered and it has cations of contextuality, but at the moment it is difficult been studied how the violation depends on the underlying to predict the future impact of these research lines. The dimension. Then, it has been shown that novel applications contain machine learning (Gao et al., 2D,com. 3D,com. √ 2021), postselected metrology (Arvidsson-Shukur et al., hPMi ≤ 2 ≤ 4( 5 − 1) ≈ 4.94, (105) 2020), and state-dependent cloning (Lostaglio and Senno, 2020). where the bounds hold for the respective dimensions un- der the assumption that the measurements are projective and obey the compatibility (or commutation) relations of the Peres-Mermin square. These bounds can be general- VII. SUMMARY AND OUTLOOK ized to certain POVMs and also to the KCBS inequality (G¨uhne et al., 2014). More recently, a general method on Since the discovery of quantum contextuality more dimension witnesses using the graph-theoretic approach than 50 years ago, the topic has received increasing atten- has been introduced (Ray et al., 2021). tion and the largest number of significant contributions occurred just during the last decade. This development parallels the increased interest in Bell nonlocality and has 5. Self-testing been partially driven by the fast-growing community of quantum information scientists. A key breakthrough for Quantum self-testing (Mayers and Yao, 2004) is the quantum contextuality was the transformation of the log- art of certifying quantum states, quantum measurements, ical contradiction that underlies the original theorem by and other quantum features from the input-output statis- Kochen and Specker(1967) to experimentally accessible tics of measurement experiments and some minimal as- noncontextuality inequalities, see Sec.IV. A recent key sumptions, which do not include assumptions about the development has been the establishment of the connec- quantum system. The method is based on the obser- tion between computational resources and the presence vation that some input-output statistics corresponding of contextuality, see Sec.VI. As the field of quantum con- to extremal points in the corresponding sets of quantum textuality is evolving faster then ever before, we identify correlations can only be achieved, up to isometries, with three key topics that are—in our opinion—essential for specific states and measurements. The idea was initially the consolidation of our current understanding of quan- used for self-testing quantum states and measurements tum contextuality and the further development of the in Bell scenarios (Mayers and Yao, 2004), and then ex- field. tended to other features and scenarios. Here we review First, the mathematical structure of quantum contex- its application for self-testing states and measurements in tuality has not yet been fully revealed despite of numer- contextuality experiments with sequences of ideal mea- ous seminal results. For example, the smallest scenario surements (Bharti et al., 2019a,b) and for self-testing for state-independent contextuality is not known. It is states and measurements in Bell scenarios by exploit- likely that it will be the scenario by Yu and Oh(2012), ing the connection between quantum contextuality and but a conclusive proof has not yet been provided. Also, graph invariants (Bharti, 2021). scenarios which are maximally contextual in certain ways In the first case, the distinctive assumption is that mea- have to be identified. For example, it is known (Amaral surements are ideal. Under this assumption, it has been et al., 2014) that the quotient ϑ(G)/α(G) for an exclu- proven (Bharti et al., 2019b) that the quantum violations sivity graph G tends to the number of vertices of G, but of the KCBS inequality and all the tight inequalities for yet, a family of graphs with this property is not known. the odd n-cycle scenarios (with n ≥ 5) (Ara´ujo et al., Second, although there is a large number of convincing 2013) allow for self-testing. It is also been proven (Bharti experiments that have confirmed quantum contextuality et al., 2019a) that the quantum violations of the anti-hole in physical systems, see Sec. IV.D, the handling of exper- noncontextuality inequalities (Cabello et al., 2013) allow imental imperfections, or “loopholes”, has not reached for self-testing. The interest of the latter result relies on the thoroughness that has been achieved for Bell nonlo- the fact that it allows for self-testing quantum states and cality (Brunner et al., 2014; Larsson, 2014). There are measurements of any odd dimension d ≥ 3. several—partially competing—methods to handle experi- In (Bharti, 2021) it is shown that the connection be- mental imperfections, see Sec. IV.C, but a comprehensive tween quantum contextuality and graph-invariants per- description in a unified framework is missing. Even if a mits to simplify the proofs of self-testability of certain truly loophole-free experiment might be fundamentally Bell nonlocal correlations that were known to allow for impossible, this does not lessen the need for comprehen- self-testing, identify new Bell nonlocal correlations that sive treatment. Part of these difficulties can be traced allow for self-testing, and prove a conjecture about the back to the fact that quantum contextuality (with the closed form expression of the Lov´asztheta number for a notable exception of Spekkens’ notion of contextuality, family of graphs. see Sec. IV.E) is based on the notion of ideal measure- 52 ments, and in the case of implementations with sequential Bourennane, Harvey Brown, Caslavˇ Brukner, Jeffrey measurements, on the role of L¨udersrule for ideal mea- Bub, Gustavo Ca˜nas, Jaime Cari˜ne, Gonzalo Carva- surements. Our understanding of both concepts within cho, Marcos Carvalho, Daniel Cavalcanti, Rafel Chaves, the foundations of quantum theory is not fully developed Jiang-Shan Chen, Jing-Ling Chen, Giulio Chiribella, and might be a source of our struggle with the design Andrea Chiuri, Sujit K. Choudhary, Rob Clifton, An- loophole-free contextuality experiments. drea Crespi, Jin-Ming Cui, Vincenzo D’Ambrosio, Lars The last topic we would like to mention is the role of E. Danielsen, Pierre-Louis de Assis, Dong-Ling Deng, contextuality in quantum computation and communica- Ehtibar Dhzafarov, Cristhiano Duarte, Joseph Emer- tion, see also Sec.VI. For example, there are yet no strong son, Paul Erker, Jos´eM. Estebaranz, Sebasti´anEtchev- methods that would allow to quantify the memory cost erry, Gabriel Fagundes, Armando Fern´andezPrieto, Jos´e of quantum contextuality. Specifically, it is not known Ferraz, Stefan Filipp, Manuel J. Freire, Tobias Fritz, whether there is a quantum advantage regarding the cost Diego Frustaglia, Chris Fuchs, Rodrigo Gallego, Ernesto when simulating a sequential implementation of quantum Galv˜ao,Ren´eGerritsma, Helena Granstr¨om,Esteban S. contextuality by means of a classical finite state machine. G´omez,Philippe Grangier, Mile Gu, Guang-Can Guo, Current affirmative results (see Sec. IV.B) are based on Yong-Jian Han, Lucien Hardy, Yuji Hasegawa, Teiko sequences of incompatible observables, which is an alien Heinosaari, Joe Henson, Isabelle Herbauts, Jonathan P. concept to quantum contextuality. A much broader and Home, Michael Horne, Pawel Horodecki, Mark Howard, more general question regards whether and how quan- Xiao-Min Hu, Yun-Feng Huang, Greg Jaeger, Dagomir tum contextuality plays a role in universal quantum com- Kaszlikowski, Adrian Kent, Michael Kernaghan, Andreii putation. So far this has been answered for the case Khrennikov, Kihwan Kim, Gerhard Kirchmair, J¨urgen of measurement-based quantum computation, quantum Klepp, Alexander Klyachko, Simon Kochen, Ravi Kun- computation via magic states, and shallow quantum cir- jwal, Pawel Kurzy´nski,Leong-Chuan Kwek, Brian La cuits, see Sec. VI.A. But, for example, whether and in Cour, Raymond Laflamme, RadekLapkiewicz,Matthew what sense contextuality plays a role in the circuit model Leifer, Florian M. Leupold, Gustavo Lima, Chuan-Feng is a widely open question. Besides those more specific Li, Qiang Li, Petr Lisonˇek, Bi-Heng Liu, Zheng-Hao questions, we expect various new key applications of con- Liu, Antonio J. L´opez Tarrida, Vicente Losada, Aintzane textuality in quantum information science to emerge in Lujambio, Maciej Malinowski, Shane Mansfield, Breno the near future. Marques, Paolo Mataloni, Hui-Xian Meng, David Mer- In conclusion, quantum contextuality plays a central min, David A. Meyer, Giovanni Morchio, Osama Moussa, role in quantum theory, encompassing both measurement Eleonora Nagali, Miguel Navascu´es,Mohamed Nawareg, incompatibility at a fundamental level, and Bell nonlo- Vlad Negnevitsky, Choo Hiap Oh, Roberto Osellame, Se- cality and entanglement when subsystems are spatially basti˜aoP´adua,Jian-Wei Pan, Mladen Paviˇci´c,Marcin separated, and is also strongly connected to new devel- Paw lowski, Asher Peres, Michel Planat, Angel´ R. Plas- opments in quantum technology. Our view is that quan- tino, Itamar Pitowsky, Ioannis Pitsios, Matt Pusey, Da- tum contextuality is at the heart of the matter, more vide Poderini, Jos´eR. Portillo, Marco T´ulioQuintino, so than quantum uncertainty or quantum interference; Rafael Rabelo, Magnus R˚admark, Ravishankar Ra- both of these could in principle be present in a classi- manathan, Helmut Rauch, Robert Raussendorf, Ma- cal model, whereas quantum contextuality cannot, as al- harshi Ray, Renato Renner, Christian Roos, Davide ready shown in Kochen and Specker(1967). Paraphras- Rusca, Carlos Saavedra, Muhammad Sadiq, Debashis ing the conclusion of that paper, “This way of viewing Saha, Ana Bel´enSainz, Emilio Santos, Valerio Scarani, the results [presented here], seems to us to display a new R¨udigerSchack, Claus Schmitzer, Fabio Sciarrino, Si- feature of quantum mechanics in its departure from clas- mone Severini, Abner Shimony, Rui Soares Barbosa, sical mechanics.” Quantum contextuality is what makes Alberto Sol´ıs, Adrian Specker, Ernst Specker, Susan quantum theory fundamentally non-classical, and will Specker, Rob Spekkens, Stephan Sponar, Allen Stairs, undoubtably play an important role in future develop- Hong-Yi Su, Kai Sun, Karl Svozil, Jochen Szangolies, ments of quantum physics. Marcelo Terra Cunha, Stefan Trandafir, Giuseppe Val- lone, Antonios Varvitsiotis, Mar´ıa C. Vel´azquez Ahu- mada, Giuseppe Vitagliano, Mordecai Waegell, Naqueeb Ahmad Warsi, Harald Weinfurter, Marcin Wie´sniak, An- Acknowledgments dreas Winter, Elie Wolfe, Chunfeng Wu, Guilherme B. Xavier, Ya Xiao, Jin-Shi Xu, Zhen-Peng Xu, Bin Yan, We thank Samson Abramsky, Antonio Ac´ın, Evelyn Sheng Ye, Sixia Yu, Xiao-Dong Yu, Florian Z¨ahringer, Acu˜na,Joseba Alonso, B´arbaraAmaral, Elias Amse- , Chi Zhang, Jie Zhou, Zong-Quan Zhou, lem, Leandro Aolita, Marcus Appleby, Mateus Ara´ujo, Marek Zukowski,˙ and Wojciech Zurek˙ for interesting dis- Mauricio Arias, Ali Asadian, Flavio Baccari, Guido Bac- cussions on quantum contextuality over the years. ciagaluppi, Piotr Badzi¸ag, Jos´eP. Baltan´as, Johanna F. Barra, Hannes Bartosik, Ingemar Bengtsson, Marco We also thank Alastair Abbott, Manik Banik, Kishor Bentivegna, Kishor Bharti, Kate Blanchfield, Rainer Bharti, Cris Calude, Hyppolite Dourdent, Pierre- Blatt, Naresh Goud Boddu, Gilberto Borges, Mohamed Emmanuel Emeriau, Alexei Grinbaum, Zheng-Hao Liu, 53

Shane Mansfield, Karl Svozil, and Armin Tavakoli for In 1931, the skepticism of Bohr received support from helpful comments on the manuscript. a proof of impossibility of hidden variables presented by This work has been supported by Project Qdisc von Neumann(1931) and included in his book (von Neu- (Project No. US-15097), with FEDER funds, Projet mann, 1932, see Sec. IV. 2). This proof was soon after No. FIS2017-89609-P (MINECO, Spain), with FEDER shown to be inconclusive by Hermann(1935), but her funds, QuantERA grant SECRET, by MINECO (Project work was mostly ignored for several years (Mermin and No. PCI2019-111885-2), the Deutsche Forschungsgemein- Schack, 2018). The influence of von Neumann’s book, schaft (DFG, German Research Foundation, Project No. then, strongly discouraged any discussion on hidden vari- 447948357 and No. 440958198), the FQXi Fund (Silicon able theories for decades. Valley Community Foundation) through projects “The Paradoxically, at that time, Wigner(1932) found some- Nature of Information in Sequential Quantum Measure- thing that could have been used against hidden variables: ments” and “The Observer Observed: a Bayesian Route when attempting to link Schr¨odinger’s wave function to a to the Reconstruction of Quantum Theory”, the Sino- distribution on phase space (which would be the analogue German Center for Research Promotion, the Austrian in quantum mechanics of the distribution function of Science Fund (FWF) through projects ZK 3 (Zukunft- classical statistical mechanics), Wigner found that such skolleg) and F7113 (BeyondC), and the ERC (Consol- a distribution has negative values and cannot be made idator Grant 683107/TempoQ). non-negative. The importance of this discovery was not recognized until much later. In 1935, Einstein, Podolsky, and Rosen (EPR) showed Appendix A: Quantum contextuality from a historical that quantum mechanics is incomplete, in the sense that perspective it does not assign definite outcomes to measurements whose results can be predicted with certainty from the Here, we present a historical introduction to quantum outcomes of spacelike separated measurements. Several contextuality from its origins to the time when the basis years later Bell(1964) will show that EPR’s hidden vari- for experimentally testing Kochen-Specker contextuality able theories collide with quantum mechanics but, at that was settled down. The aim of this part is to frame the time, the EPR argument reinforced Einstein’s (and many results presented in the review within a historical context others) resistance to accept quantum mechanics as a final and trace connections between them that may help us to theory. understand the evolution and ramifications of the field. Meanwhile, Von Neumann observed that the two- valued observables, represented in quantum mechanics by projection operators, constitute a sort of “logic” of ex- 1. The problem of hidden variables perimental propositions and, together with Birkhoff and von Neumann(1936), developed a “quantum logic”, a set The discussion, in the late 1920s, of whether quantum of algebraic rules governing operations to combine, and mechanics can be supplemented by “hidden variables” predicates to relate propositions associated with physical was motivated by two results: Born’s probabilistic inter- events. This logic will eventually provide a new basis for pretation of Schr¨odinger’swave function (Born, 1926a,b), discussing the problem of hidden variables. expressing the fundamentally probabilistic character of In 1952, Bohm(1952a,b) presented a hidden variable the predictions of quantum mechanics, and Heisenberg’s theory which is a further elaboration of de Broglie’s the- uncertainty principle (Heisenberg, 1927), asserting a fun- ory of 1927. Bohm’s theory is deterministic and explicitly damental limit to the precision with which the values nonlocal at the level of hidden variables. of position and momentum can be predicted in quan- In parallel, Mackey(1957) had asked whether ev- tum mechanics. While Heisenberg, Born, Pauli, and — ery measure on the lattice of projections of a Hilbert notably— Bohr made strong claims that quantum me- space can be defined by a positive operator with unit chanics provided a complete framework for physics and trace. A positive answer would show that the Born manifested their skepticism about the possibility of com- rule follows from a particular set of axioms (framing a pleting it with hidden variables, Schr¨odinger, de Broglie, generalized probability theory) for quantum mechanics and —specially— Einstein, hoped that incompatible ob- (Mackey, 1957, 1963). Though Kadison (Chernoff, 2009) servables such as position and momentum could be shown (and later Bell(1966) and Kochen and Specker(1967)) to have simultaneous values in a deeper non-probabilistic proved this was false for two-dimensional Hilbert spaces, theory and viewed the quantum state as an incomplete Gleason(1957) showed it to be true for higher dimen- description in need of supplementation by hidden vari- sions. Gleason’s theorem is going to play a crucial role ables (Lorentz, 1928). in the discussion of hidden variables. Mackey’s program At the Solvay Conference in 1927, de Broglie pre- (Mackey, 1957) was further developed in several direc- sented an explicit hidden variable theory (Lorentz, 1928). tions by Foulis and Randall(1972, 1974); Ludwig(1964, However, the criticisms received, particularly from Pauli 1967, 1968, 1972); Piron(1964, 1976); Randall and Foulis (Lorentz, 1928), persuaded de Broglie to abandon his the- (1970, 1973), and others. All these works provided the ory. basis of what is nowadays called the framework of gen- 54 eralized probabilistic theories (see, e.g., Chiribella et al., Bell adds: “The result of observation may reasonably 2010; Hardy, 2001), which views quantum (probability) depend not only on the state of the system (including theory as one possibility in a landscape of probability hidden variables) but also on the complete disposition of theories and asks what is special about it. the apparatus” (Bell, 1966). Both Kochen and Specker (1967, see p. 73) and Bell(1966, see p. 451) seem to believe that the only way to measure the same observ- 2. The Kochen-Specker theorem able in two contexts is by measuring two maximal (and incompatible) quantum observables, one for each con- In 1960, Specker, a mathematician with theological text. They do not consider the possibility, also offered concerns, inspired by “the question whether the omni- by quantum mechanics, of measuring each observable us- science of God also extends to events that would have ing the same apparatus so that in each context one mea- occurred in case something would have happened that sures sequentially the observables of the context, as it did not happen” (Specker, 1960) and by the logic of is done in modern sequential contextuality experiments Birkhoff and von Neumann, reformulated the question (e.g., Kirchmair et al., 2009). of hidden variables as follows: “Is it possible to extend Also Bell noticed the nonlocality in Bohm’s theory of the description of a quantum mechanical system through 1952 (he writes: “in this theory an explicitly causal mech- the introduction of supplementary —fictitious— propo- anism exists whereby the disposition of one piece of appa- sitions in such a way that in the extended domain the ratus affects the results obtained with a distant piece,” classical propositional logic holds”. Specker found that Bell, 1966) and how this is an unwanted feature, as it “the answer to this question is negative, except in the solves the EPR paradox “in the way Einstein would have case of Hilbert spaces of dimension 1 and 2” as “an el- liked least” (Bell, 1966). Finally, Bell pointed out that ementary geometrical argument shows” (Specker, 1960) “there is no proof that any hidden variable account of [quote from the English translation in Seevinck(2011)]. quantum mechanics must have this extraordinary char- In fact, according to Specker (Meon, 1990) “the basic acter. It would therefore be interesting (...) to pursue theorem of the paper was proved shortly [after a seminar some further “impossibility proofs,” replacing the arbi- on the foundations of quantum theory]”, a seminar that trary axioms objected to above by some condition of lo- probably took place during the summer semester of 1948 cality , or separability, of distant systems” (Bell, 1966). (see Enz et al., 1997). The geometrical argument was not This led to Bell’s famous proof of impossibility of “local” fully presented until the paper of 1967 in collaboration hidden variables (Bell, 1964). with Kochen (Kochen and Specker, 1967), although the fundamental building block of it can already be found 3. The origin of the word “contextuality” in Kochen and Specker(1965b) (Fig. 1, p. 182; see also Kochen and Specker, 1965a). The Kochen-Specker (KS) The term “contextuality” in association to quantum theorem shows the incompatibility between some predic- mechanics derives (Jaeger, 2019; Shimony, 2009) from the tions of quantum mechanics and a type of hidden vari- term, introduced by Shimony (Shimony, 1971) to desig- ables that will be later called “noncontextual.” nate the hidden-variable theories “in which the value of In 1963, although published in 1966, Bell developed a an observable O is allowed to depend not only upon the similar geometrical argument but using a more complex hidden state λ, but also upon the set C of compatible building block (Bell, 1966). Bell also used an infinite observables measured along with O”(Shimony, 1971). set of quantum observables. In contrast, Kochen and Shimony called them “contextualistic” hidden-variable Specker managed to prove their theorem using 117 ob- theories. The shortening to “contextual” was made by servables by concatenating their building block 15 times. Beltrametti and Cassinelli(1981) and then adopted by In his paper, Bell seems to have found this geometrical Shimony and others. In the 1990s, “Contextuality” be- argument after Jauch draw his attention to the conse- came the title of a chapter of Peres’ influential book on quences of Gleason’s theorem to the problem of hidden quantum theory (Peres, 1993). variables (Bell, 1966). In fact, Bell later refers to this proof as “observed by Jauch” (Bell, 1971) and “subse- quent set out by S. Kochen and E. P. Specker” (Bell, 4. The relation between the KS and Bell theorems and the 1971) and, even later, Bell wrote that he “was told of it by need for a theory-independent notion of noncontextuality J. M. Jauch in 1963” (Bell, 1982) and that “the idea was later rediscovered by Kochen and Specker” (Bell, 1982). While Bell’s theorem gained predicament among physi- As we pointed out, the idea was already in print in 1960. cists and the general public after the experiment of Freed- More importantly, Bell was not convinced that the man and Clauser(1972), Clauser(1976a,b), Aspect et al. proof was compelling. His source of discomfort was the (1982), and others, and its applications to cryptography observation that measuring the same observable in differ- (Ekert, 1991) and quantum information, the KS theorem ent contexts “require[s] different experimental arrange- was for a long time a subject that interested primarily ments; [and thus] there is no a priori reason to believe philosophers of science and a few physicists concerned that the results (...) should be the same” (Bell, 1966). about the foundations of quantum mechanics. 55

The situation began to change in the 1990s. On the contextuality for these measurements and contexts (that one hand, Peres(1990, 1991, 1992, 1993) and Mermin plays the same role as the impossibility of communication (1990b, 1993) simplified the proof of the KS theorem between spacelike separated events in Bell theorem’s). using a small number of two- and three-qubit observ- Nevertheless, the lack of such a formal framework did ables, making the KS accessible to a wider audience. not impeded experimental progress and the first “experi- On the other hand, Mermin’s Bell inequality (Mermin, ments towards falsification of noncontextual hidden vari- 1990a) and his “unified form for the major no-hidden- able theories” on single systems (Michler et al., 2000), variables theorems” (Mermin, 1990b, 1993) connected which took advantage of the analogy between two two- the proof of Greenberger, Horne, and Zeilinger (GHZ, dimensional separated subsystems and two dichotomic 1989) with Bell inequalities and the KS theorem, respec- degrees of freedom of a single photon and tested the vi- tively. Similar connections between Bell and KS theo- olation of the single-particle equivalent of the Clauser- rems had been found before by Kochen in a private com- Horne-Shimony-Holt Bell inequality (Clauser et al., 1969) munication to Shimony (Heywood and Redhead, 1983; (see also Hasegawa et al., 2003 for a similar experiment Stairs, 1983); Stairs(1983), to whom Mermin acknowl- with neutrons). edges input (Mermin, 1993); and Heywood and Redhead However, it was the criticisms of Meyer(1999), Kent (1983), see also Brown and Svetlichny(1990). (1999), and Clifton and Kent(2000) towards the idea of There were still something that blocked unifying the giving the KS theorem a similar experimental status as KS theorem and Bell’s theorem of impossibility of local Bell’s theorem (Cabello and Garc´ıa-Alcaine, 1998), what hidden-variable theories. While Bell’s theorem leads to gave a definitive push to the transformation of contex- experimental tests of whether the world can be explained tuality into an experimentally testable property with no with theories that can be defined without any reference to reference to quantum mechanics. quantum mechanics, the KS theorem is deeply attached These criticisms boosted vivid discussions (Appleby, to quantum mechanics. This attachment is triple. 2000, 2001, 2002, 2005; Barrett and Kent, 2004; Cabello, First, the KS theorem does not refer to general mea- 1999, 2002; Havlicek et al., 2001; Mermin, 1999; Peres, surements, but to those that are represented in quantum 2003) and, more importantly, stimulated new develop- mechanics by the spectral projectors of a self-adjoint op- ments. On the one hand, they stimulated the attempt to erator. What does this restriction mean from a theory- obtain experimentally testable “KS inequalities” (Lars- independent point of view? Moreover, in quantum me- son, 2002; Simon et al., 2001). However, these first in- chanics there are measurements that are not represented equalities still made assumptions that hold only in quan- by projective measurements but by positive operator val- tum mechanics. ued measures (POVMs). On the other hand, they stimulated a new notion of Second, the proof of the KS theorem includes con- noncontextuality (Spekkens, 2005). This notion implic- straints that are specific to quantum systems. Examples itly assumes that the hidden variables (or ontological of these constraints are that the values of the squared spin models) merely provide a classical description of the same components of spin-1 particles for any orthogonal triad operations as those allowed in quantum mechanics, with- 2 2 2 {x, y, z} should satisfy that v(Sx) + v(Sy ) + v(Sz ) = 2 out any possibility of predictions deviating from those of (Kochen and Specker, 1967) and that the values for the quantum mechanics or even redundancy in the descrip- Pauli observables of two spin-1/2 particles should satisfy tion, i.e., a different description at the level of the for- (1) (2) (1) (2) that v(σx )v(σx )v(σx ⊗ σx ) = 1 (Mermin, 1990b; malism for physically equivalent situations, as it happens Peres, 1990). for, e.g., gauge symmetries. Third, the experimental translation of the KS theorem (as proposed by KS and Bell) assumes quantum mechan- ics, as it is assumed that coarse-grainings of two different 5. Noncontextuality for ideal measurements (and incompatible) measurements represent the same ob- servable based on the fact that, in quantum mechanics, The final boost for a general theory-independent both yield the same outcome statistics. framework for contextuality rooted in the notion of non- Therefore, the problem was how to translate the KS contextuality used by KS (but free of the assumptions theorem into experimental tests of contextuality in na- that only hold in quantum mechanics) was the discovery ture (Cabello and Garc´ıa-Alcaine, 1998). For that, what of the quantum violation of the Klyachko, Can, Bini- was needed was a theory-independent notion of contextu- cio˘glu, and Shumovsky (KCBS) inequality (Klyachko ality that removes all the quantum constraints, includes et al., 2008) by single qutrits in an specific quantum state, a theory-independent definition of the type of measure- followed by the discovery of similar inequalities that are ments for which the assumption of outcome noncontex- violated by any quantum state (of a given dimension) tuality is made (similar to Bell theorem’s focus on local (Badziag‘ et al., 2009; Cabello, 2008). Unlike previous measurements), of the sets of measurements (contexts) inequalities, the bounds of these inequalities are derived whose correlations are considered (similar to Bell theo- only from the assumption of outcome noncontextuality, rem’s focus on spatially separated local measurements), without extra constraints inspired by quantum mechan- and a physical motivation for assuming outcome non- ics. 56

Interestingly, the KCBS inequality was introduced in “behavior” or simply “correlation” is a set of probabili- an earlier paper (Klyachko, 2007) as a way of showing ties for all possible combination of outcomes in each of that single spin-1 particles can exhibit a form of “single- the contexts. One obtains one of these correlations, us- particle entanglement”, defined as maximal uncertainty ing a specific initial state and measurements. Probabili- of a set of observables associated to a Lie algebra. ties have to satisfy the corresponding normalization and Then it was the question: For what type of measure- nondisturbance (or nonsignaling) constraints. ments and contexts there is an “a priori reason to be- Similarly to what happens for Bell scenarios (Fine, lieve that the results for should be the same”? (Bell, 1982a,b; Froissart, 1981; Garg and Mermin, 1984; 1966). One possible answer is: for those measurements Pitowsky, 1986, 1989, 1991; Suppes and Zanotti, 1981), that yield the same result when performed repeatedly on for any KS scenario, the set of correlations satisfying out- the same physical system and do not produce any change come noncontextuality is a polytope. Here it is called in the outcomes of any jointly measurable observable, and the noncontextual polytope of the scenario. Correlations for contexts made of compatible sets of them. These mea- outside this set are “contextual” and violate one of the surements are called “ideal” (Cabello, 2019b) or “sharp” linear inequalities (in the probabilities) that define the (Chiribella and Yuan, 2016). Intuitively, ideal measure- facets of the noncontextuality polytope. Each of these ments reveal pre-existing context-independent “proper- facets corresponds to a inequality that is necessary for ties” of the measured system that are preserved after the noncontextuality and is called a tight noncontextuality act of measuring. However, in general, this may not be inequality. These inequalities were introduced long be- the case. fore quantum mechanics. The focus on contexts made of compatible ideal mea- In 1990, during a symposium in Jerusalem and surements allows us to formulate a notion of contextu- Tel Aviv coincidentally entitled “Einstein in context”, ality in the operational framework of general probabilis- Pitowsky distributed among the participants a draft tic theories without any reference to quantum mechan- (later published as Pitowsky, 1994) where he pointed out ics. This theory-independent notion of contextuality is that Boole(1862), one of the fathers of modern logic, had referred to as “contextuality for ideal measurements” or developed a set of equalities and inequalities he called “KS contextuality”, as it is clearly inspired by the work “conditions of possible experience” (Boole, 1862) and of Kochen and Specker. Crucially, this notion allows that the Bell inequalities violated by quantum mechanics us to replace or remove the two assumptions of the KS were a subset of them. theorem which make reference to quantum mechanics. This observation leads to the following questions: Namely, (I) that measurements represented in quantum (i) Which Boole’s inequalities can be violated? (ii) What theory by self-adjoint operators reveal preexisting val- is the largest set of correlations that is possible for a given ues which are independent of the “context”, where con- scenario? (iii) How does this set compare to the one in text meant set of measurements represented by mutually quantum theory? Answering these questions would have commuting self-adjoint operators. And (II) that mea- helped to answer the central question Pitowsky asked: surement outcomes must satisfy the same functional re- “WHY is that microphysical phenomena and classical lations that quantum mechanics predicts for commuting phenomena differ in the way they do?” (Pitowsky, 1994). measurements on quantum systems of a given dimension. The answer to question (i) was known in the 1960’s. A Instead of that, in KS contextuality, (I) is replaced by the theorem introduced by Vorob’ev (or Vorob’yev depending assumption of outcome noncontextuality for ideal mea- on the transliteration) (Vorob’ev, 1962, 1959; Vorob’yev, surements and (II) is completely removed. More impor- 1967) shows that a violation of Boole’s inequalities can tantly, the new notion provides a basis for experimentally only occur for scenarios in which the graph of compatibil- testing KS contextuality in nature. ity contains induced cyclic path of size larger than three (i.e., following the path along some edges of the graph one obtains a square, or a pentagon, or an hexagon, and 6. The hidden history of noncontextuality inequalities so on). The graph of compatibility is the one in which compatible measurements are represented by adjacent Interestingly, the mathematical tools needed for vertices. Otherwise, there is always a joint probability studying contextuality were developed independently to distribution and, therefore, a noncontextual model. Bell physics and long before quantum mechanics. Let us call inequalities violated by quantum mechanics corresponds a contextuality scenario a set of abstract ideal measure- to scenarios with this property. ments, each of them having a number of possible out- The Clauser-Horne-Shimony-Holt (CHSH) scenario comes, and their relations of compatibility. For example, (Clauser et al., 1970), with four dichotomic measure- the scenario considered by KCBS (Klyachko et al., 2008) ments whose graph of compatibility is a square, is there- has 5 measurements Mi, i = 0,..., 4, each of them with fore the one with the smallest number of ideal measure- two possible outcomes, and such that Mi and Mi+1 (with ments that allow for contextual correlations. In fact, the the sum modulo 5) are compatible. Therefore, in the CHSH inequality is the only nontrivial tight noncontex- KCBS contextuality scenario there are 5 contexts. For tuality inequality for the CHSH scenario (Fine, 1982a,b). each contextuality scenario, a “matrix of correlation”, Both Bell(1966) and Kochen and Specker(1967) no- 57 ticed that the statistics of ideal measurements on a two- Anwer, H., N. Wilson, R. Silva, S. Muhammad, A. Tavakoli, dimensional quantum system (or qubit) can be repro- and M. Bourennane (2019), arXiv:1904.09766 [quant-ph]. duced with noncontextual models. Therefore, an inter- Aolita, L., R. Gallego, A. Ac´ın,A. Chiuri, G. Vallone, P. Mat- esting question is which is the scenario in which a three- aloni, and A. Cabello (2012), Phys. Rev. A 85, 032107. dimensional quantum system (or qutrit) violates non- Appleby, D. M. (2000), arXiv:quant-ph/0005010 [quant-ph]. contextuality inequalities with ideal measurements and Appleby, D. M. (2001), arXiv:quant-ph/0109034 [quant-ph]. Appleby, D. M. (2002), Phys. Rev. A 65, 022105. which are these inequalities. The answer to the first Appleby, D. M. (2005), Stud. Hist. Philos. Sci. B 36, 1. question is the KCBS scenario (Klyachko et al., 2008). Appleby, D. M., I. Bengtsson, and S. Chaturvedi (2008),J. The KCBS inquality is the only tight noncontextuality Math. Phys. 49, 012102. inequality for the KCBS scenario (Ara´ujo et al., 2013). Ara´ujo,M., M. T. Quintino, C. Budroni, M. Terra Cunha, The KCBS scenario, which was previously considered and A. Cabello (2013), Phys. Rev. A 88, 022118. in some papers on quantum logic (Gerelle et al., 1974; Aravind, P. K., and F. Lee-Elkin (1998), J. Phys. A 31, 9829. Wright, 1978), is also the scenario with the smallest num- Arends, F., J. Ouaknine, and C. W. Wampler (2011), in ber of ideal measurements whose relations of compatibil- Graph-Theoretic Concepts in Computer Science, edited by ity (and incompatibility) cannot occur in a Bell scenario. P. Kolman and J. Kratochv´ıl(Springer, New York) pp. 23– All these features made the KCBS inequality a key to the 34. Arias, M., G. Ca˜nas,E. S. G´omez, J. F. Barra, G. B. Xavier, world of KS contextuality. G. Lima, V. D’Ambrosio, F. Baccari, F. Sciarrino, and A. Cabello (2015), Phys. Rev. A 92, 032126. Arkhipov, A. (2012), arXiv:1209.3819 [quant-ph]. Arvidsson-Shukur, D. R. M., N. Yunger Halpern, H. V. Lep- References age, A. A. Lasek, C. H. W. Barnes, and S. Lloyd (2020), Nat. Commun. 11, 3775. Aaronson, S., and D. Gottesman (2004), Phys. Rev. A 70, Asadian, A., C. Budroni, F. E. S. Steinhoff, P. Rabl, and 052328. O. G¨uhne(2015), Phys. Rev. Lett. 114, 250403. Abbott, A. A., C. S. Calude, J. Conder, and K. Svozil (2012), Aspect, A., J. Dalibard, and G. Roger (1982), Phys. Rev. Phys. Rev. A 86, 062109. Lett. 49, 1804. Abbott, A. A., C. S. Calude, M. J. Dinneen, and N. Huang Avis, D. (2018), “lrs http://cgm.cs.mcgill.ca/~avis/C/ (2019), Physica Scripta 94, 045103. lrs.html,”. Abbott, A. A., C. S. Calude, and K. Svozil (2015), J. Math. Badziag,‘ P., I. Bengtsson, A. Cabello, and I. Pitowsky (2009), Phys. 56 (10), 102201. Phys. Rev. Lett. 103, 050401. Abbott, A. A., C. Giarmatzi, F. Costa, and C. Branciard Ballentine, L. (2014), arXiv:1402.5689 [quant-ph]. (2016), Phys. Rev. A 94, 032131. Banik, M., S. S. Bhattacharya, S. K. Choudhary, A. Mukher- Abramsky, S., R. S. Barbosa, and S. Mansfield (2017), Phys. jee, and A. Roy (2014), Foundations of Physics 44, 1230. Rev. Lett. 119, 050504. Banik, M., S. S. Bhattacharya, A. Mukherjee, A. Roy, A. Am- Abramsky, S., and A. Brandenburger (2011), New J. Phys. bainis, and A. Rai (2015), Phys. Rev. A 92, 030103. 13, 113036. Barbieri, M., C. Cinelli, P. Mataloni, and F. De Martini Ac´ın,A., N. Brunner, N. Gisin, S. Massar, S. Pironio, and (2005), Phys. Rev. A 72, 052110. V. Scarani (2007), Phys. Rev. Lett. 98, 230501. Barbieri, M., G. Vallone, F. De Martini, and P. Mataloni Ac´ın, A., T. Fritz, A. Leverrier, and A. B. Sainz (2015), (2007), Opt. Spectrosc. 103, 129. Commun. Math. Phys. 334, 533. Barnett, N., and J. P. Crutchfield (2015), J. Stat. Phys. 161, Alda, V. (1980), Aplikace Matematiky 25, 373. 404. Ali, S. T., C. Carmeli, T. Heinosaari, and A. Toigo (2009), Barnett, S. M., and S. Croke (2009), Adv. Opt. Photonics 1, Found. Phys. 39, 593. 238. Amaral, B., and C. Duarte (2019), Phys. Rev. A 100, 062103. Barrett, J., L. Hardy, and A. Kent (2005), Phys. Rev. Lett. Amaral, B., C. Duarte, and R. I. Oliveira (2018), J. Math. 95, 010503. Phys. 59, 072202. Barrett, J., and A. Kent (2004), Stud. Hist. Philos. Sci. B Amaral, B., and M. Terra Cunha (2018), On graph ap- 35, 151. proaches to contextuality and their role in quantum theory Bartosik, H., J. Klepp, C. Schmitzer, S. Sponar, A. Cabello, (Springer). H. Rauch, and Y. Hasegawa (2009), Phys. Rev. Lett. 103, Amaral, B., M. Terra Cunha, and A. Cabello (2014), Phys. 040403. Rev. A 89, 030101. Bassi, A., K. Lochan, S. Satin, T. P. Singh, and H. Ulbricht Ambainis, A., M. Banik, A. Chaturvedi, D. Kravchenko, and (2013), Rev. Mod. Phys. 85, 471. A. Rai (2019), Quant. Inf. Process. 18, 1. Bechmann-Pasquinucci, H., and A. Peres (2000), Phys. Rev. Ambainis, A., A. Nayak, A. Ta-Shma, and U. Vazirani Lett. 85, 3313. (2002), J. ACM 49, 496. Beeri, C., R. Fagin, D. Maier, and M. Yannakakis (1983),J. Amselem, E., M. Bourennane, C. Budroni, A. Cabello, ACM 30, 479. O. G¨uhne,M. Kleinmann, J.-A.˚ Larsson, and M. Wie´sniak Belinfante, F. J. (1973), A Survey of Hidden-Variables Theo- (2013), Phys. Rev. Lett. 110, 078901. ries, 1st ed., Monographs in Natural Philosophy (Elsevier Amselem, E., M. R˚admark,M. Bourennane, and A. Cabello Ltd, Pergamon Press, New York). (2009), Phys. Rev. Lett. 103, 160405. Bell, J. S. (1964), Physics 1, 195. Anders, J., and D. E. Browne (2009), Phys. Rev. Lett. 102, Bell, J. S. (1966), Rev. Mod. Phys. 38, 447. 050502. Bell, J. S. (1971), in Foundations of Quantum Mechanics, 58

Proceedings of the International School of Physics ‘Enrico Budroni, C., G. Vitagliano, and M. P. Woods (2020), Fermi’, Vol. IL, edited by B. D’Espagnat (Academic Press, arXiv:2005.04241 [quant-ph]. New York) pp. 171–181. Busch, P., P. Lahti, and R. F. Werner (2014), Rev. Mod. Bell, J. S. (1982), Found. Phys. 12, 989. Phys. 86, 1261. Beltrametti, E. G., and C. Cassinelli (1981), The Logic of Busch, P., P. J. Lahti, and P. Mittelstaedt (1996), The Quan- Quantum Mechanics (Addison-Wesley, Reading, MA). tum Theory of Measurement, 2nd ed., Lecture Notes in Bengtsson, I. (2009), AIP Conf. Proc. 1101, 241. Physics Monographs, Vol. 2 (Springer-Verlag Berlin Hei- Bengtsson, I., K. Blanchfield, and A. Cabello (2012), Phys. delberg). Lett. A 376, 374. Busch, P., P. J. Lahti, J.-P. Pellonp¨a¨a, and K. Ylinen (2016), Bennett, C. H., and G. Brassard (1984), in Proceedings of Quantum measurement, Vol. 890 (Springer). IEEE International Conference on Computers, Systems & Cabello, A. (1994), Eur. J. Phys. 15, 179. Signal Processing, Bangalore, India (IEEE, New York) pp. Cabello, A. (1997), Phys. Rev. A 55, 4109. 175–179. Cabello, A. (1999), arXiv:quant-ph/9911024 [quant-ph]. Bermejo-Vega, J., N. Delfosse, D. E. Browne, C. Okay, and Cabello, A. (2001a), Phys. Rev. Lett. 87, 010403. R. Raussendorf (2017), Phys. Rev. Lett. 119, 120505. Cabello, A. (2001b), Phys. Rev. Lett. 86, 1911. Bertlmann, R. A., and A. Zeilinger (2013), Quantum (un) Cabello, A. (2002), Phys. Rev. A 65, 052101. speakables: from bell to quantum information (Springer Sci- Cabello, A. (2008), Phys. Rev. Lett. 101, 210401. ence & Business Media). Cabello, A. (2009), in Foundations of Probability and Physics Bharti, K., M. Ray, A. Varvitsiotis, A. Cabello, and L.-C. - 5 (V¨axj¨o,Sweden, 24–27 August 2008), edited by L. Ac- Kwek (2019a), arXiv:1911.09448 [quant-ph]. cardi, G. Adenier, C. Fuchs, G. Jaeger, A. Y. Khrennikov, Bharti, K., M. Ray, A. Varvitsiotis, N. A. Warsi, A. Cabello, J.-A.˚ Larsson, and S. Stenholm (American Institute of and L.-C. Kwek (2019b), Phys. Rev. Lett. 122, 250403. Physics, New York) pp. 246–254. Bharti, K. M. (2021), In preparation. Cabello, A. (2010), Phys. Rev. Lett. 104, 220401. Birkhoff, G., and J. von Neumann (1936), Ann. Math. 37, Cabello, A. (2011), arXiv:1112.5149 [quant-ph]. 823. Cabello, A. (2012), arXiv:1212.1756 [quant-ph]. Blasiak, P. (2015), Ann. Phys. 353, 326. Cabello, A. (2013), Phys. Rev. Lett. 110, 060402. Bohm, D. (1952a), Phys. Rev. 85, 166. Cabello, A. (2015), Phys. Rev. Lett. 114, 220402. Bohm, D. (1952b), Phys. Rev. 85, 180. Cabello, A. (2016), Phys. Rev. A 93, 032102. Boole, G. (1862), Philos. Trans. R. Soc. Lond. 152, 225. Cabello, A. (2019a), Philos. Trans. R. Soc. A 377, 20190136. Borges, G., M. Carvalho, P.-L. de Assis, J. Ferraz, M. Ara´ujo, Cabello, A. (2019b), Phys. Rev. A 100, 032120. A. Cabello, M. Terra Cunha, and S. P´adua(2014), Phys. Cabello, A. (2020), arXiv:2011.13790 [quant-ph]. Rev. A 89, 052106. Cabello, A., E. Amselem, K. Blanchfield, M. Bourennane, Born, M. (1926a), Z. Physik 38, 803. and I. Bengtsson (2012), Phys. Rev. A 85, 032108. Born, M. (1926b), Z. Physik 37, 863. Cabello, A., L. E. Danielsen, A. J. L´opez-Tarrida, and J. R. Boyd, S., and L. Vandenberghe (2004), Convex optimization Portillo (2013), Phys. Rev. A 88, 032104. (Cambridge University Press, Cambridge). Cabello, A., J. M. Estebaranz, and G. Garc´ıa-Alcaine Brassard, G., A. Broadbent, and A. Tapp (2005), Found. (1996a), Phys. Lett. A 212, 183. Phys. 35, 1877. Cabello, A., J. M. Estebaranz, and G. Garc´ıa-Alcaine Braunstein, S. L., and C. M. Caves (1988), Phys. Rev. Lett. (1996b), Phys. Lett. A 218, 115. 61, 662. Cabello, A., J. M. Estebaranz, and G. Garc´ıa-Alcaine(2005), Braunstein, S. L., and C. M. Caves (1990), Ann. Phys. 202, Phys. Lett. A 339, 425. 22. Cabello, A., and G. Garc´ıa-Alcaine(1995), J. Phys. A: Math. Bravyi, S., D. Gosset, and R. K¨onig(2018), Science 362, 308. Gen. 28, 3719. Bravyi, S., D. Gosset, R. K¨onig, and M. Tomamichel (2020), Cabello, A., and G. Garc´ıa-Alcaine(1996), J. Phys. A: Math. Nat. Phys. 16, 1040. Gen. 29, 1025. Bravyi, S., and A. Kitaev (2005), Phys. Rev. A 71, 022316. Cabello, A., and G. Garc´ıa-Alcaine (1998), Phys. Rev. Lett. Bretto, A. (2013), Hypergraph theory (Springer, Heidelberg). 80, 1797. Briegel, H. J., D. E. Browne, W. D¨ur,R. Raussendorf, and Cabello, A., M. Gu, O. G¨uhne,J.-A.˚ Larsson, and K. Wiesner M. Van den Nest (2009), Nat. Phys. 5, 19. (2016a), Phys. Rev. A 94, 052127. Brown, H. R., and G. Svetlichny (1990), Found. Phys. 20, Cabello, A., M. Gu, O. G¨uhne, and Z.-P. Xu (2018a), Phys. 1379. Rev. Lett. 120, 130401. Brunner, N., D. Cavalcanti, S. Pironio, V. Scarani, and Cabello, A., M. Kleinmann, and C. Budroni (2015), Phys. S. Wehner (2014), Rev. Mod. Phys. 86, 419. Rev. Lett. 114, 250402. Bub, J. (1996), Found. Phys. 26, 787. Cabello, A., M. Kleinmann, and J. R. Portillo (2016b),J. Bub, J. (1997), Interpreting the Quantum World (Cambridge Phys. A: Math. Theor. 49, 38LT01. University Press, Cambridge). Cabello, A., and J.-A.˚ Larsson (2010), Phys. Lett. A 375, Budroni, C., and A. Cabello (2012), J. Phys. A: Math. Theor. 99. 45, 385304. Cabello, A., J. R. Portillo, A. Sol´ıs, and K. Svozil (2018b), Budroni, C., G. Fagundes, and M. Kleinmann (2019), New Phys. Rev. A 98, 012106. J. Phys. 21, 093018. Cabello, A., S. Severini, and A. Winter (2010), Budroni, C., N. Miklin, and R. Chaves (2016), Phys. Rev. A arXiv:1010.2163 [quant-ph]. 94, 042127. Cabello, A., S. Severini, and A. Winter (2014), Phys. Rev. Budroni, C., and G. Morchio (2010), J. Math. Phys. 51, Lett. 112, 040401. 122205. Cerf, N. J., N. Gisin, S. Massar, and S. Popescu (2005), Phys. 59

Rev. Lett. 94, 220403. Einstein, A., B. Podolsky, and N. Rosen (1935), Phys. Rev. Chailloux, A., I. Kerenidis, S. Kundu, and J. Sikora (2016), 47, 777. New J. Phys. 18, 045003. Ekert, A. K. (1991), Phys. Rev. Lett. 67, 661. Chaturvedi, A., M. Farkas, and V. J. Wright (2020), Elby, A. (1990a), Found. Phys. 20, 1389. arXiv:2010.05853 [quant-ph]. Elby, A. (1990b), Found. Phys. Lett. 3, 239. Chaves, R. (2013), Phys. Rev. A 87, 022102. Elby, A., and M. R. Jones (1992), Phys. Lett. A 171, 11. Chaves, R., and T. Fritz (2012), Phys. Rev. A 85, 032113. Emary, C., N. Lambert, and F. Nori (2014), Rep. Prog. Phys. Chaves, R., L. Luft, and D. Gross (2014), New J. Phys. 16, 77, 016001. 043001. Emeriau, P.-E., M. Howard, and S. Mansfield (2020), Chen, Z.-B., J.-W. Pan, Y.-D. Zhang, C.ˇ Brukner, and arXiv:2007.15643 [quant-ph]. A. Zeilinger (2003), Phys. Rev. Lett. 90, 160408. van Enk, S. J. (2007), Found. Phys. 37, 1447. Chernoff, P. R. (2009), Not. AMS 87, 1253. Enz, C. P., B. Glaus, and G. Oberkofler (1997), Wolfgang Chiribella, G., A. Cabello, M. Kleinmann, and M. P. M¨uller Pauli und sein Wirken an der ETH Z¨urich (vdf Hochschul- (2020), Phys. Rev. Res. 2, 042001. verlag AG an der ETH Z¨urich). Chiribella, G., G. M. D’Ariano, and P. Perinotti (2010), Phys. Fagundes, G., and M. Kleinmann (2017), J. Phys. A: Math. Rev. A 81, 062348. Theor. 50, 325302. Chiribella, G., and X. Yuan (2014), arXiv:1404.3348 [quant- Fine, A. (1982a), Phys. Rev. Lett. 48, 291. ph]. Fine, A. I. (1982b), J. Math. Phys. 23, 1306. Chiribella, G., and X. Yuan (2016), Inf. Comput. 250, 15. Foulis, D. J., and C. H. Randall (1972), J. Math. Phys. 13, Christof, T., and A. Loebel (2015), “Porta http://porta. 1667. zib.de/,”. Foulis, D. J., and C. H. Randall (1974), Synthese 29, 81. Cinelli, C., M. Barbieri, R. Perris, P. Mataloni, and Freedman, S. J., and J. F. Clauser (1972), Phys. Rev. Lett. F. De Martini (2005), Phys. Rev. Lett. 95, 240405. 28, 938. Cirel’son, B. S. (1993), Hadron. J. Suppl. 8, 329. Frembs, M., S. Roberts, and S. D. Bartlett (2018), New J. Clauser, J. F. (1976a), Phys. Rev. Lett. 36, 1223. Phys. 20, 103011. Clauser, J. F. (1976b), Nuovo Cimento B 33, 740. Fritz, T. (2012), Rev. Math. Phys. 24, 1250012. Clauser, J. F., M. A. Horne, A. Shimony, and R. A. Holt Fritz, T., and R. Chaves (2013), IEEE Trans. Inf. Theory 59, (1969), Phys. Rev. Lett. 23, 880. 803. Clauser, J. F., M. A. Horne, A. Shimony, and R. A. Holt Fritz, T., A. B. Sainz, R. Augusiak, J. B. Brask, R. Chaves, (1970), Phys. Rev. Lett. 24, 549, (Erratum). A. Leverrier, and A. Ac´ın(2013), Nat. Commun. 4, 1. Cleve, R., P. Hoyer, B. Toner, and J. Watrous (2004), in Pro- Froissart, M. (1981), Nuovo Cimento B 64, 241. ceedings. 19th IEEE Annual Conference on Computational Frustaglia, D., J. P. Baltan´as,M. C. Vel´azquez-Ahumada, Complexity, pp. 236–249. A. Fern´andez-Prieto,A. Lujambio, V. Losada, M. J. Freire, Clifton, R. K. (1993), Am. J. Phys. 61, 443. and A. Cabello (2016), Phys. Rev. Lett. 116, 250404. Clifton, R. K., and A. Kent (2000), Proc. R. Soc. Lond. A Fukuda, K. (2018), “cdd https://people.inf.ethz.ch/ 456, 2101. fukudak/cdd_home/,”. Colbeck, R. (2006), Quantum and Relativistic Protocols for Galindo, A. (1975), in Algunas Cuestiones de F´ısica Te´orica Secure Multi-Party Computation, Ph.D. thesis (University (G. I. F. T., Zaragoza) pp. 3–9. of Cambridge), arXiv:0911.3814 [quant-ph]. Gao, X., E. R. Anschuetz, S.-T. Wang, J. I. Cirac, and M. D. Conway, J., and S. Kochen (2006), Found. Phys. 36, 1441. Lukin (2021), arXiv:2101.08354 [quant-ph]. Conway, J., and S. Kochen (2009), Not. AMS 56, 226. Garey, M. R., and D. S. Johnson (2002), Computers and Conway, J. H., and S. Kochen (2000), “Reported in (Peres, intractability, Vol. 29 (W. H. Freeman, New York, NY). 1993) and as a letter “The geometry of quantum para- Garg, A., and N. D. Mermin (1984), Found. Phys. 14, 1. doxes” in (Bertlmann and Zeilinger, 2013),” . Gerelle, E. R., R. J. Greechie, and F. R. Miller (1974), in Correggi, M., and G. Morchio (2002), Ann. Phys. 296, 371. Physical Reality & Mathematical Description (Reidel, Dor- Crutchfield, J. P. (1994), Phys. D: Nonlinear Phenom. 75, 11. drecht, Holland) pp. 169–192. Cubitt, T. S., D. Leung, W. Matthews, and A. Winter (2010), Ghorai, S., and A. K. Pan (2018), Phys. Rev. A 98, 032110. Phys. Rev. Lett. 104, 230503. Gibbons, K. S., M. J. Hoffman, and W. K. Wootters (2004), Cubitt, T. S., D. Leung, W. Matthews, and A. Winter (2011), Phys. Rev. A 70, 062101. IEEE Trans. Inf. Theory 57, 5509. Gill, R. D., and M. Keane (1996), J. Phys. A 29, L289. van Dam, S. B., J. Cramer, T. H. Taminiau, and R. Hanson Gleason, A. M. (1957), J. Math. Mech. 6, 885. (2019), Phys. Rev. Lett. 123, 050401. Godsil, C. D., and J. Zaks (1988), University of Water- van Dam, W. (1999), Nonlocality & Communication Complex- loo Research Report No. CORR 88-12 arXiv:1201.0486 ity, Ph.D. thesis (University of Oxford), chapter 9. [math.CO]. De Simone, A., and P. Pt´ak(2015), Linear Algebra and its Gottesman, D. (1997), Stabilizer codes and quantum error Applications 481, 243. correction, phdthesis (California Institute of Technology), Delfosse, N., P. Allard Guerin, J. Bian, and R. Raussendorf arXiv:quant-ph/9705052 [quant-ph]. (2015), Phys. Rev. X 5, 021003. Gould, E., and P. K. Aravind (2010), Found. Phys. 40, 1096. Delfosse, N., C. Okay, J. Bermejo-Vega, D. E. Browne, and Greechie, R. J. (1971), Journal of Combinatorial Theory, Se- R. Raussendorf (2017), New J. Phys. 19, 123024. ries A 10 (2), 119. Diestel, R. (2018), Graph theory (Springer Publishing Com- Greenberger, D. M., M. A. Horne, and A. Zeilinger (1989), pany, Incorporated). in Bell’s theorem, quantum theory and conceptions of the Durucan, S., and A. Grinbaum (2020), arXiv:2007.03450 universe (Springer) pp. 69–72. [quant-ph]. Gr˝otschel, M., L. Lov´asz, and A. Schrijver (1993), Geometric 60

algorithms and combinatorial optimization: Algorithms and J¨ager,G., H. L¨auchli, B. Scarpellini, and V. Strassen (1990), Combinatorics (Springer-Verlag, Berlin). Ernst Specker Selecta (Birkh¨auserVerlag Basel). Gross, D. (2006), J. Math. Phys. 47, 122107. Jerger, M., Y. Reshitnyk, M. Oppliger, A. Potoˇcnik,M. Mon- Grudka, A., K. Horodecki, M. Horodecki, P. Horodecki, dal, A. Wallraff, K. Goodenough, S. Wehner, K. Juliusson, R. Horodecki, P. Joshi, W. Klobus, and A. W´ojcik(2014), N. K. Langford, and A. Fedoro (2016), Nat. Commun. 7, Phys. Rev. Lett. 112, 120401. 12930. Gr¨unbaum, B. (2003), Convex polytopes, 2nd ed. (Springer, Johansson, N., and J.-A.˚ Larsson (2017), Quant. Inf. Process. New York). 16, 233. Gu, M., K. Wiesner, E. Rieper, and V. Vedral (2012), Nat. Johansson, N., and J.-A.˚ Larsson (2019), Entropy 21, 800. Commun. 3, 762. Kellerer, H. G. (1964a), Math. Ann. 153, 168. G¨uhne,O., C. Budroni, A. Cabello, M. Kleinmann, and J.-A.˚ Kellerer, H. G. (1964b), Zeitschrift f¨urWahrscheinlichkeits- Larsson (2014), Phys. Rev. A 89, 062107. theorie und Verwandte Gebiete 3, 247. G¨uhne, O., M. Kleinmann, A. Cabello, J.-A.˚ Larsson, Kent, A. (1999), Phys. Rev. Lett. 83, 3755. G. Kirchmair, F. Z¨ahringer,R. Gerritsma, and C. F. Roos Kernaghan, M. (1994), J. Phys. A 27, L829. (2010), Phys. Rev. A 81, 022121. Kernaghan, M., and A. Peres (1995), Phys. Lett. A 198, 1. Hales, A. W., and E. G. Straus (1982), Pac. J. Math. 99, 31. Kirchmair, G., F. Z¨ahringer,R. Gerritsma, M. Kleinmann, Hameedi, A., A. Tavakoli, B. Marques, and M. Bourennane O. G¨uhne, A. Cabello, R. Blatt, and C. F. Roos (2009), (2017), Phys. Rev. Lett. 119, 220402. Nature (London) 460, 494. Hardy, L. (1992), Phys. Rev. Lett. 68, 2981. Kiukas, J., and R. F. Werner (2010), J. Math. Phys. 51, Hardy, L. (1993), Phys. Rev. Lett. 71, 1665. 072105. Hardy, L. (2001), arXiv:quant-ph/0101012 [quant-ph]. Kleinmann, M. (2014), J. Phys. A: Math. Theor. 47, 455304. Hasegawa, Y., R. Loidl, G. Badurek, M. Baron, and H. Rauch Kleinmann, M., C. Budroni, J.-A.˚ Larsson, O. G¨uhne, and (2003), Nature (London) 425, 45. A. Cabello (2012), Phys. Rev. Lett. 109, 250402. Hasegawa, Y., R. Loidl, G. Badurek, M. Baron, and H. Rauch Kleinmann, M., O. G¨uhne,J. R. Portillo, J.-A.˚ Larsson, and (2006), Phys. Rev. Lett. 97, 230401. A. Cabello (2011), New J. Phys. 13, 113011. Hauge, E. H., and J. A. Støvneng (1989), Rev. Mod. Phys. Klyachko, A. A. (2007), in Physics and Theoretical Com- 61, 917. puter Science: From Numbers and Languages to (Quan- Havlicek, H., G. Krenn, J. Summhammer, and K. Svozil tum) Cryptography, edited by J.-P. Gazeau, J. Neˇsetˇril,and (2001), J. Phys. A 34, 3071. B. Rovan (IOP Press, London) pp. 25–54. Heinosaari, T., T. Miyadera, and M. Ziman (2016), J. Phys. Klyachko, A. A., M. A. Can, S. Binicio˘glu, and A. S. Shu- A: Math. Theor. 49, 123001. movsky (2008), Phys. Rev. Lett. 101, 020403. Heinosaari, T., and M. M. Wolf (2010), J. Math. Phys. 51, Knuth, D. E. (1994), Electron. J. Comb. 1, A1. 092201. Kochen, S. (1970), “letter to A. Shimony, early in the 1970s, Heinosaari, T., and M. Ziman (2012), The mathematical lan- reported in (Stairs, 1983) and in (Heywood and Redhead, guage of quantum theory: from uncertainty to entanglement 1983),” . (Cambridge University Press, Cambridge, New York). Kochen, S., and E. P. Specker (1965a), in Proceedings of Heisenberg, W. (1927), Z. Physik 43, 172. the 1964 International Congress for Logic, Methodology and Henson, J. (2012), arXiv:1210.5978 [quant-ph]. Philosophy of Science, Jerusalem, edited by Y. Bar-Hillel Henson, J. (2015), Phys. Rev. Lett. 114, 220403. (North-Holland, Amsterdam) pp. 45–57. Henson, J., and A. B. Sainz (2015), Phys. Rev. A 91, 042114. Kochen, S., and E. P. Specker (1965b), in Symposium on Hermann, G. (1935), Abhandlungen der Fries’schen Schule 6, the Theory of Models: Proceedings of the 1963 Interna- 75, reprinted in (Herrmann, 2019). tional Symposium at Berkeley, edited by J. W. Addison, Herrmann, K. (2019), Grete Henry-Hermann: Philosophie– L. Henkin, and A. Tarski (North-Holland, Amsterdam) Mathematik–Quantenmechanik (Springer). pp. 177–189. Heunen, C., T. Fritz, and M. L. Reyes (2014), Phys. Rev. A Kochen, S., and E. P. Specker (1967), J. Math. Mech. 17, 59. 89, 032121. Kofler, J., and C.ˇ Brukner (2013), Phys. Rev. A 87, 052115. Heywood, P., and M. L. G. Redhead (1983), Found. Phys. Kong, X., M. Shi, F. Shi, P. Wang, P. Huang, Q. Zhang, 13, 481. C. Ju, C. Duan, S. Yu, and J. Du (2012), arXiv:1210.0961 Hoban, M. J., E. T. Campbell, K. Loukopoulos, and D. E. [quant-ph]. Browne (2011), New J. Phys. 13, 023014. Kong, X., M. Shi, M. Wang, F. Shi, P. Wang, F. Kong, Hoffmann, J., C. Spee, O. G¨uhne, and C. Budroni (2018), P. Huang, Q. Zhang, W. Ma, H. Chen, C. Ju, M. Tian, New J. Phys. 20, 102001. C. Duan, S. Yu, and J. Du (2016), arXiv:1602.02455 Holevo, A. S. (1982), Probabilistic and Statistical Aspects of [quant-ph]. Quantum Theory (North-Holland, Amsterdam). Krips, H. (1987), The Metaphysics of Quantum Theory Horn, A., and A. Tarski (1948), Transactions of the American (Clarendon Press, Oxford). Mathematical Society 64, 467. Krishna, A., R. W. Spekkens, and E. Wolfe (2017), New J. Horodecki, K., M. Horodecki, P. Horodecki, R. Horodecki, Phys. 19, 123031. M. Pawlowski, and M. Bourennane (2010), Kujala, J. V., E. N. Dzhafarov, and J.-A.˚ Larsson (2015), arXiv:1006.0468 [quant-ph]. Phys. Rev. Lett. 115, 150401. Howard, M., J. Wallman, V. Veitch, and J. Emerson (2014), Kulikov, A., M. Jerger, A. Potoˇcnik,A. Wallraff, and A. Fe- Nature (London) 510, 351. dorov (2017), Phys. Rev. Lett. 119, 240501. Huang, Y.-F., C.-F. Li, Y.-S. Zhang, J.-W. Pan, and G.-C. Kunjwal, R. (2015), Phys. Rev. A 91, 022108. Guo (2003), Phys. Rev. Lett. 90, 250401. Kunjwal, R. (2020), Quantum 4, 219. Jaeger, G. (2019), Philos. Trans. R. Soc. A 377, 20190025. Kunjwal, R., and S. Ghosh (2014), Phys. Rev. A 89, 042118. 61

Kunjwal, R., C. Heunen, and T. Fritz (2014), Phys. Rev. A L¨uders,G. (2006), Ann. Phys. (Leipzig) 15, 663. 89, 052126. Ludwig, G. (1964), Z. Physik 181, 233. Kunjwal, R., and R. W. Spekkens (2015), Phys. Rev. Lett. Ludwig, G. (1967), Commun. Math. Phys. 4, 331. 115, 110403. Ludwig, G. (1968), Commun. Math. Phys. 9, 1. Kunjwal, R., and R. W. Spekkens (2018), Phys. Rev. A 97, Ludwig, G. (1972), Commun. Math. Phys. 26, 78. 052110. Lund, C., and M. Yannakakis (1994), J. ACM 41, 960. Kurzy´nski,P., R. Ramanathan, and D. Kaszlikowski (2012), Mackey, G. W. (1957), Am. Math. Mon. 64, 45. Phys. Rev. Lett. 109, 020404. Mackey, G. W. (1963), Mathematical Foundations of Quan- La Cour, B. R. (2009), Phys. Rev. A 79, 012102. tum Mechanics (Benjamin, New York). Landauer, R., and T. Martin (1994), Rev. Mod. Phys. 66, Malinowski, M., C. Zhang, F. M. Leupold, A. Cabello, 217. J. Alonso, and J. P. Home (2018), Phys. Rev. A 98, 050102. Lapkiewicz, R., P. Li, C. Schaeff, N. Langford, S. Ramelow, Malvestuto, F. M. (1988), Discrete Math. 69, 61. M. Wie´sniak, and A. Zeilinger (2011), Nature (London) Mansfield, S., and E. Kashefi (2018), Phys. Rev. Lett. 121, 474, 490. 230401. Larsson, J.-A.˚ (2002), Europhys. Lett. 58, 799. Mao, Y., C. Spee, Z.-P. Xu, and O. G¨uhne (2020), Larsson, J.-A.˚ (2012), AIP Conf. Proc. 1424, 211. arXiv:2005.13964 [quant-ph]. Larsson, J.-A.˚ (2014), J. Phys. A 47, 424003. Mari, A., and J. Eisert (2012), Phys. Rev. Lett. 109, 230503. Larsson, J.-A.,˚ M. Kleinmann, O. G¨uhne, and A. Cabello Markiewicz, M., D. Kaszlikowski, P. Kurzy´nski, and (2011), in Advances in Quantum Theory (V¨axj¨o,Sweden, A. W´ojcik(2019), npj Quant. Inf. 5, 1. 14–17 June 2010), edited by G. Jaegger, A. Y. Khrennikov, Masanes, L., and M. P. M¨uller(2011), New J. Phys. 13, M. Schlosshauer, and G. Weihs (American Institute of 063001. Physics, New York) pp. 401–409. Matsuno, S. (2007), J. Phys. A: Math. Theor. 40, 9507. Lauritzen, S. L. (1996), Graphical models (Clarendon Press). Mat´uˇs,F. (2007a), Discrete Math. 307, 2464. Laversanne-Finot, A., A. Ketterer, M. R. Barros, S. P. Wal- Mat´uˇs,F. (2007b), in Information Theory, 2007. ISIT 2007. born, T. Coudreau, A. Keller, and P. Milman (2017),J. IEEE International Symposium on (IEEE) pp. 41–44. Phys. A: Math. Gen. 50, 155304. Mayers, D., and A. Yao (2004), Quant. Inf. Comput. 4, 273. Leggett, A. J., and A. Garg (1985), Phys. Rev. Lett. 54, 857. Mazurek, M. D., M. F. Pusey, R. Kunjwal, K. J. Resch, and Leifer, M., and C. Duarte (2020), Phys. Rev. A 101, 062113. R. W. Spekkens (2016), Nat. Commun. 7, 11780. Leifer, M. S., and O. J. E. Maroney (2013), Phys. Rev. Lett. Mealy, G. H. (1955), Bell Syst. Tech. J. 34, 1045. 110, 120401. Megill, N. D., K. Fresl, M. Waegell, P. K. Aravind, and Leupold, F. M., M. Malinowski, C. Zhang, V. Negnevitsky, M. Paviˇci´c(2011), Phys. Lett. A 375, 3419. A. Cabello, J. Alonso, and J. P. Home (2018), Phys. Rev. Meon, J. (1990), [pseudonym for E. P. Specker], printed in Lett. 120, 180401. (J¨ager et al., 1990, p. XI). Li, T., Q. Zeng, X. Song, and X. Zhang (2017), Sci. Rep. 7, Mermin, N. D. (1990a), Phys. Rev. Lett. 65, 1838. 44467. Mermin, N. D. (1990b), Phys. Rev. Lett. 65, 3373. Liang, Y.-C., R. W. Spekkens, and H. M. Wiseman (2011), Mermin, N. D. (1993), Rev. Mod. Phys. 65, 803. Phys. Rep. 506, 1. Mermin, N. D. (1999), arXiv:quant-ph/9912081 [quant-ph]. Lillystone, P., and J. Emerson (2019), arXiv:1904.04268 Mermin, N. D., and R. Schack (2018), Found. Phys. 48, 1007. [quant-ph]. Meyer, D. A. (1999), Phys. Rev. Lett. 83, 3751. Lillystone, P., J. J. Wallman, and J. Emerson (2019), Phys. Michler, M., H. Weinfurter, and M. Zukowski˙ (2000), Phys. Rev. Lett. 122, 140405. Rev. Lett. 84, 5457. Lisonˇek,P., P. Badziag,‘ J. R. Portillo, and A. Cabello (2014), Moussa, O., C. A. Ryan, D. G. Cory, and R. Laflamme Phys. Rev. A 89, 042101. (2010), Phys. Rev. Lett. 104, 160501. Liu, B.-H., X.-M. Hu, J.-S. Chen, Y.-F. Huang, Y.-J. Han, Navascu´es,M., Y. Guryanova, M. J. Hoban, and A. Ac´ın C.-F. Li, G.-C. Guo, and A. Cabello (2016), Phys. Rev. (2015), Nat. Commun. 6, 6288. Lett. 117, 220402. Navascu´es,M., S. Pironio, and A. Ac´ın(2007), Phys. Rev. Liu, B. H., Y. F. Huang, Y. X. Gong, F. W. Sun, Y. S. Zhang, Lett. 98, 010401. C. F. Li, and G. C. Guo (2009), Phys. Rev. A 80, 044101. Navascu´es,M., S. Pironio, and A. Ac´ın (2008), New J. Phys. L¨ohr,W., and N. Ay (2009), in International Conference on 10, 073013. Complex Sciences (Springer) pp. 265–276. Navascu´es,M., and H. Wunderlich (2010), Proc. R. Soc. A Lorentz, H. A. (1928), Electrons´ et Photons—Rapports et 466, 881. Discussions du Cinqui`eme Conseil de Physique tenu `a von Neumann, J. (1931), Ann. Math. 32, 191. Bruxelles du 24 au 29 Octobre 1927 sous les Auspices von Neumann, J. (1932), Mathematische Grundlagen der de L’Institut International de Physique Solvay (Gauthier- Quantenmechanik (Springer-Verlag, Berlin). Villars, Paris). Neumark, M. A. (1940a), Izv. Akad. Nauk SSSR Ser. Mat. 4, L¨orwald, S., and G. Reinelt (2015), EURO J. Comput. Op- 53. tim. 3, 297. Neumark, M. A. (1940b), Izv. Akad. Nauk SSSR Ser. Mat. 4, Lostaglio, M., and G. Senno (2020), Quantum 4, 258. 277. Lov´asz,L. (1979), IEEE Trans. Inf. Theory 25, 1. Neumark, M. A. (1943), C. R. (Doklady) Acad. Sci. URSS Lov´asz,L. (2009), Geometric representations of graphs (Lec- (N.S.) 41, 359. ture notes). de Obaldia, E., A. Shimony, and F. Wittel (1988), Found. L¨uders,G. (1951), Ann. Phys. (Leipzig) 443, 323, (German, Phys. 18, 1013. English translation (L¨uders, 2006)). Oestereich, A. L., and E. F. Galv˜ao(2017), Phys. Rev. A 96, 62

062305. J. Bermejo-Vega (2017), Phys. Rev. A 95, 052334. Palsson, M. S., M. Gu, J. Ho, H. M. Wiseman, and G. J. Ray, M., N. G. Boddu, K. Bharti, L.-C. Kwek, and A. Cabello Pryde (2017), Sci. Adv. 3, e1601302. (2021), New J. Phys. 10.1088/1367-2630/abcacd. Paviˇci´c, M. (2006), Quantum Computation and Quantum Redhead, M. L. G. (1987), Incompleteness, Nonlocality, and Communication: Theory and Experiments (Springer, New Realism (Oxford University Press, New York). York). Renner, R., and S. Wolf (2004), in International Sympo- Paviˇci´c,M., N. D. Megill, P. K. Aravind, and M. Waegell sium onInformation Theory, 2004. ISIT 2004. Proceedings. (2011), J. Math. Phys. 52, 022104. (IEEE) pp. 322–322. Paviˇci´c,M., J.-P. Merlet, B. D. McKay, and N. D. Megill Ruuge, A. E. (2007), J. Phys. A: Math. Theor. 40, 2849. (2005), J. Phys. A 38, 1577. Ruuge, A. E. (2012), J. Phys. A: Math. Theor. 45, 465304. Pawlowski, M., T. Paterek, D. Kaszlikowski, V. Scarani, Sadiq, M., P. Badziag, M. Bourennane, and A. Cabello A. Winter, and M. Zukowski˙ (2009), Nature (London) 461, (2013), Phys. Rev. A 87, 012128. 1101. Saha, D., and A. Chaturvedi (2019), Phys. Rev. A 100, Penrose, R. (2000), in Quantum Reflections, edited by J. Ellis 022108. and D. Amati (Cambridge University Press, Cambridge) Saha, D., P. Horodecki, and M. Pawlowski (2019), New J. pp. 1–27. Phys. 21, 093057. Peres, A. (1990), Phys. Lett. A 151, 107. Saniga, M., and M. Planat (2012), Quant. Inf. Comput. 11, Peres, A. (1991), J. Phys. A: Math. Gen. 24, L175. 1011. Peres, A. (1992), Found. Phys. 22, 357. Schmid, D., J. Selby, E. Wolfe, R. Kunjwal, and R. W. Peres, A. (1993), Quantum Theory: Concepts and Methods Spekkens (2019), arXiv:1911.10386 [quant-ph]. (Kluwer, Dordrecht). Schmid, D., J. H. Selby, M. F. Pusey, and R. W. Spekkens Peres, A. (2003), arXiv:quant-ph/0310035 [quant-ph]. (2020), arXiv:2005.07161 [quant-ph]. Peres, A., and A. Ron (1988), in Microphysical Reality and Schmid, D., and R. W. Spekkens (2018), Phys. Rev. X 8, Quantum Formalism, Vol. 2, edited by A. van der Merwe, 011015. F. Selleri, and G. Tarozzi (Kluwer, Dordrecht) pp. 115– Schmid, D., R. W. Spekkens, and E. Wolfe (2018), Phys. 123. Rev. A 97, 062103. Piron, C. (1964), Helv. Phys. Acta 37, 439. Seevinck, M. P. (2011), arXiv:1103.4537 [physics.hist-ph]. Piron, C. (1976), Foundations of Quantum Mechanics (W. A. Shannon, C. (1956), IRE Trans. Inf. Theory 2 (3), 8. Benjamin Inc., Reading, Massachusetts). Shimony, A. (1971), in Foundations of Quantum Mechanics: Pitowsky, I. (1986), J. Math. Phys. 27, 1556. Proceedings of the International School of Physics “Enrico Pitowsky, I. (1989), Quantum Probability-Quantum Logic, Fermi”, Course IL, Varenna on Lake Como, Villa Monas- Lecture Notes in Physics, Vol. 321 (Springer-Verlag, tero, 29th June–11th July 1970, edited by B. D’Espagnat Berlin). (Academic Press, New York) pp. 182–194. Pitowsky, I. (1991), Math. Program. 50, 395. Shimony, A. (2009), in Compendium of Quantum Physics: Pitowsky, I. (1994), Br. J. Philos. Sci. 45, 95. Concepts, Experiments, History and Philosophy, edited by Planat, M. (2012), Eur. Phys. J. Plus 127, 1. D. Greenberger, K. Hentschel, and F. Weinert (Springer- Planat, M. (2013), in Symmetries and Groups in Contempo- Verlag, Berlin) pp. 287–291. rary Physics, edited by C. Bai, J.-P. Gazeau, and M.-L. Simon, C., C.ˇ Brukner, and A. Zeilinger (2001), Phys. Rev. Ge (World Scientific, Singapore) p. 295. Lett. 86, 4427. Plastino, A. R., and A. Cabello (2010), Phys. Rev. A 82, Soares Barbosa, R. (2014), EPTCS 172, 36. 022114. Soares Barbosa, R. (2015), Contextuality in quantum mechan- Popescu, S., and D. Rohrlich (1994), Found. Phys. 24, 379. ics and beyond, Ph.D. thesis (University of Oxford). Pusey, M. F. (2018), Phys. Rev. A 98, 022112. Soares Barbosa, R., T. Douce, P.-E. Emeriau, E. Kashefi, and Qu, D., P. Kurzy´nski,D. Kaszlikowski, S. Raeisi, L. Xiao, S. Mansfield (2019), arXiv:1905.08267 [quant-ph]. K. Wang, X. Zhan, and P. Xue (2020), Phys. Rev. A 101, Specker, E. (1960), Dialectica 14, 239. 060101. Specker, E. (1999), “Private communication to Karl Svozil, Rabelo, R., C. Duarte, A. J. L´opez-Tarrida, M. Terra Cunha, mentioned in (Abbott et al., 2012).” . and A. Cabello (2014), J. Phys. A: Math. Theor. 47, Spee, C., C. Budroni, and O. G¨uhne(2020), New J. Phys. 424021. 22, 103037. Rabiner, L. R. (1989), Proc. IEEE 77, 257. Spekkens, R. W. (2005), Phys. Rev. A 71, 052108. Raeisi, S., P. Kurzy´nski, and D. Kaszlikowski (2015), Phys. Spekkens, R. W. (2007), Phys. Rev. A 75, 032110. Rev. Lett. 114, 200401. Spekkens, R. W. (2014), Found. Phys. 44, 1125. Ramanathan, R., and P. Horodecki (2014), Phys. Rev. Lett. Spekkens, R. W. (2019), arXiv:1909.04628 [physics.hist-ph]. 112, 040404. Spekkens, R. W., D. H. Buzacott, A. J. Keehn, B. Toner, and Ramanathan, R., A. Soeda, P. Kurzy´nski, and D. Kasz- G. J. Pryde (2009), Phys. Rev. Lett. 102, 010401. likowski (2012), Phys. Rev. Lett. 109, 050404. Stairs, A. (1978), Quantum Mechanics, Logic and Reality, Randall, C. H., and D. J. Foulis (1970), Am. Math. Mon. 77, Ph.D. thesis (University of Western Ontario, London, On- 363. tario). Randall, C. H., and D. J. Foulis (1973), J. Math. Phys. 14, Stairs, A. (1983), Philos. Sci. 50, 578. 1472. Stomphorst, R. G. (2002), Phys. Lett. A 292, 213. Raussendorf, R. (2013), Phys. Rev. A 88, 022322. Suppes, P., and M. Zanotti (1981), Synthese 48, 191. Raussendorf, R., and H. J. Briegel (2001), Phys. Rev. Lett. Svozil, K. (2006), Am. J. Phys. 74, 800. 86, 5188. Svozil, K. (2010), in Physics and Computation 2010, edited by Raussendorf, R., D. E. Browne, N. Delfosse, C. Okay, and H. Guerra (University of Azores, Ponta Delgada, Portugal) 63

pp. 235–249. (2011), Found. Phys. 41, 883. Szangolies, J. (2015), Testing Quantum Contextuality: The Wigner, E. (1932), Phys. Rev. 40, 749. Problem of Compatibility (Springer Spektrum, Wiesbaden). Wilde, M. M., and A. Mizel (2012), Found. Phys. 42, 256. Szangolies, J., M. Kleinmann, and O. G¨uhne(2013), Phys. Winter, A. (2014), J. Phys. A: Math. Theor. 47, 424031. Rev. A 87, 050101. Wright, R. (1978), in Mathematical Foundations of Quantum Tavakoli, A., and R. Uola (2020), Phys. Rev. Research 2, Theory, edited by A. R. Marlow, Chap. The state of the 013011. pentagon. A nonclassical example (Academic Press, New Tavakoli, A., E. Zambrini Cruzeiro, R. Uola, and A. A. Ab- York) pp. 255–274. bott (2020), arXiv:2010.04751 [quant-ph]. Xu, Z.-P., and A. Cabello (2017), Phys. Rev. A 96, 012122. Toh, S. P. (2013a), Chin. Phys. Lett. 30, 100302. Xu, Z.-P., and A. Cabello (2019), Phys. Rev. A 99, 020103. Toh, S. P. (2013b), Chin. Phys. Lett. 30, 100303. Xu, Z.-P., J.-L. Chen, and O. G¨uhne(2020a), Phys. Rev. Toner, B. F., and D. Bacon (2003), Phys. Rev. Lett. 91, Lett. 124, 230401. 187904. Xu, Z.-P., J.-L. Chen, and H.-Y. Su (2015), Phys. Lett. A Um, M., X. Zhang, J. Zhang, Y. Wang, Y. Shen, D.-L. Deng, 379, 1868. L.-M. Duan, and K. Kim (2013), Sci. Rep. 3, 1627. Xu, Z.-P., D. Saha, H.-Y. Su, M. Pawlowski, and J.-L. Chen Um, M., Q. Zhao, J. Zhang, P. Wang, Y. Wang, M. Qiao, (2016), Phys. Rev. A 94, 062103. H. Zhou, X. Ma, and K. Kim (2020), Phys. Rev. Appl. 13, Xu, Z.-P., X.-D. Yu, and M. Kleinmann (2020b), 034077. arXiv:2011.04048 [quant-ph]. Uola, R., A. C. S. Costa, H. C. Nguyen, and O. G¨uhne(2020), Yan, B. (2013), Phys. Rev. Lett. 110, 260406. Rev. Mod. Phys. 92, 015001. Yang, T., Q. Zhang, J. Zhang, J. Yin, Z. Zhao, M. Zukowski,˙ Uola, R., G. Vitagliano, and C. Budroni (2019), Phys. Rev. Z.-B. Chen, and J.-W. Pan (2005), Phys. Rev. Lett. 95, A 100, 042117. 240406. Veitch, V., C. Ferrie, D. Gross, and J. Emerson (2012), New Yeung, R. W. (2008), Information theory and network cod- J. Phys. 14, 113011. ing, Information technology–transmission, processing, and Veitch, V., S. A. H. Mousavian, D. Gottesman, and J. Emer- storage (Springer). son (2014), New J. Phys. 16, 013009. Yu, S., and C. H. Oh (2012), Phys. Rev. Lett. 108, 030402. Vorob’ev, N. (1962), Theory Probab. Appl. 7, 147. Yu, X.-D., Y.-Q. Guo, and D. M. Tong (2015), New J. Phys. Vorob’ev, N. (1963), Theory Probab. Appl. 8, 420. 17, 093001. Vorob’ev, N. N. (1959), Dokl. Akad. Nauk SSSR 124, 253. Yu, X.-D., and D. M. Tong (2014), Phys. Rev. A 89, 010101. Vorob’yev, N. N. (1967), Theory Probab. Appl. 12, 251. Zhan, X., E. G. Cavalcanti, J. Li, Z. Bian, Y. Zhang, H. M. Vourdas, A. (2004), Rep. Prog. Phys. 67, 267. Wiseman, and P. Xue (2017), Optica 4, 966. Waegell, M. (2014), Phys. Rev. A 89, 012321. Zhang, A., H. Xu, J. Xie, H. Zhang, B. J. Smith, M. S. Kim, Waegell, M., and P. K. Aravind (2010), J. Phys. A 43, and L. Zhang (2019), Phys. Rev. Lett. 122, 080401. 105304. Zhang, X., M. Um, J. Zhang, S. An, Y. Wang, D.-l. Deng, Waegell, M., and P. K. Aravind (2011a), J. Phys. A 44, C. Shen, L.-M. Duan, and K. Kim (2013), Phys. Rev. 505303. Lett. 110, 070401. Waegell, M., and P. K. Aravind (2011b), Found. Phys. 41, Zhang, Z. (2003), Commun. Inf. Syst. 3, 47. 1786. Zimba, J. R., and R. Penrose (1993), Stud. Hist. Philos. Sci. Waegell, M., and P. K. Aravind (2012), J. Phys. A: Math. A 24, 697. Theor. 45, 405301. Zu, C., Y.-X. Wang, D.-L. Deng, X.-Y. Chang, K. Liu, P.-Y. Waegell, M., and P. K. Aravind (2013a), Phys. Lett. A 377, Hou, H.-X. Yang, and L.-M. Duan (2012), Phys. Rev. Lett. 546. 109, 150401. Waegell, M., and P. K. Aravind (2013b), Phys. Rev. A 88, Zu, C., Y.-X. Wang, D.-L. Deng, X.-Y. Chang, K. Liu, P.-Y. 012102. Hou, H.-X. Yang, and L.-M. Duan (2013), Phys. Rev. Lett. Waegell, M., and P. K. Aravind (2015), J. Phys. A: Math. 110, 078902. Theor. 48, 225301. Zurel, M., C. Okay, and R. Raussendorf (2020), Phys. Rev. Waegell, M., and P. K. Aravind (2017), Phys. Rev. A 95, Lett. 125, 260404. 050101. Waegell, M., P. K. Aravind, N. D. Megill, and M. Paviˇci´c