<<

Universidad de Los Andes

Undergraduate Thesis

Backward Causation in Weak Measurements

Author: Supervisor: Sebasti´an Murgueitio Alonso Botero, Ph.D Ram´ırez

A thesis submitted in fulfilment of the requirements for the degree of Physicist

in the

Department of Physics

August 6, 2014 Declaration of Authorship

I, Sebasti´an Murgueitio Ram´ırez, declare that this thesis titled, ’Backward Causation in Weak Measurements’ and the work presented in it are my own. I confirm that:

 This work was done wholly or mainly while in candidature for a research degree at this University.

 Where any part of this thesis has previously been submitted for a degree or any other qualification at this University or any other institution, this has been clearly stated.

 Where I have consulted the published work of others, this is always clearly at- tributed.

 Where I have quoted from the work of others, the source is always given. With the exception of such quotations, this thesis is entirely my own work.

 I have acknowledged all main sources of help.

 Where the thesis is based on work done by myself jointly with others, I have made clear exactly what was done by others and what I have contributed myself.

Signed:

Date:

i “What is time? If no one asks me, I know; but if I were desirous to explain it to one that should ask me, plainly I know not ”.

Saint Augustine UNIVERSIDAD DE LOS ANDES

Abstract

Faculty of science Department of Physics

Physicist

Backward Causation in Weak Measurements

by Sebasti´an Murgueitio Ram´ırez

In this thesis I argue that the weak values revealed in weak measurements are to be understood as processes that involve backward causation. In the first part of the the- sis a philosophical examination of the concepts of causation and backward causation is provided. In this part the main theories of causation are examined, and the principal objections against the possibility of backward causation are addressed. In the second part of the work the theory of weak measurements is presented, together with the prob- lem of how these weak measurements are to be interpreted. In the last part of the work, the dependence of the on the future post-selection of the system is carefully examined. Different positions around this dependence on the future are analyzed and it is argued that the only satisfactory way of explaining the mentioned dependence is by appealing to backward causation. At the end of the work an optical experiment, which is a modified Bell-type experiment, is proposed. This experiment will not only stress the allegedly “backward in time” nature of the weak values but will also throw new lights into the old Einstein-Bohr debate around the completeness of the quantum description of nature. Acknowledgements

I would like to thank my advisor Alonso Botero who introduced me to the fascinating field of weak measurements, and who also gave me great advice during this semester. Alonso encouraged me to pursue a project that kept my main interests close, namely, physics and philosophy. It certainly was a fruitful (and demanding) experience. I would also like to thank Veronica, with whom I shared countless discussions about time and causation. Without her help this project would have been more difficult to carry out. In addition, I would like to thank Alejandra Valencia, who helped me plan the experiment proposed in the last chapter of the present work. I am very grateful to Pedro and Leonardo as well, with them I had many interesting and helpful debates about quantum . I also want to thank my friends Jose, Camilo, Manuelito, Laco, Maria P. who have made these past years very nice and unforgettable. Last but not least, I want to thank my lovely family for their constant support and their help, they encouraged me to study physics and philosophy and provided me both the material means and the emotional support needed to pursuit my studies.

iv Contents

Declaration of Authorshipi

Abstract iii

Acknowledgements iv

List of Figures viii

List of Tablesx

1 Introduction1 1.1 Thesis...... 2 1.2 Structure of the work...... 3

I Backward Causation5

2 Causation6 2.1 What is causation?...... 6 2.1.1 Regularist theories of causation...... 7 2.1.2 Counterfactual theories of causation...... 8 2.1.3 Probabilistic theories of causation...... 8 2.1.4 Manipulabilists theories...... 9 2.1.5 Process theories of causation...... 10

3 Backward causation 12 3.1 Historical approach...... 12 3.1.1 Theories of causation dealing with retro causation...... 15 3.2 Objections...... 17 3.2.1 The Bilking Argument...... 17 3.2.2 A brief discussion of the arrow of time...... 18 3.2.2.1 The symmetry of the second law...... 18 3.2.2.2 A subjective arrow of time...... 21 3.2.2.3 The Block Universe...... 22 3.3 Backward causation in physics...... 25

v Contents vi

3.3.1 The Wheeler-Feynman absorber theory of radiation...... 26 3.3.2 Crammer’s transactional interpretation of .. 29 3.3.3 Wheeler delayed choice experiment...... 32 3.4 Broad overview of the chapter...... 34

II Weak Measurements 35

4 Indirect Measurements 36 4.1 Indirect or ancilla measurement...... 36 4.1.1 Interaction between the system and the pointer...... 37 4.1.2 Reading the meter...... 41 4.2 The Von Neumann protocol...... 43 4.3 Some results...... 45 4.4 Example...... 48

5 Weak Measurements 51 5.1 An intuitive approach...... 51 5.1.1 Definition of the weak value...... 55 5.2 Mechanical interpretation of weak values...... 56 5.2.1 Playing with pre and post-selected ensembles...... 56 5.2.2 The action-reaction picture...... 59 5.2.3 General pointer variable statistics...... 61 5.2.4 Comments...... 64 5.3 Example 2...... 66 5.4 Final Remarks of the chapter...... 70

III Backward causation revealed in weak measurements 71

6 On the physical meaning of weak values 72 6.1 Two possible objections...... 73 6.1.1 Against the postulates of quantum mechanics?...... 73 6.1.2 Weak values can be complex or very eccentric...... 73 6.2 Two Stern-Gerlach Experiment...... 75 6.2.1 Predictions in the weak regimen...... 77 6.3 More objections...... 79 6.3.1 The Error View (EV)...... 79 6.3.2 The coincidence view (CV)...... 79 6.3.3 The no dependence on the future view (NDFV)...... 81

7 Backward causation in weak measurements 83 7.1 An experiment...... 83 7.2 Hidden variables, again...... 88 7.2.1 The independence assumption (IA)...... 89 7.2.2 A common cause in the future...... 91 7.3 The Two-State Vector Formalism (TSVF)...... 93 7.3.1 The ABL rule...... 93 Contents vii

7.3.2 The Two States...... 94 7.3.3 TSVF and weak measurements...... 96 7.3.4 A last objection...... 98 7.4 Summary of the chapter...... 100

8 Conclusions 103

A Derivation of some results 106 A.1 Derivation of Eq. 4.13...... 106 A.2 Derivation of Eq. 4.16...... 106 A.3 Derivation of Eq. 4.22...... 107

B Details of the experiment 108

Bibliography 116 List of Figures

3.1 Set up of Wheeler’s delayed choice experiment...... 33

4.1 Pointer’s distribution for an hypothetical case in which |α0|2 > |β0|2, ∆Q = 0.01 and g = 2/~...... 49

5.1 Pointer’s distribution for ∆Qˆ = 1...... 52 5.2 Pointer’s distribution for ∆Qˆ = 10...... 53 5.3 Probability density for m, g = 20/~ ...... 53 5.4 g = 0.002/~ ...... 54 5.5 Pointer’s distribution for the pre and post selected states of the example 2. The dispersion is ∆Q = 20, which corresponds to the weak regimen (it is much bigger than the difference between the eigenvalues). I have drawn a line centered around the peak of the distribution so that the reader can easily see that this value is approximated equal to the weak value computed with 5.38...... 68 5.6 Pointer’s distribution for the observable Sˆz with the condition of the pre and post selected states of example 2. I have set ∆Q = 0.1, which corre- sponds to the strong regimen. Note that we have sharp peaks, centered around the eigenvalues 1 and −1. The right peak is considerable higher, as we expect from the ABL rule applied to this particular example (see footnote in the present or previous page)...... 69

6.1 Two Stern-Gerlach...... 75 6.2 Screen Two Strong measurements...... 76 6.3 Weakscreen...... 77

7.1 Set up of the proposed experiment...... 84 7.2 List of position per photon...... 85 7.3 Table after post-selection...... 86 7.4 Schematic representation of the outcomes in the case we post-select the final state given by Eq. 7.2. Notice the clear anti-correlation between the weak value compared to the final strong value of the opposite arm (compare green boxes with green boxes, and blue ones with blue ones)... 86 7.5 Schematic representation of the backlight cones for the final measure- ments. Notice that the cones overlap in a region (in the past of the mea- surements) that includes the moment in which the photons are released. Therefore, in any moment during this overlapped region, it is possible for the final measurements to carry information (from the future) of the mea- surement performed, and one photon can locally transmit the information to its partner...... 92

viii List of Figures ix

B.1 A type-II non linear crystal is pumped by a pulsed laser and generates a pair of entangled photons via SPDC. Then the photons pass through the birefringent crystals, which induce a spatial displacement of the polariza- tion (see Fig. B.2). The photons are strongly measured by the polarizer, and then they pass through two lenses that projects the incoming light into the imaging plane (an image is created). This images arrive to optical fibers that will move (see arrows) so they serve as screens...... 109 B.2 Illustration of how the birefringent crystal induces a displacement d of the |Hi and |V i components of polarization for an incident beam (exactly the same principle applies to the other arm but with a displacement of |Ai and |Di). b) By increasing the waist of the beam, it is possible to obtain the conditions for weakness, since now the displacement is less that the initial waist of the beam (we cannot distinguish between the components). Analogously, we can obtain a weak measurement of the polarization by making the crystal sufficiently short so that the ray is almost unperturbed.109 B.3 Pointer’s distribution for a weak measurement without post-selection. a) is a 3D plot and b) is the contour plot. Note that the Gaussian is centered at (0, 0), which indicates that the weak measurement of both arms yielded zero (the expectation value ofσ ˆz andσ ˆx is zero for the ).... 113 B.4 Pointer’s distribution for a weak measurement with the post-selection (f) |ψ i = |Di1 ⊗ |Hi2. b) and c) are close-ups of the distributions; b) shows the x axis while c) the y axis. d) is a contour plot of the distribution.113 B.5 Strong to weak measurement transition. It is interesting to note how the weak value is obtained as a result of special interference effects between the different states of the meter. The weakness was controlled by means of the dispersion; a) corresponds to a dispersion of ∆Q = 0.1; b) to a dispersion of ∆Q = 0.6; c) a dispersion of ∆Q = 0.8 and d) a dispersion ∆Q = 1.2...... 114 B.6 The four pointer’s distributions for the four possible final states. Each plot has as a title the post-selection to which it corresponds, for example,

the graphic c) corresponds to the post-selection |Di1 ⊗ |V i2. The reader should pay attention to the perfect anti-correlation between the results of the weak measurements and the final post-selection; the center of the Gaussian in the four cases indicates that the weak measurement yielded a state perfectly anti correlated to the final state of the post-selection. For example, the center of the Gaussian in d) indicates that the weak

measurement yielded a value consistent with the states |Hi1 ⊗ |Di2 (see table B.1), clearly anti correlated with the final state of that particular

post-selection, i.e., with the state |Ai1 ⊗ |V i2...... 115 List of Tables

7.1 This table shows how the backward causation position handles the prob- lem of the interpretation of weak values and the outcomes of EPR ex- periments, in comparison with a position that denies the possibility of backward causation. None of the positions is intuitive, but one might argue that the idea of backward causation renders that position com- pletely counter intuitive. However, I consider that the “forward causa- tion”position is far from being intuitive, and this is mainly due to the no “realism” that is embodied by the violation of the Bell and Leggett inequalities. Someone could perfectly mark as no intuitive either position, or argue that the “backward causation” position, as long as avoids very counter intuitive ideas, is much more intuitive than its rival...... 99

B.1 Results of the measurements in the right and left arm, according to the final center of the Gaussian corresponding to the pointer’s distribution. This interpretation however is only valid in the strong regime, because in the weak one the system is not collapsed to the eigenstates...... 114

x To my family.

xi Chapter 1

Introduction

Suppose that you are peacefully walking down the street and suddenly a man starts yelling at you: “The amount of coins you have right now in your pockets depends on what you decide to eat tomorrow night”. Almost with certainty you will think that he has gone crazy and you might feel a bit scared. Perhaps you think of giving him some of your coins so he does not harm you in any way! After this initial surprise, you continue to think of the man’s words, just for fun; the number of coins you have right now surely depends on many factors, among them on what you decided to eat during the day, but also on the change you received during the day (the change of the coffee you drunk, the bus you took, and the snacks you just bought). However, how can it be that something you have not yet decided, something that a friend of you could decide for you, something that is uncertain because it is in the future, might determine the number of coins you carry with you in this instant? As you think about this absurd idea, the crazy man yells again, but now to an innocent woman that was walking behind you. You listen carefully, because you are very curious; “If tomorrow you decide to eat a hot dog, then the amount of coins you presently have will be different from the amount you have now if you decide to eat a pizza instead. You will remember my words!”“Stop yelling at me!”, replied the woman (he definitely got her in a bad mood). “What can I eat tomorrow to change your annoying presence?”, said then the women. You laughed silently and there was an awkward silence.

In your way home, you continue reflecting on everything that happened (because you are very curious). Your surprise or fear with the man’s words is of course due to the absurd idea that actions in the future could have any effect on our present. If this were so, it would imply that causes could occur after their effects; as if the ball started to move before I kicked it, or as if the window suddenly broke now because of the rock that is currently flying towards it. The very idea of backward causation, that is, the idea of

1 Chapter 1. Introduction 2 a kind of causation in which effects precede their causes, looks logically contradictory. However, history has taught us that our intuitions are not always good indicators of how the world is. Quantum Mechanics, the most successful theory of Physics yet, has being describing for over a century a world very different from the world in which our intuitions live; things are both waves and particles, things don’t have defined properties unless we measure; things cannot have, simultaneously, well defined properties such as momentum and position; particles evolve in a deterministic way but at the same time, seem to behave in an indeterminate fashion once we come to measure them, and so on. So, will it really be a big deal if quantum mechanics included, together with non locality, indeterminism in measurement, not “reality” and so on, one more weird thing, namely, the possibility of backward causation? Perhaps we could be a little more tolerant -or less?- with the “crazy” man if his words were instead: “The present of the particles is determined, in part, by what we will measure later”. If these were the man’s words, then I would be in complete agreement with him because in this thesis I will defend the idea that certain physical processes in quantum mechanics seem to be retro-causal processes (“backward causation” and “retro causation” are synonyms).

1.1 Thesis

It is time to abandon the crazy man’s story and present the main objectives and the general structure of the present work. The main thesis of the present work is the follow- ing: Quantum-mechanical weak measurements provide evidence for physical processes that involve backward causation, that is, physical processes in which effects occur be- fore their causes. As the reader might suspect, this is a highly polemical thesis, not only because the idea of effects occurring earlier than their causes seems absurd (this is why we judged the “yelling ”man as crazy in the first place), but also because even if we could make sense in our imagination of a fancy world in which effects preceded its causes, this imagined world would not certainly be ours since in our world, we always encounter causes that precede their effects. In fact, there is an important philosophical debate around the legitimacy of this type of causation; for some philosophers, backward causation is a logical impossibility. Thus, this thesis requires a rigorous defense against several opponents. Nevertheless, I hope to convince the reader that the fight is not even nearly lost, however unpromising the panorama might look in the beginning.

The development of the main thesis of the present work will require a twofold investi- gation: 1) On the one hand, a philosophical investigation focused on the concepts of causation and backward causation, and on the other hand, 2) a physical discussion of the theory and interpretation of weak measurements. The central objective of the first Chapter 1. Introduction 3 investigation is to clarify to what extent backward causation is a consistent (not self- contradictory) idea so that we can understand better what are the conditions that a physical process must satisfy in order to count as a retro-causal process. On the other hand, the main objective of the physical investigation is to explain the details of the weak measurement theory, and to provide an interpretation of the weak values in light of distinct physical arguments. By means of the two investigations, I expect to have set up the terrain for the last and most important part of the work: showing why and how weak measurement can be taken to indicate that in our world not every cause precedes its respective effect.

1.2 Structure of the work

Let us briefly consider the structure of the present work. The work is divided in three parts. The first part is mainly occupied with a philosophical discussion around backward causation and this part is composed of two chapters. More precisely, in the first chapter I will examine the philosophical debate around the concept of causation. In this chapter I will examine the most prominent current theories of causation because in order to understand what is at stake with the idea of backward causation we will need to address first a more basic question; what is causation? After discussing the main ideas and problems of these theories of causation we will go on to consider, in the second chapter, the concept of backward causation.

In the second chapter I will make a brief review of the first widely known discussions about backward causation and we will see that most of the theories of causation discussed in the first chapter are well suited to accommodate certain types of retro-causal physical processes. In the same chapter I will present two main philosophical objections to the idea that sometimes causes can come after their effects, and we will examine the possible answers to them. This last discussion will serve to clarify what limits we must respect so that we do not fall into contradictions when we claim that retro-causation is possible. At that moment we will come to address, but in a very succinct way, another possible objection to the idea that there exist retro causal physical processes, namely, an objection that would claim something along these lines: physical processes are not possible because our universe has an arrow of time that goes from the past toward the future. We will see that the physical laws suggest exactly the opposite: our universe is time-symmetrical, there does not exist a privileged direction in time.

To motivate the idea of retro-causation in the context of physics I will briefly show, in the same chapter (chapter 2), that the idea that there are certain retro-causal physical process is not completely new. During the last century some physicists have proposed Chapter 1. Introduction 4 some physical theories that assume this type of causation; perhaps the most known are the “Wheeler-Feymann absorption theory of radiation” and “Crammer’s transactional interpretation of quantum mechanics”. A brief presentation of these theories together with a mental experiment proposed by Wheeler, will be found in the last section of this chapter. Afterwards we come to the second part of the work, that consist of two chapters that develop the theory of weak measurements. In the first chapter I will explain the theory of indirect measurements which is required to understand the weak measurement theory. In the next chapter I will discuss the theory of weak measurements.

In the final part of the work we will address the not non-trivial task of interpreting weak values. In this part, consisting of two chapters, I will argue that the interpretation of weak values requires that we adopt a position of backward causation. In the last chapter I will consider an experiment that will bring new light into the old debate of hidden variables and that will seem to provide strong evidence in favor of the main thesis of this work. This experiment is one expected to be performed soon in the labs at our university.

A last word regarding a methodological issue: the reader might suspect after this intro- duction that he will find a strange mix of physics and philosophy in this thesis. Now, just to be sure, the present work is a thesis on physics, not philosophy. However, philos- ophy has a lot to say about the interpretation of physical theories and physical results. Indeed, no physicist can deny nowadays that many current theories, and in particular quantum mechanics, are in need of clarification. And the mentioned clarification re- quires a hard work on interpretation and conceptual analysis, two things into which philosophy, above all the other disciplines, can provide great insight. So yes, this is a thesis on physics but because of the topics it touches, and because of the emphasis that it places on interpretative issues, it is a work that will inevitably lead us to philosophical considerations. Part I

Backward Causation

5 Chapter 2

Causation

In this chapter I will present the current more prominent philosophical theories of cau- sation. The chapter is slightly technical, and I hope the reader excuses me for this (I tried to present each theory as briefly as possible). I think that if we want to defend the idea of backward causation, we inevitably have to see how this idea fits into the most important theories of causation, and therefore, a discussion of these theories is necessary.

2.1 What is causation?

Defining causation is a huge problem for philosophy, a problem that has concerned ancient Greeks and contemporary philosophers as well. The modern discussion around the concept of causation can be traced to the Scottish philosopher David Hume (1711- 1776). Although there is not a complete consensus about Hume’s theory of causation (see [1]), some elements of his theory are more or less clear. For instance, for Hume causes need to be regularly followed by effects (this implies among other things that causes have temporal priority with respect to their effects; causes occur earlier than their effects). More precisely, events of type C are causes of events of type E if and only if the events of type C are constantly followed by events of type E. As it can be seen from this condition, causation in Hume is a relation between types of events; events of type C are causes of events of type E (this is what philosophers denote as “general causation” or “type-causation”). If we wanted to say that a particular event c is a cause of a particular event e, we would have to say that this is so because c pertains to events type C, and e to events type E, and events type C are regularly followed by events type E (note that capital letters denote types of events, while lower case letters denote particular events).

6 Chapter 2. 7

For example, the bomb that exploded yesterday was a cause of the destruction of a building because bombs are types of events that are regularly followed by destructions of buildings. Finally, some consider that Hume’s is a psychological theory of causation, since in some passages he appears to defend the idea that causation is a connection that our minds make as a response to our daily habits of regularities; once we have experienced a thousand times that release of objects is followed by a falling of the objects, we instinctively make the association: the release of objects is a cause of their falling.

In the 20th century, especially in the last 40 years, the question of what is causation returned to philosophy with more vitality than ever before. Many theories were devel- oped in a short time, and it is fair to say that no consensus has emerged so far. Still, we have gained some clarity with respect to some features of causation and with respect to some ways of approaching the question about causation. We now know, for instance, that there are three distinct questions regarding causation: 1) What does it mean to say that an event C caused and event E? 2) What events does a subject S take as evidence to infer that E was caused by C? 3) Is there an objective feature of the world that determines causal relations between events? 1) Belongs to semantical inquiries, that is, the meaning of causal statements. 2) focuses on the problem of how we come to infer causal relations between events; this is a more psychological orientated investigation. 3) is a metaphysical question in the sense that it seeks to determine what causation is, and what conditions the world must have so that a causal relation between different events comes to exist. In this work I am interested on the last type of question, that is, I am interested in discussing what is causation, and I will not be discussing causation from the point of view of causal statements nor from a theory that wants to specify how we infer causal relations.

So, what is causation? To address the last question I find it important to delineate the current panorama of the debate around that precise question. In order to do this I will explain the main five contemporary theories of causation, with their central ideas and principal difficulties1.

2.1.1 Regularist theories of causation

Hume was the father of this theory. As we already saw, the main thesis of this theory is that an event c of type C causes an event e that pertains to events of type E, if and only if events of type C are regularly followed by events of type C. However, in contrast to Hume’s theory, not every regularist theory takes space-time contiguity or the priority

1I might oversimplify the presentation of each theory since the details are not relevant for my purposes. Chapter 2. 8 in time of causes with respect to effects as necessary conditions. The event “billiard ball striking a still ball” is a cause of the event “movement of the still ball”because types of events “billiard balls constantly striking still billiards balls” are regularly followed by types of events “stricken billiard balls starting to move”.

These theories are not very popular nowadays due to very important difficulties. For instance, since not every smoker develops lung cancer, do we have to deny that smoking is a cause of this type of cancer because smoking does not always bring about cancer? On the other hand, how should we understand the regularities, are they objective or dependent on subjective states of affairs? [1]. Also, suppose that in the whole history of our universe the past were the only situation in which a billiard ball strikes another one; would we be unable to say that this is a causal interaction because it is a unique situation -rather than a regular one-?

2.1.2 Counterfactual theories of causation

The most elaborated and known counterfactual theory is due to David Lewis [2]. The main thesis is that c causes e because if c had not occurred, then e would not have occurred. The billiard ball striking the still billiard ball is a cause of the movement of the latter because if the first event (striking of the still ball by the moving ball) had not obtained, the second event (the movement of the still ball) would not have occurred. This is one of the most widely adopted theories of causation. Let’s briefly examine some of its difficulties. On the one hand, this theory depends on the concept of counterfactual, but unfortunately, this is a concept that is in need of much more clarification than the concept of causation itself. On the other hand, consider the following situation: Sebastian throws a rock at a window, and at the same time David does. We would like to say that the event “shattering of the window” was an effect of the event “Sebastian throwing a rock” for example. But this situation does not satisfy the counterfactual condition, for if “Sebastian had not thrown a rock at the window”, the window could still have been shattered by David’s rock (suppose that the two rocks impact the window at nearly the same time). These cases are known as “overdeterminated” situations. There are many other situations that the counterfactual theory has problems dealing with (for example, “preemption”; see [3]).

2.1.3 Probabilistic theories of causation

Patrick Suppes and Hans Reichenbach are the main precursors of these theories [4]. The theories state that c is a cause of e because when c is obtained, then the probability of obtaining e is increased; more precisely, the probability of obtaining e, given the Chapter 2. 9 event c, is greater than the probability of obtaining e given that c is not obtained, P (e|c) > P (e| ∼ c). Thus, the probability of the window shattering when I threw the rock at it is clearly greater than the probability of the window shattering if I had not thrown a rock at it. So, my throwing the rock at the window is a cause of the shattering of the window according to these kinds of theories. These theories easily handle cases where the regularist theories fail; for instance, smoking can be taken as a cause of lung cancer because it raises the odds to develop that kind of cancer, despite the fact that in many cases smokers do not develop that cancer. The problems for these theories have to do with situations in which an event c diminishes the probability of an event e, even though c is a cause of e. For example, it might happen that if an atom is in state S,

then the probability that it subsequently goes into state S2 is less than the probability

that it comes into that same state S2 given that the atom was previously in state F .

However, it could happen that an atom in state S came to be in state S2, and we would like to say that the previous state of the atom, the state S, is a cause of the subsequent state, regardless the fact that this transition was against the odds. It could also happen that an event c raises the chances of e, that e obtains just after c obtains, and, in spite of all that, it is still possible that e was not caused by c [5].

2.1.4 Manipulabilists theories

The central idea of these theories is that an event c (that pertains to events of type C) causes e (which pertains to events of type E) because if we produce events of type C, then we can produce events of type E, and if we manipulate events of type C, then we can manipulate or alter events of type E [6]. These theories emphasize a very simple idea; a cause can be regarded as a means to obtain an end, in others words, it is possible for us to control and alter certain types of events by controlling other types of events if and only if there exists a certain causal relation between these events. The rock I threw caused the shattering of the windows because if I control the way I throw rocks (speed, angle, and so on) I can control the kind of “shattering window” obtained (big hole, small damage, or something like that). One important problem of these theories is that they are accused of being circular for the concept of manipulation seems itself to depend or require the concept of cause; to manipulate something is to cause something. On the other hand, the theory is undesirably anthropocentric, since it explicitly appeals to an intervention by agents; it is not hard to imagine causal situations in which manipulation or interventions were impossible and it is not easy to figure out how would the theory handle those special situations (however, Woodward has a defense from these critics; see [6]). Chapter 2. 10

2.1.5 Process theories of causation

Also known as physical connection theories of causation. Its historical pioneers are David Fair, Jerrold Aronson and Wesley Salmon, and its actual and more known advocate is the Australian philosopher Phil Dowe [3]. The main idea is that for an event c to be a cause of an event e, a physical process that connects c with e must obtain. Dowe’s theory appeals to two necessary conditions : 1) A certain type of physical interaction must obtain between events c and e. 2) In the physical interaction, there must occur a change in the values of certain physical quantities which are globally conserved according to physical laws (quantities such as charge, energy, linear momentum and so on). For example, the impact of the rock on the window constitutes a causal interaction because there is a physical process that connects the shattering of the window with the impact of the rock, and an exchange of the values of some globally conserved physical quantities occurs; for example, a change in the energy, or the charge, or the linear momentum of both the rock and the window occur.

The reader might wonder why condition 2) is required, could we not take condition 1) to do all the work? But condition 1) is very vague by itself. For example, condition 2) is required so that the theory does not consider, for example, a shadow as capable of participating in a causal interaction. This is achieved by imposing condition 2), for a shadow does not possess any conserved quantity –a shadow does not have charge, nor energy or mass–. Now, the reader might think that this theory is sort of obvious, since the least we expect from a causal relation between an event c and an event e is that c causes e via a physical process. In short, the reader might think that this theory is talking about conditions that in one way or another are blatant. But perhaps the reader might not have considered the following problem, a problem that is very pressing for the physical connection theories: suppose that you water your plants daily. And then imagine that you ceased to water them for several days. A day will come when your plants will start to die because of the absence of water. So, your mother will say something like “your not watering the plants caused their death”, or “the absence of water caused the death of the plants”. If you now look closely at the main thesis of the physical connection theory of causation, you will notice that according to this theory, you cannot say that the absence of water caused the death of the plants, because there is not a physical interaction between an absence and an occurring event; it is impossible to trace a physical process between an absence –the absence of water– and a “normal” event. This kind of situation is known in the literature as “negative causation” or “causation by absence”. So, you must either abandon the physical connection theories of causation (and then you would have to throw away the appealing idea that every causal relation involves a physical process) and adopt one of the other theories that Chapter 2. 11 take causal situations such as the absence of water as causal, or you firmly defend these theories and explain why absences can not be taken as causes. Any one of the two options is very problematic (see [3]).

So far I have described a broad (very broad perhaps) panorama of the most important attempts to answer our initial question, namely, “what is causation?”Is causation a kind of relation that involves regularity, or is it a relation in which certain events make more likely the occurrence of other events? Is causation no more than a way of talking of our ability to transform and manipulate how things are in the world? Perhaps it is instead a very special kind of relation that has to do with counterfactual dependence. Maybe, causation is instead a physical relation that holds between events. Some think that instead, an adequate theory of causation combines some ideas of these theories –if we do so, we will be within a pluralist theory of causation–. My intention was not to argue for one theory or another, but to show what are the current options and what are the main difficulties that we will have to overcome in case we decided to adhere to one of these theories.

Now, despite the fact that these theories are rivals –it is not easy to make them compatible–, what is true is that most of the situations in which we think that a causal relation is obtained satisfy the conditions of the five theories –this is why it is that these are the most popular theories of causation after all–. So, what we should expect is that, given a situation in which we know or suspect that an event c caused and event e, the situation must be explained, if not by all, at least by the great majority of these theories of causation. If that is not the case, we better reconsider whether that was actually a causal situation. Hence, we have enough tools to be able to judge if certain situations –for example the weak measurements situations that we will explore later–, can be counted as causal.

The purpose of this chapter was to study the concept of causation under the lights of some of the contemporary theories of causation. This study was necessary in order to anticipate our subsequent study of backward causation. Chapter 3

Backward causation

Let’s go back to the strange words of the man:“The amount of coins you have right now in your pockets depends on what you decide to eat tomorrow night”. Our main task in this chapter is to consider how, if possible, we can make sense of these words. More precisely, in this chapter we are going to study what backward causation is, in order to determine if this kind of causation is possible in our world. The order of the chapter is as follows: first we are going to present the ideas of the philosopher Michael Dummet, who was the first (in the 50s decade) to provide a defense of the idea of backward causation. The main objective of that first section is to explain why the possibility of backward causation is not absurd or contradictory. Afterwards we are going to discuss how the main theories of causation, studied in the previous chapter, handle situations of backward causation. Then we will confront two objections that have been raised against the possibility of backward causation; the first objection is philosophical and intends to show that backward causation is impossible as it leads to causal paradoxes. The second objection is physical, and its idea, in broad terms, is that the universe has (or follows?) an arrow of time, that “points”from past to future, so that backward processes will have to violate this arrow of time. After discussing this objections, we end the chapter by briefly discussing a philosophical theory of time, which propose that there is no difference between the past, the present and the future; the difference between these is due, only, to subjective or psychological factors.

3.1 Historical approach

Consider the following example presented by the philosopher Michael Dummet:

Suppose we come across a tribe who have the following custom. Every second year the young men of the tribe are sent, as part of their initiation ritual, on a lion hunt: 12 Chapter 3. Backward causation 13

they have to prove their manhood. They travel for two days, hunt lions for two days, and spend two days on the return journey; observers go with them, and report to the chief upon their return whether the young men acquitted themselves with bravery or not. The people of the tribe believe that various ceremonies, carried out by the chief, influence the weather, the crops, and so forth.... While the young men are away from the village the chief performs ceremonies – dances, let us say – intended to cause the young men to act bravely. We notice that he continues to perform these dances for the whole six days that the party is away, that is to say, for two days during which the events that the dancing is supposed to influence have already taken place...[7]

To make the example more appealing, we can add for instance that all the occasions in which the tribe members performed ceremonies, the hunters returned victorious, whereas in the few times in which ceremonies were not performed, the young men never returned. The point of the example is to illustrate that the idea of backward causation is not in itself absurd. Or to say it in another way, the point is that it is not irrational for the tribe members to consider that the dance can influence the success of the lion hunt which occurred one or two days before the dance itself. You might of course claim that these are no more than superstitions of a “primitive” tribe. But such a critique would miss the point; the important thing is not whether the tribe members are wrong in believing supernatural things. The point is rather that we can conceive of situations in which a community composed of rational agents comes to believe that, for some reason, their actions can somehow affect earlier events. Indeed, if they had experienced that in one of the ten times in which they did not perform ceremonies the young men returned, and let’s say, in nine of the ten times in which they did perform ceremonies the young men returned, would they not be justified in believing that the ceremonies exert some beneficial influence on the lion’s hunt?

Another simple example might help us stress Dummet’s point, which, to repeat, is only that the belief in backward causation need not be contradictory or need not be absurd.

Imagine that I find that if I utter the word “Click!” before opening an envelope, that envelope never turns out to contain a bill; having made this discovery, I keep up the practice for several months, and upon investigation can unearth no ordinary reason for my having received no bill during that period. It would then not be irrational for me to utter the word “Click!” before opening an envelope in order that the letter should not be a bill; it would be superstitious in no stronger sense than that in which belief in causal laws is superstitious. Someone might argue: Either the envelope already contains a bill, or it does not; your uttering the word “Click!” is therefore either redundant or fruitless. I am not however necessarily Chapter 3. Backward causation 14

asserting that my uttering the word “Click!” changes a bill into a letter from a friend; I am asserting (let us suppose) that it prevents anyone from sending me a bill the previous day. Admittedly in this case it follows from my saying “Click!” that if I had looked at the letter before I said it, it would not have been a bill; but from this it does not follow that the chances of its being a bill are the same whether I say “Click!” or not. If I observe that saying “Click!” appears to be a sufficient condition for its not being a bill, then my saying “Click!” is good evidence for its not being a bill; and if it is asked what is the point of collecting evidence for what can be found out for certain, with little trouble, the answer is that this evidence is not merely collected but brought about. Nothing can alter the fact that if one were really to have strong grounds for believing in such a regularity as this, and no alternative (causal) explanation for it, then it could not but be rational to believe in it and to make use of it [8].

Now, this last example brings into play an important element; a necessary condition for a retro-causal situation is that the event that we consider as the effect must not be explainable in any way by appealing to a “normal“ (forward) causal process. It would not make any sense to claim that c “backwardly” caused e if we can explain that an event d caused e in a “normal”manner e. So, if the tribe members understood the mechanism responsible for the failures or success of the lion’s hunt, and if this mechanism were found to have no relation at all with their ceremonies –as we think happens–, then the performance of the dance and the ceremonies could not be taken to be causally relevant for the lion’s hunt. Analogously, the example of the “click” requires that we know of no other causal explanation of why some of the envelops did not have bills precisely the same times in which I uttered “Click!”. Dummet stresses this idea when he says, right at the end of the above fragment, “no alternative (causal) explanation for it”.

The previous idea is important for our purposes. In particular, if by some reason we claim that a certain experiment could be understood as revealing a retro causal situation, we must make sure that no other “forward causation”process can explain the results. If we could imagine a “forward in time”1 process that could explain the results, then all the case for backward causation will vanish, or at least, the cause in the future will become redundant or innocuous, since we already have a sufficient cause in the past2. Fortunately, as we will see in the last two chapters, the allegedly cases of backward 1From now on, “forward in time”or “forward in time process ”or “forward causation”will be taken as synonyms; they all denote processes that evolve from past to future, or processes in which the cause occurs before the effect. 2It is important to mention that it is also a problem for “forward causation”when we have two or more events claimed to be causes of a certain event (perhaps one of the causes might render innocuous the other cause). However, the case of backward causation is more pressing, since one would prefer to discard any cause in the future of the effect if another cause, which is in the past of the effect, can explain the same event (in the case of “forward causation”, one could accept both causes as valid causes, without discarding one of the two). Still, it might be that the cases of backward causation and forward Chapter 3. Backward causation 15 causation that we will consider are protected from this problem: no physical mechanism –at least, not yet known– which goes from past to future can account for the results. Let’s refer to this restriction –the condition that every time we claim that c “retro- causes”e we should not know any alternative causal mechanism according to which another event d caused e–, the “no alternative mechanism” (NAM for short). Notice that this restriction is necessary; it makes no sense to state that c retro-causes e while e has already obtained and has been caused by d.

3.1.1 Theories of causation dealing with retro causation

The reader might have noticed that the five theories of causation that we studied – regularist, probabilistic, counterfactual, manipulabilist and physical connection theories– did not appeal to the concept of time. None of these theories states in their definition of causation that causes must occur earlier than their effects. To be fair, the original regularist theory, which was developed by Hume, explicitly mentioned the temporal priority of causes with respect to effects. But aside Hume’s theory, the other theories do not directly involve the concept of time.

I will enumerate different reasons to explain why these theories do not appeal to the concept of time in their definition of causation. First, the concept of time is very controversial; since we lack an appropriate theory of time it would be reckless to make the concept of causation depend on the concept of time 3. Second, it is not always possible to determine if an event A is prior to an event B, as might happen with certain reference frames; as relativity has taught us, two events might happen to be simultaneous for some observers while not simultaneous for some other. Third, for some philosophers such as Phil Dowe and Huw Price (see [3] and [9]), a theory of causation must be able to analyze cases of backward causation; actually, both Price and Dowe think that some results of quantum mechanics must be interpreted appealing to backward causation 4 (we will return to this in the last chapter). Hence, there are both philosophical and empirical considerations that have lead philosophers not to take a certain temporal order as a necessary condition for causation; a theory of causation that considered mandatory that causes precede their effects would be unnecessarily demanding 5. causation are completely analogously with respect to this problem, but an investigation of this question is out of the scope of the present thesis. 3It is true that the concept of “temporal order”is not controversial so far. However, the problem lies in determining why and how exactly the notion of causation demands a certain temporal order, and moreover, why causation demands a temporal order in which causes occur earlier than their effects. 4Their position is exactly equal to the main thesis of the present work, except for the fact that the processes they talk about are not weak measurements. 5a more compelling argument in favor of the idea that a theory of causation that requires a specific temporal order of causes with respect to effects would be found in the section the arrow of time, in which we will come to see that the fundamental laws of physics allow backward process. Chapter 3. Backward causation 16

Having said this, I will now show how these theories of causation could handle cases of backward causation. Let’s suppose that an event c retro caused an event e. The probabilistic theory will simply state that e was caused by c because the probability of obtaining e given that in the future we obtain c is greater than the probability of obtaining e given that in the future we do not obtain c. For instance, if we come to notice that c occurred, then we are justified in inferring that e occurred before, because the fact that c occurred now increases the chances that e had occurred before. The processes theories of causation -or physical connection theory of causation- will state that to guarantee that a causal connection between two events c and e occurred, then a certain physical process that connects c and e must exist. This theory does not pay attention to the direction of time because, as we will see later, the fundamental laws of physics are time-symmetric. In short, the process theory will say that a certain physical process that begins in c, retro-evolves until it make the event e.

At first sight the manipulabilist theory cannot easily deal with this kind of cases. The problem stems from the fact that this theory is, as we already studied, very anthropocen- tric; it seems strange that we can produce E-type events now by doing C-type events later for example. Equally absurd is the idea that by doing certain kinds of things we can come to cause certain things in the past. However, the theory can be adapted to avoid these kinds of difficulties. For example, we can say something along these lines: C-type events retro caused E-type events because every time that we produce, create, or manipulate events of type C, we come to notice that events of type E were obtained earlier. If we restrict the backward causation cases to those cases in which we can not be aware of the effect before the causes occurs –and we will soon see that this is a necessary condition–, then we can make the manipulabilist theory (and all the theories of causa- tion, as we will see in the next section) capable of handling backward causation. Again, why is events of type C cause of events of type E under the manipulabilist theory? A possible account is the following: every time we bring about events of type C we discover that events of kind E were previously produced, and we never find events of type E on occasions on which we do not make type C events.

The regularist theory has obvious difficulties with backward causation since it says “events of type C are followed by events of type E6”. Fortunately, this theory is the

6At first sight it seems that we can suggest that events of type C regularly anticipate events of type E and propose that the events that are prior in time are the effects instead of the causes. However, how should the theory discriminate between the cause and the effect if it is not by means of the temporal order? Chapter 3. Backward causation 17 most unpopular nowadays so we can leave it aside in our analysis7. Finally, determin- ing whether the counterfactual theory of causation is well suited to handle backward causation is very complex; the difficulty has to do with the interpretation of the coun- terfactuals. For some, conterfactuals are asymmetric in time and because of this, they can not accommodate cases of backward causation. For others, however, retro causation is still possible within a counterfactual theory [9]. For example, it could be suggested that c retro caused e because, if c had not occurred in a certain time ta, then e would not have occurred in a time tb, were tb < ta. If the counterfactual theory is found to be capable of handling backward causation then that would be a point in its favor, but for our purposes it is more than enough that three of the most prominent contemporary theories –the process theory, the manipulabilist and the probabilistic theory– can handle situations of backward causation.

In this section I have explained that three theories of causation can handle the cases of backward causation. This is very promising since it means that we have prominent theories of causation capable of analyzing situations of backward causation. If, on the contrary, none of the actual theories of causation would have accepted cases of backward causation, that certainly would be a good point against the possibility of backward cau- sation. Now we are going to examine two objections against the possibility of backward causation.

3.2 Objections

3.2.1 The Bilking Argument

The most infamous objection against the possibility of backward causation is the “killing your young grandfather paradox”. Suppose that c retro causes e. By the definition of backward causation, this means that e occurs in an earlier time than c. Therefore, if we detect e, it would be possible, at least in principle, to change the course of events so that we can prevent the occurrence of c. But then we are confronted with a paradox, since we are claiming that c retro caused e while, on the other hand, we are preventing the occurrence of c; e would be then an event without cause, which must be impossible. This argument is named “the bilking argument”, and was first developed by Max Black in 1956 [10].

7Indeed, I consider that the fact that this theory rules out the possibility of backward causation is a problem for the theory, given that the regularities the theory appeal to must be explained by physical laws, and physical laws are time-symmetrical. Therefore, why the regularities, supported on physical laws, should respect a temporal order? Chapter 3. Backward causation 18

A classical defense against the bilking argument is to suggest that whenever an event c retro causes an event e, we can only detect the effect e after c has occurred. In this way, we cannot, after detecting e, change the world so that c does not occur; we can notice that the event e occurred before c only after c has occurred. Another defense against the bilking argument is to claim that even though you can detect e before the occurrence of c, you still cannot do anything to change the course of events so that c is not obtained later. However, this second proposal has been subjected to an increased criticism, in part because it is not clear at all in virtue of what physical laws we cannot change the mentioned course of events [10].

In the last chapter of this thesis we will see that the backward processes revealed by weak measurements are immune to the Bilking argument, because of the first line of defense; we cannot detect e before c occurs. And why we cannot detect e before c occurs? The answer to this question will be clear once we study the theory of weak measurements.

3.2.2 A brief discussion of the arrow of time

Another possible objection against the possibility of backward causation can be found in the idea that our universe has an arrow of time. According to this idea, the universe evolves in a certain temporal direction, from past to future, so we cannot make space for physical processes that run from future to past. The classical strategy for arguing in favor of this temporal asymmetry of the universe has been to appeal to the second law of thermodynamics.

3.2.2.1 The symmetry of the second law

We live in a world in which glasses break into pieces when dropped. In our world the opposite never occurs; we never see a bunch of pieces suddenly organizing themselves until a glass is formed. We live in a world of irreversible processes, physical processes that always occur in a certain order -never in the opposite-. Since the second law of thermodynamics accounts for this irreversibility, many physicists and philosophers have been tempted to suggest that this second law can provide an arrow of time. If we have a law that dictates that dropped glasses turn into pieces but that pieces do not turn, by “themselves”, into glasses, then it seems that we have a law that reflects a temporal asymmetry of the world; our world is one in which processes occur respecting a certain privileged direction of time.

We often think of the second law as stating something like “entropy increases”. Ap- parently, this means that the entropy E of a certain system S in the future has to be Chapter 3. Backward causation 19 greater –or at least equal– to the entropy of the system in the past. Now, imagine an irreversible process. Since the entropy of the concerned system necessarily increases during an interval of time, we then can say, at any moment during that interval: “in any future time, the entropy must be greater”. Then, the gradient in which entropy increases seems to provide a ground for a an asymmetry of time.

However, the reality is that we are committing a fallacy in taking the second law as providing an arrow of time. The statistical formulation of the second law, that was derived by Boltzmann, is atemporal –it does not talk in terms of time–, it is only concerned with probability considerations [9, p. 30]. For example, suppose that we have an ideal gas inside a container, and suppose that in a certain instant of time t, most of the particles are clustered within one half of the container. The second law will say that the state of the gas in time t is very improbable because there are exponentially more configurations consistent with the particles being homogeneously distributed through all the container. This entails, of course, that in the near future the gas will reach a state of equilibrium –a much more disordered state–. In this sense we could come to think that entropy increases toward the future. However, the same second law entails that we should expect that at times prior to t, the gas was in a much more disorganized state, for the same statistical considerations. Thus, the second law is really time-symmetric; given a very ordered state, we should expect that entropy increases both toward the future and toward the past.

Once we have noticed that the second law is time-symmetric, it should be pointed out, following Price, that the relevant question is no why is it that things in our world now tend to evolve into a state of higher entropy later, because that is very clear given the second law; the present state of the world is very far from thermodynamical equilibrium so we expect that in the future things become more disordered (life, for example, is impossible in a universe in thermodynamical equilibrium). The relevant question is why is it that in the past the universe was in such an ordered state; by probabilistic considerations alone, it is much more probable that the present world emerged as a fluctuation from a state of higher entropy, rather than as a result of a gradual process from a much more ordered state [9]. As Price says, “The major task of an account of thermodynamic asymmetry is to explain why the universe as we find it is so far from thermodynamic equilibrium, and was even more so in the past”[9, p. 36].

Thus, we are not denying that the present world is one with a clear gradient of entropy; a gradient that goes from what we call past toward what we call future; a world that, in the macroscopic scale, is full of irreversible processes. What we are stressing is that the fact that we live in a world with a clear gradient of entropy does not mean that the Chapter 3. Backward causation 20 second law is asymmetric in time 8. In other words, the second law does not provide an arrow of time, if it were for this law alone the past and the future should be both states of equilibrium. Rather, it seems that we need cosmological considerations to explain why the universe in its beginning was such an ordered state; some have proposed –Cocke, Schulman, Hartle Gell-Mann– [11, p. 145] that the clue to this mystery lies in certain boundary conditions for the universe. What concerns us, however, is that the second law does not explains nor suggest, an arrow of time.

But, if it is not the second law, what other physical laws could provide an arrow of time? It seems that none, because the laws of physics are time-symmetric. This means that the fundamental laws allow a certain process to occur in reverse as if we were running a film backwards; if the movement of a ball, in going from time t1 to t2, is represented by a certain trajectory, then nothing prevents that another event occurs in which the ball describes exactly the same trajectory in reverse order. For example, Newton’s laws are symmetric in time; if F (x, t) is a solution for a certain classical equation of , so is F (x, −t). Another example: the Schr¨odingerequation admits also time reversal solutions; if ψ(x, t) is a solution of the Schr¨odingerequation, then ψ∗(x, t) (the conjugate with the time reversed) is also a solution [11, p. 140]. Indeed, the fundamental laws of physics –the weak force, the strong force, gravitation and electromagnetism–, with only one exception (that is going to be discussed in the next paragraph), are invariant under time reversal.

The mentioned exception has to do with the behavior of the neutral Kaon, first dis- covered in 1964. Neutral Kaons decay into Pions but the opposite occurs much more rarely; Pions decay into Kaons in a very small rate compared to the decay of Kaons into Pions, so this process seems to be asymmetric in time. However, the symmetry is recovered if we recall that all physical processes respect CPT –all processes are invariant if we reverse charge, parity and time–. Moreover, even if the mentioned processes of the Kaons were asymmetric in time, it is still true that physical laws are generally time- invariant. So, the cases of the Kaons, although interesting, it is not enough to supply an arrow of time for the universe; it is far from clear how this subatomic process would, by itself, counteract all the time symmetry found in all the other physical processes (and in particular, as we will see in the last two chapters, some quantum mechanical process seem to be simply unexplainable unless we take seriously the symmetry found in the Schr¨odingerequation).

8Entropy is an atemporal concept; there is not a privileged direction of time according to the second law. As argued before, the present gradient of entropy entails that we need an explanation, not given by the second law, to understand why the universe was at its origins in a state so far from thermodynamical equilibrium. Chapter 3. Backward causation 21

3.2.2.2 A subjective arrow of time

If the laws of physics (leaving aside the case of the Kaons) cannot provide an objective difference between the past and the future (cannot provide an arrow of time), it seems to be that the difference between the past and the future is due only to psychological or subjective facts. Boltzmann once said:

“... our sense of past and future depends on the entropy gradient, in such a way that we are bound to regard the future as being the direction in which entropy increases... elsewhere in the universe, or in the distant past or future, there might be creatures with the opposite temporal orientation, living in regions in which the predominant entropy gradient goes the other way. These creatures would think of our past as their future, and our future as their past; and there wouldn’t be a fact of the matter as to which of us was right”[9, p. 34].

This passages shows how Boltzmann is trying to attach our temporal perspective to the entropy gradient of our world. Boltzmann is not saying that entropy provides an objective arrow of time, but that the direction of time is a concept that depends on our constitution.

One of the reasons that explain why the asymmetry of time is psychological –subjective– has to do with an epistemic consideration: our epistemic access to the past is very different from our epistemic access to the future; we know the past “better”than the future, we have a higher degree of certainty regarding facts about the past than regarding facts about the future; indeed, we are said to remember the past but not the future! [9]. This epistemic difference explains why it is that we tend to imagine the future as “open”, while the past as if it were “closed”[9] (we think of the future as something we can change while we regard the past as something fixed, unchangeable). If we knew the future in the same manner that we know the past, we would not think of the future as something qualitatively different to the past, we would not think of the first one as “open ”and the latter as “fixed”, but rather, both as fixed.

The previous idea is related to the fact that we are agents. Mental processes such as planning and decision making require an asymmetric understanding of time: the past is considered to be “closed”(after all there is nothing we can do to change it). But as agents we think of the future as open, as yet to be determined by our decisions and actions. Deliberating requires imagining different possible courses of action and understanding how they would affect the events that will happen. Given the epistemic fact mentioned above and our nature as agents, it is understandable that we have an asymmetrical perspective of time. However, a closer examination of the relation between these two things is out of the scope of this investigation. Chapter 3. Backward causation 22

The previous was a very brief examination of the psychological arrow of time, my purpose was only to stress the following point: the temporal asymmetry that we manage in our everyday life needs to be explained by means of subjective considerations and not in terms of an objective temporal asymmetry of the world. Perhaps the actual entropy gradient has something to do with our psychological temporal asymmetry, as Boltzmann suggested, but the asymmetry is, nonetheless, subjective (we will come back to this idea soon). That this is so should be clear, not only because of the epistemic consideration, i.e, we remember the past not the future, but for the symmetry of the physical laws discussed in the previous section, that seem to leave no space for an objective temporal direction of our universe.

3.2.2.3 The Block Universe.

If we take seriously the symmetry embodied in the physical laws, then we should consider that there is no objective difference between the past, the present and the future. The present is a subjective notion, in the same way that “here”is. These two notions depend on our particular point of view, but they do not arise as a result of an objective difference in the world. “Just as “here” means roughly “this place,” so “now” means roughly “this time,” and in either case what is picked out depends on where the speaker stands. In this view there is no more an objective division of the world into the past, the present, and the future than there is an objective division of a region of space into here and there”[9]. This position is what is known as the block universe.

The idea of the block universe is that reality is not “something in time”, but that time is a part of reality (in the same way that we tend to imagine space as a part of reality and not reality as a part of space). For the block universe view, time is not a thing that “flows”; the past, the present and the future are a whole thing, only divisible by subjective considerations. It is hard to imagine the universe from this perspective, because, as Price says, we are creatures in time [9]. Nevertheless, this objective perspective is the more appropriate at the moment of investigating and analyzing physical processes because, as I have been stressing, the laws of physics do not reflect any qualitative difference between the present, the past and the future. The difference is inserted into the picture by us.

Taking seriously the idea of the block universe entails that we can think of any process, both as evolving from what we call past towards what we call future, and as evolving backwards from what we call future toward the past. There is not an objective difference between these two descriptions. A process that we are describing from a “future to past”perspective is exactly the same process that we regard as evolving from the past Chapter 3. Backward causation 23 toward the future; what changes is due to the point of view from which we are describing it. For example, saying that entropy increases toward the future is exactly the same that saying that entropy decreases toward the past (past and future are subjective labels according to the block universe view). We could describe exactly the same world as if we were running a movie backwards; no objective difference should emerge (however, as it would explained soon, this temporal symmetry of the physical laws do not impose a symmetry of the concept of causation in the macroscopic realm).

What is important for our purposes is to recall that we should not judge as problematic a description of a certain process that goes from what we call the future towards what we call the past. If we find such a description problematic just because of its “direction of time”, that would due solely to the fact that we live within an asymmetric (psychological) perspective of time. A certain description of a physical process “running”backward in time should not be taken as a questionable or wrong description per se. Once we recall that the physical laws do not provide an arrow of time, there does not remain any space for casting doubts against descriptions of physical processes that go in the “opposite”temporal direction (still, see the discussion below). I am not saying that we have to get used to descriptions that go from future to past, but that we should be able to recognize that the weirdness of such a description is due to our own point of view and not to a mistake in the description. We will apply these considerations when we come to see that some results in quantum mechanics are better understood by “backward in time”descriptions.

A clarification is in order. The symmetry of physical laws, and the block universe view, does not entail that every casual process can be described as a backward causal process. For example, “my dropping the glass caused the glass to break” is not equally valid to “the breaking of the glass caused me to drop it”, even if we reverse time. What the symmetry of time entails is that the film running backwards or forward is a film of the same process; the film running backward is not less valid (objective) that the film running forward. Suppose for instance that you could describe the position and the velocity of all the atoms of the glass that you dropped; at each instant of time you write all the positions and velocities and you make a list with your data. If you then were presented the movie of the same process (the dropping of the glass), but running backwards, and you write again all the positions and velocities for each instant of time, no difference should be found between your first list and the second one, besides that the sign of the velocities might be reversed. However, as I said, in one case (past to future) you can say “the dropping of the glass caused it to break” whereas in the other one (future to past) you will not be able to say “the breaking of the glass caused it that you dropped it” or something like that. Hence, the symmetry of the physical laws does not entail that every causal situation is automatically a backward causal situation seen Chapter 3. Backward causation 24 in reverse, it entails only that a physical description, that does not appeal to terms such as “cause”, is equally valid in both directions. In fact, some philosophers argue that if A causes B, B cannot be a cause of A even if we reverse the direction of time; this is known as the metaphysical priority of causes with respect to effects. Thus, the symmetry of physical laws is not a sufficient condition for backward causation, but a necessary one; if the laws did not permit a description of a process with the time reversed, then backward causation would be impossible.

The last consideration was important because the reader might come to think that the block universe and the symmetry of physical laws are all we need to demonstrate that backward causation occurs in our world. As I already explained, the symmetry of physical laws and the block universe view only show that the universe is symmetric in time not that every causal situation can be taken as a backward causal situation. The claim that backward causation occurs is rather that, given a certain (subjective) direction of time, there are causal situations that we cannot explain within that direction of time. Suppose that we set, as a matter of convention, that times goes from past to future. Then the symmetry of physical laws entails that another person, who chose the opposite convention (future to past), that is, changes t for −t for example, will describe exactly the same universe. But that does not mean that he can automatically talk in terms of causation in his “inverse”description (as the example of the glass tried to illustrate). The reason has to due with the fact that, macroscopically, we use the concept of cause heavily influenced by the entropy gradient we live in; our causal description of the world cannot ignore the fact that our world is one in which “dropped glasses”breaks, machines produce energy, things get damaged during the years, and so on. It is not only that most of situations in which we appeal to the concept of cause in our daily routine can be shown to depend on irreversible processes (think on whatever causal situation during the day, and you will see that, almost with certainty, it will involve irreversible processes), but that the subjective arrow of time, which we cannot evade in our life, strongly influence a temporal asymmetry in our use of the concept of cause. Macroscopically then, the concept of causation is found to be asymmetric with respect to time [9], but that is due to our subjective temporal asymmetry and not to the concept of causation itself (recall that most of the theories of causation do not attach a temporal asymmetry to the definition of causation).9

What backward causation requires is rather something more than the symmetry of the physical laws. It requires that once we have chosen a certain direction of time, “past to future”in our case, there are some processes that “escape”our chosen direction, more

9It would be interesting to make an investigation about the relation between the entropy gradient we live in, our subjective arrow of time, and the asymmetry of causation in the macroscopic realm. Unfortunately, such an investigation is out of the scope of the present work. Chapter 3. Backward causation 25 precisely, that there occurs processes that force us to appeal to a causal explanation in which the effects occur before the causes. To sum up, it happens to be that, because of the world we inhabit, with a gradient of entropy, and because of the subjective temporal asymmetry that that gradient of entropy allegedly produces10, causation almost always respects a certain causal asymmetry. But we should not expect that this causal asym- metry holds in the quantum realm; if the causal asymmetry has something to do with the entropy gradient, then, in the quantum world, it could happen that the asymme- try vanishes and that we can describe, after all, many quantum processes as backward causal situations once we reverse time (see for example [12], in which it is argued that the temporal symmetry of the laws in the quantum realm might entail backward cau- sation). As we will see in the last chapter, that’s exactly what the Two States Vector Formalism proposes.

3.3 Backward causation in physics

Upon this point we have been discussing, mainly, seven points; first, that backward causation is not absurd, as the examples of Dummet intended to show. Second, that three of the most prominent theories of causation are well suited to handle cases of backward causation. Third, that we can avoid causal paradoxes (recall the bilking argument) if we demand that we can only detect the effect of a backward causation situation once the cause has occurred. Fourth, that the second law of thermodynamics does not impose an arrow of time and indeed, that the fundamental physical laws, with exception of one solitary process, the Kaons, are symmetric in time. Fifth, that the clear arrow of time that we find in our world is due to psychological considerations, perhaps related to the actual entropy gradient in which we live. Sixth, that the symmetry of the physical laws, and the asymmetry explained by means of psychological factors, suggest that a more appropriate view of time is the block universe model, in which no objective distinction between the past, the present and the future should be found, i.e., it does not make a difference to describe the universe as evolving from what we call the future toward what we call past. Finally, we saw that the temporal symmetry of the universe does not imply that every causal process can be regarded as a backward causal process when we reverse time. Instead, backward causation arises when, given a certain direction of time (which is subjective), some processes cannot be explained unless we admit that the causes occur after the effects (where “after”is to be understood according to the subjective direction of time). In order to finish this chapter, we are going to discuss three examples of physics in which a certain kind of backward causation is proposed.

10As I said, it is in need of a further investigation the precise relation between the gradient of entropy and our subjective arrow of time Chapter 3. Backward causation 26

Recall that one of Dummet’s point was that we can conceive of some situations in which we might be justified in believing that future events can cause past events. However, Dummet’s examples might seem very fictitious and that might subtract credibility to the idea of backward causation. Thus, I think it is illustrative to study two physical theories developed during the last century that take seriously the idea of retro causal processes; The Wheeler-Feynman absorber theory and Crammer’s transactional interpretation of quantum mechanics. At the end we will discuss a thought experiment proposed by Wheeler (Wheeler’s delayed choice experiment), that intends to show backward causa- tion in quantum mechanics.

3.3.1 The Wheeler-Feynman absorber theory of radiation

Also called the “Wheeler–Feynman time-symmetric theory”. Maxwell’s electromagnetic equations have both retarded (forward in time) and advanced (backward in time) so- lutions although the advanced solutions are often discarded because it is argued that they are no physical [13, p. 658]. About 1946 Wheeler and Feynman developed their absorber theory of radiation that was inspired on the time-symmetrical nature of the solutions to the Maxwell equations. The theory was initially motivated by the problem of the self-energy of the electron (a problem that stem from the inconsistencies that arose when the electron was treated as a punctual charge). We can ignore the details of the self-energy problem since we are interested only in the time symmetrical theory of radiation that the authors developed.

We can motivate the discussion of the Wheeler-Feynman absorber theory (WFAT from now on) by asking; if Maxwell equations entail retarded and advanced solutions, why we only find retarded waves in the world? The answer that WFAT provides is very ingenious. We know that if we have a charged particle accelerating, it emits electromag- netic radiation. Also, the particle losses energy as it radiates. The core idea of WFAT is that the energy lost of the accelerating particle can be explained solely by means of the electromagnetic field produced by other particles (absorbers), this electromagnetic field being composed of retarded and advanced solutions to the Maxwell’s equations [14]. Also, it is a key condition of WFAT that the absorbers, that generate the mentioned field that acts on the source (the emitter), are enough so to absorb completely the radi- ation produced by the source, and, as we will see, this absorbers must be isotropically distributed through space.

When the field produced by the source arrives to the each particle of the absorbers, the absorber is set in motion and produces a field that is half-advanced and half-retarded. If we now sum the advanced field produced by all the absorbers, we obtain a field, that, Chapter 3. Backward causation 27 evaluated at the location of the source, has the following properties (see [14]): 1) It is independent of the absorbing medium. 2) It is completely determined by the motion of the source. 3) It exerts a finite force on the source that is sufficient in magnitude and direction to account for the lost of energy of the source when radiating. 4) It is equal in magnitude to one-half the retarded field minus one-half the advanced field generated by the acceleration of the absorber. 5) The combination of the field produced by the absorbers with the field produced by the source (which is also half-retarded and half- advanced), provides the total retarded field that we usually observe (measure). Note that by only time-symmetric considerations, Wheeler and Feynman not only succeed in reproducing the observed results of electromagnetism, but that they also provide a nice explanation of the “force of radiative damping”, avoiding the problematic interaction of the source with its own field.

In what follows I will explain, without going into the details, the main ideas that lead to the derivation of WFAT. The authors first consider the force that is produced by all the absorbers at the location of the source. This force, in turns, results from the motion of the absorbers due to the radiation produced by the source. By taken into consideration the lag due to the refraction index of the medium [14], the authors are able to deduce that the resultant force on the source, due to the advanced field produced by all the absorbers, is dE F = 2q 2/3c3 a , (3.1) t s dt

where Ea is the advanced field of the absorbers and qs is the charge of the source [14]. Now, Eq. 3.1 is the required expression in order to account for the lost of energy of the source when radiates; it is not a new derivation by WFAT. What is new is the interpretation; the force is understood as a result of the advanced radiation of the absorbers, instead of as a result of the own field of the electron. Now we will attend to a ingenious idea from which WFAT derives the main results. The authors say:

“It is instructive to see how superposition of the advanced fields of a large number of particles can give the appearance of both retarded and advanced fields due to the source itself. The advanced field of a single charge of the absorber can be symbolized as a sphere which is converging towards the particle and which will collapse upon it at just the moment when it is disturbed by the source. But at the moment when the source particle itself was accelerated, the sphere in question had a substantial radius. One point on it touched, or nearly touched, the source. The shrinking sphere therefore appears to the source as a nearly plane wave which passes over it headed towards one of the particles of the absorber. When we consider the effect of all the absorbing charges, we have to visualize an array of approximately plane waves, all marching towards the source and passing over it in step. The resultant of Chapter 3. Backward causation 28

these individual effects is a spherical wave, the envelope of the many nearly plane waves. The sphere converges, collapses on the source, and then pours out again as a divergent sphere. An observer in the neighborhood will gain the impression that this divergent wave originated from the source”. [14]

Note how important is the condition of the isotropic distribution of the absorbers; with- out this condition it would be not possible to obtain a spherical wave converging into the source. Note also that the spherical wave that converges into the source, arrives to the source in the moment ts when the source started to radiate; the spherical wave starts in the future, i.e, in the time ta when the absorbers radiates due to the retarded field produced by the source, and travels towards the past, until the moment ts in which it arrives to the source. Also, it is important to recall that the source produces an advanced field (the time-symmetry of the Maxwell equations entails that both the absorbers and

the source produce retarded and advanced fields). Then, Etc, the total field converging

on the source, can be written like the difference between Eas, the advanced field of the

source, with Ea, the “field composed of parts convergent on individual absorber parti- cles”(the “field composed of parts convergent on individual absorber particles” is the field that apparently converges on the source) [14]. WFAT shows that [14]

1 1 E = E − E = 0. (3.2) tc 2 as 2 a

Therefore, before the moment ts in which the source started to radiate, there is no net disturbance on the source, because the advanced field of the source exactly cancels the advanced field of the absorbers [14]. It is important to mention that, for the symmetry of Maxwell equations, the advanced field produced by the source, Eas, is equal to the retarded field Ers produced by the source (the only difference is that the advanced wave propagates towards the past). Hence, we can write Eq. 3.2 as

1 1 E = E − E = 0. (3.3) tc 2 rs 2 a

It should be pointed out that during the time interval between ts and ta (the time between the moment when the source starts to radiate and the moment the absorbers radiate “back”), we have two fields, that are operationally indistinguishable; the retarded field produced by the source, and the advanced field produced by the absorbers [14]:

1 1 E = E + E , (3.4) T 2 rs 2 a where Ers is the retarded wave of the source and Ea is the total field produced by the absorbers, where now we are considering times between ts and ta. As the authors say, “a test particle will be unable to make a separation between the two retarded fields, Chapter 3. Backward causation 29 one properly owing to the source, the other really owing to the advanced field of the absorber”[14]. Finally, let us attend to the following result: if we sum the total field given by Eq. 3.4, with the zero-field given by Eq. 3.3, we obtain [14]:

1 1 1 1 E + E = E + E + E − E = E , (3.5) T tc 2 rs 2 a 2 rs 2 a rs which implies that the total field is operationally indistinguishable from the retarded field of the source (the field that we actually measure). However, although operationally we cannot distinguish the advanced field of the absorbers from the retarded field of the source, recall that there is one measurable effect produced by the advanced field, namely, the force on the source that accounts for its lost of energy during the radiative process [14]. To sum up this brief presentation of WFAT, I want to cite some words from the authors:

“Our picture of the mechanism of radiation is seen to be self-consistent. Any particle on being accelerated generates a field which is half-advanced and half- retarded. From the source a disturbance travels outward into the surrounding absorbing medium and sets into motion all the constituent particles. They generate a field which is equal to half the retarded minus half the advanced field of the source... The radiation field combines with the field of the source itself to produce the usual retarded effects which we expect from observation, and such retarded effects only. The radiation field also acts on the source itself to produce the force of radiative reaction. What we have said of one particle holds for every particle in a completely absorbing medium. All advanced fields are canceled by interference. Their effects show up directly only in the force of radiative reaction.”. [14]

The WFAT is interesting for our purposes because it is a time symmetrical theory (we can simply relabel the emitters as absorbers and the results will not change). Moreover, it explicitly takes backward processes as real; waves originating at a time that from our perspective is in the future, can exert influence at times that from our perspective are in the past. Although some criticism have been raised against this theory [9], my intention was only to motivate the idea that some prominent physicist has taken seriously the idea that physical laws are time symmetric, and in this sense, to motivate the idea that backward processes are to be expected in our universe.

3.3.2 Crammer’s transactional interpretation of quantum mechanics

Crammer’s interpretation of quantum mechanics, developed around the 80s decade, is heavily influenced by WFAT. In fact, Crammer himself explains WFAT in order to in- troduce his quantum mechanical interpretation. Crammer’s motivations for developing Chapter 3. Backward causation 30 his interpretation stem from the well-known difficulties that the Copenhagen interpre- tation suffers, the understanding of the (Schr¨odinger’scat!), the difficulties with the reality of quantum properties as argued in the famous EPR article [15], the interpretation of Born’s rule, and so on [13]. It is important to emphasize that Crammer’s view is only an interpretation; in no way it entail new predictions. Hence, no empirical evidence can be taken to support or discard Crammer’s interpretation; only interpretative considerations (consistency, capacity to handle the Copenhagen problems, etc) can judge the success of Crammer’s attempt.

The scheme of this theory is almost identical to that of WFAT, although Crammer’s the- ory is not aimed at electromagnetic waves but at quantum waves ψ. Another difference is that Crammer analyses the case for a single emitter and a single absorber, whereas WFAT necessarily required to consider many absorbers (isotropically distributed). An emitter produces a wave ψ that travels to an absorber. As a response to the incoming wave, the absorber produces and advanced wave which travels back to the emitter. The emitter answers this incoming advanced wave by radiating back a retarded wave, and so on. The process cyclically keeps going until the net exchange of energy and other conserved quantities satisfy the quantum conditions of the system [13, p. 663].

Crammer illustrates the basic scheme of his theory by means of a metaphor; the emitter sends a retarded wave which he calls the “offer”; the absorber answers this wave by means of an advanced wave which we calls the “confirmation”. The cycle goes on until the above mentioned quantum conditions are satisfied, and that final moment is what Crammer names “transaction” (hence, why is called transactional interpretation).

By applying a similar argument to that of Wheeler and Feynman’s, and by taking into consideration that the advanced wave of the absorber travels trough the same attenuating media through which the retarded wave of the emitter traveled, Crammer deduces [13, p. 662] the following:

∗ 2 Faa(R1, t1) ∝ Fre(R2, t2)Fre (R2, t2) = |Fre(R2, t2)| , (3.6)

where Faa is the advanced wave of the absorber, Fre the retarded wave of the emitter,

R2 the position of the absorber, R1 the location of the emitter, t1 the time in which

the emitter “emits”the wave and t2 the moment in which the emitter’s wave arrives to the absorber. “This means that the advanced “confirmation” or “echo” wave that the emitter receives from the absorber as the first exchange step of the incipient transaction is just the absolute square of the initial “offer” wave, as evaluated at the absorber locus” [13, p. 662]. Now, an observer only sees the “complete transaction”, and what he sees can be interpreted as a single wave front that is moving from the emitter to the absorber. Crammer explains that the advanced solution of the absorber which had a Chapter 3. Backward causation 31 negative energy (has a term −~w) could be interpreted by an observer as a retarded with positive energy [13]. We can regard the complete transaction as a four-dimensional standing wave with boundary conditions ruled by the emitter and the absorber (as if it were a string hold between two fixed points); “It [the standing wave] has been established between the terminating boundaries of the emitter, which blocks passage of the advanced wave further down the time stream, and the absorber, which blocks passage of the retarded wave further up the time stream” [13, p. 663].

The next passage help us to better understand what is the status of the “offering” and “confirmation” waves in the light of Quantum Mechanical issues:

“The fundamental quantum-mechanical interaction is taken to be the transaction, as defined in the preceding section. The state vector of the quantum-mechanical formalism is a real physical wave with spatial extent and is identical with the initial “offer wave” of the transaction. The particle (photon, electron, etc.) and the collapsed state vector are identical with the completed transaction. The transaction may involve a single emitter and absorber or multiple emitters and absorbers, but is only complete when appropriate quantum boundary conditions are satisfied at all loci of emission and absorption”. [13, p. 665]

If ψ denotes the retarded quantum mechanical wave produced by an emitter, then ψ∗ denotes the corresponding advanced wave (the complex conjugate is equivalent to the operation of time reversal [13, p. 666]). In Crammer’s words, ψ is the “offer wave” while ψ∗ the “confirmation” wave. According to 3.6, then ψψ∗ is the “offer-confirmation wave echo”, in other words, the echo produced in the localization of the emitter as a result of the emitter’s retarded wave interacting with the absorber advanced wave. On the other hand, the states that ψψ∗ is the probability of finding the particle at x (here x is the location of the emitter). Therefore, joining the Born rule and the transactional interpretation, we have that ψψ∗ is the probability of finding the echo wave yielded by a certain emitter and a certain absorber.

Following the previous strategy, we can think on R ψψ∗dx as the sum of all possibles R ∗ offering-confirmation echoes waves over all locations in space [13, p. 666] and ψ1ψ2dx as the same except that we are imposing certain boundaries conditions on the absorber ∗ (ψ2 is the advanced wave originated at the position of the absorber). I think it is not relevant here to develop more details of Crammer’s theory.

As happens with all the interpretations, Crammer’s interpretation has been criticized on several levels. But it also have been proved to solve some quantum paradoxe, and we cannot deny that it could provide new insight on the comprehension of the quantum Chapter 3. Backward causation 32 mechanical realm. It should be pointed out that Crammer insists in taking the transac- tional process as atemporal (is he adopting the block universe view with respect to the transaction?).

3.3.3 Wheeler delayed choice experiment

The last example that we are going to discuss of backward-in-time processes in physics, is one proposed by John Wheeler (the same Wheeler of WFAT). The idea of the experiment is pretty simple. Suppose that we perform the Young’s double slit experiment but with some modifications. Now, in addition to the fluorescent screen that records the final position of the photons, we have a device (a pair of collimated lenses) that determines the slit through which the photon has passed (see figure 3.1). Of course, we can not have the screen and the lenses working simultaneously; if the screen is set then the photons will be absorbed by it and we will see the famous interference pattern that reveals the wave nature of the photons. If we put down the screen, then the lenses will tell us which slit the photon went through, revealing the particle nature of the photons. Clearly, we can perform only one of the two experiments at the time.

The explanation of the previous experiment is that, when we place the screen, the photons behave like waves when passing through the double slit device, whereas when we put down the screen (allowing the photons to reach the lenses) the photons behave like particles. In short, in the first case the photon passes through both slits, while in the second case the photon passes only through one slit. Given the previous set up, Wheeler [13] proposed the following: suppose that we have a mechanism that quickly -very quickly- puts down the screen so that the photons arrive to the lenses. Moreover, suppose that after the photon has passed the double slit, we decide –and suppose that we have enough time– to put down the screen, causing the photon to encounter one of the lenses. So, if after the photons passed the double slit device we decide not to put down the screen, we will see an interference pattern, which entails that the photon passed through both slits. But if, after the photon has passed the double slit device we put down the screen, then we will obtain a definite arriving position that entails that the photon only passed through one of the two slits.

Recall that we assume that when passing through the slits the photon already behave, be it as a particle or be it as wave but not as both (this is the complementary principle), depending of course on the final measurement performed. In this case, however, we make the choice of what to measure only after the photon has passed the double slit device, hence the name delayed choice. Thus, it is as if the photon has not yet “decided”how, either as a wave or as a particle, it has passed the double slit. In short, what is very Chapter 3. Backward causation 33

Figure 3.1: a) We have a double slit experiment but now, in addition to the usual fluorescent screen that records the interference pattern, we place two lenses (L1 and L2) such that, if the screen is down, as in b), the electrons arrive to only one of the lenses; therefore, according to the lens at which it arrived, we can determine which slit the electron passed through. striking is that it is as if the final choice of the experimenter came to determine if the photon passed through one slit or both slits, despite the fact that the photon has already passed the double slit device! A quantum optics version of this experiment have been performed [16].

Great debate have raised around the interpretation of this experiment, but it should be pointed out that a very natural and simple way of explaining it is by appealing to backward causation; the final measurement made by the experimentalist retro causes the previous behavior of the photon; if the experimentalist decide, at the final moment, to put down the screen, then the photon will arrive to one of the lenses, and this retro causes that the photon pass through one of the slits –the slit that corresponds to the lens at which the photon arrived–, whereas if the experimentalist leaves the screen in place, then the interaction of the photon with the screen retro causes that the photon passed the slits as a wave. In a certain sense, we can determine how the photon passed the slits after it has passed them!

Note that we can try to provide an alternative explanation, saying for instance that the photon passes through the slits while being in an undetermined state, not wave nor particle; this was Wheeler’s view because he resisted the idea of backward causa- tion. However such an explanation will lead us to deep problems. To mention one: the complementary principle must be wrong. The photon is a wave and a particle simulta- neously, as if it had momentum and a definite position at the same time! As if it passed through only one slit and both simultaneously! There is none easy way of interpreting this experiment if it is not by means of retro causation. Crammer himself uses his theory, which involve backward causation as we studied, to easily explain this experiment [13]. Wheeler’s delayed choice experiment seems to be suggesting that the quantum realm might drives us to backward causation. Accepting this idea would be a first step to- wards establishing the main thesis of this text; the interpretation of the quantum world Chapter 3. Backward causation 34 demand us to appeal, more often that we could initially suspect, to causal descriptions in which effects anticipate their causes.

In this section we have briefly examined how some physicist have developed theories that involve backward causation and we have studied an experiment (the delayed choice experiment) that seems to be suggesting that backward causation is not only a possibility, but an essential feature revealed by quantum mechanical phenomena.

3.4 Broad overview of the chapter

In the first chapter we studied what causation is. In this chapter we studied backward causation. We went through some of the objections against the possibility of backward causation and we came to discuss the symmetry of the physical laws. We can think of these two chapters as concerned about the philosophical problems around the idea of backward causation. The most important thing to highlight is that backward causation is not contradictory but a genuine possibility. This because of at least two points; the actual theories of causation can handle this kind of causation, and the symmetry of the physical laws suggest that this kind of causation is not only possible but should be expected (at least in the quantum realm, where our macroscopic notions are not “contaminated”by the entropy gradient). And finally, we have seen that this causation is not only possible and not only expected, but have been taken into serious consideration by some prominent physicists. In short, these two chapters, that together constitute the first part of the present work, can be thought of as a defense –mostly philosophical but also physical– of the legitimacy of backward causation. Part II

Weak Measurements

35 Chapter 4

Indirect Measurements

The purpose of this chapter is to present the theory of indirect measurements. This theory provides a description of the measurement process that takes into account the interaction between the measurement device and the system. Indirect measurements explain how we come to measure the properties of the system by studying the influence that a system exerts on a measurement device. In other words, indirect measurements explain how we measure the system indirectly, by measuring the device after it has interacted with the system. The theory of indirect measurement is essential for the presentation of the weak measurement theory (which we will discuss in the next chapter) for a very simple reason: weak measurements can only be performed within the scheme of indirect measurements.

4.1 Indirect or ancilla measurement

In this section we are going to provide a schematic description of the measurement process. In order to do so we start by noting the simple but important fact that any measurement always involves both a system that will be measured and a measurement device (the measurement device is also known as a meter, pointer or ancilla). Let us imagine that we want to measure an observable Aˆ for a system S. We can think of the measurement process as consisting of two stages: First, the occurrence of a certain interaction between the device and the system; during this stage the device and the system get coupled and this coupling depends on the particular observable that we want to measure. Second, a projective measurement on the device is performed; because of the entanglement between the system and the device that occurred in the first stage, this projective measurement on the meter yields the outcome of the desired observable of the system (yields a certain eigenvalue a of the system).

36 Chapter 4. Indirect Measurements 37

We are going to study indirect measurements according to the aforementioned two steps: in the next section we are going to study the details of the coupling of the measurement device with the system and in the subsequent section we will examine the way in which we have to measure the pointer in order to gather information about the system.

4.1.1 Interaction between the system and the pointer

From now on we will refer to the system S that we want to measure together with the measurement device as the total system. For example, if we are going to measure the spin of a certain particle using a Stern-Gerlach device, the system will be the spin degree of freedom, the device or meter will be the momentum of the particle along the direction of the magnetic field in the Stern-Gerlach apparatus, and the total system will be the system conformed by the spin degree of freedom together with the momentum of the particle (the momentum along the direction just mentioned). The presentation of the indirect measurement theory here presented will be based on [17], a very well written work by Svensson.

We begin our description by modeling the meter as a quantum system φ with Hilbert ˆ space Hφ and dimension Dφ. Let φ be an operator on Hφ with eigenstates |φki , k =

1, ..., Dφ ({|φki}, with k = 1, ..., Dφ, constitutes a basis in Hφ). The observable associ- ˆ ated with the operator φ is Mˆ φ. The observable Mˆ φ is known as the ‘pointer variable’and 1 the |φki are the pointer’s states . The meter will be initially prepared on a pure state (0) |φ i which does not have to be an eigenstate of Mˆ φ. The initial of the (0) (0) meter is ρφ(0) = |φ i hφ | where “(0)”is used to remember that this is the initial state of the apparatus.

The system that will be measured is ψ with Hilbert space Hψ. Let |aii , i = 1, ..., Dψ constitute a complete orthonormal set of eigenstates of an observable Aˆψ of the system

(note that Dψ is the dimension of Hψ). The observable Aˆψ is the observable that we want to measure (in some sense, the whole purpose of our measurement is to determine the values of Aˆψ). Suppose that we write the initial state of the system on the basis ˆ (0) P spanned by the eigenvectors of Aψ; |ψ i = i ci |aii. The initial density matrix of the (0) (0) system is of course ρψ(0) = |ψ i hψ |.

Now that we have specified the initial states of the meter and the system we can move on to consider their interaction. We begin our analysis by specifying the total system

1In what rest of this work I want to stress out the convention that the subscript “φ” will always refer to the meter; Mˆ φ is an observable on the meter’s space, Hφ is the meter’s Hilbert Space and so on. On the other hand, the subscript “ψ” will denote entities related to the system. Sometimes for reasons of readability the subscript is omitted. In those cases the entity that lacks of a corresponding subscript should be taken as pertaining to the system. For instance, Aˆ is an observable on the system. Chapter 4. Indirect Measurements 38

Ψ (not to be confused with ψ), which is composed of the system ψ and the meter φ.

The corresponding Hilbert state of Ψ is given by the tensor product of Hφ and Hψ:

HΨ = Hψ ⊗ Hφ. Since the meter and the system are initially uncorrelated (they have not yet come to interact), the initial density matrix of Ψ is given by the pure state

ρΨ (0) = ρψ(0) ⊗ ρφ(0) . (4.1)

Before proceeding let me note that we will work with non destructive (or non-demolition) measurements; this means that the system will not be destroyed during the measurement process (indeed, indirect measurements are by definition non-demolition measurements). For instance, measurements of the polarization of photons do not have to lead to the destruction of those photons while passing through a polarizer, while measuring the final position of those photons when they collide with a fluorescent screen does lead (mathematically, the condition of non destructive measurements is that the Hamiltonian that governs the interaction of the system with the meter commutes with the observable,

[Hˆint, Aˆψ] = 0).

Let us consider now the moment in which the system and the meter interact (recall that the study of this event is the main purpose of the present section). The device and the system interact through a unitary time-evolution operator:

R ˆ Uˆ = e−i/~ dtHΨ , (4.2)

see [17–20]. This initial stage of the measurement process in which the interaction between the system and the meter occurs is known as ‘pre-measurement ’[17]. It is during the pre-measurement that the system and the device get entangled; a coupling between the pointer’s variables and the different states of the system occurs. This entanglement is the cornerstone of the indirect measurement protocol; it is because of this coupling that a posterior reading of the pointer’s variable yields a certain outcome

of Aˆψ. For a pointer to be capable of measuring, the eigenstates of the system must be correlated with the states of the pointer just like the needle of the oil indicator of a car determines –is correlated with– the state of the oil in the tank. By reading the oil indicator we get to know the amount of oil in the tank; by reading the meter’s

pointer we get to know a certain value of Aˆψ. In the present section we will describe the pre-measurement in a quite abstract –and at the same time simple– way, without a specification of the evolution operator (all we need to know for the moment is that during the pre-measurement the states of the systems and the states of the meter get correlated). Later on (in section 4.2) we will see how the theory here presented can be applied to a particular evolution operator. Chapter 4. Indirect Measurements 39

Given that the initial state of the total system is

(0) X (0) |Ψ i = ci |aii ⊗ |φ i , (4.3) i and given the fact that during the pre-measurement the system and the device get coupled, the total state after the interaction is given by Uˆ |Ψ (0)i. This yields:

(0) Uˆ X (0) X (i) |Ψ i −→ Uˆ( ci |aii ⊗ |φ i) = ci |aii ⊗ |φ i . (4.4) i i

The upper index i in |φ(i)i serves to stress the fact that once the pre-measurement is over, the pointer’s states of the meter are coupled (correlated) with the different states of the system (recall that i runs from 1 to Dψ, the dimension of the system’s Hilbert

space). More precisely, the different eigenstates |aii of the system are now correlated with the pointer’s states |φ(i)i. How exactly this correlation is established depends on the apparatus and the system’s observables that we are interested in measuring. As I already explained, in forthcoming sections, when we come to discuss the measurement procedure known as the ‘Von Neumann protocol’, we will see how to apply the currently abstract formulation of the ‘pre-measurement’. For the moment we only need to keep in mind that the pre-measurement establishes a correlation between the meter and the apparatus.

According to standard quantum mechanics the evolution of a state described by an ˆ † ˆ ˆ initial density matrix ρ(0) is given by ρ(t) = U ρ(0)U (where U is the evolution operator). Therefore, the state of the total system Ψ after the pre-measurement is

ˆ † ˆ ˆ † ˆ ρΨ (1) = U ρΨ (0) U = U ρψ(0) ⊗ ρφ(0) U, (4.5)

where we have used Eq. 4.1. The initial density matrix of the total system is, using 4.3,

X (0) X ∗ (0) ρΨ (0) = ci |aii ⊗ |φ i cj haj| ⊗ hφ | . (4.6) i j

Therefore, the density matrix of the total system after the pre-measurement is:

ˆ † ˆ X (i) ∗ (j) ρΨ (1) = U ρΨ (0) U = |aii ⊗ |φ i cicj haj| ⊗ hφ | , (4.7) i,j Chapter 4. Indirect Measurements 40 where we have taken into account Eq. 4.4 and the linearity of Uˆ. Moreover, since (0) (0) ∗ hai| ρψ(0) |aji = hai|ψ i hψ |aji = cicj , Eq. 4.7 becomes

X (i) (j) ρΨ (1) = |aii ⊗ |φ i hai| ρψ(0) |aji haj| ⊗ hφ | . (4.8) i,j

ˆ ˆ Finally, making use of the projectors Pψi = |aii hai| and Pψj = |aji haj| that act on the system’s space, we obtain:

X (i) ˆ ˆ (j) ρΨ (1) = |φ i Pψi ρψ(0) Pψj hφ | (4.9) i,j

[17]. Some points are worth of being mentioned. First, it is clear from Eq. 4.9 that it is

not possible to write ρΨ (1) as a product of a system’s state and a meter’s state; during the pre-measurement the system and the meter get entangled. Second, recall that if the

system’s initial state was a pure state |aki, then Eq. 4.9 would be

X (i) ˆ ˆ (j) (k) (k) |φ i Pψi |aki hak| Pψj hφ | = |φ i |aki hak| hφ | . (4.10) i,j

Third, the pointer’s states |φ(i)i need not be, in general, eigenstates of the observable

Mˆ φ [17]. Fourth, as we have been saying, after the pre-measurement we achieve a

correlation between the states of the system and the states of the device; to each |aii there corresponds a particular meter state |φ(i)i. Nevertheless, as we shall see soon, this does not preclude the possibility of an overlap of the different meter states. In other words, it is possible that some states |φ(i)i, |φ(j)i are not mutually orthogonal (the importance of this fact will be appreciated later). The reason (at least one of them) for this possible overlapping of the meter states lies in the fact that nothing guarantees that

the initial eigenstates |φki of the observable Mˆ φ, when transformed by the evolution

operator, remain eigenstates of Mˆ φ again.

We can obtain the state of the system after the pre-measurement by making the partial trace of the total density matrix over the space of the instrument;

X X D (i)E ˆ ˆ D (j) E ρψ(1) = T rφρΨ (1) = hφk | ρΨ (1) | φki = φk | φ Pψi ρψ(0) Pψj φ | φk k k,i,j (4.11) X ˆ ˆ D (j) (i)E = Pψi ρψ(0) Pψj φ | φ . i,j

From the above equation we can clearly see that if the states |φ(i)i and |φ(j)i are or- thogonal, then the density matrix of the state of the system after the pre-measurement is diagonal. This means that in case none of the pointer states overlap, no interference Chapter 4. Indirect Measurements 41 effects between the different states of the system is possible [17]. It is precisely because this measurement scheme makes room for a possible interference of different eigenstates of the system that the scheme is more general than the traditional projective measure- ment theory; no interference of eigenstates pertaining to different eigenvalues is possible after a projective measurement.

4.1.2 Reading the meter

In the last section we described how it is that the instrument and the system get entan- gled. We are now ready to “read” the pointer in order to determine the specific state of the system; once the system and the instrument get entangled, we can measure the system by measuring the pointer’s states of the meter. This constitutes the second stage of the indirect measurement protocol. The reading of the meter consists in a projective measurement of the pointer’s variable Mˆ φ. Suppose that this projective measurement 2 on the meter yields the outcome φk. Then, L¨uders’rule states that the total system

ρΨ (1) that we had obtained after the pre-measurement transforms to:

Pˆ ˆ ˆ φk (Iψ ⊗ Pφk )ρΨ (1) (Iψ ⊗ Pφk ) ρΨ (1) −−→ (ρΨ (1) |φk) = , (4.12) Prob(φk|ρΨ (1) )

ˆ where Pφk = |φki hφk| and Prob(φk|ρΨ (1) ) is the probability of obtaining the outcome ˆ φk given that the total system is in the state ρΨ (1) . Note that Iψ ⊗ Pφk is an operator ˆ that applies the identity to the system’s Hilbert space Hψ and the projector Pφk to the measuring device Hilbert’s space Hφ. The probability Prob(φk|ρΨ (1) ) of obtaining the pointer’s variable outcome φk given that the total system Ψ is in the state ρΨ (1) is:

ˆ X (i) 2 Prob(φk|ρΨ (1) ) = T r[(Iψ ⊗ Pφk )ρΨ (1) ] = (| hφ |φki | hai|ρψ(0) |aii) i (4.13) X = Prob(φk|ρφ(i) )Prob(ai|ρψ(0) ), i where ρφ(i) refers to the meter’s state |φii hφi| (a step by step derivation of the pre- vious result can be found in the appendixA). The result just obtained is interesting: the probability of obtaining a certain outcome φk of the pointer variable after the pre- measurement is equivalent to the product of the probability of obtaining the outcome

φk given that the meter’s state is ρφ(i) with the probability of obtaining the outcome ai given the initial state of the system ρψ(0) (during the pre-measurement this initial state

2The L¨uders’rule affirms that the state S of the system, after we have performed a projective measurement on it by applying a projector Pˆk and after we have obtained a certain value ak, transforms ˆ ˆ ∗ PkSPk to S = ˆ [21]. T r[SPk] Chapter 4. Indirect Measurements 42 does not change). Also, let us pay attention to the penultimate line of Eq. 4.13. Sup- (i) pose that hφ |φki = δik. Then, according to Eq. 4.13 this would mean that after the

pre-measurement the probability of obtaining the pointer variable outcome φk would be

equal to the probability that the system is found in the state |aii (state that corresponds

to the eigenvalue ai); Prob(φk|ρΨ (1) ) = hai|ρψ(0) |aii = Prob(ai|ρψ(0) ). Hence, the proba- bility distribution of the meter’s observable Pˆφ would perfectly resemble the probability distribution of the system’s observable Pˆψ. This is another way of saying that in the case (i) of no overlapping between the |φki and the |φ i states, we can completely determine the probabilities of the system’s observable by attending to the probability distribution yielded by the pointer’s variable. However, the general situation is one in which a certain (i) overlap of the states |φki and |φ i exists. Therefore, in the general case the probability distribution of the system’s observable cannot be completely known by the reading of the meter; depending on the overlap of the meter’s states, we can obtain a more or less precise information of the system. We will later come back to these considerations. Following [19], we can write Eq. 4.13 in the following form (which of course is equivalent but perhaps more useful in the light of future discussions):

X (0) ˆ (0) (i) 2 Prob(φk|ρΨ (1) ) = hψ |Pai |ψ i | hφ |φki | , (4.14) i

ˆ where Pai projects onto the subspace corresponding to the eigenvalue ai.

One of the most important objectives of a measurement, if not the most important, is to gather information about the state of the system. So, the question that concerns us now is: “what is the state of the system if the reading of the meter has yielded the

outcome φk?”Equivalently, “what is the state of the system if we know that the state of

the meter is |φki?” The answer is simple: perform the partial trace of the total system state given by Eq. 4.12 over the meter’s Hilbert space:

ˆ ˆ (Iψ ⊗ Pφk )ρΨ (1) (Iψ ⊗ Pφk ) ρψ = T rφ(ρ (1) |φk) = T rφ . (4.15) (φk) Ψ Prob(φk|ρΨ (1) )

We can develop the above result (see appendixA) to obtain:

1 ˆ ˆ † ρψ(φ ) = hφk| Uρψ(0) ⊗ ρφ(0) U |φki k Prob(φk) 1 ˆ (0) ˆ (0) = hφk|U|φ i ρψ(0) hφk|U|φ i (4.16) Prob(φk) 1 ˆ ˆ † = Mkρψ(0) Mk , Prob(φk) where the Mˆ k are the measurement operators [17]. With Eq. 4.16 we can determine the system state ρψ once we have “read” the outcome φk for the state of the pointer. It Chapter 4. Indirect Measurements 43 is important to mention that the meter and the system become unentangled after the projective measurement on the meter.

Before we move on to the next section, I will present a brief summary of the last two sections. So far we have discussed the ancilla scheme (or the indirect measurement scheme) whose purpose is to provide a description of the measurement process. As we have stated, the ancilla scheme allows us to analyze the measurement process by means of two stages; the first stage is the pre-measurement, in which the different states of the system get correlated with the different states of the pointer; the second stage consisted of the reading of the meter by means of a projective measurement over the pointer’s space. Given that at least one state of the measurement device corresponds to a state of the system, then by projectively measuring the meter we gather information about the system (the state of the system after the reading of the meter is given by Eq. 4.16).

4.2 The Von Neumann protocol

The previous discussion of the measurement process was rarely abstract in the sense that we did not specify neither the Hamiltonian of interaction (and because of it) or the evolution operator. The main objective of the present section is to provide a concrete example of a measuring protocol that can be seen as a particular case of the ancilla scheme. The particular measuring protocol that we will discuss was developed by Von Neumann and for this reason is known as the Von Neumann protocol.

Within the Von Neumann protocol we will usually use the pointer’s position Qˆ as the pointer variable and {|qi} as the set of eigenstates of the meter (we could use instead the pointer’s momentum as the observable, but the protocol is exactly the same [18, 19]). Qˆ varies continuously, since it refers to the position. We can think of the eigenvalues q as the different positions that the pointer can occupy. Using the completeness relation in the continuous case, we can write the initial and the “after pre-measurement” pointers states (0) R (o) R (i) R (i) R as |φ i = dq |qi hq|φ i = dq |qi φ0(q) and |φ i = dq |qi hq|φ i = dq |qi φi(q)

respectively. φ0(q) and φi(q) are the wave functions associated to the initial and final pointers states (soon we will consider these wave functions to be Gaussian).

The next step in the protocol is to specify the Hamiltonian of interaction that rules the

interaction between the system and the device. Following [22], we can choose Hint = g(t)Aˆ⊗Pˆ, where g(t) is a function with compact support in t that measures the strength of the interaction, Aˆ is the observable of the system that we want to measure and Pˆ is the pointer’s variable conjugated to Qˆ [17] (for ease of notation I have omitted and I will omit for a while the subscripts ψ and φ; to avoid confusion we only need to remember Chapter 4. Indirect Measurements 44 that Aˆ is an observable of the system and Pˆ is an observable of the meter). We make the assumption that the time of the interaction between the device and the system is very short so that we can neglect the “free” Hamiltonian3 of the pointer and the system R during the measurement and so that dtHˆint can be approximated to gAˆ ⊗ Pˆ (see for instance [17–20]). Hence, the evolution operator for the pre-measurement stage is:

− i gAˆ⊗Pˆ Uˆ = e ~ . (4.17)

Let’s see now how this evolution operator transforms the initial state |ψ(0)i ⊗ |m(0)i . ˆ P (0) U ˆ P (0) Recall that in the ancilla scheme we had: i ci |aii ⊗ |φ i −→ U( i ci |aii ⊗ |φ i) = P (i) i ci |aii ⊗ |φ i. Given the evolution operator of Eq. 4.17, we have:

X (i) − i gAˆ⊗Pˆ X (0) ci |aii ⊗ |φ i = e ~ ( ci |aii ⊗ |φ i) i i i X − gai⊗Pˆ (0) = ci |aii ⊗ e ~ |φ i i i Z (4.18) X − gai⊗Pˆ = ci |aii ⊗ e ~ dq |qi φ0(q) i X Z = ci |aii ⊗ dq |qi φ0(q − gai), i where ai denote the eigenvalues of Aˆ and where in the last step we have used the 4 (i) R translation operator . Note that |φ i = dq |qi φ0(q − gai). Hence, we can infer that

φi(q) = φ0(q − gai). What this means is that during the pre-measurement the initial pointer’s position is translated by the amount gai. Perhaps it is more illustrative to (1) P write this last result as |Ψ i = i ci |aii ⊗ |φq−gai i. By writing it in this way, we highlight the fact that once the pre-measurement is over, there is a correlation between the translation of the pointer and the state of the system; by reading the pointer’s translation we can determine the system’s state (this is why Von Neumann chose the Hamiltonian of interaction we are discussing). However, recall that it was possible for the “after pre-measurement” pointer’s states to overlap. In the light of the present discussion, what this means is that the translation of the pointer’s states is not always enough to determine with certainty the corresponding state of the system.

3 If the intrinsic or “free” Hamiltonian of ψ is Hˆs and the intrinsic Hamiltonian of φ is Hˆm, then the total Hamiltonian of Ψ will be HˆΨ = Hˆs + Hˆm + Hˆint where Hˆint is the Hamiltonian of interaction. For the sake of simplicity we are going to suppose that the intrinsic or free Hamiltonian of the meter vanishes (however, if we do not suppose this we can appeal to the and work with time dependent observables [17]). i ˆ 4 − λai⊗P The translation operator is Tˆ = e ~ . It is a unitary operator that works in the following way; Tˆ applied to an eigenstate of the position operator Qˆ, constructs another eigenstate of Qˆ. In particular, hq| Tˆ |ψi = hq − λ|ψi = ψ(q − λ). Thus, this operator translates the wave function by the constant λ, hence the name [23, Chapter 2]. Chapter 4. Indirect Measurements 45

It is illustrative to assign a particular wave function to the pointers state (here I im following [18]). Let’s assume that the initial wave function of the pointer in the q q2 1 − √ 4∆2 representation is a Gaussian centered around zero; φ0(q) = 1 e . Let’s assume 2π∆ 4 also that the system is in a state ci |aii (not in a superposition). Therefore, after the pre-measurement the pointer’s wave function would be translate in q by the amount gai; (q−ga )2 1 − i √ 4∆2 φi(q) = 1 e . We will come back to this later, on section 4.4. 2π∆ 4

4.3 Some results

Recall that the total density matrix after the pre-measurement was given by ρΨ (1) = P (k) (j) k,j |φ i Pψk ρψ(0) Pψj hφ | (Eq. 4.9, we have used k instead of i in order to avoid confusion in what follows). We can obtain the meter’s density matrix after the pre- measurement by making the partial trace of ρΨ (1) over the system’s space:

X ρφ(1) = T rψρΨ (1) = hai|ρΨ (1) |aii i X X (k) (j) = ha |( |φ i P ρ (0) P hφ |)|a i i ψk ψ ψj i (4.19) i k,j X (i) (i) = |φ i hai|ρψ(0) |aii hφ | . i

In going from the second line to the third one we used that Pψj |aii = δi,j |aii and hai| Pψk = δi,k hai|, and hence, the sum over k and the sum over j disappears. Once we have the density matrix, it is straightforward to compute the matrix elements for the meter:

X (i) (i) hφk|ρφ(1) |φli = hφk|( |φ i hai|ρψ(0) |aii hφ |)|φli i (4.20) X (i) (i) = hφk|φ i hai|ρψ(0) |aii hφ |φli . i

(i) Notice that were it not for the fact that some |φki and |φ i are not orthogonal, there would have been a perfect correlation between the system’s elements and the meter’s (i) R elements; hφk|ρφ(1) |φli = hai|ρψ(0) |aii. Since |φ i = dq |qi φ0(q − gai), we can write Eq. 4.19 in the following way: Z Z X ∗ ρφ(1) = ( dq |qi φ0(q − gai)) hai|ρψ(0) |aii ( dxφ0(x − gai) hx|) i ZZ (4.21) X ∗ = hai|ρψ(0) |aii dqdx |qi φ0(q − gai)φ0(x − gai) hx| . i Chapter 4. Indirect Measurements 46

Both x and q are variables of the position operator Qˆ. The previous result enables us to calculate the mean value of any function f(Qˆ) of the pointer’s variable q after the pre-measurement [17]. In general we have (see appendixA for the derivation of the next result):

ˆ ˆ hf(Q)i1 = T r[f(Q)ρφ(1) ] Z (4.22) X ˆ 2 = hai|ρψ(0) |aii dqf(Q)|φ0(q − gai)| . i

For instance, let’s calculate hQi1 (the subscript 1 indicates that this mean value is calculated after the pre-measurement). Using 4.22, the mean value hQi1 after the pre- P R 2 measurement is hQi1 = i hai|ρψ(0) |aii dqq|φ0(q − gai)| . Decomposing the norm of P R ∗ φ0(q − gai) and moving q appropriately, we have: i hai|ρψ(0) |aii dqφ0(q − gai)qφ0(q − P R gai). Performing the change of variable f = q−gai, the previous result reads i hai|ρψ(0) |aii dfφ0(f)(f+ ∗ gai)φ0(f). Then, we have: Z Z ˆ X ∗ ∗ hQi1 = hai|ρψ(0) |aii ( dfφ0(f)fφ0(f) + dfφ0(f)gaiφ0(f)) i X ˆ = hai|ρψ(0) |aii (hfi0 + gai) i ˆ X X (4.23) = hfi0 hai|ρψ(0) |aii + g ai hai|ρψ(0) |aii i i ˆ X ˆ = hfi0 + g hai|ρψ(0) A|aii i ˆ ˆ = hfi0 + g hAi0 .

P P ˆ ˆ In the fifth line we used that g i ai hai|ρψ(0) |aii = g i hai|ρψ(0) A|aii = hAi0. Let’s relabel the position operator fˆ as early, namely, Qˆ. Finally, given that we can always ˆ set hQi0 = 0, we obtain ˆ ˆ hQi1 = g hAi0 . (4.24)

Thus, after the pre-measurement the mean value of the variable that corresponds to the position of the pointer has been shifted (from 0) by a quantity proportional to the mean value of the observable Aˆ. The importance of this result should be highlighted: Within the Von Neumann protocol we can compute the average of any observable of the system (the spin, the position, energy, etc.) by computing the average value of the observable Qˆ of the measuring device after the pre-measurement. This implies that when P we have the system in the initial state i ci |aii, the position of the pointer will be, on average, indicating the mean value of the observable Aˆ. In other words, the Gaussian corresponding to the pointer’s wave function (in the q representation) will be centered 2 (q−ghAi0) on the mean value of Aˆ; φ1(q) = Ne 4∆2 (N is a normalization constant). Chapter 4. Indirect Measurements 47

Let us return to some of the previous ideas. If we start off with the system in a super- P position of states |ai = i ci |aii, after the pre-measurement the pointer’s states will be in a superposition as well. More precisely, the pointer’s wave function will have several modes centered at different eigenvalues of the system [20]. And on average, the pointer will be ‘visiting ’the average of Aˆ. For example, if a certain value aj is much more likely, then the pointer’s wave function in the q basis will be centered nearer the position that corresponds to this eigenvalue. In the next section I will present an example to clarify these points. We can repeat the last procedure, starting from the change of variable just ˆ 2 discussed and keeping in mind that hQi = 0, to obtain hQ i1. The result is

2 2 2 2 hQ i1 = hQ i0 + g hA i0 . (4.25)

Finally, given the above results the variance of Qˆ1 is:

ˆ 2 2 ∆Q1 = hQ i1 − (hQi1) 2 2 2 2 2 = hQ i0 + g hA i0 − g (hAi0) (4.26) 2 2 2 = ∆Q0 + g ∆A0.

2 2 2 2 ˆ The last line appealed to the fact that ∆Q0 = hQ i0 − (hQi0) = hQ i0 since hQi0 = 0. Hence, the initial variance corresponding to the pointer’s position observable is increased by the variance corresponding to the observable Aˆ, and this makes sense since we expect that the dispersion of any variable related to the meter augments once the meter and the system have been coupled.

Before we move on to the next section, I find it is necessary to make some remarks. The Von Neumann protocol is aimed at providing a concrete description of the pre- measurement process. Thus, it accomplishes the task of providing a detailed picture of what happens during the interaction of the meter and the system. However, the second step of the measurement process, which we named ‘reading the meter’, is not covered by the Von Neumann protocol. Once the system and the meter get coupled, we still need to perform a projective measurement on the meter in order to determine the state of the system. Because of what we studied in the previous sections, we know that depending on the state of the meter after this projective measurement, we can learn the state of the system. Hence, according to the results obtained within the Von Neumann protocol, we expect that if we perform several measurements with equally prepared systems and meters, the average of the several outcomes of the meter will be proportional to the average of the observable we wanted to measure on the system. In the following section we will see how to apply the main ideas of this and the previous sections to an example. Chapter 4. Indirect Measurements 48

4.4 Example

Let us consider a simple application [20] of the ancilla scheme. Suppose that the observ- able we want to measure is Sˆz, the spin along the z axis of a 1/2-spin particle. Then, i ˆ ˆ − gSz ⊗Pφ following the Von Neuman protocol, our evolution operator can be written e ~ ψ . ˆ (0) Writing our initial system’s state in the Szψ basis, we have: |ψ i = α |+i + β |−i (|α|2 + |β|2 = 1). The initial state of the total system (particle plus apparatus) will be

|Ψ (0)i = |ψ(0)i ⊗ |φ(0)i = (α |+i + β |−i) ⊗ |φ(0)i . (4.27)

If we apply the evolution operator we obtain

ˆ i ˆ ˆ (0) U (1) − gSz ⊗Pφ (0) |Ψ i −→|Ψ i = e ~ ψ (α |+i + β |−i) ⊗ |φ i (4.28) i ˆ i ˆ − g ~ Pφ (0) g ~ Pφ (0) (+) (−) = α |+i e ~ 2 |φ i + β |−i e ~ 2 |φ i = α |+i ⊗ |φ i + β |−i ⊗ |φ i .

We have used |φ(+)i and |φ(−)i to denote the states of the apparatus after the pre- measurement. Notice that we cannot write the state of the total system after the pre- measurement as a tensor product of a state of the system alone and a state of the apparatus alone; during the pre-measurement the system and the meter got entangled. According to the Von Neumann protocol, the states |φ(+)i and |φ(−)i correspond to a translation of the initial pointer’s states in the position representation by the quantities ~ ~ g 2 and −g 2 respectively. Thus, in the position representation we can write the final state of the total system as Z Z |Ψ (1)i = α |+i ⊗ dq |qi φ (q − g ~) + β |−i ⊗ dq |qi φ (q + g ~). (4.29) 0 2 0 2

Compare this result with 4.18. If the wave function φ(q) representing the device were a Gaussian initially centered in q = 0, then 4.29 could be written

2 2 Z (q−g ~ ) Z (q+g ~ ) (1) 0 − 2 0 − 2 |Ψ i = α |+i ⊗ dq |qi e 4∆2 + β |−i ⊗ dq |qi e 4∆2 , (4.30) where the α0 and β0 absorbed the constants factors of the Gaussians. Until this point we have applied the Von Neumann protocol; during the pre-measurement the states of the pointer and the system get coupled, and this coupling in turns can be seen as a superposition of several Gaussians centered at the eigenvalues of the observable Szψ (to ~ be more precise, it is a single Guassian with two modes, one centered around g 2 and ~ the other around −g 2 ). If we were in the lab, up until this point we still could not “see” any result yet; the reading of the apparatus is required in order to actually measure something. Suppose now that we measure the meter and find that the pointer is in the state |mi (where |mi is an eigenstate of Qˆ). Mathematically, this measurements consist Chapter 4. Indirect Measurements 49

in applying the projector Πˆ φ = |mi hm| over the meter’s space. Then, 4.30 gives

2 2 ~ ~ (m−g 2 ) (m+g 2 ) (1) 0 − 2 0 − 2 (1ψ ⊗ |mi hm|) |Ψ i = α |+i ⊗ |mi e 4∆ + β |−i ⊗ |mi e 4∆ 2 2 (4.31) (m−g ~ ) (m+g ~ ) − 2 0 − 2 0 = (e 4∆2 α |+i + e 4∆2 β |−i) ⊗ |mi .

The first thing to notice is that the system and the device got un-entangled after this reading of the pointer. More importantly, notice that the probability amplitude of obtaining the system states |+i or |−i (to which corresponds the values α0 and β0 respectively) clearly depends on the value m yielded by the device; by direct inspection 2 (m−g ~ ) − 2 0 2 of Eq. 4.31 it can be seen that e 2∆2 |α | is the probability of obtaining the system 2 (m+g ~ ) − 2 0 2 in the state |+i and the meter in state |mi; while e 2∆2 |β | is the probability of obtaining the system in state |−i and the meter in state |mi5. Then, it should be clear that the probability of obtaining a certain outcome m (in other words, the probability of “reading” the meter on position m) depends on the probability density function 2 2 (m−g ~ ) (m+g ~ ) − 2 0 2 − 2 0 2 e 2∆2 |α | + e 2∆2 |β | (this is the “pointer distribution” for this particular example). I have made the following graph to illustrate the idea:

Figure 4.1: Pointer’s distribution for an hypothetical case in which |α0|2 > |β0|2, ∆Q = 0.01 and g = 2/~.

As can be seen in the figure, if m is around 1 it is more likely that we will find the system in state |+i, while if m is around −1 then the system is likely to be found in the 2 (m−g ~ ) − 2 0 2 state |−i (remember that the state |+i is associated with the factor e 2∆2 |α | and 2 (m+g ~ ) − 2 0 2 |−i with e 2∆2 |β | ). This simply means that if we, for example, perform several measurements with equally prepared systems and pointers, then, if most of our pointers yield a value of m near 1, we could infer that our systems were on the states |+i. Thus, the value yielded by the pointer gives us information of the state of the system.

2 (m−g ~ ) 5 − 4 0 Apply h+|⊗hm| and you will obtain the eigenvalue e 2∆2 α . Then, the probability of obtaining 2 2 (m−g ~ ) (m−g ~ ) − 4 0 2 − 2 0 2 this result is simply |e 2∆2 α | = e 2∆2 |α | . Chapter 4. Indirect Measurements 50

Moreover, we might ask: what’s the probability of obtaining the pointer in a certain ~ (0) position, for instance in m = g 2 , given that initially our system was in state |ψ i = α |+i + β |−i and the pointer in the state |φ(0)i? The probability density function 2 2 (m−g ~ ) (m+g ~ ) − 2 − 2 2∆2 0 2 2∆2 0 2 ~ e |α | + e |β | provides the answer. We only need to plug m = g 2 in it. Let us suppose that the outcome of the reading of the meter was m = g~/2. Then, the probability density becomes |α0|2 (the other term is zero o nearly, as can be seen from 0 2 0 2 2 0 2 the figure). Moreover, |α | can be written as |c1| |α| , where |c1| is the probability of obtaining the pointer’s state |m = g~/2i given the initial state of the pointer (this term depends on the normalization factor of the Gaussian). Thus, the probability of obtaining 0 2 2 the outcome m = g~/2 consists of the product of |c1| with |α| , which is no more than the probability of obtaining the pointer state |m = g~/2i given the initial state of the pointer times the probability of obtaining the eigenvalue ~/2 for the system. This is precisely what equations Eq. 4.13 and Eq. 4.14 state (the probability of obtaining a certain outcome φk for the observable of the meter is the product of the probability of obtaining φk given the initial state of the meter, times the probability of obtaining the system’s eigenvalue ai whose eigenstate is associated with the meter’s state |φii).

Finally, according to Eq. 4.24, we can compute the mean value of Qˆ by calculating the ˆ mean value of Szψ or vice-versa. For instance, it is straightforward to derive the mean ˆ ~ 2 2 value of Szψ : It yields 2 (|α| − |β| ). If α is greater than β then this average will be positive, if they are equal the average will be zero and otherwise it will be negative. From Eq. 4.30 it is clear that the average of Qˆ only depends on the values of α0 and β0 (see the Fig. 4.1). If α0 is greater than β0 the pointer’s position would be on the right of the Y axis, if they are equal the pointer’s average position will be zero and will be negative if α0 is smaller than β0. The point I want to stress is simply that the average of the pointer’s position directly depends on α0 and β0, and these values are in turn directly proportional to α and β. Therefore, hQˆi resembles hSˆzi up to a constant factor, as we wanted to illustrate (of course, we can be more rigorous and compute the average of Qˆ for the pointer directly from Eq. 4.30. If we do so we will obtain a result that precisely depends on β0 and α0). I expect then that the above example has served its purpose, namely, the application and illustration of some central ideas of the theory developed in the previous sections. Chapter 5

Weak Measurements

The main objective of this chapter is to develop the theory of weak measurements by applying the main results of the ancilla measurement scheme discussed in the previous chapter. The chapter is divided in three sections: In the first one, I will present an intuitive approach to weak measurements with the purpose of motivating the chapter. This first section will briefly discuss how the theory of weak measurement was original presented in the article by Aharonov, Vaidmen and Albert [18]. After, I will present the theory of weak measurements from a more general and careful treatment, following Aharonov and Botero in [19]. In this section we will see that weak measurements give us a better insight into the theory of the indirect measurement, not only in the weak regimen. Finally, I will end the chapter by considering an example.

5.1 An intuitive approach

In [18] the authors described the coupling of a meter and a system by means of the Hamiltonian of interaction H = −g(t)qAˆ, where Aˆ is an observable of the system with

eigenvalues ai, g(t) is a normalized function with compact support near the time of the measurement and Qˆ is the variable corresponding to the meter’s position (which is conjugate to the momentum Pˆ). Next, they consider the initial state of the apparatus to be a Gaussian in the p representation. Their initial state is then (the notation is mine) (0) (0) P |Ψ i = |φ i ⊗ i ci |aii. They then do the following:

−i R H(t)dt (0) X e |φ i ⊗ ci |aii . (5.1) i

ˆ2 − P If the initial wave function of the pointer is a Gaussian, centered in zero, e 4∆2 , then during the pre-measurement we obtain, for the pointer, a superposition of wave functions 51 Chapter 5. Weak Measurements 52

ˆ 2 (P −ai) P − 2 translated by the different eigenvalues; i e 4∆ . Besides the fact that the authors are considering the translation of the pointer’s variable in the momentum representation while we considered the translation on the q representation, and besides that their coupling constant is normalized while our coupling g is a number not necessarily equal to one, nothing new has been done. Then, the authors state, and this is the important part, that

‘The initial state of the measuring device in the ideal case has to be such that Pˆ is well defined. After the interaction we can ascertain the value of Aˆ from the final value of Pˆ: Aˆ = δPˆ... if the spread of the p distribution ∆Pˆ is much smaller than

the differences between the ai, then, after the interaction, we shall be left with the

mixture of Gaussians located around ai correlated with different eigenstates of Aˆ ’. [18]

What the authors are saying is better understood if we go back to the example of the 1/2-spin particle in Sec. 4.4. In Fig. 4.1 we used a dispersion of ∆Q = 0.01. But, what would happen if the dispersion ∆Qˆ of the pointer’s states were bigger? The bigger the dispersion the harder is going to be for us to determine the state of the system by reading the meter, as Fig 5.1 and Fig 5.2 illustrate.

Figure 5.1: Pointer’s distribution for ∆Qˆ = 1.

The figures help us to illustrate the effect of a big dispersion of the pointer’s wave func- tion. The bigger our uncertainty of the pointer’s position, the worst the information of the system we can obtain by reading the meter. In [18] weak measurements were presented by means of a big uncertainty on the initial state of the meter (in the variable Pˆ). The idea was the same I am explaining in the present section; a big dispersion of the pointer’s states entails an almost impossible determination of the state of the system. As I will explain in further sections, this way of approaching weak measurements (big Chapter 5. Weak Measurements 53

Figure 5.2: Pointer’s distribution for ∆Qˆ = 10.

dispersion of the pointer’s variable) can be a little misleading, but in spite of this unfor- tunate fact, this is the way that most of the literature explains weak measurements (see [18, 20, 24, 25] among others). For plain reasons I will refer to this way of understanding weak measurements, way that emphasizes the initial big dispersion of the pointer, as the “historical approach”.

There exists another way of tuning the desired uncertainty besides increasing the initial pointer’s uncertainty; tuning the coupling constant g. A very low value for g with respect to the dispersion yields a great uncertainty of the state of the system (see Fig 5.4). On the other hand, even if the dispersion were big, a great value for g will allow us to appropriately distinguish the state of the system (Fig 5.3).

Figure 5.3: Probability density for m, g = 20/~

Hence, the relevant factor in the moment of a successful determination of the system’s states is, rather than the pointer’s dispersion alone, the ratio g/∆Qˆ; a big dispersion can be handled with a big g as Fig.5.3 shows [17]. Now, recall that g was a value related to the interaction of the device and the system (was an effective coupling constant (see section 4.2)). Having this in mind, we can say: the stronger the interaction between the system and the measuring device (the stronger g), the better we can discriminate Chapter 5. Weak Measurements 54

Figure 5.4: g = 0.002/~ the state of the system after the measuring process. On the other hand, a very weak coupling (a very small g) will make it nearly impossible to gather any information about the state of the system. Measurements following this kind of weak coupling are called “weak measurements”. The reader may wonder why one would perform such an imprecise or “useless”measurement. Well, the weaker the interaction, the weaker the entanglement, the weaker the perturbation of the system during the measurement process. In some cases, we would like to be capable of obtaining information about the system while negligibly altering it. This could sound as a utopian hope within the traditional understanding of the measuring process in Quantum Mechanics, since we have been taught that the very act of measuring affects, collapses or perturbs the system; the system is projected to the eigenstate of the corresponding eigenvalue observed.

However, as we have studied in the previous chapter, not all quantum measurements needs to be projective. Indeed, it is not only that a certain overlap between the pointer’s states is possible within the ancilla measurement, but it is even possible to have a situation in which an almost complete overlap of the pointer’s states is obtained. In such cases, the interaction between the system and the meter is said to be weak. And in such cases, as will be explained soon, we can obtain useful information –in spite of the big spread or low value of g of the pointer’s states– while affecting the system in an negligible way 1.

1“Big dispersion” is to be understood as comparing the dispersion with the distance between the eigenvalues. More precisely, a big dispersion in this context is to be understood as ∆Qˆ  |aj − ai|, where |aj − ai| is the difference between the maximum and minimum eigenvalues. When the dispersion is this big compared to the difference between the eigenvalues, we can not determine the state of the system by the density distribution of the pointer Chapter 5. Weak Measurements 55

5.1.1 Definition of the weak value

The original way in which the weak value (not to confuse with a weak measurement) was first derived is, broadly, as follows: apply the evolution operator we have been using for the pre-measurement to the initial total state of the system (meter plus system), taking into account that in weak measurements the coupling factor g is very small (recall that setting a very small coupling factor is one of the ways to achieve the weak regimen and the other way is to consider a great dispersion of the momentum of the meter):

i ˆ ˆ i (1) ˆ (0) − gAψ⊗Pφ (0) (0) ˆ ˆ (0) (0) |Ψ i = U |Ψ i = e ~ |ψ i ⊗ |φ i ≈ (1 − gAψ ⊗ Pφ) |ψ i ⊗ |φ i , (5.2) ~ where we have made a Taylor expansion keeping in mind that g is very small. We now post-select2 a certain final state |ψ(f)i for the system [17, 20];

(f) (f) (1) (f) (f) i (0) (0) (|ψ i hψ |) |Ψ i = (|ψ i hψ |)(1 − gAˆψ ⊗ Pˆφ) |ψ i ⊗ |φ i . (5.3) ~

Using simple algebra we can rewrite the above equation as follows:

! i g hψ(f)| Aˆ |ψ(f)i ⊗ Pˆ (|ψ(f)i hψ(f)|) |Ψ (1)i = |ψ(f)i ⊗ hψ(f)|ψ(0)i 1 − ~ ψ φ |φ(0)i hψ(f)|ψ(0)i (5.4) i ˆ (f) (f) (0) − gAw⊗Pφ (0) = |ψ i ⊗ hψ |ψ i e ~ |φ i ,

where we have appealed to the definition of the weak value:

hψ(f)|Aˆ |ψ(0)i A ≡ ψ . (5.5) wψ hψ(f)|ψ(0)i

i ˆ − gAw⊗Pφ (0) Note that e ~ |φ i (in Eq. 5.4) will translate the wave function of the pointer, in the q representation, by the amount gAw. Therefore, the pointer’s position will be

centered at g times the weak value, i.e, Qˆφ = gAw. Before we move on to a much more delicate treatment of weak measurements, some points deserve to be highlighted. First, note that the weak value depends on |ψ(0)i, the initial state of the system and on |ψ(f)i, the final state of the system. This simple point will be crucial in the last chapters of the present work. Second, recall that the previous result was derived under the assumption

2To better undertand what a post-selection is is illustrative to review the steps of a measurement without post-selection. Before such a measurement we prepare the system in a certain initial state. Then we make the system and the meter interact as we have studied in the previous chapter. After that we “read”the meter and by doing so we determine the state of the system. Now, when we perform a post-selection we do not read the meter after it has interacted with the system but instead we select some final state of the system (after the system has interacted with the meter), and then we read the meter keeping in mind the final selected state of the system. For instance, let ψi be the initial state of the particles we want to measure. Then the particles (the system) interacts with the meter. After that, we decide to consider only the particles that ended up in a certain state ψf . Finally, we read the meter associated with the particles that ended up in the final state mentioned (without post-selection we read the meter indistinctly of the final state of the system). Chapter 5. Weak Measurements 56 of small g and big dispersion of the pointer in q. Third, note that the weak value is obtained through a procedure of pre-selection (we prepare an initial state of the system), and post-selection (we select some final state of the system). Fourth, it is clear from the definition of weak values that these values need not be bounded by the spectrum of eigenvalues of the observable; if the initial and final states are nearly orthogonal, then this value easily surpass the maximum eigenvalue (this was the way that the authors of [18] obtained a value of 100 for a 1/2 spin particle).

In the next section we will generalize and make more precise the conditions for weak measurements, but it is important not to forget that weak measurements entail that the states of the pointer are significantly overlapped so that it is difficult for us to gain any information of the system and it is important to recall that because of the very weakness of the coupling factor, the system is nearly unchanged during the interaction. Having said this, it should be pointed out that in the next section we are going to take a different approach to the “historical approach”, an approach that provides a better insight into the understanding of weak measurements and indirect measurements in general –not necessarily weak ones–.

5.2 Mechanical interpretation of weak values

5.2.1 Playing with pre and post-selected ensembles

Until now we have ignored the fact that an almost negligible dispersion of the observ- able Pˆ entails, because of the , a great dispersion of Qˆ. Indeed, this is another way of presenting the conditions for a weak measurement; a great uncer- tainty in Qˆ will make the pointer indicate, after a final post-selection, the corresponding weak value. More precisely, the pointer’s expectation value of Qˆ for a determined post- selection |ψ(f)i, in the limit in which Pˆ tends to zero (negligible dispersion), approaches to the weak value of the observable Aˆ [19]:

ˆ lim hQif = gAw. (5.6) ∆Pˆ→0

As we will now see, giving emphasis to the observable that enters into Hint rather than to the observable conjugated to it, will help us to reveal some interesting features of weak measurements. From now on we will drop the subindex of the operators, because all we need to keep in mind is that Aˆ is the observable of the system whereas the momentum Pˆ and the position Qˆ are observables of the meter. Chapter 5. Weak Measurements 57

As we have studied in the previous chapter, the interaction of the system with the meter is described by means of an evolution operator eiPˆAˆ, with Pˆ the momentum of the meter and Aˆ the observable we want to measure on the system. Note that from now on, for sake of simplicity, we are setting the interaction g = 1. The total system (system plus device) after the pre-measurement, given the evolution operator just mentioned, is eiPˆAˆ |ψ(0)i |φ(0)i. If we now post-select the final system’s state |ψ(f)i, the state of the meter will be:

1 ˆ ˆ |φ(f)i = hψ(f)| eiP A |ψ(0)i |φ(0)i , (5.7) pProb(ψf |ψ0, φ0,M)

where Prob(ψf |ψ0, φ0,M) is the probability that the system ends in state |ψf i, given that the system is initially in state |ψ0i, given that the meter is initially in state |φ0i, and given the measurement interaction (given the pre-measurement) that I have represented as M. For economy, we are going to name this probability Prob(φ). It is clear, from Eq. 5.7, that Z ˆ ˆ Prob(φ) = | hψ(f)| eiP A |ψ(0)i |2|φ(p)|2dp, (5.8)

where the integral over the momentum of the pointer is required in order to take the probability of the final state of the system regardless of the final states of the meter. In the position representation (applying R dq |qi hq|) Eq. 5.7 yields

Z (f) 1 (f) iPˆAˆ (0) |φ i = dq hψ | e |ψ i φ0(q) |qi . (5.9) pProb(φ)

(f) 2 (f) Then, the pointer’s distribution is Prob(q|Ψ ) = |φf (q)| . Ψ is the final state of the meter given the initial total system, and given the measurement interaction. Suppose now that the post-selection is done by a complete measurement of a non-degenerate observable Fˆ of the system, with eigenstates {|fi}. Then, the pointer’s distribution for the total system Ψ (1) after the pre-measurement but without post-selection (recall that the upper index 1 in Ψ (1) denotes times after the pre-measurement whereas f times after the post-selection) can be written as

(1) X (f) Prob(q|Ψ ) = Prob(φf )Prob(q|Ψ ), (5.10) f

f 0 0 where Prob(φf ) is Prob(ψ |ψ , φ ,M)[19]. This arises from the definition of a con- P ditional probability; Prob(a|c) = b Prob(b)Prob(a|b). In our case, the probability of obtaining a value q for the pointer, given the initial total state (Prob(q|Ψ (0))) –before the pre-measurement– is the probability of obtaining q for a particular given final state (f) 2 of the system (Prob(q|Ψ )) which is given by |φf (q)| , times the corresponding prob- ability of obtaining that final state (Prob(ψf |ψ0, φ0,M)), summed over all the possible Chapter 5. Weak Measurements 58

final states. By exactly the same reasoning that lead us to Eq. 5.10, we can write the expectation value of q for the total system before the post-selection (but after the measurement interaction) in terms of the expectation values for the different final states of the system, that is ˆ X ˆ hQi1 = Prob(φf ) hQif , (5.11) f ˆ where hQif is the expectation value of q calculated after the post-selection. If we consider Eq. 5.6, we can write Eq. 5.11 as

ˆ X hQi1 = Prob(φf )gAw(f). (5.12) f

This is an interesting result since it explicitly shows that we can see that the pointer’s distribution after the measurement interaction and without a post-selection can be re- garded as a mixture of distributions each one centered at the weak value gAw(f) and (f) weighted by Prob(φf ), the probability of obtaining the final state |ψ i given the initial ˆ total state and the measurement interaction [19]. Moreover, if we set hQi0 = 0, recall from the Von Neumann protocol (equation 4.24) that the expectation value of the observ- able Qˆ after the pre-measurement always yields the expectation value of the observable ˆ ˆ P A of the system. Hence, because of Eq. 5.12, we have hAi = f Prob(φf )gAw(f).

Indeed, notice the following result, that makes sense given the above considerations; (0) ˆ (0) P (0) (f) (f) ˆ (0) hψ |A|ψ i = f hψ |ψ i hψ |A|ψ i, where we have inserted the identity. Now (f) (0) multiply by hψ |ψ i . You finally obtain hψ(f)|ψ(0)i

X hψ(f)|Aˆ|ψ(0)i X hAˆi = | hψ(f)|ψ(0)i |2 = | hψ(f)|ψ(0)i |2A . (5.13) hψ(f)|ψ(0)i w(f) f f

This last result, together with the previous discussions around Eq. 5.12, entails that we can interpret the expectation value of an observable Aˆ for an initial system |ψ(0)i, as an average of weak values instead of as an average of eigenvalues [19]. The reader might come to think that, given that the weak values are not restricted to lie within the range of eigenvalues, then it is strange that Eq. 5.13 yields the expectation value of Aˆ. If we had a very eccentric weak value, would not that value make the distribution of the pointer translate from the expectation value of Aˆ? (The expectation value of Aˆ must, of course, lie within the range of eigenvalues). The answer to this concern lies in the term | hψ(f)|ψ(0)i |2, which weightes the corresponding weak value in Eq. 5.13; in order for the weak value to be eccentric, it must emerge from nearly orthogonal states, but then | hψ(f)|ψ(0)i |2 will be very small and that weak value would negligibly contribute to the pointer’s distribution. As Aharonov and Botero affirm, “The rule of thumb, Chapter 5. Weak Measurements 59

“eccentric weak values are unlikely”, therefore captures succinctly the mechanism by which standard expectation values can be viewed as averages of weak values”[19].

5.2.2 The action-reaction picture

Weak values impose a restriction on the observable Pˆ; the momentum of the measuring device should be sharply defined (and hence the dispersion of the position is consider- able). Now, what this implies is that if we were to measure the meter’s momentum, then we would obtain a well defined outcome. In other words, the evolution operator iPˆAˆ iPˆAˆ e , applied over a total system |ψ(0)i |φ(0)i, would simply yield e |ψ(0)i |φ(0)i = ipAˆ e |ψ(0)i |φ(0)i, being p the momentum that corresponds to the pointer’s state. This simple fact suggests that we can treat the evolution operator eiPˆAˆ as an operator that depends on the parameter p, with p being a definite value of the meter’s momentum. In what follows we will explore some interesting implications of the operator eipAˆ, the operator that depends on the parameter p, implications that naturally derive from the same conditions that weak measurements impose.

I will write again the state of the meter after the post-selection, following Eq. 5.7, but now keeping in mind that the evolution operator is parametrized by p, eipAˆ. The state of the meter is then:

1 ˆ |φ(f)i = hψ(f)| eipA |ψ(0)i |φ(0)i , (5.14) pProb(ψf |ψ0, p, M)

where Prob(ψf |ψ0, p, M) is the transition probability of going from eipAˆ |ψ(0)i to |ψ(f)i, that is, of obtaining the final state of the system |ψ(f)i given the initial total state, given the measurement interaction and given the parameter p –given that the meter’s momentum is well defined around a certain value p–. For the sake of simplicity, we will 3 denote this probability as Prob(φ)p to remind ourselves that this probability depends

on p. It is very important not to confuse Prob(φ)p with Prob(φ). Although they are

very similar, Prob(φ)p takes into account a specified parameter p for the momentum of the meter, whereas Prob(φ) does not specifies any particular momentum (the evolution operator for Prob(φ) has the observable Pˆ instead of the parameter p).

From now on we are going to pay attention to the term hψ(f)| eipAˆ |ψ(0)i of Eq. 5.14. Let us perform the polar decomposition 4 of the last expression:

q (f) ipAˆ (0) (f) ipAˆ (0) iS12(p) iS12(p) hψ |e |ψ i = | hψ |e |ψ i |e = Prob(φ)pe , (5.15)

3The reason why I do not use the traditional notation of probability is that, in this case, it will be a kind of confusing, because of the two consecutive parenthesis; Prob(φ)(p). 4The polar decomposition of a complex number z, which we can write as z = a+ib, is given by |z|eiθ, where θ is the angle between the complex and real parts (tanθ = b/a). Chapter 5. Weak Measurements 60

(f) ipAˆ (0) where S12(p) is the phase (the argument) of hψ |e |ψ i [19]. Notice two points: first, given these last results, the state of the meter after the post-selection can be written (f) iS (p) (0) as |φ i = e 12 |φ i. This means that the phase factor S12(p) generates a certain unitary transformation of the initial state of the apparatus [19]. Recall that this phase factor stemmed from the system itself (depends on the initial and final states of the system), so the system, via the phase factor, induces a certain unitary transformation of the device. Second, note also that because of eipAˆ, the system undergoes a certain unitary transformation that takes it from |ψ(0)i to eipAˆ |ψ(0)i = |ψ(p)i. This transformation is mediated by the parameter p. Then, the meter, via the parameter p, generates a unitary transformation of the system. We are going to study in more detail both points, but from now it is important to notice that the weak measurement involves an interplay between the system and the meter in which the system affects the meter (via the phase factor) while the meter affects the system (via the parameter p). This is the essence of the back-reaction scheme presented in [19].

We will now direct our attention to the first point, the effects of the phase factor on the meter. As I said, S12(p) induces a unitary transformation of the device, parametrized by p. We can easily study the effect of this phase factor on the device by attending the Heisenberg picture:

† iS12(ˆp) −iS12(ˆp) Qˆ1 = Uˆ Qˆ0Uˆ = e Qˆ0e

= (1 + iS12(ˆp))Qˆ0(1 − iS12(ˆp)) = Qˆ0 − iQˆ0S12(ˆp) + iS12(ˆp) + ... (5.16) ˆ ˆ ˆ 0 = Q(0) − i[Q(0),S12(ˆp)] = Q(0) + S12(ˆp).

The previous one is not an approximation since the other terms cancel out because they 0 5 0 involve commutators of S12(ˆp) with S12(ˆp) . Let us see what S12(ˆp) is. From Eq. 5.15, we can solve for S12(ˆp);

ˆ hψ(f)|eipAψ |ψ(0)i S12(ˆp) = −iLn p + inπ, (5.17) Prob(φ)p

0 where n ∈ Z. Then, S12(ˆp) is:

d hψ(f)|eipAˆ|ψ(0)i hψ(f)|Aeˆ ipAˆ|ψ(0)i S0 (ˆp) = (−iLn ) = < . (5.18) 12 p (f) ipAˆ (0) dp Prob(φ)p hψ |e |ψ i

0 ˆ Note then that S12(p) is a weak value, it is the weak value of the observable A for the system in the initial state |ψ(p)i (recall that eipAˆ |ψ(0)i = |ψ(p)i) and final state |ψ(f)i.

5It does not make any harm to treatp ˆ as an operator in deriving the above result, indeed, the approximation is to treat it as a constant parameter p. Chapter 5. Weak Measurements 61

In short, hψ(f)|Aˆ|ψ(p)i S0 (ˆp) = < = A (p), (5.19) 12 hψ(f)|ψ(p)i w

(p) where Aw(p) stands for the weak value of the observable Aˆ given the initial state |ψ i and the final state |ψ(f)i (note also that what interest us is only the real part). Therefore, Eq. 5.16 can be written

ˆ ˆ 0 ˆ Q1 = Q0 + S12(ˆp) = Q0 + Aw(p). (5.20)

In words, during a weak measurement, the pointer’s initial momentum is shifted by the weak value Aw(p), which in turns is given by the initial system’s state which has underwent a unitary transformation parametrized by the momentum p of the pointer and the final post-selected state |ψ(f)i. Indeed, this is, up to a phase, the same as the weak value for the initial state |ψ(0)i and the final state |ψ(f)i; this is a more elegant way of seeing two things; first, that during a weak interaction the system negligible changes (it only undergoes a unitary transformation) and second, that the pointer distribution of the meter indicates the weak value.

5.2.3 General pointer variable statistics

We can write the state of the meter after the pre-measurement in the p representation. From Eq. 5.7 we have

Z 1 ˆ ˆ |φ(f)i = dp hψ(f)| eiP A |ψ(0)i φ(p) |pi . (5.21) pProb(φ)

If we now apply the results of the previous section, in particular Eq. 5.15, keeping in mind the domain of very well defined Pˆ, i.e, ∆Pˆ → 0, then we can rewrite Eq. 5.21 as:

Z s Prob(φ)p |φ(f)i = dp eiS12(p)φ(p) |pi . (5.22) Prob(φ)

In the q representation we therefore have;

s s Z ipq Z i(pq+S12(p)) Prob(φ)p e Prob(φ)p e φ(q) = dp eiS12(p)φ(p)√ = dp φ(p) √ . (5.23) Prob(φ) 2π Prob(φ) 2π

Following Aharonov and Botero in [19], we are going to assume that p is restricted to lie within a finite range around the value pi (the motivation for doing this is that the momentum of the pointer is very well defined). Hence, we can model φ(p) as a window √ function; φ(p) = φ(pi) = 1/  for the range |p−pi| < /2, and φ(pi) = 0 for |p−pi| ≥ /2 Chapter 5. Weak Measurements 62

[19]. Now, the inverse Fourier transform of the window function φ(pi) is

r 2 sin( p ) F¯(φ(p )) = W (q) = 2 eiqpi . (5.24) i pi, π p

If  is small enough so that S12(p) and Prob(φ)p are nearly constant within the interval

|p − pi| < /2, then we can replace Prob(φ)p by Prob(φ)pi and approximate Prob(φ) to

Prob(φ)pi [19]. Under these conditions we obtain

Z 1 ei(pq+S12(p)) φ(q) = dp√ √ . (5.25)  2π

Finally, we can expand the term pq + S12(p) around pi;

0 pq + S12(p) = piq + S12(pi) + (q + S12(pi))(p − pi), (5.26) neglecting terms of second and higher order in (p − pi). Using simple algebra we can write Eq. 5.25 as Z 1 i(S (p )−p S0 (p )) 1 ip(S0 (p )+q) φ(q) = √ e 12 i i 12 i dp√ e 12 i . (5.27) 2π 

Hence, we can finally perform the above integral, keeping in mind that the inverse Fourier transform of the window function is given by Eq. 5.24. The result is

0 i(S12(pi)−piS12(pi)) 0 φ(q)f ≈ e Wpi,(q + S12(pi)). (5.28)

Recall that we started the previous computations by modeling the pointer wave function as a window function Wpi,(p). Eq. 5.28 then shows that the pointer’s wave function, after the pre-measurement and after a certain post-selection of the system, is the same 0 initial wave function translated by the amount S12(pi) (which in turns is the weak value for an initial state |ψ(pi)i and a final state |ψ(f)i (see Eq. 5.19)) and multiplied by a i(S (p )−p S0 (p )) certain phase factor given by e 12 i i 12 i . This is almost exact in the limit  → 0 [19]. In the next section we are going to comment on what that limit represents.

We can obtain a more general result if we model the initial wave function of the device by a superposition of “window functions” instead of a single window function [19]. By doing this we are “dropping” the condition that the pointer’s dispersion in the momentum is negligible, and this means that we are moving out of the conditions for a weak measurement, i.e, we are in the general domain of arbitrary strength measurements. Now, modeling the initial pointer’s wave function (which we assume to be real and Chapter 5. Weak Measurements 63 smooth) as a superposition of infinitesimal wide window functions, we obtain:

∞ X √ φ(p) = lim φ(pi)Wp ,(q), (5.29) →0 i i=−∞

[19]. If we insert this superposition of window functions into Eq.5 .23, we obtain:

∞ s Z i(pq+S12(p)) X Prob(φ)pi √ e φ(q)f = lim dp ( φ(pi)Wp ,(q)) √ →0 Prob(φ) i i=−∞ 2π (5.30) ∞ s Z i(pq+S12(p)) X Prob(φ)pi e = lim  φ(pi) dpWp ,(q) √ , →0 Prob(φ) i i=−∞ 2π

where we have taken out of the integral Prob(φ)pi and Prob(φ) since they are nearly constant when  goes to zero. Finally, what remains is exactly the same as Eq. 5.25, i.e, the inverse Fourier transformation of the window function, that we already know how to compute. Hence, we can use Eq. 5.28 to finally write:

∞ s Prob(φ)p 0 X i i(S12(pi)−piS12(pi)) 0 φ(q)f = lim  φ(pi)e Wpi,(q + S12(pi)), (5.31) →0 Prob(φ) i=−∞ [19]. Compare this with the initial pointer’s wave function Eq. 5.29. Note then that 0 the initial wave function is translated by the amount S12(pi), which is the weak value of Aˆ for the rotated state of the system |ψ(pi)i and the final state |ψ(f)i. In addition, note that the initial weight of the wave function, φ(pi), is now replaced by the weight q Prob(φ)pi Prob(φ) φ(pi)[19]. It is important to remember what the different terms represent;

Prob(φ)pi is the probability of obtaining the final state of the system given the initial total system and given a certain measurement interaction in which the pointer generates a unitary transformation mediated by the parameter pi. Prob(φ) is exactly the same but taking the measurement interaction without demanding a certain momentum pi for the meter.

We can grasp better the above result if we do the following; first, by Bayes theorem, we have that Prob(p)Prob(Ψ (f)|p) P rob(p|Ψ (f)) = . (5.32) Prob(Ψ (f)) P Now, Prob(p) = φ Prob(p|φ)Prob(φ) = Prob(p|φ), since we have only one initial state φ for the pointer. Prob(Ψ (f)|p) is the transition probability for the system going from its initial state to the final state, but given a certain parameter p for the pointer; for (f) what we have said before, it is clear then that Prob(Ψ |p) equals Prob(φ)p. Finally, Prob(Ψ (f)) is simply the probability that the system goes from its initial state to its final state, given only the initial total state, so Prob(Ψ (f)) = Prob(φ). Collecting these Chapter 5. Weak Measurements 64 ideas, we have: Prob(p|φ)Prob(φ) P rob(p|Ψ (f)) = p . (5.33) Prob(φ)

Note that Prob(p|Ψ (f)) is the probability distribution of the pointer for strong measure- ments, given the initial and final state of the system. From Eq. 5.33 we can evaluate

Prob(φ)pi the weight Prob(φ) φ(pi)[19].

In this section we have seen that the final wave function of the pointer can be regarded as a superposition of weak measurements around the different values pi (see Eq. 5.31), q Prob(φ)pi with each weak value weighted by the factor Prob(φ) φ(pi). Recall also that we started with a pointer’s wave function that did not necessarily satisfy the conditions for a weak measurement, i.e ∆Pˆ did not tend to zero for the initial function which was modeled by a superposition of window functions. Even in that general case, we can still write the pointer’s wave function after the measurement interaction and after the post-selection, as a superposition of weak measurements. Then, it can be argued that arbitrary strength indirect measurements can be understood better within the scheme of weak measure- ments, and this clearly shows that weak measurements can provide great insight into the comprehension of the measurement process.

From this discussion, we can see that a more general condition for a weak measurement (f) is that Prob(p|Ψ ) is sharp (well defined) around a certain value pi [19]. In that case, as we have studied, the pointer’s wave function is going to be shifted by the amount 0 S12(pi), which is a weak value. Another way to see this is to imagine that we did not (f) have Prob(p|Ψ ) sharply defined around a certain value pi, but instead, a probability distribution encompassing several values p1 ... pi; then the pointer’s wave function will be affected by an interplay of weak values and we would not be able to determined a well defined shift of the pointer -this situation would be that of a strong measurement-.

5.2.4 Comments

Recall that our expression for the final state of the pointer, when we modeled its initial wave function in p as a single window function, was more precise the smaller  was. In the limit in which  goes to zero (the window function looks like a Dirac function) Eq. 5.28 approximate to an exact solution. Note that in the mentioned limit, the state of the device tends to an eigenstate of Pˆ, the eigenstate to which we associate the eigenvalue pi.

The more sharp the pointer’s wave function around pi, the better we can approximate ip Aˆ the evolution operator to one in which we replace Pˆ by pi; e i .

Remember one of the points stressed when we discussed the action-reaction picture; ˆ the system undergoes a unitary transformation, that takes it from |ψ(pi)i to eipiA |ψ(0)i. Chapter 5. Weak Measurements 65

Compare this to a case in which a quantum system is subjected to an external constant magnetic field; in this case, the evolution operator is eiγSBˆ , where γ is a constant, and Sˆ = σ/2. Note that eiγSBˆ is explicitly treating the parameter associated with the external magnetic field B as constant, as a number, despite the fact that this quantity is a quantum observable of the measurement device. Many times we find situations in which we treat the evolution operator of a system as unitary, despite the fact that one of the parameters that enters into the Hamiltonian is a quantum physical variable and not a simple number. The reason we can take those parameters as classical, as numbers and not as operators, has to do with the fact that we have a negligible uncertainty of the corresponding external physical quantity [19]. In the last example, we know with great precision the external magnetic field B. Analogously, when we appeal to |ψ(pi)i, we are treating the momentum of the external device as if it were classical, as if it were known with a considerable precision, i.e, as if our uncertainty about pi were infinitesimal.

To illustrate the idea, imagine that we had a great uncertainty with respect to pi, a situation that would correspond to a case of strong measurements in our scheme. Then, the system would undergo a very uncertain transformation, the uncertainty stems from the uncertainty in pi. Hence, this is a nice way of noting that when interacting with the meter, in a weak measurement the system negligible changes whereas in a strong one, the system is abruptly changed, i.e, our knowledge of the system is considerable during the measurement interaction in a weak measurement, and diminishes to a considerable extent in the case of a strong one.

It is because of the previous considerations that the authors of [19] refer to the evolution operator (eipAˆ) as the generator of an infinitesimal uncertain unitary transformation on the system (parametrized by p). We arrive then to what the authors call the “me- chanical interpretation of weak values”: “The essence of a weak measurement is thus to approach, as close as possible, the ideal conditions of an infinitesimally uncertain transformation”[19]. As we will see in the next chapter, this manner of understanding weak measurements will be important in answering some objections against the physical interpretation of weak values.

The action-reaction picture also helped us illustrate the idea that the weak values are sharp physical properties in the sense that they correspond to well-defined shifts of the pointer. Therefore, weak values are measurable quantities, just as “strong values”, i.e, eigenvalues are. We will come back to this idea in the next chapter. The fact that the meter is shifted by the weak value is something that we already studied in the historical approach. However, the fact that we can regard the evolution of the system as undergoing a unitary transformation, and all the implications that these idea lead to, is something widely ignored in the literature and is a key concept in the mechanical interpretation of weak values here presented. Another important remark is the generality Chapter 5. Weak Measurements 66 of the weak measurements. We did not only see that the expectation value of any observable of the system can be operationally defined as an average of weak values (see Eq. 5.13), but we also saw that indirect measurements of arbitrary strength could be represented and understood by means of weak measurements alone. These two ideas suggest that weak measurements play a fundamental role in the understanding of the measurement process.

Within the framework discussed in this section, a more general and fruitful understand- ing of the weak measurement procedure has be obtained. Contrary to what most of the literature tends to suggest, the condition of a great dispersion on the pointer’s variable Qˆ does not guarantee that when the system and the meter interact, we can achieve a weak measurement. This because it is possible that the conjugate variable of the pointer, in this case Pˆ, could present a great dispersion, and if this occurs, if the meter has great dispersion in Pˆ, then the condition for a weak measurement is not obtained since the system will not undergo an infinitesimal uncertain unitary transformation (we will not be legitimate in using eipAˆ as the evolution operator for the system because p will not be well defined). The criteria discussed in this section is therefore more general that the criteria usually mentioned in the literature (in the literature, what is emphasized is the great uncertainty of the pointer [17, 18, 25, 26]). If the measurement approaches the conditions of an infinitesimally uncertain transformation, then we can say that the system will remain almost unchanged during the interaction with the meter and we can guarantee that the meter, in turn, will get shifted by the weak value of Aˆ. The condition of “approaching uncertain unitary transformations” is then a necessary and sufficient condition, while the condition of “not being capable of discriminating between the dif- ferent eigenvalues” is only a sufficient condition (and the condition of “great dispersion in Qˆ” is not necessary nor sufficient). Indeed, it is a consequence of the “approaching uncertain unitary transformations” condition that we can not discriminate between the eigenvalues in a weak measurement procedure. Notice that all this framework started by paying attention to the fact that, by requiring a negligible uncertainty in Pˆ, we can ob- tain a great dispersion on Qˆ. Surprisingly, this simple fact, despite the powerful insight that provides, has being widely ignored in the literature.

5.3 Example 2

We are going to end this section by presenting a simple example that will illustrate the idea that during a weak measurement the pointer is rigidly shifted to a weak value. Let us go back to the 1/2-spin particle example. We will know assume some particular values for α and β. Let’s suppose that the initial state of the system is |ψ(0)i = √1 (|+i + |−i). 2 Chapter 5. Weak Measurements 67

We will measure again Sˆz. Thus, we can repeat the steps of the first example upon the following point: Z Z (1) 1 ~ 1 ~ |Ψ i = √ |+i ⊗ dq |qi φ0(q − g ) − √ |−i ⊗ dq |qi φ0(q + g ). (5.34) 2 2 2 2

Now suppose that before we measure the meter, we select the spins that are aligned in a certain direction. For instance, suppose that we want to post-select the particles q √ q √ (f) 1 1 according to the final state |ψ i = 2 (2 + 2) |+i + 2 (2 − 2) |−i [20]. Then Eq. 5.34 becomes:

q √ Z q √ Z (1) 1 ~ 1 ~ |Ψ i = √ (2 + 2) ⊗ dq |qi φ0(q − g ) − √ (2 − 2) ⊗ dq |qi φ0(q + g ). 2 2 2 2 2 2 (5.35) Now we can continue as before (in the first example), measuring the meter by applying the projector |mi hm| onto the meter’s space. We then obtain:

2 2 (m−g ~ ) (m+g ~ ) ! − 2 1 q √ − 2 1 q √ N e 4∆2 √ (2 + 2) − e 4∆2 √ (2 − 2) |mi , (5.36) 2 2 2 2 where N is a normalization constant. The pointer’s distribution after the post-selection is clearly

2 2 2 (m−g ~ ) (m+g ~ ) ! − 2 1 q √ − 2 1 q √ 4∆2 4∆2 N e √ (2 + 2) − e √ (2 − 2) . (5.37) 2 2 2 2

Figure 5.5 is a plot of the pointer’s distribution given by Eq. 5.37 for a case of a weak measurement; I controlled the weakness by means of a great uncertainty (big with respect to the distance between g times the eigenvalues). Note that instead of the two peaks of the first example, the pointer’s distribution consists of a single peak, and what more interesting, note where the peak is centered; it is not centered around 1 or −1, the two ˆ possible eigenvalues of Sz (recall that we have been using g = ~/2). Perhaps the strangest fact is that it is not centered in any value between the two possible eigenvalues but it is slightly to the right of the maximum eigenvalue of Sˆz. So, by choosing an appropriate set of final and initial states it is possible to obtain a weak value far away from the possible range of eigenvalues (this is how Aharanov, Albert and Vaidman achieved the very eccentric weak value of 100 for a 1/2-spin particle [18]). In our example, the peak of the Gaussian is centered around the weak value of Sˆz determined by the particular Chapter 5. Weak Measurements 68 initial and final states that were chosen;

hψ(f)|Sˆ |ψ(0)i S ≡ < z zw hψ(f)|ψ(0)i q √ q √ (5.38) ( 1 (2 + 2) h+| + 1 (2 − 2) h−|)Sˆ ( √1 (|+i + |−i)) 2 2 z 2 √ = q √ q √ = ~/2(1 + 2). ( 1 (2 + 2) h+| + 1 (2 − 2) h−|)( √1 (|+i + |−i)) 2 2 2

Therefore, it should not be a surprise that the point where the peak is centered is

Figure 5.5: Pointer’s distribution for the pre and post selected states of the example 2. The dispersion is ∆Q = 20, which corresponds to the weak regimen (it is much bigger than the difference between the eigenvalues). I have drawn a line centered around the peak of the distribution so that the reader can easily see that this value is approximated equal to the weak value computed with 5.38.

√ precisely (1+ 2), the weak value given by Eq. 5.38. This example is illustrative because it explicitly shows how the indirect measurement scheme and weak measurement theory are related. To say the same thing in another way, this examples explicitly shows that weak measurement theory fits nicely within the indirect measurement formalism.

Now, the reader might ask; “why is it a big deal that the pointer’s distribution is centered around a value that does not correspond to the eigenvalues? The pointer’s distribution, he might continue, is of course changed because we have imposed some particular restrictions related to the final post-selection. Is this not what should have happened if we had performed a strong measurement of Sˆz in the intermediate time between the pre-selection and the post-selection?” The criticism of this hypothetical reader can be answered as follows: if we had performed a strong measurement of Sˆz, it is true that the pointer’s distribution for an “only pre-selected” ensemble would have been different to the pointer’s distribution for a “pre and post-selected” ensemble in the sense that the probabilities of obtaining a certain eigenvalue of Sˆz would have depended on both the final and the initial selections. In fact, in that case (if we had performed a Chapter 5. Weak Measurements 69

strong measurement of Sˆz) we would have compute the pointer’s distribution by using the ABL rule (this rule is explained in detail in the last chapter). However, there is a very important difference between the two cases (strong and weak): the pointer’s distribution, in the case of a strong measurement of Sˆz in between the post and pre selection, will consist of two sharp peaks centered around the two possible eigenvalues of Sˆz, just as happened in the example one of the previous chapter. The difference between a strong measurement with post-selection and one without post-selection has to do only with the height of the peaks; without post-selection the two peaks are of the √ same height given the initial state 1/ 2(|+i + |−i) while in the post-selection case the peak corresponding to the eigenvalue 1 is much higher than the peak corresponding to the eigenvalue −1. This fact arises naturally from the fact that the amplitude of the post-selected state |+i is bigger than the amplitude of the post-selected state |−i. So, to sum up, the point at which the weak value is centered is far from being trivial; a strong measurement in a post-selected and pre select ensemble will never imitate this behavior; in the latter case, the probabilities to measure 1 or −1 are sensible to the final state (as can be calculated by the ABL rule6), but no new value is revealed (the two peaks will still be centered around the same points). See figure 5.6. Note also that the

Figure 5.6: Pointer’s distribution for the observable Sˆz with the condition of the pre and post selected states of example 2. I have set ∆Q = 0.1, which corresponds to the strong regimen. Note that we have sharp peaks, centered around the eigenvalues 1 and −1. The right peak is considerable higher, as we expect from the ABL rule applied to this particular example (see footnote in the present or previous page). probabilities of obtaining the mean value Eq. 5.38 are lesser to a considerable extent than the probabilities of obtaining one of the eigenvalues (this can be easily seen from the height of the peaks in the strong limit compared to the height of the weak one).

6Using the ABL rule, the probability of obtaining the eigenvalue 1 given the initial and final conditions and assuming an ideal strong measurement is 0.85. For the case of obtaining −1 it yields of course 0.15 so that the sum equals 1. So, it is much more probable to obtain 1 given the final and initial states, in agreement with figure 5.6. Chapter 5. Weak Measurements 70

5.4 Final Remarks of the chapter

In the present chapter we have studied the theory of weak measurements. We initially presented the way in which these kinds of measurements have been traditionally ex- plained. In particular, we saw how weak measurements require that the states of the pointer are considerable overlapped, and this entails that we can not determine the dif- ferent eigenvalues of the system. We also saw that the expectation value of the pointer indicates the weak value. In the second section, we approached weak measurements from another perspective, paying particular attention to the observable Pˆ of the pointer instead than to Qˆ. We then examined how the interaction of the meter and the system could be conceived within a picture of action-reaction, in which the system undergoes an infinitesimal uncertain unitary transformation because of the meter, while the meter undergoes a rigid shift, proportional to the weak value. This picture provided a more general condition than the “great dispersion condition” for the implementation of a weak measurement, namely, that the large measurement approaches the ideal conditions for an infinitesimal uncertain unitary transformation [19]. We also studied how indirect measurements, not necessarily weak, could be described by weak measurements (as a sampling over weak values to be more precise). In the next chapter we are going to discuss the interpretation of weak values by concentrating on a special feature of the weak values that we have overlooked for the moment; weak values depend on both the pre-selection and the post-selection, that is, weak values depend on what we choose to measure in the post-selection. How does this dependence on the future state of the system should be understood would be the main concern of the two final chapters. Part III

Backward causation revealed in weak measurements

71 Chapter 6

On the physical meaning of weak values

In the last chapter the theory of weak measurements was presented. The reader might have noticed that all the theory was developed within the formalism of standard quantum mechanics. In that sense, weak measurements and weak values cannot be questioned on mathematical or formal matters since, as I said, nothing new is added to the old rules and postulates of the original Quantum Mechanics. The fact that nobody before Aharonov and collaborators in 1988 [18] investigated what the quantum mechanics rules entail in a regimen of weak interactions should not make the reader think that this is a new theory or a questionable extension to the already established theory of Quantum Mechanics; weak values are not the result of a novel mathematical development1. On the other hand, questioning the mathematics is problematic given the fact that a vast number of experimental evidence agrees with the predictions (see [26, 28, 29] for example). Having said this, I want to stress that the aim of the present chapter is to discuss the physical interpretation of a weak value.

How should a weak value be understood? Is it a property of the system in the same sense that eigenvalues are? Let’s begin by noting that weak values are measurable quantities; more precisely, they correspond to a rigid shift of the pointer’s wave function; in the case that the wave functions of the pointer were Gaussians, then these Gaussians would be translated from their initial position before the interaction with the system to the final position after the measurement interaction plus post-selection. If we go to the lab, and we perform a weak measurement experiment, we will see that the distribution of

1I am not saying that mathematical questions cannot be raised. In fact, the original paper of Aharonov and collaborators was criticized on mathematical grounds [25], [24]. However, several works with a more delicate treatment of some mathematical aspects, among which I include [17], have provided solid mathematical grounds for the weak values. See also [19] and [27].

72 Chapter 6. How should a weak value be understood? 73 the pointer is clearly centered on the corresponding weak value depending on the initial and final states of the system. So, in this order of ideas, weak values are measurable and the empirical evidence of them coincides with the theoretical predictions. If any objections should be settled against weak values they must not be aimed at questioning their testability.

In what follows I will briefly discuss two possible objections against the idea that weak values are physical properties. After doing that I will discuss an example.

6.1 Two possible objections

6.1.1 Against the postulates of quantum mechanics?

Someone might argue that the fact that the weak value does not necessarily coincide with an eigenvalue seems certainly strange. It is strange because, for one thing, it is a postulate of quantum mechanics that the “allowed” outcomes of a measurement are the eigenvalue of the observable measured. And, in addition, it is strange because these val- ues are obtained via a measurement that does not disturb the system (negligible disturbs it) which is, again, problematic under the basic postulates of quantum mechanics.

The reply to the previous objections is simple: the alleged postulates are limited to projective (strong) measurements, and as we have already studied, weak measurements arise in the context of a more general theory of measurement, namely, indirect mea- surements2. Furthermore, we are not going against the basic postulates of quantum mechanics because we are not gathering information for each particle without disturb- ing the state (that will be certainly prohibited by the postulates). Rather, we obtain information for a big ensemble (the weaker the measurement the bigger the ensemble).

6.1.2 Weak values can be complex or very eccentric

Weak values can be complex but it is assumed that the properties of quantum systems are real. It is true that weak values can be, in general, complex. However, this does not undermines their status as “properties” since it has been explained that complex weak values can be understood as shifts on the pointer’s conjugate variable [30], [31] (the variable conjugated to the pointer’s observable that enters into the Hamiltonian of interaction). So, complex weak values does not require too many interpretative efforts.

2Is it a unmovable condition that all quantum measurements need to be performed on the strong limit? If this is so, it seems more as an ad hoc restriction than as a justified one. Chapter 6. How should a weak value be understood? 74

On the other hand, some have cast doubts (see [32]) on weak values arguing that these can yield very strange values which are difficult to account for under the traditional quantum mechanics scheme. I will consider an example. When weak measurements are applied in the Hardy’s paradox, they yield a negative number of particles (in certain arm we have “−1” electrons, for example [33]). To simplify it in a considerable way, Hardy’s paradox entails, according to quantum mechanics, a contradictory claim; an electron is both annihilated and detected later (as if it were not annihilated). I think that a critique to the fact that weak values can yield a negative number of particles is questionable in the following way: a negative number of particles is no more problematic than the paradox itself.

Indeed, I think that this paradox (which has been experimentally verified already, see [28]) should have casted serious doubts on our traditional understanding of quantum mechanics and it should have then motivated a revision of the standard interpretation of quantum mechanics. My point is that we should not have made such a big deal of the results of the weak values in the context of Hardy’s paradox when the paradox itself is already so surprising. Indeed, a negative number of particles is an interesting solution to the paradox itself, as it is argued in [33]. Why do the results of weak values in the context of the Hardy’s paradox raise suspicions but the paradox itself does not?

The other line of defense against the criticism related with eccentric weak values is pro- vided by Aharonov and Botero in [19]. As long as weak values are understood within the context of uncertain infinitesimal transformations, no surprise should arise from “ec- centric” weak values; “...[eccentric weak values] are unique quantum-mechanical effects associated with the role of the observable as a generator of infinitesimal transformations. One would hardly suspect that such effects could indeed be possible given the physical interpretations that we have traditionally attached to the eigenvalues of a quantum- mechanical observable.” [19]. What the authors are saying is that the back-reaction on the system that comes as a result of the interaction with the meter can sometimes induce unique transformations on the system. In this sense, the mechanical interpreta- tion of weak values discussed in the previous chapter can account for the origin of these eccentric weak values; as long as we regard the system as undergoing a infinitesimal transformation generated by the observable Aˆ, we can hardly take these eccentric weak values as problematic. Chapter 6. How should a weak value be understood? 75

Figure 6.1: An experiment with two consecutive Stern-Gerlach devices. The first one measures Sˆz, deviating the particles along the z axis, while the second device measures Sˆx, and deviates the particles along the x axis. At the end we place a screen. Depending on the final position on the screen, we can determine the results of the two measurements.

6.2 Two Stern-Gerlach Experiment

The following experiment will not only help us to address the interpretative questions we are considering but will serve to highlight some very peculiar features of weak val- ues. Suppose that we have two consecutive Stern-Gerlach devices (see Fig. 6.1), one measuringσ ˆz and the other measuringσ ˆx (the time tz in which the particles interact withσ ˆz is earlier than the time tx in which the particles interact withσ ˆx). We orientate

the devices so that the 1/2-spin particles that interact with the first device (ˆσz) are deflected along the z axis and when they come to interact with the second device they are deflected along the x axis.

All the particles that exit from the first device enter the second one despite the deflection generated by the first device (this is similar to the setup proposed in the first discussions of weak values, in [18]). At the end we place a screen that registers the point of arrival of the particle (the point of arrival consist of two coordinates, (x, z)). Suppose also that we shoot the particles one at a time so that we can individuate the particles: “particle

number 3 arrived at position R1”, “particle number 7 arrived at position R3”, and so on. Finally, suppose that the particles are all prepared in the initial state |ψ(0)i = √ 1/2 |+i − 1/ 2 |−i. In addition, we have the screen connected to a computer that can create a table of data with number of particle and position.

What do we expect? In the case of two consecutive strong measurements the prediction is very simple: the first device deflects the particles along the z axis so, if it were only for this device, we would obtain, after N particles, two spots at the screen, corresponding to the eigenvalues 1z and −1z ofσ ˆz. Since coming out of the first device the particles Chapter 6. How should a weak value be understood? 76

Figure 6.2: Screen after N shots. We see four spots because the first device deflects the particles along z and the second one along x.

enter the second device, we will obtain a further deviation, this time along the x axis

according to the values 1x and −1x forσ ˆx. Four spots will then be formed at the screen: one will correspond to particles that yielded 1z and 1x, other for particles that yielded

1z and −1x, and so on. Thus, by looking at the final position of each particle we can determine the value (and the corresponding state) of the spin of the particle once it passed through the first device, and the value of the spin ofσ ˆx when it interacted with the second device (see Fig. 6.2).

If we order the computer to print the list of data of the x position alone, and we plot it, we expect two sharp Gaussian peaks centered around the eigenvalues 1x and −1x ofσ ˆx. We can do the same for the list of the data along z; again two peaks will be formed, centered on the eigenvalues ofσ ˆz. The height of the peaks will not be equal (0) because the initial state is biased along |−iz, as can be seen from |ψ i. Indeed, it is quite obvious that the peak corresponding to the eigenvalue −1z should be higher (three times higher). The height of the peaks forσ ˆx, however, should be equal (the probability of obtaining 1x or −1x is the same for a state |+iz or a state |−iz, which are the only possible outcomes after the first interaction).

Finally, imagine that we print the lists, one for X and another one for Z. Suppose that Veronica takes the X list and I take the Z list. Now, imagine that Veronica marks all the particles that ended “up” in her list, i.e, that yielded the outcome 1x, to which the state

|+ix corresponds. In addition, let’s imagine that she makes a list with those particle that ended in 1x. The list will be something as “particle 1, particle 4, particle 9, particle 13, particle 21”and so on. Afterwards, she sends to me that list (assume that I do not know why she chose those particles, that is, I do not know that those are the particles that yielded 1x). What happen if then I plot my Z data corresponding to the particles that Veronica sent to me? (that is, I plot only the Z data for the particles that Veronica wrote on the list she sent me) I will obtain two sharp peaks, each one centered around each one of the eigenvalues ofσ ˆz, as it should be, since my list only has information of Chapter 6. How should a weak value be understood? 77

Figure 6.3: This is how the screen would like if we performed a weak measurement in Sˆz. The separation of the values along x is clear but not the separation for the values in z.

σˆz. The height of the peaks will diminish compared to the height of the peaks I obtain when I plot all my data (not the data filtered by Veronica’s list), because now we are considering a smaller number of particles. With the two peaks I obtain, I will not be able to guess for example that the particles of Veronica’s list correspond to those particles that ended in the state |+ix; as we know from standard Quantum Mechanics, from these two peaks alone I cannot guess the final result of the particles of an observableσ ˆx that does not commute with the one I measured, namely,σ ˆz.

6.2.1 Predictions in the weak regimen

Let us go weak. If the interaction of the first device with the particles were within the weak regime (so that for example the difference in distance between 1z and −1z were much smaller than the dispersion of the Z spots in the screen), then, we would expect to see only two spots in the screen, separated along the X axis (see Fig. 6.3); one spot will correspond to 1x and the other one to −1x (the spots deviating along z, corresponding 3 to 1z and −1z, will merge into a single one ). In this case then we could learn, by seeing the position at the screen, what the outcome of each particle along X was but not along Z. The deviation in Z is less than the dispersion, so the error will not allow us determine this outcome for each particle; if we could determine it then it would not be a weak regime after all!

If we now plot the Z data (irrespective ofσ ˆx), then we will obtain a single Gaussian centered at the average ofσ ˆz, as the theory predicts. But if we plot the Z data depending on the final state inσ ˆx, one plot of Z for the final state −1x and another one for

1x, then we will obtain two Gaussians, one for each case, centered at the weak values corresponding to the initial state and the final post-selection (for the post-selection 1x

3Perhaps the spots are slightly wider than in the case of a single Stern-Gerlach experiment because the first interaction will slightly move apart the particles along the Z axis Chapter 6. How should a weak value be understood? 78

the Gaussian will be centered at 1, and for the post-selection −1x the Guassian will be centered at −1).

So far nothing special has been done, we have been following what the formalism dictates. But perhaps by taking the formalism from a mathematical perspective we came to ignore an intriguing fact, that we already discussed briefly in section 5.3, namely, the fact that the information of the first weak measurement interaction is somehow sensitive, in a non-trivial way, to the information of the final state. It is not simply that our no-post-

selection Gaussian ofσ ˆz, centered on the mean value, is smaller or taller, but that it is translated to a new value. It is important to be clear in that this new value, be it an

eigenvalue or not, is still a value ofσ ˆz because the translation of the pointer’s variable indicates the values of the observable that enters into the Hamiltonian of interaction, accordingly to the indirect measurement formalism. So, this data, obtained from a

weak measurement and that corresponds to values ofσ ˆz, is carrying information, in a

mysterious way, of the final state that was selected by a measurement ofσ ˆx. How is this possible? Does it make sense?

If Veronica were the only person with access to the X list, if she gave to me the same list as before, namely, the list that has the number of the particles that ended with spin up according to her list (again, I do not know that the list is a list of the particles that ended “up”in X), and if I come to plot the data of my Z list accordingly to the particles that are mentioned in Veronica’s list, then I will obtain a Gaussian centered at 1, the weak value corresponding to the final post-selection |+ix. This entails that I can tell her: “hey, I know that the particles named in the list you sent to me are the particles that ended with spin up in X”. This of course is totally unexpected because these two observables do not commute, so no correlation is to be expected, so, no relevant information of the final state should be expected from my data, which is data from a “previous state” corresponding to a different and non commuting observable.

Hence, something very odd is happening with weak values; somehow my Z list is not totally independent of the X list, somehow there exists a very strong correlation between the two though I am unaware of it before the post-selection. If Veronica had not told me what particles to consider, I would have not seen this information! On the other hand, the very principles of quantum mechanics seem to prohibit any correlation between no commuting observables. We are then trapped in a very intriguing mystery: how can we understand these correlations? We, again, are back to the main question of this chapter: how should weak values be interpreted? Chapter 6. How should a weak value be understood? 79

6.3 More objections

In this last section I will consider more possible objections now focused on the peculiar results of the previous experiment.

6.3.1 The Error View (EV)

One might come to think that weak values are errors since the condition for a weak measurement is that we have a big dispersion, so that each individual result is embedded within the range of error of the measurement process. However, the fact that each individual result is incapable of surpassing the “error threshold” does not entail that n weak measurements cannot, together, surpass the error threshold. We should recall that √ the standard error for a distribution over a sample of n elements reduces as ∆/ n, where ∆ is the dispersion. Then, if n is big enough, that is, if we perform a sufficiently large number of measurements, we can make the statistical error arbitrary small; we can make the Gaussian centered at the weak value narrow enough that we can read with certainty the weak value from it. Indeed, Aharonov and Vaidman have shown that we could even obtain a weak value by a single measurement if we measure at once a big ensemble of particles, although under very rare conditions [31]. In short, the “error view” can ve refuted easily with statistical arguments; we can reduce the error by performing enough measurements, and when we do so, we find that the pointer is systematically indicating a certain value.

6.3.2 The coincidence view (CV)

This view will hold something along the following lines: there does not exist any cor- relation between the Z and the X lists, what we are obtaining with the splitting is a very odd statistical coincidence. This option, however, does not stands very firmly once we consider some points. First, it would mean that in all the cases in which a weak measurement is performed, the same coincidence always appears; no matter what is the observable that we are weakly measuring, no matter what we post-select, we will always find that the data of the weak measurement can be divided in a special way that matches the final state of the system. The translation of the Gaussians when you split the data according to the final state is systematic; if you perform enough number of measurements, the Guassian always accommodates to the weak value (this is a very similar argument to the one that we raised against the “error” view).

Perhaps the defender of CV would hold that the shift of the Gaussian to a weak value does not entail that the weak value has a physical meaning. In this order of ideas, a Chapter 6. How should a weak value be understood? 80

CV defender will accept that the pointer always translates into a value that depends on the final state, but will remark that this translation should not be taken as if it were indicating something physical; “not every shift of the pointer is physically meaningful, despite being a systematic shift”, a CV defender will claim. It is just a strange acci- dent of the universe (or a strange accident of the mathematical formalism of Quantum Mechanics) that in a weak measurement the pointers translate to a value that depends on the final state. The correlation between the data Z and the final outcomes of X is a brute inexplicable and accidental fact of the universe, that’s all.

However, it will be certainly problematic for CV when weak values coincide with strong ones; it is not very strange that a mere accident coincides too with the physical properties to which strong measurements point to? Let’s not forget that weak measurements without post-selection do provide information of the system, namely, the expectation value of the observable we are measuring and this expectation value coincides with the one obtained in the strong regimen.

CV is, I believe, untenable. The mere fact that a clear correlation exists, the fact that the Gaussians of the weak measurements undergo a shift accordingly to the final state of the system, should be more than enough to suggest that there is a physical phenomena underlying the weak values. Why is it the case that in strong measurements the shifts of the Gaussians are taken as indicating properties but in the weak regime no? Is it not arbitrary to discard the physical meaning of “shifting pointers” just because they are weak? Not to mention than the transition from weak to strong is continuous, so any “barrier” that sharply defines what is weak and what is strong would be a kind of arbitrary. If, according to the quantum formalism, the distribution of the pointer is what indicates the states of the system, why should we discard the particular pointer distribution obtained in a weak measurement? Is the fact that the measurement is weak a motive to impose a whole different understanding of the shift of the pointer? If this is so then it appears to be a more an ad hoc restriction; where should the boundary between strong and weak should be placed so that we can be justified in discarding some shifts of pointers as meaningful?

On the other hand, I want to stress the point that a correlation is the kind of thing that demands for a physical explanation. Suppose that we invoke 1000 persons, with the condition that all of them have white shirts. We then register the color of the shoes of all of them. We do not find any pattern. Finally, we impose the restriction that only the people with bluejeans stay. In that moment, after all the others have gone, we discover that the color of the shoes of the people that remains is the same, let’s say red. If we perform the same “experiment” many times, and we come to notice the same odd correlation between red shoes white shirts and bluejeans, we better start to investigate Chapter 6. How should a weak value be understood? 81 possible reasons for this fact rather than discard it as an odd coincidence of the universe. It stopped being a coincidence once it happened a million of times, with different groups of people! To sum up, it is against the spirit of science that, on the face of systematic correlations between certain physical events, we adopt the coincidence view rather than the attitude; “there is something we must look for”.

The defender of CV could finally appeal to one more idea: “it makes no sense that a correlation is established between no commuting observables according to quantum mechanics”. But, in saying that, the defender of CV is ignoring one of the central results of weak measurements: we are not obtaining information per each particle when we measure them weakly. Hence, the correlation we obtain is not between the eigenvalues of the X list and the eigenvalues of the Z one; the Z list is not a list of eigenvalues. Quantum mechanics does not prohibit a possible correlation between weak data that

does not store any information of the observableσ ˆz per particle and a final list which

does has information (ˆσx) per particle, it just prohibits that the correlation is formed between the eigenvalues of both lists, which is not the case. So, the fact that the observables do not commute cannot constitute a good objection against the physical meaning of the correlation that we find between X and Z.

6.3.3 The no dependence on the future view (NDFV)

There seems to be one last objection to the idea that weak values are something phys- ical. It is the idea that, according to the formalism of quantum mechanics, the Z mea- surements produce shifts on the pointers because of whatever interaction has occurred between the system and the device. The mentioned interaction occurs only during the

time tz in which the system interacted with the first Stern-Gerlach, so nothing related to

a time later to tz should come to induce a pointer’s shift of the Z data. For example, in the case of two consecutive strong measurements, the pointer’s states are shifted in the first interaction according only to that interaction. The last measurement also induces shifts of the pointer of the last device, but in no way the last measurement will alter or modify the shifts of the previous measurement device (see section 5.3). To attribute physical meaning to a weak value and to these correlations would apparently entail that the data obtained during tz is sensitive to data obtained from the last measurement, which appears to be a kind of absurd.

NDFV then states that the fact that the post-selection can shift the pointer’s corre- sponding to the weak data is the best reason to suggest that weak measurements are “nothing real”, since the shifting of the pointer of one set of data has nothing to do with the other set of data –one is obtained in a prior time–. So, for NDFV Quantum Chapter 6. How should a weak value be understood? 82

Mechanics starts to be inapplicable when we are considering the weak regimen and we have to learn to live with this illusory correlations.

A variation of the last argument (recall that the argument is “it does not make sense to take as a physical property one that changes depending on the final state”) in favor of NPDF has been invoked before in the literature. For example, in discussing the results of weak measurements applied to Hardy’s paradox, Svensson claims that it would be certainly problematic under the traditional interpretation of quantum mechanics to consider that the number of particles at a certain time varies accordingly to the final measurement of the particles, which occurs later [32]. Weak values of the number of particles depends of course, as all weak values, on the final state. How then is it possible that the number of particles that I have here, right now, varies accordingly to a measurement that we have not performed yet? (does not the idea that “the number of particles that you have right now depends on what you come to measure later”bring to the reader’s mind the words of a crazy man?). According to NDFV, the very fact that weak values depend on the final state of the particles is enough for us to be justified in denying that these weak values are indicating something physical4.

Svensson is right in pointing out that within the traditional interpretation, the fact that a number of particles in a time t is sensitive to a measurement performed later is puzzling to say the least; it is as if I could change the number of particles of a “past” state by changing what I measure in the future. That the traditional interpretation is limited in understanding this fact is one idea that we can share with Svensson. But this is not a good reason to raise suspicion against the legitimacy of weak values as physical properties, as Svensson believes, but a motive to find a better interpretation of the weak values. Notice that NDFV is problematic in the sense that, as long as it denies the physical status of weak values, it will have to ascribe to a certain version of the “coincidence view” and hold that these correlations are magical coincidences. If we can provide a physical interpretation for the weak values, then we can overcome all the problems that the CV faces. Notice also that NDFV is problematic under the lights of the extensive discussion of the first chapters; given the temporal symmetry of physical laws, and in particular, of quantum laws, it is a kind of arbitrary to discard quantum processes that occur in the direction of time opposite to that dictated by our psychological arrow of time. The development of an improved interpretation will drive us to consider the weak values as the result of backward processes. The justification of that idea is the main objective of the last chapter of the present work.

4The other option, that perhaps nobody will consider is to cast doubts on the interpretation of the number of particles operator itself; why are we so sure that this operator should be taken as indicating the number of particles? Clearly, if someone proposes this, he will be immediately silent and ejected from the physical community, but it is a possibility nonetheless. Chapter 7

Backward causation in weak measurements

In this last chapter I will justify the following claim: weak values are physical entities only understood as a result of both forward and backwards causation. Recall that the main thesis of this work is that weak measurements seem to provide evidence for retro causal processes. Then, in this chapter I will develop the arguments that justify that thesis.

I will begin by discussing an experiment1 that we are planning to perform soon in the labs at our university. As we will see, the implications of this experiment are certainly important; it will throw new lights into the old Bohr-Einstein debate about the complete- ness or incompleteness of the quantum mechanical description of nature. Additionally, by discussing the results of the experiment we will not only give new life to this old debate but we will be in a position to provide a straightforward interpretation of weak values, an interpretation that appeals to retro causal processes.

7.1 An experiment

The proposed experiment is a Bell-type experiment. In the left arm we are going to perform a weak measurement of the observableσ ˆx and then we perform a strong mea- surement ofσ ˆz. In the right arm we invert the things so that first we weakly measure

the photons alongσ ˆz and then strong alongσ ˆx (see Fig. B.1). In addition, we must place a screen at the end of each arm to record the arrival position of the photons (in

1The idea to develop this experiments is from Alonso Botero. In [34] the authors discuss a situation that involves a similar experiment.

83 Chapter 7. Backward causation in weak measurements 84

Figure 7.1: A Bell-Type experiment adapted to implement weak measurements on each arm before the final strong ones. At the end of each arm we place a screen that will allow us to read the weak values (the details of the experiment, including a detailed description of the set up can be found in AppendixB).

AppendixB we will see how to achieve this things in the lab). The general idea is then to create a list that has the final position in the screen of each photon of each pair (see Fig. 7.2). Me must guarantee that the stored positions refer to photons of the same pair, hence, we must look for coincidences (we need to be able to differentiate pairs).

As in all EPR experiments, we begin with an initial entangled state

1 |ψ(0)i = √ (|+i |−i − |−i |+i ), (7.1) 2 1 2 1 2

where the subindex 1 denotes the right photon and 2 the left one. Imagine then that we look for the post-selection given by the final state

|ψ(f)i = |+i |+i . (7.2) x1 z2

This final state means that the photons of the right arm end in state |+i = √1 (|+i + x 2 z |−iz) (written in the basis forσ ˆz) and the photons of the left arm ended up in state

|+iz. From now on, every ket that does not have as a subindex the letter x will denote a ket in the z basis. If we compute the weak value of σzw (right arm) for the mentioned post-selection, we obtain

h+| h+| (1 ⊗ σˆ )( √1 (|+i |−i − |−i |+i ) x1 2 2 z1 2 1 2 1 2 σz = = −1. (7.3) w h+| h+| √1 (|+i |−i − |−i |+i ) x1 2 2 1 2 1 2 Chapter 7. Backward causation in weak measurements 85

Figure 7.2: These two tables are the data with the final position of the photon (the top table corresponds to the right photon, and the bottom one to the left photon). The number of the photon identifies the photons that belong to the same pair (the columns of both tables refer to the same pair). Next to each table we have a density distribution for the data of the table, one distribution is of the right screen and the other of the left one (one corresponds to the right photon and the other one to the left one). Note that because of the weakness of the measurement, these distributions do not indicate the eigenvalue of the observables (if it were strong, then we would have two sharp peaks centered at 1 and −1). Warning: This is an example, I made up all the data, my purpose is only to illustrate the procedure.

To compute the weak value on the left arm we simply change 12 ⊗σˆz2 for 11 ⊗σˆx1 (recall

that in the left arm the photons are subjected to a weak measurement ofσ ˆx). The result for this weak value is

h+| h+| (1 ⊗ σˆ )( √1 (|+i |−i − |−i |+i ) x1 2 1 x2 2 1 2 1 2 σx = = −1. (7.4) w h+| h+| √1 (|+i |−i − |−i |+i ) x1 2 2 1 2 1 2

Therefore, if we plot the data according to the final state given by Eq. 7.2, we expect to see, in both screens, a Gaussian distribution centered at −1, where −1 is the weak value for both arms, see Fig. 7.3. Recall that the translation of the pointer’s distribution is due to the weak measurement ofσ ˆz in the right arm and ofσ ˆx on the left one. It is clear then that the screen two (left arm) will reveal the weak value of σxw while the other screen will reveal the weak value for σzw . The important point is that both σzw and

σxw equal −1. This is very important since it shows a clear anti-correlation between the weak values of each arm and the final states of the partner photons of the opposite arm;

σzw of the right arm is perfectly anti correlated withσ ˆz that belongs to the left photon

(of the same pair), and the same happens for σxw in the left photon andσ ˆx in the right one, as it is illustrated in Fig. 7.4.

We know from simple quantum mechanics that when we have an EPR pair, if we obtain a final value of 1 for one of the photons, the other member of the pair will certainly yield −1 if measured along the same direction. Notice that the weak measurement of Chapter 7. Backward causation in weak measurements 86

Figure 7.3: Now we want to analyze the distribution of the position accordingly to the post-selected final state. Therefore, for the screen one (right arm) we need to plot only the data of the photons that ended up |+i in the left arm, since these are the photons that matches the condition of the final state of the photons (recall Eq. 7.2). And for the screen two (left arm) we want to plot only the data of the photons that

ended up |+ix in the right arm. For example, to plot the data for the screen one (right arm) we have to plot, not all the “positions ”but only the position of those photons whose partners ended up in the state |+i in the left arm. In brown are highlighted those photons that ended up in |+i in the left arm, so in the first screen we only want to plot the appropriate photons, the highlighted in brown. An analogous argument explains what are the photons we want their position to be plotted for the screen two (highlighted in yellow). Next to these tables the reader can find the density distribution for these “filtered”data. We obtain Gaussians centered around the weak value −1, that is the same in both cases. Warning: This is an example, I invented all the data, my purpose is only to illustrate the procedure.

Figure 7.4: Schematic representation of the outcomes in the case we post-select the final state given by Eq. 7.2. Notice the clear anti-correlation between the weak value compared to the final strong value of the opposite arm (compare green boxes with green boxes, and blue ones with blue ones). Chapter 7. Backward causation in weak measurements 87 the right photon (exactly the same argument applies for the left weak measurement) is a measurement of the same observable than that measured strongly on the partner photon; in both cases we are comparingσ ˆz, although in one case (the left arm) we have a strong measurement and in the other case (the right arm) a weak one. What our experiment suggests is then a striking fact, namely, that the weak measurement, which is done earlier than the final post-selection, is yielding a value which is consistent with the final outcome of the partner photon; one is tempted to say, given these results, that the photons have a certain property, revealed by the weak measurements, that accounts for the results of the final outcomes. For example, one is tempted to say that the fact that the right photon yielded a weak value of −1 when measured alongσ ˆz, explains

why the partner photon yielded 1 when measured alongσ ˆz. Nevertheless, the reader might come to think that this is rather trivial; is it not a established fact that when measured along the same direction, one photon must yield the opposite outcome that the one yielded by its partner? Why is it a big deal that, for instance, the weak value of the right photon is anti-correlated with the final strong value of its partner left photon?

To answer the last question, recall that a weak measurement does not change (or negli- gible changes) the state of the particles, which means that the weak measurement does not collapse the particles into an eigenstate of the observable; the right photon, when weakly measured alongσ ˆz, is not collapsing to a state |−i, and hence, the fact that the partner photon yields a state |+i cannot be explained by appealing to a collapse of the right photon2. Therefore, one is tempted to say that the weak measurement is showing that the state of the photons before the final post-selection, which is nearly identical to the state of the photons if we do not perform a weak measurement at all, already has a property (a hidden variable?) that explains why the outcomes of the post-selections are the ones obtained. One is tempted to say that, somehow, the photons do have properties, independent of any measurement, when traveling through the arms of the experiment, and these properties are the ones we should have expected given the final outcomes (−1 for the right photon if its partner ended in 1 for example). One is tempted to say that Einstein, after all, was right. However, this temptation, so strongly suggested by our experiment, requires a much more careful discussion before we can take it seriously.

2 One could perform many modifications to our experiment to corroborate the no-collapse of the weak measurement; for example, strongly measureσ ˆz in the right arm, after the weak measurement, and corroborate that the photons were not collapsed by the weak measurement into the state |−i. Moreover, you can perform first the weak measurement alongσ ˆx in the left arm; if it were true that the weak measurements were collapsing the state, then all the entanglement between the pair would be lost, and so, if you weakly measureσ ˆz on the right arm, it would be impossible that this measurement induced a certain effect to the left photon. However, if we were to weakly measureσ ˆx on the left arm a small time before weakly measuringσ ˆz on the right arm, then the same results explained in this section would have been obtained, namely, an anti-correlation between the weak measurement ofσ ˆz and the final state of the left photons. Therefore, the effect, if any, induced by a weak measurement cannot be responsible for the final outcomes of the partner photon, which clearly shows why the anti-correlation is far from being trivial. Chapter 7. Backward causation in weak measurements 88

7.2 Hidden variables, again

Einstein was unsatisfied by the idea that we cannot, according to quantum mechanics, talk about properties of the system without performing measurements. He believed that quantum mechanics, as it is, was incomplete; “This [the fact that we cannot simulta- neously attribute a position and a momentum to the particles in the EPR pair] makes the reality of P and Q [the momentum and the position] depend upon the process of measurement carried out on the first system [our left photon for example], which does not disturb the second system [the right photon] in any way. No reasonable definition of reality could be expected to permit this” [15]. This position, famously defended by Ein- stein, would eventually be characterized as the “hidden variable position”. The hidden variable position, in broad terms, considers that there is something missing (hidden) to the quantum mechanical description given by the wave function; that thing (presumably a property of the system) that is missing could in principle explain in a more satisfactory manner the results yielded by the EPR-type experiments. Having said this, it is tempt- ing to claim, as I suggested before, that the experiment we are proposing is revealing a subtle kind of hidden variable, one that is only accessible with the “lenses”of a weak measurement.

Before assuring that our experiment is revealing a kind of hidden variables many con- siderations are to be put in place. It is an established fact of contemporary physics that Bell inequalities can be violated (see for example [35]). The violation of Bell in- equalities entails that nature (at least in the quantum level) cannot be described by the conjunction of two assumptions; b1) that there are hidden variables and b2) that physi- cal systems respect locality, that is, that no faster-than-light influence between systems is possible [36]. So, because of the violation of these inequalities we have to give up one or both of the mentioned assumptions. Imagine then that we hold firmly to the position of considering that our experiment is revealing a kind of hidden variable. Then, because of the violations of Bell inequalities, we must have to abandon locality –which, after all, is not very problematic in the context of quantum mechanics.

However, in the last 15 years or so another very important inequalities were violated, namely, the Leggett inequalities. These inequalities are derived from two assumptions; l1) that there are hidden variables and l2) that physical systems can exert instantaneous influences, i.e, physical systems need not respect locality. Hence, if Leggett inequalities are violated and because of the violation of Bell inequalities, the following follows: na- ture cannot be described by a hidden variable theory. The conclusion, exposed by Legget, is extraordinary, to say the least; “I believe that the results of the present investigation provide quantitative backing for a point of view which I believe is by now certainly well accepted at the qualitative level, namely that the incompatibility of the predictions Chapter 7. Backward causation in weak measurements 89 of objective local theories with those of quantum mechanics has relatively little to do with locality and much to do with objectivity” [37]. Moreover, the authors that per- formed the first experimental violation of these inequalities state: “Our result suggests that giving up the concept of locality is not sufficient to be consistent with quantum experiments, unless certain intuitive features of realism are abandoned ”[38]. Thus, it is certainly problematic the idea that there are properties not taken into account by quantum mechanics; those properties, those hidden variables, seem to be impossible.

A contradiction then has emerged; for one thing, Legget and Bell inequalities, taken together, block all possible space for a hidden variable theory. At the same time, weak measurements seem to be revealing a subtle kind of hidden variables, as it can be seen in our experiment. Given this contradictory result it seems that we have to reject once and for all the idea that the weak values refer to physical properties of the system. But then we will have to adopt the “coincidence view” and argue that all of the correlations found in weak values are magical coincidences, are inexplicable facts of nature. But then we will have to go all the way back and attack the core postulates of quantum mechanics since the formalism of weak measurements is completely contained within the formalism of quantum mechanics. As the reader can examine, this option is profoundly unpromising.

7.2.1 The independence assumption (IA)

The purpose of the present and subsequent subsections is to examine one way of avoiding the mentioned contradiction (recall the contradiction: weak measurements seem to re- veal hidden variables in the proposed experiment, whereas Bell and Leggett inequalities prohibit these kind of variables). Instead of rejecting once and for all weak values, it is possible to examine some of the assumptions made by Bell and Leggett in the derivation of their inequalities. If we find an assumption that can be abandoned, then we can discard their inequalities and avoid the contradiction. In short, we could make space for some kind of hidden variables if we can see how to overcome the inequalities. One of the assumptions3 that both Bell and Leggett made in order to derive their inequali- ties is the idea that the allegedly hidden variables are independent of the settings of the measurement device. Let’s call this, following Price, the independence assumption (IA). Leggett himself states, at the end of [37], that “It might, ... [be possible] to allow the hidden-variable distribution...to depend on the settings a and b of the polarizers.”

Now, Bell entertained some thoughts on the possibility of abandoning IA, but he noticed how problematic it would be to do so. For example, suppose that there is a certain 3Of course, this is not the only assumption. For instance, the authors of [38] suggest that it could be possible to abandon Aristotelian logic, and then abandon the idea that P and no P is a contradic- tion. However, as we will see, the assumption we are examining, IA, is the one whose rejection better contributes to an interpretation of weak values and Bell outcomes as well. Chapter 7. Backward causation in weak measurements 90 relation between the state of the particle to be measured and the settings of the device. We then have to assume that in the past there was an event that explains why the present state of the particle is no independent of the present setting of the device. For example, we can imagine that the day before the experiment the source from which the photons are extracted and the polarizer where stored in the same drawer and somehow, a weird entanglement formed so that the next day the photons produced by the source inherit this entanglement, causing that the hidden variables of the particles be not independent of the settings of the measurement device. We of course will take it as a very puzzling fact that a kind of correlation between properties of the system and the device is obtained in a case like this one; but we might come to recall that the quantum world is strange after all and it is possible then that the rejection of IA, in a case such as the mentioned, is not absurd.

However, we can easily construct an example –much more extreme– that highlights the problems that arise with the rejection of IA. Suppose for example that we decide to choose the settings of the device in the last second, when the particle is already in its way to the detector. It would seem then very odd that a last minute choice, made by the experimentalist, would not be able to “erase” any correlation between the state of the system and the state of the meter. It would seem to be as if the final chose of the experimentalist was already determined! Rejecting IA seems to undermine the possibility of free will and Bell found this very problematic, as we can see in the following passage:

“It may be that it is not permissible to regard the experimental settings a and b in the analyzers as independent variables, as we did... Now even if we have arranged that a and b are generated by apparently random radioactive devices, or by the Swiss national lottery machines, or by elaborate computer programmers, or by apparently free willed physicists, or by some combination of all of these, we cannot be sure that a and b are not significantly influenced by the same factors X that influence A and B [the measurement results]. But this way of arranging quantum mechanical correlations would be even more mind boggling than one in which causal chains go faster than light. Apparently separate parts of the world would be conspiratorially entangled, and our apparent free will would be entangled with them ”. [9, p. 237]

Although it is debatable that the fact that our actions are determined is really a prob- lematic consideration (it is possible to believe in free will and at the same time, this is precisely what in philosophy is called compatibilism), we can understand the magnitude of the implications that follows from the rejection of IA; it would follow Chapter 7. Backward causation in weak measurements 91 that there is an underlying reality of which we do not have any idea that completely sur- passes the reach of our science, and that controls everything, even the more (apparently) random and disconnected events –the Swiss lottery, a radioactive process, etc–. Price says: “The conclusion that our actions are influenced by physical states of affairs of a previously unimagined kind could easily lead one to fatalism!” [9, p. 237]. It follows then that the rejection of IA drives us to a reality much more mysterious and inexplicable than the one revealed by the violation of Bell inequalities. IA seems very solid after all.

7.2.2 A common cause in the future

Despite the previous considerations we are going to abandon IA. But the way we will abandon IA does not commit us to that frightening reality that scared Bell. The other option to reject IA is one that some physicists, principally Costa de Beauregard4 and Cramer, and some philosophers such as Phil Dowe and Huw Price, have proposed; reject IA by appealing to backward causation. In this section I will explain this suggestion by following Price in [9]. The core idea of this rejection of IA is that the state of the meter and the hidden variables of the system are not independent, not because some event in the past stands for their present correlation, but because we have an event in the future that accounts for it; the moment when the particle comes to interact with the meter is that event that explains why the correlations in the past exist. “a correlation explicable not in terms of some common cause in the past, but simply in terms of the existing interaction in the future” [9, p. 238]. We then do not need a very complicate and horrific reality that entails the no independence of the hidden variables and the setting of the device, but only to appeal to the simple future event in which the system and the device get to interact. The interaction between the meter and the system retro causes the earlier state of the particle, and this is way the earlier state of the system is no independent of the settings of the measurement.

By appealing to advanced action (backward causation)5 (backward causation), it is straightforward to explain the outcomes in EPR experiments. We simply need to pro- pose that the statistics of the hidden variables of the particles before being measured are changed depending on the posterior measurement performed on the particles. For

instance, suppose that photon 1 is measured alongσ ˆz and the partner photon along

4Costa de Beauregard proposed retro causation even before the development of Bell experiments, around 1947, but his thesis advisor, who was Louis de Broglie, did not like the idea. Later, de Beauregard returned to retro causation to explain the EPR experiments [39] 5It should be pointed out, to better understand Price’s motivations, to mention that he wants to sustain the “block universe”view . Recall that for the “block universe”, there does not exist any objective difference between the past, the present and the future –all the differences are subjective–. Therefore, Price believes that it is equally valid to take the world as “going”from what we call the future toward what we call the past, as the way we usually take it to be, from past to future [9]. For him then is completely natural to abandon IA as he does. Chapter 7. Backward causation in weak measurements 92

Figure 7.5: Schematic representation of the backlight cones for the final measure- ments. Notice that the cones overlap in a region (in the past of the measurements) that includes the moment in which the photons are released. Therefore, in any moment during this overlapped region, it is possible for the final measurements to carry infor- mation (from the future) of the measurement performed, and one photon can locally transmit the information to its partner.

σˆx. Then, these measurements retro-cause that, in the moment before the two particles separate, the particles get into a certain state, i.e, the state that accounts for their latter answers. More precisely, Price appeals to a “a probabilistic decoupling factor” which depended on the actual spin measurements to be performed on each particle and which influenced the underlying spin properties of the particles concerned.” [9, p. 246]. This “decoupling factor” could be, for example, the following: “In those directions G and H (if any) in which the spins are going to be measured, the probability that the particles have opposite spins is cos2(α/2), where α is the angle between G and H”[9, p. 247]. So, when G and H are aligned in the same direction, α is zero and the probability of obtaining opposite spins is maximum, what indeed happens when we perform the same spin measurement on each of the particles of the EPR pair. And when G and H are orthogonal, we of course expect to obtain 50 % of anti correlation between the spins, such as cos2(α/2) guarantees. Hence, this simple decoupling factor will account for the relevant statistics obtained in an EPR experiment.

Another outstanding advantage of backward causation in EPR experiments is that we can save locality, i.e. we can avoid “spooky” action at distance. This because the moment in which the two particles get coupled (entangled) lies within the light cones of both future measurements (see Fig. 7.5). The effect - (the state of the particles induced by the future measurements) is transmitted from the moment of the measurement to the moment of entanglement simply by the particles themselves, “who [the particles] bear the marks of their future as they bear the marks of their past” [9, p. 242].

Remember that any situation of backward causation must be protected from the bilking argument so to avoid causal paradoxes. Price’s suggestion is easily protected from the bilking argument. A causal paradox is possible if we can determine the states of the Chapter 7. Backward causation in weak measurements 93 particles in a moment before the final measurement, so that we can change the final measurement appropriately so to contradict the earlier state of the particle. But how are we going to determine the state of the particles before the final measurement if it is not by another measurement? And if we perform another measurement, is this measurement the one that retro causes the earlier state of the particles, so the paradox is blocked. In Price’s words: “Here the claimed earlier effect is the arrangement of spins in the directions G and H which are later to be measured. But what would it take to detect this arrangement in any particular case? It would take, clearly, a measurement of the spins of particles concerned in the directions G and H. However, such a measurement is precisely the kind of event which is being claimed to have this earlier effect” [9, p. 247].

The appealing to advanced action in EPR experiments is thus very powerful while being at the same time very simple. It provides a very sound case for abandoning IA without needing to appeal to a frightening reality that determines everything and that seriously harms any possibility for free will. It is the free final choice of the experimentalist which causes the earlier state of the particles, and not the other way around. This proposal rescues hidden variables, locality in EPR cases, and is protected from causal loopholes in the context of EPR experiments. The idea, however, is more of a sketchy view of what is happening than a detailed physical theory (for example, how exactly the final measurement affects the previous state?)6. Also, the idea here presented is aimed at providing an explanation of the EPR experiments but it is no clear how the same idea of advanced action could be used in order to explain the weak values. Fortunately, since 1964, Aharonov and collaborators have been entertaining a formulation of quantum mechanics completely symmetric in time, in which physical states evolve both toward the future and toward the past. This theory, called the Two State Vector Formalism, can provide a solid ground to those who have suggested advanced action in EPR cases, and can provide new lights to the physical meaning of weak values. I will now pass to explain the Two state vector formalism.

7.3 The Two-State Vector Formalism (TSVF)

7.3.1 The ABL rule

TSVF is a quantum time-symmetric theory that traces its origins to a work by Aharonov, Bergmann and Lebowitz from 1962 [40]. In [40] the authors developed a time symmetric account of the measurement process for systems that are subjected to two consecutive

6Crammer’s transactional interpretation might have an answer, but I consider that a much clearer account of these retro causal processes is given by The Two State vector formalism (see the next section). Chapter 7. Backward causation in weak measurements 94 strong measurements. The idea then presented was that the quantum laws are symmetric in time so that we needed a symmetrical account of the measurement process; the measurement process has been understood as an asymmetric process in the sense that we take a system in a certain time t to be determined by a previous measurement but we do not take the system before a measurement as determined also by this future measurement. Thus, motivated by the idea of restoring the symmetry in time that the measurement process has apparently stolen, the authors came to develop the following idea: the state of a system between two measurements is equally determined by both of the measurements, and not only by the past one. The most complete information for a system in a time between a past measurement and a future one, if we know the results of both the final measurement and the initial one, is provided by the two corresponding state vectors (the future one and the past one). By following this symmetrical treatment the authors were able to develop the ABL rule, which is a rule for computing the probabilities of the outcomes for an intermediate observable between the two strong measurements. The rule is:

(f) (ci) 2 (ci) (0) 2 (0) (f) | hψ |ψ i | | hψ |ψ i | Prob(ci|ψ , ψ ) = , (7.5) P (f) (cj ) 2 (cj ) (f) 2 j | hψ |ψ i | | hψ |ψ i |

where |ψ(ci)i refers to an eigenstate of an observable Cˆ between the pre-selected state |ψ(0)i and the post-selected state |ψ(f)i. Thus, if for example we make an initial measure- ment to prepare the system in a certain state |ψ(0)i, then we make another measurmeent of an observable Cˆ, and we finally make another measurement, then we can ask for the probability of obtaining an intermediate result ci given the initial preparation of the system and a certain final state that we selected (recall the example two in Sec.5.3).

This probability, in the case that ci is non-degenerate, is provided by Eq. 7.5. What concern us about the ABL rule is that it is aimed at answering the question “what is the probability of obtaining a certain state |ψ(ci)i given that in the future a state |ψ(f)i will be obtained?” The ABL rule then nicely illustrates that in the same way that stan- dard quantum mechanics worries about the probabilities of obtaining a certain quantum state given a prior measurement, we can also ask and answer with the same quantum formalism questions having to do with probabilities of quantum states given a posterior measurement.

7.3.2 The Two States

The original idea presented in [40] was going to be strengthened and expanded in latter works by Aharonov and Vaidman, until TSVF was developed. The TSVF formalism can be regarded as an extension of the “one vector state formalism” (the “one vector” Chapter 7. Backward causation in weak measurements 95 state is no other than the traditional state vector of quantum mechanics). For TSVF the quantum state of a system between two strong complete7 measurements is described by two vectors, the “old” vector |ψ(0)i that resulted from the first measurement, and that evolves with a unitary operator Uˆ(t0, tm) from time t0 to tm, and another vector which they represent as hψ(f)|, that resulted from the final strong measurement and retro evolves from a time tf to a prior time tm, accordingly to the adjoint of the “normal” unitary operator, such that:

(m) (f) ˆ † hψ2 | = hψ | U (tm, tf ). (7.6)

Hence, the most complete description of a quantum state, between two complete mea- surements, is given by the two-state vector

(m) (m) hψ2 | |ψ1 i , (7.7)

where I am following the way that the authors write this state [31] (note the subindex 1 and 2 that differentiates the states; one state is not the adjoint of the other). Notice how nicely TVSF fits with the ABL rule; the probability of obtaining a certain outcome (c ) ci for an eigenstate |ψ i i of an observable Cˆ in an intermediate time tm between t0 and tf , is (m) (m) | hψ |ψ(ci)i |2| hψ(ci)|ψ i |2 Prob(c |ψ(0), ψ(f)) = 2 1 , (7.8) i (m) (m) P (cj ) 2 (cj ) 2 j | hψ2 |ψ i | | hψ |ψ1 | i see [31]. Note that the result is directly obtained by the two-states vectors of Eq. 7.7. TSVF is then a great step in the direction of restoring time symmetry in the measurement process of quantum mechanics. Although there are some details that I leave aside, and some further possible issues that might undermine some of the symmetry of the TSVF treatment [31], the point is that TSVF provides a nice theoretical ground for the possibility of advanced action in quantum mechanics.

It is important to mention that, according to the authors, TSVF does not give further predictions to the traditional quantum mechanics (the ABL rule for example is simply derived by standard quantum mechanics and Bayes theorem). TSVF is no more than an interpretation; “the two formalisms [the standard one and TSVF] describe the same theory with the same predictions. The difference is that the standard approach is time asymmetric and it is assumed that only the results of the measurements in the past exist” [31].

7TSVF can treat cases in which both the initial and final strong measurement are not complete. This is achieved by the introduction of what the authors call “generalized vectors” [41]. Chapter 7. Backward causation in weak measurements 96

7.3.3 TSVF and weak measurements

As we have seen, standard quantum mechanics is profoundly puzzled by the weak values, since they seem to contain information of future states of the system. We have shown in previous chapters that there seems to be no possible way to understand the physical meaning of the weak values within standard quantum mechanics. TSVF however, easily explains the weak value, as I will now show.

As we already know, a weak value is a value yielded by means of two measurements. In that sense, the same way TSVF interprets the ABL rule can be used here to interpret the weak value; a vector evolving from the past as a result of a strong initial measurement simultaneously coexists with a vector evolving from the future from a future strong measurement; thus, in the intermediate time, we must have two vectors, one having the information of the initial state and the other one that of the final state.

Now, suppose that the initial measurement of an observable Aˆ yielded a value a to which the state |ai corresponds, and the final measurement of an observable Bˆ yielded a value b to which the state |bi is associated. Then, two things must be true, if TSVF is correct; 1) if we perform an intermediate measurement of Aˆ, then we must obtain a again, as standard quantum mechanics affirms –we know this actually happens–. 2) If we were to measure Bˆ in the time between the two measurements, we should obtain b because of the “future to past” vector hb| (recall that this vector evolves from future to past and that we denote it as a bra, following the notation explained in the previous section). However, we cannot test this second prediction since any measurement performed in the intermediate time will destroy the past state (changing the initial boundary conditions of the experiment). In that sense, the second prediction of TSVF is untestable.

However, if we were capable of measuring Bˆ without destroying the “past to future” state, the state |ai, then it would be possible to test the second prediction, namely, to corroborate that we obtain b because of the “future to past” state hb|. Fortunately, we can do exactly that by performing a weak measurement! If we perform a weak measurement of Bˆ with an initially prepared state |ai and a final state |bi, then our outcome will be given by hb|Bˆ|ai B = = b. (7.9) w hb|ai

(If we had weakly measured Aˆ, then the weak value would have yielded a, as it must be because of the initial state). That this second prediction can be experimentally corroborated then provides a very strong support to TSVF, even though it is only an interpretation. Moreover, if we measure in the intermediate time Bˆ and then Aˆ, if TSVF is true, then we must obtain b for the first measurement and a for the second Chapter 7. Backward causation in weak measurements 97 one, because the first one is “extracting” the information of the “future to past” state vector which is hb|, while the other one is “extracting” the information from the “past to future” vector state |ai. Strong measurements, as I said, are incapable of testing these predictions since they will disturb the initial and final conditions; if we measure Bˆ in the intermediate time we will “erase” the information of Aˆ in the case of no commuting observables and this entails that measuring Aˆ after measuring Bˆ will not guarantee that we obtain a again. But with weak measurements we can do exactly that and find first b and then a, and if we weakly measure Bˆ again, we will obtain b as it have to be, because of the “future to past”state vector hb|. To sum up, the weak measurements are weak enough to permit the coexistence of both the “forward” state and the “backward state”.

As we have seen, TSVF provides a very elegant and natural interpretation for the weak values: a weak value depends on the past and on the future as well, because there are two state vectors, one evolving and the other retro evolving. Both vectors provide together all the information than an intermediate state between two strong measurements has. It is true that the weak value emerges from the standard formalism of quantum mechanics, and in that sense, it cannot be taken as a novel prediction of TSVF. But the standard interpretation of quantum mechanics cannot explain how it is that the weak values contains information of the future, while TSVF, as I said, explains it easily. Although it is only an interpretation, I consider that we can take the results yielded by weak measurements as providing strong empirical support to TSVF (it should be an issue of further investigation to determine how weak values can be understood within other time symmetrical quantum theories such as Crammer’s).

If TSVF appeals to a vector evolving from the future, it must be committed to advanced action and then it must be protected from the bilking argument. TSVF is certainly protected from the bilking argument. The reason is that experimentally, we can only obtain the weak value after performing the post-selection. But once we make the post- selection, the intermediate state does not exist any more, so there is no way that we can manipulate it to produce a causal paradox. More precisely, weak measurements avoid causal paradoxes for two facts: first, we cannot obtain information for a single system, but for a sufficiently great ensemble. Second, the information of the ensemble is obtained only a posteriori, after the post-selection has been performed. Hence, the mark left by the post-selection is only revealed for a sample big enough and, of course, after the post-selection has been performed. The weak value is then protected from the bilking argument, as it is TVSF in general given that TVSF predictions concerning the “future to past” state vector can only be evaluated through the implementation of weak measurements plus post-selection. Chapter 7. Backward causation in weak measurements 98

Moreover, TSVF easily handles EPR experiments, because TSVF will claim in those experiments that the strong measurement performed on each arm creates a “future to past” state vector which is going to carry information of this final measurement to an earlier state of the particles. However, to my knowledge the defenders of TSVF have never used the theory to explain Bell’s outcomes. It will be a motive of further research to investigate if Price’s ideas and TSVF are plainly compatible. As far as I notice, TSVF provides an adequate physical ground to support advanced action in EPR experiments.

In this section we have studied TSVF and we have shown that this theory not only easily explains how is it that the weak values depend on the future, but it also gives a powerful extension to the traditional quantum theory, an extension that can be used to explain Bell experiments by appealing to advanced action, following the ideas of de Beauregard, Crammer, Price and Dowe. On the other hand, after the last two sections, it seems that we have overcome the contradiction; the hidden variables suggested by our experiment are not ruled out by Bell and Leggett inequalities, because “ours” seems to be hidden variables coming from the future, and because of the independence assumption, Bell and Leggett inequalities have nothing to say against these hidden variables. But what we have achieved throughout the lasts sections is not only a resolution to the contradiction, but a very powerful interpretation that gives new insight into the quantum realm; by appealing to backward causation we are able to better understand the outcomes of EPR experiments, we avoid non-locality in those experiments, we can explain the Wheeler delayed choice experiment, we adopt a position more coherent with the temporal sym- metry of quantum laws and we finally are able to understand why the weak values seem to present information of the future measurements (see Table 7.1). Given all these ad- vantages, it is very hard to resist the idea that backward causation is a core feature of the quantum world. The cost, if any, is the counter intuitive nature of the idea of backward causation. However, with respect to this last point, I ask: is it not profoundly counter intuitive a world in which things do not have properties that account for their outcomes, in which the concept of reality is seriously harmed, in which unexplained correlations occur (recall the correlations of weak values), in which paradoxes such as those entailed by Wheeler’s delayed choice experiment are unresolvable? If intuition is to play any role, then it seems that intuition would prefer the quantum realm governed by both forward and backward causation rather that one with only forward causation.

7.3.4 A last objection

Even if we concede that weak values interpreted accordingly to TSVF are protected from the bilking argument, someone could argue that the weak value can be explained solely in terms of forward in time causation. Let’s name this position OFC (for “only forward Chapter 7. Backward causation in weak measurements 99

Explains Respect An intuitive Locality in Explains our Additional weak values Realism position EPR experiments experiment Advantages Backward Yes Yes No Yes Yes Time Causation symmetrical Only forward No No Yes* No No ? Causation

Table 7.1: This table shows how the backward causation position handles the problem of the interpretation of weak values and the outcomes of EPR experiments, in compar- ison with a position that denies the possibility of backward causation. None of the positions is intuitive, but one might argue that the idea of backward causation renders that position completely counter intuitive. However, I consider that the “forward cau- sation”position is far from being intuitive, and this is mainly due to the no “realism” that is embodied by the violation of the Bell and Leggett inequalities. Someone could perfectly mark as no intuitive either position, or argue that the “backward causation” position, as long as avoids very counter intuitive ideas, is much more intuitive than its rival. causation”). However, as we already studied in the previous chapter, the standard interpretation of quantum mechanics is incapable of providing an adequate interpretation of a weak value and this was clear enough in the cases in which a special correlation between two no commuting observables was obtained. However, someone could object that even though the standard interpretation falls short in explaining the weak values, there still a possibility to explain these weak values (in the context of the experiment here proposed) with forward in time process. I will now reply to this last and desperate attempt to avoid backward causation in the context of weak measurements.

If we adopt a position that suggests that weak values can be completely explained in terms of forward in time causation then we must explain how the correlations found between the future strong measurement and the past weak measurement are obtained. The only possible way to do this, in the case of OFC, will be to suggest that the weak measurements must exert a kind of influence on the system such that we obtain the final outcomes that we obtain for the strong measurement in the post-selection. In short,

OFC will claim that in the case of our proposed experiment,σ ˆzw induces a certain bias in the system so that the other particle notice this bias and adopts the state |+iz (and that is the reason to explain why this second particle successfully comes through the post-selection).

This suggestion however will not only be refuted by the no hidden variables conclusion that we studied in section 7.2, but will too be problematic for another reason that we briefly examined before: If the bias is strong enough so that it explains the clear cor- relation found between theσ ˆz list and the σzw list (between the weak value in one arm and the final post-selection made on the partner photon on the second arm), then it would have had to collapse the state of the photon in the moment of the weak measure- ment, and no Bell inequalities could be violated –in principle, by suitable arrangement of the polarizers, we could check for Bell inequalities in our experiment–. Aharonov Chapter 7. Backward causation in weak measurements 100 and collaborators say, in [34]: “How robust is the alleged bias introduced by the weak measurements? If it is robust enough to oblige the strong measurements, then it is equivalent to full collapse, namely local hidden variables, already ruled out by Bell’s in- equality”. So it should not be a significant bias. However, if weak measurements do not impose a robust bias, then OFC will be confronted with an additional problem: the clear correlation found between theσ ˆz and the σzw list will still be unexplained since is seems unsatisfactory that a bias so subtle and weak could produce such a strong correlations. Unfortunately, for OFC, the explanation of this correlation was the whole motivation of his suggestion.

7.4 Summary of the chapter

As a way of connecting the main results of the previous sections I will rebuild the chain of ideas that drove me to hold the claim that weak measurements reveal retro causal physical processes. I start by recalling that in section7 we saw that the proposed ex- periment seems to be pointing to a hidden variable position. Nevertheless, the violation of Bell and Leggett inequalities seriously challenge any hidden variables position. Being confronted by these “no hidden variables results”, we can consider the possibility of taking the weak values as not physical. But then we will have to go against quantum mechanics itself, from which the indirect measurement theory is developed, and we have to resign any explanation for the clear correlations between strong and weak measure- ments, advocating something in the line of CV, i.e, arguing that these correlations are simply inexplicable. This in turn will take us to adopt a not very scientific attitude in the sense that we will have to resign explanations for correlations between physical events. Given this obscure path, it seems better to return to the Bell and Leggett inequalities and see if something can be done.

If we admit that weak values are physical, we must surely assert one thing; the weak value is providing information of the future state –this does not necessarily implies a backward causation view, as for example in classical mechanics the velocity and position of a particle provides the information of the future states of the particle–. But if we admit that weak values are providing information of the future state, we are confronted with two options; 1) the information of the future state is presented in the weak value because the system already had that information in the moment of the weak measurement, and transmitted it to the final result (forward in time causation). Or 2), there is backward causation so that the final state somehow comes to affect the earlier state. Any of the two last routes will commit us to a kind of hidden variable theory but only the “forward in time hidden variables” position is ruled out by Bell and Leggett. We can make space Chapter 7. Backward causation in weak measurements 101 for hidden variable position by questioning the idea that the state of the particle is independent of the setting of the device (by questioning IA).

If we abandon IA, we can too adopt an only forward causation position, so we are again confronted with the two options cited above, i.e, OFC or backward causation. However, if we reject IA and adopt OFC, we will directly fall in a monstrous determinism, which will entail that our scientific image of the world is completely mistaken; the Swiss lottery, the free will of the experimentalist and radioactive processes are tightly coordinate by a frightening underlying mechanism. If this were the only option to abandon IA, then it seems preferable to abandon weak measurements once and for all, despite all the problems mentioned above. But as I said, we also can abandon IA as long as we be committed to a backward causation position. We then can follow Price, de Beauregard, Crammer and Dowe (among others) and suggest that the state of the system and the meter are not independent, not because a common event in the past, but because there is a common cause in the future that accounts for this dependence; this event is no other than the moment in which the system and the meter interact.

By considering the possibility of advanced action we can then easily abandon IA without compromising, for example, the apparently free choice of the experimentalist; he can choose whatever he wants, and his choice is what determines the earlier state of the system, via backward causation. Moreover, adopting a backward causation view permit us to easily explain Bell outcomes and avoid action at distance, such as Price explains. Furthermore, to make space for backward causation in quantum mechanics we just only need to extend the standard formalism as Aharonov and Vaidman do. This is achieved by adding to the “old” state vector another one that retro evolves. We then have the same predictions of quantum mechanics –all the achievements of quantum mechanics are protected– and a better understanding of quantum phenomena emerges.

By appealing to TSVF we can finally understand why is that the weak value contains information of the future. By adopting a backward in time interpretation we are also taking seriously the idea that the physical laws are time symmetric. What would be puzzling is not that backward causation occurred in the world, as I claim it does within the context of weak measurements, but that there were not backward processes, given as I said, the symmetry of quantum laws. Finally, the advanced action we are adopting, via TVSF and Price’s ideas, is protected from the bilking argument. Everything then fits nicely together. Thus, weak measurements help us to recover a much more Einsteinian reality, in which the quantum states have properties that account for their outcomes. These properties, nevertheless, cannot be explained without reference to backward in time processes. Chapter 7. Backward causation in weak measurements 102

EPR ended their article: “While we have shown that the wave function does not provide a complete description of the physical reality, we left open the question of whether or not such a description exists. We believe, however, that such a theory is possible” [15]. I think that such a theory is possible; it is the same old quantum mechanics but considered within a time symmetrical perspective, following the Block universe view, in which the future and the past can produce effects on the present. That this is so is revealed by weak values, as I have argued with this work. Chapter 8

Conclusions

As a way of concluding I want to present in a succinct manner the main results of the three parts of the present work.

In the first part we discussed five different contemporary theories of causation. That was done because a natural first step before studying the concept of backward causa- tion demanded a study of causation. In the next chapter we studied the concept of backward causation. We saw that three theories of causation (the process theory, the manipulabilist theory and the probabilistic theory) are well suited to handle backward causation. The discussion of backward causation drove us to an examination of the arrow of time of the universe. I briefly explained that the physical laws do not provide any arrow of time and then, I explained how the arrow of time could be understood by means of subjective considerations. We also took a glance at the block universe model, which takes the present, the past and the future as objectively equal. The symmetry of the physical laws together with the block universe model suggest that backward causa- tion should be expected, at least in the quantum realm –far from the asymmetries that arise in the macroscopic world–. We ended this first part by discussing some theories of physics that have appealed to the idea of backward causation in order to explain some physical processes. This first part was, in short, dedicated to the conceptual analysis of the concept of backward causation.

In the second part we studied the theory of weak measurements. First we studied the theory of indirect measurements, in which the measurement process is described by taking into account both the system that we want to measure and the pointer. In the next chapter I presented the theory of weak measurements, first by looking at the original paper in which it was developed for the first time, and then by a different and more general approximation, that takes into account the different effects that arise due to an interplay the (action-reaction picture) between the system and the measurement

103 Chapter 8. Conclusions 104 device. We saw that weak measurements are to be understood, not as measurements with big uncertainty, but as measurements that tends to approach the conditions of uncertain infinitesimal transformations.

The last part was dedicated to the interpretation of weak values. In particular, it was dedicated to the understanding of the fact that these values depend on the final state of the system. In the first chapter of this part we discussed possible objections against the idea that weak values are physical properties of the system. We saw that taking the weak values as not physical would commit us to the problematic idea that these are “magical”entities and that the clear correlations between the weak values and the post- selection is simply inexplicable. Indeed, in that chapter we came to the conclusion that a position that considers the weak values as unphysical is unpromising. In the last chapter we saw how we could easily understand weak values by means of backward processes; the final post-selection together with the initial selection causes the intermediate state of the system that yields the corresponding weak value. As we examined, this proposal nicely fits into the Two State Vector Formalism.

Moreover, in the last chapter an experiment was proposed, an experiment that, as explained, tends to suggest a very subtle type of hidden variables; the weak measurement performed on the photon in one arm is perfectly anti correlated with the final state of the photon in the other arm. We showed that this allegedly hidden variables are apparently prohibited because of the violations of Bell and Leggett inequalities, and so, we stressed that a contradiction had emerged; weak measurements indicate hidden variables and the mentioned violations prohibit them. Given that the inequalities and weak measurements are developed within the formalism of quantum mechanics, this contradiction appeared to put at serious risk the internal consistency of quantum mechanics. The way out was to revised the independence assumption (it is not the only assumption that we can revise, but the one whose rejection better explains weak values and the EPR results) that both Bell and Leggett assume; the state of the device is not independent of the state of the particle before the measurement. We saw that we do not need to be committed to a frightening reality that accounts for these relations between the system and the final measurement, but only appeal to the idea of backward causation; the state of the system is not independent because of the future interaction with the meter.

Note that the hypothesis of backward causation was mandatory1 if we wanted to safe- guard the consistency of quantum mechanics. Moreover, note that besides the not

1Actually, what is mandatory is the rejection of some assumption required by the Bell and Leggett inequalities in order to avoid the internal contradiction mentioned above. But the point is, only the hypothesis of backward causation accomplish the double task of avoiding the internal contradiction and explaining weak values. Chapter 8. Conclusions 105 negligible fact that backward causation is a direct way out of the contradiction men- tioned above, appealing to backward causation allows us, following Price among others, to explain Bell’s results, something that within the forward causation perspective we have not done yet. Hence, backward causation has independent virtues to that of ex- plaining weak values. Given that it easily solves some of the deepest mysteries of the quantum world, and given that many years ago some physicists such as Crammer and Beauregard suggested it, it is strange that the idea of backward causation in quantum mechanics has not received more attention. Moreover, given the time symmetry of the physical laws, backward causation, at least in the quantum realm, is exactly what we should have expected. Then, an Einsteinian reality is at hand if we are willing to accept that the quantum world does not need to respect a subjective arrow of time. And weak values, as we have studied throughout this work, are perhaps the best allies of Einstein. Appendix A

Derivation of some results

In this Appendix the reader will find the derivation of three results presented during the thesis.

A.1 Derivation of Eq. 4.13

ˆ X X ˆ T r[(Iψ ⊗ Pφk )ρΨ (1) ] = hai| hm| (Iψ ⊗ Pφk )ρΨ (1) |mi |aii = i m (A.1) X X (i) (i) hm|φki hφk|φ i hφ |φki hai|ρψ(0) |aii , i m

(i) (i) where we have used that ρΨ (1) = ρφ(i) ⊗ ρψ(0) = |φ i hφ | ⊗ ρψ(0) and the definition of ˆ ˆ Pφk , namely, Pφk = |φki hφk|. Finally, using that hm|φki = δk,m, we obtain:

ˆ X (i) (i) X (i) 2 T r[(Iψ ⊗ Pφk )ρΨ (1) ] = hφk|φ i hφ |φki hai|ρψ(0) |aii = (| hφ |φki | hai|ρψ(0) |aii). i i (A.2)

A.2 Derivation of Eq. 4.16

(I ⊗ Pˆ )ρ (1) (I ⊗ Pˆ ) 1 ψ φk Ψ ψ φk X ˆ ˆ T rφ = hm| ((Iψ ⊗ Pφk )ρΨ (1) (Iψ ⊗ Pφk )) |mi . Prob(φ |ρ (1) ) Prob(φ ) k Ψ k m (A.3)

106 Appendix A. Derivation of some results 107

ˆ † ˆ ˆ Using that ρΨ (1) = U ρφ(0) ⊗ ρψ(0) U, and that Pφk = |φki hφk|, we have:

1 X hm| ((I ⊗ Pˆ )ρ (1) (I ⊗ Pˆ )) |mi = Prob(φ ) ψ φk Ψ ψ φk k m 1 X † 1 † hm| (|φ i hφ | Uˆ ρ (0) ⊗ ρ (0) Uˆ |φ i hφ |) |mi = (hφ | Uˆ ρ (0) ⊗ ρ (0) Uˆ |φ i), Prob(φ ) k k φ ψ k k Prob(φ ) k φ ψ k k m k (A.4)

(0) since hm|φki = δm,k. Finally, define the measurement operators Mˆ k ≡ hφk|Uˆ|φ i, and ˆ † (0) ˆ † Mk ≡ hφ |U |φki. Then, we finally obtain:

(I ⊗ Pˆ )ρ (1) (I ⊗ Pˆ ) 1 ψ φk Ψ ψ φk ˆ ˆ † T rφ = Mkρψ(0) Mk , (A.5) Prob(φk|ρΨ (1) ) Prob(φk)

(0) (0) where we have used that ρφ(0) = |ψ i hψ |.

A.3 Derivation of Eq. 4.22

ˆ ˆ hf(Q)i1 = T r[f(Q)ρφ(1) ] Z ZZ ˆ X ∗ = dm hm| f(Q)( hai|ρψ(0) |aii dqdx |qi φ0(q − gai)φ0(x − gai) hx|) |mi , i (A.6) where we have used Eq. 4.21 and Eq. 4.22. because of linearity we have Z ZZ ˆ X ˆ ∗ hf(Q)i1 = ( hai|ρψ(0) |aii f(Q)dm hm| dqdx |qi φ0(q − gai)φ0(x − gai) hx|) |mi . i (A.7)

Finally, note that two integrals disappear for the terms hx|mi first, and then hm|qi. Hence, we can write Eq. A.7 as: Z ˆ X ˆ ∗ hf(Q)i1 = hai|ρψ(0) |aii dqdqf(Q)φ0(q − gai)φ0(q − gai) = i Z (A.8) X ˆ 2 hai|ρψ(0) |aii dqf(Q)|φ0(q − gai)| . i Appendix B

Details of the experiment

I will end this thesis by discussing some experimental details of the experiment 1. We are going to perform the weak measurements by means of birefringent crystals, similar to the way in which a weak value was experimentally realized for the first time [26]. In our case, in the left arm the birefringent crystal induces a displacement in space of the vertical and horizontal polarizations of the photons, that is, of |Hi and |V i. In the right arm the crystal axis is orientated such that a space displacement of the |Di and |Ai components is obtained. The displacement produced by the crystal depends on the length of the crystal (see Fig. B.2), that we can vary according to our needs (we expect to determine the appropriate length experimentally, taking into account that in order for a weak measurement to be obtained, the induced displacement has to be much smaller than the waist of the beam).

Note that in this experiment our system is the polarization of the photons and our meter is the transverse momentum of the photons; when the beam pass through the birefringent crystal, the transverse momentum is deflected according to the incoming polarization (it deviates upwards if the polarization is |Hi and does not deviates if it is |V i). The deflection then indicates the polarization; by reading the deviation we gather information of the polarization. The transverse deviation obtained during the interaction with the birefringent crystal is measured by placing a screen, a photo detector, at the end, that keeps track of the final position of the photons. This applies for both arms.

In what follows I will apply the indirect measurement formalism, studied in chapter4, to calculate the density distribution of the final positions of the photons. Based on these density distributions we are going to interpret the results of our experiments (we could say that these density distributions are the predictions). Before I continue, it should be

1It should be pointed out that some details are kept out of the present discussion, since they will be determined experimentally, i.e, in the lab.

108 Appendix B. Details of the experiment 109

Figure B.1: A type-II non linear crystal is pumped by a pulsed laser and generates a pair of entangled photons via SPDC. Then the photons pass through the birefringent crystals, which induce a spatial displacement of the polarization (see Fig. B.2). The photons are strongly measured by the polarizer, and then they pass through two lenses that projects the incoming light into the imaging plane (an image is created). This images arrive to optical fibers that will move (see arrows) so they serve as screens.

Figure B.2: Illustration of how the birefringent crystal induces a displacement d of the |Hi and |V i components of polarization for an incident beam (exactly the same principle applies to the other arm but with a displacement of |Ai and |Di). b) By increasing the waist of the beam, it is possible to obtain the conditions for weakness, since now the displacement is less that the initial waist of the beam (we cannot distinguish between the components). Analogously, we can obtain a weak measurement of the polarization by making the crystal sufficiently short so that the ray is almost unperturbed.

noticed that the eigenvalues forσ ˆx andσ ˆz are simply 1 and −1;σ ˆx is, in this context, the observable of the polarization along x, to which the eigenvectors |Di with eigenvalue

1 and |Ai with eigenvalue −1 are associated.σ ˆz is the observable of the polarization along z, to which the eigenstates |Hi with eigenvalue 1 and |V i with eigenvalue −1 correspond.

We start with an EPR state, more precisely, with the singlet state |ψ(0)i = √1 (|Hi |V i − 2 1 2 (0) (0) |V i1 H2). In addition, let |φ i1 and |φ i2 be the meter in the right and left arms re- spectively (recall that in our case the meter correspond to the transverse momentum of Appendix B. Details of the experiment 110

(0) (0) (0) the photons), so that the total initial state of the meter is |φ i = |φ i1 |φ i2 (we are considering that the wave function of the meter is separable into the wave function of each arm, a supposition that can be achieved experimentally). The initial state of the total system (particle plus apparatus) will be

(0) (0) (0) 1 (0) (0) |Ψ i = |ψ i ⊗ |φ i = √ (|Hi |V i − |V i H2) ⊗ |φ i |φ i . (B.1) 2 1 2 1 1 2

The pre-measurement in this case involves two evolution operators, one for each arm; i ˆ ˆ − gσˆx ⊗P2 U2 = e ~ 2 is the evolution operator for the left arm in which we will measure i ˆ ˆ − gσˆz ⊗P1 σˆx, while U1 = e ~ 1 is the evolution operator for the right arm, in which we will measureσ ˆz. On the other hand, Pˆ1 is the operator of the transverse momentum of the

right photon while Pˆ2 is the transverse momentum of the left one. First, I will make the operators act on the states of the system:

Uˆ i ˆ i ˆ 1 (0) (1) ˆ ˆ (0) − gσˆx ⊗P2 − gσˆz ⊗P1 (0) (0) |Ψ i −→|Ψ i = U2U1 |Ψ i = e ~ 2 e ~ 1 √ (|Hi |V i − |V i |H2i) ⊗ |φ i |φ i 2 1 2 1 1 2 i i 1 − gσˆx ⊗Pˆ2 − gσˆz ⊗Pˆ1 (0) (0) = e ~ 2 e ~ 1 √ |Hi |V i ⊗ |φ i |φ i 2 1 2 1 2 i i 1 − gσˆx ⊗Pˆ2 − gσˆz ⊗Pˆ1 (0) (0) − e ~ 2 e ~ 1 √ |V i |Hi ⊗ |φ i |φ i 2 1 2 1 2 1 i i − gσˆx ⊗Pˆ2 − gσˆz ⊗Pˆ1 (0) (0) = e ~ 2 e ~ 1 |Hi (|Di − |Ai ) ⊗ |φ i |φ i 2 1 2 2 1 2 1 i i − gσˆx ⊗Pˆ2 − gσˆz ⊗Pˆ1 (0) (0) − e ~ 2 e ~ 1 |V i (|Di + |Ai ) ⊗ |φ i |φ i 2 1 2 2 1 2 1 i i 1 i i − gPˆ2 − gPˆ1 (0) (0) gPˆ2 − gPˆ1 (0) (0) = e ~ e ~ |Hi |Di ⊗ |φ i |φ i − e ~ e ~ |Hi |Ai ⊗ |φ i |φ i 2 1 2 1 2 2 1 2 1 2 1 i i 1 i i − gPˆ2 gPˆ1 (0) (0) gPˆ2 gPˆ1 (0) (0) − e ~ e ~ |V i |Di ⊗ |φ i |φ i − e ~ e ~ |V i |Ai ⊗ |φ i |φ i . 2 1 2 1 2 2 1 2 1 2 (B.2)

Note that when necessary, I passed from theσ ˆz basis to theσ ˆx basis (for example, √ |Hi = 1/ 2(|Ai + |Di)). Note also thatσ ˆz is now replaced, on the exponentials, by

1 or −1 (the corresponding eigenvalues), and the same happened toσ ˆx. Now I will make the operators act on the states of the meter, taking into account the fact that the exponentials in the previous result translate the pointer’s in the q representation. The first term is:

1 i i 1 − gPˆ2 − gPˆ1 (0) (0) e ~ e ~ |Hi |Di ⊗ |φ i |φ i = |Hi |Di ⊗ |φ(−g)i |φ(−g)i 2 1 2 1 2 2 1 2 1 2 (B.3) 1 Z Z = |Hi |Di ⊗ φ (q − g) |q i dq φ (q − g) |q i dq . 2 1 2 1 1 1 1 1 2 2 2 2 2 Appendix B. Details of the experiment 111

Doing the same for all the four terms, and writing the eigenstates ofσ ˆx in terms of the

σˆz basis, we have:

ˆ 1 Z Z |Ψ (0)i −→|U Ψ (1)i = [(|Di + |Ai )(|Hi + |V i ) ⊗ φ (q − g) |q i dq φ (q − g) |q i dq 4 1 1 2 2 1 1 1 1 1 2 2 2 2 2 Z Z −(|Di1 + |Ai1)(|Hi2 − |V i2) ⊗ φ1(q1 − g)1 |q1i dq1 φ2(q2 + g)2 |q2i dq2 Z Z −(|Di1 − |Ai1)(|Hi2 + |V i2) ⊗ φ1(q1 + g)1 |q1i dq1 φ2(q2 − g)2 |q2i dq2 Z Z −(|Di1 − |Ai1)(|Hi2 − |V i2) ⊗ φ1(q1 + g)1 |q1i dq1 φ2(q2 + g)2 |q2i dq2]. (B.4)

The previous one is the total state of the system and the pointer after the measurement

interaction. It is very important to understand that φ1(q1 − g)1 is the right meter state

associated with the eigenvalue 1 corresponding to the eigenstate |Hi, whereas φ1(q1 + g)1 is the right meter state associated with the eigenvalue −1 corresponding to the eigenstate

|V i. Analogously, the meter state φ2(q2 − g)2 that belongs to the left arm encodes the information that the system is in the state |Di while φ2(q2 + g)2 encodes the information of |Ai. (Of course, in a weak measurement the deviation does not indicate any state, since the system is not collapsed). Having said this, recall that to obtain a weak value (f) we need to post-select a final state; let us post-select the final state |ψ i = |Di1 ⊗|Hi2. But we also have to “read” the meter –recall that this was the second step of the ancilla scheme–, and we do so by applying the projector |q1i hq1| ⊗ |q2i hq2|. Hence, the (f) post-selection of the state |ψ i = |Di1 ⊗ |Hi2, plus the reading of the meter, yields, according to Eq. B.4,

1 |φ(f)i = [φ (q − g) φ (q − g) −φ (q − g) φ (q + g) − φ (q + g) φ (q − g) 4 1 1 1 2 2 2 1 1 1 2 2 2 1 1 1 2 2 2

−φ1(q1 + g)1φ2(q2 + g)2] |q1i ⊗ |q2i, (B.5)

where |Ψ (f)i refers to the state of the meter after the post-selection. Doing exactly the (f) (f) same for the other possible post-selected states, namely, |φ i = |Ai1 ⊗ |Hi2, |ψ i = (f) |Di1 ⊗ |V i2 and |ψ i = |Ai1 ⊗ |V i2, we have respectively:

1 [|φ(f)i = φ (q − g) φ (q − g) −φ (q − g) φ (q + g) + φ (q + g) φ (q − g) 4 1 1 1 2 2 2 1 1 1 2 2 2 1 1 1 2 2 2

+φ1(q1 + g)1φ2(q2 + g)2] |q1i ⊗ |q2i, (B.6) Appendix B. Details of the experiment 112

1 [|φ(f)i = φ (q − g) φ (q − g) +φ (q − g) φ (q + g) − φ (q + g) φ (q − g) 4 1 1 1 2 2 2 1 1 1 2 2 2 1 1 1 2 2 2

+φ1(q1 + g)1φ2(q2 + g)2] |q1i ⊗ |q2i, (B.7) 1 [|φ(f)i = φ (q − g) φ (q − g) +φ (q − g) φ (q + g) + φ (q + g) φ (q − g) 4 1 1 1 2 2 2 1 1 1 2 2 2 1 1 1 2 2 2

−φ1(q1 + g)1φ2(q2 + g)2] |q1i ⊗ |q2i. (B.8)

From these results it is straightforward to obtain the pointer’s distributions. In the first (f) case, for the post-selected state |ψ i = |Di1 ⊗ |Hi2, we simply have

1 |φ(f)i = Prob(q , q |Ψ (f)) = |[φ (q − g) φ (q − g) −φ (q − g) φ (q + g) − φ (q + g) φ (q − g) 1 2 16 1 1 1 2 2 2 1 1 1 2 2 2 1 1 1 2 2 2 2 −φ1(q1 + g)1φ2(q2 + g)2]| . (B.9)

Let the pointer’s wave functions in q be gaussians. From now on I will set g = 1 (the real value of g will be determined experimentally). I will use a dispersion of 30, that is more than enough to guarantee the weakness of the measurement, as will be clear soon. Let us begin by considering the case of no post-selection which is done in the lab by placing no polarizers in the arms. According to the weak measurement theory, in the no post-selection case we expect that the weak measurement reveals the average

of bothσ ˆz andσ ˆx (we know that the average of these observables is simply zero for the singlet state). Fig. B.3 is the plot for the pointer’s distribution in the case of no post-selection (it is much easier to see the center of the Gaussian by attending to the

contour plot). Since the distribution is function of q1 and q2, we have a 3D Gaussian (x

corresponds to q1 and y to q2 ). Note that the Gaussian is centered in the origin, which

agrees with the theory since it shows that the expectation value ofσ ˆz andσ ˆx are zero. Compare the no post-selection case with Fig. B.4, which is a plot for the pointer’s (f) distribution of the post-selection |ψ i = |Di1 ⊗ |Hi2. This Gaussian is translated from the origin to the point −1 in x and −1 in y . It is hard to see the translation of the distribution in a), so I made a “zoomed”plot of axis x and axis y (figures b) and c) respectively). Of course, the clearest way of seeing the translation is by means of d), the contour plot. What does it mean that the Guassian is centered in the point (−1, −1)?

Recall that in the strong measurement regimen φ1(q1 + g)1 is associated to the state

|V i1 whereas φ2(q2 + g)2 is associated to the state |Ai2 (in the weak regimen we can not associate these states to the system because the system is not changed when measured),

so, if g = 1, φ1(q1 + g)1 together with φ2(q1 + g)2 yield a Gaussian centered at the point (−1, −1). In short, if our Gaussian is centered in (−1, −1), in the strong regime it entails

that the measurement ofσ ˆz yielded the state |V i1 whereas that ofσ ˆx yielded the state Appendix B. Details of the experiment 113

Figure B.3: Pointer’s distribution for a weak measurement without post-selection. a) is a 3D plot and b) is the contour plot. Note that the Gaussian is centered at (0, 0), which indicates that the weak measurement of both arms yielded zero (the expectation value ofσ ˆz andσ ˆx is zero for the singlet state).

Figure B.4: Pointer’s distribution for a weak measurement with the post-selection (f) |ψ i = |Di1 ⊗ |Hi2. b) and c) are close-ups of the distributions; b) shows the x axis while c) the y axis. d) is a contour plot of the distribution.

|Ai2. In the light of this information, we can write the state of the EPR pair just after (1) passing the birefringent crystals as |ψ i = |V i1 ⊗ |Ai2. In table B.1 the reader can find a simple guide to interpret the different results of the measurement (for a case of strong measurements), according to the point in which the Gaussian is centered.

Before we see the pointer’s distributions for the four possible final post-selections, I find it illustrative to show how the varying of the weakness transforms the distributions. That can be seen in Fig. B.5, where I vary the dispersion; a) has a dispersion of 0.1 (strong measurement), b) a dispersion of 0.6, c) one of 0.8 and d) one of 1.2. From this graphics it is clear how interference effects yield the final weak value once the dispersion Appendix B. Details of the experiment 114

Center of the Gaussian σˆz (Right photon) σˆx (Left photon) (1,1) |Hi1 |Di2 (1,-1) |Hi1 |Ai2 (-1,1) |V i1 |Di2 (-1,-1) |V i1 |Ai2

Table B.1: Results of the measurements in the right and left arm, according to the final center of the Gaussian corresponding to the pointer’s distribution. This interpre- tation however is only valid in the strong regime, because in the weak one the system is not collapsed to the eigenstates.

Figure B.5: Strong to weak measurement transition. It is interesting to note how the weak value is obtained as a result of special interference effects between the different states of the meter. The weakness was controlled by means of the dispersion; a) corre- sponds to a dispersion of ∆Q = 0.1; b) to a dispersion of ∆Q = 0.6; c) a dispersion of ∆Q = 0.8 and d) a dispersion ∆Q = 1.2. is bigger than the difference between the eigenvalues. Note also that in the case of a strong measurement, we obtain four narrow peaks; two of them correspond to the states |Hi and |V i –measured on the right arm–, whereas the other two to the states |Di and |Ai –measured on the left arm–. Although Fig. B.5 refers to a particular post-selection, in the four possible post-selections, when we are in the strong regimen, we will find the four Gaussians presented in a) (this is completely analogous to the example of the two consecutive Stern-Gerlach experiment of chapter6).

Finally, let us attend to the contour plots of the four possible post-selections, presented in Fig. B.6. Note that a perfect anti-correlation between the post-selection outcomes and the weak measurements is obtained; when the photons yield 1z, the weak measurement on the opposite arm yielded −1z, when the photons yield 1x, the weak measurement on the other photon yielded −1x, and so on. Moreover, the weak measurement reflects the state before the final post-selection, so it is as if the photons had, before the final post- selection, the information of the future states (the information of the post-selection), as we have studied throughout the last two chapters. Appendix B. Details of the experiment 115

Figure B.6: The four pointer’s distributions for the four possible final states. Each plot has as a title the post-selection to which it corresponds, for example, the graphic

c) corresponds to the post-selection |Di1 ⊗ |V i2. The reader should pay attention to the perfect anti-correlation between the results of the weak measurements and the final post-selection; the center of the Gaussian in the four cases indicates that the weak measurement yielded a state perfectly anti correlated to the final state of the post-selection. For example, the center of the Gaussian in d) indicates that the weak

measurement yielded a value consistent with the states |Hi1 ⊗ |Di2 (see table B.1), clearly anti correlated with the final state of that particular post-selection, i.e., with

the state |Ai1 ⊗ |V i2.

Again, in the light of these results, one would be tempted to say that the photons had a certain hidden variable that accounts for their final outcomes, as Einstein would liked to say, but the hidden variables changes according to the final post-selection (as if retro processes were occurring). Again, the result of the weak measurements should not be taken as providing the states of the individual photons once they passed through the birefringent crystals, i.e, that the Gaussian is centered at (−1, −1) does not entails that the right photon was in the state |Hi and the left one in state |V i after passing the birefringent crystals. That would be true in the case of a strong measurement that collapses the system into an eigenstate of the observable. What the center of the Gaussian indicates is rather that in average, the outcome of the right photons when

measured alongσ ˆz was 1, as if they had a property that accounts for this outcome (that property could be understood in the lights of the Two State Vector formalism, as it was argued in the last chapter). We hope to perform soon the experiment, and check by ourselves that these predictions are correct. I claim that if successful, the only available way of interpreting the experimental results, given the actual state of physics, is by appealing to hidden variables coming from the future. Bibliography

[1] Stathis Psillos. Regularity theories, book section 7. Oxford handbooks. Oxford University Press, Oxford, 2009. ISBN 9780199279739 (hbk.) 019927973X (hbk.).

[2] David Lewis. Philosophical Papers : Volume II. Number v. 2. Oxford University Press, USA, 1987. ISBN 9780198020660. URL http://books.google.com.co/ books?id=NprZWnApecIC.

[3] Phil Dowe. Physical Causation. Cambridge studies in probability, induction, and decision theory. Cambridge University Press, 2007. URL http://espace.library. uq.edu.au/view/UQ:202458.

[4] Christopher Hitchcock. Probabilistic causation. In Edward N. Zalta, editor, The Stanford Encyclopedia of Philosophy. Winter 2012 edition, 2012.

[5] Christopher Hitchcock. Do All and Only Causes Raise the Probabilities of Effects?, book section 17, pages viii, 481 p. MIT Press, Cambridge, Mass., 2004. ISBN 0262532565 (pbk. alk. paper) 0262033178 (alk. paper).

[6] James F. Woodward. Agency and interventionist theories, book section 11. Oxford handbooks. Oxford University Press, Oxford, 2009. ISBN 9780199279739 (hbk.) 019927973X (hbk.).

[7] Hanoch Ben-Yami. The impossibility of backwards causation. Philosophical Quar- terly, 57(228):439–455, 2007. ISSN 00318094. doi: 10.1111/j.1467-9213.2007.494.x.

[8] Dummett, A. E. and Flew, A. Symposium: Can an effect precede its cause? Pro- ceedings of the Aristotelian Society, Supplementary Volumes, 28:27–62, 1954. ISSN 03097013. URL http://www.jstor.org/stable/4106593. Contributor:.

[9] Huw Price. Time’s Arrow Archimedes’ Point: New Directions for the Physics of Time. Oxford University Press, 1997. ISBN 9780195117981. URL http://books. google.com.co/books?id=WxQ4QIxNuD4C.

[10] Jan Faye. Backward causation. In Edward N. Zalta, editor, The Stanford Encyclo- pedia of Philosophy. Spring 2010 edition, 2010. 116 Bibliography 117

[11] Y. Aharonov and D. Rohrlich. Quantum Paradoxes: Quantum Theory for the Perplexed. Physics textbook. Wiley, 2008. ISBN 9783527619122. URL http: //books.google.com.co/books?id=3PSHDohngVgC.

[12] Huw Price. Does time-symmetry imply retrocausality? how the quantum world says “maybe”? Studies in History and Philosophy of Science Part B, 43(2):75–83, 2012.

[13] JG Cramer. The transactional interpretation of quantum mechanics. Reviews of Modern Physics, 58(3), 1986. URL http://rmp.aps.org/abstract/RMP/v58/i3/ p647_1.

[14] John Archibald Wheeler and Richard Phillips Feynman. Interaction with the ab- sorber as the mechanism of radiation. Rev. Mod. Phys., 17:157–181, Apr 1945. doi: 10.1103/RevModPhys.17.157. URL http://link.aps.org/doi/10.1103/ RevModPhys.17.157.

[15] A. Einstein, B. Podolsky, and N. Rosen. Can quantum-mechanical descrip- tion of physical reality be considered complete? Phys. Rev., 47:777–780, May 1935. doi: 10.1103/PhysRev.47.777. URL http://link.aps.org/doi/10.1103/ PhysRev.47.777.

[16] V Jacques, E Wu, F Grosshans, and F Treussart. Experimental realization of Wheeler ’ s delayed-choice GedankenExperiment. Science, pages 1–9, 2007. URL http://www.sciencemag.org/content/315/5814/966.short.

[17] Bengt Svensson. Pedagogical review of quantum measurement theory with an emphasis on weak measurements. Quanta, 2(1), 2013. ISSN 1314-7374. URL http://quanta.ws/ojs/index.php/quanta/article/view/12.

[18] Yakir Aharonov, David Z. Albert, and Lev Vaidman. How the result of a mea- surement of a component of the spin of a spin-1/2 particle can turn out to be 100. Physical Review Letters, 60(14):1351–1354, 1988. URL http://link.aps. org/doi/10.1103/PhysRevLett.60.1351. PRL.

[19] Yakir Aharonov and Alonso Botero. Quantum averages of weak values. Physical Re- view A, 72(5):052111, 2005. URL http://link.aps.org/doi/10.1103/PhysRevA. 72.052111. PRA.

[20] Boaz Tamir and Eliahu Cohen. Introduction to weak measurements and weak values. Quanta, 2(1), 2013. ISSN 1314-7374. URL http://quanta.ws/ojs/index. php/quanta/article/view/14. Bibliography 118

[21] P Busch and P Lahti. L¨udersRule. Compendium of Quantum Physics, pages 3–5, 2009. URL http://link.springer.com/chapter/10.1007/978-3-540-70626-7_ 110.

[22] . Mathematical Foundations of Quantum Mechanics. Investi- gations in physics. Princeton University Press, 1955. ISBN 9780691028934. URL http://books.google.com.co/books?id=JLyCo3RO4qUC.

[23] B.D.F.L. Claude Cohen-Tannoudji. Quantum Mechanics Volume 1. Hermann. ISBN 9782705683924. URL http://books.google.com.co/books?id=e_Ec0IlSbMMC.

[24] Stephen Parrott. What do quantum ”weak” measurements actually measure? Au- gust 2009. URL http://arxiv.org/abs/0908.0035.

[25] I. M. Duck, P. M. Stevenson, and E. C. G. Sudarshan. The sense in which a ”weak measurement” of a spin-1/2 particle’s spin component yields a value 100. Physical Review D, 40(6):2112–2117, 1989. URL http://link.aps.org/doi/10. 1103/PhysRevD.40.2112. PRD.

[26] N. W. M. Ritchie, J. G. Story, and Randall G. Hulet. Realization of a measure- ment of a “weak value”. Phys. Rev. Lett., 66:1107–1110, Mar 1991. doi: 10.1103/ PhysRevLett.66.1107. URL http://link.aps.org/doi/10.1103/PhysRevLett. 66.1107.

[27] Xuanmin Zhu, Yuxiang Zhang, Shengshi Pang, Chang Qiao, Quanhui Liu, and Shengjun Wu. Quantum measurements with preselection and postselection. Phys. Rev. A, 84:052111, Nov 2011. doi: 10.1103/PhysRevA.84.052111. URL http: //link.aps.org/doi/10.1103/PhysRevA.84.052111.

[28] J. S. Lundeen and A. M. Steinberg. Experimental joint weak measurement on a photon pair as a probe of hardy’s paradox. Physical Review Letters, 102 (2):020404, 2009. URL http://link.aps.org/doi/10.1103/PhysRevLett.102. 020404. PRL.

[29] G. Pryde, J. O’Brien, a. White, T. Ralph, and H. Wiseman. Measurement of Quantum Weak Values of Photon Polarization. Physical Review Letters, 94(22): 220405, June 2005. ISSN 0031-9007. doi: 10.1103/PhysRevLett.94.220405. URL http://link.aps.org/doi/10.1103/PhysRevLett.94.220405.

[30] Richard Jozsa. Complex weak values in quantum measurement. Phys. Rev. A, 76: 044103, Oct 2007. doi: 10.1103/PhysRevA.76.044103. URL http://link.aps. org/doi/10.1103/PhysRevA.76.044103. Bibliography 119

[31] Yakir Aharonov and Lev Vaidman. The Two-State Vector Formalism: An Up- dated Review, volume 734 of Lecture Notes in Physics, book section 13, pages 399– 447. Springer Berlin Heidelberg, 2007. ISBN 978-3-540-73472-7. doi: 10.1007/ 978-3-540-73473-4 13. URL http://dx.doi.org/10.1007/978-3-540-73473-4_ 13.

[32] Bengt Svensson. What is a quantum-mechanical “weak value” the value of? Foundations of Physics, 43(10):1193–1205, 2013. ISSN 0015-9018. doi: 10.1007/ s10701-013-9740-6. URL http://dx.doi.org/10.1007/s10701-013-9740-6.

[33] Yakir Aharonov, Alonso Botero, and Sandu Popescu. Revisiting Hardy’s para- dox: counterfactual statements, real measurements, entanglement and weak val- ues. Physics Letters A, 2002. URL http://www.sciencedirect.com/science/ article/pii/S0375960102009866.

[34] Y. Aharonov, E. Cohen, D. Grossman, and a.C. Elizutr. Can a Future Choice Affect a Past Measurement’s Outcome? EPJ Web of Conferences, 70:00038, April 2014. ISSN 2100-014X. doi: 10.1051/epjconf/20147000038. URL http://www. epj-conferences.org/10.1051/epjconf/20147000038.

[35] Marissa Giustina, Alexandra Mech, Sven Ramelow, Bernhard Wittmann, Johannes Kofler, J¨ornBeyer, Adriana Lita, Brice Calkins, Thomas Gerrits, Sae Woo Nam, Rupert Ursin, and Anton Zeilinger. Bell violation using entangled photons without the fair-sampling assumption. Nature, 497(7448):227–30, May 2013. ISSN 1476- 4687. doi: 10.1038/nature12012. URL http://www.ncbi.nlm.nih.gov/pubmed/ 23584590.

[36] J. S. Bell and Alain Aspect. Speakable and Unspeakable in Quantum Mechanics. Cambridge University Press, second edition, 2004. ISBN 9780511815676. URL http://dx.doi.org/10.1017/CBO9780511815676. Cambridge Books Online.

[37] AJ Leggett. Nonlocal hidden-variable theories and quantum mechanics: An in- compatibility theorem. Foundations of Physics, 33(10):1469–1493, 2003. URL http://link.springer.com/article/10.1023/A:1026096313729.

[38] Simon Gr¨oblacher, Tomasz Paterek, Rainer Kaltenbaek, Caslav Brukner, Marek Zukowski, Markus Aspelmeyer, and Anton Zeilinger. An experimental test of non- local realism. Nature, 446(7138):871–5, April 2007. ISSN 1476-4687. doi: 10.1038/ nature05677. URL http://www.ncbi.nlm.nih.gov/pubmed/17443179.

[39] Huw Price. Toy models for retrocausality. Studies in History and Philosophy of Science Part B: Studies in History and Philosophy of Modern Physics, 39(4):752– 761, 2008. ISSN 1355-2198. Bibliography 120

[40] Yakir Aharonov, Peter G. Bergmann, and Joel L. Lebowitz. Time symmetry in the quantum process of measurement. Physical Review, 134(6B):B1410–B1416, 1964. URL http://link.aps.org/doi/10.1103/PhysRev.134.B1410. PR.

[41] Y Aharonov and L Vaidman. Complete description of a quantum system at a given time. Journal of Physics A: Mathematical and General, 24(10): 2315–2328, May 1991. ISSN 0305-4470. doi: 10.1088/0305-4470/24/10/ 018. URL http://stacks.iop.org/0305-4470/24/i=10/a=018?key=crossref. 77952b3bfed522dd7df23d46efa38cbd.