<<

Density .

It seems very strange and uncomfortable that our fundamental equation is a linear equation for the wavefunction while all observed quantities involve bi-linear combinations of the wavefunc- tion. This disparity is also at the heart of the measurement paradox because when we measure our interpretation is based on bi-linear constructions, not to mention that people often talk of the wave- function collapse. This is just unscientific because wavefunctions are subject to the Schr¨odinger Equation which does not have a ’collapse’ term in it! In what follows I will discuss the density matrix approach to the same foundational problems/questions, so do not be surprised that the same issues will be discussed again but from a different ”angle”. Let us introduce the notion of the density matrix, which offers an alternative and more powerful (for open systems when we need to deal with the ”rest of the world”) description of system states. For a system isolated from its environment we define it as (below x is a collection of coordinates, or state index, in whatever basis, in our system)

0 ∗ ρ = |ψi hψ| → ρx0x = ψ(x )ψ (x) . (1) P or in any other basis, after substituting |ψi = n ψn|ni, ∗ hn|ρˆ|mi = ρnm = ψnψm . (2) We immediately see thatρ ˆ is a Hermitian with non-negative diagonal elements, and for P normalized states n ρnn = Tr ρˆ = 1. This allows one to interpret ρnn as the probability of finding a system in the state |ni. Of course, a matrix parameterized by a vector is a special case; we will refer to this situation as a ”pure” state. We will see shortly that for open systems this will no longer be the case, in general, and then we are dealing with a ”mixed” state of the density matrix (see below). Its time dependence immediately follows from the Schr¨odinger Equation : ∂ i ρ(x0, x; t) = |Hψˆ (x0, t)ihψ(x, t)|−|ψ(x0, t)ihHψˆ (x, t)| = Hˆ |ψ(x0, t)ihψ(x, t)|−|ψ(x0, t)ihψ(x, t)|H,ˆ ~ ∂t (3) where in the last relation I have used the fact that Hˆ is Hermitian. In matrix notation we have ∂ i ρˆ (t) = Hˆ ρˆ (t) − ρˆ (t)Hˆ = [H,ˆ ρˆ] , (4) ~ ∂t nm nk km nk km nm which can be formulated in the basis independent form ∂ i ρˆ(t) = [H,ˆ ρˆ] . (5) ~ ∂t

For time-independent Hamiltonians with eigenstates/eigenenergies {|ni,En}, we immediately get after trivial integration that in the |ni-basis

it(Em−En)/~ ρnm(t) = ρnm(0)e . (6) This formula is especially important for discussion of the apparent de-phasing relaxation in a system isolated with extreme degree of accuracy. If, after a long enough time of system’s evolution, we 1 completely loose control on the value of the phases t(Em − En)/~ modulo 2π, then, FAPP we 1Due to natural limitations on experimental resolution and degree of isolation.

1 should replace our actual density matrix with the one averaged over the values of the phases t(Em − En)/~. This amounts to vanishing of the off-diagonal terms:

ρnm(t) → δnmρnn(0), (7) meaning that our state is indistinguishable from a statistical mixture of energy eigenstates, the state |ni entering the ensemble with the probability ρnn(0). [In the degenerate subspaces of H, one can always make H and ρ diagonal at the same time by basis rotations].

Measurement results are also directly related to the density matrix because

X ∗ ˆ X ˆ ˆ hΛi = ψλΛλλψλ = ρˆλλΛλλ = Tr ρˆΛ . (8) λ λ Note that any basis can be used to do the calculation using last equality.

Measurement

We see that all the necessary ingredients for doing mechanical calculations can be formulated at the level of density matrix. We still need the notion of the and the basis states of the Hermitian operators, but this is basically for mathematical convenience. Of course, dealing with vectors=wavefunctions is easier than with matrixes. But if we are after the physics principle, then density matrix should be our prime object for philosophical discussions because only ρˆ can deal with open system. Why do we need open systems? Because truly isolated ones cannot be observed from outside (!) and thus are irrelevant for material world. Any event qualifying for ”system’s observation” necessarily involves system’s interaction with the outside world, or environment, and has to deal with enlarged Hilbert space of the system and ”observer”. However, in system’s measurements we do not care what happens to the rest of the world (otherwise we are measuring more than the system in question), and thus probabilities for all possible outcomes for the world variables have to be summed up. This leads to the following definition of the density matrix for an open system: Ifρ ˆ(F ) is a full density matrix for the system and its environment then the system’s density matrix is defined as

(S) (F ) ρˆ = Trenv ρˆ , (9) where the is taken over the basis states of the environment, or, using a notation Renv for the collection of all degrees of freedom in the environment Z (S) 0 0 ∗ ρˆ (x , x) = dRenv Ψ(x , Renv)Ψ (x, Renv) , env which is an overlap of environmental states for different system parameters. Indeed, the average (S) of any system’s operator hΨ|Qˆ|Ψi reduces to TrS ρˆ Qˆ because Qˆ is not acting on environmental coordinates. Now consider the measuring process of some system’s quantity characterized by operator Λˆ (with the set {λ} of possible measurement outcomes) as a physical process involving sufficiently strong coupling (interaction Hamiltonian) between the system and the environment=measuring machine, something similar to the SG-machine. The set of Λˆ eigenvectors |eλi is very convenient

2 because it is tuned to the properties of the measuring machine in such a way that when our system is in state |eλi the machine is responding with its degrees of freedom strongly enough and goes into the state |Mλi such that an FAPP principle by Bell becomes applicable, namely the overlap 0 hMλ|Mλ0 i is an FAPP zero for λ 6= λ . This is because there are macroscopically many degrees of freedom which change their states strongly and differently enough for λ 6= λ0 under the action P ˆ of the interaction Hamiltonian HΛ,env = λ |λihλ| henv(λ). In the simplest example it can be a steel arrow of the voltmeter moving to a particular position. If this is not enough, complement it with your journal recording. Consider an arbitrary initial state expanded into the |eλi-basis (for simplicity, we will assume that λ is non-degenerate; more general formalism will be introduced later) X |Ψi = ψλ(Renv)|eλi . λ To ensure that before the measurement the environment is ”indifferent” to the state of the system, we write the expansion coefficients as a product ψλ(Renv) = Φ(Renv)ψλ; i.e. the initial state is a P simple product of the system and environment wavefunctions, |Ψi = |Φi λ ψλ|eλi. It will evolve during the measurement process into (the process has to be fast with respect to the system’s dynamics to qualify for the ”measurement of Λ in a given state of the system”)

X −i[Hˆenv+hˆenv(λ)]t X −→ Ψ = e Φ(Renv) ψλ|eλi = Mλ(Renv) ψλ|eλi . λ λ Correspondingly, (S) X ∗ ρˆ = ψλ0 ψλ |eλi heλ0 | . λλ0 will evolve into

(S) X ∗ X ∗ X 2 ρˆ = hMλ0 |Mλiψλ0 ψλ |eλi heλ0 | = δλλ0 ψλ0 ψλ |eλi heλ0 | = |ψλ| |eλi heλ| , (10) λλ0 λλ0 λ with hMλ|Mλ0 i = δλλ0 . This density matrix is diagonal in the λ-representation and leads to

X (S) X hΛi = ρλλ λ = pλλ . (11) λ λ The subsequent evolution of this density matrix is exactly the same as the evolution of an ensemble of independent system’s copies each in one of the eigenstates |eλi, as if each copy has collapsed into the state |eλi because the observed value of Λˆ happened to be λ. If λ values have degeneracy ν, then the result will be (S) X ∗ ρˆ = ψλνψλν0 |eλνi heλν0 | , (12) λ,ν,ν0 or, identically, ∗ (S) X X ψλνψλν0 ρˆ = pλ |eλνi heλν0 | , (13) pλ λ ν,ν0 with X 2 pλ = |ψλν| . (14) ν

3 This is essentially a ’multiple Universe’ or ’multiverse’ interpretation of Quantum : measuring process splits the world into distinctive states with FAPP zero overlap; these states FAPP never interfere in the future and go their separate ways FAPP forever. Please, do not worry about those multiple splittings and where they are all ’stored’ - the Hilbert space of 1060 degrees of freedom is FAPP infinite!

Mathematical formalism

Since splitting things into the system and environment is rather arbitrary and mostly depends on the properties of the measured quantity Λ, let us summarize our discussion into the follow- ing formal rules/procedures. When quantity Λ is measured, the possible outcomes are given by probabilities pλ = Tr ρˆPˆλ, (15) where Pˆλ is a projector on the subspace spanned by eigenstates |eλ,νi with the same eigenvector X Pˆλ = |eλ,νiheλ,ν| . (16) ν Here index ν refers to all possible degeneracies for a given eigenstate of Λ or degrees of freedom decoupled from Λ (generalizing previous discussion). FAPP the new density matrix is given by

ˆ ˆ ˆ ˆ (new) PλρˆPλ PλρˆPλ X ρˆλ,ν;λ,ν0 ρˆ = = = |eλ,νi heλ,ν0 | . (17) ˆ ˆ pλ pλ Tr PλρˆPλ νν0

The denominator in (17) is a normalization constant necessary to obey Trρ ˆ(new) = 1. We did nothing new relative to Eq. (10) except for introducing an extremely convenient notion of projector ˆ ˆ2 ˆ Pλ, and used the fact that Pλ = Pλ. The following three properties of projectors—associated with certain Hermitian operator Λ—ˆ are especially important: X Λˆ = λPˆλ, (18) λ ˆ ˆ ˆ PλPλ0 = δλ,λ0 Pλ, (19) X Pˆλ = 1ˆ (completeness relation), (20) λ where 1ˆ is the identity operator. As a , an arbitraryρ ˆ can be written in the diagonal form in the basis of its own eigenvectors ρˆ|ψji = wj|ψji, (21) as X ρˆ = wj|ψjihψj|, hψj1 |ψj2 i = δj1j2 . (22) j P along with wj ≥ 0, j wj = 1, allowing to interpret wj as probabilities of finding the system in the state |ji. Correspondingly,ρ ˆ can be interpreted as a statistical mixture (with appropriate probabilities)—and that is why the term “mixed”—of a special set of pure states. Of course if

4 all wj are zero except one, we are back to the pure state situation. Indeed, the probability of measuring λ for quantity Λ is given by [see (15)]

X X 2 pλ = Tr ρˆPˆλ = wjhj|Pˆλ|ji = wj|hψj|λi| . (23) j j

As the system evolves according to (4), or (5), all the eigenvalues (but not the eigenvectors!) ofρ ˆ remain preserved because X ρˆ(t) = wj|ψj(t)ihψj(t)|, hψj1 (t)|ψj2 (t)i = δj1j2 , (24) j

−iHtˆ where |ψj(t)i = e |ψj(0)i evolves in accordance with the Schr¨odingierequation for a pure state. The same result can be written as (using unitary evolution operator)

ρˆ(t) = U(t)ˆρ(0)[U(t)]†. (25)

Problem 8. The Hilbert space of the system X is three-dimensional. The current state of the system is described by the density matrix

3 X ρˆ = wj|ψjihψj|, hψj1 |ψj2 i = δj1j2 . (26) j=1 The matrix of the A—in the ONB of eigenvectors of A—has the form

 2015 0 0  A =  0 2015 0  . (27) 0 0 −2003

In the same ONB (of eigenvectors of A), the vectors |ψji have the following representation  √   √    1/√2 1/ √2 0 |ψ1i =  1/ 2  , |ψ2i =  −1/ 2  , |ψ3i =  0  . (28) 0 0 1

(a) What is the probability to find the eigenvalue 2015 when measuring A in the stateρ ˆ ?

(b) What will be the new (“collapsed”) density matrix after the eigenvalue 2015 is observed?

The interplay between “destructive” and probabilistic aspects of measurements can be quite nontrivial. The following example shows how a sequence of measurements with an essentially deterministic outcome can be used to perform a dramatic, but absolutely predictable change of a pure state of a system.

5 Problem 9. Let -1/2 system be prepared in the spin-up state with respect to some axis z. Suppose one performs N successive measurements of the spin projection, with each new measure- ment being along the axis zn that is rotated with respect to the original axis by angle n(θ/N), where θ ∼ 1 is fixed and N is supposed to be very large compared to unity. You may find this useful (S~ = (1/2)~σ):  n n   cos φ sin φ  ~σ · nˆ = z x = . nx −nz sin φ −cos φ Prove that in each of the successive N measurements, the probability to find the spin-up state approaches unity as N → ∞. That is, given sequence of measurements rotates the spin-up state by the angle θ deterministically .

Composition postulate

The composition postulate deals with the structure of the Hilbert space of a composite system consisting of two or more subsystems. It is sufficient to formulate it for two subsystems since generalization (by induction) to the case of many subsystems is then trivial. The composition postulate states that the Hilbert space of a system consisting of two subsys- tems is a direct product of the corresponding two Hilbert spaces. This, in particular, means that, (I) (II) if {|en i} and {|en i} are ONBs in the Hilbert spaces of the systems I and II, respectively, then the set of vectors

(I) (II) |enmi = |en i|em i, hen1m1 |en2m2 i = δn1,n2 δm1,m2 (29) forms ONB in the Hilbert space of the composite system. Any vector of the combined system can be then represented as a linear superposition

X X (I) (II) |ψi = cnm|enmi ≡ cnm|en i|em i. (30) nm nm Depending on the structure of the Hamiltonian, the two subsystems may, or may not, interact with each other. The two subsystems do not interact with each other if and only if the Hamiltonian of the composed system reduces to a direct sum of two independent Hamiltonians, each dealing with the corresponding subsystem:

H = H(I) + H(II) (noninteracting subsystems). (31)

A distinctively different from the issue of interaction is the notion entanglement between the two subsystems. The two subsystems are called disentangled (with respect to each other) if the density matrixρ ˆ of their composed state reduces to a direct product of individual density matrices 2 :

ρˆ =ρ ˆ(I)ρˆ(II) (disentangled subsystems). (32)

2Pure states (wavefunctions) correspond to a particular case of the density matrix, and need not to be considered separately.

6 If two subsystems are noninteracting and disentangled in the initial state, then the equation of readily implies that the two subsystems will remain disentangled, with individual density matrices evolving in accordance with individual Hamiltonians. Interaction leads to entanglement, by which, speaking formally, one means such a structure ofρ ˆ that cannot be reduced to (32). As opposed to interaction, which can be switched off at a certain moment of time, the entanglement, if created, will persist. Physically, the entanglement implies certain correlations between results of measurements dealing with the of both systems. If all measurements look at observables of only one of the subsystems (say, subsystem I), then from previous discussion it follows that the outcome of these measurements is exhaustively described by the reduced density matrix of subsystem I, obtained by tracing out variables of the other subsystem, see (9). Note that measuring an observable of system I often (but not necessarily always!) leads to disentanglement of the originally entangled systems. A simple and very important sufficient condition for disentanglement upon measurement is that the observed eigenvalue λ is not degenerate (in subsystem I). In this case, the state of the systems disentangles into a product of the corresponding pure state |ψ(I)i of the system I and a certain new state (not necessarily pure!) of the system II: (I) (I) (II) ρˆnew = |ψλ ihψλ | ρˆnew. (33)

Problem 10. Show the correctness of the above statement.

Thermal equilibrium and Gibbs distribution

In , the statistical description and the full dynamical description naturally deal with distinctively different (though related to each other) mathematical objects. The dy- namical description—the canonical formalism—operates with canonical dynamical variables (gen- eralized coordinates and momenta), while the statistical description is all about the distribution functions for these canonical variables. In —thanks to its probabilistic nature, on one hand, and linearity, on the other hand—the statistical description operates with the very same object as the dynamic theory, the density matrix3. The only () difference is that, in the dynamic theory, one would need to address the density matrix (or the , in the case of a pure state) of the full system, while quantum is constructed in terms of the reduced density matrix (9), describing either a smaller subsystem of a macroscopic body, or a system (no matter microscopic or macroscopic) weakly coupled to some heat bath. In the case of equilibrium statistics considered below, both cases are essentially equivalent. For example, in the case of ideal Bose or Fermi gases, these small subsystems are nothing but single-particle eigenmodes.4 Our goal is to establish the form of the density matrix for a small part of a macroscopic body in the state of thermal equilibrium. By macroscopic body we understand a system with infinite (in the limit of infinite volume) number of subsystems of arbitrary nature that are weakly coupled to each other; our subsystem of interest being one of them. The condition of infinitesimal weakness of coupling is crucial for subsystems to preserve their individuality. In quantum-mechanical language,

3Often referred to, in this context, as statistical operator. 4Recall that, in the case of bosons, a single-particle eigenmode is equivalent to a quantum harmonic oscillator; and, in the case of fermions, a single-particle eigenmode is equivalent to a two-level system.

7 this means that the entanglement between any two subsystems is negligibly small, so that the density matrixρ ˆAB of any two subsystems A and B (viewed as a untied system AB) splits into a direct product of individual density matrices [cf. (32)]:

ρˆAB =ρ ˆA ρˆB. (34) Apart from the rather simple requirement (34), the other circumstance one needs to take into account to establish the form of the equilibrium density matrix, say,ρ ˆA, is that, by definition, the equilibrium matrix does not evolve in time. Since infinitesimally weak interaction with the rest of the system can be ignored, from the equation of motion we thus have

[HA, ρˆA] = 0, (35) where HA is the Hamiltonian of the subsystem A. By introducing, for convenience, the logarithm of the density matrix, and rewriting (34) and (35) as

lnρ ˆAB = lnρ ˆA + lnρ ˆB, [HA + HB, lnρ ˆAB] = 0, (36) we conclude that the logarithm of the equilibrium density matrix is an additive conserved quantity. In the most general case, there is only one universal additive conserved quantity (up to a global scaling factor and a global constant): the energy, with the corresponding operator—the system’s Hamiltonian—being the sum of individual Hamiltonians for all the subsystems. In this case, for any subsystem A of our weakly-coupled macroscopic system, we find

e−βHA −βHA lnρ ˆA = c0 + c1HA → ,ZA = Tr e , (37) ZA where the value of the parameter c1 = −β = −1/T is the same for each subsystem. Here we took into account the normalization condition Trρ ˆA allowing us to fix one of the two constants, namely, the additive constant c0. We see that, in the case when energy is the only additive constant, the equilibrium statistics is exhaustively characterized by the system’s Hamiltonian and a global (for all the subsystems, or, equivalently, the heat bath) parameter T . Expression (37) is known as Gibbs distribution for a quantum system.

Problem 11. A three-level system (in some ONB) is described by the Hamiltonian  2 1 0  H = ε  1 2 0  . (38) 0 0 3 The system is in equilibrium with a heat bath at temperature T = 0.5ε.

(a) What is the expectation value of the system’s energy?

(b) What is the probability that a measurement of the system’s energy will produce the result 3ε?

(c) What is the probability to find the system in the state  1  |ψ1i =  0  ? (39) 0

8 Next in importance (and rather general in condensed matter) is the case when we are dealing with a system of N particles, the total number of which is a conserved quantity.5 Since N is also an additive integral of motion, the logarithm of the density matrix should contain an extra term c2NA. The standard convention is to write c2 = −µc1 = βµ in terms of yet another constant µ called the chemical potential. The final result is identical to the Gibbs distribution (37) with 0 modified Hamiltonian, HA → HA: 0 HA = HA − µNA, (40)

The Gibbs distribution based on HA − µNA is called the grand canonical distribution, while the 0 Gibbs distribution (37) is called the canonical distribution. Likewise, HA is often referred to as grand canonical Hamiltonian. Statistical operator is diagonal in the eigenenergy representation [for clarity, below we omit the subscript A and do not distinguish between canonical and grand canonical cases]:

X −βEn X −βEn ρˆ = wn|nihn|, wn = e /Z, Z = e . (41) n n In accordance with the previously-discussed statistical interpretation of the density matrix, Eq. (41) allows one to interpret quantum equilibrium statistics as that of an ensemble of pure states {|ni}. Such an ensemble is called the (grand) . The rest of thermodynamics follows directly from Eq. (41). The corresponding derivations belong to the course, where it is shown that, starting from an expression for Z as a function of T , µ, and the systems’s volume V , one can obtain all thermodynamic quantities by doing partial derivatives with respect to T , µ,and V . In particular, is given by (do it in two lines): X S = −Trρ ˆlnρ ˆ = − wn ln wn. (42) n

Einstein-Podolsky-Rosen (EPR) paradox

In 1935 Einstein, Podolsky, and Rosen, being unhappy with the status of quantum mechanics as a theory, made an example of a system which they believed was a clear cut contradiction between the quantum mechanical predictions and what a reasonable physical theory should be. Here is the setup. Consider two spins-1/2 prepared in the , which is a superposition state now known as the EPS (entangled pair state) state: | ↑ i| ↓ i − | ↓ i| ↑ i |EPSi = √ . (43) 2 Then, making sure that the spin variables are decoupled from the rest of the world and each other (i.e. keeping the EPS state undisturbed) take the first spin (and the lucky space explorer) to the

5In this respect, very instructive is the difference between the quantum statistics of in a cavity (or phonons in a solid or liquid) and the statics of ideal gas of bosonic . In the former case, the interaction with the other degrees of freedom of the system (or the heat bath) does not conserve the total number of photons (phonons), and, for each eigenmode, we should use Eq. (37), while in the latter case, the conservation of the total number of particles has a profound effect on statistics, leading, in particular, to the phenomenon of Bose-Einstein condensation.

9 distant Galaxy while the second spin remains on Earth. Next, try to measure the z-components of the spins very fast so that even light can not be used to communicate the results of the Earth and Galaxy experiments while they are conducted.

Problem 12. Show that |EPSi has the same form, Eq. (43), for any choice of the direction of the quantization axis. To this end, find the two eigenstates |aˆ±i of the operator ~σ · aˆ wherea ˆ is the unit vector [the eigenvalues are ±1/2: (~σ · aˆ) |aˆ±i = ±(1/2)|aˆ±i ] and re-expand |EPSi in terms of them.

Now, a reasonable physics theory should say that the result of the Galaxy measurement can not depend on what and how is measured on Earth because physically the two spins are completely decoupled. At this point EPR noted that if on Earth the spin was measured ”up” with respect to any axis, then the result of the Galaxy measurement is predetermined! It has to be ”down” along the same axis for sure. But if the measurement on Earth was not done, then the Galaxy result is completely random! EPR concluded that this ”spooky action at a distance” implies that there are some hidden (not mentioned explicitly yet) labels or variables which are ”attached” to the spins. For example, both spins may carry a variable representing a solid angle on the sphere Ω, which is set at random when the EPS pair is created. The two spins carry their variables Ω1 = Ω and Ω2 = −Ω and in later experiments one always finds them pointing in the opposite directions. This type of reasoning is called the ”theory of hidden variables”. The EPR conclusion was that this example proves that quantum mechanics is an incomplete theory and it simply misses extra variables which make it deterministic. Einstein carried his ”God does not play dice” all his life. Indeed, quantum mechanics predicts that if the Earth (E) and Galaxy (G) measurements for many EPS pairs are written in the lab journal, and after a few light years they are compared to each other then one will find (u=”up”, d=”down”) udduudduduud ... (E) duudduududdu ... (G) (44) i.e. perfect correlation. But so what? Can the lucky explorer ever tell whether experiments were conducted on Earth or not by looking at the G-journal? The answer is (see Problem below) NO!

Problem 13. Show that the reduced density matrix of each of the two spins—obtained by tracing the full density matrixρ ˆ = |ψ0ihψ0| over the variables of the other spin—is nothing but the unity operator.

If E-measurements were not done then G-measurements would appear absolutely random, i.e. u and d would happen with probability 1/2. G-measurements are predetermined after E-measurements, but E-measurement (done before G-measurements) have random outcomes with probabilities 1/2 for u and d. It means that the sequence of u and d in the G-journal is statistically exactly the same no matter what is happening on Earth! This is quantum mechanical version of no ”spooky action at a distance”; thus, formally, there is no good reason for pushing the paradox if you cannot tell the difference, never ever!

Bell’s inequality (theorem)

10 The other way to present the paradox is to say: ”When spin 1 is measured along z-axis and found to be ↑z this instantaneously changes the state of spin 2 to be ↓z. But outside of the light cone (so that any physical communication between the spins is impossible) the notion of past and future is relative, so in another reference frame spin 2 can be measured along x-axis and can be found, say, to be in the state ↓x which means that the spin 1 state has to change to ↑x. This is a contradiction.” Well, the counter argument is ”OK, if after the measurement of spin 1 along z we decide that spin 2 has to be in the state ↓z this still does not prevent one from finding spin 1 in the state ↓x afterwards. What is your problem? If you want to check that spin 2 is now in the state ↑x then you have to skip the first measurement of spin 1 along z because otherwise you are messing with the same spin twice and destroy the original state.” Still, one may wonder, what if some kind of local hidden variables theory is indeed standing behind the quantum mechanics. By ”local” we mean that measuring one of the spins we cannot change the state of its counterpart in any physical sense. The assumption of locality is most natural, given that the spatial distance between the two spins can be arbitrarily large. Furthermore, in the absence of locality, hidden variables can hardly be of any non-academic importance. Quantum mechanics is certainly non-local in this particular sense (this does not violate the principle of impossibility of communicating physical information at speeds exceeding c). After nearly 30 years of numerous attempts by many to construct a local hidden variables theory, John Bell has elegantly shown that such a theory is impossible in principle by formulating inequalities, which every hidden variables theory has to satisfy, and quantum mechanics violates. Next, his inequalities were checked to be violated experimentally. This was probably the most important development in QM after 1926. The idea is to measure correlations in measurements along non-commuting axis. For example, measure spin 1 alongz ˆ and spin 2 alongx ˆ, or spin 1 alongx ˆ and spin 2 alongy ˆ, etc. i.e. measure them along any two axis a and b with a 6= b. If the two spins ”agreed” ahead of time (using hidden variables) to be in the opposite direction to each other then there will be correlations between the results of two measurements. To study correlations statistically, perform many identical ex- periments and record the frequency of different outcomes for various setups of the two axis. Let Ca,b(σa|σb) is the probability of the outcome σa = ±1 for spin 1 when the subsequent measure- ment resulted in σb for spin 2. Similarly for Ca,b(σb|σa) but now spin 1 is measured after spin 2. These are probabilities filtered according to the second measurement outcomes. By C¯a,b(σa, σb) we denote the joint probability of having outcomes σa, σb. Now, our spins are in different Galaxies, and any local classical theory has to state that the statistics of measurements on one of the spins may not depend on what we will do next to the other spin, so

Ca,b(σa|σb) = Ca(σa) ,Ca,b(σb|σa) = Cb(σb) , and C¯a,b(σa, σb) = Ca(σa)Cb(σb) , no matter how the initial state is correlated through hidden variables. This equation is known as Bell’s locality condition for any local hidden variables theory. Let us now prove a Lemma: Quantity S S = αβ + αβ0 + α0β − α0β0 for α, α0, β, β0 ∈ [−1/2, 1/2] is such that |S| ≤ 1/2. Indeed, since S is linear in all variables it has to reach its maximum and minimum at the corners of the domain of definition, i.e. when all

11 0 0 0 0 variables are ±1/2. Now, writing it as S = (σa + σa)(σb + σb) − 2σaσb we see that at the corners the first term can be only ±1, 0 and the second only ∓1/2 with the opposite sign between the two. This proves that |S| ≤ 1/2.  While doing measurements on spins one may calculate single spin averages

X 0 X X 0 X α = σCa(σ) , α = σCa0 (σ) , β = σCb(σ) , β = σCb0 (σ) σ σ σ σ and pair-wise averages X P (a, b) = σaσbC¯ab(σa, σb) . σaσb According to the locality condition, joint probabilities factorize, and we have P (a, b) = αβ. Then the Lemma proved for S leads to the conclusion that

−1/2 ≤ P (a, b) + P (a, b0) + P (a0, b) − P (a0, b0) ≤ 1/2 (45)

Again, all probabilities are written for a particular set of hidden variables ζ (not mentioned any- where yet, but the same in all expressions). At the end of the day one will further average all expressions over some f(ζ) to account for the possibility that when the initial state was prepared ζ was selected at random from the distribution f. The inequality (45) still holds. All of this is very appealing, but next comes the quantum mechanical calculation for a particular choice of axes that violates the Bell’s inequality! Let the initial state be the EPS pair, which looks the same when spins are quantized along any axis. This simplifies the calculation, since we can always choose the axis of the magnet used to measure the first spin to be alongz ˆ, and the axis of the other magnet at angle θ to it. Quantum mechanically, the probability that we measure the first spin to be along axis ”a” and the other along axis ”b” is then C¯(+, +) = (sin2(θ/2))/2. Similarly, C¯(−, −) = (sin2(θ/2))/2 and C¯(+, −) = C¯(−|+) = (cos2(θ/2))/2. Altogether, the quantum mechanical average of different outcomes for σ1σ2 will be 1  1 P (Q) = sin2((θ − θ )/2) − cos2((θ − θ )/2) = − cos(θ − θ ) . (46) a,b 4 a b a b 4 a b

Finally, choose the following axis set for our measurements (I mention angles to some predefined z-axis θa = π/2 , θa0 = 0 , θb = −3π/4 , θb0 = −π/4 . and compute the corresponding correlations 1 5π 1 1 3π 1 Pa,b = − cos = √ ,Pa,b0 = − cos( ) = √ , 4 4 4 2 4 4 4 2 1 3π 1 1 π 1 Pa0,b = − cos( ) = √ ,Pa0,b0 = − cos( ) = − √ . 4 4 4 2 4 4 4 2 √ This gives us S = 1/ 2 = 0.707 in clear violation of the Bell’s locality condition (45). One can further show that our choice of axes results in the maximum possible violation of (45) for the two spin-1/2 system.

12 The actual experiments were done with pairs using photon states instead of spin projections and instead of magnets. The result was in agreement with quantum mechanics, not the hidden variables theory.

Schr¨odingercats

Suppose the initial state of the system was prepared as the superposition of two eigenstates, let’s denote them | ↑i and | ↓i) of some quantity Λ:

|ψi = c1| ↑i + c2| ↓i. (47)

The system itself can be anything you like, say a ”cat”, and the operator in question is determining whether it is ”dead” or ”alive”. Imagine now that Λ was measured, and assume that the ”outside world” state was disentangled from the system before the measurement. As discussed previously, the state after measurement will become

|finali = c1M↑(Renv) | ↑i + c2M↓(Renv) | ↓i + . (48)

On one hand, this expression clearly states that the ”cat” is both ”dead” and ”alive” at the same time and we ”know” about that because we measured the system! This is the essence of the ”Schr¨odinger cat” paradox. However, one should not forget that ”dead” and ”alive” states are now attached to M↑(Renv) and M↓(Renv) states of the world, and the overlap between the two states in the superposition is FAPP zero. This means that not only after the measurement, but also in all future experiments ever done on the system one would never be able to interfere measured states, and thus will not be able to establish that the cat was dead and alive at the same time—in practice it will always appear either dead or alive. Why do people keep discussing Schr¨odingercats? Because experimentally, the ”cat” AND the ”world” can be made so small that the overlap between two states in (48) is no longer zero, and detecting their interference (or converting one into another under evolution) becomes possible. However, the notion of ”observation” or ”measurement” does not apply to such systems because the observers did not ”make their mind” yet!

Effective dynamics under quantum measurement

The requirement that the measurement of Λ leaves intact the eigenstates of the observable Λ implies a universal (for all possible measuring devices) form of the Hamiltonian describing the evolution of the system and device during the measurement process (we have used this form before): X ˆ (dev) Hmes = PλHλ (t) (device + system-device interaction). (49) λ (dev) Here Hλ is a certain (rather general and time dependent—the measurement has to be perform fast) Hermitian operator acting in the Hilbert space of the measuring device. This form implies equally simple and generic form of the corresponding evolution operator for the duration of the measurement: X ˆ (dev) Umes = PλUλ . (50) λ

13 (dev) Here Uλ is the evolution operator (in the Hilbert space of the device) corresponding to the (dev) Hamiltonian Hλ [to show that, use Eq.(19)]. In a standard measurement setup, the system and the devise are initially disentangled, so that initial total density matrix,ρ ˆ0, is simply a direct product of two density matrices:

(sys) (dev) ρˆ0 =ρ ˆ0 ρˆ0 (before measurement). (51)

After the measurement, the total density matrix becomes

† X  ˆ (sys) ˆ   (dev) (dev) (dev) † ρˆ = Umes ρˆ0 Umes = Pλ ρˆ0 Pλ0 Uλ ρˆ0 [Uλ0 ] . (52) λλ0 For macroscopic devices, the requirement of macroscopic distinguishability of device’s states cor- responding to different λ’s (by myself!) is equivalent to adding myself to the device variables and then tracing out my degrees of freedom. This immediately leads to the λ = λ0 condition, see (10): This leaves us with the effective density matrix after the measurement

X  ˆ (sys) ˆ  (dev) (dev) (dev) (dev) (dev) † ρˆ = Pλ ρˆ0 Pλ ρˆλ , ρˆλ = Uλ ρˆ0 [Uλ ] . (53) λ

(dev) Note thatρ ˆλ has a very transparent meaning of the final state of measuring device corresponding (dev) to the evolution operator Uλ . Finally, observe that X (sys) (dev) ρˆ = pλ ρˆλ ρˆλ . (54) λ That is, consistently with all aspects of the measuring axioms,ρ ˆ has the form of a statistical mixture (with the weights pλ) of new states of the system and measuring device. Needless to say that the new states of the measuring device are distinctively different (orthogonal):

(dev) (dev) 0 ρˆλ ρˆλ0 = 0, if λ 6= λ. (55)

Schmidt decomposition

Suppose we split a system into two subsystems, I and II. With a general choice of ONBs in the Hilbert spaces of the subsystems I and II, an expansion of any pure state |ψi of the total system has the generic form (30). There exists, however, a special choice of the orthonormal set of vectors (I) (II) in two subsystems, {|φn i} and {|φn i} (same subscript is not a typo!) when

X (I) (II) |ψi = an|φn i|φn i. (56) n

In principle, the coefficients an can be rendered real and nonnegative, since the phase factor can be always absorbed into the basis vector.

14 In mathematics, the representation (56) is known as Schmidt decomposition. It has quite important physical implications. Namely, for the two reduced density matrices it yields

(I) X 2 (I) (I) (II) X 2 (II) (II) ρˆ = |an| |φn ihφn |, ρˆ = |an| |φn ihφn |, (57) n n revealing that, despite all possible qualitative and quantitative differences between the two sub- systems (including vastly different dimensions of Hilbert spaces) the two reduced density matrices always feature a remarkable correspondence between their eigenvectors, with exactly the same 2 eigenvalues given by |an| . Now the proof. Let the dimension of Hilbert space for subsystem I be lower or equal to that of subsystem II. Given the pure state |ψi, constructρ ˆ(I) = Tr(II)ρˆ usingρ ˆ = |ψihψ|, and find its 2 (I) eigenvalues, {|an| }, and eigenstates, {|φn i}. Then,

X (I) (II) |ψi = |φn i|ϕn i, (58) n

(II) where {|ϕn i} are nothing but the coefficients of expansion of ψ(XI ,XII ) over the basis {|φn(XI )i}, for each XII . They are certain states for the system II. Correspondingly,

X (I) (II) (II) (I) ρˆ = |φn1 i|ϕn1 ihϕn2 |hφn2 |, (59) n1n2

(I) (II) X (I) (II) (II) (I) ρˆ = Tr ρˆ = |φn1 ihϕn2 |ϕn1 ihφn2 |. (60) n1n2 (I) So far, we have used only the fact that {|φn i} is a certain ONB in the Hilbert space of the (I) (I) subsystem I. Now, recall that {|φn i} is the ONB ofρ ˆ , meaning that all n1 6= n2 terms in (60) have to be identically equal to zero:

(II) (II) hϕn2 |ϕn1 i = 0 if n1 6= n2. (61) After normalization, we arrive at (57) with

(II) (II) |ϕn i |φn i =  (62) q (II) (II) hϕn |ϕn i P For a density matrixρ ˆ, the quantity −Trρ ˆlnρ ˆ = − j wj ln wj is called entropy [recall (42)]. The entanglement between two subsystems in a pure global states is characterized by entanglement entropy (I) (I) (II) (II) X 2 2 Sentang = −Trρ ˆ lnρ ˆ = −Trρ ˆ lnρ ˆ = − |an| ln |an| . (63) n

Problem 14. Argue that the minimum possible value of Sentang equals zero.—What case does it correspond to? Show that the maximal possible value of Sentang equals ln N, where N is the smallest dimension of the Hilbert space out of the two subsystems.

15 For a pure state of a macroscopic system and a macroscopically large interface between the two macroscopic subsystems, the entanglement entropy between the two subsystems is proportional to the size of the interface: the length of the interface in 2D and the surface of the interface in 3D. (The so-called area law.) One way to see that, is split the system into three macroscopic subsys- tems with the third subsystem being localized (within the appropriate correlation length) around the interface and containing entangled degrees of freedom with total Hilbert space N ∝ e#Area and maximum entanglement entropy Sentang ≈ lnN ∝ Area.

Permanent observation

For a small enough time interval ∆t, we have (~ = 1)

ρˆ(t + ∆t) =ρ ˆ(t) − i∆t[H, ρˆ(t)] + O (∆t)2 . (64)

Now suppose that before the evolution (64) has started, we measured some quantity Λ and observed λ. After the evolution (64) we measure Λ again. What will be the probability pλ(∆t) of observing the same eigenvalue λ? From (64) and the measuring axiom, one readily concludes that

 2 pλ(∆t) = 1 − O (∆t) . (65)

Problem 15. Derive Eq. (65). Hint. An observation of λ at time t, impliesρ ˆ(t) ≡ Pˆλ ρˆ(t − 0) Pˆλ.

Now, fix some finite time interval τ and perform N  1 measurements of the same quantity Λ, with the same time interval ∆τ = τ/N between successive measurements. From (65) we then see (observe close similarity with Problem 9 in terms of taking the limit of N → ∞) that in the limit N → ∞, each successive measurement will deterministically yield one and the same eigenvalue λ found in the very first measurement. This fact is known as quantum Zeno effect, by analogy with the arrow paradox by Zeno of Elea (ca. 490–430 BC). The N → ∞ limit corresponds to permanent observation. In the light of the Zeno effect, a question arises of whether constant observation implies complete suppression of system’s evolution. If the eigenvalue λ (found in the first measurement) is nondegenerate, then the answer is ”YES”. The system simply stays in the corresponding pure state |ψλihψλ|. Note, however, that the phase of the state is fundamentally indefinite. If the eigenvalue λ is degenerate (degeneracy is unavoidable for composite systems when the observable Λ deals exclusively with the variables of a certain subsystem) the system keeps evolving. The law of this permanent-measurement evolution is very intuitive (we restore ~): ∂ i ρˆ = [H(λ), ρˆ], (66) ~∂t where (λ) H = PˆλHPˆλ (67) is the projected—onto the degenerate λ-subspace—Hamiltonian. Equations (66)–(67) apply to the case of nondegenerate λ as well, with H(λ) being just a number. In this case [H(λ), ρˆ] ≡ 0 and the evolution is suppressed. To derive (67), simply project (64).

16 No matter how trivial this derivation is, it paves the way to engineering a novel Hamiltonian by permanent observation of a composite system. Let the observable Λ deals only with the variables of subsystem II, and λ is a non-degenerate eigenvalue within the Hilbert space of subsystem II. Let, the interaction Hamiltonian between subsystems I and II be of the form

X (I) (II) Hint = Hj Hj , (68) j

(I) (II) where Hj and Hj are some Hermitian operators acting in the Hilbert spaces I and II, re- spectively. Permanent observation of the above-mentioned eigenvalue λ results in the following projected interaction Hamiltonian

(λ) X (λ) (I) (λ) (II) (II) (II) Hint = hj Hj , hj = hψλ |Hj |ψλ i. (69) j

The physical meaning of the interaction Hamiltonian (69) is as follows. The subsystem II gets (II) “frozen” in the state |ψλ i. The evolution of the system I is driven by the effective Hamiltonian equal to the sum of the systems own Hamiltonian in the presence of generalized ”external fields” (λ) (λ) hj (resulting in Hint ). (λ) In the spirit of Problem 9, the fields hj in the effective interaction Hamiltonian (69) can be rendered time-dependent by appropriately changing the observable Λ in time.

Problem 16. For two spins-1/2 interacting via the Hamiltonian ~σ1 · ~σ2, show that permanently measuring the projection of the second spin onto the time-dependent axisn ˆ(t) results in the effec- tive time-dependent magnetic field acting on the first spin, provided the dependence of the unit vectorn ˆ on time is smooth.

17