<<

Basic Probability Theory A

In order to follow many of the arguments in these notes, especially when talking about entropies, it is necessary to have some basic knowledge of probability theory. Therefore, we review here the most important tools of probability theory that are used. One of the basic notions of probability theory that also frequently appears throughout these notes is that of a discrete random variable. A random variable X can take one of several values, the so-called realizations x,givenbythealphabet X. The probability that a certain realization x ∈ X occurs is given by the probability distribution pX(x). We usually use upper case letters to denote the random variable, lower case letters to denote realizations thereof, and calligraphic letters to denote the alphabet. Suppose we have two random variables X and Y , which may depend on each other. We can then define the joint probability distribution pX,Y (x, y) of X and Y that tells you the probability that Y = y and X = x. This notion (and the following definition) can be expanded to n random variables, but we restrict ourselves to the case of pairs X, Y here to keep the notation simple. Given the joint probability distribution of the pair X, Y , we can derive the marginal distribution PX(x) by  pX(x) = pX,Y (x, y) ∀ x ∈ X (A.1) y∈Y and analogously for PY (y). The two random variables X and Y are said to be independent if

pX,Y (x, y) = pX(x)pY (y). (A.2)

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 219 R. Wolf, , Lecture Notes in Physics 988, https://doi.org/10.1007/978-3-030-73991-1 220 A Basic Probability Theory

Furthermore, we can define the conditional probability that Y takes the value y ∈ Y, given that X takes the value x ∈ X:

pX,Y (x, y) pY |X(y|x) = . (A.3) pX(x)

To avoid complications, we use the convention that pX,Y (x, y) = 0ifpX(x) = 0. If X and Y are independent, pY |X(x|y) = pY (y) for all y ∈ Y. Using the definition of the conditional probability, (A.1) can be rewritten as  pX(x) = pX|Y (x|y)pY (y) ∀ x ∈ X. (A.4) y∈Y

In this form it is also called the law of total probability. Another important rule that relates different conditional probabilities is Bayes’ rule:

pX(x) pX|Y (x|y) = pY |X(y|x) . (A.5) pY (y)

This rule can be proved as follows: Note that (A.3) can be rewritten as

pX,Y (x, y) = pY |X(y|x)pX(x). (A.6)

It follows that

pX,Y (x, y) pX(x) pX|Y (x|y) = = pY |X(y|x) . (A.7) pY (y) pY (y) Calderbank–Shor–Steane Codes B

Calderbank–Shor–Steane (CSS) codes are a large class of quantum error cor- rection codes that exploit ideas from classical linear error correction codes. In entanglement-based QKD protocols, they can be used to correct errors that occur during the distribution of entangled states.

B.1 Classical Linear Codes

Before we can understand CSS codes, we need to make a short detour into the theory of classical linear codes. A linear code C that encodes k bits into an n bit code space (with n>k)isasetof2k codewords, where each codeword is a binary vector of length n. We call such a code an [n, k] code. It is specified by a n × k generator matrix G with elements in {0, 1}. G maps messages to their equivalent in the code space, for instance, a k bit message x (which is represented by a column vector) is encoded as y = Gx. Note that all arithmetic operations (especially multiplications and additions) are done modulo 2. As a simple example, consider the [3, 1] repetition code that encodes 1 bit messages into three copies of them: 0 is mapped to (0, 0, 0)T and 1 is mapped to (1, 1, 1)T . Hence, the generator matrix G is ⎛ ⎞ 1 G = ⎝1⎠ . (B.1) 1

To connect this definition of classical codes to error correction, we have to introduce a different formulation of linear codes, the parity check matrices. In this formulation, an [n, k] code is defined as all vectors x of length n with entries from {0, 1} such that

Hx = 0, (B.2)

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 221 R. Wolf, Quantum Key Distribution, Lecture Notes in Physics 988, https://doi.org/10.1007/978-3-030-73991-1 222 B Calderbank–Shor–Steane Codes where H is an (n − k) × n matrix with entries in {0, 1} called the parity check matrix. To construct the parity matrix H from a generator matrix G, one has to pick out n − k linearly independent vectors orthogonal to the columns of G.The corresponding parity check matrix for the [3, 1] repetition code with G given in (B.1)isthen   110 H = . (B.3) 011

In the language of parity check matrices, it is quite easy to see how error detection and correction work. Suppose we have a message x that we encode as y = Gx.If an error e occurs, the codeword y is transformed into the corrupted codeword y = y + e. Because Hy = 0 for all codewords y, it follows that Hy = Hy+ He = He. This is called the error syndrome. If the syndrome is 0, we know that no error has occurred. Otherwise, it contains information about the error because of the way the parity check matrix H was constructed. In the example of the [3, 1] repetition code, every codeword has a length of 3 bits. Therefore, errors can occur at three different positions. Denote by ei an error in the ith bit, i.e., a vector with a 1 at position i. Then for all codewords y,wehave  that Hy = Hei; hence, the three different syndromes are       1 0 1 He = ,He= ,He= . (B.4) 1 0 2 1 3 1

This makes it possible to read off the position of the error from the syndromes. Note that this procedure is only successful if we know that an error has occurred for at most one bit. Hence, the [3, 1] repetition code can correct one error. More general linear error correction codes can be obtained using the concepts of Hamming distance. The Hamming distance d(x,y) between two binary vectors x and y is defined as the number of positions in which the two bit strings differ. For example, d((1, 1, 0, 0)T ,(1, 0, 0, 1)T ) = 2, because the vectors differ in the 2nd and 4th positions. Error correction now works as follows: suppose we have a codeword y = Gx that is corrupted such that the resulting vector is y = y + e.If 1 the probability that an error occurs is less than 2 , the most likely codeword to have been encoded is the one that minimizes the Hamming distance to y, i.e., d(y,y), since this is the one with the least amount of bit flips. How many errors can such a code correct? This can also be analysed in terms of the Hamming distance: We define the distance of a code C to be the minimum Hamming distance between any of its codewords:

d(C) = min d(x,y). (B.5) x,y∈C, x=y

We use the notation d = d(C) and call C an [n, k, d] code. With a little bit of thinking one can see that a code with distance 2t + 1 for some integer t can be used B Calderbank–Shor–Steane Codes 223 to correct up to t errors, simply by decoding the corrupted message y as the unique codeword y that satisfies d(y,y) ≤ t. If more than t errors occur, this codeword is no longer unique and therefore, errors cannot be reliably detected and corrected. The last concept we need from classical linear codes is duality. Suppose we have a linear [n, k] code C with generator matrix G and parity check matrix H . We can then construct another code, the dual code C⊥ of C, which consists of all codewords that are orthogonal to each codeword in C. Hence, the generator matrix of the dual code is H T and its parity check matrix is GT .

B.2

In the quantum case, the situation is a bit more complicated. Where in the classical case only one type of error is possible (namely the bit flip error), a can undergo three different types of errors: a bit flip, which changes |0 to |1 and |1 to |0,a phase error, which maps |1 to −|1 but leaves |0 unchanged, and a combination of the two, which maps |0→−|1 and |1→|0. The Calderbank–Shor–Steane (CSS) code is now defined as follows: Suppose we have two classical linear error correction codes, an [n, k1] code C1 and an [n, k2] ⊂ ⊥ code C2 such that C2 C1 and C1 and C2 both correct up to t errors. Using these two classical codes we can define a quantum error correction code, the CSS code of C1 over C2, denoted CSS(C1,C2).Itisan[n, k1 − k2] quantum code that is capable of correcting errors on up to t . The construction works as follows: for any codeword x ∈ C1, we define the quantum state

1  |x + C2=√ |x + y, (B.6) |C2| y∈C2 where + is the bitwise addition modulo 2 and |C2| denotes the cardinality of C2 k (which is 2 2 , since this is the number of codewords of C2). We have used coset notation here for a reason. If you are not familiar with the concept of a coset, we briefly recap some facts here for a group G and a subgroup H ⊂ G, then for any g ∈ G the left coset of H in G determined by g is defined as

g + H ={g + h : h ∈ H }. (B.7)

We denote G/H the set of all cosets of H in G. Cosets have some convenient properties: Coming back to the notation for CSS codes, suppose that x is an element   of C1 such that x −x ∈ C2. Then it follows that |x +C2=|x +C2, which implies that the state |x + C2 only depends on the coset of C1/C2 in which x is contained. In this sense, (B.6) is an equally weighted superposition of all the words in the coset represented by x. Another consequence of the coset formalism is that if x and x belong to different    cosets of C2 in C1, then there are no codewords y,y ∈ C2 such that x +y = x +y ; 224 B Calderbank–Shor–Steane Codes

 hence, |x+C2 and |x +C2 are orthonormal states. The quantum code CSS(C1,C2) {| + } is defined to be the vector space spanned by x C1 x∈C1 . Since the number of cosets of C2 in C1 is |C1|/|C2|, the dimension of this vector space is |C1|/|C2|= k −k 2 1 2 , and therefore CSS(C1,C2) is an [n, k1 − k2] quantum code. It is now possible to exploit the classical error-correcting properties of the codes ⊥ C1 and C2 to detect and correct quantum errors. The crucial point here is that bit flip errors and phase flip errors are corrected independent of each other. Bit flip errors are described by a vector ebit of length n that has 1s at those positions where a bit flip has occurred, and 0s otherwise. If the original state is denoted |x, bit flip errors transform this state to

|x→|x + ebit. (B.8)

Phase errors are described by a second vector ephase of length n with 1s at those positions where a phase error has occurred. In this case, the phase errors transform the state |x as

· |x→(−1)x ephase|x. (B.9)

A crucial observation here is that when we apply the Hadamard transformation (see (2.28)), the phase error takes the same form as the bit flip error,1 i.e., a state |x in the Hadamard basis is transformed by phase errors as

  |x →|x + ephase. (B.10)

In summary, if |x + C2 as defined in (B.6) is the original state, then the corrupted state is described as 1  (x+y)·ephase √ (−1) |x + y + ebit. (B.11) 2k2 y∈C2

To detect bit flip errors, we need to compute the error syndrome for the code C1. For this purpose it is convenient to introduce an ancilla state that consists of a sufficient number of qubits to store the syndrome and that initially is in the all zero state |0. To compute the syndrome, we apply the parity check matrix H1 of the code C1 and store the result in the ancilla state:

|x+y+ebit|0→|x+y+ebit|H1(x+y+ebit)=|x+y+ebit|H1ebit. (B.12)

Hence, to detect the error one simply measures the ancilla state, discards it, and applies NOT gates (i.e., gates that take |0→|1 and |1→|0) to those qubits

1One can easily verify this statement by carrying out this computation for the two basis states for the Hadamard basis, namely |+ and |− defined in (2.24). B Calderbank–Shor–Steane Codes 225 where a bit flip has occurred. This removes all the bit flip errors and the resulting state is  1 + · √ (−1)(x y) ephase|x + y. (B.13) 2k2 y∈C2

The remaining part is to detect and correct phase errors. We can do this by applying a Hadamard transformation to each qubit, which transforms the state to   1 + · + √ (−1)(x y) (ephase z)|z, (B.14) + 2n k2 z y∈C2 where the sum is over all possible n bit values for z. We can rewrite this state by  setting z = z + ephase, which yields   1 (x+y)·z  √ (−1) |z + ephase. (B.15) n+k 2 2  z y∈C2

 One can show that if z ∈ C⊥,then (−1)y·z =|C |, while if z ∈/ C⊥,then y∈C2 2  (−1)y·z = 0, which allows us to further rewrite the state: y∈C2  1 x·z  √ (−1) |z + ephase, (B.16) n−k2 2 ∈ ⊥ z C2 which has exactly the form of a bit flip error described by the vector ephase. We can therefore simply repeat the procedure we did before, but now with the parity check ⊥ ⊥ matrix of the code C2 . Here, it becomes clear why we need C2 to be able to correct t errors and not C2 itself. This allows us to correct all the errors and we receive the state  1 ·   √ (−1)x z |z . (B.17) n−k2 2 ∈ ⊥ z C2

The last step is to apply the Hadamard transformation again to each qubit since it is its own inverse. The resulting state is

1  √ |x + y, (B.18) 2k2 y∈C2 which is exactly the originally encoded state. Index

A Completely positive map, 26 Asymptotic equipartion property (AEP), 85 Composability, 123 Avalanche photodiode (APD), 186 Conference key agreement (CKA), 184 Correctness, 119 Covariance matrix, 198 B Csiszár-Körner theorem, 130 Basis, 14 CSS codes protocol, 141 computational, 20 diagonal, 8 Hadamard, 20 D mutually unbiased, 9 Decoy state protocol, 134, 207 rectilinear, 8 De Finetti theorem, 148, 173, 205 BB84 protocol, 7, 93, 134, 147, 151, 171, 207 Density matrix, 22 entanglement-based, 104 Depolarizing channel, 169 secure, 143 Detection loophole, 177 Bell inequality, 162 Detector blinding attack, 186 Bell states, 37, 45 Devetak-Winter rate, 144, 170 Bell test, 209 Device-independent QKD (DIQKD), 167 Binary entropy, 57 Discrete modulation, 204 Black box, 160 Displacement operator, 199 classical, 161 non-signalling, 161 quantum, 160 E Bloch ball, 24 Ekert protocol, 101 Bloch sphere, 20, 24 Ensemble, 22 Bloch vector, 21 quantum, 22 quantum-classical, 44 Entanglement, 38 C Entropic uncertainty principle, 87, 146 Caesar cipher, 1 Entropy, 55 Calderbank-Shor-Steane (CSS) codes, 105, binary, 57 139 conditional, 61 Cauchy-Schwartz inequality, 16 joint, 64 Choi-Kraus theorem, 27 Shannon, 55 CHSH game, 165 von Neumann, 72 CHSH inequality, 102, 163, 175, 177 Entropy accumulation theorem (EAT), 174, Classical post-processing, 107 205 Coherent information, 80 Error correction, 110 Coherent state, 131, 199

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 227 R. Wolf, Quantum Key Distribution, Lecture Notes in Physics 988, https://doi.org/10.1007/978-3-030-73991-1 228 Index

F N Fidelity, 50, 136 No-cloning theorem, 46 Non-local games, 163

G Gaussian modulation, 202 O Gaussian states, 199 One-time pad (OTP), 3 GG02 protocol, 202 Operator adjoint, 16 Hermitian, 17 H normal, 17 Hadamard transformation, 20 projective, 17 Heterodyne detection, 201 self-adjoint operator, 17 Hilbert space, 13 transpose, 26 Holevo quantity, 170, 205 unitary, 17 Homodyne detection, 201 Hong-Ou-Mandel effect (HOM), 190 P Parameter estimation, 108 Partial trace, 42, 46 I Pauli matrices, 16, 24 IID, 172 Phase space, 197 Individual attacks, 129 -number-splitting (PNS) attack, 98, 131 Information content, 54 PLOB bound, 194 Positive map, 25 Positive operator-valued measure (POVM), 34 K Postselection technique, 145, 173 Klein’s inequality, 81 Prepare-and-measure protocol, 92 Kraus decomposition, 27 Privacy amplification, 112 Kullback-Leibler divergence, 70 Probability distribution, 54 Projector, 17 Purification, 48, 136 L Purified distance, 51 Linear map, 25 Purity, 23 Linear operator, 15 Locality loophole, 176 Q Quantum asymptotic equipartition property M (QAEP), 173 Max-entropy, 84, 153 Quantum Bit Error Rate (QBER), 167 smooth, 85 , 24, 26, 45 Mean value, 198 amplitude damping channel, 28 Measurement, 30 isometric, 27 parity, 43 unitary, 27 positive operator-valued measure (POVM), Quantum entropy, 72 34 Quantum leftover hash lemma, 114 projective, 32, 33 Quantum mutual information, 80 Measurement Device Independent QKD (MDI Quantum non-demolition (QND) measurement, QKD), 188 179, 210 Min-entropy, 84, 151 Quantum-proof strong randomness extractor, smooth, 85, 154, 173 113 Multipartite QKD, 184 Quantum state, 19 Mutual information, 67 maximally entangled, 39 maximally mixed, 23 Index 229

mixed state, 23 Squeezed state, 200 pure state, 23 separable, 39 Qubit, 19 T Tensor product, 18 Time-shift attack, 185 R Trace distance, 49, 122 Randomness extractor, 113 Trace norm, 49 Random variable, 54 Trace preserving, 26 Robustness, 123 Tsirelson’s bound, 104, 163 Twin-field QKD (TF QKD), 194 Two-universal hash function, 111, 114 S SARG04 protocol, 98 Scalar product, 14 U Schmidt decomposition, 40 Uhlmann’s theorem, 52 Secrecy, 120 Secret key rate, 144, 170, 183, 192, 194, 205 Security, 120 V Serfling’s inequality, 108 Vernam cipher, 3 Shannon entropy, 55 von Neumann entropy, 72 Shannon’s noiseless coding theorem, 56 conditional, 77 Shot noise, 197 joint, 75 Side channel attacks, 185 Singular value decomposition, 17 Six-state protocol, 98 W Spectral decomposition, 17 Wigner function, 198