New Developments in Quantum Tomography

Yuanlong Wang

A thesis submitted in fulﬁlment of the requirements for the degree of Doctor of Philosophy

SCIENTIA

MANU E T MENTE

School of Engineering and Information Technology, University of New South Wales in Canberra

June 2019 Thesis/Dissertation Sheet Australia's Global SYDNEY University

Surname/Family Name Wang Given Name/s Yuanlong Abbreviation for degree as give in the University calendar PhD Faculty UNSW Canberra School School of Engineering and Information Technology Thesis Title New Developments in Quantum Tomography

Abstract 350 words maximum: (PLEASE TYPE) This thesis investigates several topics in quantum tomography: quantum state tomography (QST), quantum Hamiltonian identifiability, quantum Hamiltonian/gate identification (QHI) and quantum detector tomography. For QST, we propose a novel recursively adaptive quantum state tomography (RAQST) protocol, which can outperform static tomography protocols using mutually unbiased bases and a two-stage mutually unbiased bases adaptive strategy, even with the simplest product measurements. When nonlocal measurements are available, RAQST can beat the Gill Massar bound for a wide range of quantum states with a modest number of copies. For quantum Hamiltonian identifiability, we extend the similarity transformation approach (STA) in classical system identification theory to the quantum domain to prove for the first time the identifiability conclusions for arbitrary dimensional spin-1/2 chain systems assisted by single qubit probes. We further develop the traditional STA method by proposing a Structure Preserving Transformation (SPT) method for non-minimal systems. We use the SPT method to introduce an indicator for the existence of economic quantum Hamiltonian identification algorithms, and give two algorithm examples. Within the framework of quantum process tomography, we propose a general two-step optimization (TSO) QHI algorithm. We then improve the TSO method to a more efficient pure-state-based gate identification (PGI) algorithm. By employing a series of predetermined pure probe states and developing a fast QST protocol specialized for pure states, we reduce the computational complexity from O(d"6) with dimension din TSO to O(d"3) in PGI. We provide theoretical error upper bounds for TSO and PGI methods. Finally we propose a novel QDT method. Using constrained linear regression estimation, a stage-1 estimate of the detector is obtained. Next a positive semidefinite requirement is added to guarantee a physical stage-2 estimate. We analyze the computational complexity and establish an error upper bound for this Two-Stage Estimation (TSE) method. r Such a theoretical analysis is uncommon in other QDT methods. We also investigate optimization over the coherent probe states. For RAQST, PGI and QDT, our collaborators have performed quantum optical experiments to validate the effectiveness of the proposed algorithms. " !I Declaration relating to disposition of project thesis/1:iissertation

I hereby grant to the University of New South Wales or its agents the right to archive and to make available my thesis or dissertation in whole or in part in the University libraries in all forms of media, now or here after known. subject to the provisions of the Copyright Act 1968. I retain all property rights, such as patent rights. I also retain the right to use in future works (such as articles or books) all or partof this thesis or dissertation.

I also authorise University Microfilms to use the 350 word abstract of my thesis in Dissertation Abstracts International (this is applicable to doctoral theses only). . ...?. .t/ C?.t/2!!.I ) ...... Date The University recognises that there may be exceptional circumstances requiring restrictions on copying or conditions on use. Requests for restriction for a period of up to 2 years must be made in writing. Requests for a longer period of restriction may be considered in exceptional circumstances and require the approval of the Dean of Graduate Research.

OR OFFICE USE ONLY Date of completion of requirements for Award: ORIGINALITY STATEMENT

'I hereby declare that this submission is my own work and to the best of my knowledge it contains no materials previously published or written by another person, or substantial proportions of material which have been accepted for the award of any other degree or diploma at UNSW or any other educational institution, except where due acknowledgement is made in the thesis. Any contribution made to the research by others, with whom I have worked at UNSW or elsewhere, is explicitly acknowledged in the thesis. I also declare that the intellectual content of this thesis is the product of my own work, except to the extent that assistance from others in the project's design and conception or in style, presentation and linguistic expression is acknowledged.'

Signed / / Date .... Pt Pt/.�.. 1 ...... COPYRIGHT STATEMENT

'I hereby grant the University of New South Wales or its agents the right to archive and to make available my thesis or dissertation in whole or part in the University libraries in all forms of media, now or here after known, subject to the provisions of the Copyright Act 1968. I retain all proprietary rights, such as patent rights. I also retain the right to use in future works (such as articles or books) all or part of this thesis or dissertation. I also authorise University Microfilms to use the 350 word abstract of my thesis in DissertationAbstract International (this is applicable to doctoral theses only). I have either used no substantial portions of copyright material in my thesis or I have obtained permission to use copyright material; where permission has not been granted I have applied/will apply for a partial restriction of the digital copy of rny thesis or dissertation.'

Signed · Date .... ! t/ pf /.2?!1 ...

AUTHENTICITY STATEMENT

'I certify that the Library deposit digital copy is a direct equivalent of the final officially approved version of rny thesis. No emendation of content has occurred and if there are any minor variations in formatting, they are the result of the conversion to digital format.'

Signed

Date ?t/9(/½?(f INCLUSION OF PUBLICATIONS STATEMENT

UNSW is supportive of candidates publishing their research results during their candidature as detailed in the UNSW Thesis Examination Procedure.

Publications can be used in their thesis in lieu of a Chapter if: • The student contributed greater than 50% of the content in the publication and is the "primary author", ie. the student was responsible primarily for the planning, execution and preparation of the work for publication • The student has approval to include the publication in their thesis in lieu of a Chapter from their supervisor and Postgraduate Coordinator. • The publication is not subject to any obligations or contractual agreements with a third party that would constrain its inclusion in the thesis

Please indicate whether this thesis contains published material or not. This thesis contains no ublications, either ublished or submitted for publication D 'fhls:boxlsch Some of the work described in this thesis has been published and it has been documented in the relevant Chapters with acknowledgement • box OJt

This thesis has publications (either published or submitted for publication) D incorporated into it in lieu of a chapter and the details are presented below

CANDIDATE'S DECLARATION I declare that: • I have complied with the Thesis Examination Procedure • where I have used a publication in lieu of a Chapter, the listed publication(s) below meet(s) the requirements to be included in the thesis. Name Signature Date ( dd/mm/yy) IA.t�vt I vV.;.,"" ; U·lf -P utlcy,I!J / / Postgraduate Coordinator's Declaration (

I declare that: • the information below is accurate • where listed publication(s) have been used in lieu of Chapter(s), their use complies with the Thesis Examination Procedure • the minimum requirements for the format of the thesis have been met. PGC's Name PGC's Signature Date ( dd/mm/yy)

Abstract

This thesis investigates several topics in quantum tomography: quantum state tomography (QST), quantum Hamiltonian identiﬁability, quantum Hamiltonian/gate identiﬁcation (QHI) and quantum detector tomography.

For QST, we propose a novel recursively adaptive quantum state tomography (RAQST) protocol, which can outperform static tomography protocols using mutually unbiased bases and a two-stage mutually unbiased bases adaptive strategy, even with the simplest product measurements. When nonlocal measurements are available, RAQST can beat the Gill-Massar bound for a wide range of quantum states with a modest number of copies.

For quantum Hamiltonian identifiability, we extend the similarity transformation approach (STA) in classical system identification theory to the quantum domain to prove for the first time the identifiability conclusions for arbitrary dimensional spin- 1/2 chain systems assisted by single qubit probes. We further develop the traditional STA method by proposing a Structure Preserving Transformation (SPT) method for non-minimal systems. We use the SPT method to introduce an indicator for the existence of economic quantum Hamiltonian identification algorithms, and give two algorithm examples.

Within the framework of quantum process tomography, we propose a general two- step optimization (TSO) QHI algorithm. We then improve the TSO method to a more eﬃcient pure-state-based gate identiﬁcation (PGI) algorithm. By employing

3 a series of predetermined pure probe states and developing a fast QST protocol specialized for pure states, we reduce the computational complexity from O(d6) with dimension d in TSO to O(d3) in PGI. We provide theoretical error upper bounds for TSO and PGI methods.

Finally we propose a novel QDT method. Using constrained linear regression estimation, a stage-1 estimate of the detector is obtained. Next a positive semideﬁ- nite requirement is added to guarantee a physical stage-2 estimate. We analyze the computational complexity and establish an error upper bound for this Two-Stage Estimation (TSE) method. Such a theoretical analysis is uncommon in other QDT methods. We also investigate optimization over the coherent probe states.

For RAQST, PGI and QDT, our collaborators have performed quantum optical experiments to validate the eﬀectiveness of the proposed algorithms.

4 Acknowledgement

I would like to thank my primary supervisor, A/Prof. Daoyi Dong, for all the guidance and help he has given me, both academic and in life. He taught me every detail ranging from doing research, writing papers to making career plans, etc. He encouraged me to attend academic conferences and introduced excellent collaborators for our projects. His patience in mentoring students and in life is the greatest I have ever met, which I doubt whether I can achieve in the future. He is the necessary (and suﬃcient largely) condition for my satisfactory and memorable PhD period of research.

I would also like to thank my joint supervisor Prof. Ian R. Petersen and co- supervisors Dr. Hidehiro Yonezawa and Prof. Elanor Huntington. I learned a lot from their rich academic experience. They taught me valuable research techniques in ﬁnding topics, overcoming research problems, revising papers, etc. Their various backgrounds and specialty inspire me to view problems from diﬀerent angles, which is especially important for understanding interdiscipline research.

I am also sincerely grateful to my close collaborators: Dr. Bo Qi at CAS, Dr. Zhibo Hou, Dr. Qi Yin and Prof. Guo-Yong Xiang at USTC, Dr. Jun Zhang at SJTU, Dr. Akira Sone and Prof. Paula Cappellaro at MIT, and Dr. Shota Yokoyama at UNSW. It is really fruitful and memorable journeys to have their collaboration, from which I have learned and improved myself quite a lot.

Special thanks to my group members Wei Zhang, Qi Yu and Yanan Liu. Our

5 academic discussion is rich and instrumental, and our support in each other in life is warm and precious. Also special thanks to my friends Ruxiu Liu, Wei Zhang and Di Liu, whose emotional support is vital to my life. Many thanks to my family, who are always in favor of my career decision.

Finally, thank life, for everything.

6 Certiﬁcate of Originality

I hereby declare that this submission is my own work and that, to the best of my knowledge and belief, it contains no material previously published or written by another person, nor material which to a substantial extent has been accepted for the award of any other degree or diploma at UNSW or any other educational institution, except where due acknowledgement is made in the thesis. Any contribution made to the research by colleagues, with whom I have worked at UNSW or elsewhere, during my candidature, is fully acknowledged.

I also declare that the intellectual content of this thesis is the product of my own work, except to the extent that assistance from others in the project’s design and conception or in style, presentation and linguistic expression is acknowledged.

YUANLONG WANG

7 8 List of publications

[Journal articles]

1. Y. Wang, S. Yokoyama, D. Dong, I. R. Petersen, E. H. Huntington, and H. Yonezawa, Two-stage estimation for quantum detector tomography: Error analysis, numerical and experimental results, in preparation.

2. Y. Wang, D. Dong, A. Sone, I. R. Petersen, H. Yonezawa, and P. Cappellaro, Quantum Hamiltonian identiﬁability via a similarity transformation approach and beyond, submitted to IEEE Transactions on Automatic Control, 2018.

3. Y. Wang, Q. Yin, D. Dong, B. Qi, I. R. Petersen, Z. Hou, H. Yonezawa, and G.-Y. Xiang, Quantum gate identiﬁcation: Error analysis, numerical results and optical experiment, Automatica, vol. 101, pp. 269-279, 2019.

4. Y. Wang, D. Dong, B. Qi, J. Zhang, I. R. Petersen, and H. Yonezawa, A quantum Hamiltonian identiﬁcation algorithm: Computational complexity and error analysis, IEEE Transactions on Automatic Control, vol. 63, no. 5, pp. 1388-1403, 2018.

5. B. Qi, Z. Hou, Y. Wang, D. Dong, H.-S. Zhong, L. Li, G.-Y. Xiang, H. M. Wiseman, C.-F. Li, and G.-C. Guo, Adaptive quantum state tomography via linear regression estimation: Theory and two-qubit experiment, npj Quantum Information, vol. 3, no. 1, p. 19, 2017.

9 6. D. Dong, I. R. Petersen, Y. Wang, X. Yi, and H. Rabitz, Sampled-data design for robust control of open two-level quantum systems with operator errors, IET Control Theory & Applications, vol. 10, no. 18, pp. 2415-2421, 2016.

7. Z. Hou, H.-S. Zhong, Y. Tian, D. Dong, B. Qi, L. Li, Y. Wang, F. Nori, G.-Y. Xiang, C.-F. Li and G.-C. Guo, Full reconstruction of a 14-qubit state within four hours, New Journal of Physics, vol. 18, no. 8, p. 083036, 2016.

[Conference papers]

1. Y. Wang, D. Dong, and I. R. Petersen, An approximate quantum Hamiltoni- an identiﬁcation algorithm using a Taylor expansion of the matrix exponential function, in 2017 IEEE 56th Annual Conference on Decision and Control (CD- C), pp. 5523-5528, Melbourne, Australia, December 2017.

2. Y. Wang, Q. Yin, D. Dong, B. Qi, I. R. Petersen, Z. Hou, H. Yonezawa, and G.-Y. Xiang, Eﬃcient identiﬁcation of unitary quantum processes, in 2017 Australian and New Zealand Control Conference (ANZCC), pp. 196-201, Gold Coast, Australia, December 2017.

3. D. Dong, and Y. Wang, Several recent developments in estimation and robust control of quantum systems, in 2017 Australian and New Zealand Control Conference (ANZCC), pp. 190-195, Gold Coast, Australia, December 2017.

4. Y. Wang, D. Dong, I. R. Petersen, and J. Zhang, An approximate algorithm for quantum Hamiltonian identiﬁcation with complexity analysis, in the 20th World Congress of the International Federation of Automatic Control (IFAC), vol. 50, no. 1, pp. 11744-11748, Toulouse, France, July 2017.

10 5. D. Dong, Y. Wang, Z. Hou, B. Qi, Y. Pan, and G.-Y. Xiang, State tomography of qubit systems using linear regression estimation and adaptive measurements, in the 20th World Congress of the International Federation of Automatic Con- trol (IFAC), vol. 50, no. 1, pp. 13014-13019, Toulouse, France, July 2017.

6. Y. Wang, B. Qi, D. Dong, and I. R. Petersen, An iterative algorithm for Hamiltonian identiﬁcation of quantum systems, in 2016 IEEE 55th Annual Conference on Decision and Control (CDC), pp. 2523-2528, Las Vegas, USA, December 2016.

11 12 Contents

Abstract2

Acknowledgements5

Declaration7

List of Publications9

Table of Contents 13

List of Figures 19

List of Tables 21

List of Symbols 23

List of Common Acronyms 31

1 Introduction 33

2 Quantum mechanics and standard tomography methods 37

2.1 Quantum mechanics foundations...... 37

13 2.2 Standard maximum likelihood estimation...... 43

2.2.1 Quantum state tomography via maximum likelihood estimation 43

2.2.2 Quantum process tomography via maximum likelihood estimation...... 45

2.2.3 Quantum detector tomography via maximum likelihood estimation...... 46

3 Recursively adaptive multi-qubit state tomography 49

3.1 Introduction...... 49

3.2 RAQST protocol...... 52

3.2.1 Establishment of linear regression model...... 53

3.2.2 Recursive LRE updating rule and physical projection..... 55

3.2.3 Optimization criterion...... 58

3.2.4 Two versions of RAQST...... 61

3.3 Numerical results...... 64

3.4 Experimental results...... 68

3.5 Summary and open problems...... 72

4 Quantum Hamiltonian identiﬁability via a similarity transformation approach and beyond 75

4.1 Introduction...... 76

4.2 Model establishment...... 78

4.2.1 Problem formulation of Hamiltonian identiﬁability and identi- ﬁcation...... 78

4.2.2 Laplace transform approach and atypical cases...... 81

14 4.3 Similarity Transformation Approach...... 84

4.3.1 General procedures for minimal systems...... 84

4.3.2 General procedures for non-minimal systems...... 85

4.3.3 Structure Preserving Transformation method for non-minimal systems...... 88

4.4 Quantum Hamiltonian identiﬁability via STA...... 89

4.4.1 General framework...... 89

4.4.2 Exchange model without transverse ﬁeld...... 91

4.4.3 Exchange model with transverse ﬁeld...... 94

4.5 From identiﬁability to economic identiﬁcation algorithms...... 104

4.5.1 An indicator for the existence of economic identiﬁcation algorithms...... 104

4.5.2 Two economic Hamiltonian identiﬁcation algorithms..... 111

4.5.3 Error analysis...... 116

4.5.4 Simulation performance...... 118

4.6 Conclusion and open problems...... 120

5 Quantum Hamiltonian/gate identiﬁcation via TSO and PGI 123

5.1 Introduction...... 123

5.2 TSO Hamiltonian identiﬁcation algorithm...... 127

5.2.1 Quantum process tomography...... 127

5.2.2 Problem formulation of Hamiltonian identiﬁcation...... 131

5.2.3 Two-step Optimization algorithm...... 134

5.2.4 Error analysis...... 144

15 5.2.5 Numerical results of TSO...... 151

5.3 Pure-state-based Gate Identiﬁcation...... 158

5.3.1 Problem formulation of gate identiﬁcation...... 158

5.3.2 Fast pure-state tomography...... 158

5.3.3 Gate reconstruction...... 160

5.3.4 General procedure and computational complexity...... 163

5.3.5 Error analysis...... 164

5.3.6 Numerical results...... 169

5.3.7 Experimental results...... 172

5.4 Summary and open problems...... 175

6 Quantum detector tomography via two-stage estimation 177

6.1 Introduction...... 177

6.2 Two-stage Estimation method for quantum detector tomography... 179

6.2.1 Problem formulation...... 179

6.2.2 Estimation algorithm...... 182

6.2.3 General procedure and computational complexity...... 187

6.2.4 Error analysis...... 188

6.3 Optimization of the coherent probe states...... 194

6.3.1 On the kinds of probe states...... 194

6.3.2 Optimization of the size of sampling square for probe states. 196

6.4 Numerical results...... 199

6.4.1 Basic performance...... 199

6.4.2 On the kinds of probe states...... 201

16 6.4.3 Optimization of the size of sampling square for probe states. 202

6.4.4 Comparison with MLE...... 203

6.5 Experimental results...... 207

6.5.1 Experimental setup...... 207

6.5.2 Modiﬁed estimation protocol...... 208

6.5.3 Experimental results...... 211

6.6 Summary and open problems...... 215

7 Conclusions and outlook 217

A Some common formulas 220

B Iterative algorithm for product projector optimization 221

C Gill-Massar bound for inﬁdelity in two-qubit state tomography 224

D Proof of Lemma 4.1 225

E Proof of Lemma 4.2 226

F Proof of Lemma 4.3 227

G Proof of Lemma 4.4 229

H Proof of Lemma 4.5 232

I Proof of Proposition 5.1 234

J Proof of Theorem 5.1 235

17 K Proof of Theorem 5.2 237

L A suﬃcient condition for Assumption 5.1 239

M Proof of Lemma 5.2 240

N A basis set example for the space B(d, m, {dj}) 242

18 List of Figures

3.1 Simulated performance of the RAQST protocol for pure states..... 65

3.2 Simulated performance of the RAQST protocol for mixed states.... 66

3.3 Two-qubit state tomography experimental setup, adopted from [114]. 69

3.4 Two-qubit state tomography experimental results...... 70

4.1 Relationships between identiﬁability criteria...... 87

4.2 Performance of Algorithm 4.1 with diﬀerent data lengths...... 119

4.3 Performance of Algorithm 4.2 with diﬀerent noise variances...... 120

4.4 Performance of Algorithm 4.2 with diﬀerent data lengths...... 121

5.1 General procedure of the TSO method...... 140

5.2 MSE of TSO versus the total resource number...... 153

5.3 MSE of TSO versus diﬀerent evolution times...... 153

5.4 MSE of TSO versus number of qubits...... 154

5.5 Running time versus qubit number for the ERA method in [165] and our TSO method...... 156

5.6 MSE versus resource number for each output state...... 170

5.7 Running time and MSE versus qubit number for MLE and PGI methods.171

5.8 The schematic of experimental setup, adopted from [153]...... 173

19 5.9 MSE versus resource number for the experimental single-qubit gate.. 174

6.1 The projection amplitude function h(k, j)...... 198

6.2 MSE versus the total resource number...... 200

6.3 MSE versus probe state kinds...... 201

6.4 MSE versus the size of sampling square for probe states...... 202

6.5 The optimal sampling square size versus dimension...... 203

6.6 Comparison between our TSE algorithm with MLE for diﬀerent qubit number...... 204

6.7 Comparison between our algorithm with MLE for diﬀerent number of POVM matrices...... 205

6.8 Quantum optical experimental setup for QDT [161]...... 207

6.9 Experimental and simulation results for Group I...... 214

6.10 Experimental and simulation results for Group II...... 214

20 List of Tables

6.1 The coherent probe states for TSE QDT experiment...... 212

21 22 List of Symbols

∗ An indeterminate variable, vector or matrix ...... 93 ⊗ Tensor product between two Hilbert space, or Kronecker product between two matrices...... 40 ⊕ Matrix direct sum...... 208 ≡ Identity symbol ...... 40 aˆ Estimation of variable a ...... 43 a∗ Conjugate of a ...... 38 bxc Largest integer that is not larger than x ∈ R ...... 64 |ψ⊥i A state orthogonal to |ψi; i.e., hψ|ψ⊥i =0...... 63

A ≥ 0 A ∈ Cd×d is positive semideﬁnite ...... 54 √ √ √ √ 1 † A, A 2 , Udiag( P11, P22, ..., Pdd)U , where A ≥ 0 with spectral decomposition A = UPU † ...... 135

Aσi (Aiσ) i-th column (row) of matrix A...... 93 AT Transpose of A ...... 38 A† Transpose and conjugate of A ...... 38

⊗N A , A ⊗ A ⊗ · · · ⊗ A, N times tensor product of A ...... 152 | {z } N ||A|| Frobenius norm of A ...... 57

(x, y) , (x − 1)K + y for 1 ≤ x, y ≤ K, when (x, y) is used as a number, especially in subscripts or superscripts...... 132

† † ha, bi (hφ|ψi) , a b (, (|φi) |ψi), inner product between column vectors (pure states) ...... 241(38)

23 † hA, Bi , Tr(A B), inner product of matrices A and B ...... 79 [A, B] , AB − BA, commutation ...... 39   j j!   , k!(j−k)! , binomial coeﬃcient ...... 113 k

Ai Kraus operators...... 128

B(d, m, {dj}) , {L1 ⊕ L2 ⊕ ... ⊕ Lm|∀ 1 ≤ j ≤ m, Lj ∈ Cdj ×dj }, set of all

block diagonal matrices with m blocks, where j-th block Lj Pm is dj × dj dimensional and j=1 dj = d ...... 208 C Complex domain ...... 82

Cd d-dimensional complex vector space ...... 142

Cd×d Set of all d × d complex matrices ...... 53 CM Controllability matrix ...... 92

T Cov(e) , E[(e − E(e))(e − E(e)) ], covariance matrix of random variable vector e ...... 59 cq Size of sampling square in complex plane for coherent state preparation in TSE QDT ...... 195

T −1 co , argmin E{Tr[(X0 X0) ]}, optimal size of sampling square cq for coherent probe states in TSE QDT ...... 198

D (Dd,D(H )) Set of all (d-dimensional) density matrices (in space H )..38 diag(a) A diagonal matrix with diagonal line consisting of elements in vector a ...... 56 diag(A) A diagonal matrix obtained from square matrix A by setting all non-diagonal elements in A as zero ...... 184 dim(H) Dimension of space H ...... 46 E(·) Expectation on all possible measurement results ...... 58

E(·) Expectation on classical random variables x and y ...... 196

24 E Quantum process...... 40 e Base of natural logarithm ...... 114 exp(A) Matrix exponential function on square matrix A ...... 39

{Fi} Set of basis matrices for Cd×d ...... 128 2 p√ √ F(ρ1, ρ2) , Tr ( ρ1ρ2 ρ1), ﬁdelity between states ρ1 and ρ2 .... 65 G¯ Accessible set ...... 80 H Hamiltonian...... 39

√1 ( 1 1 ), single-qubit Hadamard gate ...... 39 H , 2 1 −1 H Hilbert space...... 37 ~ Reduced Planck constant...... 39 I Identity operator or identity matrix ...... 40

I−k Identity matrix with k-th diagonal element −1...... 97 √ i , −1, imaginary unit ...... 39 d2−1 {iHm}m=1 An orthonormal basis set of su(d)...... 79 K Number of steps in adaptive QST ...... 53 L Number of kinds of diﬀerent probe states in QDT ...... 180 L Lagrange function...... 134 ˆ ˆ ˆ T M (M(Θ, Θ)) , E(Θ − Θ)(Θ − Θ) , Mean Squared Error matrix ...... 58 M (M(j)) Number of matrices in (j-th) POVM ...... 54

Pt (jk) Mt , k=1 M , number of regression equations after t times POVM measurement ...... 55 S (j) M , M , admissible measurement set...... 54 j=1 (j) (j) M(j) M , {Pi }i=1 , j-th step POVM measurement operators . . . . 54

ND Length of experimental data (i.e., sampling times)...... 111

NH Number of unknown parameters in Hamiltonian ...... 79

NL Dimension of linear system model (4.51)...... 105

25 NO Number of copies of each output state in QHI/QGI . . . . . 140

NP Number of diﬀerent POVM sets to reconstruct one output state in QHI/QGI...... 160

Nq Number of qubits ...... 62

Nt Total number of copies of all quantum states used in an experiment ...... 54 n Number of elements in accessible set...... 80 n(j) Number of times j-th measurement M(j) is performed . . . . 54

(j) (j) nm Number of occurrence of m-th outcome from n measurement trials of M(j) ...... 54 nij Number of occurrence of i-th POVM outcome on ρj ..... 181 OM Observability matrix ...... 92 (j) Pi (Pi ) i-th matrix in (j-th) POVM ...... 41(45) pi (pij) , Tr(Piρ)(, Tr(Piρj)), measurement probability of i-th out-

come on ρ (ρj)...... 42(180) R Real domain ...... 112

Rd d-dimensional real vector space ...... 83

Rd×d Set of all d × d real matrices ...... 138 Re(a) Real part of a ...... 150

Sjkl Structure constants of su(d)...... 79 S Similarity transformation matrix in STA ...... 84 s Laplace variable in Laplace transform ...... 82 su(d) Lie algebra consisting of all d × d skew-Hermitian traceless matrices...... 79

TR Running time of simulation programs in seconds...... 156

TrA(X) (TrB(X)) Partial trace on space HA (HB) where X ∈ HA ⊗ HB ... 43 t Time...... 39

26 ts Sampling time ...... 111 U Unitary propagator/quantum gate...... 39 u Control signal ...... 82

T vec(Am×n) , [A11,A21, ..., Am1,A12, ..., Am2, ..., Amn] , column vectorization of A ...... 101

−1 vec (a) Inverse function of vectorization, from Cd2 to Cd×d ...... 132

(jk) Wm Weight of linear regression equation obtained from m-th out-

come of jk-th POVM ...... 55 X Process matrix ...... 128 ˆ ˆ ∆A , A − A, diﬀerence between A and its estimation A ..... 145

δij Kronecker Delta function ...... 41 δ(t) Dirac Delta function ...... 89 η Penalty coeﬃcient in modiﬁed TSE method ...... 210

T (j) (j) T Θ (Θj) , (θ1, ..., θd2 ) (, (θ1 , ..., θd2 ) ), vector of parametrization of (j-th) state ...... 181

T Θ , (θ2, ..., θd2 ) , eﬀective vector of parametrization of state 53 ˆ Θn LRE estimation of Θ using n regression equations ...... 55 (j) θi (θi ) , Tr(ρΩi)(, Tr(ρjΩi)), parametrization coeﬃcient of (j-th)

state in i-th basis matrix Ωi ...... 53(180)

ϑi i-th unknown parameter in Hamiltonian ...... 79 T ϑ , (ϑ1, ..., ϑNH ) , vector of all unknown Hamiltonian parameters...... 79 Λ(A) Set of all eigenvalues of A (repeated eigenvalues appear multiple times) ...... 96

ΛL Lagrange multiplier variable/matrix ...... 45

λi(A) i-th eigenvalue of A ...... 232

27 Ξ , [ξmn], matrix composed of elements ξmn ...... 129

ξmn Expansion coeﬃcient of ρm using ρn ...... 129 ρ Density matrix ...... 38

ρi i-th probe state in QDT...... 180

ρin (ρout) Input (output) state of a quantum process...... 40 Σ , (A, B, C, D)(, (A, B, C)), 4(3)-tuples denoting a linear system ...... 82

1 σx, σy, σz, σ+, σ− Pauli matrices and their linear combinations, σ± = 2 (σx ±iσy) 39,156

T T T T Φ , (Φ1 , Φ2 , ..., ΦM) , vector of all parameters in POVM matrices ...... 181

i i i i T Φi,(j) , (φ1,(j), φ2,(j), φ3,(j), φ4,(j)) , vector of parametrization of i-th matrix of single-qubit projector on j-th qubit...... 221

(k) (j) (j) T (j) (j) T Φj (Φj ) , (φ1 , ..., φd2 ) (, (φ1,k, ..., φd2,k) ), vector of parametrization of j-th matrix of (k-th) POVM ...... 181

(k) (j) (j) T (j) (j) T Φj (Φj ) , (φ2 , ..., φd2 ) (, (φ2,k, ..., φd2,k) ), eﬀective vector of parametrization of j-th matrix of (k-th) POVM . . . . . 55(54)

CE Φi Vector of parametrization of “Cyclic eigenvalues” method’s result to estimate eigenvector corresponding to i-th near-0 eigenvalue ...... 62 (j) (j) (k) φi (φi,k ) , Tr(PjΩi) (Tr(Pj Ωi)), parametrization coeﬃcient of (j-th)

matrix of (k-th) POVM in i-th basis matrix Ωi ...... 55(54) (j) ⊗(k−1) φi,(k) , Tr[Pj(Ω1 ⊗ Ωi ⊗ Ω1 ⊗ · · · ⊗ Ω1)], parametrization coef- ﬁcient of k-th qubit measurement component of product pro-

jector’s j-th matrix in i-th single-qubit basis Ωi ...... 221

28 |ψi Unit complex column vector representing a pure state . . . . 37

d2 {Ωi}i=1 A set of Hermitian bases for Cd×d, where Tr(ΩiΩj) = δij and √ Ω1 = I/ d ...... 53

29 30 List of Common Acronyms

BBO β-barium borate BME Bayesian mean estimation BS Beam splitter CLS Constrained least squares CW Continuous wave ERA Eigenstate realization algorithm GM Gill-Massar HWP Half-wave plate IF Interference filter LRE Linear regression estimation LS Least squares MES Maximally entangled states MIMO Multiple-input multiple-output MLE Maximum likelihood estimation MSE Mean squared error MUB Mutually unbiased bases PBS Polarizing beam splitter PGI Pure-state-based gate identification POVM Positive operator valued measure QGI Quantum gate identification QHI Quantum Hamiltonian identification

31 QT Quantum tomography QST Quantum state tomography QPT Quantum process tomography QDT Quantum detector tomography QWP Quarter-wave plate RAQST Recursively adaptive quantum state tomography SIC-POVM Symmetric informationally complete positive operator- valued measure SISO Single-input single-output SNSPD Superconducting nanowire single photon detector SPD Single photon detector SPT Structure preserving transformation STA Similarity transformation approach TSE Two-stage estimation TSO Two-step optimization

32 Chapter 1

Introduction

The search for the principles of the nature has always been a main target in physics. One of the most signiﬁcant achievements since the 20th century is quantum science, which unravels the special properties of the universe at the microscale level. Then a subsequent question arises: what practical inﬂuence or application can quantum science bring us?

Through recent decades’ endeavour, scientists have developed a number of rev- olutionary quantum technologies based on the principles of quantum mechanics. For example, quantum computation utilizes the superposition of quantum states to perform certain computation tasks with an eﬃciency much higher than classical (which means non-quantum throughout this thesis) computers [43]. Quantum communication encodes information in quantum states to realize secure exchange of key information [10]. Quantum sensing employs quantum properties to perform measurements with high sensitivity or precision [39]. These achievements, together with many other developing branches, are promising candidates for next-generation technologies in many information-related subjects.

To realize and develop these quantum technologies, it usually requires accurate manipulation of certain quantum objects. Before this, a necessity is to obtain enough

33 information about the unknown quantum entity; i.e., information about certain key structures or parameters of the entity needs to be extracted. This highlights the signiﬁcance of system identiﬁcation and parameter estimation, which is often called quantum tomography (QT) in quantum-associated subjects.

In contrast to the classical world, the quantum no-cloning theorem [41, 111, 158] implies that for a single copy of an unknown quantum state, its information cannot be totally recovered. To overcome this obstacle, QT usually assumes a framework where a large number of independent identical copies of the unknown quantum entity are available, and data are obtained through proper interaction (e.g., quantum measurement) with these copies following certain protocols. Then the information about the entity can be extracted through a reconstruction algorithm using the data. The ﬁnal target is thus to obtain an estimate of the whole entity (called full QT) or of partial aspects of the entity. Common indices evaluating the tomography methods include computational complexity, estimation error, eﬃciency, reliability, etc.

For this thesis, a main focus is on designing novel full tomography methods. We start from estimating the state of an unknown quantum system, and this technique is called quantum state tomography (QST). Then we move our focus to the evolution of states. Specifically, we concentrate on closed quantum systems, where the system evolution is governed by the Hamiltonian, and the task is usually referred to as Hamiltonian identification. Before designing a novel Hamiltonian identification algorithm, we investigate the problem of Hamiltonian identifiability; i.e., whether a given experimental setting is enough to uniquely determine all the unknown parameters in the Hamiltonian. Finally, we “complete the triad, state, process and detector tomography, required to fully specify an experiment” [97] by considering quantum detector tomography.

In Ch.3, we propose a novel Recursively Adaptive Quantum State Tomography (RAQST) protocol for multi-qubit systems. Based on the linear regression estimation algorithm, RAQST recursively incorporates new measurement data into a historical

34 estimate. Then according to the updated estimate, an index is used to predict the performance of any candidate measurement bases. Simulation shows that even with the simplest 2-qubit product measurements, RAQST can outperform nonadaptive protocols, and also beat the Gill-Massar bound for a wide range of pure states. Quantum optical experiment on a two-qubit system demonstrates the eﬀectiveness of our adaptive method.

Ch.4 switches the focus to the Hamiltonian identifiability problem, which investigates whether a given experimental setting can uniquely determine all the unknown parameters in the Hamiltonian. We employ the Similarity Transformation Approach (STA) in classical control theory to solve the quantum Hamiltonian identifiability problem, and prove identifiability conclusions for spin-1/2 chain systems with arbitrary dimensions assisted by single-qubit probes. We further extend the traditional STA method by proposing a Structure Preserving Transformation (SPT) method for non-minimal systems. We use the SPT method to introduce an indicator for the existence of economic quantum Hamiltonian identification algorithms, whose computational complexity directly depends on the number of unknown parameters (which could be much smaller than the system dimension). Finally, we give two examples of such economic Hamiltonian identification algorithms and perform simulations to demonstrate their effectiveness.

A test of quantum Hamiltonian identifiability is instrumental to save time and cost for practical Hamiltonian identification experiments. With this precursory problem solved in Ch.4, we proceed to specific identification algorithms in Ch.5. We identify an unknown quantum Hamiltonian within the framework of quantum process tomography. In our method, different pre-designed probe states are input into the quantum system and the output states are estimated using the quantum state tomography protocol via linear regression estimation. To reconstruct the time-independent system Hamiltonian, we establish the identification problem as an optimization problem, and design an approximate solution method using two-step optimization (TSO). We

35 analyze the computational complexity and identification error of the TSO method, and provide numerical examples to demonstrate the effectiveness of the TSO method. Furthermore, we improve our TSO method to provide a more efficient Pure-state- based Gate Identification (PGI) algorithm, with the computational complexity reduced from O(d6) to O(d3) for a d-dimensional system. We note that theoretically both the input and output states in our protocol are pure. We thus design a fast pure-state tomography to reconstruct the output states more efficiently. We establish an analytical error upper bound, and perform a single-qubit optical experiment to validate the effectiveness of PGI method.

In Ch.6 we come to the tomography of a quantum detector using a Two-Stage Estimation (TSE) method. First, a series of different probe states are employed to generate measurement data. Then, using constrained linear regression estimation, a stage-1 estimate of the detector is obtained. Finally, the positive semidefinite requirement on the POVM matrices is added to guarantee a physical stage-2 estimate. We analyze the computational complexity of this approach and establish an error upper bound. We also discuss optimization of the coherent probe states. We perform simulation and a quantum optical experiment to verify the effectiveness of the TSE method.

Ch.7 concludes all the forementioned results, and provides some discussion about possible future work in the ﬁeld of QT.

36 Chapter 2

Quantum mechanics and standard tomography methods

2.1 Quantum mechanics foundations

Quantum mechanics is a set of mathematical and physical formulisms for describ- ing nature at the scale of atoms and subatomic particles. Axiomatic methods were employed in the development and reformulation of quantum mechanics, leading to now a number of fundamental postulates from which the whole theory can be deduced. We start from these postulates to introduce the necessary preliminaries for this thesis. The postulates in diﬀerent textbooks have minor diﬀerences, and here we adopt the version in [104].

Postulate 2.1. Any isolated quantum system is associated to a complex vector space with inner product (namely, a Hilbert space) known as the state space of the system. The system is completely described by its state vector, which is a unit vector in the state space of the system.

Mathematically, a quantum state is usually denoted as a unit complex vector |ψi in the underlying Hilbert space H , which can also be viewed as a column vector

States that each can be represented by a single vector are called pure states. In contrast, a statistical ensemble of pure states, called a mixed state, cannot be described with a single vector. Hence, a mixed state is usually denoted as a density matrix ρ, which is Hermitian, positive semideﬁnite and satisﬁes Tr(ρ) = 1. For a closed quantum system with state |ψi, we have ρ = |ψihψ|. In this thesis, we denote

Dd (D(H )) the set of all d-dimensional quantum states (in space H ), also simpliﬁed as D when there is no ambiguity. Since quantum states are fundamental in quantum research, their eﬃcient or accurate estimation is certainly an important problem.

The dimension of the underlying Hilbert space can be infinite or finite, depending upon the specific physical system. For the infinite dimensional case, the above notations |ψi and ρ are more commonly interpreted as operators instead of matrices, due to the fact that infinite dimensional matrices are difficult to tackle mathematically. This thesis is mainly focused on finite dimensional cases, and thus one can identify |ψi and ρ with finite dimensional vectors and matrices, respectively. The simplest nontrivial Hilbert space is two-dimensional, upon which the quantum system is called a qubit. An orthonormal basis of a qubit system is usually denoted as |0i and |1i, corresponding to the classical 0-1 bit and forming the basic unit of quantum information.

Postulate 2.2. The time evolution of the state of a closed quantum system is described by the Schr¨odingerequation

d|ψ(t)i i = H|ψ(t)i, (2.1) ~ dt

38 2.1. QUANTUM MECHANICS FOUNDATIONS

√ where i = −1, ~ is the reduced Planck’s constant and set to 1 in atomic units, and H is a ﬁxed Hermitian operator known as the Hamiltonian of the closed system.

Postulate 2.2 describes how a quantum state will evolve with time. It also has a density matrix version, which is the Liouville-von Neumann equation

ρ˙ = −i[H, ρ], (2.2) where [A, B] = AB − BA is the commutator and we use atomic units to set ~ = 1 throughout the rest of this thesis.

When the Hamiltonian does not change with time, we say it is time-independent, and the solution to the Schr¨odingerequation is thus

|ψ(t2)i = U(t2, t1)|ψ(t1)i, (2.3) where we deﬁne

U(t2, t1) , exp[−iH(t2 − t1)]. (2.4)

Since it is the relative diﬀerence between t2 and t1 that determines U, one can further write U as U(t2 −t1). Operator U in the form of (2.4) will always be unitary, and it is called the propagator or quantum gate. Common single-qubit operators include the Pauli matrices:

      0 1 0 −i 1 0 σx =   , σy =   , σz =   , 1 0 i 0 0 −1 and the Hadamard gate   1 1 1 H = √   . 2 1 −1

39 2.1. QUANTUM MECHANICS FOUNDATIONS

In the mixed-state case, (2.4) becomes

† ρ(t2) = U(t2 − t1)ρ(t1)U (t2 − t1). (2.5)

From Postulate 2.2 we see the dynamics of quantum states in closed systems are governed by the Hamiltonian/gate, which highlights the importance of the Hamilto- nian/gate identiﬁcation problem.

If the system under consideration has interaction with the environment, it becomes an open quantum system and has a more complicated state evolution. Usually a quantum process/operation is used to describe the transformation of the state.

Suppose there is a state ρin ∈ D(HA), then the process is a map E that transforms it to another state

ρout = E(ρin), (2.6) where ρout ∈ D(HB). In the general case, the input and output Hilbert spaces HA and HB can have diﬀerent dimensions, while for simplicity in this thesis we assume they are the same space.

For a process to be indeed physical, it further has two restrictions:

(i) E must be trace preserving; i.e.,

∀ρin ∈ D(HA), Tr[E(ρin)] ≡ Tr(ρin). (2.7)

AC (ii) E must be completely positive; i.e., for arbitrary Hilbert space HC ,(E ⊗IC )ρ ∈ AC D(HB ⊗ HC ), ∀ρ ∈ D(HA ⊗ HC ), where IC is the identity operator in HC .

There are a number of speciﬁc equivalent representations for E. In Sec. 2.2.2, we will introduce Choi-Jamio lkowski isomorphism, and in Sec. 5.2.1 we will introduce Kraus operator-sum representation.

Postulate 2.3. A quantum measurement is associated to a collection {Qi} of

40 2.1. QUANTUM MECHANICS FOUNDATIONS measurement operators, acting on the state space of the system being measured. They satisfy the completeness equation

X † Qi Qi = I, (2.8) i where I is the identity operator. The index i labels the possible measurement outcomes. If the quantum system has state |ψi immediately before the measurement, then the probability that the i-th result occurs is given by

† pi = hψ|Qi Qi|ψi, and the post-measurement state is

|ψi Qi . q † hψ|Qi Qi|ψi

One can check that the completeness equation is equivalent to requiring that all the probabilities sum to one. If {Qi} further satisﬁes QiQj = δijQi and each Qi is Hermitian, then the measurement is called projective measurement, and Qi a projector. When the post-measurement state is not of much interest, a more widely used formulism is Positive Operator-Valued Measure (POVM) measurement, † which can be deduced from Postulate 2.3. We deﬁne Pi , Qi Qi. Then a POVM measurement is associated with a set of positive operators {Pi}, with their sum equal to the identity. The probability of the i-th outcome is now determined as

pi = hψ|Pi|ψi, which in the mixed-state case is

pi = Tr(Piρ). (2.9)

41 2.1. QUANTUM MECHANICS FOUNDATIONS

Each Pi is a POVM element, and in the ﬁnite dimensional case it corresponds to a positive semideﬁnite matrix.

Suppose we have a series of probabilities (2.9). If the measurement outcome probabilities (pis) are approximated by experiments and the POVM elements (Pis) are known, the technique to deduce the unknown state ρ is called quantum state tomography (QST). Otherwise, if the probabilities and the state is known, the procedures to deduce the unknown POVM elements is called quantum detector tomography (QDT), because POVM elements are the mathematical representation of quantum detectors (measurement devices).

Postulate 2.4. The state space of a composite physical system is the tensor product of the state spaces of the component physical systems. Speciﬁcally, let |ψii be the state of the i-th subsystem and there are n subsystems altogether, then the total system has the joint state |ψ1i ⊗ |ψ2i ⊗ ... ⊗ |ψni, which is often written in a short notation |ψ1ψ2...ψni.

From Postulate 2.4 we know the composite of density matrices (or of operators) is also in the tensor product form. When the operators are Pauli matrices, a similar more common way to express their composite is to omit the tensor product and identity notation, and use number subscripts to denote the qubit number. For example,

I I for a 4-qubit system, Y1X2 in fact is short for σy ⊗ σx ⊗ 2 ⊗ 2 . If a state of the total system can be written as the tensor product of states of the component systems, as in the form in Postulate 2.4, then it is called separable. Non- separable states are called entangled states, which are one of the most important resources in quantum science.

Postulate 2.4 directly shows how to obtain composite states from states on subsystems. To go in the opposite direction, we need the partial trace to obtain a reduced density operator. For any |ai, |bi ∈ HA, |ci, |di ∈ HB, the partial trace

42 2.2. STANDARD MAXIMUM LIKELIHOOD ESTIMATION

over HA is deﬁned by

TrA(|aihb| ⊗ |cihd|) = Tr(|aihb|)|cihd|.

AB AB Suppose ρ is a state on HA ⊗ HB. Then ρ restricted to HB is the reduced density operator for system HB, as

B AB ρ , TrA(ρ ).

AB B Specially, if ρ = ρ ⊗ σ, then ρ = TrA(ρ ⊗ σ) = σ.

2.2 Standard maximum likelihood estimation

The main focus of this thesis is on designing new tomography algorithms, and later it will be necessary to compare them with existing methods. Hence, in this section we brieﬂy introduce one of the most commonly used tomography methods, the Maximum Likelihood Estimation (MLE) method.

2.2.1 Quantum state tomography via maximum likelihood estimation

Quantum State Tomography (QST) is the technique to deduce an unknown quantum state from measurement data. We hereby introduce the MLE method to perform QST, based on [75, 79].

Suppose we have Nt identical independent copies of an unknown state ρ. Usually we perform a series of POVM measurement {Pi} on it to extract information. Denote the observed occurrence of the i-th outcome as ni, and then the frequencyp ˆi = ni/Nt is an experimental approximation to the true probability pi. The probability that

Q ni we observe such data is in fact i pi , which after taking logarithm and dividing by

43 2.2. STANDARD MAXIMUM LIKELIHOOD ESTIMATION

Nt is equivalent to the log-likelihood functional

X L(ρ) = pˆi ln Tr(ρPi). i

The core idea of MLE is to search for the solution that maximizes the probability to observe the data in hand. Hence, MLE takes

X ρˆMLE = arg max pˆi ln Tr(ρPi) ρ i as the estimate of the state. By analyzing the extremal equation, one can obtain an iterative search algorithm:

1. Assign an admissible guess to the initial state; e.g., ρ(0) = I/d.

2. At step-k, compute X pˆi i R(k) = P . Tr[ρ(k) ] i Pi

3. Update the estimate at step-(k + 1) as

R(k)ρ(k)R(k) ρ(k+1) = . Tr[R(k)ρ(k)R(k)]

4. Terminate the iteration if the distance between ρ(k) and ρ(k+1) is smaller than a given threshold; otherwise, let k be k + 1 and repeat the iteration.

It is straightforward to check that during the above procedures the estimated value of the state is kept positive semideﬁnite and with trace 1. Hence, MLE always gives a physical estimate. Furthermore, there are improved versions of MLE QST to accelerate the algorithm; e.g., see [128, 140].

44 2.2. STANDARD MAXIMUM LIKELIHOOD ESTIMATION

2.2.2 Quantum process tomography via maximum likelihood estimation

Quantum Process Tomography (QPT) is a technique to employ quantum states (which are usually known) to estimate an unknown quantum process, which is a map between quantum states. Here we rephrase the framework in [79] to perform MLE QPT.

Suppose the process is a map E from HA to HB; i.e., for any state ρin ∈ HA,

E(ρin) = ρout ∈ HB. From Choi-Jamio lkowski isomorphism [31, 66, 78, 110], E is in one-to-one correspondence with an operator Q ∈ HA ⊗ HB, such that

T B E(ρin) = TrA[Q(ρin ⊗ I )].

The trace preserving restriction (2.7) is equivalent to

A TrB(Q) = I (2.10) and the completely positive restriction on the process amounts to requiring that Q is positive semideﬁnite. Under this representation, reconstructing the process E amounts to reconstructing Q. Usually a series of diﬀerent states ρm are inputted to (m) the process, and POVM measurements {Pl } are performed on each corresponding output state. Letp ˆlm denote the observed frequency of the corresponding outcomes (m) from the POVM {Pl }. We then aim to maximize the constrained log-likelihood functional as

ˆ X T (m) B QMLE = arg max pˆlm ln Tr[Q(ρm ⊗ Pl )] − Tr[(ΛL ⊗ I )Q], Q m,l where ΛL is the Lagrange multiplier matrix accounting for the trace-preservation condition (2.10). One can then again design a numerical iteration algorithm to

45 2.2. STANDARD MAXIMUM LIKELIHOOD ESTIMATION search for the optimal solution:

(0) AB 1. Assign an admissible guess to the initial process; e.g., Q = I /dim(HB).

2. At step-k, compute

(k) X pˆlm T (m) K = ρm ⊗ l (k) T (m) P m,l Tr[Q (ρm ⊗ Pl )] and

(k) (k) (k) (k) 1/2 ΛL = [TrB(K Q K )]

3. Update the estimation at step-(k + 1) as

(k+1) (k) −1 B (k) (k) (k) (k) −1 B Q = [(ΛL ) ⊗ I ]K Q K [(ΛL ) ⊗ I ].

4. Terminate the iteration if the distance between Q(k) and Q(k+1) is smaller than a given threshold; otherwise, let k be k + 1 and repeat the iteration.

One can check that the above procedures keep Q positive semideﬁnite and preserve the condition (2.10).

2.2.3 Quantum detector tomography via maximum likelihood estimation

Quantum Detector Tomography (QDT) accounts to reconstructing the POVM elements of a set of quantum measurements, since detectors are a kind of physical realization of POVMs. In this section, we rephrase the MLE QDT method in [110].

M Suppose we perform POVM measurements {Pl}l=1 (called one detector) on a series of diﬀerent states ρm, and the observed corresponding frequency isp ˆlm for the l-th outcome. To reconstruct {Pl}, we need to consider the solution that maximizes the

46 2.2. STANDARD MAXIMUM LIKELIHOOD ESTIMATION constrained log-likelihood functional

ˆ X X {Pl}MLE = arg max pˆlm ln Tr(ρmPl) − Tr(ΛLPl), {Pl} m,l l where ΛL is the Lagrange multiplier matrix accounting for the constraint

X Pl = I. l

One can again design a numerical iteration algorithm as follows:

(0) 1. Assign an admissible guess to the initial detector; e.g., Pl = I/M.

2. At step-k, for each l, compute

(k) X pˆlm Rl = (k) ρm. m Tr[ρmPl ]

Then update the Lagrange multiplier matrix as

(k) X (k) (k) (k) 1/2 ΛL = ( Rl Pl Rl ) . l

3. Update the estimation at step-(k + 1) for each l as

(k+1) (k) −1 (k) (k) (k) (k) −1 Pl = (ΛL ) Rl Pl Rl (ΛL ) .

(k) (k+1) 4. Terminate the iteration if the distance between {Pl } and {Pl } is smaller than a given threshold; otherwise, let k be k + 1 and repeat the iteration.

One can check that the above procedures guarantee that the POVM matrices are positive semideﬁnite and sum to the identity.

Although MLE has been widely accepted and used in quantum tomography, it still has some intrinsic drawbacks that might be improved by other methods. For

47 2.2. STANDARD MAXIMUM LIKELIHOOD ESTIMATION example, the iterative procedure is not very amenable to adaptivity, and usually results in a heavy computational burden. Also, it is not easy to theoretically characterize the estimation error. These drawbacks also appear in most other existing algorithms such as Bayesian Mean Estimation [18, 76]. To alleviate or overcome these drawbacks is thus the main motivation for the research in this thesis.

48 Chapter 3

Recursively adaptive multi-qubit state tomography

The work, reported in this chapter, has been partially published in the following articles:

1. B. Qi, Z. Hou, Y. Wang, D. Dong, H.-S. Zhong, L. Li, G.-Y. Xiang, H. M. Wiseman, C.-F. Li, and G.-C. Guo, Adaptive quantum state tomography via linear regression estimation: Theory and two-qubit experiment, npj Quantum Information, vol. 3, no. 1, p. 19, 2017. 2. D. Dong, Y. Wang, Z. Hou, B. Qi, Y. Pan, and G.-Y. Xiang, State tomography of qubit systems using linear regression estimation and adaptive measurements, in the 20th World Congress of the International Federation of Automatic Control (IFAC), vol. 50, no. 1, pp. 13014-13019, Toulouse, France, July 2017. 3. D. Dong, and Y. Wang, Several recent developments in estimation and robust control of quantum systems, in 2017 Australian and New Zealand Control Conference (ANZCC), pp. 190-195, Gold Coast, Australia, December 2017.

3.1 Introduction

One of the central problems in quantum science and technology is the estimation of an unknown quantum state [104]. Quantum state tomography (QST), as a procedure for experimentally determining an unknown quantum state, has become a standard technology for veriﬁcation and benchmarking of quantum devices [7, 33, 35, 58, 62, 77, 85, 93, 96, 99, 110, 113, 121, 137, 141, 144]. Two key tasks in QST are data

49 3.1. INTRODUCTION acquisition and data analysis. The aim of data acquisition is to acquire information for reconstructing the quantum state through appropriate measurement strategies. Then in the data analysis step, the acquired data is associated with an estimate of the unknown quantum state using a reconstruction algorithm.

For data acquisition, in order to enhance the efficiency, it is desirable to develop optimal measurement strategies for collecting data. However, an optimal measurement strategy, which is only known for a few special cases [29, 58, 65, 70, 170], depends on the state to be reconstructed. To circumvent this issue, many kinds of fixed sets of measurement bases have been designed to be optimal either in terms of the average over a certain quantum state space [2, 15, 38, 106, 157] or in terms of the worst case in the quantum state space [113]. For instance, improved state estimation can be achieved by taking advantage of mutually unbiased bases (MUB) [2, 46, 157] or symmetric informationally complete positive operator-valued measures (SIC-POVM) [11, 117]. For multi-partite quantum systems, MUB and SIC-POVM are difficult to experimentally realize since they involve nonlocal measurements. It remains open how to efficiently acquire information of an unknown quantum state using simple measurements that are straightfoward to realize experimentally.

For data analysis in tomography, although many methods, such as maximum- likelihood estimation (MLE) [17, 110, 130, 139, 140], Bayesian mean estimation (BME) [18, 76], least-squared inversion [109], have been used to reconstruct a quantum state, this task can be computationally intensive, and may take even more time than the experiments themselves. It has been reported in [62] that using the maximum-likelihood method to reconstruct an eight-qubit state took weeks of computation. Therefore, the development of an eﬃcient data analysis algorithm is a critical issue in quantum state tomography [98, 113]. In [113], a recursive linear regression estimation algorithm was presented which is much more computationally eﬃcient in the sense that it can greatly save the cost of computation as compared to the maximum-likelihood method, with only a small amount of accuracy being

50 3.1. INTRODUCTION sacriﬁced.

For a given number of copies of the system, in order to improve the tomography accuracy by better tomographic measurements, a natural idea is to develop an adaptive tomography protocol where the measurement can be adaptively optimized based on data collected so far. Adaptive measurements have shown more powerful capability than nonadaptive measurements in quantum phase estimation [67, 155, 160], phase tracking [162], quantum state discrimination [1, 68], and Hamiltonian estimation [49, 125]. Actually, adaptivity has been proposed for quantum state tomography in various contexts [5, 58, 76, 86, 90, 98, 108, 136]. For example, the results on one qubit have demonstrated that adaptive quantum state tomography can improve the accuracy quadratically considering the inﬁdelity index [98]. However, when generalizing these results to multi-qubit systems, two new problems arise:

Problem 3.1. In the adaptive tomography protocol, the optimized measurement bases may be nonlocal, which are diﬃcult to realize in experiments. The question of how to adapt the theory to practical experiments is an open problem.

Problem 3.2. Ref. [98] points out that estimating the near-zero eigenvalues of the state is vital to reduce the inﬁdelity. Single-qubit states have at most one near-zero eigenvalue, while a multi-qubit state can have many, namely, being (or approximately being) degenerate. What eﬀect does this have upon the adaptive protocol?

In this chapter, we combine the computational eﬃciency of the recursive technique of [113] with a new adaptive protocol that does not necessarily require nonlocal measurement to present a recursively adaptive quantum state tomography (RAQST) protocol. In Sec. 3.2, we introduce our RAQST protocol, where no prior assumption (except the dimension) is made on the state to be reconstructed. The state estimate is recursively updated based on the current estimate and the new measurement data, via certain closed form formulas. Thus, compared with MLE and BME, the combination of historical information with the newly acquired data is much more

51 3.2. RAQST PROTOCOL eﬃcient in our method. Thanks to the simple recursive estimation procedure, we can obtain the estimate state in a realtime way, and using the estimate we can adaptively optimize the measurement strategies to be performed in the forthcoming step. In our RAQST protocol, the measurement to be performed at each step is optimized w.r.t. the corresponding admissible measurement set determined by the experimental conditions. As an example, we consider the case where only product measurements are performed, and we design a numerical algorithm to optimize the measurement base among all product measurements.

In Sec. 3.3, we present simulation results for the RAQST protocol. It is first demonstrated numerically that our RAQST even with the simplest product measurements can outperform the tomography protocols using MUBs and the two-stage MUB adaptive strategy. For maximally entangled states, the infidelity can even be reduced to beat the Gill-Massar bound which is a quantum Cramér-Raoinequality [58]. Moreover, if nonlocal measurements are available, with our RAQST the infidelity can be further reduced. For a wide range of quantum states, the infidelity of our RAQST can be reduced to beat the Gill-Massar bound with a modest number of copies. We perform two-qubit state tomography experiments in Sec. 3.4 using only the simplest product measurements, and the experimental results demonstrate that the improvement of our RAQST over nonadaptive tomography is significant for states with a high level of purity. This limit (very high purity) is the one relevant for most forms of quantum information processing. Finally, Sec. 3.5 concludes this chapter and presents relevant open problems.

3.2 RAQST protocol

A basic observation for proposing adaptivity is that the optimal measurement basis w.r.t. the estimation error usually depends on the speciﬁc state to be estimated [98]. Hence, the general idea of adaptivity is to employ historical estimates to deduce an

52 3.2. RAQST PROTOCOL optimal or near-optimal measurement basis to improve the accuracy. Let K denote the number of total adaptive steps (non-adaptivity corresponds to K = 1). In the k-th step (2 ≤ k ≤ K), all the data before or in the k − 1-th step is employed to deduce an estimate of the state, according to which a new measurement basis is determined to perform the k-th step’s measurement. The problem is to find an appropriate mathematical framework to describe the adaptivity process, preferably reducing the estimation error significantly compared with non-adaptive protocols, achieving efficiency advantage and (partly) answering Problem 3.1 and Problem 3.2.

Ref. [113] proposed a linear regression estimation (LRE) method for quantum state tomography, where the results have shown that the LRE approach has much lower computational complexity than the MLE method for quantum tomography. Also, the LRE solution has a closed form, which should be advantageous in performing adaptive QST. Here, we further develop this LRE method to present a RAQST protocol that can greatly improve the precision of tomography.

3.2.1 Establishment of linear regression model

We ﬁrst convert the quantum state tomography problem into a parameter estimation problem for a linear regression model. Consider a d-dimensional quantum

d2 system with Hilbert space H . Let {Ωi}i=1 denote a complete Hermitian basis set √ of Cd×d, satisfying Tr(ΩiΩj) = δij. Also let Ω1 = I/ d, and then the rest of the

Ωis are all traceless. Using this set, the quantum state ρ to be reconstructed can be parameterized as d2 X ρ = I + θ Ω , (3.1) d i i i=2 √ where θi = Tr(ρΩi) is real. Since θ1 = 1/ d is ﬁxed, we take the eﬀective parametriza- T tion vector as Θ = (θ2, ··· , θd2 ) .

A quantum measurement can be described by a positive operator-valued measure

53 3.2. RAQST PROTOCOL

M (POVM) {Pi}i=1, which is a set of positive semidefinite matrices which sum to the PM identity, i.e., Pi ≥ 0 and i=1 Pi = I. In QST, different sets of POVMs should be appropriately combined to efficiently acquire information of the unknown quantum state. Let M = S M(j) denote the admissible measurement set, which is a union j=1 of POVMs determined by the experimental conditions. Each POVM is denoted

(j) (j) M(j) d2 as M = {Pi }i=1 . Using the set of {Ωk}k=2, elements of the POVM can be parameterized as d2 (j) (i) I X (i) Pi = φ1,j √ + φk,jΩk, d k=2

(i) (j) √ (i) (j) where φ1,j = Tr(Pi )/ d, and φk,j = Tr(Pi Ωk). Let the eﬀective parametrization (j) (i) (i) T vector of the i-th matrix of the j-th POVM be Φi = (φ2,j, ··· , φd2,j) . If we perform the POVM M(j) on copies of a system in state ρ, the probability that we observe the result m is given by

√ (j) (j) (m) T (j) p(m|M ) = Tr(Pm ρ) = φ1,j / d + Θ Φm . (3.2)

Assume that the total number of copies of the state is Nt, and we perform a (j) (j) M(j) (j) (j) measurement described by M = {Pi }i=1 n times. Let nm denote the number of the occurrence of the outcome m from the n(j) measurement trials of M(j). Let

Using (3.2), we have the linear regression equations for m = 2, ··· , M(j),

√ (j) (m) T (j) (j) pˆ(m|M ) = φ1,j / d + Θ Φm + em . (3.4)

√ (j) (m) (j) (j) Note thatp ˆ(m|M ), φ1,j / d and Φm are all available, while em may be considered as the observation noise. Hence, the problem of QST is converted into the estimation

54 3.2. RAQST PROTOCOL of the unknown vector Θ.

PK (jk) Denote the total number of regression equations as MK = k=1 M , and the ˆ estimation using the former n equations as Θn. To give an estimate with a high level ˆ of accuracy, the basic idea of LRE is to ﬁnd an estimate ΘMK such that

K M(jk) √ X X (m) T Θˆ = argmin W (jk)[ˆp(m|M(jk)) − φ / d − Θˆ Φ(jk)]2. (3.5) MK m 1,jk m ˆ Θ k=1 m=1

(j ) (jk) (jk) (jk) M k Here, M denotes the POVM M = {Pm }m=1 being performed at the k-th

(jk) step. The notation Wm denotes the weight of the corresponding linear regression

(jk) equation. In general, the smaller the variance of em is, the more the information

(jk) can be extracted by Pm . Therefore, the corresponding weight of the regression

(jk) equation should be larger. A sound choice of Wm is the estimate of the inverse of

(jk) (jk) (j ) (j ) 2 (j ) the variance of em ; i.e., Wm = n k /[ˆp(m|M k ) − pˆ (m|M k )].

3.2.2 Recursive LRE updating rule and physical projection

We utilize the recursive LRE algorithm [113] to ﬁnd a closed-form solution for ˆ ΘMK . First, we transform the linear regression equations (3.4) into a compact form.

Pt (jk) After t times of POVMs, we can obtain in total Mt = k=1 M linear regression

(j1) equations. We denote them as 2-tuples [1, (j1)], ··· ,[M ,(j1)], ··· , [1, (jt)], ··· ,

(jt) [M ,(jt)], where [m, (jk)] corresponds to the linear regression equation with the outcome m when the POVM M(jk) is performed at the k-th step. To facilitate the presentation, we relabel them according to the natural order. Thus, the notation (m) (j ) (j ) pˆ(m|M(jk)), φ , Φ(jk), e k , and W k can be simpliﬁed with the corresponding 1,jk m m m (n) sequence number n asp ˆn, φ1 , Φn, en, and Wn. Let

T (1) (n) (Mt) ! φ1 φ1 φ1 Yt = pˆ1 − √ , ··· , pˆn − √ , ··· , pˆM − √ , d d t d

55 3.2. RAQST PROTOCOL

T Xt = (Φ1, ··· , Φn, ··· , ΦMt ) ,

T et = (e1, ··· , en, ··· , eMt ) ,

Wt = diag (W1, ··· ,Wn, ··· ,WMt ) .

Using this notation, the linear regression equations (3.4) can be expressed in a compact form

Yt = XtΘ + et. (3.6)

The solution to (3.5) is ˆ T −1 T Θt = (Xt WtXt) Xt WtYt. (3.7)

We now show how to rewrite (3.7) in a recursive way. Deﬁne

n X 1 Q = ( W Φ Φ T )−1, a = ( + ΦT Q Φ )−1. (3.8) n i i i n W n n−1 n i=1 n

Using the matrix inversion formula (see, e.g., page 19 of [72])

(A − BCD)−1 = A−1 + A−1B(C−1 − DA−1B)−1DA−1,

for n = 2, ··· , Mt, we have

T Qn = Qn−1 − anQn−1ΦnΦn Qn−1. (3.9)

ˆ From (3.7)-(3.9), the recursive form of Θn can be obtained as

(n) ˆ ˆ φ1 T ˆ Θn = Θn−1 + anQn−1Φn(ˆpn − √ − Φ Θn−1). (3.10) d n

ˆ Observing (3.8)-(3.10), we ﬁnd that the historical data are “compressed” in Θn−1 and Qn−1. Each time when the new data come and the estimation is updated, the historical data participate in the computation as a whole, instead of one by one.

56 3.2. RAQST PROTOCOL

Hence, the historical calculation involving old data directly helps to exempt the updating of the estimation from some repetitive computation tasks. This is quite d- iﬀerent from the MLE or BME method where one has to go through all the historical data many times, which is computationally intensive. The algorithm in Sec. 2.2.1 shows that when fresh measurement data from new POVM come, the historical calculation is of little use to save the computation burden of the new round of searching.

A further necessary procedure of our protocol is physical projection. Specifically, ˆ using the solution ΘMK in (3.5) and the relationship in (3.1), we can obtain a Hermitian matrixµ ˆ with Trˆµ = 1. However,µ ˆ may have negative eigenvalues and be nonphysical, due to the randomness of the measurement results or the finiteness of the resource number. In this work, the final physical estimateρ ˆph is chosen to be the closest density matrix toµ ˆ under the Frobenius norm; i.e.,

ρˆph = argmin||ρ − µˆ||. (3.11) ρ∈Dd

In standard state reconstruction algorithms, this task is computationally intensive [130]. However, we can employ the fast algorithm in [130] with computational complexity O(d3) to solve (3.11) since we have a Hermitian estimateµ ˆ with Trˆµ = 1. It can be veriﬁed that pullingµ ˆ back to a physical stateρ ˆph can further reduce the mean squared error [137]. Using this technique, we project the pseudo estimation

µˆ to the physical space Dd composed of all density matrices and obtain the ﬁnal estimation.

Finally, it should be pointed out that the physical projection procedure should not directly interfere with the update of the estimation, in order to guarantee the intactness of the raw data. Speciﬁcally, each time we obtain an updated estimation ˆ Θn−1, which can be non-physical, we employ physical projection to obtain a tempo- ph ph rary genuine estimateρ ˆn−1. Based onρ ˆn−1, we employ optimization algorithms (like that in Sec. 3.2.4) to determine the chosen measurement basis for the n-th step Φn.

57 3.2. RAQST PROTOCOL

ˆ When the measurement data of the n-th step arrive, Θn should be obtained using ˆ ph (3.10) on the basis of Θn−1 instead ofρ ˆn−1, which is a key point in simulation.

We would like to stress two advantages of the recursive LRE method: (a) as we have demonstrated in [113], the recursive LRE method can greatly reduce the cost of computation in comparison with the MLE method; (b) the recursive LRE algorithm is naturally suitable for optimizing measurements adaptively. The argument for the advantage (b) can be explained as follows. For state tomography the optimal measurements generally depend upon the state to be reconstructed. By utilizing the recursive LRE algorithm, we can obtain the estimate of the real state in a computationally eﬃcient way. Using the state estimate, the measurements to be performed can be adaptively optimized.

3.2.3 Optimization criterion

In this subsection, we illustrate how to deduce the criterion for optimization of the measurement basis; i.e., on what standard should we determine the forthcoming measurement basis, given a historical estimate of the state.

As pointed out in [58], as the number of copies Nt becomes large, the only relevant measure of the quality of estimation becomes the mean squared error matrix ˆ ˆ ˆ T M(Θ, Θ) , E(Θ − Θ)(Θ − Θ) , where E(·) denotes the expectation on all possible measurement results. To be specific, for a good estimation strategy, a reasonable expectation is that the elements of the mean squared error matrix decrease as O(1/Nt), ˆ ˆ ˆ ˆ i.e., Mij(Θ, Θ) = E(θi − θi)(θj − θj) = O(1/Nt). Assume that f(Θ, Θ) is any s- mooth cost function that can measure how much the estimate Θˆ (ˆρ) differs from the true value Θ(ρ). From equations (1)–(3) in [58], there exist a function f0(Θ) and a positive semidefinite matrix C(Θ) such that the mean value of f(Θˆ , Θ) under a

58 3.2. RAQST PROTOCOL reasonable estimation strategy will decrease as

1 Ef(Θˆ , Θ) = f (Θ) + Tr(C(Θ) (Θˆ , Θ)) + o(1/N ). (3.12) 0 2 M t

Note that f0(Θ) and C(Θ) depend only on the cost function and the true state, while M(Θˆ , Θ) depends on the true state as well as the estimation Θˆ . Hence, from (3.12), we can minimize the mean squared error matrix M(Θˆ , Θ) to minimize any smooth cost function by choosing appropriate POVMs and suitable estimation strategies.

In the following, we only minimize Tr(M(Θˆ , Θ)) instead of M(Θˆ , Θ) itself for simplicity, although this is not equivalent. To give a criterion on how to optimize ˆ the POVMs, we ﬁrst look at the mean squared error matrix of Θn. From (3.7) and (3.8), we have

ˆ ˆ T T E(Θn − Θ)(Θn − Θ) = QnXn WnCov(en)WnXnQn,

T where Cov(en) is the covariance matrix of en = (e1, ··· , en) . To minimize the ˆ ˆ T trace of E(Θn − Θ)(Θn − Θ) , we can minimize Qn. We now present an intuitive explanation on the rationality to minimize Qn. It can be seen that if the weighted −1 ˆ ˆ T matrix Wn satisﬁes Wn ≈ Cov(en) when Nt becomes large, E(Θn −Θ)(Θn −Θ) ≈

Qn. Recall that the weight of the i-th linear regression equation is approximately equal to the inverse of Cov(en)ii. Moreover, if ei and ej correspond to diﬀerent

POVMs, they are independent, and so Cov(en)ij = 0. Therefore, we can adaptively choose POVMs to minimize Qn.

We illustrate some speciﬁcs of our RAQST protocol before minimizing Qn. RAQST is generally divided into two stages. In the ﬁrst stage, we perform a standard

(static/non-adaptive) linear regression estimation on N0 copies with the standard cube measurement bases (while other common static bases are also applicable) to get a prelimiary Θˆ and Q [113]. Next in the second stage we set the initial value ˆ ˆ Q0 = Q in (3.9) and Θ0 = Θ in (3.10), and then utilize the remaining Nt − N0

59 3.2. RAQST PROTOCOL copies for K − 1 steps adaptive linear regression estimation.

s ˆ P (jk) Suppose after s steps, we get QMs and ΘMs where Ms = k=1 M . Recall that (j ) (jk) (jk) (jk) M k M denotes the POVM M = {Pm }m=1 being performed at the k-th step. If s = 0, Ms = 0. From (3.9), we can see that QMs+1 ≤ QMs , and

ΦT Q2 Φ − g Tr(Q ) − Tr(Q ) = − Ms+1 Ms Ms+1 . (3.13) Ms+1 , Ms+1 Ms 1 T + ΦM +1QMs ΦMs+1 WMs+1 s

The remaining question is how to choose POVMs to improve the rate of decreasing.

(js+1) (js+1) S (j) We can choose Pi (Φi ) from the admissible measurement set M = M = j=1 S (j) M(j) {Pi }i=1 such that it maximizes gMs+1. In other words, we need to solve j=1

T 2 ΦMs+1QMs ΦMs+1 max 1 T . (3.14) S (j) M(j) + Φ QM ΦM +1 ΦMs+1∈ {Pi }i=1 WM +1 Ms+1 s s j=1 s

In cases when M is a ﬁnite set determined by the practical experimental setting, one can simply enumerate all the candidates to determine the ΦMs+1 which gives the largest corresponding gMs+1 value. Otherwise as M is inﬁnite, the general solution to (3.14) remains an open problem, and we present a heuristic answer in Sec. 3.2.4.

(js+1) Once the new measurement base ΦMs+1 (i.e., Pi ) is chosen, we perform the (j ) (js+1) (js+1) M s+1 corresponding POVM M = {Pk }k=1 at the (s + 1)-th step. By doing this, we obtain M(js+1) linear regression equations. Thus, we can utilize (3.9) and s+1 ˆ P (jk) (3.10) to get QMs+1 and ΘMs+1 , where Ms+1 = k=1 M . The above procedure is repeated until all the copies are consumed.

Two points should be paid attention to in the above RAQST protocol. The ﬁrst

(js+1) one is when choosing Pi to maximize gMs+1, we cannot obtain information about pˆMs+1 in WMs+1 because we have not really performed the experiments. From (3.4),

60 3.2. RAQST PROTOCOL we can use its estimate

√ T ˜p (Θˆ ) = φ(i) / d + Θˆ Φ(js+1) (3.15) Ms+1 Ms 1,(js+1) Ms i

to replacep ˆMs+1. Another one is that given the total number of copies Nt, how to determine the number of copies N0 used in the ﬁrst stage and the number of adaptive steps K − 1 in the second stage. The optimal values remain open, while we give eﬀective empirical formulas for two-qubit systems as an example in Sec. 3.3.

3.2.4 Two versions of RAQST

To (partly) answer Problem 3.1 and Problem 3.2, we develop a specialized version of our protocol, namely RAQST1, where the admissible measurement set consists of all the product measurements (including the product of standard cube measurement bases [38]). This kind of measurement is one of the simplest measurements to realize in optical systems. In this section, we employ the two-qubit case as an example to illustrate RAQST1, which is not diﬃcult to be extended to systems with three or more qubits.

We first consider Problem 3.1 and start from searching for a suboptimal solution to (3.14). Ref. [98] pointed out that, in order to reduce the infidelity significantly compared with non-adaptive protocols, we must accurately estimate the small eigenvalues of the state to be reconstructed, particularly those near-zero eigenvalues. To do this, a preferable choice is to take the projector measurement along with or close to the eigenvectors corresponding to the near-zero eigenvalues. Furthermore, we consider the following problem:

It is straightforward to ﬁnd that |λ1i is just a solution to Problem 3.3, which means

61 3.2. RAQST PROTOCOL to ﬁnd the eigenvector corresponding to the least eigenvalue amounts to minimize the expected measurement value of using a projector on the state. Inspired by this, at each iteration step we aim to ﬁnd a product projector that minimizes ˜pMs+1 in (3.15), and we name this procedure as “product projector optimization”. This operation makes the corresponding regression equation to be obtained as accurate as possible since the variance of the relevant observation noise (3.3) is minimized.

Furthermore, this procedure maximizes the weight factor WMs+1, thus promising to achieve a large gMs+1 according to (3.14). The validity to perform product projector optimization is supported by the above analysis. The product projector optimization can be modelled as a standard conditional extreme problem, and then solved by a simple iteration algorithm. The details are given in AppendixB. The measurement basis determined using product projector optimization might not always be the one

(js) that can maximize gMs+1, while it deserves to be added in the admissible set M .

Now we come to Problem 3.2. In a single-qubit system, there can be at most one near-zero eigenvalue, and it thus suﬃces to estimate this unique one accurately. However, for multi-qubit systems, due to degeneration or near-degeneration, there

Nq can be more than one (at most 2 − 1 in Nq-qubit systems) near-zero eigenvalues, and their eigenvectors together can form a non-trivial linear subspace. The eigenvector of the least eigenvalue is only one component (or basis) of this subspace, and our target now is to estimate this whole subspace, instead of estimating just one component. To do this, we design a “cyclic eigenvalues” method as follows.

Suppose the current estimate of the state has spectral decomposition

d X ˆ ˆ ˆ ρˆ = λi|λiihλi| i=1

ˆ ˆ where 0 ≤ λ1 ≤ · · · ≤ λd ≤ 1. If we directly employ the product projector optimiza- CE tion algorithm in AppendixB on ˆρ, we obtain a product projector Φ1 which can ˆ accurately estimate |λ1i. Now suppose there are altogether j near-zero eigenvalues

62 3.2. RAQST PROTOCOL

ˆ and we want to estimate |λki, where 2 ≤ k ≤ j < d. We can write down a dummy state

We then employ the algorithm in AppendixB for this dummy state ˆρk, and the result CE ˆ is a product projector Φk which can accurately estimate |λki. Using this method for all 2 ≤ k ≤ j, we can thus accurately estimate all the near-zero eigenvalues.

In practice, we still need to determine the exact number of near-zero eigenvalues. √ Since the estimation infidelity decreases as O(1/ N0) in the first static stage of √ our protocol [98], we set the threshold as 10/ N0. After the first stage, suppose we √ ˆ obtain j eigenvalues in Θ0 smaller than 10/ N0, then we need to accurately estimate the eigenvectors corresponding to the least j eigenvalues of the state. In each of the

CE forthcoming adaptive step, we ﬁrst add Φ1 to the admissible set and choose the optimal one as the measurement, using (Nt − N0)/(K − 1)/j resources, then we consume the rest (j − 1)(Nt − N0)/(K − 1)/j resources evenly on the measurement CE CE CE bases Φ2 , Φ3 , ..., Φj .

Moreover, for each adaptive product project measurement basis we need to perform in two-qubit RAQST1, it is necessary to expand it to a complete POVM. Speciﬁcally, if the chosen projector is |ψ1ihψ1|⊗|ψ2ihψ2|, the corresponding POVM is {|ψ1ihψ1|⊗ ⊥ ⊥ ⊥ ⊥ ⊥ ⊥ ⊥ ⊥ |ψ2ihψ2|, |ψ1 ihψ1 | ⊗ |ψ2ihψ2|, |ψ1ihψ1| ⊗ |ψ2 ihψ2 |, |ψ1 ihψ1 | ⊗ |ψ2 ihψ2 |}, where ⊥ |ψi i is orthogonal to |ψii. This completes RAQST1.

For RAQST2, we further add the set of the eigenbases of the current state estimate into the admissible measurement set. This set, together with the product projector obtained in RAQST1, is enough to construct a satisfactory admissible set, and we thus omit the cyclic eigenvalues method. Note that the admissible measurement set in RAQST2 will involve non-local measurements in general, which may be diﬃcult to perform reliably using the current experiment techniques.

63 3.3. NUMERICAL RESULTS

3.3 Numerical results

In this section, we present the numerical results. We perform numerical simulations of two-qubit tomography mainly using the LRE method by default while with six diﬀerent measurement strategies: (i) standard cube measurements [38]; (ii) mutually unbiased bases (MUB) measurements; (iii) MUB half-half [98]; (iv) “known basis” [98]; (v) RAQST1: the admissible measurement set only contains the simplest product measurements; (vi) RAQST2: the admissible measurement set is not limited.

Each of the tomography protocols (iii)-(vi) is adaptive and consists of two stages. In the first stage, we all use the standard cube measurements. For the MUB half- half, we first perform standard cube measurements on Nt/2 copies and obtain a preliminary estimateρ ˆ0 via LRE, and then measure the remaining half of copies so that one set of the bases is adaptively adjusted to diagonalizeρ ˆ0 and it together with another four sets of bases constitutes a complete set of MUB as proposed in [98]. As compared to the MUB half-half, for the “known basis” [98], in the second stage, we perform a set of measurements so that one of the five bases of the MUB is exactly the eigenbasis of the true value of the state to be reconstructed. Although it is impossible physically using current technology, this is a useful comparison.

For the RAQST, we need to specify N0, which is the number of copies measured in the first stage, and the number K of the iteration steps. In principle, K may depend on the preliminary estimate in the first stage. For simplicity, in this work, we give empirical formulas only depending upon the total number Nt of the copies. Note that in RAQST1 and RAQST2, the admissible measurement sets are different, and so

(1) (1) are their empirical formulas. For RAQST1, N0 = Nt/(1.3 + 0.1 log10 Nt), K = (2) (2) blog10 Ntc, and for RAQST2, N0 = Nt(0.8−0.01 log10 Nt), K = b1.5 log10 Nt − 1c. Obviously the formula for the resource distribution for RAQST2 applies only when Nt is not too large.

64 3.3. NUMERICAL RESULTS

-1 120 RAQST1 for Random MESs -2 RAQST1 for Random Pure States 100 RAQST2 for Random MESs RAQST2 for Random Pure States -3 80

-4 60 Cube MUB -5 MUB Half-half 40 Known Basis RAQST1 -6 20 RAQST2 GM Bound -7 0 2.5 3 3.5 4 4.5 5 5.5 6 -0.5 0 0.5 1 1.5

(a) (b)

Figure 3.1: Simulated performance of the RAQST protocol for pure states.

We use Monte Carlo simulations to demonstrate the results. The figure of mer- it is the particularly well-motivated quantum infidelity [98], 1 − F(ρ, ρˆ) = 1 − 2 p√ √ Tr ( ρρˆ ρ). Fig. 3.1(a) depicts the average infidelity versus Nt for the maxi- |HV i−|VHi mally entangled state √ with different tomography protocols. Each point is 2 averaged over 100 realizations and the error bars are the standard deviation of the average. It can be seen that the average infidelity of the static tomography protocols √ (i.e., protocol (i) and (ii)) versus Nt is in the order of O(1/ Nt). However, the Gill-Massar bound [58] for the infidelity in two-qubit state tomography is 75 . This 4Nt can be obtained by combining the equations (5.29) and (A.8) in [170] (see Appendix C). It is clearly seen that, as compared to the static tomography protocols and the adaptive MUB half-half, the average infidelity using our RAQST protocol can be reduced to beat the Gill-Massar bound even with only the simplest product measurements. Furthermore, if there are no constraints on the admissible measurement set, the RAQST2 can outperform the “known basis” tomography, and the average infidelity of RAQST2 versus Nt can be significantly reduced to the order of the

Gill-Massar bound, i.e., O(1/Nt).

Fig. 3.1(b) shows the histogram for RAQST over 200 randomly selected pure states and 200 maximally entangled states (MESs) when the total number of copies

65 3.3. NUMERICAL RESULTS

-1 -1.5

-1.5 -2 -2

-2.5 -2.5

-3 Cube Cube -3 -3.5 MUB MUB MUB Half-half MUB Half-half -4 Known Basis Known Basis -3.5 RAQST1 RAQST1 -4.5 RAQST2 RAQST2 GM Bound GM Bound -5 -4 2.5 3 3.5 4 4.5 5 5.5 6 0.25 0.4 0.55 0.7 0.85 0.9 0.95 1

(a) (b)

Figure 3.2: Simulated performance of the RAQST protocol for mixed states.

4 is Nt = 10 for each random state. Random pure states are created using the algorithm in [173]. Since all the maximally entangled states are equivalent under local unitary operations, they are created by applying randomly generated local unitary operators [100] on the same maximally entangled state √1 (0, 1, −1, 0)T . Each 2 generated state is repeated through the RAQST protocol for 200 times. We adopt the index IP = log10 Cube−log10 RAQST to evaluate the performance of our RAQST protocol. log10 Cube−log10 GM Here, Cube and RAQST represent the average inﬁdelity between the corresponding estimate and the true state when the standard cube measurement bases and the RAQST are utilized, respectively, while GM is the Gill-Massar bound. Note that if IP > 0, our adaptive protocol surpasses the standard measurement strategy, while if IP > 1, our adaptive protocol beats the Gill-Massar bound. From Fig. 3.1(b) we can see that our RAQST protocol is particularly eﬀective for the class of maximally entangled states which are important resources in quantum information.

Fig. 3.2(a) depicts average inﬁdelity versus Nt with diﬀerent tomography methods for state (|HV i − |VHi)(hHV | − hVH|) ρ = 0.997 + 0.003 I , 2 4 which has purity Tr(ρ2)=0.9955. Each point is averaged over 200 realizations, and the

66 3.3. NUMERICAL RESULTS error bars are the standard deviation of the average. Note that there are kinks in the four curves corresponding to the four diﬀerent adaptive protocols (iii)-(vi). We can see that each of the four curves can be divided into three segments from left to right.

In the first segment, the infidelity decreases quickly as Nt increases, and then the curves go into the second segment where the infidelity decreases slowly. Finally when the resource number is large, the infidelity decreases quickly again as Nt increases. This is because infidelity is hypersensitive to misestimation of small eigenvalues, as pointed out in [98]. Whether a small number is close to zero is in fact a relative notion, instead of an absolute notion. When the resource number is not large enough to discriminate the near-zero eigenvalues and zero, the state “looks” pure from the view of data, and the performance is thus the same as pure-state tomography. Hence, the infidelity decreases as O(1/Nt) at first. When the resource number increases to a level where the near-zero eigenvalues start to take effect, it will be hard to estimate them accurately, so the decay rate of the infidelity decreases. Once the resource number is large enough to clearly discriminate the near-zero eigenvalues and zero, we are performing mixed-state tomography in essence, which all has O(1/Nt) decreasing rate for infidelity as predicted in [98]. This completes the explanation of the performance of three segments. More detailed explanation on this phenomenon can be found in [164].

From Fig. 3.2(a) it can be further seen that our RAQST1 can beat the static tomography protocols and the adaptive MUB half-half protocol even with the simplest product measurements. The inﬁdelity can be further reduced by using RAQST2,

4.5 and when the total copies Nt ≥ 10 , the inﬁdelity can be reduced to O(1/Nt).

Fig. 3.2(b) shows the average inﬁdelity versus diﬀerent purity when the total num-

4 ber of the copies for each state is ﬁxed as Nt = 10 . The states to be estimated are (|HV i − |VHi)(hHV | − hVH|) α + β I , 2 4 where α, β ≥ 0 and satisfy α + β = 1. Each point is averaged over 1000 realizations.

67 3.4. EXPERIMENTAL RESULTS

The results show that when the states have a high level of purity, our RAQST1 with the simplest product measurements can beat the MUB protocol. However, as the state becomes more mixed (Tr(ρ2) decreases), using MUB measurements for state tomography can perform better than using the adaptive product measurements. This fact is due to the essential limit of product measurements on mixed states. As pointed out in [58], nonlocal measurements on a mixed state can extract more information. Thus, to estimate mixed states, it is better to use nonlocal measurements, e.g., MUB measurements. It is also clear that the inﬁdelity achieved by using RAQST2 is much lower than that using MUB, and can beat the Gill-Massar bound for a wide range of quantum states.

3.4 Experimental results

In this section, we report the experimental results using our RAQST protocol for two-qubit quantum state tomography. The experiment was performed by our collaborators Zhibo Hou, Han-Sen Zhong, Li Li, Guo-Yong Xiang, Chuan-Feng Li and Guang-Can Guo at the University of Science and Technology of China. Since it is diﬃcult to perform nonlocal measurements in real experiments, we only experimentally implement tomography protocols using (i) standard cube measurements and (v) RAQST1.

As shown in Fig. 3.3, the experimental setup includes two modules: state preparation (gray) and adaptive measurement (light blue). In the state preparation module, a pair of polarization-entangled photons with a central wavelength at λ =702.2 nm is ﬁrst generated after the continuous Ar+ laser at 351.1 nm with diagonal polarization pumps a pair of type I phase-matched β-barium borate (BBO) crystals whose optic axes are normal to each other [87]. The generation rate is about 3000 two-photon coincidence counts per second at a pump power of 60 mW. Half-wave plates (HWPs) at both ends of the two single mode ﬁbers are used to control polarization. Then,

68 3.4. EXPERIMENTAL RESULTS

Figure 3.3: Two-qubit state tomography experimental setup, adopted from [114]. one photon is either reﬂected by or transmits through a 50/50 beam splitter (BSs). In the transmission path, a quarter-wave plate (QWP) is tilted to compensate the |HV i−|VHi phase of the two-photon state for the generation of √ . In the reﬂected path, 2 three 446 λ quartz crystals and a half wave plate with 22.5◦ are used to dephase the two-photon state into a completely mixed state I/4. The ratio of the two states mixed at the output port of the second BS can be changed by the two adjustable apertures for the generation of arbitrary Werner state in the form

(|HV i − |VHi)(hHV | − hVH|) ρ = α + (1 − α) I . 2 4

Note that since the coherence length of the photon is only 176 λ (due to the 4 nm bandwidth of the interference filter (IF)), much smaller than the optical path difference which is about 0.5 m, two states from the reflected and transmission path only mix at the second BS rather than coherently superpose. In the adaptive measurement module, the two-photon product measurements are realized by the combinations of quarter-wave plates, half-wave plates, polarizing beam splitters (PBSs), single photon detectors (SPDs) and a coincidence circuit. The rotation angles of QWPs and HWPs can be adaptively adjusted by a controller according to the analysis of the collected coincidence data on a computer.

69 3.4. EXPERIMENTAL RESULTS

-1.8 Cube simulation -1.5 Cube experiment -1.9 MUB simulation -2 RAQST1 simulation -2 RAQST1 experiment -2.1 GM Bound -2.5 -2.2

-2.3

-3 Cube simulation -2.4 Cube experiment MUB simulation -2.5 -3.5 RAQST1 simulation -2.6 RAQST1 experiment GM Bound -2.7 -4 -2.8 2.5 3 3.5 4 4.5 5 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

(a) (b)

Figure 3.4: Two-qubit state tomography experimental results.

The experiment results are depicted in Fig. 3.4. Dots are the average infidelity of simulation results with 1000 repetitive runs of RAQST1 (red), MUB (khaki) and standard cube measurements (magenta), and circles are the corresponding average infidelity of experimental results. Error bars are the standard deviation of the average. In the first experiment, as shown in Fig. 3.4(a), we realize RAQST1 and standard cube measurements tomography protocols for entangled states with a high level of purity, w.r.t. different number of resources Nt ranging from 251 to 251189. 7 First, we calibrate the true state ρ using RAQST1 with Nt = 10 copies so that the infidelity of the calibrated true state is even 10 times smaller than the estimate accuracy achieved at Nt = 251189 with RAQST1. The purity of the calibrated state is 0.983. Systematic error is crucial in the experiments. Beam displacers, which separate extraordinary and ordinary light, act as PBS and have an extinction ratio of about 10000:1. As the precision of rotation stages of QWPs and HWPs are 0.01◦, the rotation error is determined by the calibration error of optic axes, which is 0.1◦ in our experiment. Phase errors of the currently used true zero-order QWPs and HWPs are 1.2◦, which dominate the systematic error of practically realized measurements. These error sources induce a systematic error to the estimate state, which can be characterized by its infidelity from the true state. The systematic error is in

70 3.4. EXPERIMENTAL RESULTS the order of 10−3 when the error sources take the above values. For resource number

3 Nt ≥ 10 , the systematic error is of the same scale as or even larger than the statistical error due to ﬁnite resources (Nt copies). To deal with this problem, we employ error-compensation measurements [74] to reduce the systematic error to the order of 10−5. In error-compensation measurement technique, multiple nominally equivalent measurement settings are applied to sub-ensembles such that the systematic errors can cancel out in the ﬁrst order. Tomography experiments using both RAQST1 and standard cube measurements are repeated 10 times for each number of photon resources.

In the second experiment, as shown in Fig. 3.4(b), we realize tomography protocols using RAQST1 and standard cube measurements for Werner states with purities ranging from 0.25 to 0.98. The purities are changed by adjusting the apertures. Since the photon resource for each run of tomography protocols is only 104, we use 106 copies to calibrate the true state. There are 40 experimental runs and 1000 simulation runs for each of nine Werner states. In each RAQST experiment, four adaptive steps are used to optimize the measurements. To ensure measurement accuracy, error-compensation measurements are also employed.

In both of these two experiments, our experimental results agree well with simulation results. The improvement of RAQST1 protocol over standard cube measurements strategy is signiﬁcant. According to the simulation results of MUB protocol and the experimental results of RAQST1, even only with the simplest product measurements, our RAQST1 can outperform the tomography protocols using MUBs for states with a high level of purity. Taking into account the trade-oﬀ between accuracy and implementation challenge, from Fig. 3.2 and Fig. 3.4, RAQST using the simplest product measurement seems to be the best choice for reconstructing entangled states with a high level of purity.

71 3.5. SUMMARY AND OPEN PROBLEMS

3.5 Summary and open problems

We have presented a new adaptive QST protocol using an adaptive LRE algorith- m and reported a two-qubit experimental realization of the adaptive tomography protocol. In our RAQST protocol, no prior assumption is made on the state to be reconstructed. The infidelity of the adaptive tomography is greatly reduced and can even beat the Gill-Massar bound by adaptively optimizing the POVMs performed. We demonstrated that the fidelity obtained by using our RAQST with only the simplest product measurements can even surpass those obtained by using MUB and the two-stage MUB adaptive strategy, for states with a high level of purity. Consider- ing the trade-off between accuracy and difficulty of implementation, it seems that RAQST using the product measurements is the best choice for reconstructing pure and nearly pure entangled states, which are among the most important resources for quantum information processing.

It is worth stressing that our RAQST protocol is ﬂexible and extensible. For any ﬁnite dimensional quantum systems, once the admissible measurement set is given, we can utilize the adaptive measurement strategy to estimate an unknown quantum state. As demonstrated by numerical results, if nonlocal measurements can be experimentally realized reliably after an experimental advance, the admissible measurement set M can be enlarged, and our RAQST protocol can be better utilized accordingly.

A number of open questions deserve to be investigated:

(i) How to give a more effective formula for the parameters defining the second stage? Analytically derived formulas would be preferable over empirical ones, in particular allowing the parameters to depend upon the estimated state in the first stage. This is actually related to the tomography problem wherein some prior information is already known, e.g., pure entangled states, matrix-product states, low-rank states, etc. By taking full advantage of the prior information, an even more efficient RAQST

72 3.5. SUMMARY AND OPEN PROBLEMS protocol may be designed.

(ii) How to present a theoretical description of the convergence speed of the numerical algorithm in AppendixB? This would be helpful in characterizing the eﬃciency of the algorithm.

(iii) It remains open to ﬁnd the general optimal solution to (3.14). The analytical solution in single-qubit systems has already been obtained [45], while for multi-qubit systems it is still diﬃcult to solve.

73 3.5. SUMMARY AND OPEN PROBLEMS

74 Chapter 4

Quantum Hamiltonian identiﬁability via a similarity transformation approach and beyond

The work, reported in this chapter, has been partially published in the following articles:

1. Y. Wang, D. Dong, A. Sone, I. R. Petersen, H. Yonezawa, and P. Cappellaro, Quantum Hamiltonian identiﬁability via a similarity transformation approach and beyond, submitted to IEEE Transactions on Automatic Control, 2018.

2. Y. Wang, D. Dong, and I. R. Petersen, An approximate quantum Hamiltonian identiﬁcation algorithm using a Taylor expansion of the matrix exponential function, in 2017 IEEE 56th Annual Conference on Decision and Control (CDC), pp. 5523-5528, Melbourne, Australia, December 2017.

3. Y. Wang, D. Dong, I. R. Petersen, and J. Zhang, An approximate algorithm for quantum Hamiltonian identiﬁcation with complexity analysis, in the 20th World Congress of the Inter- national Federation of Automatic Control (IFAC), vol. 50, no. 1, pp. 11744-11748, Toulouse, France, July 2017.

75 4.1. INTRODUCTION

4.1 Introduction

The Hamiltonian is a fundamental quantity that governs the evolution of a quantum state, as described by the well-known Schrödingerequation (2.1). Hamiltonian identification is thus critical for tasks such as calibrating quantum devices [147] and characterizing quantum channels [40, 166]. Before performing identification experiments, a natural question arises: is the available data from a given experimental setting enough to identify (or determine) all the desired parameters in the Hamil- tonian? We refer to such a problem as Hamiltonian identifiability. The solution to this problem is fundamental and necessary for designing experiments, and also gives us insights into the information extraction capability of certain probe systems.

There are several existing approaches to investigating the problems of quantum system identification [19, 54, 129] and identifiability [92]. For example, Ref. [27] proved that controllable quantum systems are indistinguishable if and only if they are related through a unitary transformation, which can be developed as an identifiability method for controllable systems. The identifiability problem for a Hamiltonian corresponding to a dipole moment was investigated in [22]. The identification of spin chains has been extensively investigated in [23, 24, 25, 28, 52, 53]. Ref. [63] presented identifiable conditions for parameters in passive linear quantum systems, and further disposed of the requirement of “passive” in [91]. Control signals to enhance the observability of the quantum dipole moment matrix were introduced in [89]. Zhang and Sarovar [165] proposed a Hamiltonian identification method based on measurement time traces. Sone and Cappellaro [132] employed Gröbnerbasis to test the Hamiltonian identifiability of spin-1/2 systems, and their method is also applicable to general finite-dimension systems.

We assume the dimension [131] and structure (e.g., the coupling types) [82] of the Hamiltonian is already determined, and the task is to identify unknown parameters in the Hamiltonian. It is natural to resort to identiﬁability test methods in

76 4.1. INTRODUCTION classical (non-quantum) control field to tackle the quantum Hamiltonian identifiability problem. Common classical methods include the Laplace transform approach [8], the Taylor series expansion approach [112] and the Similarity Transformation Approach (STA) [142, 143, 145]. For a review, see [59, 105, 148]. The main idea of the Laplace transform approach is to determine the number of solutions of the multivariate equations composed by coefficients of the transfer function. In contrast, the STA method transforms the identifiability problem into finding the existence of unequal solutions of similarity equations generated by a minimal system’s equivalent realizations, thus providing a chance to avoid directly solving multivariate polynomial equations, a considerable advantage in the case of high-dimension systems or incomplete prior information. In this chapter, we extend the classical STA method to quantum Hamiltonian identifiability. We generalize and improve STA-based i- dentifiability criteria, which are applicable to both classical control and quantum identification domains. We employ the STA method to analyze all the physical cases in [132] and present proofs for the associated identifiability conclusions.

We further propose a Structure Preserving Transformation (SPT) method for the STA-based identifiability analysis in non-minimal systems. In classical control, when faced with non-minimal systems, one usually prefers to change the system settings such that it becomes minimal. In other words, the original settings are abandoned. This indirect solution is not applicable when the experimental settings are difficult to change or when we only expect to explore the information extraction capability of some particular physical probe systems. However, the SPT method provides a chance to preserve most of the system key properties after transformations while still performing identifiability analysis on its minimal subsystem. Hence, we employ the SPT method to prove that it is always possible to estimate one unknown parameter in the system matrix using a specifically designed experimental setting. This conclusion serves as an indicator for the existence of “economic” quantum Hamiltonian identification algorithms, whose computational complexity directly depends on the

77 4.2. MODEL ESTABLISHMENT number of unknown parameters.

As an example, we provide two specific economic identification algorithms, where the computational complexity only depends on the number of unknown parameters and data length. Therefore, for physical systems with a small number of unknown parameters in the Hamiltonian, these economic identification algorithms can be efficient.

The structure of this chapter is as follows. In Sec. 4.2, we formulate the Hamilto- nian identifiability problem in a linear systems framework, and present the classical Laplace transform approach and some necessary concepts. In Sec. 4.3, we introduce the general procedures of STA for identifiability problems, including the SPT method as a new tool in non-minimal systems. Sec. 4.4 consists of the specific applications of STA method on the three physical cases in [132]. Sec. 4.5 employs STA to indicate the existence of economic identification algorithms, and presents two examples of economic Hamiltonian identification algorithms. Sec. 4.6 concludes this chapter and introduces several open problems.

4.2 Model establishment

4.2.1 Problem formulation of Hamiltonian identiﬁability and identiﬁcation

Since Hamiltonian identifiability and identification are two closely related problems, we start from their model establishment in a common framework; namely, we rephrase the framework in [165] to recast them as a linear system problem. Let H be the d-dimensional Hamiltonian to be identified, which can be parametrized as

N XH H = am(ϑ)Hm, (4.1) m=1

78 4.2. MODEL ESTABLISHMENT

T where ϑ = (ϑ1, ..., ϑNH ) is a vector consisting of all the unknown real parameters,

NH is the number of unknown parameters, am are known functions of ϑ and Hm are known Hermitian matrices (also called basis matrices). Let su(d) denote the Lie

d2−1 algebra consisting of all d × d skew-Hermitian traceless matrices. Then {iHm}m=1 can be chosen as an orthonormal basis of su(d), where the inner product is deﬁned

† as hiHm, iHni = Tr(HmHn). The traceless assumption is reasonable because H has an intrinsic degree of freedom (see [150] for details).

Let Sjkl be the real structure constants of su(d), which satisfy

d2−1 X [iHj, iHk] = Sjkl(iHl), l=1

2 where j, k = 1, ..., d − 1. If Hk is the observable, then the experimental data is obtained from Born’s rule

xk = Tr(Hkρ). (4.2)

The state evolution is described by the Liouville-von Neumann equation (2.2).

The identiﬁability is determined by the system structure. Hence, it is usually assumed that there are no imperfections in the available experimental data, which is the reason we identify theoretical values with practical data in (4.2).

From (4.1)-(4.2) and (2.2), we have

d2−1 N X XH x˙k = ( Smklam(ϑ))xl. (4.3) l=1 m=1

If we directly rewrite (4.3) into a matrix form, the dimension of the system matrix would be d2−1, which is quite large for multi-qubit systems. To reduce the dimension,

ﬁrst consider the operators Oi that we can directly measure in practice. We expand P Oi as Oi = j ojHj, and collect all the Hj that appear in the expansion of all the

Ois as K = {Hv1 , ..., Hvp }. Also, we collect all the Hj that appear in the expansion

79 4.2. MODEL ESTABLISHMENT

NH of H as L = {Hm}m=1. Deﬁne an iterative procedure as

(0) (i) (i−1) (i−1) G = K, G = {G , L} ∪ G ,

(i−1) † (i−1) where {G , L} , {Hj|Tr(Hj[g, h]) 6= 0 for some g ∈ G , h ∈ L}. This iteration will terminate at a maximal set G¯ (called the accessible set) because su(d) ¯ is ﬁnite. We collect all the xi with Hi ∈ G in a vector x of dimension n, and its dynamics satisfy the linear system equation

˙x = Ax. (4.4)

The elements in A are the coeﬃcients in (4.3), which are linear combinations of am(ϑ). A is real and antisymmetric due to the antisymmetry of the structure constants. For some types of physical systems, the dimension n can be much smaller than d2 − 1. The output data can be denoted as

y = Cx, (4.5) where C selects the entries in x corresponding to the expectation values of the elements in K. Each speciﬁc experimental setting determines the initial state x0 and the observation matrix C of this linear system. Therefore, the quantum Hamiltonian identiﬁcation problem can be formulated as follows:

Problem 4.1. Given the system matrix A = A(ϑ), initial state x(0) = x0 and observation matrix C, design an algorithm to obtain an estimate ϑˆ of ϑ from measurement data ˆy.

In this chapter we mainly consider a preceding question: for a system A, can we uniquely determine the unknown parameters, based on a given experimental setup

(i.e., x0 and C)? If not, then it may be required to redesign the experimental setup before starting the experiment. This is especially signiﬁcant for quantum system

80 4.2. MODEL ESTABLISHMENT identification, since implementing quantum experiments is usually expensive. The problem of identifiability is thus induced. Let ϑ denote the true value of the unknown parameter vector to be identified. Assume that the system under consideration has a parametric model structure with output data PM(ϑ), for a given experimental setup. The equation

0 PM(ϑ) = PM(ϑ ) (4.6) means that the model with parameter set ϑ0 outputs exactly the same data as the model with parameter set ϑ. Identiﬁability then depends on the number of solutions to (4.6) for ϑ0. We use the following deﬁnition from [148]:

Definition 4.1. [148] The model PM is structurally globally identifiable (abbreviated as identifiable in the rest of this thesis), if for almost any value of ϑ,(4.6) has only one solution ϑ0 = ϑ.

Definition 4.1 is in essence the same as the definition of identifiability in [132]. It is necessary to ensure that identifiability holds for almost any value of the parameters because the number of solutions to (4.6) might change for some particular values of ϑ, which are called atypical cases (to be illustrated later). Also, identifiability is determined by the system structure. Hence, we do not consider noise or uncertainty in the experimental data. A trivial necessary condition for a parameter to be identi-

ﬁable is that it should appear in the system model PM, and in the following we only focus on this class of parameters.

4.2.2 Laplace transform approach and atypical cases

One of the most intuitive ways to solve identifiability problems is through the Laplace transform, which is also helpful in understanding concepts like atypical cases. Hence, we first briefly introduce the Laplace transform approach [148]. Consider the

81 4.2. MODEL ESTABLISHMENT following standard MIMO linear system with zero initial condition:

  ˙x = A(ϑ)x + B(ϑ)u, x(0) = 0, (4.7)  y = C(ϑ)x + D(ϑ)u.

Throughout this thesis we use 4-tuples Σ = (A, B, C, D) (or 3-tuples without D) to denote linear systems with the form of (4.7). The Laplace transform solution to (4.7) is Y(s, ϑ) = T(s, ϑ)U(s), where the transfer function matrix is T(s, ϑ) = C(ϑ)[sI − A(ϑ)]−1B(ϑ) + D(ϑ). In the frequency domain, (4.6) is now

T(s, ϑ)U(s) = T(s, ϑ0)U(s).

By cancelling U(s), (4.6) is equivalent to

T(s, ϑ) = T(s, ϑ0), ∀s ∈ C . (4.8)

Hence, the transfer function is exactly a tool to characterize identifiability. By writing (4.8) in a canonical form (e.g., transforming the numerators and denominators into monic polynomials) and equating coefficients on both sides of (4.8), one obtain a series of algebraic equations in ϑ and ϑ0. If for almost any value of ϑ, the solutions always satisfy ϑ0 = ϑ, then the system is identifiable. From now on, we assume am(ϑ) are linear functions on ϑ for simplicity. In order to investigate identifiability, Sone and Cappellaro [132] employed Gröbnerbasis to determine the conditions of identifiability. By directly solving (4.8) where the RHS is replaced by a specific transfer function reconstructed from experimental data, one can develop algorithms like that in [165] to identify the Hamiltonian.

The following property of the transfer function will be frequently used in the sequel:

82 4.2. MODEL ESTABLISHMENT

Property 4.1. When a system undergoes a similarity transformation x0 = P x where P is a nonsingular matrix, the transfer function remains the same, and thus the identiﬁability does not change.

We speciﬁcally illustrate atypical cases and hypersurfaces. Assume that the number of unknown parameters is NH and we have no prior knowledge of the true values, which indicates the candidate space for the parameters is RNH . A hypersurface is a manifold or an algebraic variety with dimension NH − 1, and it is usually obtained by adding an extra polynomial equation about the unknown parameters. Hypersur- face sets have Lebesgue measure zero and they can thus be neglected in practice. Atypical cases are subsets of hypersurfaces. Hence, analysis on atypical cases can also be omitted. When the complement of a hypersurface is open and dense in RNH and has full measure, it is often called a generic set [133]. For strictness, the phrase “almost always” is usually employed to indicate that atypical cases have already been neglected. We give an example of atypical cases from the point of view of transfer functions like Example 3.1 in [148]. Consider a system with unknown parameters ϑ1 and ϑ2 and the transfer function

ϑ T(s, ϑ) = 1 . (4.9) s + ϑ1 + ϑ2

0 0 0 The algebraic equations from (4.8) are thus ϑ1 = ϑ1 and ϑ1 +ϑ2 = ϑ1 +ϑ2. Therefore, the system (4.9) is generally identiﬁable, except the case of ϑ1 = 0 which leads to a zero transfer function and erases all the information about ϑ2. Since ϑ1 = 0 is an atypical case, we can omit it and conclude that this system is (almost always) identi- ﬁable. In the rest of this thesis we will omit “almost always” if there is no ambiguity.

83 4.3. SIMILARITY TRANSFORMATION APPROACH

4.3 Similarity Transformation Approach

4.3.1 General procedures for minimal systems

Strictly speaking, the word “minimal” is used to describe system realizations that are both controllable and observable. In this thesis, we call a system “minimal” if it is both controllable and observable.

Let θ be the true value generating the system (4.7). Suppose that there is an alternative value θ0 generating the same output data. Then θ0 gives an alternative realization:   ˙x0 = A(θ0)x0 + B(θ0)u, x0(0) = 0, (4.10)  y = C(θ0)x0 + D(θ0)u.

Suppose that the system realization (4.7) is minimal, then (4.10) is also minimal since they have the same dimension. From Kalman’s algebraic equivalence theorem [81], minimal realizations of a transfer function are equivalent; i.e., they are related by a similarity transformation:

  A(θ) = S−1A(θ0)S,    B(θ) = S−1B(θ0), (4.11)  C(θ) = C(θ0)S,    D(θ) = D(θ0), where S is an invertible matrix. We call equations (4.11) the STA equations. We take S, θ and θ0 as unknown variables and search for their solution. The solvability of (4.11) can be guaranteed because it always has a trivial solution S = I and θ = θ0. If all the solutions satisfy θ = θ0, then the system (4.7) is identifiable. Otherwise it is unidentifiable. In cases when the signs of θ are not considered, one can check whether all the solutions to the STA equations satisfy |θ| = |θ0| to determine the identifiability.

84 4.3. SIMILARITY TRANSFORMATION APPROACH

4.3.2 General procedures for non-minimal systems

If the system is not minimal, Kalman’s algebraic equivalence theorem (and hence the STA equations) can only be applied to the part that is both controllable and observable; i.e., the minimal subsystem. If one ignores whether the system is minimal or not and directly employs the solution to the STA equations to test the identiﬁabil- ity, an incorrect conclusion might be obtained. For example, consider the following 2-dimensional system:

Example 4.1.

      ϑ1 0 1  ˙x =   x +   u, x(0) = 0, 0 ϑ2 0 (4.12)   y = (1 0)x.

This system (4.12) is uncontrollable and unobservable. If one directly solves the

STA equations, the conclusion is that it is identiﬁable. However, since x2 evolves independently asx ˙ 2 = ϑ2x2, the output y = ϑ1x1 never contains any information about x2 or ϑ2. Hence, ϑ2 is in fact unidentiﬁable.

The fact that (4.8) is equivalent to (4.6) means a linear system’s identiﬁability is uniquely and completely determined by its transfer function. Therefore, unlike the situation using STA, non-minimal systems do not introduce extra requirements in the Laplace transform approach.

Regardless of controllability or observability, the transfer function of a system remains the same under similarity transformation. Therefore, for uncontrollable or unobservable systems, the solution using STA is [145]: (i) perform Kalman decomposition and obtain the controllable and observable (minimal) subsystem; (ii) write down the STA equations for the minimal subsystem; (iii) the original system is iden- tiﬁable if and only if the solutions to the STA equations in (ii) all satisfy ϑ = ϑ0.

For Example 4.1,(4.12) is already in the Kalman canonical form and the minimal

85 4.3. SIMILARITY TRANSFORMATION APPROACH

subsystem isx ˙ 1 = ϑ1x1 + u, y = x1, which does not involve ϑ2. Hence, ϑ1 is identifiable and ϑ2 is unidentifiable. This example also implies the following identifiability Criterion 4.1, which corresponds to the fact in [132] that the parameters that do not appear in the transfer function are unidentifiable.

Criterion 4.1. Suppose a system is non-minimal. Perform the Kalman decomposition to obtain its minimal subsystem and non-minimal subsystem. The unknown parameters that do not appear in the minimal subsystem are unidentiﬁable.

For a non-minimal system, even if all the unknown parameters appear in the minimal subsystem and the STA equations for the original system (rather than the minimal subsystem) exclude the solutions ϑ 6= ϑ0, it is not suﬃcient for guaranteeing the identiﬁability of the original system. A straightforward example can be obtained by substituting ϑ1 and ϑ2 in Example 4.1 with ϑ1 + ϑ2 and ϑ1 − ϑ2, respectively.

Although it is necessary to analyze the minimality before solving the STA equations in most situations, we ﬁnd a shortcut for some special cases.

Criterion 4.2. If the STA equations for a system have a (non-atypical) solution

0 ϑ0 6= ϑ0, the system is unidentiﬁable regardless of whether it is minimal or not.

For the proof of Criterion 4.2, we consider two specific realizations (A(ϑ0),B(ϑ0), 0 0 0 0 C(ϑ0),D(ϑ0)) and (A(ϑ0),B(ϑ0),C(ϑ0),D(ϑ0)) for the system. According to the form of STA equations (4.11), these two different (possibly non-minimal) realizations are related by a similarity transformation. Using Property 4.1 they result in the same transfer function. Therefore, different system parameters are generating the same system model. This means the system must be unidentifiable, which proves Criterion 4.2.

As pointed out in [42], the controllability and observability properties are neither sufficient nor necessary for identifiability. Example 4.1 has shown that non-minimal systems may be unidentifiable. Moreover, if one replaces ϑ2 in the system matrix of

86 4.3. SIMILARITY TRANSFORMATION APPROACH

Figure 4.1: Relationships between identiﬁability criteria.

(4.12) with ϑ1, then the system becomes identiﬁable, which indicates non-minimal systems can also be identiﬁable.

In Fig. 4.1, we summarize all the results of Sec. 4.3.1 and 4.3.2. Note that for non-minimal systems Criterion 4.2 is necessary but not suﬃcient, diﬀerent from the case for minimal systems.

We would like to further emphasize the link between non-minimal systems and Laplace transform approach. When one is deducing the transfer function (matrix), if pole-zero cancellation happens, then the system is in fact non-minimal; otherwise it is minimal. Hence, the transfer function matrix in (4.8) should be in the reduced form after possible pole-zero cancellation.

87 4.3. SIMILARITY TRANSFORMATION APPROACH

4.3.3 Structure Preserving Transformation method for non- minimal systems

The Structure Preserving Transformation (SPT) method is an idea we develop for identiﬁability analysis on non-minimal systems. Suppose there is a non-minimal system Σ = (A, B, C, D) with state vector x. If Criterion 4.2 fails, traditionally we have to perform Kalman decomposition. We let ¯x = P x such that the equivalent system Σ¯ = (A,¯ B,¯ C,¯ D¯) has the Kalman canonical form. Then, we employ the STA ¯ ¯ ¯ ¯ equations for its minimal subsystem Σ¯ 1 = (A1, B1, C1, D1), with the corresponding state vector ¯x1 having a dimension smaller than x.

Quantum systems often generate clear structure properties in A. These structure properties may be completely disguised in the system Σ¯, making the STA equations difficult to solve. This problem is seldom investigated in classical control theory, because for a classical system when faced with such problems, one prefers to change the system structure (A, B, C, D) so that the system becomes minimal. On the con- trary, quantum research sometimes investigates the physical capability of a certain fixed system setting and the initial quantum system states or the observables may be difficult to change. Therefore, changing (A, B, C, D) may not be practical. How can we keep (some of) the structure properties of the original system Σ and meanwhile perform STA analysis?

The idea of SPT is to further perform a similarity transformation on Σ¯ to recover (some of) the structure properties of Σ, meanwhile preserving the canonically decom- posed form. To do this, we let ˜x = (P˜−1 ⊕I)¯x and obtain a system Σ˜ = (A,˜ B,˜ C,˜ D˜), ˜−1 where P acts only on the minimal subsystem Σ¯ 1. Since the second transformation P˜−1 ⊕ I is block-diagonal, Σ˜ is still in the Kalman canonical form, and the matrices ˜ ˜ ˜ ˜ ˜ (A1, B1, C1, D1) are submatrices of those in Σ˜, respectively. If P is close to P (in the form/appearance, not in norm), or P˜−1 is close to P −1, then we are likely to regain ˜ an A1 similar to A, thus recovering key structure properties. Then we solve the STA

88 4.4. QUANTUM HAMILTONIAN IDENTIFIABILITY VIA STA

equations for the minimal subsystem Σ˜ 1 to determine the identiﬁability.

In the SPT method, P˜ can never be exactly equal to P , because their dimensions are diﬀerent. The choice of P˜ is not unique and depends on speciﬁc problems. One common choice is to let P˜ be a submatrix of P . An example using the SPT method is provided in Sec. 4.5.1.

4.4 Quantum Hamiltonian identiﬁability via STA

4.4.1 General framework

We clarify several points when using STA for analyzing Hamiltonian identiﬁability of a quantum system. For simplicity we only consider single input Hamiltonian systems (i.e., the state variable x has only one column), while the result can be straightforwardly extended to multi-input systems. A quantum system of (4.4) and

(4.5) with the initial state x(0) = x0 is equivalent to the following zero-initial-state system:   ˙x = Ax + Bu, x(0) = 0,  y = Cx, where B = x0 and u = δ(t).

For a quantum Hamiltonian, x0 and C are usually determined and A is antisymmetric. We rewrite (4.11) as:

SA(ϑ) = A(ϑ0)S, (4.13)

Sx0 = x0, (4.14)

C = CS, (4.15) together with the requirement that S is nonsingular and other possible constraints on ϑ and ϑ0. Eqs. (4.13)-(4.15) are the starting point for STA analysis for the rest

89 4.4. QUANTUM HAMILTONIAN IDENTIFIABILITY VIA STA of this chapter.

Next we use STA to test the identifiability for single-probe-assisted spin-1/2 chain systems in [132], which have the form of a one-dimensional chain, composed of multi qubits with their interaction governed by the system Hamiltonian. It is usually assumed that only the first qubit (the probe qubit) can be initialized and measured, while the rest of the qubits are all inaccessible (and thus they are assumed to be in the maximally mixed state initially). As in [132], we identify only the magnitude of the unknown parameters in the Hamiltonian; i.e., a system is identifiable if and only if all the solutions to the STA equations for the minimal (sub-)systems satisfy

0 |ϑi| = |ϑi|. There are four physical models in [132], where the transfer function on the Ising model without transverse field can be directly calculated and we omit the STA analysis for this model. The Ising model with the transverse field can also be skipped, because the system matrix has the same structure as that in the exchange model without transverse field. Hence, we only analyze two exchange models, with

T and without transverse ﬁeld. Let ϑ = (ϑ1, ϑ2, ..., ϑNH ) be the unknown parameters.

For the exchange model without transverse ﬁeld, NH + 1 is the total qubit number and the Hamiltonian can be written as

NH i X (−1) ϑi H = (X X + Y Y ), (4.16) 2 i i+1 i i+1 i=1 where the subscript i denotes the i-th qubit. The observable is X1 with the initial state being an eigenstate of X1. For the exchange model with transverse ﬁeld, NH

NH+1 must be odd and 2 is the total qubit number. The Hamiltonian can be written as NH+1 NH−1 2 2 X ϑ2i−1 X ϑ2i H = Z + (X X + Y Y ). (4.17) 2 i 2 i i+1 i i+1 i=1 i=1

With the initial state being the eigenstate of X1, the observable can be X1 or Y1. Hence, there are altogether three situations to be analyzed, which are summarized

90 4.4. QUANTUM HAMILTONIAN IDENTIFIABILITY VIA STA as Theorems 4.1-4.3. These three situations were first investigated in [132] and only verified numerically for several specific cases. Here, we provide a mathematical proof for arbitrary dimension. Also, Theorems 4.1-4.3 contain various situations to showcase the power of STA: Theorem 4.1 and Theorem 4.3 characterize identifiable minimal systems, while Theorem 4.2 corresponds to an unidentifiable minimal system. An example of dealing with identifiable non-minimal systems will be presented in Theorem 4.4. Moreover, the proof for Theorem 4.1 here is slightly different from the version in [151]. The proof here exploits the system’s symmetry, and this idea can be applicable to other symmetric systems.

4.4.2 Exchange model without transverse ﬁeld

The Hamiltonian for this spin system is described in [132], which also derives the

(0) system model (4.16). As in [132], we choose G = {X1}, and the accessible set is

¯ G = {X1,Z1Y2,Z1Z2X3, ..., Z1 ··· ZNH KNH+1}, (4.18)

where K = X if NH is even, and K = Y otherwise.

Then we start from the linear system form (4.7). In the system matrix A only the elements directly above or below the main diagonal are non-zero:

  0 ϑ1 0 0 ···      −ϑ1 0 ϑ2 0 ···     ..  A =  0 −ϑ2 0 .  . (4.19)  .   ..   0 0 ϑNH    . . . . −ϑNH 0 (NH+1)×(NH+1)

The initial state of the probe is an eigenstate of X1. Hence, from (4.18) we know T B = x0 = (1, 0, ..., 0) . We measure X1, and thus C = (1, 0, ..., 0). We have the following theorem to characterize this system:

91 4.4. QUANTUM HAMILTONIAN IDENTIFIABILITY VIA STA

Theorem 4.1. The exchange model without transverse ﬁeld is identiﬁable when measuring X1 on the single qubit probe, with the initial state of the probe in an eigenstate of X1.

Proof. We ﬁrst prove this system is minimal for almost any value of the unknown parameters, and then test the identiﬁability.

4.4.2.1 Proof of minimality

Lemma 4.1. With (4.19) and B = (1, 0, ..., 0)T , the controllability matrix CM =

[B, AB, ..., ANH B] has full rank for almost any value of ϑ.

The proof of Lemma 4.1 is provided in AppendixD. Then, given the observability matrix   C      CA  N T OM =   = diag(1, −1, 1, −1, ..., (−1) H ) · CM ,  .   .    CANH the system is also almost always observable. Therefore, it is almost always minimal.

4.4.2.2 Identiﬁability test

We now employ the STA equations to test the identiﬁability. We provide a proof slightly diﬀerent from the version in [151].

We observe that the parameters ϑ2, ϑ3, ..., ϑNH are symmetric in this system. Namely, if we make a similarity transformation to just swap any two parameters

ϑi and ϑj where 2 ≤ i 6= j ≤ NH, then B and C will keep unchanged, and the only difference in A is that the indices i and j in ϑ are swapped. Therefore, the parameters ϑ2, ϑ3, ..., ϑNH must have the same identifiability conclusion; i.e., if one of them is identifiable, the rest are also identifiable, and vice versa. It thus suffices

92 4.4. QUANTUM HAMILTONIAN IDENTIFIABILITY VIA STA

to prove that ϑ1 and ϑ2 are identiﬁable, which can be obtained as follows.

Using (4.14) and (4.15) we know S is of the form

  1 0 ··· 0      0 ∗ · · · ∗  S =   , (4.20)  . . .   . . .    0 ∗ · · · ∗ (NH+1)×(NH+1) and (4.13) is now

    1 0 ··· 0 0 ϑ1 0 ··· 0        ..   0 ∗ · · · ∗   −ϑ1 0 .       . . .   ..   . . .   0 .     .  0 ∗ · · · ∗ .     (4.21) 0 0 ϑ1 0 ··· 0 1 0 ··· 0  .     0 ..     −ϑ1 0   0 ∗ · · · ∗  =     .  ..   . . .   0 .   . . .   .    . 0 ∗ · · · ∗

0 For (4.21), consider LHS12 = RHS12, we have ϑ1 = ϑ1S22. From LHS21 = RHS21, 0 2 we have −ϑ1S22 = −ϑ1. Hence, ϑ1 = ϑ1S22. Since the atypical case of ϑ1 = 0 is not 0 considered, we know |S22| = 1, which indicates |ϑ1| = |ϑ1|.

Continue analyzing (4.21) and consider LHSσ1 = RHSσ1, we thus have S32 = ... =

S(NH+1)2 = 0. Similarly from LHS1σ = RHS1σ, we know S23 = ... = S2(NH+1) = 0. 0 Then from LHS23 = RHS23, we have ϑ2 = ϑ2S33. From LHS32 = RHS32, we have 0 −ϑ2S33 = −ϑ2. Hence, |S33| = 1 and ϑ2 is also identiﬁable, which completes the proof.

Remark 4.1. The above new version proof of Theorem 4.1 is more general and

93 4.4. QUANTUM HAMILTONIAN IDENTIFIABILITY VIA STA enlightening than the original proof in [151], because here we take full advantage of the system symmetry. One can imagine that in other physical systems (even not limited to the quantum domain) some appropriate symmetry might also create a consistent identiﬁability conclusion for some of the unknown parameters, thus making the identiﬁability analysis more straightforward and more concentrated on the vital part.

The relevant result in Theorem 4.1 was also presented in [52], where a specific Hamiltonian identification algorithm for the same system setting was proposed. Here we present our result as an example to illustrate the effectiveness of STA and the idea of symmetry exploitation.

4.4.3 Exchange model with transverse ﬁeld

The Hamiltonian for this system is as in (4.17), where NH must be odd. As in (0) (0) [132], both G = {X1} and G = {Y1} yield the accessible set

¯ G = {X1,Y1,Z1X2,Z1Y2,Z1Z2X3,Z1Z2Y3, ..., Z1 ··· Z(NH−1)/2Y(NH+1)/2}. (4.22)

Then we start from the linear system form (4.7). In A, each ϑ2k+1 appears twice and each ϑ2k appears four times:

  0 ϑ1 0 −ϑ2 ···      −ϑ1 0 ϑ2 0 ···     ..  A =  0 −ϑ2 0 .  . (4.23)    ...   ϑ2 0 ϑNH    . . . . −ϑNH 0 (NH+1)×(NH+1)

The initial state of the probe is an eigenstate of X1. Hence, from (4.22) we know T B = x0 = (1, 0, ..., 0) . With Property 4.1, we can ﬁrst rearrange A as follows: we

94 4.4. QUANTUM HAMILTONIAN IDENTIFIABILITY VIA STA take its odd rows in ascending sequence and then take its even rows in ascending sequence, and then apply the same procedures to its columns. We thus rewrite A into   0 A¯ A =   , (4.24) −A¯ 0 where   ϑ1 −ϑ2 0 ··· 0    .  −ϑ2 ϑ3 −ϑ4 .    ¯  ..  A =  0 −ϑ4 ϑ5 . 0  (4.25)  .   ......   . −ϑNH−1  

0 ··· 0 −ϑNH−1 ϑNH is symmetric. After this transformation, we have B = (1, 0, ..., 0)T unchanged.

4.4.3.1 Measuring X1

First we consider measuring X1. Then C = (1, 0, ..., 0). We have the following conclusion:

Theorem 4.2. The exchange model with transverse ﬁeld is unidentiﬁable when measuring X1 on the single qubit probe, with the initial state of the probe in an eigenstate of X1.

Proof. We employ Criterion 4.2 to prove the conclusion, and thus do not need to analyze its minimality. When A in (4.23) is transformed to (4.24), C is unchanged and we assume S is transformed to S¯. Now (4.14) and (4.15) imply S¯ is of the same form as (4.20). We do not need to ﬁnd all the solutions to (4.13). Instead, we only

0 need to ﬁnd a special solution to (4.13) which gives |ϑi|= 6 |ϑi| for some i. We assume

¯ S = 11×1 ⊕ N NH−1 NH−1 ⊕ M NH+1 NH+1 , 2 × 2 2 × 2

95 4.4. QUANTUM HAMILTONIAN IDENTIFIABILITY VIA STA which satisﬁes the form (4.20). Eq. (4.13) now is

    1     1   A¯ A¯0        N    =    N  . (4.26)   −A¯ −A¯0   M M

We further assume N and M are orthogonal, which guarantees that S¯ is nonsingular and now (4.26) is in essence only one equation:

  1 T 0   AM¯ = A¯ . (4.27) N

We perform spectral decomposition on A¯ to have A¯ = PEP T where P is orthogonal and E is diagonal. Denote Λ(A) the set of all the eigenvalues of A, where repeated eigenvalues appear multiple times. We have the following lemma to exclude the atypical cases:

¯ Lemma 4.2. Given arbitrary λ0 ∈ C, it is atypical that λ0 ∈ Λ(A).

  ϑ1 ϑ2 Lemma 4.2 is non-trivial. For example, if we change the structure of A¯ as  , ϑ1 ϑ2 then it is always true that 0 ∈ Λ(A¯).

We leave the specific details for proving Lemma 4.2 in AppendixE, but only sketch the main idea here, since this idea is quite general in proving many similar propositions. Usually we first reduce the problem to proving certain polynomial ¯ (det(A−λ0I) in the case of Lemma 4.2) almost always non-zero. Then since a finite- order polynomial only has a finite number of roots, it suffices to find the polynomial non-zero for some particular values of the unknown parameters. For detailed proof, please refer to AppendixE.

96 4.4. QUANTUM HAMILTONIAN IDENTIFIABILITY VIA STA

Let

I−k = diag(1, ..., 1, −1, 1, ..., 1) | {z } k−1 where only the kth element is −1. We have the following assertion:

T Lemma 4.3. ∃ k ∈ {1, 2, ..., NH} such that |ϑ1|= 6 |(PEI−kP )11|.

The proof of Lemma 4.3 is given in AppendixF. Using Lemma 4.3, suppose |ϑ1|= 6 T |(PEI−mP )11|. We let   T T 1 M = P I−mP   . N T

As long as N is orthogonal, M is orthogonal. We denote the LHS of (4.27) as L¯, and have   1 L¯ =   AM¯ T N     1 1 T T =   PEP P I−mP   (4.28) N N T     1 1 T =   PEI−mP   . N N T

We thus know

    1 1 ¯ T |L11| = I1σ   PEI−mP   Iσ1 T N N T T = |I1σPEI−mP Iσ1| = |(PEI−mP )11|= 6 |ϑ1|.

From (4.28) we know L¯ is always symmetric. Then we only need to ﬁnd an appropriate orthogonal N to make L¯ have the same positions of zeros as A¯. Denote

T Z = PEI−mP , which is symmetric. We design a series of orthogonal matrices

97 4.4. QUANTUM HAMILTONIAN IDENTIFIABILITY VIA STA

NH−3 (1) (2) ( 2 ) N NH−1 NH−1 ,N NH−3 NH−3 , ..., N2×2 such that 2 × 2 2 × 2

    N −5 N −5 I H × H I1×1 (1) N =  2 2  ···   N . ( NH−3 ) (2) N 2 N

N +1 NH−3 H (1) (2) ( 2 ) We further denote a series of 2 -dimensional matrices Z ,Z , ..., Z such that     (1) 1 1 Z =   Z   (4.29) N (1) [N (1)]T

N −5 NH−3 (i+1) (i+1) (i) (i+1) T H ( 2 ) and Z = (Ii+1 ⊕ N )Z (Ii+1 ⊕ [N ] ) for 1 ≤ i ≤ 2 . Then Z = L¯. We start from the innermost layer (4.29).

We partition Z as

  Z11 J1× NH−1 Z =  2  , T (J NH−1 ) J NH−1 NH−1 1× 2 2 × 2 and have   (1) T (1) Z11 J[N ] Z =   . (4.30) N (1)J T N (1)J [N (1)]T

(1) T In (4.30), Z11 is unchanged and we need to make J[N ] have the form

(1) T J[N ] = (∗1×1, 0, ..., 0). (4.31)

We perform spectral decomposition to set

J T J = U (1)diag(∗, 0, ..., 0)[U (1)]T .

Then N (1) = [U (1)]T is orthogonal and (4.31) holds.

98 4.4. QUANTUM HAMILTONIAN IDENTIFIABILITY VIA STA

For the next layer, we partition Z(1) as

  Z11 ∗ 0 NH−3 1× 2 (1)   Z =  ∗ ∗ K N −3  .  1× H   2  T 0 NH−3 (K NH−3 ) K NH−3 NH−3 2 ×1 1× 2 2 × 2

We then have

    1 1     (2)   (1)   Z =  1  Z  1      N (2) [N (2)]T   Z11 ∗ 01× NH−3  2   (2) T  =  ∗ ∗ K[N ]  .   (2) T (2) (2) T 0 NH−3 N K N K[N ] 2 ×1

(2) T Z11 is unchanged and we need to make K[N ] take the form

(2) T K[N ] = (∗1×1, 0, ..., 0).

We perform spectral decomposition to make

KT K = U (2)diag(∗, 0, ..., 0)[U (2)]T , and then N (2) = [U (2)]T is what we need. Continuing the above procedure, we can

( NH−3 ) ﬁnally determine an orthogonal N such that L¯ = Z 2 has the same structure ¯ ¯ as A. Since Z11 is unchanged and |Z11| 6= |ϑ1|, we know |L11| 6= |ϑ1|, which implies we have found a special unequal solution to the STA equations. Thus the system is unidentiﬁable.

99 4.4. QUANTUM HAMILTONIAN IDENTIFIABILITY VIA STA

4.4.3.2 Measuring Y1

Now we consider measuring Y1, which sets C = (0, 1, 0, ..., 0). We have the following theorem to correct the conclusion in [132].

Theorem 4.3. The exchange model with transverse ﬁeld is identiﬁable when measuring Y1 on the single qubit probe, with the initial state of the probe in an eigenstate of X1.

Proof. After A in (4.23) is transformed to (4.24), C is transformed to

¯ ¯ C = (0 NH+1 , C), C = (1, 0 NH−1 ). (4.32) 1× 2 1× 2

Denote ¯T T ¯ T B = (B , 0 NH+1 ) , B = (1, 0 NH−1 ) . (4.33) 1× 2 1× 2

We have the following lemma (the proof is given in AppendixG) to show that the system is minimal.

Lemma 4.4. With (4.24), (4.25), (4.32) and (4.33), both the controllability matrix

CM = [B, AB, ..., ANH B] and the observability matrix OM = [CT ,AT CT , ...,

ANHT CT ]T have full rank for almost any value of ϑ.

By Property 4.1, we use STA to prove the system (4.24) and (4.25) is identiﬁable with (4.32) and (4.33). We partition S as

  X NH+1 × NH+1 ∗ NH+1 × NH+1 S =  2 2 2 2  . ∗ NH+1 NH+1 Y NH+1 NH+1 2 × 2 2 × 2

Then (4.13) is

        X ∗ 0 A¯ 0 A¯0 X ∗     =     , (4.34) ∗ Y −A¯ 0 −A¯0 0 ∗ Y

100 4.4. QUANTUM HAMILTONIAN IDENTIFIABILITY VIA STA which is XA¯ = A¯0Y, (4.35)

Y A¯ = A¯0X, (4.36) where the other two equations on the indeterminate submatrices are omitted. Using (4.14) and (4.15), we have

T Xσ1 = (1, 0, ..., 0) ,Y1σ = (1, 0, ..., 0). (4.37)

From (4.35) and (4.36), we have

XT XA¯ = XT A¯0Y = AY¯ T Y, (4.38)

Y T Y A¯ = Y T A¯0X = AX¯ T X. (4.39)

From (4.38) and (4.39), the following relationship holds,

(XT X − Y T Y )A¯ = −A¯(XT X − Y T Y ), (4.40) which is a special form of Sylvester equation. We rephrase the general solving procedures for Sylvester equation [169] to solve (4.40). We vectorize (in column) (4.40) to have ¯ ¯ T T (A ⊗ I NH+1 + I NH+1 ⊗ A)vec(X X − Y Y ) = 0. 2 2

Using the same idea in AppendicesE andG, it is straightforward to prove that

A¯ ⊗ I + I ⊗ A¯ is almost always nonsingular by considering A¯ = I. An equivalent expression is that we almost always have

¯ ¯ λi(A) + λj(A) 6= 0 (4.41)

101 4.4. QUANTUM HAMILTONIAN IDENTIFIABILITY VIA STA

NH+1 for any 1 ≤ i, j ≤ 2 . Therefore we can almost always have

XT X = Y T Y. (4.42)

Similarly, (XXT − YY T )A¯0 = −A¯0(XXT − YY T ), and thus ¯0 ¯0 T T (A ⊗ I NH+1 + I NH+1 ⊗ A )vec(XX − YY ) = 0. (4.43) 2 2

¯0 ¯0 Lemma 4.5. With (4.24), (4.25) and (4.34), A ⊗ I NH+1 + I NH+1 ⊗ A is almost 2 2 always nonsingular.

The proof of Lemma 4.5 is provided in AppendixH. With Lemma 4.5, we can almost always solve (4.43) to have

XXT = YY T . (4.44)

Considering (4.37), we partition X and Y as

    11×1 E1× NH−1 11×1 01× NH−1 X =  2  ,Y =  2  . ˜ ˜ 0 NH−1 X NH−1 NH−1 F NH−1 Y NH−1 NH−1 2 ×1 2 × 2 2 ×1 2 × 2

T T T From (4.42), (X X)11 = 1 = (Y Y )11 = 1 + F F , which means F = 0. Similarly from (4.44) we have E = 0. We partition A¯ as

  ϑ1 G1× NH−1 A¯ =  2  . T ˜ (G NH−1 ) A NH−1 NH−1 1× 2 2 × 2

Then (4.35) is         0 0 1 0 ϑ1 G ϑ1 G 1 0     =     , 0 X˜ GT A˜ G0T A˜0 0 Y˜

102 4.4. QUANTUM HAMILTONIAN IDENTIFIABILITY VIA STA

0 which implies ϑ1 = ϑ1, G = G0Y,˜ (4.45)

XG˜ T = G0T , (4.46)

X˜A˜ = A˜0Y.˜ (4.47)

0 ˜ ˜ 0 Eq. (4.45) is (−ϑ2, 0, ..., 0) = (−ϑ2, 0, ..., 0)Y , which implies Y1σ = (ϑ2/ϑ2, 0, ..., 0). ˜ 0 T Similarly (4.46) gives Xσ1 = (ϑ2/ϑ2, 0, ..., 0) . With similar procedures, (4.36) gives ˜ 0 ˜ 0 T X1σ = (ϑ2/ϑ2, 0, ..., 0), Yσ1 = (ϑ2/ϑ2, 0, ..., 0) and

Y˜ A˜ = A˜0X.˜ (4.48)

˜ ˜ 0 0 By equating X11 (or Y11), we ﬁnd |ϑ2| = |ϑ2|. If ϑ2 = ϑ2, we have

˜ ˜ T Y1σ = (1, 0, ..., 0) = (Xσ1) . (4.49)

Now (4.47), (4.48) and (4.49) have the same structures as (4.35), (4.36) and (4.37),

0 respectively, while with the dimension decreased by 1. Otherwise, if ϑ2 = −ϑ2, ˜ ˜ T we have −Y1σ = (1, 0, ..., 0) = (−Xσ1) and we can rewrite (4.47) and (4.48) as (−X˜)A˜ = A˜0(−Y˜ ) and (−Y˜ )A˜ = A˜0(−X˜). Therefore, either {X,˜ Y,˜ A,˜ A˜0} or {−X,˜ −Y,˜ A,˜ A˜0} has the same structure and property as {X,Y, A,¯ A¯0}, but with the dimension decreased by 1. This procedure can thus be performed recursively, until

0 we ﬁnally reach X = Y = diag(1, ±1, ..., ±1) and |ϑi| = |ϑi| for every 1 ≤ i ≤ NH.

Remark 4.2. Theorem 4.2 and Theorem 4.3 indicate that when the system matrix A has periodically repeated structure properties, STA analysis can avoid the curse of dimensionality and provide identiﬁability results for arbitrary dimension.

103 4.5. FROM IDENTIFIABILITY TO ECONOMIC IDENTIFICATION ALGORITHMS

4.5 From identiﬁability to economic identiﬁcation algorithms

If a system is identifiable, we may develop an appropriate identification algorith- m to identify the parameters. In this section, we provide another application of STA and SPT to quantum Hamiltonian identification. Generally the dimension of a quantum system is exponential in the number of qubits. Hence, identification algorithms with polynomial complexity in the system dimension will in essence have exponential computational complexity in the number of qubits, which has been referred to as the exponential problem [104]. To avoid this problem, one method is to design identification algorithms with computational complexity directly depending on quantities that increase much slower than the system dimension. Typical such quantities include the number of qubits in multi-qubit systems, or the number of unknown parameters for special physical systems. We find that STA can be a useful tool to indicate the existence of such efficient algorithms.

4.5.1 An indicator for the existence of economic identiﬁca- tion algorithms

We aim to design an identiﬁcation algorithm that has computational complexity that only depends on the number of unknown (or interested) parameters. Suppose we have a d-dimensional Hamiltonian H with NH unknown parameters ϑi. In most cases, the ais in (4.1) are linear functions of ϑi. Hence, we can expand H directly using ϑ, N XH H = ϑiHi. (4.50) i=1

104 4.5. FROM IDENTIFIABILITY TO ECONOMIC IDENTIFICATION ALGORITHMS

Using the procedures in Sec. 4.2.1, we can model the evolution of the state as an

NL-dimensional linear system model

  ˙x = Ax + Bδ(t), x(0) = 0, (4.51)  y = Cx,

where B = x0 is the initial state and each ϑi multiplied by a coeﬃcient is an element of A. We hope the algorithm can identify one unknown element in A under one set of B and C, with the computational complexity for estimating one unknown parameter as f(NH) (or f(NL)) that is a function of NH (or NL) but not of d. Then the total computational complexity to identify the Hamiltonian is NHf(NH) (or NHf(NL)), which does not directly depend on d. In fact, we can reduce f(NH) (or f(NL)) to O(1) in some appropriate cases.

We start by investigating the identiﬁcation capability of the fundamental setting of B = Iσi and C = Ijσ. By changing indices, we assume that

B = Iσ2 and

C = I1σ.

In the most general case, there are no special properties for the structure of A. Assume that this system (A, B, C) is already minimal. Then from (4.14) and (4.15) we know the transformation matrix S is

  1 0 0 ··· 0     ∗ 1 ∗ · · · ∗     S = ∗ 0 ∗ · · · ∗ ,   . . . . . . . .   ∗ 0 ∗ · · · ∗

105 4.5. FROM IDENTIFIABILITY TO ECONOMIC IDENTIFICATION ALGORITHMS and (4.13) is now

  1 0 0 ··· 0       ∗ 1 ∗ · · · ∗ A11 A12 ···         ∗ 0 ∗ · · · ∗ A21 A22 ···     . . . . . . . . . . . .   ∗ 0 ∗ · · · ∗   (4.52) 1 0 0 ··· 0     0 0   A11 A12 ··· ∗ 1 ∗ · · · ∗     =  0 0    . A21 A22 ··· ∗ 0 ∗ · · · ∗     . . . . . . . . . . . .   ∗ 0 ∗ · · · ∗

0 From LHS12 = RHS12 of (4.52), we have A12 = A12, which indicates this fundamental setting of B and C has the capability of identifying one parameter for minimal systems. Interestingly, we succeed in extending this conclusion to non-minimal systems using STA.

Theorem 4.4. Given a linear system (A, B, C), Aij is identiﬁable (including its sign) if B = Iσj and C = Iiσ.

Proof. Without loss of generality, we can always assume that we are identifying A12 or A11 after appropriately arranging the element order of x .

T For the case of identifying A12, C = (1, 0, ..., 0) and B = (0, 1, 0, ..., 0) . Without loss of generality, we assume that the system is neither controllable nor observable. We tentatively calculate the ﬁrst two rows of the observability matrix, which are

  1 0 0 ··· 0   . (4.53) ∗ A12 ∗ · · · ∗

Since A12 = 0 is atypical, it is almost always true that (4.53) has rank two. Assume

106 4.5. FROM IDENTIFIABILITY TO ECONOMIC IDENTIFICATION ALGORITHMS

that the observable subsystem of (4.51) has dimension m. We thus have 2 ≤ m < NL.

Let   1      1       −A32/A12 1  T =   ,    −A42/A12 1     . ..   . .   

−ANL2/A12 1 NL×NL and perform a similarity transformation ¯x = T x. Using Property 4.1, the equivalent system is   ∗ A12 ∗ · · · ∗     ∗ ∗ ∗ · · · ∗   ¯ −1   A = T AT = ∗ 0 ∗ · · · ∗ ,   . . . . . . . .   ∗ 0 ∗ · · · ∗

B¯ = TB = (0, 1, 0, ..., 0)T and C¯ = CT −1 = (1, 0, ..., 0). The former two rows in the observability matrix OM of the new system (A,¯ B,¯ C¯) have the same form as

(4.53). Since OM has rank m, there exists a reordering (j3, j4, ..., jNL ) of (3, 4, ..., NL) such that the matrix (OMσ1, OMσ2, OMσj3 , OMσj4 , ..., OMσjm ) is column-full-ranked. −1 Let the matrix U = ( σ1, σ2, σj , σj , ..., σj ) and perform a further similarity I I I 3 I 4 I NL transformation ˜x = U¯x. Then the equivalent system is

  ∗ A12 ∗ · · · ∗     ∗ ∗ ∗ · · · ∗   ˜ ¯ −1   A = UAU = ∗ 0 ∗ · · · ∗ , (4.54)   . . . . . . . .   ∗ 0 ∗ · · · ∗ NL×NL

B˜ = UB¯ = (0, 1, 0, ..., 0)T and C˜ = CU¯ −1 = (1, 0, ..., 0). Now the observability

107 4.5. FROM IDENTIFIABILITY TO ECONOMIC IDENTIFICATION ALGORITHMS matrix of the system Σ˜ = (A,˜ B,˜ C˜) is

    C˜ CU¯ −1      ˜ ˜   ¯ ¯ −1   CA   CAU  −1 OMg =   =   = OM · U .  .   .   .   .      C˜A˜NL−1 C¯A¯NL−1U −1

Therefore, the first m columns of OMg are of full-rank. We can now employ the SPT method. To perform observability decomposition for the system Σ˜ , firstly we select the first two rows and other m − 2 rows from OMg to form a full-row-rank matrix ˜ ˜ ˜ Em×NL such that the former m columns of E are also full-rank. We partition E ˜ ˜ ˜ as E = [Fm×m fm×(N −m)], and then F is invertible. The transformation matrix   L F˜ f   can decompose the system Σ˜ into observable and unobservable parts. We 0 I choose the second transformation matrix as F˜−1 ⊕ I. The total transformation is       F˜−1 0 F˜ f I F˜−1f Q =     =   , 0T I 0T I 0T I and its inversion is   ˜−1 −1 I −F f Q =   . 0T I

Let ´x = Q˜x generate the system Σ´ = (A,´ B,´ C´):

  ´x˙ = A´´x + Bδ´ (t), ´x(0) = 0,  y = C´¯x.

We partition A˜ as

  ULf m×m URgm×(NL−m) A˜ =   . DLg(NL−m)×m DRg(NL−m)×(NL−m)

108 4.5. FROM IDENTIFIABILITY TO ECONOMIC IDENTIFICATION ALGORITHMS

Then we have

      I F˜−1f ULf URg I −F˜−1f A´ = QAQ˜ −1 =       0T I DLg DRg 0T I   ˜−1 ULf + F fDLg ∗m×(NL−m) =   , ∗(NL−m)×m ∗(NL−m)×(NL−m)

B´ = QB˜ = (0, 1, 0, ..., 0)T , and C´ = CQ˜ −1 = (1, 0, ..., 0, ∗, ..., ∗). | {z } m−1 We partition ´x = (`xT , ∗)T where `x is m-dimensional. Since the second transformation F˜−1 ⊕ I is block-diagonal, we know Σ´ is in the observable canonical form. Therefore, `x corresponds to the observable subsystem of Σ.´ We denote this m- dimensional observable subsystem as Σ` = (A,` B,` C`) where A` = ULf + F˜−1fDLg, ` T ` T B = (0, 1, 0, ..., 0) and C = (1, 0, ..., 0). From (4.54) we know DLgσ2 = (0, 0, ..., 0) , ` ` and Aσ2 = ULf σ2. Therefore, A12 = A12.

Similarly, we can employ the SPT method again to perform a controllability decomposition on Σ` to ﬁnally obtain a t-dimensional (2 ≤ t ≤ m) minimal system ˇ ˇ ˇ ˇ ˇ T ˇ (A, B, C) where we still have A12 = A12, B = (0, 1, 0, ..., 0) and C = (1, 0, ..., 0).

For (A,ˇ B,ˇ Cˇ), we can employ the STA method. Using (4.14) and (4.15) we know that the transformation matrix S is

  1 0 0 ··· 0     ∗ 1 ∗ · · · ∗     S = ∗ 0 ∗ · · · ∗ ,   . . . . . . . .   ∗ 0 ∗ · · · ∗ t×t

109 4.5. FROM IDENTIFIABILITY TO ECONOMIC IDENTIFICATION ALGORITHMS and (4.13) is now

  1 0 0 ··· 0     ∗ A ∗ · · · ∗   12 ∗ 1 ∗ · · · ∗     ∗ ∗ ∗ · · · ∗     ∗ 0 ∗ · · · ∗ . . . .   . . . . . . . .   . . . .     ∗ ∗ ∗ · · · ∗ ∗ 0 ∗ · · · ∗   (4.55)   1 0 0 ··· 0 ∗ A0 ∗ · · · ∗   12     ∗ 1 ∗ · · · ∗ ∗ ∗ ∗ · · · ∗       = . . . . ∗ 0 ∗ · · · ∗ . . . . .     . . . .   . . . . ∗ ∗ ∗ · · · ∗   ∗ 0 ∗ · · · ∗

By equating the elements on the ﬁrst row and second column of both sides of (4.55),

0 we have A12 = A12. Thus, A12 is identiﬁable.

T For the case of identifying A11, B = C = (1, 0, ..., 0). Its observability matrix is now   1 0 ··· 0     OM = A11 ∗ · · · ∗ .   ......

If OM2σ has non-zero elements other than A11, then the former two rows of OM are linearly independent and we can use similar procedures to the case of identifying

A12 to prove that A11 is identiﬁable. Otherwise if OM2σ = (A11, 0, ..., 0), then A1σ =

(A11, 0, ..., 0), which means (A, B, C) now is already of the observable canonical form, where the observable subsystem is 1-dimensional:

  x˙ 1 = ϑ1x1 + 1 · δ(t), x1(0) = 0,

 y = 1 · x1.

Hence, ϑ1 is certainly identiﬁable, which completes the proof.

110 4.5. FROM IDENTIFIABILITY TO ECONOMIC IDENTIFICATION ALGORITHMS

4.5.2 Two economic Hamiltonian identiﬁcation algorithms

Theorem 4.4 indicates the existence of economic quantum Hamiltonian identification algorithms. A natural following question is whether we can find any specific economic algorithm. In fact, the proof of Theorem 4.4 has already implied how to prepare the initial state of the system and select the observable. Here, we present two such identification algorithms.

We follow the notations in (4.50) and (4.51), and the aim is to estimate some elements of A. Suppose α(p)ϑp = Aa(p)b(p), where α(p) is a coeﬃcient depending on p, a = a(p) and b = b(p) are two indices also depending on p. α, a and b are determined by the model establishment procedures. Then we prepare the system initial value in a state corresponding to B = x0 = Iσb(p) and measure the observable corresponding to C = Ia(p)σ. If a(p) or b(p) is multivalued, then one can choose one value such that the corresponding experiment is straightforward to perform.

We assume that in actual experiments we can sample the system output with a

ﬁxed period of time ts (as assumed in [165]), and the data length is ND. Then the (p) data we obtain is denoted as a ND-dimensional vector yˆ . For the true value, the i-th element should be

(p) yi = Cexp(iAts)B = [exp(iAts)]a(p)b(p). (4.56)

From Theorem 4.4, Aa(p)b(p) is identifiable with this experimental setting. Our task is to find a specific algorithm to complete the identification.

We start from the matrix logarithm function. For a square matrix Z, we deﬁne its logarithm as ∞ X ( − Z)j log(Z) = − I . (4.57) j j=1

111 4.5. FROM IDENTIFIABILITY TO ECONOMIC IDENTIFICATION ALGORITHMS

This series is convergent and exp[log(Z)] = log(Z) when ||I − Z|| < 1 [55]. Further- more, when ||I − exp(J)|| < 1, we have log[exp(J)] = J [135].

First we need to guarantee ||I − exp(Ats)|| < 1 (the reason will be shown later), which should hold from an intuitive guess when ts is small enough. Speciﬁcally, suppose we have a prior knowledge on A, like ||A|| < F where F is given. Since A is † antisymmetric, we can employ a unitary matrix UA to diagonalize it as A = UAJAUA, where JA = diag(iλ1, iλ2, ..., iλK ). Here for j = 1, ..., K, λj ∈ R can be zero. We thus know maxj |λj| ≤ F . Note that

2 † 2 ||I − exp(Ats)|| = ||I − UAexp(JAts)UA|| 2 = ||I − exp(JAts)|| 2 = ||diag(1 − iλ1ts, 1 − iλ2ts, ..., 1 − iλK ts)|| PK 2 PK 2 2 = j=1 |1 − iλjts| = j=1[(1 − cos λjts) + sin λjts] PK = j=1(2 − 2 cos λjts).

Suppose we have 1 2K − 1 t < arccos , (4.58) s F 2K then we know λ 2K − 1 2K − 1 cos λ t > cos j arccos > , j s F 2K 2K and 2 PK ||I − exp(Ats)|| = j=1(2 − 2 cos λjts) PK 2K−1 < j=1(2 − K ) = 1.

Therefore, (4.58) can be a suﬃcient condition on the sampling period ts to guarantee the validity of the matrix logarithm function. We can then continue (4.57) to obtain

∞ j X [ − exp(Ats)] log[exp(At )] = At = − I . s s j j=1

112 4.5. FROM IDENTIFIABILITY TO ECONOMIC IDENTIFICATION ALGORITHMS

We thus have

P∞ 1 j Aa(p)b(p)ts = − j=1 j {[I − exp(Ats)] }a(p)b(p) P∞ 1 Pj j j−k k = − { I [−exp(Ats)] }a(p)b(p) j=1 j k=0 k (4.59) P∞ 1 Pj k j = − j=1 j k=0(−1) k [exp(kAts)]a(p)b(p), P∞ 1 Pj k j (p) = − j=1 j k=0(−1) k yk , where   j j!   = . k k!(j − k)!

We then truncate the inﬁnite series in (4.59) to the ND-th term and reconstruct

Aa(p)b(p). Based on the above discussion, the identiﬁcation algorithm is designed as follows.

Algorithm 4.1. Step 1. Establish the system model as (4.50) and (4.51). Using available prior knowledge, choose ts such that ||I − exp(Ats)|| < 1 (e.g., let ts satisfy (4.58)). Let p = 1.

Step 2. Suppose α(p)ϑp = Aa(p)b(p), choose c = Ia(p)σ and x0 = Iσb(p). Record the sampled data yˆ(p).

Step 3. Reconstruct Aa(p)b(p) according to

ND j ˆ 1 X 1 X k j (p) Aa(p)b(p) = − (−1) k yˆk . (4.60) ts j j=1 k=0

ˆ ˆ Then ϑp = Aa(p)b(p)/α(p).

Step 4. When p < NH, let p = p + 1, change the values of α(p), a(p) and b(p) accrodingly and repeat Step 2 and Step 3 to reconstruct all the elements in ϑ. Finally, Hˆ can be obtained from (4.50).

We analyze the computational complexity of Algorithm 4.1, where the time spent in experiments is not considered because it depends on diﬀerent experiment settings.

113 4.5. FROM IDENTIFIABILITY TO ECONOMIC IDENTIFICATION ALGORITHMS

The binomial coeﬃcients in (4.60) can be calculated from the recursion relation

      j j − 1 j − 1   =   +   . k k − 1 k

j Therefore, the computational complexity to calculate all k (1 ≤ j ≤ ND, 0 ≤ 2 k ≤ j) is O(ND ), and this calculation can be performed oﬀ-line in advance. Hence, 2 the online computational complexity in calculating (4.60) is O(ND ). Since there are NH parameters to reconstruct, the total online computational complexity of the 2 proposed identiﬁcation algorithm is O(NHND ).

Algorithm 4.1 is in essence based on matrix logarithm function. Furthermore, a natural idea arises: is it possible to employ matrix exponential function to design economic identiﬁcation algorithms? We give a positive answer here.

We rewrite the true value of the data as

r r (p) P∞ i ts r yi = Cexp(iAts)B = r=0 r! Ia(p)σA Iσb(p) r r P∞ i ts r = δa(p)b(p) + r=1 r! (A )a(p)b(p) r r Pq i ts r ≈ δa(p)b(p) + r=1 r! (A )a(p)b(p), where we should choose q ≤ ND. Since A is always antisymmetric, its diagonal elements are always zero, and we must have a(p) 6= b(p). Hence,

q r r (p) X i t y ≈ s (Ar) . (4.61) i r! a(p)b(p) r=1

Denote w = ND||A||tse and z = 1 + max(bwc, q) for simplicity. We can bound the

114 4.5. FROM IDENTIFIABILITY TO ECONOMIC IDENTIFICATION ALGORITHMS truncated terms as

r r P∞ p ts r | r=q+1 r! (A )a(p)b(p)| ∞ (pt e)r ≤ P | √ 1 s (Ar) | r=q+1 2πr rr a(p)b(p) ∞ = P √ 1 ( ptse )r| Ar | r=q+1 2πr r Ia(p)σ Iσb(p) ∞ ≤ P √ 1 ( ptse )r|| || · ||A||r · || || r=q+1 2πr r Ia(p)σ Iσb(p) ∞ p||A||t e = P √ 1 ( s )r r=q+1 2πr r ≤ √ 1 P∞ ( w )r 2π(q+1) r=q+1 r ≤ √ 1 Pz−1 ( w )r + √ 1 P∞ ( w )r 2π(q+1) r=q+1 r 2π(q+1) r=z z ( w )z √ 1 Pz−1 w r √ z = r=q+1( r ) + w , 2π(q+1) 2π(q+1)(1− z ) where the ﬁrst inequality comes from Stirling’s approximation. Hence, the summa- tion of the truncated items is never divergent.

T i Denote a vector Ψ(p, q) = [ψ1(p), ψ2(p), ..., ψq(p)] where ψi(p) = (A )a(p)b(p). Then we need to identify α(p)ϑp = Aa(p)b(p) = ψ1(p). Denote

 1 1 2 2 q q  1 ts 1 ts 1 ts 1! 2! ··· q!  1 1 2 2 q q   2 ts 2 ts 2 ts   1! 2! ··· q!  L =   . (4.62)    ......   1 1 2 2 q q  ND ts ND ts ND ts 1! 2! ··· q! ND×q

From (4.61), we have y(p) ≈ LΨ(p, q). We use a least-squares method to obtain an estimate Ψ(ˆ p, q) = (LT L)−1LT yˆ(p), (4.63)

ˆ ˆ and ϑp = ψ1(p)/α(p).

When determining the sampling period, the Nyquist Sampling Theorem should be satisﬁed. For details one can refer to the supplementary material of [165]. For example, assume that we know a prior ||A|| ≤ F , then we require F ts < π. We generalize the procedures of this algorithm as follows.

115 4.5. FROM IDENTIFIABILITY TO ECONOMIC IDENTIFICATION ALGORITHMS

Algorithm 4.2. Step 1. Establish the system model as (4.50) and (4.51). Choose ts such that the Nyquist Sampling Theorem is satisﬁed. Let p = 1.

Step 2. Suppose α(p)ϑp = Aa(p)b(p), choose c = Ia(p)σ and x0 = Iσb(p). Record the sampled data yˆ(p). ˆ ˆ Step 3. Reconstruct Ψ(p, q) according to (4.62) and (4.63). Then ϑp = ψ1(p)/α(p).

Step 4. When p < NH, let p = p + 1, change the values of α(p), a(p) and b(p) and repeat Step 2 and Step 3 to reconstruct all the elements in ϑ. Finally, Hˆ can be obtained from (4.50).

We analyze the computational complexity of Algorithm 4.2. Matrix L in (4.62) and the quantity (LT L)−1LT in (4.63) can be calculated oﬀ-line in advance, which does not account to the online computation. Hence, we only need to perform a matrix multiplication in (4.63), which has complexity O(qND). Since there are NH parameters to reconstruct, the total online computational complexity is O(qNHND).

Since q ≤ ND, the computational complexity of Algorithm 4.2 is not larger than that of Algorithm 4.1.

Remark 4.3. The essence of Algorithm 4.1 and Algorithm 4.2 is to estimate the unknown parameters one by one independently. Hence, these two algorithms are also applicable and eﬃcient in the case when there are a large number of unknown parameters but one may be only interested in a small portion of them.

4.5.3 Error analysis

We present an analysis of the identification error in Algorithm 4.1 and Algorithm 4.2. We start with Algorithm 4.1, and characterize the truncation error induced by truncating the infinite series (4.59) to a finite one (4.60). We propose the following theorem.

Theorem 4.5. Assuming that the experimental data exactly equal to the true

116 4.5. FROM IDENTIFIABILITY TO ECONOMIC IDENTIFICATION ALGORITHMS values, the identiﬁcation error of Algorithm 4.1 satisﬁes √ N || − exp(At )||ND+1 ||Hˆ − H|| ≤ H I s . ts minj |α(j)|(ND + 1)(1 − ||I − exp(Ats)||)

Proof. According to (4.59) and (4.60), we have

h i ˆ P∞ 1 j ts|Aa(p)b(p) − Aa(p)b(p)| = j=N +1 j (I − exp(Ats)) D a(p)b(p)

P∞ 1 j ≤ [ − exp(Ats)] j=ND+1 j I 1 P∞ j (4.64) ≤ ||[ − exp(Ats)] || ND+1 j=ND+1 I 1 P∞ j ≤ || − exp(Ats)|| ND+1 j=ND+1 I N +1 ≤ ||I−exp(Ats)|| D (ND+1)(1−||I−exp(Ats)||) where the fourth line comes from (A.7).

We then bound the estimation error as

ˆ PNH ˆ ||H − H|| = || m=1(ϑm − ϑm)Hm|| q PNH ˆ ˆ = m,n=1(ϑm − ϑm)(ϑn − ϑn)hHm, Hni q q PNH ˆ 2 PNH ˆ 2 2 (4.65) = m=1(ϑm − ϑm) = m=1(Aa(p)b(p) − Aa(p)b(p)) /α(m) q 1 PNH ˆ 2 ≤ (Aa(p)b(p) − Aa(p)b(p)) minj |α(j)| m=1 √ N +1 ≤ NH||I−exp(Ats)|| D . ts minj |α(j)|(ND+1)(1−||I−exp(Ats)||)

Since ||I − exp(Ats)|| < 1, Theorem 4.5 indicates that the truncation error should decrease exponentially as the data length increases. However, if error sources other than truncation are considered, the performance of Algorithm 4.1 is not satisfactory. The reason might be that the series (4.57) converges too slow and the binomial coef- ﬁcients in (4.60) increase fast, which may amplify the error in data. In comparison, Algorithm 4.2 can perform better when the experimental data have various kinds of noise, which is stated in the following theorem.

117 4.5. FROM IDENTIFIABILITY TO ECONOMIC IDENTIFICATION ALGORITHMS

Theorem 4.6. Assume the experimental data is subject to additive Gaussian noise with zero mean and correlation matrix c2I. Using Algorithm 4.2, the MSE E{Tr[(H−ˆ H)2]} is linear in c2.

Proof. Let η denote the noise vector such that yˆ(p) = y(p) + η. Also let

∞ r X ts R(p, q) = (Ar) (1r, 2r, ..., N r)T . r! a(p)b(p) D r=q+1

Its convergence can be seen from the fact that the factorial function increases much faster than the power function. The estimated vector is

Ψ(ˆ p, q) = (LT L)−1LT (y(p) + η) = (LT L)−1LT [LΨ(p, q) + R + η] = Ψ(p, q) + (LT L)−1LT R + (LT L)−1LT η.

We then have

ˆ 2 E[(ψ1(p) − ψ1(p)) ] ˆ 2 = E[(I1σΨ(p, q) − I1σΨ(p, q)) ] T T −1 T −1 T T −1 T −1 T T = R L(L L) Iσ1I1σ(L L) L R + Tr[L(L L) Iσ1I1σ(L L) L ηη ] T T −1 2 2 T −1 = [R L(L L) Iσ1] + c I1σ(L L) Iσ1. (4.66) ˆ 2 2 Therefore E[(Aa(p)b(p) − Aa(p)b(p)) ] is linear in c . From the third line of (4.65) we know that E{Tr[(Hˆ − H)2]} is also linear in c2.

4.5.4 Simulation performance

We perform numerical simulations to test the performance of the two economic Hamiltonian identiﬁcation algorithms.

We consider a 5-qubit exchange model without transverse ﬁeld (n = 4 in (4.16))

118 4.5. FROM IDENTIFIABILITY TO ECONOMIC IDENTIFICATION ALGORITHMS

-2

-3

H|| -4 − ˆ ||H|| H -5 || E

10 -6

-7

-8

-9

-10 Log of relative error, log -11

-12 4 6 8 10 12 14 16 18 20 Data length, ND

Figure 4.2: Performance of Algorithm 4.1 with diﬀerent data lengths. and the values of the Hamiltonian parameters are ϑ = (0.1, 1.5, −0.8, 3.1). The ¯ accessible set is G = {X1,Z1Y2,Z1Z2X3,Z1Z2Z3Y4,Z1Z2Z3Z4X5}. We set the initial states of the system as the eigenstates of Z1Y2,Z1Z2X3,Z1Z2Z3Y4,Z1Z2Z3Z4X5 and observe X1,Z1Y2,Z1Z2X3,Z1Z2Z3Y4, respectively. From Theorem 4.4 we know all the parameters are identiﬁable.

First we identify the Hamiltonian using Algorithm 4.1. The sampling period is ts =

0.1. We perform identification with different data length ND, and plot the result in Fig. 4.2. It is shown that the truncation error approximately has a linear relationship with the data length in the logarithm coordinate, which testifies Theorem 4.5.

We then identify the Hamiltonian using Algorithm 4.2. We set the sampling period as ts = 0.1 and the parameter q = b0.3NDc + 3. We add zero-mean Gaussian noise into the sampling data, with the variance of the noise varying from 10−7 to 10−5. The identiﬁcation result is shown in Fig. 4.3, where each point is the average of 500 repetitive runs. We can see the error and noise variance generally have a linear

119 4.6. CONCLUSION AND OPEN PROBLEMS

×10-3 5

4.5 2 2 H|| 4 − ˆ ||H|| H

|| 3.5 E

2.5 cation error,

ﬁ 2

1.5

1 Relative identi 0.5

0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 c2 Noise variance, ×10-5

Figure 4.3: Performance of Algorithm 4.2 with diﬀerent noise variances. relationship, which testiﬁes the result of Theorem 4.6.

We finally test the performance of Algorithm 4.2 with different data length, where the sampling period and q are the same as the simulation in Fig. 4.3. The noise variance is 10−6. We employ Algorithm 4.2 with different data length, and plot the result in Fig. 4.4, where each point is the average of 500 repetitive runs. The robustness of Algorithm 4.2 still has improvement space, and we leave it an open problem to develop economic identification with better performance.

4.6 Conclusion and open problems

In this chapter, we have extended the STA method in classical control theory to the domain of quantum Hamiltonian identiﬁcation, and employed the STA method to prove the identiﬁability of spin-1/2 chain systems assisted by single-qubit probes

120 4.6. CONCLUSION AND OPEN PROBLEMS

0.045

H|| 0.04 − ˆ ||H|| H ||

E 0.035

0.03 cation error, ﬁ

0.025

0.02 Relative identi

0.015 4 6 8 10 12 14 16 18 20 Data length, ND

Figure 4.4: Performance of Algorithm 4.2 with diﬀerent data lengths.

[132]. STA has been demonstrated to be a powerful tool to analyze the identiﬁabil- ity for quantum systems with arbitrary dimension, which is also helpful for further designing identiﬁcation algorithms. STA can also serve as a useful method for physi- cists to investigate the information extraction capability of quantum subsystems (like the single qubit probe in [132]).

For non-minimal systems, an SPT method was proposed to efficiently test the identifiability, while preserving the key information in the original system matrix. We further employed the SPT method to provide an indicator for the existence of economic quantum Hamiltonian identification algorithms. The SPT method is proved to be a strong supplement to STA. SPT can also be applicable to classical control systems, especially when the experimental settings are difficult to change.

We proposed two examples of economic quantum Hamiltonian identiﬁcation algorithms, based on expansions of the matrix logarithm function and exponential function. The computational complexities of the two algorithms are directly de-

121 4.6. CONCLUSION AND OPEN PROBLEMS termined by the data length and the number of unknown parameters, which can be quite small in some classes of physical systems. We presented analysis and numerical result to illustrate the performance of the identiﬁcation algorithms.

Open problems in this domain include:

(i) The question of how to develop a general framework using STA to characterize the amount of identiﬁable information for an unidentiﬁable system is open.

(ii) It may be helpful to propose other general suﬃcient or necessary conditions for a system/some parameters to be identiﬁable, like the symmetry method in Sec. 4.4.2.2.

(iii) It is useful to develop other economic Hamiltonian identiﬁcation algorithms with better accuracy and robustness.

122 Chapter 5

Quantum Hamiltonian/gate identiﬁcation via TSO and PGI

The work, reported in this chapter, has been partially published in the following articles:

1. Y. Wang, Q. Yin, D. Dong, B. Qi, I. R. Petersen, Z. Hou, H. Yonezawa, and G.-Y. Xi- ang, Quantum gate identification: Error analysis, numerical results and optical experiment, Automatica, vol. 101, pp. 269-279, 2019. 2. Y. Wang, D. Dong, B. Qi, J. Zhang, I. R. Petersen, and H. Yonezawa, A quantum Hamiltoni- an identification algorithm: Computational complexity and error analysis, IEEE Transactions on Automatic Control, vol. 63, no. 5, pp. 1388-1403, 2018. 3. Y. Wang, Q. Yin, D. Dong, B. Qi, I. R. Petersen, Z. Hou, H. Yonezawa, and G.-Y. Xiang, Efficient identification of unitary quantum processes, in 2017 Australian and New Zealand Control Conference (ANZCC), pp. 196-201, Gold Coast, Australia, December 2017. 4. D. Dong, and Y. Wang, Several recent developments in estimation and robust control of quantum systems, in 2017 Australian and New Zealand Control Conference (ANZCC), pp. 190-195, Gold Coast, Australia, December 2017. 5. Y. Wang, B. Qi, D. Dong, and I. R. Petersen, An iterative algorithm for Hamiltonian identification of quantum systems, in 2016 IEEE 55th Annual Conference on Decision and Control (CDC), pp. 2523-2528, Las Vegas, USA, December 2016.

5.1 Introduction

Quantum processes, also called quantum operations, are linear, trace-preserving and completely positive maps that transform quantum states in one space to quantum states in another space [154]. Characterizing an unknown quantum process is

123 5.1. INTRODUCTION vital to verify and benchmark quantum devices for quantum computation, communication and metrology [104]. The standard solution to characterizing a quantum process is quantum process tomography (QPT), wherein usually known input quantum states (probe states) are applied to the process and the output states are measured to reconstruct the quantum process [3, 51, 120].

For a closed quantum system, its state undergoes unitary evolution, which can be viewed as a special class of quantum process. The system Hamiltonian, as the generator of the unitary propagator, completely determines the state evolution, and it is thus an essential component to characterize the dynamics of the system. In quantum information, the unitary evolutions on single or multiple qubits are called quantum gates, which serve as the quantum analog of logic gates in classical digital circuits [104]. Therefore, Hamiltonian and gates are closely related sides of unitary evolution, and their identiﬁcation is a hot topic in research about quantum system dynamics.

System identification has been widely investigated in classical (non-quantum) control theory and application, and many identification algorithms have been develope- d to estimate unknown dynamical parameters of linear or nonlinear input-output systems [30, 44, 94, 134]. In recent years, the problem of quantum system identification has attracted more and more attention due to the rapid development of emerging quantum technology [160, 162] and increasing demand of characterizing quantum devices. For example, a framework for quantum system identification has been established in [27] to classify how much knowledge about a quantum system is attainable from a given experimental setup. Gut¸˘aand Yamamoto [63] considered a class of passive linear quantum input-output systems, and investigated the problem of identifiability and how to optimize the identification precision by preparing good input states and performing appropriate measurements on the output states.

In this chapter, ﬁrst we focus on the problem of quantum Hamiltonian identiﬁca- tion (QHI), which is a key task in characterizing the dynamics of quantum systems

124 5.1. INTRODUCTION and achieving high-precision quantum control. There exist some results on QHI and various aspects of QHI have been investigated [26, 48, 52, 132]. For example, a symmetry-preserving observer has been developed for the Hamiltonian identification of a two-level quantum system [19]. The question of how to utilize quantum control to identify a Hamiltonian for a controllable system with nondegenerate transitions has been addressed [89]. Closed-loop learning control has been presented to optimally identifying Hamiltonian information [57] and compressed sensing has been proposed to enhance the efficiency of identification algorithms for Hamiltonian with special structures [119, 127]. Several Hamiltonian identification algorithms have been developed using only measurement in a single fixed basis [122, 123, 125]. Wang et al. [149] utilized dynamical decoupling to identify Hamiltonians for quantum many- body systems with arbitrary couplings. Cole et al. [34] discussed the estimation error in identifying a two-state Hamiltonian and Zhang et al. [165] presented a QHI protocol using measurement time traces.

For quantum gate identification (QGI), a natural approach to is to view the unitary gate as a special class of quantum process. Many results have been obtained from this point of view. Gutoski and Johnston [64] proved that any d-dimensional unitary channel can be determined with only O(d2) interactive observables. Baldwin et al. [6] showed that a d-dimensional unitary map is completely characterized by a minimal set of d2 + d measurement outcomes and this needs to be achieved using at least d probe pure states. Wang et al. [146] proposed an adaptive unitary process tomography protocol which needs only d2 + d − 1 measurement outcomes for a d- dimensional system. For quantum gates having an effective matrix product operator representation, Holzäpfel et al. [71] presented a tomography method that requires only measurements of linearly many local observables on the subsystems. Further- more, standard quantum process tomography methods can be used to identify an unknown quantum gate. For example, maximum likelihood estimation (MLE) for QPT has been applied to gate identification in [13, 101, 107]. A Bayesian deduction

125 5.1. INTRODUCTION method for QPT has also been applied to gate identiﬁcation in [138].

There are also other methods to solve the gate identification problem. Kimmel et al. [84] developed a parameter estimation technique to calibrate key systematic parameters in a universal single-qubit gate set and achieved good robustness and efficiency. Rodionov et al. [118] utilized the compressed sensing QPT method from [126] to characterize quantum gates based on superconducting Xmon and phase qubits. They showed that the compressed sensing method may reduce the amount of required resources significantly. Kimmel et al. [83] used a randomized benchmarking method to reconstruct a unitary evolution and achieved robustness to preparation, measurement and gate imperfections. Many of these existing results on QHI and QGI have limitations for practical applications (e.g., estimating a single parameter [125, 163], identifying special Hamiltonian [34, 123]), and there are few theoretical results on the analysis of computational complexity and upper bounds on estimation errors.

In Sec. 5.2, we present an identification algorithm for general time-independent Hamiltonians and analyzes its computational complexity and upper bounds on estimation errors. Our quantum Hamiltonian identification method is established within the framework of quantum process tomography. Some different input states are pre- pared for quantum systems and the corresponding output states are measured after a fixed time evolution under the Hamiltonian to be identified. These output states are reconstructed using the quantum state tomography technique via linear regression estimation (LRE) [113]. Using the information of estimated output states, the Hamiltonian is reconstructed via a two-step optimization (TSO) identification algorithm. Matrix differentiation and Schur’s decomposition are used in the development of the TSO algorithm. The computational complexity is O(d6) for a d-dimensional

d3 Hamiltonian. We also establish an error upper bound as O( √ ) where No is the No resource number in the tomography of each output state. We then numerically compare TSO with the QHI method using measurement time traces in [166], and our

126 5.2. TSO HAMILTONIAN IDENTIFICATION ALGORITHM identiﬁcation algorithm shows an eﬃciency advantage over the method in [166] in terms of the computational time.

We further extend the TSO algorithm to a Pure-state-based Gate Identiﬁcation (PGI) method with lower computational complexity in Sec. 5.3. We notice that for a unitary quantum gate, the output state is always a pure state if the input state is pure. We thus design a fast pure-state tomography algorithm to reconstruct the output states. Then we improve the TSO algorithm to reconstruct the Hamiltonian more eﬃciently. The total computational complexity is thus reduced to O(d3). We

2.5 also demonstrate the expectation of the error scales as O( √d ). Numerical compar- No ison with MLE method testifies the effectiveness and efficiency advantage of PGI algorithm. We perform a quantum optical experiment on a one-qubit Hadamard gate to demonstrate the theoretical result.

Sec. 5.4 concludes this chapter and presents several open problems.

5.2 TSO Hamiltonian identiﬁcation algorithm

In this section, we introduce our TSO QHI method. First, we rephrase the framework of QPT in [104] in the matrix form, which is fundamental for the later illustration of our algorithm.

5.2.1 Quantum process tomography

For an open quantum system, the transformation from an input state ρin to an output state ρout can be given by the Kraus operator-sum representation

X † ρout = E(ρin) = AiρinAi , (5.1) i

127 5.2. TSO HAMILTONIAN IDENTIFICATION ALGORITHM

where the quantum operation E maps ρin to ρout and {Ai} is a set of mappings (called Kraus operators) from the input Hilbert space to the output Hilbert space with P † i Ai Ai ≤ I. The form (5.1) already contains the completely positive restriction on the process, and the trace-preserving restriction (2.7) means that the completeness relation X † Ai Ai = I (5.2) i is satisﬁed. In particular, we consider d-dimensional quantum systems and have

Ai ∈ Cd×d.

By expanding {Ai} in a ﬁxed family of basis matrices {Fi}, we obtain

X Ai = cijFj, (5.3) j and then X † E(ρ) = FjρFkxjk, jk

P ∗ with xjk = i cijcik. If we deﬁne the matrix C = [cij] and the matrix X = [xij], then

X = CT C∗, (5.4) which indicates that X must be Hermitian and positive semideﬁnite. X is called the process matrix [107]. The completeness constraint equation (5.2) becomes

X † xjkFkFj = I. (5.5) j,k

It is diﬃcult to further simplify this relationship before the structure of {Fi} is determined. Note that the matrix X and the process E are in a one-to-one correspondence. Hence, we can obtain a full characterization of E by reconstructing X.

Let {ρm} be a complete basis set of Cd×d, in the sense that every matrix in Cd×d can be expressed as a ﬁnite complex linear combination of {ρm}. For example, all

128 5.2. TSO HAMILTONIAN IDENTIFICATION ALGORITHM

Pauli matrices together with I2×2 form a complete basis set of C2×2. If we let {ρm} be linearly independent matrices (with respect to addition between matrices, and multiplication between a scalar and a matrix) and we input ρm to the process, then each process output can be expanded uniquely in the basis set {ρn}; i.e.,

X ρout = E(ρin) = E(ρm) = ξmnρn. (5.6) n

For simplicity, we choose {ρn} to be the same set as {ρm} although they could be different. We then need to find the relationship between X and ξ, which is independent of the bases {Fi}. Considering the effects of the bases {ρn} on {ρm}, we have

† X jk FjρmFk = βmnρn. (5.7) n Hence,

X X jk X βmnρnxjk = ξmnρn. n jk n

From the linear independence of {ρn}, one can obtain

X jk βmnxjk = ξmn. (5.8) jk

To rewrite this equation into a compact form, deﬁne the matrix Ξ = [ξmn] and jk arrange the elements βmn into a matrix B:

  11 21 12 22 d2d2 β11 β11 ··· β11 β11 ··· β11    11 21 12 22 d2d2   β21 β21 ··· β21 β21 ··· β21       ......    B =  11 21 12 22 d2d2  (5.9)  β12 β12 ··· β12 β12 ··· β12     11 21 12 22 d2d2   β22 β22 ··· β22 β22 ··· β22       ......    11 21 12 22 d2d2 β 2 2 β 2 2 ··· β 2 2 β 2 2 ··· β 2 2 d d d d d d d d d d d4×d4

129 5.2. TSO HAMILTONIAN IDENTIFICATION ALGORITHM so that we have Bvec(X) = vec(Ξ). (5.10)

Here, B is determined once the bases {Fi} and {ρm} are chosen, and Ξ is obtained from experimental data. B, X and Ξ are in general complex matrices. Note that X should be Hermitian and positive semideﬁnite and satisfy the constraint (5.5). Considering that practical data Ξˆ usually have noise or uncertainty, direct inversion or pseudo-inversion of B may fail to generate a physical solution. We try to ﬁnd ˆ 0 a physical estimate X which will generate an outputρ ˆge as close as possible to 0 0 0 the estimated resultsρ ˆ from quantum state tomography. Becauseρ ˆge andρ ˆ are ˆ ˆ ˆ ˆ characterized by Ξge and Ξ separately, we should minimize ||Ξge − Ξ||. Since

ˆ ˆ ˆ ˆ ˆ ˆ ||Ξge − Ξ|| = ||vec(Ξge) − vec(Ξ)|| = ||Bvec(X) − vec(Ξ)||, we will take ||Bvec(X)ˆ − vec(Ξ)ˆ || as a performance index.

The problem is now the following optimization problem:

Problem 5.1. Given the matrix B and experimental data Ξˆ, find a Hermitian and positive semidefinite estimate Xˆ minimizing ||Bvec(X)ˆ − vec(Ξ)ˆ ||, such that (5.5) is satisfied.

It is diﬃcult to obtain an analytical solution to Problem 5.1. In this thesis, we do not directly solve Problem 5.1 since the problem of QHI can be further speciﬁed based on Problem 5.1. Here we complete the deduction after one obtains an estimate ˆ ˆ ˆ X. The remaining task is to obtain Kraus operators {Ai}. Since X is Hermitian, it has spectral decomposition d2 ˆ X X = ui|viihvi|, i=1

130 5.2. TSO HAMILTONIAN IDENTIFICATION ALGORITHM

where ui are real eigenvalues. Then

d2 ˆ T X √ C = ui|viihvi|, i=1 and ˆ X Ai = cijFj. j

Though X and E are in one-to-one correspondence, the notable property of the Kraus operator-sum representation is its non-uniqueness; i.e., there may be more than one diﬀerent sets of Kraus operators that give rise to the same process E. This comes from the procedure of decomposing X into CT C∗, which is in fact non-unique because X = CT C∗ = (CT U T )(U ∗C∗) holds for any unitary U. Hence, the deduction of C from X is non-unique.

5.2.2 Problem formulation of Hamiltonian identiﬁcation

We write the closed-system evolution (2.5) in the following form

ρ(t) = U(t)ρ(0)U †(t), (5.11) where U(t) = exp(−iHt) and H is a d-dimensional time-independent Hamiltonian to be identiﬁed. If we compare (5.11) with the Kraus representation (5.1), it is clear that the unitary propagator U(t) is the only Kraus operator. Then from (5.3) we know that the matrix C is a row vector. Hence, from (5.4) we know X is of rank one. It is worth mentioning that, for any given process E, although the Kraus operator-sum representation is not unique, the process matrix X is in fact uniquely determined. Despite that there might be other Kraus operator-sum representations where the number of operators is more than 1, the conclusion that X is of rank one

131 5.2. TSO HAMILTONIAN IDENTIFICATION ALGORITHM is always true. Furthermore, when X is of rank one, the semideﬁnite requirement is naturally satisﬁed. We thus let X = gg† and g = vec(G).

Now we need to determine basis sets {Fi} and {ρm}. Proper choice of these basis sets can greatly simplify the QHI problem, and we thus choose both of them as the natural basis {|jihk|}1≤j,k≤d, because the natural basis can simplify the completeness requirement (5.5) and Problem 5.1. These advantages can be demonstrated as follows.

Proposition 5.1. If {Fi} is chosen as the natural basis and the relationship between i, j and k is i = (j−1)d+k, then the completeness constraint reads TrAX = Id.

Proposition 5.1 is the Choi-Jamio lkowski isomorphism [4, 172]. We restate its proof using the notation of this thesis in AppendixI. In this chapter, whenever we need to endow orders to number pairs (x, y) (1 ≤ x, y ≤ K) we identify (x, y) with (x − 1)K + y unless declared otherwise.

The natural basis is also useful in transforming Problem 5.1 into an optimization problem in a more convenient form:

Problem 5.2. Given the matrix B and experimental data Ξˆ, find a Hermitian and positive semidefinite estimate Xˆ minimizing ||Xˆ − vec−1(B−1vec(Ξ))ˆ ||, such that constraint (5.5) is satisfied.

Problem 5.2 is not necessarily equivalent to Problem 5.1. We need to determine when B is invertible and when these two problems are equivalent. To answer these two questions, we give the following conditions to characterize B.

d2 d2 Theorem 5.1. Let {Fi}i=1 be a set of matrices in the space Cd×d and let {ρm}m=1 be a set of linearly independent bases of Cd×d. Deﬁne B through (5.7) and (5.9).

Then {Fi} is a set of linearly independent bases of Cd×d if and only if B is invertible.

d2 d2 Theorem 5.2. Let {Fi}i=1 be a set of matrices in Cd×d and let {ρm}m=1 be a set of normal orthogonal bases of Cd×d. Deﬁne B through (5.7) and (5.9). Then {Fi} forms normal orthogonal basis of Cd×d if and only if B is unitary.

132 5.2. TSO HAMILTONIAN IDENTIFICATION ALGORITHM

The detailed proofs of Theorem 5.1 and Theorem 5.2 are presented in Appendix J and AppendixK, respectively. Under the conditions in Theorem 5.2, B is unitary, and we have

||Bvec(X)ˆ − vec(Ξ)ˆ || = ||vec(X)ˆ − B−1vec(Ξ)ˆ || = ||Xˆ − vec−1(B−1vec(Ξ))ˆ ||, which means Problem 5.1 is equivalent to Problem 5.2 in this case. The natural basis set satisﬁes the requirements in Theorem 5.1 and Theorem 5.2.

With the natural basis {|jihk|}1≤j,k≤d for {Fi} and {ρm}, we have

† † TrA(vec(G)vec(G) ) = Id = GG , which means the completeness constraint (5.2) is equivalent to the requirement that G is unitary. Hence, we can transform Problem 5.2 into the following problem which is critical for QHI.

d2 Problem 5.3. Assume that {ρm}m=1 is a set of normal orthogonal bases of the space Cd×d, {Fi} is chosen as {|jihk|}1≤j,k≤d, and the relationship between i, j and k is i = (j − 1)d + k. Given the unitary matrix B and experimental data Ξˆ, ﬁnd a unitary matrix Gˆ minimizing ||vec(G)vec(ˆ G)ˆ † − vec−1(B†vec(Ξ))ˆ ||.

Remark 5.1. Note that we can experimentally measure only Hermitian physical variables. Hence, we cannot directly use |jihk| (j 6= k) as probe states. According to [104], when j 6= k, one can take |jihj|, |kihk|, |+ih+| and |−ih−| as inputs where √ √ |+i = (|ji+|ki)/ 2 and |−i = (|ji+i|ki)/ 2. Then E(|jihk|) can be obtained from

1 + i 1 + i E(|jihk|) = E(|+ih+|) + iE(|−ih−|) − E(|jihj|) − E(|kihk|). (5.12) 2 2

133 5.2. TSO HAMILTONIAN IDENTIFICATION ALGORITHM

5.2.3 Two-step Optimization algorithm

5.2.3.1 Solution to Problem 5.3

The direct solution to Problem 5.3 is diﬃcult [152] and we split it into two optimization sub-problems (i.e., two-step optimization):

Problem 5.3.1. Let Dˆ = vec−1(B†vec(Ξ))ˆ be a given matrix. Find a d × d matrix Sˆ minimizing ||vec(S)vec(ˆ S)ˆ † − Dˆ||.

Problem 5.3.2. Let Sˆ be given. Find a d × d unitary matrix Gˆ minimizing ||vec(G)vec(ˆ G)ˆ † − vec(S)vec(ˆ S)ˆ †||.

First step optimization:

For Problem 5.3.1, let

ˆ ˆ † ˆ 2 L1 = ||vec(S)vec(S) − D|| = Tr{[vec(S)vec(ˆ S)ˆ † − D][vec(ˆ S)vec(ˆ S)ˆ † − Dˆ †]}

= [vec(S)ˆ †vec(S)]ˆ 2 − vec(S)ˆ †(Dˆ + Dˆ †)vec(S)ˆ + Tr(DˆDˆ †).

Then by partial diﬀerentiation we obtain the conjugate gradient matrix

∂L 1 = 2vec(S)ˆ †vec(S)vec(ˆ S)ˆ − (Dˆ † + D)vec(ˆ S)ˆ , ∂vec(S)ˆ ∗ which leads to (Dˆ † + D)vec(ˆ S)ˆ = 2vec(S)ˆ †vec(S)vec(ˆ S)ˆ .

Therefore the optimal vec(S)ˆ must be an eigenvector of (Dˆ † + D)ˆ corresponding to the positive eigenvalue 2vec(S)ˆ †vec(S).ˆ Then

ˆ † ˆ 2 ˆ † ˆ ˆ † ˆ ˆ ˆ † L1 = [vec(S) vec(S)] − vec(S) (D + D )vec(S) + Tr(DD ) = Tr(DˆDˆ †) − [2vec(S)ˆ †vec(S)]ˆ 2/4.

134 5.2. TSO HAMILTONIAN IDENTIFICATION ALGORITHM

Since Dˆ † + Dˆ is Hermitian, we have the spectral decomposition

d2 ˆ † ˆ X ˆ ˆ † D + D = αˆivec(Pi)vec(Pi) , (5.13) i=1

ˆ where Pi ∈ Cd×d andα ˆ1 ≥ ... ≥ αˆd2 . To minimize L1, we should choose

ˆ † ˆ 2vec(S) vec(S) =α ˆ1 and r αˆ Sˆ = 1 Pˆ . 2 1

Second step optimization:

For Problem 5.3.2, note that

||vec(G)vec(ˆ G)ˆ † − vec(S)vec(ˆ S)ˆ †||2

= Tr{[vec(G)vec(ˆ G)ˆ † − vec(S)vec(ˆ S)ˆ †]2}

= [vec(G)ˆ †vec(G)]ˆ 2 + [vec(S)ˆ †vec(S)]ˆ 2 − 2vec(G)ˆ †vec(S)vec(ˆ S)ˆ †vec(G)ˆ

= d2 + [Tr(Sˆ†S)]ˆ 2 − 2|Tr(Gˆ †S)ˆ |2.

ˆ †ˆ 2 Hence, Problem 5.3.2 is equivalent to maximizing L2 = |Tr(G S)| among all unitary † − 1 G.ˆ We make a polar decomposition [14] of Sˆ to obtain Sˆ = LˆQ,ˆ where Lˆ = S(ˆ Sˆ S)ˆ 2

† 1 is unitary and Qˆ = (Sˆ S)ˆ 2 is positive semideﬁnite. We make a spectral decomposition ˆ ˆ ˆ ˆ ˆ† ˆ ˆ ˆ ˆ ˆ on Q to obtain Q = ZRZ , where Z is unitary and R = diag(R11, R22, ..., Rdd) with ˆ ˆ Rjj ≥ 0. Without loss of generality, we assume Rjj > 0 for all 1 ≤ j ≤ d. Let ˆ ˆ ˆ† ˆ † ˆˆ ˆ iψj ˆ M = Z G LZ, and assume that Mjj = ˆrje with ˆrj ≥ 0 and 0 ≤ ψj < 2π. Because

135 5.2. TSO HAMILTONIAN IDENTIFICATION ALGORITHM

ˆ M is unitary, we must have ˆrj ≤ 1. Hence, we have

Then we let ∂L2 = 0 for all j and we obtain ∂ψˆj

P ˆ ˆ ˆrjRjj sin ψj j = tan ψˆ = tan ψˆ = ... = tan ψˆ . P ˆ ˆ 1 2 d j ˆrjRjj cos ψj

ˆ ˆ iψ0 ˆ ˆ ˆ Note that L2(M) = L2(e M) for any ψ0 ∈ R. Hence, we can choose ψ1 = 0, which ˆ ˆ means ψj = 0 or π for 2 ≤ j ≤ d. To maximize L2, we should let all ψj equal to P ˆ 2 0. Therefore, L2 = ( j ˆrjRjj) , which indicates ˆrj = 1 for all j. If all the diagonal elements of a unitary matrix are equal to one, then it must be the identity matrix.

Hence, for the optimal value we have Mˆ = I. Considering an extra global phase, we ﬁnally have the optimal solution

iψˆ iψˆ † − 1 Gˆ = e Lˆ = e S(ˆ Sˆ S)ˆ 2 , (5.14) where ψˆ ∈ R. Combining the results of Problem 5.3.1 and Problem 5.3.2, we obtain the ﬁnal solution.

† − 1 Remark 5.2. When using the notation (Sˆ S)ˆ 2 , we have assumed that Sˆ is nonsingular. This is true when the resource number is suﬃciently large. On one hand, the true value S equals to G, which is unitary and naturally nonsingular. On the other hand, when the resource number is large enough, Sˆ will be close to S. Thus, in the asymptotic sense we can assume that Sˆ is nonsingular.

After we solve Problem 5.3, we should calculate the Kraus operator Aˆ (which is also the unitary propagator Uˆ(t)) from G,ˆ and ﬁnally we calculate Hˆ from A.ˆ Note that Uˆ(t) must be a unitary matrix. Then the questions arise of how to calculate Aˆ

136 5.2. TSO HAMILTONIAN IDENTIFICATION ALGORITHM from G,ˆ and whether the matrix Aˆ calculated from Gˆ is always unitary? We answer these questions as follows.

Proposition 5.2. Under the assumptions of Problem 5.3, suppose we have obtained a solution Xˆ = vec(G)vec(ˆ G)ˆ †.

Then there is essentially only one Kraus operator Aˆ calculated from Gˆ . Aˆ must be unitary and in fact Aˆ is equal to eiφGˆ T , where φ ∈ R.

ˆ ˆ ˆ ˆ ˆ † Proof. Denote vec(G)j as the j-th element of vec(G). Since X = vec(G)vec(G) , we have

Therefore, there is essentially only one Kraus operator, which is eiφGˆ T with φ ∈ R undetermined, and Aˆ = eiφGˆ T is unitary.

Remark 5.3. If Sˆ and Gˆ are the solutions to Problem 5.3.1 and Problem 5.3.2,

iφ1 ˆ iφ2 ˆ respectively, then for any φ1, φ2 ∈ R, e S and e G are also optimal solutions, respectively. Hence, there is in fact an undetermined global phase in Gˆ , which can also be seen from Proposition 5.2. This stems from the global phase in the Hamiltonian, which is physically unobservable. Through proper prior knowledge, this global phase can be eliminated. For example, in [165] the prior knowledge of TrH = 0 is assumed. In our simulations of Section 5.2.5, we use the assumption that the smallest eigenvalue of H is set to a determined value.

137 5.2. TSO HAMILTONIAN IDENTIFICATION ALGORITHM

After obtaining A,ˆ we need to solve Aˆ = exp(−iHˆt) to obtain Hˆ. Traditionally this task is performed via matrix logarithm function [165] or Taylor expansion method [71], which are either time-consuming or inaccurate. We hereby propose a new solution that is both eﬃcient and accurate.

Note that in real physical systems we always require Hˆ to be Hermitian. Another question which naturally arises is whether every solution Hˆ of the equation Aˆ = exp(−iHˆt) is Hermitian. We introduce Theorem 1.43 from [69] as well as its proof, since the proof provides a method to obtain Hˆ.

Lemma 5.1 ([69]). A ∈ Cn×n is unitary if and only if A = exp(iH) for some Hermitian H.

Proof. The Schur decomposition of A has the form A = QDQ† with Q unitary and

D = diag(eiθ1 , eiθ2 , ..., eiθn ) = exp(iΘ),

where Θ = diag(θ1, ..., θn) ∈ Rn×n. Hence,

A = Qexp(iΘ)Q† = exp(iQΘQ†) = exp(iH), where H = H†.

Lemma 5.1 satisﬁes our needs perfectly. Instead of using the general matrix logarithm function, we can just use the Schur decomposition to obtain the logarithm of unitary matrix A.ˆ Furthermore, from the proof of Lemma 5.1 we notice that all

θj should lie in a region no larger than π, otherwise they can not be uniquely determined. This indicates that the sampling period should be small enough. This can also be viewed as a result of Nyquist sampling theorem, as stated in [165]. Hence, in this chapter we employ the following assumption.

138 5.2. TSO HAMILTONIAN IDENTIFICATION ALGORITHM

Assumption 5.1. The evolution time t satisﬁes

π 0 < t < , λd(H) − λ1(H) where λd(H) and λ1(H) are the largest and smallest eigenvalues of Hamiltonian H, respectively.

In AppendixL we give an example of a suﬃcient condition for Assumption 5.1, which might be more convenient to determine t in practice. Now with Assumption

5.1 satisﬁed and λ1(H) set, we design an algorithm to recover the Hamiltonian from a unitary Gˆ as the following.

ˆ T ˆ T ˆ ˆ ˆ † Algorithm 5.1. (i) Perform a Schur decomposition of G to get G = QGJGQG ˆ ˆ ˆ ˆ with QG unitary, and JG = exp(iΘG), where ΘG = diag(ˆg1, ..., ˆgd), 0 ≤ ˆg1 ≤ ˆg2 ≤

... ≤ ˆgd < 2π.

(ii) If ˆgd−ˆg1 < π, go to step (iii); otherwise, ﬁnd the smallest k so that ˆgk −ˆg1 ≥ π.

Then for j = k, k + 1, ..., d, replace ˆgj with ˆgj − 2π. This step aims to ensure the reconstructed Hamiltonian has spectral region no larger than λd(H) − λ1(H).

(iii) Let ˆg0 = maxj ˆgj. For all 1 ≤ j ≤ d, take ¯gj = ˆgj − λ1(H)t − ˆg0. If we denote ¯ ˆ ˆ ¯ ˆ † ΘG = diag(¯g1, ¯g2, ..., ¯gd), then H = −QGΘGQG/t is the ﬁnal estimated Hamiltonian.

5.2.3.2 General procedure and computational complexity

In Fig. 5.1, we summarize the general procedure of the QHI framework, where this chapter focuses on Box 2. All steps in Box 2 are data processing steps performed on a computer. Step 1 is quantum state tomography, which includes the acquisition of experimental data and post-processing of the experimental data. In this chapter we do not consider the time spent on experiments, since it depends on the specific experimental realization. In the following, we brieﬂy summarize each step and illustrate their corresponding computational complexity.

139 5.2. TSO HAMILTONIAN IDENTIFICATION ALGORITHM

Figure 5.1: General procedure of the TSO method.

Step 1. Choose basis sets {Fi} and {ρm} and calculate B. Then use quantum state tomography to reconstruct experimental output states of the system. The number of resource copies NO for each output state determines the estimation error, but does not aﬀect the computational complexity of the estimation algorithm. Generally the calculation of B according to (5.7) has O(d11) computational complexity. However, under the natural basis, this complexity can be reduced to only O(d4). For state reconstruction, we employ the method of QST using LRE for our numerical simulations. The computational complexity of LRE state tomography is O(d6) oﬄine and O(d4) online [113]. Considering there are d2 output states to be reconstructed, the total computational complexity of our LRE method for QHI is O(d6).

Step 2. Use (5.6) to determine Ξ.ˆ Generally the computational complexity to solve (5.6) is O(d12). But it is only O(d2) using the orthogonal property under the natural basis.

Step 3. Calculate Dˆ = vec−1(B†vec(Ξ)).ˆ Generally the complexity is O(d8). But

140 5.2. TSO HAMILTONIAN IDENTIFICATION ALGORITHM under the natural basis, we already know the speciﬁc structure and the value of B (see (5.16)). Thus, the complexity now is only O(d4).

Step 4. Calculate Sˆ according to the spectral decomposition of Dˆ + Dˆ †. The computational complexity is determined by spectral decomposition, which is O(d6) (the computational complexity of spectral decomposition is cubic in a Hermitian matrix’s dimension; see [60]).

† − 1 Step 5. Use matrix polar decomposition to obtain Gˆ = S(ˆ Sˆ S)ˆ 2 . The computational complexity is O(d3)[60].

Step 6. Use the Schur decomposition to obtain the ﬁnal estimated Hamiltonian Hˆ from G.ˆ The computational complexity of Schur decomposition is O(d3)[60, 103].

Our Hamiltonian identiﬁcation procedure has the following advantages. Firstly, the framework is general, since we formulate it within the QPT framework. We do not impose any restriction (such as sparseness) on the Hamiltonian. Secondly, Step 1 has the potential for parallel processing. One can deal with data on hand to reconstruct existing output states while at the same time inputting new probe states to the process and making measurements on them. Thirdly, the computational complexity can be analyzed. Regardless of the time spent in experiments, all steps in our QHI framework have clear computational complexity (at most O(d6)). Finally, it is possible to analytically investigate the error upper bound and a detailed error analysis is presented in Section 5.2.4.

5.2.3.3 Practical consideration of storage requirements

One issue in the calculations is that the dimension of B may increase rapidly. When there are 4 qubits, B has 232 elements. If it takes one byte to store one element of B, then we need 4 GB of storage space, which is already a very heavy task for a common PC. We notice that B generated from the natural basis is a permutation matrix, which can be vital to improving the computation eﬃciency. A permutation

141 5.2. TSO HAMILTONIAN IDENTIFICATION ALGORITHM matrix is a (square in this thesis) matrix such that all elements are 0 except exactly one 1 in each column and each row.

Notice that after B is determined from equations (5.7) and (5.9), its practical usage is in Problem 5.3, where we need to multiply B† to a vector. This multiplication task can be done in an alternative way where B’s full storage is avoided. To be speciﬁc, we aim to make B sparse. Hence, we only need to store the information of its very small number of nonzero elements and thus ignore a large number of zero elements, while still being able to perform the multiplication. This idea is realized by the following theorem:

d2 d2 Theorem 5.3. Let {Fi}i=1 be a set of matrices in Cd×d. Choose {ρm}m=1 = iθ {|jihk|}1≤j,k≤d. Deﬁne B through (5.7) and (5.9). Then {Fi} = e {ρm} if and only if B is a permutation permutation matrix. Here, θ ∈ R is any ﬁxed global phase.

Proof. Using Theorem 5.2 we know that equation (J.3) holds.

Sufficiency :

Deﬁne W(j, k) as a d2 × d2 matrix where W(j, k)’s element in position (a, b) is the jk number βba , and denote (x, y) = (x − 1)K + y for 1 ≤ x, y ≤ K. Denote ρk = |gihh| and ρj = |mihn|. From the choice of {ρm} we know {vec(ρm)} is a set of linearly † independent column vectors forming a basis of the space Cd4 . We multiply vec(ρ(p,q)) from the left of (J.1) and use equation (A.1) to obtain

jk W(j, k)(p,q)(s,t) = β(s,t)(p,q) † −iθ ∗ ∗ iθ = vec(ρ(p,q)) (e |gi hh| ⊗ e |mihn|)vec(ρ(s,t)) = (hq|∗ ⊗ hp|)(|gi∗ ⊗ |mi)(hh|∗ ⊗ hn|)(|ti∗ ⊗ |si) (5.15) = (hq|gi∗ ⊗ hp|mi)(hh|ti∗ ⊗ hn|si)

= δqgδpmδthδsn.

Hence, each matrix W(j, k) has exactly one 1 and all other elements are 0. From equation (J.4) we know each row of B has exactly one 1 and all other elements are

142 5.2. TSO HAMILTONIAN IDENTIFICATION ALGORITHM

0. When indices j and k run from 1 to d2, the index combination (g, h, m, n) never repeats, therefore, W(j1, k1) and W(j2, k2) have diﬀerent positions of 1 as long as index pair (j1, k1) 6= (j2, k2). This means each row of B has no more than one 1. Since B is square, we know that each row of B has exactly one 1. Hence, B is a permutation matrix.

Necessity :

When B is a permutation matrix, from equation (J.4) we know that each matrix W(j, k) has exactly one 1 and all other elements are 0. According to V’s permutation property and equation (K.1) we know this property for each W(j, k) also holds for

∗ each matrix Fk ⊗ Fj. This means that each matrix Fj has exactly one nonzero ∗ 2 element, denoted as yj. Then we have ykyj = 1 holds for every k, j = 1, 2, ..., d .

iθj Let j = k, and we find yj = e . Then we know θ1 = θ2 = ... = θd2 = θ, where θ is any fixed real number. Since B is invertible, from Theorem 5.1 we know {Fj} is a linearly independent set. Thus each pair of matrices in {Fj} have different positions iθ iθ of e . Hence, we can write {Fj} = e {ρm}.

From the proof of Theorem 5.3, one can also deduce an equation to directly calculate B. Using (5.15) to consider W(j, k)(p,q)(s,t), we obtain

jk (m,n)(g,h) β(s,t)(p,q) = δqgδpmδthδsn = β(s,t)(p,q) . (5.16)

Therefore, one can easily write down B when the size d is given.

A special case of the suﬃciency of Theorem 5.3, i.e., when {Ei} and {ρm} are the same natural basis sets with the same order of elements, also appeared in [159]. Our theorem and proof here is more general. Using this theorem, we only need to store all 1’s positions in B, which only requires d4 storage space. This is a great reduction compared with d8, and the cost is only some more coding in calculating multiplication by B. Furthermore, the computational complexity in writing down B is also reduced to only O(d4).

143 5.2. TSO HAMILTONIAN IDENTIFICATION ALGORITHM

5.2.4 Error analysis

The error in the Hamiltonian identification method under consideration has only three possible sources. The first one occurs in state estimation, where measurement frequency in practical simulations or experiments is used to approximate the measurement probability. The second one is that state reconstruction algorithm might produce errors. The third one is that our TSO QHI algorithm may also produce errors. In this section, we give an upper bound for the ultimate identification error. We first fix the given evolution time t and analyze the error of our QHI method. Then we utilize the similar method to analyze the relationship between the error and the time t.

Let NO be the number of resources in state tomography for each output state.

For simplicity, we assume that NO is a constant for different output states in this chapter. If otherwise NO changes for different states, TSO still applies, but with the error characterization modified.

5.2.4.1 Upper error bound for ﬁxed evolution time

Theorem 5.4. If {Fi} and {ρm} are chosen as natural basis of Cd×d and the evolution time t is ﬁxed and satisﬁes Assumption 5.1, then the estimation error of

3 the TSO QHI method E||Hˆ − H|| scales as O( √d ), where E(·) denotes expectation NO with respect to all possible measurement results.

Proof. The proof of this theorem is divided into the following seven parts.

Error in Step 1

The quantum state tomography algorithm used in this paper is from [113], and the upper bound on the state estimation error is given by

M sup E{Tr[(ˆρ − ρ)2]} = Tr(XT X)−1, (5.17) ρ 4N

144 5.2. TSO HAMILTONIAN IDENTIFICATION ALGORITHM where ρ is the true state andρ ˆ its estimator, M is the number of measurement bases, N is the number of experiments (i.e., number of copies of ρ) in state tomography, X is a matrix determined by the measurement basis set (for details, see [113]). Henceforth, we denote this error upper bound (i.e., the RHS of (5.17)) as Est. Following the d4 deduction in the Methods section of [113], one can prove Est ∼ O( ). NO ˆ We always denote ∆A , A − A as the diﬀerence between variable/matrix A and its estimation. When ρm is Hermitian,

2 E||∆E(ρm)|| ≤ Est. (5.18)

When ρm is not Hermitian, its process output is in fact calculated according to equation (5.12) rather than directly probed. Hence, we must analyze this situation speciﬁcally. Under the choice of {ρm} as the natural basis, for j 6= k,

E||∆E(|jihk|)||2

1+i 1+i 2 = E||[∆E(|+ih+|)] + i[∆E(|−ih−|)] − 2 [∆E(|jihj|)] − 2 [∆E(|kihk|)]|| 1+i 1+i 2 ≤ (1 + |i| + | 2 | + | 2 |) Est √ = (6 + 4 2)Est. (5.19)

Error in Step 2

Now we calculate the error in the experimental data:

E||∆Ξ||2 P P ∗ = E m n,k(∆ξmn)(∆ξmk)δnk P P ∗ † = E m n,k(∆ξmn)(∆ξmk)Tr(ρnρk) P P ∗ † P (5.20) = E m Tr[ n(∆ξmn)ρn k(∆ξmk)ρk] P 2 = E m Tr[(∆E(ρm)) ] Pd Pd 2 Pd 2 = E[ j=1 k=1,k6=j ||∆E(|jihk|)|| + l=1 ||∆E(|lihl|)|| ] √ ≤ (6 + 4 2)d(d − 1)Est + dEst.

145 5.2. TSO HAMILTONIAN IDENTIFICATION ALGORITHM

Error in Step 3

From Theorem 5.3, we know B† is a permutation matrix. Hence, its eﬀect on vec(Ξ) is merely a series of swapping two elements of vec(Ξ), and thus D = vec−1(B†vec(Ξ)) is just a reordering of Ξ’s elements. For the same reason, ∆D is just reordering of ∆Ξ. Therefore

X 2 1 ||∆D|| = ( |∆Djk| ) 2 = ||∆Ξ||. (5.21) j,k

Error in Step 4

We present a lemma to be used in this part.

Lemma 5.2. Let b and c be two complex vectors with the same ﬁnite dimension and assume that they are not both zero simultaneously. Then we have √ † † † † ||bb − cc || iθ 2||bb − cc || ≤ min ||e b − c|| ≤ p . ||b|| + ||c|| θ∈R ||b||2 + ||c||2

The detailed proof of Lemma 5.2 can be found in AppendixM.

We ﬁrst estimate ||Sˆ||. We have

αˆ αˆ ||Sˆ||2 = Tr(Sˆ†S)ˆ = 1 Tr(Pˆ†Pˆ ) = 1 . 2 1 1 2

ˆ ˆ † Remember thatα ˆ1 is the largest eigenvalue of D + D . Using Lemma 5.3,

ˆ ˆ † |αˆ1 − 2d| ≤ ||(D + D ) − 2D|| ≤ 2||∆D|| = 2||∆Ξ||.

We thus have

2d − 2||∆Ξ|| ≤ αˆ1 ≤ 2d + 2||∆Ξ||.

Therefore, p ˆ p p d − ||∆Ξ|| ≤ ||S|| = αˆ1/2 ≤ d + ||∆Ξ||. (5.22)

146 5.2. TSO HAMILTONIAN IDENTIFICATION ALGORITHM

We also need to estimate ||∆S||. Using Lemma 5.2, we have

||∆S|| = ||∆vec(S)|| √ ˆ ˆ † † ≤ 2||vec(√ S)vec(S) −vec(S)vec(S) || ||vec(S)ˆ ||2+||vec(S)||2 √ ˆ ˆ † † ≤ 2||vec(S)vec(√ S) −vec(S)vec(S) || 2d−||∆Ξ|| = [ √1 + o(1)]||vec(S)vec(ˆ S)ˆ † − D|| d (5.23) ≤ [ √1 + o(1)][||vec(S)vec(ˆ S)ˆ † − Dˆ|| + ||Dˆ − D||] d 1 † = [ √ + o(1)][min˜ ||vec(S)vec(˜ S)˜ − Dˆ|| + ||∆D||] d S∈Cd×d ≤ [ √1 + o(1)][||vec(G)vec(G)† − Dˆ|| + ||∆D||] d = [ √1 + o(1)] · 2||∆D|| ∼ √2 ||∆Ξ||. d d

Error in Step 5

We introduce Weyl’s Perturbation Theorem, which can be found in [14].

Lemma 5.3 ([14]). Let A, B be Hermitian matrices with eigenvalues λ1(A) ≥

... ≥ λn(A) and λ1(B) ≥ ... ≥ λn(B), respectively. Then

max |λj(A) − λj(B)| ≤ ||A − B||. j

Remark 5.4. The original version of Lemma 5.3 was for the operator norm. However, from [14] we know for any ﬁnite-dimension square matrix, its operator norm is not larger than its Frobenius norm. Therefore this theorem also holds for the Frobenius norm, which is our main focus throughout this thesis.

√ For the true value we have S†S = G†G = I and ||S|| = d. Denote Sˆ†Sˆ − S†S = † ˆ†ˆ ˆ ˆ ˆ † ˆ ∆S S. From the spectral decomposition S S = USESUS, where ES = diag(1 + t1, 1 + t2, ..., 1 + td). Hence, tj ∈ R. Then

ˆ†ˆ † 2 ˆ ˆ ˆ † 2 ˆ 2 X 2 † 2 ||S S − S S|| = ||USESUS − I|| = ||ES − I|| = tj = ||∆S S|| . j

147 5.2. TSO HAMILTONIAN IDENTIFICATION ALGORITHM

Thus we know

||Gˆ − Sˆ||2 = Tr[(Gˆ † − Sˆ†)(Gˆ − S)]ˆ p = d − 2Tr Sˆ†Sˆ + Tr(Sˆ†S)ˆ P p P = d − 2 j 1 + tj + j(1 + tj) t2 (5.24) P p 2 P j√ = j( 1 + tj − 1) = j 2+tj +2 1+tj P 2 1 1 = j tj [ 4 − 8 tj + o(tj)] 1 † 2 † 2 = 4 ||∆S S|| + o(||∆S S|| ).

For ∆S†S, using property (A.7), we have

||∆S†S|| = ||Sˆ†Sˆ − S†S|| ≤ ||Sˆ†Sˆ − Sˆ†S|| + ||Sˆ†S − S†S|| ≤ ||Sˆ†|| · ||Sˆ − S|| + ||S|| · ||Sˆ† − S†|| √ = (||Sˆ|| + d)||Sˆ − S|| (5.25) √ ≤ (pd + ||∆Ξ|| + d)||∆S|| √ p 2 ∼ ( d + ||∆Ξ|| + d) √ ||∆Ξ|| d ∼ 4||∆Ξ||.

Combining (5.24) and (5.25), we obtain

1 ||Gˆ − Sˆ|| ∼ ||∆S†S|| ≤ 2||∆Ξ||. (5.26) 2

From Sec. 5.2.3, we know there is in fact an extra degree of freedom φ in the estimated eiφGˆ T and it can be eliminated using prior knowledge. Here, we take

||∆G|| = min ||eiφGˆ − G||. φ

148 5.2. TSO HAMILTONIAN IDENTIFICATION ALGORITHM

Then we have

||∆G|| ≤ ||Gˆ − Sˆ|| + ||Sˆ − S|| + ||S − G|| (5.27) = ||Gˆ − Sˆ|| + ||∆S||.

Now by substituting equations (5.23) and (5.26) into (5.27), we have

||∆G|| ≤ ||Gˆ − Sˆ|| + ||∆S||

≤ 2||∆Ξ|| + √2 ||∆Ξ|| (5.28) d ∼ O(||∆Ξ||).

Error in Step 6

In this part we need the following lemma:

2 2 Lemma 5.4. For θ ∈ [−π, π], π2 θ ≤ 1 − cos θ.

Based on diﬀerential analysis up to the second-order derivative, the proof of Lem- ma 5.4 is straightforward and hence we omit the details.

† Suppose the system Hamiltonian has a spectral decomposition tH = −QHΘHQH, where ΘH = diag(ζ1, ζ2, ..., ζd). Since t satisﬁes Assumption 5.1, we have |ζi −ζj| ≤ π ˆ † for every i, j = 1, 2, ..., d. Let K = QHQH, which is also unitary. Then, we have

t2||Hˆ − H||2 ˆ ˆ ˆ † † 2 = ||QHΘHQH − QHΘHQH|| ˆ † 2 = ||ΘH − KΘHK || ˆ 2 2 ˆ † = Tr(ΘH + ΘH) − 2Tr(ΘHKΘHK ) P ˆ2 2 P ˆ 2 = j(ζj + ζj ) − 2 j,k ζjζk|Kjk| P ˆ2 2 2 P ˆ 2 = j,k(ζj + ζk )|Kjk| − 2 j,k ζjζk|Kjk| P ˆ 2 2 = j,k(ζj − ζk) |Kjk| .

149 5.2. TSO HAMILTONIAN IDENTIFICATION ALGORITHM

Now using Lemma 5.4, we have

4t2 ˆ 2 P ˆ 2 π2 ||H − H|| ≤ 2 j,k[1 − cos(ζj − ζk)]|Kjk| P 2 P ˆ 2 = 2 j,k |Kjk| − 2 j,k cos(ζk − ζj)|Kjk| ˆ P i(ζk−ζj ) 2 = 2d − 2Re( j,k e |Kjk| ) ˆ P −iζj P iζk 2 = Tr(I + I) − 2Re( j e k e |Kjk| ) ˆ ˆ = Tr[exp(iΘH)exp(−iΘH) + exp(iΘH)exp(−iΘH)] (5.29) ˆ † −2Re{Tr[exp(−iΘH)Kexp(iΘH)K ]} ˆ † 2 = ||exp(iΘH) − Kexp(iΘH)K || ˆ ˆ ˆ † † 2 = ||QHexp(iΘH)QH − QHexp(iΘH)QH|| = ||Gˆ T − GT ||2 = ||∆G||2.

Hence, we have ||∆H|| ∼ O(||∆G||). (5.30)

Total Error

We combine equations (5.30), (5.28), (5.20) and (5.17) to obtain

E||∆H||2 ∼ E[O(||∆G||2)] ∼ E[O(||∆Ξ||2)]

2 d6 ∼ O(d Est) ∼ O( ), NO which concludes the proof of Theorem 5.4.

From Theorem 5.4, we can also obtain the following corollary.

Corollary 5.1. If {Fi} and {ρm} are chosen as natural basis of Cd×d, and the evolution time t is fixed and satisfies Assumption 5.1, then the TSO Hamiltonian identification method is asymptotically unbiased.

150 5.2. TSO HAMILTONIAN IDENTIFICATION ALGORITHM

5.2.4.2 Upper error bound vs evolution time

Using a similar idea to the above, we can characterize the estimation error for diﬀerent evolution times t:

Theorem 5.5. If {Fi} and {ρm} are chosen as a natural basis of Cd×d and NO is fixed, then the estimation error of the TSO Hamiltonian identification method scales ˆ 1 as E||H − H|| ∼ O( t ) where t satisfies Assumption 5.1.

The proof of Theorem 5.5 is similar to the proof of Theorem 5.4. Note from (5.29), we have 2t ||Hˆ − H|| ≤ ||Gˆ − G||, (5.31) π which combined with (5.28), (5.20) and (5.17) leads to the conclusion in Theorem 5.5. It is worth pointing out that in this theorem, the evolution time cannot be arbitrarily large, rather it must be upper bounded according to Assumption 5.1. Hence, this scaling only holds in a certain region.

5.2.5 Numerical results of TSO

We perform numerical simulations using MATLAB on a PC. It is worth mentioning that the selection of natural bases is only a mathematical representation tool in the identiﬁcation algorithm. When performing measurements on the output states, our framework is applicable to many general measurement bases, such as cube bases [38], MUB bases [73, 99, 157], SIC-POVMs [117], etc. In our simulations for the TSO method, we choose cube measurement bases. The single-qubit cube measurement

I±σx I±σy I±σz set consists of six measurement operators: { 2 , 2 , 2 }, and the multi-qubit cube measurement set is the tensor product of the single-qubit cube set. After the measurements, we then use the LRE method to reconstruct the output states. In fact other QPT methods (like MLE, BME, etc.) are also applicable in our framework, while the computational complexity and error characterization might change.

151 5.2. TSO HAMILTONIAN IDENTIFICATION ALGORITHM

5.2.5.1 Performance illustration

First we illustrate the relationship between the mean squared error (MSE) and the resource number. Let Nt be the total number of resources, i.e., the total number of copies of diﬀerent quantum states used as probes. Considering (5.12), we have 3d2−d ˆ 2 Nt = 2 NO. In Fig. 5.2, the vertical axis is log10 ETr[(H−H) ] and the horizontal axis is log10 Nt. Assume that the real Hamiltonian is taken as

  5 0.1 3i 4i      0.1 −1 1.8 0.9  H =   . (5.32)    −3i 1.8 2 0.7i    −4i 0.9 −0.7i 3

The distance between its largest and smallest eigenvalues is 11.95. The evolution time t = 0.1 and each point is the average of 10 repetitive runs. The ﬁtting slope is −1.0131 ± 0.0154, which matches the theoretical result in Theorem 5.4.

Now we demonstrate the relationship between the MSE and the evolution time. For the same 2-qubit Hamiltonian in (5.32), we fix the number of copies in state tomography for each output state as 36 × 1000 and perform simulations for different evolution times t. The result is shown in Fig. 5.3 and each point is the average of 10 repetitive runs. The fitting slope is −2.0759 ± 0.0268, which matches the theoretical result in Theorem 5.5.

Moreover, we present an example to illustrate the relationship between the MSE

Nq and the qubit number. Let Nq denote the number of qubits; i.e., d = 2 . We perform simulations when Nq increases from 1 to 5. We set

 ⊗Nq 1 0.9 + 0.9i H =   0.9 − 0.9i 2

and t = 0.01. For Nq = 5, the distance between the largest and smallest eigen-

152 5.2. TSO HAMILTONIAN IDENTIFICATION ALGORITHM

0.5

-0.5

-1

-1.5

MSE -2 10

log -2.5

-3

-3.5

-4

-4.5 3.5 4 4.5 5 5.5 6 6.5 7 7.5 8 8.5 log10Nt

Figure 5.2: MSE of TSO versus the total resource number.

MSE 1 10 log 0

-1

-2

-3 -4.5 -4 -3.5 -3 -2.5 -2 -1.5 -1 t

Figure 5.3: MSE of TSO versus diﬀerent evolution times.

153 5.2. TSO HAMILTONIAN IDENTIFICATION ALGORITHM

0.5

−0.5 MSE 10

log −1

−1.5

−2

−2.5 1 2 3 4 5 Nq

Figure 5.4: MSE of TSO versus number of qubits.

values of H is 193.87. The number of copies for state tomography is 36 × 1000 for each output state. The result is shown in Fig. 5.4 and each point is the average of

10 repetitive runs. We observe that as Nq increases, the errorbar decreases. This is because as Nq increases, the error is also increasing. Therefore, the fluctuations gradually become relatively small. We examine various Hamiltonians and obtain similar results. Furthermore, we observe that the upper error bound in Theorem 5.4 indicates a slope larger than that of the fitted line in Fig. 5.4. This is because as d increases, the Hamiltonian will necessarily change, while different Hamiltonians usually lead to different identification errors even as they are of the same dimension. Also, it is possible that the bound in Theorem 5.4 w.r.t. d is not tight. We thus leave it an open problem to further investigate the relationship between the identification error and the dimension.

154 5.2. TSO HAMILTONIAN IDENTIFICATION ALGORITHM

5.2.5.2 Performance comparison

We compare the performance of the TSO QHI method with the QHI approach developed by Zhang and Sarovar in [165], which is based on the eigenstate realization algorithm in classical identiﬁcation theory (abbreviated as the ERA method hereafter).

The ERA method can be used to give a general solution to QHI although it was originally presented for the identification of partial parameters in the system Hamil- tonian. The ERA method first converts QHI into a system identification problem in the real domain, where the transfer function of the equivalent linear system can be obtained. From temporal records of system observables, it can reconstruct the transfer function. Then equating the coefficients of the transfer functions with unknown parameters to those from the experimental data, the ERA method leads to a set of multivariate polynomial equations, whose solution yields the estimates of the Hamiltonian parameters.

ERA approach is only efficient if the number of parameters to be identified in the Hamiltonian is small. This is because solving multivariate polynomial equations takes a considerable amount of time, especially for high dimensional systems or for full Hamiltonian identification with complex quantum systems. In fact, common algorithms solving multivariate polynomial equations can be super-exponential when the number of variables scales up [16].

To illustrate the eﬃciency of the TSO Hamiltonian identiﬁcation method, we compare it with the ERA method by numerical simulations, which we performed on a single thread, computer cluster with 2 Intel Xeon E5-2680v3 CPUs and 256 GB memory. We consider the following Hamiltonian for a 1D chain of Nq qubits, which is the example investigated in [165]:

Nq Nq−1 X ωk X H = σk + η (σk σk+1 + σk σk+1). 2 z k + − − + k=1 k=1

155 5.2. TSO HAMILTONIAN IDENTIFICATION ALGORITHM

5 ERA Method TSO Algorithm 4

2 T 10

log 1

−1

−2 3 4 5 Nq

Figure 5.5: Running time versus qubit number for the ERA method in [165] and our TSO method.

Here ωk and δk are unknown parameters to be identiﬁed. ηk is the coupling strength 1 between k-th and (k +1)-th spins, σ± = 2 (σx ±iσy). Running on the same computer cluster, we compare the consumed time of our TSO QHI method versus the ERA method for the cases of Nq = 3, 4, and 5. For the TSO method, we do not utilize the prior structural knowledge (1D-chain) of the targeted Hamiltonian, whereas this information is used in the ERA method. Fig. 5.5 shows the numerical result, where the vertical axis is the programs’ running time TR (in units of seconds) in a logarithmic scale, and the horizontal axis is the number of qubits Nq. The red diamonds are the times from the ERA method, whereas the blue dots are for the TSO identi- ﬁcation method. The numerical results show that the TSO method is much faster

(e.g., around 100 times faster for Nq = 4) than the ERA method even if we do not use the prior knowledge of the Hamiltonian’s structure. It is worth mentioning that the eﬃciency of the TSO algorithm usually depends on the system size but not the

156 5.2. TSO HAMILTONIAN IDENTIFICATION ALGORITHM number of parameters for a given system size, while the performance of the ERA method significantly depends on the system size as well as the number of parameters to be identified. The efficiency advantage of TSO algorithm becomes remarkable as the system size and the number of parameters to be identified increase.

For the case of Nq = 3, we further compare the identification error of the ERA method and the TSO method. We assume that the error in the measurement data in the TSO method is the same as the zero mean Gaussian noise in [165] with standard deviation 0.01, and the total measurement times are equal for both identification methods. We compare the percentage relative errors in the estimates of the five unknown parameters. For the ERA method, the mean relative errors are (−0.0018%, 0.1104%, 0.4338%, −0.0209%, −0.0809%), while the relative errors using the TSO method are (0.2200%, 0.2914%, 0.5403%, 0.0037%, 0.0129%). The identification errors of the TSO method are usually larger than those using the ERA method since the ERA method uses structure information and identifies only 5 unknown parameters while the TSO method identifies 63 unknown parameters in this example. It is worth mentioning that the focus of the ERA method and the focus of the T- SO method are different. ERA aims to take full advantage of prior knowledge to identify the parameters in the Hamiltonian. Without prior knowledge, a two-qubit Hamiltonian has 15 unknown parameters to be identified, which is already a heavier task than the Nq = 5 case in Fig. 5 using the ERA method. While for the TSO method, the aim is to design a full identification algorithm with improved computational efficiency for a general Hamiltonian. Moreover, in the TSO method we can use Theorem 5.4 to estimate how many resources will be needed to attain a certain level of identification error.

157 5.3. PURE-STATE-BASED GATE IDENTIFICATION

5.3 Pure-state-based Gate Identiﬁcation

In this section, we improve the TSO method in Sec. 5.2 to a Pure-state-based Gate Identiﬁcation (PGI) method, with better computational complexity. We start from the problem formulation.

5.3.1 Problem formulation of gate identiﬁcation

Denote U the quantum gate to be identiﬁed, then the system evolution equation is in the same form as (5.11):

† ρout = UρinU .

Since the Hamiltonian is the generator of unitary propagator, we can model the gate identification problem still as Problem 5.3. And from Proposition 5.2 we know that the relationship between the gate and the estimated Gˆ from Problem 5.3 is Uˆ = eiφGˆ T , where φ ∈ R is the global phase. In the following, we introduce PGI as an improvement version of TSO in two aspects: (i) the original LRE-based QST method on output states is replaced by a fast-pure-state tomography method; (ii) the original TSO solution to Problem 5.3 is modified to enhance the efficiency.

5.3.2 Fast pure-state tomography

For general state tomography with no prior knowledge, one of the most eﬃcient methods is LRE [113], which has computational complexity O(d4) for reconstructing a d-dimensional state. This is the reaseon we employ LRE in Step 1 of TSO. Note from (5.12) that all the input states are pure. Hence, all the output states from the quantum gates are also pure. Using this information, we can establish a fast pure-state tomography protocol with computational complexity O(d2).

Denote an output state to be reconstructed as ρ. Under the natural basis, ρij = hi|ρ|ji for 1 ≤ i, j ≤ d. Since ρ is pure, we can write ρ = |ψihψ|. We do not know ρ

158 5.3. PURE-STATE-BASED GATE IDENTIFICATION or |ψi yet, but writing ρ in this form will help to facilitate the theoretical analysis.

Denote the i-th row and j-th column of ρ as ρiσ and ρσj, respectively. We have

ρσj = |ψihψ|ji. Note that ρ is of rank-1 and thus each nonzero column contains all the information apart from a trivial global phase. Therefore, we only need to reconstruct one column of ρ. The chosen column should have the largest (or almost

2 the largest) |hψ|ji| so that the error is suppressed. Since ρjj = |hψ|ji| , the norm of each column is thus indicated by the corresponding diagonal element. We thus need to reconstruct the column where the largest diagonal element of ρ resides in.

We begin the practical measurement by first taking the measurement basis as |jihj| (1 ≤ j ≤ d) to estimate all the diagonal elements of ρ. Among these estimation values, we denote the largest one asρ ˆss. For the true values, the largest diagonal element might not be ρss due to experiment inaccuracy. We will show in Sec. 5.3.5 that our procedures still bound the final estimation error. Since we assume no prior knowledge on ρ, this index s cannot be determined before the experiment. Now with s fixed, for each j 6= s we take the measurement basis as

(|si + |ji)(hs| + hj|) P(s) = j 2 and (|si + i|ji)(hs| − ihj|) Q(s) = j 2

+ − + (s) and obtain the corresponding estimators asρ ˆjs andρ ˆjs; i.e., ρjs = Tr(ρPj ) and − (s) ρjs = Tr(ρQj ). We multiply (5.12) with ρ, take trace and replace the original indices (j, k) with (s, j) to obtain

1 + i 1 + i ρ = ρ+ + iρ− − ρ − ρ , js js js 2 ss 2 jj

+ − which can be used to calculateρ ˆjs sinceρ ˆjs,ρ ˆjs,ρ ˆss andρ ˆjj are already in hand.

Aligningρ ˆ1s,ρ ˆ2s,...,ρ ˆds into a column vector, we obtainρ ˆσs, i.e., the estimation of † † |ψihψ|si. Then we takeρ ˆ =ρ ˆσsρˆσs/Tr(ˆρσsρˆσs) as the ﬁnal estimator of ρ. It is clear

159 5.3. PURE-STATE-BASED GATE IDENTIFICATION that the above procedure has computational complexity O(d2).

(s) (s) We realize Pj and Qj in different sets of POVM in the simulation and experiment (s) (s) part of this chapter. Since all of the eigenvalues of Pj are either 1 or 0, I − Pj is (s) also a positive semidefinite operator. For the same reason, I − Qj is also positive (s) (s) semidefinite. Hence, we perform two sets of POVM measurements {Pj , I − Pj } (s) (s) and {Qj , I − Qj } for every j 6= s. The total number of different POVM sets to reconstruct one output state is

NP = 1 + 2(d − 1) = 2d − 1. (5.33)

5.3.3 Gate reconstruction

Now we illustrate how to solve Problem 5.3 more efficiently. We begin from the specialty of Problem 5.3.1. In Problem 5.3.1, we need to recover vec(S)ˆ from D.ˆ For the true value, D is of rank-one and thus each nonzero row (or column) contains sufficient information about vec(S). Hence, we do not need to obtain all the elements of D.ˆ Also, Theorem 5.3 indicates that Dˆ is obtained by rearranging the elements in Ξ.ˆ We can thus input a subset of the input states to the gate and reconstruct the corresponding output states. From these output states we obtain partial elements of Ξˆ and determine partial rows of D,ˆ from which we recover vec(S).ˆ We thus need to determine the specific mapping from data Ξˆ to D.ˆ

From (5.16) we know all the nonzero elements of B are

(p,s)(q,t) β(s,t)(p,q) = 1 (5.34) for p, q, s, t = 1, 2, ..., d. Since Dˆ = vec−1[BT vec(Ξ)],ˆ we have

BT vec(Ξ)ˆ = vec(D)ˆ , (5.35)

160 5.3. PURE-STATE-BASED GATE IDENTIFICATION which is equivalent to X jk ˆ ˆ βab Ξab = Djk. (5.36) ab Comparing (5.34) with (5.36), we have

ˆ ˆ Ξ(s,t)(p,q) = D(p,s)(q,t), (5.37) which shows how the elements in Ξˆ are rearranged to obtain D.ˆ

To employ only partial rows of D,ˆ a first idea is to employ only one row. Then it is necessary to fix both p and s, instead of taking every integer from {1, 2, ..., d}. However, this is generally not applicable, because one might happen to take a row from Dˆ that has a very small norm, which greatly amplifies the identification error. Hence, we need to take more than one row of D.ˆ Without loss of generality, we fix the value of s as 1. Then we need to collect all the elements in the ((p − 1)d + 1)-th ˆ ˆ rows of D (1 ≤ p ≤ d); i.e., D(p,1)σ. Suppose we already have Ξab, which corresponds ˆ to Djk. From (5.37) we establish the correspondence between indices as

  a = (s, t) = (s − 1)d + t,    b = (p, q) = (p − 1)d + q, (5.38)  j = (p, s) = (p − 1)d + s,    k = (q, t) = (q − 1)d + t.

ˆ To clearly write down the rule to obtain Djk from Ξab, the dominant indices should be a and b, which range from 1 to d, and 1 to d2, respectively. Then, we consider the ˆ subordinate indices s, t, p, q, j and k. Since we focus on D(p,1)σ, we already have

s ≡ 1 (5.39) and therefore t = a. (5.40)

161 5.3. PURE-STATE-BASED GATE IDENTIFICATION

From the second equation of (5.38), we have

q = 1 + (b − 1) mod d (5.41) and p = {b − [1 + (b − 1) mod d]}/d + 1. (5.42)

From the third equation in (5.38), we have

j = (p − 1)d + 1. (5.43)

ˆ By cancelling s, t, p and q from (5.38) to (5.43), we know Ξab is in the [b − (b − 1) mod d]th row and {[(b − 1) mod d]d + a}th column of D.ˆ The computational 3 ˆ ˆ complexity is O(d ) to calculate all D(p,1)σ (1 ≤ p ≤ d) from Ξab (1 ≤ a ≤ d, 2 ˆ 1 ≤ b ≤ d ). Moreover, from (5.6) and Theorem 5.3, we know these Ξab require to d reconstruct E(|1ihk|) where k = 1, 2, ..., d, which are just {ρm}m=1. From (5.12), we need to input 3(d − 1) + 1 = 3d − 2 classes of probe states to the gate. ˆ Then to solve Problem 5.3.1, we calculate ||D(p,1)σ|| for all 1 ≤ p ≤ d and ﬁnd the ˆ row with the largest row vector norm denoted by D(z,1)σ; i.e.,

ˆ z = argmaxp∈{1,2,...,d}||D(p,1)σ||.

For the true values, we have

† D(z,1)σ = G1zvec(G) . (5.44)

ˆ −1 ˆ † We thus take vec(S) = G1z D(z,1)σ, which is

ˆ −1 −1 ˆ † S = G1z vec [D(z,1)σ]. (5.45)

162 5.3. PURE-STATE-BASED GATE IDENTIFICATION

Though we do not know the value of G1z, we will later prove that G1z in (5.45) can be substituted by any nonzero number. Without loss of generality, we let G1z = 1 and use ˆ −1 ˆ † S = vec [D(z,1)σ] (5.46) instead of (5.45) in practical applications.

ˆ ˆ Remark 5.5. The procedure of ﬁnding D(z,1)σ from D(p,1)σ (1 ≤ p ≤ d) is in fact ˆ searching for z = argmaxp∈{1,2,...,d}|G1p|, which generally has computational complexity O(d3). In the best case, this can be reduced to O(1), although it may need some prior information of the gate or a diﬀerent basis set. However, the computational complexity of the whole algorithm is in general at the level of O(d3).

Then we continue using the original solution to Problem 5.3.2 in Sec. 5.2.3.1; i.e.,

(5.14), from which we consider the eﬀect of G1z when we use (5.46) instead of (5.45). If we multiply Sˆ with a nonzero real number, then the optimal Gˆ from (5.14) does not change. If we multiply Sˆ by eiθ, then this degree of freedom in the phase can ˆ be incorporated into U. This proves that we can substitute G1z in (5.45) by any nonzero number, which validates the feasibility of employing (5.46).

5.3.4 General procedure and computational complexity

We summarize the general procedure of PGI method and analyze the computational complexity as follows.

Step 1. Employ (5.12) and the fast quantum state tomography protocol in Sec. 5.3.2 to reconstruct the output states Eˆ(|1ihk|). Since there are 3d − 2 classes of output states to be reconstructed, the computational complexity is O(d) × O(d2) = O(d3). ˆ Step 2. Based on (5.6) and (5.37), obtain D(p,1)σ for all 1 ≤ p ≤ d. The computational complexity is O(d3) according to Sec. 5.3.1.

163 5.3. PURE-STATE-BASED GATE IDENTIFICATION

ˆ ˆ Step 3. For D(p,1)σ, ﬁnd the row with the biggest row vector norm as D(z,1)σ. Use (5.46) and (5.14) to obtain G,ˆ and the estimated gate is Uˆ = Gˆ T . Use prior knowledge to multiply Uˆ by eiθ to calibrate the global phase. Based on Step 5 of TSO method, we know the computational complexity is O(d3).

Hence, the total computational complexity of our PGI algorithm is O(d3), which is much lower than the complexity of TSO method O(d6).

5.3.5 Error analysis

We present the following theorem to characterize the gate identiﬁcation error of PGI method.

Theorem 5.6. If {Fi} and {ρm} are chosen as natural basis of Cd×d, then the ˆ d2.5 identiﬁcation error E||U − U|| scales as O( √ ) in the PGI method, where NO is NO the number of copies of each probe states.

Proof. a) Error in fast QST

Recall from (5.33) that NP is the number of diﬀerent POVM sets for each output state. The NO/NP outcomes for each measurement of ρjj are i.i.d. (see, e.g., [113]).

According to the central limit theorem [32], ∆ρjj converges in distribution to a ρ −ρ2 normal distribution with mean zero and variance jj jj . Similarly, the distributions NO/NP ρ+ −(ρ+ )2 of ∆ρ+ and ∆ρ− converge to zero-mean normal distributions with variances js js js js NO/NP ρ− −(ρ− )2 and js js , respectively. NO/NP In the asymptotical sense, the identiﬁcation error is small and we can ensure thatρ ˆss is close to the largest diagonal element of ρ (though it might not be ρss).

ρjj Therefore, we have 1 ≥ ρss > 0 and ρss > 2 for every j 6= s. Hence, we know

E(∆ρ 2) ρ − ρ2 N ss = ss ss ≤ P . (5.47) ρss ρssNO/NP NO

164 5.3. PURE-STATE-BASED GATE IDENTIFICATION

For j 6= s, 2 2 E(∆ρ ) ρjj − ρ 2(1 − ρ ) 2N jj = jj < jj ≤ P . (5.48) ρss ρssNO/NP NO/NP NO

Decompose ρ = L†L. Denote |Lsi = L|si and hLj| = (L|ji)†. From the Cauchy inequality, we have

2 2 |ρjs| = |hLj|Lsi| ≤ hLj|LjihLs|Lsi = ρjjρss.

Thus the following relationship holds:

+ 1 ρjs = 2 Tr[ρ(|si + |ji)(hs| + hj|)] 1 = 2 [ρss + ρjj + 2Re(ρjs)] 1 √ ≤ 2 (ρss + ρjj + 2 ρjjρss) √ 3+2 2 ≤ 2 ρss.

Therefore, √ E[(∆ρ+ )2] ρ+ − (ρ+ )2 3 + 2 2 (1 − ρ+ ) 3N js = js js ≤ js ≤ P . ρss ρssNO/NP 2 NO/NP NO

Similarly E[(∆ρ− )2] 3N js ≤ P . ρss NO

Using (4.7), we have

2 E(|∆ρjs| ) + 1 1 2 − 1 1 2 = E[(∆ρjs − 2 ∆ρjj − 2 ∆ρss) ] + E[(∆ρjs − 2 ∆ρjj − 2 ∆ρss) ] + 2 − 2 1 2 1 2 + = E[(∆ρjs) ] + E[(∆ρjs) ] + 2 E(∆ρjj ) + 2 E(∆ρss ) − E(∆ρjs∆ρjj) + − − −E(∆ρjs∆ρss) − E(∆ρjs∆ρjj) − E(∆ρjs∆ρss) − E(∆ρjj∆ρss) q + 2 − 2 1 2 1 2 + 2 2 ≤ E[(∆ρjs) ] + E[(∆ρjs) ] + 2 E(∆ρjj ) + 2 E(∆ρss ) + E[(∆ρjs) ]E[(∆ρjj) ] q q + 2 2 − 2 2 + E[(∆ρjs) ]E[(∆ρss) ] + E[(∆ρjs) ]E[(∆ρjj) ] q − 2 2 p 2 2 + E[(∆ρjs) ]E[(∆ρss) ] + E[(∆ρjj) ]E[(∆ρss) ]

165 5.3. PURE-STATE-BASED GATE IDENTIFICATION

q + 2 − 2 1 2 1 2 + 2 = E[(∆ρjs) ] + E[(∆ρjs) ] + 2 E(∆ρjj ) + 2 E(∆ρss ) + { E[(∆ρjs) ] q − 2 p 2 p 2 p 2 2 + E[(∆ρjs) ]}{ E[(∆ρjj) ] + E[(∆ρss) ]} + E[(∆ρjj) ]E[(∆ρss) ] q q q q √ 3NP 3NP NP NP 3NP 3NP 2NP NP 2NP ≤ [ + + + + ( + )( + ) + ]ρss NO NO NO 2NO NO NO NO NO NO

18NP ≤ ρss. NO

Then, using (5.47), the following relationship holds:

2 2 Pd 2 E(||∆ρσs|| )/ρss = E(∆ρss )/ρss + E(|∆ρjs| )/ρss j=1,j6=s (5.49) ≤ NP + (d − 1) 18NP = (18d − 17) NP . NO NO NO

Now we use Lemma 5.2 to obtain

† † ρˆσsρˆσs ρσsρσs ||∆ρ || = ||ρˆ − ρ || = || 2 − 2 || out out out ||ρˆσs|| ||ρσs|| ≤ (|| ρˆσs || + || ρσs ||)|| ρˆσs − ρσs || = 2|| ρˆσs − ρσs || ||ρˆσs|| ||ρσs|| ||ρˆσs|| ||ρσs|| ||ρˆσs|| ||ρσs|| (5.50) ≤ 2|| ρˆσs − ρˆσs || + 2|| ρˆσs − ρσs || = 2 | ||ρˆσs||−||ρσs|| | + 2 ||ρˆσs−ρσs|| ||ρˆσs|| ||ρσs|| ||ρσs|| ||ρσs|| ||ρσs|| ||ρσs|| ||ρˆ −ρ || ||ρˆ −ρ || ||∆ρ || ≤ 4 σs σs = 4 σs σs = 4 √ σs . ||ρσs|| |hψ|si| ρss

Using (5.33), we establish an upper bound (in the asymptotical sense) for the mean squared error (MSE) of the fast pure-state tomography protocol

2 2 2 NP d E(||∆ρout|| ) ≤ 16E(||∆ρσs|| )/ρss ≤ 16(18d − 17) ∼ O( ). (5.51) NO NO

From (5.18) and (5.19), for each basis matrix ρm we have

||∆E(ρm)|| ∼ O(||∆ρout||). (5.52)

b) Error in G

∗ We denote w = G1z, which is a number dependent only on the real gate. Since the global phase in G can be eliminated via prior knowledge, we assume that w is

166 5.3. PURE-STATE-BASED GATE IDENTIFICATION

∗ real and positive. From (5.44) and (5.46) we know S = G1zG = wG. To estimate ||∆G||, we use ||∆G|| ≤ ||Gˆ − Sˆ || + || Sˆ − S || + || S − G|| w w w w (5.53) ˆ Sˆ ||∆S|| = ||G − w || + |w| ,

ˆ Sˆ where ||G − w ||, |w| and ||∆S|| will be separately estimated below. c) Error in S

From the analysis in (5.20) and (5.21), and using (5.52), we know

d 2 X 2 2 ||∆Ξ|| = ||∆E(ρm)|| ∼ O(d||∆ρout|| ) (5.54) m=1 and ||∆D|| = ||∆Ξ||. (5.55)

We thus have

† ||∆S|| = ||∆vec(S) || = ||∆D(z,1)σ|| ≤ ||∆D|| = ||∆Ξ||. (5.56)

ˆ Sˆ d) Estimation of ||G − w || For the true value S, we have

√ † 2 S S = |w| I, ||S|| = |w| d.

ˆ†ˆ 2 ˆ ˆ ˆ † Perform the spectral decomposition S S/|w| = VSFSVS, where

ˆ FS = diag(1 + r1, 1 + r2, ..., 1 + rd)

and rj ∈ R. Then

ˆ†ˆ † 2 4 ˆ ˆ ˆ † 2 ˆ 2 X 2 ||S S − S S|| /|w| = ||VSFSVS − I|| = ||FS − I|| = rj . j

167 5.3. PURE-STATE-BASED GATE IDENTIFICATION

Thus we know

ˆ Sˆ 2 ˆ † Sˆ† ˆ Sˆ ||G − w || = Tr[(G − w∗ )(G − w )] ˆ † ˆ Sˆ† ˆ ˆ † Sˆ Sˆ†Sˆ = Tr(G G) − Tr( w∗ G + G w ) + Tr( |w|2 ) p = d − 2Tr Sˆ†Sˆ/|w| + Tr(Sˆ†S)ˆ /|w|2 P p P = d − 2 j 1 + rj + j(1 + rj) (5.57) r2 P p 2 P j√ = j( 1 + rj − 1) = j 2+rj +2 1+rj P 2 1 1 = j rj [ 4 − 8 rj + o(rj)] 1 ˆ†ˆ † 2 4 ˆ†ˆ † 2 = 4 ||S S − S S|| /|w| + o(||S S − S S|| ), where the last line comes from the fact that w is a constant. We further have

||Sˆ†Sˆ − S†S||/|w|2 ≤ ||Sˆ†Sˆ − Sˆ†S||/|w|2 + ||Sˆ†S − S†S||/|w|2 ≤ ||Sˆ†||/|w| · ||Sˆ − S||/|w| + ||S||/|w| · ||Sˆ† − S†||/|w| √ = (||Sˆ||/|w| + d)||Sˆ − S||/|w| (5.58) √ ≤ (||Sˆ − S||/|w| + ||S||/|w| + d)||Sˆ − S||/|w| √ = (||Sˆ − S||/|w| + 2 d)||Sˆ − S||/|w| √ ≤ (||∆Ξ||/|w| + 2 d)||∆Ξ||/|w|, where the third line follows from (A.7).

Combining (5.57) and (5.58), we obtain

q ˆ Sˆ 1 ˆ†ˆ † 2 4 ˆ†ˆ † 2 ||G − w || = 4 ||S S − S S|| /|w| + o(||S S − S S|| ) q 2 √ 2 ||∆Ξ|| ||∆Ξ|| ||∆Ξ|| 2 ≤ ( 2 + 4d + 4 d ) 2 + o(||∆Ξ|| ) |w| |w| 4|w| (5.59) q 2 d||∆Ξ|| 2 = |w|2 + o(||∆Ξ|| ) √ ||∆Ξ|| = d |w| + o(||∆Ξ||).

e) Estimation of |w|

In the asymptotic sense, the identiﬁcation error will be small. Hence, |w| will be

168 5.3. PURE-STATE-BASED GATE IDENTIFICATION

close to maxk∈{1,2,...,d} |G1k|. Since G is unitary, each row is a unit vector and

1 max |G1k| ≥ √ . k∈{1,2,...,d} d

Therefore, we have 1 |w| ≥ √ . (5.60) 2 d

f) Total error

Now we substitute (5.56) and (5.59) into (5.53), and employ (5.54) and (5.60) to obtain ˆ Sˆ ||∆S|| ||∆G|| ≤ ||G − w || + |w| √ ||∆Ξ|| ||∆Ξ|| (5.61) ≤ d |w| + |w| + o(||∆Ξ||) 1.5 ∼ O(d||∆Ξ||) ∼ O(d ||∆ρout||).

Using (5.51) and taking expectation, we have

2.5 1.5 d E||∆G|| ∼ O(d E||∆ρout||) ∼ O(√ ). NO

Finally from Proposition 5.2 we have

d2.5 E||∆U|| ∼ O(√ ). (5.62) NO

5.3.6 Numerical results

5.3.6.1 Error vs resource number

We present some numerical simulation to validate Theorem 5.6 and to showcase the speciﬁc performance of the PGI algorithm. We consider a single-qubit Hadamard

169 5.3. PURE-STATE-BASED GATE IDENTIFICATION

−0.5

−1

−1.5

−2

MSE −2.5 10 log −3

−3.5

−4 Pure Probe States Mixed Probe States −4.5 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 log10NO

Figure 5.6: MSE versus resource number for each output state.

gate,   1 1 1 H = √   . (5.63) 2 1 −1

ˆ We assume H11 is real, which can be guaranteed by choosing a suitable global phase. Using the PGI method, the corresponding simulation result is shown in Fig. 5.6, ˆ 2 where the vertical axis is the logarithm of the MSE (i.e., log10 E||H − H|| ) and the horizontal axis is the logarithm of the resource number for each output state (i.e., log10 NO). The blue dots are simulation results with probe pure states and the blue line is the ﬁtting line, with slope −0.9870 ± 0.0199. Each point is the average of 50 repetitions. The slope is approximately −1, which closely matches the conclusion in

Theorem 5.6 w.r.t. NO.

We further consider the case when the probe states are not completely pure. We

0 I mix each probe pure state with maximally mixed states as ρin = αρin + (1 − α) d ,

170 5.3. PURE-STATE-BASED GATE IDENTIFICATION

4 4

3 3

2 2

1 1

0 0

-1 -1

-2 -2 Time of MLE Time of PGI -3 Error of MLE -3 Error of PGI

1 2 3 4 5 6 7

Figure 5.7: Running time and MSE versus qubit number for MLE and PGI methods.

√ 02 where α = 0.99 and the purity of the mixed states is Tr(ρin) = 0.9950. The simulation results are shown as red diamonds in Fig. 5.6, and the red fitting line has slope −0.9032 ± 0.0140. This slope is not far from the theoretical value, which demonstrates that our identification algorithm is applicable even when the probe states are not completely pure such as arises in the laboratory using current optics techniques. From Fig. 5.6, it is also clear that if the probe states are not completely pure, it is possible to use more copies of the mixed states than pure states to achieve a similar level of identification accuracy.

5.3.6.2 Comparison with MLE

We present numerical results to compare the running time and identiﬁcation error of our algorithm with the maximum likelihood estimation (MLE) method. MLE is

171 5.3. PURE-STATE-BASED GATE IDENTIFICATION

the most widely used quantum tomography method. Let Nq denote the number of qubits, and assume the gate to be identiﬁed is in the form of an Nq times tensor product of a single-qubit Hadamard gate

  ⊗Nq 1 1 1 H = √   . 2 1 −1

3 2 Let NO = 10 ×d . We perform POVM measurements, reconstruct the output states using the proposed fast QST protocol and identify H using PGI algorithm. Then we use the same POVM measurement bases and the corresponding measurement results for the MLE identification algorithm. The corresponding simulation result is illustrated in Fig. 5.7 and each point is the average of 10 repetitions. The MLE algorithm in [79] is employed, which is the one we introduced in Sec. 2.2.2. Standard MLE algorithm only reconstructs the unknown process matrix X rather than the ˆ 2 unitary gate H. Hence, the error we compare in Fig. 5.7 is log10 E||X − X|| . We compare the running time for similar estimation errors. Specifically, the estimation errors in the MLE method are in [95%, 105%] of the corresponding estimation errors in the PGI algorithm. The running time (TR) only includes the online computational time. From Fig. 5.7, our identification algorithm is much faster than MLE. For example, our algorithm takes less time for a seven-qubit (d = 27 dimensional) system than MLE for a four-qubit (d = 24 dimensional) system.

For the relationship between MSE and d, the same analysis as in Sec. 5.2.5.1 applies. Namely, it remains open whether the error bound in Theorem 5.6 is tight w.r.t. d, and what gates can achieve the bound.

5.3.7 Experimental results

In the section, we present experimental results on the identiﬁcation of a one-qubit Hadamard gate. The quantum optical experiment is performed by our collaborators

172 5.3. PURE-STATE-BASED GATE IDENTIFICATION

Target State Preparation Gate Projective Measurement

BBO Diode Laser 404nm Step-Motors 1-4 Coincidence Unit

Glan Half Quarter Beam Single Photon Prism Wave Plate Wave Plate Displacer Dedector

Figure 5.8: The schematic of experimental setup, adopted from [153].

Qi Yin, Zhibo Hou and Guo-Yong Xiang at the University of Science and Technology of China.

The experimental setup is illustrated in Fig. 5.8. From left to right, ﬁrst photon pairs are created using type-I spontaneous parametric down-conversion in a nonlinear crystal. Then the photon in the lower path is sent immediately to a single photon detector to act as a trigger. The other photon in the upper path, as the probe state, is sent through a Glan prism (extinction ratio more than 2000:1 of horizonal and vertical polarization in the transmission direction) and a half-quarter wave plate combination to prepare it in any desired state of very pure polarization. The Hadamard gate is realized by a half wave plate with its axis placed at 22.5◦ relative to lab horizon. Another quarter-half wave plate combination followed by a beam displacer with high extinction ratio (more than 10000:1) is used to project the photon onto any measurement basis on the Bloch sphere. The rotations of the wave plates in the state preparation part and in the projective measurement part are separately driven by four step-motors, which are connected to a computer running Labview

173 5.3. PURE-STATE-BASED GATE IDENTIFICATION

−0.5

−1

−1.5

−2

MSE −2.5 10 log −3

−3.5

−4 Simulation Result Experimental Result −4.5 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 log10NO

Figure 5.9: MSE versus resource number for the experimental single-qubit gate.

program automatically enable the quantum gate identification. Since our method needs to input pure states and we assume the output states are also pure, the Glan prism and beam displacer with high extinction ratio are adopted in our experiment to reduce the system error as much as possible, which is measured about 2000:1 for both horizonal and vertical polarization for the whole setup. To alleviate the drift of the collective efficiency of the two photon detectors behind the beam displacer, multimode fibers fully covered by black plastic bags instead of singlemode fibers are used to collect the coupled photons. Because of the introducing of multimode fibers we set the coincidence window to 1ns to minimize the random coincidence count so that its error is negligible.

We generate a quantum gate close to the Hadamard gate, and use our identiﬁcation

6 algorithm to calibrate it with Nt = 4 × 10 total resources. The identiﬁcation result is taken as the real value of the gate to be identiﬁed later. Then we experimentally

174 5.4. SUMMARY AND OPEN PROBLEMS identify it with diﬀerent numbers of resources far less than 4 × 106 using the PGI method. The result is shown in red diamonds in Fig. 5.9, where the vertical axis ˆ 2 is the logarithm of the MSE (i.e., log10 E||H − H|| ) and the horizontal axis is the logarithm of the resource number for each output state (i.e., log10 NO). Every point is averaged over 50 experimental runs. In comparison, we also use blue dots to represent the simulation result of a single-qubit Hadamard gate using probe pure states. The averaged purity of experimental output states is 0.9993. The ﬁtting line of the experimental result has slope −0.9772±0.0296, which matches the theoretical result very well.

5.4 Summary and open problems

In this chapter, we have presented a new QHI method (TSO algorithm) and a new QGI method (PGI algorithm).

The TSO identification method is applicable to general time-independent Hamil- tonians for closed quantum systems. The method is presented within the QPT framework. We have analyzed its computational complexity and also provided a theoretical upper bound for the identification error. We demonstrated the performance of the identification algorithm using numerical examples.

We further improved the TSO method to the PGI method for quantum gates, reducing the computational complexity from O(d6) to O(d3). We employed a series of pure input states, and designed a fast pure state tomography for the output states. We also established an error upper bound, which can be useful for designing gate-related experiments or simulation tasks. We performed simulations to compare our algorithm with MLE, which demonstrates the eﬃciency of our method. We performed a quantum optical experiment on a one-qubit Hadamard gate to illustrate the eﬀectiveness of the PGI method.

Some questions remain open and deserve further investigation:

175 5.4. SUMMARY AND OPEN PROBLEMS

(i) Are the error bounds in Theorem 5.4 and Theorem 5.6 tight w.r.t. the system dimension d? If not, what are the tight bounds, and which Hamiltonians and gates can achieve the bounds?

(ii) How to extend the TSO algorithm to quantum process tomography for open quantum systems?

(iii) Can quantum entanglement enhance the performance of TSO or PGI method? For example, is it possible to increase the eﬃciency or decrease the error by using entangled probe states?

176 Chapter 6

Quantum detector tomography via two-stage estimation

The work, reported in this chapter, has been partially included in the following paper:

1. Y. Wang, S. Yokoyama, D. Dong, I. R. Petersen, E. H. Huntington, and H. Yonezawa, Two-stage estimation for quantum detector tomography: Error analysis, numerical and experimental results, submitted to IEEE Transactions on Information Theory, 2019.

6.1 Introduction

Measurement, on a quantum entity or using a quantum object, is the connection between the classical world and the quantum domain, and plays a fundamental role in investigating and controlling a quantum system [12, 156]. For example, quantum computation can be performed through a series of appropriate measurements in certain schemes [115]. In quantum communication, measurement is a vital part of quantum key distribution [9]. In quantum metrology, adaptive measurement can achieve the Heisenberg limit in phase estimation [67].

Since quantum measurement can also be viewed as a class of quantum resources,

177 6.1. INTRODUCTION its investigation and characterization is fundamentally important. Quantum detector tomography (QDT) is a technique to characterize quantum measurement devices [36, 95], and thus paves the way for other estimation tasks like quantum state tomography [110, 113, 114, 171], Hamiltonian identiﬁcation [27, 150, 165] and quantum process tomography [51, 80, 120].

The investigation of protocols for quantum detector tomography dates back to [50], where the Maximum Likelihood Estimation (MLE) method is employed to reconstruct an unknown POVM detector. As one of the most widely recognized methods [37, 110], MLE can preserve the positivity and completeness of the detector, but it is difficult to characterize the error and computational complexity. Phase- insensitive detectors correspond to diagonal matrices in the photon number basis and are thus relatively straightforward to be reconstructed. Ref. [61] modelled this problem as a linear-regression problem and obtained a least squares solution. In [47, 97], phase-insensitive detector tomography was modelled as a convex quadratic optimization problem and an efficient numerical solution was obtained. This method was also experimentally testified in [21, 102], and then was developed in [167] and [168] to model phase-sensitive detector tomography as a recursive constrained convex optimization problem, where the unknown parameters are recursively estimated. For phase-insensitive detectors with a large linear loss, an extension of detector tomography is introduced in [116] and tested on a superconducting multiphoton nanodetector.

In this chapter, we propose a novel quantum detector tomography protocol, which is applicable to both phase-insensitive and general phase-sensitive detectors. We first input a series of different states (probe states) to the detector and collect all the measurement data. The forthcoming algorithm mainly consists of two stages: in the first stage, we find a constrained least square estimate, which corresponds to a Her- mitian estimate satisfying the completeness constraint. However, this estimate can be non-physical; i.e., the estimated detectors may have negative eigenvalues. Hence,

178 6.2. TWO-STAGE ESTIMATION METHOD FOR QUANTUM DETECTOR TOMOGRAPHY in the second stage we further design a series of matrix transformations preserving the Hermitian and completeness constraint to find a physical approximation based on the result in the first stage, and thus obtain the final physical estimate. Our Two- stage Estimation (TSE) method has computational complexity O(d2ML), where M and d are the number and dimension of the detector matrices, respectively, and L kinds of probe states are employed. We further prove an error upper bound O( d5M2 ) Nt on the condition that the probe states are optimal (if not optimal, the specific form of the changed bound is also given in Sec. 6.2.4), where Nt is the total copy number of probe states. These theoretical analysis on computational complexity and error analysis is not common in other QDT methods.

In practical experiments, coherent states are more easily generated and manipu- lated. Hence, in Sec. 6.3, we investigate the optimization of L (the kinds of coherent probe states) and the size of their sampling square. We then perform numerical simulation in Sec. 6.4 to validate the theoretical analysis and compare our algorithm with the MLE method.

In Sec. 6.5, we slightly modify our method to cater to a practical experiment situation, and we perform quantum optical experiments using two-mode coherent states to testify the eﬀectiveness of our method. Sec. 6.6 concludes this chapter and presents associated open problems.

6.2 Two-stage Estimation method for quantum detector tomography

6.2.1 Problem formulation

As stated in Sec. 2.1, one of the most common quantum measurement methods is the positive operator valued measure (POVM), and quantum detectors are devices to realize a POVM, especially in optical systems. Suppose there is a set of POVM {Pi}

179 6.2. TWO-STAGE ESTIMATION METHOD FOR QUANTUM DETECTOR TOMOGRAPHY

PM satisfying the completeness constraint (2.8) i=1 Pi = I and each Pi is Hermitian and positive semideﬁnite. The measurement apparatus is the physical realization of a quantum detector, and {Pi} is the mathematical representation. We thus directly call

{Pi} a quantum detector in this thesis. In the case when Pi are inﬁnite dimensional, they are usually truncated at a ﬁnite dimension d in practice.

The technique to deduce an unknown detector from known quantum states and measurement results is called quantum detector tomography. We design a series of different known quantum states ρj (called probe states) and record the measurement resultsp îj as the estimate of pij = Tr(Piρj). Assume that for the probe states, L different kinds are employed and their total number of copies is Nt. Also assume different probe states use the same number of copies, which is Nt/L. Note that if the resources are distributed unevenly, our method still applies, but with the error characterization modified. We then aim to solve the following optimization problem:

PM PL ˆ Problem 6.1. Given experimental data {pˆij}. Solve min ˆ [Tr( iρj)− {Pi} i=1 j=1 P 2 PM ˆ ˆ pˆij] such that i=1 Pi = I and Pi ≥ 0 for 1 ≤ i ≤ M.

d2 We parameterize the detector and input (probe) states. Let {Ωi}i=1 be a complete √ set of d-dimensional traceless Hermitian matrices except Ω1 = I/ d, and they satisfy † Tr(Ωi Ωj) = δij. Then we can parameterize the true values of the detector and probe states as d2 X (i) Pi = φa Ωa, (6.1) a=1

d2 X (j) ρj = θb Ωb, (6.2) b=1

(i) (j) where φa , Tr(PiΩa) and θb , Tr(ρjΩb) are real. When ρj is inputted, the probability to obtain the result corresponding to Pi is calculated according to Born’s rule as

pij = Tr(Piρj). (6.3)

180 6.2. TWO-STAGE ESTIMATION METHOD FOR QUANTUM DETECTOR TOMOGRAPHY

Substituting (6.1) and (6.2) into (6.3), we obtain

d2 X (j) (i) pij = θa φa . a=1

(j) (j) (j) T (i) (i) (i) T Denote Θj , [θ1 , θ2 , ..., θd2 ] and Φi , [φ1 , φ2 , ..., φd2 ] . Suppose when estimat- ingp îj, the outcome for Pi appears nij times, thenp îj = nij/(Nt/L). According to the central limit theorem [32], the error ∆pij =p îj − pij converges in distribution to 2 a normal distribution with mean zero and variance (pij −pij)/(Nt/L). We thus have the linear regression equation

T pˆij = Θj Φi + ∆pij.

T T T T Let Φ = (Φ1 , Φ2 , ..., ΦM) , which is the vector of all the unknown parameters to be T estimated. Collect the parametrization of the probe states as X0 = (Θ1, Θ2, ..., ΘL) . ˆ T Let Y = (ˆp11, pˆ12, ..., pˆ1L, pˆ21, pˆ22, ..., pˆ2L, ..., pˆML) , X = IM ⊗X0, e = (∆p11, ∆p12, ..., √ T T ∆p1L, ∆p21, ∆p22, ..., ∆p2L, ..., ∆pML) , H = (1, ..., 1)1×M⊗Id2 , Dd2×1 = ( d, 0, ..., 0) . Then the regression equations can be rewritten in a compact form:

Yˆ = X Φ + e, (6.4) with a linear constraint HΦ = D. (6.5)

Now Problem 6.1 can be transformed into the following equivalent form:

2 Problem 6.2. Given experimental data Yˆ. Solve min ˆ ||Yˆ − X Φˆ|| such that {Pi} ˆ ˆ ˆ ˆ HΦ = D and Pi ≥ 0 for 1 ≤ i ≤ M, where Φ is the parametrization of {Pi} via (6.1).

181 6.2. TWO-STAGE ESTIMATION METHOD FOR QUANTUM DETECTOR TOMOGRAPHY

6.2.2 Estimation algorithm

6.2.2.1 Stage-1 approximation–constrained LRE

Problem 6.2 is diﬃcult to solve directly. Hence, we split it into two approximate subproblems:

2 Problem 6.2.1. Given experimental data Yˆ. Solve min ˆ ||Yˆ − X Φˆ|| such that {Fi} ˆ ˆ ˆ HΦ = D, where Φ is the parametrization of {Fi} via (6.1).

PM ˆ P ˆ ˆ 2 Problem 6.2.2. Given i = . Solve min ˆ || i − i|| such that i=1 F I {Pi} i F P PM ˆ ˆ i=1 Pi = I and Pi ≥ 0 for 1 ≤ i ≤ M.

Problem 6.2.1 is a linear regression problem with a linear constraint, and it can be solved analytically via the constrained least squares (CLS) method [124]. Assume the input states have enough diversity such that X T X is nonsingular. This indicates L ≥ d2 for general complete probe-state sets. The standard CLS solution is [124]

ˆ ˆ T −1 T T −1 T −1 ˆ ΦCLS = ΦLS − (X X ) H [H(X X ) H ] (HΦLS − D), (6.6)

ˆ where ΦLS is the unconstrained least squares solution

ˆ T −1 T ˆ ΦLS = (X X ) X Y. (6.7)

To further reduce the computational burden, we can simplify the form of (6.6) and

T −1 T −1 (6.7). Let Z0 = (X0 X0) . Then (X X ) = IM ⊗ Z0, and

1 [H(X T X )−1HT ]−1 = [H( ⊗ Z )HT ]−1 = (MZ )−1 = Z−1. IM 0 0 M 0

Eq. (6.7) is in fact

ˆ T ˆ T ˆ ΦLS = (IM ⊗ Z0)(IM ⊗ X0 )Y = (IM ⊗ Z0X0 )Y.

182 6.2. TWO-STAGE ESTIMATION METHOD FOR QUANTUM DETECTOR TOMOGRAPHY

Also (6.6) is

ˆ ΦCLS   Id2 T  .  1 −1 T = ( ⊗ Z X )Yˆ − ( ⊗ Z )  .  Z [( 2 ··· 2 )( ⊗ Z X )Yˆ − D] IM 0 0 IM 0  .  M 0 Id Id IM 0 0   Id2   Z0  .  = ( ⊗ Z X T )Yˆ − 1  .  Z−1[(Z X T ··· Z X T )Yˆ − D] IM 0 0 M  .  0 0 0 0 0   Z0   Id2  .  = ( ⊗ Z X T )Yˆ − 1  .  [(Z X T ··· Z X T )Yˆ − D] IM 0 0 M  .  0 0 0 0   Id2     T T Z0X0 ··· Z0X0 D  . .   .  = ( ⊗ Z X T )Yˆ − 1  . .  Yˆ + 1  .  . IM 0 0 M  . .  M  .      T T Z0X0 ··· Z0X0 D (6.8) ˆ ˆT ˆT ˆT ˆT T ˆ T We then partition Y as Y = (Y1 , Y2 , ..., YM) where Yi = (ˆpi1, pˆi2, ..., pˆiL) for T P ˆ 1 ≤ i ≤ M. Denote Y0 = ((1, ..., 1)1×L) = i Yi. We transform (6.8) as

ˆ ΦCLS           T ˆ T T ˆ Z0X0 Y1 Z0X0 ··· Z0X0 Y1 D    .   . .   .   .  =  ...   .  − 1  . .   .  + 1  .     .  M  . .   .  M  .            T ˆ T T ˆ Z0X0 YM Z0X0 ··· Z0X0 YM D       T ˆ T P ˆ Z0X0 Y1 Z0X0 i Yi D  .   .   .  =  .  − 1  .  + 1  .   .  M  .  M  .        T ˆ T P ˆ Z0X0 YM Z0X0 i Yi D   T −1 T ˆ 1 1 (X0 X0) X0 (Y1 − M Y0) + M D    .  =  .  .   T −1 T ˆ 1 1 (X0 X0) X0 (YM − M Y0) + M D (6.9)

183 6.2. TWO-STAGE ESTIMATION METHOD FOR QUANTUM DETECTOR TOMOGRAPHY

ˆ Compared with (6.6) and (6.7), Eq. (6.9) is a faster way to calculate ΦCLS.

ˆ ˆ T ˆ T T ˆ T ˆ(i) ˆ(i) ˆ Let ΦCLS = (Φ1 , ..., ΦM) and Φi = (φ1 , ..., φd2 ). From ΦCLS, we can obtain a ˆ Pd2 ˆ(i) ˆ stage-1 estimate Fi = a=1 φa Ωa. The error ||Fi −Fi|| will be referred to as the CLS error in the rest of this thesis. Note that the positive semideﬁniteness requirement ˆ ˆ on Fi is not considered at this stage, and Fi may have negative eigenvalues. Hence, ˆ we need to further adjust Fi to obtain a physical estimate.

6.2.2.2 Diﬀerence decomposition

ˆ Now we begin to solve Problem 6.2.2. First we decompose each Fi as the difference ˆ ˆ ˆ ˆ ˆ of two positive semidefinite matrices Fi and Gi: Fi = Fi − Gi. There are infinite- ly many such decompositions, because a new decomposition will be obtained once ˆ ˆ ˆ another positive semidefinite matrix is added to Fi and Gi. We hope to view Gi as small disturbance, and thus seek a decomposition method to minimize the norm of ˆ Gi. ˆ ˆ ˆ ˆ ˆ † For each Fi, we perform a spectral decomposition to obtain Fi = WiKiWi , where ˆ ˆ Wi is unitary and Ki is real diagonal. We have

ˆ ˆ † ˆ ˆ ˆ † ˆ ˆ Ki = Wi FiWi − Wi GiWi.

ô ô Denote the optimal decomposition solution as Fi and Gi . We assert that both ˆ † ô ˆ ˆ † ô ˆ ˆ † ô ˆ Wi Fi Wi and Wi Gi Wi must be diagonal. Otherwise, we note that diag(Wi Fi Wi) − ˆ † ô ˆ ˆ diag(Wi Gi Wi) still equals to Ki, where diag(A) outputs the square matrix A with ˆ † ô ˆ all the nondiagonal elements set to zero. Since Wi Fi Wi is positive semidefinite, all ˆ † ô ˆ of its diagonal elements are thus nonnegative. This indicates that diag(Wi Fi Wi) ˆ † ô ˆ is also positive semidefinite. Similarly, diag(Wi Gi Wi) is also positive semidefi- ˆ † ô ˆ ˆ † ô ˆ nite. Hence, diag(Wi Fi Wi) and diag(Wi Gi Wi) are also feasible solutions. Since ˆ † ô ˆ ˆ † ô ˆ ô ||diag(Wi Gi Wi)|| < ||Wi Gi Wi||, this contradicts the assumption that Gi is the op-

184 6.2. TWO-STAGE ESTIMATION METHOD FOR QUANTUM DETECTOR TOMOGRAPHY

ˆ † ˆo ˆ ˆ † ˆo ˆ timal solution. Therefore, Wi Fi Wi and Wi Gi Wi must be diagonal. We then have

ˆ 2 ˆ † ˆo ˆ 2 X ˆ † ˆo ˆ 2 min ||Gi|| = min ||Wi Gi Wi|| = min(Wi Gi Wi)jj, j

ˆ ˆ † ˆ ˆ ˆ † ˆ ˆ ˆ and we can consider its elements: (Ki)jj = (Wi FiWi)jj −(Wi GiWi)jj. If (Ki)jj > 0, ˆ † ˆ ˆ ˆ ˆ † ˆ ˆ ˆ we should take (Wi FiWi)jj = (Ki)jj and (Wi GiWi)jj = 0; if (Ki)jj ≤ 0, we should ˆ † ˆ ˆ ˆ † ˆ ˆ ˆ take (Wi FiWi)jj = 0 and (Wi GiWi)jj = −(Ki)jj.

Hence, the optimal decomposition can be obtained through the following proce- ˆ ˆ dure: assume there are fi nonpositive eigenvalues for Fi, and they are in decreasing ˆ order in diag(Ki). Let

Fˆ = ˆ diag[(Kˆ ) , (Kˆ ) , ..., (Kˆ ) , 0, ..., 0] ˆ † (6.10) i Wi i 11 i 22 i (d−fˆi)(d−fˆi) Wi and

Gˆ = − ˆ diag[0, ..., 0, (Kˆ ) , (Kˆ ) , ..., (Kˆ ) ] ˆ †. i Wi i (d−fî+1)(d−fî+1) i (d−fî+2)(d−fî+2) i dd Wi

ˆ ˆ ˆ ˆ ˆ ˆ Then we know Fi ≥ 0, Gi ≥ 0 and Fi = Fi − Gi, and this Gi has the least norm.

6.2.2.3 Stage-2 approximation

P ˆ P ˆ P ˆ From I = i Fi = i Fi − i Gi, we have

X ˆ X ˆ I + Gi = Fi. (6.11) i i

ˆ Since each Gi is positive semideﬁnite, we can decompose

X ˆ ˆ ˆ † I + Gi = CC . (6.12) i

185 6.2. TWO-STAGE ESTIMATION METHOD FOR QUANTUM DETECTOR TOMOGRAPHY

Then Eq. (6.11) is transformed into

X ˆ −1 ˆ ˆ −† C FiC = I. i

ˆ ˆ −1 ˆ ˆ −† ˆ We let Ai = C FiC , and then each Ai is positive semideﬁnite and their sum is ˆ ˆ the identity. Hence, {Ai} is a genuine estimate of the detector and we call {Ai} the stage-2 approximation. A further optimization is needed in order to obtain the ﬁnal estimation result in the following.

6.2.2.4 Unitary optimization

P ˆ ˆ ˆ † When decomposing I + i Gi = CC in (6.12), there is in fact another degree of ˆ ˆ ˆ † ˆ ˆ ˆ † ˆ † ˆ † ˆ ˆ freedom. For any unitary U, it holds that CC = CUU C . Therefore, U AiU can also be a valid estimate of the detector. We hope to choose a Uˆ such that the eﬀect of Cˆ is (partly) neutralized. Hence, we aim to minimize ||CˆUˆ − I||.

We have ||CÛˆ − I||2 = Tr[(CÛˆ − I)(Uˆ †Cˆ † − I)] = d + Tr(CˆCˆ †) − Tr(CÛˆ + Uˆ †Cˆ †).

ˆ ˆ ˆ † ˆ † † ˆ ˆ † Let L = −Tr(CU+U C )+Tr[(ΛL +ΛL)(UU −I)], where ΛL is a Lagrange multiplier matrix. By partial diﬀerentiation we have

∂L ˆ † † ˆ = −C + (ΛL + ΛL)U = 0. ∂Uˆ ∗

Therefore, ˆ † ˆ † † ˆ ˆ C U = ΛL + ΛL = UC. (6.13)

ˆ ˆ ˆ ˆ † ˆ We perform a singular value decomposition to obtain C = UαSCUβ where Uα and ˆ ˆ Uβ are unitary and SC is diagonal and positive semideﬁnite. It is straightforward to verify that p ˆ ˆ † ˆ −1 ˆ ˆ † CC C = UβUα.

186 6.2. TWO-STAGE ESTIMATION METHOD FOR QUANTUM DETECTOR TOMOGRAPHY

ˆ ˆ † ˆ ˆ Let Uγ = UβUUα. Then (6.13) is now equivalent to

ˆ ˆ † ˆ ˆ SCUγ = UγSC. (6.14)

Thus we have ˆ ˆ2 ˆ † ˆ ˆ ˆ ˆ ˆ ˆ † ˆ ˆ ˆ2 UγSCUγ = UγSCUγSC = SCUγUγSC = SC.

Therefore, ˆ ˆ2 ˆ † is the spectral decomposition of ˆ2 . Since the probability for UγSCUγ SC ˆ to be degenerate is zero, we know ˆ2 is nondegenerate. Thus we have ˆ = SC SC Uγ

iκ1 iκ2 iκd ˆ diag(e , e , ..., e ) where κj ∈ [0, 2π) for 1 ≤ j ≤ d, which indicates that Uγ and ˆ ˆ ˆ ˆ ˆ ˆ ˆ 2 SC commutate. From (6.14), we then have SC = UγSCUγ = SCUγ. When the resource ˆ ˆ number Nt is large enough, C will be close to a unitary matrix and we can view SC ˆ 2 ˆ as nonsingular. We thus have Uγ = I, which indicates Uγ = diag(±1, ±1, ..., ±1). ˆ ˆ ˆ We further have L = −2Tr(UγSC), which indicates Uγ = I. Therefore, the optimal solution is p ˆ ˆ ˆ † ˆ † ˆ ˆ −1 U = UβUα = C CC . (6.15)

ˆ ˆ † ˆ ˆ ˆ Hence, the ﬁnal estimate is Pi = U AiU where U is determined through (6.15). The ˆ error ||Pi − Pi|| will be referred to as the ﬁnal (estimation) error, in contrast to the ˆ CLS error ||Fi − Fi||.

6.2.3 General procedure and computational complexity

We now generalize the procedure of our TSE algorithm and analyze its computational complexity. We do not consider the time spent on experiments, since it depends on the experimental realization. In the following, we brieﬂy summarize each step and illustrate their corresponding computational complexity.

Step 1. Stage-1 Approximation. Choose basis sets {Ωi} and probe states

ρj and calculate Θj. Then perform measurement experiments to collect datap ˆij. Obtain the constrained least square solution from (6.9) and construct the stage-1

187 6.2. TWO-STAGE ESTIMATION METHOD FOR QUANTUM DETECTOR TOMOGRAPHY

ˆ P ˆ(i) T −1 T approximation Fi = a φa Ωa. In (6.9), both (X0 X0) X0 and D can be calculated oﬄine prior to the experiments, and the remaining online calculation has computational complexity O(d2ML).

Step 2. Diﬀerence Decomposition. Perform spectral decomposition on each ˆ ˆ ˆ ˆ Fi and obtain Fi = Fi − Gi. Since the computational complexity of spectral decomposition is cubic in the dimension of a Hermitian matrix [60], this step has total computational complexity O(d3M).

P ˆ ˆ ˆ † Step 3. Stage-2 Approximation. The transformation of I + i Gi into CC can be accomplished by spectral decomposition. Then, we obtain the stage-2 ap- ˆ ˆ −1 ˆ ˆ −† 3 proximation Ai = C FiC . The complexity of this step is O(d M).

Step 4. Unitary Optimization. Calculate the global unitary matrix Uˆ accord- ˆ ˆ † ˆ ˆ ing to (6.15) and obtain the ﬁnal estimate Pi = U AiU. This step has computational complexity O(d3M).

Since L ≥ d2 for general complete probe-state sets, we have d3M ≤ d2ML. Hence, our algorithm has total computational complexity O(d2ML).

6.2.4 Error analysis

In this section, we present a theoretical upper bound for the ﬁnal estimation error of our TSE algorithm. It is necessary to ﬁrst characterize the probe states.

Assumption 6.1. The probe states used are optimal [113, 150]; i.e., they are d-

T dimensional pure states and X0 X0 = c0I for some c0 ∈ R. From [150], we have the following characterization:

4 L T −1 d Tr[(X0 X0) ] ∼ O( ). (6.16) 4Nt Nt

Let E(·) denote the expectation w.r.t. all possible measurement results. We propose the following theorem to characterize the estimation error:

188 6.2. TWO-STAGE ESTIMATION METHOD FOR QUANTUM DETECTOR TOMOGRAPHY

Theorem 6.1. Under Assumption 6.1, the ﬁnal estimation error of our algorithm P ˆ 2 d5M2 E( || i − i|| ) scales as O( ), where d is the system dimension, M is the i P P Nt number of detector POVM matrices and Nt is the total number of resources of the probe states.

Proof. We prove the conclusion through analyzing the error in each step of our algorithm.

6.2.4.1 Error in stage-1 approximation

T −1 T −1 For simplicity, let Z = (X X ) = I ⊗ (X0 X0) . The estimation error for constrained LRE is

ˆ 2 E(||ΦCLS − Φ|| ) = E[||ZX T e − ZHT (HZHT )−1(HΦ + HZX T e − D)||2] = E[||ZX T e − ZHT (HZHT )−1HZX T e||2] = Tr{E[(ZX T − ZHT (HZHT )−1HZX T )T (ZX T − ZHT (HZHT )−1HZX T )eeT ]}.

From [113], we know E(eeT ) ≤ L . Therefore, 4Nt I

ˆ 2 E[||ΦCLS − Φ|| ] ≤ L Tr[(ZX T − ZHT (HZHT )−1HZX T )T (ZX T − ZHT (HZHT )−1HZX T )] 4Nt = L Tr[XZ2X T − X ZHT (HZHT )−1HZ2X T − X Z2HT (HZHT )−1HZX T 4Nt +XZHT (HZHT )−1HZ2HT (HZHT )−1HZX T ] = L Tr(Z) − L Tr[(HZHT )−1HZ2HT ]. 4Nt 4Nt (6.17) We have

T T −1 T −1 T T −1 HZH = (I, ..., I)diag[(X0 X0) , ..., (X0 X0) ](I, ..., I) = M(X0 X0) .

189 6.2. TWO-STAGE ESTIMATION METHOD FOR QUANTUM DETECTOR TOMOGRAPHY

T −1 T 2 T T −2 It is clear that (HZH ) = X0 X0/M and HZ H = M(X0 X0) . Continuing (6.17), we have

ˆ 2 L T −1 L T T −2 E[||ΦCLS − Φ|| ] ≤ Tr[ M ⊗ (X X0) ] − Tr[X X0/M · M(X X0) ] 4Nt I 0 4Nt 0 0 L T −1 L T −1 = MTr[(X X0) ] − Tr[(X X0) ] 4Nt 0 4Nt 0 (M−1)L T −1 = Tr[(X X0) ]. 4Nt 0

Hence, we have

P 2 P Pd2 (i) 2 ˆ 2 E( ||∆Fi|| ) = E{ Tr{[ (∆φa )Ωa] }} = E(||ΦCLS − Φ|| ) i i a=1 (6.18) (M−1)L T −1 ≤ Tr[(X X0) ], 4Nt 0 which we refer to as the CLS bound.

Remark 6.1. In cases when the last POVM matrix PM is omitted for simplicity, unconstrained LRE can be used for stage-1 approximation, and a corresponding error upper bound can be obtained as in [113]:

L T −1 L T −1 Tr[(X X ) ] = Tr{[IM ⊗ (X0 X0)] } 4Nt 4Nt (6.19) ML T −1 = Tr[(X X0)] . 4Nt 0

M−1 Comparing (6.18) and (6.19), we find they are only different by a factor of M . For any given detector, M is fixed and these two bounds behave the same, apart from a constant. We thus omit analysis for unconstrained LRE method.

6.2.4.2 Error in ||∆Fi||

ˆ † ˆ We start from the spectral decomposition (6.10). Since Wi FiWi is positive semidefinite, its diagonal elements are all nonnegative. Therefore, we have

2 ||∆Fi|| (6.20) ˆ 2 = ||Fi − Fi||

190 6.2. TWO-STAGE ESTIMATION METHOD FOR QUANTUM DETECTOR TOMOGRAPHY

ˆ = Pd−fi [(Kˆ ) − ( ˆ † ˆ ) ]2 + Pd ( ˆ † ˆ )2 j=1 i jj Wi FiWi jj j=d−fˆi+1 Wi FiWi jj Pd Pd ˆ † ˆ 2 + j=1 k=1,k6=j |(Wi FiWi)jk| ˆ ≤ Pd−fi [(Kˆ ) − ( ˆ † ˆ ) ]2 + Pd [(Kˆ ) − ( ˆ † ˆ ) ]2 j=1 i jj Wi FiWi jj j=d−fˆi+1 i jj Wi FiWi jj Pd Pd ˆ † ˆ 2 + j=1 k=1,k6=j |(Wi FiWi)jk| 2 = ||∆Fi|| .

6.2.4.3 Error in ||CˆCˆ † − I||

Using (6.20), we have the following relationship:

ˆ ˆ † X ˆ X X X ||CC − I|| = || Fi − I|| = || ∆Fi|| ≤ ||∆Fi|| ≤ ||∆Fi||. (6.21) i i i i

6.2.4.4 Error in ||CˆUˆ − I||

Let ˆ2 = diag(1+s , ..., 1+s ). We have ||ˆ ˆ † − || = ||ˆ ˆ2 ˆ † − || = ||ˆ2 − || = SC 1 d CC I UαSCUα I SC I pP 2 i si . Hence,

||CˆUˆ − I||2 = d + Tr(CˆCˆ †) − Tr(CˆUˆ + Uˆ †Cˆ †) p = d + Tr(CˆCˆ †) − 2Tr( Cˆ †Cˆ) = d + Tr(ˆ2 ) − 2Tr(ˆ ) SC SC P P √ = d + i(1 + si) − 2 i 1 + si √ P 2 = i( 1 + si − 1) 2 P si P 2 1 1 = √ = s [ − si + o(si)] i 2+si+2 1+si i i 4 8 1 ˆ ˆ † 2 ∼ O( 4 ||CC − I|| ).

Using (6.21), we know

1 X ||ˆ ˆ − || = O( ||∆ ||), (6.22) CU I 2 Fi i where we do not incorporate the constant into the O notation before the end of this

191 6.2. TWO-STAGE ESTIMATION METHOD FOR QUANTUM DETECTOR TOMOGRAPHY proof.

6.2.4.5 Error in ||(Uˆ †Cˆ †)−1 − I||

ˆ ˆ 2 Denote the singular values of CU asµ î for 1 ≤ i ≤ d. From (6.12) we know eachµ î P ˆ is an eigenvalue of I + i Gi. Hence, we haveµ î ≥ 1 for every 1 ≤ i ≤ d. Therefore,

s X 1 √ ||(ˆ † ˆ †)−1|| = ≤ d. (6.23) U C µˆ2 i i

Using (6.22), we have

||(Uˆ †Cˆ †)−1 − I|| = ||(Uˆ †Cˆ †)−1(Uˆ †Cˆ † − I)2 − (Uˆ †Cˆ † − I)|| ≤ ||(Uˆ †Cˆ †)−1|| · ||Uˆ †Cˆ † − I||2 + ||Uˆ †Cˆ † − I|| √ (6.24) ≤ d||Uˆ †Cˆ † − I||2 + ||Uˆ †Cˆ † − I|| ∼ O(||Uˆ †Cˆ † − I||) = O(||CˆUˆ − I||) 1 P = O( 2 i ||∆Fi||).

P 2 6.2.4.6 Error in i ||∆Pi||

Since each Fi = Fi is positive semideﬁnite, we have

P 2 P 2 P 2 i ||Fi|| = i Tr(Fi ) = Tr( i Fi ) P 2 P ≤ Tr( i Fi + i,j FiFj) P 2 = Tr[( i Fi) ] = Tr(I) = d.

For each i, we have √ ||Fi|| = ||Fi|| ≤ ||I|| = d.

192 6.2. TWO-STAGE ESTIMATION METHOD FOR QUANTUM DETECTOR TOMOGRAPHY

Using (6.20), (6.22), (6.23) and (6.24), we have

P 2 i ||∆Pi|| P ˆ ˆ −1 ˆ ˆ † ˆ † −1 2 = i ||(CU) Fi(U C ) − Fi|| P ˆ ˆ −1 ˆ ˆ † ˆ † −1 ˆ ˆ † ˆ † −1 ˆ ˆ † ˆ † −1 ˆ ˆ 2 = i ||(CU) Fi(U C ) − Fi(U C ) + Fi(U C ) − Fi + Fi − Fi|| P ˆ ˆ −1 ˆ ˆ † ˆ † −1 ˆ ˆ † ˆ † −1 ˆ ˆ † ˆ † −1 ˆ ˆ 2 ≤ i[||(CU) Fi(U C ) − Fi(U C ) || + ||Fi(U C ) − Fi|| + ||Fi − Fi||] P ˆ ˆ −1 ˆ ˆ † ˆ † −1 ˆ ˆ † ˆ † −1 2 ≤ i[||(CU) − I|| · ||Fi(U C ) || + ||Fi|| · ||(U C ) − I|| + ||∆Fi||] P ˆ † ˆ † −1 ˆ ˆ † ˆ † −1 ˆ ˆ † ˆ † −1 2 ≤ i[||(U C ) − I|| · ||Fi|| · ||(U C ) || + ||Fi|| · ||(U C ) − I|| + ||∆Fi||] P ˆ ˆ † ˆ † −1 2 ∼ i[2||Fi|| · ||(U C ) − I|| + ||∆Fi||] P ˆ P 2 ∼ i[||Fi||O( j ||∆Fj||) + ||∆Fi||] P ˆ 2 P 2 2 ˆ P = i[||Fi|| O( j ||∆Fj||) + ||∆Fi|| + 2||Fi||O( j ||∆Fj||)||∆Fi||] P 2 P 2 P 2 P P ∼ O( j ||∆Fj||) i ||Fi|| + i ||∆Fi|| + 2O( j ||∆Fj||) i ||Fi|| · ||∆Fi|| √ P 2 P 2 P P ≤ d · O( j ||∆Fj||) + i ||∆Fi|| + 2 dO( j ||∆Fj||) i ||∆Fi|| √ P 2 P 2 P 2 ≤ dM · O( j ||∆Fj|| ) + i ||∆Fi|| + 2 dMO( j ||∆Fj|| ) √ P 2 = (dM + 2 dM + 1)O( j ||∆Fi|| ), (6.25) where the second last line comes from the Cauchy-Schwarz inequality

X 2 X 2 ( ||∆Fi||) ≤ M( ||∆Fi|| ). i i

Taking the expectation of (6.25) and using (6.18), we have

√ P 2 P 2 E( i ||∆Pi|| ) ∼ (dM + 2 dM + 1)O[E( i ||∆Fi|| )] √ (6.26) (dM+2 dM+1)(M−1)L T −1 ∼ O{ Tr[(X X0) ]}. 4Nt 0

Since we have explicitly shown the constants in the O notation, (6.26) should be interpreted as that the following equation holds asymptotically: √ X (dM + 2 dM + 1)(M − 1)L 1 E( ||∆ ||2) ≤ Tr[(X T X )−1] + o( ). (6.27) Pi 4N 0 0 N i t t

193 6.3. OPTIMIZATION OF THE COHERENT PROBE STATES

Using (6.16), we can further simplify (6.26) as

X d5M2 E( ||ˆ − ||2) ∼ O( ). (6.28) Pi Pi N i t

Remark 6.2. If the probe states are not optimal, (6.28) might fail and only (6.27)

T −1 holds. This proof also indicates that Tr[(X0 X0) ] is a helpful index to guide the choice of the probe states. If diﬀerent probe states are highly similar to each other,

T −1 then they result in a large Tr[(X0 X0) ] and thus (possibly) a large estimation error.

6.3 Optimization of the coherent probe states

Since the detector to be estimated is usually unknown in practice, the optimization among all the possible probe states should be independent of the speciﬁc detector. An advantage of our TSE method is that an explicit error upper bound is presented, which does not involve the speciﬁc form of the detector. This can be critical in the optimization of the probe states. Moreover, to adapt to practical applications, we assume the probe states are all coherent states in this section.

6.3.1 On the kinds of probe states

In quantum optics experiments, the preparation of number states |ki (k is an nonnegative integer) is a diﬃcult task, especially when k is large. Therefore, in practice the input probe states are usually coherent states instead. A coherent state is denoted as |αi where α ∈ C and it can be expanded using number states as

∞ 2 i − |α| X α |αi = e 2 √ |ii. i=0 i!

194 6.3. OPTIMIZATION OF THE COHERENT PROBE STATES

Their inner product relationship is

− 1 (|β|2+|α|2−2β∗α) hβ|αi = e 2 . (6.29)

We usually identify α with |αi when there is no ambiguity.

2 − |α| d−1 αi Let |αi = e 2 P √ |ii. Coherent states are in essence inﬁnite dimensional. d i=0 i!

To estimate a d-dimensional detector, in practice we employ |αid as the (approximate) mathematical description of |αi in this chapter. Therefore, the part |αi − |αid is viewed as noise, which should be suppressed. This requires the amplitude |α| to be not large. Furthermore, (6.29) indicates that if |α| and |β| are both close to zero, their inner product will also be close to one, which means that coherent states with small amplitudes are very much “alike”. This requires that we cannot employ probe state sets where all the amplitudes are small. Considering the above two requirements, we design the preparation procedure of the probe states as follows.

Probe States Preparation: Given appropriate cq > 0, generate two random numbers x and y independently with their probability density function uniformly distributed on [−cq, cq]. Then |x + iyi will be employed as a probe state, with Nt/L copies. Repeat this process to generate L probe states and employ them to perform detector tomography.

Remark 6.3. Our sampling procedure is in essence sampling randomly within a given square in the complex plane. Another candidate method is to sample following a certain symmetric fixed pattern within this given square. Since simulation shows little difference in the final estimation error, we stick to our random-sample procedure.

With our probe state preparation procedure, we wonder what is the relationship between L and the ﬁnal estimation error, when other factors, such as the detector, the total number of copies Nt for the probe states and the parameter cq, remain unchanged.

To ensure that the inversion of X T X in (6.7) exists, it is required that at least

195 6.3. OPTIMIZATION OF THE COHERENT PROBE STATES

L ≥ d2. We further ﬁnd that when L is large enough, the ﬁnal estimation error tends to a constant independent of L. We give an explanation as follows.

First, the j-th probe state |αi is approximately viewed as |αid, which has a corresponding parametrization Φj. Let E(·) denote the expectation of functions of x and y, in contrast to the expectation E(·) in Theorem 6.1. Let γj = Φj − E(Φj). Then the γjs are i.i.d. with respect to the subscript j. According to (6.18), the estimation (M−1)L T −1 error upper bound is Tr[(X X0) ]. We thus have 4Nt 0

(M−1)L T −1 { Tr[(X X0) ]} E 4Nt 0 (M−1)L PL T T −1 = Tr{LE[( (E(Φj) + γj)(E(Φj) + γj )) ]} 4Nt j=1 (6.30) (M−1)L T PL T −1 = Tr{L [(L (Φj) (Φj) + γjγ ) ]} 4Nt E E E j=1 j (M−1)L T 1 PL T −1 = Tr{[ (Φj) (Φj) + ( γjγ )] }. 4Nt E E E L j=1 j

1 PL T According to the central limit theorem [32], as L tends to inﬁnity, E( L j=1 γjγj ) converges to a ﬁxed matrix, and hence the expectation of the estimation error upper bound tends to a constant.

Two points should be noted: (i) In practice L cannot be arbitrarily large when

Nt is given. (ii) There is usually a gap between this bound (6.30) and the practical error. However, simulation results imply the effectiveness of the above analysis, which suggests that a modest number of different kinds of probe states should be enough for practical applications. To investigate the least L that suffices for an estimation task

(M−1)L T −1 with the dimension given, it only requires us to calculate { Tr[(X X0) ]} E 4Nt 0 for several candidates of L, which is a quantity independent of the speciﬁc detector.

6.3.2 Optimization of the size of sampling square for probe states

As analyzed in Sec. 6.3.1, the estimation error would be large if cq is too small or too large. Hence, there should be an optimal value for the choice of cq. This is

196 6.3. OPTIMIZATION OF THE COHERENT PROBE STATES further veriﬁed by the simulation results in Fig. 6.4.

To locate the optimal value of cq, we consider the projection of a probe state onto the d-dimensional subspace where the detector resides. Theoretically, the optimal value of cq should be different for different detectors, even though the dimension is fixed. However, in simulations (for example, Fig. 6.4), we find that the optimal values

(M−1)L T −1 for a practical detector and the bound { Tr[(X X0) ]} usually coincide. E 4Nt 0 Therefore, as an approximation, we can investigate the optimization of this bound w.r.t. cq. Furthermore, the value of Nt does not aﬀect this optimization, and from Sec. 6.3.1 we know an L not too small will also be irrelevant to the optimization. Hence, we only need to optimize

T −1 E{Tr[(X0 X0) ]}, (6.31) which is a quantity uniquely determined by the probe states.

−|α|2 |α|2k We start from the real function deﬁned on all nonnegative integers, g(k) = e k! , where α is the corresponding complex number of a probe state |αi. From g(k) −

−|α|2 2k k+1−|α|2 g(k + 1) = e |α| (k+1)! , we know g(k) ﬁrst increases and then decreases after k ≥ |α|2 − 1. Hence, g(k) reaches the maximum value around |α|2 − 1.

For any given probe state |αi, we consider the amplitude of its projection on each position |kihj|, which is

j+k 2 |α| h(k, j) |hj|αihα|ki| = e−|α| √ . , j!k!

Fig. 6.1 shows the grided h(k, j) with d = 8 and |α| = 2. Note that g(k) is the restriction of h(k, j) on j = k. Using the same technique for analyzing g(k), it is straightforward to prove that grided h(k, j) always has a single peak, with the position of the maximum around (|α|2 − 1, |α|2 − 1). Generally, the larger h(k, j) is, the better accuracy one can expect to obtain for estimating the element of a detector matrix at position |kihj|. To obtain the least estimation error, a natural idea is to

197 6.3. OPTIMIZATION OF THE COHERENT PROBE STATES

0.2

0.15

0.1 h

0.05

0 8 6 8 4 6 4 2 2 0 j 0 k

Figure 6.1: The projection amplitude function h(k, j). maximize h(k, j) for each position (k, j). However, this is not practical, because from P∞ P k=0 g(k) = 1 one can see that k,j h(k, j) is bounded. Therefore, to locate the optimal cq means to optimally allocate h(k, j) on the d × d positions. P Generally speaking, when estimating a multivariate target {θi}, the MSE E( i |θi− ˆ 2 ˆ θi| ) is usually dominated by the worst estimated parameter maxi |θi − θi|. Hence, T −1 the optimal cq (denoted as co , argmin E{Tr[(X0 X0) ]}) should have a good per- cq formance for the worst estimation. When cq is too small, E(|αihα|) is overly concentrated near the original point, and the projections on (k, j)s far from the original point will be too small, resulting in a large (6.31); i.e., a bad estimate. Conversely, cq should not be too large, either. If we approximately view h(k, j) as symmetric, it is natural to conclude that the projection of the maximum (or the middle point

d−1 d−1 of the two maxima) of h(k, j) should be at ( 2 , 2 ) for co. More speciﬁcally, if |α|2 − 1 is an integer, then g(|α|2 − 1) = g(|α|2) are the two maxima. When d is

d−2 d−2 d d even, the maximum of h(k, j) should be two contour points ( 2 , 2 ) and ( 2 , 2 ),

198 6.4. NUMERICAL RESULTS

d−2 2 and we should have 2 = |α| − 1. When d is odd, h(k, j) has one maximum and its d−1 d−1 d−1 |α|2+|α|2−1 projection should be ( 2 , 2 ), which further indicates 2 = 2 . Therefore for co, we should always have d ( |α|)2 = . (6.32) E 2 From our sampling scheme for probe states in Sec. 6.3.1, we have √ √ c c p Z q Z q x2 + y2 2 + ln(1 + 2) E|α| = 2 dxdy = cq. (6.33) −cq −cq cq 3

Combining (6.32) and (6.33), we have the following heuristic formula √ 3 d co = √ √ . (6.34) 2 + 2 ln(1 + 2)

Remark 6.4. If the probe state is the tensor product of single-qubit probe states, then one only needs to optimize each single-qubit probe state, which corresponds to the 2-dimensional edition of (6.31). This can be straightforwardly achieved by running a numerical simulation, and the result is also covered in Fig. 6.5.

6.4 Numerical results

6.4.1 Basic performance

We simulate the estimation error under diﬀerent total resource numbers. We consider a 2-dimensional system with a detector

      0 0 0.1 −0.02i 0.9 0.02i P1 =   , P2 =   and P3 =   . 0 0.3 0.02i 0.2 −0.02i 0.5

The sampling parameter for coherent states is cq = 0.015. The number of diﬀerent kinds of probe states is L = 40. We employ our method to estimate the detector

199 6.4. NUMERICAL RESULTS

4 Theoretical CLS Bound 3 Theoretical Final Bound CLS Error 2 Final Error

MSE −1 10

log −2

−3

−4

−5

−6 5 6 7 8 9 10 11 12 13 log10 Nt

Figure 6.2: MSE versus the total resource number. using different resource numbers and present the results in Fig. 6.2. In Fig. 6.2, the green dashed line is the theoretical CLS error upper bound (6.18), the black line is the theoretical final error upper bound (6.26) (or equivalently, (6.27) without the higher order term), and the blue dots and red diamonds are the CLS error and final error, respectively. The horizontal axis is the logarithm of the total number of copies of probe states Nt and the vertical axis is the logarithm of the Mean Square Error P ˆ 2 (MSE) E( i ||Pi − Pi|| ). Each point in Fig. 6.2 is the average of 50 simulations.

In Fig. 6.2, the CLS bound is better than the ﬁnal bound, which is because more relaxation procedures are used to deduce the ﬁnal bound and make it looser. We

6 also notice that when the resource number is small (Nt < 10 ), the ﬁnal estimation error is obviously better than the CLS error. This is because the estimation error of arbitrary physical estimation is in essence bounded by a constant, while the CLS estimation can be nonphysical and thus leads to an unbounded error. As a result,

200 6.4. NUMERICAL RESULTS

5 Theoretical CLS Bound Theoretical Final Bound 4 CLS Error Final Error 3

MSE 1 10 log 0

−1

−2

−3 0 50 100 150 200 250 300 L

Figure 6.3: MSE versus probe state kinds. when the resource number is not large enough, the CLS estimation is rough and the error exceeds this constant, while the ﬁnal error is still bounded by this constant.

7 When the resource number is large (Nt > 10 ), the decreasing slope is close to −1, which veriﬁes Theorem 6.1.

6.4.2 On the kinds of probe states

We simulate the performance of our algorithm with diﬀerent number of kinds of coherent probe states. The detector and cq are the same as in Sec. 6.4.1. The 9 total resource number is Nt = 1.44 × 10 . We perform our estimation method with L varying from 4 to 260, and present the results in Fig. 6.3, where each point is the average of 50 simulations. The legend is the same as Fig. 6.2, except that the horizontal axis is the number of kinds of probe states L. We can see when L is very small, both the theoretical bound and the practical errors are large, due to the fact

201 6.4. NUMERICAL RESULTS

−0.5 Theoretical CLS Bound −1 Theoretical Final Bound CLS Error −1.5 Final Error

−2

−2.5 MSE 10 −3 log

−3.5

−4

−4.5

−5 0.5 1 1.5 2 2.5 3 3.5 log10 cq

Figure 6.4: MSE versus the size of sampling square for probe states. that the probe states lack diversity and their linear dependence is high. When L is 16 or more, both the bound and the practical errors quickly tend to constants, which validates our analysis in Sec. 6.3.1. Therefore, in practice a moderate number of diﬀerent probe states should suﬃce.

6.4.3 Optimization of the size of sampling square for probe states

We perform simulations to illustrate that the optimal size of the sampling square for coherent probe states coincides (approximately) with the optimal point of the bound (6.31). We consider a system with the same detector as that in Sec. 6.4.1.

6 The total resource number is Nt = 10 , and the number of diﬀerent kinds of probe states is L = 32. We perform our TSE method under diﬀerent cq, and present the results in Fig. 6.4. Each point is the average of 200 simulations. We can see that

202 6.4. NUMERICAL RESULTS

4.5 Practical optimal positions of the bound Predicted optimal positions 4

3.5

3 | α | 2.5

1.5

1 2 4 6 8 10 12 14 16 d

Figure 6.5: The optimal sampling square size versus dimension. there is indeed an optimal point for the practical estimation error w.r.t. diﬀerent sizes cq, which validates the analysis in Sec. 6.3.2. Also this practical optimal position of cq basically coincides with the optimal position of the error bound.

Using the same system we simulate to search for the optimal size cq of the sampling square for probe states in diﬀerent dimensions. The practical optimal positions we search for are the minimum points of the bound (6.31) under dimensions d = 2, 4, 8, 16, which are presented as red diamonds in Fig. 6.5. The blue line is the optimal co predicted by our formula (6.34), which are close to the practical optimal values and show there is still improvement space.

6.4.4 Comparison with MLE

We compare our TSE method with the Maximum Likelihood Estimation (MLE) method, which is one of the most widely used methods. We simulate an Nq-qubit

203 6.4. NUMERICAL RESULTS

4 4 Time of MLE Time of TSE Error of MLE 3 Error of TSE 3

2 2 MSE (seconds) 10 R T log 10

1 1 , log ,

0 0

−1 −1 Logarithm of Error Logarithm of Time

−2 −2

−3 −3 1 2 3 4 5 6 7 Qubit Number, Nq

Figure 6.6: Comparison between our TSE algorithm with MLE for diﬀerent qubit number.

detector with P1 + P2 = I where

1 1 1 † P1 = U1diag(1, , , ..., )U1 2 3 2Nq and   √ ⊗Nq 1 1 3 U1 =   √  . 2 − 3 1

I I+σx I+σy I+σz The probe states are the tensor product of single-qubit states { 2 , 2 , 2 , 2 }.

3 3Nq For each Nq, the total resource number of the probe states is Nt = 10 × 2 , and they are evenly distributed to each probe state. The MLE method we used is the method in [110], which is also the one we introduced in Sec. 2.2.3. We compare the estimation results of our TSE method and MLE in Fig. 6.6, where each point is the average of 10 simulations. The running time (TR) is the online computational time.

204 6.4. NUMERICAL RESULTS

4 4 Time of MLE Time of TSE 3 Error of MLE 3 Error of TSE

2 2

1 1 MSE (seconds) 10 R

T 0 0 log 10 log −1 −1

−2 −2

−3 −3 0 2 4 6 8 10 12 log2M

Figure 6.7: Comparison between our algorithm with MLE for diﬀerent number of POVM matrices.

For each detector, we first run our TSE algorithm, and then adjust the MLE method such that the averaged estimation error of MLE is within [95%, 105%] of the error of our algorithm. We see that for Nq ≥ 4 qubits our algorithm can be faster than MLE by over 4 orders of magnitude. In this simulation, L = d2, and we thus anticipate the computational complexity is O(d4), which indicates a theoretical slope 1.204 for our running time in the coordinate of Fig. 6.6. For the simulated running time of our algorithm, the slope of the fitting line of the right three points is 1.060, which is close to the theoretical value but still with some difference, possibly because the qubit number is not large enough.

We also simulate the case when M increases. We ﬁx d = 4 and the detector is

1 1 1 j = V diag[ , (1, , ) ]V †, Pj j M 2 3 M2 j

205 6.4. NUMERICAL RESULTS where for 1 ≤ j < M we have

    †  1 1 1 1  √1 ⊗ √1 , when j is odd.  2   2   Vj = 1 −1 1 −1    exp(−iσx ⊗ σx), when j is even.

The probe states are the same as those in the above simulation. We choose M to be a power of 2 and run the simulation for diﬀerent values of M. The total resource

3 3 number of the probe states is ﬁxed as Nt = 10 × 2 , and they are evenly distributed to each probe state. We plot the running time (TR) versus M in logarithmic coor- dinates for our TSE method and MLE in Fig. 6.7, where each point is the average of 10 simulations. We see that TSE is signiﬁcantly faster than MLE for large M.

Theoretically, TR = O(M) indicates a slope 0.301 for the TSE method. For the simulated running time of our algorithm, the slope of the ﬁtting line of the right three points is 0.293, which is close to the theoretical value. Furthermore, Fig. 6.6 and Fig. 6.7 also imply the relationship between the estimation error and M and d, which is not close to the prediction of (6.28). One possible reason is that the bound (6.28) might not be tight. Also, note that the practical error is dependent on the speciﬁc detector and when M and d change the detector necessarily changes. Hence, we leave it an open problem to better characterize the increasing tendency of the error w.r.t. M and d.

Remark 6.5. For practical detectors M is usually smaller than d. However, this pattern means a very large d in simulation, which is diﬃcult to perform if we are to simulate the performance of MLE as comparison. Hence, we do not enforce large d when performing simulation in Fig. 6.7.

206 6.5. EXPERIMENTAL RESULTS

(a)

Fiber coupled CW laser Emulated two-mode detector Att.

PBS1

Q1H1

Probe states Detector outcomes (Two-mode coherent states) (on/off)

(b) (c) Emulated detector Emulated detector SNSPDs SNSPDs 2 2

1 1

OR PBS0 PBS0 H0(22.5deg) Q0(45deg) Detector outcomes Detector outcomes

Figure 6.8: Quantum optical experimental setup for QDT [161].

6.5 Experimental results

6.5.1 Experimental setup

The quantum optical experiment in this section is performed by our collaborators Shota Yokoyama and Hidehiro Yonezawa at the University of New South Wales, Canberra. We use the experimental data to demonstrate our algorithm.

We ﬁrst brieﬂy explain the entire experimental setup (as in Fig. 6.8(a)), which determines the structure of the detector to be estimated. More details about this setup can be found in [161].

In Fig. 6.8, the pink dashed box corresponds to the emulated quantum detector

207 6.5. EXPERIMENTAL RESULTS which works as two-mode inputs - one binary output detector. Two independent quantum modes are encoded within orthogonal polarization modes in one optical beam at the detector input. The two-mode quantum detector consists of two superconducting nanowire single photon detectors (SNSPDs), a polarization beam splitter (PBS), a half wave plate (HWP), and a logical OR gate. The polarization of the

◦ input beam is first rotated by a HWP0 with the azimuth angle of 22.5 (Fig. 6.8(b)), ◦ or a quarter wave plate (QWP0) with the azimuth angle of 45 (Fig. 6.8(c)), respectively. Then the beam is split into two spatially separated beams via PBS0, and they are injected into two SNSPDs through optical fibers. The photon counting signals from the two SNSPDs are sent to a logical OR gate, and the final detector output is obtained as on/off signal corresponding to POVMs of P1 and P2 (P1 + P2 = I). Fig. 6.8(b) and (c) are different specific settings to generate different emulated detectors.

This experimental setup leads to a special class of detectors. Speciﬁcally, we require them to be block diagonal (e.g., see [161]). Denote

B(d, m, {dj}) , {(L1 ⊕ L2 ⊕ ... ⊕ Lm)|∀ 1 ≤ j ≤ m, Lj ∈ Cdj ×dj },

where m is the number of diﬀerent blocks and Lj is dj × dj dimensional with Pm j=1 dj = d. In this subsection (Sec. 6.5), we require each detector matrix Pi ∈

B(d, m, {dj}). Denote (i) (i) (i) Pi = L1 ⊕ L2 ⊕ ... ⊕ Lm . (6.35)

(i) The requirement Pi ≥ 0 is thus equivalent to Lj ≥ 0 for all 1 ≤ j ≤ m. Hence, we need to modify our original TSE method to reconstruct {Pi}.

6.5.2 Modiﬁed estimation protocol

v First we choose {Ωi}i=1 to be a complete orthogonal Hermitian basis set for √ P 2 B(d, m, {dj}) (instead of for Cd×d), where Ω1 = Id/ d and v equals to j dj instead 2 v of d . Such basis set {Ωi}i=1 is not unique, and we give an example in AppendixN.

208 6.5. EXPERIMENTAL RESULTS

Then we have the parametrization under this basis set as

v X (i) Pi = φa Ωa, a=1

v X (j) ρj = θb Ωb, b=1 and the theoretical probability is pij = Tr(Piρj), which now becomes

v X (j) (i) T pij = θa φa , Θj Φi. a=1

The linear regression equation is now

T pˆij = Θj Φi + ∆pij,

and the error ∆pij =p ˆij − pij converges in distribution to a normal distribution 2 T T T T with mean 0 and variance (pij − pij)/(Nt/L). Let Φ = (Φ1 , Φ2 , ..., ΦM) and X0 = T (Θ1, Θ2, ..., ΘL) . Then X0 is L×v dimensional. Let Y = (ˆp11, pˆ12, ..., pˆ1L, pˆ21, pˆ22, ..., T T pˆ2L, ..., pˆML) , X = IM ⊗ X0, e = (∆p11, ∆p12, ..., ∆p1L, ∆p21, ..., ∆p2L, ..., ∆pML) , √ T H = (1, 1, ..., 1)1×M ⊗ Iv, Dv×1 = ( d, 0, ..., 0) . Then the regression equations can be rewritten in a compact form:

Yˆ = X Φ + e, with a linear constraint HΦ = D,

ˆ ˆ which is the same form as (6.4) and (6.5), but with the dimensions of ΦCLS and ΦLS decreased from d2M to vM.

Before proceeding to the CLS solution, we introduce another amendment. In practical experiments, the kinds of the probe states are not always rich enough, and

209 6.5. EXPERIMENTAL RESULTS the resource number can be small. These limitations lead to large CLS error and thus ˆ unsatisfactory ﬁnal errors. More speciﬁcally, physical estimates {Pi} always have the ˆ eigenvalues of Pi between 0 and 1, while bad non-physical estimates usually make ˆ some of these eigenvalues far away from the region [0, 1], which indicates ||ΦCLS|| is too large. To avoid a CLS estimate that deviates seriously from the true value, we enforce a further requirement on the cost function of the linear regression process. Note that the original CLS Problem 6.2.1 is

min ||Yˆ − X Φˆ||2, s.t. HΦˆ = D. (6.36) Φˆ

We now add an extra penalty item to modify (6.36) as

min ||Yˆ − X Φˆ||2 + η||Φˆ||2, s.t. HΦˆ = D, (6.37) Φˆ where η > 0. The new cost function is YˆT Yˆ − 2YˆT X Φˆ + Φˆ T (X T X + ηI)Φ.ˆ Hence, the new CLS solution is obtained by changing all the X T X items in (6.6) and (6.7) as X T X + ηI: ˆ T −1 T ˆ ΦLS = (X X + ηI) X Y, (6.38) and

ˆ ˆ T −1 T T −1 T −1 ˆ ΦCLS = ΦLS − (X X + ηI) H · [H(X X + ηI) H ] (HΦLS − D). (6.39)

The modification from (6.36) to (6.37) is in essence Tikhonov regularization [20], and the optimal parameter η is usually difficult to determine by a fixed formula. Note ˆ ˆ that as the total resource number of all the probe states Nt increases, ||Y − X Φ|| 3 usually decreases, and η should also decrease. We thus choose η = 10 /Nt for ˆ simplicity. From the CLS solution (6.39), we obtain the stage-1 estimate {Fi} which might not be positive semidefinite but satisfies all the other requirements.

ˆ ˆ (i) ˆ (i) ˆ (i) Denote Fi = L1 ⊕ L2 ⊕ ... ⊕ Lm . The block diagonal structure of (6.35) implies

210 6.5. EXPERIMENTAL RESULTS

that the detector is decoupled on the subspaces Cd1×d1 , Cd2×d2 , ..., Cdm×dm . We thus can perform the procedures of Sec. 6.2.2.2, 6.2.2.3 and 6.2.2.4 on these subspaces ˆ (i) M separately. Specifically, for each 1 ≤ j ≤ m, {Lj }i=1 is a set of Hermitian estimate PM ˆ (i) on the space Cdj ×dj satisfying i=1 Lj = Idj . We thus employ difference decomposition, stage-2 approximation and unitary optimization in Sec. 6.2.2.2- Sec. 6.2.2.4 on ˆ (i) M ˆ (i) M {Lj }i=1 to obtain a set of physical estimate {Qj }i=1 for each j. The final estimate ˆ ˆ (i) ˆ (i) ˆ (i) of the detector is thus Pi = Q1 ⊕ Q2 ⊕ ... ⊕ Qm , which is physical and also satisfies the block-diagonal requirement.

Remark 6.6. An error upper bound similar to Theorem 6.1 can also be given for this modiﬁed case. However, the upper bound requires that the form (6.36) without the penalty item is employed and also that Nt should be large enough. In practical experiments, Nt is diﬃcult to be arbitrary large due to noise and imperfections.

6.5.3 Experimental results

We prepare two-mode coherent states for detector tomography by using an ad- equately attenuated continuous-wave (CW) ﬁber coupled laser as depicted in the yellow dashed box in Fig. 6.8(a). We express the general two-mode coherent state without global phase as |α, βeiς i (ς ∈ R, α, β ≥ 0), which can be expanded in the photon number basis as

∞ j k ikς iς − 1 (α2+β2) X α β e |α, βe i = e 2 √ |j, ki. j!k! j,k

We can experimentally generate the above two-mode coherent states by attenuating the laser and rotating a QWP1 and a HWP1 after a PBS1. The probe states we used are the 19 states listed in Tab. 6.1.

We performed experiments for two diﬀerent sets of detectors, denoted as Group I

3 and Group II, respectively. We take η = 10 /Nt for both groups. For the true value

211 6.5. EXPERIMENTAL RESULTS

Table 6.1: The coherent probe states for TSE QDT experiment.

α β ς[deg] 0.316 0.316 -135 0.316 0.316 -90 0.316 0.316 -45 0.316 0.316 0 0.316 0.316 45 0.316 0.316 90 0.316 0.316 135 0.316 0.316 180 0.447 0 - 0 0.447 - 0.194 0.112 -90 0.194 0.112 0 0.194 0.112 90 0.194 0.112 180 0.112 0.194 -90 0.112 0.194 0 0.112 0.194 90 0.112 0.194 180 0 0 -

(1) (1) (1) of Group I (experimental setting as Fig. 6.8(b)), P1 = L1 ⊕L2 ⊕L3 , and we have

(1) −4 L1 = 2.91 × 10 ,

  (1) 0.202 0.00109i L2 =   , −0.00109i 0.202

212 6.5. EXPERIMENTAL RESULTS and   0.363 0.00123i 1.20 × 10−6   (1)   L3 =  −0.00123i 0.363 0.00123i  .   1.20 × 10−6 −0.00123i 0.363

For the true value of Group II (experimental setting as Fig. 6.8(c)), we have

(1) −4 L1 = 1.27 × 10 ,

  (1) 0.0763 −0.0440 + 0.0879i L2 =   , −0.0440 − 0.0879i 0.127 and

  0.147 −0.0574 + 0.115i 0.00580 + 0.00773i   (1)   L3 =  −0.0574 − 0.115i 0.184 −0.0543 + 0.109i  .   0.00580 − 0.00773i −0.0543 − 0.109i 0.238

The error bars are at most 4%, which are derived from the precisions of quantum eﬃciency measurements for each SNSPD.

We record 105 measurement outcomes for each input state, and repeat it 6 times. By truncating the outcome records in the time axis we can obtain data for different resource numbers. We employ our modified algorithm to reconstruct the two sets of detectors, and show the results in Fig. 6.9 and 6.10, respectively. We also plot the reconstruction results using simulated measurement data as a comparison. In Fig. 6.9, the simulation matches the experiment very well. The performance in Fig. 6.10 is not as good as that for Group I, due to the influence of the nondiagonal elements with amplitudes significantly larger than zero.

213 6.5. EXPERIMENTAL RESULTS

−0.2 CLS Error − Simulation −0.3 Final Error − Simulation CLS Error − Experiment −0.4 Final Error − Experiment

−0.5

−0.6

MSE −0.7 10

log −0.8

−0.9

−1

−1.1

−1.2 1.5 2 2.5 3 3.5 4 4.5 5 5.5 6 6.5 log N 10

Figure 6.9: Experimental and simulation results for Group I.

0.2 CLS Error − Simulation 0.15 Final Error − Simulation CLS Error − Experiment 0.1 Final Error − Experiment

0.05

MSE −0.05 10

log −0.1

−0.15

−0.2

−0.25

−0.3 1.5 2 2.5 3 3.5 4 4.5 5 5.5 6 6.5 log N 10

Figure 6.10: Experimental and simulation results for Group II.

214 6.6. SUMMARY AND OPEN PROBLEMS

6.6 Summary and open problems

In this chapter, we have proposed a novel Two-Stage Estimation (TSE) quantum detector tomography method. We analysed the computational complexity for our algorithm and established an upper bound for the estimation error. We discussed the optimization of the coherent probe states, and presented simulation results to illustrate the performance of our algorithm. Quantum optical experiments were performed and the results validated the eﬀectiveness of our method.

Associated open problems include:

(i) Is it possible to give some theoretical support w.r.t. the simulation work in the former part of Sec. 6.4.3? Namely, to theoretically prove that the optimal size of the sampling square for coherent probe states is usually close to the optimal point of the bound (6.31).

(ii) It is desirable to search for formulas more accurate to predict the optimal sampling square size co for any given dimension d.

(iii) How to determine the optimal η more accurately for the modiﬁed TSE method?

(iv) We guess that the error upper bound in Theorem 6.1 might not be tight w.r.t. M; i.e., the practical tight bound should be in the form of O(Mk) where k is not larger than 1, instead of being 2. This is because when the dimension is ﬁxed, for any ˆ ˆ 2 physical estimate Pi, the error E(||Pi − Pi|| ) would be bounded by a constant c, and thus for M matrices the total error should be bounded by cM, which implies that as M increases the error should increase at most linearly instead of quadratically. We leave the better characterization of the error as an open problem.

215 6.6. SUMMARY AND OPEN PROBLEMS

216 Chapter 7

Conclusions and outlook

In this thesis we have discussed various topics in quantum tomography (QT) including quantum state tomography (QST), quantum Hamiltonian identiﬁability, quantum Hamiltonian/gate identiﬁcation (QHI), and quantum detector tomography (QDT). We have mainly focused on discrete variable cases where the target to be estimated is time-independent. Apart from the system dimension, we enforced no prior knowledge on the system to be estimated, and our aim has mainly been on designing novel full tomography algorithms.

We began from estimating the state of a quantum system; i.e., QST. In Ch.3 we proposed a novel recursively adaptive quantum state tomography (RAQST) protocol. We introduced a recursive linear regression estimation solution to adaptive QST, where the forthcoming measurement bases can be optimized using historical measurement data. Numerical results show that RAQST protocol can outperform static tomography protocols using mutually unbiased bases and the two-stage mutually unbiased bases adaptive strategy, even with the simplest product measurements. For high purity states, RAQST using only product measurements already has the potential to beat the Gill-Massar bound. When nonlocal measurements are available, RAQST can beat the Gill-Massar bound for a wide range of quantum states with a modest number of copies. We implemented a two-qubit tomography exper-

217 iment using only the simplest product measurements, and demonstrated that the improvement of adaptive quantum state tomography over nonadaptive tomography is signiﬁcant for states with a high level of purity.

Then we shifted our focus to characterizing the dynamics of quantum systems. As the generator of unitary propagator, the Hamiltonian is an important quantity governing the evolution of closed quantum systems. Hence, in Ch.4, we investigated the problem of quantum Hamiltonian identifiability. We extended the similarity transformation approach (STA) in classical system identification theory to the quantum domain to prove identifiability conclusions for arbitrary dimensional spin-1/2 chain systems assisted by single qubit probes. We further extended the traditional STA method by proposing a structure preserving transformation (SPT) method for non-minimal systems. We used the SPT method to introduce an indicator for the existence of economic quantum Hamiltonian identification algorithms, with computational complexity directly depending on the number of unknown parameters (which could be much smaller than the system dimension). We gave two examples of such economic Hamiltonian identification algorithms.

With the identifiability problem solved, we proceeded to general Hamiltonian i- dentification/gate algorithms in Ch.5. Within the framework of quantum process tomography, we proposed an effective two-step optimization (TSO) QHI algorithm. In our method, different probe states are input into the quantum systems and the output states are estimated using the QST protocol via linear regression estimation. The time-independent system Hamiltonian is reconstructed based on the reconstruction data for the output states. We established a theoretical error upper bound and also characterized the computational complexity as O(d6) for a d-dimensional system. We then improved this TSO method to a more efficient pure-state-based gate identification (PGI) algorithm. By employing a series of predetermined pure probe states and developing a fast QST protocol specialized for pure states, we reduced the computational complexity from O(d6) to O(d3). We established a theoretical error

218 upper bound, and performed quantum optical experiments on single-qubit Hadamard gate to validate the eﬀectiveness of the PGI method.

Finally we considered the last ingredient, detectors, to complete a quantum measurement. In Ch.6 we proposed a novel QDT method. In our method, first a series of different probe states are used to generate measurement data. Then, using constrained linear regression estimation, a stage-1 estimation of the detector is obtained. Finally, the positive semidefinite requirement is added to guarantee a physical stage- 2 estimation. We analyzed the computational complexity and established an error upper bound for this two-stage estimation (TSE) method. We also investigated the optimization on the coherent probe states. A quantum optical experiment was performed to testify the effectiveness of the TSE method.

Here we further propose some outlook for QT on the basis of our research. Gen- erally speaking, QT is a direction still developing. It is a vigorous interdisciplinary that absorbs ideas and methods from many subjects like physics, mathematics, engineering, etc. Hence, we believe it will continue thriving in the near future. More speciﬁcally, full tomography will sooner or later face the problem of exponential explosion, and thus becomes intractable for multi-qubit systems. Hence, partial tomography, namely to estimate only partial parameters or information about the unknown system, will be an important solution. It is also desirable to thoroughly utilize the system prior knowledge to circumvent the exponential explosion problem.

An interesting thing about our work on quantum Hamiltonian identifiability via S- TA is that this extension to the quantum domain in return may supplement classical system identification theory. By proposing SPT method, we found a way to directly solve the identifiability problem for non-minimal systems, which can also be helpful in classical engineering. This indicates that the quantum domain may provide new problems and new angles, which might be instrumental to the development of classical engineering. We thus anticipate more “feedback” like this from the quantum domain to the classical domain, in the research of QT.

219 Appendix A

Some common formulas

Some common properties about the vectorization function are listed as follows [72, 154]: vec(|aihb|) = |bi∗ ⊗ |ai, (A.1)

vec(AXB) = (BT ⊗ A)vec(X), (A.2)

hA, Bi = hvec(A), vec(B)i, (A.3)

† † TrA(vec(X)vec(Y ) ) = XY , (A.4)

† † T TrB(vec(X)vec(Y ) ) = (Y X) . (A.5)

Two important properties of the Frobenius norm are:

||A|| = ||UA||, (A.6)

||AB|| ≤ ||A|| · ||B||, (A.7) where U is any unitary matrix.

220 Appendix B

Iterative algorithm for product projector optimization

We illustrate our algorithm in the case of two-qubit tomography. We ﬁrst ﬁx the

4 basis set {Ωi}i=1 for one-qubit. Denote

(j) φi,(k) , Tr[Pj(Ω1 ⊗ · · · ⊗ Ω1 ⊗Ωi ⊗ Ω1 ⊗ · · · ⊗ Ω1)] | {z } k−1 the parametrization coeﬃcient of the k-th qubit measurement component of the product projector’s j-th matrix in the i-th single-qubit basis matrix Ωi. Then we can parameterize the single-qubit projector measurement on the j-th qubit as

i i i i T Φi,(j) , (φ1,(j), φ2,(j), φ3,(j), φ4,(j)) , i, j = 1, 2.

Here, i denotes the i-th POVM matrix and the projector corresponding to Φ1,(j) is orthogonal to that corresponding to Φ2,(j).

4 For a two-qubit system, we take the tensor product of {Ωi}i=1 as the basis set.

221 Then any two-qubit product projectors can be parameterized as

4 4 4 X i X j X ( φk,(1)Ωk) ⊗ ( φl,(2)Ωl) = (Φi,(1) ⊗ Φj,(2))4k+lΩk ⊗ Ωl. k=1 l=1 k,l=1

Moreover, we can parameterize the quantum state as

4 X ρ = θ4(k−1)+jΩk ⊗ Ωj. (B.1) k,j=1

T −1 Denote Θ = (θ1, θ2, ..., θ16) and P = vec (Θ). Using (A.2), we have

T T T ˜pi,j = (Φi,(1) ⊗ Φj,(2)) Θ = (Φi,(1) ⊗ Φj,(2) )vec(P ) (B.2) T T = vec(Φj,(2) P Φi,(1)) = Φj,(2) P Φi,(1).

Since we take Ω = √I , we have φi = φi = √1 and P = 1 . We then 1 2 1,(1) 1,(2) 2 11 2  T  1/2 Pa partition Φ = ( √1 , xT )T ,Φ = ( √1 , yT )T and P = . Now we i,(1) 2 j,(2) 2   Pb PD have ˜p = 1 + √1 yT P + √1 P T x + yT P x. i,j 4 2 b 2 a D

T T To minimize ˜pi,j under the constraints Φi,(1) Φi,(1) = Φj,(2) Φj,(2) = 1, we introduce β Lagrange multipliers λα, λ , and write L = 1 + √1 yT P + √1 P T x+yT P x+λα(xT x− L L 4 2 b 2 a D L β T 0.5) + λL(y y − 0.5). At extreme points we should have

 ∂L 1 α  = √ Pa + PDy + 2λ x = 0, ∂x 2 L (B.3) β ∂L = √1 P + P x + 2λ y = 0.  ∂y 2 b D L

From (B.3) we can design an iterative algorithm: i) choose x0 = y0 = 0; ii) in step k, let x = − 1 ( √1 P + P T y ) and y = − 1 ( √1 P + P x ), where λα and k 2|λα | 2 a D k−1 k β 2 b D k−1 L L 2|λL| β T T λL are chosen such that xk xk = yk yk = 0.5; iii) replace k ← k + 1 and repeat (ii) until |Lk − Lk−1| is small enough.

The convergence of our algorithm is straightforward to prove. Note that it can

222 be veriﬁed Lk ≤ Lk−1 during iterations. Moreover, since ˜pi,j ≥ 0, we have L ≥ 0.

Therefore, the sequence {Lk} indeed has a limit and the convergence is guaranteed accordingly. It is worth noting that ρ is actually unknown. Hence, we should take its current estimate instead of ρ in the parametrization (B.1).

223 Appendix C

Gill-Massar bound for inﬁdelity in two-qubit state tomography

According to (A.8) in [170], the inﬁdelity between two states ρ andρ ˆ is related to

2 the squared Bures distance DB by

2 4 1 − F(ρ, ρˆ) = DB − DB/4. (C.1)

Thus, the infidelity and the squared Bures distance share the same Gill-Massar bound in the first order approximation. Since the Gill-Massar bound for the mean squared Bures distance is 1 (d + 1)2(d − 1) 1 in a d-dimensional quantum state tomography 4 Nt with a total number of copies Nt ((5.29) in [170]), one can derive the Gill-Massar bound 75 for the infidelity in two-qubit state tomography. 4Nt

224 Appendix D

Proof of Lemma 4.1

Proof. By induction we have

k k k Y T A B = [(∗, ..., ∗, (−1) ϑi, 0, ..., 0) ](NH+1)×1 | {z } k i=1 for 1 ≤ k ≤ NH where ∗ are polynomials in ϑi and the last NH − k elements are zero. Therefore, CM is an upper triangular matrix and its determinant is

NH k Y k Y det(CM) = (−1) ϑi, k=1 i=1 which is non-zero for almost any value of ϑ. Hence, CM is almost always full-ranked.

225 Appendix E

Proof of Lemma 4.2

¯ Proof. We consider det(A − λ0I), which must equal to one of the following three possibilities:

(a) A non-trivial polynomial in ϑis (i = 1, 2, ..., NH);

(b) A non-zero constant;

We let ϑ2 = ϑ4 = ... = ϑNH−1 = 0 and ϑ1 = ϑ3 = ... = ϑNH = λ0 + 1. Then from ¯ (4.24) and (4.25) we know det(A − λ0I) = det(I) = 1. Therefore, (c) is excluded. ¯ No matter which of (a) and (b) is valid, det(A − λ0I) 6= 0 for almost any value of ϑ, ¯ which implies that it is atypical to assume λ0 ∈ Λ(A).

226 Appendix F

Proof of Lemma 4.3

¯ T PNH T Proof. Since A = PEP = i=1 EiiPσi(P )iσ, we have

NH NH ¯ X T X 2 ϑ1 = I1σAIσ1 = EiiI1σPσi(P )iσIσ1 = EiiP1i. (F.1) i=1 i=1

PNH 2 Since i=1 P1i = 1, P1σ can not be all zero. Suppose there are m non-zero elements in P1σ where 1 ≤ m ≤ NH. If m = 1, we suppose it is P1t 6= 0. Then P1t = ±1 and

PNH 2 P1i = 0 for every i 6= t. Since j=1 Pjt = 1, Pjt = 0 for every j 6= 1. We calculate

¯ PNH T −ϑ2 = I1σAIσ2 = i=1 EiiI1σPσi(P )iσIσ2 PNH = i=1 EiiP1iP2i = EttP1tP2t = 0, which is atypical and can be ignored. Hence, it is almost always true that m ≥ 2.

We assume that P1ij 6= 0 for ij = i1, i2, ..., im and otherwise P1i = 0.

We prove the conclusion of Lemma 4.3 by contradiction. Suppose for every 1 ≤

227 T k ≤ NH, |ϑ1| = |(PEIkP )11|. Since

T T T (PEIkP )11 = I1σ[PEP − PE(I − Ik)P ]Iσ1 PNH T T = I1σ[ i=1 EiiPσi(P )iσ − 2EkkPσk(P )kσ]Iσ1 PNH 2 2 = i=1 EiiP1i − 2EkkP1k 2 = ϑ1 − 2EkkP1k, we always have

2 |ϑ1| = |ϑ1 − 2EkkP1k|. (F.2)

We let k = i1 in (F.2). From Lemma 4.2, we have Ei1i1 6= 0. Since P1i1 6= 0, we take 2 the square of both sides of (F.2) and obtain ϑ1 = Ei1i1 P1i1 . For the same reason, we 2 2 2 have ϑ1 = Ei2i2 P1i2 = ... = Eimim P1im . Then (F.1) implies ϑ1 = mEi1i1 P1i1 , which 2 means Ei1i1 P1i1 = 0 and implies a contradiction.

228 Appendix G

Proof of Lemma 4.4

Proof. The controllability matrix is

  B¯ 0 −A¯2B¯ 0 ... 0 CM =  N −1  . 3 2 H 0 −A¯B¯ 0 A¯ B¯ ... −A¯(−A¯ ) 2 B¯

From Lemma 4.2 we know A¯ is almost always nonsingular. Hence, it suffices to prove that Q = (B,¯ A¯2B,¯ ..., A¯NH−1B¯) is almost always nonsingular. Similar to the analysis in AppendixE, det( Q) has only three possibilities, where the possibility of det(Q) ≡ 0 needs to be excluded. Hence, it suffices to find a special A¯ such that det(Q) 6= 0.

We take   0 1      1 0 1    ¯  .. ..  A =  1 . .  .    ..   . 0 1    1 1

229 Then   1 0 1      0 2 0 1     ...  2  1 0 2 0  A¯ =   .  ..   1 . 0 1     ..   . 0 2 1    1 1 2

We can view Q as the controllability matrix of another system (A¯2, B¯), which should be controllable. Since controllability is unchanged under similarity transformation, we transform A¯2 into   1 1      1 2 1    ˜  .. ..  A =  1 . .  . (G.1)    ..   . 2 1    1 1

This similarity transformation works in the following steps: (i) We take all the odd rows of A¯2 in ascending order. (ii) Following (i), we take all the even rows of A¯2 in descending order. (iii) We repeat (i) and (ii) on the columns of A¯2. After steps (i) and (ii), each 2 (except the 2 in the last row) will have a 1 just above it and a 1 just below it, and this property does not change in step (iii). Also, the transformation is symmetric. Hence, A˜ is symmetric with all the 2s on the diagonal line. A˜ thus has the form of (G.1). Under this transformation, B˜ = B¯ is unchanged.

For system (A,˜ B˜), it can be proven by induction that the controllability matrix Q˜ is an upper triangular matrix with all the diagonal elements 1. Therefore det(Q˜) 6= 0, and thus det(Q) 6= 0 and the possibility (c) is excluded. Hence, CM is almost always full-rank.

230 For the observability matrix,

  0 C¯    ¯ ¯   −CA 0     ¯ ¯2  OM =  0 −CA  .      ······   N −1  2 H −C¯A¯(−A¯ ) 2 0

Hence, it suﬃces to prove that P = (C¯T , A¯2T C¯T , ..., A¯(NH−1)T C¯T )T is almost always nonsingular. Since A¯ is symmetric and C¯T = B¯, we know P = QT . Therefore, OM is also almost always full-rank.

231 Appendix H

Proof of Lemma 4.5

Proof. First, we investigate the relationship between Λ(A¯) and Λ(A¯)0. Since A is similar to A0, we know A2 is similar to A02, which implies Λ(A2) = Λ(A02). Therefore,

    −A¯2 0 −A¯02 0 Λ   = Λ   . 0 −A¯2 0 −A¯02

¯ ¯0 Denote λi(A) the i-th eigenvalue of A. If we arrange the eigenvalues of A and A both in ascending sequences, we have

¯0 ¯ λi(A ) = hiλi(A) (H.1)

NH+1 for 1 ≤ i ≤ 2 where hi = ±1. Second, we point out that it is atypical for A¯ to have multiple eigenvalues. We consider det(λI − A¯), which is a polynomial on λ with the coeﬃcients being poly- ¯ nomials on ϑis. det(λI − A) has multiple roots if and only if its discriminant, which is a polynomial function in the coeﬃcients of det(λI − A¯), equals to zero [56]. We can view this discriminant as a polynomial function in ϑis. If this discriminant is in fact the constant zero, then det(λI − A¯) will always have multiple roots, which can

¯ NH+1 be excluded by taking A = diag(1, 2, ..., 2 ). Therefore, the discriminant does not

232 degenerate to zero, and its solution set is of zero measure. Hence, the set of ϑ that can make det(λI − A¯) have multiple roots is of zero measure, which implies that it is atypical when A¯ has multiple eigenvalues.

¯0 ¯0 Third, we prove that we can almost always have λi(A ) + λj(A ) 6= 0 for any

NH+1 1 ≤ i, j ≤ 2 . Using (H.1) we have

¯0 ¯0 ¯ ¯ λi(A ) + λj(A ) = hiλi(A) + hjλj(A). (H.2)

¯ If i = j, then the RHS of (H.2) is 2hiλi(A), which is almost always non-zero according ¯ ¯ to Lemma 4.2. If i 6= j, the RHS of (H.2) is hi[λi(A) ± λj(A)], which is also almost always non-zero because of (4.41) and the fact that A¯ almost always has no multiple ¯0 ¯0 eigenvalues. Therefore, we can almost always have λi(A ) + λj(A ) 6= 0 for any

NH+1 ¯0 ¯0 1 ≤ i, j ≤ 2 , which is equivalent to the statement that A ⊗ I NH+1 + I NH+1 ⊗ A 2 2 is almost always nonsingular.

233 Appendix I

Proof of Proposition 5.1

Proof. When {Fi} is chosen as {|jihk|}1≤j,k≤d, expand (5.5) as

Therefore, we must have

d d X X x(s,t)(s,v) = δtv = x((s−1)d+t)((s−1)d+v) s=1 s=1 for t, v = 1, 2, ..., d, which is just TrAX = Id.

234 Appendix J

Proof of Theorem 5.1

Proof. Using equation (A.2), we vectorize equation (5.7) to obtain

∗ X jk (Fk ⊗ Fj)vec(ρm) = βmnvec(ρn). (J.1) n

d2 4 2 2 Let {W(j, k)}j,k=1 be a family of d matrices. Each matrix W(j, k) is d × d and its jk element in position (m, n) is the number βnm. Let V = (vec(ρ1), vec(ρ2), ..., vec(ρd2 )). From equation (J.1) we have

∗ (Fk ⊗ Fj)V = VW(j, k). (J.2)

d2 Since {ρm}m=1 is a set of linearly independent matrices forming a basis of the space Cd×d, V must be invertible. Therefore we know

−1 ∗ W(j, k) = V (Fk ⊗ Fj)V. (J.3)

Sufficiency :

d2 Since {Fi}i=1 is a set of linearly independent matrices forming a basis of the space ∗ d2 Cd×d, we know {Fk ⊗Fj}j,k=1 is a set of linearly independent matrices forming a basis

235 d2 of the space Cd2×d2 . Therefore, {W(j, k)}j,k=1 is also a set of linearly independent T d2 matrices forming a basis of Cd2×d2 . We then know {vec(W(j, k) )}j,k=1 is a set of linearly independent column vectors forming a basis of the space Cd4×1, which leads to the conclusion that

B = [vec(W(1, 1)T ), vec(W(1, 2)T ), ..., vec(W(2, 1)T ), (J.4) vec(W(2, 2)T ), ..., vec(W(d2, d2)T )] must be invertible.

Necessity :

T d2 When B is invertible, from equation (J.4) we know that {vec(W(j, k) )}j,k=1 is a set of linearly independent column vectors forming a basis of the space Cd4×1. d2 Therefore {W(j, k)}j,k=1 is a set of linearly independent basis of Cd2×d2 , and from ∗ d2 equation (J.3) {Fk ⊗ Fj}j,k=1 is also a set of linearly independent basis of Cd2×d2 .

d2 Now suppose that {Fi}i=1 is not linearly independent. Then from equation (J.3), ∗ d2 one can easily prove {Fk ⊗ Fj}j,k=1 is not linearly independent, which leads to a contradiction. Hence, we prove the necessity.

236 Appendix K

Proof of Theorem 5.2

d2 Proof. We follow the notations in the Proof of Theorem 5.1. Since {ρm}m=1 is a set of normal orthogonal basis of space Cd×d, we know V is unitary. Therefore, we know

† ∗ W(j, k) = V (Fk ⊗ Fj)V. (K.1)

d2 Sufficiency : Since {Fi}i=1 is a set of normal orthogonal basis of the space Cd×d, we have

∗ ∗ ∗ ∗ δ(p,q)(k,j) = δpkδqj = hFp, FkihFq, Fji = hFp ⊗ Fq, Fk ⊗ Fji,

∗ d2 which means {Fk ⊗ Fj}j,k=1 is a set of normal orthogonal basis of the space Cd2×d2 . d2 Therefore, from (K.1), we know that {W(j, k)}j,k=1 is also a set of normal orthogonal basis of the space Cd2×d2 . Hence, B must be unitary.

d2 Necessity : Since B is unitary, from (J.4) we know {W(j, k)}j,k=1 is a set of normal ∗ d2 orthogonal basis of Cd2×d2 . According to (K.1), we know {Fk ⊗ Fj}j,k=1 is also a set of normal orthogonal basis of Cd2×d2 . Hence, we have

∗ ∗ ∗ ∗ hFp ⊗ Fq, Fk ⊗ Fji = δ(p,q)(k,j) = δpkδqj = hFp, FkihFq, Fji. (K.2)

237 Now we concentrate on the third equality in (K.2). Setting p = k = q = j, we

2 † obtain 1 = |hFj, Fji| . Since hFj, Fji = Tr(FjFj) is a positive real number, we must 2 have hFj, Fji = 1 for every j = 1, 2, ..., d . Setting p = k, we obtain δqj = hFq, Fji, d2 which means that {Fi}i=1 is a set of normal orthogonal basis of the space Cd×d.

238 Appendix L

A suﬃcient condition for Assumption 5.1

Let ||·||x be any submultiplicative matrix norm (i.e., ||·||x satisfies (A.7)). Suppose we know a priori that ||H||x is upper bounded by a known value hm. Then we can set the evolution time t < π . 2hm The proof is straightforward. Theorem 1 in Chapter 10.3 of [88] states that the absolute value of the eigenvalue of any matrix is no larger than the submultiplicative norm of the matrix. Hence, the prior knowledge that ||H||x is upper bounded by hm is a sufficient condition for Assumption 5.1 to be satisfied.

239 Appendix M

Proof of Lemma 5.2

Proof. ||eiθb − c||2 = (e−iθb† − c†)(eiθb − c) (M.1) = b†b + c†c − (eiθbc† + e−iθb†c).

Let b†c = reiφ, where r ≥ 0, φ ∈ R. Then

||eiθb − c||2 = b†b + c†c − (reiθe−iφ + re−iθeiφ) (M.2) = b†b + c†c − 2r cos(θ − φ).

Therefore, we should take θ = φ to obtain

min ||eiθb − c||2 = b†b + c†c − 2r. (M.3) θ

We have ||bb† − cc†||2 = Tr(bb†bb† + cc†cc† − 2bb†cc†) (M.4) = (b†b)2 + (c†c)2 − 2r2.

From the Cauchy-Schwartz inequality,

√ √ r = |hb, ci| ≤ ||b|| · ||c|| = b†b c†c.

240 We thus have b†b+c†c iθ 2 2 minθ ||e b − c|| 1 † † 2 † † = 2 (b b + c c) − r(b b + c c) ≤ (b†b)2 + (c†c)2 − r(b†b + c†c) √ (M.5) ≤ (b†b)2 + (c†c)2 − 2r b†bc†c ≤ (b†b)2 + (c†c)2 − 2r2 = ||bb† − cc†||2.

On the other hand,

√ √ † † 2 iθ 2 ( b b + c c) minθ ||e b − c|| √ √ = (b†b + c†c + 2 b†b c†c)(b†b + c†c − 2r) √ √ √ √ = (b†b + c†c)2 − 4r b†b c†c + 2(b†b + c†c)( b†b c†c − r) √ √ ≥ (b†b)2 + (c†c)2 + 2b†bc†c − 4r b†b c†c (M.6) √ √ = (b†b)2 + (c†c)2 + 2( b†b c†c − r)2 − 2r2 ≥ (b†b)2 + (c†c)2 − 2r2 = ||bb† − cc†||2.

241 Appendix N

A basis set example for the space

B(d, m, {dj})

We start from the structure property of each Pi. There are altogether p , Pm j=1 dj(dj −1)/2 non-diagonal non-zero variables in the upper triangular part of each

Pi, and we denote them as (Pi)r1c1 , (Pi)r2c2 , ..., (Pi)rpcp where rk < ck for all 1 ≤ k ≤ p. α α α α α Let the set C = {C1 ,C2 , ..., Cp } where Ck ∈ Cd×d has all the elements zero except √ √ α α β β β β (Ck )rkck = 1/ 2 and (Ck )ckrk = 1/ 2. Also let the set C = {C1 ,C2 , ..., Cp } where β β √ β √ Ck ∈ Cd×d has all the elements zero except (Ck )rkck = i/ 2 and (Ck )ckrk = −i/ 2.

Then we consider the linear equation

x1 + x2 + ... + xd = 0. (N.1)

All the solutions to (N.1) form a (d − 1)-dimensional linear vector space, which can

T have an orthonormal basis set {u1, u2, ..., ud−1}, where uk ∈ Rd and us ut = δst for γ all 1 ≤ k, s, t ≤ d − 1. Then we let C = {diag(u1), diag(u2), ..., diag(ud−1)}.

Finally we construct

√ v α β γ {Ωi}i=1 = {Id×d/ d} ∪ C ∪ C ∪ C .

242 Clearly all the elements in {Ωi} are Hermitian. And using (A.3) it is straightforward † v to prove that Tr(ΩjΩk) = δjk for all 1 ≤ j, k ≤ v. Therefore, {Ωi}i=1 is indeed an orthonormal basis set for the space B(d, m, {dj}).

243 244 Bibliography

[1] A. Ac´ın, E. Bagan, M. Baig, L. Masanes, and R. Mu˜noz-Tapia. Multiple-copy two-state discrimination with individual measurements. Physical Review A, 71(3):032338, 2005.

[2] R. B. A. Adamson and A. M. Steinberg. Improving quantum state estimation with mutually unbiased bases. Physical Review Letters, 105(3):030406, 2010.

[3] J. B. Altepeter, D. Branning, E. Jeﬀrey, T. C. Wei, P. G. Kwiat, R. T. Thew, J. L. O’Brien, M. A. Nielsen, and A. G. White. Ancilla-assisted quantum process tomography. Physical Review Letters, 90(19):193601, 2003.

[4] P. Arrighi and C. Patricot. On quantum operations as quantum states. Annals of Physics, 311(1):26–52, 2004.

[5] E. Bagan, M. A. Ballester, R. D. Gill, R. Mu˜noz-Tapia, and O. Romero-Isart. Separable measurement estimation of density matrices and its ﬁdelity gap with collective protocols. Physical Review Letters, 97(13):130501, 2006.

[6] C. H. Baldwin, A. Kalev, and I. H. Deutsch. Quantum process tomography of unitary and near-unitary maps. Physical Review A, 90(1):012110, 2014.

[7] K. Bartkiewicz, A. Cernoch,ˇ K. Lemr, and A. Miranowicz. Priority choice experimental two-qubit tomography: Measuring one by one all elements of density matrices. Scientiﬁc Reports, 6:19610, 2016.

245 BIBLIOGRAPHY

[8] R. Bellman and K. J. Astr¨om.˚ On structural identiﬁability. Mathematical Biosciences, 7(3-4):329–339, 1970.

[9] C. H. Bennett and G. Brassard. Quantum cryptography: Public key distribution and coin tossing. Theoretical Computer Science, 560(P1):7–11, 2014.

[10] C. H. Bennett and D. P. DiVincenzo. Quantum information and computation. Nature, 404(6775):247–255, 2000.

[11] N. Bent, H. Qassim, A. A. Tahir, D. Sych, G. Leuchs, L. L. S´anchez-Soto, E. Karimi, and R. W. Boyd. Experimental realization of quantum tomography of photonic qudits via symmetric informationally complete positive operator- valued measures. Physical Review X, 5(4):041006, 2015.

[12] M. Berta, J. M. Renes, and M. M. Wilde. Identifying the information gain of a quantum measurement. IEEE Transactions on Information Theory, 60(12):7987–8006, 2014.

[13] I. I. Beterov, M. Saﬀman, E. A. Yakshina, D. B. Tretyakov, V. M. Entin, G. N. Hamzina, and I. I. Rvabtsev. Simulated quantum process tomography of quantum gates with rydberg superatoms. Journal of Physics B: Atomic, Molecular and Optical Physics, 49(11):114007, 2016.

[14] R. Bhatia. Matrix Analysis. Springer, Berlin, 1997.

[15] A. Bisio, G. Chiribella, G. M. D’Ariano, S. Facchini, and P. Perinotti. Optimal quantum tomography of states, measurements, and transformations. Physical Review Letters, 102(1):010404, 2009.

[16] D. Bleichenbacher and P. Q. Nguyen. Noisy polynomial interpolation and noisy Chinese remaindering. In International Conference on the Theory and Applications of Cryptographic Techniques, pages 53–69, 2000.

246 BIBLIOGRAPHY

[17] R. Blume-Kohout. Hedged maximum likelihood quantum state estimation. Physical Review Letters, 105(20):200504, 2010.

[18] R. Blume-Kohout. Optimal, reliable estimation of quantum states. New Jour- nal of Physics, 12(4):043034, 2010.

[19] S. Bonnabel, M. Mirrahimi, and P. Rouchon. Observer-based Hamiltonian identiﬁcation for quantum systems. Automatica, 45(5):1144–1155, 2009.

[20] S. Boyd and L. Vandenberghe. Convex Optimization. Cambridge University Press, Cambridge, U.K., 2004.

[21] G. Brida, L. Ciavarella, I. P. Degiovanni, M. Genovese, L. Lolli, M. G. Min- golla, F. Piacentini, M. Rajteri, E. Taralli, and M. G. A. Paris. Quantum characterization of superconducting photon counters. New Journal of Physics, 14(8):085001, 2012.

[22] C. Le Bris, M. Mirrahimi, H. Rabitz, and G. Turinici. Hamiltonian identiﬁca- tion for quantum systems: Well-posedness and numerical approaches. ESAIM: Control, Optimisation and Calculus of Variations, 13(2):378–395, 2007.

[23] D. Burgarth and A. Ajoy. Evolution-free Hamiltonian parameter estimation through Zeeman markers. Physical Review Letters, 119(3):030402, 2017.

[24] D. Burgarth and K. Maruyama. Indirect Hamiltonian identiﬁcation through a small gateway. New Journal of Physics, 11(10):103019, 2009.

[25] D. Burgarth, K. Maruyama, and F. Nori. Coupling strength estimation for spin chains despite restricted access. Physical Review A, 79(2):020305, 2009.

[26] D. Burgarth, K. Maruyama, and F. Nori. Indirect quantum tomography of quadratic Hamiltonians. New Journal of Physics, 13(1):013019, 2011.

[27] D. Burgarth and K. Yuasa. Quantum system identiﬁcation. Physical Review Letters, 108(8):080502, 2012.

247 BIBLIOGRAPHY

[28] D. K. Burgarth. Identifying combinatorially symmetric Hidden Markov Mod- els. Available at https://arxiv.org/pdf/1709.02932.pdf, 2017.

[29] G. Chen, Y. Zou, X.-Y. Xu, J.-S. Tang, Y.-L. Li, J.-S. Xu, Y.-J. Han, C.-F. Li, G.-C. Guo, H.-Q. Ni, Y. Yu, M.-F. Li, G.-W. Zha, Z.-C. Niu, and Y. Kedem. Experimental test of the state estimation-reversal tradeoﬀ relation in general quantum measurements. Physical Review X, 4(2):021043, 2014.

[30] H.-F. Chen and W. Zhao. Recursive Identiﬁcation and Parameter Estimation. CRC Press, Taylor & Francis, Singapore, 2014.

[31] M. D. Choi. Completely positive linear maps on complex matrices. Linear Algebra and its Applications, 10(3):285–290, 1975.

[32] Y. S. Chow and H. Teicher. Probability Theory: Independence, Interchange- ability, Martingales. Springer, New York, 1997.

[33] M. Christandl and R. Renner. Reliable quantum state tomography. Physical Review Letters, 109(12):120403, 2012.

[34] J. H. Cole, S. G. Schirmer, A. D. Greentree, C. J. Wellard, D. K. L. Oi, and L. C. L. Hollenberg. Identifying an experimental two-state Hamiltonian to arbitrary accuracy. Physical Review A, 71(6):062312, 2005.

[35] M. Cramer, M. B. Plenio, S. T. Flammia, R. Somma, D. Gross, S. D. Bartlet- t, O. Landon-Cardinal, D. Poulin, and Y.-K. Liu. Eﬃcient quantum state tomography. Nature Communications, 1:149, 2010.

[36] G. M. D’Ariano, L. Maccone, and P. L. Presti. Quantum calibration of measurement instrumentation. Physical Review Letters, 93(25):250407, 2004.

[37] V. D’Auria, N. Lee, T. Amri, C. Fabre, and J. Laurat. Quantum decoherence of single-photon counters. Physical Review Letters, 107(5):050504, 2011.

248 BIBLIOGRAPHY

[38] M. D. de Burgh, N. K. Langford, A. C. Doherty, and A. Gilchrist. Choice of measurement sets in qubit tomography. Physical Review A, 78(5):052122, 2008.

[39] C. L. Degen, F. Reinhard, and P. Cappellaro. Quantum sensing. Reviews of Modern Physics, 89(3):035002, 2017.

[40] S. J. Devitt, J. H. Cole, and L. C. L. Hollenberg. Scheme for direct measurement of a general two-qubit Hamiltonian. Physical Review A, 73(5):052317, 2006.

[41] D. Dieks. Communication by EPR devices. Physics Letters A, 92(6):271–272, 1982.

[42] J. DiStefano. On the relationships between structural identiﬁability and the controllability, observability properties. IEEE Transactions on Automatic Con- trol, 22(4):652–652, 1977.

[43] D. P. DiVincenzo. Quantum computation. Science, 270(5234):255–261, 1995.

[44] D. Dong and I. R. Petersen. Quantum control theory and applications: A survey. IET Control Theory & Applications, 4(12):2651–2671, 2010.

[45] D. Dong, Y. Wang, Z. Hou, B. Qi, Y. Pan, and G.-Y. Xiang. State tomography of qubit systems using linear regression estimation and adaptive measurements. In Proceedings of the 20th IFAC World Congress, volume 50, pages 13014– 13019, 2017.

[46] T. Durt, B.-G. Englert, I. Bengtsson, and K. Zyczkowski.˙ On mutually unbiased bases. International Journal of Quantum Information, 8(04):535–640, 2010.

249 BIBLIOGRAPHY

[47] A. Feito, J. S. Lundeen, H. Coldenstrodt-Ronge, J. Eisert, M. B. Plenio, and I. A. Walmsley. Measuring measurement: Theory and practice. New Journal of Physics, 11(9):093038, 2009.

[48] C. Ferrie, C. E. Granade, and D. G. Cory. Adaptive Hamiltonian estimation using Bayesian experimental design. In AIP Conference Proceedings 31st., volume 1443, pages 165–173, 2012.

[49] C. Ferrie, C. E. Granade, and D. G. Cory. How to best sample a periodic probability distribution, or on the accuracy of Hamiltonian ﬁnding strategies. Quantum Information Processing, 12(1):611–623, 2013.

[50] J. Fiur´aˇsek.Maximum-likelihood estimation of quantum measurement. Phys- ical Review A, 64(2):024102, 2001.

[51] J. Fiur´aˇsekand Z. Hradil. Maximum-likelihood estimation of quantum processes. Physical Review A, 63(2):020101, 2001.

[52] C. Di Franco, M. Paternostro, and M. S. Kim. Hamiltonian tomography in an access-limited setting without state initialization. Physical Review Letters, 102(18):187203, 2009.

[53] C. Di Franco, M. Paternostro, and M. S. Kim. Bypassing state initialization in Hamiltonian tomography on spin-chains. International Journal of Quantum Information, 9(supp01):181–187, 2011.

[54] Y. Fu, H. Rabitz, and G. Turinici. Hamiltonian identiﬁcation in presence of large control ﬁeld perturbations. Journal of Physics A: Mathematical and Theoretical, 49(49):495301, 2016.

[55] J. Gallier. Logarithms and square roots of real matrices. Available at http- s://arxiv.org/pdf/0805.0245.pdf, 2008.

250 BIBLIOGRAPHY

[56] I. M. Gelfand, M. Kapranov, and A. Zelevinsky, editors. Discriminants, Re- sultants, and Multidimensional Determinants. Birkh¨auser,Boston, 2008.

[57] J. M. Geremia and H. Rabitz. Optimal identiﬁcation of Hamiltonian information by closed-loop laser control of quantum systems. Physical Review Letters, 89(26):263902, 2002.

[58] R. D. Gill and S. Massar. State estimation for large ensembles. Physical Review A, 61:042312, 2000.

[59] K. R. Godfrey and J. J. DiStefano III. Identiﬁability of model parameters. In IFAC Proceedings Volumes, volume 18, pages 89–114, 1985.

[60] G. H. Golub and C. F. Van Loan. Matrix Computations. JHU Press, Baltimore, MD, 2013.

[61] S. Grandi, A. Zavatta, M. Bellini, and M. G. Paris. Experimental quantum tomography of a homodyne detector. New Journal of Physics, 19(5):053015, 2017.

[62] D. Gross, Y.-K. Liu, S. T. Flammia, S. Becker, and J. Eisert. Quantum state tomography via compressed sensing. Physical Review Letters, 105(15):150401, 2010.

[63] M. Gut¸˘aand N. Yamamoto. System identiﬁcation for passive linear quantum systems. IEEE Transactions on Automatic Control, 61(4):921–936, 2016.

[64] G. Gutoski and N. Johnston. Process tomography for unitary quantum channels. Journal of Mathematical Physics, 55(3):032201, 2014.

[65] M. Hayashi, editor. Asymptotic Theory of Quantum Statistical Inference: Se- lected Papers. World Scientiﬁc, Singapore, 2005.

[66] K.-E. Hellwig and K. Kraus. Operations and measurements. II. Communica- tions in Mathematical Physics, 16(2):142–147, 1970.

251 BIBLIOGRAPHY

[67] B. L. Higgins, D. W. Berry, S. D. Bartlett, H. M. Wiseman, and G. J. Pryde. Entanglement-free Heisenberg-limited phase estimation. Nature, 450(7168):393–396, 2007.

[68] B. L. Higgins, B. M. Booth, A. C. Doherty, S. D. Bartlett, H. M. Wiseman, and G. J. Pryde. Mixed state discrimination using optimal control. Physical Review Letters, 103(22):220503, 2009.

[69] N. J. Higham. Functions of Matrices: Theory and Computation. SIAM, Philadelphia, 2008.

[70] A. S. Holevo. Probabilistic and Statistical Aspects of Quantum Theory. North- Holland, Amsterdam, 1982.

[71] M. Holz¨apfel,T. Baumgratz, M. Cramer, and M. B. Plenio. Scalable reconstruction of unitary processes and Hamiltonians. Physical Review A, 91(4):042129, 2015.

[72] R. A. Horn and C. R. Johnson. Matrix Analysis. Cambridge University Press, Cambridge, U.K., 2012.

[73] Z. Hou, G.-Y. Xiang, D. Dong, C.-F. Li, and G.-C. Guo. Realization of mutually unbiased bases for a qubit with only one wave plate: Theory and experiment. Optics Express, 23(8):10018–10031, 2015.

[74] Z. Hou, H. Zhu, G.-Y. Xiang, C.-F. Li, and G.-C. Guo. Error-compensation measurements on polarization qubits. Journal of the Optical Society of America B, 33(6):1256–1265, 2016.

[75] Z. Hradil. Quantum-state estimation. Physical Review A, 55(3):R1561, 1997.

[76] F. Husz´arand N. M. T. Houlsby. Adaptive Bayesian quantum tomography. Physical Review A, 85(5):052120, 2012.

252 BIBLIOGRAPHY

[77] D. F. V. James, P. G. Kwiat, W. J. Munro, and A. G. White. Measurement of qubits. Physical Review A, 64:052312, 2001.

[78] A. Jamio lkowski. An eﬀective method of investigation of positive maps on the set of positive deﬁnite operators. Reports on Mathematical Physics, 5(3):415– 424, 1974.

[79] M. Jeˇzek,J. Fiur´aˇsek,and Z. Hradil. Quantum inference of states and processes. Physical Review A, 68(1):012305, 2003.

[80] Z. Ji, G. Wang, R. Duan, Y. Feng, and M. Ying. Parameter estimation of quantum channels. IEEE Transactions on Information Theory, 54(11):5172– 5185, 2008.

[81] R. E. Kalman. Mathematical description of linear dynamical systems. Jour- nal of the Society for Industrial and Applied Mathematics, Series A: Control, 1(2):152–192, 1963.

[82] Y. Kato and N. Yamamoto. Structure identiﬁcation and state initialization of spin networks with limited access. New Journal of Physics, 16(2):023024, 2014.

[83] S. Kimmel, M. P. da Silva, C. A. Ryan, B. R. Johnson, and T. Ohki. Robust extraction of tomographic information via randomized benchmarking. Physical Review X, 4(1):011050, 2014.

[84] S. Kimmel, G. H. Low, and T. J. Yoder. Robust calibration of a universal single- qubit gate set via robust phase estimation. Physical Review A, 92(6):062315, 2015.

[85] A. B. Klimov, G. Bj¨ork,and L. L. S´anchez-Soto. Optimal quantum tomography of permutationally invariant qubits. Physical Review A, 87(1):012109, 2013.

253 BIBLIOGRAPHY

[86] K. S. Kravtsov, S. S. Straupe, I. V. Radchenko, N. M. T. Houlsby, F. Husz´ar, and S. P. Kulik. Experimental adaptive Bayesian tomography. Physical Review A, 87(6):062122, 2013.

[87] P. G. Kwiat, E. Waks, A. G. White, I. Appelbaum, and P. H. Eberhard. Ultrabright source of polarization-entangled photons. Physical Review A, 60(2):R773, 1999.

[88] P. Lancaster and M. Tismenetsky. The Theory of Matrices: with Applications. Academic, New York, USA, 1985.

[89] Z. Leghtas, G. Turinici, H. Rabitz, and P. Rouchon. Hamiltonian identiﬁcation through enhanced observability utilizing quantum control. IEEE Transactions on Automatic Control, 57(10):2679–2683, 2012.

[90] S. Lerch and A. Stefanov. Adaptive quantum state estimation of an entangled qubit state. Optics Letters, 39(18):5399–5402, 2014.

[91] M. Levitt and M. Gut¸˘a.Identiﬁcation of single-input-single-output quantum linear systems. Physical Review A, 95(3):033825, 2017.

[92] M. Levitt, M. Gut¸˘a,and H. I. Nurdin. Power spectrum identiﬁcation for quantum linear systems. Automatica, 90:255–262, 2018.

[93] W.-T. Liu, T. Zhang, J.-Y. Liu, P.-X. Chen, and J.-M. Yuan. Experimental quantum state tomography via compressed sampling. Physical Review Letters, 108(17):170403, 2012.

[94] L. Ljung. System Identiﬁcation - Theory for the User. Prentice Hall, Upper Saddle River, N. J., 1999.

[95] A. Luis and L. L. S´anchez-Soto. Complete characterization of arbitrary quantum measurement processes. Physical Review Letters, 83(18):3573, 1999.

254 BIBLIOGRAPHY

[96] J. S. Lundeen and C. Bamber. Procedure for direct measurement of general quantum states using weak measurement. Physical Review Letters, 108(7):070402, 2012.

[97] J. S. Lundeen, A. Feito, H. Coldenstrodt-Ronge, K. L. Pregnell, C. Silberhorn, T. C. Ralph, J. Eisert, M. B. Plenio, and I. A. Walmsley. Tomography of quantum detectors. Nature Physics, 5(1):27–30, 2009.

[98] D. H. Mahler, L. A. Rozema, A. Darabi, C. Ferrie, R. Blume-Kohout, and A. M. Steinberg. Adaptive quantum state tomography improves accuracy quadratically. Physical Review Letters, 111(18):183601, 2013.

[99] A. Miranowicz, K. Bartkiewicz, J. PeˇrinaJr., M. Koashi, N. Imoto, and F. Nori. Optimal two-qubit tomography based on local and global measurements: Max- imal robustness against errors as described by condition numbers. Physical Review A, 90(6):062123, 2014.

[100] J. A. Miszczak. Generating and using truly random quantum states in Math- ematica. Computer Physics Communications, 183(1):118–124, 2012.

[101] M. Miˇcuda,M. Miková,I. Straka, M. Sedlák,M. Duˇsek,M. Jeˇzek,and J. Fi- uráˇsek.Tomographic characterization of a linear optical quantum Toffoli gate. Physical Review A, 92(3):032312, 2015.

[102] C. M. Natarajan, L. Zhang, H. Coldenstrodt-Ronge, G. Donati, S. N. Doren- bos, V. Zwiller, I. A. Walmsley, and R. H. Hadﬁeld. Quantum detector tomography of a time-multiplexed superconducting nanowire single-photon detector at telecom wavelengths. Optics Express, 21(1):893–902, 2013.

[103] Y. M. Nechepurenko. New spectral analysis technology based on the schur decomposition. Russian Journal of Numerical Analysis and Mathematical Mod- elling, 14(3):265–274, 1999.

255 BIBLIOGRAPHY

[104] M. A. Nielsen and I. L. Chuang. Quantum Computation and Quantum Infor- mation. Cambridge University Press, Cambridge, U.K., 2000.

[105] J. P. Norton. An investigation of the sources of nonuniqueness in deterministic identiﬁability. Mathematical Biosciences, 60(1):89–108, 1982.

[106] J. Nunn, B. J. Smith, G. Puentes, I. A. Walmsley, and J. S. Lundeen. Optimal experiment design for quantum state tomography: Fair, precise, and minimal tomography. Physical Review A, 81(4):042109, 2010.

[107] J. L. O’Brien, G. J. Pryde, A. Gilchrist, D. F. V. James, N. K. Langford, T. C. Ralph, and A. G. White. Quantum process tomography of a controlled-NOT gate. Physical Review Letters, 93(8):080502, 2004.

[108] R. Okamoto, M. Iefuji, S. Oyama, K. Yamagata, H. Imai, A. Fujiwara, and S. Takeuchi. Experimental demonstration of adaptive quantum state estimation. Physical Review Letters, 109(13):130404, 2012.

[109] T. Opatrn´y,D.-G. Welsch, and W. Vogel. Least-squares inversion for density- matrix reconstruction. Physical Review A, 56(3):1788, 1997.

[110] M. G. A. Paris and J. Reh´aˇcek,editors.ˇ Quantum State Estimation, volume 649 of Lecture Notes in Physics. Springer, Berlin, 2004.

[111] J. L. Park. The concept of transition in quantum mechanics. Foundations of Physics, 1(1):23–33, 1970.

[112] H. Pohjanpalo. System identiﬁability based on the power series expansion of the solution. Mathematical Biosciences, 41(1-2):21–33, 1978.

[113] B. Qi, Z. Hou, L. Li, D. Dong, G.-Y. Xiang, and G.-C. Guo. Quantum state tomography via linear regression estimation. Scientiﬁc Reports, 3:3496, 2013.

[114] B. Qi, Z. Hou, Y. Wang, D. Dong, H.-S. Zhong, L. Li, G.-Y. Xiang, H. M. Wiseman, C.-F. Li, and G.-C. Guo. Adaptive quantum state tomography via

256 BIBLIOGRAPHY

linear regression estimation: Theory and two-qubit experiment. npj Quantum Information, 3(1):19, 2017.

[115] R. Raussendorf and H. J. Briegel. A one-way quantum computer. Physics Review Letters, 86(22):5188, 2001.

[116] J. J. Renema, G. Frucci, Z. Zhou, F. Mattioli, A. Gaggero, R. Leoni, M. J. A. de Dood, A. Fiore, and M. P. van Exter. Modiﬁed detector tomography technique applied to a superconducting multiphoton nanodetector. Optics Express, 20(3):2806–2813, 2012.

[117] J. M. Renes, R. Blume-Kohout, A. J. Scott, and C. M. Caves. Symmetric informationally complete quantum measurements. Journal of Mathematical Physics, 45(6):2171–2180, 2004.

[118] A. V. Rodionov, A. Veitia, R. Barends, J. Kelly, D. Sank, J. Wenner, J. M. Martinis, R. L. Kosut, and A. N. Korotkov. Compressed sensing quantum process tomography for superconducting quantum gates. Physical Review B, 90(14):144504, 2014.

[119] K. Rudinger and R. Joynt. Compressed sensing for Hamiltonian reconstruction. Physical Review A, 92(5):052322, 2015.

[120] M. F. Sacchi. Maximum-likelihood reconstruction of completely positive maps. Physical Review A, 63(5):054104, 2001.

[121] J. Z. Salvail, M. Agnew, A. S. Johnson, E. Bolduc, J. Leach, and R. W. Boyd. Full characterization of polarization states of light via direct measurement. Nature Photonics, 7(4):316–321, 2013.

[122] S. G. Schirmer, A. Kolli, and D. K. L. Oi. Experimental Hamiltonian iden- tiﬁcation for controlled two-level systems. Physical Review A, 69(5):050306, 2004.

257 BIBLIOGRAPHY

[123] S. G. Schirmer and D. K. L. Oi. Two-qubit Hamiltonian tomography by bayesian analysis of noisy data. Physical Review A, 80(2):022333, 2009.

[124] G. A. F. Seber and A. J. Lee. Linear Regression Analysis. John Wiley & Sons, New York, 2012.

[125] A. Sergeevich, A. Chandran, J. Combes, S. D. Bartlett, and H. M. Wiseman. Characterization of a qubit Hamiltonian using adaptive measurements in a ﬁxed basis. Physical Review A, 84(5):052315, 2011.

[126] A. Shabani, R. L. Kosut, M. Mohseni, H. Rabitz, M. A. Broome, M. P. Almei- da, A. Fedrizzi, and A. G. White. Eﬃcient measurement of quantum dynamics via compressive sensing. Physical Review Letters, 106(10):100401, 2011.

[127] A. Shabani, M. Mohseni, S. Lloyd, R. L. Kosut, and H. Rabitz. Estimation of many-body quantum Hamiltonians via compressive sensing. Physical Review A, 84(1):012107, 2011.

[128] J. Shang, Z. Zhang, and H. K. Ng. Superfast maximum-likelihood reconstruction for quantum tomography. Physical Review A, 95(6):062336, 2017.

[129] C.-C. Shu, K.-J. Yuan, D. Dong, I. R. Petersen, and A. D. Bandrauk. Identify- ing strong-ﬁeld eﬀects in indirect photofragmentation reactions. The Journal of Physical Chemistry Letters, 8(1):1–6, 2016.

[130] J. A. Smolin, J. M. Gambetta, and G. Smith. Eﬃcient method for comput- ing the maximum-likelihood quantum state from measurements with additive gaussian noise. Physical Review Letters, 108(7):070502, 2012.

[131] A. Sone and P. Cappellaro. Exact dimension estimation of interacting qubit systems assisted by a single quantum probe. Physical Review A, 96(6):062334, 2017.

258 BIBLIOGRAPHY

[132] A. Sone and P. Cappellaro. Hamiltonian identiﬁability assisted by a single- probe measurement. Physical Review A, 95(2):022335, 2017.

[133] E. D. Sontag. For diﬀerential equations with r parameters, 2r+1 experiments are enough for identiﬁcation. Journal of Nonlinear Science, 12(6):553–583, 2002.

[134] E. D. Sontag, Y. Wang, and A. Megretski. Input classes for identiﬁability of bilinear systems. IEEE Transactions on Automatic Control, 54(2):195–207, 2009.

[135] J. Stillwell. Naive Lie Theory. Springer Science & Business Media, New York, 2008.

[136] T. Sugiyama, P. S. Turner, and M. Murao. Adaptive experimental design for one-qubit state estimation with ﬁnite data based on a statistical update criterion. Physical Review A, 85(5):052107, 2012.

[137] T. Sugiyama, P. S. Turner, and M. Murao. Precision-guaranteed quantum tomography. Physical Review Letters, 111(16):160406, 2013.

[138] B. Teklu, S. Olivares, and M. G. Paris. Bayesian estimation of one-parameter qubit gates. Journal of Physics B: Atomic, Molecular and Optical Physics, 42(3):035502, 2009.

[139] Y. S. Teo, B. Stoklasa, B.-G. Englert, J. Reh´aˇcek,andˇ Z. Hradil. Incom- plete quantum state estimation: A comprehensive study. Physical Review A, 85(4):042317, 2012.

[140] Y. S. Teo, H. Zhu, B.-G. Englert, J. Reh´aˇcek,andˇ Z. Hradil. Quantum-state reconstruction by maximizing likelihood and entropy. Physical Review Letters, 107(2):020404, 2011.

259 BIBLIOGRAPHY

[141] G. T´oth, W. Wieczorek, D. Gross, R. Krischek, C. Schwemmer, and H. We- infurter. Permutationally invariant quantum tomography. Physical Review Letters, 105(25):250403, 2010.

[142] C. C. Travis and G. Haddock. On structural identiﬁcation. Mathematical Biosciences, 56(3-4):157–173, 1981.

[143] S. Vajda, K. R. Godfrey, and H. Rabitz. Similarity transformation approach to identiﬁability analysis of nonlinear compartmental models. Mathematical Biosciences, 93(2):217–248, 1989.

[144]J. Reh´aˇcek,D.ˇ Mogilevtsev, and Z. Hradil. Operational tomography: Fitting of data patterns. Physics Review Letters, 105(1):010402, 2010.

[145] E. Walter and Y. Lecourtier. Unidentiﬁable compartmental models: What to do? Mathematical Biosciences, 56(1-2):1–25, 1981.

[146] H. Y. Wang, W. Q. Zheng, N. K. Yu, K. R. Li, D. W. Lu, T. Xin, C. Li, Z. F. Ji, D. Kribs, B. Zeng, X. H. Peng, and J. F. Du. Quantum state and process tomography via adaptive measurements. Science China Physics, Mechanics & Astronomy, 59(10):100313, 2016.

[147] J. Wang, S. Paesani, R. Santagati, S. Knauer, A. A. Gentile, N. Wiebe, M. Petruzzella, J. L. O’Brien, J. G. Rarity, A. Laing, and M. G. Thompson. Experimental quantum Hamiltonian learning. Nature Physics, 13(6):551–555, 2017.

[148] L. Wang and H. Garnier, editors. System Identiﬁcation, Environmental Mod- elling, and Control System Design. Springer-Verlag, London, U.K., 2012.

[149] S.-T. Wang, D.-L. Deng, and L.-M. Duan. Hamiltonian tomography for quantum many-body systems with arbitrary couplings. New Journal of Physics, 17(9):093017, 2015.

260 BIBLIOGRAPHY

[150] Y. Wang, D. Dong, B. Qi, J. Zhang, I. R. Petersen, and H. Yonezawa. A quantum Hamiltonian identiﬁcation algorithm: Computational complexity and error analysis. IEEE Transactions on Automatic Control, 63(5):1388–1403, 2018.

[151] Y. Wang, D. Dong, A. Sone, I. R. Petersen, H. Yonezawa, and P. Cappellaro. Quantum Hamiltonian identiﬁability via a similarity transformation approach and beyond. submitted to IEEE Transactions on Automatic Control, available at https://arxiv.org/pdf/1809.02965.pdf, 2018.

[152] Y. Wang, B. Qi, D. Dong, and I. R. Petersen. An iterative algorithm for Hamiltonian identiﬁcation of quantum systems. In 2016 IEEE 55th Annual Conference on Decision and Control (CDC), pages 2523–2528, 2016.

[153] Y. Wang, Q. Yin, D. Dong, B. Qi, I. R. Petersen, Z. Hou, H. Yonezawa, and G.-Y. Xiang. Quantum gate identiﬁcation: Error analysis, numerical results and optical experiment. Automatica, 101:269–279, 2019.

[154] J. Watrous. The Theory of Quantum Information. Cambridge University Press, Cambridge, U.K., 2018.

[155] H. M. Wiseman. Adaptive phase measurements of optical modes: Going beyond the marginal Q distribution. Physical Review Letters, 75(25):4587, 1995.

[156] H. M. Wiseman and G. J. Milburn. Quantum Measurement and Control. Cam- bridge University Press, Cambridge, U.K., 2009.

[157] W. K. Wootters and B. D. Fields. Optimal state-determination by mutually unbiased measurements. Annals of Physics, 191(2):363–381, 1989.

[158] W. K. Wootters and W. H. Zurek. A single quantum cannot be cloned. Nature, 299(5886):802–803, 1982.

261 BIBLIOGRAPHY

[159] X. Wu and K. Xu. Partial standard quantum process tomography. Quantum Information Processing, 12(2):1379–1393, 2013.

[160] G. Y. Xiang, B. L. Higgins, D. W. Berry, H. M. Wiseman, and G. J. Pryde. Entanglement-enhanced measurement of a completely unknown optical phase. Nature Photonics, 5(1):43–47, 2011.

[161] S. Yokoyama, N. D. Pozza, T. Serikawa, K. B. Kuntz, T. A. Wheatley, D. Dong, E. H. Huntington, and H. Yonezawa. The quantum entanglement of measurement. Available at https://arxiv.org/pdf/1705.06441.pdf, 2017.

[162] H. Yonezawa, D. Nakane, T. A. Wheatley, K. Iwasawa, S. Takeda, H. Arao, K. Ohki, K. Tsumura, D. W. Berry, T. C. Ralph, H. M. Wiseman, E. H. Hunt- ington, and A. Furusawa. Quantum-enhanced optical-phase tracking. Science, 337(6101):1514–1517, 2012.

[163] H. Yuan and C.-H. F. Fung. Optimal feedback scheme and universal time scaling for Hamiltonian parameter estimation. Physical Review Letters, 115(11):110401, 2015.

[164] A. Zhang, Y. Zhang, F. Xu, L. Li, and L. Zhang. Adaptive tomography of qubits: Purity versus statistical ﬂuctuations. Available at http- s://arxiv.org/pdf/1805.04808.pdf, 2018.

[165] J. Zhang and M. Sarovar. Quantum Hamiltonian identiﬁcation from measurement time traces. Physical Review Letters, 113(8):080401, 2014.

[166] J. Zhang and M. Sarovar. Identiﬁcation of open quantum systems from observable time traces. Physical Review A, 91(5):052121, 2015.

[167] L. Zhang, H. B. Coldenstrodt-Ronge, A. Datta, G. Puentes, J. S. Lundeen, X.-M. Jin, B. J. Smith, M. B. Plenio, and I. A. Walmsley. Mapping coherence in measurement via full quantum tomography of a hybrid optical detector. Nature Photonics, 6(6):364–368, 2012.

262 BIBLIOGRAPHY

[168] L. Zhang, A. Datta, H. B. Coldenstrodt-Ronge, X.-M. Jin, J. Eisert, M. B. Plenio, and I. A. Walmsley. Recursive quantum detector tomography. New Journal of Physics, 14(11):115005, 2012.

[169] K. Zhou, J. C. Doyle, and K. Glover. Robust Optimal Control. Prentice Hall, New Jersey, 1996.

[170] H. Zhu. Quantum State Estimation and Symmetric Informationally Complete POMs. PhD thesis, National University of Singapore, 2012.

[171] M. Zorzi, F. Ticozzi, and A. Ferrante. Minimum relative entropy for quantum estimation: Feasibility and general solution. IEEE Transactions on Informa- tion Theory, 60(1):357–367, 2014.

[172]K. Zyczkowski˙ and I. Bengtsson. On duality between quantum maps and quantum states. Open Systems & Information Dynamics, 11(01):3–42, 2004.

[173]K. Zyczkowski˙ and M. Ku´s.Random unitary matrices. Journal of Physics A: Mathematical and General, 27(12):4235, 1994.

263