New Developments in Quantum Tomography
Yuanlong Wang
A thesis submitted in fulfilment of the requirements for the degree of Doctor of Philosophy
SCIENTIA
MANU E T MENTE
School of Engineering and Information Technology, University of New South Wales in Canberra
June 2019 Thesis/Dissertation Sheet Australia's Global SYDNEY University
Surname/Family Name Wang Given Name/s Yuanlong Abbreviation for degree as give in the University calendar PhD Faculty UNSW Canberra School School of Engineering and Information Technology Thesis Title New Developments in Quantum Tomography
Abstract 350 words maximum: (PLEASE TYPE) This thesis investigates several topics in quantum tomography: quantum state tomography (QST), quantum Hamiltonian identifiability, quantum Hamiltonian/gate identification (QHI) and quantum detector tomography. For QST, we propose a novel recursively adaptive quantum state tomography (RAQST) protocol, which can outperform static tomography protocols using mutually unbiased bases and a two-stage mutually unbiased bases adaptive strategy, even with the simplest product measurements. When nonlocal measurements are available, RAQST can beat the Gill Massar bound for a wide range of quantum states with a modest number of copies. For quantum Hamiltonian identifiability, we extend the similarity transformation approach (STA) in classical system identification theory to the quantum domain to prove for the first time the identifiability conclusions for arbitrary dimensional spin-1/2 chain systems assisted by single qubit probes. We further develop the traditional STA method by proposing a Structure Preserving Transformation (SPT) method for non-minimal systems. We use the SPT method to introduce an indicator for the existence of economic quantum Hamiltonian identification algorithms, and give two algorithm examples. Within the framework of quantum process tomography, we propose a general two-step optimization (TSO) QHI algorithm. We then improve the TSO method to a more efficient pure-state-based gate identification (PGI) algorithm. By employing a series of predetermined pure probe states and developing a fast QST protocol specialized for pure states, we reduce the computational complexity from O(d"6) with dimension din TSO to O(d"3) in PGI. We provide theoretical error upper bounds for TSO and PGI methods. Finally we propose a novel QDT method. Using constrained linear regression estimation, a stage-1 estimate of the detector is obtained. Next a positive semidefinite requirement is added to guarantee a physical stage-2 estimate. We analyze the computational complexity and establish an error upper bound for this Two-Stage Estimation (TSE) method. r Such a theoretical analysis is uncommon in other QDT methods. We also investigate optimization over the coherent probe states. For RAQST, PGI and QDT, our collaborators have performed quantum optical experiments to validate the effectiveness of the proposed algorithms. " !I Declaration relating to disposition of project thesis/1:iissertation
I hereby grant to the University of New South Wales or its agents the right to archive and to make available my thesis or dissertation in whole or in part in the University libraries in all forms of media, now or here after known. subject to the provisions of the Copyright Act 1968. I retain all property rights, such as patent rights. I also retain the right to use in future works (such as articles or books) all or partof this thesis or dissertation.
I also authorise University Microfilms to use the 350 word abstract of my thesis in Dissertation Abstracts International (this is applicable to doctoral theses only). . ...?. .t/ C?.t/2!!.I ) ...... Date The University recognises that there may be exceptional circumstances requiring restrictions on copying or conditions on use. Requests for restriction for a period of up to 2 years must be made in writing. Requests for a longer period of restriction may be considered in exceptional circumstances and require the approval of the Dean of Graduate Research.
OR OFFICE USE ONLY Date of completion of requirements for Award: ORIGINALITY STATEMENT
'I hereby declare that this submission is my own work and to the best of my knowledge it contains no materials previously published or written by another person, or substantial proportions of material which have been accepted for the award of any other degree or diploma at UNSW or any other educational institution, except where due acknowledgement is made in the thesis. Any contribution made to the research by others, with whom I have worked at UNSW or elsewhere, is explicitly acknowledged in the thesis. I also declare that the intellectual content of this thesis is the product of my own work, except to the extent that assistance from others in the project's design and conception or in style, presentation and linguistic expression is acknowledged.'
Signed / / Date .... Pt Pt/.�.. 1 ...... COPYRIGHT STATEMENT
'I hereby grant the University of New South Wales or its agents the right to archive and to make available my thesis or dissertation in whole or part in the University libraries in all forms of media, now or here after known, subject to the provisions of the Copyright Act 1968. I retain all proprietary rights, such as patent rights. I also retain the right to use in future works (such as articles or books) all or part of this thesis or dissertation. I also authorise University Microfilms to use the 350 word abstract of my thesis in DissertationAbstract International (this is applicable to doctoral theses only). I have either used no substantial portions of copyright material in my thesis or I have obtained permission to use copyright material; where permission has not been granted I have applied/will apply for a partial restriction of the digital copy of rny thesis or dissertation.'
Signed · Date .... ! t/ pf /.2?!1 ...
AUTHENTICITY STATEMENT
'I certify that the Library deposit digital copy is a direct equivalent of the final officially approved version of rny thesis. No emendation of content has occurred and if there are any minor variations in formatting, they are the result of the conversion to digital format.'
Signed
Date ?t/9(/½?(f INCLUSION OF PUBLICATIONS STATEMENT
UNSW is supportive of candidates publishing their research results during their candidature as detailed in the UNSW Thesis Examination Procedure.
Publications can be used in their thesis in lieu of a Chapter if: • The student contributed greater than 50% of the content in the publication and is the "primary author", ie. the student was responsible primarily for the planning, execution and preparation of the work for publication • The student has approval to include the publication in their thesis in lieu of a Chapter from their supervisor and Postgraduate Coordinator. • The publication is not subject to any obligations or contractual agreements with a third party that would constrain its inclusion in the thesis
Please indicate whether this thesis contains published material or not. This thesis contains no ublications, either ublished or submitted for publication D 'fhls:boxlsch Some of the work described in this thesis has been published and it has been documented in the relevant Chapters with acknowledgement • box OJt
This thesis has publications (either published or submitted for publication) D incorporated into it in lieu of a chapter and the details are presented below
CANDIDATE'S DECLARATION I declare that: • I have complied with the Thesis Examination Procedure • where I have used a publication in lieu of a Chapter, the listed publication(s) below meet(s) the requirements to be included in the thesis. Name Signature Date ( dd/mm/yy) IA.t�vt I vV.;.,"" ; U·lf -P utlcy,I!J / / Postgraduate Coordinator's Declaration (
I declare that: • the information below is accurate • where listed publication(s) have been used in lieu of Chapter(s), their use complies with the Thesis Examination Procedure • the minimum requirements for the format of the thesis have been met. PGC's Name PGC's Signature Date ( dd/mm/yy)
i
Abstract
This thesis investigates several topics in quantum tomography: quantum state to- mography (QST), quantum Hamiltonian identifiability, quantum Hamiltonian/gate identification (QHI) and quantum detector tomography.
For QST, we propose a novel recursively adaptive quantum state tomography (RAQST) protocol, which can outperform static tomography protocols using mutu- ally unbiased bases and a two-stage mutually unbiased bases adaptive strategy, even with the simplest product measurements. When nonlocal measurements are avail- able, RAQST can beat the Gill-Massar bound for a wide range of quantum states with a modest number of copies.
For quantum Hamiltonian identifiability, we extend the similarity transformation approach (STA) in classical system identification theory to the quantum domain to prove for the first time the identifiability conclusions for arbitrary dimensional spin- 1/2 chain systems assisted by single qubit probes. We further develop the traditional STA method by proposing a Structure Preserving Transformation (SPT) method for non-minimal systems. We use the SPT method to introduce an indicator for the existence of economic quantum Hamiltonian identification algorithms, and give two algorithm examples.
Within the framework of quantum process tomography, we propose a general two- step optimization (TSO) QHI algorithm. We then improve the TSO method to a more efficient pure-state-based gate identification (PGI) algorithm. By employing
3 a series of predetermined pure probe states and developing a fast QST protocol specialized for pure states, we reduce the computational complexity from O(d6) with dimension d in TSO to O(d3) in PGI. We provide theoretical error upper bounds for TSO and PGI methods.
Finally we propose a novel QDT method. Using constrained linear regression estimation, a stage-1 estimate of the detector is obtained. Next a positive semidefi- nite requirement is added to guarantee a physical stage-2 estimate. We analyze the computational complexity and establish an error upper bound for this Two-Stage Estimation (TSE) method. Such a theoretical analysis is uncommon in other QDT methods. We also investigate optimization over the coherent probe states.
For RAQST, PGI and QDT, our collaborators have performed quantum optical experiments to validate the effectiveness of the proposed algorithms.
4 Acknowledgement
I would like to thank my primary supervisor, A/Prof. Daoyi Dong, for all the guidance and help he has given me, both academic and in life. He taught me every detail ranging from doing research, writing papers to making career plans, etc. He encouraged me to attend academic conferences and introduced excellent collaborators for our projects. His patience in mentoring students and in life is the greatest I have ever met, which I doubt whether I can achieve in the future. He is the necessary (and sufficient largely) condition for my satisfactory and memorable PhD period of research.
I would also like to thank my joint supervisor Prof. Ian R. Petersen and co- supervisors Dr. Hidehiro Yonezawa and Prof. Elanor Huntington. I learned a lot from their rich academic experience. They taught me valuable research techniques in finding topics, overcoming research problems, revising papers, etc. Their various backgrounds and specialty inspire me to view problems from different angles, which is especially important for understanding interdiscipline research.
I am also sincerely grateful to my close collaborators: Dr. Bo Qi at CAS, Dr. Zhibo Hou, Dr. Qi Yin and Prof. Guo-Yong Xiang at USTC, Dr. Jun Zhang at SJTU, Dr. Akira Sone and Prof. Paula Cappellaro at MIT, and Dr. Shota Yokoyama at UNSW. It is really fruitful and memorable journeys to have their collaboration, from which I have learned and improved myself quite a lot.
Special thanks to my group members Wei Zhang, Qi Yu and Yanan Liu. Our
5 academic discussion is rich and instrumental, and our support in each other in life is warm and precious. Also special thanks to my friends Ruxiu Liu, Wei Zhang and Di Liu, whose emotional support is vital to my life. Many thanks to my family, who are always in favor of my career decision.
Finally, thank life, for everything.
6 Certificate of Originality
I hereby declare that this submission is my own work and that, to the best of my knowledge and belief, it contains no material previously published or written by another person, nor material which to a substantial extent has been accepted for the award of any other degree or diploma at UNSW or any other educational institution, except where due acknowledgement is made in the thesis. Any contribution made to the research by colleagues, with whom I have worked at UNSW or elsewhere, during my candidature, is fully acknowledged.
I also declare that the intellectual content of this thesis is the product of my own work, except to the extent that assistance from others in the project’s design and conception or in style, presentation and linguistic expression is acknowledged.
YUANLONG WANG
7 8 List of publications
[Journal articles]
1. Y. Wang, S. Yokoyama, D. Dong, I. R. Petersen, E. H. Huntington, and H. Yonezawa, Two-stage estimation for quantum detector tomography: Error analysis, numerical and experimental results, in preparation.
2. Y. Wang, D. Dong, A. Sone, I. R. Petersen, H. Yonezawa, and P. Cappellaro, Quantum Hamiltonian identifiability via a similarity transformation approach and beyond, submitted to IEEE Transactions on Automatic Control, 2018.
3. Y. Wang, Q. Yin, D. Dong, B. Qi, I. R. Petersen, Z. Hou, H. Yonezawa, and G.-Y. Xiang, Quantum gate identification: Error analysis, numerical results and optical experiment, Automatica, vol. 101, pp. 269-279, 2019.
4. Y. Wang, D. Dong, B. Qi, J. Zhang, I. R. Petersen, and H. Yonezawa, A quantum Hamiltonian identification algorithm: Computational complexity and error analysis, IEEE Transactions on Automatic Control, vol. 63, no. 5, pp. 1388-1403, 2018.
5. B. Qi, Z. Hou, Y. Wang, D. Dong, H.-S. Zhong, L. Li, G.-Y. Xiang, H. M. Wiseman, C.-F. Li, and G.-C. Guo, Adaptive quantum state tomography via linear regression estimation: Theory and two-qubit experiment, npj Quantum Information, vol. 3, no. 1, p. 19, 2017.
9 6. D. Dong, I. R. Petersen, Y. Wang, X. Yi, and H. Rabitz, Sampled-data design for robust control of open two-level quantum systems with operator errors, IET Control Theory & Applications, vol. 10, no. 18, pp. 2415-2421, 2016.
7. Z. Hou, H.-S. Zhong, Y. Tian, D. Dong, B. Qi, L. Li, Y. Wang, F. Nori, G.-Y. Xiang, C.-F. Li and G.-C. Guo, Full reconstruction of a 14-qubit state within four hours, New Journal of Physics, vol. 18, no. 8, p. 083036, 2016.
[Conference papers]
1. Y. Wang, D. Dong, and I. R. Petersen, An approximate quantum Hamiltoni- an identification algorithm using a Taylor expansion of the matrix exponential function, in 2017 IEEE 56th Annual Conference on Decision and Control (CD- C), pp. 5523-5528, Melbourne, Australia, December 2017.
2. Y. Wang, Q. Yin, D. Dong, B. Qi, I. R. Petersen, Z. Hou, H. Yonezawa, and G.-Y. Xiang, Efficient identification of unitary quantum processes, in 2017 Australian and New Zealand Control Conference (ANZCC), pp. 196-201, Gold Coast, Australia, December 2017.
3. D. Dong, and Y. Wang, Several recent developments in estimation and ro- bust control of quantum systems, in 2017 Australian and New Zealand Control Conference (ANZCC), pp. 190-195, Gold Coast, Australia, December 2017.
4. Y. Wang, D. Dong, I. R. Petersen, and J. Zhang, An approximate algorithm for quantum Hamiltonian identification with complexity analysis, in the 20th World Congress of the International Federation of Automatic Control (IFAC), vol. 50, no. 1, pp. 11744-11748, Toulouse, France, July 2017.
10 5. D. Dong, Y. Wang, Z. Hou, B. Qi, Y. Pan, and G.-Y. Xiang, State tomography of qubit systems using linear regression estimation and adaptive measurements, in the 20th World Congress of the International Federation of Automatic Con- trol (IFAC), vol. 50, no. 1, pp. 13014-13019, Toulouse, France, July 2017.
6. Y. Wang, B. Qi, D. Dong, and I. R. Petersen, An iterative algorithm for Hamiltonian identification of quantum systems, in 2016 IEEE 55th Annual Conference on Decision and Control (CDC), pp. 2523-2528, Las Vegas, USA, December 2016.
11 12 Contents
Abstract2
Acknowledgements5
Declaration7
List of Publications9
Table of Contents 13
List of Figures 19
List of Tables 21
List of Symbols 23
List of Common Acronyms 31
1 Introduction 33
2 Quantum mechanics and standard tomography methods 37
2.1 Quantum mechanics foundations...... 37
13 2.2 Standard maximum likelihood estimation...... 43
2.2.1 Quantum state tomography via maximum likelihood estimation 43
2.2.2 Quantum process tomography via maximum likelihood esti- mation...... 45
2.2.3 Quantum detector tomography via maximum likelihood esti- mation...... 46
3 Recursively adaptive multi-qubit state tomography 49
3.1 Introduction...... 49
3.2 RAQST protocol...... 52
3.2.1 Establishment of linear regression model...... 53
3.2.2 Recursive LRE updating rule and physical projection..... 55
3.2.3 Optimization criterion...... 58
3.2.4 Two versions of RAQST...... 61
3.3 Numerical results...... 64
3.4 Experimental results...... 68
3.5 Summary and open problems...... 72
4 Quantum Hamiltonian identifiability via a similarity transformation approach and beyond 75
4.1 Introduction...... 76
4.2 Model establishment...... 78
4.2.1 Problem formulation of Hamiltonian identifiability and identi- fication...... 78
4.2.2 Laplace transform approach and atypical cases...... 81
14 4.3 Similarity Transformation Approach...... 84
4.3.1 General procedures for minimal systems...... 84
4.3.2 General procedures for non-minimal systems...... 85
4.3.3 Structure Preserving Transformation method for non-minimal systems...... 88
4.4 Quantum Hamiltonian identifiability via STA...... 89
4.4.1 General framework...... 89
4.4.2 Exchange model without transverse field...... 91
4.4.3 Exchange model with transverse field...... 94
4.5 From identifiability to economic identification algorithms...... 104
4.5.1 An indicator for the existence of economic identification algo- rithms...... 104
4.5.2 Two economic Hamiltonian identification algorithms..... 111
4.5.3 Error analysis...... 116
4.5.4 Simulation performance...... 118
4.6 Conclusion and open problems...... 120
5 Quantum Hamiltonian/gate identification via TSO and PGI 123
5.1 Introduction...... 123
5.2 TSO Hamiltonian identification algorithm...... 127
5.2.1 Quantum process tomography...... 127
5.2.2 Problem formulation of Hamiltonian identification...... 131
5.2.3 Two-step Optimization algorithm...... 134
5.2.4 Error analysis...... 144
15 5.2.5 Numerical results of TSO...... 151
5.3 Pure-state-based Gate Identification...... 158
5.3.1 Problem formulation of gate identification...... 158
5.3.2 Fast pure-state tomography...... 158
5.3.3 Gate reconstruction...... 160
5.3.4 General procedure and computational complexity...... 163
5.3.5 Error analysis...... 164
5.3.6 Numerical results...... 169
5.3.7 Experimental results...... 172
5.4 Summary and open problems...... 175
6 Quantum detector tomography via two-stage estimation 177
6.1 Introduction...... 177
6.2 Two-stage Estimation method for quantum detector tomography... 179
6.2.1 Problem formulation...... 179
6.2.2 Estimation algorithm...... 182
6.2.3 General procedure and computational complexity...... 187
6.2.4 Error analysis...... 188
6.3 Optimization of the coherent probe states...... 194
6.3.1 On the kinds of probe states...... 194
6.3.2 Optimization of the size of sampling square for probe states. 196
6.4 Numerical results...... 199
6.4.1 Basic performance...... 199
6.4.2 On the kinds of probe states...... 201
16 6.4.3 Optimization of the size of sampling square for probe states. 202
6.4.4 Comparison with MLE...... 203
6.5 Experimental results...... 207
6.5.1 Experimental setup...... 207
6.5.2 Modified estimation protocol...... 208
6.5.3 Experimental results...... 211
6.6 Summary and open problems...... 215
7 Conclusions and outlook 217
A Some common formulas 220
B Iterative algorithm for product projector optimization 221
C Gill-Massar bound for infidelity in two-qubit state tomography 224
D Proof of Lemma 4.1 225
E Proof of Lemma 4.2 226
F Proof of Lemma 4.3 227
G Proof of Lemma 4.4 229
H Proof of Lemma 4.5 232
I Proof of Proposition 5.1 234
J Proof of Theorem 5.1 235
17 K Proof of Theorem 5.2 237
L A sufficient condition for Assumption 5.1 239
M Proof of Lemma 5.2 240
N A basis set example for the space B(d, m, {dj}) 242
18 List of Figures
3.1 Simulated performance of the RAQST protocol for pure states..... 65
3.2 Simulated performance of the RAQST protocol for mixed states.... 66
3.3 Two-qubit state tomography experimental setup, adopted from [114]. 69
3.4 Two-qubit state tomography experimental results...... 70
4.1 Relationships between identifiability criteria...... 87
4.2 Performance of Algorithm 4.1 with different data lengths...... 119
4.3 Performance of Algorithm 4.2 with different noise variances...... 120
4.4 Performance of Algorithm 4.2 with different data lengths...... 121
5.1 General procedure of the TSO method...... 140
5.2 MSE of TSO versus the total resource number...... 153
5.3 MSE of TSO versus different evolution times...... 153
5.4 MSE of TSO versus number of qubits...... 154
5.5 Running time versus qubit number for the ERA method in [165] and our TSO method...... 156
5.6 MSE versus resource number for each output state...... 170
5.7 Running time and MSE versus qubit number for MLE and PGI methods.171
5.8 The schematic of experimental setup, adopted from [153]...... 173
19 5.9 MSE versus resource number for the experimental single-qubit gate.. 174
6.1 The projection amplitude function h(k, j)...... 198
6.2 MSE versus the total resource number...... 200
6.3 MSE versus probe state kinds...... 201
6.4 MSE versus the size of sampling square for probe states...... 202
6.5 The optimal sampling square size versus dimension...... 203
6.6 Comparison between our TSE algorithm with MLE for different qubit number...... 204
6.7 Comparison between our algorithm with MLE for different number of POVM matrices...... 205
6.8 Quantum optical experimental setup for QDT [161]...... 207
6.9 Experimental and simulation results for Group I...... 214
6.10 Experimental and simulation results for Group II...... 214
20 List of Tables
6.1 The coherent probe states for TSE QDT experiment...... 212
21 22 List of Symbols
∗ An indeterminate variable, vector or matrix ...... 93 ⊗ Tensor product between two Hilbert space, or Kronecker prod- uct between two matrices...... 40 ⊕ Matrix direct sum...... 208 ≡ Identity symbol ...... 40 aˆ Estimation of variable a ...... 43 a∗ Conjugate of a ...... 38 bxc Largest integer that is not larger than x ∈ R ...... 64 |ψ⊥i A state orthogonal to |ψi; i.e., hψ|ψ⊥i =0...... 63
A ≥ 0 A ∈ Cd×d is positive semidefinite ...... 54 √ √ √ √ 1 † A, A 2 , Udiag( P11, P22, ..., Pdd)U , where A ≥ 0 with spectral decomposition A = UPU † ...... 135
Aσi (Aiσ) i-th column (row) of matrix A...... 93 AT Transpose of A ...... 38 A† Transpose and conjugate of A ...... 38
⊗N A , A ⊗ A ⊗ · · · ⊗ A, N times tensor product of A ...... 152 | {z } N ||A|| Frobenius norm of A ...... 57
(x, y) , (x − 1)K + y for 1 ≤ x, y ≤ K, when (x, y) is used as a number, especially in subscripts or superscripts...... 132
† † ha, bi (hφ|ψi) , a b (, (|φi) |ψi), inner product between column vectors (pure states) ...... 241(38)
23 † hA, Bi , Tr(A B), inner product of matrices A and B ...... 79 [A, B] , AB − BA, commutation ...... 39 j j! , k!(j−k)! , binomial coefficient ...... 113 k
Ai Kraus operators...... 128
B(d, m, {dj}) , {L1 ⊕ L2 ⊕ ... ⊕ Lm|∀ 1 ≤ j ≤ m, Lj ∈ Cdj ×dj }, set of all
block diagonal matrices with m blocks, where j-th block Lj Pm is dj × dj dimensional and j=1 dj = d ...... 208 C Complex domain ...... 82
Cd d-dimensional complex vector space ...... 142
Cd×d Set of all d × d complex matrices ...... 53 CM Controllability matrix ...... 92
T Cov(e) , E[(e − E(e))(e − E(e)) ], covariance matrix of random vari- able vector e ...... 59 cq Size of sampling square in complex plane for coherent state preparation in TSE QDT ...... 195
T −1 co , argmin E{Tr[(X0 X0) ]}, optimal size of sampling square cq for coherent probe states in TSE QDT ...... 198
D (Dd,D(H )) Set of all (d-dimensional) density matrices (in space H )..38 diag(a) A diagonal matrix with diagonal line consisting of elements in vector a ...... 56 diag(A) A diagonal matrix obtained from square matrix A by setting all non-diagonal elements in A as zero ...... 184 dim(H) Dimension of space H ...... 46 E(·) Expectation on all possible measurement results ...... 58
E(·) Expectation on classical random variables x and y ...... 196
24 E Quantum process...... 40 e Base of natural logarithm ...... 114 exp(A) Matrix exponential function on square matrix A ...... 39
{Fi} Set of basis matrices for Cd×d ...... 128 2 p√ √ F(ρ1, ρ2) , Tr ( ρ1ρ2 ρ1), fidelity between states ρ1 and ρ2 .... 65 G¯ Accessible set ...... 80 H Hamiltonian...... 39
√1 ( 1 1 ), single-qubit Hadamard gate ...... 39 H , 2 1 −1 H Hilbert space...... 37 ~ Reduced Planck constant...... 39 I Identity operator or identity matrix ...... 40
I−k Identity matrix with k-th diagonal element −1...... 97 √ i , −1, imaginary unit ...... 39 d2−1 {iHm}m=1 An orthonormal basis set of su(d)...... 79 K Number of steps in adaptive QST ...... 53 L Number of kinds of different probe states in QDT ...... 180 L Lagrange function...... 134 ˆ ˆ ˆ T M (M(Θ, Θ)) , E(Θ − Θ)(Θ − Θ) , Mean Squared Error matrix ...... 58 M (M(j)) Number of matrices in (j-th) POVM ...... 54
Pt (jk) Mt , k=1 M , number of regression equations after t times POVM measurement ...... 55 S (j) M , M , admissible measurement set...... 54 j=1 (j) (j) M(j) M , {Pi }i=1 , j-th step POVM measurement operators . . . . 54
ND Length of experimental data (i.e., sampling times)...... 111
NH Number of unknown parameters in Hamiltonian ...... 79
NL Dimension of linear system model (4.51)...... 105
25 NO Number of copies of each output state in QHI/QGI . . . . . 140
NP Number of different POVM sets to reconstruct one output state in QHI/QGI...... 160
Nq Number of qubits ...... 62
Nt Total number of copies of all quantum states used in an ex- periment ...... 54 n Number of elements in accessible set...... 80 n(j) Number of times j-th measurement M(j) is performed . . . . 54
(j) (j) nm Number of occurrence of m-th outcome from n measure- ment trials of M(j) ...... 54 nij Number of occurrence of i-th POVM outcome on ρj ..... 181 OM Observability matrix ...... 92 (j) Pi (Pi ) i-th matrix in (j-th) POVM ...... 41(45) pi (pij) , Tr(Piρ)(, Tr(Piρj)), measurement probability of i-th out-
come on ρ (ρj)...... 42(180) R Real domain ...... 112
Rd d-dimensional real vector space ...... 83
Rd×d Set of all d × d real matrices ...... 138 Re(a) Real part of a ...... 150
Sjkl Structure constants of su(d)...... 79 S Similarity transformation matrix in STA ...... 84 s Laplace variable in Laplace transform ...... 82 su(d) Lie algebra consisting of all d × d skew-Hermitian traceless matrices...... 79
TR Running time of simulation programs in seconds...... 156
TrA(X) (TrB(X)) Partial trace on space HA (HB) where X ∈ HA ⊗ HB ... 43 t Time...... 39
26 ts Sampling time ...... 111 U Unitary propagator/quantum gate...... 39 u Control signal ...... 82
T vec(Am×n) , [A11,A21, ..., Am1,A12, ..., Am2, ..., Amn] , column vectoriza- tion of A ...... 101
−1 vec (a) Inverse function of vectorization, from Cd2 to Cd×d ...... 132
(jk) Wm Weight of linear regression equation obtained from m-th out-
come of jk-th POVM ...... 55 X Process matrix ...... 128 ˆ ˆ ∆A , A − A, difference between A and its estimation A ..... 145
δij Kronecker Delta function ...... 41 δ(t) Dirac Delta function ...... 89 η Penalty coefficient in modified TSE method ...... 210
T (j) (j) T Θ (Θj) , (θ1, ..., θd2 ) (, (θ1 , ..., θd2 ) ), vector of parametrization of (j-th) state ...... 181
T Θ , (θ2, ..., θd2 ) , effective vector of parametrization of state 53 ˆ Θn LRE estimation of Θ using n regression equations ...... 55 (j) θi (θi ) , Tr(ρΩi)(, Tr(ρjΩi)), parametrization coefficient of (j-th)
state in i-th basis matrix Ωi ...... 53(180)
ϑi i-th unknown parameter in Hamiltonian ...... 79 T ϑ , (ϑ1, ..., ϑNH ) , vector of all unknown Hamiltonian parame- ters...... 79 Λ(A) Set of all eigenvalues of A (repeated eigenvalues appear mul- tiple times) ...... 96
ΛL Lagrange multiplier variable/matrix ...... 45
λi(A) i-th eigenvalue of A ...... 232
27 Ξ , [ξmn], matrix composed of elements ξmn ...... 129
ξmn Expansion coefficient of ρm using ρn ...... 129 ρ Density matrix ...... 38
ρi i-th probe state in QDT...... 180
ρin (ρout) Input (output) state of a quantum process...... 40 Σ , (A, B, C, D)(, (A, B, C)), 4(3)-tuples denoting a linear system ...... 82
1 σx, σy, σz, σ+, σ− Pauli matrices and their linear combinations, σ± = 2 (σx ±iσy) 39,156
T T T T Φ , (Φ1 , Φ2 , ..., ΦM) , vector of all parameters in POVM ma- trices ...... 181
i i i i T Φi,(j) , (φ1,(j), φ2,(j), φ3,(j), φ4,(j)) , vector of parametrization of i-th matrix of single-qubit projector on j-th qubit...... 221
(k) (j) (j) T (j) (j) T Φj (Φj ) , (φ1 , ..., φd2 ) (, (φ1,k, ..., φd2,k) ), vector of parametriza- tion of j-th matrix of (k-th) POVM ...... 181
(k) (j) (j) T (j) (j) T Φj (Φj ) , (φ2 , ..., φd2 ) (, (φ2,k, ..., φd2,k) ), effective vector of parametrization of j-th matrix of (k-th) POVM . . . . . 55(54)
CE Φi Vector of parametrization of “Cyclic eigenvalues” method’s result to estimate eigenvector corresponding to i-th near-0 eigenvalue ...... 62 (j) (j) (k) φi (φi,k ) , Tr(PjΩi) (Tr(Pj Ωi)), parametrization coefficient of (j-th)
matrix of (k-th) POVM in i-th basis matrix Ωi ...... 55(54) (j) ⊗(k−1) φi,(k) , Tr[Pj(Ω1 ⊗ Ωi ⊗ Ω1 ⊗ · · · ⊗ Ω1)], parametrization coef- ficient of k-th qubit measurement component of product pro-
jector’s j-th matrix in i-th single-qubit basis Ωi ...... 221
28 |ψi Unit complex column vector representing a pure state . . . . 37
d2 {Ωi}i=1 A set of Hermitian bases for Cd×d, where Tr(ΩiΩj) = δij and √ Ω1 = I/ d ...... 53
29 30 List of Common Acronyms
BBO β-barium borate BME Bayesian mean estimation BS Beam splitter CLS Constrained least squares CW Continuous wave ERA Eigenstate realization algorithm GM Gill-Massar HWP Half-wave plate IF Interference filter LRE Linear regression estimation LS Least squares MES Maximally entangled states MIMO Multiple-input multiple-output MLE Maximum likelihood estimation MSE Mean squared error MUB Mutually unbiased bases PBS Polarizing beam splitter PGI Pure-state-based gate identification POVM Positive operator valued measure QGI Quantum gate identification QHI Quantum Hamiltonian identification
31 QT Quantum tomography QST Quantum state tomography QPT Quantum process tomography QDT Quantum detector tomography QWP Quarter-wave plate RAQST Recursively adaptive quantum state tomography SIC-POVM Symmetric informationally complete positive operator- valued measure SISO Single-input single-output SNSPD Superconducting nanowire single photon detector SPD Single photon detector SPT Structure preserving transformation STA Similarity transformation approach TSE Two-stage estimation TSO Two-step optimization
32 Chapter 1
Introduction
The search for the principles of the nature has always been a main target in physics. One of the most significant achievements since the 20th century is quantum science, which unravels the special properties of the universe at the microscale level. Then a subsequent question arises: what practical influence or application can quantum science bring us?
Through recent decades’ endeavour, scientists have developed a number of rev- olutionary quantum technologies based on the principles of quantum mechanics. For example, quantum computation utilizes the superposition of quantum states to perform certain computation tasks with an efficiency much higher than classi- cal (which means non-quantum throughout this thesis) computers [43]. Quantum communication encodes information in quantum states to realize secure exchange of key information [10]. Quantum sensing employs quantum properties to perform measurements with high sensitivity or precision [39]. These achievements, together with many other developing branches, are promising candidates for next-generation technologies in many information-related subjects.
To realize and develop these quantum technologies, it usually requires accurate manipulation of certain quantum objects. Before this, a necessity is to obtain enough
33 information about the unknown quantum entity; i.e., information about certain key structures or parameters of the entity needs to be extracted. This highlights the significance of system identification and parameter estimation, which is often called quantum tomography (QT) in quantum-associated subjects.
In contrast to the classical world, the quantum no-cloning theorem [41, 111, 158] implies that for a single copy of an unknown quantum state, its information cannot be totally recovered. To overcome this obstacle, QT usually assumes a framework where a large number of independent identical copies of the unknown quantum en- tity are available, and data are obtained through proper interaction (e.g., quantum measurement) with these copies following certain protocols. Then the information about the entity can be extracted through a reconstruction algorithm using the data. The final target is thus to obtain an estimate of the whole entity (called full QT) or of partial aspects of the entity. Common indices evaluating the tomography methods include computational complexity, estimation error, efficiency, reliability, etc.
For this thesis, a main focus is on designing novel full tomography methods. We start from estimating the state of an unknown quantum system, and this technique is called quantum state tomography (QST). Then we move our focus to the evolu- tion of states. Specifically, we concentrate on closed quantum systems, where the system evolution is governed by the Hamiltonian, and the task is usually referred to as Hamiltonian identification. Before designing a novel Hamiltonian identification algorithm, we investigate the problem of Hamiltonian identifiability; i.e., whether a given experimental setting is enough to uniquely determine all the unknown parame- ters in the Hamiltonian. Finally, we “complete the triad, state, process and detector tomography, required to fully specify an experiment” [97] by considering quantum detector tomography.
In Ch.3, we propose a novel Recursively Adaptive Quantum State Tomography (RAQST) protocol for multi-qubit systems. Based on the linear regression estimation algorithm, RAQST recursively incorporates new measurement data into a historical
34 estimate. Then according to the updated estimate, an index is used to predict the performance of any candidate measurement bases. Simulation shows that even with the simplest 2-qubit product measurements, RAQST can outperform nonadaptive protocols, and also beat the Gill-Massar bound for a wide range of pure states. Quantum optical experiment on a two-qubit system demonstrates the effectiveness of our adaptive method.
Ch.4 switches the focus to the Hamiltonian identifiability problem, which investi- gates whether a given experimental setting can uniquely determine all the unknown parameters in the Hamiltonian. We employ the Similarity Transformation Approach (STA) in classical control theory to solve the quantum Hamiltonian identifiability problem, and prove identifiability conclusions for spin-1/2 chain systems with arbi- trary dimensions assisted by single-qubit probes. We further extend the traditional STA method by proposing a Structure Preserving Transformation (SPT) method for non-minimal systems. We use the SPT method to introduce an indicator for the existence of economic quantum Hamiltonian identification algorithms, whose compu- tational complexity directly depends on the number of unknown parameters (which could be much smaller than the system dimension). Finally, we give two examples of such economic Hamiltonian identification algorithms and perform simulations to demonstrate their effectiveness.
A test of quantum Hamiltonian identifiability is instrumental to save time and cost for practical Hamiltonian identification experiments. With this precursory problem solved in Ch.4, we proceed to specific identification algorithms in Ch.5. We identify an unknown quantum Hamiltonian within the framework of quantum process tomog- raphy. In our method, different pre-designed probe states are input into the quantum system and the output states are estimated using the quantum state tomography pro- tocol via linear regression estimation. To reconstruct the time-independent system Hamiltonian, we establish the identification problem as an optimization problem, and design an approximate solution method using two-step optimization (TSO). We
35 analyze the computational complexity and identification error of the TSO method, and provide numerical examples to demonstrate the effectiveness of the TSO method. Furthermore, we improve our TSO method to provide a more efficient Pure-state- based Gate Identification (PGI) algorithm, with the computational complexity re- duced from O(d6) to O(d3) for a d-dimensional system. We note that theoretically both the input and output states in our protocol are pure. We thus design a fast pure-state tomography to reconstruct the output states more efficiently. We estab- lish an analytical error upper bound, and perform a single-qubit optical experiment to validate the effectiveness of PGI method.
In Ch.6 we come to the tomography of a quantum detector using a Two-Stage Estimation (TSE) method. First, a series of different probe states are employed to generate measurement data. Then, using constrained linear regression estimation, a stage-1 estimate of the detector is obtained. Finally, the positive semidefinite requirement on the POVM matrices is added to guarantee a physical stage-2 estimate. We analyze the computational complexity of this approach and establish an error upper bound. We also discuss optimization of the coherent probe states. We perform simulation and a quantum optical experiment to verify the effectiveness of the TSE method.
Ch.7 concludes all the forementioned results, and provides some discussion about possible future work in the field of QT.
36 Chapter 2
Quantum mechanics and standard tomography methods
2.1 Quantum mechanics foundations
Quantum mechanics is a set of mathematical and physical formulisms for describ- ing nature at the scale of atoms and subatomic particles. Axiomatic methods were employed in the development and reformulation of quantum mechanics, leading to now a number of fundamental postulates from which the whole theory can be de- duced. We start from these postulates to introduce the necessary preliminaries for this thesis. The postulates in different textbooks have minor differences, and here we adopt the version in [104].
Postulate 2.1. Any isolated quantum system is associated to a complex vector space with inner product (namely, a Hilbert space) known as the state space of the system. The system is completely described by its state vector, which is a unit vector in the state space of the system.
Mathematically, a quantum state is usually denoted as a unit complex vector |ψi in the underlying Hilbert space H , which can also be viewed as a column vector
37 2.1. QUANTUM MECHANICS FOUNDATIONS v. Its adjoint is denoted as hψ| (v†), which corresponds to the conjugate (∗) and transpose (T ) of v. The inner product between two states |φi and |ψi is denoted as hφ|ψi , hφ| · |ψi. A state is thus usually normalized so that hψ|ψi = 1. As the linear combination of vectors, the superposition of quantum states is also a valid state, P P 2 in the form of i ci|ψii, where the ci with that i |ci| = 1 are complex coefficients.
States that each can be represented by a single vector are called pure states. In contrast, a statistical ensemble of pure states, called a mixed state, cannot be described with a single vector. Hence, a mixed state is usually denoted as a density matrix ρ, which is Hermitian, positive semidefinite and satisfies Tr(ρ) = 1. For a closed quantum system with state |ψi, we have ρ = |ψihψ|. In this thesis, we denote
Dd (D(H )) the set of all d-dimensional quantum states (in space H ), also simplified as D when there is no ambiguity. Since quantum states are fundamental in quantum research, their efficient or accurate estimation is certainly an important problem.
The dimension of the underlying Hilbert space can be infinite or finite, depending upon the specific physical system. For the infinite dimensional case, the above no- tations |ψi and ρ are more commonly interpreted as operators instead of matrices, due to the fact that infinite dimensional matrices are difficult to tackle mathemat- ically. This thesis is mainly focused on finite dimensional cases, and thus one can identify |ψi and ρ with finite dimensional vectors and matrices, respectively. The simplest nontrivial Hilbert space is two-dimensional, upon which the quantum sys- tem is called a qubit. An orthonormal basis of a qubit system is usually denoted as |0i and |1i, corresponding to the classical 0-1 bit and forming the basic unit of quantum information.
Postulate 2.2. The time evolution of the state of a closed quantum system is described by the Schr¨odingerequation
d|ψ(t)i i = H|ψ(t)i, (2.1) ~ dt
38 2.1. QUANTUM MECHANICS FOUNDATIONS
√ where i = −1, ~ is the reduced Planck’s constant and set to 1 in atomic units, and H is a fixed Hermitian operator known as the Hamiltonian of the closed system.
Postulate 2.2 describes how a quantum state will evolve with time. It also has a density matrix version, which is the Liouville-von Neumann equation
ρ˙ = −i[H, ρ], (2.2) where [A, B] = AB − BA is the commutator and we use atomic units to set ~ = 1 throughout the rest of this thesis.
When the Hamiltonian does not change with time, we say it is time-independent, and the solution to the Schr¨odingerequation is thus
|ψ(t2)i = U(t2, t1)|ψ(t1)i, (2.3) where we define
U(t2, t1) , exp[−iH(t2 − t1)]. (2.4)
Since it is the relative difference between t2 and t1 that determines U, one can further write U as U(t2 −t1). Operator U in the form of (2.4) will always be unitary, and it is called the propagator or quantum gate. Common single-qubit operators include the Pauli matrices:
0 1 0 −i 1 0 σx = , σy = , σz = , 1 0 i 0 0 −1 and the Hadamard gate 1 1 1 H = √ . 2 1 −1
39 2.1. QUANTUM MECHANICS FOUNDATIONS
In the mixed-state case, (2.4) becomes
† ρ(t2) = U(t2 − t1)ρ(t1)U (t2 − t1). (2.5)
From Postulate 2.2 we see the dynamics of quantum states in closed systems are governed by the Hamiltonian/gate, which highlights the importance of the Hamilto- nian/gate identification problem.
If the system under consideration has interaction with the environment, it becomes an open quantum system and has a more complicated state evolution. Usually a quantum process/operation is used to describe the transformation of the state.
Suppose there is a state ρin ∈ D(HA), then the process is a map E that transforms it to another state
ρout = E(ρin), (2.6) where ρout ∈ D(HB). In the general case, the input and output Hilbert spaces HA and HB can have different dimensions, while for simplicity in this thesis we assume they are the same space.
For a process to be indeed physical, it further has two restrictions:
(i) E must be trace preserving; i.e.,
∀ρin ∈ D(HA), Tr[E(ρin)] ≡ Tr(ρin). (2.7)
AC (ii) E must be completely positive; i.e., for arbitrary Hilbert space HC ,(E ⊗IC )ρ ∈ AC D(HB ⊗ HC ), ∀ρ ∈ D(HA ⊗ HC ), where IC is the identity operator in HC .
There are a number of specific equivalent representations for E. In Sec. 2.2.2, we will introduce Choi-Jamio lkowski isomorphism, and in Sec. 5.2.1 we will introduce Kraus operator-sum representation.
Postulate 2.3. A quantum measurement is associated to a collection {Qi} of
40 2.1. QUANTUM MECHANICS FOUNDATIONS measurement operators, acting on the state space of the system being measured. They satisfy the completeness equation
X † Qi Qi = I, (2.8) i where I is the identity operator. The index i labels the possible measurement out- comes. If the quantum system has state |ψi immediately before the measurement, then the probability that the i-th result occurs is given by
† pi = hψ|Qi Qi|ψi, and the post-measurement state is
|ψi Qi . q † hψ|Qi Qi|ψi
One can check that the completeness equation is equivalent to requiring that all the probabilities sum to one. If {Qi} further satisfies QiQj = δijQi and each Qi is Hermitian, then the measurement is called projective measurement, and Qi a projector. When the post-measurement state is not of much interest, a more widely used formulism is Positive Operator-Valued Measure (POVM) measurement, † which can be deduced from Postulate 2.3. We define Pi , Qi Qi. Then a POVM measurement is associated with a set of positive operators {Pi}, with their sum equal to the identity. The probability of the i-th outcome is now determined as
pi = hψ|Pi|ψi, which in the mixed-state case is
pi = Tr(Piρ). (2.9)
41 2.1. QUANTUM MECHANICS FOUNDATIONS
Each Pi is a POVM element, and in the finite dimensional case it corresponds to a positive semidefinite matrix.
Suppose we have a series of probabilities (2.9). If the measurement outcome prob- abilities (pis) are approximated by experiments and the POVM elements (Pis) are known, the technique to deduce the unknown state ρ is called quantum state tomog- raphy (QST). Otherwise, if the probabilities and the state is known, the procedures to deduce the unknown POVM elements is called quantum detector tomography (QDT), because POVM elements are the mathematical representation of quantum detectors (measurement devices).
Postulate 2.4. The state space of a composite physical system is the tensor prod- uct of the state spaces of the component physical systems. Specifically, let |ψii be the state of the i-th subsystem and there are n subsystems altogether, then the total system has the joint state |ψ1i ⊗ |ψ2i ⊗ ... ⊗ |ψni, which is often written in a short notation |ψ1ψ2...ψni.
From Postulate 2.4 we know the composite of density matrices (or of operators) is also in the tensor product form. When the operators are Pauli matrices, a similar more common way to express their composite is to omit the tensor product and iden- tity notation, and use number subscripts to denote the qubit number. For example,
I I for a 4-qubit system, Y1X2 in fact is short for σy ⊗ σx ⊗ 2 ⊗ 2 . If a state of the total system can be written as the tensor product of states of the component systems, as in the form in Postulate 2.4, then it is called separable. Non- separable states are called entangled states, which are one of the most important resources in quantum science.
Postulate 2.4 directly shows how to obtain composite states from states on sub- systems. To go in the opposite direction, we need the partial trace to obtain a reduced density operator. For any |ai, |bi ∈ HA, |ci, |di ∈ HB, the partial trace
42 2.2. STANDARD MAXIMUM LIKELIHOOD ESTIMATION
over HA is defined by
TrA(|aihb| ⊗ |cihd|) = Tr(|aihb|)|cihd|.
AB AB Suppose ρ is a state on HA ⊗ HB. Then ρ restricted to HB is the reduced density operator for system HB, as
B AB ρ , TrA(ρ ).
AB B Specially, if ρ = ρ ⊗ σ, then ρ = TrA(ρ ⊗ σ) = σ.
2.2 Standard maximum likelihood estimation
The main focus of this thesis is on designing new tomography algorithms, and later it will be necessary to compare them with existing methods. Hence, in this section we briefly introduce one of the most commonly used tomography methods, the Maximum Likelihood Estimation (MLE) method.
2.2.1 Quantum state tomography via maximum likelihood estimation
Quantum State Tomography (QST) is the technique to deduce an unknown quan- tum state from measurement data. We hereby introduce the MLE method to perform QST, based on [75, 79].
Suppose we have Nt identical independent copies of an unknown state ρ. Usually we perform a series of POVM measurement {Pi} on it to extract information. Denote the observed occurrence of the i-th outcome as ni, and then the frequencyp ˆi = ni/Nt is an experimental approximation to the true probability pi. The probability that
Q ni we observe such data is in fact i pi , which after taking logarithm and dividing by
43 2.2. STANDARD MAXIMUM LIKELIHOOD ESTIMATION
Nt is equivalent to the log-likelihood functional
X L(ρ) = pˆi ln Tr(ρPi). i
The core idea of MLE is to search for the solution that maximizes the probability to observe the data in hand. Hence, MLE takes
X ρˆMLE = arg max pˆi ln Tr(ρPi) ρ i as the estimate of the state. By analyzing the extremal equation, one can obtain an iterative search algorithm:
1. Assign an admissible guess to the initial state; e.g., ρ(0) = I/d.
2. At step-k, compute X pˆi i R(k) = P . Tr[ρ(k) ] i Pi
3. Update the estimate at step-(k + 1) as
R(k)ρ(k)R(k) ρ(k+1) = . Tr[R(k)ρ(k)R(k)]
4. Terminate the iteration if the distance between ρ(k) and ρ(k+1) is smaller than a given threshold; otherwise, let k be k + 1 and repeat the iteration.
It is straightforward to check that during the above procedures the estimated value of the state is kept positive semidefinite and with trace 1. Hence, MLE always gives a physical estimate. Furthermore, there are improved versions of MLE QST to accelerate the algorithm; e.g., see [128, 140].
44 2.2. STANDARD MAXIMUM LIKELIHOOD ESTIMATION
2.2.2 Quantum process tomography via maximum likelihood estimation
Quantum Process Tomography (QPT) is a technique to employ quantum states (which are usually known) to estimate an unknown quantum process, which is a map between quantum states. Here we rephrase the framework in [79] to perform MLE QPT.
Suppose the process is a map E from HA to HB; i.e., for any state ρin ∈ HA,
E(ρin) = ρout ∈ HB. From Choi-Jamio lkowski isomorphism [31, 66, 78, 110], E is in one-to-one correspondence with an operator Q ∈ HA ⊗ HB, such that
T B E(ρin) = TrA[Q(ρin ⊗ I )].
The trace preserving restriction (2.7) is equivalent to
A TrB(Q) = I (2.10) and the completely positive restriction on the process amounts to requiring that Q is positive semidefinite. Under this representation, reconstructing the process E amounts to reconstructing Q. Usually a series of different states ρm are inputted to (m) the process, and POVM measurements {Pl } are performed on each corresponding output state. Letp ˆlm denote the observed frequency of the corresponding outcomes (m) from the POVM {Pl }. We then aim to maximize the constrained log-likelihood functional as
ˆ X T (m) B QMLE = arg max pˆlm ln Tr[Q(ρm ⊗ Pl )] − Tr[(ΛL ⊗ I )Q], Q m,l where ΛL is the Lagrange multiplier matrix accounting for the trace-preservation condition (2.10). One can then again design a numerical iteration algorithm to
45 2.2. STANDARD MAXIMUM LIKELIHOOD ESTIMATION search for the optimal solution:
(0) AB 1. Assign an admissible guess to the initial process; e.g., Q = I /dim(HB).
2. At step-k, compute
(k) X pˆlm T (m) K = ρm ⊗ l (k) T (m) P m,l Tr[Q (ρm ⊗ Pl )] and
(k) (k) (k) (k) 1/2 ΛL = [TrB(K Q K )]
3. Update the estimation at step-(k + 1) as
(k+1) (k) −1 B (k) (k) (k) (k) −1 B Q = [(ΛL ) ⊗ I ]K Q K [(ΛL ) ⊗ I ].
4. Terminate the iteration if the distance between Q(k) and Q(k+1) is smaller than a given threshold; otherwise, let k be k + 1 and repeat the iteration.
One can check that the above procedures keep Q positive semidefinite and preserve the condition (2.10).
2.2.3 Quantum detector tomography via maximum likeli- hood estimation
Quantum Detector Tomography (QDT) accounts to reconstructing the POVM elements of a set of quantum measurements, since detectors are a kind of physical realization of POVMs. In this section, we rephrase the MLE QDT method in [110].
M Suppose we perform POVM measurements {Pl}l=1 (called one detector) on a series of different states ρm, and the observed corresponding frequency isp ˆlm for the l-th outcome. To reconstruct {Pl}, we need to consider the solution that maximizes the
46 2.2. STANDARD MAXIMUM LIKELIHOOD ESTIMATION constrained log-likelihood functional
ˆ X X {Pl}MLE = arg max pˆlm ln Tr(ρmPl) − Tr(ΛLPl), {Pl} m,l l where ΛL is the Lagrange multiplier matrix accounting for the constraint
X Pl = I. l
One can again design a numerical iteration algorithm as follows:
(0) 1. Assign an admissible guess to the initial detector; e.g., Pl = I/M.
2. At step-k, for each l, compute
(k) X pˆlm Rl = (k) ρm. m Tr[ρmPl ]
Then update the Lagrange multiplier matrix as
(k) X (k) (k) (k) 1/2 ΛL = ( Rl Pl Rl ) . l
3. Update the estimation at step-(k + 1) for each l as
(k+1) (k) −1 (k) (k) (k) (k) −1 Pl = (ΛL ) Rl Pl Rl (ΛL ) .
(k) (k+1) 4. Terminate the iteration if the distance between {Pl } and {Pl } is smaller than a given threshold; otherwise, let k be k + 1 and repeat the iteration.
One can check that the above procedures guarantee that the POVM matrices are positive semidefinite and sum to the identity.
Although MLE has been widely accepted and used in quantum tomography, it still has some intrinsic drawbacks that might be improved by other methods. For
47 2.2. STANDARD MAXIMUM LIKELIHOOD ESTIMATION example, the iterative procedure is not very amenable to adaptivity, and usually results in a heavy computational burden. Also, it is not easy to theoretically char- acterize the estimation error. These drawbacks also appear in most other existing algorithms such as Bayesian Mean Estimation [18, 76]. To alleviate or overcome these drawbacks is thus the main motivation for the research in this thesis.
48 Chapter 3
Recursively adaptive multi-qubit state tomography
The work, reported in this chapter, has been partially published in the following articles:
1. B. Qi, Z. Hou, Y. Wang, D. Dong, H.-S. Zhong, L. Li, G.-Y. Xiang, H. M. Wiseman, C.-F. Li, and G.-C. Guo, Adaptive quantum state tomography via linear regression estimation: Theory and two-qubit experiment, npj Quantum Information, vol. 3, no. 1, p. 19, 2017. 2. D. Dong, Y. Wang, Z. Hou, B. Qi, Y. Pan, and G.-Y. Xiang, State tomography of qubit systems using linear regression estimation and adaptive measurements, in the 20th World Congress of the International Federation of Automatic Control (IFAC), vol. 50, no. 1, pp. 13014-13019, Toulouse, France, July 2017. 3. D. Dong, and Y. Wang, Several recent developments in estimation and robust control of quantum systems, in 2017 Australian and New Zealand Control Conference (ANZCC), pp. 190-195, Gold Coast, Australia, December 2017.
3.1 Introduction
One of the central problems in quantum science and technology is the estimation of an unknown quantum state [104]. Quantum state tomography (QST), as a procedure for experimentally determining an unknown quantum state, has become a standard technology for verification and benchmarking of quantum devices [7, 33, 35, 58, 62, 77, 85, 93, 96, 99, 110, 113, 121, 137, 141, 144]. Two key tasks in QST are data
49 3.1. INTRODUCTION acquisition and data analysis. The aim of data acquisition is to acquire information for reconstructing the quantum state through appropriate measurement strategies. Then in the data analysis step, the acquired data is associated with an estimate of the unknown quantum state using a reconstruction algorithm.
For data acquisition, in order to enhance the efficiency, it is desirable to develop optimal measurement strategies for collecting data. However, an optimal measure- ment strategy, which is only known for a few special cases [29, 58, 65, 70, 170], depends on the state to be reconstructed. To circumvent this issue, many kinds of fixed sets of measurement bases have been designed to be optimal either in terms of the average over a certain quantum state space [2, 15, 38, 106, 157] or in terms of the worst case in the quantum state space [113]. For instance, improved state estimation can be achieved by taking advantage of mutually unbiased bases (MUB) [2, 46, 157] or symmetric informationally complete positive operator-valued measures (SIC-POVM) [11, 117]. For multi-partite quantum systems, MUB and SIC-POVM are difficult to experimentally realize since they involve nonlocal measurements. It remains open how to efficiently acquire information of an unknown quantum state using simple measurements that are straightfoward to realize experimentally.
For data analysis in tomography, although many methods, such as maximum- likelihood estimation (MLE) [17, 110, 130, 139, 140], Bayesian mean estimation (BME) [18, 76], least-squared inversion [109], have been used to reconstruct a quan- tum state, this task can be computationally intensive, and may take even more time than the experiments themselves. It has been reported in [62] that using the maximum-likelihood method to reconstruct an eight-qubit state took weeks of com- putation. Therefore, the development of an efficient data analysis algorithm is a critical issue in quantum state tomography [98, 113]. In [113], a recursive linear regression estimation algorithm was presented which is much more computationally efficient in the sense that it can greatly save the cost of computation as compared to the maximum-likelihood method, with only a small amount of accuracy being
50 3.1. INTRODUCTION sacrificed.
For a given number of copies of the system, in order to improve the tomography ac- curacy by better tomographic measurements, a natural idea is to develop an adaptive tomography protocol where the measurement can be adaptively optimized based on data collected so far. Adaptive measurements have shown more powerful capability than nonadaptive measurements in quantum phase estimation [67, 155, 160], phase tracking [162], quantum state discrimination [1, 68], and Hamiltonian estimation [49, 125]. Actually, adaptivity has been proposed for quantum state tomography in various contexts [5, 58, 76, 86, 90, 98, 108, 136]. For example, the results on one qubit have demonstrated that adaptive quantum state tomography can improve the accu- racy quadratically considering the infidelity index [98]. However, when generalizing these results to multi-qubit systems, two new problems arise:
Problem 3.1. In the adaptive tomography protocol, the optimized measurement bases may be nonlocal, which are difficult to realize in experiments. The question of how to adapt the theory to practical experiments is an open problem.
Problem 3.2. Ref. [98] points out that estimating the near-zero eigenvalues of the state is vital to reduce the infidelity. Single-qubit states have at most one near-zero eigenvalue, while a multi-qubit state can have many, namely, being (or approximately being) degenerate. What effect does this have upon the adaptive protocol?
In this chapter, we combine the computational efficiency of the recursive technique of [113] with a new adaptive protocol that does not necessarily require nonlocal measurement to present a recursively adaptive quantum state tomography (RAQST) protocol. In Sec. 3.2, we introduce our RAQST protocol, where no prior assumption (except the dimension) is made on the state to be reconstructed. The state estimate is recursively updated based on the current estimate and the new measurement data, via certain closed form formulas. Thus, compared with MLE and BME, the combination of historical information with the newly acquired data is much more
51 3.2. RAQST PROTOCOL efficient in our method. Thanks to the simple recursive estimation procedure, we can obtain the estimate state in a realtime way, and using the estimate we can adaptively optimize the measurement strategies to be performed in the forthcoming step. In our RAQST protocol, the measurement to be performed at each step is optimized w.r.t. the corresponding admissible measurement set determined by the experimental conditions. As an example, we consider the case where only product measurements are performed, and we design a numerical algorithm to optimize the measurement base among all product measurements.
In Sec. 3.3, we present simulation results for the RAQST protocol. It is first demonstrated numerically that our RAQST even with the simplest product mea- surements can outperform the tomography protocols using MUBs and the two-stage MUB adaptive strategy. For maximally entangled states, the infidelity can even be reduced to beat the Gill-Massar bound which is a quantum Cram´er-Raoinequality [58]. Moreover, if nonlocal measurements are available, with our RAQST the infi- delity can be further reduced. For a wide range of quantum states, the infidelity of our RAQST can be reduced to beat the Gill-Massar bound with a modest number of copies. We perform two-qubit state tomography experiments in Sec. 3.4 using only the simplest product measurements, and the experimental results demonstrate that the improvement of our RAQST over nonadaptive tomography is significant for states with a high level of purity. This limit (very high purity) is the one relevant for most forms of quantum information processing. Finally, Sec. 3.5 concludes this chapter and presents relevant open problems.
3.2 RAQST protocol
A basic observation for proposing adaptivity is that the optimal measurement basis w.r.t. the estimation error usually depends on the specific state to be estimated [98]. Hence, the general idea of adaptivity is to employ historical estimates to deduce an
52 3.2. RAQST PROTOCOL optimal or near-optimal measurement basis to improve the accuracy. Let K denote the number of total adaptive steps (non-adaptivity corresponds to K = 1). In the k-th step (2 ≤ k ≤ K), all the data before or in the k − 1-th step is employed to deduce an estimate of the state, according to which a new measurement basis is determined to perform the k-th step’s measurement. The problem is to find an appropriate mathematical framework to describe the adaptivity process, preferably reducing the estimation error significantly compared with non-adaptive protocols, achieving efficiency advantage and (partly) answering Problem 3.1 and Problem 3.2.
Ref. [113] proposed a linear regression estimation (LRE) method for quantum state tomography, where the results have shown that the LRE approach has much lower computational complexity than the MLE method for quantum tomography. Also, the LRE solution has a closed form, which should be advantageous in performing adaptive QST. Here, we further develop this LRE method to present a RAQST protocol that can greatly improve the precision of tomography.
3.2.1 Establishment of linear regression model
We first convert the quantum state tomography problem into a parameter esti- mation problem for a linear regression model. Consider a d-dimensional quantum
d2 system with Hilbert space H . Let {Ωi}i=1 denote a complete Hermitian basis set √ of Cd×d, satisfying Tr(ΩiΩj) = δij. Also let Ω1 = I/ d, and then the rest of the
Ωis are all traceless. Using this set, the quantum state ρ to be reconstructed can be parameterized as d2 X ρ = I + θ Ω , (3.1) d i i i=2 √ where θi = Tr(ρΩi) is real. Since θ1 = 1/ d is fixed, we take the effective parametriza- T tion vector as Θ = (θ2, ··· , θd2 ) .
A quantum measurement can be described by a positive operator-valued measure
53 3.2. RAQST PROTOCOL
M (POVM) {Pi}i=1, which is a set of positive semidefinite matrices which sum to the PM identity, i.e., Pi ≥ 0 and i=1 Pi = I. In QST, different sets of POVMs should be appropriately combined to efficiently acquire information of the unknown quantum state. Let M = S M(j) denote the admissible measurement set, which is a union j=1 of POVMs determined by the experimental conditions. Each POVM is denoted
(j) (j) M(j) d2 as M = {Pi }i=1 . Using the set of {Ωk}k=2, elements of the POVM can be parameterized as d2 (j) (i) I X (i) Pi = φ1,j √ + φk,jΩk, d k=2
(i) (j) √ (i) (j) where φ1,j = Tr(Pi )/ d, and φk,j = Tr(Pi Ωk). Let the effective parametrization (j) (i) (i) T vector of the i-th matrix of the j-th POVM be Φi = (φ2,j, ··· , φd2,j) . If we perform the POVM M(j) on copies of a system in state ρ, the probability that we observe the result m is given by
√ (j) (j) (m) T (j) p(m|M ) = Tr(Pm ρ) = φ1,j / d + Θ Φm . (3.2)
Assume that the total number of copies of the state is Nt, and we perform a (j) (j) M(j) (j) (j) measurement described by M = {Pi }i=1 n times. Let nm denote the number of the occurrence of the outcome m from the n(j) measurement trials of M(j). Let
(j) (j) (j) (j) (j) (j) pˆ(m|M ) = nm /n , and em =p ˆ(m|M ) − p(m|M ). According to the central (j) limit theorem [32], em converges in distribution to a normal distribution with mean 0 and variance [p(m|M(j)) − p2(m|M(j))]/n(j). (3.3)
Using (3.2), we have the linear regression equations for m = 2, ··· , M(j),
√ (j) (m) T (j) (j) pˆ(m|M ) = φ1,j / d + Θ Φm + em . (3.4)
√ (j) (m) (j) (j) Note thatp ˆ(m|M ), φ1,j / d and Φm are all available, while em may be considered as the observation noise. Hence, the problem of QST is converted into the estimation
54 3.2. RAQST PROTOCOL of the unknown vector Θ.
PK (jk) Denote the total number of regression equations as MK = k=1 M , and the ˆ estimation using the former n equations as Θn. To give an estimate with a high level ˆ of accuracy, the basic idea of LRE is to find an estimate ΘMK such that
K M(jk) √ X X (m) T Θˆ = argmin W (jk)[ˆp(m|M(jk)) − φ / d − Θˆ Φ(jk)]2. (3.5) MK m 1,jk m ˆ Θ k=1 m=1
(j ) (jk) (jk) (jk) M k Here, M denotes the POVM M = {Pm }m=1 being performed at the k-th
(jk) step. The notation Wm denotes the weight of the corresponding linear regression
(jk) equation. In general, the smaller the variance of em is, the more the information
(jk) can be extracted by Pm . Therefore, the corresponding weight of the regression
(jk) equation should be larger. A sound choice of Wm is the estimate of the inverse of
(jk) (jk) (j ) (j ) 2 (j ) the variance of em ; i.e., Wm = n k /[ˆp(m|M k ) − pˆ (m|M k )].
3.2.2 Recursive LRE updating rule and physical projection
We utilize the recursive LRE algorithm [113] to find a closed-form solution for ˆ ΘMK . First, we transform the linear regression equations (3.4) into a compact form.
Pt (jk) After t times of POVMs, we can obtain in total Mt = k=1 M linear regression
(j1) equations. We denote them as 2-tuples [1, (j1)], ··· ,[M ,(j1)], ··· , [1, (jt)], ··· ,
(jt) [M ,(jt)], where [m, (jk)] corresponds to the linear regression equation with the outcome m when the POVM M(jk) is performed at the k-th step. To facilitate the presentation, we relabel them according to the natural order. Thus, the notation (m) (j ) (j ) pˆ(m|M(jk)), φ , Φ(jk), e k , and W k can be simplified with the corresponding 1,jk m m m (n) sequence number n asp ˆn, φ1 , Φn, en, and Wn. Let
T (1) (n) (Mt) ! φ1 φ1 φ1 Yt = pˆ1 − √ , ··· , pˆn − √ , ··· , pˆM − √ , d d t d
55 3.2. RAQST PROTOCOL
T Xt = (Φ1, ··· , Φn, ··· , ΦMt ) ,
T et = (e1, ··· , en, ··· , eMt ) ,
Wt = diag (W1, ··· ,Wn, ··· ,WMt ) .
Using this notation, the linear regression equations (3.4) can be expressed in a compact form
Yt = XtΘ + et. (3.6)
The solution to (3.5) is ˆ T −1 T Θt = (Xt WtXt) Xt WtYt. (3.7)
We now show how to rewrite (3.7) in a recursive way. Define
n X 1 Q = ( W Φ Φ T )−1, a = ( + ΦT Q Φ )−1. (3.8) n i i i n W n n−1 n i=1 n
Using the matrix inversion formula (see, e.g., page 19 of [72])
(A − BCD)−1 = A−1 + A−1B(C−1 − DA−1B)−1DA−1,
for n = 2, ··· , Mt, we have
T Qn = Qn−1 − anQn−1ΦnΦn Qn−1. (3.9)
ˆ From (3.7)-(3.9), the recursive form of Θn can be obtained as
(n) ˆ ˆ φ1 T ˆ Θn = Θn−1 + anQn−1Φn(ˆpn − √ − Φ Θn−1). (3.10) d n
ˆ Observing (3.8)-(3.10), we find that the historical data are “compressed” in Θn−1 and Qn−1. Each time when the new data come and the estimation is updated, the historical data participate in the computation as a whole, instead of one by one.
56 3.2. RAQST PROTOCOL
Hence, the historical calculation involving old data directly helps to exempt the updating of the estimation from some repetitive computation tasks. This is quite d- ifferent from the MLE or BME method where one has to go through all the historical data many times, which is computationally intensive. The algorithm in Sec. 2.2.1 shows that when fresh measurement data from new POVM come, the historical cal- culation is of little use to save the computation burden of the new round of searching.
A further necessary procedure of our protocol is physical projection. Specifically, ˆ using the solution ΘMK in (3.5) and the relationship in (3.1), we can obtain a Hermitian matrixµ ˆ with Trˆµ = 1. However,µ ˆ may have negative eigenvalues and be nonphysical, due to the randomness of the measurement results or the finiteness of the resource number. In this work, the final physical estimateρ ˆph is chosen to be the closest density matrix toµ ˆ under the Frobenius norm; i.e.,
ρˆph = argmin||ρ − µˆ||. (3.11) ρ∈Dd
In standard state reconstruction algorithms, this task is computationally intensive [130]. However, we can employ the fast algorithm in [130] with computational com- plexity O(d3) to solve (3.11) since we have a Hermitian estimateµ ˆ with Trˆµ = 1. It can be verified that pullingµ ˆ back to a physical stateρ ˆph can further reduce the mean squared error [137]. Using this technique, we project the pseudo estimation
µˆ to the physical space Dd composed of all density matrices and obtain the final estimation.
Finally, it should be pointed out that the physical projection procedure should not directly interfere with the update of the estimation, in order to guarantee the intactness of the raw data. Specifically, each time we obtain an updated estimation ˆ Θn−1, which can be non-physical, we employ physical projection to obtain a tempo- ph ph rary genuine estimateρ ˆn−1. Based onρ ˆn−1, we employ optimization algorithms (like that in Sec. 3.2.4) to determine the chosen measurement basis for the n-th step Φn.
57 3.2. RAQST PROTOCOL
ˆ When the measurement data of the n-th step arrive, Θn should be obtained using ˆ ph (3.10) on the basis of Θn−1 instead ofρ ˆn−1, which is a key point in simulation.
We would like to stress two advantages of the recursive LRE method: (a) as we have demonstrated in [113], the recursive LRE method can greatly reduce the cost of computation in comparison with the MLE method; (b) the recursive LRE algorithm is naturally suitable for optimizing measurements adaptively. The argument for the advantage (b) can be explained as follows. For state tomography the optimal measurements generally depend upon the state to be reconstructed. By utilizing the recursive LRE algorithm, we can obtain the estimate of the real state in a computationally efficient way. Using the state estimate, the measurements to be performed can be adaptively optimized.
3.2.3 Optimization criterion
In this subsection, we illustrate how to deduce the criterion for optimization of the measurement basis; i.e., on what standard should we determine the forthcoming measurement basis, given a historical estimate of the state.
As pointed out in [58], as the number of copies Nt becomes large, the only rele- vant measure of the quality of estimation becomes the mean squared error matrix ˆ ˆ ˆ T M(Θ, Θ) , E(Θ − Θ)(Θ − Θ) , where E(·) denotes the expectation on all possible measurement results. To be specific, for a good estimation strategy, a reasonable ex- pectation is that the elements of the mean squared error matrix decrease as O(1/Nt), ˆ ˆ ˆ ˆ i.e., Mij(Θ, Θ) = E(θi − θi)(θj − θj) = O(1/Nt). Assume that f(Θ, Θ) is any s- mooth cost function that can measure how much the estimate Θˆ (ˆρ) differs from the true value Θ(ρ). From equations (1)–(3) in [58], there exist a function f0(Θ) and a positive semidefinite matrix C(Θ) such that the mean value of f(Θˆ , Θ) under a
58 3.2. RAQST PROTOCOL reasonable estimation strategy will decrease as
1 Ef(Θˆ , Θ) = f (Θ) + Tr(C(Θ) (Θˆ , Θ)) + o(1/N ). (3.12) 0 2 M t
Note that f0(Θ) and C(Θ) depend only on the cost function and the true state, while M(Θˆ , Θ) depends on the true state as well as the estimation Θˆ . Hence, from (3.12), we can minimize the mean squared error matrix M(Θˆ , Θ) to minimize any smooth cost function by choosing appropriate POVMs and suitable estimation strategies.
In the following, we only minimize Tr(M(Θˆ , Θ)) instead of M(Θˆ , Θ) itself for simplicity, although this is not equivalent. To give a criterion on how to optimize ˆ the POVMs, we first look at the mean squared error matrix of Θn. From (3.7) and (3.8), we have
ˆ ˆ T T E(Θn − Θ)(Θn − Θ) = QnXn WnCov(en)WnXnQn,
T where Cov(en) is the covariance matrix of en = (e1, ··· , en) . To minimize the ˆ ˆ T trace of E(Θn − Θ)(Θn − Θ) , we can minimize Qn. We now present an intuitive explanation on the rationality to minimize Qn. It can be seen that if the weighted −1 ˆ ˆ T matrix Wn satisfies Wn ≈ Cov(en) when Nt becomes large, E(Θn −Θ)(Θn −Θ) ≈
Qn. Recall that the weight of the i-th linear regression equation is approximately equal to the inverse of Cov(en)ii. Moreover, if ei and ej correspond to different
POVMs, they are independent, and so Cov(en)ij = 0. Therefore, we can adaptively choose POVMs to minimize Qn.
We illustrate some specifics of our RAQST protocol before minimizing Qn. RAQST is generally divided into two stages. In the first stage, we perform a standard
(static/non-adaptive) linear regression estimation on N0 copies with the standard cube measurement bases (while other common static bases are also applicable) to get a prelimiary Θˆ and Q [113]. Next in the second stage we set the initial value ˆ ˆ Q0 = Q in (3.9) and Θ0 = Θ in (3.10), and then utilize the remaining Nt − N0
59 3.2. RAQST PROTOCOL copies for K − 1 steps adaptive linear regression estimation.
s ˆ P (jk) Suppose after s steps, we get QMs and ΘMs where Ms = k=1 M . Recall that (j ) (jk) (jk) (jk) M k M denotes the POVM M = {Pm }m=1 being performed at the k-th step. If s = 0, Ms = 0. From (3.9), we can see that QMs+1 ≤ QMs , and
ΦT Q2 Φ − g Tr(Q ) − Tr(Q ) = − Ms+1 Ms Ms+1 . (3.13) Ms+1 , Ms+1 Ms 1 T + ΦM +1QMs ΦMs+1 WMs+1 s
The remaining question is how to choose POVMs to improve the rate of decreasing.
(js+1) (js+1) S (j) We can choose Pi (Φi ) from the admissible measurement set M = M = j=1 S (j) M(j) {Pi }i=1 such that it maximizes gMs+1. In other words, we need to solve j=1
T 2 ΦMs+1QMs ΦMs+1 max 1 T . (3.14) S (j) M(j) + Φ QM ΦM +1 ΦMs+1∈ {Pi }i=1 WM +1 Ms+1 s s j=1 s
In cases when M is a finite set determined by the practical experimental setting, one can simply enumerate all the candidates to determine the ΦMs+1 which gives the largest corresponding gMs+1 value. Otherwise as M is infinite, the general solution to (3.14) remains an open problem, and we present a heuristic answer in Sec. 3.2.4.
(js+1) Once the new measurement base ΦMs+1 (i.e., Pi ) is chosen, we perform the (j ) (js+1) (js+1) M s+1 corresponding POVM M = {Pk }k=1 at the (s + 1)-th step. By doing this, we obtain M(js+1) linear regression equations. Thus, we can utilize (3.9) and s+1 ˆ P (jk) (3.10) to get QMs+1 and ΘMs+1 , where Ms+1 = k=1 M . The above procedure is repeated until all the copies are consumed.
Two points should be paid attention to in the above RAQST protocol. The first
(js+1) one is when choosing Pi to maximize gMs+1, we cannot obtain information about pˆMs+1 in WMs+1 because we have not really performed the experiments. From (3.4),
60 3.2. RAQST PROTOCOL we can use its estimate
√ T ˜p (Θˆ ) = φ(i) / d + Θˆ Φ(js+1) (3.15) Ms+1 Ms 1,(js+1) Ms i
to replacep ˆMs+1. Another one is that given the total number of copies Nt, how to determine the number of copies N0 used in the first stage and the number of adaptive steps K − 1 in the second stage. The optimal values remain open, while we give effective empirical formulas for two-qubit systems as an example in Sec. 3.3.
3.2.4 Two versions of RAQST
To (partly) answer Problem 3.1 and Problem 3.2, we develop a specialized version of our protocol, namely RAQST1, where the admissible measurement set consists of all the product measurements (including the product of standard cube measurement bases [38]). This kind of measurement is one of the simplest measurements to realize in optical systems. In this section, we employ the two-qubit case as an example to illustrate RAQST1, which is not difficult to be extended to systems with three or more qubits.
We first consider Problem 3.1 and start from searching for a suboptimal solution to (3.14). Ref. [98] pointed out that, in order to reduce the infidelity significant- ly compared with non-adaptive protocols, we must accurately estimate the small eigenvalues of the state to be reconstructed, particularly those near-zero eigenvalues. To do this, a preferable choice is to take the projector measurement along with or close to the eigenvectors corresponding to the near-zero eigenvalues. Furthermore, we consider the following problem:
Pd Problem 3.3. Given ρ ∈ Dd with spectral decomposition ρ = i=1 λi|λiihλi| where 0 ≤ λ ≤ · · · ≤ λ ≤ 1. Solve argmin Tr(ρ|αihα|). 1 d |αi∈Cd,hα|αi=1
It is straightforward to find that |λ1i is just a solution to Problem 3.3, which means
61 3.2. RAQST PROTOCOL to find the eigenvector corresponding to the least eigenvalue amounts to minimize the expected measurement value of using a projector on the state. Inspired by this, at each iteration step we aim to find a product projector that minimizes ˜pMs+1 in (3.15), and we name this procedure as “product projector optimization”. This operation makes the corresponding regression equation to be obtained as accurate as possible since the variance of the relevant observation noise (3.3) is minimized.
Furthermore, this procedure maximizes the weight factor WMs+1, thus promising to achieve a large gMs+1 according to (3.14). The validity to perform product projector optimization is supported by the above analysis. The product projector optimization can be modelled as a standard conditional extreme problem, and then solved by a simple iteration algorithm. The details are given in AppendixB. The measurement basis determined using product projector optimization might not always be the one
(js) that can maximize gMs+1, while it deserves to be added in the admissible set M .
Now we come to Problem 3.2. In a single-qubit system, there can be at most one near-zero eigenvalue, and it thus suffices to estimate this unique one accurately. However, for multi-qubit systems, due to degeneration or near-degeneration, there
Nq can be more than one (at most 2 − 1 in Nq-qubit systems) near-zero eigenvalues, and their eigenvectors together can form a non-trivial linear subspace. The eigen- vector of the least eigenvalue is only one component (or basis) of this subspace, and our target now is to estimate this whole subspace, instead of estimating just one component. To do this, we design a “cyclic eigenvalues” method as follows.
Suppose the current estimate of the state has spectral decomposition
d X ˆ ˆ ˆ ρˆ = λi|λiihλi| i=1
ˆ ˆ where 0 ≤ λ1 ≤ · · · ≤ λd ≤ 1. If we directly employ the product projector optimiza- CE tion algorithm in AppendixB on ˆρ, we obtain a product projector Φ1 which can ˆ accurately estimate |λ1i. Now suppose there are altogether j near-zero eigenvalues
62 3.2. RAQST PROTOCOL
ˆ and we want to estimate |λki, where 2 ≤ k ≤ j < d. We can write down a dummy state
j−k+1 j d X ˆ ˆ ˆ X ˆ ˆ ˆ X ˆ ˆ ˆ ρˆk = λk+i−1|λiihλi| + λi−j+k−1|λiihλi| + λi|λiihλi|. i=1 i=j−k+2 i=j+1
We then employ the algorithm in AppendixB for this dummy state ˆρk, and the result CE ˆ is a product projector Φk which can accurately estimate |λki. Using this method for all 2 ≤ k ≤ j, we can thus accurately estimate all the near-zero eigenvalues.
In practice, we still need to determine the exact number of near-zero eigenvalues. √ Since the estimation infidelity decreases as O(1/ N0) in the first static stage of √ our protocol [98], we set the threshold as 10/ N0. After the first stage, suppose we √ ˆ obtain j eigenvalues in Θ0 smaller than 10/ N0, then we need to accurately estimate the eigenvectors corresponding to the least j eigenvalues of the state. In each of the
CE forthcoming adaptive step, we first add Φ1 to the admissible set and choose the optimal one as the measurement, using (Nt − N0)/(K − 1)/j resources, then we consume the rest (j − 1)(Nt − N0)/(K − 1)/j resources evenly on the measurement CE CE CE bases Φ2 , Φ3 , ..., Φj .
Moreover, for each adaptive product project measurement basis we need to perform in two-qubit RAQST1, it is necessary to expand it to a complete POVM. Specifically, if the chosen projector is |ψ1ihψ1|⊗|ψ2ihψ2|, the corresponding POVM is {|ψ1ihψ1|⊗ ⊥ ⊥ ⊥ ⊥ ⊥ ⊥ ⊥ ⊥ |ψ2ihψ2|, |ψ1 ihψ1 | ⊗ |ψ2ihψ2|, |ψ1ihψ1| ⊗ |ψ2 ihψ2 |, |ψ1 ihψ1 | ⊗ |ψ2 ihψ2 |}, where ⊥ |ψi i is orthogonal to |ψii. This completes RAQST1.
For RAQST2, we further add the set of the eigenbases of the current state estimate into the admissible measurement set. This set, together with the product projector obtained in RAQST1, is enough to construct a satisfactory admissible set, and we thus omit the cyclic eigenvalues method. Note that the admissible measurement set in RAQST2 will involve non-local measurements in general, which may be difficult to perform reliably using the current experiment techniques.
63 3.3. NUMERICAL RESULTS
3.3 Numerical results
In this section, we present the numerical results. We perform numerical simula- tions of two-qubit tomography mainly using the LRE method by default while with six different measurement strategies: (i) standard cube measurements [38]; (ii) mu- tually unbiased bases (MUB) measurements; (iii) MUB half-half [98]; (iv) “known basis” [98]; (v) RAQST1: the admissible measurement set only contains the sim- plest product measurements; (vi) RAQST2: the admissible measurement set is not limited.
Each of the tomography protocols (iii)-(vi) is adaptive and consists of two stages. In the first stage, we all use the standard cube measurements. For the MUB half- half, we first perform standard cube measurements on Nt/2 copies and obtain a preliminary estimateρ ˆ0 via LRE, and then measure the remaining half of copies so that one set of the bases is adaptively adjusted to diagonalizeρ ˆ0 and it together with another four sets of bases constitutes a complete set of MUB as proposed in [98]. As compared to the MUB half-half, for the “known basis” [98], in the second stage, we perform a set of measurements so that one of the five bases of the MUB is exactly the eigenbasis of the true value of the state to be reconstructed. Although it is impossible physically using current technology, this is a useful comparison.
For the RAQST, we need to specify N0, which is the number of copies measured in the first stage, and the number K of the iteration steps. In principle, K may depend on the preliminary estimate in the first stage. For simplicity, in this work, we give empirical formulas only depending upon the total number Nt of the copies. Note that in RAQST1 and RAQST2, the admissible measurement sets are different, and so
(1) (1) are their empirical formulas. For RAQST1, N0 = Nt/(1.3 + 0.1 log10 Nt), K = (2) (2) blog10 Ntc, and for RAQST2, N0 = Nt(0.8−0.01 log10 Nt), K = b1.5 log10 Nt − 1c. Obviously the formula for the resource distribution for RAQST2 applies only when Nt is not too large.
64 3.3. NUMERICAL RESULTS
-1 120 RAQST1 for Random MESs -2 RAQST1 for Random Pure States 100 RAQST2 for Random MESs RAQST2 for Random Pure States -3 80
-4 60 Cube MUB -5 MUB Half-half 40 Known Basis RAQST1 -6 20 RAQST2 GM Bound -7 0 2.5 3 3.5 4 4.5 5 5.5 6 -0.5 0 0.5 1 1.5
(a) (b)
Figure 3.1: Simulated performance of the RAQST protocol for pure states.
We use Monte Carlo simulations to demonstrate the results. The figure of mer- it is the particularly well-motivated quantum infidelity [98], 1 − F(ρ, ρˆ) = 1 − 2 p√ √ Tr ( ρρˆ ρ). Fig. 3.1(a) depicts the average infidelity versus Nt for the maxi- |HV i−|VHi mally entangled state √ with different tomography protocols. Each point is 2 averaged over 100 realizations and the error bars are the standard deviation of the average. It can be seen that the average infidelity of the static tomography protocols √ (i.e., protocol (i) and (ii)) versus Nt is in the order of O(1/ Nt). However, the Gill-Massar bound [58] for the infidelity in two-qubit state tomography is 75 . This 4Nt can be obtained by combining the equations (5.29) and (A.8) in [170] (see Appendix C). It is clearly seen that, as compared to the static tomography protocols and the adaptive MUB half-half, the average infidelity using our RAQST protocol can be reduced to beat the Gill-Massar bound even with only the simplest product mea- surements. Furthermore, if there are no constraints on the admissible measurement set, the RAQST2 can outperform the “known basis” tomography, and the average infidelity of RAQST2 versus Nt can be significantly reduced to the order of the
Gill-Massar bound, i.e., O(1/Nt).
Fig. 3.1(b) shows the histogram for RAQST over 200 randomly selected pure states and 200 maximally entangled states (MESs) when the total number of copies
65 3.3. NUMERICAL RESULTS
-1 -1.5
-1.5 -2 -2
-2.5 -2.5
-3 Cube Cube -3 -3.5 MUB MUB MUB Half-half MUB Half-half -4 Known Basis Known Basis -3.5 RAQST1 RAQST1 -4.5 RAQST2 RAQST2 GM Bound GM Bound -5 -4 2.5 3 3.5 4 4.5 5 5.5 6 0.25 0.4 0.55 0.7 0.85 0.9 0.95 1
(a) (b)
Figure 3.2: Simulated performance of the RAQST protocol for mixed states.
4 is Nt = 10 for each random state. Random pure states are created using the algorithm in [173]. Since all the maximally entangled states are equivalent under local unitary operations, they are created by applying randomly generated local unitary operators [100] on the same maximally entangled state √1 (0, 1, −1, 0)T . Each 2 generated state is repeated through the RAQST protocol for 200 times. We adopt the index IP = log10 Cube−log10 RAQST to evaluate the performance of our RAQST protocol. log10 Cube−log10 GM Here, Cube and RAQST represent the average infidelity between the corresponding estimate and the true state when the standard cube measurement bases and the RAQST are utilized, respectively, while GM is the Gill-Massar bound. Note that if IP > 0, our adaptive protocol surpasses the standard measurement strategy, while if IP > 1, our adaptive protocol beats the Gill-Massar bound. From Fig. 3.1(b) we can see that our RAQST protocol is particularly effective for the class of maximally entangled states which are important resources in quantum information.
Fig. 3.2(a) depicts average infidelity versus Nt with different tomography methods for state (|HV i − |VHi)(hHV | − hVH|) ρ = 0.997 + 0.003 I , 2 4 which has purity Tr(ρ2)=0.9955. Each point is averaged over 200 realizations, and the
66 3.3. NUMERICAL RESULTS error bars are the standard deviation of the average. Note that there are kinks in the four curves corresponding to the four different adaptive protocols (iii)-(vi). We can see that each of the four curves can be divided into three segments from left to right.
In the first segment, the infidelity decreases quickly as Nt increases, and then the curves go into the second segment where the infidelity decreases slowly. Finally when the resource number is large, the infidelity decreases quickly again as Nt increases. This is because infidelity is hypersensitive to misestimation of small eigenvalues, as pointed out in [98]. Whether a small number is close to zero is in fact a relative notion, instead of an absolute notion. When the resource number is not large enough to discriminate the near-zero eigenvalues and zero, the state “looks” pure from the view of data, and the performance is thus the same as pure-state tomography. Hence, the infidelity decreases as O(1/Nt) at first. When the resource number increases to a level where the near-zero eigenvalues start to take effect, it will be hard to estimate them accurately, so the decay rate of the infidelity decreases. Once the resource number is large enough to clearly discriminate the near-zero eigenvalues and zero, we are performing mixed-state tomography in essence, which all has O(1/Nt) decreasing rate for infidelity as predicted in [98]. This completes the explanation of the performance of three segments. More detailed explanation on this phenomenon can be found in [164].
From Fig. 3.2(a) it can be further seen that our RAQST1 can beat the static to- mography protocols and the adaptive MUB half-half protocol even with the simplest product measurements. The infidelity can be further reduced by using RAQST2,
4.5 and when the total copies Nt ≥ 10 , the infidelity can be reduced to O(1/Nt).
Fig. 3.2(b) shows the average infidelity versus different purity when the total num-
4 ber of the copies for each state is fixed as Nt = 10 . The states to be estimated are (|HV i − |VHi)(hHV | − hVH|) α + β I , 2 4 where α, β ≥ 0 and satisfy α + β = 1. Each point is averaged over 1000 realizations.
67 3.4. EXPERIMENTAL RESULTS
The results show that when the states have a high level of purity, our RAQST1 with the simplest product measurements can beat the MUB protocol. However, as the state becomes more mixed (Tr(ρ2) decreases), using MUB measurements for state tomography can perform better than using the adaptive product measurements. This fact is due to the essential limit of product measurements on mixed states. As pointed out in [58], nonlocal measurements on a mixed state can extract more information. Thus, to estimate mixed states, it is better to use nonlocal measurements, e.g., MUB measurements. It is also clear that the infidelity achieved by using RAQST2 is much lower than that using MUB, and can beat the Gill-Massar bound for a wide range of quantum states.
3.4 Experimental results
In this section, we report the experimental results using our RAQST protocol for two-qubit quantum state tomography. The experiment was performed by our col- laborators Zhibo Hou, Han-Sen Zhong, Li Li, Guo-Yong Xiang, Chuan-Feng Li and Guang-Can Guo at the University of Science and Technology of China. Since it is difficult to perform nonlocal measurements in real experiments, we only experimen- tally implement tomography protocols using (i) standard cube measurements and (v) RAQST1.
As shown in Fig. 3.3, the experimental setup includes two modules: state prepara- tion (gray) and adaptive measurement (light blue). In the state preparation module, a pair of polarization-entangled photons with a central wavelength at λ =702.2 nm is first generated after the continuous Ar+ laser at 351.1 nm with diagonal polarization pumps a pair of type I phase-matched β-barium borate (BBO) crystals whose optic axes are normal to each other [87]. The generation rate is about 3000 two-photon coincidence counts per second at a pump power of 60 mW. Half-wave plates (HWPs) at both ends of the two single mode fibers are used to control polarization. Then,
68 3.4. EXPERIMENTAL RESULTS
Figure 3.3: Two-qubit state tomography experimental setup, adopted from [114]. one photon is either reflected by or transmits through a 50/50 beam splitter (BSs). In the transmission path, a quarter-wave plate (QWP) is tilted to compensate the |HV i−|VHi phase of the two-photon state for the generation of √ . In the reflected path, 2 three 446 λ quartz crystals and a half wave plate with 22.5◦ are used to dephase the two-photon state into a completely mixed state I/4. The ratio of the two states mixed at the output port of the second BS can be changed by the two adjustable apertures for the generation of arbitrary Werner state in the form
(|HV i − |VHi)(hHV | − hVH|) ρ = α + (1 − α) I . 2 4
Note that since the coherence length of the photon is only 176 λ (due to the 4 nm bandwidth of the interference filter (IF)), much smaller than the optical path differ- ence which is about 0.5 m, two states from the reflected and transmission path only mix at the second BS rather than coherently superpose. In the adaptive measurement module, the two-photon product measurements are realized by the combinations of quarter-wave plates, half-wave plates, polarizing beam splitters (PBSs), single pho- ton detectors (SPDs) and a coincidence circuit. The rotation angles of QWPs and HWPs can be adaptively adjusted by a controller according to the analysis of the collected coincidence data on a computer.
69 3.4. EXPERIMENTAL RESULTS
-1.8 Cube simulation -1.5 Cube experiment -1.9 MUB simulation -2 RAQST1 simulation -2 RAQST1 experiment -2.1 GM Bound -2.5 -2.2
-2.3
-3 Cube simulation -2.4 Cube experiment MUB simulation -2.5 -3.5 RAQST1 simulation -2.6 RAQST1 experiment GM Bound -2.7 -4 -2.8 2.5 3 3.5 4 4.5 5 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
(a) (b)
Figure 3.4: Two-qubit state tomography experimental results.
The experiment results are depicted in Fig. 3.4. Dots are the average infidelity of simulation results with 1000 repetitive runs of RAQST1 (red), MUB (khaki) and standard cube measurements (magenta), and circles are the corresponding average infidelity of experimental results. Error bars are the standard deviation of the av- erage. In the first experiment, as shown in Fig. 3.4(a), we realize RAQST1 and standard cube measurements tomography protocols for entangled states with a high level of purity, w.r.t. different number of resources Nt ranging from 251 to 251189. 7 First, we calibrate the true state ρ using RAQST1 with Nt = 10 copies so that the infidelity of the calibrated true state is even 10 times smaller than the estimate accuracy achieved at Nt = 251189 with RAQST1. The purity of the calibrated state is 0.983. Systematic error is crucial in the experiments. Beam displacers, which separate extraordinary and ordinary light, act as PBS and have an extinction ratio of about 10000:1. As the precision of rotation stages of QWPs and HWPs are 0.01◦, the rotation error is determined by the calibration error of optic axes, which is 0.1◦ in our experiment. Phase errors of the currently used true zero-order QWPs and HWPs are 1.2◦, which dominate the systematic error of practically realized measure- ments. These error sources induce a systematic error to the estimate state, which can be characterized by its infidelity from the true state. The systematic error is in
70 3.4. EXPERIMENTAL RESULTS the order of 10−3 when the error sources take the above values. For resource number
3 Nt ≥ 10 , the systematic error is of the same scale as or even larger than the statis- tical error due to finite resources (Nt copies). To deal with this problem, we employ error-compensation measurements [74] to reduce the systematic error to the order of 10−5. In error-compensation measurement technique, multiple nominally equivalent measurement settings are applied to sub-ensembles such that the systematic errors can cancel out in the first order. Tomography experiments using both RAQST1 and standard cube measurements are repeated 10 times for each number of photon resources.
In the second experiment, as shown in Fig. 3.4(b), we realize tomography protocols using RAQST1 and standard cube measurements for Werner states with purities ranging from 0.25 to 0.98. The purities are changed by adjusting the apertures. Since the photon resource for each run of tomography protocols is only 104, we use 106 copies to calibrate the true state. There are 40 experimental runs and 1000 simulation runs for each of nine Werner states. In each RAQST experiment, four adaptive steps are used to optimize the measurements. To ensure measurement accuracy, error-compensation measurements are also employed.
In both of these two experiments, our experimental results agree well with simu- lation results. The improvement of RAQST1 protocol over standard cube measure- ments strategy is significant. According to the simulation results of MUB protocol and the experimental results of RAQST1, even only with the simplest product mea- surements, our RAQST1 can outperform the tomography protocols using MUBs for states with a high level of purity. Taking into account the trade-off between accuracy and implementation challenge, from Fig. 3.2 and Fig. 3.4, RAQST using the simplest product measurement seems to be the best choice for reconstructing entangled states with a high level of purity.
71 3.5. SUMMARY AND OPEN PROBLEMS
3.5 Summary and open problems
We have presented a new adaptive QST protocol using an adaptive LRE algorith- m and reported a two-qubit experimental realization of the adaptive tomography protocol. In our RAQST protocol, no prior assumption is made on the state to be reconstructed. The infidelity of the adaptive tomography is greatly reduced and can even beat the Gill-Massar bound by adaptively optimizing the POVMs performed. We demonstrated that the fidelity obtained by using our RAQST with only the sim- plest product measurements can even surpass those obtained by using MUB and the two-stage MUB adaptive strategy, for states with a high level of purity. Consider- ing the trade-off between accuracy and difficulty of implementation, it seems that RAQST using the product measurements is the best choice for reconstructing pure and nearly pure entangled states, which are among the most important resources for quantum information processing.
It is worth stressing that our RAQST protocol is flexible and extensible. For any finite dimensional quantum systems, once the admissible measurement set is given, we can utilize the adaptive measurement strategy to estimate an unknown quantum state. As demonstrated by numerical results, if nonlocal measurements can be experimentally realized reliably after an experimental advance, the admissible measurement set M can be enlarged, and our RAQST protocol can be better utilized accordingly.
A number of open questions deserve to be investigated:
(i) How to give a more effective formula for the parameters defining the second stage? Analytically derived formulas would be preferable over empirical ones, in par- ticular allowing the parameters to depend upon the estimated state in the first stage. This is actually related to the tomography problem wherein some prior information is already known, e.g., pure entangled states, matrix-product states, low-rank states, etc. By taking full advantage of the prior information, an even more efficient RAQST
72 3.5. SUMMARY AND OPEN PROBLEMS protocol may be designed.
(ii) How to present a theoretical description of the convergence speed of the numer- ical algorithm in AppendixB? This would be helpful in characterizing the efficiency of the algorithm.
(iii) It remains open to find the general optimal solution to (3.14). The analytical solution in single-qubit systems has already been obtained [45], while for multi-qubit systems it is still difficult to solve.
73 3.5. SUMMARY AND OPEN PROBLEMS
74 Chapter 4
Quantum Hamiltonian identifiability via a similarity transformation approach and beyond
The work, reported in this chapter, has been partially published in the following articles:
1. Y. Wang, D. Dong, A. Sone, I. R. Petersen, H. Yonezawa, and P. Cappellaro, Quantum Hamiltonian identifiability via a similarity transformation approach and beyond, submitted to IEEE Transactions on Automatic Control, 2018.
2. Y. Wang, D. Dong, and I. R. Petersen, An approximate quantum Hamiltonian identification algorithm using a Taylor expansion of the matrix exponential function, in 2017 IEEE 56th Annual Conference on Decision and Control (CDC), pp. 5523-5528, Melbourne, Australia, December 2017.
3. Y. Wang, D. Dong, I. R. Petersen, and J. Zhang, An approximate algorithm for quantum Hamiltonian identification with complexity analysis, in the 20th World Congress of the Inter- national Federation of Automatic Control (IFAC), vol. 50, no. 1, pp. 11744-11748, Toulouse, France, July 2017.
75 4.1. INTRODUCTION
4.1 Introduction
The Hamiltonian is a fundamental quantity that governs the evolution of a quan- tum state, as described by the well-known Schr¨odingerequation (2.1). Hamiltonian identification is thus critical for tasks such as calibrating quantum devices [147] and characterizing quantum channels [40, 166]. Before performing identification exper- iments, a natural question arises: is the available data from a given experimental setting enough to identify (or determine) all the desired parameters in the Hamil- tonian? We refer to such a problem as Hamiltonian identifiability. The solution to this problem is fundamental and necessary for designing experiments, and also gives us insights into the information extraction capability of certain probe systems.
There are several existing approaches to investigating the problems of quantum system identification [19, 54, 129] and identifiability [92]. For example, Ref. [27] proved that controllable quantum systems are indistinguishable if and only if they are related through a unitary transformation, which can be developed as an identifia- bility method for controllable systems. The identifiability problem for a Hamiltonian corresponding to a dipole moment was investigated in [22]. The identification of spin chains has been extensively investigated in [23, 24, 25, 28, 52, 53]. Ref. [63] presented identifiable conditions for parameters in passive linear quantum systems, and fur- ther disposed of the requirement of “passive” in [91]. Control signals to enhance the observability of the quantum dipole moment matrix were introduced in [89]. Zhang and Sarovar [165] proposed a Hamiltonian identification method based on measure- ment time traces. Sone and Cappellaro [132] employed Gr¨obnerbasis to test the Hamiltonian identifiability of spin-1/2 systems, and their method is also applicable to general finite-dimension systems.
We assume the dimension [131] and structure (e.g., the coupling types) [82] of the Hamiltonian is already determined, and the task is to identify unknown param- eters in the Hamiltonian. It is natural to resort to identifiability test methods in
76 4.1. INTRODUCTION classical (non-quantum) control field to tackle the quantum Hamiltonian identifia- bility problem. Common classical methods include the Laplace transform approach [8], the Taylor series expansion approach [112] and the Similarity Transformation Approach (STA) [142, 143, 145]. For a review, see [59, 105, 148]. The main idea of the Laplace transform approach is to determine the number of solutions of the multivariate equations composed by coefficients of the transfer function. In contrast, the STA method transforms the identifiability problem into finding the existence of unequal solutions of similarity equations generated by a minimal system’s equivalent realizations, thus providing a chance to avoid directly solving multivariate polyno- mial equations, a considerable advantage in the case of high-dimension systems or incomplete prior information. In this chapter, we extend the classical STA method to quantum Hamiltonian identifiability. We generalize and improve STA-based i- dentifiability criteria, which are applicable to both classical control and quantum identification domains. We employ the STA method to analyze all the physical cases in [132] and present proofs for the associated identifiability conclusions.
We further propose a Structure Preserving Transformation (SPT) method for the STA-based identifiability analysis in non-minimal systems. In classical control, when faced with non-minimal systems, one usually prefers to change the system settings such that it becomes minimal. In other words, the original settings are abandoned. This indirect solution is not applicable when the experimental settings are difficult to change or when we only expect to explore the information extraction capabili- ty of some particular physical probe systems. However, the SPT method provides a chance to preserve most of the system key properties after transformations while still performing identifiability analysis on its minimal subsystem. Hence, we employ the SPT method to prove that it is always possible to estimate one unknown parameter in the system matrix using a specifically designed experimental setting. This con- clusion serves as an indicator for the existence of “economic” quantum Hamiltonian identification algorithms, whose computational complexity directly depends on the
77 4.2. MODEL ESTABLISHMENT number of unknown parameters.
As an example, we provide two specific economic identification algorithms, where the computational complexity only depends on the number of unknown parameters and data length. Therefore, for physical systems with a small number of unknown parameters in the Hamiltonian, these economic identification algorithms can be effi- cient.
The structure of this chapter is as follows. In Sec. 4.2, we formulate the Hamilto- nian identifiability problem in a linear systems framework, and present the classical Laplace transform approach and some necessary concepts. In Sec. 4.3, we introduce the general procedures of STA for identifiability problems, including the SPT method as a new tool in non-minimal systems. Sec. 4.4 consists of the specific applications of STA method on the three physical cases in [132]. Sec. 4.5 employs STA to indicate the existence of economic identification algorithms, and presents two examples of economic Hamiltonian identification algorithms. Sec. 4.6 concludes this chapter and introduces several open problems.
4.2 Model establishment
4.2.1 Problem formulation of Hamiltonian identifiability and identification
Since Hamiltonian identifiability and identification are two closely related prob- lems, we start from their model establishment in a common framework; namely, we rephrase the framework in [165] to recast them as a linear system problem. Let H be the d-dimensional Hamiltonian to be identified, which can be parametrized as
N XH H = am(ϑ)Hm, (4.1) m=1
78 4.2. MODEL ESTABLISHMENT
T where ϑ = (ϑ1, ..., ϑNH ) is a vector consisting of all the unknown real parameters,
NH is the number of unknown parameters, am are known functions of ϑ and Hm are known Hermitian matrices (also called basis matrices). Let su(d) denote the Lie
d2−1 algebra consisting of all d × d skew-Hermitian traceless matrices. Then {iHm}m=1 can be chosen as an orthonormal basis of su(d), where the inner product is defined
† as hiHm, iHni = Tr(HmHn). The traceless assumption is reasonable because H has an intrinsic degree of freedom (see [150] for details).
Let Sjkl be the real structure constants of su(d), which satisfy
d2−1 X [iHj, iHk] = Sjkl(iHl), l=1
2 where j, k = 1, ..., d − 1. If Hk is the observable, then the experimental data is obtained from Born’s rule
xk = Tr(Hkρ). (4.2)
The state evolution is described by the Liouville-von Neumann equation (2.2).
The identifiability is determined by the system structure. Hence, it is usually assumed that there are no imperfections in the available experimental data, which is the reason we identify theoretical values with practical data in (4.2).
From (4.1)-(4.2) and (2.2), we have
d2−1 N X XH x˙k = ( Smklam(ϑ))xl. (4.3) l=1 m=1
If we directly rewrite (4.3) into a matrix form, the dimension of the system matrix would be d2−1, which is quite large for multi-qubit systems. To reduce the dimension,
first consider the operators Oi that we can directly measure in practice. We expand P Oi as Oi = j ojHj, and collect all the Hj that appear in the expansion of all the
Ois as K = {Hv1 , ..., Hvp }. Also, we collect all the Hj that appear in the expansion
79 4.2. MODEL ESTABLISHMENT
NH of H as L = {Hm}m=1. Define an iterative procedure as
(0) (i) (i−1) (i−1) G = K, G = {G , L} ∪ G ,
(i−1) † (i−1) where {G , L} , {Hj|Tr(Hj[g, h]) 6= 0 for some g ∈ G , h ∈ L}. This iter- ation will terminate at a maximal set G¯ (called the accessible set) because su(d) ¯ is finite. We collect all the xi with Hi ∈ G in a vector x of dimension n, and its dynamics satisfy the linear system equation
˙x = Ax. (4.4)
The elements in A are the coefficients in (4.3), which are linear combinations of am(ϑ). A is real and antisymmetric due to the antisymmetry of the structure con- stants. For some types of physical systems, the dimension n can be much smaller than d2 − 1. The output data can be denoted as
y = Cx, (4.5) where C selects the entries in x corresponding to the expectation values of the elements in K. Each specific experimental setting determines the initial state x0 and the observation matrix C of this linear system. Therefore, the quantum Hamiltonian identification problem can be formulated as follows:
Problem 4.1. Given the system matrix A = A(ϑ), initial state x(0) = x0 and observation matrix C, design an algorithm to obtain an estimate ϑˆ of ϑ from mea- surement data ˆy.
In this chapter we mainly consider a preceding question: for a system A, can we uniquely determine the unknown parameters, based on a given experimental setup
(i.e., x0 and C)? If not, then it may be required to redesign the experimental setup before starting the experiment. This is especially significant for quantum system
80 4.2. MODEL ESTABLISHMENT identification, since implementing quantum experiments is usually expensive. The problem of identifiability is thus induced. Let ϑ denote the true value of the unknown parameter vector to be identified. Assume that the system under consideration has a parametric model structure with output data PM(ϑ), for a given experimental setup. The equation
0 PM(ϑ) = PM(ϑ ) (4.6) means that the model with parameter set ϑ0 outputs exactly the same data as the model with parameter set ϑ. Identifiability then depends on the number of solutions to (4.6) for ϑ0. We use the following definition from [148]:
Definition 4.1. [148] The model PM is structurally globally identifiable (abbrevi- ated as identifiable in the rest of this thesis), if for almost any value of ϑ,(4.6) has only one solution ϑ0 = ϑ.
Definition 4.1 is in essence the same as the definition of identifiability in [132]. It is necessary to ensure that identifiability holds for almost any value of the parameters because the number of solutions to (4.6) might change for some particular values of ϑ, which are called atypical cases (to be illustrated later). Also, identifiability is determined by the system structure. Hence, we do not consider noise or uncertainty in the experimental data. A trivial necessary condition for a parameter to be identi-
fiable is that it should appear in the system model PM, and in the following we only focus on this class of parameters.
4.2.2 Laplace transform approach and atypical cases
One of the most intuitive ways to solve identifiability problems is through the Laplace transform, which is also helpful in understanding concepts like atypical cases. Hence, we first briefly introduce the Laplace transform approach [148]. Consider the
81 4.2. MODEL ESTABLISHMENT following standard MIMO linear system with zero initial condition:
˙x = A(ϑ)x + B(ϑ)u, x(0) = 0, (4.7) y = C(ϑ)x + D(ϑ)u.
Throughout this thesis we use 4-tuples Σ = (A, B, C, D) (or 3-tuples without D) to denote linear systems with the form of (4.7). The Laplace transform solution to (4.7) is Y(s, ϑ) = T(s, ϑ)U(s), where the transfer function matrix is T(s, ϑ) = C(ϑ)[sI − A(ϑ)]−1B(ϑ) + D(ϑ). In the frequency domain, (4.6) is now
T(s, ϑ)U(s) = T(s, ϑ0)U(s).
By cancelling U(s), (4.6) is equivalent to
T(s, ϑ) = T(s, ϑ0), ∀s ∈ C . (4.8)
Hence, the transfer function is exactly a tool to characterize identifiability. By writing (4.8) in a canonical form (e.g., transforming the numerators and denominators into monic polynomials) and equating coefficients on both sides of (4.8), one obtain a series of algebraic equations in ϑ and ϑ0. If for almost any value of ϑ, the solutions always satisfy ϑ0 = ϑ, then the system is identifiable. From now on, we assume am(ϑ) are linear functions on ϑ for simplicity. In order to investigate identifiability, Sone and Cappellaro [132] employed Gr¨obnerbasis to determine the conditions of identifiability. By directly solving (4.8) where the RHS is replaced by a specific transfer function reconstructed from experimental data, one can develop algorithms like that in [165] to identify the Hamiltonian.
The following property of the transfer function will be frequently used in the sequel:
82 4.2. MODEL ESTABLISHMENT
Property 4.1. When a system undergoes a similarity transformation x0 = P x where P is a nonsingular matrix, the transfer function remains the same, and thus the identifiability does not change.
We specifically illustrate atypical cases and hypersurfaces. Assume that the num- ber of unknown parameters is NH and we have no prior knowledge of the true values, which indicates the candidate space for the parameters is RNH . A hypersurface is a manifold or an algebraic variety with dimension NH − 1, and it is usually obtained by adding an extra polynomial equation about the unknown parameters. Hypersur- face sets have Lebesgue measure zero and they can thus be neglected in practice. Atypical cases are subsets of hypersurfaces. Hence, analysis on atypical cases can also be omitted. When the complement of a hypersurface is open and dense in RNH and has full measure, it is often called a generic set [133]. For strictness, the phrase “almost always” is usually employed to indicate that atypical cases have already been neglected. We give an example of atypical cases from the point of view of transfer functions like Example 3.1 in [148]. Consider a system with unknown parameters ϑ1 and ϑ2 and the transfer function
ϑ T(s, ϑ) = 1 . (4.9) s + ϑ1 + ϑ2
0 0 0 The algebraic equations from (4.8) are thus ϑ1 = ϑ1 and ϑ1 +ϑ2 = ϑ1 +ϑ2. Therefore, the system (4.9) is generally identifiable, except the case of ϑ1 = 0 which leads to a zero transfer function and erases all the information about ϑ2. Since ϑ1 = 0 is an atypical case, we can omit it and conclude that this system is (almost always) identi- fiable. In the rest of this thesis we will omit “almost always” if there is no ambiguity.
83 4.3. SIMILARITY TRANSFORMATION APPROACH
4.3 Similarity Transformation Approach
4.3.1 General procedures for minimal systems
Strictly speaking, the word “minimal” is used to describe system realizations that are both controllable and observable. In this thesis, we call a system “minimal” if it is both controllable and observable.
Let θ be the true value generating the system (4.7). Suppose that there is an alternative value θ0 generating the same output data. Then θ0 gives an alternative realization: ˙x0 = A(θ0)x0 + B(θ0)u, x0(0) = 0, (4.10) y = C(θ0)x0 + D(θ0)u.
Suppose that the system realization (4.7) is minimal, then (4.10) is also minimal since they have the same dimension. From Kalman’s algebraic equivalence theorem [81], minimal realizations of a transfer function are equivalent; i.e., they are related by a similarity transformation:
A(θ) = S−1A(θ0)S, B(θ) = S−1B(θ0), (4.11) C(θ) = C(θ0)S, D(θ) = D(θ0), where S is an invertible matrix. We call equations (4.11) the STA equations. We take S, θ and θ0 as unknown variables and search for their solution. The solvability of (4.11) can be guaranteed because it always has a trivial solution S = I and θ = θ0. If all the solutions satisfy θ = θ0, then the system (4.7) is identifiable. Otherwise it is unidentifiable. In cases when the signs of θ are not considered, one can check whether all the solutions to the STA equations satisfy |θ| = |θ0| to determine the identifiability.
84 4.3. SIMILARITY TRANSFORMATION APPROACH
4.3.2 General procedures for non-minimal systems
If the system is not minimal, Kalman’s algebraic equivalence theorem (and hence the STA equations) can only be applied to the part that is both controllable and ob- servable; i.e., the minimal subsystem. If one ignores whether the system is minimal or not and directly employs the solution to the STA equations to test the identifiabil- ity, an incorrect conclusion might be obtained. For example, consider the following 2-dimensional system:
Example 4.1.
ϑ1 0 1 ˙x = x + u, x(0) = 0, 0 ϑ2 0 (4.12) y = (1 0)x.
This system (4.12) is uncontrollable and unobservable. If one directly solves the
STA equations, the conclusion is that it is identifiable. However, since x2 evolves independently asx ˙ 2 = ϑ2x2, the output y = ϑ1x1 never contains any information about x2 or ϑ2. Hence, ϑ2 is in fact unidentifiable.
The fact that (4.8) is equivalent to (4.6) means a linear system’s identifiability is uniquely and completely determined by its transfer function. Therefore, unlike the situation using STA, non-minimal systems do not introduce extra requirements in the Laplace transform approach.
Regardless of controllability or observability, the transfer function of a system re- mains the same under similarity transformation. Therefore, for uncontrollable or unobservable systems, the solution using STA is [145]: (i) perform Kalman decom- position and obtain the controllable and observable (minimal) subsystem; (ii) write down the STA equations for the minimal subsystem; (iii) the original system is iden- tifiable if and only if the solutions to the STA equations in (ii) all satisfy ϑ = ϑ0.
For Example 4.1,(4.12) is already in the Kalman canonical form and the minimal
85 4.3. SIMILARITY TRANSFORMATION APPROACH
subsystem isx ˙ 1 = ϑ1x1 + u, y = x1, which does not involve ϑ2. Hence, ϑ1 is identifi- able and ϑ2 is unidentifiable. This example also implies the following identifiability Criterion 4.1, which corresponds to the fact in [132] that the parameters that do not appear in the transfer function are unidentifiable.
Criterion 4.1. Suppose a system is non-minimal. Perform the Kalman decom- position to obtain its minimal subsystem and non-minimal subsystem. The unknown parameters that do not appear in the minimal subsystem are unidentifiable.
For a non-minimal system, even if all the unknown parameters appear in the minimal subsystem and the STA equations for the original system (rather than the minimal subsystem) exclude the solutions ϑ 6= ϑ0, it is not sufficient for guaranteeing the identifiability of the original system. A straightforward example can be obtained by substituting ϑ1 and ϑ2 in Example 4.1 with ϑ1 + ϑ2 and ϑ1 − ϑ2, respectively.
Although it is necessary to analyze the minimality before solving the STA equa- tions in most situations, we find a shortcut for some special cases.
Criterion 4.2. If the STA equations for a system have a (non-atypical) solution
0 ϑ0 6= ϑ0, the system is unidentifiable regardless of whether it is minimal or not.
For the proof of Criterion 4.2, we consider two specific realizations (A(ϑ0),B(ϑ0), 0 0 0 0 C(ϑ0),D(ϑ0)) and (A(ϑ0),B(ϑ0),C(ϑ0),D(ϑ0)) for the system. According to the form of STA equations (4.11), these two different (possibly non-minimal) realizations are related by a similarity transformation. Using Property 4.1 they result in the same transfer function. Therefore, different system parameters are generating the same system model. This means the system must be unidentifiable, which proves Criterion 4.2.
As pointed out in [42], the controllability and observability properties are neither sufficient nor necessary for identifiability. Example 4.1 has shown that non-minimal systems may be unidentifiable. Moreover, if one replaces ϑ2 in the system matrix of
86 4.3. SIMILARITY TRANSFORMATION APPROACH
Figure 4.1: Relationships between identifiability criteria.
(4.12) with ϑ1, then the system becomes identifiable, which indicates non-minimal systems can also be identifiable.
In Fig. 4.1, we summarize all the results of Sec. 4.3.1 and 4.3.2. Note that for non-minimal systems Criterion 4.2 is necessary but not sufficient, different from the case for minimal systems.
We would like to further emphasize the link between non-minimal systems and Laplace transform approach. When one is deducing the transfer function (matrix), if pole-zero cancellation happens, then the system is in fact non-minimal; otherwise it is minimal. Hence, the transfer function matrix in (4.8) should be in the reduced form after possible pole-zero cancellation.
87 4.3. SIMILARITY TRANSFORMATION APPROACH
4.3.3 Structure Preserving Transformation method for non- minimal systems
The Structure Preserving Transformation (SPT) method is an idea we develop for identifiability analysis on non-minimal systems. Suppose there is a non-minimal system Σ = (A, B, C, D) with state vector x. If Criterion 4.2 fails, traditionally we have to perform Kalman decomposition. We let ¯x = P x such that the equivalent system Σ¯ = (A,¯ B,¯ C,¯ D¯) has the Kalman canonical form. Then, we employ the STA ¯ ¯ ¯ ¯ equations for its minimal subsystem Σ¯ 1 = (A1, B1, C1, D1), with the corresponding state vector ¯x1 having a dimension smaller than x.
Quantum systems often generate clear structure properties in A. These structure properties may be completely disguised in the system Σ¯, making the STA equations difficult to solve. This problem is seldom investigated in classical control theory, because for a classical system when faced with such problems, one prefers to change the system structure (A, B, C, D) so that the system becomes minimal. On the con- trary, quantum research sometimes investigates the physical capability of a certain fixed system setting and the initial quantum system states or the observables may be difficult to change. Therefore, changing (A, B, C, D) may not be practical. How can we keep (some of) the structure properties of the original system Σ and meanwhile perform STA analysis?
The idea of SPT is to further perform a similarity transformation on Σ¯ to recover (some of) the structure properties of Σ, meanwhile preserving the canonically decom- posed form. To do this, we let ˜x = (P˜−1 ⊕I)¯x and obtain a system Σ˜ = (A,˜ B,˜ C,˜ D˜), ˜−1 where P acts only on the minimal subsystem Σ¯ 1. Since the second transformation P˜−1 ⊕ I is block-diagonal, Σ˜ is still in the Kalman canonical form, and the matrices ˜ ˜ ˜ ˜ ˜ (A1, B1, C1, D1) are submatrices of those in Σ˜, respectively. If P is close to P (in the form/appearance, not in norm), or P˜−1 is close to P −1, then we are likely to regain ˜ an A1 similar to A, thus recovering key structure properties. Then we solve the STA
88 4.4. QUANTUM HAMILTONIAN IDENTIFIABILITY VIA STA
equations for the minimal subsystem Σ˜ 1 to determine the identifiability.
In the SPT method, P˜ can never be exactly equal to P , because their dimensions are different. The choice of P˜ is not unique and depends on specific problems. One common choice is to let P˜ be a submatrix of P . An example using the SPT method is provided in Sec. 4.5.1.
4.4 Quantum Hamiltonian identifiability via STA
4.4.1 General framework
We clarify several points when using STA for analyzing Hamiltonian identifiability of a quantum system. For simplicity we only consider single input Hamiltonian systems (i.e., the state variable x has only one column), while the result can be straightforwardly extended to multi-input systems. A quantum system of (4.4) and
(4.5) with the initial state x(0) = x0 is equivalent to the following zero-initial-state system: ˙x = Ax + Bu, x(0) = 0, y = Cx, where B = x0 and u = δ(t).
For a quantum Hamiltonian, x0 and C are usually determined and A is antisym- metric. We rewrite (4.11) as:
SA(ϑ) = A(ϑ0)S, (4.13)
Sx0 = x0, (4.14)
C = CS, (4.15) together with the requirement that S is nonsingular and other possible constraints on ϑ and ϑ0. Eqs. (4.13)-(4.15) are the starting point for STA analysis for the rest
89 4.4. QUANTUM HAMILTONIAN IDENTIFIABILITY VIA STA of this chapter.
Next we use STA to test the identifiability for single-probe-assisted spin-1/2 chain systems in [132], which have the form of a one-dimensional chain, composed of multi qubits with their interaction governed by the system Hamiltonian. It is usually assumed that only the first qubit (the probe qubit) can be initialized and measured, while the rest of the qubits are all inaccessible (and thus they are assumed to be in the maximally mixed state initially). As in [132], we identify only the magnitude of the unknown parameters in the Hamiltonian; i.e., a system is identifiable if and only if all the solutions to the STA equations for the minimal (sub-)systems satisfy
0 |ϑi| = |ϑi|. There are four physical models in [132], where the transfer function on the Ising model without transverse field can be directly calculated and we omit the STA analysis for this model. The Ising model with the transverse field can also be skipped, because the system matrix has the same structure as that in the exchange model without transverse field. Hence, we only analyze two exchange models, with
T and without transverse field. Let ϑ = (ϑ1, ϑ2, ..., ϑNH ) be the unknown parameters.
For the exchange model without transverse field, NH + 1 is the total qubit number and the Hamiltonian can be written as
NH i X (−1) ϑi H = (X X + Y Y ), (4.16) 2 i i+1 i i+1 i=1 where the subscript i denotes the i-th qubit. The observable is X1 with the initial state being an eigenstate of X1. For the exchange model with transverse field, NH
NH+1 must be odd and 2 is the total qubit number. The Hamiltonian can be written as NH+1 NH−1 2 2 X ϑ2i−1 X ϑ2i H = Z + (X X + Y Y ). (4.17) 2 i 2 i i+1 i i+1 i=1 i=1
With the initial state being the eigenstate of X1, the observable can be X1 or Y1. Hence, there are altogether three situations to be analyzed, which are summarized
90 4.4. QUANTUM HAMILTONIAN IDENTIFIABILITY VIA STA as Theorems 4.1-4.3. These three situations were first investigated in [132] and on- ly verified numerically for several specific cases. Here, we provide a mathematical proof for arbitrary dimension. Also, Theorems 4.1-4.3 contain various situations to showcase the power of STA: Theorem 4.1 and Theorem 4.3 characterize identifiable minimal systems, while Theorem 4.2 corresponds to an unidentifiable minimal sys- tem. An example of dealing with identifiable non-minimal systems will be presented in Theorem 4.4. Moreover, the proof for Theorem 4.1 here is slightly different from the version in [151]. The proof here exploits the system’s symmetry, and this idea can be applicable to other symmetric systems.
4.4.2 Exchange model without transverse field
The Hamiltonian for this spin system is described in [132], which also derives the
(0) system model (4.16). As in [132], we choose G = {X1}, and the accessible set is
¯ G = {X1,Z1Y2,Z1Z2X3, ..., Z1 ··· ZNH KNH+1}, (4.18)
where K = X if NH is even, and K = Y otherwise.
Then we start from the linear system form (4.7). In the system matrix A only the elements directly above or below the main diagonal are non-zero:
0 ϑ1 0 0 ··· −ϑ1 0 ϑ2 0 ··· .. A = 0 −ϑ2 0 . . (4.19) . .. 0 0 ϑNH . . . . −ϑNH 0 (NH+1)×(NH+1)
The initial state of the probe is an eigenstate of X1. Hence, from (4.18) we know T B = x0 = (1, 0, ..., 0) . We measure X1, and thus C = (1, 0, ..., 0). We have the following theorem to characterize this system:
91 4.4. QUANTUM HAMILTONIAN IDENTIFIABILITY VIA STA
Theorem 4.1. The exchange model without transverse field is identifiable when measuring X1 on the single qubit probe, with the initial state of the probe in an eigenstate of X1.
Proof. We first prove this system is minimal for almost any value of the unknown parameters, and then test the identifiability.
4.4.2.1 Proof of minimality
Lemma 4.1. With (4.19) and B = (1, 0, ..., 0)T , the controllability matrix CM =
[B, AB, ..., ANH B] has full rank for almost any value of ϑ.
The proof of Lemma 4.1 is provided in AppendixD. Then, given the observability matrix C CA N T OM = = diag(1, −1, 1, −1, ..., (−1) H ) · CM , . . CANH the system is also almost always observable. Therefore, it is almost always minimal.
4.4.2.2 Identifiability test
We now employ the STA equations to test the identifiability. We provide a proof slightly different from the version in [151].
We observe that the parameters ϑ2, ϑ3, ..., ϑNH are symmetric in this system. Namely, if we make a similarity transformation to just swap any two parameters
ϑi and ϑj where 2 ≤ i 6= j ≤ NH, then B and C will keep unchanged, and the only difference in A is that the indices i and j in ϑ are swapped. Therefore, the parameters ϑ2, ϑ3, ..., ϑNH must have the same identifiability conclusion; i.e., if one of them is identifiable, the rest are also identifiable, and vice versa. It thus suffices
92 4.4. QUANTUM HAMILTONIAN IDENTIFIABILITY VIA STA
to prove that ϑ1 and ϑ2 are identifiable, which can be obtained as follows.
Using (4.14) and (4.15) we know S is of the form
1 0 ··· 0 0 ∗ · · · ∗ S = , (4.20) . . . . . . 0 ∗ · · · ∗ (NH+1)×(NH+1) and (4.13) is now
1 0 ··· 0 0 ϑ1 0 ··· 0 .. 0 ∗ · · · ∗ −ϑ1 0 . . . . .. . . . 0 . . 0 ∗ · · · ∗ . (4.21) 0 0 ϑ1 0 ··· 0 1 0 ··· 0 . 0 .. −ϑ1 0 0 ∗ · · · ∗ = . .. . . . 0 . . . . . . 0 ∗ · · · ∗
0 For (4.21), consider LHS12 = RHS12, we have ϑ1 = ϑ1S22. From LHS21 = RHS21, 0 2 we have −ϑ1S22 = −ϑ1. Hence, ϑ1 = ϑ1S22. Since the atypical case of ϑ1 = 0 is not 0 considered, we know |S22| = 1, which indicates |ϑ1| = |ϑ1|.
Continue analyzing (4.21) and consider LHSσ1 = RHSσ1, we thus have S32 = ... =
S(NH+1)2 = 0. Similarly from LHS1σ = RHS1σ, we know S23 = ... = S2(NH+1) = 0. 0 Then from LHS23 = RHS23, we have ϑ2 = ϑ2S33. From LHS32 = RHS32, we have 0 −ϑ2S33 = −ϑ2. Hence, |S33| = 1 and ϑ2 is also identifiable, which completes the proof.
Remark 4.1. The above new version proof of Theorem 4.1 is more general and
93 4.4. QUANTUM HAMILTONIAN IDENTIFIABILITY VIA STA enlightening than the original proof in [151], because here we take full advantage of the system symmetry. One can imagine that in other physical systems (even not limited to the quantum domain) some appropriate symmetry might also create a consistent identifiability conclusion for some of the unknown parameters, thus making the identifiability analysis more straightforward and more concentrated on the vital part.
The relevant result in Theorem 4.1 was also presented in [52], where a specific Hamiltonian identification algorithm for the same system setting was proposed. Here we present our result as an example to illustrate the effectiveness of STA and the idea of symmetry exploitation.
4.4.3 Exchange model with transverse field
The Hamiltonian for this system is as in (4.17), where NH must be odd. As in (0) (0) [132], both G = {X1} and G = {Y1} yield the accessible set
¯ G = {X1,Y1,Z1X2,Z1Y2,Z1Z2X3,Z1Z2Y3, ..., Z1 ··· Z(NH−1)/2Y(NH+1)/2}. (4.22)
Then we start from the linear system form (4.7). In A, each ϑ2k+1 appears twice and each ϑ2k appears four times:
0 ϑ1 0 −ϑ2 ··· −ϑ1 0 ϑ2 0 ··· .. A = 0 −ϑ2 0 . . (4.23) ... ϑ2 0 ϑNH . . . . −ϑNH 0 (NH+1)×(NH+1)
The initial state of the probe is an eigenstate of X1. Hence, from (4.22) we know T B = x0 = (1, 0, ..., 0) . With Property 4.1, we can first rearrange A as follows: we
94 4.4. QUANTUM HAMILTONIAN IDENTIFIABILITY VIA STA take its odd rows in ascending sequence and then take its even rows in ascending sequence, and then apply the same procedures to its columns. We thus rewrite A into 0 A¯ A = , (4.24) −A¯ 0 where ϑ1 −ϑ2 0 ··· 0 . −ϑ2 ϑ3 −ϑ4 . ¯ .. A = 0 −ϑ4 ϑ5 . 0 (4.25) . ...... . −ϑNH−1
0 ··· 0 −ϑNH−1 ϑNH is symmetric. After this transformation, we have B = (1, 0, ..., 0)T unchanged.
4.4.3.1 Measuring X1
First we consider measuring X1. Then C = (1, 0, ..., 0). We have the following conclusion:
Theorem 4.2. The exchange model with transverse field is unidentifiable when measuring X1 on the single qubit probe, with the initial state of the probe in an eigenstate of X1.
Proof. We employ Criterion 4.2 to prove the conclusion, and thus do not need to analyze its minimality. When A in (4.23) is transformed to (4.24), C is unchanged and we assume S is transformed to S¯. Now (4.14) and (4.15) imply S¯ is of the same form as (4.20). We do not need to find all the solutions to (4.13). Instead, we only
0 need to find a special solution to (4.13) which gives |ϑi|= 6 |ϑi| for some i. We assume
¯ S = 11×1 ⊕ N NH−1 NH−1 ⊕ M NH+1 NH+1 , 2 × 2 2 × 2
95 4.4. QUANTUM HAMILTONIAN IDENTIFIABILITY VIA STA which satisfies the form (4.20). Eq. (4.13) now is
1 1 A¯ A¯0 N = N . (4.26) −A¯ −A¯0 M M
We further assume N and M are orthogonal, which guarantees that S¯ is nonsingular and now (4.26) is in essence only one equation:
1 T 0 AM¯ = A¯ . (4.27) N
We perform spectral decomposition on A¯ to have A¯ = PEP T where P is or- thogonal and E is diagonal. Denote Λ(A) the set of all the eigenvalues of A, where repeated eigenvalues appear multiple times. We have the following lemma to exclude the atypical cases:
¯ Lemma 4.2. Given arbitrary λ0 ∈ C, it is atypical that λ0 ∈ Λ(A).
ϑ1 ϑ2 Lemma 4.2 is non-trivial. For example, if we change the structure of A¯ as , ϑ1 ϑ2 then it is always true that 0 ∈ Λ(A¯).
We leave the specific details for proving Lemma 4.2 in AppendixE, but only sketch the main idea here, since this idea is quite general in proving many similar propositions. Usually we first reduce the problem to proving certain polynomial ¯ (det(A−λ0I) in the case of Lemma 4.2) almost always non-zero. Then since a finite- order polynomial only has a finite number of roots, it suffices to find the polynomial non-zero for some particular values of the unknown parameters. For detailed proof, please refer to AppendixE.
96 4.4. QUANTUM HAMILTONIAN IDENTIFIABILITY VIA STA
Let
I−k = diag(1, ..., 1, −1, 1, ..., 1) | {z } k−1 where only the kth element is −1. We have the following assertion:
T Lemma 4.3. ∃ k ∈ {1, 2, ..., NH} such that |ϑ1|= 6 |(PEI−kP )11|.
The proof of Lemma 4.3 is given in AppendixF. Using Lemma 4.3, suppose |ϑ1|= 6 T |(PEI−mP )11|. We let T T 1 M = P I−mP . N T
As long as N is orthogonal, M is orthogonal. We denote the LHS of (4.27) as L¯, and have 1 L¯ = AM¯ T N 1 1 T T = PEP P I−mP (4.28) N N T 1 1 T = PEI−mP . N N T
We thus know
1 1 ¯ T |L11| = I1σ PEI−mP Iσ1 T N N T T = |I1σPEI−mP Iσ1| = |(PEI−mP )11|= 6 |ϑ1|.
From (4.28) we know L¯ is always symmetric. Then we only need to find an ap- propriate orthogonal N to make L¯ have the same positions of zeros as A¯. Denote
T Z = PEI−mP , which is symmetric. We design a series of orthogonal matrices
97 4.4. QUANTUM HAMILTONIAN IDENTIFIABILITY VIA STA
NH−3 (1) (2) ( 2 ) N NH−1 NH−1 ,N NH−3 NH−3 , ..., N2×2 such that 2 × 2 2 × 2
N −5 N −5 I H × H I1×1 (1) N = 2 2 ··· N . ( NH−3 ) (2) N 2 N
N +1 NH−3 H (1) (2) ( 2 ) We further denote a series of 2 -dimensional matrices Z ,Z , ..., Z such that (1) 1 1 Z = Z (4.29) N (1) [N (1)]T
N −5 NH−3 (i+1) (i+1) (i) (i+1) T H ( 2 ) and Z = (Ii+1 ⊕ N )Z (Ii+1 ⊕ [N ] ) for 1 ≤ i ≤ 2 . Then Z = L¯. We start from the innermost layer (4.29).
We partition Z as
Z11 J1× NH−1 Z = 2 , T (J NH−1 ) J NH−1 NH−1 1× 2 2 × 2 and have (1) T (1) Z11 J[N ] Z = . (4.30) N (1)J T N (1)J [N (1)]T
(1) T In (4.30), Z11 is unchanged and we need to make J[N ] have the form
(1) T J[N ] = (∗1×1, 0, ..., 0). (4.31)
We perform spectral decomposition to set
J T J = U (1)diag(∗, 0, ..., 0)[U (1)]T .
Then N (1) = [U (1)]T is orthogonal and (4.31) holds.
98 4.4. QUANTUM HAMILTONIAN IDENTIFIABILITY VIA STA
For the next layer, we partition Z(1) as
Z11 ∗ 0 NH−3 1× 2 (1) Z = ∗ ∗ K N −3 . 1× H 2 T 0 NH−3 (K NH−3 ) K NH−3 NH−3 2 ×1 1× 2 2 × 2
We then have
1 1 (2) (1) Z = 1 Z 1 N (2) [N (2)]T Z11 ∗ 01× NH−3 2 (2) T = ∗ ∗ K[N ] . (2) T (2) (2) T 0 NH−3 N K N K[N ] 2 ×1
(2) T Z11 is unchanged and we need to make K[N ] take the form
(2) T K[N ] = (∗1×1, 0, ..., 0).
We perform spectral decomposition to make
KT K = U (2)diag(∗, 0, ..., 0)[U (2)]T , and then N (2) = [U (2)]T is what we need. Continuing the above procedure, we can
( NH−3 ) finally determine an orthogonal N such that L¯ = Z 2 has the same structure ¯ ¯ as A. Since Z11 is unchanged and |Z11| 6= |ϑ1|, we know |L11| 6= |ϑ1|, which implies we have found a special unequal solution to the STA equations. Thus the system is unidentifiable.
99 4.4. QUANTUM HAMILTONIAN IDENTIFIABILITY VIA STA
4.4.3.2 Measuring Y1
Now we consider measuring Y1, which sets C = (0, 1, 0, ..., 0). We have the follow- ing theorem to correct the conclusion in [132].
Theorem 4.3. The exchange model with transverse field is identifiable when mea- suring Y1 on the single qubit probe, with the initial state of the probe in an eigenstate of X1.
Proof. After A in (4.23) is transformed to (4.24), C is transformed to
¯ ¯ C = (0 NH+1 , C), C = (1, 0 NH−1 ). (4.32) 1× 2 1× 2
Denote ¯T T ¯ T B = (B , 0 NH+1 ) , B = (1, 0 NH−1 ) . (4.33) 1× 2 1× 2
We have the following lemma (the proof is given in AppendixG) to show that the system is minimal.
Lemma 4.4. With (4.24), (4.25), (4.32) and (4.33), both the controllability matrix
CM = [B, AB, ..., ANH B] and the observability matrix OM = [CT ,AT CT , ...,
ANHT CT ]T have full rank for almost any value of ϑ.
By Property 4.1, we use STA to prove the system (4.24) and (4.25) is identifiable with (4.32) and (4.33). We partition S as
X NH+1 × NH+1 ∗ NH+1 × NH+1 S = 2 2 2 2 . ∗ NH+1 NH+1 Y NH+1 NH+1 2 × 2 2 × 2
Then (4.13) is
X ∗ 0 A¯ 0 A¯0 X ∗ = , (4.34) ∗ Y −A¯ 0 −A¯0 0 ∗ Y
100 4.4. QUANTUM HAMILTONIAN IDENTIFIABILITY VIA STA which is XA¯ = A¯0Y, (4.35)
Y A¯ = A¯0X, (4.36) where the other two equations on the indeterminate submatrices are omitted. Using (4.14) and (4.15), we have
T Xσ1 = (1, 0, ..., 0) ,Y1σ = (1, 0, ..., 0). (4.37)
From (4.35) and (4.36), we have
XT XA¯ = XT A¯0Y = AY¯ T Y, (4.38)
Y T Y A¯ = Y T A¯0X = AX¯ T X. (4.39)
From (4.38) and (4.39), the following relationship holds,
(XT X − Y T Y )A¯ = −A¯(XT X − Y T Y ), (4.40) which is a special form of Sylvester equation. We rephrase the general solving pro- cedures for Sylvester equation [169] to solve (4.40). We vectorize (in column) (4.40) to have ¯ ¯ T T (A ⊗ I NH+1 + I NH+1 ⊗ A)vec(X X − Y Y ) = 0. 2 2
Using the same idea in AppendicesE andG, it is straightforward to prove that
A¯ ⊗ I + I ⊗ A¯ is almost always nonsingular by considering A¯ = I. An equivalent expression is that we almost always have
¯ ¯ λi(A) + λj(A) 6= 0 (4.41)
101 4.4. QUANTUM HAMILTONIAN IDENTIFIABILITY VIA STA
NH+1 for any 1 ≤ i, j ≤ 2 . Therefore we can almost always have
XT X = Y T Y. (4.42)
Similarly, (XXT − YY T )A¯0 = −A¯0(XXT − YY T ), and thus ¯0 ¯0 T T (A ⊗ I NH+1 + I NH+1 ⊗ A )vec(XX − YY ) = 0. (4.43) 2 2
¯0 ¯0 Lemma 4.5. With (4.24), (4.25) and (4.34), A ⊗ I NH+1 + I NH+1 ⊗ A is almost 2 2 always nonsingular.
The proof of Lemma 4.5 is provided in AppendixH. With Lemma 4.5, we can almost always solve (4.43) to have
XXT = YY T . (4.44)
Considering (4.37), we partition X and Y as
11×1 E1× NH−1 11×1 01× NH−1 X = 2 ,Y = 2 . ˜ ˜ 0 NH−1 X NH−1 NH−1 F NH−1 Y NH−1 NH−1 2 ×1 2 × 2 2 ×1 2 × 2
T T T From (4.42), (X X)11 = 1 = (Y Y )11 = 1 + F F , which means F = 0. Similarly from (4.44) we have E = 0. We partition A¯ as
ϑ1 G1× NH−1 A¯ = 2 . T ˜ (G NH−1 ) A NH−1 NH−1 1× 2 2 × 2
Then (4.35) is 0 0 1 0 ϑ1 G ϑ1 G 1 0 = , 0 X˜ GT A˜ G0T A˜0 0 Y˜
102 4.4. QUANTUM HAMILTONIAN IDENTIFIABILITY VIA STA
0 which implies ϑ1 = ϑ1, G = G0Y,˜ (4.45)
XG˜ T = G0T , (4.46)
X˜A˜ = A˜0Y.˜ (4.47)
0 ˜ ˜ 0 Eq. (4.45) is (−ϑ2, 0, ..., 0) = (−ϑ2, 0, ..., 0)Y , which implies Y1σ = (ϑ2/ϑ2, 0, ..., 0). ˜ 0 T Similarly (4.46) gives Xσ1 = (ϑ2/ϑ2, 0, ..., 0) . With similar procedures, (4.36) gives ˜ 0 ˜ 0 T X1σ = (ϑ2/ϑ2, 0, ..., 0), Yσ1 = (ϑ2/ϑ2, 0, ..., 0) and
Y˜ A˜ = A˜0X.˜ (4.48)
˜ ˜ 0 0 By equating X11 (or Y11), we find |ϑ2| = |ϑ2|. If ϑ2 = ϑ2, we have
˜ ˜ T Y1σ = (1, 0, ..., 0) = (Xσ1) . (4.49)
Now (4.47), (4.48) and (4.49) have the same structures as (4.35), (4.36) and (4.37),
0 respectively, while with the dimension decreased by 1. Otherwise, if ϑ2 = −ϑ2, ˜ ˜ T we have −Y1σ = (1, 0, ..., 0) = (−Xσ1) and we can rewrite (4.47) and (4.48) as (−X˜)A˜ = A˜0(−Y˜ ) and (−Y˜ )A˜ = A˜0(−X˜). Therefore, either {X,˜ Y,˜ A,˜ A˜0} or {−X,˜ −Y,˜ A,˜ A˜0} has the same structure and property as {X,Y, A,¯ A¯0}, but with the dimension decreased by 1. This procedure can thus be performed recursively, until
0 we finally reach X = Y = diag(1, ±1, ..., ±1) and |ϑi| = |ϑi| for every 1 ≤ i ≤ NH.
Remark 4.2. Theorem 4.2 and Theorem 4.3 indicate that when the system matrix A has periodically repeated structure properties, STA analysis can avoid the curse of dimensionality and provide identifiability results for arbitrary dimension.
103 4.5. FROM IDENTIFIABILITY TO ECONOMIC IDENTIFICATION ALGORITHMS
4.5 From identifiability to economic identification algorithms
If a system is identifiable, we may develop an appropriate identification algorith- m to identify the parameters. In this section, we provide another application of STA and SPT to quantum Hamiltonian identification. Generally the dimension of a quantum system is exponential in the number of qubits. Hence, identification al- gorithms with polynomial complexity in the system dimension will in essence have exponential computational complexity in the number of qubits, which has been re- ferred to as the exponential problem [104]. To avoid this problem, one method is to design identification algorithms with computational complexity directly depending on quantities that increase much slower than the system dimension. Typical such quantities include the number of qubits in multi-qubit systems, or the number of unknown parameters for special physical systems. We find that STA can be a useful tool to indicate the existence of such efficient algorithms.
4.5.1 An indicator for the existence of economic identifica- tion algorithms
We aim to design an identification algorithm that has computational complexity that only depends on the number of unknown (or interested) parameters. Suppose we have a d-dimensional Hamiltonian H with NH unknown parameters ϑi. In most cases, the ais in (4.1) are linear functions of ϑi. Hence, we can expand H directly using ϑ, N XH H = ϑiHi. (4.50) i=1
104 4.5. FROM IDENTIFIABILITY TO ECONOMIC IDENTIFICATION ALGORITHMS
Using the procedures in Sec. 4.2.1, we can model the evolution of the state as an
NL-dimensional linear system model
˙x = Ax + Bδ(t), x(0) = 0, (4.51) y = Cx,
where B = x0 is the initial state and each ϑi multiplied by a coefficient is an element of A. We hope the algorithm can identify one unknown element in A under one set of B and C, with the computational complexity for estimating one unknown parameter as f(NH) (or f(NL)) that is a function of NH (or NL) but not of d. Then the total computational complexity to identify the Hamiltonian is NHf(NH) (or NHf(NL)), which does not directly depend on d. In fact, we can reduce f(NH) (or f(NL)) to O(1) in some appropriate cases.
We start by investigating the identification capability of the fundamental setting of B = Iσi and C = Ijσ. By changing indices, we assume that
B = Iσ2 and
C = I1σ.
In the most general case, there are no special properties for the structure of A. Assume that this system (A, B, C) is already minimal. Then from (4.14) and (4.15) we know the transformation matrix S is
1 0 0 ··· 0 ∗ 1 ∗ · · · ∗ S = ∗ 0 ∗ · · · ∗ , . . . . . . . . ∗ 0 ∗ · · · ∗
105 4.5. FROM IDENTIFIABILITY TO ECONOMIC IDENTIFICATION ALGORITHMS and (4.13) is now
1 0 0 ··· 0 ∗ 1 ∗ · · · ∗ A11 A12 ··· ∗ 0 ∗ · · · ∗ A21 A22 ··· . . . . . . . . . . . . ∗ 0 ∗ · · · ∗ (4.52) 1 0 0 ··· 0 0 0 A11 A12 ··· ∗ 1 ∗ · · · ∗ = 0 0 . A21 A22 ··· ∗ 0 ∗ · · · ∗ . . . . . . . . . . . . ∗ 0 ∗ · · · ∗
0 From LHS12 = RHS12 of (4.52), we have A12 = A12, which indicates this fundamental setting of B and C has the capability of identifying one parameter for minimal sys- tems. Interestingly, we succeed in extending this conclusion to non-minimal systems using STA.
Theorem 4.4. Given a linear system (A, B, C), Aij is identifiable (including its sign) if B = Iσj and C = Iiσ.
Proof. Without loss of generality, we can always assume that we are identifying A12 or A11 after appropriately arranging the element order of x .
T For the case of identifying A12, C = (1, 0, ..., 0) and B = (0, 1, 0, ..., 0) . Without loss of generality, we assume that the system is neither controllable nor observable. We tentatively calculate the first two rows of the observability matrix, which are
1 0 0 ··· 0 . (4.53) ∗ A12 ∗ · · · ∗
Since A12 = 0 is atypical, it is almost always true that (4.53) has rank two. Assume
106 4.5. FROM IDENTIFIABILITY TO ECONOMIC IDENTIFICATION ALGORITHMS
that the observable subsystem of (4.51) has dimension m. We thus have 2 ≤ m < NL.
Let 1 1 −A32/A12 1 T = , −A42/A12 1 . .. . .
−ANL2/A12 1 NL×NL and perform a similarity transformation ¯x = T x. Using Property 4.1, the equivalent system is ∗ A12 ∗ · · · ∗ ∗ ∗ ∗ · · · ∗ ¯ −1 A = T AT = ∗ 0 ∗ · · · ∗ , . . . . . . . . ∗ 0 ∗ · · · ∗
B¯ = TB = (0, 1, 0, ..., 0)T and C¯ = CT −1 = (1, 0, ..., 0). The former two rows in the observability matrix OM of the new system (A,¯ B,¯ C¯) have the same form as
(4.53). Since OM has rank m, there exists a reordering (j3, j4, ..., jNL ) of (3, 4, ..., NL) such that the matrix (OMσ1, OMσ2, OMσj3 , OMσj4 , ..., OMσjm ) is column-full-ranked. −1 Let the matrix U = ( σ1, σ2, σj , σj , ..., σj ) and perform a further similarity I I I 3 I 4 I NL transformation ˜x = U¯x. Then the equivalent system is
∗ A12 ∗ · · · ∗ ∗ ∗ ∗ · · · ∗ ˜ ¯ −1 A = UAU = ∗ 0 ∗ · · · ∗ , (4.54) . . . . . . . . ∗ 0 ∗ · · · ∗ NL×NL
B˜ = UB¯ = (0, 1, 0, ..., 0)T and C˜ = CU¯ −1 = (1, 0, ..., 0). Now the observability
107 4.5. FROM IDENTIFIABILITY TO ECONOMIC IDENTIFICATION ALGORITHMS matrix of the system Σ˜ = (A,˜ B,˜ C˜) is
C˜ CU¯ −1 ˜ ˜ ¯ ¯ −1 CA CAU −1 OMg = = = OM · U . . . . . C˜A˜NL−1 C¯A¯NL−1U −1
Therefore, the first m columns of OMg are of full-rank. We can now employ the SPT method. To perform observability decomposition for the system Σ˜ , firstly we select the first two rows and other m − 2 rows from OMg to form a full-row-rank matrix ˜ ˜ ˜ Em×NL such that the former m columns of E are also full-rank. We partition E ˜ ˜ ˜ as E = [Fm×m fm×(N −m)], and then F is invertible. The transformation matrix L F˜ f can decompose the system Σ˜ into observable and unobservable parts. We 0 I choose the second transformation matrix as F˜−1 ⊕ I. The total transformation is F˜−1 0 F˜ f I F˜−1f Q = = , 0T I 0T I 0T I and its inversion is ˜−1 −1 I −F f Q = . 0T I
Let ´x = Q˜x generate the system Σ´ = (A,´ B,´ C´):
´x˙ = A´´x + Bδ´ (t), ´x(0) = 0, y = C´¯x.
We partition A˜ as
ULf m×m URgm×(NL−m) A˜ = . DLg(NL−m)×m DRg(NL−m)×(NL−m)
108 4.5. FROM IDENTIFIABILITY TO ECONOMIC IDENTIFICATION ALGORITHMS
Then we have
I F˜−1f ULf URg I −F˜−1f A´ = QAQ˜ −1 = 0T I DLg DRg 0T I ˜−1 ULf + F fDLg ∗m×(NL−m) = , ∗(NL−m)×m ∗(NL−m)×(NL−m)
B´ = QB˜ = (0, 1, 0, ..., 0)T , and C´ = CQ˜ −1 = (1, 0, ..., 0, ∗, ..., ∗). | {z } m−1 We partition ´x = (`xT , ∗)T where `x is m-dimensional. Since the second transfor- mation F˜−1 ⊕ I is block-diagonal, we know Σ´ is in the observable canonical form. Therefore, `x corresponds to the observable subsystem of Σ.´ We denote this m- dimensional observable subsystem as Σ` = (A,` B,` C`) where A` = ULf + F˜−1fDLg, ` T ` T B = (0, 1, 0, ..., 0) and C = (1, 0, ..., 0). From (4.54) we know DLgσ2 = (0, 0, ..., 0) , ` ` and Aσ2 = ULf σ2. Therefore, A12 = A12.
Similarly, we can employ the SPT method again to perform a controllability de- composition on Σ` to finally obtain a t-dimensional (2 ≤ t ≤ m) minimal system ˇ ˇ ˇ ˇ ˇ T ˇ (A, B, C) where we still have A12 = A12, B = (0, 1, 0, ..., 0) and C = (1, 0, ..., 0).
For (A,ˇ B,ˇ Cˇ), we can employ the STA method. Using (4.14) and (4.15) we know that the transformation matrix S is
1 0 0 ··· 0 ∗ 1 ∗ · · · ∗ S = ∗ 0 ∗ · · · ∗ , . . . . . . . . ∗ 0 ∗ · · · ∗ t×t
109 4.5. FROM IDENTIFIABILITY TO ECONOMIC IDENTIFICATION ALGORITHMS and (4.13) is now
1 0 0 ··· 0 ∗ A ∗ · · · ∗ 12 ∗ 1 ∗ · · · ∗ ∗ ∗ ∗ · · · ∗ ∗ 0 ∗ · · · ∗ . . . . . . . . . . . . . . . . ∗ ∗ ∗ · · · ∗ ∗ 0 ∗ · · · ∗ (4.55) 1 0 0 ··· 0 ∗ A0 ∗ · · · ∗ 12 ∗ 1 ∗ · · · ∗ ∗ ∗ ∗ · · · ∗ = . . . . ∗ 0 ∗ · · · ∗ . . . . . . . . . . . . . ∗ ∗ ∗ · · · ∗ ∗ 0 ∗ · · · ∗
By equating the elements on the first row and second column of both sides of (4.55),
0 we have A12 = A12. Thus, A12 is identifiable.
T For the case of identifying A11, B = C = (1, 0, ..., 0). Its observability matrix is now 1 0 ··· 0 OM = A11 ∗ · · · ∗ . ......
If OM2σ has non-zero elements other than A11, then the former two rows of OM are linearly independent and we can use similar procedures to the case of identifying
A12 to prove that A11 is identifiable. Otherwise if OM2σ = (A11, 0, ..., 0), then A1σ =
(A11, 0, ..., 0), which means (A, B, C) now is already of the observable canonical form, where the observable subsystem is 1-dimensional:
x˙ 1 = ϑ1x1 + 1 · δ(t), x1(0) = 0,
y = 1 · x1.
Hence, ϑ1 is certainly identifiable, which completes the proof.
110 4.5. FROM IDENTIFIABILITY TO ECONOMIC IDENTIFICATION ALGORITHMS
4.5.2 Two economic Hamiltonian identification algorithms
Theorem 4.4 indicates the existence of economic quantum Hamiltonian identifi- cation algorithms. A natural following question is whether we can find any specific economic algorithm. In fact, the proof of Theorem 4.4 has already implied how to prepare the initial state of the system and select the observable. Here, we present two such identification algorithms.
We follow the notations in (4.50) and (4.51), and the aim is to estimate some elements of A. Suppose α(p)ϑp = Aa(p)b(p), where α(p) is a coefficient depending on p, a = a(p) and b = b(p) are two indices also depending on p. α, a and b are determined by the model establishment procedures. Then we prepare the system initial value in a state corresponding to B = x0 = Iσb(p) and measure the observable corresponding to C = Ia(p)σ. If a(p) or b(p) is multivalued, then one can choose one value such that the corresponding experiment is straightforward to perform.
We assume that in actual experiments we can sample the system output with a
fixed period of time ts (as assumed in [165]), and the data length is ND. Then the (p) data we obtain is denoted as a ND-dimensional vector yˆ . For the true value, the i-th element should be
(p) yi = Cexp(iAts)B = [exp(iAts)]a(p)b(p). (4.56)
From Theorem 4.4, Aa(p)b(p) is identifiable with this experimental setting. Our task is to find a specific algorithm to complete the identification.
We start from the matrix logarithm function. For a square matrix Z, we define its logarithm as ∞ X ( − Z)j log(Z) = − I . (4.57) j j=1
111 4.5. FROM IDENTIFIABILITY TO ECONOMIC IDENTIFICATION ALGORITHMS
This series is convergent and exp[log(Z)] = log(Z) when ||I − Z|| < 1 [55]. Further- more, when ||I − exp(J)|| < 1, we have log[exp(J)] = J [135].
First we need to guarantee ||I − exp(Ats)|| < 1 (the reason will be shown later), which should hold from an intuitive guess when ts is small enough. Specifically, suppose we have a prior knowledge on A, like ||A|| < F where F is given. Since A is † antisymmetric, we can employ a unitary matrix UA to diagonalize it as A = UAJAUA, where JA = diag(iλ1, iλ2, ..., iλK ). Here for j = 1, ..., K, λj ∈ R can be zero. We thus know maxj |λj| ≤ F . Note that
2 † 2 ||I − exp(Ats)|| = ||I − UAexp(JAts)UA|| 2 = ||I − exp(JAts)|| 2 = ||diag(1 − iλ1ts, 1 − iλ2ts, ..., 1 − iλK ts)|| PK 2 PK 2 2 = j=1 |1 − iλjts| = j=1[(1 − cos λjts) + sin λjts] PK = j=1(2 − 2 cos λjts).
Suppose we have 1 2K − 1 t < arccos , (4.58) s F 2K then we know λ 2K − 1 2K − 1 cos λ t > cos j arccos > , j s F 2K 2K and 2 PK ||I − exp(Ats)|| = j=1(2 − 2 cos λjts) PK 2K−1 < j=1(2 − K ) = 1.
Therefore, (4.58) can be a sufficient condition on the sampling period ts to guarantee the validity of the matrix logarithm function. We can then continue (4.57) to obtain
∞ j X [ − exp(Ats)] log[exp(At )] = At = − I . s s j j=1
112 4.5. FROM IDENTIFIABILITY TO ECONOMIC IDENTIFICATION ALGORITHMS
We thus have
P∞ 1 j Aa(p)b(p)ts = − j=1 j {[I − exp(Ats)] }a(p)b(p) P∞ 1 Pj j j−k k = − { I [−exp(Ats)] }a(p)b(p) j=1 j k=0 k (4.59) P∞ 1 Pj k j = − j=1 j k=0(−1) k [exp(kAts)]a(p)b(p), P∞ 1 Pj k j (p) = − j=1 j k=0(−1) k yk , where j j! = . k k!(j − k)!
We then truncate the infinite series in (4.59) to the ND-th term and reconstruct
Aa(p)b(p). Based on the above discussion, the identification algorithm is designed as follows.
Algorithm 4.1. Step 1. Establish the system model as (4.50) and (4.51). Using available prior knowledge, choose ts such that ||I − exp(Ats)|| < 1 (e.g., let ts satisfy (4.58)). Let p = 1.
Step 2. Suppose α(p)ϑp = Aa(p)b(p), choose c = Ia(p)σ and x0 = Iσb(p). Record the sampled data yˆ(p).
Step 3. Reconstruct Aa(p)b(p) according to
ND j ˆ 1 X 1 X k j (p) Aa(p)b(p) = − (−1) k yˆk . (4.60) ts j j=1 k=0
ˆ ˆ Then ϑp = Aa(p)b(p)/α(p).
Step 4. When p < NH, let p = p + 1, change the values of α(p), a(p) and b(p) accrodingly and repeat Step 2 and Step 3 to reconstruct all the elements in ϑ. Finally, Hˆ can be obtained from (4.50).
We analyze the computational complexity of Algorithm 4.1, where the time spent in experiments is not considered because it depends on different experiment settings.
113 4.5. FROM IDENTIFIABILITY TO ECONOMIC IDENTIFICATION ALGORITHMS
The binomial coefficients in (4.60) can be calculated from the recursion relation
j j − 1 j − 1 = + . k k − 1 k