Quantum Error Correction for the Toric Code Using Deep Reinforcement Learning

Quantum error correction for the toric code using deep reinforcement learning Philip Andreasson, Joel Johansson, Simon Liljestrand, and Mats Granath Department of Physics, University of Gothenburg, SE-41296 Gothenburg, Sweden We implement a quantum error correction success recently when applied to problems such as com- algorithm for bit-flip errors on the topologi- puter and board games[5{8]. The super-human perfor- cal toric code using deep reinforcement learn- mance achieved by deep reinforcement learning has rev- ing. An action-value Q-function encodes the olutionized the field of artificial intelligence and opens discounted value of moving a defect to a neigh- up for applications in many areas of science and tech- boring site on the square grid (the action) de- nology. pending on the full set of defects on the torus In physics the use of machine learning has seen a great (the syndrome or state). The Q-function is rep- deal of interest lately[9{13]. The most natural type of resented by a deep convolutional neural net- application of neural networks is in the form of super- work. Using the translational invariance on vised learning where the deep network can capture cor- the torus allows for viewing each defect from relations or subtle information in real or artificial data. a central perspective which significantly simpli- The use of deep reinforcement learning may be less ob- fies the state space representation independently vious in general as the type of topics addressed by RL of the number of defect pairs. The training is typically involve some sort of "intelligent" best strat- done using experience replay, where data from egy search, contrary to the deterministic or statistical the algorithm being played out is stored and models used in physics. used for mini-batch upgrade of the Q-network. In this paper we study a type of problem where arti- We find performance which is close to, and for ficial intelligence is applicable, namely the task of find- small error rates asymptotically equivalent to, ing a best strategy for error correction of a topological that achieved by the Minimum Weight Perfect quantum code; the potential basic building blocks of a Matching algorithm for code distances up to quantum computer. In the field of quantum comput- d = 7. Our results show that it is possible for ing, smart algorithms are needed for error correction a self-trained agent without supervision or sup- of fragile quantum bits[14{17]. Reinforcement learning port algorithms to find a decoding scheme that has been suggested recently as a tool for quantum error performs on par with hand-made algorithms, correction and quantum control[18{20], where an agent opening up for future machine engineered de- learns to manipulate a quantum device that functions coders for more general error models and error in an imperfect environment and with incomplete infor- correcting codes. mation. Under the umbrella term "Quantum Machine Learning" there are also interesting prospects of utiliz- 1 Introduction ing the natural parallelization of quantum computers for machine learning itself[21], but we will be dealing arXiv:1811.12338v3 [quant-ph] 25 Aug 2019 Much of the spectacular advances in machine learning here with the task of putting (classical) deep learning using artificial neural networks has been in the domain and AI at the service of quantum computing. of supervised learning were deep convolutional networks Due to the inherently fragile nature of quantum in- excel at categorizing objects when trained with big an- formation a future universal quantum computer will notated data sets[1{3]. A different but also more chal- require quantum error correction.[14{17] Perhaps the lenging type of problem is when there is no a priori so- most promising framework is to use topological error lution key, but rather a dynamic environment through correcting codes.[22{26] Here, logical qubits consisting which we want to learn to navigate for an optimal out- of a large set of entangled physical qubits are protected come. For these types of problems reinforcement learn- against local disturbances from phase or bit flip errors as ing (RL) [4] combined with deep learning has had great logical operations require global changes. Local stabilizer operators, in the form of parity checks on a group Mats Granath: [email protected] of physical qubits, provide a quantum non-demolition Accepted in Quantum 2019-08-24, click title to verify 1 diagnosis of the logical qubit in terms of violated sta- toric code was implemented. That work focuses on the bilizers; the so-called syndrome. In order for errors not important issue of imperfect syndromes as well as de- to proliferate and cause logical failure, a decoder, that polarizing noise and used an auxiliary "referee decoder" provides a set of recovery operations for correction of to assist the RL decoder. In the present work we con- errors given a particular syndrome, is required. As the sider the simpler but conceptually more direct problem syndrome does not uniquely determine the physical er- of error correction on a perfect syndrome and with only rors, the decoder has to incorporate the statistics of er- bit flip error. Also in contrast to [30] we study the rors corresponding to any given syndrome. In addition actual "toric" code, rather than the code with bound- the syndrome itself may be imperfect, due to stabilizer aries. Clearly the toric code will be harder to implement measurement errors, in which case the decoder must experimentally but nevertheless provides a well under- also take that into account. stood standard model. It also provides a simplification In the present work we consider Kitaev's toric from the fact that on a torus only the relative positions code[22, 23, 25] which is a stabilizer code formulated on of syndrome defects are relevant which reduces the state a square lattice with periodic boundary conditions (see space complexity that decoder agent has to master. By Figure1 and Section 2.1). We will only consider bit-flip focusing on this minimal problem we find that we can errors which correspond to syndromes with one type of make a rigorous benchmark on the RL decoder showing violated stabilizer that can be represented as plaquette near optimal performance. defects on the lattice (see Figure2). The standard de- Finding better performing decoders has been the coder for the toric code is the Minimum Weight Perfect topic of many studies, using methods such as renor- Matching (MWPM) or Blossom algorithm[27{29] that malization group[31, 32], cellular automata[33, 34], and works by finding the pairwise matching of syndrome de- a number of neural network based decoders[19, 35{43]. fects with shortest total distance, corresponding to the The decoder presented in this paper does not outper- minimal number of errors consistent with the syndrome. form state of the art decoders, it's value lies in show- The decoder problem is also conceptually well suited ing that it is possible to use reinforcement learning for reinforcement learning, similar in spirit to a board to achieve excellent performance on a minimal model. game; the state of the system is given by the syndrome, Given that deep reinforcement learning is arguably the actions correspond to moving defects of the syndrome, most promising AI framework it holds prospect for fu- and with reward given depending on the outcome of the ture versatile self-trained decoders that can adapt to game. By playing the game, the agent improves its er- different error scenarios and code architectures. ror correcting strategies and the decoder is the trained The outline of the paper is the following. In the Back- agent that provides step by step error correction. As in ground section we give a brief but self-contained sum- any RL problem the reward scheme is crucial for good mary of the main features of the toric code including performance. The size of the state-action space is also a the basic structure of the error correction and a similar challenge, to provide the best action for each of a myriad summary of one-step Q-learning and deep Q-learning. syndromes, but this is exactly the problem addressed by The following section, RL Algorithm, describes the for- recent deep learning approaches to RL.[5{8] mulation and training of the error correcting agent. In We find that by setting up a reward scheme that en- the Results section we shows that we have trained the courages the elimination of the syndrome in as few op- RL agent up to code distance d = 7 with performance erations as possible within the deep Q-learning (or deep which is very close to the MWPM algorithm. We finally Q-network, DQN)[6,7] formalism we are able to train a conclude and append details of the asymptotic fail rate decoder that is comparable in performance to MWPM. for small error rates as well as the neural network ar- Although the present algorithm does not outperform chitecture and the RL and network hyperparameters. the latter we expect that it has the potential to be more versatile when addressing depolarizing noise (with cor- related bit and phase flip errors), measurement noise 2 Background giving imperfect syndromes, or varying code geometries. 2.1 Toric code Compared to the MWPM algorithm the RL algorithm also has the possible advantage that it provides step The basic construction of the toric code is a square lat- by step correction whereas the MWPM algorithm only 1 tice with a spin- 2 degree of freedom on every bond, the provides information on which defects should be paired, physical qubits, and with periodic boundary conditions, making the former more adaptable to the introduction as seen in Figure1.[22, 23] 1 (An alternative rotated lat- of additional errors.

Load more