TCAS-I 1923 1

Quantum Switching and Quantum Merge Sorting

Sheng-Tzong Cheng and Chun-Yen Wang

 A quantum wires A Abstract—This paper proposes a quantum switching architect- ure that can dynamically permute each input quantum data to its destination port to avoid using the fully connected networks. In B C B C addition, in order to reduce the execution time of the quantum Quantum quantum switching, an efficient quantum merge (QMS) switching nodes that provides a parallel quantum computation is also developed. The quantum switching utilizes the QMS algorithm as a subroutine so that the total running time can be reduced to D E D E polylogarithmic time. Furthermore, to evaluate the feasibility of the quantum switching, we also define three different kinds of (a) fully connected (b) quantum switching performance factors that can be used to estimate the complexity in implementation and the time delay in execution for quantum Fig. 1. The architecture instruments. From the evaluation results, it can be seen that the proposed quantum switching is feasible in practice. [8], [9][10], quantum repeater [11], and quantum networks [12] have been raised. These results have Index Terms—Quantum circuits, quantum computation, quan- driven the QIS field further into the real- world applications. tum permutation, quantum sort, quantum switching. Many views of the future of QIS indicate that the need for basic quantum wire components would become more essential due to the explosive growth of quantum devices. As shown in I. INTRODUCTION Fig. 1(a), in order to ensure any two quantum devices can uantum computation and science communicate quantum data with each other, it is necessary to Q(QIS) [1][2] that study the exploration of quantum employ a quantum wire between them. However, this results in a mechanics provides us advanced tools to solve hard problems in fully-connected quantum network that would requires O(n2) traditional Turing machines. The fundamental element in the quantum wires to construct, where n is the total number of quantum field can be any particle with two states that coexist quantum nodes. In this paper, a novel quantum switching together such as a photon, atom, electron, or subatomic particle architecture is proposed to avoid the fully-connected quantum [3]. All of them are under the nano-scale. Thus, QIS is often networks. As shown in Fig. 1(b), with the aid of the quantum considered to be a basis for further research and innovations in switching, quantum nodes only need to deliver their quantum the future development of nanotechnology. data to the quantum switching and the quantum switching would In literature, many outstanding algorithms have been switch it to its destination port. Thus, each quantum node only developed to illustrate that quantum-based computers are more requires one quantum wire to connect with quantum switching, powerful than traditional computers. For example, Shor [4] so the number of the quantum wires can be reduced to O(n). presents a that factors a composite integer The proposed quantum switching allows each input quantum exponentially faster than the best classical algorithms do. And data to be permuted to its corresponding destination port in Grover [5] presents a quantum algorithm that searches an item parallel dynamically. Moreover, to minimize the time delay in from an unstructured list polynomial speedup over classical executing the quantum switching, an efficient quantum merge algorithms [6]. Due to its potential impact and significance, QIS sorting (QMS) algorithm is developed in this paper. Although it has drawn more and more attentions in the recent decade. has been shown that quantum computers, in general, can not One of the most important applications in QIS is the quantum much outperform classical computers for the task of sorting [13], communication that is a mechanism for transferring quantum this paper makes use of the parallel computation technique [14] data from one location to another. Among them, quantum wire to reduce the running time of the QMS algorithm so that it only [7] is one of such mediums for transporting quantum informa- takes time complexity O(log2n) to sort n quantum elements. tion. At present, many remarkable research issues based upon Because the proposed quantum switching utilizes the the quantum wire configuration such as efficient QMS algorithm, the total execution time can be reduced to polylogarithmic time. In addition, in order to assess Manuscript received December 10, 2004; revised May 26, 2005; accepted the feasibility of the quantum switching, we also define three July 16, 2005. different performance indexes to evaluate the effectiveness of The authors are with the Department of Computer Science and Information quantum instruments. According to the results, it can be seen Engineering, National Cheng Kung University, Tainan 701, Taiwan (e-mail: [email protected]; [email protected]). that the proposed quantum switching is feasible to put into TCAS-I 1923 2

|0+ |1 |0+ |1 |A |A |B |A |B |B |Bor |A (a) NOT gate |B |A |C |0 |1

(a) SWAP gate (b) Controlled-swap gate control Fig. 3. Quantum SWAP gate and Controlled-swap gate |00+|01+|10+|11 |00+|01+|10+11 |q1 |q1 target qubit |q2 Comparison |q2 (b) CNOT gate ancillary |0if |q1|q2 Fig. 2. Quantum NOT gate and CNOT gate qubit |0 |1if |q1> |q2 Fig. 4 The Comparison gate practice. This paper represents one of the few attempts to solve the dynamic quantum switching in the literature. with some fundamental logic operators so as to complete a The rest of this paper is organized as follows. In Section II, specific task is called a logic gate, such as the classical NOT gate, the general concepts about quantum computation are introduced. AND gate, OR gate, XOR gate, and so on. Analogously, in the Related research is surveyed as well. In Section III, the QMS quantum realm, a set of operators that can manipulate a quantum algorithm that provides a parallel quantum computation is system to accomplish a specific computation is called a presented. In Section IV, the quantum switching architecture is quantum gate. In this section, we introduce four quantum gates proposed to switch quantum data to its corresponding port. that are in common use throughout this paper. Finally, conclusion remarks are drawn in Section V. Similar to the classical NOT gate, the quantum NOT gate applied on a single qubit can be used to flip its states |0and |1. II. BACKGROUND AND RELATED WORK As shown in Fig. 2(a), when we send a qubit |= |0+ |1 In this section, we introduce definitions and terminologies through the quantum NOT gate, then the corresponding output used in describing the proposed mechanism throughout this would be |output= |0+ |1. Besides, the quantum NOT gate paper. In-depth treatments of the notations can be found in is usually called as a Pauli-X gate. Fig. 2(b) shows the circuits of literature [1][15]. the Controlled-NOT (CNOT) gate for processing two . It A. Quantum States can be seen from Fig. 2(b) that the quantum CNOT gate has two inputs. One is the control qubit and the other is the target qubit. Analogous to classical bits, the underlying unit in QIS is a When the control qubit is set to |0, the CNOT gate does nothing. quantum bit (qubit). However, unlike a classical bit that must be On the other hand, if the control qubit is set to |1, then it would inastateeither0 or1,aqubitcanbeinbothstate |0and state |1 apply the NOT gate to the target qubit. In other words, the at the same time. This special phenomenon in quantum CNOT gate can be used to exchange |10and |11. computing is called superposition. In general, a qubit state is Besides, one more important quantum operator is the quan- often represented as a linear combination of these two states, tum SWAP gate. SWAP gate applied on a two-qubit system can |= |0+ |1, (1) be used to do the transposition. More precisely, it changes a where and are complex numbers and ||||2 + ||||2 = 1. two-qubit system from |A|Bto |B|A. As drawn in Fig. Note that it is impossible for us to determine whether a qubit 3(a), SWAP gate can be implemented by three layers of CNOT is in state |0or |1by examining the values of and. However, gates. However, because of its importance to quantum com- when we measure a qubit in a superposition state, the entire munication, SWAP gate is often seen as a basic quantum gate. qubit system would collapse into one of its basis (e.g., |0or |1). The last quantum operator we introduce in this subsection is As for which state we would obtain, it is determined by the the controlled-swap gate (or the Fredkin gate). As depicted in absolute square of its coefficient. That is, we get the qubit in Fig. 3(b), the controlled-swap gate has three input qubits, |A, state |0with probability ||||2, or in state |1with probability ||||2. |Band |C. The qubit |Cis called a control qubit because it Thus, and are called probability amplitudes. controls what happens to the other two qubits. When the control Furthermore, a multiple-qubit system can be built up by qubit |Cis set to |0, it does nothing. On the other hand, if the composing multiple independent qubits. To compose two or control qubit |Cis set to |1, then the SWAP gate is applied to more qubit systems together, the tensor product operator is the target qubits (i.e., |Aand |Bare swapped). adopted. For example, consider a two-qubit system composed In quantum circuits, time evolution goes from left to right. of two single qubit system |= a|0+ b|1and |= c|0+ d|1. Furthermore, each quantum operation is defined to be certainly Then, the entire two-qubit system is usually represented as revertible, so it can be done in principle without consuming any ||= |= ac|00+ ad|01+ bc|10+ bd|11. (2) energy [16]. This particular reversibility property is also put to B. Quantum Unitary Gates use by the proposed quantum switching so as to clean up some In digital information processing, a basic circuit combined garbage bits. More details are described clearly in Section IV-B. TCAS-I 1923 3

|q1 min{|q1, |q2} |q1 quantum field. Thus, it is necessary to apply the so-called black box model to determine the large or small relationship between |q2 Comparison two input quantum numbers. Fig. 4 shows the sketch of the comparison gate that is |q2 max{|q1, |q2} |0 commonly used to define the order relation between two (a) (b) Fig. 5. Quantum circuits for quantum comparator gate quantum numbers [1]. As shown in Fig. 4, the comparison gate has three inputs: two quantum numbers |q1, |q2and an ancillary |q1 |q1 qubit. If |q1is greater than |q2, then the comparison gate flips

|q2 |q2 the state of the ancillary qubit, else it does nothing. For clarity, in this paper we also adopt the thick line as in Fig. 4 to represent |q3 |q3 the quantum number is composed of multiple qubits and use the |q  |q  4 4 thin line to stand for a single qubit. (a) (b) In literature, Farhi et al. [23] proposes a quantum algorithm Fig. 6. Examples for quantum comparator gates that can search an element in a sorted list of n elements as a C. Quantum Digital Switch subroutine in by querying the comparison gates

In literature, I. M. Tsai and S. Y. Kuo [17] present a digital roughly0.526log2 n times. This achieves a quantum computer switching that can switch digital data in the quantum domain. needs 0.526nlog2 n O(n) queries to sort n elements. However, The basis idea of their digital switching is to apply the property is it possible to improve the performance of the quantum sorting that any general cycle can be performed by using six layers of algorithm by decreasing the upper bound for ordered search? CNOT gates [5]. This results in their digital switching can be Unfortunately, this idea cannot be put into practice because the implemented in a constant-time complexity [18]. log n However, the control module in their digital switching was lower bound 2 O(1) for the ordered search problem has 12 absent. The control module is responsible for controlling the been verified by Ambainis [24]. switching configuration so that all the input data in each After that, Høyer et al. [13] shows that any comparison-based individual time can be switched to their destination ports quantum sorting algorithm needs (nlog n) queries to the correctly. In order to dynamically control the switching module comparison gates that is already achieved by classical sorting in each individual time, we propose a new quantum switching algorithms. In other words, it shows that quantum algorithms architecture with the control module in this paper. can only achieve a constant speedup over classical algorithms In addition, M. K. Shukla et al. [19] present a quantum for the sorting problem. Besides, Arul [25] also demonstrates Banyan Network that only requires logarithmic execution time that non-orthogonal quantum states cannot be compared or to switch input data. However, the quantum Banyan Network is sorted. And Klauck et al. [26][27] studies the lower bound on not revertible since it needs to discard some dummy packets. In the time-space tradeoffs for sorting n elements. addition, the most serious problem in quantum Banyan Network In this paper, we present a QMS algorithm that provides a is that some input quantum information would be lost. Further, parallel computation for our quantum switching. The QMS in the worst case, it is possible for the quantum Banyan Network algorithm is constructed by the comparator gates (or the to lose n–2 input quantum packets where n is the total number of control-and-swap operation in some publications [1]) defined input packets. According to the no-cloning theorem [20][21], in Fig. 5(a) completely. As drawn in Fig. 5(b), the quantum these lost quantum data cannot be recovered. This would result comparator gate is composed of a comparison gate and in an unreliable quantum network. controlled-swap gates. The former comparison gate compares The proposed quantum switching can overcome these drawbacks. The quantum switching is designed to be revertible. the input elements |q1with |q2and changes the ancilla state if Moreover, no input data would be lost in executing quantum |q1is greater than |q2. The latter controlled-swap gates are switching. The quantum operations for implementing the responsible for permuting |q1and |q2if the ancilla state is set to quantum switching are described clearly in Section IV. Since |1. In other words, the whole comparator gate is the quantum the quantum switching is constructed by applying an efficient operation that can be used to sort two quantum numbers. quantum sorting algorithm, we briefly discuss some of the For simplify, in this paper we use the abbreviated notion C[i, existing issues in quantum sorting in the following. j] to indicate the comparator gate on the qubits |qiand |qj. Thus, the quantum circuits shown in Fig. 6(a) can be formulated as D. Quantum Sorting Algorithm C[1, 3]C[2, 4]. And the quantum circuits depicted in Fig. 6(b) In classical information theory, the traditional sorting can be represented by (C[1, 2]C[3, 4])o(C[1, 4]C[2, 3]). problem can be formulated as follows. Given a sequence of n numbers Q=q1, q2, …, qn, output a permutation Q= III. QUANTUM MERGE SORTING

q1, q2,, qnof the input sequence such that the list Q’is in In this section, the QMS(n) algorithm for sorting n quantum non-decreasing order [13][22]. However, unlike the classical numbers is presented. In the beginning, we consider that the bits, there exists a special superposition phenomenon in the number of inputs n is a power of two (i.e., n = 2k). As for a TCAS-I 1923 4

|q1

|q2 Bitonic QMS(n/2) Sorter(n/2) bitonic sorted … Merge(n) sequence

QMS(n/2) (n/2) |qn Division Merge Bitonic-Sorter(n) Fig. 7. The skeleton diagram of the QMS(n) architecture Fig. 9. Quantum circuits for quantum Bitonic-Sorter(n)

|1 |1

|3 |3 sorted bitonic |6 |5 bitonic ≦ | Bitonic sorted 8 |2

sequence Sorter(n) |9 |9 | | sorted Reverse bitonic 7 7 |5 |6

|2 |8 Bitonic-Merge(n) (a) (b) Fig. 8. The skeleton diagram for quantum bitonic merge Fig. 10. The first stage of quantum Bitonic Sorter general n, we discuss it in Section III-D. quantum bitonic sequence is defined as a quantum sequence that Analogous to classical merge sorting algorithm [22], the is either monotonically increasing and then monotonically QMS algorithm also follows a divide-and- conquer approach decreasing, or monotonically decreasing and then monotonic- and is a recurrence in structure. In fact, the QMS(n) algorithm ally increasing. For example, suppose that the ordered relation can be defined in the following form: defined in comparison gate satisfies that |1|2|3 Merge(n) (QMS(n / 2) QMS(n / 2)) if n 1 QMS(n)  (3) |4|5|6|7|8. Then the quantum sequences  I if n 1 {|1, |3, |6, |8, |2} and {|8, |5, |2, |4, |6, where I denote the identity operation. |} are both bitonic sequences. Besides, a monotonically We can depict the quantum circuits for the QMS(n) as shown 7 increasing or decreasing sequence is also a bitonic sequence. in Fig. 7. It can be seen from Fig. 7 that QMS algorithm is The of the quantum bitonic sorter is designed composed of two phases: Division phase and Merge phase. for sorting any quantum bitonic sequence. Fig. 9 demonstrates During the division phase, the QMS(n) algorithm is recursively the quantum circuits for quantum bitonic sorter with n input performed to transform n unsorted quantum numbers into two elements. It can be seen from Fig. 9 that the quantum bitonic sorted lists. Afterwards, these two sorted lists are combined into sorter is constructed by applying n/2 comparator gates in one increasing sequence during the merge phase. In addition, parallel and then recursively performing the quantum Bitonic- this paper develops a special merge strategy, quantum bitonic Sorter(n/2). Thus, we can express the quantum bitonic sorter in merge, to reduce the running time of the QMS algorithm. The the following mathematical form details for constructing the quantum bitonic merge are described  n / 2 n in the following. (Sorter(n / 2) Sorter(n / 2)) C[i, i] if n 1 Sorter(n)  i1 2 (4) A. Quantum Bitonic Merge  I if n 1 In this subsection, we start out to present the quantum bitonic The basic idea of the quantum bionic sorter for sorting any merge to combine two sorted quantum lists into an ordered bitonic sequence is to separate input bitonic sequence into two sequence. The skeleton for quantum bitonic merge is depicted in bitonic sequences. And by recursively performing the quantum Fig. 8. As shown, the quantum bitonic merge can be further split bitonic sorter in parallel, we can sort the original bitonic into two components: Reverse operation and Bitonic Sorter. sequence ultimately. More precisely, as drawn in Fig. 9, the The Reverse operation is responsible for transforming two former part of the quantum Bitonic-Sorter(n) is made up of n/2 sorted quantum sequence into one quantum bitonic sequence comparator gates. Those comparator gates compare each input i while the bitonic sorter can sort any bitonic sequence [28]. A with input n/2 + i for i = 1,…,n/2 in parallel. The purpose of TCAS-I 1923 5

sorted

bitonic sequence sorted Reverse

Omitted (a) (b) (a) (b) Fig. 11. Quantum circuits for quantum reverse operation Fig. 13. Quantum circuit simplification

Bitonic Bitonic Sorter(n/2) Sorter(n/2)

Bitonic Sorter(n/2) Bitonic Sorter(n/2) Bitonic-Sorter(n) (i) BitonicRev -Merge(n) Bitonic-Merge(n) Fig. 12. The skeleton diagram for quantum Bitonic-Merge(n) Fig. 14. The quantum circuits for quantum Bitonic-Merge(n) those comparator gates is to partition the original bitonic order of the latter sequence, then it is clear that the new quantum sequence into two halves such that 1) both halves are bitonic and sequence {|q1, |q2, …, |qm, |rn, |rn1, …, |r1} is bitonic. 2) any quantum number in the first half is smaller than or equal The Reverse operation that reverses the order of any input list to all quantum numbers in the second half (see Fig. 10(a)). can be easily implemented by employing n/4 swap gates as A simple example is shown in Fig. 10(b). Consider a quantum shown in Fig. 11(b). Each input n/2 + i is swapped with input n  sequence {|1, |3, |6, |8, |9, |7, |5, |2} with i + 1 for i = 1, 2, …,n/2. Note that those n/4 swap gates can be eight quantum numbers. Suppose that the ordered relation executed in parallel by using only three layers of CNOT gates. Now we can substitute the quantum Reverse operation (Fig. defined in comparison gate satisfies that |1|2|3 11) and bitonic sorter (Fig. 9) into quantum bitonic merge (Fig. |||||. By the above definition, we 4 5 6 7 8 8) to obtain the quantum bitonic merge circuits as depicted in know that this input quantum sequence is a bitonic sequence. Fig. 12. Inputs of two sorted sequences from the division phase Then, after performing the quantum circuits in Fig. 10(a), we are converted into a bitonic sequence by the Reverse operation. can get two sequences {|1, |3, |5, |2} and {|9, |7, And quantum bitonic sorter can sort any bitonic sequence. Thus, |6, |8}. It is clear that these two output sequences are both two sorted quantum sequences can be easily merged into an bitonic and each quantum number in the first half is smaller than order list through the quantum bitonic merge in Fig. 12. that in the second half. Thus, by following the same procedure Nevertheless, the quantum circuits in Fig. 12 can be further continuously, we can get a sorted sequence ultimately. As a simplified. The left-hand quantum circuits of quantum Bitonic- result, we can sort any bitonic sequence by performing the Merge(n) are taken out as in Fig. 13(a). As mentioned above, quantum circuits in Fig. 9. the quantum circuits shown in Fig. 13(a) can transform two The quantum bitonic sorter only accepts bitonic sequences. sorted quantum sequences into two bitonic sequences such that However, it can be seen from Fig. 7 that the output sequences the output numbers in the top half are at least as small as that in from the division phase are not bitonic, but two sorted the bottom half. In fact, the quantum circuits in Fig. 13(a) are sequences. Therefore, it is necessary to employ the Reverse equivalent to that in Fig. 13(b). Compared Fig. 13(a) with Fig. operation to convert the ordering of one sorted sequence. 13(b), each input i in Fig. 13(a) is compared with input n/2 + i Consequently, the output turns out to be a bitonic sequence. that is swapped from input ni+1 while each input i in Fig. 13(b) As shown in Fig. 11(a), two sorted non-decreasing quantum is directly compared with input ni+1 and then the Reverse sequences can be easily transformed into a bitonic sequence by operation is applied to conform to the output of Fig. 13(a). reversing the order of one of these two sequences. More Since the quantum circuits in Fig. 13(a) and Fig. 13(b) are precisely, suppose that {|q1, |q2, …, |qm} and {|r1, |r2, …, |rn} equivalent, the circuits in Fig. 13(b) can also convert two sorted are two non-decreasing quantum sequences. If we reverse the quantum sequences into two bitonic sequences such that every TCAS-I 1923 6

|q1 Bitonic- Layer 1 2 3 4 5 6 |q1 |q2 Merge(2) Bitonic- |q2 |q3 Bitonic- Merge(4) |q3 |q4 Merge(2) Bitonic-Merge(8) |q4 |q5 Bitonic- |q5 |q  Merge(2) 6 Bitonic- |q6 |q  7 Bitonic- Merge(4) |q7 |q  8 Merge(2) |q8 Fig. 15. Recursively expanding the QMS(8) Fig. 17. The entire QMS(8) circuits

Finally, as drawn in Fig. 9, quantum Bitonic-Sorter(n) is also |q1 Bitonic a recursive construction. And it can be performed by applying |q  Sorter(2) 2 Bitonic n/2 comparator gates and two Bitonic-Sorter(n/2) operations in Sorter(4) |q3 Bitonic parallel. Thus, we can substitute Fig. 9 into Fig. 16, and then the Sorter(2) entire QMS(8) circuits can be carried out as shown in Fig. 17. |q4

|q5 C. Performance Evaluation Bitonic Sorter(2) In this subsection, the performance of the QMS algorithm is |q6 Bitonic studied. Let NQ(n), NM(n) and NS(n) denote the number of |q  Sorter(4) 7 Bitonic comparator gates that are needed to implement the QMS(n), Sorter(2) |q8 Merge(n) and Sorter(n) circuits, respectively. Thus, according Fig. 16. Replacing the quantum Bitonic Merge circuits to equations (3), (4), and (5), it is clear that n  , NQ(n) 2NQ NM (n) (6) quantum number in the top half is at least as small as each one in 2  the bottom half. In addition, it can be observed that even if we n  n , NM (n) 2NS  (7) omit the Reverse operation in Fig. 13(b), the outputs still satisfy 2  2 this property. It is because the reversal of any bitonic sequence n  n NS(n) 2NS  (8) is also bitonic. Therefore, we can only perform n/2 comparator 2  2 gates in parallel to substitute the n/4 swap gates and n/2 for n > 1. And NQ(1) = NM(1) = NS(1) = 0. comparator gates in Fig. 12. By applying the Master theorem [22] to the above equations, After replacing the quantum circuits in Fig. 13(a) with the we know that quantum circuits in Fig. 13(b), we can get the new quantum NQ(n) = (nlogn) (9) Bitonic-Merge(n) as depicted in Fig. 14. In quantum Bitonic- From (9), it can be seen that the QMS algorithm still requires Merge(n), each input i is compared with input ni+1 for i = 1, (nlogn) queries to comparison gates that matches the lower 2, …, n/2 to produce two bitonic sequences and then the top half bound by Høyer [13]. However, the purpose of the QMS and bottom half can be sorted in parallel by the quantum bitonic algorithm is not to outperform other quantum sorting algorithm, sorter. As a result, in terms of the mathematic equations, the but to provide an efficient parallel quantum computation to quantum Bitonic-Merge(n) can be written as decrease the whole running time of quantum sorting algorithms.  n / 2 (Sorter(n / 2) Sorter(n / 2))  C[i,n i 1] if n 1 Let TQ(n), TM(n) and TS(n) denote the time complexity of Merge(n)    i1 the QMS(n), Merge(n) and Sorter(n) circuits, respectively.   I if n 1 Similarly, based on the equations (3), (4), and (5), it can be (5) known that B. Quantum Merge Sorting (QMS) n  , TQ(n) TQ TM (n) (10) In this subsection, we illustrate the QMS algorithm for 2  n  sorting eight quantum elements. As mentioned above, the QMS TM (n) TS TC(k) , (11) algorithm is a recurrence in structure. Thus, after recursively 2  expanding the QMS(8) algorithm, we can get the infrastructure n  TS(n) TS TC(k) (12) constructed by the quantum bitonic merge as shown in Fig. 15. 2  In addition, as depicted in Fig. 14, for each quantum Bitonic- for n > 1, where TC(k) is the time complexity of the comparator Merge(n), it can be implemented by executing n/2 comparator gate with two k-bit input numbers. And TQ(1)=TM(1)=TS(1)=0. gates and two quantum Bitonic-Sorter(n/2) in parallel. Thus, After expanding the recursive equations, we get that after replacing the quantum Bitonic-Merge by the circuits in Fig. TQ(n) ((log n) 2 TC(k)) . (13) 14, we can obtain the construction as depicted in Fig. 16. Consequently, the QMS(n) algorithm can sort n quantum TCAS-I 1923 7

Layer 1 2 3 4 5 6 two. By applying the method described in Section III-B, the r r |q1 QMS(2 ) can be implemented. Since those 2 n appended

|q2 quantum numbers are large enough that all the comparator gates performed on them do not work, we can remove those |q3 comparator gates performed on them. Then the remaining |q  4 circuits are the quantum operator that can be employed to sort n |q5 quantum elements. |q6 From the complexity point of view, it can be seen from equations (9) and (13) that for a general n, NQ(2r1 ) NQ(n) NQ(2 r )  NQ(n) (nlog n) , (14) Fig. 18. The quantum circuits for QMS(6) TQ(2r1 ) TQ(n) TQ(2r )  TQ(n) ((log 2 n)TC(k)) (15) where r = log n. In other words, it also consists of (nlogn) input output 2 input qubits ports ports comparator gates in depths (log2n), just as the computational P1 complexity we demonstrate above. |q1 P1 |q1’

|q2 P2 P2 |q2’ Switching IV. QUANTUM SWITCHING |q3 P3 P3 |q3’ … Sector … In this section, the proposed quantum switching to avoid the fully connected networks is presented. The quantum switching Pn Pn |q ’ |qn n is designed to permute each input data to its corresponding control registers output registers destination port dynamically. In addition, since quantum qubits cannot be replicated [20][21], the multicast service is not |C1 |C1’ supported by the proposed quantum switching. In other words, |C2 Control |C2’ each input data is allowed to have exactly one destination port. |C  3 … Sector … |C3’ On the other hand, if there exist two or more input qubits with

the same destination port, it would result in the so-called output |C  n |Cn’ collision. Therefore, to prevent the quantum switching from the Fig. 19. The sketch of the quantum switching architecture output collision, we only consider the case that each input data and output port have one-to-one mapping. numbers in parallel and only take time ((logn)2) layers of Fig. 19 shows the sketch of the proposed quantum switching comparator gates. In other words, the QMS algorithm achieves architecture. As shown, the quantum switching consists of two the polylogarithmic time complexity. So it can be employed in sectors: Switching sector and Control sector. The switching constructing some quantum instruments with critical time delay sector is responsible for permuting all the input qubits into their such as the quantum switching. The details of the quantum corresponding output ports. And the control sector is operations required to implement the quantum switching are responsible for controlling the switching sector so that the described in Section IV. switching sector can accomplish its task correctly. Actually, the D. General Cases data permutation in the switching sector is dominated by the control registers. More precisely, consider the case that there is In this subsection, we consider the QMS(n) algorithm for the a qubit |q input to the I/O port P with the destination port P . case in which n is not exactly a power of two. The first step to i i j To fulfill this switching task, we set the corresponding control construct such a QMS(n) operation is to implement the QMS(2r) registers |C to be | j. After we set all the control registers, by circuits that can be done by following the methodology i performing the quantum switching, all the input data would be described in Section III-B, where r = log n. Afterwards, we 2 switched to their corresponding destination port correctly. omit all the quantum gates performed on the last 2rn quantum In essence, quantum switching can be viewed as a quantum elements. Then the remaining quantum circuits are the QMS(n) sorting algorithm that sorts all the input qubits |q into an circuits that can sort n quantum numbers. i increasing order by their corresponding destination port number An example for QMS(6) circuits is illustrated in Fig. 18. To |C . As a result, the quantum switching architecture can be construct the QMS(6) circuits, the first step is to build the i constructed easily by using the QMS algorithm presented in quantum circuits for QMS(23) due to log 6= 3 Afterwards, 2 Section III. The details are described as follows. we remove all the quantum gates that are operated on the last two (236=2) quantum numbers (the dotted part in Fig. 18). A. Quantum Switching Circuits Then the remaining quantum circuits are the QMS(6) circuits as Before we present how quantum switching can be constructed depicted in Fig. 18. by using the QMS algorithm, we define a particular partial The basis idea of the QMS(n) algorithm for any natural comparator gate for the quantum switching first. As shown in number n is to imagine that there are 2rn appended very large Fig. 20(a), the partial comparator gate only compares the quantum numbers to meet the total sorted number is a power of register |C1with the register |C2. If the register |C1is smaller TCAS-I 1923 8

|q1 input input output output qubit ports ports qubits |q2 |q1 |qs P1 P1 |q  1 |q1’ |C1 2 |q2 P2 Switching P2 … |q2’ … |C  |C1

2 Comparison Sector

|q  Pn Pn |q ’ |0 |C  n n 2 output registers workspace (a) (b) control registers Fig. 20. The sketch of the partial comparator gate for quantum switching |C1 |C1’ … … |C2 |C2’ … … |C2

Layer 1 2 3 |C  |C ’ Garbage |C  |q1 |q1 n Control n n workspace Sector Bit clean bits |q2 |q2 Cleaner |0 garbage |0 … …

|q  |q  … 3 3 |0 bits … |0

|q4 |q4 |0 |0 Fig. 22. The entire quantum switching architecture |C1 |C1 |Ci |Ci |C2 |C2

|C3 |C3 |Cj Comparison |C  |C  4 4 |Cj |0or |1 (a) I4 QMS(4) (b) Quantum switching (a) (b) Fig. 21. An example for quantum switching circuits with four I/O ports Fig. 23. The inverse of the comparator gate or equal to the register |C2, then it does nothing. Otherwise, the make the partial comparator gate revertible. However, those partial comparator gate flips the workspace qubit state (i.e., qubits would become garbage bits after the quantum switching. Therefore, to utilize those workspace qubits more efficiently, changes the state from |0to |1) and swaps the registers |C1and we implement a garbage-bit cleaner to clean those garbage bits |C2and the qubits |q1and |q2. Actually, the partial comparator gate can be seen as a comparator gate appended with a so that they can be reused in the next round (Fig. 22). controlled-swap gate. Therefore, for the purpose of simplicity, As shown in Fig. 22, the garbage-bit cleaner can be roughly we adopt the abbreviated drawing in Fig. 20(b) to represent the seen as the inverse of the control sector. Since the control sector partial comparator gate defined in Fig. 20(a). is primarily constructed by the QMS algorithm, the garbage-bit Quantum switching can be seen as a kind of quantum sorting cleaner can be done by implementing the inverse of the QMS algorithm that sorts each input qubit by its destination port algorithm. We employ the following methodology to build the number. The first step of constructing the quantum switching is inverse of the QMS operation. The first step is to implement the to establish the QMS(n) circuits in the control sector. And then, QMS circuits in reverse order. And the second step is to replace for each comparator gate in the control sector, we append a each comparator gate by the inverse of the comparator gate. controlled-swap gate on the corresponding input qubits. An example for implementing the inverse of the QMS(4) is A example for implementing the quantum switching with four illustrated here. We define the quantum logical gate drawn in I/O ports is illustrated in Fig. 21. As mentioned above, the Fig. 23(a) to be the inverse ofthe comparator gate. It isclear that, quantum switching can be done by first constructing the QMS(4) as shown in Fig. 23(b), the inverse of the comparator gate can be circuits in the control sector as in Fig. 21(a). Afterwards, a constructed by applying the comparator circuits in Fig. 5 in controlled-swap gate is appended to each comparator gate on reverse order. Now we can start out to build the garbage-bit the corresponding input qubits. As drawn in Fig. 21(b), for the cleaner. Fig. 24(a) shows the QMS(4) quantum circuits. To construct the inverse of the QMS(4) operation, we first comparator gate applying on the registers |Ciand |Cj, a implement the QMS(4) circuits in reverse order as depicted in controlled-swap gate is performing on the input qubits |qiand Fig. 24(b). After we replace each comparator gate in Fig. 24(b) |q . After that, the quantum circuit is obtained as drawn in Fig. j by the inverse of the comparator gate, we get the inverse of the 21(b). Furthermore, it can be seen that, after performing the QMS(4) as drawn in Fig. 24(c). By substituting the quantum quantum circuits in Fig. 21(b), each input qubit is permuted to circuits in Fig. 21(b) and Fig. 24(c) into the quantum switching its corresponding destination port. architecture in Fig. 22, we can get the whole quantum switching B. Garbage-Bit Cleaner circuits as depicted in Fig. 25. In addition to the switching sector and control sector, it is C. Performance Evaluation necessary for the quantum switching to implement a garbage- In this subsection, the performance of the quantum switching bit cleaner. As depicted in Fig. 20(a), whenever we employ the is studied. We define three factors to evaluate the effectiveness partial comparator gate, we require an ancillary workspace to of the quantum switching: Space Complexity, Implementation TCAS-I 1923 9

as the number of quantum operations to implement the quantum switching with n I/O ports.  According to Fig. 25 and (9), it can be seen that the quantum switching with n I/O ports is composed of (nlogn) partial comparator gates and (nlogn) inverses of comparator gates. In (a) (b) (c) Fig. 24. The process to implement the garbage-bit cleaner addition, as shown in Fig. 20(a), each partial comparator gate

contains a comparison gate and log 2 n 1controlled-swap gates. Layer 1 2 3 4 5 6 And the inverse of comparator gate can be constructed by log n |q  2 1 controlled-swap gates and a comparison gate as drawn in Fig. |q2 23(b). Furthermore, for the case of the proposed quantum

|q3 switching, the comparison gate can be constructed by (nlogn) quantum operations [29]. As a result, it is clear that |q  4 IC(n) [(nlogn)(logn(logn 1)) (nlogn)(logn logn)] (17) (nlog2 n) |C1

|C2 Definition: The delayed time complexity, DT(n), is defined as the running time complexity of the quantum switching with n |C  3 I/O ports.  |C4 It can be seen from Fig. 25 and (13) that the whole quantum Fig. 25. The entire quantum switching circuits switching can be implemented in (nlogn) layers of partial comparator gates with a time complexity of (nlog n) and Complexity, and Delayed Time Complexity. Each of these plays an important role in evaluating the practicability of a quantum (nlogn) layers of inverses of comparator gates with a time instrument. Space complexity and implementation complexity complexity of (nlogn). Therefore, we know that 2 2 3 refer to the complexity in accomplishing the quantum DT(n) [(log n)(logn) (log n)(logn)] (log n) (18) instrument. Delayed time complexity represents the running In other words, the quantum switching only takes (nlogn) time in execution. In practice, the delayed time complexity is space consumption and can be implemented with (nlog2n) more critical than the space complexity and implementation quantum gates. Furthermore, in terms of time complexity, the complexity since the proposed quantum switching is designed to time delay for executing the quantum switching only requires be reused. The detailed definitions of those performance (log3n) operation time. This achieves a quantum switching indexes are described as follows. that can be performed only in polylogarithmic time. Therefore, Definition: The space complexity, SC(n), is defined as the total it turns out that the proposed quantum switching architecture is number of input qubits for the quantum switching with n I/O very scalable. ports.  V. CONCLUSION As depicted in Fig. 22, the space complexity SC(n) for the Because many breakthroughs in quantum communication quantum switching contains n input qubits, n control registers have been discovered, this results in the fact that the demand for and extra workspace qubits. Since each control register is establishing the communication networks becomes more and responsible for storing the destination port number of the more urgent. To save the cost in building the quantum backbone corresponding input qubit, it requires at least log n qubits. As 2 networks, this paper proposes a quantum switching architecture for the number of workspace qubits, because we have to employ that can be applied to switch each input quantum data to its an additional workspace qubit for each partial comparator gate corresponding destination port correctly. The proposed and it can be seen from (9) that the quantum switching requires quantum switching employs an efficient QMS algorithm as a performing (nlog n) comparator gates, we need (nlog n) subroutine, so the total execution time can be reduced in workspace qubits in total. Consequently, polylogarithmic time. More precisely, the quantum switching SC(n) n nlog 2 n (nlog n) (nlog n) (16) only requires (nlog n) space consumption and can be On the other hand, it is clear that quantum switching with n implemented with (nlog2n) quantum gates. Furthermore, the I/O ports has to be able to handle npossible cases. Due to n= quantum switching achieves a better performance in a time (2n log n ) [22], it is unavoidable for the quantum switching to complexity of (log3n). Based on these advantages, it can be apply (nlogn) control bits to specify each possible switching seen that the proposed quantum switching is feasible to be put case. This matches the space complexity of the proposed into constructing high performance quantum networks. quantum switching. So the proposed quantum switching is verified to be efficient in space consumption. REFERENCES [1] M. Nielsen and I. Chuang. Quantum Computation and Quantum Definition: The implementation complexity, IC(n), is defined Information. Cambridge University Press, Cambridge, England, 2000. TCAS-I 1923 10

[2] D. DiVincenzo, “Quantum computation,”Science, vol. 270, pp. 255-261, [29] K. W. Cheng and C. C. Tseng, “Quantum full adder and subtractor,” 1995. Electronics Letters, vol. 38, pp. 1343-1344, Oct. 2002. [3] S. Lloyd, “A potentially realizable quantum computer,”Science, vol. 261, pp. 1569-1571, 1993. [4] P. Shor, “Algorithms for quantum computation: discrete logarithms and factoring,”in Proc. of the 35th Annual IEEE Symposium on the Foundations of Computer Science, pp. 124-134, 1994. [5] L. Grover, “A fast quantum mechanical algorithm for database search,”in Proc. of the 28th Annual ACM Symposium on the Theory of Computing, Sheng-Tzong Cheng received the B.S. (1985) and pp. 212-219, 1996. M.S. (1987) in Electrical Engineering form the [6] I. M. Tsai, S. Y. Kuo, and David S. L. Wei, “Quantum Boolean Circuit National Taiwan University, Taipei, Taiwan. He Approach for Searching and Unordered Database,”in Proc. of the 2002 received the M.S. (1993) and Ph.D. (1995) in nd 2 IEEE Conference on Nanotechnology, pp. 325-318, Aug. 2002. Computer Science from the University of Maryland, [7] M. Oskin, F. T. Chong, I. L. Chuang, and J. Kubiatowicz, “Building College Park, Md. He was an Assistant Professor of Quantum Wires: The Long and the Short of it,”in Proc. of the 30th Computer Science and Information Engineering at Annual International Symposium on the Computer Architecture, pp. National Dong Hwa University, Hualien, Taiwan, in 374-385, 2003. 1995, and became an Associate Professor in 1996. [8] C. H. Bennett, G. Brassard, C. Crépeau, R. Josza, A. Peres, and W. K. He is currently a Professor in the Department of Wootters, “Teleporting an unknown quantum state via dual classical and Computer Science and Information Engineering, Einstein- Podolsky-Rosen channels,”Phys. Rev. Lett., 70:1895-1899, National Cheng Kung University, Tainan, Taiwan. His research interests are in 1993. design and performance analysis of mobile computing, wireless [9] E. Biham, B. Huttner, and T. Mor, “Quantum cryptographic network communications, multimedia, and real-time systems. based on quantum memories,”Phys. Rev. A, vol. 54, pp. 2651-2658, Oct. 1996. [10] C. H. Bennett, G. Brassard, and A. K. Ekert, “Quantum cryptography,” Scientific American, pp. 50-57, Oct 1992. [11] W. Dür, H. J. Briegel, J. I. Cirac, and P. Zoller, “Quantum repeater based on entanglement purification,”Phys. Rev. A, vol. 59, pp. 169-181, 1999. [12] S. T. Cheng, C.-Y. Wang, and M. H. Tao, “Quantum Communication for Chun-Yen Wang received the B.S. degree in Wireless Wide-Area Networks,”IEEE Journal on Selected Areas in mathematics from National Cheng Kung Communications, pp. 1424-1432, July 2005. University, Tainan, Taiwan in 2001. He is [13] P. Høyer, J. Neerbek, Y. Shi, “Quantum complexities of ordered currently working toward the Ph.D. degree in searching, sorting and element distinctness,”in Proc. of the 28th computer science and information engineering at International Colloquium on Automata, Languages and Programming, National Cheng Kung University. His research pp. 62-73, 2001. interests include: wireless communications, [14] C. Moore and M. Nilsson, “Parallel Quantum Computation and Quantum mobile computing, quantum computation and Codes,”ArXive e-print quant-ph/9808027, 1998. quantum communications. [15] A. Barenco, C. Bennett, R. Cleve, D. P. Divincenzo, N. Margolus, P. Shor, T. Sleator, J. Smolin and H. Weinfurter, “Elementary Gates for Quantum Computation,”Phys. Rev. A, 52(5), pp. 3457-3467, 1995. [16] R. Landauer, “Irreversibility and heat generation in the computing process,”IBM F. Res. Dev., vol. 5, pp. 183, 1961. [17] I. M. Tsai and S. Y. Kuo, “Digital Switching in the Quantum Domain,” IEEE Trans. Nanotechnology, vol. 1, pp. 154-164, Sep. 2002. [18] I. M. Tsai, S. Y. Kuo, S. L. Huang, Y. C. Lin, and T. T. Chen, “Experimental Realization of an NMR Quantum Switch,”ArXive e-print quant-ph/0405170, 2004. [19] M. K. Shukla, R. Ratan and A. Y. Oruc, “A Quantum Self-Routing Packet Switch,”inProc.of the 38th Annual Conference on Information Sciences and Systems, Mar. 2004. [20] D. Dieks, “Communication by EPR devices,”Phys. Lett. A, 92(6), pp. 271-272, 1982. [21] W. K. Wootters and W. H. Zurek, “A single quantum cannot be cloned,” Nature, vol. 299, pp. 802-803, 1982. [22] T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein, Introduction to Algorithms, MIT Press, Sep. 2001. [23] E. Farhi, J. Goldstone, S. Gutmann, and M. Sipser, “Invariant quantum algorithms for insertion into an ordered list,” ArXive e-print quant-ph/9901059, 1999. [24] A. Ambainis, “A better lower bound for quantum algorithms searching an ordered list,” in Proc. of the 40th Annual IEEE Symposium on Foundations of Computer Science, pp. 352-357, 1999. [25] A. J. Arul, “Impossibility of comparing and sorting quantum states,” ArXive e-print quant-ph/0107085, 2001. [26] H. Klauck, R. Spalek, and R. del Wolf, “Quantum and Classical Strong Direct Product Theorems and Optimal Time-Space Tradeoffs,”ArXive e-print quant-ph /0402123, July 2004. [27] H. Klauck, “Quantum Time-Space Tradeoffs for Sorting,”in Proc. of the 35th ACM Symposium on Theory of Computing, pp. 69-76, 2003. [28] D. Nassimi and S. Sahni, “Parallel Permutation and Sorting Algorithms and a New Generalized Connection Network,”Journal of the Association for Computing Machinery, vol. 29, pp. 642-667, July 1982.