Solving the Set Cover Problem and the Problem of Exact Cover by 3-Sets In

BioSystems 72 (2003) 263–275 Solving the set cover problem and the problem of exact cover by 3-sets in the Adleman–Lipton model Weng-Long Chang a,1, Minyi Guo b,∗ a Department of Information Management, Southern Taiwan University of Technology, Tainan, Taiwan 701, ROC b Department of Computer Software, The University of Aizu, Aizu-Wakamatsu City, Fukushima 965-8580, Japan Received 15 April 2003; received in revised form 23 July 2003; accepted 23 July 2003 Abstract Adleman wrote the first paper in which it is shown that deoxyribonucleic acid (DNA) strands could be employed towards cal- culating solutions to an instance of the NP-complete Hamiltonian path problem (HPP). Lipton also demonstrated that Adleman’s techniques could be used to solve the NP-complete satisfiability (SAT) problem (the first NP-complete problem). In this paper, it is proved how the DNA operations presented by Adleman and Lipton can be used for developing DNA algorithms to resolving the set cover problem and the problem of exact cover by 3-sets. © 2003 Elsevier Ireland Ltd. All rights reserved. Keywords: Biological computing; Molecular computing; DNA-based computing; The NP-complete problem 1. Introduction shows that the set cover problem and the problem of exact cover by 3-sets can be solved with biological op- Producing 1018 DNA strands that fit in a test tube erations in the Adleman–Lipton model. Furthermore, is possible through advances in molecular biology this work represents obvious evidence for the ability (Sinden, 1994). Adleman wrote the first paper in of DNA-based computing to solve the NP-complete which it is shown that each DNA strand could be ap- problems. plied for solving the NP-complete Hamiltonian path The rest of this paper is organized as follows. In problem (Adleman, 1994). Lipton demonstrated that Section 2, the Adleman–Lipton model is introduced Adleman’s experiment could be used to determine the in detail and the comparison of the model with other NP-complete satisfiability (SAT) problem (the first models is also given. Section 3 introduces a DNA NP-complete problem) (Lipton, 1995). algorithm for solving the set cover problem. Section In this paper, we apply the DNA operations in the 4 describes another DNA algorithms for solving the Adleman–Lipton model to develop two DNA algo- problem of exact cover by 3-sets. In Section 5, the rithms. The main result of the two DNA algorithms experimental result of simulated DNA computing is also given. Conclusions are drawn in Section 6. ∗ Corresponding author. Tel.: +81-242-37-2557; + fax: 81-242-37-2744. 2. DNA model of computation E-mail addresses: [email protected], [email protected] (W.-L. Chang), [email protected] (M. Guo). In Section 2.1, the summary of DNA structure and 1 Tel.: +886-6-2533131x4300; fax: +886-6-2541621. the Adleman–Lipton model is described in detail. 0303-2647/$ – see front matter © 2003 Elsevier Ireland Ltd. All rights reserved. doi:10.1016/S0303-2647(03)00149-7 264 W.-L. Chang, M. Guo / BioSystems 72 (2003) 263–275 In Section 2.2, the comparison of the Adleman– 3. Merge. Given tubes P1 and P2, yield ∪(P1, P2), Lipton model with other models is also introduced in where ∪(P1,P2) = P1 ∪ P2. This operation is to detail. pour two tubes into one, with no change of the individual strands. 2.1. The Adleman–Lipton model 4. Anneal. The operation is to represent all of the operations that combine a test tube of sin- A DNA (deoxyribonucleic acid) is a polymer, gle stranded DNA with other prepared strands which is strung together from monomers called de- and let them anneal together to form double oxyribonucleotides (Sinden, 1994; Paun et al., 1998). strands. Distinct nucleotides are detected only with their 5. Detect. Given a tube P, say ‘yes’ if P includes at bases. Those bases are, respectively, abbreviated as least one DNA molecule, and say ‘no’ if it contains A, G, C and T. Two strands of DNA can form (under none. appropriate conditions) a double strand, if the respec- 6. Append. Given a tube P and a short strand of DNA, tive bases are the Watson–Crick complements of each Z, the operation will append the short strand, Z, other—A matches T and C matches G; also 3 end onto the end of every strand in the tube P. matches 5 end. The length of a single stranded DNA 7. Discard. Given a tube P, the operation will discard is the number of nucleotides comprising the single the tube P. strand. Thus, if a single stranded DNA includes 20 nu- 8. Read. Given a tube P, the operation is used to de- cleotides, it is called a 20 mer. The length of a double scribe a single molecule, which is contained in the stranded DNA (where each nucleotide is base paired) tube P.EvenifP contains many different molecules is counted in the number of base pairs. Thus, if we each encoding a different set of bases, the opera- make a double stranded DNA from a single stranded tion can give an explicit description of exactly one 20 mer, then the length of the double stranded DNA of them. is 20 base pairs, also written as 20 bp (for more dis- cussion of the relevant biological background, refer 2.2. Other related work and comparison with the to Sinden, 1994; Boneh et al., 1996; Paun et al., Adleman–Lipton model 1998). The DNA operations proposed by Adleman and Based on solution space of splint in the Adleman– Lipton (Adleman, 1994, 1996; Lipton, 1995; Boneh Lipton model, their methods (Narayanan and Zorbala, et al., 1996), are described below. These operations 1998; Chang and Guo, 2002a,b,c; Chang et al., will be used for figuring out solutions of the set cover 2002) could be applied towards solving the trav- problem and the problem of exact cover by 3-sets in eling salesman problem, the dominating-set prob- this paper. lem, the vertex cover problem, the clique problem, The Adleman–Lipton model: A (test) tube is a set the independent-set problem, the three-dimensional of molecules of DNA (i.e. a multi-set of finite strings matching problem and the set-packing problem. Those over the alphabet {A, C, G, T}). Given a tube, one methods for the problems resolved have exponentially can perform the following operations: increasing volumes of DNA and linearly increasing time. 1. Extract. Given a tube P and a short single strand Bach et al. (1996) proposed an n1.89n vol- of DNA, S, produce two tubes +(P, S) and −(P, ume, O(n2 + m2) time molecular algorithm for the S), where +(P, S) is all of the molecules of DNA 3-coloring problem and a 1.51n volume, O(n2m2) in P which contain the strand S as a substrand and time molecular algorithm for the independent set −(P, S) is all of the molecules of DNA in P which problem, where n and m are, subsequently, the number do not contain the short strand S. of vertices and the number of edges in the problems 2. Separate. Given a tube P and length L for a double resolved. Fu (1997) presented a polynomial-time al- strand of DNA, generate one tube ∗(P, L), where gorithm with a 1.497n volume for the 3-SAT problem, ∗(P, L) is all of the molecules of DNA in P which a polynomial-time algorithm with a 1.345n volume length is equal to L. for the 3-Coloring problem and a polynomial-time W.-L. Chang, M. Guo / BioSystems 72 (2003) 263–275 265 algorithm with a 1.229n volume for the independent 3. Using the Adleman–Lipton model to solve the set. Though the size of those volumes (Fu, 1997; Bach set cover problem et al., 1996) is lower, constructing those volumes is more difficult and the time complexity to the methods In Section 3.1, the summary of the set-cover prob- is very higher. lem is described. Applying splints to constructing so- Quyang et al. (1997) showed that restriction en- lution space of DNA sequences for the set-cover prob- zymes could be used to solve the NP-complete clique lem is introduced in Section 3.2.InSection 3.3, one problem. The maximum number of vertices that DNA algorithm is proposed to resolving the set-cover they can process is limited to 27 because the size problem. In Section 3.4, the complexity of the pro- of the pool with the size of the problem exponen- posed algorithm is described. tially increases (Quyang et al., 1997). Shin et al. (1999) presented an encoding scheme for decreas- 3.1. Definition of the set-cover problem ing error rate in hybridization. The method (Shin Assume that a finite set S is {s , ..., sA}, where sk et al., 1999) could be employed towards ascertain- 1 is one element in S for 1 ≤ k ≤ A. We denote that |S|, ing the traveling salesman problem for represent- which is equal to A, is the cardinality of S. Suppose ing integer and real values with fixed-length codes. that a collection C is {C , ..., CB}, where Ci is a Arita et al. (1997) and Morimoto et al. (1999), 1 subset in S for 1 ≤ i ≤ B. We denote that |C|isthe proposed new molecular experimental techniques cardinality of C and |C| is equal to B. Assume that m is and a solid-phase method to finding a Hamiltonian a positive integer. A set cover for S is a sub-collection path, respectively.

Solving the Set Cover Problem and the Problem of Exact Cover by 3-Sets In

Selective Maximum Coverage and Set Packing

Solving Sudoku with Dancing Links

Representing Sudoku As an Exact Cover Problem and Using Algorithm X and Dancing Links to Solve the Represented Problem

Exact Algorithms and APX-Hardness Results for Geometric Packing and Covering Problems∗

Bastian Michel, "Mathematics of NRC-Sudoku,"

Posterboard Presentation

Combinatorialalgorithms for Packings, Coverings and Tilings Of

Arxiv:0910.0460V3 [Cs.DS] 3 Feb 2010 Ypsu Ntertclapcso Optrsine21 ( 2010 Science Computer of Aspects Theoretical on Symposium Matching

Reductions and Satisfiability

8. NP and Computational Intractability

Unique Covering Problems with Geometric Sets

Algorithm X, Conflict-Driven-Clause-Learning