Optimizing Molecular Cloning of Multiple Plasmids Thierry Petit, Lolita Petit
Total Page:16
File Type:pdf, Size:1020Kb
Optimizing Molecular Cloning of Multiple Plasmids Thierry Petit, Lolita Petit To cite this version: Thierry Petit, Lolita Petit. Optimizing Molecular Cloning of Multiple Plasmids. International Joint Conference on Artificial Intelligence, 2016, Buenos Aires, Argentina. p. 773-779. hal-01628328 HAL Id: hal-01628328 https://hal.archives-ouvertes.fr/hal-01628328 Submitted on 3 Nov 2017 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. Optimizing Molecular Cloning Of Multiple Plasmids∗ Thierry Petit Lolita Petit Foisie School of Business, Gene Therapy Center, Worcester Polytechnic Institute, USA University of Massachusetts Medical School, [email protected] Worcester MA, USA [email protected] [email protected] Abstract The remainder of this paper is organized in the following manner. Section 2 describes the PCP. Section 3 introduces In biology, the construction of plasmids is a rou- a CP model for the PCP and a new propagator for AtMost- tine technique, yet under-optimal, expensive and NVector, theoretically and empirically evaluated. Section 4 time-consuming. In this paper, we model the Plas- presents experiments on real PCP instances. mid Cloning Problem (PCP) in constraint program- ing, in order to optimize the construction of plas- mids. Our technique uses a new propagator for 2 The Plasmid Cloning Problem the AtMostNVector constraint. This constraint al- Construction of plasmids refers to the isolation of an oriented lows the design of strategies for constructing mul- DNA sequence containing a gene of interest, insert, and its in- tiple plasmids at the same time. Our approach rec- sertion into a circular molecule, plasmid, which can be used ommends the smallest number of different cloning to generate multiple copies of the DNA when introduced into steps, while selecting the most efficient steps. It host cells (Fig. 1, next page). Each insert i is typically en- provides optimal strategies for real instances in closed into two different plasmids, z test and z cont (con- gene therapy for retinal blinding diseases. trol), to yield two final plasmids z test[i] and z cont[i]. Inserting the insert into a plasmid is generally performed 1 Introduction using a cut and paste approach (Fig. 1A). First, the insert and Construction of plasmids is one of the most commonly used the plasmid are cut with commercially available restriction techniques in molecular biology. Invented 40 years ago, enzymes (e.g., [Biolabs, 2015c]) at specific restriction sites this technique led to impressive applications in medicine and in order to generate compatible ends. Then, digested DNA biotechnologies, such as the production of large amounts of fragments are mixed together to allow their compatible ends insulin and antibiotics in bacteria, as well as the manipula- to anneal to each other. A DNA ligase is added to the solution tion of genes in host organisms. In most cases, plasmids are to covalently join the two fragments. constructed by cutting and assembling DNA fragments from The most crucial part of the experiment is to determine how different sources [Brown, 2010]. However, the physical as- the insert is introduced into the plasmids, i.e., which pairs of sembly of DNA parts is a time-consuming process for the ex- restriction enzymes will be used to cut the insert and the plas- perimenter, as it requires a multitude of steps. Minimizing the mids to facilitate the cloning. The PCP considers the con- number of different steps that are necessary to build multiple struction of multiple plasmids at the same time. We cannot plasmids at the same time can significantly reduce laboratory select enzymes recognizing restriction sites that are already work time and financial costs. present within the insert or present more than once in the plas- In this paper, we introduce a Constraint Programming (CP) mid. Using an enzyme recognizing any of these sites would approach to the Plasmid Cloning Problem (PCP). The ob- provoke fragmentation of DNAs. The list and positions of re- jective of the PCP is to recommend the smallest number striction sites present within each DNA sequence can be gen- of different cloning steps, while selecting the most efficient erated by several analysis DNA software tools, e.g., [ApE, steps. For this purpose, we design a constraint model and 2015]. Let z be either z test or z cont, and z[i] the result- we introduce a new propagator for the AtMostNVector con- ing final plasmid. Let the integer identifiers x[i]:before and straint [Chabert et al., 2009]. We show that no dominance x[i]:after be the restriction sites located at ends of insert i to properties exist between our technique and state-of-the-art allow its cloning into z. Five constraints apply: propagators. This complementarity is confirmed empirically. (A) Enzymes used to cut the insert i and the plasmid z need We demonstrate the relevance of using an Artificial Intelli- to create identical or equivalent ends, as only physically gence framework for modeling the PCP by successfully solv- compatible ends can be covalently ligated (Fig. 1A). ing real instances in gene therapy for retinal blinding diseases. Therefore, the restriction sites x[i]:before and x[i]:after ∗This work was partially supported by the Fulbright-Fondation should be compatible with the restriction sites of z[i]. Monahan, Fondation de France and AFM Tel´ ethon´ fellowships. The list of enzymes creating compatible ends is freely A NotI PvuII x[i].before x[i].after SpeI EcoRV Digestion Mix Ligation SpeI SpeI + EcoRV Final XbaI Insert i PvuII (292) 1 2 3 Plasmid EcoRV 0 XbaI (297) Digestion z[i] XhoI Plasmid HindIII (312) EcoRV (383) XbaI XhoI z XhoI (499) + EcoRV XhoI (876) Cut Paste B C compatibles D HindIII PvuII EcoRV EcoRV DraI SpeI XbaI PvuII [i].before Inter- SpeI EcoRV Correct XbaI mediate orientation Digestion ✓ EcoRV Insert i y XhoI Insert i EcoRI EcoRV SpeI XbaI PvuII EcoRV SpeI Final EcoRI Incorrect XbaI Incorrect Self HindIII Plasmid orientation orientation ligation Insert i ✗ EcoRV ✗ ✗ z[i] Insert i Figure 1: Construction of plasmids using the digestion/ligation method. (A) General procedure for cloning an insert into a plasmid. Using polymerase chain reaction, restriction sites (SpeI and EcoRV in this example) are added at both ends of an oriented insert i (arrow). The insert and the plasmid are digested by restriction enzymes to generate sticky and complimentary ends. With the addition of DNA ligase, the insert and the plasmid recombine at the sticky end sites. The result is a new plasmid containing the insert. (B)-(D) Orientation and compatibility constraints for the cloning of an insert into a final plasmid, in one step (B), (C), or when an intermediate plasmid is required (D). Restricted sites that are excluded are in grey. Restriction site position in the plasmid is given in parenthesis. provided by suppliers, e.g., [Biolabs, 2015a]. Moreover, 1) digest insert i to generate ends compatible with y, 2) in some cases, ligation of two compatible ends can de- assemble i and y, 3) extract i from y[i] to generate ends stroy both original restriction sites in the hybrid DNA compatible with z and 4) ligate i and z to get z[i]. sequence. These sites are excluded from the list to allow (E) As using an intermediate plasmid is very time consum- the experimenter to easily remove or exchange the insert ing task for the experimenter, we constrain x[i]:before from the final plasmid, should this be required. and x[i]:after to be identical for the construction of both (B) The insert must be enclosed into the plasmid in the right z test and z cont when an intermediate plasmid is re- orientation to allow its expressivity. As restriction sites quired for the construction of both z test and z cont. present in the plasmid z are totally ordered by their posi- tion, the two enzymes used to cut the insert i, x[i]:before Optimization criteria. When multiple plasmids are con- and x[i]:after, should recognize restriction sites with structed at the same time, sharing the same cloning strat- identical orientation on the plasmid z[i] (Fig. 1B). In egy reduces the total step count, associated laboratory work addition, in z[i], the restriction sites should be separated and reagent costs, because plasmid digestion and purification by d (a positive integer), as many restriction enzymes do steps can be done in parallel. Therefore, our primary objec- not cut DNA efficiently at the end of a DNA sequence. tive is to minimize the number of distinct pairs of enzymes Therefore, we must have z[i]:before + d < z[i]:after. (z[i]:before; z[i]:after) used to construct all final plasmids. In (C) Enzymes used to cut the two ends of the insert and plas- addition, not all pairs of restriction enzymes (before; after) mid should raise two different and not compatible ends. are equivalent in practice. Restriction enzyme cost is stem Otherwise, the insert could be assembled to the plasmid from buffer and temperature compatibility, as well as cloning in either the correct or incorrect orientation. Addition- efficiency. In particular, it is highly desirable that at least one ally, the plasmid may self-ligate omitting the insert (Fig. of the restriction enzyme generates a sticky-end DNA frag- 1C).