Applications of site-specific recombination

As can be seen from the examples discussed above, the same mechanism of DNA recombination can be utilized in different biological contexts to bring about integration, excision (deletion) and inversion of DNA segments. In principle, then, one should be able to adapt site-specific recombination systems to direct one or more of these types of DNA rearrangements in selected regions of a of interest. This expectation has been fully validated. Site-specific recombination has been utilized in promoting genetic alterations for answering fundamental questions in biology and for developing biotechnological tools.

Since Cre and Flp have extremely simple reaction requirements, these recombination systems have been reconstituted in a variety of organisms-, fungi, plants, nematodes, flies and animals. Cre and Flp can be placed under regulatable promoters for conditional or tissue- specific expression.

An important first step in applying site-specific recombination in a genetic context of interest is the introduction of the target site or sites at the desired locale(s). Once this has been accomplished, the rest of the experimental steps are quite straightforward.

Tracking cell lineage during development

One of the most useful applications of site-specific recombination in basic biology has been in tracking the lineage of cells during development. The early work was done in Drosophila using the Flp-FRT system. The method has been extended to a variety of developmental systems, helped, in particular, by important technological advances in fluorescence microscopy and the development of multi-color reporter constructs.

The principle of the method is illustrated in the figure below. Here site-specific recombination is used to bring about mitotic recombination between homologous chromosomes. In an organism such as Drosophila, the reporter constructs and the recombination sites can be appropriately placed in the chromosomes to be manipulated.

The FRT sites are placed at identical positions close to the centromeres of a pair of homologous chromosomes. In one chromosome a GFP (green fluorescent protein) reporter gene, lacking a promoter, is placed adjacent to the FRT site as shown in the figure below. In the homologue, a promoter sequence is placed next to the FRT site. At a certain stage in development, as demanded by the experimental objective, the Flp protein is expressed by turning 

on a conditional promoter that controls the FLP gene. Consider the state of a cell in which DNA replication has been completed. There are two sister copies of each of the two chromosomes that we are interested in. Each pair of sister chromatids is held together, so that they can be segregated to the daughter cells in a one-to-one fashion. When recombination occurs between FRT sites as diagrammed, one chromosome acquires the promoter from its partner, and becomes competent to express GFP. When the cell divides, only one of the two daughter cells will acquire this chromosome; therefore only this cell will express GFP, and appear as green under the fluorescence microscope. The other will lack fluorescence. All the cells resulting from the division of the fluorescent progenitor cell will also be green. These clones will form a mosaic against the majority of cells that are non-fluorescent. Thus, by tracking fluorescence, cell lineage can be followed during development. 

Ablating a gene function during development

One can also delete a particular gene at a given stage of development, and follow the consequence of the deletion during further development. Here the locus of interest is flanked by two copies of FRT in a direct (or headͲtoͲtai) orientation (see the figure above). The Flp protein is induced from a regulated promoter. If one expresses the Flp gene from a tissue specific promoter, the deletion will occur only in that particular tissue. The effects of removing the gene function in a given tissue or a set of tissues can be followed.

Inducing the expression of a gene at a specific time in development One can induce the expression of a gene at a desired point in development via site-specific recombination. For example, imagine a gene X engineered in such a way that its promoter is oriented in the nonͲfunctional direction (see the figure below). The promoter is flanked by two FRT sites in the direct or headͲtoͲhead orientation. When the Flp protein is induced at the appropriate time, recombination by Flp will invert the DNA segment containing the promoter. It is now turned around, and acquires the functional orientation, thus turning on the gene X.

+ 

Alternatively, one can introduce a transcription terminator site between the promoter and the gene of interest (see the figure below). The terminator is flanked by a recombination target site at either end, with the sites arranged as direct repeats. With the terminator present, transcription initiated at the promoter cannot get past the terminator and enter the gene. However, when the terminator is removed by recombination, transcription proceeds into the gene, thus activating its function.

Site-specific recombination in biotechnological applications Once the target site has been inserted into the genomic locus of interest, one can perform integration of an exogenous DNA sequence into this target site. Since the inserted sequence is flanked by two directly repeated (head-to-tail) target sites, it can also be excised from the genome by recombination. These operations can be performed in a regulated fashion by inducible expression of the recombinase. Similarly, if two target sites are inserted on either side of a certain genetic locus, recombination can be used to excise the locus. Removal of integrated harmful viral sequences would be a potential beneficial application of site-specific recombination. 

A more recent application of site specific recombination in biotechnology is described as recombination mediated cassette exchange (RMCE; see Figure below).

This method uses two recombination events to replace the endogenous chromosomal locus by a new one. Originally, a single recombinase was used for RMCE (shown in A). RTa and RTa* are slightly altered sites, so that Rta x Rta* recombination cannot occur. However, the recombinase can act on both RTa and RTa* with nearly equal efficiencies. [I shall explain in class how this can be done.] In later versions of RMCE, two recombinases and their respective target sites were employed. The two cross-over events diagrammed cleanly splices out the pre-existing locus, and exchanges it for the incoming locus.

Expanding the utility of site-specific recombination in biotechnology As we already saw, one of the limitations in the application of site-specific recombination in biotechnology is the introduction of the target site being a pre-requisite for manipulating a genome of interest. The other is the small number of recombinases that can be easily manipulated for genome engineering purposes. For all practical purposes, the number is two, Cre and Flp. For each recombinase, one can generate a small set of variant target sites that are not cross reactive with each other, but are normal or nearly normal in self-reactions. These target sites are created by changing the sequence of the short strand exchange region without changing the binding sites for the recombinase. The absolute sequence of the strand exchange region is not terribly important for recombination (although some sequences work poorly). However, there has to be perfect homology between the strand exchange regions for recombination to occur between two 

target sites. In other words, an altered site is reactive with a second copy of itself, but non-reactive with the native site or with another altered site containing a different substitution. A potentially useful approach to expand the utility of site-specific recombination is to generate recombinases with altered binding (DNA recognition) specificities. That is, change the sequence of the binding elements, and then produce active recombinase variants that have acquired the corresponding new recognition capabilities. Although this idea would seem reasonably straightforward, this is quite difficult to accomplish in practice. Note that the present day Flp or Cre represent the optimization of DNA-protein recognition and catalysis over evolutionary time. It would be quite difficult to redesign that optimization within the time frame of laboratory experiments. Nevertheless, what one tries to do is mimic the process evolution in the test tube. The approach is called directed in vitro evolution, and is briefly described below. On can generate a large pool of random mutants of the recombinase gene, preferably by PCR- based mutagenesis. Among the 107 or 108 of mutants generated, there might be one or a tiny number of recombinase variants that might have acquired specificity for the new target sequence that we designed. The predominant majority will consist of recombinases that have lost function because of acquired mutations or have not changed their DNA recognition. These are not of interest to us. The problem is how to find the needle in the hay stack. For this one has to have a simple and quick genetic or physical screens to track down the clones of interest to us. The way the directed evolution protocol is carried out is as follows. First, we transform the mutant library into an E. coli host, and collect the transformants, say, 107 independent transformants. To screen just one equivalent of this library by standard genetic assays, one would require about 104 large petri dishes (about 1000-1500 colonies can be plated out on a dish without overcrowding). Many times more colonies can be analyzed by physical screen using fluorescence reporters and high throughput cell sorting machines. 109 cells can be sorted in several hours using state of the art sorters.

Changing the target specificity of Flp Here is one example of a genetic screen. We want to identify Flp variants with their specificity shifted away from the native FRT site and towards a mutant FRT site, called mFRT. The two sites shown in the figure below differ in one key base pair of the Flp recognition sequence- a change from a CG bp to GC bp. Two reporter plasmids are constructed, and the assay is carried out in E. coli. The experimental design is to distinguish FRT x FRT recombination from mFRT x mFRT recombination by a colony color assay (See the figure below). In one case, the LacZ reporter is flanked by two direct copies of mFRT sites. In the other the gene for RFP 

(red fluorescent protein) is flanked by two direct copies of FRT. Recombination between two FRT sites will eliminate the RFP gene, and recombination between two FRT sites will eliminate the LacZ gene. Either one or both the genes will be expressed only when the relevant recombination event or events fail to occur. The library of Flp variants are expressed from the P-BAD promoter, which is turned on only in the presence of arabinose. The color of the colonies after addition of arabinose to the growth medium will indicate the occurrence or non-occurrence of the designed recombination events. Colonies expressing lacZ (absence of mFRT recombination) will be blue in the presence of the indicator substrate X-gal. Colonies expressing both LacZ and RFP (absence of mFRT and FRT recombination) will also appear as blue as the blue color is strong, and masks the red color. Thus blue colonies declare a given Flp variant clone is either inactive on mFRT or inactive on both FRT and mFRT. Colorless (white) colonies will indicate that a Flp variant is active on both FRT and mFRT. Red colonies will denote mFRT recombination and the absence FRT recombination by a Flp variant. One can collect the red colonies, isolate the Flp plasmids from them, and identify the mutations responsible for the specificity switch. One can pool the variant library, subject them to further random mutagenesis and screening to identify more robust variants with strong mFRT recombination activity and high discrimination against FRT. 

In panel A of the figure, the design of the Flp expression vector and of the two reporter constructs is schematically shown. In Panel B, the expectations and outcomes are summarized.

Changing the target specificity of Cre To facilitate the rapid screen of desired variants, in this set up, bacterial cells containing the library of variants were screened by fluorescence-based cell sorting. The sequence of the native Cre target site (called LoxP) and a mutant site M5 on which wild type Cre does not act are shown in the figure below.

We wish to identify Cre variants that can recombine M5-LoxP sites. 

The reporter genes here are GFP or YFP (yellow fluorescent protein). The LoxP sites, either native or M5, are arranged in the reporter in inverted orientation (blue arrows). The promoter is placed adjacent to the LoxP site (or M5 site) proximal to the YFP gene. The GFP gene in the opposite orientation lacks a promoter. In the absence of Cre mediated recombination, the cells will only express the YFP protein, and will display yellow fluorescence (left panel of the Figure below). If recombination occurs between the sites, the DNA sequence between the LoxP (or M5) sites will be inverted, and GFP will become in register with the promoter. Cells will now show both green and yellow fluorescence (as shown in the right panel of the figure below. Cells with binary fluorescence can be separated from the yellow fluorescent cells, and further enriched by growing them to suitable cell density, and taking them through repeated rounds of recombinase, induction, sorting etc.

The advantage of the sorting method over the genetic screen is that the former is much faster and the cell populations covered are larger in number by two to three orders of magnitude. 

Making the genetic screen more robust The genetic method can be made more efficient by using a reporter gene that can be selected against. For example the E. coli GalK gene is lethal when the strain carries a mutation in the GalE, gene, which acts down stream of GalK. These genes code for involved in the utilization of galactose. The product of GalK action, Gal-1-phopshate is toxic to the cell if it is not metabolized further. Thus, a reporter carrying the GalK gene bordered by directly repeated recombination target sites offers a means for selecting for cells that have eliminated GalK by recombination. The cells carrying the reporter construct is grown in the presence of glucose (not galactose in the medium, which would cause toxicity), and following the induction of recombinase, they are plated on galactose plates. Here a million or more cells can be plated on a single plate as only the few cells that successfully performed recombination will be able to grow. The rest of the ell population will die due to the galactose toxicity.

Structure based mutagenesis When detailed structural information on recombinase-DNA interaction are available, the directed evolution procedure can be refined by randomizing a selected set of amino acid residues that make direct contact with DNA bases or are located in close proximity to such residues. This approach is much more efficient than targeting the entire protein to random mutagenesis. A library of selective randomization will have a much more complete representation of different amino acids at the relevant positions than a randomly mutagenized library, and is therefore likely to comprise a significantly higher frequency of the desired recombinase variants.

Recombinases that utilize native genomic sequences as recombination target sites The directed evolution strategy has been used with some degree of success to shift the specificity of a recombinase through stepwise changes, leading ultimately to specificity for a sequence that is already present within a genome. Achieving specificity to native sequences eliminates the problem of having to introduce the natural target site of the recombinase into the genome to be manipulated. Computer algorithms can scan through the sequence of a genome, and identify sites that resemble the recombinase site in sequence as well as organization. The degree of resemblance can be ranked by a suitable scoring scheme. The high ranking sites are likely more amenable to recognition by the recombinase (after directed evolution) than a lower ranking site. Once a potential target site is chosen, the step-wise evolution can be performed. First, one or a few 

changes corresponding to the native genomic sequence are incorporated into the target site, and specificity is selected for the partially altered site. Additional substitutions are then introduced into this altered site, and the evolution-selection schemes are repeated. Thus, through multiple rounds of evolution, a recombinase variant is identified that can accept a genomic sequence as a recombination target site. By subjecting Flp to this evolutionary scheme, it has been possible to obtain a variant that can mediate recombination at a sequence contained within the human IL-10 gene. Similarly, a Cre variant capable of recombining a sequence within the LTR (long terminal repeat) of a human immunodeficiency (HIV) has been evolved. Such recombinases can be potentially harnessed for cleansing the genome of integrated retrovirus.