Journal of Cell Science 105, 481-488 (1993) 481 Printed in Great Britain © The Company of Biologists Limited 1993

Use of a general purpose mammalian expression vector for studying intracellular targeting: identification of critical residues in the nuclear lamin A/C nuclear localization signal

John V. Frangioni* and Benjamin G. Neel Molecular Medicine Unit, Beth Israel Hospital, Boston, MA 02215, USA *Author for correspondence

SUMMARY

We have constructed a general purpose mammalian subunit IV) and to the nucleus (using a putative eight expression vector for the study of intracellular protein amino acid nuclear localization signal from human targeting. The vector, p3PK, facilitates construction of nuclear lamins A and C). We also report that, contrary N- and/or C-terminal fusions of an amino acid sequence to the predictions of previously published work, substi- of interest to the normally cytosolic protein chicken tution of a critical residue in the nuclear lamin A/C muscle pyruvate kinase (CMPK). The vector has been nuclear localization signal (the equivalent of lysine 128 engineered such that any fusion construct can be sub- in the SV40 large T nuclear localization signal) retains cloned into the versatile pJxW family of mammalian nuclear localization, and discuss how amino acid con- expression vectors and into pGEX bacterial expression text might affect targeting to the nucleus. vectors, for the generation of affinity reagents. In this paper, we demonstrate the general utility of p3PK by Key words: intracellular protein targeting, mammalian expression redirecting CMPK to mitochondria (using the twelve vector, mitochondrial import signal, nuclear localization signal, amino acid pre-sequence of yeast cytochrome c oxidase nuclear lamins

INTRODUCTION (Kalderon et al., 1984b). A commonly employed heterolo- gous protein for the latter strategy is chicken muscle pyru- Proper cellular function depends on targeting to vate kinase (CMPK; Lonberg and Gilbert, 1983). This com- specific subcellular locations (reviewed in von Heijne, pletely soluble, normally cytosolic protein is ideal for such 1990). In most cases, a short sequence of amino acids serves studies, and a number of mammalian expression vectors as a signal to direct a protein within the cell. Amino acid containing it have been constructed (Kalderon et al., 1984b; sequences that target proteins to the nucleus (Kalderon et Roberts et al., 1987; Dingwall et al., 1988; Gao and Knipe, al., 1984b; reviewed by Chelsky et al., 1989), the cyto- 1992). solic side of the endoplasmic reticulum (Frangioni et al., Unfortunately, none of these vectors offers both flexible 1992), Golgi apparatus (Swift and Machamer, 1991), lyso- cloning and versatile control. To circumvent these somes (Kornfeld, 1990) and mitochondria (Hurt et al., problems, we have engineered the mammalian expression 1985), have been described. To demonstrate that a partic- vector p3PK. This vector contains cloning sites for con- ular amino acid sequence directs the intracellular targeting struction of N- and/or C-terminal fusions of putative tar- of a protein, one must prove that it is both necessary and geting sequences to CMPK, and permits subcloning in cas- sufficient. A sequence is considered necessary if its removal sette fashion to expression vectors controlled by different or disrupts proper targeting. To be considered suf- promoters. Since it is often desirable to use an identified ficient, a sequence must be capable of redirecting a het- targeting sequence as an affinity matrix for purifying anti- erologous protein to the subcellular compartment of inter- bodies and/or binding proteins, p3PK has also been est. designed such that targeting sequences fused to the C-ter- Two basic strategies have been employed to prove suf- minus of CMPK can be rapidly subcloned into the pGEX ficiency: microinjection of a carrier protein covalently bacterial protein purification system (Smith and Johnson, coupled to a peptide containing the putative targeting 1988). We demonstrate the versatility of p3PK by target- sequence (Goldfarb et al., 1986; Lanford et al., 1986); and ing CMPK to various organelles, and by using it to study of an expression vector containing the putative the sequence requirements of the human nuclear lamin A/C targeting sequence fused in-frame to a heterologous protein nuclear localization signal. 482 J. V. Frangioni and B. G. Neel

MATERIALS AND METHODS 5’ GATCCACCAAAAAGCGCAAACTGGAGTCCTGAAAGCTTG 3’ 3’ GTGGTTTTTCGCGTTTGACCTCAGGACTTTCGAACTTAA 5’ Chemicals and reagents p3PK/K1® L1, coding for a substitution in the nuclear Restriction endonucleases and other molecular reagents lamin NLS (see below) used: were purchased from Gibco-BRL. All other reagents were pur- 5’ GATCCACCCTGAAGCGCAAACTGGAGTCCTGAAAGCTTG 3’ chased from Sigma. Oligonucleotides were synthesized on a Mil- 3’ GTGGGACTTCGCGTTTGACCTCAGGACTTTCGAACTTAA 5’ ligen Cyclone Plus DNA synthesizer. Plasmid p3PK/K2® L2, coding for a substitution in the nuclear Construction of p3PK lamin NLS (see below) used: The vector p3PK was designed to express amino acids 17 to 476 5’ GATCCACCAAACTTCGCAAACTGGAGTCCTGAAAGCTTG 3’ of chicken muscle pyruvate kinase (CMPK; Lonberg and Gilbert, 3’ GTGGTTTGAAGCGTTTGACCTCAGGACTTTCGAACTTAA 5’ 1983) under control of the SV40 promoter. It was constructed by digesting the plasmid pJ3W (Morgenstern and Land, 1990; ATCC Cell culture and transfection catalog no. 37719) with HindIII and BamHI. A three-way ligation Cell culture of HeLa cells was carried out as previously described was then performed with the digested plasmid, a double-stranded (Frangioni et al., 1992). DNA for transfection was prepared by oligonucleotide (d.s. oligo) adapter, and the BglII/BamHI fragment alkaline lysis and CsCl banding as described (Ausubel et al., from the plasmid PK10/8 (Gao and Knipe, 1992). The d.s. oligo 1987). Twenty-four hours prior to transfection, 1.5´ 105 cells in 2 adapter contained a HindIII site, a consensus sequence for trans- ml of medium were plated onto sterile coverslips in 35 mm plas- lation initiation (Kozak/Start site; Kozak, 1986), and a BglII site. tic Petri dishes. Cells were transfected using the modified calcium It was formed by annealing two overlapping oligonucleotides: phosphate precipitation method (Chen and Okayama, 1987). Briefly, 4 µg plasmid DNA (per 35 mm dish) was mixed in 100 5’ AGCTTACCATGGGTA 3’ 3’ ATGGTACCCATCTAG 5’ ml of 0.25 M CaCl2. Then 100 µl of 2´ BBS (50 mM N-,N-bis(2- hydroxyethyl)-2-amino-ethanesulfonic acid, pH 6.95, 280 mM The plasmid PK10/8 was a generous gift from Dr Min Gao NaCl, 1.5 mM Na2HPO4) was added to the mixture over 30 sec- (Bristol-Myers Squibb) and Dr David M. Knipe (Harvard Med- onds, and the precipitate allowed to form for 20 minutes at room ical School). The BglII/BamHI fragment from this plasmid codes temperature. The total solution (200 µl) was applied directly to for amino acids 17-476 of chicken muscle pyruvate kinase the cells which were incubated for 16 hours at 37˚C and 3% CO2, (CMPK; Lonberg and Gilbert, 1983). washed 3 times, re-fed with fresh medium, and allowed to incu- An in-frame for CMPK was added to the recombi- bate for an additional 32 hours at 37˚C and 10% CO2, before fix- nant plasmid by digesting with BamHI and EcoRI, and ligating ation as described below. an overlapping d.s. oligo adapter containing a BamHI site, an in- frame stop codon, an XbaI site for screening recombinants, and Affinity purification of polyclonal antibodies an EcoRI site: Whole serum from rabbits immunized with chicken muscle pyru- 5’ GATCCTGATGATCTAGAG 3’ vate kinase (Sigma, catalog no. P-6406) was the generous gift of 3’ GACTACTAGATCTCTTAA 5’ Dr Morris Birnbaum (Harvard Medical School). Antibodies The resulting plasmid contained two BglII sites, one of which specific for CMPK were affinity-purified using a glutathione S- would preclude cloning of N-terminal fusions. To remove the transferase (GST)/CMPK covalently coupled to unwanted site, the plasmid was cut with limiting concentrations Affi-Gel 15 (Bio-Rad). The details of the purification and Affi- of BglII, treated with the Klenow fragment of DNA polymerase, Gel coupling of GST/CMPK are described elsewhere (Frangioni and ligated back to itself. All constructions were verified by and Neel, 1993). An 8 ml sample of whole rabbit serum was dideoxynucleotide fluorescence sequencing using an Applied immunodepleted of GST reactive antibodies by incubation with Biosystems Model 373A fluorescence DNA sequencer. an Affi-Gel 15 GST column (4 mg GST protein, 0.5 ml bed volume) for 1 hour at 4˚C, with end-over-end rotation of the column. The flow-through was collected into a GST/CMPK Cloning of putative targeting sequences into p3PK column (1 mg protein, 0.5 ml bed volume) and incubated for 1 The plasmid p3PK/MtLS was designed to contain the first twelve hour at 4˚C. Bound proteins were washed with 40 bed volumes amino acids of the presequence of yeast cytochrome c oxidase of PBS (8.4 mM Na2HPO4, 1.9 mM NaH2PO4, pH 7.4, 150 mM subunit IV (Hurt et al., 1985) fused to the N-terminus of CMPK. NaCl), 10 bed volumes of PBS adjusted to 500 mM NaCl, 20 bed It was constructed by digesting p3PK with HindIII and BglII, and volumes of PBS, 10 bed volumes of 0.1% NP-40 in PBS, and ligating an overlapping d.s. oligo adapter containing (5¢to 3¢): a finally 30 bed volumes of PBS. Specific antibody was eluted with HindIII site, an EcoRI site for screening recombinants, a 100 mM glycine (pH 2.5), and 250 µl fractions were collected Kozak/Start site, the putative targeting sequence, and a BglII site. into tubes containing 1/10th volume 1 M Tris, pH 8.5. Pooled IgG The two oligonucleotides used to form the adapter were: fractions were analyzed by SDS-PAGE and supplemented with 100 µg/ml BSA and 0.2% azide, and stored at 4˚C. Affi-Gel 5’ AGCTTGAATTCACCATGCTTTCACTACGTCAATCTATAAGATTTTTCAAGA 3’ columns were stripped with 100 mM glycine, pH 2.2, re-equili- 3’ ACTTAAGTGGTACGAAAGTGATGCAGTTAGATATTCTAAAAAGTTCTCTAG 5’ brated in PBS and stored at 4˚C using 0.2% azide as a preserva- The p3PK/NL, p3PK/K1® L1 and p3PK/K 2® L2 con- tive. tained wild type and mutant forms of the putative nuclear lamin A/C nuclear localization signal (NLS) fused to the C-terminus of Indirect immunofluorescence CMPK. They were constructed by digesting p3PK with BamHI Cells were washed gently 3 times with PBS at 37˚C, fixed with and EcoRI and ligating an overlapping d.s. oligo adapter con- 2% paraformaldehyde in PBS (pH 7.4) for 10 minutes at 37˚C, taining (5¢to 3¢): a BamHI site, the NLS (nucleotides 1456-1479 washed once with PBS, quenched for 5 minutes at room temper- of human nuclear lamin A; McKeon et al., 1986), an in-frame stop ature (RT) with 50 mM glycine in PBS (pH 7.4), washed once codon, a HindIII site for screening recombinants, and an EcoRI with PBS, and permeabilized with 0.1% NP-40 in PBS. All sub- site. Plasmid p3PK/NL codes for the wild type NLS and used the sequent incubations and washes were carried out in 0.1% NP- following overlapping d.s. oligos: 40/PBS to reduce non-specific background. Blocking was Intracellular protein targeting 483

Fig. 1. The p3PK mammalian expression vector. The general structure of the vector is shown at the top of the figure. The DNA sequence for amino acids 17-476 of chicken muscle pyruvate kinase (CMPK) has been inserted into the modified polylinker of the parent vector pJ3W (Morgenstern and Land, 1990). Transcriptional control of p3PK is mediated by the SV40 early promoter, and polyadenylation by an SV40 polyadenylation signal (SV40 Poly A). Note the presence of a splice donor/acceptor pair consisting of the SV40 IVS (Splice donor/acceptor). p3PK also contains an SV40 (SV40 ori/Early promoter), a pBR322 origin of replication (pBR ori) and the for ampicillin resistance (Ampr). Unique restriction endonuclease sites are shown, as are the detailed sequences of the 5¢and 3¢ polylinkers. Kozak/Start specifies a consensus start sequence (ACCATG; Kozak, 1986). Predicted amino acids are shown by a three letter code, under which are their numbered positions in the wild type CMPK molecule. achieved by incubation for 30 minutes at RT with 0.3% bovine have no effect on the expected subcellular localization of serum albumin/0.1% NP-40/PBS. Cells were incubated with pri- CMPK (see below). The multiple cloning sites can be used mary antibodies for 60 min at RT, washed 4 times with 0.1% NP- to generate fusion of a putative targeting sequence 40/PBS, incubated with secondary antibodies for 45 min at RT, with the N- and/or C-terminus of CMPK. Although three washed 3 times with 0.1% NP-40/PBS, washed twice with PBS, restriction sites (XbaI, EcoRI, and ClaI) are available for and mounted on glass slides with Mowiol (Calbiochem) contain- use as the 3¢cloning site, we strongly recommend that the ing 2.5% DABCO (Kodak) as described (Harlow and Lane, 1988). Affinity-purified anti-CMPK antibodies were used at a concen- EcoRI site be used whenever possible in order to preserve tration of 1 µg/ml. Anti-mitochondria antibody mAb1273 (Chemi- the cassette features of p3PK (see below). Fig. 1 also shows con, Inc.) was used at a dilution of 1:200. Secondary antibodies, the relative position of a unique SalI restriction site (amino FITC-conjugated goat anti-mouse (Tago) or TRITC-conjugated acids 215-216 of CMPK), which can be used to insert tar- goat anti-rabbit (Tago), were used at a dilution of 1:350. Cells geting sequences into the middle of the CMPK molecule were photographed on an Olympus BH-2 microscope with Ektar (discussed below). 125 color film (Kodak) as described (Frangioni et al., 1992). Cassette structure of p3PK CMPK/targeting sequence (TS) fusion genes constructed RESULTS with p3PK can be subcloned in one step into several other useful vectors. For example, Fig. 2A demonstrates how dif- The p3PK mammalian expression vector ferent combinations of fusion genes can be subcloned into The p3PK vector, shown in Fig. 1, was constructed from any of the pJxW (Morgenstern and Land, 1990) family of the plasmids pJ3W and PK10/8 as described in Materials mammalian expression vectors. The pJxW vectors (ATCC and Methods. This 4.9 kb mammalian expression vector catalog nos 37719-37724) differ only in the promoter/ expresses amino acids 17 to 476 of chicken muscle pyru- combination controlling expression of the gene of vate kinase (CMPK) under the control of the SV40 early interest (Fig. 2A, inset). To subclone from p3PK into a promoter. An SV40 polyadenylation site within the parent pJxW vector, the CMPK/TS fusion gene is excised from vector pJ3W provides the signal for polyadenylation. Repli- p3PK as a single unit by digestion with HindIII and EcoRI cation of the plasmid in bacteria is mediated by a pBR and ligated to a similarly digested pJxW family member. origin, and ampicillin resistance is conferred by the Ampr This swapping procedure can be accomplished for TS fused gene. Replication of p3PK in SV40 large T-containing at either the N-and/or C-terminus of CMPK. We use the mammalian cells is mediated by the SV40 origin of repli- simple nomenclature of pxPK where x is the pJxW vector cation. into which the p3PK fusion gene has been cloned (e.g. Short oligonucleotide adapters were inserted at each end CMPK alone swapped into pJ4W creates p4PK, CMPK/TS of the CMPK cDNA (Fig. 1) to create multiple cloning swapped into pJ5eW creates p5ePK/TS, etc.). sites. These adapters also provide a translation start signal C-terminal CMPK fusions can be directly subcloned into (Kozak/Start site; Kozak, 1986) at the N-terminus of the the pGEX-1l T and pGEX-2T bacterial fusion protein CMPK cDNA, and a translation stop site at its C-terminus. expression vectors (Smith and Johnson, 1988). As shown The additional amino acids generated by these adapters in Fig. 2B, the TS sequence alone can be cloned directly 484 J. V. Frangioni and B. G. Neel

Fig. 2. Cassette structure of p3PK. (A) Putative targeting sequences (TS) can be cloned at the N and/or C-termini of CMPK. Shown are the possible combinations of TS and CMPK. The resultant fusion gene can be excised by digestion with HindIII and EcoRI and then directly subcloned into any one of the pJxW mammalian expression vectors. The pJxW vectors and their corresponding promoters are shown in the framed insert. The thick arrow shows the direction of /translation. (B) A targeting sequence (TS) fused in-frame to the C-terminus of CMPK can be directly subcloned into the bacterial expression vectors pGEX-1l T or pGEX-2T. This is accomplished by digesting the pGEX plasmid with BamHI and EcoRI, and digesting the p3PK/TS plasmid with BglII and EcoRI. Ligation of these two fragments yields the recombinant pGEX plasmid. Alternatively, the TS alone (flanked by BamHI and EcoRI ends) can be subcloned into the pGEX plasmid. *T denotes a thrombin proteolytic cleavage sequence in the adapter between GST and inserted amino acid sequence (see text). The thick arrow shows the direction of transcription/translation.

into the pGEX vectors at the same time it is cloned into that the signal was diffusely distributed throughout the cyto- p3PK, by digesting the pGEX plasmid with BamHI and plasm and excluded the nucleus. However, when the puta- EcoRI. Alternatively, the entire CMPK/TS fusion gene can tive twelve amino acid mitochondrial localization signal be cloned into the pGEX plasmid by digesting the p3PK fused to the N-terminus of CMPK was overexpressed in construct with BglII and EcoRI, and ligating the excised HeLa cells, CMPK was seen to have a restricted subcellu- fusion gene to pGEX digested with BamHI and EcoRI. Sub- lar localization (Fig. 3, column 2, row 1). The staining pat- cloning from p3PK into pGEX retains the thrombin prote- tern excluded the nucleus, and within the cytoplasm had a olytic site between GST and the insert. This allow the TS tubular and punctate appearance consistent with localiza- to be cleaved away from GST and purified separately. tion to mitochondria. Fig. 3 (column 2, row 2) displays the same field stained with a mitochondria-specific monoclonal Redirection of CMPK to mitochondria antibody and confirms localization of CMPK to the mito- chondria. Control cells were transfected with p3PK/MtLS To test the utility of p3PK in studying intracellular protein and incubated with each primary antibody separately, and targeting, we asked whether a sequence previously shown both secondary antibodies together. Under these conditions, to be a mitochondrial import signal in yeast could function there was no detectable cross-reactivity of the secondary in mammalian cells. DNA coding for the twelve amino acid antibodies (data not shown). pre-sequence from yeast cytochrome c oxidase subunit IV was cloned into p3PK as described in Materials and Meth- ods. The resultant plasmid, which encodes a yeast mito- Redirection of CMPK to the nucleus chondrial import signal (MLSLRQSIRFFK) fused to the N- A putative 8 amino acid nuclear localization signal from terminus of CMPK, was named p3PK/MtLS. HeLa cells, human nuclear lamins A and C was cloned into p3PK as transiently transfected with p3PK or p3PK/MtLS, were described in Materials and Methods. The resultant vector, fixed 48 hours post-transfection and processed for indirect containing the wild type sequence TKKRKLES fused to the immunofluorescence as described in Materials and Meth- C-terminus of CMPK, was designated p3PK/NL. HeLa ods. Cells overexpressing CMPK were easily identified cells transiently transfected with p3PK/NL were fixed at 48 from untransfected, non-overexpressing cells due to the hours post-transfection and processed for indirect immuno- high signal to background ratio. Fig. 3 (column 1, row 1) fluorescence as described in Materials and Methods. Fig. 3 displays the typical immunofluorescence pattern seen when (column 3, row 1) displays the resulting immunofluores- CMPK alone was overexpressed in HeLa cells. Compari- cence pattern. By comparison of this fluorescence micro- son of the fluorescence micrograph to the phase contrast graph to the phase contrast image of the same field (Fig. 3, image of the same field (Fig. 3, column 1, row 3) revealed column 3, row 3), it was evident that CMPK localized Intracellular protein targeting 485

Fig. 3. Use of p3PK for characterizing putative intracellular targeting sequences.HeLa cells were fixed at 48 hours after transfection with the indicated plasmids and processed for indirect immunofluorescence as described in Materials and Methods. Transfected plasmids are shown at the top of each column and are labeled p3PK (CMPK protein only), p3PK/MtLS (the twelve amino acid pre-sequence of yeast cytochrome c oxidase subunit IV, MLSLRQSIRFFK, fused in-frame to the N-terminus of CMPK) and p3PK/NL (a putative eight amino acid nuclear localization sequence for human nuclear lamins A/C, TKKRKLES, fused in-frame to the C-terminus of CMPK).The top row displays the staining pattern observed with rabbit anti-CMPK affinity purified antibody (a -CMPK). The middle row displays the staining pattern observed with mouse anti-mitochondria monoclonal antibody 1273 (mAb1273).The bottom row displays the phase-contrast image of the same field (Phase). exclusively to the nucleus. Only in extremely high level localized to the nucleus, and exhibited a staining pattern overexpressors was any CMPK signal detectable in the identical to that seen with CMPK alone (compare to Fig. cytoplasm (data not shown). 3, column 1, row 1). affecting nuclear localization Based on a previously proposed consensus sequence for DISCUSSION nuclear localization signals (Chelsky et al., 1989), single point mutations were made in presumed critical residues of Using the mammalian expression vector p3PK, putative tar- the nuclear lamin A/C nuclear localization signal (see below geting sequences can be cloned as in-frame fusions to the for discussion). One mutation substituted the first lysine N- and/or C-termini of the normally cytosolic protein with a leucine residue, changing the wild type sequence chicken muscle pyruvate kinase (CMPK). There are two from TKKRKLES to TLKRKLES. The plasmid containing general strategies for constructing the desired fusion gene: this mutant NLS fused to the C-terminus of CMPK was PCR; and overlapping double-stranded oligonucleotide (d.s. constructed as described in Materials and Methods and was oligo) adapters. The use of PCR in generating CMPK fusion labeled p3PK/K1® L1. As shown in Fig. 4 (column 2, row genes, flanked by appropriate restriction sites, has been 1), this mutation did not prevent accumulation of CMPK detailed previously (Frangioni et al., 1992). In general, PCR in the nucleus, although a very slight additional cytoplas- is the method of choice to clone sequences longer than 25 mic staining was seen in some cells. For comparison, the amino acids, while d.s. oligo adapters offer rapid cloning staining pattern seen with the wild type NLS/CMPK fusion of shorter sequences. In designing the insert, one must be is shown (Fig. 4, column 1, row 1). A second mutant NLS, sure to add a HindIII site, a Kozak/start sequence, and a in which the second lysine was substituted with leucine BglII site for N-terminal fusions, and a BamHI site, a stop (TKLRKLES), was cloned into p3PK to generate the plas- codon, and an EcoRI site for C-terminal fusions (see Fig. mid p3PK/K2® L2. When this construct was overexpressed 1). For PCR, these sequences are engineered into the in HeLa cells (Fig. 4, column 3, row 1), CMPK no longer primers. For both PCR and d.s. oligos, it is helpful to 486 J. V. Frangioni and B. G. Neel

Fig. 4. Critical amino acid residues in the human nuclear lamin A/C nuclear localization signal. HeLa cells were fixed at 48 hours after transfection and processed for indirect immunofluorescence as described in Materials and Methods.Mutations in the nuclear localization signal (NLS) for human nuclear lamins A and C were constructed as described in the text.Transfected plasmids are shown at the top of each column and are labeled p3PK/NL (wild type nuclear lamin NLS, TKKRKLES), p3PK/K1® L1 (first lysine changed to leucine, TLKRKLES) and p3PK/K2® L2 (second lysine changed to leucine, TKLRKLES). The top row displays the staining pattern observed with rabbit anti-CMPK affinity purified antibody (a -CMPK). The bottom row displays the phase-contrast image of the same field (Phase).

Table 1. Comparison of human nuclear lamin A/C wild type and mutant nuclear localization sequences with SV40 large T and proposed consensus sequences Amino Type acids Sequence Localization SV40 large T antigen 127-134 K KKRK V E DN Human nuclear lamin A/C 416-423 T KKRK L E SN K1® L1 mutant lamin A/C - TL KRK L E SN K2® L2 mutant lamin A/C - T K L RK L E SC

Consensus sequence* K R/K X R/K

Amino acids specify position within the wild type molecule. Residues conserved in wild type nuclear lamin and SV40 large T sequences are shown in boldface. Intracellular localization is either N (nuclear) or C (cytoplasmic). *Consensus sequence for a nuclear localization signal as proposed by Chelsky et al., (1989), where X is either K, R, P, V or A. include an additional restriction site (e.g. EcoRI for N-ter- offering a total of six different promoter/enhancer combi- minal fusions, HindIII for C-terminal fusions) to assist with nations for controlling CMPK expression. Expression levels the screening of recombinants (see above for examples). of CMPK from p3PK (SV40 promoter) may not be high Without such a screening site, it may be difficult to resolve enough for immunodetection in some cell types. More short inserts. often, though, overexpression of a fusion protein will sat- Although not used in this paper, p3PK contains a unique urate the subcellular compartment of interest resulting in SalI site which permits cloning of targeting sequences into ambiguous results. Expression from the vectors pJ5W and the middle of the CMPK molecule. The SalI site, corre- pJ5eW is under control of the glucocorticoid-inducible sponding to amino acids 215-216 of CMPK, is contained mouse mammary tumor virus (MMTV) promoter. pJ5eW within a slightly hydrophobic region of the molecule (data offers slightly higher induction than pJ5W since it contains, not shown). Negative results obtained using internal in addition, a murine sarcoma virus (MSV) enhancer. By fusions, or for that matter N- or C-terminal fusions, must swapping the fusion gene of interest from p3PK into pJ5W be interpreted with caution since flanking amino acids can (or pJ5eW), protein expression levels can be modulated by affect the functioning of a targeting sequence (Roberts et glucocorticoid treatment. Although saturation was not al., 1987; Gao and Knipe, 1992). encountered in this study, we expect that it will be a The mammalian expression vector p3PK has been engi- common problem which can potentially be solved by proper neered to permit one step subcloning into the pJxW (Mor- promoter choice. When interpreting results though, one genstern and Land, 1990) family of expression vectors, thus must also consider that immunodetection requires a thresh- Intracellular protein targeting 487 old concentration of CMPK molecules. Although we have SV40 large T. Previous work has shown that substitution shown that CMPK can be used to visualize mitochondria of lysine 128 of the SV40 large T NLS with non-basic, (Fig. 3, column 2), the cytosolic face of the endoplasmic non-hydrophobic amino acids abolishes nuclear localization reticulum (Frangioni et al., 1992), and the nucleus (Fig. 3, (Lanford et al., 1986; Kalderon et al., 1984a; Lanford et al., column 3), saturation of other subcellular compartments 1988). A previous analysis of published NLS sequences may occur before the critical CMPK concentration is concluded that nuclear localization signals conform to the reached. consensus sequence K R/K X R/K, where R/K specifies p3PK has also been designed to permit one step sub- either R or K at that position, and X is either K, R, P, V, cloning into the pGEX bacterial fusion protein vectors or A (Chelsky et al., 1989; Table 1). However, we found (Smith and Johnson, 1988). The pGEX plasmids express indirect evidence in the literature that substitution of a protein sequences of interest as C-terminal fusions to glu- hydrophobic amino acid at the critical first lysine of the tathione S-transferase (GST), which can be purified from sequence KKRK might permit at least partial functioning crude bacterial lysates by glutathione agarose chromatog- of a nuclear localization signal (Loewinger and McKeon, raphy (Smith and Johnson, 1988; Frangioni and Neel, 1988). 1993). The GST fusion proteins can then be used to purify To test the hypothesis that a hydrophobic substitution of antibodies against the protein sequence of interest, and/or the first lysine would retain CMPK nuclear targeting, we for probing mammalian cellular lysates for targeting fused the mutant sequence (TLKRKLES; Table 1) to its C- sequence binding proteins (Kaelin et al., 1991). A GST terminus. As shown above (Fig. 4, column 2), this mutant fusion protein containing just the targeting sequence (TS) sequence was virtually indistinguishable from the wild type has the advantages of being structurally compact and poten- sequence in terms of its nuclear targeting. Changing the tially soluble. A GST fusion protein containing the entire second lysine to leucine (TKLRKLES) resulted in complete CMPK/TS fusion gene has the advantage that the TS is loss of nuclear localization (Fig. 4 and Table 1). It will now expressed in the same context (CMPK) as that which func- be possible to test the hypothesis that an alternative mini- tioned intracellularly. A significant disadvantage, however, mal nuclear localization sequence has the form XKRK is that GST/CMPK fusion proteins are extremely insoluble where X is a hydrophobic amino acid. Due to the impor- when expressed in bac-teria and require special conditions tance of protein context on the functioning of NLSs (see for solubilization and purification (Frangioni and Neel, above and Roberts et al., 1987; Gao and Knipe, 1992), 1993). For studies utilizing GST fusions to isolate target results such as those obtained using isolated targeting sequence binding proteins, a non-functioning point mutant sequences must ultimately be confirmed by mutation of the (e.g. K2® L2 mutation, see also Kaelin et al., 1991) is help- whole parent molecule. ful in identifying proteins that bind to the targeting In this paper, we have described strategies that should sequence but which do not serve a targeting function. facilitate the identification of both amino acid residues crit- In this study, we have shown the general usefulness of ical for targeting and proteins that interact with targeting p3PK by redirecting CMPK to various intracellular com- sequences. As more investigators use p3PK in their studies, partments. The first twelve amino acids of the yeast we hope that a collection of p3PK vectors that target CMPK cytochrome c oxidase subunit IV gene have previously been to various intracellular compartments will be formed. These shown to be sufficient for import of a heterologous protein vectors, all expressing the same heterologous protein (dihydrofolate reductase) into the yeast mitochondrial (CMPK), will then form a resource for the characterization matrix (Hurt et al., 1985). Our data suggest that this pre- of new putative targeting sequences. sequence can, at the very least, be recognized by the translo- cation machinery of mammalian mitochondria, and cause We thank Dr Min Gao (Bristol-Myers Squibb) and Dr David co-localization of CMPK with mitochondria. However, it M. Knipe (Harvard Medical School) for their generous gift of plas- remains to be seen whether the yeast sequence is sufficient mid PK10/8 and for many helpful discussions. We thank Dr to direct translocation of CMPK across the outer and inner Morris Birnbaum for the generous gift of anti-CMPK whole rabbit membranes of mammalian mitochondria. serum, and Dr Lan Bo Chen (Dana-Farber Cancer Institute) for We also targeted CMPK to the nucleus by constructing mAb1273. We thank Drs Brian Burke, Frank McKeon and Morris Birnbaum (Harvard Medical School) for critical reading of this a C-terminal fusion with a putative 8 amino acid nuclear manuscript. We thank Ms Maureen Magane and Ms Celia localization signal (NLS) from human nuclear lamins A/C. Mokalled for administrative assistance. This work was funded by This sequence (TKKRKLES) contains a high degree of NIH Grant no. R01-CA-49152 (BGN) and a grant from the Row- homology to the SV40 NLS (Table 1), and deletion of this land Foundation (JVF). JVF is a Markey Fellow in Developmen- region prevents nuclear localization of the lamins tal Biology at Harvard Medical School. (Loewinger and McKeon, 1988). We provide the first direct The vector p3PK has been deposited with the ATCC, and proof that this sequence is sufficient for nuclear localiza- is available as catalog no. 77314. tion of a heterologous fusion protein. Moreover, a longer sequence from nuclear lamins A/C (SVTKKRKLE), when REFERENCES conjugated to human serum albumin, has recently been shown to promote nuclear import in an assay used to iden- Ausubel, F. M., Brent, R., Kingston, R. E., Moore, D. D., Seidman, J. G. tify cytosolic factors involved in this process (Moore and and Smith, J. A. (1987). Current Protocols in . Boston: Massachusetts General Hospital. Blobel, 1992). Chelsky, D., Ralph, R. and Jonak, G. (1989). Sequence requirements for Table 1 displays a comparison of the wild-type human synthetic peptide-mediated translocation to the nucleus. Mol. Cell. Biol. nuclear lamin A/C nuclear localization signal with that of 9, 2487-2492. 488 J. V. Frangioni and B. G. Neel

Chen, C. and Okayama, H. (1987). High-efficiency transformation of initiator codon that modulates translation by eukaryotic ribosomes. Cell mammalian cells by plasmid DNA. Mol. Cell. Biol. 7, 2745-2752. 44, 283-292. Dingwall, C., Robbins, J., Dilworth, S. M., Roberts, B. and Richardson, Lanford, R. E., Kanda, P. and Kennedy, R. C. (1986). Induction of W. D. (1988). The nucleoplasmin nuclear location sequence is larger and nuclear transport with a synthetic peptide homologous to the SV40 T more complex than that of SV-40 large T antigen. J. Cell Biol. 107, 841- antigen transport signal. Cell 46, 575-582. 849. Lanford, R. E., White, R. G., Dunham, R. G. and Kanda, P. (1988). Frangioni, J. V., Beahm, P. H., Shifrin, V., Jost, C. A. and Neel, B. G. Effect of basic and nonbasic amino acid substitutions on transport (1992). The nontransmembrane tyrosine phosphatase PTP-1B localizes to induced by simian virus 40 T-antigen synthetic peptide nuclear transport the endoplasmic reticulum via its 35 amino acid C-terminal sequence. signals. Mol. Cell. Biol. 8, 2722-2729. Cell 68, 545-560. Loewinger, L. and McKeon, F. (1988). Mutations in the nuclear lamin Frangioni, J. V. and Neel, B. G. (1993). Solubilization and purification of proteins resulting in their aberrant assembly in the cytoplasm. EMBO J.7, enzymatically active glutathione S-transferase (pGEX) fusion proteins. 2301-2309. Anal. Biochem. (in press). Lonberg, N. and Gilbert, W. (1983). Primary structure of chicken muscle Gao, M. and Knipe, D. M. (1992). Distal protein sequences can affect the pyruvate kinase mRNA. Proc. Nat. Acad. Sci. USA 80, 3661-3665. function of a nuclear localization signal. Mol. Cell. Biol. 12, 1330-1339. McKeon, F. D., Kirschner, M. W. and Caput, D. (1986). Homologies in Goldfarb, D. S., Gariepy, J., Schoolnik, G. and Kornberg, R. D. (1986). both primary and secondary structure between nuclear envelope and Synthetic peptides as nuclear localization signals. Nature 322, 641-644. intermediate filament proteins. Nature 319, 463-468. Harlow, E. and Lane, D. (1988). Antibodies: A Laboratory Manual. Cold Moore, M. S. and Blobel, G. (1992). The two steps of nuclear import, Spring Harbor, New York: Cold Spring Harbor Laboratory. targeting to the nuclear envelope and translocation through the nuclear Hurt, E. C., Pesold-Hurt, B., Suda, K., Oppliger, W. and Schatz, G. pore, require different cytosolic factors. Cell 69, 939-950. (1985). The first twelve amino acids (less than half of the pre-sequence) of Morgenstern, J. P. and Land, H. (1990). A series of mammalian an imported mitochondrial protein can direct mouse cytosolic expression vectors and characterisation of their expression of a reporter dihydrofolate reductase into the yeast mitochondrial matrix. EMBO J. 4, gene in stably and transiently transfected cells. Nucl. Acids Res.18, 1068. 2061-2068. Roberts, B. L., Richardson, W. D. and Smith, A. E. (1987). The effect of Kaelin, W. G., Pallas, D. C., DeCaprio, J. A., Kaye, F. J. and Livingston, protein context on nuclear location signal function. Cell 50, 465-475. D. M. (1991). Identification of cellular proteins that can interact Smith, D. B. and Johnson, K. S. (1988). Single-step purification of specifically with the T/E1A-binding region of the retinoblastoma gene polypeptides expressed in as fusions with glutathione-S- product. Cell 64, 521-532. transferase. Gene 67, 31-40. Kalderon, D., Richardson, W. D., Markham, A. F. and Smith, A. E. Swift, A. and Machamer, C. (1991). A Golgi retention signal in a (1984a). Sequence requirements for nuclear location of simian virus 40 membrane-spanning domain of coronavirus E1 protein. J. Cell Biol. 115, large-T antigen. Nature 311, 33-38. 19-30. Kalderon, D., Roberts, B. L., Richardson, W. D. and Smith, A. E. von Heijne, G. (1990). Protein targeting signals. Curr. Opin. Cell. Biol. 2, (1984b). A short amino acid sequence able to specify nuclear localization. 604-608. Cell 39, 499-509. Kornfeld, S. (1990). Lysosomal enzyme targeting. Biochem. Soc. Trans. 18, 367-374. (Received 22 December 1992 - Accepted, in revised form, Kozak, M. (1986). Point mutations define a sequence flanking the AUG 1 March 1993)