Louisiana State University LSU Digital Commons

LSU Historical Dissertations and Theses Graduate School

1996 The yC toplasmic N,n'-Diacetylchitobiase Gene From Vibrio Parahaemolyticus: Sequence Analysis, Protein Secretion, and Secretion System Development. Michael Hongli Wu Louisiana State University and Agricultural & Mechanical College

Follow this and additional works at: https://digitalcommons.lsu.edu/gradschool_disstheses

Recommended Citation Wu, Michael Hongli, "The yC toplasmic N,n'-Diacetylchitobiase Gene From Vibrio Parahaemolyticus: Sequence Analysis, Protein Secretion, and Secretion System Development." (1996). LSU Historical Dissertations and Theses. 6171. https://digitalcommons.lsu.edu/gradschool_disstheses/6171

This Dissertation is brought to you for free and open access by the Graduate School at LSU Digital Commons. It has been accepted for inclusion in LSU Historical Dissertations and Theses by an authorized administrator of LSU Digital Commons. For more information, please contact [email protected]. INFORMATION TO USERS

This manuscript has been reproduced from the microfilm master. UMI films the text directly from the original or copy submitted. Thus, some thesis and dissertation copies are in typewriter face, while others may be from any type of computer printer.

The quality of this reproduction is dependent upon the quality of the copy submitted. Broken or indistinct print, colored or poor quality illustrations and photographs, print bleedthrough, substandard margins, and improper alignment can adversely affect reproduction.

In the unlikely event that the author did not send UMI a complete manuscript and there are missing pages, these will be noted. Also, if unauthorized copyright material had to be removed, a note will indicate the deletion.

Oversize materials (e.g., maps, drawings, charts) are reproduced by sectioning the original, beginning at the upper left-hand comer and continuing from left to right in equal sections with small overlaps. Each original is also photographed in one exposure and is included in reduced form at the back of the book.

Photographs included in the original manuscript have been reproduced xerographically in this copy. Higher quality 6” x 9” black and white photographic prints are available for any photographs or illustrations appearing in this copy for an additional charge. Contact UMI directly to order. UMI A Bell & Howell Information Company 300 North Zeeb Road, Ann Arbor MI 48106-1346 USA 313/761-4700 800/521-0600

THE CYTOPLASMIC N,iV-DLACETYLCHITOBIASE GENE FROM VIBRIO PARAHAEMOLYTICUS: SEQUENCE ANALYSIS, PROTEIN SECRETION, AND SECRETION SYSTEM DEVELOPMENT

A Dissertation

Submitted to the Graduate Faculty of the Louisiana State University and Agricultural and Mechanical College in partial fulfillment of the requirements for the degree of Doctor of Philosophy

in

The Department of Biochemistry

by Michael Hongli Wu B.S., Beijing Forestry University, P. R. China, 1982 M.S., University of California, Riverside, CA, 1989 May 1996 UMI Number: 9628325

Copyright 1996 by Wu, Michael Honglr All rights reserved.

UMI Microform 9628325 Copyright 1996, by UMI Company. All rights reserved.

This microform edition is protected against unauthorized copying under Title 17, United States Code.

UMI 300 North Zeeb Road Ann Arbor, MI 48103 ©Copyright 1996 Michael Hongli Wu All rights reserved ACKNOWLEDGMENTS

I am extremely grateful to my major professor, Dr. Roger A. Laine, for his understanding, patience, encouragement, and general support dur­ ing this study and the preparation of manuscripts and this dissertation. He never seems to run out of new ideas that sometimes, somewhere, will shine some light on someone's project in the lab. I am also indebted to other members of my advisory committee, Drs. William H. Daly (Chemistry), Donal F. Day (Audubon Sugar Institute), Ding S. Shih (Biochemistry), Patrick J. DiMario (Biochemistry), Jan L. VanSteenhouse (Veterinary Pathology, School of Veterinary medicine) for their valuable suggestions, corrections, and other efforts during re­ viewing this dissertation. In addition, I also highly appreciate Drs. Simon H. Chang (Biochemistry) and Jesse M. Jaynes (Formerly Biochemistry) for their general support and technical assistance to my research program while on my committee. My gratitude is extended to Dr. Betty C. R. Zhu for her assistance and valuable suggestions that are always available. Last, but certainly not the least, I would like to express my gratitude to my wife, Kathy G. Wu. Without her understanding, patience and support, this study would not have been completed. I'm also in debt to my two sons, Peter and Clifford, with whom I should have played more as I al­ ways promised. TABLE OF CONTENTS ACKNOWLEDGMENTS...... iii LIST OF FIGURES...... v ABSTRACT...... vi CHAPTER 1 REVIEW OF THE LITERATURE...... 1 1.1 INTRODUCTION...... 1 1.2 CH ITIN...... 2 1.3 CHITIN UTILIZATION BY MARINE BACTERIA...... 3 1.4 PROTEIN SECRETION...... 8 1.5 OVEREXPRESSION OF CLONED GENES...... 29 1.6 PCR AND RECOMBINANT DNA...... 36 CHAPTER 2 SEQUENCING AND SEQUENCE ANALYSIS...... 48 2.1 INTRODUCTION...... 48 2.2 MATERIALS AND METHODS...... 48 2.3 RESULTS...... 50 2.4 DISCUSSION...... 65 2.5 SUMMARY...... 69 CHAPTER 3 PROTEIN SECRETION...... 71 3.1 INTRODUCTION...... 71 3.2 MATERIALS...... 72 3.3 EXPERIMENTAL PROCEDURES AND RESULTS...... 73 3.4 DISCUSION...... 91 3.5 SUMMARY...... 95 CHAPTER 4 CONCLUSIONS...... 98 REFERENCES...... 101 APPENDIXES...... 127 A.l SEQUENCING AND PCR PRIMERS...... 127 A.2 SIGNAL SEQUENCES USED IN THIS STUDY...... 128 A.3 SECRETION VECTOR SEQUENCES...... 129 A.4 GENOTYPES OF SOME HOST STRAINS...... 131 A.5 STRUCTURES OF RELEVENT COMPOUNDS...... 132 A.6 LIST OF ABBREVIATIONS...... 137 VITA ...... 138 LIST OF FIGURES

Fig. 1 Chi tin degradation pathways in vibrios ...... 6 Fig. 2 Construction of the tac promoter ...... 30 Fig. 3 Vectorette and its application...... 40 Fig. 4 Construction of a possible universal vectorette ...... 41 Fig. 5 Restriction map and nucleotide sequencing strategy ...... 52 Fig. 6 Nucleotide sequence of the chitobiase gene ...... 53 Fig. 7 Amino acid alignment of three vibrio chitobiases ...... 56 Fig. 8 Amino acid sequence pileup ...... 60 Fig. 9 Hydrophobicity analysis ...... 67 Fig. 10 In-frame connection by recombinant PCR ...... 75 Fig. 11 Two clones for the secretion of cytoplasmic chitobiase ...... 77 Fig. 12 Secretion and protein leakage ...... 79 Fig. 13 Silent mutagenesis ...... 84 Fig. 14 Construction of pVerSec and pVerSec-SE ...... 85 Fig. 15 Application of secretion vectors ...... 87 Fig. 16 Effects of EDTA on protein protection and host growth ...... 88 Fig. 17 Chitobiase activity and EDTA concentration ...... 89

v ABSTRACT

The nucleotide sequence of the gene encoding the cytoplasmic N ,N '-di- acetylchitobiase from Vibrio parahaemolyticus has been determined. The deduced peptide sequence surprisingly has minimum evolutionary relationship to two other reported iV,iV'-diacetylchitobiases from vibrios, except for highly conserved regions which are also homologous with lysosomal beta-hexosaminidases from eukaryotes including humans. In contrast, the other two sequenced chitobiases from vibrios are much more closely related to each other. This 85 kDa cytoplasmic protein, as revealed from the sequence, appears to be a unique protein, lacking a signal sequence and genetically distant from other known enzymes of similar function. This is consistent with its limited substrate specificity to small iV-acetylglucosamine terminated oligosaccharides. The signal peptide of an extracellular endochitinase from V". p a ra ­ haemolyticus causes the mature chitinase to be efficiently secreted through the double membranes of both Gram negative V. p a ra ­ haemolyticus, and of Escherichia coli JM101 when the gene is cloned therein. By using recombinant PCR, this signal sequence was fused in frame to the chitobiase coding sequence, and active chitobiase was found in the E. coli culture medium. Two secretion vectors were developed during this study for the secretion of wild type proteins. PCR fragment of a structural gene can be inserted in frame with the signal sequence, which is under the strong trc pro­ moter. The secretion system has been tested using the cytoplasmic chito­ biase gene as a model system. Restriction sites and complete DNA se­ quence are not required for the structural gene to be cloned. The chitobiase and the original chitinase as cloned for secretion were stabilized by EDTA added to the medium. This observation may prove generally useful for protecting cloned and secreted proteins in E. coli. CHAPTER 1 REVIEW OF THE LITERATURE

1.1 INTRODUCTION 1.1.1 Historical Overview Chitin is a polymer of beta-D-2V-acetyl-glucosamine (GlcNAc). Its chemi­ cal structure, natural abundance and biological function resemble those of cellulose found in plants. The primary source of this insoluble particu­ late is the marine environment, where many organisms use chitin as their major skeletal or cell wall components. Recycling of chitin for car­ bon and nitrogen sources is a complex biochemical process, involving many proteins. The Gram negative bacteria, vibrios, play a major role in the degradation of chitin. Two parallel pathways in vibrios have been postulated for chitin catabolism (273). Chemotactic proteins act as receptors for trehalose and

GlcNAc released from dead crustaceans; lectins help the bacteria adhere to the chitin (77, 274), and extracellular chitinase (123) and periplasmic chitodextrinase sequentially degrade the substrate into iV,iV'-diacetylchi- tobiose (15). The glycosidase/PTS system cleaves N,V-diacetyl chitobiose to GlcNAc in the periplasm via a membrane bound chitobiase (15, 220, 265) after which GlcNAc is transported and phosphorylated by the bacte­ rial phosphoenolpyruvate:glycosephosphotransferase system (PTS) (15). In the alternative system, an N,N '~diacetylchitobiose permease translo­ cates this substrate to the cytoplasm, where it is cleaved by a cytoplasmic chitobiase (278) and phosphorylated by an ATP-dependent iV-acetyl-D- glucosamine (6, 12, 15). The cytoplasmic system works indepen­ dently of the PTS (15). The two pathways both produce Fru-6-P from chitin, which enters glycolysis for further metabolism.

1 2

1.1.2 The Objectives of This Study The cloning and characterization of the extracellular chitinase (123) and cytoplasmic chitobiase (278) from V. parahaem olyticus have shed light on the chitinoclastic pathway involved in this organism. The objectives of this project include:

1. Nucleotide sequence determination of the gene encoding the cytoplas­ mic chitobiase from V. parahaem olyticus to reveal the cytoplasmic fea­ tures of the enzyme and its relationship to other chitobiases and beta- hexosaminidases. 2. The secretion of chitobiase by construction of a gene for a fusion protein using the signal peptide of chitinase gene to demonstrate that a 21 amino acid leader peptide is responsible for secretion of mature protein across membranes, and that the gene product in the supernatant of E. coli cul­ ture is not simply the consequence of leakage or cell lysis. 3. Development of vectors for production of extracellular proteins. 4. Solve the problem of instability of the secreted chitinase. 1.2 CHITIN

Chitin is a homopolysaccharide composed of beta-(l,4)-linked 2-ac- etamido-2-deoxy-D-pyranosyl residues. Considering the similarities in structure and function between chitin and cellulose, chitin can be con­ sidered as a derivative of cellulose in which the C2 hydroxyl groups of the building blocks have been replaced by acetamido residues (67). After cel­ lulose, it is one of the most abundant organic compounds in nature. Chitin sediments produced as copepod exoskeletons alone have a esti­ mated productivity of several billion tons annually (166). 3

The earliest description of this polysaccharide (29, 67) was in 1811 by Braconnot (26) who extracted this alkali resistant polysaccharide from several species of higher fungi ( Agaricus volvaceus, A. acris, Hydnum repandum, H. hybridum, Merulius cantharellus, A. cantharellus and Boletus viscidus) and named it ‘fungine’. The name chitin (Greek, tunic or covering) was first proposed by Odier in 1923 to describe the substance that he isolated from the May beetles elytra (67). It is the major exoskele­ ton component of insect, crustacean and arachnoids. The molecular weight of this polysaccharide has been determined to be 1.0 x 106 Da for chitin purified from crabs, and higher than 1.8 x 106 Da from other crus­ taceans (30, 210). Because of the insoluble nature and its association with other macromolecules, isolation and purification of chitin frequently re­ quire drastic methods that partially degrade chitin (67). The purified chitin used in determining its molecular weight may be a degradation product of natural chitin. 1.3 CHITIN UTILIZATION BY MARINE BACTERIA 1.3.1 Vibrio Parahaemolyticus V. parahaem olyticus is a member of the Gram negative, rod shaped bac­ terial genus, V ibrio. At least 11 vibrio species have been reported pathogenic to man. Three of them, namely V. cholera, V. parahaem olyti­ cus and V. vulnificus, are of medical concern. The symptoms associated w ith V. parahaem olyticus infection include mild diarrhea, abdominal cramps, nausea, vomiting and fever (167). This was believed to be related to the high incidence of gastroenteritis and extraintestinal infections in coastal areas during warmer month of the year when the organism is prevalent. 4

1.3.2 The Bacterial PTS and Sugar Uptake The bacterial phosphoenolpyruvate:glycose system (PTS) is a complex system with diverse functions in bacterial cells. The PTS is widely distributed among most species of bacteria (149). At least 15 species of vibrios, including V. parahaemolyticus and V. vulnificus, con­ tain a PTS (150). Several proteins in the PTS have been characterized. The general proteins of the pathway consist of Enzyme I (El) (40, 118, 199, 253, 255) and HPr (20, 40, 253, 254). The sugar specific proteins in the PTS re­ quire specific sugar substrate. Known examples of sugar specific pro­ teins characterized from S. typhimurium include: IIBMan for m annose (119), IIIGlc for Glucose (151), and IINae for GlcNAc and methyl-GlcNAc (83, 252). Some sugar specific proteins have more than one sugar sub­ strate. On the other hand, some sugars {e.g., glucose) are substrates of several different proteins. The system works by translocation and phosphorylation of a particular sugar through a series of membrane bound and cytosolic proteins. The phosphoryl group is transferred sequentially from phosphoenolpyruvate (PEP) to El, to HPr, and then to a membrane bound sugar specific pro­ tein, depending on the sugar substrate. In the case of glucose (Glc), for example, the sugar specific protein would be either IIIGlc or IIIMan. The phosphorylated sugar specific protein(s) then transfer the phosphoryl group to the sugar concomitant with translocation of the sugar across the membrane (15,149).

The significance of this system is best understood from its essential func­ tions. In addition to phosphorylation and concomitant translocation of its sugar substrates (PTS sugars), it has regulatory functions on non-PTS 5 sugar transport. PTS-mediated repression of non-PTS sugar utilization operons is at the metabolic level. The product of err gene, which encodes the IIIGlc enzyme (200), is the key to understanding the underlying mechanism: a err mutation simultaneously relieves four non-PTS oper­ ons from PTS-mediated regulation. It has been shown that IIIGlc in­ hibits adenylase cyclase activity and thereby reduces the level of cAMP and cAMP-CAP level in the cell. IIIGlc or phospho-IIIGlc was also found to interact with permeases of lactose, glycerol, melibiose and maltose (149). The chemotaxis of bacterial cells towards sugar is known to be promoted by the PTS (140). There is evidence that the PTS regulates the transcription of operons required for the uptake and catabolism of non- PTS sugars, such as the E. coli lac operon for lactose utilization. 1.3.3 Chitinoclastic Pathways in Vibrios Two parallel pathways have been postulated in marine vibrios for catabolism of chitin (Fig. 1), each possibly containing as many as 6-10 en­ zymes and a number of chemotactic proteins (273). In the common part of both pathways, chitin-binding proteins adhere to the substrate (77, 274), and extracellular chitinase (123) and periplasmic chitodextrinase work together to produce iV^AT-diacetylchitobiose (15). The glycosidase/PTS sys­ tem cleaves A/^iV'-diacetyl chitobiose to GlcNAc in the periplasmic space using a membrane bound chitobiase (15, 220, 265) after which GlcNAc is transported and phosphorylated (15). The second, parallel perme- ase/glycosidase system resembles the E. coli lac permease/beta-galac- tosidase system (275, 276), utilizing an N,N'-diacetylchitobiose permease for transport of this substrate to the cytoplasm. The transported (G1cNAc)2 is cleaved by the cytoplasmic chitobiase reported here (278) 6

Outer Membrane

Inner Membrane

PERIPLASM CYTOPLASM

[G1cNA c ]>12 ■ ^ G lc N A c -6 -PGlcNAc C hn C hb

[G1cNA c ]2 K in ase ATP GlcNAc C hd

C hb

[G1cNA c ]2 [G1cNA c ]2

Ac- GlcNAc-6-P GlcNH2-6-PX m W F rn-6-P Deacetylase Deaminase

G lycolysis

Fig. 1 Chitin degradation pathways in vibrios. 1. Production of GlcNAc-6-P from chitin. Chitin is represented [GlcNAc] > 1 2 , where the number indicates chain length in GlcNAc. Inner membrane proteins for the translocation of GlcNAc and [GlcNAc] 2 , respectively, are labeled as PTS and P, for permease. 2. GlcNAc metabolism. Chn, chitinase; Chd, chitodextranase; Chb, chitobiase. Adapted from Bassler et al. (15). 7 and phosphorylated by a ATP-dependent IV-acetyl-D-glucosamine kinase (6, 12, 15). This cytoplasmic system works independently of the PTS (15). The cytoplasmic iV,iV'-diacetylchitobiase (EC 3.2.1.14) from V. p a ra ­ haemolyticus (ATCC #27969) has been characterized and the gene cloned into Escherichia coli (278). A novel 85 kDa cytoplasmic glycosyl hydrolase of restricted specificity participates in the utilization of chitin-derived 2- deoxy-2-acetamido-,D-glucose (GlcNAc) as one of two parallel pathways for metabolism of iV,Af'-diacetylchitobiose (15). The chitobiose is then cleaved in the cytoplasm to yield GlcNAc which is then phosphorylated by a constitutively expressed, ATP-dependent iV-acetyl-D-glucosamine kinase (6,12,15). The two pathways, glycosidase/PTS and permease/glycosidase, yield a common product from chitin degradation, GlcNAc-6-P, which is further deacetylated by Af-acetylglucosamine deacetylase (112, 196), followed by deamination of the intermediate by glucosamine-6-P deaminase, result­ ing in Fru-6-P, acetate, and ammonia (49, 233). As a substrate of phospho- (PFK), Fru-6-P enters the glycolysis pathway to provide en­ ergy (ATP) and reducing power (NADH) for the organism. The phospho­ ryl group of GlcNAc-6-P from the two parallel pathways come from dif­ ferent sources, phosphoenolpyruvate or ATP, depending on the pathway where it was produced. Chitin-degrading enzymes have also been reported in organisms as di­ verse as bacteria, fungi, plants, insects and vertebrates (25, 80, 279), and the pathways involved in the degradation of chitin are as diverse as the distribution. It is clear that, at least in Vibrio, secretion of the first cat­ alytic enzyme to the environment is necessary to initiate the process, 8 since the outer membrane of the cell is not permeable to this particulate polymer. Subsequent enzymes in the pathway are chitodextrinases and/or chitobiases. The substrates of these enzymes are chitinase prod­ ucts, which gain entry into either the periplasm or the cytoplasm, de­ pending on the organism, for further degradation. Other enzymes than the chitinases are either secreted into the periplasm or stay in the cyto­ plasm. Exceptions to this scheme do exist: chitobiase of the fungus, Trichoderma harzianum, is reported to be extracellular (235). 1.4 PROTEIN SECRETION 1.4.1 The Signal Hypothesis The initial clue that some proteins require an amino terminal (N-termi- nal) peptide sequence for secretion through the membrane was obtained during the in vitro synthesis of an immunoglobulin chain by cell ex­ tracts. It was found that synthesis in the absence (but not the presence) of membrane yields a product with a ca. 3,000 Da extra segment at the N- terminus (158). This highly hydrophobic segment, the signal peptide, is intrinsic to the nascent secretory polypeptide possessing the information necessary and sufficient for targeting. It is the membrane-bound signal peptidase in the membrane fraction, but not the membrane per se, th a t plays the critical role in this process. The signal hypothesis was proposed by Blobel and Dobberstein (24): (i) mRNAs encoding proteins for secretion through the ER membrane contain signal codons, unique sequences of codons immediately following the initiation codon; (ii) Translation of the mRNAs is initiated by ribosomes that are not attached to the membrane; (iii) A signal sequence receptor(s) on the membrane binds the nascent polypeptide, and facilitates the initial penetration into the endoplasmic 9 reticulum (ER) membrane; (iv) There exists in the membrane a pro- teinaceous channel, formed in association with the signal sequence, that provides an aqueous environment for the transfer of hydrophilic peptide. It is this short sequence that signals the attachment of nascent polypep­ tide chain, thus the ribosome, to the membrane to produce the appear­ ance of 'rough ER.1 It is clear that the hypothesis assumed a cotransla- tional translocation mechanism. Wickner (257, 258, 260) proposed an alternative hypothesis for membrane protein assembly known as the membrane trigger hypothesis. It was suggested in the hypothesis that some membrane proteins synthesized on soluble ribosomes can assume two conformations, one more stable in aqueous solutions and the other triggered by contact with the hydropho­ bic environment of the membrane. The soluble precursor diffuses to the membrane and undergoes a conformational change as it self-inserts into the bilayer. The precursors with triggered conformation are thought to be stabilized by cleavage of the signal peptide at the N-terminus. This hy­ pothesis did not assume membrane attachment of the ribosome synthe­ sizing the precursor. Membrane protein insertion is a self-assembly pro­ cess.

The membrane trigger hypothesis could explain the secretion and inte­ gration of integral membrane proteins. Membrane protein secretion is not a favorite subject of the signal hypothesis proponents since a majority of these proteins are made without a cleaved leader sequence (39, 60, 188, 204, 263, 272). The membrane trigger hypothesis, however, is not without problems: there have been too few reported examples that fit into this model, probably due to the lack of practical testing systems. One 10 extensively studied protein that has been shown to follow this model is the M13 coat protein. This lipid soluble, 50 residue peptide, is synthesized as a precursor with a signal peptide 23 residues long. It becomes integrated into the membrane during the infection stage of the M13 bacteriophage life cycle (108). Insertion into the membrane also requires sequence at the carboxy terminus (C-terminus) of the mature protein. In vitro studies (261) demonstrated post-translational processing of this protein. Signal peptidase, membrane electrochemical potential and phospholipid are the only components required to mediate binding, processing and insertion. The loop model proposed by Inouye (104) was based on the functional and structural relationship of the signal peptides (105). The positively charged N-terminal region and the extremely hydrophobic core region of these short leader peptides were believed to form a loop. The basic region interacts with the acidic surface of cytoplasmic membrane and the hy­ drophobic core inserts into the lipid bilayer. The model system used was the signal peptide of E. coli major outer membrane lipoprotein. 1.4.2 Cell Compartmentation and Protein Targeting The chromosome of a bacterium occupies a certain area in the cell but is not membrane bounded. However, the bacterial cell is also compart- mented and protein targeting is essential for the organism to survive. The periplasm of a Gram negative bacterium, sometimes called the periplasmic space, is surrounded by the inner (cytoplasmic) and outer (lipopolysaccharide and protein) membranes. In Salmonella ty- p h im u riu m and E. coli, the volume of this compartment comprises 20 to 40% of the total cell volume and is isoosmotic to the cytoplasm (222). This 11 compartment contains at least 4% of total cell proteins (175) with up to 50 distinct polypeptide species (50). The outer membrane is a coarse molecu­ lar sieve that allows simple diffusion of hydrophilic and hydrophobic molecules up to a molecular weight of about 800 Da for E. coli and even higher for members of Pseudomonas and N eisseria (7) whereas the cy­ toplasmic membrane excludes almost all hydrophilic substances. The cell wall of a Gram negative bacterium includes both the outer mem­ brane and a thin layer of murein or mucopeptide (peptidoglycan) as os­ motic pressure barrier between the two membranes. Cell envelope is an­ other term that includes everything outward from the cytoplasmic mem­ brane. Proteins synthesized in the cytoplasm of the cell have different destina­ tions. In addition to integral membrane proteins that have hydrophobic regions to span the lipid bilayer, there are proteins that are targeted and functional in compartments other than the cytoplasm. It has been esti­ mated that 25% of proteins synthesized in the bacterial cytoplasm are targeted either to the inner membrane, the periplasm, or to the outer membrane. The periplasm contains proteases, phosphatases and sugar binding proteins for active transport. Some toxins are secreted from the cell into the medium (8). Some enzymes in polymer utilization systems are secreted out to the external milieu. These include amylases from the Gram positive Bacillus subtilis (270), B. amyloliquefacens (180, 225), B. stearothermophilus (223), and the Gram negative Aeromonas hydrophila (42). Chitinases from V. harveyi (219, 220) and V. parahaemolyticus (123), and pullulanases from Klebsiella oxytoca (51, 191) are known to be se­ creted across the double-membranes of these Gram-negative bacteria to 12 digest chitin and branched-starch, respectively. Other examples include the plant pathogens Erwinia chrysanthemi (134) and Erwinia carotovora (194) that secrete cell-wall degrading enzymes such as pectate lyase, polygalacturonase and cellulase. 1.4.3 Bacterial Signal Sequences Secretory proteins of the Gram negative bacteria are those targeted to the periplasm, the external milieu and the inner and outer membranes. Not all secreted proteins require a signal sequence. For those that need one, it does not have to be at the N-terminus. For those that do have an N-termi- nal signal sequence, it is the underlying basis of protein targeting. Several lines of evidence from genetics and recombinant DNA demon­ strated that it is this short piece of amino acid sequence at the N-termi- nus of the precursor that dictates the transport of the mature protein to the location where it is functional. The bacterial signal peptides, like eukaryotic ones, have three distinct re­ gions, a positively charged N-terminus (n-region), a hydrophobic core (h- region), and a hydrophilic C-terminus (c-region) (245). The n-region varies in length and has an overall positive charge. The contribution of this region to the secretion process is not well understood although it has been speculated that it may be responsible for binding to the negatively charged phospholipid head groups of the cytoplasmic surface of the membrane. The h-region is rich in hydrophobic residues (Phe, Leu, Val, lie, Met, Trp). Often found in this region are sequences with potentials for forming alpha-helices, but the presence of some helix-disrupting amino acids is not unusual (242, 245). What is critical for translocation is its length (minimum of 7-8 residues) and overall hydrophobicity as a 13 continuous region. A consensus sequence to known signal sequences does not seem to exist even within the same organism and the analyses of structure and function are largely based on the classes of amino acids present in each of the three regions that contribute to the secondary structures. The amino acid sequence of the c-region, generally around six residues in length, demonstrates a higher degree of stringency at the cleavage site of signal peptidase. Based on the signal sequences available, a '-1, -3' rule was proposed (243, 244) which states that the sequence at the cleavage site conforms to the formula A-X-B, where B contributes the carboxy group to the peptide bond that is cleaved and is Ala, Gly or Ser. The ‘A’ in the above formula at position -3 is any of B or Leu, Val or lie. The -2 position is somewhat less restrictive where larger residues and charged residues are generally found. The signal peptidase prefers certain amino acids for their steric configuration that probably dictates the position for prote­ olytic cleavage. It is worth noting that these preferential residues at their perspective positions are for peptidase cleavage only. Mutations of these residues to bulky or charged residues prevent cleavage but not transloca­ tion. Extensive site-directed mutagenesis and genetics studies in these regions of signal peptides from OmpA, MPB, LamB, Staphlokinase, and beta-Lactamase agreed with the structural analysis above (71). A net pos­ itive charge at the N-terminus of the signal peptide is required for opti­ mal processing and translocation. It is the overall hydrophobicity rather than any specific amino acid sequence that is essential for export func­ tion. It is also noteworthy from these studies that, for cleavage by signal peptidase, the N-terminal residue at +1 can be virtually any residue as 14 long as it does not disturb secondary structure, suggesting that this residue is not recognized by the enzyme. The specificity of the enzyme seems to rely on the signal peptide, which is not part of the functional ex­ ported protein. Restrictions on the N-terminal residue may limit the number of functional proteins translocated. Attempts to elucidate the mechanistic contributions of the n-region and c-region during translocation using 2D-NMR (two dimetional nuclear magnetic resonance), CD (Circular dichroism) and infra-red measure­ ments suggested that the peptide may be in beta-sheet structure when in­ teracting electrostatically on the aqueous surface of the membrane, and in alpha-helix when inserted (28). Care should be taken for interpreta­ tions of such data since the principal assumption is that the translocated proteins interact with lipid bilayer. That secretory proteins get across the membrane does not signify interaction between the polypeptides and the lipid during translocation. Integral membrane proteins interact with the membrane and many of them are secretion products. However, there has been no conclusive evidence that integration is a direct consequence of translocation. In contrast, it has been demonstrated that the E. coli MalF protein, which spans the cytoplasmic membrane eight times, can be properly inserted into the membrane in the absence of its N-terminal ex­ port signal that also serves as the first of the eight transmembrane seg­ ments (61). It would be difficult to envision that these proteins simply stalk on the membrane in the middle of translocation and 'become' membrane inserted, without jamming the secretion pathway, especially for those that span the membrane many times, such as SecY, SecE (41, 56) as well as MalF (61). 15

Regardless of the underlying mechanism, the importance of signal se­ quences in protein secretion is obvious based on the fact that many signal peptides secrete foreign proteins, making this complex process visible and practical for cloning, expression and commercial production of pro­ teins. It has been demonstrated that, in E. coli, the signal sequences are inter­ changeable between the outer membrane porin PhoE and a similar outer membrane protein OmpF, and both functional proteins are localized in the outer membrane (231, 232). Similar to this is the substitution of periplasmic TEM-beta-lactamase signal sequence for that of the PhoA, also periplasmic, and the mature beta-lactamase is secreted into the periplasm (98, 142). The signal sequence from the alpha-amylase of the Gram positive Bacillus amyloliquefaciens was found capable of secret­ ing periplasmic TEM-beta-lactamase of Gram negative bacteria into the growth medium of Bacillus subtilis (181, 236). This location can be con­ sidered as equivalent to the beta-lactamase destination in E. coli since Bacillus does not have an outer membrane and thus lacks a periplasm. Prokaryotic signal sequences can be used to secrete proteins from eu­ karyotes. Examples include the expression and secretion of human growth hormone into the E. coli periplasm using signal sequences from E. coli alkaline phosphatase gene (177) and E. coli heat-labile enterotoxin gene (74, 161). This signal sequence was also used in the secretion of ri- bonuclease T1 from Aspergillus oryzae (69). Prokaryotic secretion ma­ chinery can also recognize, process and translocate eukaryotic precur­ sors, as demonstrated in the case of rat preproinsulin gene that was 16 expressed, the precursor processed, and the proinsulin localized in E. coli periplasm (226, 227). Other examples in light of signal sequence compatibility could be seen in numerous eukaryotic cellular systems. These include, but are not limited to: (i) Saccharomyces cerevisiae that secretes human epidermal growth factor using a synthetic leader sequence (45), human lysozyme using chicken lysozyme signal peptide (234), leech ( Hirudo medicinalis) hirudin using the yeast invertase signal sequence (97), human lysozyme using synthetic signal sequences (267, 268), and the secretion of human lipocortin-1 by S. diastatcus using STA1 signal sequence from this yeast species (170); (ii) Insect cells ( Spodoptera frugiperda, host cell line of bac- ulovirus expression system) that secrete papain (a plant cystein protease, EC 3.4.22.2) using honeybee mellitin signal peptide (228), and human HIV-1 gp120 using signal peptide of honeybee mellitin or murine inter­ leukin 3 (133). (iii) Mammalian cells (Chinese hamster ovary cells, CHO) that secrete endoglucanase of Clostridium thermocellum using signal sequences either from the pre-endoglucanase or from the human growth hormone gene (86), human granulocyte-colony stimulating factor (hG- CSF) using the conserved portion between hG-CSF and C-GSF (of the mouse) signal sequences (COS cells) (117), and neutral endopeptidase us­ ing pro-opiomelanocortin signal peptide (COS cells) (131); (iv) Transgenic tobacco cells that secrete the bacterial ChiA protein from S e r ra tia marcescens using either the original signal sequence of pre-ChiA or that from the PRlb protein (138).

The above are some examples with respect to protein secretion in differ­ ent systems. The enhanced secretion efficiency of papain and HIV-1 17 gpl20 when their respective signal sequences were substituted for the in- sect-derived signal peptide immediately implies that the Spodoptera frugiperda translocation machinery is more compatible with signal se­ quences of insect origin. However, this has been shown to be not true, as the secretion of human tissue plasminogen activator (t-PA) was not in­ creased by any of the signal sequences tested, most of which are insect- derived, including honeybee prepromellitin (111) that greatly enhanced the secretion of papain (228). 1.4.4 Secretion of beta-Galactosidase The establishment of the signal hypothesis (22, 23, 24) and subsequent work on secretory systems led to partial understandings of the expres­ sion and secretion of several prokaryotic and eukaryotic proteins across the E. coli cytoplasmic membrane (18). One interesting protein that played a pivotal role in the early elucidation of the sec-dependent secre­ tion pathway is beta-galactosidase. The monomer of this homo- tetrameric enzyme is encoded by the tricistronic mRNA of the E. coli lac operon (275, 276). Attractive features of this well-known enzyme as a se­ cretion reporter include its ease of bioassay and its cytoplasmic location. Unfortunately, efforts in secreting this cytoplasmic protein have not been successful. The signal sequences used include those from the m a lE (periplasmic) (13, 14), phoA (periplasmic) (156), phoE (outer membrane) (230), om pF (outer membrane) (217), and lam B (outer membrane) (63). These fusion proteins could not be found in their expected locations. Instead, they were incorporated into the cytoplasmic membrane where they appear to block the secretory potential, causing accumulation of pre­ cursors of most other would-be-exported proteins in the cell. The problem 18 had been noted earlier by von Heijne (241) who proposed that incompe­ tence of beta-galactosidase in passing through the membrane is due to the presence of amino acid sequences within it. Genetic studies further specified that the nonexportability of the enzyme is due to folding of por­ tion of the polypeptide in the cytoplasm. This model was further sup­ ported by the finding that overexpression of the GroEL chaperone protein and DnaK facilitate transfer of a LamB-beta-galactosidase hybrid protein (186,205,277). Another interesting feature of these systems is that these chimeric genes are lethal to the cell when over expressed. Careful investigation of malE- lacZ and lam B -lacZ demonstrated that the inhibition effect is propor­ tional to the amount of chimeric proteins produced (107). Both of these chimeras can be induced by maltose since LamB and MalE proteins are involved in maltose metabolism. As the concentration of the fusion pro­ tein in the cytoplasmic membrane increases by maltose induction, there is an increase inhibition of export of cell envelope proteins. The precur­ sors of these proteins accumulate in the cell but the mature proteins are barely found in their proper locations. Accumulation of the hybrid itself in the membrane severely affects the cells. A large fraction of the cells eventually lysed. The hybrids thus render the cell sensitive to maltose. A similar phenomenon was also observed in other fusions. The phoA - lacZ on a multicopy plasmid is lethal to a partially constitutive phoR E. coli strain, whereas the same gene on a single copy plasmid is not (156), another line of evidence implying the sensitivity of the cell to the amount of hybrid protein produced. 19

The inhibition effects may be due to the direct disruption of membrane organization by the hybrid protein, or indirectly through the blockage of secretion channels for naturally secreted proteins that have essential functions to the cell. The maltose induction assay system used by these investigators is rather ingenious since the beta-galactosidase activity of the hybrid protein is not measurable even after full induction with mal­ tose (lam B-lacZ) (87) or considerably lower than expected without induc­ tion but is close to normal level upon induction ( m alB-lacZ) (13, 14). The inhibition effect itself is thus a better quantitative assessment to the hy­ brid than the enzyme activity. 1.4.5 The Bacterial Secretion Machinery The pleiotropic detrimental effects of the LacZ fused with the N-terminal portions of secretory proteins led to the elucidation of bacterial protein se­ cretion machinery. Mutations in E. coli carrying these chimeras mapped in loci that are components of the secretory machinery. These include secA(prlD), secB, secE (prlG), secY (prlA ), secD, secF, and the two genes for leader peptidases, lepB (coding for leader peptidase) and IspA (coding for lipoprotein signal peptidase). Translocation and processing of precursors that require at least some of these proteins (i.e., SecA and leader pepti­ dase) are said to be sec-dependent, those that do not require any of these proteins are sec-independent. In both cases, translocation and process­ ing which occur before chain elongation is complete are said to be co- translational, and those that occur after elongation is complete are said to be post-translational. In E. coli, a known example of sec-independent protein transport is the 107 kDa haemolysin (HylA) that is secreted inde­ pendently of an N-terminal signal sequence and the secA gene product 20

(8, 99). There exist extensive reviews on the genetics and enzymology of these genes and gene products (18, 205, 259) and this section only briefs the current understanding of what is known about the system in E. coli. On the basis of functional and structural interactions, these sec proteins are now believed to form a multisubunit enzyme, translocase, on the cy­ toplasmic membrane (256). The integral membrane domain of the en­ zyme is a complex of secY and secE, which can be co-purified under nondenaturing conditions and cross-immunoprecipitated (1, 37, 38). Both SecY (49 kDa) and SecE (13.6 kDa) proteins are highly hydrophobic with 10 and 3 membrane spanning regions, respectively (41, 56). The cyto­ plasmic SecA is associated with the membrane through its affinity for

SecY/E complex (90). The ATPase activity of SecA is believed to catalyze cycles of precursor binding and release (206) and successive cycles of in­ sertion into the membrane translocator (178). Both SecA and its ATPase activity are essential for sec-dependent translocation of proteins. SecB is a 16 kDa soluble protein that binds to both the leader and the mature por­ tion of nascent chains of secretory proteins such as MBP, OmpA and PhoE (128, 193). It has been demonstrated in vitro that SecB binds only to denatured molecules and the binding prevents refolding (47, 128, 136). The homo-tetrameric SecB is required for translocation but not transla­ tion of precursors. E. coli supernatant depleted of SecB through an anti- SecB affinity chromatography column retains full translation activity but is unable to support translocation of preMBP into inverted vesicles (251). The products of secD and secF are homologous in their membrane spanning regions. These two proteins are required for the proton electrochemical gradient stimulation of preprotein translocation (5) and 21 are not considered as the subunits of the multisubunit translocase enzyme. Proteins need to be unfolded prior to translocation and fully folded pre­ cursors are not secretion competent (192). The co-translational transloca­ tion process requires nothing but the translocase enzyme. This is consis- tant with the vectorial transfer as proposed in the signal hypothesis (24). In post-translational translocation, molecular chaperones play roles in maintaining precursor polypeptide ready for translocation. In addition to chaperones that are components of the translocase (SecB and SecA), GroEL and GroES are also known to have functions that maintain secre­ tory precursors in translocation-competent conformations (144). These chaperonins, belonging to a subclass of chaperones, are the major 'heat shock' proteins but are essential for growth of E. coli at all temperatures. Mutations in both groEL and groES are defective in the export of pre-beta- lactamase but not pro-OmpA and pre-MBP. The folding property of the mature portion or the folding status of the precursor affects the secretion of maltose binding protein (MPB), and rapid folding of the precursor in the cytoplasm of secB' cells results in a conformation incompatible with translocation across the membrane (192). Amino acid changes in the mature portion that suppress defects in the signal sequence decrease the rate of MPB folding (135). Mutations in secB slow the rate of MPB transport and also increase the amount of folded, translocation-incompetent preMPB in the cytoplasm. On the other hand, the defects in MPB export in secB' cells were suppressed by amino acid changes that slowed the folding rate of mature MPB (47), or by muta­ tions in the signal sequence that increase the rate of export (48). Based on 22

these studies, Schatz and Beckwith (205) suggested that secB binds to preMBP during or soon after its synthesis and stabilizes the conforma­ tion to enable its association with later components of the secretion ma­ chinery. The signal sequence per se also plays a role, independently of chaperons, in slowing down the rate of precursors folding (173). The ef­ fect of precursor folding on protein secretion has been applied to practice. In one example, the production of active secreted protein is increased by 16 fold through the control of protein folding (224). The mature protein found in the targeting location is the product of pro­ teolytic cleavage by peptidases. There are two peptidases associated with protein secretion, leader peptidase and lipoprotein signal peptidase, each showing different substrate specificites. The lipoprotein signal peptidase is an endopeptidase that recognizes glycyl glyceride-cystein or alanyl glyceride-cystein as cleavage site [L-A(S)-G(A)-C-]. It cleaves lipid modi­ fied pro-lipoproteins into lipoproteins, which are further modified by N- acylation to form mature lipoproteins that attach to the outer membrane of Gram negative bacterial cells (191). The protein, also known as signal peptidase II, is the gene product of IspA (106). The lepB gene product, leader peptidase (also known as signal peptidase or leader peptidase I), cleaves the signal sequences from precursors other than that of pre­ lipoproteins (52). Leader peptidase is a cytoplasmic membrane protein with its active site facing the periplasm. This 323 residue polypeptide (36 kDa) has three hydrophobic segments near the N-terminus with the first two spanning the cytoplasmic membrane and the third residing in the periplasmic space, as demonstrated by protease mapping studies (130, 160, 263). Studies on a B. subtilis signal sequence fused to the mature 23 portion of the TEM-beta-lactamase (238) showed that a 25-fold overproduction of this protease results in enhanced processing rate and improved release of mature beta-lactamase into the periplasm of E. coli. Since these effects were not detected on the wild-type pre-beta-lactamase and other secretory proteins, the author argued that the availability of this signal peptidase may be a limiting factor in protein export in E. coli, in particular with those showing low processing efficiencies. For proteins that export from the cytoplasm of Gram negative bacterial cell to the external milieu, an important but poorly addressed question is how the translocated protein is inserted into or exported across the outer membrane. Three mechanisms have been proposed for this process (9): through intermediate vesicles, direct translocation through the periplasm, and direct crossing from the inner to the outer membrane via Bayers Patches where the two membranes are in close vicinity. None of these were able to explain what dictates the final destination of these pro­ teins. Recent studies on the genetics of extracellular protein export have shed light on this problem. It has been shown that this step alone, from the periplasm to the external milieu of the organism, involves more than 13 genes, collectively called the out genes (in Erw inia) (134, 194) or pul genes (in Klebsiella) (51, 191). Molecular cloning and characterization of these genes from several Gram negative bacteria (51, 134, 191, 194) showed that they are highly conserved and are specific in function for the transloca­ tion of proteins targeting to the outside of the cell. Mutations in these genes do not affect periplasmic protease translocation (194). The extracel­ lular starch-degrading enzyme pullulanase from Klebsiella oxytoca is 24 one of the best-studied examples, in which a p u lS gene and a 13-gene operon ( p ulC through pulO )were shown to be necessary and sufficient for the secretion of the pullulanase enzyme across the outer membrane (51, 191). More recently, the out genes have been cloned and sequenced in two independent studies from the plant pathogens Erwinia chrysan- them i (134) and Erwinia carotovora (194) that secrete plant cell-wall de­ grading enzymes including pectate lyase, polygalacturonase and cellu- lase. Homologs of the p u l or out genes have also been cloned from other Gram negative bacteria including Pseudomonas aeruginosa (10, 11, 55, 65), Xanthomonas campestris (57, 102), and Aeromonas hydrophila (113). This suggests that these systems are widely distributed in Gram nega­ tive bacteria. The gene arrangement and amino acid content of their ORFs are similar (194). Further investigation is needed to understand how the system works. Little is known, for example, about the signals in the extracellular proteins that target them to the secretion pathway, and how they interact with the components of the pathway in the periplasm. Although these out or p u l genes are highly conserved in their gene ar­ rangement, nucleotide and amino acid sequences and their biological functions, evidence from these works seemed to indicate that these gene clusters are species specific (134). Known exceptions are the p u l and out clusters of K. oxytoca and E. chrysanthemi, respectively, that permit E. coli to secrete homologous proteins by a Sec-dependent pathway (93, 134, 190). In addition, the P. aeruginosa xcp gene cluster, another homolog of the p u lgene system, complemented the protein secretion capacity of a X. co m p estris secretion mutant while the gene cluster from the latter 25 organism restored the secretion capacity of P. aeruginosa xcp m u tan t only to a low degree (55). Together, these findings are significant in that they are the starting point for understanding and for further probing the mechanism that dictates the final destination of proteins, since for almost two decades after the es­ tablishment of the signal hypothesis, little was known about how proteins transport beyond the cytoplasmic membrane in Gram negative bacteria. 1.4.6 Protein Secretion by Eukaryotes Some examples are briefly mentioned in this section to show that protein targeting is a general phenomenon of life and that the bacterial system is only a part of it. Proteins encoded by nuclear genes in eukaryotic cells are synthesized in the cytosol and are targeted to specific locations where they are functional. Depending on the types of cells, the destinations in­ clude the nuclei, mitochondria, chloroplasts, lysosomes, as well as the lumen of endoplasmic reticulum (ER). The signals contained in the pre­ cursors play important roles in these processes. Targeting of a nascent secretory protein to the ER lumen is initiated by the high affinity binding of newly emerged signal sequence to the signal recognition particle (SRP) (198). SRP is a cytosolic ribonucleoprotein complex consisting of six polypeptide chains and a 7S RNA (248). Its high affinity for signal sequence, ribosome, and SRP receptor makes it a criti­ cal factor in the association between the precursor and the ER. The SRP receptor, also known as docking protein (DP) (78, 94), was originally iso­ lated from the canine pancreatic rough microsome, and was found to be required for translocation across the ER membranes in cell free assays (153, 154). The binding of SRP to the signal sequence halts chain 26 elongation until the complex interacts with the DP on the ER surface (155, 246, 247, 249). Precursor-synthesizing ribosomes bound to the ER gives rise to the 'rough ER' appearance. Two further events are mediated by the binding of SRP to DP: stable interaction between the ribosome and the ER membrane, and release of nascent polypeptide chain from the SRP. This process requires that DP be in the GTP-bound form. The bound GTP is hydrolyzed during the dissociation of SRP and DP. It is postulated that GTP hydrolysis provides a molecular proofreading mechanism to assure correct association of the nascent polypeptide with the ER membrane (78, 94). As with the bacterial secretion system, insertion of polypeptide into the ER membrane remains unclear, although several mechanisms have been postulated (260). One common model is that there exists a translocation complex on the ER membrane comprised of a 43 kDa signal binding unit and several other proteins. The roles of the complex include binding to the signal sequences and forming an aqueous transmembrane tunnel (8, 214). The signal sequence is then cleaved by the signal peptidase and the mature protein released into the ER lumen for further processing and sorting. The resistance of microsome peptidase activity to alkali treatment and trypsin digestion indicated that signal peptidase is an integral membrane complex with its active site positioned on the lumenal side of the ER membrane (137, 250). There are extensive reviews on the import of proteins into organelles (mitochondria (91, 185), chloroplasts (208), and the nuclei (213)). Unlike that of ER, translocation of proteins into these organelles is posttransla- tional. Although similarities do exist among these and the ER/bacterial 27 systems, protein traffic into these organelles is complicated by further compartmentation of the organelles. Proteins that are nuclear-encoded can reside in one of several suborganellar compartments: four for a mi­ tochondrion (Outer and inner membranes, intermembrane space, and matrix), and six for a chloroplast (Outer and inner membranes, inter­ membrane space, stroma, thylakoid membrane, and thylakoid lumen). The nuclear envelop is constructed of two lipid bilayers with a perinu­ clear space in between. Accordingly, proteins targeted to their destina­ tions require specific signals, in addition to unique translocation com­ plex on each compartment. Presequences of the mitochondria and chloroplasts (also called transit sequences) seem to accommodate this specific targeting. Hurt et al. (103) and Horwich et al. (101) were the first to demonstrate that the mouse cy­ tosolic enzyme dihydrofolate reductase (DHFR) can be directed into the mitochondrial matrix by the 25 N-terminal presequence of cytochrome oxidase IV, an inner membrane protein. The first 12 residues of the pre­ sequence were found to be necessary and sufficient for the transport and subsequent processing in the mitochondria. On the other hand, even when 28 more N-terminal residues of the mature cytochrome oxidase IV were included, the fusion protein was still not found in the inner mem­ brane, suggesting that information for intramitochondrial traffic must reside in the remaining amino acids of the mature protein. Another ex­ ample is the presence of isozymes encoded by the same nuclear gene. Two isozymes of the yeast histidine tRNA synthetase are the products of a gene with dual transcription start sites (171). The longer mRNA is translated into a mitochondrial precursor, whereas the shorter one 28 produces a cytosolic isozyme. Mutation at the first start site leads to the lack of mitochondrial isozyme, resulting in respiratory deficiency, with no effect on the cytosolic isozyme. The small subunit of Rubisco (Ribulose-l,5-bisphosphate carboxylase) is a nuclear gene product and the 40-residue transit sequence is cleaved in the stroma (207). The bacterial protein neomycin phosphotransferase was imported into chloroplast when fused to this transit sequence, show­ ing that the transit sequence contains sufficient information to target mature proteins into the chloroplast (237). For proteins targeting to the nucleus, basic residue stretches called nuclear localization sequences (NLS), serve as targeting signals. Mutations in the transit sequence of nuclear proteins (e.g., SV40 T-antigen) cause cytoplasmic localization and exclusion of the protein from the nucleus (124, 125). Fusions between SV40 T-antigen and cytoplasmic proteins such as and beta-galactosidase localize the chimeras to the nuclei (114). Non-nuclear proteins coupled chemically to synthetic nuclear targeting peptides and microinjected into cells are also shown to be imported into the nucleus (126). Recently, a 60,000 Da protein called 'importin' that is essential for the first step of nuclear protein import has been isolated from Xenopus eggs (81). The current understandingis that importin binds NLS-contain- ing protein and directs it to the nuclear pore proteins, followed by a en­ ergy dependent translocation process facilitated by the Ran/TC4 protein (189).

Like bacterial signal sequences, very little similarities have been found among presequences for ER lumen and organelles. Particularly notewor­ thy are nuclear localization sequences. Although most of them are short 29

(5-7 residues) and are rich in basic amino acids, a consensus sequence is also lacking. 1.5 OVEREXPRESSION OF CLONED GENES 1.5.1 Strong Promoters The tac and trc promoters employed in the pKK expression vectors {e.g., pKK233-2/Zrc and pKK223-3/Zac) are strong promoters derived from the E. coli lac (76, 109) and trp (187) operons. The E. coli bacteriophage T5 early promoters are naturally strong. Although this project does not involve these promoters, the rmBTlT2 transcriptional terminators were first in­ troduced down stream of the T5 early gene promoters (73) and subse­ quently of the tac and trc promoters in the pKK expression vectors. A brief introduction to T5 promoters will thus follow. The tac and the trc promoters are hybrids composed of -10 region of the ZacUV5 promoter and the -35 region of the trp promoter (see below). These promoters have been successfully used to drive the overproduction of several prokaryotic and eukaryotic genes including the dehydro- quinate synthase from E. coli (68), the lipocortin I from rat (211), and the alpha-tubulin from human (266). The dehydroquinate synthase yield from E. coli K12 (MM294), for example, is 1000 foldhigher when driven by the tac promoter than in the wild type E. coli K12, and 50 times more en­ zyme was produced than that from a clone in which the gene is driven by its own promoter (68).

During the production of human growth hormone (HGH) in E. coli, De Boer et al. (53, 54) used both trp, lacUV5 and a hybrid tandem trp-lac promoter they constructed {tac promoter). It was demonstrated that the promoter is 5 to 10 times as efficient as the lacU VB promoter and is 30

1

-35 -10 +1 Ptrp GGCTGTTGACAATTAATCAT CGAACTAGTTAACTAG TACGCAAGTTC. . . hgh

-35 -10 +1 P tac GGCTGTTGACAATTAATCAT CGGCTCGTATAATGT GTGGAATTGT. . . hgh

-35 -10 +1 P lac CCAGGCTTTACACTTTATGCTTCCGGCTCGTATAATGT GTGGAATTGT.. .hgh (UV5)

2

The consensus sequences:

-10 Region: 51-TAtAaT-31 -35 Region: 5'-TTGACa-3'

Fig. 2 Construction of the ta c promoter. 1. The lac, trp and tac pro­ moter sequences (53, 54). The hgh gene was served as a reporter in the promoter strength assay and was labeled as the gene product, HGH. The distances between the -35 and -10 of the three promoters are 18 bp, 17 bp, 16 bp for lac, trp and tac, respectively. 2. Consensus sequences of the two hexamers of prokaryotic promoters (92, 132, 197). 31 controlled by the lac repressor. The lacUV5 is a UV-induced up mutant of the wild-type lac promoter. In the absence of cAMP-CAP complex, the wild type lac promoter is only 2% of full level (17). The GT in the -10 re­ gion of lac promoter is changed to an AA sequence, resulting in lacU V5 that restores 50% of full promotion (75). The trp -35 and lacU V5 -10 re ­ gions are identical to those of the consensus sequences, respectively (92, 132, 197). By bringing these two sequences together, these investigators were able to construct a hybrid promoter (Fig. 2) that is stronger than ei­ ther of the two promoters, as demonstrated by the expression levels of HGH under the control of these promoters. Expression vectors bearing the tac promoter are now commercially available and are widely used in revised forms that facilitate cloning simplicity and the capability of expressing genes from both prokaryotes and eukaryotes (2, 3, 4). The expression vector pKK223-2, constructed by Amamm and Brousius (2), carries a slightly modified form of tac. The promoter, known as trc, has the consensus distance, or consensus spacer length of 17 bp between the -10 and -35 regions instead of 16 bp in the orig­ inal tac. It had been shown previously in vitro th a t trc is the strongest promoter characterized so far and confirmed that the consensus se­ quence, in terms of the two hexamers and spacer length in between, is the "best" (163). The expression levels of genes directed by tac and trc, however, are similar in vivo (2, 3, 33). The major improvement is the in­ troduction of lacZ ribosome binding site (RBS) followed by an ATG trans­ lation initiation codon at an appropriate distance from the RBS. The ATG codon is exposed for insertion of incoming structural genes upon diges­ tion w ith Nco I. The tac bearing plasmid pKK223-3 (34) contains a variety 32 of restriction sites for the insertion of genes to be expressed at high level between the regulatable tac and the rrnB transcriptional terminators (see below). In addition to those mentioned above, it is also the driving force of cloned genes on vectors such as GST gene fusion vectors (pGEX series) (215), pTrc99 A (pTrc series) (4) and pKK232-8, a promoter selection vector (31). Both pKK223-3 and pKK233-2 plasmids were used in this study. The trc promoter carried on pKK233-2 is the one that direct the expression of se­ creted chitobiase. The secretion systems developed during this study (pVerSec and pVerSec-SE) also use trc for high level expression and se­ cretion of chitobiase and proteins from other sources. 1.5.5 The T5 Early Gene Promoters The E. coli bacteriophage T5 and related phages have been extensively reviewed (146, 147). This coliphage is unique compared with other bacte­ riophages. It injects 7.9% of its 121.3 kbp chromosomal DNA into host E. coli upon attachment to the cell surface, with the remaining 92% DNA injected 3-4 minutes later. Promoters of its early genes are naturally strong. Using filter binding, von Gabain and Bujard (239) measured both the rel­ ative rate of formation and stability of RNA -promoter com­ plexes and demonstrated that the transcriptional activities of promoters, in vitro and in vivo, correlate with the rate of polymerase-promoter com­ plex formation, but not with the stability of the complex. It has also been shown (240) that promoters of T5 early genes are very strong, stronger than those of pre-early and late promoters. Furthermore, T5 early gene promoters are stronger than other promoters tested, including those of 33

bacteriophage X, T7, fd, and of plasmid pML21 and pSClOl. By setting up

a promoter probing system (a plasmid used to select promoters by func­ tion), Gentz and Bujard (72) were able to select 11 highly efficient promot­ ers from T5. It should be pointed out that the structural aspects of promoters have been, and may only be, used to predict, but not to foresee, their functions. The ultimate biological activities of this important control element of a gene may not correlate well enough with these parameters. A synthetic prokaryotic promoter with consensus -10, -35 and optimal spacer length, for example, is only a moderate promoter (216) instead of a strong one as expected, although biologically functional. 1.5.6 Strong Transcription Termination Signals Transcription of a prokaryotic gene by RNA polymerase begins by bind­ ing to the promoter. Once initiated, the newly synthesized transcript is elongated until the enzyme reaches a transcription termination signal, or a terminator. In general, the DNA from the promoter through the terminator defines a transcription unit. Strictly speaking, the transcrip­ tion unit is what is meant by the word 'gene.1 The open reading frame (ORF) is often called a structural gene to signify that it can not be natu­ rally active as a gene because it lacks control elements such as the pro­ moter, RBS and, at least for some genes, a proper terminator. Some genes have other elements such as enhancers and/or operators. In these cases, the transcription unit itself is active, but the biological and biochemical behaviors of these genes are not necessarily retained when mobilized through gene cloning. 34

The terminator is a base-paired hairpin on the newly synthesized RNA molecule. It is formed by GC rich self-complimentary sequences followed by a string of U residues. Nascent RNA spontaneously dissociates from RNA polymerase at the terminator. This prevents read-through of distal region on the DNA template by the RNA polymerase. Some terminators require an ancillary proteinaceous factor called rho and are defined as r h o-dependent terminators. Those without the need of rh o for termination are r/io-independent terminators. Failure to terminate properly is obviously a disadvantage to the organism since it would be a waste of energy. It may have other negative effects if critical elements (origin of replication, promoters, etc.) that happened to be located nearby. The strong T5 early gene promoters, as justified by in vitro assay (239, 240), are attractive. Early attempts in cloning these promoters were un­ successful when the promoter is carried on short inserts. These promot­ ers were successfully cloned, however, when longer fragments that carry both the promoter and a strong transcription termination signal. Based on these and other observations, Gentz et al (73) constructed plasmids bearing two reporter genes ( la cZ ' and tet) with a strong terminator (the fd terminator) in between. Several T5 strong promoters were cloned and screened based on read-through of the terminator region and thus the expression of distal reporter gene tet. 1.5.7 A Strong Tandem Terminator A frequently seen termination signal used is a tandem terminator of the ribosomal RNA (rRNA) operon B of E. coli, or 7twBT1T2, occasionally la­ beled as T1T2 on plasmids. Expression vectors bearing this terminator include the above mentioned pKK series, pGEX series and the pTrc 35 series. On all of these systems, the promoter and terminator are arranged such that the inserted structural gene directed under the promoter is proximal to the terminator. The rrn B operon is one of the seven separate E. coli rRNA operons, named rrnA through rrnG. These operons are not linked on the E. coli chromosome, although the encoded 16S, 22S and 5S rRNAs are homolo­ gous. In addition to rRNAs, these operons also encode transfer RNA (tRNAs) located between these rRNAs. All of these rRNAs and tRNAs are intracistronic, driven by the same tandem promoters PI and P2 (separated by 80 bps) and terminated at a dual distal tandem terminator. Processing of the 30S primary transcript by RNAse III yields the rRNAs and tRNAs (132). The rrnB operon was characterized and completely sequenced in the early 1980's (32, 35). The organization of the operon is similar to the other 6 operons, with gene products located on the operon in the following or­ der: 16S rRNA, tRNAGlu, 23S rRNA, and 5S rRNA followed by rrnBTlT2. This dual terminator is located distal to the rRNA genes. T1 is adjacent to the 3' terminus of the 5S rRNA gene, and T2 is 175 nt further down­ stream. Both T1 and T2 are strongly homologous to known r/io-indepen- dent terminators. Features of the T1T2 include a potential hairpin struc­ ture that ends with stretch of T residues (32). Understanding the operon organization is helpful in practice when some of the elements are in­ volved in a system to be designed. One clear example is the 5S rRNA gene that always precedes T1T2 in expression systems. Since the 5S rRNA re­ gion has no function in terms of transcription termination, it can be deleted, partial or completely, without detrimental effect on the 36 expression of genes upstream, as long as the T1T2 themselves are retained. This idea has been applied to the secretion system construction as part of this project (chapter 4) and proven to be applicable.

1.6 PCR AND RECOMBINANT DNA The concept of the polymerase chain reaction (PCR) was first formulated and reduced to practice in the mid-1980s by and co-workers (164, 165) using the of E. coli DNA polymerase I and was applied to the amplification of human genomic DNA by a group in the Human Genetics Department at Cetus Laboratory (62, 201, 202). This ingenious invention was soon evolved into a powerful, yet simple, tech­ nique along with the discovery of the thermostable Taq DNA polymerase (70) and the introduction of thermocycling equipment (179). The operation of PCR since then became simple enough for end-users equipped with minimal background of molecular biology. The power of this revolution­ ary breakthrough to the recombinant DNA technology could be envi­ sioned from several aspects of the technique. In addition to its simplicity, the reaction is extremely sensitive. One single copy of a target sequence buried in a mass of chromosomal DNA could be amplified. Another as­ pect is that the sequences to be amplified do not need to be known. Sequences flanking the target sequence is all that is required. As is seen below, with slight modifications of the procedure, PCR can also be used to amplify sequences flanking a known region. Since polymerase chain re­ action tolerates mismatches in the primer binding regions away from the 3' ends of primers, mutations could be introduced to essentially any­ where in a given gene. 37

1.6.1 PCR and Beyond The powerful, yet easy to use, polymerase chain reaction has revolution­ ized the world of molecular biology as did the discovery of restriction en­ donucleases. The amplification of specific DNA sequences is only the ba­ sis, but not the final solution, to solve real problems. Many modifications to the original invention have been made that greatly increased the effi­ ciency of the reaction. Simple tagging of primers with restriction sites and control elements has been described in the original publication (164), and is thus considered in this writing as 'common knowledge.' This sec­ tion only concerns major modifications thereafter. Inverse PCR Originally presented by Ochman and co-workers (176), in­ verse PCR was the first method designed to amplify unknown DNA se­ quences flanking a known region. This is a rather theoretical comple­ mentation to the PCR technique, which was designed to amplify se­ quence between known sequences. The method is straightforward. Large DNA molecules such as chromosomal DNA is digested with a restriction enzyme followed by ligation of the digestion mixture. Circular molecules containing the known sequence are the template for the polymerase chain reaction. Primers are designed based on the known region such that they are extended away from each other, and thus the name 'Inverse PCR.' V ectorette PCR Vectorette PCR can be practically considered as a fur­ ther improvement to the Inverse PCR method to increase the specificity and efficiency of the reaction. In amplifying unknown DNA segments adjacent to regions with known sequences, an obvious idea would be to ligate digested chromosomal DNA with a synthetic linker. Two primer 38 binding sites, one in the known region of interest and the other in the linker region, can be used to amplify the unknown region. Since the link­ ers will ligate to both ends of any fragment of the digestion mixture, the resulting PCR products would be a smear because all the chromosomal fragments have the potential to ligate to the linker, and the linker can ligate to both ends of each fragment. Therefore, a primer sequence must be placed on the ends of DNA molecule in such a way that only those fragments containing the target sequence can be amplified. The design of 'vectorette' as a linker (58, 195) in this process solved the problem (Fig. 3). Two oligoes are synthesized such that they base-pair on both ends but not in the central region about 30 nt in length. One of the oligoes is longer than the other such that, upon base-pairing, an over-hang or a sticky end is formed, which is compatible with that produced with a known restric­ tion enzyme that cut the chromosomal DNA (Fig. 3-1). This looped struc­ ture, termed a 'vectorette,' serves as a linker that ligates to the digested chromosomal DNA mixture. The ligation mixture, referred to as a vec­ torette library, is to be used as the template for subsequent PCR. Primers used are based on the target sequence (primer 1) and the central region of the vectorette (primer 2). Primer 2 sequence is the same as one of the two strands and it will not bind the other strand since the two strands are not complementary to each other. The first cycle of PCR using the ligation mixture as templates could only extend from the 3' end of primer 1 that binds to the target sequence (Fig. 3-2). Primer 2 will then bind to the PCR product from the first cycle, and a normal PCR is assumed (Fig. 3- 3). Ligated molecules without the target sequence will not be amplified since neither of the two primers will be able to bind to them. 39

This technique has been applied to several projects, including the order­ ing of overlapping cosmid clones containing the YAC (yeast artificial chromosome) insert sequences (46, 195), and the analysis of chromoso­ mal breakpoints (157). Vectorette PCR is very easy to use and is powerful in increasing reaction specificity and efficiency by preventing amplifica­ tion of nonspecific sequences. One possible reason that it is not widely used today is that for each restriction site to be used for ligation, a separate vectorette needs to be synthesized, which contains two oligonu­ cleotides each longer than 60 bases. In practice, several restriction en­ zymes need to be used to find a proper one. A potential solution is to amplify or digest a multicloning sequence (MCS) from a commonly used cloning vector ( i. e., pBS or pUC18) and lig­ ate it to the synthesized vectorette (Fig. 4). This modified vectorette carry may unique restriction sites and is universal in that digestion of this vec­ torette with one of these restriction enzymes compatible with the one used for the chromosomal DNA has the same output as synthesizing an­ other vectorette. In addition to reduced cost on oligonucleotide synthesis, a major advantage of this vectorette is that several vectorette libraries can be constructed simultaneously and used as templates in PCR's that can be carried out at the same time.

For projects with more demanding usage of this structure, and for com­ mercial purposes, two pairs of perfectly matched oligonucleotides can be synthesized such that, upon annealing, one strand from one pair will base pair with one strand from the other pair to form a vectorette. The two oligonucleotides within each pair are complementary and anneal to form a double stranded form. Appropriate unidirectional ligations of 40

Compatible Overhangs Mismatches / \ Target ^ \ Sequence

Chromosomal Vectorette DNA Fragments

Primer 2 Primer 1 in target Mismatches sequence

Mismatches

Vectorette Vectorette

Primer 1

Primer 2

Fig. 3 Vectorette and its application. 1. Ligation of vectorette and chromosomal DNA fragments. 2. The vectorette library as templates for the first cycle PCR. 3. Subsequent PCR cycles in the same reaction [Adapted from Eales and Stamps (58) and Riley et al. (195)]. 41

Mismatches Kpnl Overhang Kpnl&SacI Fragment X of pBLueScrip KS+, or PCR Fragment

Vectorette

Ligation

Mismatches MCS

a. £ £ ^ S. 5 o O o O LU LU CO

Universal Vectorette

Fig. 4 Construction of a possible universal vectorette. The MCS in this example is a Kpn I & Sac I fragment of pBlueScript KS. The Kpn I overhang on the vectorette is to designed as part of the two oligonu­ cleotides for the vectorette structure. 42 these two fragments into pBS SK+ and pBS SK", respectively, would allow production of a specific strand from each recombinant phagemid from E. coli upon rescue with a helper phage. Annealing of the two ss-DNA re­ sults in large amount of universal vectorette. Since PCR is not sensitive to sequences upstream to the 5' ends of primers, the length of the vec­ torette should not be of concern with regards to efficiency of the method.

PCR Using Degenerate Primer(s) A cloning project very often starts from DNA sequences deduced either from the N-terminal sequence of a protein or from a conserved region of another gene, presumably in the same gene family. In the first case, DNA sequence thus obtained is de­ generate since the genetic code is redundant for all amino acids except methionine and tryptophan. In the second case, identical DNA sequence can not be expected since proteins with same function does not signify identity in their DNA (or protein) sequence. Degenerate primer is a mix­ ture of oligoes that represents all possible sequences. The degeneracy can be reduced to some extent by observing codon usage of other genes in the same organism and further by substituting ambiguous bases with ino- sine (115, 182). The proportion of'true' primer is still very low in the mix­ ture, and addition of large amount, up to 10 pmol in place of the usual 20- 100 pmol, has been used (79). However, since PCR tolerates mismatches in primer binding, some of the sequences in the mixture, although not necessarily perfect, can be used as primers to amplify prospective genes, directly or indirectly. This idea has seen some successful applications even though variability and uncertainty always remain. A DNA frag­ ment generated this way was successfully used as a specific hybridiza­ tion probe for porcin urate oxidase (129). Their degenerate primers were 43 deduced from within a 32 amino acid at the N-terminus of the protein. The short PCR product was then confirmed by sequencing and used suc­ cessfully to isolate a full-length cDNA sequence from a porcine liver cDNA library. Careful PCR parameter determination (annealing temperature and primer sequences in particular) are of critical importance for this kind of amplification since they are based on target sequences that are not avail­ able. Considerable time investment (trial and error) can be alleviated by using a gradient temperature thermocycler. 1.6.2 Recombinant PCR Strictly speaking, recombinant PCR is a derivative of the original poly­ merase chain reaction (164) that results in chimeric DNA molecules. The purpose of an experimental design is often beyond what the ordinary PCR provides, although the underlying principle is the same. The resul­ tant product of the reaction depends on particular problems but in almost all cases this is achieved through careful design of primers. The possibility of combining two DNA fragments through overlapping regions using PCR was pointed out by Mullis et al. (165). The idea was to connect synthetic oligonucleotides to build up longer and longer synthetic DNA molecules than direct chemical synthesis. The same principle was applied in the development of phage displayed single chain variable re­ gion (scFv) technology in which the heavy and light chains for a func­ tional antigen binding fusion protein is displayed on the surface of re­ combinant bacteriophage (100, 162, 262). The sources of the mRNAs for the cDNA templates were either from a hybridoma (43) or from unimmu­ nized human lymphocytes (43, 143). Sequences of Joint-genes and leader 44 regions were used to prime both heavy and light chain variable-genes

(143). A small linker sequence encoding (Gly 4S er)3 (148) has sequences complimentary to relevant ends of the PCR products and was used in subsequent PCRs to assemble the heavy and light chains. The recombinant PCR protocols were introduced with several application potentials by Higuchi (95, 96). Base substitution is a matter of amplifying the fragment separately into two PCR products with overlapping se­ quences. This is achieved by using two overlapping primers: one as the lower primer in the first PCR, the other as the upper primer of the sec­ ond PCR. These overlapping primers are often called the 'inside primers,' or 'internal primers' as they are positioned between the two primers at the ends of the initial fragment. The other two primers are similarly called the 'outside primers.' The two purified PCR products can be extended, when mixed together, in the absence of any primer be­ cause each fragment serves as the primer on the other one that is now the template. These inside primers are where manipulations can be in­ troduced, thanks to the fact that primer binding tolerates mismatches away from the 3' end. In addition to point mutations, desired length of DNA sequence can be deleted using the above protocol with minor modifications. With the 5' ends of inside primers located at the starting points of the sequences to be retained, homologous sequences to the 5' portion of the lower primer of the other PCR can be added. When the amplified products are self-as­ sembled, the homologous region is where connection occurs and the se­ quence between the two inside primers is deleted. 45

Several bases can be added to the middle of the inside primers, and these are base insertions to the original DNA fragment. Since the added bases are not complementary to the template, they will loop put during initial binding. This does not happen during subsequent cycles since newly syn­ thesized DNA molecules are the templates and are complementary to the primers, including the added region. The efficiency of this kind of PCR is generally not a concern. The two PCR products to be connected do not need to be from the same longer fragment. A homologous region added to the 5' ends of the inside primers, sometimes called "add-on" (95), is all that is needed to combine the two PCR fragments. For example, one fragment could be a promoter- less gene from one organism while the other could be the promoter re­ gion from another organism. It should also be clear that the power of re­ combinant PCR is to rearrange DNA sequences. The ultimate expression of a molecule is dependent on the genetic problem itself. One obvious advantage of recombinant PCR is that there is no prerequi­ site, such as restriction site availability, to the manipulations discussed above. The mutations, in a broader sense, can be introduced into virtually anywhere on any DNA molecule, and the connections can be precisely controlled by careful primer design. The disadvantage is intrinsic to the Taq DNA polymerase itself: the fidelity of PCR is considerably lower than th a t of E. coli holoenzyme (the E. coli DNA-dependent DNA polymerase). The in vitro PCR has higher mutation rate than the in vivo amplification of a DNA fragment inserted into a replicon that is replicated by the E. coli holoenzymes. The problem can be alleviated by using that have proof-reading capabilities, such as the V ent DNA polymerase. 46

Functional screening, however, is still necessary unless the entire re­ combinant PCR product is to be confirmed by sequencing. There is also a limit of add-on’s in length. Although add-on's as long as 45 bases have been reported (209), chemical synthesis of longer primer results in, be­ sides higher cost, progressively smaller portion of full length products. Forty-five bases add-on plus the 3' portion (which is the only sequence responsible for primer binding) will add up to more than 60 bases, ap­ proaching the upper limit of chemical DNA synthesis. 1.6.3 PCR Cloning and the Choice of Polymerases With the polymerase chain reaction in mind, essentially any fragment of DNA can be amplified and cloned into anywhere in a vector without con­ cerning the availability of restriction sites. This has made many difficult experimental designs easier and is becoming indispensable to re­ searchers familiar with this technology. One major issue deserves care­ ful consideration if a PCR product is to be cloned: the fidelity of the poly­ m erase. is the first one that introduced the PCR revolu­ tion to average molecular biology laboratories but it is the very one that lacks 3' to 5' exonuclease (proof-reading) activity (70). Primers can be ex­ tended more reliably by other thermostable polymerases with 3'-5' proof­ reading capability, such as the Vent DNA polymerase from Thermococcus litoralis (184), and the P fu DNA polymerase from Pyrococcus furiosus (139), both available commercially. The Vent DNA polymerase was purified from the Thermococcus litoralis, iso­ lated from a submarine thermal vent (19) and capable of growth at up to 98°C (184). 47

By measuring opal codon reversion frequency, it has been shown that the error rate of DNA synthesis by Taq polymerase is 3/10,000 bases, 10 fold higher that observed for V ent polymerase that is in the range of 3- 5/100,000 bases (59, 120, 145). With the average gene size of 2,000 bp, Tag- amplified PCR product does not seem to be a choice for cloning purposes. Even with Ve/iZ-amplified DNA fragments, functional screening is nec­ essary after cloning. CHAPTER 2 SEQUENCING AND SEQUENCE ANALYSIS

2.1 INTRODUCTION Two parallel pathways have been postulated in marine vibrios for catabolism of chitin, possibly containing as many as 6-10 enzymes and a number of chemotactic proteins (273). In the common part of the path­ way, chitin-binding proteins adhere to the substrate (77, 274), and extra­ cellular chitinase (123) and periplasmic chitodextrinase work together to produce jV,iV'-diacetylchitobiose (15). The glycosidase/PTS system cleaves N,N ''-diacetyl chitobiose to GlcNAc in the periplasmic space via a mem­ brane bound chitobiase (15, 220, 265) after which GlcNAc is transported and phosphorylated by the PTS (15). The second, parallel perme- ase/glycosidase system, resembling the E. coli lac permease/beta-galac- tosidase system (275, 276), utilizes an N,N '-diacetylchitobiose permease for transport of this substrate to the cytoplasm. The transported (G1cNAc)2 is cleaved by the cytoplasmic chitobiase reported here (278) and phosphorylated by a ATP-dependent N-acetyl-D-glucosamine kinase (6, 12, 15). The cytoplasmic system works independently of the PTS (15). This report gives the nucleotide sequence and the deduced polypeptide sequence of the gene encoding the cytoplasmic chitobiase (EC 3.2.1.14) from V. parahaem olyticus and shows an ancient evolutionary diver­ gence for this unique beta-hexosaminidase when compared with other chitobiases. 2.2 MATERIALS AND METHODS 2.2.1 Host Bacterial Strains, Vectors and Phages E. coli strains DH5a and DH5aF' were purchased from Gibco Bethesda

Research Laboratories (Gaithersberg, MD). E. coli strain JM101,

48 49 phagemid vectors pBS II KS+ and pBS II SK+ and the interference resis­ tant helper phages VCSM13 ( ka n r) and R408 were from Stratagene (La

Jolla, CA). The plasmid harboring the V. parahaemolyticus chitobiase gene, PC120, was constructed in this laboratory and has been described previously (278). 2.2.1 Enzymes, Chemicals and Antibiotics Restriction endonucleases, T4 DNA ligase, T4 DNA polymerase were purchased from Gibco Bethesda Research Laboratories (Gaitherberg, MD), New England Biolabs, Inc. (Beverly, MA) or United States Biochemicals Corp. (Cleveland, OH). T7 DNA polymerase DNA sequenc­ ing kit (Sequenase 2.0) was from United States Biochemicals Corp. (Cleveland, OH). 35S-deoxyadenosine 5'-[alpha-thio]-triphosphate was from Du Pont Co.-NEN (Boston, MA). Ampicilin was from Sigma

Chemical, Co. (St. Louis, MO). Oligonucleotides used as sequencing primers were synthesized on an automated DNA/RNA synthesizer (Applied Biosystems, Model abi 394) in GeneLab, School of Veterinary Medicine, Louisiana State University (Baton Rouge, LA). 2.2.2 Construction of Subclones for Sequencing The chitobiase gene harboring plasmid PC 120 (278) was digested with the restriction enzymes Pst I, Sac I and H ind III, respectively. The 1.6 kbp, 2.1 kbp and 3.5 kbp fragments were gel purified and cloned into appropri­ ate sites of pBS II SK+ vector for single strand DNA production. Insert orientations of the clones were identified by digestion with appropriate restriction enzymes that cut the inserts asymmetrically (H ind III for the 2.1 kbp fragment and EcoR V for the 3.5 kbp fragment), and the clones thus obtained were named as SKS162, SKP21-1, SKP21-2, SKH35-1 and 50

SKH35-2 (Fig. 5). The protocol for rescuing recombinant phagemid using VCSM13 was from Stratagene (La Jolla, CA) (except that E. coli D H 5aF' instead of XLl-Blue, Luria Bertanic medium (LB medium) instead of Super-Broth, were used). 2.2.3 Sequence Analysis Sequencing gel data were assembled and analyzed using Staden’s algo­ rithm (221), an integrated part of the GCG software, Unix® version 8.0 (Genetic Computing Group, Madison, WI). The identified chitobiase open reading frame (ORF) and the predicted polypeptide sequence were then used as primary query sequences to search available nucleic acid and protein depository databases. 2.3 RESULTS 2.3.1 Nucleotide Sequence Determination1 The sequencing clones and the orientations of the inserts are shown in Fig. 5-1. The two phagemids, SK21-1 and SK21-2, contain the 2.1 kbp Pst I fragments but in opposite orientation. Both SKS162 and SKH351 overlap with the above two clones to facilitate assembly of sequencing gel data. The restriction map of PC 120 as determined previously (278) and from the sequence data is depicted in Fig. 5-2. Using the Staden algorithm (221), a single ORF was identified from sequencing data from the clones shown in Fig. 5. The area sequenced and the coding region of V. para- haemolyticus chitobiase are also shown. The 47 amino acid residues at the N-terminus of the polypeptide are located between the Pst I and Sac I restriction sites, indicating that expression of the chitobiase gene on

1 The DNA sequence presented here has been deposited with the GenBank/EMBL (21) sequence data bank and is available under accession number U24658 (VPCHB). 51

PC 120 is driven by the V. parahaemolyticus chitobiase gene promoter in­ stead of the lac promoter on pUC18, the parental cloning vector. The nu­ cleotide sequence and the deduced amino acid sequence of the gene are shown in Fig. 6. The N-terminal sequence of the polypeptide from Edman degradation (278) perfectly matched that deduced from the DNA se­ quence, suggesting the lack of a signal peptide at the N-terminus. This agrees with the cytoplasmic localization of the enzyme. By attaching a signal peptide upstream of the first initiator AUG me­ thionine codon, this enzyme has been shown to secrete across the two membranes of the Gram negative E. coli (Chapter 3), further indicating that the wild type chitobiase of V. parahaem olyticus is cytoplasmic. The predicted molecular weight of the deduced 741 amino acid polypeptide is 85 kDa, agreeing with that determined by SDS-PAGE (278). 2.3.2 Homology with Other Chitobiases The deduced polypeptide sequence of V. parahaem olyticus was used as a query sequence to search the GenBank/EMBL (21) and Swiss-Prot genetic data bases using GCG program (FastA) based on the algorithm of Pearson and Lipman (183). Limited homologies, aligning in a 60 amino acid area of a composite map from residues 341-400, were found between this enzyme and those of V. harveyi, an outer membrane protein (110, 220) and V. vulnificus (218). Except for lysosomal chitobiases from hu­ man and rabbit which cleave (GlcNAc )2 from the reducing end and are related to each other (66), a remarkable homology was found among 31 amino acids out of 930 to all sequenced beta-hexosaminidases in the composite map locations 296-325, 341-359, 380-400 and 358-465 (Fig. 7). Particularly interesting in this region is position 359, at which an 52

^ " SKS162 SKH351 ''

SKP21-1

SK P21-2

Sc H P S I i i ______I m m m : i I l PC120 500 bp

Fig. 5 Restriction map and nucleotide sequencing strategy of the sequenced region containing the AT,iV-diacetylchitobiase gene from V. parahaemolyticus. 1. Inserts in subclones used to gen­ erate single stranded DNA for sequencing. Arrows represent insert ori­ entation relative to the chromosomal DNA fragment in PC120 (278). 2. Restriction sites used to generate the inserts shown above and the se­ quencing strategy. Arrows below the restriction map are individual se­ quencing gel data used in sequence assembly. Sequence presented in this paper is boxed. The solid portion indicates the chitobiase ORF. Only re­ striction sites used for subcloning are shown. P, Pst I; S, S a l I; Sc, Sac I; H, H ind III. 53

TGGAGCTCACGCGTGCGCCGCTCTAGACTAGTGG -177

ATCCCCGGCTGCAGCTGATATAGAAATGTTTCTCGTGGCATTAAACAAGCGCGGCGCAACGCGTATTTGTTTAATGGGC -9 8

-35 AGCATTGCTGAACGTATTTTACGTTGGCTTTCGCACCAGTTCAGCAATGGGTTGTTGAGCACAGTTTGACGCGATTGGG -19

-10 +1 SD CGC AAT TAT GTT GCA GGT AAC TGA CAT ACT TTA CTC GCT CAG TGC GGC GTA GGA GTA GAT 42

ATG GAA TAT CGT GTT GAT CTC GTC GTC CTA TCA GAA CAA AAG CAA AAC TGC CGT TTC GGA 102 Met Glu Tvr Ara Val Asd Leu V al Val Leu Ser Glu Gin Lvs. Gin Asn Cvs Ara Phe G lv 20

CTG ACT TTC CAT AAT TTG AGC GAT CAA GAT CTC CAC AAT TGG AGC CTG ATT TTT GCT TTT 162 Ih ?U T hr Phe His Asn Leu S e r Asd G in a sp Leu H is Asn T m S er Leu l i e Phe A la Phe 40

GAT CGC TAC ATT CTG CCG GAT AGT ATT TCG AAT GGT CAG CTC AAG CAA ATT GGC AGC TAC 222 Asd A ra T vr H e Leu P ro Asd S e r l i e S er Asn G ly G in Leu Lys G in l i e G ly S er T yr 60

TGC ACC CTC AAA CCA GAA GGG TTG GTG CTG GCA GCT AAC CAT CAT TTT TAC TGC GAG TTC 282 Cys T hr Leu Lys P ro G lu G ly Leu V al Leu A la A la Asn H is H is Phe T yr Cys G lu Phe 80

AGT ATT GGT TCG AAC CCA TTC CGT TAC TAT TCT GAT GGA TTC AAT GAA GCC TTG GTC AAC 342 S e r H e G ly S e r Asn P ro Phe A rg T yr T yr S er Asp G ly Phe Asn G lu A la Leu V al Asn 100

TTT GAA GTC AAC GGC AAC CTT CAG CGA GCT CAA GTC GAT GTC ACG CCG ATC GTA TTG GCT 402 Phe G lu V al Asn G ly Asn Leu G in Arg A la G in V al Asp V al T hr P ro l i e V al Leu A la 120

TCA CCG TAC CGT GAA CGT AGT GAG ATC CCT TCC AGC TTG ACG CAT GCG CAC GCT TTG TTG 462 S e r P ro T yr A rg G lu A rg S e r G lu l i e Pro S er S er Leu T hr H is A la H is A la Leu Leu 140

CCA AAA CCA AAC CAT ATA GAA GTC AGC GAT CAC TGC TTT AGC TTT AAT CAT CAC GCC GGC 522 P ro Lys P ro Asn H is l i e G lu V al S er Asp H is Cys Phe S er Phe Asn H is H is A la G ly 160

GTT GCG GTT TAT TCA AAC CTA GCC AAT TCA GCT AAA GAG TGG TTA CTT GAA GAG CTT AAG 582 V al A la V al T yr S er Asn Leu A la Asn S er A la Lys Glu T rp Leu Leu G lu G lu Leu Lys 180

CGC ATT CAT CAA TTT GAG TTC GCA TCA GAC AAT GGC AGT CAG ATC ATC TTC AAA GGC AAC 642 A rg l i e H is G in Phe G lu Phe A la S er Asp Asn G ly S er G in l i e l i e Phe Lys G ly Asn 200

CCA ACC TTG GAT GAA GGC GCT TAC AAG CTG AAG GTA GCA GAA GAG TCG ATC AAA ATT GAA 702 P ro T hr Leu Asp G lu G ly A la T yr Lys Leu Lys V al A la Glu G lu S er l i e Lys l i e G lu 220

GCA GGC TCT TCG TCT GGT TTT ACC CAT GCT TGT GCA ACG TTG TTG CAA CTG ATC AAA GTT 762 A la G ly S e r S er S e r G ly Phe T hr H is A la Cys A la T hr Leu Leu G in Leu l i e Lys V al 240

Fig. 6 Nucleotide sequence of the chitobiase gene from V. p a r a ­ haemolyticus. Presented in the figure is the coding strand (non-tem­ plate strand) of the DNA. Putative regulatory elements and the N-termi- nal sequence of the predicted polypeptide as determined by Edman Degradation (278) are underlined. Numbering of nucleotides is based on the putative transcription start site (+1). -35 and -10: -35 and -10 regions of the promoter, respectively; SD: Shine-Dalgarno site or ribosome binding site (RBS). Note the three consecutive translation Opal stop codons (TGA) at the end of the ORF.

(Fig. con'd) 54

GGC GAT CAA CCA GCC TCA ATG GAA GTG GTT TGC TGT TCA ATC AAA GAC CGA CCA CGT TTT 822 G ly Asp G in P ro A la S e r Met G lu V al V al Cys Cys S er l i e Lys Asp Arg P ro Arg Phe 260

CGT TAC CGC GGT ATG ATG CTA GAT TGT GCT CGC CAT TTT CAC TCC GTT GAG CAA GTC AAA 882 A rg T yr A rg G ly Met Met Leu Asp Cys A la Arg H is Phe H is S er V al G lu G in V al Lys 280

CGT TTG ATC AAC CAG TTG GCT CAC TAC AAG TTC AAT ACA TTC CAT TGG CAC CTT ACC GAT 942 A rg Leu l i e Asn G in Leu A la H is T yr Lys Phe Asn T hr Phe H is T rp H is Leu T hr Asp 300

GAT GAA GGT TGG CGA ATT GAG ATC AAA TCA TTG CCT CAA CTA ACC GAT ATT GGC GCA TGG 1002 Asp G lu G ly T rp Arg l i e G lu l i e Lys S er Leu Pro G in Leu T hr Asp l i e G ly A la T rp 320

CGT GGG TTG GAT GAA ACC AAT GAG CCA CAG TAC TCG CAC CTT GCT GAG CGG TTA CGG CGG 1062 A rg G ly Leu Asp G lu Thr Asn G lu Pro Gin T yr S er H is Leu A la G lu A rg Leu A rg A rg 340

TTT TTA CAC TCA AGA AGA CAT CAA AGA CGT GGT TGC CTT TGC TTC GAA ACG AGG CAT CAC 1122 Phe Leu H is S e r Arg Arg H is G in Arg Arg G ly Cys Leu Cys Phe G lu T hr A rg H is H is 360

TGT TAT CCT GAA ATC GAT GTA CCA GGG CAC TGC CGA GCT GCC ATC AAG TCG TTA CCA CAC 1182 Cys T yr P ro G lu l i e Asp V al P ro G ly H is Cys Arg A la A la l i e Lys S er Leu Pro H is 380

CTA TTG GTA GAA GCA GAA GAT ACC ACC GAA TAC CGC AGC ATT CAG CAT TAC AAC GAC AAC 1242 Leu Leu V al G lu A la G lu Asp T hr T hr Glu Tyr Arg S er l i e G in H is Tyr Asn Asp Asn 400

AAA GTT TTG GAA GAA GTC TCG CGT TGT GTC ATT AAC CCA GCT CTG CCG GGG AGC TAT GAG 1302 V al l i e Asn P ro A la Leu P ro G ly S er Tyr G lu Phe l i e Asp Lys V al Leu G lu G lu V al 420

TTT ATC GAT TCC CTG CCC CTT ATG TTC ATA TCG GTG CGG ATA AGT ACT AAC GGC GTA TGG 1362 S e r A rg Cys S e r Leu Pro Leu Met Phe l i e S er V al Arg l i e S er T hr Asn G ly V al T rp 440

TCA AAA AGC CCT GCA TGC CAA GCA CTA ATG GAA CAA CTG GGT TAC AGC GAC TAC AAA GAG 1422 S e r Lys S e r P ro A la Cys G in A la Leu Met Glu G in Leu G ly T yr S e r Asp T y r Lys G lu 460

TTA CAA GGG CAC TTC TTG CGT CAT GCC GAA GAC AAA CTG CGC AAA CTT GGC AAG CGC ATG 1482 Leu G in G ly H is Phe Leu Arg H is A la G lu Asp Lys Leu Arg Lys Leu G ly Lys A rg M et 480

CTG GGT TGG GAA GAA GCA CAG CAT GGC GAC AAA GTC AGC AAA GAC ACA GTG ATC. TAT TCG 1542 Leu G ly T rp G lu G lu A la G in H is G ly Asp Lys V al S er Lys Asp T hr V al l i e T yr S e r 500

TGG TTA AGC GAA GAA GCG GCG TTG AAC TGC GCG CGC CAA GGT TTC GAT GTG GTG CTA CAA 1602 T rp Leu S e r G lu G lu A la A la Leu Asn Cys A la Arg G in G ly Phe Asp V al V al Leu G in 520

CCT GCG CAA ACC ACC TAC TTA GAT ATG ACC CAA GAT TAC GCA CCA GAA GAA CCG GGC GTG 1662 P ro A la G in T hr T hr T yr Leu Asp Met Thr G in Asp T yr A la Pro G lu G lu P ro G ly V al 540

GAT TGG GCT AAC CCA TTG CCG CTA GAA AAA GCT TAC AAC TAT GAA CCA CTC GCT GAA GTG 1722 Asp T rp A la Asn Pro Leu P ro Leu G lu Lys A la T yr Asn T yr G lu P ro Leu A la G lu V al 560

CCA GCC GAC GAT CCA ATA CGT AAA CGC ATT TGG GGC ATT CAA ACA GCA TTG TGG TGC GAA 1782 P ro A la Asp Asp Pro H e Arg Lys Arg l i e Trp G ly l i e G in T hr A la Leu T rp Cys G lu 580

ATC ATC AAC AAC CAG TCT CGT ATG GAC TAC ATG GTC TTC CCG CGC TTA ACC GCA ATG GCA 1842 l i e l i e Asn Asn G in S er Arg Met Asp Tyr Met V al Phe P ro Arg Leu T hr A la Met A la 600

GAA GCA TGT TGG ACA GAC AAG CAA CAC CGA GAC TGG ACC GAC TAT TTA TCA CGT TTG AAA 1902 G lu A la Cys T rp T hr Asp Lys G in H is Arg Asp T rp T hr Asp T yr Leu S e r A rg Leu Lys 620

GGA CAC CTA CCG CTG CTT GAT TTG CAG GGA GTG AAT TAC CGT AAC CGT GGA AGT AAT ACA 1962 G ly H is Leu P ro Leu Leu Asp Leu G in G ly V al Asn T yr Arg Asn Arg G ly S er Asn T hr 640

GAG CAT TGT AGT AGA AGC ATC ACG CTT GAA GAG TTT TTA AAT TTT GGC TGC AGA CGC AGC 2022 G lu H is Cys S e r A rg S er l i e T hr Leu G lu G lu Phe Leu Asn Phe G ly Cys A rg Arg S e r 660

TTT GTA AAA AGG AAT ACA CAA ATG AAA TAC GGC TAT TTC GAT AAC GAG AAT CGT GAA TAC 2082 Phe V al Lys A rg Asn T hr G in Met Lys Tyr G ly T yr Phe Asp Asn G lu Asn A rg G lu T yr 680 (Fig. con'd) 55

GTC ATT ACT CGC CCT GAT GTA CCT GCT CCT TGG ACC AAC TAC CTA GGT ACA GAA AAA TTC Val He Thr Arg Pro Asp Val P ro A la P ro Trp Thr Asn Tyr Leu Gly Thr G lu Lys Phe 700

TGT ACC GTT ATC TCG CAT AAC GCA GGT GGC TAT TCG TTC TAC AAC TCT CCA GAA TAC AAC 2202 Cys T hr V al lie Ser His Asn A la G ly G ly Tyr S er Phe T yr Asn S er Pro G lu T yr Asn 720

CGT GTT ACT AAG TTC CGT CCA AAT GCG ACA TTT CGA TCG CCC AGG ACA CTA CGT TTA CCT 2262 Arg Val Thr Lys Phe Arg P ro Asn A la T hr Phe Arg Ser Pro Arg Thr Leu Arg Leu P ro 740

ACG TGA TGA TGA GACGGGAGATTACGGTCAATCTCTTGGCAACCAGTTGCAAAAGCCTAGACGAAGCGAACTACG 2337 Thr 1ES XEE 2EE 741

AAGTTCGTCATGGTTTGTCGTACTCTAAATTCAAGTGTGAATACAGCGGCATTAGCGCAACCAAAACGCTCTTTGTACC 2416

AAAAGGCGAAGATGCAGAAATTTGGGATGTGGTCATCAAGAACACCTCTGACAAACCGCGTACGATCAGTGCATTCTCA 2495

TTTGTTGAGTTCTCGTTCAGCCACATTCAATCAGATAACCAAAACCACCAGATGTCTCTCTACTCTGCTGGTACGTCAT 2574

ACAACACAGGCGTGTTGAATACGACCTGTACTACAACACTAACGATTCGAAGGCTTCTACTACCTAGCGTCAACGTTTG 2653

ATCCAGATTCATACGACGGTCCAACGTGATAGCTTCTAGGTCTATACCGCGACGAAGCAAACCCACTAGCAGAGTAGAA 2732

CAGGTAAGTGTTTACACGCACAACGTGTTACACCACTGTGGCTTTGCACGCATT 2786 56

80 90 100 110 120 13 0 v p c h b . gYCEFSIGSNPFRYYSDGFNEALgNFEVNGiSjQSAQVraVTPIVLSSPYRjpRSipIPSSLTHJlP£

v h c h b , gV^SAPNAEPKMIASLNTEDVASFSTGLEGN^aSKj^TPpmrcT .T?f;Mh^^lrtTPri(ilKrmrPAN2 7SRFij iDLATQDV 150 160 170 180 190 200

140 150 160 170 180 190 v p c h b . AHAgBKBNiDlSvSDHCFSFNHHA^ySVYSNLANSAKEWLLEELKRIHQFEFASiSjNGSQH

v h c h b STTj2B,il3MisM3AGKG- -KVDIAD0l2LPKDAFDATQFAAIQDRAEWGVDVRGi2LPVs2 210 220 230 240 250 260

200 210 220 230 240 250 v p c h b . !tf®NPTLDEjp^LKVAEES5

v h c h b . TWPADS t S e LA-KS^^|EMSIKGDGUVHK^FDQASAFY2 v QS I FGj^VD- SQNAD^bPQL 270 280 290 300 310 320

26 0 27 0 280 290 300 310 v p c h b . VT:nWRLII^L^tWFi^lTFi^ i^ » l» l^ ^ li^:E v h c h b . 'kdailatldS iaa52L!33pG 35 0 36 0 3 7 0 38 0

3 2 0 33 0 340 3 5 0 3 6 0 v p c h b . pQ2i5DIS3i^i}3L!jlETNEP-QYSH3AE--RLRRj3LHS--RRHQRRGCipF-ETijjK-HCY0!

v h c h b . 3E3SsvySN^CF!»}TOEKSCLLPOifl3SGPTTDNaGSGYFSKADYVEIHKYAKAt3NIEVlB 390 - 400 410 420 430 440

370 380 390 400 410 v p c h b . EM3315ci2 p IK§ L PH3LVgA--EDTT[35i3SIQ-HYND22|lNPALPGSYEaSDKV L: 1: 1 Ill: :r l: ::: ilu :: : : : : 11: v h c h b . 2Mj3AaA22SvV@MEARYDRii5MEiSGKEAEANiaSLMDPQDTS22TTVQFYNKQSi3|!

420 4 3 0 440 450 4 6 0 4 7 0 v p c h b . LfjjjEVSSCSLPLMFISVRISTNGWSKSPACQALMjgQLGYSDYKELgjGHFLRHAEDKLRKLiYKEI

v h c h b . mQ s s t jIjfvdkvisevaahhqeagaplttwhfggd S akn t k l g a g f Hdvnaedkvswkgti 510 520 530 540 550 560

Fig. 7 Amino acid alignment of three vibrio chitobiases Shown are regions displayed by the FastA program. Numbers above or beneath sequences indicate positions of amino acids in the original polypeptides, starting from the amino termini. 1. vpchb vs. vhchb, 28.6% identity in 336 amino acid overlap; 2. vpchb vs. wchb, 31.7% identity in 284 amino acid overlap; 3. vhchb vs. wchb, 34.4% identity in 785 amino acid overlap, vhchb: V”. harveyi chitobiase (220); wchb: V. vu ln ificu s beta-hex- osaminidase (218); vpchb: V. parahaemolyticus chitobiase, this study. (Fig. con'd) 57

2

80 90 100 110 120 130 v p c h b . FYCEFSIGSNPFRYYSDGFNEALWFEVNGNfflQjjjjAQVM^SlVLS-SPjJjRERgEIPSSLT

w c h b . RI^ILNTVPIDAVHXTEEVSGFTTGIKHTPNQ2Ki2TAN5lLLgAAT^TTR3EQYgKVKE'VKDLGA 9 040 1 5 0 160 170 180 19 014

140 150 160 170 180 190 v p c h b . H2HALLPKjjj!'IKirayHD3CF^Fi|iHKAgVAVYp\rLANSAKEWLLSEi|^:-raiHQFEFASDI\lg-

w c h b . D^SAHILSTPlJsrHVSjEGHljSlAQ^INIVHDALPADQ VgP^FgFETLGVNTGTEv 200 210 220 230 240 250

200 210 220 2 3 0 2 4 0 2 5 0 v p c h b . SQIIFjjGNPTLDEEAQKj^^AEESgJsgEAGSSSjjjjFTHACATjjLQiJlKijgDQPASMEgVC2IIFjj3NPTLDEEAMK3K^AEES[3K{

w c h b . pvxjvtiSadsskksbs2t[Jd2tssgBr6vgvdka0afygvqs!“ FYG\ ^GVQSn^GjSVTi jKDTIN-QQ-- 260 270 280 29029C 3 0 0

2 7 0 2 8 0 2 9 0 3 0 0 ______3 1 0 v p c h b . ’Hwr-I?Ti ») ;11 faflK sii

w c h b . MtwsfitoiuKCteialT .TC^rhFT,i7Wt^Ar^l^iKr5r?!Fi;n^reTS73TeiT7^T ,E ML ta. 3^ 3 6 0 ■ 4G2

320 330 340 350 360 v p c h b . ■jQ23Dl[pWjpL[|;-ip,iSEPQYSHS--AER[pRRFLHS--R3HQRRGC3CFETRH--HCY^E

w c h b . aEiEaV^HgCHEvjaQiJiKCMMPQiiGSGAEjJPNNGSGYYTgEDYKEllJ^YASARNIQVlBs 370 380 390 400 410 420

370 380 390 400 410 420 v p c h b . l5M ^CRj22l2^PHLLVEAEDTTEYRS I QHgNDNVlgPALPGSYEF IDKVLEEVSRCS w c h b . MjSMS2|3SL2Sv|ggMEARYRKFMAEGDVVKAEMaLLSDpSDTTQYYSIQHYQDNTINPCME 430 440 450 460 470 480

(Fig. con'd) 58

30 40 50 60 70 80 v h c h b . VNSLADNLDIQYEVTONHGANEGLACQDMGgEffiASgNKKNMTBvSjQEEAVDSKDW; w c h b . IDQKDVDYAAKNLKjafrsLVANKPKDCPPEgpJ^A^R^II^SlIfiSKSLNENVEl 10 20 30 40 50 60

90 1 0 0 110 120 130 140 v h c h b . S 3 r l i 0 dvdn S o S 3 i s r :e w l p l v g e y E q l f e t E w c h b . RTfflGSKS :TKSFQVDFMNjJjlVSNS__ 110 120 vhchb. FESAPN w ch b . YQASEHLgGRNgLNTVPI IKHTPj5Q22j BLLPA 13 0 140 v h c h b w c h b . YS (?LNFSF

2 6 0 v h c h b . DVRGDLGJS s I w c h b . NTGTG IK ADSS TVG 2 5 0 260 29 0 30 0

32 0 33 0 340 350 360 370 v h c h b . NAJBSLPHl I 2H^DAILATTOTSS5H2LiipL^ w chb. K-RriNK MHMj^Sj3S3iS5ELVFRFiij[3«ii£ES3kMtl!!Fu3F 310 320 330 340 350

3 8 0 390 v h c h b . 3pEE» FRTOEKSHLlJ3»Weg!«|PTTD'SF w c h b . jv e q n SESsaelp TRES 3 6 0 380

4 4 0 v h c h b . ISHe EFE e i ! ^il3»^rV4DRD^Et5glKEAEPWF •0SNVTTVWF0NKC w c h b . [_ DWKSEMQLBSSMS^QYYS Ip 4 2 0

50 0 510 530 540 550 vhchb. SFHJSSSI^Er 'WgiFHGjpAKNIKLGAGFQDVNAEDKH = i n mi 11 wchb. T-g2j2»j2jaEHF jINKLig! DYffllHAffraTAGAW— GDSPECR-KMFffl 4 8 0 490 500 510 520 530

(Fig. con'd) 59

56 0 570 580 590 600 610 v h c h b SWKGTIDLSf5QDKPFAQSPQCQT[j|lTDB-ijjlVSDFAHLP^i!FgEE\ (IVAEKGIP- -NF w c h b APESGVKNAjgDINGYFINRISHljflM IGAWNDGLSjteSLDAHSLAGNPPKAWVWGT 540 550 560 570 580 590

620 630 640 650 660 v h c h b QAjjjQDGLKjSSD - GE0AF ATEN^RVN— FW^OoH-WGGTSSVEjEffiSKKGYDVIVSNPDYVY w c h b MFgjGGVDQaNSFANjgGYDVwSPPDAYYFjijMPaENDPEERGBYBiATR-FNDTKKVFSFMP 600 610 620 630 640 650

670 680 690 700 710 720 v h c h b . MDM0YE2iDPKERGYYWATRgnDTRipMFGraAPENMPQNA{p3sVDRDGNGFTGKGEIE2KPF w c h b . ENYlWBEWMTDRMGAKISEvSTGEOTHDraLGVOGALWStkraiRTDAOTOyMVLPRMlflVAE 660 670 680 690 700 710

730 740 750 760 770 780 v h c h b , YELSAQL3is[3TVRNDEQHEYM^FPR VLAAAQR0WHRSp5|jENDYKVGVE^SQNSIS w c h b . RBWHKASi!iEf3|EHKEGITfcsNKDGHEGTTHLNDNlPTRDBBBiAH-F.qNTT,GWKEMPKrtl-rt! 720 730 740 750 760 770

790 800 810 820 830 840 v h c h b , IffiS LMOD YNRFANi?iLGOREff AKLEKSG0DSR0PVPGAKVEgGKLAMNHQFPGVTLQY w c h b , S22gityrlpvlga S i k n n i [2d w t e f h g v a S q 2 s 2 dgktwhky @d t k k p q S s t k a l v r s v 780 790 800 810 820 830

850 860 870 880 v h c h b , gLDgENWLTYADNgRPNVTGEVFIRSVSATGEKVSRITSVK w c h b rNiaRTGRAVEVLSKi 840 60

1 ------hschb

rnchb

'hshxa

■mmhxa

•hshxb

•mmhxb

■ddhxa

•vhchb

•wchb

•vpchb

Fig. 8 Amino acid sequence pileup of chitobiases and beta-hex- osaminidases from 10 organisms. 1. Dendrogram of aligned amino acid sequences. Distance along the horizontal axis is proportional to the difference between sequences. 2. Amino acid sequence alignment. Numbers above the sequences represent positions in the alignment, not for individual polypeptide. Highlighted residues are identical to those in vpchb. The boxed arginine residues at position 359 include R178 of hshxa and R211 of hshxb, respectively (36). The arginine position in the chitobi­ ase polypeptide is 271. hschb: H. sapiens chitobiase (66); rnchb: R a ttu s norvegcus chitobiase (66); hshxa and hshxb: alpha- and beta-polypeptides, respectively, of H. sapiens beta-hexosaminidase (116); mmhxa and mmhxb: alpha- and beta-polypeptide, respectively, of Mus musculus beta- hexosaminidase (16, 269); ddhxa: Dictyostelium discoideum beta-hex­ osaminidase (82); vhchb: V. harveyi chitobiase (220); wchb: V. vulnificus beta-hexosaminidase (218); vpchb: V". parahaemolyticus chitobiase, this study.

(Fig. con'd) 61

50 h s c h b r n c h b h s h x a mmhxa h s h x b mmhxb d d h x a v h c h b MLKHSLIAAS VITTLAGCSS LQSSEQQWN SLADNLDIQ0 e Q l TNHGANE w c h b ...... MAS D I ...... DQKDVD YAAKNL. . . . KLTTSLVANK v p c h b ...... MEj3 rQDLWLSEQ

51 100 h s c h b r n c h b h s h x a mmhxa h s h x b mmhxb d d h x a v h c h b DMGAE WASCNKVNM2 LVjjQGEAVDS KDWAIYFjjjSI R20LDV0NEQ w c h b PK iPP.EA P WGACYfjjVEIN LeStG SKSLN EN V EIY FSSI HRTLGSKSEE v p c h b KQNJj ...... [gJf F g G l I.nS FH'Sf h S tl. sc d q d ...... l SJnw s ;H f a f [3r y i

101 1 5 0 h s c h b r n c h b h s h x a mmhxa h s h x b • MEL mmhxb d d h x a v h c h b FK lgR ' HKLEPTDKFD GFAAGjgEVkgg PLVGEYWQLHj ETDF; w c h b FKVSHIi HKgTTTEKFjg GLKGGKTKSF QVDFMNWIVS NSDF! v n c h b LPDglS; KQ^GSYCTLa P0GL52 AANHHFYCE0 SIGSN0F.

151 200 h s c h b . MSR r n c h b h s h x a ... .MTSSRL WFSLLLAAAF AGR...... A mmhxa . . . . MAGCRL WVSLLLAAAffl ACL...... A h s h x b CGLGLPRPPM LLALLLATL3 AAMLALLTQV ALWQVAEAA [jEJPSVSAKPG mmhxb . . . . MPQSPR SAPGLL. . Lg QALVSLVS . L ALV...... APARLQ d d h x a .MIKKIIL FFAVLIAIVI G ...... QQP v h c h b gAPNAgPKMI ...... A S0 NTEDHASFVT GLE . . GP w c h b ASEHLgGRNI LNTVPIDAVH ITEgSsgFTT GIKHTPN q E | P: v p c h b gDGFNQA...... 2 VNF|= wQ' IV

201 2 5 0 h s c h b PQLRRWSLVg SPPSGVPGL2 LLAjgLALLAL RLAAGTDCPC PEPELCRPIR r n c h b MALSDL0ELT LLjaabPLLE. RLgAE.DCPC SEjiJSLCRPIR h s h x a TgLWQWPQNF QTSDQRYVLY PNNFQFQYDV SSAAQPGCSV l d e a f q r Q rd mmhxa TSLWjWPQYI QTYHRRY |3l y PNNFQFRY0V SSAAQGGCW LDEAFRRpRgj h s h x b PgLW; LPLSV KMTg lS PENFYISHSP NSTAGPSCTL LEEAFRR?HG mmhxb PgLW;FPRSV QMF§ IS AEDFSIDHSP NSTAGPSCSL LQEAFRR?Y® d d h x a ^ P Q Q V S0GTCVIPV2 P G S IL IE S N 0 SSAT...... FSVSMDRghS v h c h b KN. EDLA0QD v s t t l l T IWiEAGKGKVD IADglSLPKD w c h b KV. KDLGADjjJ VSAHIL ;J3 IAQ@INI\ v p c h b E0.gSS|53H2 HAj2gPK HP

(Fig. con'd) 2 5 1 3 0 0 h s c h b HHPDFEVF.. . . . VFDVGQK TWKSYDWSQI TTVATFGKYD r n c h b HHRDFEVF.. . . .VFDVGQK TWKSYDWSQI TTVAVFGKYD h s h x a 0 l :FGSGSjjfjPR PY0TGKRHTL EKNVLW SW TP0CNQj3PTL ESVENQTjjTI mmhxa Bl:FGSGSgPR PSFSNKQ0TL GKNXLW SW TAECNEFPNL ESI h s h x b y x f g f y k |Shh 0PAEFQAKTQ VQQLLVP QSECDAFPNI SSDES? mmhxb YVFGFYKRHH GPARFRAEAQ LQKLLV? ESECESFPSL SSDET? d d h x a 0FFPFSN ESE PSSN ...... E^FLLg YSDDE02Q. L GIDES? v h c h b AFDATQFAAI QDRAEWGVD VRGQLPVSj w p a d f t g T w c h b ALPADQVEAfl NFRFETLGVN TGTGVPVNVT Igi v p c h b 2Jansake 2]l3 0eE2krih 0fe FAg2NG§Qg] Fiv

3 0 1 3 5 0 h s c h b SgLMCYAHSK GARWLKGDV Sij D. PA FRggWIAQKL NLAKTQYMD. r n c h b SgLMCYAHSK GARWLKGDV frjS w ia q k v ALAKAQHMD■ h s h x a NDDQCLLLSE VWKS AEGTFFINKT mmhxa NDDQCLLASE RGLE VWKS AEGTFFINKT h s h x b KgPVAVLKj RGLE SYGTFTINES mmhxb q B p v a v l k I RGLE SFGTFTINES d d h x a EQGgYQLKgT GLE len Sysi EBv v h c h b kgdgBvOksf YjjJVQ S IF ADSLPQL... w c h b TSSG' YGVQ S ■DTINQS - • • v p c h b gSSgFTH2CA qpJSmeEBSc

4 0 0 h s c h b IEQE VNCLSPEYDA LT ETTD E I.E G S QVTFDVAWSH r n c h b IEQE VDCSSPEYEA LT E I.E G S QVTFDVAWSF h s h x a ' YLPLSSILDT LD SFPY gSFTF mmhxa YLPLSSILDT LD SFPYQSFTF h s h x b YLPflKIILKT LD SFPY Q SITF mmhxb LLp S k TIFK T LD SFPYQSTTF d d h x a YIPKNMILHM ■V AFPV0STTYL v h c h b DAILAT LD jEgLfj w c h b ^L S F S F LD v p c h b ______2qBkHL ™ i!5wKsn

4 0 1 4 5 0 h s c h b KNlRRRCYN...... YTGIAD r n c h b KGinjjKRCYN ...... YTGIAD h s h x a ISY NP V ...... THIYTAQ DVKEVISYAR mm hxa gSFNP V ...... THIYTAQ DVKEVIEYAR h s h x b jSY . S [J...... SHVYTPN DVRMVIEYAR mmhxb ’.S 3 ...... SHVYTPN DVSMV0EYAR d d h x a jjFSP ...... SATFSHD DIQEWAYAK v h c h b “ jTQEKSCLL PQgGSGPTTD NFGSGYFSKA DYVElfflKYAK w c h b •jjjKCMM PQgGSGAEL? NNGSGYYTgE DYKEIHAYAS v p c h b SjEPQY SH0AERLRRF LHSR. . . .gH QREGC0CFET

451 500 h s c h b ACDFLFVMSY 2QSQIWSEC IAAAN...... r n c h b ACDFLFVMSY 2QSQIWSEC IAAAN...... h s h x a LRGIRVLAgF TLSWG P G lgG L L ...... mmhxa L R G IR '” CLSWG PGAgGLL...... h s h x b LRGIRVL JTLSWG gGQKDLL...... mmhxb LRGIRVI TQSWG ^GQKNLL...... d d h x a TYGIRVI ig y Q e l v ...... A v h c h b ARNIEVI VgMEARYDRL MEEGKEAEAN EYRRMDPQR- w c h b ARNIQVI eS m e a r y r k f m a e g d w k a e my S h s d p : v p c h b RH. . HCY m m ......

(Fig. con1 63

501 55 0 h s c h b . . APYjgQI oy 0 k m s i NPKKLVMGVP WYGYDYTCLN r n c h b .. apySqi YGDYLRMGI SPRXLVMGXP WYGYDYICLN h s h x a p c Q s g s e p s g TFGPvffigS n t SH m s t f f DF ...... YLH mmhxa p c Q s g s h l s g T F G P V p DF ...... YLH h s h x b PCgSRQNKLD S F G P g TTFF DQ...... FIH mmhxb PCfcjNQKTKTQ VFGPV1 _ NTFF DQ...... FIH d d h x a EJCPDYAANVgj NI.PLDISN0 T iJIA PL FI DN...... YFH v h c h b SNVTTVfflFWi KOSFtTO5CME EAGAPfjjTTWH w c h b ME MINKLHK EGGQPgTDYH v p c h b . . . CS0PLMF

551 60 0 h s c h b LgEDHVCTIA KVPFRGAPCS DAAGRQVPYK TIMKQINSSI SGNL. . .WDK r n c h b LgKDDVCAIA KVPFRGAPCS DAAGHQVPYR VIMKQVNSSV SGSQ. . . WNQ h s h x a LGGDEVD...... FTCjJjK mmhxa LGGDEVD...... FTC®K h s h x b LGGDEVE...... FKCgE mmhxb LGGDEVE ...... FQ cS a d d h x a TGGDELV...... TECfflL v h c h b FGGDEAKjSlK LGAGFQDVNA EDKVSWKGTI DLSKQDKPFA w c h b 03ADETAG...... AffjG v p c h b SE v r i s t B ...... Evffis

601 650 h s c h b DQRAPf DPAWiiaHOVW YD. . NPQ SISLKATYIQ NYRLRGIGMW r n c h b DQ0APt DPTgRLHQVW YD. .NPjjJ SISLKAAFVK HYGLRGIGMW h s h x a kkgf E. QjjESFYIQTL LDIVSSYGKG YV. VWQEVFD NKV....KIQ mmhxa KKGFT. .IjjFljj Q0ESFYIQTL LDIVSDYDKG YV. VWQEVFD NKV. . KVR h s h x b q k g f S. t 5 f 3 ISFYIQKV l d i i SJt in k g SI.VWQEVFD DKA. .KLA mmhxb RKGFg.gSFR ;SFYIKKI l e i i s s l S kn SI.VWQEVFD DKV. . ELQ d d h x a KMGFS.TT.. DAFQYgENNL DVTMKSINRT ijjl -TWNDPID YGV. QLN v h c h b . TDGTVgj]FA h 2 p s J[*3a e e v skiv S^kgip NFQAWQDgLK YSD. (jjEEAFA w c h b APES0VKNAJ5 d i n @y B i h r i SHILDAKGgT lWNDHLS KKALDAS S LA v p c h b .eSlS e s S e 3 q S i3 l r h . .... KRMlHwE eaqh Ed2vsk

651 70 0 h s c h b NANCLDYSGD AVgKQQTEEM WEVLKPKLLQ R r n c h b NANCLDYSDD AL2REQTEEM WGALj^PRL h s h x a p d t FJiqt DIPVgjYMKEL ELVTF LjflSAP. mmhxa p d t h iq t [jjMPVEYMLEM QDITg L g S A ?. h s h x b p g t Bvevw kd S . . . AYPEEL SRVTASl I g S A ? . mmhxb PGTWEVWKS j=j. . . HYSYEL KQVTGS iSsAP. d d h x a PETLVQVWgS G ...... SDL QGIVNsj LVSFfJ. QNPDNNIH v h c h b TENTRVNFWD VLYWGGTSSV YEWSK IVSNPD w c h b GNPPKAffiVWG TMFWGGVDQY NSF| HVTSPDA ijMIPYE v p c h b dtv 0ys SlS3 S3a 2l 2]c ...... TODY

701 7 5 0 h s c h b r n c h b h s h x a ISY. DWKDF ...... F EGTPEQKALV mmhxa VKY. DWKD ...... F HGTPEQKALV h s h x b ISY ...... GQ ...... F GGTQKQKQLF mmhxb ISY ...... GQ DV ...... F EGSEKQKQLV d d h x a YEW...... QD TWQDFg ...... N ISTNAEN. v h c h b PRAT DTF 'NAET SVDRDGNGFT GKGEIEAQ p F w c h b 2TRFN DTP ;W MTDRMGAKIS ATTGEKTHDF v p c h b JPLjjJ .leSaQn ’. |d ...... d p i r E rQ

(Fig. con'd) 751 800 h s c h b r n c h b h s h x a Y V . DNTNLVP RL' •G2VS mmhxa YV. DSTNLVP RL1 .GgVjJ h s h x b YV. DATNLTP RL' S W G mmhxb F V . DATNL0S .S2VG d d h x a qH I G l g v h c h b TVRgpEQYEi g A g Q ...... RAW HRAJSwENDy Q w c h b t RRTDAQVE] I g V g g ...... RGW HKASWEEEHg v p c h b i Bn ^Ss: LTgMg gACWTDKQHR DWTBy i .s I ^

801 850 h s c h b r n c h b h s h x a s n k l t s SJ t f AYEgLSHFRC jjC-VQAQ PLNVGFWEQE fgEQT. mmhxa SSNLTTNIDF AFKgLSHFRC QAQ PISVGYWEQE § E Q T . h s h x b SSKDVRgMDD AYDgLTRHRC RMVE PLYAGYWNHE NM. . . mmhxb SPKTVrffi^EN AYKgLAVHRC PLYTGYMNYE N K I . . d d h x a SAQSVNSVSL A L P^IG H F T C DLSRjjjC plfpdySpmq DDLVF0MKPN v h c h b VGVEYSQN.. SNLVD K .AgLNQDYN iSVLGQijE h?jA2 w c h b EGXTYTSNVD GHEGTTHLgD NIATgDADWA HgSglLGYKE MPg.... v p c h b g h l p l l 2 2 q g v n y Q n r g s Sjt gHC^SsQTLE eElSfgSrSs SvSrn Sq MKY

851 900 h s c h b r n c h b h s h x a mmhxa h s h x b mmhxb d d h x a TKLSKSEXKL ILNK ...... v h c h b . . LEKSGXDg r l p v Q g a k v e DGKLAMNVQg p g Ht l q y s l d GENWLTYAD jS w c h b . .LgKAGITQ RLPVLGAVIK N N IL D W T E y h g R a i q y s l d g k t w k k y d d t v p c h b GYPgNENREO VITRgjDVPAP WTNYLGTEKQ c t ^ is k n a g g YSFYNSPEY^]

901 929 h s c h b r n c h b h s h x a mmhxa h s h x b mmhxb d d h x a v h c h b ARPNVTGEVF ig^VSATGEK VSRITSVK. w c h b KKPQVSTKgL VggVSgNGRT GRAVEVLAX v p c h b RVTKFRPN2T F^PRBLRLP T ...... 65 arginine residue is conserved for all the enzymes listed except for lysosomal chitobiases from human and rabbit. It has been shown (36) that Arg178 and Arg211 (aligned at position 359 in Fig. 8) in the alpha- and beta-subunits of human beta-hexosaminidase, respectively, are "active" residues. They are part of the catalytic sites, but they do not participate in substrate binding. Fig. 6 also shows that periplasmic chitobiases from V. harveyi and V. vulnificus have extensive homology with each other while homology of either of these with the cytoplasmic chitobiase from V. para­ haemolyticus is much lower. This implies that cytoplasmic chitobiases from vibrios took a very different line of evolution than periplasmic chito­ biases and these signal-sequence-containing enzymes are more closely related to beta-hexosaminidases from higher organisms. The structural gene and the deduced amino acid sequence of the V. parahaemolyticus chitobiase were progressively piled up (64, 172) to those of chitobiases and beta-hexosaminidases from other organisms in­ cluding other vibrios and higher organisms. The results are shown as a dendrogram (Fig. 8-1) as well as sequence alignment (Fig. 8-2). Identical relationships among these organisms were obtained using either the DNA or the amino acid sequences for comparison (only the amino acid data are shown). The clustering relationships show the uniqueness of the cytoplasmic chitobiase from V. parahaem olyticus among the chitobi­ ases and beta-hexosaminidase from all three vibrios. 2.4 DISCUSSION Rosem an et al. showed that V. furnissii possessed a separate N ,N '-d i- acetylchitobiase with cytoplasmic localization which depended on a

(G1cNAc)2 permease (15, 273, 274). We isolated such an N,N'-diacetyl 66 chitobiase of V. parahaem olyticus from the cytoplasm (278). The lack of signal peptide at the N-terminus in the gene confirms this assignment. It appears that a hydrophobic patch is absent throughout the entire length of the polypeptide sequence, as shown in Fig. 9. In addition, a naturally secreted endo-chitinase has been characterized and the gene cloned from V. parahaem olyticus (123). The major product of chitin cleavage by this chitinase is iV)iV'-diacetylchitobiose, the substrate of the periplasmic and cytoplasmic iV,iV'-diacetylchitobiases. A chitin binding protein to facilitate V. parahaemolyticus to adhere to chitin has also been reported (77). The presence of a permease-like protein to translocate chitobiose across the cytoplasmic membrane of V. parahaem olyticus is thus obligatory. The involvement of a permease is supported by in vitro data from V. furnisii (15). Complementation analysis (219) showed that a lac permease defective E. coli strain (LE392, lacY~ mutant) carrying plasmids with chromosomal DNA fragment from V". harveyi were able to hydrolyze o-nitrophenyl-beta-Z)-galactoside (ONPG), implying that the proposed putative chitopermease is equivalent in function to the la cY gene product and is able to translocate lactose across the cytoplasmic membrane of E. coli. Identification and characterization of the permease protein in V. parahaemolyticus will establish the complete chitinoclastic pathway in this organism. Although it has been shown that the PTS exists in V. parahaem olyticus (150) and all vibrios tested (149, 150), the establishment of an alternative pathway for the utilization of chitin by V. V. parahaemolyticus (150) re ­ quires further investigation, particularly the presence of a second N,N'- diacetyl-chitobiase that is periplasmic. 67

200 400 600 800

AMINO ACID

Fig. 9 Hydrophobicity analysis of chitobiases from two vibrios according to Kyte and Doolitle (122). 1. V. parahaem olyticus (this study); 2. V. harveyi (220). Parameters used for the plots: average window size of 11 amino acids, compression factor of 4. The highly hydrophobic N-terminal region (the signal peptide) in V. harveyi prechitobiase is indicated by arrow. 68

It is worth noting that sequences in the -10 and -35 region of the V. para­ haemolyticus chitobiase gene promoter are close to the prokaryotic con­ sensus promoters sequences (5’-TATAAT-3’ for -10 and 5’-TTGACA-3’ for -35) (92, 197), especially in the -35 region where the only difference is th a t the V. parahaem olyticus chitobiase gene has a G instead of an A as in the consensus sequence, implying high expression of the gene driven by its own promoter. This agrees with the high yield of the enzyme using PC 120, the clone with the chromosomal insert from V. parahaemolyti­ cus. On the other hand, the chitobiase level in V". parahaemolyticus itself is at least 30 fold lower (Zhu, B. C., personal communication). One possi­ bility is that the expression of chitobiase gene is regulated in the original organism but is not when cloned into E. coli, although other alternatives such as higher gene dosage in E. coli carrying PC 120 may also play a role. In contrast to the chitin degradation systems of V. harveyi and V. vulnificus in which catalytic and non-catalytic proteins are organized as simple c/ii-operons (219, 265), the chitinase and chitobiase genes for the two major catalytic enzymes in V. parahaemolyticus do not seem to be in the same cistron, since both have their own promoters and ribosome binding sites; nor are they in close vicinity, since sequences 1000 nt up­ stream and 500 nt downstream of the chitobiase ORF do not overlap with the chitinase gene (data not shown). At this point, we do not yet have enough data to conclude the presence or the absence of a chi- operon in this organism. The regulatory effect involved in the chitobiase gene of V. parahaemolyticus may be in trans. Two highly conserved regions were revealed from Fig. 8-2 among all chi- tobiases and hexosaminidases except for human and rabbit lysosomal 69 chitobiases which are exoglycosidases that split the GlcNAc-beta-D-(l- 4)GlcNAc chitobiose core of asparagine-linked glycoproteins from the re­ ducing ends of the dimeric sugar (66, 121). It is highly likely that amino acid residues in one or both of these regions participate in the catalysis of their respective substrate, especially the region from position 341-400 that includes the catalytic arginine residues in the alpha- and beta-subunits of human beta-hexosaminidase (36). This report shows the uniqueness of the cytoplasmic chitobiase gene of V. parahaemolyticus when compared to other chitobiase genes cloned from vibrios or other organisms. A cytoplasmic chitobiase activity was pro­ posed by Roseman's group (15) and cloned and isolated by Zhu et al. (278). The sequence and homology comparisons in this report establish an evo­ lutionary relationship among similar enzymes and further show the ex­ tent of genetic investment in chitin degradation by vibrios. 2.5 SUMMARY The cytoplasmic iV,AT'-diacetylchitobiase (EC 3.2.1.14) from Vibrio para­ haemolyticus has been characterized and the gene cloned into E. coli [Zhu, B. C. et al. (1992) J. Biochemistry (Tokyo) 112, 163-167]. The nu­ cleotide sequence of the gene encoding this unusual beta-hex­ osaminidase has now been determined, and the deduced peptide se­ quence surprisingly has minimum evolutionary relationship to two other reported iVjAf'-diacetylchitobiases from vibrios, except for highly conserved regions which are also homologous with lysosomal beta-hex- osaminidases from eukaryotes including humans. In contrast, the other two beta-hexosaminidases with reported sequences from vibrios are much more closely related to each other. This novel 85 kDa cytoplasmic 70 glycosyl hydrolase of restricted specificity participates in the high level utilization of chitin-derived 2-deoxy-2-acetamido-Z)-glucose (GlcNAc) by vibrios as one of two parallel pathways for metabolism of N,N'-diacetyl- chitobiose [Bassler, B. L., Yu, C., Lee, Y. C. & Roseman, S. (1991) J. Biol. Chem. 266, 24276-24286]. This complex chitin utilization system contains as m any as 6-10 enzymes and a num ber of chemotactic proteins [Yu, C., Bassler, B. & Roseman, S. (1993) J. Biol. Chem. 268, 9405-9409]. These pathways use chitin-binding proteins for the adherence of the bacterial chitinase to the substrate [Yu, C., Lee, A. M., Bassler, B. L. & Roseman, S. (1991) J. Biol. Chem. 266, 24260-24267; Gildemeister, O., Zhu, B. & Laine, R. (1994) Glycoconjugate J. 11, 518-526], and extracellular chitinase and periplasmic chitodextrinase to produce A^iV'-diacetylchitobiose. At this point the pathways branch. On one hand, a periplasmic, membrane an­ chored 2V,./V'-diacetyl-chitobiase cleaves (GlcNAc )2 and the liberated GlcNAc is simultaneously transported to the cytoplasm and phosphory- lated by the bacterial phosphoenolpyruvate: glycose phosphotransferase system (the PTS). The alternate pathway utilizing the cytoplasmic N,N'- diacetyl chitobiase requires a permease for N,N'-diacetyl-chitobiose, and is mechanistically equivalent to the lactose permease of E. coli. The V. parahaemolyticus cytoplasmic N,N'~diacetyl-chitobiase appears to be an unique protein, lacking a signal sequence and genetically distant from other known chitinoclastic beta-iV-diacetyl-hexosaminidases. This is consistent with its limited substrate specificity to small GlcNAc termi­ nated oligosaccharides. CHAPTER 3 PROTEIN SECRETION

3.1 INTRODUCTION To facilitate collection and purification of recombinant proteins, several signal sequences have been fused to proteins from various sources and cloned into jB. coli (69, 74, 85,141,161, 177, 224), yeast (45, 89, 97, 170, 174, 234, 267, 268), insect cells (S f cell line, a baculovirus host) (133, 228), transgenic tobacco plant cells (138), and mammalian cells (CHO and COS cells) (86, 117, 131). In E. coli, the secreted protein often accumulates in the periplasmic space (18, 85, 141, 224). Signal sequences have been used to construct vectors for protein secretion from cells of yeast (27, 159, 174), insect (228), and E. coli (84, 141, 224). In the E. coli secretion systems, some of which are commercially available (84, 141, 224), collection of gene products requires disruption of the cell wall, because the target location of the recombinant proteins is the periplasm. Gram negative bacteria do not secrete many proteins. To our knowledge, no secretion vector has been available for the harvest of wild type cytoplasmic proteins in E. coli culture medium. Chitinase is a glycohydrolase whose target polymer must be degraded outside the periplasm of vibrios. Resultant oligosaccharides are pro­ cessed by periplasmic chitodextrinases and chitobiases or transported into the cytoplasm for degradation (15). Cloning of the extracellular chitinase into E. coli resulted in high level secretion of the enzyme (123), which apparently has the correct architecture to pass the protein through the two membranes of Gram negative bacteria. To test the uni­ versality of this chitinase signal sequence (MIRFNLCAAGVALALSG AAVA), we fused the DNA fragment encoding this peptide to the

71 72 cytoplasmic chitobiase structural gene, using recombinant polymerase chain reaction. This report shows highly efficient secretion of the active chitobiase into E. coli culture medium and the construction of two secre­ tion vectors that could be potentially useful for the secretion of other pro­ teins. 3.2 MATERIALS 3.2.1 Host Strains and Plasmids Escherichia coli strains DH5a and DH5aF' were purchased from Gibco

Bethesda Research Laboratories (Gaithersberg, MD). E. coli strain JM101 was from New England Biolabs, Inc. (Beverly, MA). The phagemid vec­ tors pBS SK+ was from Stratagene (La Jolla, CA). The expression vectors pKK2 3 3-2 (7rcJ and pKK223-3(£ac) were from Pharmacia Biotech Inc. (Alameda, CA). The plasmid harboring the Vibrio parahaemolyticus chitobiase gene, PC120, has been described previously (278). The plasmid pKKAl carrying the V. parahaemolyticus chitinase gene and its signal sequence was from the V. parahaem olyticus chitinase gene was con­ structed previously (123). The chitobiase structural gene was amplified from the plasmid PC 120 (278), based on the nucleotide sequence reported previously (Wu and Laine, submitted), from this laboratory. 3.2.2 Enzymes, Chemicals and Antibiotics Restriction endonucleases, T4 DNA ligase, T4 DNA polymerase, T4 polynucleotide kinase and calf intestine phosphatase were purchased from Gibco Bethesda Research Laboratories (Gaitherberg, MD), New England Biolabs, Inc. (Beverly, MA) or United States Biochemicals Corp. (Cleveland, OH). T7 DNA polymerase DNA sequencing kit (Sequenase 2.0) was from United States Biochemical Corp. (Cleveland, OH). 35S-ATP 73 was from Du Pont Co.-NEN (Boston, MA). Vent DNA polymerase, deoxyribonucleotides and MgSC >4 solution were from New England Biolabs, Inc. (Beverly, MA). The polymerase chain reactions were carried out as previously described (164, 165). Ampicilin and p-nitrophenyl derivatives were purchased from Sigma Chemical, Co. (St. Louis, MO). Oligonucleotides used as sequencing and PCR primers were synthesized on an automated DNA/RNA synthesizer (Applied Biosystems, Model abi 394) in GeneLab, School of Veterinary Medicine, Louisiana State University (Baton Rouge, LA). 3.3 EXPERIMENTAL PROCEDURES AND RESULTS 3.3.1 In-frame Connection of DNA Fragments To fuse the V. parahaemolyticus chitinase signal peptide to the N-termi- nus of chitobiase, two PCR products were connected, using the recombi­ nant PCR method modified from that by Higuchi (95, 96). Briefly, the two pairs of primers were used to amplify the regions (PCR1 and PCR2) en­ compassing the signal sequence and the chitobiase structural gene, us­ ing pKKAl and PC120 as templates, respectively (Fig. 10). The 10-base long sequence at the 5' end of the lower primer for PCR1 was from the 5' end of the chitobiase structural gene. Similarly, the 10 base strech at the 5' end of the upper primer for PCR2 was from the 3' end of the chitinase signal sequence, as indicated by the raised portion of the two arrows in Fig. 10-2. These two oligonucleotides thus have a perfect match region of 20 bases in length and are designated as "inside primers.” The other two primers are similarly designated as "outside primers.” The two inside primers were designed in order to fuse the N-terminal methionine to the carboxy terminal of the first 6 amino acids of the 74 mature chitinase, immediately following the signal peptide (Fig. 10-1). The two PCR products, PCR1 (annealing: 56°C, 1 min.; polymerization: 72°C, 25 sec.; denaturation: 94°C, 1 min.; Vent DNA polymerase: 2 unit; Mg++: 16 mM; 25 cycles) and PCR2 (polymerization: 2.3 min.), were then gel-purified and used as PCR primers and templates to each other, and the reaction allowed for 10-12 cycles (polymerization: 2 min.) in a 100 pi volume. Two units of Vent DNA polymerase and 2 pi of 20 pM outside primer mixture were then added and the reaction proceeded for 25 cycles under the same conditions (Fig. 10-3). The products of the 3 PCR reaction were monitored by agarose gel electrophoresis (Fig. 10-4). In addition to the signal sequence and chitobiase structural gene, PCR3 also contains the strong trc promoter (33, 163) and the E. coli lacZ Shine- Dalgarno ribosome binding site (S.D.). The 3' end of PCR3 encompasses the 3 consecutive translational stop codons (TGA TGA TGA) of the chito­ biase gene. PCR3 is therefore an intact transcriptional unit that, when carried on a plasmid, is expected to express and secrete the recombinant chitobiase. 3.3.2 Construction of Chitobiase Secretion Clones For convenience of cloning, the upper primer for PCR1 was designed such that the Bam HI site was at the 5' end of PCR1 (and PCR3). PCR3 was blunt-ended with T4 DNA polymerase and inserted into the .EcoRV site of pBS SK+, and the ligation mixture was used to transform E. coli JM101. White colonies were incubated at 37°C for 20 hrs. The super­ natant and cell pellet suspension were used to test the activity of chitobiase, using p-NP-GlcNAc as substrate. One of the colonies was found to express and secrete chitobiase according to the assay, and was 75

1

Sail H in d lll PstI ori Amp 'Amp

'on pKKAl PC 120 7235 bp 8986 bp chn

A val trc chb SacI BamHI PstI Tagged PCR Tagge PCR

B ch b PCR 1 (404 bp) PCR 2 (2263 bp)

Fig. 10 In-frame connection by recombinant PCR. 1. Internal primer sequences and their arrangement during self-priming. Also shown in the top panel is the fusion of proteins between the serine from chitinase gene and the initiator Met of chitobiase. 2. Amplification of signal sequence from pKKAl and chitobiase structural gene from PC 120. Primers used are indicated by arrows on the plasmids and PCR products (PCR1 and PCR2) are shown with their respective sizes. 3. Connection of the fragments. Shown are the two PCR products that serve as both tem­ plates and primers during self-priming for the extension of the two frag­ ments. The resulting product, PCR3, is reamplified by adding outside primers. 4. Agarose gel stained with ethidium bromide showing the 3 PCR products. Five pi was used from each of 100 pi PCR reaction. M: 1 kb DNA ladder; PI, P2, P3: PCR1, PCR2 and PCR3, respectively.

(Fig. con'd) 76

Ftrc S nhh PCR 1 PCR 2 Self-Priming & Reamplification

B p trc s T ______chb

PCR 3 (2647 bp)

PCR2 Upper Primer w GCT CCG ACC GCA CCA AGT ATG GAA TAT CGT GTT GAT CT APTAPSMEYRVDL CGA GGC TGG CGT GGT TCA TAC CTT ATA GCA CAA CTA GA ^ PCR1 Lower Primer

PCR1 Product 5 '-GCT CCG ACC GCA CCA AGT ATG GAA TAT C -3' 3 '-G CGT GGT TCA TAC CTT ATA GCA CAA CTA GA-5' PCR2 Product

M P1 P2 P3 M

3054 bp 3054 bp 2036 bp 2036 bp 1636 bp 1636 bp 1018 bp 1018 bp 506 bp 506 bp 77

SspI SspI Seal N ael N ael Seal { EcoRV BamHI Amp .Amp tr c pBS-SK pSKPCR3 2961 bp 5608 bp ori

chb BamHI ori

BamHI Ligation of PCR3 into k

EcoRV Site of pBS-SK

BamHI BamHI BamHI trc tac

pKK223-3 4584 bp ori

Amp, an Amp

BamHI BamHI Digestion of pKK223-3 &

SKPCR3. Gel Purify and Ligate

Fig. 11 Two clones for the secretion of cytoplasmic chitobiase. 1. The PCR3 fragment is cloned into the Eco RV site of pBS II SK+. The B am HI fragment from the resultant plasmid pSKPCR3 is ligated with the larger B a m HI fragment of pKK223-3, resulting in pK3PCR3. 2. Nucleotide and amino acid sequences at the juncture between signal se­ quences and the chitobiase structural gene sequences. Arrow indicates signal peptidase cleavage.

(Fig. con'd) 78

ATG ATT CGA TTT AAC CTA TGT GCA GCT GGG GTT GCT TTA Met lie Arg Phe Asn Leu Cys Ala Ala Gly Val Ala Leu

GCG CTA TCA GGT GCT GCA^GTC GCA GCT CCG ACC GCA CCA Ala Leu Ser Gly Ala Ala Val Ala Ala Pro Thr Ala Pro

AGT ATG GAA TAT CGT GTT GAT CTC GTC CTA ... Ser Met Glu Tyr Arg Val Asp Leu Val Val ...

BamHI g BamHI

pK3PCR3 pSKPCR3 6983 bp 5608 bp

\ BamHI Seal BamHI 79

1.4 —O— pK3PCR3 —O— PC120 1.2

o

0.8

0.6

0.4

0.2

0.0 0 5 10 15 20 TIME (MIMUTES)

Fig. 12 Secretion and protein leakage. E. coli JM101 cells trans­ formed with pK3PCR3 and PC 120 were incubated in LB medium at 37°C for 18 hours. Cultures were spun for 3 minutes and 100 |il of the super­ natants was added to a cuvette containing 880 pi TE buffer, pH 8.0 and 20 pi p-NP-GlcNAc. Reactions were allowed for 20 minutes in a spectropho­ tometer (RT) and the release of p-nitrophenol (A420) was measured at time intervals indicated. The spectrophotometer was calibrated at 420 nm with pK3PCR3 supernatant reaction mixture without p-NP-GlcNAc. Inocula used were overnight colonies. 80

TABLE 1 Specificity of recombinant chitobiase. The recombinant chitobiase from the supernatant fraction of an overnight culture harbor­ ing pK3PCR3 was precipitated with 50% (NH 4)2 SC>4 (w/v) and incubated with the compounds listed. Activity towards each of the p-nitrophenyl derivatives was measured as the release of p-nitrophenol (A420). Reaction conditions (in 0.5 ml): 50 mM phosphate buffer, pH 7.0, 50 pM p- NP sugar substrates, 37°C, 30 min. Reactions were promptly stopped by adding 2 mM N a 2 C0 3 . The data for wild type chitobiase purified from V. parahaemolyticus were previously determined (278). nd: not determined.

______Chitobiase Activities ______Substrates Secreted Wild Type p-NP-p-A-GlcNAc 100 100 p-NP-a-A-GlcNAc 0 0 p-NP-p-A-GalNAc 29 18 p-NP-oc-iV-GalNAc 0 0 p-NP-P-A,iV'-diacetylchitobiose 60 66 p-NP-P-Glu 0 0 p-NP-oc-Glu 0 0 p-NP-P-Man 0 0 p-NP-oc-Man 0 0 p-NP-p-Gal 0 nd p-NP-a-Gal 0 0 p-NP-P-iV-acetyl-l-thio-GlcN 14 0

designated pSKPCR3. The orientation of the insert in pSKPCR3 was de­ termined by KpnVSacI digestion followed by agarose gel electrophoresis, and is indicated by arrow in Fig. 11. Digestion of pSKPCR3 with BamHI results in the release of PCR3 and the fragment is used to replace the small BamHI fragment of pKK223-3. This results in pK3PCR3. The chitobiase activity assay and the insert ori­ entation determination used were the same as those for the pSKPCR3. Although pKK223-3 was used as a vector, the promoter used in pK3PCR3 81 for the chitobiase gene is trc instead of tac. Both tac and trc have similar promoter strength (2, 3, 33,163). For both pSKPCR3 and pK3PCR3, the DNA and amino acid sequence at the juncture between signal sequence and the chitobiase structural gene are the same, as depicted in Fig. 11-2. The six amino acids (APTAPS) at the N-terminus do not seem to have any effect on the enzyme, since the secreted recombinant chitobiase specificity towards several p-nitrophenyl derivatives (Table I) is the same compared with the wild type chitobiase purified from V. parahaemolyticus (278). 3.3.3 Secretion and Leakage To rule out the possibility that the chitobiase in the supernatant is due to cell lysis instead of secretion, the plasmid PC 120 was used to transform JM101 and the resultant clone was incubated in parallel with pK3PCR3 for 18 hr. at 37°C. Chitobiase activities from the supernatants were as­ sayed using p-NP-GlcNAc. Absorbance at 420 nm (A420) was measured and the results are shown in Fig. 12. No chitobiase activity was detected in 20 minutes from the supernatant of PC 120 clone. This was expected since no signal sequence is present at the N-terminus. It is clear that cell lysis, which is possible and will result in protein leakage, does not con­ tribute to the detection of chitobiase in the supernatant, and the appear­ ance of chitobiase in the supernatant of pK3PCR3 is due to the function of signal peptide. This is also the evidence that the chitinase gene product in the supernatant is a result of secretion. 82

3.3.4 Construction of Secretion Vectors2 Although the secreted chitobiase using either pSKPCR3 and pK3PCR3 does not seem to be different in function from that purified from V. para­ haemolyticus, it is still desirable to collect the wild type enzyme from the supernatant. The six amino acids at the N-terminus of the chitobiase can be removed using PCR. It is also possible to construct expression vectors that could secrete wild type chitobiase, which have the added advantage of secrete proteins from other sources. The signal peptidase cleavage site between the two Ala residues (Fig. 13-1) can be mutated to create a Nae I site in the DNA without changing the amino acid sequence. Digestion of the plasmid thus exposes the 3' blunt end of the signal sequence that can be used to fuse in frame the incoming structural gene PCR product, pro­ vided that the Nae I site thus created is unique on the vector. Another possibility is the existence of a potential S fi I site near the 3' end of the signal sequence. Silent mutations can be introduced such that the ORF is maintained. DNA sequence from the S fi I site to the 3' end of the signal sequence, inclusive, could be added to the 5' end of the upper primer so that the PCR product of the structural gene to be cloned could be tagged w ith S fi I site, which could be used to fuse to the signal peptide through ligation (Fig. 13-2). To eliminate other Nae I sites on the plasmid, pKK223-3 was digested w ith Nae I and Pvu II, both blunt end cutters, and the large fragment is self-ligated, resulting in a modified version of the vector, pK3-NP (Fig. 14- 1, a). The B am H I fragment from pK3PCR3 was then used to replace the

2 Materials in sections 3.3.4 and 3.3.5 have been submitted to the Office of Technology Transfer, Louisiana State University and A&M College for disclosure to the university patent attorneys. 83

small BamHI fragment of pK3-NP (Fig. 14-1, b). The resulting plasmid is the same as pK3PCR3 except that sequence between Nae I and Pvu II in pKK223-3 has been removed. The lower primer used for pVerSec was tagged with N ae I. Self-ligation of the PCR product results in pVerSec with a unique Nae I site which, upon digestion, will expose the 3' end of the signal sequence for in frame fusion to the structural gene of interest Fig. 14-1, c). Since the structural gene is promoterless, only half of the lig­ ation products could be functional. By using a different lower primer tagged with S fi I site, the same was done using pK3NPCR3 as template to produce pVerSec-SE. The upper primer for pVerSec-SE was also tagged with a Eco RV site (Fig. 14, d), so that the ligation of the structural gene into the vector is unidirectional. Detailed DNA and the deduced amino acid sequences around the structural gene insertion sites are shown in Fig. 14-2. Both vectors have been used to express and secrete the wild type chitobi­ ase. The lower primer used was that used in the amplification of PCR2. For cloning into pVerSec, the 5' end of the upper primer starts from the translation initiator methionine codon, ATG. Ligation of the PCR prod­ uct results in pVS-CHB (Fig. 15-1). For cloning into pVerSec-SE, a 10 base sequence was attached to the 5' end of the upper primer. The PCR prod­ uct was trimmed with T4 DNA polymerase followed by digestion with S fi I. Ligation of this processed PCR product gives rise to pVSSE-CHB (15-2). The juncture amino acid sequences between the signal peptide and the structural gene are the same using either of the two vectors. M

ATGATTCGA TTT AAC * * BamHI Met H e Arg Phe Asn

TGTGCAGCTGGGGTT Cys Ala Ala Gly Val

TTAGCGCTATCAGGT PvuII Leu Ala Leu Ser Gly

GCAGTCGCJTGCTCCG 6983 bp Ala Val Ala Ala Pro

GCA CCAAGTATGGAA Ala Pro Ser Met Glu

CGTGTTGATCTCGTC Arg Val Asp Leu Val Seal BamHI

WT Signal Sequence: ATGATTCGATTTAACCTATGTGCAGCTGGGGTTGCTTTAGCGCTATCAGGTGCTGCAGTCGCAGCTCCG TACTAAGCTAAATTGGATACACGTCGACCCCAACGAAATCGCGATAGTCCACGACGTCAGCGTCGAGGC MIRFNLCAAGVALALSGAAVAAP

pV erSec: Nae I ATGATTCGATTTAACCTATGTGCAGCTGGGGTTGCTTTAGCGCTATCAGGTGCTGCAGTCGCcGacCCGAC TACTAAGCTAAATTGGATACACGTCGACCCCAACGAAATCGCGATAGTCCACGACGTCAGCGaCcaGGCTG MIRFNLCAAGVALALSGAAVA GP

pVerSec-SE: Sfi I ATGATTCGATTTAACCTATGTGCAGCTGGGGTTGCTTTAGCGCTATCAGGaGCcGCAGTaGCcGCTCCG TACTAAGCTAAATTGGATACACGTCGACCCCAACGAAATCGCGATAGTCCcCGaCGTCAcCGaCGAGGC MIRFNLCAAGVALALSGAAVAAP

Fig. 13 Silent mutagenesis. 1. pK3PCR3 with juncture sequence in which mutagenesis is to be made. The 4 Nae I sites are indicated by stars (*). 2. Wild type and mutated forms of signal sequences. Arrows indicate lower primers for secretion vector constructions. Tagged restriction sites within the primers are also indicated and their recognition sequences underlined. Lowercase bases were mutated. 85

N ael N ael BamHP Nael | ^ BamHI EcoRI BamHI BamHI N aels

pKK223-3 pK 3-N P 4584 bp 2919 bp

PvuII

Ndel

Ndel BamHI N del BamHI

Nael SacI

f t p V erS e c /A pK3NPCR3 \ 2975 bp A r 5318 bp chbj T \ \Amp I j \ 1 \------j Amp S 7 \ ------BamHI

Fig. 14 Construction of pVerSec and pVerSec-SE. 1. The four Nae I sites were removed from pKK223-3 by cutting with Nae I and Pvu II and self-ligation of the large fragment (a). The small B am HI fragment of pK3-NP is replaced by PCR3 (from pSKPCR3) (b). The fragment encompassing ori were amplified, and the products self-ligated, resulting in pVerSec (c). 2. pVerSec-SE construction by PCR using primers tagged w ith S fi I and EcoR V. 3. Juncture sequences in pVerSec and pVerSec- SE. lowercase indicate base substitution.

(Fig. con'd) 86

BamHI BamHI

SacI /- S f iI o n ^rcVsV/EcoRV

o n A5S

pK3NPCR3 pVerSec-SE 5318 bp chb 2990 bp A m p

A m p

BamHI

BamHI ATGATTCGATTTAACCTA Met lie Arg Phe Asn Leu

TGTGCA GCT GGGGTT GCT 7a Cys Ala Ala Gly Val Ala TTA GCG CTATCA GGT GCT Leu Ala Leu Ser Gly Ala

N a e l GCA GTC GCc Gac TGATACA Ala Val Ala Gly

GATTAAATCAGAAC

BamHI ATG ATT CGA TTT AAC CTA Met lie Arg Phe Asn Leu

TGT GCA GCT GGG GTT GCT Ala Ala 7a Cys Gly Val Ala TTA GCG CTA TCA GGcr GCc Leu Ala Leu Ser Gly Ala 2990 bp Sfi I GCA GTcr GCc GCT CCGATG Ala Val Ala Ala Pro Met

EcoRV AGA GAA GAT aTc CAG Arg Glu Asp lie Gin 87

1

N del BamHI N del BamHI

N ael trc SacI trc 'bri N ael 'ori A5S p V e rS e c pV S-CH B 5228 bp 2975 bp chb :Amp

Am p Seal A5S Seal

2

BamHI Ndel Ndel BamHI .Sfi I Sfil Sac I EcoRV trc trc 'ori ( Nae I ri A5S pV erS ec-S E pV SSE-CH B 2990 bp 5228 bp chb\ A.mp

Seal A m p A5S Seal

Fig. 15 Application of secretion vectors. 1. Cloning into pVerSec. Insertion of amplified chitobiase structural gene into the N ae I site re­ sults in pVS-CHB. 2. Cloning into pVerSec-SE. The amplified chitobiase structural gene was digested with Sfi I and ligated into pVerSec-SE. 88

1.2 -f- —-o —A. Secreted Chb -- 1.2 ■—o — B. Retained Chb —-O— C. Culture A600 o 1.0 - - 1 o © O o CO <

0.8 -- 0.8 Q m ac 0.6 0.6 3 I- U1 DC 0.4 -- 0.4 3 !j 3 O

0.2 -- 0.2

0.0 0.0 0.20.4 0.6 0.8 EDTA IN CULTURE MEDIUM (MM)

Fig. 16 Effects of EDTA on protein protection and host growth. A. Activity of chitobiase from the supernatant. B. Activity of chitobiase from the bacterial cell pellet washed with, and then suspended in, LB medium prior to assay. Overnight cultures supplemented with known concentrations of EDTA at inoculation were centrifuged in an Eppendorf tube for 5 min., and 20 pi of 5 mM p-NP-beta-GlcNAc solution was added to 200 pi each of the supernatant fractions. Reactions were proceeded at 25°C for 5 min., after which the cell pellets were suspended in 200 pi LB media and treated the same as the supernatant fractions (except that in­ cubations were increased to 10 min.). Residual chitobiase in all reactions was inactivated by heating at 65°C for 10 min. C. Growth of E. coli carry­ ing the rcombinant chitobiase gene in the presence of EDTA. 200 pi of overnight cultures from corresponding tubes were diluted in 800 pi LB medium and turbidity was measured as A600. Optical densities of E. coli culture were adjusted based on Toennies and Gallant (229). 89

1.6

1.4

1.2 o

0.8

0.6

O 0.4

0.2

0.0 0.0 0.2 0.4 0.6 0.8 1.0 EDTA (mM)

Fig. 17 Chitobiase activity and EDTA concentration. Five micro­ liters of chitobiase concentrated from the supernatant with (NH 4)2 SC>4 was added to 995 pi of water containing 100 pM p-NP-beta-GlcNAc and 0.0 mM to 1.0 mM EDTA. Reactions were stopped at 5 min. by heating at 65°C for 5 min. 90

3.3.5 Protection of Secreted Proteins Colonies of JM101 harboring pK3PCR3 grown overnight (18 hours) in LB medium showed chitobiase activity in the supernatant fraction (Fig. 12) but subsequent cultures using previous overnight culture as inoculum lead to dramatic declines in active chitobiase in the supernatant. For each of the colonies tested, no chitobiase activity could be detected after three sequential overnight cultures using immediate previous culture as inoculum. Incubation longer than overnight resulted in a decrease in chitobiase activity regardless of inoculum used (data not shown). Secreted chitobiase was probably being degraded during stationary growth of the culture since a plateau of chitobiase activity in the super­ natant fraction would be expected otherwise, even when the expression and secretion terminate. When an appropriate concentration of EDTA was added into the medium, overnight cultures inoculated with the same culture showed similar chitobiase activity as that inoculated with a colony, suggesting a protective effect. LB medium was supplemented with EDTA from 0.0 mM to 1.0 mM and inoculated with culture that no longer showed detectable chitobiase activity. Cells from the overnight cultures were pelleted and 200 |il supernatant fraction from each assayed for p-NP-beta-GlcNAc hy­ drolysis at 100 pM. The reaction mixtures were incubated for 5 minutes and 800 pi of water were added after reactions were stopped by incubation for 10 minutes at 65°C. As shown in Figure 16, optimal EDTA concentra­ tion was 0.6 mM. Less than 0.3 mM EDTA did not protect, and high concentrations of EDTA inhibited cell growth, and thus the production and secretion of the enzyme. No chitobiase activity was detected at 1.0 91 mM EDTA. Unexpectedly, chitobiase from the cell pellets also showed a similar response to EDTA under the same assay condition (except that the incubation time was 10 minutes instead of 5 minutes) (Fig. 16), implying that chitobiase degradation is not limited to the results of host cell lysis. The decline of chitobiase activity above 0.6 mM of EDTA is not from inhi­ bition of the enzyme which tolerates EDTA up to 1.0 mM (Fig. 17). Higher levels of EDTA affect the level of secreted enzyme by inhibiting growth of the host bacteria (Fig. 16). EDTA levels of 0.4 mM to 0.8 mM inhibited bac­ terial growth by ca. 25% (Fig. 16). Above 0.8 mM of EDTA dramatically in­ hibited bacterial growth, with complete inhibition at 1.0 mM. EDTA has a slight inhibitory effect on E. coli growth at moderate concen­ trations (0.4 mM-0.8 mM). At 0.6 mM of EDTA during E. coli growth, se­ creted chitobiase was best protected probably due to balance between cell growth and proteolysis protection. The observation that stability of the en­ zyme retained in the cells has a similar response to EDTA in the culture medium implies that degradation of chitobiase is not host cell lysis de­ pendent. The culprit is probably a metalloprotease released by E. coli. This observation may prove generally useful for protection of cloned se­ creted proteins in the supernatant fraction of E. coli and other bacterial cultures. 3.4 DISCUSION A stretch of six amino acids (-Ala-Pro-Thr-Ala-Pro-Ser-), located at the extreme N-terminus of mature chitinase, is noteworthy. The peculiar ar­ rangement of two potential turn-making residues in this short peptide suggested a possible function for the signal peptidase. The first Ala 92

residue is where proteolysis occurs. It is for this reason that some clones for the secretion of cytoplasmic chitobiase retained these residues, and the secretion product is thus a fusion protein containing them. Clones with or without these residues, however, did not show any differences in protein secretion. In addition, these residues attached to the chitobiase do not have detectable effects on the function of the protein. These amino acid residues are therefore ordinary components of the mature chitinase and have no contribution to the nature of the signal peptide immediately preceding it. Several protein secretion vectors have been reported to secrete fusion pro­ teins from organisms such as E. coli yeast, and higher eukaryotic cells. In E. coli, as all Gram negative bacteria, the existence of two membranes (outer and inner membranes) seems to have complicated the secretion process. Collection of gene products often requires disruption of the cell wall, which makes subsequent purification difficult. Genes to be cloned need to have unique restriction sites at certain locations, which is not al­ ways available. In addition, proteins obtained this way are often recombi­ nants, with additional amino acids attached to the final products, which are obviously not desirable. For example, gene products directed by the m alE signal peptide (on the pMAL-p2 vector) target to the periplasm of E. coli (84, 141, 224). Periplasmic extract prepared by osmotic shock is used to purify fusion proteins. Attached peptides to the proteins of interest are then cleaved by Factor Xa at a specific site (127, 168, 169), leaving some residues at the amino termini of the resultant protein, depending on re­ striction sites used during cloning. In addition, the quality of proteins also depend on other factors, including the efficiency of Factor Xa 93 cleavage and the presence of additional Factor Xa cleavage sites in the mature proteins. The secretion vectors reported here do not require restriction site nor nucleotide sequence information except sequences flanking the structural gene for primer binding. These vectors also have the added advantage of producing wild type proteins independent of the restriction sites on the gene to be cloned. As reviewed in Chapter 1, the placement of a strong transcription termi­ nation signal distal to a strong promoter is believed to be necessary. The evidence against this hypothesis is provided in this study. The PCR frag­ ment, PCR3, is an intact transcription unit with a signal sequence placed in frame with the chitobiase ORF. It contains the trc promoter and the SD sequence of E. coli. A terminator-like stretch of sequence down stream of the chitobiase gene translation stop codons was not present (Chapter 2). Further more, the lower primer for the chitobiase structural gene amplification is in the stop codon region, ruling out the possibility that a unidentified termination signal could be carried over with the PCR prod­ uct. Insertion of PCR3 into pBS SK+ results in pSKPCR3. The fragment was then excised from pSKPCR3 with Bam H I, and ligated into pKK223-3, re­ sulting in pK3PCR3. The level of expression and secretion of chitobiase from the two clones, pSKPCR3 and pK3PCR3, are similar. It is unlikely that sequences on pBS SK+ could serves as a transcription terminator of the chitobiase gene. The orientation of the chitobiase gene transcription unit is opposite to the lacZ' gene. The stop codons of chitobiase gene is very close (<500 bp) to the origin of replication of the plasmid (ori). No ter­ minator-like sequence was evident in this region. The E. coli strains used 94 may play a role but is unlikely in this system since DH5aF' does not sup­ port neither clones whereas JM101 supports both. A by-product of this project is the observation that plasmids containing the signal sequence followed by a ORF can not survive in DH5aF' but replicate and express faithfully in JM101. Less than 1.0 ng of miniprepped plasmid DNA pSKPCR3 or pK3PCR3 were routinely used to successfully transform JM101 competent cells whereas no colonies of DH5aF' were observed with up to 1.0 |ig of the same DNA samples of ei­ ther plasmid. When the chitobiase ORF is removed, the plasmid can sur­ vive in both DH5aF' and JM101. Examples of such plasmids from this study include pVerSec and pVerSec-SE and several others. As a matter of fact, only DH5aF' was used for large preparation of plasmid DNA to eliminate contamination by plasmid with the chitobiase ORF. Another reason that DH5aF' is preferred to JM101 is that it grows slower than

JM101, and thus the plasmid DNA is relatively cleaner. All single-strand DNAs for sequencing were prepared from DH5aF' except for secretion clones that can only transform JM101. Both strains of E. coli (DH5aF' and JM101) are commercial products and their genotypes are known from various sources (see Appendices). However, these laboratory E. coli strains, as well as others, are the re­ sults of extensive mutagenesis. Different strains may carry different, so far undiscovered, mutations that have unknown effects to a particular system. The secretion plasmids in this study seem to have unfortunately fallen into this situation. The known genotypes (mutations) of these strains include, among others, plasmid protection ( el4(McrAMcrBC'), mrr, hsdRS, recABCFJ, etc.), colony identification (A lacZ, §80(lacM15, 95 etc.), strain maintenance ( proAB, etc.) and bacteriophage infection for ss- DNA production (F', carrying the pili genes). These genotypes are obviously useful for general cloning purposes but less so for analysis in gene expression and secretion. The observation that the secretion plasmids can survive in JM101 but not in DH5aF' has been very advantageous in this project, regardless of the

underlying mechanism. During pVerSec and pVersec-SE developments, there are several steps that require amplification of vector portion fol­ lowed by self-ligation of the PCR product. High fidelity amplification de­ mands lower cycle number whereas high yield PCR product requires larger amount of template. Since the secretion vectors are the results of stepwise chitobiase structural gene removal and insertion, prevention of template-carry-over is critical if sequencing of each intermediate is to be avoided. One way to alleviate the problem is to gel-purify PCR products before ligations. This did not work well because the PCR products co-mi- grate with the supercoiled form of the template plasmid DNA. Although restriction enzyme digestion before gel electrophoresis would solve the problem, PCR products need to be purified before digestion. The alterna­ tive solution, which has been used in this study, is to transform DH5aF'

with the ligation mixture. Since the template DNA can not survive in this strain, PCR product can be ligated and directly used for transformation. DH5aF' is also used to propagate these vectors to eliminate contamina­

tion. Considerable time and effort were saved by using this E. coli strain. 3.5 SUMMARY

The signal peptide of an extracellular endochitinase from the marine bacterium Vibrio parahaemolyticus causes the mature chitinase to be 96 efficiently secreted through both membranes of its Gram negative bacterial host, and of Escherichia coli JM101 when cloned. By using recombinant PCR, this signal sequence, including the N-terminal six amino acids of the mature chitinase, was fused in frame to the gene encoding a cytoplasmic chitobiase [Zhu, et al., 1992, J. Biochem. (Tokyo), 112(1): 163-167]. The construct caused the cytoplasmic chitobiase secreted efficiently into the E. coli culture medium, suggesting that this signal sequence could be useful for extracellular production of other proteins. To facilitate ease of cloning, two secretion vectors were developed. PCR fragment of a structural gene can be inserted into the vectors such that it is in frame with the signal sequence. The secretion system has been tested using the cytoplasmic chitobiase gene as a model system for the secretion of wild type proteins. Secreted chitobiase was assayed both on the transformation plate and in E. coli culture supernatant. The signal sequence on the two vectors, together with the incoming structural genes when fused, is driven by the strong trc promoter. PCR fragment of the structural gene of interest can be used directly, or following simple restriction enzyme digestion, for ligation into the vectors for secretion. Restriction sites and complete DNA sequence are not required for the structural gene to be cloned. EDTA was shown to have a protective effect on cloned proteins secreted into the medium of host cell E. coli JM101. The protective concentration was rather narrow at 0.5-0.7 mM EDTA due to inefficiency at lower concentrations and cell growth inhibition at higher levels. EDTA was also shown effective in protecting proteins inside E. coli cells, implying that cytoplasmic proteins cloned for secretion have limited stability in the cytoplasm. This observation may 97 prove generally useful for protecting proteins cloned for secretion in E. coli and other bacterial cultures. CHAPTER 4 CONCLUSIONS

The 85 kDa cytoplasmic N,N'-diacetylchitobiase of restricted specificity participates in the high level utilization of chitin-derived 2-deoxy-2-ac- etamido-Z)-glucose (GlcNAc) by vibrios as one of two parallel pathways for metabolism of N,N'-diacetylchitobiose. This chitin utilization system contains many enzymes and chemotactic proteins. The system is obvi­ ously complex in general, and is probably so in Vibrio parahaemolyticus. While this study has answered some questions, many remain for further studies. The nucleotide sequence and the predicted amino acid data revealed that the cytoplasmic N,N '~diacetylchitobiase appears to be a unique protein, lacking a signal sequence and genetically distant from other known chitinoclastic beta-iV-diacetyl-hexosaminidases, consistent with its lim­ ited substrate specificity to small GlcNAc terminated oligosaccharides. Sequence analysis and comparison with other published sequences also implied that Arg271 (Fig. 8-2) of the enzyme may be involved in catalysis (36).

Of pivotal importance is the cytoplasmic nature of chitobiase as one of the major catalytic proteins in the chitinoclastic pathway. In particular, the involvement of a chitopermease is based on the properties of cytoplasmic chitobiase. These two proteins together determine which one of the two pathways is involved in chitin degradation by a particular organism. The fusion of chitinase signal peptide to the chitobiase coding sequence provided several lines of evidence, (i) Consistent with the general belief, it is now clear that the chitinase signal sequence is required and sufficient to target the mature chitinase to the external milieu of E. coli, a Gram

98 99

negative bacterium. The mature polypeptide does not seem to have any function in this regard, (ii) Although the DNA sequence and the physical location of chitinase strongly argue that the enzyme is secreted, it was not possible to rule out the possibility that the presence of the enzyme in the culture supernatant is a consequence of cell lysis. The secretion of the cytoplasmic chitobiase by this signal sequence provided evidence that cell lysis does not contribute to the presence of enzymes in the supernatant, since chitobiase activity was not detected without a signal peptide, (iii) The secretion of chitobiase by fusion to the signal peptide, together with the lack of a signal peptide as revealed from the sequence data of chitobi­ ase, also proved that the chitobiase is cytoplasmic. The hydrophobicity analysis demonstrated that the polypeptide does not interact with the membrane. All of these points to the notion that the enzyme is not in­ volved in any reaction in the periplasm. This is important because the presence of a chitopermease is necessary to translocate the substrate of chitobiase to the cytoplasm across the cytoplasmic membrane. The artificial secretion of cytoplasmic chitobiase also suggests that the chitinase signal sequence can be used to produce other proteins extracel- lularly. This idea has been translated into the construction of secretion vectors during this study. While there is no way to predict the secretabil- ity of a particular protein, the same is true for other systems. The advan­ tages of this system are: (i) Easy to use for cloning; (ii) Facile collection of gene products; (iii) Does not require restriction site and sequence infor­ mation except for the partial sequences at both ends of the coding se­ quence of interest; (iv) The secreted gene products are of wild type instead of fusion proteins. Of course, the unfortunate lack of glycosyltransferase 100 systems in E. coli puts a barrier to the secretion of functional glycopro­ teins. EDTA was shown to have a protective effect on cloned proteins secreted into the medium of host cell E. coli JM101. The protective concentration was rather narrow at 0.5-0.7 mM EDTA due to inefficiency at lower con­ centrations and cell growth inhibition at higher levels. This observation may prove generally useful for protecting proteins cloned for secretion in E. coli and other bacterial cultures. REFERENCES

1. Akimaru, J., Matsuyama, S. I., Tokuda, H., and Mizushima, S. (1991). Reconstitution of a protein translocation system containing purified secY, secE, and secA from Escherichia coli. Proc. Natl. Acad. Sci. USA. 88, 6545-6549.

2. Amann, E., and Brosius, J. (1985).'ATG vectors' for regulated high-level expression of cloned genes in Escherichia coli. Gene. 40, 183-190.

3. Amann, E., Brosius, J., and Ptashne, M. (1983).Vectors bear­ ing a hybrid trp-lac promoter useful for regulated expression of cloned genes in Escherichia coli. Gene. 25, 167-178.

4. Amann, E., Ochs, B., and Abel, K. (1988).Tightly regulated tac promoter vectors useful for the expression of unfused and fused proteins in Escherichia coli. Gene. 69, 301-315.

5. Arkowitz, R. A., and Wickner, W. (1994).SecD and secF are re­ quired for the proton electrochemical gradient stimulation of pre­ protein translocation. EMBO J. 13(4), 954-963.

6. Asensio, C., and Ruiz-Amil, M. (1966).N-Acety\-D_- glucosamine kinase: II. Escherichia coli. Methods Enzymol. 9, 421- 425.

7. A tlas, R. M. (1988). Microbiology: Fundamentals and Applications (2nd Ed.). New York: MacMillan Publishing Company.

8. Austen, B. M., and Westwood, O. M. R. (1991). P ro tein Targeting and Secretion. New York: Oxford University Press.

9. Baker, K., Mackman, N., and Holland, I. B. (1987).Genetics and biochemistry of the assembly of protein into the outer mem­ brane of E. coli. Prog. Biophys. Molec. Biol. 49, 89-115. 10. Bally, M., Ball, G., Badere, A., and Lazdunski, A. (1991). Protein secretion in Pseudomonas aeruginosa: the xcpA gene en­ codes an integral inner membrane protein homologous to Klebsiella pneumoniae secretion function protein PulO. J. Bacteriol. 173, 479-486. 11. Bally, M., Filloux, A., Akrim, M., Ball, G., Lazdunski, A., and Tommassen, J. (1992).Protein secretion in Pseudomonas aeruginosa: characterization of seven xcp genes and processing of secretory apparatus components by prepilin peptidase. M ol. Microbiol. 6,1121-1131.

101 102

12. Barkulis, S. S. (1966).iV-Acetyl-Zl-glucosamine kinase: I. Streptococcus pyogenes. Methods Enzymol. 9, 415-420.

13. Bassford, P., and Beckwith, J. (1979).Mutants of Escherichia coli which accumulate the precursor of a secreted protein in the cy­ toplasm. Nature. 277, 538-541.

14. Bassford, P., Silhavy, T., and Beckwith, J. (1979).Use of gene fusion to study secretion of maltose-binding protein in Escherichia coli periplasm. J. Bacteriol. 139,19-31.

15. B assler, B. L., Yu, € ., Lee, Y. C., and Roseman, S. (1991). Chitin utilization by marine bacteria: degradation and catabolism of chitin oligosaccharides by Vibrio furnissii. J. Biol. Chem. 266(36), 24276-24286.

16. B eccari, T., Hoade, J., Orlacchio, A., and Stirling, J. L. (1992). Cloning and sequence analysis of a cDNA encoding the al- pha-subunit of mouse beta-N-acetylhexosaminidase and compari­ son with the human enzyme. Biochem. J. 285(Pt 2), 593-596.

17. Beckwith, J., Grodzicker, T., and Arditti, R. (1972). Evidence for two sites in the lac promoter region. J. Mol. Biol. 69(1), 155-160.

18. Beckwith, J., and Perro-Novick, S. (1986).Genetic studies on protein export in bacteria. In H. Wu & P. Tai (e d s .), C urr. Top. Microbiol. Immunol, (pp. 5-27). Berlin, Heidelberg: Spring-Verlag.

19. Belkin, S., and Jannasch, H. W. (1985). A new extremely ther­ mophilic, sulfur-reducing heterotrophic, marine bacterium. Arch. Microbiol. 141(3), 181-186. 20. Beneski, D., Nakazawa, A., Weigel, N., Hartman, P. E., and Roseman, S. (1982). Sugar transport by the bacterial phospho­ system: Isolation and characterization of a phosphocar- rier protein HPr from wild type and mutants of Salmonella ty- phimurium. J. Biol. Chem. 257(23), 14492-14498.

21. Bilofsky, H. S., and Burks, C. (1988). The GenBank® genetic se­ quence data bank. Nucl. Acids Res. 16, 1861-1864.

22. Blobel, G. (1977). Mechanisms for the intracellular compartmen- tation of newly synthesized proteins. In B. Clark,H. Klenow, & J. Zeuthen (eds.), FEBS 11th Meeting (Copenhagen, 1977) (pp. 99-108). Pergamon Press, Oxford.

23. Blobel, G. (1977). Synthesis and segregation of secretoty proteins: The signal hypothesis. In B. Brinkey & K. Porter (e d s.), 103

International Cell Biology 1976-77 (pp. 318-325). New York: The Rockefeller University Press.

24. Blobel, G., and Dobberstein, B. (1975). Transfer of proteins across membranes: I. Presence of proteolytically processed and un­ processed nascent immunoglobulin light chain on membrane- bound ribosomes of murine myoloma. J. Cell Biol. 67,835-851.

25. Boiler, T. (1973). Hydrolytic enzymes in plant disease resistance. In T. Kosuge & E. W. Nester {eds.), Plant-microbe interactions: molecular and genetic perspectives (pp. 223-231). New York: M acM illan.

26. Braconnot, M. H. (1811). Sur la nature des champignons [The na­ ture of mushrooms]. Ann. Chi. Phys. 79,265-304.

27. Brake, A. J. (1989).Secretion of heterologous proteins directed by yeast alpha-factor leader. In P. Barr,A. Brake, & P. Valenzuela {eds.), Yeast Genetic Engineering (pp. 269-280). Stoneham: Butterworths. 28. Briggs, M. S., Cornell, D. G., Dluhy, R. A., and Gierasch, L. M. (1986).Comformations of signal peptides induced by lipids sug­ gest initial steps in protein export. Science. 233(4760), 206-208.

29. Brine, C. (1984).Chitin: Accomplishments and perspectives. In J. Zikakis {eds.), Chitin, Chitosan and Related Enzymes (pp. xvii-xxiv). Academic Press, Inc.

30. Brine, C. J., and Austin, P . R. (1981). Chitin variability with species and method of preparation. Comp. Biochem. Physiol. 69B, 283-286.

31. Brosius, J. (1984).Plasmid vectors for the selection of promoters. Gene. 2 7 ,151-160. 32. Brosius, J., Dull, T. J., Sleeter, D. D., and Noller, H. F. (1981). Gene organization and primary structure of a ribosomal RNA operon from Escherichia coli. J. Mol. Biol. 148, 107-127.

33. Brosius, J., Erfle, M., and Storella, J. (1985).Spacing of the -10 and -35 regions in the tac promoter. J. Biol. Chem. 260, 3539-3541.

34. Brosius, J., and Holy, A. (1984).Regulation of ribosomal RNA promoters with a synthetic lac operator. Proc. Natl. Acad. Sci. USA. 81,6929-6933. 35. Brosius, J., Ullrich, A., Raker, M. A., Gray, A., Dull, T. J., Gutell, R. R., and Noller, H. F. (1981). Construction and fine 104

mapping of recombinant plasmid containing the rrnB ribosomal RNA operon of E. coli. Plasmid. 6, 112-118.

36. Brown, C. A., and M ahuran, D. J. (1991). Active arginine residues in beta-hexosaminidase: identification through studies of the B1 variant of Tay-Sachs disease. J. Biol. Chem. 266(24), 15855- 15862.

37. Brundage, L., Fimmel, C. J., Mizushima, S., and Wickner, W. (1992). SecY, secE, and band I from the membrane-embedded domain of Eschericjia coli preprotein translocase. J. Biol. Chem. 267, 4166-4170.

38. Brundage, L., Hendrick, J. P., Schiebel, E., Driessen, A. J. M., and W ickner, W. (1990). The purified E. coli integral mem­ brane protein secY/E is sufficient for reconstitution of secA-depen- dent precursor protein translocation. Cell. 62, 649-657.

39. Brusilow, W. S. A., Gunsalis, R. P., Hardeman, E. C., Decker, K. P., and Simoni, R. D. (1981). In vitro synthesis of the F0 and FI components of the protein translocating ATPase of Escherichia coli. J. Biol. Chem. 256, 3141.

40. Byrne, C. R., Monroe, R. S., Ward, K. A., and Kredch, N. M. (1988). DNA sequence of the c y sK regions of Salmonella ty- p h im u r iu m and Escherichia coli and the linkage of the cysK to ptsH. J. Bacteriol. 170, 3150-3157.

41. Cerretti, D. P., Dean, D., Davis, G. R., Bedweil, D. M., and Nomura, M. (1983). The spc ribosomal operon of Escherichia coli: sequence and transcription of the ribosomal protein genes and a protein export gene. Nucl. Acids Res. 11, 2599-2616.

42. Chang, M. C., Chang, J. C., and Chen, J. P. (1993). Cloning and nucleotide sequence of an extracellular alpha-amylase gene from Aeromonas hydrophila MCC-1. J. Gen. Microbiol. 139, 3215- 3223.

43. Chaudhary, V., Batra, J., Gallo, M., Willingham, M., FitzGerald, D., and Pastan, I. (1990). A rapid method of cloning functional variable-region antibody gene in Escherichia coli as sin­ gle-chain immunotoxins. Proc. Natl. Acad. Sci. USA. 87, 1066-1070.

44. Chou, P. Y., and Fasman, G. D. (1988). Prediction of the sec­ ondary structure of proteins from their amino acid sequence. A dv. Enz. 47,45-147.

45. Clements, J. M., Catlin, G. H., Price, M. J., and Edwards, R. M. (1991). Secretion of human epidermal growth factor from 105

Saccharomyces cerevisiae using synthetic leader sequences. Gene. 106,267-272.

46. Coffey, A. J., Roberts, R. G., Green, E. D., Cole, C. G., Butler, R., Anand, R., Gianneli, F., and Bentley, D. R. (1992). Construction of a 2.6 Mb contig in yeast artificial chromosomes spanning the human dystrophin gene using a STS-based approach. Genomics. 12, 474-484.

47. Collier, D. N., Bankaitis, V. A., Weiss, J. B., and Bassford, P. J., Jr. (1988). The antifolding activity of secB promotes the export of E. coli maltose binding protein. Cell. 53, 273-283.

48. Collier, D. N., and Bassford, P. J., Jr. (1989). Mutations that improve export of maltose-binding protein in secB~ cells of Escherichia coli. J. Bacteriol. 171, 4640-4647.

49. Comb, D. G., and Roseman, S. (1958). Glucosamine metabolism: IV. Glucosamine-6-phosphate deaminase. J. Biol. Chem. 232, 807-827.

50. Copeland, B. R., Richter, R. J., and Furlong, C. E. (1982). Renaturation and identification of periplasmic proteins in two dimmentional gels of Escherichia coli. J. Biol. Chem. 257, 15065- 15071.

51. d'Enfert, C., Reyss, I., Wanderman, C., and Pugsley, A. P. (1989). Protein secretion by Gram-negative bacteria. Characterazation of two membrane proteins required for pullu- lanase secretion by E. coli K12. J. Biol. Chem. 264(29), 17462-17468.

52. Dalbey, R. E. (1994). Leader peptidase. In J. Rothblatt,P. Novick, & T. Stevens (eds.), Guidebook to the secretory pathway (pp. 17-18). New York: Oxford University Press, Inc.

53. de Boer, H. A., Comstock, L. J., and Vasser, M. (1983). The tac promoter: a functional hybrid derived from the trp and lac promoters. Proc. Natl. Acad. Sci. USA. 80, 21-25.

54. de Boer, H. A., Comstock, L. J., Yansura, D. G., and Heyneker, H. L. (1982). Construction of a tandem trp -la c promoter and a hybrid trp-lac promoter for efficient and controlled expression of the human growth hormone gene in Escherichia coli. In R. L. Rodriguez & M. J. Cham berlin (eds.), Promoters: Structure and function (pp. 462-481). Praeger.

55. de Groot, A., Filloux, A., and Tommassen, J. (1991). Conservation of xcp genes, involved in the two-step protein secretion 106

process, in different Pseudomonas species and other gram- negative bacteria. Mol. Gen. Genet. 229, 278-284. 56. Downing, W. L., Sullivan, S. L., Gottesman, M. E., and Dennis, P. P. (1990).Sequence and transcription pattern of the es­ sential Escherichia coli secE-nusG operon. J. Bacteriol. 172, 1621- 1627.

57. Dums, F., Dow, J. M., and Daniels, M. J. (1991).Structural characterization of protein secretion genes of the bacterial phy­ topathogen Xanthomonas campestris pathovar campestris: related­ ness to secretion systems of other gram-negative bacteria. Mol. Gen. Genet. 229, 357-364.

58. Eales, R. A., and Stamps, A. C. (1993).Polymerase chain reac­ tion (PCR): the technique and its applications. R. G. Landes Company.

59. Eckert, K. A., and Kunkel, T. A. (1991).DNA polymerase fidelity and the polymerase chain reaction. PCR Methods Appl. 1, 17-24. 60. Ehring, R., Beyreutther, K., Wright, J. K., and Overath, P. (1980).In vitro and in vivo products of E. coli lactose permease gene are identical. Nature. 283, 537.

61. Ehrmann, M., and Beckwith, J. (1991). Proper insertion of a complex membrane protein in the absence of its amino-termial ex­ port signal. J. Biol. Chem. 266(25), 16530-16533. 62. Embury, S. H., Scharf, S. J., Saiki, R. K., Gholson, M. A., Golbus, M., Arnheim, N., and Erlich, H. A. (1987).Rapid pre­ natal diagnosis of sickle cell anemia by a new method of DNA anal­ ysis. N. Engl. J. Med. 316, 656-661.

63. Emr, S. D., and Silhavy, T. J. (1982).The molecular components of the signal sequence that function in the initiation of protein ex­ port. J. Cell Biol. 95, 689-696.

64. Feng, D. F., and Doolittle, R. F. (1987).Progressive sequence alignment as a prerequisite to correct phylogenetic trees. J. Mol. Evol. 25(4), 351-360. 65. Filloux, A., Bally, M., Ball, G., Akrim, M., Tommasen, J., and Lazdunski, A. (1990). Protein secretion in Gram-negative bacteria: transport across the outer membrane involves common mechanisms in different bacteria. EMBO J. 9(13), 4323-4329. 107

66. Fisher, K., and Aronson, N. J. (1992).Cloning and expression of the cDNA sequence encoding the lysosomal glycosidase di-N- acetylchitobiase. J. Biol. Chem. 267(27), 19607-19616.

67. Foster, A. B., and Webber, J. M. (1960).Chitin. Adv. Carbohydr. Chem. 15, 371-393. 68. Frost, J. W., Bender, J. L., Kodonaga, J. T., and Knowles, J. R. (1984). Dehydroquinate synthase from Escherichia coli: purification, cloning, and construction of overproducers of the enzyme. Biochem. 23, 4470-4475. 69. Fujimura, T., Tanaka, T., Ohara, K., Morioka, H., Uesugi, S., Ikehara, M., and Nishikawa, S. (1990). Secretion of recombinant ribonuclease T1 into the periplasmic space of Escherichia coli with the aid of the signal peptide of alkaline phosphatase. FEBS Lett. 265(1,2), 71-74.

70. Gelfand, D. H. (1989). Taq DNA polymerase. In H. A. Erlich (eds.), PCR Technology: Principles and Applications for DNA Amplification (pp. 17-22). New York: Stockton Press.

71. Gennity, J., Goldstein, J., and Inouye, M. (1990).Signal pep­ tide mutants of Escherichia coli. J. Bioenerget. Biomemb. 22(3), 233- 268.

72. Gentz, R., and Bujard, H. (1985). Promoters recognized by Escherichia coli RNA polymerase selected by function: highly effi­ cient promoters from bacteriophage T5. J. Bacteriol. 164(1), 70-77. 73. Gentz, R., Langner, A., Chang, A. C. Y., Cohen, S. N., and Bujard, H. (1981). Cloning and analysis of strong promoters is made possible by the downstream placement of a RNA termination signal. Proc. Natl. Acad. Sci. USA. 78(8),4936-4940.

74. Ghorpade, A., and Carg, L. (1993).Efficient processing and ex­ port of human growth hormone by heat labile enterotoxin chain B signal sequence. FEBS Lett. 330(1), 61-65.

75. Gilbert, W. (1976).Starting and stopping sequences for the RNA polymerase. In R. Losick & M. Cham berlin (eds.), RNA Polymerase (pp. 193-205). Cold Spring Harbor, New York: Cold Spring Harbor Laboratory.

76. Gilbert, W., and Muller-Hill, B. (1966).Isolation of the lac re­ pressor. Proc. Natl. Acad. Sci. USA. 5 6 ,1891-1898. 108

77. Gildemeister, O., Zhu, B., and Laine, R. (1994). Chitovibrin: a chitin-binding lectin from Vibrio parahaemolyticus. Glycoconjugate 11,518-526.J.

78. Gilmore, R., and Kellaris, K. V. (1992).Translocation of pro­ teins across and integration of membranes into the rough endo­ plasmic erticulum. Ann. N. Y. Acad. Sci. 674,27-37. 79. Girgis, S. I., Alivizaki, M., Denny, P., Ferrier, G. J., and Legon, S. (1988).Generation of DNA probes with highly degenerate codons using mixed primer PCR. Nucl. Acids Res. 16, 10371.

80. Gooday, G. W. (1986).Chitinase activities in animals, fungi and bacteria. In K. Muzzarelli.C. Jeuniaux, & C. W. Gooday (eds.), Chitin in nature and technology (pp. 241-261). New York: Plenum Press. 81. Gorlich, D., Prehn, S., Laskey, R. A., and Hartmann, E. (1994).Isolation of a protein that is essential for the first step of nu­ clear protein import. Cell. 79,767-778. 82. Graham, T. R., Zassenhaus, H. P., and Kaplan, A. (1988). Molecular cloning of the cDNA which encodes beta-lV-acetylhex- osaminidase A from Dictyostelium discoideum: complete amino acid sequence and homology with the human enzyme. J. Biol. Chem. 263,16823-16829.

83. Grenier, F. C., Waygood,E. B., and Saier, M. H . J. (1986).The bacterial phosphotransferase system: kinetic characterization of the glucose, innocitol, glucitol, and AT-acetylglucosamine systems. J. Cell. Biochem. 31, 97-105.

84. Guan, C., Li, P., Riggs, P. D., and Inouye, H. (1987).Vectors that facilitate the expression and purification of foreign peptides in Escherichia coli by fusion to maltose-binding protein. Gene. 67, 21- 30.

85. Guan, Y. Q., Pong, H. L., He, Y. S., Chen, J. M., and Ren, K. Q. (1989).Preparation of high purity IgG and its application in screening for clones positive for Vibrio succinogenes L- asparaginase gene. Chin. J. Biotechnol. 5(1), 19-26.

86. Hall, J., Hazlewood, G. P., Surani, M. A., Hirst,B. H., and Gilbert, H. J. (1990).Eukaryotic and prokaryotic signal peptides direct secretion of a bacterial endoglucanase by mammalian cells. J. Biol. Chem. 265(32), 19996-19999. 109

87. Hall, M. N., Schw art, M., and Silhavy, T. J. (1982).Sequence information within the lam B gene is required for proper routing of the bacteriophage lambda receptor protein to the outer membrane of Escherichia coli K-12. J. Mol. Biol. 156, 93-112.

88. Hanahan, D. (1983).Studies on transformation of Escherichia coli with plasmids. J. Mol. Biol. 166(4), 557-580. 89. Harmsen, M. H., Langedijk, A. C., van Tuinen, E., Geerse, R. H., Raue, H. A., and Maat, J. (1993).Effect of a pmrl disruption and different signal sequences on the intracellular processing and secretion of Cyamopsis tetragonoloba alpha-galactosidase by Saccharomyces cerevisiae. Gene. 125, 115-123. 90. Hartl, F.-U., Lecker, S., Schiebel, E., Henrick, J. P., and Wickner, W. (1990).The binding cascade of secB to secA to secE/Y mediattes protein targeting to the E. coli plasma membrane. Cell. 63,269-279. 91. Hartl, F.-U., Pfanner, N., Nicholson, D. W., and Neupert, W. (1989).Mitochondrial protein import. Biochim. Biophys. Acta. 988, 1-45.

92. Hawley, D., and McLure, W. (1983).Compilation and analysis of Escherichia coli promoter DNA sequences. Nucl. Acids Res. 11(8), 2237-2255.

93. He, S. Y., Lindeberg, M., Chattejee, A.K., and Collmer, A. (1991).Cloned Erwinia chrysanthemi out genes enable Escherichia coli to selectively secrete a diverse family of heterologous proteins to its milieu. Proc. Natl. Acad. Sci. USA. 88, 1079-1083.

94. High, S., and Doberstein, B. (1992).Membrane protein insertion into the endoplasmic reticulum: signals, machinery and mecha­ nism s. In W. N eupert & R. Lill (eds.), Membrane Biogenesis and Protein Targeting (pp. 105-118). Amsterdam: Elsevier.

95. Higuchi,R. (1989).Rcombinant PCR. In H. A. Erlich (eds.), PCR Protocols: A Guide to Methods and Applications (pp. 177-183). New York: Academic Press, Inc.

96. Higuchi,R. (1989).Using PCR to engineer DNA. In H. A. Erlich (eds.), PCR Technology: Principles and Applications for DNA Amplification (pp. 61-70). New York: Stckton Press.

97. Hinnen, A., Meyhack, B., and Heim, J. (1989).Heterologous gene expression in yeast. In P. Barr,A. Brake, & P. Valenzuela (eds.), Yeast Genetic Engineering (pp. 193-214). Stoneham: Butterworths. 110

98. Hoffman, C., and Wright, A. (1985).Fusions of secreted proteins to alkaline phosphotase: an approch for studing protein secretion. Proc. Natl. Acad. Sci. USA. 82, 5107-5111.

99. Holland, I. B. (1989). Protein secretion in Escherichia coli w ith particular refrence to haemolysin. Biochem. Soc. Trans. 17(5), 828- 830.

100. Hoogenboom, H. R., and Winter, G. (1992). By-passing immu­ nisation: human antibodies from synthetic repertoires of germline Vh gene segments rearranged in vitro. J. Mol. Biol. 227(2), 381-388.

101. Horwich, A. L., Kolousek, F., and Rosenberg, L. E. (1985).A leader sequence is sufficient to direct mitochondrial import of a chimeric protein. EMBO J. 4, 1129-1135. 102. Hu, N.-T., Hung, M.-N., Chiou, S.-J., Tang, F., Chiang, D.-C., H uang, H.-Y., and Wu, C.-Y. (1992).Cloning and characteriza­ tion of a gene required for the secretion of extracellular enzymes across the outer membrane by Xanthemonas campestris pv. campestris. J. Bacteriol. 174, 2679-2787.

103. Hurt, E. C., Pesold-Hurt, B., and Schatz, G. (1984). The cleav- able prepiece of an imported mitochondrial protein is sufficient to direct cytosolic dihydrofolate reductase into the mitochondrial ma­ trix. FEBS Lett. 178, 306-310.

104. Inouye, M., and Halegoua, S. (1980). Secretion and membrane localization of proteins in Escherichia coli. CRC Crit. Rev. Biochem. 7(4), 339-371. 105. Inouye, S., Wang, S. S., Sekizawa, J., Halegua, S., and Inouye, M. (1977).Amino acid sequence for the peptide extension on the prolipoprotein of the Escheritchia coli outer membrane. Proc. Natl. Acad. Sci. USA. 74(3), 1004-1008.

106. Isaki, L., and Wu, H. C. (1994). Lipoprotein signal peptidase. In J. Rothblatt,P. Novick, & T. Stevens (eds.), Guidebook to the secretory pathway (pp. 18-20). New York: Oxford University Press, Inc.

107. Ito, K., Bassford, P. J. J., and Beckwith, J. (1981).P rotein localization in E. coli. Is there a common step in the secretion of periplasmic and outer membrane proteins? Cell. 24, 707-714.

108. Ito, K., Date, T., and Wickner, W. (1980).Synthesis, assembly into the cytoplasmic membrane, and proteolytic processing of the precursor of coliphage M13 coat protein. J. Biol. Chem. 255, 2123- 2130. I l l

109. Jacob, F., and Monod, J. (1961).Genetic regulatory mechanisms in the synthesis of proteins. J. Mol. Biol. 3, 318-356. 110. Jannatipour, M., Soto-Gil, R. W., Childers, L. C., and Zyskind, J. W. (1987).Translocation of Vibrio harveyi N,N'-di- acetylchitobiase to the outer membrane of Escherichia coli. J . Bacteriol. 169(8),3785-3791. 111. Jarvis, D., Summers, M., Gacia, A., Jr., and Bohlmeyer, D. (1993).Influence of different signal peptides and prosequences on expression and secretion of tissue plasminogen activator in the bac- ulovirus system. J. Biol. Chem. 268(22), 16754-16762.

112. Jeu n iau x, C. (1959).Action consecutive de deux enzymes dif- ferents au cours de l'hydrolyse complete de la chitin [Consecutive action of two different enzymes central to the complete hydrolysis of chitin]. Arch. Intern. Physiol. Biochem. 67,115-116.

113. Jiang, B., and Howard, S. P. (1992).The Aeromonas hydrophila exeE gene, required both for protein secretion and normal outer membrane biosynthesis, is a member of a general secretion family. Mol. Microbiol. 6,1351-1361. 114. Kalderon, D., Roberts, B. L., Richardson, W. D., and Smith, A. E. (1984). A short amino acid sequence able to specify nuclear location. Cell. 39, 499-509.

115. Knoth, K., Roberds, S., Poteet, C., and Tamkun, M. (1988). Highly degenerate, inosine-containing primers specifically amplify rare cDNA using the polymerase chain reaction. Nucl. Acids Res. 16,10932.

116. Korneluk, R., Mahuran, D., Neote, K., Klavins, M., O'Dowd, B., Tropak, M., Willard, H., Anderson, M., Lowden, J., and Gravel, R. (1986).Isolation of cDNA clones coding for the alpha- subunit of human beta-hexosaminidase. J. Biol. Chem. 261(18), 8407-8413.

117. Kuga, T., Komatsu, Y., Mizukami, T., Sato, M., and Itoh, S. (1990). Effect of N-terminal deletion of signal peptide on the secre­ tion of human granulocyte-colony stimulating factor in mammalian cells. Biotechnol. Lett. 12(2), 87-92. 118. Kukuruzinska, M., Harrington, W., and Roseman, S. (1982). Sugar transport by the bacterial phosphotransferase system: Studies on molecular weight and association of enzyme I. J. Biol. Chem. 257(23), 14470-14476. 112

119. Kundig, W., and Roseman, S. (1971).Sugar transport: II. Characterization on constitutive membrane-bound enzymes II of the Escherichia coli phosphotransferase system. J. Biol. Chem. 246(5), 1407-1418. 120. Kunkel, T. A., Sabatino, R. D., and Bambara, R. A. (1987). Exonucleolytic proofreading by calf thymus DNA polymerase-delta. Proc. Natl. Acad. Sci. USA. 84,4865-4869.

121. Kuranda, M. J., and Aronson, N. N., Jr. (1986).A di-iV-acetyl- chitobiase activity is involved in the lysosomal catabolism of as­ paragine-linked glycoprotein in rat liver. J. Biol. Chem. 261(13), 5803-5809.

122. Kyte, J., and Doolittle, R. F. (1982). A simple method for displaying the hydrophobic character of a protein. J. Mol. Biol. 157, 105-132. 123. Laine, R. A., Ou, C.-Y., and Jaynes, J. M. (Oct. 4, 1994). Molecular clone of a chitinase gene from Vibrio parahaemolyticus. U. S. patent #5,352,607.

124. Lanford, R. E., and Butel, J. S. (1982).Intracellular transport of SV40 large tumor antigen: a mutation which abolishes migration to the nucleus does not prevent association with the cell surface. Virology. 119,169-184.

125. Lanford, R. E., and Butel, J. S. (1984). Construction and characterization of an SV40 mutant defective in nuclear transport of T-antigen. Cell. 37, 801-813. 126. Lanford, R. E., White, R. G., Dunham, R. G., and Kunda, P. (1988). Effect of basic and nonbasic amino acid substitutions on transport induced by simian virus 40 T-antigen synthetic peptide nuclear transport signals. Mol. Cell Biol. 8, 2722-2729.

127. Lauritzen, C., Tiichsen, E., Hansin, P. E., and Skovgaard, O. (1991).BPTI and N-terminal extended analogs generated by Factor X a cleavage and cathepsin C trimming of a fusion protein ex­ pressed in Escherichia coli. Prot. Expr. Purif. 2, 372-378. 128. Lecker, S., Lill, R., Ziegelholffer, T., Geougopoulos, C., Bassford, P. J., Jr., Kumamoto, C. A., and Wickner, W. (1989). Three pure chaperon proteins of Escherichia coli—secB, trig­ ger factor and groEL—form soluble complexes with precursor pro­ teins in vitro. EMBO, J. 8, 2703-2709.

129. Lee, C. C., Wu, X., Gibbs, R. A., Cook, R. G., Muzny, D. M., and Caskey, C. T. (1988).Generation of cDNA probes directed by 113

amino acid sequence: cloning of urate oxidase. Science. 239, 1288- 1291.

130. Lee, J. I . , Kuhn, A., and Dalbey, R. E. (1992). Distinct domains of an oligotopic membrane protein are Sec-dependent and Sec-inde­ pendent for membrane insertion. J. Biol. Chem. 267(2), 938-943.

131. Lemay, G., Waksman, G., Roques, B. P., Crine, P., and B oileau, G. (1989). Fusion of a cleavable signal peptide to the ectodomain of neutral endopeptidase (EC 3.4.24.11) results in the se­ cretion of active enzyme in COS-1 cells. J. Biol. Chem. 264(26), 15620-15623.

132. Lewin, B. (1990). Genes IV. Oxford University Press.

133. Li, L., Luo, L., Thomas, D., and Kang, C. (1994).Control of ex­ pression, glycosylation, and secretion of HIV-1 gpl20 by homologous and heterologous signal sequences. Virology. 204, 266-278.

134. Lindeberg, M., and Collmer, A. (1992). Analysis of eight ou t genes in a cluster required for pectic enzyme secretion by E rw inia chrysanthemi: sequence comparison with secretion genes from other Gram-negative bacteria. J. Bacteriol. 174(22), 7385-7397. 135. Liu, G., Topping, T. B., Cover, W. H., and Radall, L. L. (1988).Retardation of folding as a possible means of suppression of a mutation in the leader sequence of an exported protein. J. Biol. Chem. 263, 14790-14793. 136. Liu, G., Topping, T. B., and Randall, L. L. (1989). Physiological role during export for the retardation of folding by the leader peptide of maltose binding protein. Proc. Natl. Acad. Sci. USA. 86,9213-9217.

137. Lively, M. O., and Walsh, K. A. (1983).Hen oviduct signal pepti­ dase is an integral membrane protein. J. Biol. Chem. 258, 9488-9495.

138. Lund, P ., and Dunsmuir, P. (1992). A plant signal sequence en­ hances the secretion of bacterial ChiA in transgenic tobacco. P lant Molecular Biology. 18, 47-53.

139. Lundberg, K. S., Shoem aker, D. D., Adams, M. W., Short, S. M., Sarge, J. A., and Mathur, E. J. (1991).High fidelity amplifi­ cation using a thermostable DNA polymerase isolated from Pyrococcus furiosus. Gene. 108, 1-6.

140. M acnab, R. M. (1987). Motility and chemotaxis. In F. C. Nerdhardt,J. L. Ingraham,K. B. Low,B. Magasanik,M. Schaechter, & H. E. U m barger (eds.), Escherichia coli and Salmonella ty- 114

p h im u r iu m : Cellular and molecular biology (pp. 732-759). Washington, D. C.: American Society for Microbiology. 141. M aina, C. V., R iggs, P. D., Grandea, A. G. I., Slatko, B. E., Moran, L. S., Tagliamonte, J. A., McReynolds, L. A., and Guan, C. (1988).A vector to express and purify foreign proteins in Escherichia coli by fusion to, and separation from, maltose binding protein. Gene. 74, 365-373.

142. Manoil, C., and Beckwith, J. (1985).TnPhoA: a tranposon probe for protein export signals. Proc. Natl. Acad. Sci. USA. 82, 8129-8133. 143. Marks, J. D., Hoogenboom, H. R., Bonnert, T. P., McCafferty, J., Griffiths, A. D., and Winter, G. (1991).By­ passing immunization. Human antibodies from V-gene libraries displayed on phage. J. Mol. Biol. 222(3), 581-597.

144. Martin, J., and Hartl, F.-U. (1994).GroEL and GroES. In J. Rothblatt,P. Novick, & T. Stevens (eds.), Guidebook to the secretory pathway (pp. 9-10). New York: Oxford University Press, Inc. 145. Mattila, P., Korpela, J., Tenkanen, T., and Pitkanen, K. (1991).Fidelity of the DNA synthesis by the Thermococcus litoralis DNA polymerase-an extremely heat stale enzyme with proofreading activity. Nucl. Acids Res. 19, 4967-4973.

146. McCorquodale, D. (1975).The T-odd Bacteriophages. In A. Caskin & H. Lachevalier (eds.), Crit. Rev. Microbiol, (pp. 101-159). CRC Press.

147. McCorquodale, D., and Warner, H. (1988). Bacteriophage T5 and related phages. In R. Calendar (eds.), The bacteriophages (pp. 439-475). Plenum Press.

148. McLafferty, M. A., Kent, R. B., Ladner, R. C., and Markland, W. (1993). M13 bacteriophage displaying disulfide-constrained mi­ croproteins. Gene. 128(1), 29-36.

149. Meadow, N. D., Fox, D. K., and Rosem an, S. (1990). The bacte­ rial phosphoenol-pyruvate:glycose phosphotransferase system. Annu. Rev. Biochem. 59, 497-542. 150. Meadow, N. D., Revuelta, R., Chen, V. N., Colwell, R. R., and Roseman, S. (1987). Phosphoenolpyruvate:glycose phosphotransferase system in species of Vibrio, a widely distributed marine bacterial genus. J. Bacteriol. 169(11),4893-4900.

151. Meadow, N. D., and Roseman, S. (1982).Sugar transport by the bacterial phosphotransferase system: Isolation and 115

characterization of a glucose-specific phosphocarrier protein (HlGlc) from Salmonella typhimurium. J. Biol. Chem. 257(23), 14526-14537.

152. Messing, J. (1979). A multipurpose cloning system based on sin­ gle-stranded DNA bacteriophage M13. Recomb. DNA Tech. Bull. 2(2), 43.

153. Meyer, D. I., and Doberstein, B. (1980). Identification and char­ acterization of a membrane component essential for the transloca­ tion of nascent proteins across the membrane of the endoplasmic reticulum. J. Cell Biol. 87, 503-508.

154. Meyer, D. I., and Doberstein, B. (1980). A membrane component for vectorial translocation of nascent proteins across the endoplasmic reticulum: requiement for its extraction and reassociation with the membrane. J. Cell Biol. 87, 498-502.

155. Meyer, D. I., Krause, E., and Doberstein, B. (1982). Secretory translocation across membranes: the role of the "docking protein". Nature. 297, 647-650.

156. Michaelis, S., Guarente, L., and Beckwith, J. (1983). In vitro construction and characterization of p h o A -la cZ gene fusions in Escherichia coli. J. Bacterial. 154, 356-365.

157. Mills, K. I., Sproul, A. M., Ogilvie, D., Elvin, P., Leibowitz, D., and Burnett, A. K. (1992). Amplification and sequencing of genomic breakpoints located within the M-bcr region by Vectorette- mediated polymerase chain reaction. Leukemia. 6, 481-483.

158. Milstein, C., Brownlee, G., Harrison, T., and Mathews, M. (1972). A possible precursor of immunoglobulin light chains. Nature New Biol. 239(91), 117-120.

159. Miyajima, A., and Arai, K. (1989). Use of a cDNA expression- cloning vector and a secretion vector for mammalian gene expres­ sion in Saccharamyces cerevisiae. In P. Barr,A. Brake, & P. Valenzuela (eds.), Yeast Genetic Engineering (pp. 281-304). Stoneham: Butterworths.

160. Moore, K. E., and M iura, S. J. (1992). A small hydrophobic do­ main anchors leader peptidase to the cytoplasmic membrane of Escherichia coli. J. Biol. Chem. 262(18), 8806-8813.

161. Morioka-Fujimoto, K., Marumoto, R., and Fukuda, T. (1991). Modified enterotoxin signal sequences increases secrettion level of the recombinant human epidermal growth factor in Escherichia coli. J. Biol. Chem. 266(3), 1728-1732. 116

162. Morrison, S. L. (1992).In vitro antibodies: Strategies for produc­ tion and application. Annu. Rev. Immunol. 10, 239-265. 163. Mulligan, M., Brosius, J., and McClure, W. (1985). Characterization in vitro of the effect of spacer length on the activity of Escherichia coli RNA polymerase at the tac promoter. J. Biol. Chem. 260(6), 3529-3538.

164. Mullis, K. B., and Falooda, F. A. (1987).Specific synthesis of DNA in vitro via a polymerase-catalized chain reaction. M ethods Enzymol. 155(21), 335-350. 165. Mullis, K. B., Falooda, F. A., Scharf, F. J., Saiki, R. K., Horn, G. T., and Erlich, H. A. (1986). Specific enzymatic reaction of DNA in vitro: the polymerase chain reaction. C old Spring Harbor Symp. Quant. Biol. 51, 263-273.

166. M uzzarelli, R. A. A. (1973).Chitin. Pergamon: Oxford.

167. Myatt, D. C., and Davis, G. H. G. (1989).Extracellular and sur­ face-bound biological activities of Vibrio fluvialis, Vibrio furnissii and related species. Med. Enzym. Immunol. 178,279-287.

168. Nagai, K., and Thogersen, H. C. (1984).Generation of beta- globin by sequence-specific proteolysis of a hybrid protein produced in Escherichia coli. Nature. 309, 810-812.

169. Nagai, K., and Thogersen,H . C. (1987). Synthesis and sequence-specific proteolysis of hybrid proteins produced in Escherichia coli. Methods in Enzymol. 153, 461-481. 170. Nam, S. W., Kim, B. M., Chung, B. H., Kang, D. O., and Ahn, J. S. (1994).Expression and secretion of human lipocortin-1 by promoter and signal sequence of STA1 form Saccharomyces di- astaticus. Biotechnol. Lett. 16(9), 897-902.

171. Natsoulis, G., Hilger, F., and Fink, G. R. (1986).The HTS1 gene encodes both the cytoplasmic and mitochondrial histidine tRNA synthetase of S. cerevisae. Cell. 46, 235-243.

172. Needleman, S., and Wunsch, C. (1970).A general method ap­ plicable to the search for similarities in the amino acid sequence of two proteins. J. Mol. Biol. 48(3), 443-453.

173. Nilsson, B., and Anderson, S. (1991).Proper and improper fold­ ing of proteins in the cellular environment. Annu. Rev. Microbiol. 45,607-635. 117

174. Nishizawa, M., Ozawa, F., and Hishinuma, F. (1989). Construction of yeast secretion vectors designed for production of mature proteins using the signal sequence of yeast invertase. A ppl Microbiol Biotechnol. 32, 317-322.

175. Nossal, N. G., and Hepel, L. A. (1966).The release of enzymes by osmotic shock from Escherichia coli in exponential phase. J. Biol. Chem. 241, 3055-3062.

176. Ochman, H., Gerber, A. S., and Hartl, D. L. (1988).Genetic ap­ plications of an inverse polymerase chaine reaction. Genetics. 120, 621-623. 177. Oka, T., Sakamoto, S., Miyoshi, K., Fuwa, T., Yoda, K., Yamasaki, M., Tamura, G., and Miyake, T. (1985).Synthesis and secretion of human epidermal growth factor by Escherichia coli. Proc. Natl. Sci. USA. 82, 7212-7216.

178. Oliver, D. B. (1993).SecA protein: autoregulated ATPase catalysing preprotein insertion and translocation across the Escherichia coli inner membrane. Mol. Microbiol. 7(2), 159-165.

179. O ste, C. (1989).PCR automation. In H. A. Erlich ( e d s .), PCR Technology: Principles and Applications for DNA Amplification (pp. 23-30). New York: Stockton Press. 180. Palva, I., Pettersson, R. F., Kalkkinen, N., Lehtovaara, P., Sarvas, M., Soederlund, H., Takkinen, K ., an d Kaeaeriaeinen, L. (1981). Nucleotide sequence of the promoter and NH2-terminal signal peptide region of the alpha-amylase gene from Bacillus amyloliquefaciens. Gene. 15(1), 43-51. 181. Palva, I., Sarvas, M., Lehtovaara, P., Shakow, M., and Kaariainen, L. (1982). Secretion of Escherichia coli beta-lacta- m ase from Bacillus subtilis by the aid of alpha-amylase signal se­ quence. Proc. Natl. Acad. Sci. USA. 79,5582-5586.

182. Patil, R. V., and Dekker, E. E. (1990). PCR amplification of an Escherichia coli gene using mixed primers containing deoxyinosine in degenerate amino acid codons. Nucl. Acids Res. 18, 3080.

183. Pearson, W., and Lipman, D. (1988).Improved tools for biologi­ cal sequence comparison. Proc. Natl. Acad. Sci. USA. 85(8), 2444- 2448.

184. Perler, F. B., Comb, D. G., Jack, W. E., Moran, L. S., Qiang, B., Kucera, R. B., Benner, J., Slatko, B. E., Nwankwo, D. O., Hempstead, S. K., and al, e. (1992). Intervening sequences in 118

the Archea DNA polymerase gene. Proc. Natl. Acad. Sci. USA. 89(12), 5577-5581.

185. Pfanner, N., and Neupert, W. (1990).The Mitochondrial Protein Import Apparatus. Ann.. Rev. Biochem. 59, 331-353.

186. Phillips, G. J., and Silhavy, T. J. (1990).Heat-shock proteins DnaK and GroEL facilitate export of LacZ hybrid proteins in E. coli. Nature. 344, 882-884.

187. Platt, T. (1978).Regulation of gene expression in the tryptophan operon of Escherichia coli. In J. M iller & W. Reznikoff ( e d s .), The Operon (pp. 263-302). Cold Spring Harbor Laboratory.

188. Poulis, M. I., Shaw, D. C., Campbell, H. D., and Young, I. G. (1981).In vitro synthesis of the respiratory NADH dehydrogenase of Escherichia coli: role of UUG as initiation codon. Biochemistry. 20(14), 4178-4185.

189. Powers, M. A., and Forbes, D. J. (1994).Cytosolic factors in nu­ clear transport: what's importin? Cell. 79,931-934.

190. Pugsley, A., Kornacker, M., and Ryter, A. (1990).Analysis of the subcellular location of pullulanase produced by E. coli carrying the p u lA gene from Klebsiella pneumoniae strain UNF5023. Mol. Microbiol. 4(1), 59-72. 191. Pugsley, A. P., d'Enfert, C., Reyss, I., and Kornacker, M. G. (1990). Genetics of extracellular protein secretion by Gram- Negative bnacteria. Annu. Rev. Genet. 24, 67-90.

192. Randall, L. L., and Hardy, S. J. S. (1986).Correlation of compe­ tence for export with lack of tertiary structure of the mature species: a sudy in vivo of maltose binding protein in E. coli. Cell. 46, 921-928.

193. Randall, L. L., Topping, T. B., and Hardy, S. J. S. (1990).No specific recognition of leader peptide by secB, a chaperon involved in protein export. Science. 248, 860-863. 194. Reeves, P. J., Whitcombe, D., Wharam, S., Gibson, M., Allison, G., Bunce, N., Barallon, R., Douglas, P., Mulholland, V., Stevens, S., Walker, D., and Salmond, G. P. C. (1993).Molecular cloning and characterization of 13 out genes from Erwina carotovora subspecies carotovora: genes encoding members of a general secretion pathway (GSP) widespread in Gram-negative bacteria. Mol. Microbiol. 8(3), 443-456. 119

195. Riley, J., Butler, R., Ogilvie, D., Finnier, R., Jenner, D., Powell, S., Anand, R., Smith, J. C., and Markham, A. F. (1990). A novel, rapid method for the isolation of terminal sequences from yeast artifical chromosome (YAC) clones. N ucl. Acids Res. 18, 2887-2890.

196. Roseman, S. (1957).Glucosamine metabolism: I. iV-acetylglu- cosamine deacetylase. J. Biol. Chem. 226, 115-124.

197. Rosenberg, M., and Court, D. (1975).Regulatory sequences in­ volved in the promotion and termination of RNA transcription. Ann. Rev. Genet. 13, 319-353.

198. Rothblatt, J. (1994).Protein translocation and maturation in the mammalian ER. In J. Rothblatt,P. Novick, & T. Stevens ( e d s .), Guidebook to the secretory pathway (pp. 65-67). New York: Oxford University Press, Inc. 199. Saffen, D. W., Presper, K. A., Doering, T. L., and Roseman, S. (1987).Sugar transport by the bacterial phosphotransferase system, molecular cloning and structural analysis of the Escherichia coli ptsH, ptsl, and err genes. J. Biol. Chem. 262(33), 16241-16253.

200. Saier, M. H., Jr., and Roseman, S. (1976).Sugar transport: the err mutation: its effect on repression of enzyme synthesis. J. Biol. Chem. 251(21), 6598-6605.

201. Saiki, R. K., Bugawan, T. L., Horn, G., Mullis, K. B., and Erlich, H. A. (1986). Analysis of enzymatically amplified beta- globin and HLA-DQa DNA with allel-specific oligonucleotide probes. Nature. 324, 163-166.

202. Saiki, R. K., Scharf, S. J., Faloona, F., Mullis, K., Horn, G., Erlich, H. A., and Arnheim, N. (1985).Enzymatic amplification of beta-globin genomic sequences and restriction site analysis for diagnosis of sickle cell anemia. Science. 230(4732), 1350-1354. 203. Sambrook, J., Fritsch, E. F., and Maniatis, T. (1989). Molecular Cloning: a laboratory manual (2 E d .). Cold Spring Harbor Laboratory Press.

204. Santos, E., Kung, H., Young, I. G., and Kaback, H. R. (1982). In vitro synthesis of the membrane-bound D-lactate dehydrogenase of Escherichia coli. Biochemistry. 21(9), 2085.

205. Schatz, P. J., and Beckwith, J. (1990).Genetic analysis of pro­ tein export in Escherichia coli. Annu. Rev. Genet. 24, 215-248. 120

206. Schiebel, E., Driessen, A. J., Hartl, F.-U., and Wickner, W. (1991).Delta mu H+ and ATP function at different steps of the cat­ alytic cycle of preprotein translocase. Cell. 64(5),927-939. 207. Schmidt, G. W., Devilliers-Thiery, A., Desruisseaux, H., Blobel, G., and Chua, N.-H. (1979).NH2-terminal amino acid sequences of precursor and mature forms of the small subunit of ribulose 1,5-bisphosphate carboxylase from Chlamydomonas rein- hardtii. J. Cell Biol. 83, 615-622.

208. Schm idt, G. W., and M ishkind, M. L. (1986).The transport of proteins into chloroplasts. Annu. Rev. Biochem. 55, 879-912. 209. Sheffield, V. C., Cox, D. R., Lerman, L. S., and Myers, R. M. (1989).Attachment of a 40-basepair G+C rich sequence (GC Clamp) to genomic DNA fragments by the polymerase chain reaction results in improved detection of single-base changes. Proc. Natl. Acad. Sci. USA. 86, 232-236.

210. Shimahara, K., and Takiguchi, Y. (1988).Preparation of crus­ tacean chitin. Methods Enzymol. 161, 417-423. 211. Shimizu, Y., Takabayashi, E., Yano, S., Shimizu, N., Yamada, K., and Gushima, H. (1988).Molecular cloning and expression in Escherichia coli of the cDNA coding for rat lipocortin I (calpactin II). Gene. 65, 141-147.

212. Silber, K. R., Keiler, K. C., and Sauer, R. T. (1992). Tsp: a tail- specific protease that selectively degrades proteins with nonpolar C termini. Proc. Natl. Acad. Sci. USA. 89(1), 295-299.

213. Silver, P., and Goodson, H. (1989).Nuclear protein transport. Crit. Rev. Biochem. and Mol. Biol. 24, 419-435.

214. Singer, S. J., Maher, P. A., and Yaffe, M. P. (1987).On the translocation of proteins across membranes. Proc. Natl. Acad. Sci. USA. 84,1015-1019.

215. Smith, D. B., and Johnson, K. S. (1988).Single-step purification of polypeptides expressed in Escherichia coli as fusions with glu­ tathione S-transferase. Gene. 67(31), 31-40. 216. Soberon, X., Rossi, J. J., Larson, G. P., and Itakura, K. (1982). A synthetic, consensus sequence, prokaryotic promoter is functional. In R. L. Rodriguez & M. J. Chamberlin ( e d s .), Promoters: Structure and function (pp. 407-431). Praeger. 121

217. Sodergren, E. J., Davidson, J., Taylor, K. K., and Silhavy, T. J. (1985).Selection of mutants altered in the expression or export of outer membrane porin OmpF. J. Bacteriol. 162, 1047-1053.

218. Somerville, C. C., and Colwell, R. R. (1993).Sequence analysis of the beta-N-acetylhexosaminidase gene of Vibrio vulnificus: evi­ dence for a common evolutionary origin of hexosaminidases. Proc. Natl. Acad. Sci. USA. 90(14), 6751-6755.

219. Soto-Gil, R. W., and Zyskind, J. W. (1984).Cloning of V ibrio harveyi chitinase and chitobiase genes in Escherichia coli. In J. P. Zikakis (eds.), Chitin, chitosan, and related enzymes (pp. 209-223). Academic Press, Inc.

220. Soto-Gil, R. W., and Zyskind, J. W. (1989).7V,iV'-diacetylchito- biase of Vibrio harveyi: primary structure, processing, and evolu­ tionary relationships. J. Biol. Chem. 264(25), 14778-14783.

221. Staden, R. (1980). A new computer method for the storage and manipulation of DNA gel reading data. Nucl. Acids Res. 8(16), 3673- 3694.

222. Stock, J. B., Rauch, B., and Roseman, S. (1977). Periplasmic space in Salmonella typhimurium and Escherichia coli. J. Biol. Chem. 252(21), 7850-7861.

223. Suominen, I., Karp, M., Lautamo, J., Knowles, J. K. C., and Mantsaelae, P . (1987).Thermostable alpha amylase of B a cillu s stearothermophilus: Cloning, expression, and secretion by Escherichia coli. In J. Chaloupka & V. Krumphanzl (ed s.), Extracellular Enzymes of Microorganisms (pp. 129-137). New York: Plenum Press.

224. Takagi, H., Morinaga, Y., Tsuchiya, M., Ikemura, G., and Inouye, M. (1988).Control of folding of proteins secreted by a high expression secretion vector, pIN-III-ompA: 16-fold increase in pro­ duction of active subtilisin E in Escherichia coli. Bio / technology. 6, 948-950.

225. Takkinen, K., Pettersson, R. F., Kalkkinen, N., Palva, I., Soederlund, H., and Kaeaeriaeinen, L. (1983).Amino acid se­ quence of alpha-amylase from Bacillus amyloliquefaciens deduced from the nucleotide sequence of the cloned gene. J. Biol. Chem. 258, 1007-1013.

226. Talmadge, K., Kaufman, J., and Gilbert, W. (1980).B acteria mature preproinsulin to proinsulin. Proc. Natl. Acad. Sci. USA. 77, 3988-3992. 122

227. Talmadge, K., Stahl, S., and Gilbert, W. (1980).Eukaryotic signal sequence transport insulin antigen in Escherichia coli. Proc. Natl. Acad. Sci. USA. 77,3369-3373.

228. Tessier, D. C., Thomas, D. Y., Khouri, H. E., Laliberte, F., and Vernet, T. (1991).Enhanced secretion from insect cells of a foreign protein fused to the honeybee melittin signal peptide. Gene. 98177-183. ,

229. Toennies, G., and Gallant, L. (1947).The relation between pho­ tometric turbidity and bacterial concentration (bacterimetric stud­ ies. IV). Growth. 13, 7-20. 230. Tommassen, J., Leunissen, J., van Damme-Jongsten, M., and Overduin, P. (1985).Failure of E. coli K-12 to transport PhoE- LacZ hybrid protein out of the cytoplasm. EMBO J. 4, 1041-1047. 231. Tommassen, J., Pugsley, A. P., Korteland, J., Verbakel, J., and Lugtenberg, B. (1984).Gene encoding a hybrid OmpF-PhoE pore protein in the outer membrane of Escherichia coli K12. Mol. Gen. Genet. 197,503-508. 232. Tommassen, J., van Tol, H., and Lugtenberg, B. (1983). The ultimate localization of a OM protein of Escherichia coli K12 is not determined by the signal sequence. EMBO J. 2 , 1275-1279. 233. Trumbly, R., Robbins, P., Belfort, M., Ziegler, F., Maley, F., and Trimble, R. (1985).Amplified expression of Streptomyces Endo-beta-N-acetylglucosaminidase H in Escherichia coli and characterization of the enzyme product. J. Biol. Chem. 260, 5683- 5690. 234. Tsuchiya, Y., Fujisawa, H., Nakayama, K., Nagahora, H., and Jigami, Y. (1993).Effect of chicken lysozyme signal peptide alterations on secretion of human lysozyme in Saccharomyces cere- visiae. Biochem. Cell Biol. 71, 401-405.

235. Ulhoa, C., and Peberdy, J. (1991). Purification and Characterization of an extracellular chitobiase from Trichoderma harzianum. Current Microbiology. 23, 285-289.

236. Ulmanen, I., Lundstron, K., Lehtovaara, P., Sarvas, M., Ruohonen, M., and Palva,I. (1985).Transcription and transla­ tion of foreign genes in Bacillus subtilis by the aid of a secretion vector. J. Bactriol. 162,176-182. 237. Van den Broeck, G., Timko, M. P., Kausch, A. P., Cashmore, A. R., Van Montagu, M., and Herra-Estrella, L. (1985). Targeting of a foreign protein to chloroplast by fusion to the transit 123

peptide from the small subunit of ribulose 1,5-bisphosphate carboxy­ lase. Nature. 313, 358-363.

238. van Dijl, J. M., de Jong, A., Smith, H., Bron, S., and Venema, G. (1991). Signal peptidase I overproduction results in increase efficiencies of export and maturation of hybrid secretory proteins in E. coli. Mol. Gen. Genet. 227, 40-48.

239. von Gabain, A., and Bujard, H. (1977). Interaction of E. coli RNA polymerase with promoters of coliphage T5: the rates of com­ plex formation and their correlation with in vitro and in vivo tra n ­ scriptional activity. Mol. Gen. Genet. 157, 301-311.

240. von Gabain, A., and Bujard, H. (1979). Interaction of Escherichia coli RNA polymerase with promoters of several col­ iphage and plasmid DNAs. Proc. Natl. Acad. Sci. USA. 76(1), 189- 193.

241. von Heijne, G. (1980). Trans-membrane localization of proteins: a detailed physico-chemical analysis. Eur. J. Biochem. 103, 176-182.

242. von Heijne, G. (1981). On the hydrophobic nature of signal se­ quences. Eur. J. Biochem. 116, 419-422.

243. von Heijne, G. (1983). Patterns of amino acids near signal-se- quence cleavage sites. Eur. J. Biochem. 133(1), 17-21.

244. von Heijne, G. (1984). How signal sequences maintain cleavage specificity. J. Mol. Biol. 173, 243-251.

245. von Heijne, G. (1988). Transcending the impenetrable: how pro­ tein come to terms with membranes. Biochem. Biophys. Acta. 947, 307-333.

246. Walter, P., and Blobel, G. (1981). Translocation of proteins across the endoplasmic reticulum. II. signal recogition protein (SRP) mediates the selective binding to micosomal membranes of m-ui^ro-assembled polysomes synthesizing secrettory protein. J. Cell Biol. 91, 551-556.

247. Walter, P., and Blobel, G. (1981). Translocation of proteins across the endoplasmic reticulum. III. signal recogition protein (SRP) causes signal sequence-dependent and site-specific arrest of chain elongation that is released by microsomal membrane. J. Cell Biol. 91,557-561.

248. Walter, P., and Blobel, G. (1982). Signal recognition particle contains a 7S RNA essential for protein translocation across the endoplasmic reticulum. Nature. 299, 691-698. 124

249. Walter, P., Ibrahimi, I., and Blobel, G. (1981).Translocation of proteins across the endoplasmic reticulum. I. signal recogition pro­ tein (SRP) binds to in-ufrro-assembled polysomes synthesizing se- crettory protein. J. Cell Biol. 91, 545-550.

250. Walter, P., Jackson, R. C., M arcus, M. M., Lingappa, V. R., and Blobel, G. (1979).Tryptic dissection and reconstitution of translocation activity for nacent presecretory proteins across micro­ somal membranes. Proc. Natl. Acad. Sci. USA. 76, 1795-1799.

251. Watanabe, M., and Blobel, G. (1989). SecB functions as a cytoso­ lic signal recognition factor for protein export in E. coli. Cell. 58, 695-705. 252. Waygood, E. B., Mattoo, R. L., and Peri, K. G. (1984). Phosphoproteins and the phosphoenolpyruvate:sugar phosphotransferase system in Salmonella typhimurium a n d Escherichia coli: evidence for Hpnannose^ jjjfructose^ jjjglucito^ and the phosphorylation of enzyme n mannito1 and enzyme II^‘ acetylglucosamine j CeU Bioche1rlt 25(3), 139-159.

253. Weigel, N., Kukuruzinska, M. A., Nakazawa, A., Waygood, E. B., and Roseman, S. (1982). Sugar transport by the bacterial phosphotransferase system: Phophoryl transfer reactions catalized by enzyme I of Salmonella typhimurium. J. Biol. Chem. 257(23), 14477-14491.

254. Weigel, N., Powers, D. A., and Roseman, S. (1982). Sugar transport by the bacterial phosphotransferase system: primary structure and active site of a general phosphocarrier protein (HPr) from Salmonella typhimurium. J. Biol. Chem. 257(23), 14499-14509. 255. Weigel, N., Waygood, E. B., Kukuruzinska, M. A., Nakazawa, A., and Roseman, S. (1982).Sugar transport by the bacterial phosphotransferase system: Isolation and characterization of enzyme I from Salmonella typhimurium. J. Biol. Chem. 257(23), 14461-14469.

256. Wichner, B. (1994).The E. coli preprotein translocase. In J. Rothblatt,P. Novick, & T. Stevens (eds.), Guidebook to the secretory pathway (pp. 5-6). New York: Oxford University Press, Inc.

257. Wickner, W. (1979).The assembly of proteins into biological membranes: the membrane trigger hypothesis. A n n u . Rev. Biochem. 48, 23-45.

258. Wickner, W. (1980). Assembly of proteins into membranes. Science. 210, 861. 125

259. Wickner, W., Diressen, A., and Hartl, F. (1991). The enzymol- ogy of protein translocation across the Escherichia coli plasm a membrane. Annu. Rev. Biochem. 60, 101-124.

260. Wickner, W. T., and Lodish, H. F. (1985). Multiple Mechanisms of protein insertion into and across membranes. Science. 230, 400-407.

261. Wiech, H., Saggstetter, M., Muller, G., and Zimmermann, R. (1987). The ATP requiring step in assembly of M13 precoat protein into microsomes is related to preservation of transport competence of the precursor protein. EMBO J. 6(4), 1011-1016.

262. Winter, G., and Milstein, C. (1991). Man-made antibodies. Nature. 349, 293-299.

263. Wolfe, P. B., Wickner, W., and Goodman, J. M. (1983). Sequence of the leader peptidase gene of Escherichia coli and the orientation of leader peptidase in the bacterial envelope. J. Biol. Chem. 258, 12073.

264. Woodlock, D. M., Crother, P. J., Doherty, J., Jefferson, S., DeCruz, E., Noyer-Weidner, M., Smith, S. S., Michael, M. Z., and Graham, M., W. (1989). Quantitative evaluation of Escherichia coli host strains for tolerance to cytosine methylation in plasmid and phage recombinants. Nucl. Acids Res. 17(9), 3469- 3478.

265. Wortman, A. T., Somerville, C. C., and Colwell, R. R. (1986). Chitinase determinants of Vibrio vulnificus: gene cloning and ap­ plications of a chitinase probe. Appl. Environ. Microbiol. 52(1), 142- 145.

266. Yaffe, M. B., Levison, B. S., Szasz, J., and Sternlicht, H. (1988). Expression of a human alpha-tubulin: properties of the iso­ lated subunit. Biochemistry. 27, 1869-1880.

267. Yamamoto, Y., Taniyama, Y., and Kikuchi, M. (1989). Important role of the proline residue in the signal sequence that di­ rect the secretion of human lysozyme in Saccharomyces cerevisiae. Biochemistry. 28, 2728-2732.

268. Yamamoto, Y., Taniyama, Y., Kikuchi, M., and Ikehara, M. (1987). Engineering of the hydrophobic segment of the signal se­ quence for efficient secretion of human lysozyme by Saccharomyces serevisiae. Biochem. Biophys. Res. Commun. 149(2), 431-436. 126

269. Yamanaka, S., Johnson, O. N., Norflus, F., Boles, D. J., and Proia, R. L. (1994). Structure and expression of the mouse beta- hexosaminidase genes, Hexa and Hexb. Genomics. 21(3), 588-596.

270. Yang, M., Galizzi, A., and Henner, D. J. (1983). Nucleotide se­ quence of the amylase gene from Bacillus subtilis. Nucl. Acids Res. 11,237-249.

271. Yanisch-Perron, C., Viera, J., and Messing, J. (1985). Improved M13 phage cloning vectors and host strains: nucleotide sequences of the M13mpl8 and pUC19 vectors. Gene. 33(1), 103-119.

272. Yazyu, H., Shiota-Niiya, S., Shimamoto, T., Kanazawa, H., Futai, M., and Tsuchiya, T. (1984). Nucleotide sequence of the m elB gene and characteristics of duduced amino acid sequence of the melibiose carrier in Escherichia coli. J. Biol. Chem. 259(7), 4320-4326.

273. Yu, C., Bassler, B., and Roseman, S. (1993). Chemotaxis of the marine bacterium Vibrio furnissii to sugars: a potential mechanism for initiating the chitin catabolic cascade. J. Biol. Chem. 268(13), 9405-9409.

274. Yu, C., Lee, A. M., Bassler, B. L., and Roseman, S. (1991). Chitin utilization by marine bacteria: a physiological function for bacterial adhesion to immobilized carbohydrates. J. Biol. Chem. 266(36), 24260-24267.

275. Zabin, I. (1980). Beta-galactosidase and the lactose operon. In D. S. Sigman & M. A. B. Brazier (eds.), The evolution of protein structure and function (pp. 49-62). Academic Press.

276. Zabin, I . , and Fowler, A. V. (1980). Beta-galactosidase, the lac­ tose permease protein, and thiogalactoside transacetylase. In J. H. Miller & W. S. Reznikoff (eds.), The Operon (pp. 89-122). Cold Spring Harbor Laboratory.

277. Zeilstra-Ryalls, J., Fayet, O., and Georgopoulos, C. (1991). The universally conserved GroE (Hsp60) chaperonins. Annu. Rev. Microbiol. 45, 301-325.

278. Zhu, B. C., Lo, J. Y., Li, Y. T., Li, S. C., Jaynes, J. M., Gildemeister, O. S., Laine, R. A., and Ou, C. Y. (1992). Thermostable, salt tolerant, wide pH range novel chitobiase from Vibrio parahaemolyticus: isolation, characterization, molecular cloning, and expression. J. Biochem. 112(1), 163-167.

279. Zikakis, J. P. (E d .). (1984). Chitin, Chitosan, and Related Enzymes. Orlando: Harcourt Brace Jovanovich. APPENDIXES

A.1 SEQUENCING AND PCR PRIMERS3 A. 1.1 Sequencing Primers p U C -F O R -P 5 ' -CGC CAG GGT TTT CCC AGT CAC GAC--3 ' p U C -R E V -P 5 ' -TCA CAC AGG AAA CAG CTA TGA C -3 1 M 1 3 -F O R -4 0 5 ' -G TT TTC CCA GTC ACG A C -3 ' KSP21-P1 5’-TTG CAG GTA AAC CTG A A C -3 ’ K S P 2 1 -P 2 5 ' -T T C AAT GAA GCC TTG G T T -3 ' K S P 2 1 -P 3 5 ' -GCC AAT TAA GCT AAA GA G -3 ' S K S 1 6 1 -P 1 5 ' -CAC GTT GTG GTG GTG A A T -3 1 S K S 2 1 -P 1 5 1-TCG TCC AGT CTC GGT G T T -3 1 SKS21-P2 5'-CTT TTT CTA GCG GCA A T G -3 ' S K S 2 1 -P 3 5 ' -GGG AAC AAC GCG AGA C T T - 3 1 S K S 2 1 -P 4 5 1-GAC TCA ATT CGC CAA C C T -3 1 SKH3 5-P1 51-TTG ATT TGC AGG GAG T G A -3 ' SKH35-P2 5'-ACG GGA GAT TAC GGT C A A -3 ' K B P -1 5 ' -CGC CGC GCT TGT TTA A T G -3 ' KBP-2 5'-CTA TCC GGC AGA ATG T A G -3 ' K B P -3 5 ' -ATC GCT GAC TTCTAT A T G -3 ' K B P -4 5 ' -GGC TGG TTG ATC GCC A A C -3 1 KBP-10 5'-CCA AGG AGC AGG TAC ATC- 3 ' K B P -1 1 5 1-GCT AAT GCC GCT GTA T T C -3 ' K B P -1 2 5 ' -GAC CGT CGT ATG AAT CTG-3' S F P -5 5 ’ -TGC TCG CCA TTT TCA C T C - 3 1 S F P -6 5 ’ -CGT GGT TGC CTT TGC T T C -3 ' S F P -7 5 1-GGA ACA ACT GGG TTA C A G -3 1 S F P -8 5 ' -CGC CAA GGT TTC GAT G T G -3 ' S F P -9 5 ' -AAC GCA TTT GGG GCA T T C -3 ' CHNSIG-F 5 ' -C C T ATG TGC AGC TGG GGT TGC TTT AGC--3 ' A. 1.2 P C R P rim e rs P C R -P U l 5 1-GTA GAG GAT CCG GAA TTC TCA-■3 ' P C R -P L 1 5 ' -GAT ATT CCA TAC TTG GTG CGG TCG GAG C -3 ' P C R 2-P U 1 5 ' -CGC ACC AAG TAT GGA ATA TCG TGT TGA T C T -3 1 P C R 2 -P L 1 5 ' -GAT TGA CCG TAA TCT CCC GTC TCA TCA TCA-3' p V S -p U 5 ' -GGC TGA TAC AGA TTA AAT CAG AAC--3 ' p V S -p L 2 5 1-GTC GGG CCG GCG ACT GCA GCA CCTGAT A G C -3 ' p V S -S E -U 5 ' -ATG AGA GAA GAT ATC CAG CCT GAT ACA G A T -3 ' p V S -S E -L 5 ’ -CGG AGC GGC CAC TGC GGC CCC TGA TAG C G C -3 ' P C R 4-pU 5 ' -ATG GAA TAT CGT GTT GAT CTC- 3 1 P C R 5-pU 5 ' -GGG GCC GCA GTG GCC ATG GAA TAT CGT GTT GAT PCRBGAL-U 5 ' -GGG GCC GCA GTG GCC ATG AGC GAA AAA TAC ATC PCRBGAL-L 51-CGA AAT ACG GGC AGA CAT G G -3 '

3 Not all primers in the chitobiase gene region perfectly match the template since they were based on primary gel data. Some of the primers listed are commercial products.

127 A.2 SIGNAL SEQUENCES USED IN THIS STUDY A.2.1 Wild Type Signal Sequence4

ATG ATT CGA TTT AAC CTA TGT GCA GCT GGG GTT GCT TTA GCG CTA TCA GGT M I RFNLCAAGVALALSG

GCT GCA GTC GCA GCT CCG ACC GCA CCA AGT ... A A V A A P T A P S ... A.2.2 Signal Sequence Used in Secretion Vectors

pVerSec:

ATG ATT CGA TTT AAC CTA TGT GCA GCT GGG GTT GCT TTA GCG CTA TCA GGT MIRFNLCAAGVALALSG

GCT GCA GTC GCc A A V A

pVerSec-SE5:

ATG ATT CGA TTT AAC CTA TGT GCA GCT GGG GTT GCT TTA GCG CTA TCA GGq MIRFNLCAAGVALALSG

GCc GCA GTa GCc A A V A

4 Underlined is the N-terminal sequence of the mature chitinase (123). The last Ala not underlined donates the carboxy group of the signal peptide.

5 Underlined is the minimal add-on to the upper primer of structural genes to be amplified. Lowercase bases in A.22 and A.23 represent silent mutations. 129

A.3 SECRETION VECTOR SEQUENCES pVerSec (2975 bp):

GGCCACGATG CGTCCGGCGT AGAGGATCCG GAATTCTCAT GTTTGACAGC TTATCATCGA 60 CTGCACGGTG CACCAATGCT TCTGGCGTCA GGCAGCCATC GGAAGCTGTG GATTGGCTGT 12 0 GCAGGTCGTA AATCACTGCA TAATTCGTGT CGCTCAAGGC GCACTCCCGT TCTGGATAAT 18 0 GTTTTTTGCG CCGACATATA AACGGTTCTG GCAAATATTC TGAAATGAGC TGTTGACAAT 240 TAATCATCCG GCTCGTATAA TGTGTGGAAT TGTGAGCGGA TAACAATTTC ACACAGGAAA 3 00 CAGACCATCG ATAAGCTTGA TGAGAGAAGT TATGATTCGA TTTAACCTAT GTGCAGCTGG 3 60 GGTTGCTTTA GCGCTATCAG GTGCTGCAGT CGCCGGCCCG ACGGCTGATA CAGATTAAAT 42 0 CAGAACGCAG AAGCGGTCTG ATAAAACAGA ATTTGCCTGG CGGCAGTAGC GCGGTGGTCC 480 CACCTGACCC CATGCCGAAC TCAGAAGTGA AACGCCGTAG CGCCGATGGT AGTGTGGGGT 540 CTCCCCATGC GAGAGTAGGG AACTGCCAGG CATCAAATAA AACGAAAGGC TCAGTCGAAA 6 00 GACTGGGCCT TTCGTTTTAT CTGTTGTTTG TCGGTGAACG CTCTCCTGAG TAGGACAAAT 660 CCGCCGGGAG CGGATTTGAA CGTTGCGAAG CAACGGCCCG GAGGGTGGCG GGCAGGACGC 72 0 CCGCCATAAA CTGCCAGGCA TCAAATTAAG CAGAAGGCCA TCCTGACGGA TGGCCTTTTT 780 GCGTTTCTAC AAACTCTTTT GTTTATTTTT CTAAATACAT TCAAATATGT ATCCGCTCAT 840 GAGACAATAA CCCTGATAAA TGCTTCAATA ATATTGAAAA AGGAAGAGTA TGAGTATTCA 900 ACATTTCCGT GTCGCCCTTA TTCCCTTTTT TGCGGCATTT TGCCTTCCTG TTTTTGCTCA 9 60 CCCAGAAACG CTGGTGAAAG TAAAAGATGC TGAAGATCAG TTGGGTGCAC GAGTGGGTTA 102 0 CATCGAACTG GATCTCAACA GCGGTAAGAT CCTTGAGAGT TTTCGCCCCG AAGAACGTTT 108 0 TCCAATGATG AGCACTTTTA AAGTTCTGCT ATGTGGCGCG GTATTATCCC GTGTTGACGC 114 0 CGGGCAAGAG CAACTCGGTC GCCGCATACA CTATTCTCAG AATGACTTGG TTGAGTACTC 1 2 0 0 ACCAGTCACA GAAAAGCATC TTACGGATGG CATGACAGTA AGAGAATTAT GCAGTGCTGC 12 60 CATAACCATG AGTGATAACA CTGCGGCCAA CTTACTTCTG ACAACGATCG GAGGACCGAA 132 0 GGAGCTAACC GCTTTTTTGC ACAACATGGG GGATCATGTA ACTCGCCTTG ATCGTTGGGA 13 8 0 ACCGGAGCTG AATGAAGCCA TACCAAACGA CGAGCGTGAC ACCACGATGC TGTAGCAATG 1440 GCAACAACGT TGCGCAAACT ATTAACTGGC GAACTACTTA CTCTAGCTTC CCGGCAACAA 1500 TTAATAGACT GGATGGAGGC GGATAAAGTT GCAGGACCAC TTCTGCGCTC GGCCCTTCCG 1 5 6 0 GCTGGCTGGT TTATTGCTGA TAAATCTGGA GCCGGTGAGC GTGGGTCTCG CGGTATCATT 162 0 GCAGCACTGG GGCCAGATGG TAAGCCCTCC CGTATCGTAG TTATCTACAC GACGGGGAGT 1680 CAGGCAACTA TGGATGAACG AAATAGACAG ATCGCTGAGA TAGGTGCCTC ACTGATTAAG 1740 CATTGGTAAC TGTCAGACCA AGTTTACTCA TATATACTTT AGATTGATTT AAAACTTCAT 18 00 TTTTAATTTA AAAGGATCTA GGTGAAGATC CTTTTTGATA ATCTCATGAC CAAAATCCCT 18 6 0 TAACGTGAGT TTTCGTTCCA CTGAGCGTCA GACCCCGTAG AAAAGATCAA AGGATCTTCT 192 0 TGAGATCCTT TTTTTCTGCG CGTAATCTGC TGCTTGCAAA CAAAAAAACC ACCGCTACCA 19 80 GCGGTGGTTT GTTTGCCGGA TCAAGAGCTA CCAACTCTTT TTCCGAAGGT AACTGGCTTC 2 04 0 AGCAGAGCGC AGATACCAAA TACTGTCCTT CTAGTGTAGC CGTAGTTAGG CCACCACTTC 210 0 AAGAACTCTG TAGCACCGCC TACATACCTC GCTCTGCTAA TCCTGTTACC AGTGGCTGCT 2160 GCCAGTGGCG ATAAGTCGTG TCTTACCGGG TTGGACTCAA GACGATAGTT ACCGGATAAG 222 0 GCGCAGCGGT CGGGCTGAAC GGGGGGTTCG TGCACACAGC CCAGCTTGGA GCGAACGACC 22 8 0 TACACCGAAC TGAGATACCT ACAGCGTGAG CATTGAGAAA GCGCCACGCT TCCCGAAGGG 2340 AGAAAGGCGG ACAGGTATCC GGTAAGCGGC AGGGTCGGAA CAGGAGAGCG CACGAGGGAG 2 4 0 0 CTTCCAGGGG GAAACGCCTG GTATCTTTAT AGTCCTGTCG GGTTTCGCCA CCTCTGACTT 2460 GAGCGTCGAT TTTTGTGATG CTCGTCAGGG GGGCGGAGCC TATGGAAAAA CGCCAGCAAC 2 5 2 0 GCGGCCTTTT TACGGTTCCT GGCCTTTTGC TGGCCTTTTG CTCACATGTT CTTTCCTGCG 2580 TTATCCCCTG ATTCTGTGGA TAACCGTATT ACCGCCTTTG AGTGAGCTGA TACCGCTCGC 2 64 0 CGCAGCCGAA CGACCGAGCG CAGCGAGTCA GTGAGCGAGG AAGCGGAAGA GCGCCTGATG 2 7 0 0 CGGTATTTTC TCCTTACGCA TCTGTGCGGT ATTTCACACC GCATATGGTG CACTCTCAGT 27 60 ACAATCTGCT CTGATGCCGC ATAGTTAAGC CAGTATACAC TCCGCTATCG CTACGTGACT 2 82 0 GGGTCATGGC TGCGCCCCGA CACCCGCCAA CACCCGCTGA CGCGCCCTGA CGGGCTTGTC 28 80 TGCTCCCGGC ATCCGCTTAC AGACAAGCTG TGACCGTCTC CGGGAGCTGC ATGTGTCAGA 2 940 GGTTTTCACC GTCATCACCG AAACGCGCGA GGCAG 3 0 0 0 130

pVerSec-SE (2990 bp):

GGCCACGATG CGTCCGGCGT AGAGGATCCG GAATTCTCAT GTTTGACAGC TTATCATCGA 60 CTGCACGGTG CACCAATGCT TCTGGCGTCA GGCAGCCATC GGAAGCTGTG GATTGGCTGT 12 0 GCAGGTCGTA AATCACTGCA TAATTCGTGT CGCTCAAGGC GCACTCCCGT TCTGGATAAT 18 0 GTTTTTTGCG CCGACATATA AACGGTTCTG GCAAATATTC TGAAATGAGC TGTTGACAAT 240 TAATCATCCG GCTCGTATAA TGTGTGGAAT TGTGAGCGGA TAACAATTTC ACACAGGAAA 3 00 CAGACCATCG ATAAGCTTGA TGAGAGAAGT TATGATTCGA TTTAACCTAT GTGCAGCTGG 3 60 GGTTGCTTTA GCGCTATCAG GgGCcGCAGT gGCcGCTCCG ATGAGAGAAG ATaTcCAGCC 4 2 0 TGATACAGAT TAAATCAGAA CGCAGAAGCG GTCTGATAAA ACAGAATTTG CCTGGCGGCA 4 80 GTAGCGCGGT GGTCCCACCT GACCCCATGC CGAACTCAGA AGTGAAACGC CGTAGCGCCG 540 ATGGTAGTGT GGGGTCTCCC CATGCGAGAG TAGGGAACTG CCAGGCATCA AATAAAACGA 60 0 AAGGCTCAGT CGAAAGACTG GGCCTTTCGT TTTATCTGTT GTTTGTCGGT GAACGCTCTC 66 0 CTGAGTAGGA CAAATCCGCC GGGAGCGGAT TTGAACGTTG CGAAGCAACG GCCCGGAGGG 72 0 TGGCGGGCAG GACGCCCGCC ATAAACTGCC AGGCATCAAA TTAAGCAGAA GGCCATCCTG 780 ACGGATGGCC TTTTTGCGTT TCTACAAACT CTTTTGTTTA TTTTTCTAAA TACATTCAAA 84 0 TATGTATCCG CTCATGAGAC AATAACCCTG ATAAATGCTT CAATAATATT GAAAAAGGAA 9 00 GAGTATGAGT ATTCAACATT TCCGTGTCGC CCTTATTCCC TTTTTTGCGG CATTTTGCCT 96 0 TCCTGTTTTT GCTCACCCAG AAACGCTGGT GAAAGTAAAA GATGCTGAAG ATCAGTTGGG 102 0 TGCACGAGTG GGTTACATCG AACTGGATCT CAACAGCGGT AAGATCCTTG AGAGTTTTCG 10 8 0 CCCCGAAGAA CGTTTTCCAA TGATGAGCAC TTTTAAAGTT CTGCTATGTG GCGCGGTATT 1140 ATCCCGTGTT GACGCCGGGC AAGAGCAACT CGGTCGCCGC ATACACTATT CTCAGAATGA 12 00 CTTGGTTGAG TACTCACCAG TCACAGAAAA GCATCTTACG GATGGCATGA CAGTAAGAGA 12 60 ATTATGCAGT GCTGCCATAA CCATGAGTGA TAACACTGCG GCCAACTTAC TTCTGACAAC 13 2 0 GATCGGAGGA CCGAAGGAGC TAACCGCTTT TTTGCACAAC ATGGGGGATC ATGTAACTCG 13 80 CCTTGATCGT TGGGAACCGG AGCTGAATGA AGCCATACCA AACGACGAGC GTGACACCAC 1 4 4 0 GATGCTGTAG CAATGGCAAC AACGTTGCGC AAACTATTAA CTGGCGAACT ACTTACTCTA 1500 GCTTCCCGGC AACAATTAAT AGACTGGATG GAGGCGGATA AAGTTGCAGG ACCACTTCTG 15 6 0 CGCTCGGCCC TTCCGGCTGG CTGGTTTATT GCTGATAAAT CTGGAGCCGG TGAGCGTGGG 162 0 TCTCGCGGTA TCATTGCAGC ACTGGGGCCA GATGGTAAGC CCTCCCGTAT CGTAGTTATC 16 8 0 TACACGACGG GGAGTCAGGC AACTATGGAT GAACGAAATA GACAGATCGC TGAGATAGGT 1 7 4 0 GCCTCACTGA TTAAGCATTG GTAACTGTCA GACCAAGTTT ACTCATATAT ACTTTAGATT 18 00 GATTTAAAAC TTCATTTTTA ATTTAAAAGG ATCTAGGTGA AGATCCTTTT TGATAATCTC 186 0 ATGACCAAAA TCCCTTAACG TGAGTTTTCG TTCCACTGAG CGTCAGACCC CGTAGAAAAG 192 0 ATCAAAGGAT CTTCTTGAGA TCCTTTTTTT CTGCGCGTAA TCTGCTGCTT GCAAACAAAA 19 80 AAACCACCGC TACCAGCGGT GGTTTGTTTG CCGGATCAAG AGCTACCAAC TCTTTTTCCG 2 040 AAGGTAACTG GCTTCAGCAG AGCGCAGATA CCAAATACTG TCCTTCTAGT GTAGCCGTAG 2100 TTAGGCCACC ACTTCAAGAA CTCTGTAGCA CCGCCTACAT ACCTCGCTCT GCTAATCCTG 2160 TTACCAGTGG CTGCTGCCAG TGGCGATAAG TCGTGTCTTA CCGGGTTGGA CTCAAGACGA 222 0 TAGTTACCGG ATAAGGCGCA GCGGTCGGGC TGAACGGGGG GTTCGTGCAC ACAGCCCAGC 22 8 0 TTGGAGCGAA CGACCTACAC CGAACTGAGA TACCTACAGC GTGAGCATTG AGAAAGCGCC 2340 ACGCTTCCCG AAGGGAGAAA GGCGGACAGG TATCCGGTAA GCGGCAGGGT CGGAACAGGA 2 4 0 0 GAGCGCACGA GGGAGCTTCC AGGGGGAAAC GCCTGGTATC TTTATAGTCC TGTCGGGTTT 24 60 CGCCACCTCT GACTTGAGCG TCGATTTTTG TGATGCTCGT CAGGGGGGCG GAGCCTATGG 2 5 2 0 AAAAACGCCA GCAACGCGGC CTTTTTACGG TTCCTGGCCT TTTGCTGGCC TTTTGCTCAC 2 58 0 ATGTTCTTTC CTGCGTTATC CCCTGATTCT GTGGATAACC GTATTACCGC CTTTGAGTGA 2640 GCTGATACCG CTCGCCGCAG CCGAACGACC GAGCGCAGCG AGTCAGTGAG CGAGGAAGCG 27 00 GAAGAGCGCC TGATGCGGTA TTTTCTCCTT ACGCATCTGT GCGGTATTTC ACACCGCATA 2 7 6 0 TGGTGCACTC TCAGTACAAT CTGCTCTGAT GCCGCATAGT TAAGCCAGTA TACACTCCGC 2 82 0 TATCGCTACG TGACTGGGTC ATGGCTGCGC CCCGACACCC GCCAACACCC GCTGACGCGC 2 880 CCTGACGGGC TTGTCTGCTC CCGGCATCCG CTTACAGACA AGCTGTGACC GTCTCCGGGA 2940 GCTGCATGTG TCAGAGGTTT TCACCGTCAT CACCGAAACG CGCGAGGCAG 3000 131

A.4 GENOTYPES OF SOME HOST STRAINS6

JM100 Series

JM 101: F' traD36 laclq A(lacZ)Ml5 proA+B+ /supE thi A(lac-proAB) (271). Supports the growth of vectors carrying amber muta­ tions (152).

JM 103: F' traD36 lacP A(lacZ)M15 proA+B+/endAl supE sbcBC thi- 1 rpsL (Strr) A(lac-pro) (PI) (rk+ mk+ mp l+) (88, 271).

JM 105: F' traD36 laclqA(lacZ)M15 proA+B+ /thi rpsL (Strr) endA sbcB15 sbcC? hsdR4 (rf mk+) Ailac-proAB) (271). Supports the growth of vectors carrying amber mutations (152) and will modify but not restrict transfected DNA (271).

JM 107: F ' traD36 laclqA(lacZ)M15 proA+B+/ e 14'(McrA~) Ailac- proAB) thi gyrA96 (N aP) endAl rpsR17 (r^ m^+) relAl suE44 (271). Supports the growth of vectors carrying amber muta­ tions (152) and will modify but not restrict transfected DNA (271).

JM109: JM107 recAl (271).

Other Strains

D H 5a: supE44 A(lacZYA-argF)U169 (§80 dlacA(lacZ)M15) hsdR17 recAl gyrA96 thi-1 relAli203).

DH5aF': F1/ endAl hsdR17 (rk' mk+) supE44 thi-1 recAl gyrA (NaP) relAl A(lacZYA-argF)U169 deoR (§80 dlacA(lacZ)M15) (264).

KS1000: F'lacIqlacI+pro+ tara A(lac-pro) nalA argl(am) rif thi-1 A(tsp)::Kanr eda-51::TnlO (tetr). E. coli tsp mutant. The tsp gene (tail specific protease) encodes a periplasmic protease that may degrade secreted or cytoplasmically overexpressed proteins after cell lysis (212).

6 Listed are known, but not all, genotypes. Since these strains are the results of extensive mutagenesis, phenotypes can not be predicted based on these genotypes. Additional notes on some of these strains can be found in several laboratory manuals (8, 189). 132

A.5 STRUCTURES OF RELEVENT COMPOUNDS A.5.1 Monosaccharides7

Glucose Galactose

n h 2 Mannose Glucosamine

A^-Acetylglucosamine

7 All sugars, monomeric or as components of other compounds, are beta anomers of D-sugars unless otherwise indicated. A.5.2 Disaccharides

O Lactose

1,6-Allolactose

Maltose

Cellobiose

o o Chitobiose H—N—C—CH3 H -N -C —CH3 oII 3 o II A.5.3 Polysaccharides

O O

m

Cellulose

NHNH NH NH c=o C = 0 c=o c=o CH CH

Chitin (n=4500-8000, Mr*1.0-1.8xl06 Da) 135

A.5.4 Sugar Analogs

X-Gal (5-bromo-4-chloro-3-indolyl-beta-Z)-galactopyranoside)

Cl Br

H

IPTG (Isopropyl-beta-ZMhiogalactopyranoside)

CH S—C -H

p-NP-GlcNAc (p-Nitrophenyl-beta-Zl-iV-acetylglucosamine)

NO NH c=o I c h 3 136

cx-35S-dATP (35S-Deoxyadenosine 5'-(alpha-thio)triphosphate); FW: 507.2; Half-life: 87.4 days; Catalog #: NEG-034H, from DuPont-NEN, Sequencing grade for use with Sequenase 2.0.

NH

O P O P O. CH OH OH OH

OH H 137

A.6 LIST OF ABBREVIATIONS8 ATCC American type culture collection cAMP 3',5'-Cyclic adenosine monophosphate bp base-pair (of DNA) CAP Catabolite activator protein CHO Cell line derived from Chinese hamster ovary C-terminal Carboxy terminal ds-DNA Double stranded DNA ER Endoplasmic reticulum EDTA Ethylene diamine tetraacetic acid, disodium salt Fru-6-P Fructose-6-phosphate Gal Galactose GalNAc N-acetylgalactosamine GCG Genetics Computing Group, GCG software vender Glc Glucosyl GlcN Glucosamine GlcNAc iV-acetylglucosamine GlcNAc-6-P Glucose-6-phosphate Glu Glucose IPTG Isopropyl-beta-D-thiogalactopyranoside kDa Kilodalton LB Luria-Bertani medium Man Mannose MPB Maltose binding protein N-terminal Amino terminal nt (Deoxy)ribonucleotide (as a unit of RNA or DNA) ORF Open reading frame p-NP para-nitrophenyl p-NP-GlcNAc jPara-nitrophenyl-iV-acetylglucosamine pBS II pBluescript II PCR Polymerase chain reaction PTS Bacterial phosphoenolpyruvate:glycose phosphotrans­ ferase system recPCR Recombinant polymerase chain reaction rrreBTlT2 rRNA operon B transcriptional tandem terminator S.D. Shine-Dalgarno sequence (= RBS) SDS-PAGE Sodium Dodecyl Sulfate-Polyacrylamide Gel Electrophoresis ss-DNA single stranded DNA Tris Tris(hydroxymethyl)aminomethane X-gal 5-bromo-4-chloro-3-indolyl-beta-D-galactopyranoside

8 Not included are those fully defined whenever used. Names of genes (products), operons, mutants, amino acids and nucleotides follow standard notations. VITA

Michael Hongli Wu was born on April 27, 1956 in Hebei province, People’s Republic of China. He went to Beijing Forestry University in 1978, where he received his Bachelor of Science in Agriculture in 1982. After working for four years in the Forestry Institute of Shaanxi province in Northwest China, he came to the University of California at Riverside in 1986, where he worked on baculovirus genetics under professor Brian A. Federici and graduated with a Master of Science in Entomology in 1989. He started his doctoral program in the Department of Biochemistry, Louisiana State University, in the summer of 1991, under the direction of professor Roger A. Laine, and will receive the doctor of philosophy degree in the spring of 1996. His project is on DNA sequencing and protein secretion of the cytoplasmic chitobiase gene from the marine bacterium, Vibrio parahaemolyticus. In addition to his proposed project, he is also interested in the development of cloning vectors for the production of extracellular proteins cloned from other sources. The major techniques used during his study in professor Laine's lab include gene cloning, DNA sequencing, and recombinant polymerase chain reaction (recPCR).

138 DOCTORAL EXAMINATION AND DISSERTATION REPORT

Candidate: MICHAEL HONGLI WU

Major Field: BIOCHEMISTRY

Title of Dissertation: THE CYTOPLASMIC N , N ' -DIACETYLCHITOBIASE GENE FROM VIBRIO PARAHAEMOLYTICUS: SEQUENCE ANALYSIS, PROTEIN SECRETION, AND SECRETION SYSTEM DEVELOPMENT

Approved: y ■ * c ■c— -O Major Pr^J&ssor and Chairman"

Dean of the Graduate School

EXAMINING-COMMITTEE

Date of Examination:

OCT. 31, 1995