Western Michigan University ScholarWorks at WMU

Dissertations Graduate College

6-2007

Structure-Function Studies of Self- Assembling Flagellin Proteins fromSalmonella typhimurium and pyrophilus

Venkata Raghu Ram Malapaka Western Michigan University

Follow this and additional works at: https://scholarworks.wmich.edu/dissertations

Part of the Biochemistry, Biophysics, and Structural Biology Commons, and the Biology Commons

Recommended Citation Malapaka, Venkata Raghu Ram, "Structure-Function Studies of Self- Assembling Flagellin Proteins fromSalmonella typhimurium and Aquifex pyrophilus" (2007). Dissertations. 893. https://scholarworks.wmich.edu/dissertations/893

This Dissertation-Open Access is brought to you for free and open access by the Graduate College at ScholarWorks at WMU. It has been accepted for inclusion in Dissertations by an authorized administrator of ScholarWorks at WMU. For more information, please contact [email protected]. STRUCTURE-FUNCTION STUDIES OF SELF-ASSEMBLING FLAGELLIN PROTEINS FROM SALMONELLA TYPHIMURIUM AND AQUIFEX PYROPHILUS

by

Venkata Raghu Ram Malapaka

A Dissertation Submitted to the Faculty of The Graduate College in partial fulfillment of the requirements for the Degree of Doctor of Philosophy Department of Biological Sciences Dr. Brian . Tripp, Advisor

Western Michigan University Kalamazoo, Michigan June 2007

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. UMI Number: 3265919

INFORMATION TO USERS

The quality of this reproduction is dependent upon the quality of the copy submitted. Broken or indistinct print, colored or poor quality illustrations and photographs, print bleed-through, substandard margins, and improper alignment can adversely affect reproduction. In the unlikely event that the author did not send a complete manuscript and there are missing pages, these will be noted. Also, if unauthorized copyright material had to be removed, a note will indicate the deletion.

® UMI

UMI Microform 3265919 Copyright 2007 by ProQuest Information and Learning Company. All rights reserved. This microform edition is protected against unauthorized copying under Title 17, United States Code.

ProQuest Information and Learning Company 300 North Zeeb Road P.O. Box 1346 Ann Arbor, Ml 48106-1346

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Copyright by Venkata Raghu Ram Malapaka 2007

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. ACKNOWLEDGEMENTS

I would like to acknowledge Dr. Brian C. Tripp, a wonderful mentor I had

during my tenure at Western Michigan University. Dr. Tripp always valued my

creative spirit and had a sense of faith in me. His guidance in terms of designing a

research project and writing manuscripts have been invaluable. He is always willing

to try novel ideas and never afraid of going in a new direction. His critical analysis of

my thoughts gave me tremendous confidence in pursuing various ideas during my

graduate study. I sincerely believe the various skills that I have learned under him

will help me launch my scientific career in a successful way.

I would also like to thank my committee; Dr. Bruce Bejcek, Dr. John Geiser,

and Dr. Yirong Mo. Their commitment towards their students and colleagues is

exemplary. Their suggestions and thoughtful insights have given me an opportunity to

critically think about my work and approach it in a rational way. I would like to

especially thank Dr. Subra Muralidharan for his financial and intellectual support

during my study at WMU. I would like to thank all my lab members for maintaining a

healthy work environment in the lab.

I would like to thank my father, Mr. Subrahmanyam, for providing me with

the best possible means to fullfil my dreams. He has successfully played the role of

both the parents after my mother’s demise and made numerous sacrifices for us. I

would like to thank my wife Sandhya, for being a wonderful and patient companion.

She has been a pillar of strength for me during my study. I would like to acknowledge

ii

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Acknowledgements - Continued

my beloved sisters Vandana and Shilpa, with out whom my life wouldn’t be

complete.

Finally, I would like to dedicate this work to my mother Suseela and my

grandmother Mahalakshmi, who are not here today to share my happiness, but will

always be there in our thoughts.

Venkata Raghu Ram Malapaka

111

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. TABLE OF CONTENTS

ACKNOWLEDGEMENTS...... ii

LIST OF TABLES...... x

LIST OF FIGURES ...... xi

CHAPTER

I. INTRODUCTION TO THE DISSERTATION...... 1

References...... 13

II. INVESTIGATION OF THE HYPERVARIABLE DOMAIN REGION OF THE SALMONELLA TYPHIMURIUM FLAGELLIN PROTEIN IN MOTILITY ...... 20

Abstract...... 20

Introduction ...... 21

Materials and Methods ...... 25

Bacterial strains ...... 25

Plasmids and construction of flagellin internal deletion variant library ...... 26

Motility assays ...... 31

Colony PCR ...... 32

DNA sequence determination and analysis...... 33

Flagellin protein expression* flagella fiber purification and SDS-PAGE ...... 33

Electron microscopy ...... 34

Light microscopy and swimming velocity analysis ...... 35 iv

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Table of Contents - Continued

CHAPTER

Measurement of power required to stall cells using optical tweezer ...... 35

Structure analysis and homology modeling ...... 38

Results ...... 38

Description of flagellin hypervariable domain structural features...... 38

Identification of functional fliC internal deletion variants 39

Analysis of functional deletion variant polypeptide sequences ...... 42

Purification and transmission electron microscopy analysis of flagella ...... 70

Swarming agar motility assay ...... 74

Video analysis of swimming motility in dilute aqueous media ...... 77

Measurement of stall force in viscous aqueous media by optical trapping ...... 78

Comparison of motility results determined by different methods ...... 80

Discussion ...... 81

Effects of internal deletions on cellular motility and stability of flagella ...... 81

Implications for folding, export and assembly ...... 83

Implications for design of flagellin fusion proteins ...... 85

v

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Table of Contents - Continued

CHAPTER

GenBank accession numbers ...... 85

References...... 88

Appendices ...... 94

A. DNA sequences of the fliC gene deletion variants 94

B. A multiple sequence alignment of all the 39 unique internal deletion variants of FliC ...... 132

III. A THEORETICAL MODEL OF AQUIFEX PYROPHILUS FLAGELLIN: IMPLICATIONS FOR ITS THERMOSTABILITY...... 133

Abstract...... 133

Introduction ...... 134

Computational Methods ...... 138

Results and Discussion ...... 141

Sequence comparison ...... 142

Structure comparison ...... 154

Conclusions ...... 169

References...... 170

IV. INVESTIGATION OF INSERTION OF GREEN FLUORESCENT PROTEIN IN SALMONELLA TYPHIMURIUM FLAGELLIN PROTEIN TO CREATE A FUNCTIONAL BIONANOTUBE...... 177

Introduction ...... 177

vi

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Table of Contents - Continued

CHAPTER

Materials and Methods ...... 183

Bacterial strains...... 183

fliC and GFP transposon plasmids ...... 184

Transposase reaction ...... 185

Removal of Xmal restriction site from pTH890 plasmid 187

Screening of the GFP inserts ...... 189

Removal of KanR gene ...... 190

Screening for functional insertions using bacterial motility assay ...... 190

Results...... 191

Transposase reaction ...... 191

Screening for the fluorescence ...... 192

Removal of KanR gene ...... 193

Motility screening ...... 195

D iscussion ...... 195

References...... 198

V. EXPERIMENTAL INVESTIGATION OF THE HYPOTHETICAL S. TYPHIMURIUM FLAGELLIN MECHANICAL SWITCH BY INSERTION OF A DISULFIDE BRIDGE...... 205

Introduction ...... 205

Materials and Methods ...... 207

vii

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Table of Contents - Continued

CHAPTER

Strains and plasmid ...... 207

Construction of cysteine disulfide bond ...... 208

Partial purification of fibers and SDS-PAGE ...... 208

Ellman’s assay ...... 209

Motility assay ...... 210

Video microscopy and swimming analysis ...... 210

Results and Discussion ...... 212

References...... 213

VI. HIGH-THROUGHPUT SCREENING FOR ANTIMICROBIAL COMPOUNDS USING A 96-WELL FORMAT BACTERIAL MOTILITY ABSORBANCE ASSAY ...... 215

Abstract...... 215

Introduction ...... 216

Materials and Methods ...... 217

Results and Discussion ...... 219

References...... 232

VII. RATIONAL DESIGN OF METAL BINDING SITES AND TESTING THE FEASIBILITY OF CATALYTIC SITES IN FLAGELLIN ...... 233

Introduction ...... 233

viii

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Table of Contents - Continued

CHAPTER

Materials and Methods ...... 237

Bacterial strains...... 237

Design of the metal binding sites ...... 238

Site-directed mutagenesis ...... 238

Partial purification of fibers and SDS-PAGE ...... 241

Dialysis of mutants ...... 241

Enzyme assay ...... 242

Results and Discussion ...... 242

References...... 243

VIII. FUTURE STUDIES...... 246

Study of the conformational state of Salmonella typhimurium flagellin protein during type III export and assembly into flagella... 246

Self-assembly of GFP-flagellin fusion proteins in vitro ...... 248

Conformational study of various flagellin deletion variants using circular dichroism and fluorescence spectroscopy ...... 249

Engineering of catalytic sites in flagellin protein ...... 249

Insertion of single domain antibodies in flagellin hypervariable region ...... 250

APPENDICES...... 251

A. Recombinant DNA Biosafety Approval ...... 251

B. Letters and Permission ...... 258

ix

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. LIST OF TABLES

2.1 Strains and plasmids used in this study ...... 48

2.2 Sequence parameters of functional internal deletion variants ...... 51

2.3 Summary of deleted secondary structures internal deletion variants 59

2.4 Motility of S. typhimurium SJW134 cells expressing FliC and deletion variants as determined by different methods ...... 75

2.5 The deletion variant plasmids and their Genbank accession n um bers ...... 86

3.1 Amino acid composition of the full length FliC and FlaA ...... 145

3.2 Computed parameters of FlaA and FliC flagellin proteins and their N-terminal, C-terminal and middle hypervariable domain regions ...... 150

3.3 Amino acid composition (percentages) of N-terminus, C-terminus and middle domains of FliC and FlaA ...... 151

3.4 Solvent accessible surface area (SASA) calculations for conserved DO and D1 domains o f FlaA and F liC ...... 163

5.1 Ellman’s assay of the wild-type and cysteine mutants ...... 212

6.1 Complete list of antibacterial compounds detected in screen of Genesis Plus library with offset motility detection method ...... 228

7.1 Parameters used for the construction of metal binding sites in FliC protein ...... 236

7.2 Mutagenesis primers ...... 240

x

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. LIST OF FIGURES

1.1 General structure of eubacterial flagella ...... 2

1.2 A cartoon representation o f S. typhimurium F liC ...... 9

2.1 An Electron Micrograph of Salmonella typhimurium ...... 22

2.2 A pictorial representation of pTH890 plasmid ...... 27

2.3 Construction of fliC gene deletion library ...... 30

2.4 A schematic representation of a laser tweezer ...... 37

2.5 DNA agarose gel of the pTH890 deletion library ...... 41

2.6 Motility assay ...... 42

2.7 Colony PCR of the deletion variants ...... 43

2.8 Multiple alignment of FliC and internal deletion variants ...... 50

2.9 Homology models of flagellin variants ...... 56

2.10 Plot of normalized relative cell motility ...... 67

2.11 TEM images of flagella fibers ...... 72

2.12 SDS-PAGE of the FliC and deletion variant proteins ...... 73

2.13 Motility assay of internal deletion library ...... 75

3.1 A pairwise sequence alignment of FliC and FlaA ...... 143

3.2 A comparison of 3-D structures of FliC and FlaA ...... 153

3.3 Ramachandran plot of the FlaA model ...... 155

3.4 A 3-D structural alignment of FliC with the model of FlaA ...... 156

3.5 Alignment of the secondary structures of FliC and FlaA ...... 161

xi

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. List of Figures - Continued

3.6 Structure of a flagellin pentamer ...... 167

4.1 Crystal structure of a dimeric GFP (PDB: 1GFL) ...... 181

4.2 A map of the pTH890 plasmid ...... 186

4.3 A map of the pBNJ 11.7 plasmid ...... 186

4.4 Insertion of GFP using transposase enzyme ...... 188

4.5 Colony PCR of the fluorescent inserts ...... 192

4.6 Gel picture of the removal of Kanamycin gene ...... 194

4.7 Ligation of the fliC-GFP inserts ...... 194

5.1 A pictorial depiction of the mutated Cys residues ...... 206

5.2 SDS-PAGE and motility analysis of the wild-type flagellin and two cysteine mutation variants ...... 211

6.1 Comparison of motile and non-motile bacterial growth patterns in 96-well plate-formatted motility agar ...... 222

6.2 Comparison of average absorbance readings for motile and non-motile S. typhimurium bacterial cells...... 223

6.3 Growth curves for motile and non-motile S. typhimurium bacterial cells inoculated in 96-well plate-formatted motility agar ...... 226

7.1 Location of the three rationally designed metal binding sites in the flagellin (FliC) structure (PDB 1UCU) ...... 235

7.2 Pictorial depiction of three rationally designed metal binding sites in the FliC protein ...... 239

xii

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER I

INTRODUCTION TO THE DISSERTATION

The ability to provide movement in living organisms plays an important role

in terms of fulfilling their need for nutrition, evading harmful conditions and

colonization of new environments. Specialized membrane-bound appendages have

evolved in prokaryotes and eukaryotes in the form of flagella and cilia. These motility

structures must have evolved in conjunction with other sensor components involved

in sensing the environment, such as chemotaxis, thermotaxis, magnetotaxis, geotaxis,

etc.

The flagellum is the primary organelle responsible for motility in eubacteria K 2>

3’4’5. It is a complex structure composed of a basal body, a trans-membrane rotary

m otor3’6’7’8, a universal joint-like hook structure and a hollow helical flagellar

filament (Figure 1.1). The flagella of E. coli and S. typhimurium have been

extensively studied and serve as models for the study of eubacterial flagella. The

flagella filament has an outer diameter of 12-25 nm, a 2-3 nm diameter inner channel

9’10 and can be 1-15 micrometers or more in length 8’n . Flagella are rotated by the

membrane rotary motor to enable movement of the bacterium, in a process that is

often regulated by chemotaxis \ Peritrichously arranged flagella either associate into

a helical bundle of multiple flagella 12 that generates a net thrust along its axis or they

separate into individual fibers, depending on the direction of rotation and resulting

1

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. supercoiled state of the fiber. Figure 1.1 shows the basic structure of typical

eubacterial flagella.

General structure of Flagella

Filament cap •—-a^jStek.— — F R monomer folds with the help of ftament cap

Channel

Filament

w Transport of FliC through “the central channel

Hook-filamsnt •junction

-Hook Exterior , L ring , P ring

Outer membrane Cell wail Mot AS Periplasm Cell membrane Rod MS ring Crod"

Cytoplasm C ring

FliS is released before the export

FliS attached to the C-terminus of FliC FliS

• FliC

Figure 1.1 General structure of eubacterial flagella.The MS (membrane/supramembrane), L (lipopolysaccharide) and P (peptidoglycan) rings anchor the structure into the cell wall. The MotA-MotB studs represent the site of the motor complexes which drive flagellar rotation. The C (cytoplasmic) ring complex is involved in controlling the direction of flagellar rotation. The C (cytoplasmic) rod may control export of flagellar proteins. The filament cap and hook-filament junction sites contain minor proteins not mentioned in the text. The FliC (flagellin) protein that forms the bulk of the filament is synthesized in the cytoplasm; the FliS protein binds to the C-terminus of the monomers to prevent degradation and premature 2

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. polymerization of the protein. The FliC protein is exported through the central channel to the tip where it folds into proper structure with the help of the FliD/Hap2 filament cap. (The above figure has been adopted and modified from an Encyclopedia of Life Sciences article titled “Bacterial Flagella”.)

Depending on their location, the flagellar complexes have been divided into

MS (membrane/supramembrane), L (lipopolysaccharide) and P (peptidoglycan) that

anchor the structure into the cell w all13. The motor complexes are represented by

MotA and MotB complexes and the cytoplasmic ring complex (C) is involved in

controlling export of flagellar proteins and the direction of rotation (Figure 1.2) 6’9’10,

13

The C component comprises a peripheral ring and an axial rod. The FliG

protein forms the cytoplasmic component of the switch complex that also includes

FliM and FliN proteins 14,15,16. The MS ring forms a connection between the C

component and the cell wall component and is predominantly made up of several

copies of the FliF protein. Flil is the ATPase of the export apparatus that aids in the

export process of other type III proteins. A recent study shows structural similarities

Flil and Fi-ATPase despite the apparently different functions of these proteins 17. It is

central to the structure and function of the flagella 9.The MS ring forms an axial rod

that emerges through the outer wall and is connected to the external hook. Due to the

presence of the outer membrane in Gram-negative , the L and P ring

structures serve as bushings while rotation occurs 18.

The rotary motor is composed made up of two components, a stator and a

rotor 9. The MotA and MotB proteins constitute the stator and the FliG protein

3

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1 Q- Oft' 0 1 constitutes the rotor ’ ’ . The motor is driven by movement of either protons or

sodium ions across the membrane 6; 22. The rotary motor of flagella is the most

powerful of the known macro molecular machines, generating torques of

-4000 pN nm near stall and rotates at -300 Hz 6; 22. The sodium-driven motor rotates

99 00 at up to -1700 Hz . For each motor revolution, -1200 protons pass through it . The

flagella hook formed by FlgE protein plays an important function in the functioning

and assembly of the filament. The FlgD protein forms the hook cap structure that

forms prior to the assembly of the hook. The FliK/Hapl protein functions as a

0\ 94 molecular ruler that controls the length of the hook ’ . The FlgK and FlgL/Hap3

proteins form the hook-filament junction

A striking feature of the different prokaryotic and eukaryotic flagella is that

they don’t appear to have a common evolutionary origin. Eukaryotic flagella differ

remarkably from their prokaryotic counterparts. The basic unit of eukaryotic flagella

and ciliary systems, the axoneme, is predominantly composed of microtubules,

dynein proteins and up to 250 associated proteins 18,22. The basic scheme of flagella

architecture is represented as 9+2 (2 central microtubules and 9 peripheral tubes

i o. oc connected by spokes) ’ ’ . The dynein proteins hydrolyze ATP molecules to drive

rotary motion of the axoneme 18.

At first inspection, eubacterial and archaeal flagella appear to be more similar

in structure and function. However, they are actually not similar in the makeup of

their genetic components and are not derived from the same type of secretion system;

bacterial flagella are based on a type III export system, similar to that used by

4

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. pathogens such as Yersinia pestis to deliver toxin proteins to their host cells. In

contrast, archaeal flagella appear to have more sequence similarity with pili, a type IV

secretion system 2; 26; 27; 28. The archaeal flagellar filament has significant structural

differences when compared to the eubacterial flagella. The archaeal flagella has been

found to be extensively glycosylated, which could be a stabilizing factor under the

extreme conditions that many live in 29. Furthermore, the archaeal flagella

a . 'y j . ^Q. o p filament is a nearly solid structure ’ ’ ■ whereas the eubacterial filament is hollow

with distinct a central channel. The archaeal filament is also more narrow and ranges

from 10-14 nm in diameter, compared to 18-24 nm diameter for the eubacterial

counterpart. The protein components of the flagella follow a different mode of export,

similar to that of type IV pili of eubacteria. There is also growing evidence about the

similarities between the ATPases and archaeal flagella motor proteins. The archaeal

flagellin proteins have an N-terminal signal peptide similar to that of pili that is

cleaved by a membrane-bound peptidase. Recently, pili have also been shown to be

the source of propulsion in twitching motility of bacterial cells ’ TO ’ ' T1 . ■

This study primarily involves understanding the structural and functional

aspects of the S. typhimurium flagellin protein that forms the flagella filament. The

flagella filament in S. typhimurium is primarily composed of up to 30,000 copies of a

single globular protein called flagellin (FliC), which self-assembles via noncovalent

forces to form a helical fiber composed of 11 protofilaments 9. The primary (phase-1)

flagellin gene in S. typhimurium is designated as fliC, a secondary (phase-2) flagellin

gene is named JljB, while other bacterial flagellin genes are named as flaA, hag, or

5

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. flaF. Flagellin molecular masses can range from 20-77 kDa ’33 ’ ’34 ’ 35 ’ ' 36 ’ ' ,37

corresponding to amino acid sequences ranging in size from under 300 amino acids to

almost 700 amino acids (e.g., Pseudomonasputida has 688 residues), with the

majority of flagellins containing 500 amino acids 13. The N- and C-termini of flagellin

proteins are conserved across various species38. These size differences are primarily

due to high genetic variation in a “hypervariable” middle region of the protein

encoding outer domains, D2 and D3 38. Flagellins have long been characterized with

respect to their serotype antigenicity in eliciting production of antibodies in mammals

and are classified as H-antigens.34,39,40,41,42 The outer domains are the primary

immunological determinant of this protein; the sequence of the middle region has

long been correlated with variations in the H antigenicity.42,43 The flagellin proteins

are so variable that the Salmonella sp. has at least 70 different serological strains 44.

The export of flagellin monomers is facilitated by the cytosolic chaperone

FliS, which binds to FliC and inhibits its polymerization 45,46. FliS binds specifically

to the C-terminal 40 amino acid component of the disordered DO domain in a 1:2

ratio, which prevents self-assembly and formation of flagellin fiber aggregates within

the cell 45,46,47. Without FliS binding, the C-terminus is degraded. However, it is not

clear what the conformational states of the N-terminus and middle hypervariable

regions of the FliC are prior to assembly and during export through the flagella

central pore. No study has been performed to understand the interactions of these

domains before they are exported through the flagellar central channel. A new report

published by Muskotal et al.48 suggests that the FliS protein binds to the C-terminal

6

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. region of the FliC in the ratio of 1:1. The report suggests that FliS does not act as an

unfolding factor. Instead it helps in the formation of alpha-helical secondary structure

in the region of FliC, where it binds 48.

Eubacterial flagellins are known to undergo several post-translational

modifications including glycosylation and N-methylation of lysines, yet no functional

significance has been attributed 49; 50; 51; 52; 53. The flagellin export in bacteria is very

much similar to the type III secretion system (TTSS) of several pathogenic Gram-

negative bacteria including Yersinia enteritidis 54,55. The TTSS also employs a

chaperone (e.g. SycE) that prevents the folding of the pathogenic proteins (Yops)

before secretion 56. The secretion apparatus , i.e., the “injectisome”, includes -25

proteins and forms a needle like projection that penetrates the host cytoplasmic

membrane to release the pathogenic factors 56,57,58. Several studies have been

performed to determine the extent of the folding of the secreted proteins, as the

needle diameter is only 2-3 nm 54,55. A recent study involving a fusion protein of

YopE-DHFR suggests that the proteins can be folded before export, but they are

exported with the help of an unfoldase activity54. This terminal assembly process is

aided by the flagellar chaperone cap protein, termed FliD or HAP2 9; 10; 59; 60; 61; 62.

The partial and complete structures of S. typhimurium FliC were determined in

2001 11 and 2003 63. The complete FliC structure (PDB 1UCU) shows four distinct

globular domains in the protein, termed DO, D l, D2 (with subdomains D2a and D2b)

and D3. Domain DO forms the inner core of the filament; Dl forms the outer core and

domains D2 and D3 form a knob-like projection on the filament surface. Extensive

7

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. biophysical studies performed in vitro on S. typhimurium flagellin monomers, using

proton nuclear magnetic resonance (NMR), Fluorescence Resonance Energy Transfer

(FRET) and Circular Dichroism (CD) indicate that monomeric flagellin contains

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. D3 domain Structure: mainly p-sheet. Outer part of dispensable, solvent accessible “hypervariable” region, extracellular display region. Function: unknown; propulsion, host immune evasion.

D1 domain Structure: mainly a-helical. Function: folding, D2 domain self-assembly Structure: Mainly P-sheet. Inner part of dispensable, solvent accessible “hypervariable” region, extracellular display region. Function: folding, linker, conformational flexibility.

DO domain. Structure: a-helical. Function: type III export signal, FliS chaperonin binding & flagellar self-assembly via intermolecular coiled-coil interactions.

C-terminus N-terminus

Figure 1.2 A cartoon representation ofS. typhimurium flagellin protein(PDB: 1UCU). The DO domain is formed by the N-and C-terminal regions and is composed of a-helices. The Dl domain is consists of mostly a-helices and a P-hairpin. The D2 domain consists mostly of P-strands and falls in the hypervariable region of the protein. The D3 domain forms the solvent accessible hypervariable outer region and consists of a unique P-folium structure.

9

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. regions both of low mobility (broad resonances) and of high mobility (sharp peaks) 64,

65; 66 These high mobility regions are generally considered to be highly disordered

and can be assigned to specific residues in the N- and C-terminal regions of the

sequence. CD studies show that flagellin contains a mixture of a-helix, (3-structure

and random coil and appear to form two discrete folding domains as measured using

microcalorimetry. Upon polymerization, the amount of a-helix increases, while the

amount of random coil decreases. Sequence analysis work based on the E. coli and

Salmonella species indicated that the N- and C-terminal regions of flagellins are more

highly conserved than the interior sequences. Among the eubacterial flagellins, -180

N-terminal and -100 C-terminal residues are highly conserved and form the DO and

D l domains 67. The region between two conserved termini is highly variable, both in

length and sequence. Our study of internal deletions of FliC protein of S. typhimurium

indicates that the hypervariable region can be deleted to a large extent without

interfering with the export and assembly of the flagellin 68. A similar study that was

performed earlier on E. coli indicated that the minimal sequence capable of forming

filaments contained little more than these conserved terminal regions 69. The 272

residue flagellin of Bacillus is slightly smaller than this ‘minimum-length flagellin’,

and may represent the minimum residues required for filament formation and

function.

Structural features of the D2 and D3 domain regions of S. typhimurium phase

1 flagellin PDB structure (1UCU), as defined by Samatey, etal.,70 are summarized

here. The central, hypervariable region of S. typhimurium flagellin that encodes the

10

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. D2 and D3 domains is composed of residues Lysl77-Ala401. Subdomain D2a

consists of the N-terminal residues Lysl77-Glyl89 and the C-terminal residues

Ala284-Ile344. Subdomain D2b consists of residues Asn345-Ala401. Domain D3 is

formed by residues Tyrl90-Val283. The structure has nine helical structural motifs,

with an a-helical region and short adjoining 3io helical region, collectively termed

helix 6 (Leu288-Ala298), present in subdomain D2a. A second, largely helical region

that is neither in 3io nor a-helical form 11,63 is present in domain D3 (residues

Asnl99-Gly210) and was named helix 4-5 in accordance with the PDB structure

notation. The wild-type structure also has 20 defined p-strands, of which 18 are

present in the outer D2 and D3 domains. The C a backbone of the D2 and D3 domains

contains three examples of a unique folding motif, termed the p-folium, comprising a

series of four successive p-strands that form three p-hairpins all pointing away from

one another.70 The three P-folium structural motif regions are composed of residues

220-260 in domain D3, residues 308-345 in subdomain D2a and residues 345-383

in subdomain D2b. Due to the large structure of the flagellin and solvent exposed

dispensable hypervariable region, flagellin serves as an excellent scaffold for protein

and peptide display.

Flagella display, similar to phage display, has evolved as an alternative method

of peptide display, in part due to the ease of purification and tolerance for insertion of

foreign peptides 71 ’72,73,74,75’76. Flagella display is based on the genetic fusion of

foreign loop peptides into the surface-exposed non-essential “hypervariable” central

11

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. region of flagellin, which is the major subunit present in thousands of copies per

flagella filament. Previous reports have demonstrated the utility of genetically

inserted fusion peptides and small proteins displayed on the immunologically

reactive, solvent accessible, middle domain of mesophilic E. coli flagellin71.

Expression of these constructs in flagellin-deficient host strains results in

hybrid flagella carrying the heterologous peptides in thousands of copies displayed in

a regular array on the flagella surface. This approach has been successfully used for

expression of foreign peptides/proteins to be displayed on flagella. A versatile variant

of flagellar display is the FliTrx hybrid display system created by the insertion of the

entire E. coli thioredoxin gene into the central region of flagellin 75; 11. Peptide

libraries have been genetically introduced into a disulphide loop of thioredoxin,

creating a conformationally constrained library that is readily accessible on the

flagellar surface 11. This FliTrx random peptide display library system has been used

with great success for epitope-mapping purposes 75; 77. Direct flagella display has also

proven to be applicable also in bacterial adhesion technology, as large fragments, up

to 302 amino acid residues in length, of bacterial adhesins can be functionally

expressed as fusions to flagellin 78. Several studies indicate that tolerance to internal

domain fusions may be a general property of many proteins 79’80,81; 82 and represents

another means by which protein structures may evolve 83. A number of foreign

peptides have been successfully inserted into the hypervariable region of flagellin 1,14‘

21, including simultaneous insertion of 115 and 302 amino acid peptides 84. These

engineered hybrid peptide-flagellin proteins are most frequently used as

12

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. immunological reagents to generate vaccine antibodies against the inserted peptides

when injected into animals 85; 86; 87. Only two known reports describe the insertion of

full-length proteins into flagellin, the FliTrx system first described in 1995 88 and a

second 1998 publication 89. Flybrid flagella are easily purified and can easily be

analyzed for binding to various targets, such as immobilized proteins, tissue sections,

as well as cell cultures 71; 78. Flagella have also been used as a scaffold for composite

materials insertion and display of other fusion peptides and proteins on flagella as a

“bionanotube”, as demonstrated in other recent publications by Kumara, et al 90; 91; 92;

93

One specific aim of the research described in the following chapters was to

understand the functional and structural role of the hypervariable domains of flagellin

and to what extent they can be removed. We have also modeled and studied the

structure of a thermostable flagellin protein from Aquifex pyrophilus, with the goal of

engineering this thermostable protein for use as a thermostable flagella nanotube in

peptide display and nanotechnology applications. Motility being central to the

virulence of many pathogens, we have also designed and tested a high-throughput

screening assay to find bacterial motility inhibitors.

References

1. Armitage, J. P. (1992). Bacterial motility and chemotaxis. Sci Prog 76, 451 - 77. 2. Bardy, S. L., Ng, S. Y. & Jarrell, K. F. (2003). Prokaryotic motility structures. Microbiology 149, 295-304. 3. Berry, R. M. & Armitage, J. P. (1999). The bacterial flagella motor. Adv Microb Physiol 41, 291-337. 13

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 4. Metlina, A. L. (2004). Bacterial and archaeal flagella as prokaryotic motility organelles. Biochemistry (Mosc) 69, 1203-12. 5. Moens, S. & Vanderleyden, J. (1996). Functions of bacterial flagella. Crit Rev Microbiol 22, 67-100. 6 . Berg, H. C. (2003). The rotary motor of bacterial flagella. Annu Rev Biochem 72, 19-54. 7. DeRosier, D. J. (1998). The turn of the screw: the bacterial flagellar motor. Cell 93,17-20. 8 . Jones, C. J. & Aizawa, S. (1991). The bacterial flagellum and flagellar motor: structure, assembly and function. Adv Microb Physiol 32, 109-72. 9. Macnab, R. M. (2003). How bacteria assemble flagella. Annu Rev Microbiol 57, 77-100. 10. Macnab, R. M. (2004). Type III flagellar protein export and flagellar assembly. Biochim Biophys Acta 1694, 207-17. 11. Samatey, F. A., Imada, K., Nagashima, S., Vonderviszt, F., Kumasaka, T., Yamamoto, M. & Namba, K. (2001). Structure of the bacterial flagellar protofilament and implications for a switch for supercoiling. Nature 410, 331- 7. 12. Flores, H., Lobaton, E., Mendez-Diez, S., Tlupova, S. & Cortez, R. (2005). A study of bacterial flagellar bundling. Bull Math Biol 67, 137-68. 13. Morgan, D. G. & Khan, S. (2001). Bacterial Flagella. Nature Encyclopedia of Life Sciences, 1-8. 14. Park, S.-Y., Lowder, B., Bilwes, A. M., Blair, D. F. & Crane, B. R. (2006). Structure of FliM provides insight into assembly of the switch complex in the bacterial flagella motor. PNAS103, 11886-11891. 15. Francis, N. R., Irikura, V. M., Yamaguchi, S., DeRosier, D. J. & Macnab, R. M. (1992). Localization of the Salmonella typhimurium flagellar switch protein FliG to the cytoplasmic M-ring face of the basal body. Proc Natl Acad Sci USAS9, 6304-8. 16. Francis, N. R., Sosinsky, G. E., Thomas, D. & DeRosier, D. J. (1994). Isolation, characterization and structure of bacterial flagellar motors containing the switch complex. JM ol Biol 235, 1261-70. 17. Imada, K., Minamino, T., Tahara, A. & Namba, K. (2007). Structural similarity between the flagellar type III ATPase Flil and Fl-ATPase subunits. Proc Natl Acad Sci U SA 104, 485-90. 18. Wemmer, K. A. & Marshall, W. F. (2004). Flagellar motility: all pull together. Curr Biol 14, R992-3. 19. Lloyd, S. A., Tang, H., Wang, X., Billings, S. & Blair, D. F. (1996). Torque generation in the flagellar motor of Escherichia coli: evidence of a direct role for FliG but not for FliM or FliN. JBacteriol 178, 223-31. 20. Marykwas, D. L. & Berg, H. C. (1996). A mutational analysis of the interaction between FliG and FliM, two components of the flagellar motor of Escherichia coli. J Bacteriol 178, 1289-94.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 21. Tang, H., Braun, T. F. & Blair, D. F. (1996). Motility protein complexes in the bacterial flagellar motor. JM ol Biol 261, 209-21. 22. Oster, G. & Wang, H. (2003). Rotary protein motors. Trends Cell Biol 13, 114-21. 23. Moriya, N., Minamino, T., Hughes, K. T., Macnab, R. M. & Namba, K. (2006). The type III flagellar export specificity switch is dependent on FliK ruler and a molecular clock. J Mol Biol 359, 466-77. 24. Minamino, T., Gonzalez-Pedrajo, B., Yamaguchi, K., Aizawa, S. I. & Macnab, R. M. (1999). FliK, the protein responsible for flagellar hook length control in Salmonella, is exported during hook assembly. Mol Microbiol 34, 295-304. 25. Mastronarde, D. N., O'Toole, E. T., McDonald, K. L., McIntosh, J. R. & Porter, M. E. (1992). Arrangement of inner dynein arms in wild-type and mutant flagella of Chlamydomonas. J Cell Biol 118, 1145-62. 26. Faguy, D. M., Jarrell, K. F., Kuzio, J. & Kalmokoff, M. L. (1994). Molecular analysis of archael flagellins: similarity to the type IV pilin-transport superfamily widespread in bacteria. Can J Microbiol 40, 67-71. 27. Thomas, N. A., Bardy, S. L. & Jarrell, K. F. (2001). The archaeal flagellum: a different kind of prokaryotic motility structure. FEMS Microbiol Rev 25, 147- 74. 28. Ng, S. Y., Chaban, B. & Jarrell, K. F. (2006). Archaeal flagella, bacterial flagella and type IV pili: a comparison of genes and posttranslational modifications. J Mol Microbiol Biotechnol 11, 167-91. 29. Trachtenberg, S. & Cohen-Krausz, S. (2006). The archaeabacterial flagellar filament: a bacterial propeller with a pilus-like structure. JM ol Microbiol Biotechnol 11, 208-20. 30. Varga, J. J., Nguyen, V., O'Brien, D. K., Rodgers, K., Walker, R. A. & Melville, S. B. (2006). Type IV pili-dependent gliding motility in the Gram- positive pathogen Clostridium perfringens and other Clostridia. Mol Microbiol 62, 680-94. 31. Kaiser, D. (2000). Bacterial motility: How do pili pull? Current Biology 10, R777-R780. 32. Mattick, J. S. (2002). Type IV pili and twitching motility. Annu Rev Microbiol 56, 289-314. 33. Kondoh, H. & Hotani, H. (1974). Flagellin from Escherichia coli K-12: polymerization and molecular weight comparison with Salmonella flagellins. Biochim. Biophys. Acta 336, 117-139. 34. Joys, T. M. (1985). The covalent structure of the phase-1 flagellar filament protein of Salmonella typhimurium and its comparison with other flagellins. J Biol Chem 260, 15758-61. 35. Lagenaur, C. & Agabian, N. (1976). Physical characterization of Caulobacter crescentus flagella. J Bacteriol 128, 435-44. 36. Joys, T. M. (1988). The flagellar filament protein. Can J Microbiol 34, 452-8.

15

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 37. Andrade, A., Giron, J. A., Amhaz, J. M., Trabulsi, L. R. & Martinez, M. B. (2002). Expression and characterization of flagella in nonmotile enteroinvasive Escherichia coli isolated from diarrhea cases. Infect Immun 70, 5882-6. 38. Beatson, S. A., Minamino, T. & Pallen, M. J. (2006). Variation in bacterial flagellins: from sequence to structure. Trends Microbiol 14, 151-5. 39. Prager, R., Strutz, U., Fruth, A. & Tschape, H. (2003). Subtyping of pathogenic Escherichia coli strains using flagellar (H)-antigens: serotyping versus fliC polymorphisms. IntJMedMicrobiol 292, 477-86. 40. Dauga, C., Zabrovskaia, A. & Grimont, P. A. D. (1998). Restriction Fragment Length Polymorphism Analysis of Some Flagellin Genes of Salmonella enterica. J. Clin. Microbiol. 36, 2835-2843. 41. Reid, S. D., Selander, R. K. & Whittam, T. S. (1999). Sequence Diversity of Flagellin (fliC) Alleles in Pathogenic Escherichia coli. J. Bacteriol. 181, 153- 160. 42. Wang, L., Rothemund, D., Curd, H. & Reeves, P. R. (2003). Species-Wide Variation in the Escherichia coli Flagellin (H-Antigen) Gene. J. Bacteriol. 185, 2936-2943. 43. Kuwajima, G. (1988). Construction of a minimum-size functional flagellin of Escherichia coli. J Bacteriol 170, 3305-9. 44. Mortimer, C. K., Gharbia, S. E., Logan, J. M., Peters, T. M. & Arnold, C. (2006). Flagellin gene sequence evolution in Salmonella. Infect Genet Evol. 45. Ozin, A. J., Claret, L., Auvray, F. & Hughes, C. (2003). The FliS chaperone selectively binds the disordered flagellin C-terminal DO domain central to polymerisation. FEMS Microbiol Lett 219, 219-24. 46. Auvray, F., Thomas, J., Fraser, G. M. & Hughes, C. (2001). Flagellin polymerisation control by a cytosolic export chaperone. JM ol Biol 308, 221- 9. 47. Evdokimov, A. G., Phan, J., Tropea, J. E., Routzahn, K. M., Peters, H. K., Pokross, M. & Waugh, D. S. (2003). Similar modes of polypeptide recognition by export chaperones in flagellar biosynthesis and type III secretion. Nat Struct Biol 10, 789-93. 48. Muskotal, A., Kiraly, R., Sebestyen, A., Gugolya, Z., Vegh, B. M. & Vonderviszt, F. (2006). Interaction of FliS flagellar chaperone with flagellin. FEBS Lett 580, 3916-20. 49. Doig, P., Kinsella, N., Guerry, P. & Trust, T. J. (1996). Characterization of a post-translational modification of Campylobacter flagellin: identification of a sero-specific glycosyl moiety. Mol Microbiol 19, 379-87. 50. Logan, S. M., Trust, T. J. & Guerry, P. (1989). Evidence for posttranslational modification and gene duplication of Campylobacter flagellin. J Bacteriol 171,3031-8. 51. Logan, S. M., Kelly, J. F., Thibault, P., Ewing, C. P. & Guerry, P. (2002). Structural heterogeneity of carbohydrate modifications affects serospecificity of Campylobacter flagellins. Mol Microbiol 46, 587-97. 16

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 52. Schirm, M., Kalmokoff, M., Aubry, A., Thibault, P., Sandoz, M. & Logan, S. M. (2004). Flagellin from Listeria monocytogenes is glycosylated with beta- O-linked N-acetylglucosamine. J Bacteriol 186, 6721-7. 53. Voisin, S., Houliston, R. S., Kelly, J., Brisson, J. R., Watson, D., Bardy, S. L., Jarrell, K. F. & Logan, S. M. (2005). Identification and characterization of the unique N-linked glycan common to the flagellins and S-layer glycoprotein of Methanococcus voltae. J Biol Chem 280, 16586-93. 54. Wilharm, G., Lehmann, V., Neumayer, W., Trcek, J. & Fleesemann, J. (2004). Yersinia ehterocolitica type III secretion: evidence for the ability to transport proteins that are folded prior to secretion. BMC Microbiol 4, 27. 55. Wilharm, G., Lehmann, V., Krauss, K., Lehnert, B., Richter, S., Ruckdeschel, K., Heesemann, J. & Trulzsch, K. (2004). Yersinia enterocolitica type III secretion depends on the proton motive force but not on the flagellar motor components MotA and MotB. Infect Immun 72, 4004-9. 56. Yip, C. K., Kimbrough, T. G., Felise, H. B., Vuckovic, M., Thomas, N. A., Pfuetzner, R. A., Frey, E. A., Finlay, B. B., Miller, S. I. & Strynadka, N. C. (2005). Structural characterization of the molecular platform for type III secretion system assembly. Nature 435, 702-7. 57. Cornelis, G. R. (2006). The type III secretion injectisome. Nat Rev Microbiol 4,811-25. 58. Tampakaki, A. P., Fadouloglou, V. E., Gazi, A. D., Panopoulos, N. J. & Kokkinidis, M. (2004). Conserved features of type III secretion. Cell Microbiol 6 , 805-16. 59. Ikeda, T., Oosawa, K. & Hotani, H. (1996). Self-assembly of the filament capping protein, FliD, of bacterial flagella into an annular structure. JM ol Biol 259, 679-86. 60. Maki-Yonekura, S., Yonekura, K. & Namba, K. (2003). Domain movements of HAP2 in the cap-filament complex formation and growth process of the bacterial flagellum. Proc Natl Acad Sci U SA 100, 15528-33. 61. Maki, S., Vonderviszt, F., Furukawa, Y., Imada, K. & Namba, K. (1998). Plugging interactions of HAP2 pentamer into the distal end of flagellar filament revealed by electron microscopy. JM ol Biol 277, 771-7. 62. Furukawa, Y., Imada, K., Vonderviszt, F., Matsunami, H., Sano, K., Kutsukake, K. & Namba, K. (2002). Interactions between bacterial flagellar axial proteins in their monomeric state in solution. J Mol Biol 318, 889-900. 63. Yonekura, K., Maki-Yonekura, S. & Namba, K. (2003). Complete atomic model of the bacterial flagellar filament by electron cryomicroscopy. Nature 424, 643-50. 64. Ishima, R., Akasaka, K., Aizawa, S. & Vonderviszt, F. (1991). Mobility of the terminal regions of flagellin in solution. J Biol Chem 266, 23682-8. 65. Novikov, V. V., Metlina, A. L. & Poglazov, B. F. (1994). A study on the mechanism of polymerisation of Bacillus brevis flagellin. Biochem Mol Biol Int 33, 723-8.

17

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 6 6 . Gugolya, Z., Muskotal, A., Sebestyen, A., Dioszeghy, Z. & Vonderviszt, F. (2003 Jan 30). Interaction of the disordered terminal regions of flagellin upon flagellar filament formation. FEBS Lett 535, 66-70. 67. Murthy, K. G., Deb, A., Goonesekera, S., Szabo, C. & Salzman, A. L. (2004). Identification of conserved domains in Salmonella muenchen flagellin that are essential for its ability to activate TLR5 and to induce an inflammatory response in vitro. J Biol Chem 279, 5667-75. 6 8 . Malapaka, R. R., Adebayo, L. O. & Tripp, B. C. (2007). A deletion variant study of the functional role of the Salmonella flagellin hypervariable domain region in motility. JM ol Biol 365, 1102-16. 69. Kuwajima, G. (1988). Construction of a minimum-size functional flagellin of Escherichia coli. J. Bacteriol. 170, 3305-3309. 70. Samatey, F. A., Imada, K., Nagashima, S., Vonderviszt, F., Kumasaka, T., Yamamoto, M. & Namba, K. (2001). Structure of the bacterial flagellar protofilament and implications for a switch for supercoiling. Nature 410, 331- 7. 71. Westerlund-Wikstrom, B. (2000). Peptide display on bacterial flagella: principles and applications. Int J Med Microbiol 290, 223-30. 72. Brown, C. K., Modzelewski, R. A., Johnson, C. S. & Wong, M. K. (2000). A novel approach for the identification of unique tumor vasculature binding peptides using an E. coli peptide display library. Ann Surg Oncol 7, 743-9. 73. Hynonen, U., Westerlund-Wikstrom, B., Palva, A. & Korhonen, T. K. (2002). Identification by flagellum display of an epithelial cell- and fibronectin- binding function in the SlpA surface protein of Lactobacillus brevis. J Bacteriol 184, 3360-7. 74. Tanskanen, J., Korhonen, T. K. & Westerlund-Wikstrom, B. (2000). Construction of a multihybrid display system: flagellar filaments carrying two foreign adhesive peptides. Appl Environ Microbiol 6 6 , 4152-6. 75. Tripp, B. C., Lu, Z., Bourque, K., Sookdeo, H. & McCoy, J. M. (2001). Investigation of the 'switch-epitope' concept with random peptide libraries displayed as thioredoxin loop fusions. Protein Eng 14, 367-77. 76. Fedorov, O. V. & Efimov, A. V. (1990). Flagellin as an object for supramolecular engineering. Protein Eng 3, 411-3. 77. Lu, Z., Murray, K. S., Van Cleave, V., LaVallie, E. R., Stahl, M. L. & McCoy, J. M. (1995). Expression of thioredoxin random peptide libraries on the Escherichia coli cell surface as functional fusions to flagellin: a system designed for exploring protein-protein interactions. Biotechnology (NY) 13, 366-72. 78. Westerlund-Wikstrom, B., Tanskanen, J., Virkola, R., Hacker, J., Lindberg, M., Skurnik, M. & Korhonen, T. K. (1997). Functional expression of adhesive peptides as fusions to Escherichia coli flagellin. Protein Eng 10, 1319-26. 79. Betton, J. M., Jacob, J. P., Hofnung, M. & BroomeSmith, J. K. (1997). Creating a bifunctional protein by insertion of beta-lactamase into the maltodextrin-binding protein. Nature Biotechnology 15, 1276-1279. 18

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 80. Doi, N., Itaya, M., Yomo, T., Tokura, S. & Yanagawa, H. (1997). Insertion of foreign random sequences of 1 2 0 amino acid residues into an active enzyme. Febs Letters 402, 177-180. 81. Doi, N. & Yanagawa, H. (1999). Insertional gene fusion technology. FEBS Lett 457, 1-4. 82. Collinet, B., Herve, M., Pecorari, F., Minard, P., Eder, O. & Desmadril, M. (2000). Functionally accepted insertions of proteins within protein domains. Journal o f Biological Chemistry 275, 17428-17433. 83. Scalley-Kim, M., Minard, P. & Baker, D. (2003). Low free energy cost of very long loop insertions in proteins. Protein Science 12, 197-206. 84. Tanskanen, J., Korhonen, T. K. & Westerlund-Wikstrom, B. (2000). Construction of a multihybrid display system: flagellar filaments carrying two foreign adhesive peptides. Appl. Environ. Microbiol. 6 6 , 4152-4156. 85. Westerlund-Wikstrom, B. (2000). Peptide display on bacterial flagella: principles and applications. Int. J. Med. Microbiol. 290, 223-230. 8 6 . Newton, S. M., Jacob, C. O. & Stocker, B. A. (1989). Immune response to cholera toxin epitope inserted in Salmonella flagellin. Science 244, 70-2. 87. Wu, J. Y., Newton, S., Judd, A., Stocker, B. & Robinson, W. S. (1989). Expression of immunogenic epitopes of hepatitis B surface antigen with hybrid flagellin proteins by a vaccine strain of Salmonella. Proc Natl Acad Sci USA 8 6 , 4726-30. 8 8 . Lu, Z., Murray, K. S., Van Cleave, V., LaVallie, E. R., Stahl, M. L. & McCoy, J. M. (1995). Expression of thioredoxin random peptide libraries on the Escherichia coli cell surface as functional fusions to flagellin: a system designed for exploring protein-protein interactions. Biotechnology (N Y) 13, 366-372. 89. Ezaki, S., Tsukio, M., Takagi, M. & Imanaka, T. (1998). Display of heterologous gene products on the Escherichia coli cell surface as fusion proteins with flagellin. Journal o f Fermentation and Bioengineering 8 6 , 500- 503. 90. Kumara, M. T., Muralidharan, S. & Tripp, B. C. (2006). Generation and characterization of inorganic and organic nanotubes on bioengineered flagella of mesophilic bacteria Journal o f Nanoscience and Nanotechnology accepted. 91. Kumara, M. T., Srividya, N., Muralidharan, S. & Tripp, B. C. (2006). Bioengineered Flagella Protein Nanotubes with Cysteine Loops: Self- Assembly and Manipulation in an Optical Trap. Nano Lett. 6 , 2121-2129. 92. Kumara, M. T., Tripp, B. C. & Muralidharan, S. (2007). Self-assembly of Metal Nanoparticles and Nanotubes on Bioengineered Flagella Scaffolds Chemistry o f Materials accepted. 93. Kumara, M. T., Tripp, B. C. & Muralidharan, S. (2007). Exciton Energy Transfer in Self-assembled Quantum Dots on Bioengineered Bacterial Flagella. J. Phys. Chem. C accepted.

19

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER II

INVESTIGATION OF THE HYPERVARIABLE DOMAIN REGION OF THE SALMONELLA TYPHIMURIUM FLAGELLIN PROTEIN IN MOTILITY

Abstract

The eubacterial flagellum is a complex structure with an elongated extracellular

filament that is primarily composed of a single protein termed flagellin. The highly

conserved N- and C- termini of flagellin are important in its export and assembly,

whereas the middle “hypervariable” region is highly variable in size across different

species and can be deleted to a large extent. In Salmonella typhimurium phase 1

flagellin, this hypervariable region encodes two solvent-exposed D2 and D3 domains

that form a knob-like feature on flagella fibers. The functional role of this structural

feature in motility remains unclear. We investigated the structural and physiological

role of the hypervariable region in flagellar assembly, stability and cellular motility.

A library of random internal deletion variants of S. typhimurium phase 1 flagellin was

constructed and screened for functional variants using a swarming agar motility

assay. The relative cellular motility and propulsive force of ten representative

functional variants were determined in semi-solid and liquid media using swarming

motility assays, video microscopy and optical trapping of single cells. All ten variants

investigated exhibited diminished motility, with varying extents of partial motility

observed for internal deletions less than 75 residues and nearly complete loss of

20

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. motility for deletions greater than 100 residues. The mechanical stability of flagella

fibers also decreased with increasing size of deletions. Analysis of deletion variant

sequences with respect to the wild-type structure indicated that all deletions involved

both loss of hydrophobic core residues and loss of both partial and complete segments

of (3-strand secondary structure in D2 and D3 domains. Homology modeling also

predicted disruptions of secondary structures in each variant. The hypervariable

region middle domain appears to have an important role in stabilizing the folded

conformation of the flagellin protein and resulting mechanical stability of the flagella

fibers and the resulting propulsion

Introduction

The eubacterial flagellum is a helical protein filament that is rotated by a membrane-

bound motor to propel the cell through liquid environments.1’2; 3; 4; 5 Other flagellar

biological functions include host adhesion, colonization and virulence .5 ’ 6 The

flagellum is a complex structure composed of more than 2 0 component proteins2’7; 8;

9 ’ 10 and has several major features: a basal body, a trans-membrane motor, a hook

structure and an elongated helical filament. The filament has an outer diameter of 12-

25 nm, a 2-3 nm diameter inner channel10’ 11 and can be 1-15 pM or more in length.9’

12 It is composed of as many as 30,000 subunits of a single flagellin protein . 13’ 14

Salmonella enterica serovar Typhimurium, commonly named Salmonella

typhimurium, is a peritrichously flagellated ( 6 -8 ) bacterial pathogen that serves as a

model organism for the investigation of bacterial flagellar structure and function

21

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. (Figure 2.1). The mature S. typhimurium phase 1 flagellin protein, encoded by the fliC

gene, consists of 494 amino acids; the first methionine residue is removed post-

1 pm

Figure 2.1 An Electron Micrograph ofSalmonella typhimurium. An electron micrograph of a wild-type Salmonella typhimurium strain SJW1103, negatively stained with phospho-tungstic acid. The figure shows six peritrichously arranged flagella.

22

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. translationally. The structure of S. typhimurium flagellin was recently determined

(PDB 1UCU) 12’ 15 and shows four distinct globular domains in the protein, termed

DO, D l, D2 (with subdomains D2a and D2b) and D3. Domain DO forms the inner

core of the filament; Dl forms the outer core and domains D2 and D3 form a knob­

like projection on the filament surface. The N-terminal polypeptide chain starts at

domain DO, proceeds through domains Dl and D2 and reaches domain D3, and then

returns through domains D2 and D l, with the C-terminal chain ending in domain

DO. 15

The DO and Dl domain regions are formed from the N- and C-

terminal regions and are highly conserved across different bacterial species.16’17; 18; 19

These terminal domain regions are essential for flagellar self-assembly and are

largely composed of a-helical coiled-coils that further associate to form helical

bundles in the interior of the filament. The middle region of the flagellin sequence

encodes the surface-exposed D2 and D3 domains that are largely composed of

p-sheet structures. This “hypervariable” sequence region can vary greatly in

sequence composition and number of residues across different strains and species,

resulting in observed flagellin molecular masses that range from 20-77 kDa.20’21; 22; 23

24 The D2 and D3 middle domains have been attributed with functions such as aiding

in increase of traction while swimming and evasion of host defenses due to its

n r , n<:, n n sequence variability. ’ ’ Flagellins have long been characterized with respect to

their serotype antigenicity in eliciting production of antibodies in mammals and are

classified as H-antigens .21,28,29,30,31 The outer D2 and D3 domains are the primary

23

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. immunological determinant of this protein; the sequence of the middle region has

long been correlated with variations in the H antigenicity . 3 1 , 32 Large portions of the

D2 and D3 hypervariable domain regions can be deleted to a large extent without

completely disabling the export and self-assembly functions of flagellin.18,32; 33; 34

Furthermore, the deletion tolerance of the D2 and D3 domains has been exploited by

a number of researchers to insert peptides and proteins in this extracellular protein for

use as a peptide display tool for selection of constrained peptides with affinity for

TC, I T . TO. TQ other ligands and immunogenic epitope display. ’ ’ ’ ’

The biophysical function of the middle domain of flagellin in propulsion

remains unclear. Further understanding of the evolution and function of the

hypervariable region of the flagellin protein and its resulting physiological impact on

cellular motility would be useful, in light of the recently determined structure.

Structural information about the protein folding rules that allow deletion of segments

of the middle region may also be useful in developing rational approaches for

insertion and display of other fusion peptides and proteins on flagella as a

“bionanotube”, as demonstrated in other recent publications by Kumara, et a l40; 41; 42;

43

We investigated the function of the hypervariable region of flagellin on

cellular motility by deleting portions of the outermost D2 and D3 domains and

screening for variants with functional motility. Three complementary biophysical

methods were used to characterize the motility of cells expressing the wild-type

flagellin and ten representative internal deletion variants, in environmental conditions

24

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. that ranged from viscous agar to dilute aqueous solution. The structures of these

variants and two previously described internal deletion variants were analyzed with

respect to the wild-type protein structure and predictive homology models were

developed for most of these variants.

Materials and Methods

Bacterial strains

E. coli XL-1 Blue electrocompetent cells (Stratagene, La Jolla, CA) were used for the

preparation of plasmid libraries of fliC internal deletion variants. These cells have a

transformation efficiency greater than 1 x 10 10 transformants/pg of DNA, are

tetracycline resistant, endonuclease ( endA) deficient, which greatly improves the

quality of miniprep DNA and are recombination ( recA") deficient, improving insert

stability. The S. typhimurium strains used for the experiments were kindly provided

by the late Dr. Robert Macnab (Yale University). S. typhimurium strain SJW1103,44;

4 5 , 46 a derivative of serovar Typhimurium LT2 that can only express the phase 1 fliC

flagellin ( i.e.,fliC stable) was used as a wild-type flagellar motility control for the

studies of modified flagellin proteins and motility assays in solution and on agar

plates. S. typhimurium strain SJW134 (AfliC and AfljB), which was derived from

parent strain SJW806, is wild-type except for the deletion of the phase 1 fliC and

phase 2 fljB flagellin genes .4 7 , 48 This strain is non-motile, unless a functional flagellin

gene is introduced, e.g., on an expression plasmid 47 Thus, it serves as an ideal strain

to screen engineered or mutated fliC genes, present in the introduced plasmids, for

25

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. functional flagellin protein expression, folding, export and flagellar fiber assembly, as

indicated by a simple agar microbial motility assay. S. typhimurium strain JR501, a

stable restriction-deficient (r), methylation-proficient (m+) galE strain , 4 9 , 50 was used

to convert E. coli grown plasmids to S. typhimurium compatibility. Electrocompetent

cells o f S. typhimurium strains JR501 and SJW134 were prepared using standard lab

protocols . 5 1 ,5 2

Plasmids and construction of flagellin internal deletion variant library

The pTrc series of plasmids were constructed using plasmid pKK233-2, for the

regulated expression of genes in Escherichia coli.53, 54 These vectors carry a strong

hybrid trp/lac, isopropyl-p-D-thiogalactopyranoside (IPTG) inducible promoter, the

lacZ ribosome-binding site (RBS), the multiple cloning site of pUC18 and the rmB

transcription terminators. The pTrc plasmids have been frequently used for cloning

purposes. The pTH890 plasmid is a derivative of pTrc99A plasmid with the S.

typhimurium fliC phase 1 flagellin gene cloned into the Xbal/Hindlll site (Figure 2.2).

The pTH890 plasmid was kindly provided by the late Dr. Robert Macnab (Yale

University). This plasmid has a unique BsrGI restriction site at the nucleotide position

843, located approximately in the middle of the fliC gene. We exploited this

fortuitous unique site to generate the internal deletion library. Unless otherwise noted,

all restriction enzymes and PCR reagents were obtained from New England Biolabs

(Beverly, MA). First, the pTH890 plasmid was cut at the unique BsrGI site with the

corresponding restriction enzyme to generate a linear plasmid with intact 5’ and 3’

26

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. BsrGI

pTH890

Pvul . 3K 1

'eta lactaro»se

Figure 2.2 A pictorial representation of pTH890 plasmid. The pTH890 plasmid is a derivative of pTrc99A plasmid with the S. typhimurium fliC phase 1 flagellin gene cloned into the Xbal/Hindlll site. This plasmid has a unique BsrGI restriction site at the nucleotide position 843, located approximately in the middle of the fliC gene. A Pvul restriction site is present in the middle of fi~lactamase gene.

27

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. ends of the fliC gene. This was followed by digestion with Bal31 slow exonuclease

time course reactions for 30 s, 60 s, 90 s, 120 s, 150 s, 180 s, 210 s, 240 s, 300 s and

20 min. The varied time periods were used to generate diversity in the length of

deletions in the fliC gene present at both ends of the linear plasmid. After each time

interval equal volumes of the reaction mixture were transferred into the

phenol:chloroform:isoamyl alcohol (25:24:1) mixture to denature Bal31 nuclease and

stop the reaction. The DNA was precipitated using 5 pi of 3M sodium acetate and

150 pi of 100% ethanol (Figure 2.3).

The Bal31 nuclease creates rough, single-stranded ends with overhangs. These

rough ends were repaired by DNA polymerase I Large Fragment (Klenow Fragment),

in the presence of deoxy nucleotide tri-phosphates (dNTPs). This enzyme either

digests or fills in any overhangs on the ends of the linearized plasmid DNA fragment.

To generate additional diversity in the fliC internal deletion library, i.e., allow short 5’

fragments of fliC to recombine with long 3’ fliC gene fragments and vice-versa, a

unique Pvul restriction site at nucleotide position 419 in the plasmid-encoded P-

lactamase gene ( amp r) was cut following variable digestion with the slow Bal31

nuclease and the resulting used to shuffle the plasmid fragments. The resulting

plasmid fragments were ligated at 16 °C overnight with T4 DNA ligase, thus

producing a circular plasmid library of fliC genes with internal deletions of variable

length. The ligated plasmid library was then digested a second time with BsrGI

restriction endonuclease before transformation into SJW134 cells. This procedure was

designed to linearize plasmids containing the unique restriction site present only in 28

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. the wild-type fliC gene, thus preventing their transformation into the intermediate E.

coli cloning strain. A proportion of the library was transformed into E.coli XL-1 Blue

electrocompetent cells and all the transformants were inoculated into 50 ml of LB

media (Shelton Scientific, Peosta, IA) with 50 pi of 100 pg/ml ampicillin (Shelton

Scientific) for overnight growth. The direct transformation of the midiprep plasmid

library obtained from E. coli XL-1 Blue cells into the non-motile S. typhimurium

strain SJW134 was highly inefficient. The intermediate step of transformation into S.

typhimurium strain JR501, plasmid recovery and subsequent transformation into

SJW134 strain yielded good results. After the SJW134 cells were electroporated with

the plasmid library, the plates were incubated overnight at 37°C. The colonies were

picked using a blunt ended toothpick and smeared onto the surface of the motility

agar and incubated at 30 °C in a humidifying incubation chamber for 6 - 8 hrs. The

plasmid library from the cell culture was isolated using a QIAprep Plasmid Midi Kit

(Qiagen Inc., Valencia, CA). The library was transformed into S. typhimurium strain

JR501 for methylation of the plasmids and the transformants were inoculated into 50

ml of LB media with ampicillin for overnight growth. The plasmid library was

isolated using a QIAgen Plasmid Midi Kit.

29

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Construction of ff/C gene deletion library

lac repressor

m e

BsrGI Restriction site

a) iBsrGI digestion

b) iBal31 Nuclease Digestion

c) I Pvul Digestion

d) I T4 DNA ligase OOO

Figure 2.3 Construction of fliC gene deletion library,a) The pTH890 plasmid was cut with BsrGI restriction enzyme, b) The linear plasmid was digested using Bal31 nuclease for varying time periods and the ends were repaired with Klenow fragment, c) The digested fragments were further cut with Pvul restriction endonuclease, d) The fragments were ligated using T4 DNA ligase and transformed into XL-1 blue E. coli cells.

30

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. The identified motile colonies were streaked on fresh LB ampicillin plates and

incubated overnight at 37 °C to obtain single isolated colonies. Colony PCR was

performed with primers that flanked the full length gene, to identify the extent of

deletion within each fliC gene that conferred motility to the cells. The PCR results

were analyzed on 2% agarose gels. A control PCR reaction on full-length fliC formed

a band just above the 1.5 kb DNA marker. Colony PCR performed on the variably

motile cells indicated fliC genes with molecular masses equal to or smaller than the

full length control. After screening for motile cultures using a motility assay, colony

PCR was performed to identify the approximate extent of deletion in each fliC gene.

The results were initially analyzed using agarose gel electrophoresis, followed by

DNA sequencing of the isolated plasmids. The sequencing results were verified using

nucleotide BLAST and the extent of deletion was analyzed. The plasmids were

named according to the sequence in which they were identified with the clones and

hence have a suffix C with a number.

Motility assays

S. typhimurium strain SJW134 electrocompetent cells were transformed with the

pTH890 fliC internal deletion variant plasmid library or purified pTH890 plasmids of

interest and plated on LB agar (1.5%, wt/vol) for overnight growth at 37 °C. Motility

agar media was composed of tryptone broth (10 g/liter tryptone, EM Science, 5 g/liter

NaCl, Fisher Scientific) that contained 0.3% (wt/vol) agar (Sigma-Aldrich, St. Louis,

MO). For the initial screening, agar plates were inoculated with bacteria from an

overnight culture in LB agar (1.5%, wt/vol) plates at 37 °C with a sterile, blunt-ended

31

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. toothpick. The plates were then incubated at 30 °C for 6-8 h in a humidified incubator

to prevent dehydration of the media. The low density of the agar allowed the bacteria

to move within the agar, forming a visible halo of growth around the point of

inoculation. For further analysis of swarming diameters of the 10 deletion mutants,

the motility agar was inoculated with 2 pi of overnight culture at an OD55o of 0.5.

Optical imaging measurements were taken after six h with a Kodak DC290 digital

camera gel documentation system. A controlled motility assay was performed using

the deletion mutants and the negative and positive control strains SJW134 (non-

motile) and SJW1103 (wild-type). Five different sets of measurements were

performed on each clone and the average swarming diameter was calculated.

Colony PCR

Colony PCR was performed on motile S. typhimurium colonies using a Techne

Touchgene Gradient Thermal Cycler (Techne, Inc., Burlington, NJ). PCR reagents

were obtained from NEB. Bacterial colonies were picked using 10 pi micropipette

tips and added to 50 pi of PCR master mix (0.25 pi Taq DNA Polymerase (NEB), 5

pi lOx standard PCR buffer, 5 pi of 2.5 mM dNTP mix, 5 pi of 10 pM forward and

reverse primers and 30 pi of Dl F^O). The PCR reaction conditions used were: an

initial cycle of 94°C for 10 min to perform cell lysis, followed by 30 cycles of 94 °C

for 1 min; 55°C for 1 min; 72 °C for 1.5 min, followed by a final extension step of 72

°C for 10 min. A forward oligonucleotide primer with DNA sequence 5’-

AATTAATCCGGCTCGT-3’ started at nucleotide position 197 of the pTH890

plasmid. A reverse oligonucleotide primer with DNA sequence 5’-

32

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. ATTTAGTCTTGCGTCTTCGC - 3’ started at nucleotide position 1958. These two

primers were designed to completely flank the fliC gene in the pTH890 plasmid. The

PCR products were analyzed by electrophoresis on 2% agarose gels.

DNA sequence determination and analysis

Plasmids containing functional fliC genes with internal deletions were purified using

a QIAprep plasmid miniprep kit (Qiagen, Valencia, CA). A forward sequencing

primer (5’-TCCATCCAGGCTGAAATC-3’) was designed starting at fliC nucleotide

position 330 to analyze the extent of deletion. DNA sequences were analyzed with the

BLASTN software 55 using the default parameters, to identify the extent of deletion

within each gene. The ORF (Open Reading Frame) finder tool at the NCBI server

was used to convert the nucleotide sequences into protein sequences. Multiple

sequence alignment of the mutant flagellin proteins was performed using the

ClustalW software package . 56

Flagellin protein expression, flagella fiber purification and SDS-PAGE

Purified pTH890 plasmids encoding fliC internal deletions of interest were

transformed into competent SJW134 S. typhimurium cells and plated on LB-Amp

agar. Colonies were picked and inoculated into 10 ml LB broth with 5 pi of 100

mg/ml ampicillin and allowed to grow overnight at 37 °C. 10 ml of the bacterial

cultures were transferred to a 15 ml polypropylene tube and placed on a heat block at

60 °C for 10 min. The cells were pelleted at 6000 x g and the supernatant was

transferred to 15 ml Amicon Ultra centrifugal filtration tubes (Millipore Corporation,

Billerica, MA) for concentrating the flagella fibers. The concentrates were resolved

33

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. by SDS-PAGE on 10% (wt./vol.) polyacrylamide gels with power settings of 30 mA

and 180 V and visualized by staining with Coomassie brilliant blue R-250.

Isolation of flagella fibers from bacteria for TEM studies was performed by an

established technique 1 7 ,3 2 ,57 o f vortexing 1.0 ml o f cells for 30 sec. in a 1.5 ml

microcentrifuge tube, followed by selective pelleting of the cells by centrifugation at

5,000 x g for 5 min. 0.5 ml of supernatant containing suspended flagella was

transferred to an Amicon Centricon centrifugal filter (Millipore, Billerica, MA) with a

molecular weight cutoff of 30 kDa and centrifuged at 15,000 x g for 30 min. The

concentrated solution was collected into the retentate vial by brief centrifugation.

Partial purification of flagella by this method yielded a significant amount of protein

even with a small amount of culture.

Electron microscopy

Flagellar fiber samples were prepared for TEM imaging by shearing and

concentration in the manner described above. For TEM imaging of bacterial cells, 20

pi of overnight grown culture samples were applied to carbon coated copper grids and

negatively stained with 4% aqueous phosphotungstic acid (pH 5.2). Both flagella

attached to the bacterial cells as well as detached flagella were observed. TEM

micrographs were taken using a JEM-1230EXII electron microscope (JEOL, Tokyo,

Japan), at an accelerating voltage of 80 kV. Digital images were captured as TIFF

files by a 1 megapixel Gatan Bioscan digital camera (Gatan Inc., Pleasanton, CA).

34

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Light microscopy and swimming velocity analysis

A Nikon Eclipse E600 epifluorescence stereo microscope (Mager Scientific Inc.,

Dexter, MI) with 10X, 40X and 100X (oil immersion) Planfluor objectives,

transmission and darkfield condenser light sources and a Qlmaging Fast 1394 cooled

mono (greyscale) CCD digital camera was used to record visible images and movies

of non-motile S. typhimurium SJW134 bacteria transformed with original or modified

pTH890 expression plasmids encoding wild-type and internal deletion variant

flagellins. The video images of swimming bacteria were captured at 13 fps speed

using a 100X oil immersion lens. Bacterial cultures were grown to mid-log phase in

LB media without shaking to minimize breakage of potential brittle flagella fibers. 30

pi of each culture at OD55o 0.05 was loaded onto a concave glass slide and covered

with a glass cover slip. The Manual Tracking plug-in of the ImageJ software (NIH,

Bethesda, MD) (Version 1.36, http://rsb.info.nih.gov/ij/) was used to calculate the

velocity with which the bacteria were moving. The swimming speeds of the cells

were measured for 20 frames. The swimming speeds of 20 individual bacterial cells

were determined and averaged for wild-type and each deletion variant. In general, the

more the cells tumbled and changed direction, the slower was the observed cell speed.

In order to measure the approximate swimming speed, cells were chosen for analysis

that stopped and changed direction the least in two dimensions.

Measurement of power required to stall cells using optical tweezer

A BioRyx 200 optical trap using a 1064 nm (Arryx, Chicago, IL) high power 2 Watt

laser source and proprietary computer-controlled holographic optical trapping

35

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. software was used to trap and monitor the propulsive forces generated by bacterial

cells (Figure 2.4). The software can create up to 200 independently maneuverable

optical traps in three dimensions and can trap objects from 150 nm to 300 pm in size.

A Nikon Eclipse TE2000U inverted epifluorescence microscope with 4X, 10X and

60X (oil immersion) objectives, fitted with a digital 640 x 480 pixel resolution CCD

digital camera was used to record visible images and movies of trapped S.

typhimurium bacteria with different flagellin mutant plasmids. Cell cultures were

grown to the mid-log phase, where they produce maximum number of flagella. The

cells were grown without shaking in order to reduce potential breakage of flagella.

The cultures were then diluted to an OD 550 0.05. Cells observed in the LB growth

media were difficult to follow as their movement was very rapid, especially the wild-

type, and also due to the limitations of the laser power. Addition of 6.25% glycerol to

the media considerably reduced the speed of the swimming cells and enabled their

trapping with relative ease. Individual cells were initially trapped using the maximum

laser power of 2.0 W.

The laser power was then decreased from 2.0-0.2 W by 0.1 W increments and

the power at which cells started to move out of the trap was noted. The minimum

power of the laser was 0.2 W. The average stall force for each variant was calculated

from measurements performed on 2 0 different single cells, with similar dimensions,

from the same culture preparation. The stall force each cell was measured multiple

times for consistency.

36

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. imaging Camera Infrared Halegrnphtel Optical T ttrn n

Nikon Eciipu' TE-20OO Inverted MlcraPotnt Microscope Laser Cutter

ftnam caa

Figure 2.4 A schematic representation of a laser tweezer,a) The trapping laser is an Nd-YAG continuous-wave operated at 1064 nm. The laser power can be varied from 0.2 W to 2 W. The tweezer is optimized to work with the 60X objective, b) The optical tweezers system has been constructed around an inverted microscope (Nikon TE2000U) with a high numerical aperture of 1.4 and a working distance of 0.21 mm (oil immersion objective).

37

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Structure analysis and homology modeling

The deleted regions of the flagellin protein were visualized and analyzed with respect

to end-to-end distances using the SwissPDBViewer software version 3.7.58 The

structures of different flagellin protein sequences were modeled based on the

available 3D structure of S. typhimurium flagellin 1UCU at the RCSB Protein Data

Bank (PDB).59 The MODELLER (6vl) 60 software module of the Accelrys (San

Diego, CA) Insightll software package was used to generate and the models and

analyze them. The Accelrys modeling software was run on an IBM IntelliStation M

Pro PC workstation with the Red Hat Enterprise Linux WS 3 operating system.

Results

Description of flagellin hypervariable domain structural features

Structural features of the D2 and D3 domain regions of S. typhimurium phase 1

flagellin PDB structure (1UCU), as defined by Samatey, et al.,n are summarized here

for reference in the following analysis. The central, hypervariable region of S.

typhimurium flagellin that encodes the D2 and D3 domains is composed of residues

Lysl77-Ala401. Subdomain D2a consists of the N-terminal residues Lysl77-Glyl89

and the C-terminal residues Ala284-Ile344. Subdomain D2b consists of residues

Asn345-Ala401. Domain D3 is formed by residues Tyrl90-Val283. The structure

has nine helical structural motifs, with an a-helical region and short adjoining 3io

helical region, collectively termed helix 6 (Leu288-Ala298), present in subdomain

D2a. A second, largely helical region that is neither in 3i0 nor a-helical form15’61 is

38

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. present in domain D3 (residues Asnl99-Gly210) and was named helix 4-5 in

accordance with the PDB structure notation. The wild-type structure also has 20

defined (3-strands, of which 18 are present in the outer D2 and D3 domains. These

were numbered as: p-strand 3, Lys 179-Alal84; p-strand 4,

Alal91-Thrl93; p-strand 5, Ilel95-Leul97; p-strand 6, Asp213-Asp217; p-strand

7, Leu220-Asp223; p-strand 8, Lys228-Thr236; p-strand 9, Gly243-Asp250;

p-strand 10, Glu255-Ala259; p-strand 11, Glu276-Val283; p-strand 12, Ser305-

Asp313; p-strand 13, Lys317-Val327; p-strand 14, Asp330-Asn337; p-strand 15,

Ser341-Asn344; p-strand 16, Tyr349-Ala351; p-strand 17, Thr355-Lys357;

p-strand 18, Asn361-Gly364; (3-strand 19, Thr370-Ile375; p-strand 20, Lys378-

Ala382. Subdomain D2a includes the five p-strands 3 and 12-15. Subdomain D2b

includes the five p-strands 16-20. Domain D3 includes the eight p-strands 4-11.

The Ca backbone of the D2 and D3 domains contains three examples of a unique

folding motif, termed the p-folium, comprising a series of four successive p-strands

that form three p-hairpins all pointing away from one another.12 The three P-folium

structural motif regions are composed of residues 220-260 in domain D3, residues

308-345 in subdomain D2a and residues 345-383 in subdomain D2b.

Identification of functionalfliC internal deletion variants

An internal deletion library of the S. typhimurium phase 1 flagellin gene, fliC, was

constructed by a procedure involving restriction digestion and variable nuclease

digestion of the fliC middle region in a pTrc expression plasmid (Table 2.1, Materials

39

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. and Methods). Agarose gel analysis of the purified plasmid library yielded a broad

smear, indicating the presence of a diverse range of deletions in the fliC gene (Figure

2.5). This plasmid library was then transformed into non-motile SJW134 S.

typhimurium cells and screened for functional internal deletion variants on 0.3%

swarming motility agar. The colonies were initially classified into non-motile, low,

medium and strongly motile (Figure 2.6). The initial colony motility screens yielded a

ratio of roughly 1:10 motile to non-motile cells. PCR and DNA sequencing analysis

of a preliminary set of selected colonies showed that most of the strongly motile cells

had an intact fliC gene without any internal deletions. Therefore, the plasmid library

was again digested with BsrGI restriction endonuclease to remove any remaining

uncut, wild-type plasmids before transformation into SJW134 cells. This step

considerably reduced the number of false positive results and the ratio of motile vs.

non-motile cells was about 1:30. Approximately 12,000 single colonies were

manually screened by this method and approximately 200 functional deletion variants

were identified.

40

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Deletion Library P™ 90

3 kb<

Figure 2.5 DNA agarose gel of the pTH890 deletion library.DNA agarose gel (1%) depicting the pTH890 plasmid and its deletion library. The plasmid library was constructed as described in the materials and methods section.

The relatively small number of functional flagellin variants isolated in the screen, ca.

2%, suggests that internal deletion and recombination events that result in a

functional flagellin protein are relatively rare. Colony PCR indicated that all isolated

fliC genes had molecular masses equal to or smaller than the full length fliC control

(Figure 2.7). A total of 46 functional deletion variants with more substantial deletions

were further characterized by DNA sequencing, yielding 39 unique clones (Table

2.2).

41

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Weak Non-molsle

Medium- Strong Swimmer Strong

Weak Non-motile A

Figure 2.6 Motility assay. A picture depicting variation in the swarming abilities of different deletion variants on a 0.3% motility agar media. The deletion variants were initially classified as strong swimmers, medium-strong swimmers, weak swimmers and non-motile.

Analysis of functional deletion variant polypeptide sequences

All deletions in functional variants were centered on the location of the initial BsrGI

restriction site in the fliC gene, corresponding to a cut in the wild-type peptide

sequence after residue 281 in the subdomain D2a a-helix 6. These results are a

consequence of the molecular biology procedures and this may not be the most

permissible deletion region in the D2 and D3 domains. Conversely, two regions were

conserved in all variants in this study: (3-strand 3 segment (Lysl79-Alal84)

connecting domain D1 to D3 and the two successive (3-strands 4 and 5 on domain

D3; residues Alal91-Thrl93 and Ilel95-Leul97. This sequence region is furthest 42

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. from the BsrGI restriction site, and may represent another bias in the molecular

biology procedures, rather than an intrinsic stability of this region of the protein.

Figure 2.7 Colony PCR of the deletion variants. A 2% agarose gel of colony PCR products of several deletion variants. Lane 1: 100 bp Marker; Lane 2: pTH890 fliC gene PCR; Lanes 3-24: Colony PCR of different deletion fliC gene deletion variants.

The extent of functional deletions in the flagellin fliC gene ranged from 12-

177 amino acids, although the majority of functional deletions were in the range of

40-80 amino acids, with an average value of 43.0 ± 36.3 deleted residues. The

average N-terminal start position of the deleted polypeptide region was 260.2 ± 18.4

residues and the average C-terminal end position was 304.3 ± 23.3 residues. This

indicates a preference in the position of functional deletions that is centered towards

the C-terminal region of the hypervariable region, especially for larger deletions. This

result is consistent with prior analysis of flagellin sequence homology. For example,

the Pfam Protein Families Database62,63 entries for the bacterial flagellin N-terminus

43

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. and C-terminus, pfam00669 and pfam00700, indicate a more extended region of

amino acid sequence conservation in the N-terminal region encompassing residues

28-163 (136 residues) than for the conserved C-terminal sequence encompassing

residues 401-472 (72 residues).

It was theoretically possible for deletions to occur in all four domains of

flagellin. However, no functional variants were isolated that contained any deletions

of the largely a-helical DO and D1 domains essential for self-assembly. This result

was consistent with numerous prior observations indicating that the highly conserved

sequences encoding the DO and D1 domains are responsible for the majority of inter­

subunit interactions in the filament, vs. the dispensable D2 and D3 domains.15’64

Smaller functional internal deletions were more likely to be found in the outermost

D3 domain and the D2a subdomain region; more extensive deletions extended further

into the D2b subdomain region. However, as discussed below, more extensive

deletions only resulted in marginally functional proteins, as indicated by swarming

agar motility measurements.

Analysis of the sequence data in Table 2.2 also indicated that the start and

stop positions of the deleted peptides were not randomly distributed. Many different

variants had deletions that started or ended at the same residue position, suggesting

the presence of deletion-tolerant “hot spots” in the flagellin sequence and structure.

Deletions that started at residue Asp250, located at the C-terminal end of D3 domain

p-strand 9, were the most frequent, with five variants encoding deletions that started

at this position: C37, C39, C79, C88, and C89. Four variants, C108, Cl 18, C120 and

44

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. C l70, had deletions starting at residue Asp277 near the N-terminal end of D3 domain

p-strand 11. Three variants, C21, C30 and C42, had deletions starting at residue

Leu266, a D3 loop region; three other variants, C27, C33 and Cl 15, had deletions

starting at residue Lys279, in the middle of D3 p-strand 11. Six other pairs of

variants had deletions that started in various positions in the D3 domain at residues

Val247 (C61, C167), Thr257 (C14, C85), Gly261 (C4, C35), Gly269 (C l 10, C l 12),

Thr275 (C48, Cl 16) and Asn280 (C47, C174). Deletions that ended at residues

Ala293 and Gly299, located in the middle and C-terminal end of helix 6, were the

most frequent. Four variants had deletions ending at these two residue positions: C3,

C4, C39, Cl 67; C6, Cl 1, Cl 4, C21. Two groups of three variants encoded deletions

ending at residues Leu288 (C42, C55, Cl 12) and Ala304 (C33, C37, C62), which are

located at the N-terminal start region of subdomain D2a helix 6 and the N-terminal

start region of p-strand 12. Five other pairs of variants had deletions ending at

identical residues: Val281 (C61, C70) and Gln282 (C5, C30), located in domain D3

C-terminal p-strand 11; Ala291 (C9, C l82) and Ala298 (C89, Cl 18), located in

subdomain D2a helix 6; and Thr312 (Cl 16, C l74), located near the C-terminal end of

subdomain D2a p-strand 12. To summarize, all of the most frequent functional

deletion start positions were located in the outermost D3 domain and nearly all of the

most frequent deletion end positions were located near the C-terminal end of the D3

domain or the N-terminal start region of the D2a subdomain.

The deleted sequence regions of the functional variants in Table 2.2 were

similar to those previously observed by Kuwajima32 for a series of eight internal

45

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. deletion variants of a homologous, 497 amino acid residue E. coli flagellin protein.

The largest deletion in the E. coli flagellin encoded 187 residues (194-380) in the

middle region, compared with 177 deleted residues (227-403) for the C l50 S.

typhimurium flagellin variant in this study. It is difficult to make further inferences

without more detailed structural information about the homologous E. coli flagellin

protein, which may have a number of sequence substitutions, insertions and deletions

relative to the S. typhimurium flagellin, as indicated by a Blast sequence homology

search (data not shown).

There were also some similarities between the deletion variants isolated in this

study and two S. typhimurium flagellin deletion variants previously described by

Yoshioka, et al,33 Selection for S. typhimurium motility in the presence of an antibody

specific for the hypervariable region yielded two spontaneous deletion variants, one

containing a deletion of 89 residues encompassing Ala204-Lys292 (SJW46 strain)

and a second deletion of 97 residues encompassing Thrl83-Lys279 (SJW61 strain).

The starting point in the deleted sequence for SJW46 (Ala204) was similar to the

deletion start residue Thr201 for the C131 variant in this study (Table 2.2), which

corresponds to the N-terminal helix 4-5 of the D3 domain. The SJW46 deletion

ending at Lys292 was identical or similar to the deletion end hot spot at residues

291-293 in the C3, C4, C9, C35, C39, C167 and C182 variants in this study (Table

2.2), corresponding to the D2a subdomain helix 6. The N-terminal sequence region

for the SJW61 variant deletion included part of the interdomain p-strand 3 region

between the D2 and D3 domains and was 18 residues earlier in the sequence than any

46

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. of the variants isolated in this study. Thus, no obvious similarities were apparent

between the N-terminal deletion start region of the SJW61 variant and the 39 unique

variants in this study. The end of SJW61 deletion region was similar to that observed

in the C61 and C70 variants in this study (Table 2.2).

A subset of ten representative deletion variants with a wide range of internal deletions

(12-177 residues) were chosen for more detailed structural analysis, predictive

modeling and experimental characterization of physiological motility function (Table

2.1). The published sequences of the two SJW46 and SJW61 variants were also

included in the theoretical structural analysis. All twelve variant amino acid

sequences were analyzed with respect with respect to the wild-type flagellin structure

(PDB 1UCU), to determine deleted hydrophobic core residues, secondary structures

and domain regions and deduce possible structural trends. The deleted regions in the

primary sequence of each variant are depicted in Figure 8. The deletions were

predominantly in the D3 domain with some deletions extending into the D2 domain;

more of the D2b subdomain was deleted in variants with the most extensive deletions.

47

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Table 2.1 Strains and plasmids used in this study.

Source / Description

Reference

A. Strains.

E. coli

XL 1-Blue Cloning strain recAl endAl gyrA96 thi-1 hsdR17 Stratagene

supE44 relAl lac {¥' proAB lacIqZAMl 5 Tn 10

(Tetr)}.

S. typhimurium

SJW1103 LT2 derivative that is wild-type for motility and 44; 45; 46

chemotaxis, phase 1 flagellin stable (/7/C gene)

SJW134 Non-motile, flagellin deficient strain, AfliC and A/7jB, 45; 47; 48

derived from parent strain SJW806

JR501 Stable restriction-deficient (r ), methylation- 49 ; 50

proficient ( m+) galE cloning strain for providing

restriction compatibility with E. coli plasmids

Source / Description

Reference

B. Plasmids

pTH890 pTrc99A-derived Amp expression plasmid for wild- 53; 54

type S. typhimurium fliC gene

pTH890-C3 Derivative of pTH890, expressing FliC A282-293 This study

48

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Table 2.1 - Continued

Source / Description

Reference

pTH890-C4 Derivative of pTH890, expressing FliC A261-293 This study

pTH890-C9 Derivative of pTH890, expressing FliC A245-291 This study

pTH 890-C ll Derivative of pTFI890, expressing FliC A231-299 This study

pTH890-C12 Derivative of pTFI890, expressing FliC A220-328 This study

pTH890-C37 Derivative of pTH890, expressing FliC A250-304 This study

pTH890-C79 Derivative of pTH890, expressing FliC A250-324 This study

pTH890-C89 Derivative of pTH890, expressing FliC A250-298 This study

pTH890-C131 Derivative of pTH890, expressing FliC A201-368 This study

pTH890-C150 Derivative of pTH890, expressing FliC A227-403 This study

The deleted regions in the wild-type flagellin tertiary structure are shown in

Figure 2.9 and key structural features of the deleted regions are summarized in

Table.2. 3. All representative variants in this study had deletions of hydrophobic core

residues that would otherwise contribute to the stability of the outer D2 and D3

domain structures (Table 2.3). For example, the C3 variant encoded removal of two

residues (Ala286 and Ala291) lfom the hydrophobic core of the D2a domain, the

moderately functional C9 variant had additional deletions of eight core residues from

the D3 domain, and the non-functional C12 variant had deletions of 11 D3 domain 49

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. core residues and 8 D2a subdomain core residues. Thus, it is not surprising that the

deletion variant flagella structures were progressively less likely to fold and assemble

properly, with negative consequences for the resulting oligomeric flagella structures,

as more residues were deleted from the D2 and D3 domains.

1 100 200 300 400 495 I Wild Type Flagellin

ii» n d2 —-lar r; i ci3i ------C150

0 Domain DO SB Domain D2a 1 Domain D1 H Domain D2b I Domain D3

Figure 2.8 Multiple alignment of FliC and internal deletion variants.The five different domains and subdomains are represented with different patterns. The dotted line indicates the region of the protein deleted in each variant. Domain DO comprises the N-terminal residues from Alai to Ser32 and C-terminal segment of residues from Ala459 to Arg494; Domain D1 comprises the N-terminal residues from Asn44 to Glnl76 and C-terminal residues from Thr402 to Arg450. Subdomain D2a consists of the N-terminal residues from Lysl77 to Glyl89 and the C-terminal residues from Ala284 to lle344. Subdomain D2b consists of residues from Asn345 to Ala401. Domain D3 is formed by residues Tyrl90 to Val283.

50

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. A naive prediction was that most functional deletions should start and end at

flexible loop regions or at the ends of secondary structures, rather than in the middle

of extensively hydrogen bonded a-helix and P-strand secondary structures. Analysis

of the variant deleted sequences with respect to the wild-type flagellin structure

indicated that a number of deleted peptide regions started and stopped near the ends

of secondary structures (Figure 2.8, Table 2.3). The C4 peptide deletion started

immediately after the end of a D3 domain p-strand. The Cl 1 deletion encoded

complete removal of the D2a a-helix and ended in a loop region. The C37, C79 and

C89 variants had deletions that started at an apparent “hot-spot” (Val 249) near the C-

terminal end of a D3 p-strand and ended at or near the N-terminal end of several D2a

Table 2.2 Sequence parameters of functional internal deletion variants . a

Deleted Number of deleted Deleted peptide N-and C- Clone nucleotides b amino acid residuesc terminal residues d C5 36 12 271-282

C3 36 12 282-293

C108 39 13 277-289

CSS 48 16 273-288

C30 51 17 266-282

C110 54 18 269-286

Cl 15 54 18 279-296

51

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Table 2.2 - Continued

Deleted . Number of deleted Deleted peptide N-and C-

Clone nucleotides b amino acid residues c terminal residues d

C70 54 18 264-281

Cl 12 60 20 269-288

C l 18 66 22 277-298

C42 66 22 266-288

Cl 82 75 25 267-291

C27 75 25 279-303

C48 78 26 275-300

C33 78 26 279-304

C62 87 29 276-304

C120 90 30 277-306

C35 96 32 261-292 C4 99 33 261-293

C174 99 33 280-312

C21 103 34 266-299

C61 105 35 247-281

C47 114 38 280-317

C116 114 38 275-312

C6 123 41 259-299

C14 129 43 257-299

52

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Table 2.2 - Continued

Deleted Number of deleted Deleted peptide N-and C-

Clone nucleotides b amino acid residues c terminal residues d

C39 132 44 250-293

C88 138 46 250-295

C167 141 47 247-293

C9 141 47 245-291

C89 147 49 250-298

C170 162 54 277-330

C37 165 55 250-304

C85 186 62 257-318

C ll 207 69 231-299

C79 225 75 250-324

C12 327 109 220-328

C131 504 168 201-368 C150 531 177 227-403

a Functional flagellin variants were identified in a swarming agar motility screen

with non-motile S. typhimurium SJW134 cells expressing randomly generated

internal deletion variants of phase 1 flagellin (Swiss-Prot ID number P06179). The

amino acid sequence numbering system does not include the first N-terminal

methionine that is post-translationally removed and ends at residue 494.

b Total number of deleted nucleotides in the mutated fliC gene encoded in each

53

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Table 2.2 - Continued

modified pTH890 plasmid.

c Total number of internally deleted amino acid residues in each corresponding

flagellin internal deletion variant that is encoded by the mutated /Z/C gene.

d Sequence identity of N- and C-terminal amino acid residues in the internally

deleted peptide region of flagellin.

P-strands. The larger C12 deletion both started and ended at or near loop regions in

the D3 and D2a domains. The larger C131 deletion started after a D3 p-strand, at the

start of the D3 helical region and stopped at the N-terminal end of a D2b p-strand.

The largest Cl 50 variant deletion started near the N-terminal end of a D3 p-strand

end and stopped in a loop region near the N-terminal end of a D1 domain a-helical

region.

However, examples of deletions that started or ended in the middle of a-helix

and P-strand secondary structures were also observed. The C3, C4 and C9 variants,

with relatively small deletions, had only partial removal of the subdomain D2a

a-helix. The C3 and Cl 1 variants had significant partial deletions of p-strands in the

deletion start regions. The C9 variant deletion started in the N-terminal region of a

P-strand following a loop region in the D3 domain. The C79 and C89 deletions

started in the C-terminal region of a D3 p-strand. The SJW46 variant deletion started

54

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. in the middle of the D3 domain helix 4-5 and ended in the middle of the subdomain

D2a helix 6. The SJW61 variant deletion started in the middle of subdomain D2a

P-strand 3 and ended in the middle of domain D3 p-strand 11. Thus, some of the

functional deletions started and ended in peptide regions with extensive hydrogen

bonding.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. DO 1U C U

Figure 2.9 Homology models of flagellin variants.S. typhimurium wild-type phase 1 flagellin structure (1UCU) and homology model structures of representative flagellin internal deletion variants developed with Modeler software (6vl). Structures are color coded by secondary structure; a-helices are red, p-strands are yellow and random coils are green. The insets show the hypervariable D2 and D3 domain region of wild-type flagellin with the deleted sequence for each variant colored in black.

56

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. C 89 C 37

r-

/ *

C79

Figure 2.9 - Continued

57

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission C 12

S JW 46 SJW 61

Figure 2.9 - Continued

58

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission It should also be noted that in some variants, adjacent anti-parallel p-strands

were removed in pairs, e.g., the D2a domain p-hairpin turn motif in the C12 and C79

variants and another pair of p-strands in the D3 domain of the C9 variant. The C3

and C4 variants did not have any disruption of the p-folium motifs (Figure 2.8, Table

2.3). The C9, C89, C37 and Cl 1 variants had partial deletions of the D3 domain

P-folium. The C79 variant had deletions of parts of both the D3 and D2a p-folium

motif. The C12 variant had complete deletion of the D3 p-folium and partial deletion

of the D2a p-folium. The C131 variant had complete deletion of the D3 and D2a

P-folium motifs and partial deletion of the D2b p-folium. The Cl 50 variant had

Table 2.3 Summary of deleted secondary structures internal deletion variants. a

Deletion Deleted End-to-end Deleted secondary Deleted core variant residues distance structured hydrophobic residues (sequence) 6 (A)c in each domain e

C3 12 21.4 D3: C-terminal segment D2a: A287, A292

(282-293) of p-strand 11. D2a: N-terminal segment

o f a-h e lix 6. C4 33 36.2 D3: C-terminal extended D3: A263, V279, V281 (261-293) loop region, p-strand 11. D2a: A286, A291 D2a: N-terminal segment o f a-h e lix 6. C9 47 36.2 D3: p-strands 9-11. D3: V247, V249, (245-291) D2a: N-terminal segment V256, L258, A259,

59

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Table 2.3 - Continued Deletion Deleted End-to-end Deleted secondary Deleted core variant residues distance structured hydrophobic residues (sequence) b (A) c in each domain e of a-helix 6 A262, V278, V281 D2a: A286, A291 C89 49 32.0 D3: Short segment of D3: V249, V256, (250-298) (3-strand 9, p-strands L258, A259, A262,

10- 11. V278, Y281

D2a: Entire a-helix 6. D2a: A286, A291, L295, L298 C37 55 25.2 D3: Short segment of D3: V249, V256, L258, (250-304) p-strand 9, p-strands A259, A262, V278,

10- 11. V281

D2a: Entire a-helix 6, D2a: A286, A291,

C-terminal loop region. L295, L298, V300, A3 04

C ll 69 38.4 D3: Long segment of D3: A231, V233, (231-299) (3-strand 8, p-strands V249, V256, L258,

9-10. A259, A262, V278,

D2a: Entire a -h e lix 6, V281

short loop segment. D2a: A286,A291, L295, L298, V300 C79 75 28.2 D3: Short segment of D3: V249, V256, (250-324) p-strand 9, p-strands L258, A259, A262,

10- 11. V278, V281

D2a: Entire a-helix 6, D2a: A286, A291, L295, L298, V300,

60

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Table 2.3 - Continued Deletion Deleted End-to-end Deleted secondary Deleted core variant residues distance structure d hydrophobic residues (sequence) b (A)c in each domain e P-strand 12, long A304, V306, M309

segment of p-strand 13.

C12 109 44.1 D3: p-strands 7-11. D3: L220, A231,

(220-328) D2a: Entire a-helix 6, V233, V247, V249,

P-strands 12-13. V256, L258, A259, A262, V278, V281 D2a: A286, A291, L295, L298, V300, A304, V306, M309 C131 168 50.9 D3: Majority of D3 D3: F202, A 206,1216, (201-368) domain, including L220, A231, V233,

p-strands 6-11 and most V247, V249, V256, of helix 4-5. L258, A259, A262, D2a: Majority of V278, V281 subdomain, including D2a: A286, A291,

a -h e lix 6, p-strands L295, L298, V300,

12-15. A304, V306, M309,

D2b: p-strands 16-18. A 3 3 4 ,1344

C150 177 54.0 D3: p-strands 8-11. D3: A231, V233, (227-403) D2a: Most of subdomain, V247, V249, V256,

including a -h e lix 6, L258, A259, A262,

P-strands 12-15. V278, V281 D2b: Entire subdomain D2a: A231, V233, including p-strands V247, V249, V256,

61

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Table 2.3 - Continued Deletion Deleted End-to-end Deleted secondary Deleted core variant residues distance structure d hydrophobic residues (sequence) b (A)c in each domain e 16-20, up to start of D1 L258, A259, A262, domain C-terminal V278,V281 a-helix. D2b: 1375, V373, A382, A385, F390, G376 SJW46 89 35.4 D3: Majority of D3 D3: A 206,1216, L220, (204-292) domain, including A231, V233, V247,

(3-strands 6-11 and part V249, V256, L258,

of helix 4-5. A259, A262, V278, D2a: N-terminal segment V281

o f a-h e lix 6. D2a: A286, A291

SJW61 97 33.6 D3: Majority of D3 D3: F202, A 206,1216, (183-279) domain, including F220, A231, V233,

P-strands 4-10 and V247, V249, VAF256,

segment of p-strand 11. F258, A259, A262, D2a: Segment o f V278 (3-strand 3. D2a: AFA185, VAL187

a Secondary structure regions are defined in the Results section. S. typhimurium phase 1

flagellin internal deletion variants include ten representative variants identified in this

study and two variants (SJW46, SJW61) previously described by Yoshioka, et al. 33.

b Total number of deleted amino acid residues in each internal deletion variant and N- and

62

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Table 2.3 - Continued

C-terminal ends of deleted region (in parentheses).

c End-to-end distances between the a-carbons of remaining N- and C-terminal residues

flanking internal deletions of each deletion variant in the wild-type phase 1 flagellin

structure (PDB 1UCU), as measured with the SwissPDBViewer 3.7 software distance

measurement tool.

d Description of deleted regions of secondary structure in D2 and D3 domain regions of

each flagellin variant, corresponding to inset regions in Figure 7.

e Deleted core hydrophobic residues in D2 and D3 domains of each variant.

partial, almost complete deletion of the D3 p-folium and complete deletion of the

D2a and D2b p-folium motifs. The SJW46 and SJW61 variants both had complete

removal of the D3 p-folium, with no disruption of the other D2a and D2b p-folium

motifs.

Larger deletions in the marginally functional C l2, C131 and C l50 variants

encoded complete removal of the D2a helix 6, most of the D3 domain and large

sections of the D2a (C12) and D2b subdomains (C131, C150) (Table 2.3). Variant

C12 had deletion of subdomain D2a helix 6 and p-strands 12 and 13 in a p-hairpin

motif, similar to the C79 variant, but also had an extensive deletion of most of the

outermost D3 domain, except for an N-terminal segment comprising D3 domain

63

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. p-strands 4, 5, the helix 4-5 and p-strand 6. The C131 variant had deletions of

additional p-strands on both the D2 and D3 domains, with almost complete removal

of the outermost D3 domain, except for three successive p-strands 3-5 that extend

from subdomain D2a into domain D3. Most of the D2a subdomain and part of the

D2b domain were also removed, leaving only a partial D2b subdomain including

P-strands 19 and 20. The Cl 50 variant had nearly complete deletion of the D2a and

D2b domains; the remaining peptide included subdomain D2a p-strand 3 and part of

the D3 domain that included p-strands 4-7 and helix 4-5. None of the ten

representative variants in this study had complete deletion of the D3 domain and

corresponding preservation of the D2 domain. However, the previously described

SJW46 mutant had most of the outermost D3 domain deleted (Figure 2.8), similar to

the C131 variant, but with only a minimal deletion of the D2a domain, including

partial deletion of the D2a a-helix. The SJW61 variant had the most complete

deletion of the D3 domain and a relatively minimal disruption of the D2 domain

region, with complete preservation of the subdomain D2a a-helical region (Figure

2.8). Both of these spontaneous mutations resulted in deletions that appear to clip off

the majority of the D3 domain and part of the D2a subdomain, with little disruption to

the D2b subdomain. It is straightforward to envision how the remaining D3 domain

P-strand 11 segment could flip over to join the other end of the remaining p-strand 3

segment in the SJW61 mutant, with minimal disruption to the D2 domain region, as

observed in the predictive model. These two prior “domain-clipping” mutations

suggest a rational engineering approach to deleting and inserting other proteins in the 64

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. hypervariable region of flagellin; however, this approach remains to be

experimentally validated.

The end-to-end distances of remaining, undeleted peptide segments, with

respect to the wild-type flagellin structure, were determined for the 12 variants (Table

2.3). These measurements can be considered to be a rough estimate of the possible

degree of structural distortion and rearrangement required to join together the two

remaining peptide segments to yield an intact protein. The end-to-end distances of the

seven relatively motile variants (C3, C4, C9, Cl 1, C37, C79 and C89) ranged from

21.4-38.4 A, with an average value of 31.1 ± 6.4 A. The end-to-end distances for the

motile SJW46 and SJW61 variants were also similar, 35.4 A and 33.6 A. It is also

interesting to note that the average end-to-end distance is similar to the end-to-end

distance of 26 A for the N- and C-termini (Seri-Alai08) of E. coli thioredoxin,

previously inserted in an E. coli flagellin internal deletion variant to yield the hybrid

FliTrx protein.37 The three relatively non-motile C12, C131 and C150 variants,

encoding the largest deletions, had correspondingly larger end-to-end distances of

44-54 A (Table 3). Thus, a rough correlation was apparent for the end-to-end

distance between remaining residues in the original wild-type structure and the

number of deleted residues (Figure 2.10); larger internal deletions tended to result in

larger end-to-end distances. However, this correlation was not entirely linear; the C4

and C9 variants had a local maximum in their end-to-end distances of 36 A that

coincided with a relatively minimal change in the motility for these same variants. A

second local maxima of 38 A in the end-to-end distance was apparent for the Cl 1

65

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. variant. Variation in the deletion start and stop positions and intrinsic topological

constraints of the folded protein structure probably account for these variations. For

example, removal of an entire p-hairpin could result in a smaller end-to-end distance

between remaining peptide segments than removal of only one strand of this same

structure.

Given that deletions should alter the remaining protein conformation, we

attempted to develop predictive structural models of the twelve deletion variants.

Homology modeling of all 12 variants was performed with the MODELER software

package, using the S. typhimurium flagellin PDB structure (1UCU) as the template

(See Supplemental Data for the model PDB coordinate files). We were able to model

structures of the ten C3, C4, C9, Cl 1, C12, C37, C79, C89, SJW46 and SJW61

variants often with reasonable accuracy (Figure 2.8), but were unable to model the

C131 and Cl 50 variant structures that contained extensive deletions of both the D2

and D3 domains. Thus, one preliminary result from this in silico experiment was that

the ability of this software algorithm to predict a compact folded domain structure

decreased as the size of the internal deletion increased.

66

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. u _ 16 OJ !)•§ m £ a. 1.4 40 Q.i D t 1 2 8 ' 3 Q. 11. 1 Q. «' I s sr s i 0-8 20 3 © s 8 0 6 © CD U ' b o 0.4

0.2 100 150 200 Number of Deleted Residues

Normalized Swarming Diameter •O ■*•3*“’* Normalized Swimming Speed (offset) ▼ Normalized Optical Trapping Force (offset)

Figure 2.10 Plot of normalized relative cell motility. Plot of normalized relative cell motility of wild-type and ten internal deletion variants of S. typhimurium phase 1 flagellin and corresponding end-to-end distances of remaining peptide segments in the wild-type structure (PDB 1UCU). Normalized motility values and end-to-end distances (Table 4) were plotted as a function of the number of deleted residues for each variant. Three types of cellular motility were determined by agar swarming motility; video microscopy swimming velocity and optical trapping stall force measurements, as described in Materials and Methods. The motility data in Table 3 were normalized by computing the ratio of each value relative to the baseline value determined for the SJW134 strain (transformed with pTH890) expressing wild-type flagellin. The larger of the two normalized motility values for zero deleted residues are for the wild-type motile SJW1103 strain. The plotted normalized swimming velocity and trapping force values were vertically offset by 0.4 and 0.8 for clarity.

67

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Conservation of wild-type flagellin secondary structures and hydrophobic

cores in the outer domains of the ten converged variant models was analyzed using

the Swiss-PdbViewer software.65 All ten of the deletion variant models had an intact

D2b domain with respect to its secondary structures. Helix 4-5 in the D3 domain was

intact in most of the models, except for the C 12 variant model, where it assumed

more of a loop structure. The model of the C3 variant, with only 12 deleted residues,

indicated that the subdomain D2a a-helix 6 was completely replaced by an extended

loop connecting the D3 domain and D2a subdomain. The corresponding (3-strands 4,

5 and 11 in the D3 domain were replaced by loops. The subdomain D2a p-strands 3,

12 and 13 were also replaced by loops. The model of the C4 variant, with 33 residues

deleted primarily from the D3 domain (261-293), shows disruption of the

hydrophobic core region in the D3 domain. The remaining D3 domain p-strands 4, 5,

10 and 11 were disrupted, while the D2a domain secondary structures remained

intact. The C9 variant model, with a deletion of 47 residues (244-291), showed

disruption of p-strands 4, 5 and 11 in the D3 domain and disruption of p-strands 3,

12 and 13 and a-helix 5 in the D2a domain. The C89 variant model with 49 deleted

residues (250-298) had disruption of p-strands 4, 5, 9, 10 and 11 in the D3 domain

and a-helix 6 in the D2a domain. The C37 variant model (55 deleted residues;

249-304) had disruption of all remaining p-strands in the D3 domain and p-strands

3,12 and 13 and a-helix 6 in the D2a domain. The Cl 1 variant model with 69

deleted residues (230-299) had deletions that were predominantly in the D3 domain,

68

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. causing either disruption of the secondary structures or their complete elimination.

The (T-strands in the D2a domain appeared to be intact but a-helix 6 was replaced by

a loop. The C79 variant, with 75 residues deleted (249-324), had deletions that

occurred in most of the D3 and D2 domains, disrupting the hydrophobic cores in both

domains. The C79 variant model indicated that p-strands 4, 5, 10 and 11 were

disrupted in the D3 domain and no secondary structures were present in the D2a

domain. The model of the Cl 2 variant did not have preservation of any secondary

structures in the D3 domain while p-strands 12 and 13 and a-helix 6 were disrupted

in the D2a domain.

The remaining outer domain peptides in the Cl 1, C l2, and C37 models were

predicted to form large segments of extended, random-coil loops, with minimal

hydrogen bonding. These loop regions also contained significant numbers of solvent-

exposed hydrophobic side chains that did not collapse to form buried hydrophobic

cores. The lack of compact globular structure in the outer domain regions of these

three models suggests that they may not be very accurate. Thus, it cannot be ruled out

that these predicted loop regions may assume other conformations that would result in

more compact, domain-like structures. The model of the SJW46 variant indicated

complete removal of the D3 domain region. The p-strands 12-15 and a-helix 6 in the

D2a region and p-strands 16-20 in the D2b region are preserved. The SJW61 model

also indicated complete removal of the D3 domain region and preservation of the

P-strands 12-15 and a-helix 6 in the in D2a region and p-strands 16-20 in the D2b

region are preserved. 69

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. These preliminary structural models suggest that even a deletion of a few

residues in the outer domains can perturb the folded structure, resulting in decreased

protein and fiber stability and consequently, lowered cellular motility. The modeling

predictions involving extended loop regions and the failure to model the C l31 and

Cl 50 variants were probably due to the lack of an appropriate template protein

structure; the MODELER algorithm is very dependent on existence of a known

protein structure to predict unknown structures. More detailed modeling studies with

a different structure prediction algorithm that is not dependent on a single template

are underway.

Purification and transmission electron microscopy analysis of flagelia

Although the models and sequence analysis were revealing, we also pursued

experimental measurement of the macroscopic, physiological effects of the deletions

on oligomeric flagellar structure and motility for the ten representative flagellin

internal deletion variants. The presence of intact flagelia fibers that were still attached

to cells was verified using transmission electron microscopy (TEM) of both S.

typhimurium cells and media samples in which cells were grown. TEM images of the

wild-type and deletion variant flagelia revealed subtle differences (Figure 2.11). TEM

imaging indicated that the C3, C4 and C9 flagelia fibers exhibited near wild-type

mechanical stability in cell culture. However, the other variants were substantially

more brittle than the wild-type protein and were prone to breakage. When SJW134

cells expressing the C3, C4, C9, C89, C37, Cl 1 and C79 flagellin variants were

grown with rigorous shaking (225 rpm), flagelia were mostly found in the media as

70

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. detached, isolated fibers, indicating that they were more readily sheared off the cells

than the wild-type protein. The C12 and C131 variants formed shorter fibers which

were also mostly found as isolated fibers in the media. However, all variants formed

flagelia that were still attached to the cells, similar to the wild-type, under shear-free

growth conditions. Flagelia filaments with larger internal deletions have previously

been observed to be less mechanically stable than the wild-type flagellin,12’33; 34 e.g.,

the SJW46, SJW61 and SJW46S S. typhimurium flagellin internal deletion variants.33’

34 Large internal deletions may destabilize the conformation of flagellin, leading to

weaker interactions between flagellin subunits and between flagellins and other

flagellar hook proteins.32 No significant aggregation of the C3, C4, C9, C89, C37,

Cl 1 and C79 variants into bundles was observed, in contrast to the previously

described SJW46 and SJW61 variants, which aggregated into flagellar bundles in the

presence of more than 0.1 M NaCl.33

71

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Figure 2.11 TEM images of flagelia fibers.Electron micrographs of flagelia fibers or protein aggregates isolated from S. typhimurium SJW134 cells, (a) wild-type cells, (b) C79 variant; (c) and (d) C12 variant and (e) and (f) C l50 variant. The scale bars indicated in the images are as follows: (a) 100 nm, (b) 100 nm, (c) 100 nm, (d) 100 nm, (e) 200 nm and (f) 2 pm.

Flagellin protein monomers and flagellar fibers of each variant were isolated from

bacterial cells by heating or shearing, as described in Materials and Methods and were

characterized by SDS-PAGE and TEM imaging. Analysis by SDS-PAGE revealed

similar extracellular amounts of each flagellin variant (Figure 2.12), with different

molecular masses corresponding to the smaller PCR products for each internally

deleted fliC gene. These results indicated that similar amounts of each flagellin

72

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. deletion variant were exported for all ten variants. TEM images of the C12 and C131

variant flagelia (Figure 2.10(c), (d)) indicated sharper edges compared to the wild-

type and other variants (Figure 2.10(a), (b)), suggesting a deletion of most of the

outer domains. Electron microscopy of the shear isolated Cl 50 variant did not

indicate intact flagelia fibers but instead revealed large amorphous aggregates of

protein (Figure 2.10(e), (f)) that were not observed for the other nine variants. These

aggregates, along with the SDS-PAGE results, indicate that correct export of the

smallest C l50 variant was followed by aggregation and failure to assemble into

fibers, resulting in accumulation in the media. This result is consistent with a previous

result in which expression and secretion of E. coli flagellin internal deletion variants

were similar to wild-type, but assembly into filaments was minimal for the smallest

variant.32

m r pTH J 890 C3 €4 C9 C89 €37 CM C79 C12 €131 €150 55.4 kDa *

36.5 k»a ...

Figure 2.12 SDS-PAGE of the FliC and deletion variant proteins. A 10% SDS- PAGE of FliC proteins from wild-type SJW1103 Salmonella strain, pTH890 plasmid and other deletion variants.

73

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Swarming agar motility assay

The physiological function of the ten flagellin internal deletion variants was first

characterized by a timed swarming motility assay performed on dilute 0.3% agar

(Table 2.4). The swarming diameter of each variant expressed in SJW134 S.

typhimurium cells was determined after 6 h; both wild-type SJW1103 and non-motile

SJW134 S. typhimurium cells transformed with the pTH890 plasmid were measured

as controls (Figure 2.13). The measured motility for both controls were very similar,

suggesting that the rate of flagellin expression from the pTH890 plasmid was not a

significantly rate-limiting factor in the production of functional flagelia in the non-

motile SJW134 strain. The seven C3, C4, C9, C89, C37, Cl 1 and C79 variants

showed a detectable swarming radius while the C l2, C131 and C l50 variants did not

show any significant swarming motility within six h, similar to the non-motile

SJW134 strain. These latter three variants only showed weak motility after longer

periods of growth, e.g., 12-16 h. All seven of the relatively motile variants showed

significant decreases of 23-58% in their swarming diameters relative to

SJW134+pTH890 cells expressing the wild-type flagellin (Table 2.4, Figure 2.9).

Thus, a detectable loss of function was observed even for the least perturbed C3

variant with a small deletion of 12 residues. Furthermore, the swarming radius was

largely proportional to the extent of deletion (Figure 2.13), although the C4 and C9

variants had similar motilities, despite deletion of an additional 14 residues for the C9

variant. These results were largely consistent with a prior study of E. coli flagellin

deletion variants.32

74

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. SJW134 SJW1103 pTH890

C89 C37 Cll C79 C12 C131 C150 SJW134

Figure 2.13 Motility assay of internal deletion library. Images of swarming motility agar assays for wild-type controls and selected internal deletion variants of S. typhimurium flagellin. Images were collected after 6 h growth at 30 °C.

Table 2.4 Motility of S. typhimurium SJW134 cells expressing FliC and deletion variants as determined by different methods. ______V arianta Number Swarming Swimming speed Optical trapping deleted diameter (pm/s) d stall force (W) e residues (mm) c

SJW1103 n.a. 36.5 ± 6 .5 13.7 ±3.4 >2.0

(wild-type)

SJW134 + pTH890 n.a. 34.4 ± 4 .3 11.2 ± 3.0 1.7 ± 0 .3

C3 12 26.5 ± 1.9 10.5 ± 4 .3 1.3 ± 0 .3

C4 33 22.5 ± 2 .6 9.0 ± 2 .8 1.2 ± 0 .3

75

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Table 2.4 - Continued

V arianta Number Swarming Swimming speed Optical

deleted diameter (pm/s) d trapping stall

residuesb (m m )e force (W) e

C9 47 22.5 ± 1.7 10.1 ± 3.8 1.3 ± 0.3

C89 49 19.5 ± 1.9 9.3 ± 3.6 1.2 ± 0.3

C37 55 19.2 ± 1.7 8.3 ± 3.6 1.0 ± 0.3

Cll 69 17.2 ±4.7 9.0 ± 3.0 0.8 ± 0 .2

C79 75 14.5 ±2.1 6.0 ± 2.6 0.6 ± 0.3

C12 109 9.5 ± 1.7 n.d. n.d.

C131 168 9.2 ± 1.0 n.d. n.d.

C150 177 9.2 ±0.5 n.d. n.d.

SJW134 (control) n.a. 8.2 ± 1.0 n.a. n.a.

"The non-motile S. typhiumurium strain SJW134 was transformed with pTH890

expression plasmids encoding either wild-type phase 1 flagellin or internal deletion

variants shown in Figure 12. The untransformed SJW134 strain was used as a

negative control for swarming agar motility measurements. Experiments in liquid

media were performed using cell cultures grown to mid-log growth phase without

shaking.

b The number of deleted amino acid residues in each flagellin internal deletion

variant, corresponding to the sequence schematic shown in Figure 7.

76

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Table 2.4 - Continued

c Swarming diameter of S. typhimurium SJW134 cell expressing the wild-type and

internal deletion variants of phase 1 flagellin on 0.3% agar media at 30 °C for 6

hours. Each value is the average and standard deviation of five replicates.

d Swimming velocity in aqueous LB media of SJW134 cells expressing wild-type

and internal deletion variants of phase 1 flagellin. Values are the average and

standard deviation of video microscopy measurements performed on 20 different

cells, as described in Materials and Methods.

e Minimum stall force required to optically trap individual SJW134 cells expressing

wild-type and internal deletion variants of phase 1 flagellin in 6.25% glycerol LB

media, as described in Materials and Methods. Values are the average and standard

deviation of 20 measurements.

n.a. Not applicable.

n.d. No data..

Video analysis of swimming motility in dilute aqueous media

A second experimental technique of video microscopy was used to measure the

effects of internal deletions on the swimming ability of cells in aqueous solution. The

swimming rates of both wild-type SJW1103 and non-motile SJW134 S. typhimurium

cells expressing wild-type flagellin and the ten representative variants were

determined in liquid LB media (Table 2.4). For any given population of cells the 77

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. swimming rates can considerably vary from individual to individual and the

swimming speed of the same species of bacteria can vary depending on different

conditions.66’67; 68 Hence, the average velocity was determined for 20 different cells

expressing each variant under otherwise identical media conditions. Stereotypical

motility behavior was observed for individual cells, with periods of smooth

swimming in one direction interrupted by tumbling, followed by a change in the

direction of movement. The SJW1103 wild-type cells had an average swimming

speed of 13.7 ± 3.2 pm/s, while the SJW134 cells transformed with the pTH890

expression plasmid encoding wild-type flagellin had an average swimming speed of

11.2 ± 3.0 um/s. The deletion variants showed significant reductions in swimming

speed, with measured values ranging from 10.5 pm/s for C3 (6.3% reduction in

velocity) to 6.0 pm/s for C79 (46% reduction in velocity), relative to the SJW134

cells expressing wild-type flagellin (Table 2.4, Figure 2.9). The C12, C131 and C150

variants did not show any measurable swimming ability in liquid media apart from

random Brownian motion, in agreement with the previous swarming agar motility

studies.

Measurement of stall force in viscous aqueous media by optical trapping

A third technique of optical trapping was employed to measure the relative propulsive

force generated by cells expressing the flagellin deletion variants. Optical traps

(“laser tweezers”) allow manipulation of individual cells through the forces exerted

on a particle by a focused laser beam, as first demonstrated by Ashkin.69’70 Optical

traps have been employed in understanding different bacterial motions and

78

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. quantitation of the forces generated by their motility structures such as type IV pili

and endoflagella,71’72; 73 and the torque generated by single flagellum in E. coli.74

Optical traps have also been employed to measure the force generated by motile

eukaryotic cells such as human sperm cells75 and Chlamydomonas sp. algae.76 One of

the major concerns while using optical traps on living systems is the extent of

photodamage to the specimen. The high energy laser light source can cause damage

to the specimens and could eventually lead to their death, a process termed

“opticution”.69 However, studies conducted on living systems have shown that the

near-IR (790-1064 nm) wavelengths are relatively harmless for trapping live cells.69’

76; 77 Neuman et al. studied the effects of near-IR lasers of different wavelengths on E.

coli71 and concluded that the most damaging wavelength to be 870 nm and least

damaging to be 970 nm. Thus, the wavelength of 1064 nm used in this study is

considered to be relatively safe for using with the live specimens.

The relative propulsive forces exerted by wild-type S. typhimurium SJW 1103

cells and SJW134 cells expressing wild-type flagellin and each of the seven motile

deletion variants (C3, C4, C9, Cl 1, C37, C79 and C89) were characterized by optical

trapping (Table 2.4). These experiments were based on the assumption that all cells

expressing flagellin deletion variants, and grown without shaking, produce similar

numbers and lengths of flagelia, as indicated by the TEM images and the similar

amounts of extracellular flagellin protein indicated by SDS-PAGE of the isolated

flagelia (Figure 2.10). The average stall force required to stop movement of the cells

in media containing 6.25% glycerol was measured by decreasing the power of the

79

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. laser until cells were able to escape the optical trap. Once the cells were optically

trapped, they aligned vertically with their long axis perpendicular to the image plane,

parallel to the laser beam, and rapidly rotated about the Z-axis. As the laser power

was reduced, the motile cells started to move in different directions. Repeated

trapping and releasing did not appear to damage the cells, as indicated by consistent

stall forces measured over time for the same cell. The majority of the wild-type

SJW1103 cells were difficult to trap, even in the relatively viscous glycerol media

and had an average stall force of more than 2.0 W (Table 2.4). The SJW134 cells

transformed with the pTH890 plasmid had an average trapping force of 1.7 W.

Conversely, the non-motile SJW134 cells that lack flagelia were not able to escape

the trap even after reducing the laser power to the lowest setting of 0.2 W. Cells

expressing the functional internal deletion variants exhibited decreases in their

measured stall forces ranging from 24-65% for the C3 and C79 variants, relative to

the SJW134 cells expressing wild-type flagellin (Table 2.3). A steep initial decrease

in the stall force was followed by a slower rate of decrease with respect to number of

deleted residues (Figure 2.9); the stall force was relatively constant for the C3, C4, C9

and C89 variants despite deletions ranging from 12-49 residues. These observations

were consistent with the prior swimming velocity and agar motility measurements.

Comparison of motility results determined by different methods

A strong correlation was observed between the number of deleted residues and the

cellular motility, for variants with deletions less than 100 residues, as determined by

three different experimental methods (Figure 2.9). All three methods yielded

80

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. comparable results for relative decreases in cellular motility as a function of deleted

residues, although optical trapping appeared to be slightly more sensitive to changes

in the protein-encoded motility function than the other two methods. For example, the

following changes were observed for the least perturbed C3 variant vs. the control

SJW134 strain expressing wild-type flagellin: a 23% decrease in agar swarming

diameter, a 6.3% decrease in the aqueous swimming speed and a minimum stall force

of 24% by optical trapping in 6.25% glycerol media. The C79 variant, with a much

larger deletion of 75 residues, had decreases of 58% in agar swarming diameter, 46%

in the aqueous swimming speed, and 65% in the optical trapping stall force.

Following the large initial decrease in motility observed for the C3 variant, the

motility values were relatively constant for the C3, C4 and C9 variants with deletions

of 12-47 residues (Table 2.4, Figure 2.9). This analysis suggests that the C9 variant

may have a more optimal folded protein structure than the other variants with slightly

smaller or larger deletions. The motility decreased much more rapidly as a function of

number of deleted residues for the C89, C37, Cl 1 and C79 variants encoding

deletions of more than 47 residues. Deletion of more than 100 residues in the Cl 2,

C131 and Cl 50 variants resulted in nearly complete loss of cellular motility.

Discussion

Effects of internal deletions on cellular motility and stability of flagelia

The remarkable sequence variations in the hypervariable region of bacterial flagellins

make this class of protein an ideal candidate for investigating the structure-function

81

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. relationships of internal domains. Previous studies have shown that the hypervariable

middle sequence region of E. coli and S. typhimurium flagellin proteins can be

deleted to varying extents without complete loss of Type III export, folding, self-

assembly and motility functions. The recent availability of a complete structure of the

S. typhimurium flagellin protein15 enabled a more detailed molecular approach to

investigating the structure-motility function relationships of flagellin internal deletion

variants. This study demonstrated that many different functional internal deletion

variants are possible, as long as the deletions are confined to regions of the solvent-

exposed D2 and/or D3 domains, which are not critical for export and self-assembly.

Although this was not an exhaustive study of all possible functional variants, the DO

and D1 domains, including the putative p-switch region in the D1 domain, were not

altered in any of the functional sequences identified. This result is in agreement with

previous studies indicating that the conserved N- and C-terminal DO and D1 domain

regions are critical for export and flagellar assembly.

Three different physiological measurements indicated that the motility of cells

generally decreased as the size of the outer domains decreased, under a wide range of

viscosity conditions, suggesting that the D2 and D3 domains are important in

stabilizing the proper conformation of the flagellin protein and contributing to its role

in propulsion. Removal of even small peptides from the hypervariable domain

resulted in a measurable decrease in motility, although deletions of up to 75 residues,

or 15% of the entire sequence, still resulted in partial motility. Lowered motility

resulting from internal deletions may be a permissible evolutionary compromise

82

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. between colonization of the environment and evasion of host immune responses to

the wild-type flagellin.

The mechanical stability of the flagelia fibers decreased as more of the D2 and

D3 domain regions were deleted. The lowered mechanical stability observed as may

be due both to decreased stability of the folded monomer and decreased strength of

intermolecular interactions between the D2 domains in the oligomeric flagelia fibers.

It should also be noted that thermostable flagellins from extremophilic Aquifex

-JO. HQ. OA pyrophilus and A. aeolicus eubacterial species ’ ’ have similar numbers of amino

acid residues (-500) to the wild-type S. typhimurium phase 1 flagellin. These

thermostable flagellins have hypervariable regions that may also encode structurally-

stabilizing domains. It can be concluded that the outermost D3 domain “knob” is

important for protein stability and contributes to mechanical stability of oligomeric

flagelia and consequently, motility function, under all conditions. Depending on the

species, the outer D2 and D3 domain sequence regions can be greatly reduced in size.

It remains to be determined what structural mechanisms other flagellins with smaller

hypervariable regions use for stability and how they compare in motility relative to

the S. typhimurium flagellin.

Implications for folding, export and assembly

While large deletions significantly destabilized flagellins and their final assembly into

oligomeric flagelia, they did not appear to perturb efficient extracellular export. The

formation of large extracellular aggregates instead of flagellar fibers observed for the

Cl 50 variant with most extensive deletions also raises the question of possible

83

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. molecular interactions between the outer D2 and D3 domains and the terminal

FliD/Hap2 chaperone complex during assembly in vivo.

The globular D2a, D2b and D3 domains have hydrophobic cores at their

centers which contribute to the folded stability of the protein.81 Removal of protein

segments also removes portions of these hydrophobic cores, contributing to lowered

stability. The variant end-to-end distances in the wild-type structure, on the order of

the outer diameter of the D3 domain (30 A), clearly indicate that some form of

conformational rearrangement is required for the remaining polypeptide ends to be

joined together to form a functional protein. Thus, in addition to disruption or

complete loss of hydrophobic core regions, joining the ends of remaining peptide

segments may introduce strain into the variant structures. This may also lower the

folding stability of flagellin monomers and may also interfere with the packing of the

D1 and any remaining D2b domain segments in the oligomeric flagelia fiber.

The wild-type D2 and D3 domains have a large proportion of p-strand-rich

structures that may be relatively flexible. These domains probably rearrange from some type of precursor structure, after export through the 20 A diameter inner pore of

the flagellum, to fold and assemble into the final oligomeric wild-type structure.

Given the tendency of p-strands to together, some collapse and rearrangement

of formerly distant segments may occur to give new p-sheet structures. It is also well

known that some p-sheet proteins can undergo conformational rearrangements, e.g.,

proteins82 and p-amyloid proteins.83 Thus, some of the remaining peptide

segments regions may also assume alternative conformations. Rearrangements of the

84

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. remaining peptide may also result in generation of relatively unstructured loops that

may further destabilize the folded state. However, it is not obvious from the

preliminary simulated structures for many of the variants with larger deletions how

the remaining segments in each D2 and D3 domain fold, or whether there is even a

single folding solution for these modified domains.

Implications for design of flagellin fusion proteins

Given the extracellular display of flagellins, the hypervariable region would be a

logical place to insert other proteins domains of interest, as previously demonstrated

with thioredoxin. Some of the variants analyzed in this study suggest possible

locations for rational insertion of foreign proteins as N- and C-terminal fusions in the

hypervariable region of flagellin. One obvious approach would be to remove the D3

domain and genetically insert other proteins at the exposed ends of the D2a domain.

Other approaches would involve insertion between segments of the D3 domain and

D2a or D2b subdomains. Recovery of motility with otherwise non-functional variants

could potentially be used as a simple selection method to identify functional protein

insertion variants.

GenBank accession numbers

The sequences of the 39 unique internal deletion variants of the S. typhimurium fliC

gene have been deposited in the GenBank database (Table 2.5). The corresponding

DNA sequences can be found in the Appendix I and a multiple sequence alignment of

all the 39 uniqe clones can be found in Appendix II.

85

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Table 2.5 The deletion variant plasmids and their Genbank accession numbers. Plasmid Genbank Acc. Number

pTH890-C5 EF057752

pTH890-C3 EF057753

pTH890-C108 EF057754

pTH890-C55 EF057755

pTH890-C30 EF057756

pTH890-Cl 10 EF057757

PTH890-C115 EF057758

pTH890-C70 EF057759

pTH890-C112 EF057760

pTH890-C118 EF057761

pTH890-C42 EF057762

pTH890-C182 EF057763

pTH890-C27 EF057764

pTH890-C48 EF057765

pTH890-C33 EF057766

pTH890-C62 EF057767

pTH890-C120 EF057768

pTH890-C131 EF057789

86

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Table 2.5 - Continued

Plasmid Genbank Acc. Number

pTH890-C35 EF057769

pTH890-C4 EF057770

pTH890-C174 EF057771

pTH890-C21 EF057772

pTH890-C61 EF057773

pTH890-C47 EF057774

pTH890-C116 EF057775

pTH890-C6 EF057776

pTH890-C14 EF057777

pTH890-C39 EF057778

pTH890-C88 EF057779

pTH890-C167 EF057780

pTH890-C9 EF057781

pTH890-C89 EF057782,

pTH890-C170 EF057783

pTH890-C37 EF057784

pTH890-C85 EF057785

pTH 890-C ll EF057786

pTH890-C79 EF057787

pTH890-C12 EF057788

87

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Table 2.5 - Continued

Plasmid Genbank Acc. Number

pTH890-C150 EF057790

References

1. Armitage, J. P. (1992). Bacterial motility and chemotaxis. Sci Prog 76, 451- 77. 2. Berry, R. M. & Armitage, J. P. (1999). The bacterial flagella motor. Adv Microb Physiol 41, 291-337. 3. Bardy, S. L., Ng, S. Y. & Jarrell, K. F. (2003). Prokaryotic motility structures. Microbiology 149, 295-304. 4. Metlina, A. L. (2004). Bacterial and archaeal flagella as prokaryotic motility organelles. Biochemistry (Mosc) 69, 1203-12. 5. Moens, S. & Vanderleyden, J. (1996). Functions of bacterial flagella. CritRev Microbiol 22, 67-100. 6. Josenhans, C. & Suerbaum, S. (2002). The role of motility as a virulence factor in bacteria. Int J Med Microbiol 291, 605-14. 7. Berg, H. C. (2003). The rotary motor of bacterial flagella. Annu Rev Biochem 72, 19-54. 8. DeRosier, D. J. (1998). The turn of the screw: the bacterial flagellar motor. Cell 93,17-20. 9. Jones, C. J. & Aizawa, S. (1991). The bacterial flagellum and flagellar motor: structure, assembly and function. Adv Microb Physiol 32, 109-72. 10. Macnab, R. M. (2004). Type III flagellar protein export and flagellar assembly. Biochim Biophys Acta 1694, 207-17. 11. Macnab, R. M. (2003). How bacteria assemble flagella. Annu Rev Microbiol 57, 77-100. 12. Samatey, F. A., Imada, K., Nagashima, S., Vonderviszt, F., Kumasaka, T., Yamamoto, M. & Namba, K. (2001). Structure of the bacterial flagellar protofilament and implications for a switch for supercoiling. Nature 410, 331- 7. 13. Yonekura, K., Maki, S., Morgan, D. G., DeRosier, D. J., Vonderviszt, F., Imada, K. & Namba, K. (2000). The bacterial flagellar cap as the rotary promoter of flagellin self-assembly. Science 290, 2148-52. 14. Yonekura, K., Maki-Yonekura, S. & Namba, K. (2002). Growth mechanism of the bacterial flagellar filament. Res Microbiol 153, 191-7.

88

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 15. Yonekura, K., Maki-Yonekura, S. & Namba, K. (2003). Complete atomic model of the bacterial flagellar filament by electron cryomicroscopy. Nature 424, 643-50. 16. Hyman, H. C. & Trachtenberg, S. (1991). Point mutations that lock Salmonella typhimurium flagellar filaments in the straight right-handed and left-handed forms and their relation to filament superhelicity. J Mol Biol 220, 79-88. 17. Kanto, S., Okino, H., Aizawa, S. & Yamaguchi, S. (1991). Amino acids responsible for flagellar shape are distributed in terminal regions of flagellin. J Mol Biol 219, 471-80. 18. Trachtenberg, S. & DeRosier, D. J. (1988). Three-dimensional reconstruction of the flagellar filament of Caulobacter crescentus. A flagellin lacking the outer domain and its amino acid sequence lacking an internal segment. J Mol Biol 202, 787-808. 19. Murthy, K. G., Deb, A., Goonesekera, S., Szabo, C. & Salzman, A. L. (2004). Identification of conserved domains in Salmonella muenchen flagellin that are essential for its ability to activate TLR5 and to induce an inflammatory response in vitro. JB iol Chem 279, 5667-75. 20. Kondoh, H. & Hotani, H. (1974). Flagellin from Escherichia coli K-12: polymerization and molecular weight comparison with Salmonella flagellins. Biochim. Biophys. Acta 336,117-139. 21. Joys, T. M. (1985). The covalent structure of the phase-1 flagellar filament protein of Salmonella typhimurium and its comparison with other flagellins. J Biol Chem 260, 15758-61. 22. Lagenaur, C. & Agabian, N. (1976). Physical characterization of Caulobacter crescentus flagella. JBacteriol 128, 435-44. 23. Joys, T. M. (1988). The flagellar filament protein. Can J Microbiol 34, 452-8. 24. Andrade, A., Giron, J. A., Amhaz, J. M., Trabulsi, L. R. & Martinez, M. B. (2002). Expression and characterization of flagella in nonmotile enteroinvasive Escherichia coli isolated from diarrhea cases. Infect Immun 70, 5882-6. 25. Andersen-Nissen, E., Smith, K. D., Strobe, K. L., Barrett, S. L., Cookson, B. T., Logan, S. M. & Aderem, A. (2005). Evasion of Toll-like receptor 5 by flagellated bacteria. Proc Natl Acad Sci U SA 102, 9247-52. 26. Honko, A. N. & Mizel, S. B. (2005). Effects of flagellin on innate and adaptive immunity. Immunol Res 33, 83-101. 27. Trachtenberg, S., Fishelov, D. & Ben-Artzi, M. (2003). Bacterial flagellar microhydrodynamics: Laminar flow over complex flagellar filaments, analog archimedean screws and cylinders, and its perturbations. Biophys J 85, 1345- 57. 28. Prager, R., Strutz, U., Fruth, A. & Tschape, H. (2003). Subtyping of pathogenic Escherichia coli strains using flagellar (H)-antigens: serotyping versus fliC polymorphisms. Int J Med Microbiol 292, 477-86.

89

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 29. Dauga, C., Zabrovskaia, A. & Grimont, P. A. D. (1998). Restriction Fragment Length Polymorphism Analysis of Some Flagellin Genes of Salmonella enterica. J. Clin. Microbiol. 36, 2835-2843. 30. Reid, S. D., Selander, R. K. & Whittam, T. S. (1999). Sequence Diversity of Flagellin (fliC) Alleles in Pathogenic Escherichia coli. J. Bacteriol. 181, 153- 160. 31. Wang, L., Rothemund, D., Curd, H. & Reeves, P. R. (2003). Species-Wide Variation in the Escherichia coli Flagellin (H-Antigen) Gene. J. Bacteriol. 185, 2936-2943. 32. Kuwajima, G. (1988). Construction of a minimum-size functional flagellin of Escherichia coli. J Bacteriol 170, 3305-9. 33. Yoshioka, K., Aizawa, S. & Yamaguchi, S. (1995). Flagellar filament structure and cell motility of Salmonella typhimurium mutants lacking part of the outer domain of flagellin. J Bacteriol 177, 1090-3. 34. Mimori-Kiyosue, Y., Yamashita, I., Fujiyoshi, Y., Yamaguchi, S. & Namba, K. (1998). Role of the outermost subdomain of Salmonella flagellin in the filament structure revealed by electron cryomicroscopy. J Mol Biol 284, 521 - 30. 35. Brown, C. K., Modzelewski, R. A., Johnson, C. S. & Wong, M. K. (2000). A novel approach for the identification of unique tumor vasculature binding peptides using an E. coli peptide display library. Ann Surg Oncol 7, 743-9. 36. Hynonen, U., Westerlund-Wikstrom, B., Palva, A. & Korhonen, T. K. (2002). Identification by flagellum display of an epithelial cell- and fibronectin- binding function in the SlpA surface protein of Lactobacillus brevis. J Bacteriol 184, 3360-7. 37. Lu, Z., Murray, K. S., Van Cleave, V., LaVallie, E. R., Stahl, M. L. & McCoy, J. M. (1995). Expression of thioredoxin random peptide libraries on the Escherichia coli cell surface as functional fusions to flagellin: a system designed for exploring protein-protein interactions. Biotechnology (N Y) 13, 366-72. 38. Tripp, B. C., Lu, Z., Bourque, K., Sookdeo, H. & McCoy, J. M. (2001). Investigation of the 'switch-epitope' concept with random peptide libraries displayed as thioredoxin loop fusions. Protein Eng 14, 367-77. 39. Westerlund-Wikstrom, B. (2000). Peptide display on bacterial flagella: principles and applications. Int J Med Microbiol 290, 223-30. 40. Kumara, M. T., Muralidharan, S. & Tripp, B. C. (2006). Generation and characterization of inorganic and organic nanotubes on bioengineered flagella of mesophilic bacteria Journal o f Nanoscience and Nanotechnology accepted. 41. Kumara, M. T., Srividya, N., Muralidharan, S. & Tripp, B. C. (2006). Bioengineered Flagella Protein Nanotubes with Cysteine Loops: Self- Assembly and Manipulation in an Optical Trap. Nano Lett. 6, 2121-2129. 42. Kumara, M. T., Tripp, B. C. & Muralidharan, S. (2007). Self-assembly of Metal Nanoparticles and Nanotubes on Bioengineered Flagella Scaffolds 90

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Chemistry o f Materials accepted. 43. Kumara, M. T., Tripp, B. C. & Muralidharan, S. (2007). Exciton Energy Transfer in Self-assembled Quantum Dots on Bioengineered Bacterial Flagella. J. Phys. Chem. C accepted. 44. Yamaguchi, S., Fujita, H., Ishihara, A., Aizawa, S. & Macnab, R. M. (1986). Subdivision of flagellar genes of Salmonella typhimurium into regions responsible for assembly, rotation, and switching. J Bacteriol 166, 187-93. 45. Yamaguchi, S., Fujita, H., Sugata, K., Taira, T. & lino, T. (1984). Genetic analysis of H2, the structural gene for phase-2 flagellin in Salmonella. J Gen Microbiol 130, 255-65. 46. Yamaguchi, S., Fujita, H., Taira, T., Kutsukake, K., Homma, M. & lino, T. (1984). Genetic analysis of three additional fla genes in Salmonella typhimurium. J Gen Microbiol 130, 3339-42. 47. Williams, A. W., Yamaguchi, S., Togashi, F., Aizawa, S. I., Kawagishi, I. & Macnab, R. M. (1996). Mutations in fliK and flhB affecting flagellar hook and filament assembly in Salmonella typhimurium. J Bacteriol 178, 2960-70. 48. Tallant, T., Deb, A., Kar, N., Lupica, J., de Veer, M. J. & DiDonato, J. A. (2004). Flagellin acting via TLR5 is the major activator of key signaling pathways leading to NF-kappa B and proinflammatory gene program activation in intestinal epithelial cells. BMC Microbiol 4, 33. 49. Ryu, J. & Hartin, R. J. (1990). Quick transformation in Salmonella typhimurium LT2. BioTechniques 8, 43-5. 50. Tsai, S. P., Hartin, R. J. & Ryu, J. (1989). Transformation in restriction- deficient Salmonella typhimurium LT2. J Gen Microbiol 135, 2561-7. 51. Dower, W. J., Miller, J. F. & Ragsdale, C. W. (1988). High efficiency transformation of E. coli by high voltage electroporation. Nucleic Acids Res 16,6127-45. 52. Song, S., Zhang, T., Qi, W., Zhao, W., Xu, B. & Liu, J. (1993). Transformation of Escherichia coli with foreign DNA by electroporation. Chin J Biotechnol 9, 197-201. 53. Amann, E. & Brosius, J. (1985). "ATG vectors' for regulated high-level expression of cloned genes in Escherichia coli. Gene 40, 183-90. 54. Amann, E., Ochs, B. & Abel, K. J. (1988). Tightly regulated tac promoter vectors useful for the expression of unfused and fused proteins in Escherichia coli. Gene 69, 301-15. 55. Altschul, S. F., Madden, T. L., Schaffer, A. A., Zhang, J., Zhang, Z., Miller, W. & Lipman, D. J. (1997). Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25, 3389- 402. 56. Chenna, R., Sugawara, H., Koike, T., Lopez, R., Gibson, T. J., Higgins, D. G. & Thompson, J. D. (2003). Multiple sequence alignment with the Clustal series of programs. Nucleic Acids Res 31, 3497-500. 57. Kamiya, R. & Asakura, S. (1976). Helical transformations of Salmonella flagella in vitro. J Mol Biol 106, 167-86. 91

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 58. Kaplan, W. & Littlejohn, T. G. (2001). Swiss-PDB Viewer (Deep View). Brief Bioinform 2, 195-7. 59. Berman, H. M., Westbrook, J., Feng, Z., Gilliland, G., Bhat, T. N., Weissig, H., Shindyalov, I. N. & Bourne, P. E. (2000). The Protein Data Bank. Nucleic Acids Res 28, 235-42. 60. Sali, A. & Blundell, T. L. (1993). Comparative protein modelling by satisfaction of spatial restraints. J Mol Biol 234, 779-815. 61. Samatey, F. A., Imada, K., Nagashima, S., Vonderviszt, F., Kumasaka, T., Yamamoto, M. & Namba, K. (2001). Structure of the bacterial flagellar protofilament and implications for a switch for supercoiling. Nature 410, 331- 7. 62. Bateman, A., Coin, L., Durbin, R., Finn, R. D., Hollich, V., Griffiths-Jones, S., Khanna, A., Marshall, M., Moxon, S., Sonnhammer, E. L., Studholme, D. J., Yeats, C. & Eddy, S. R. (2004). The Pfam protein families database. Nucleic Acids Res 32, D138-41. 63. Finn, R. D., Mistry, J., Schuster-Bockler, B., Griffiths-Jones, S., Hollich, V., Lassmann, T., Moxon, S., Marshall, M., Khanna, A., Durbin, R., Eddy, S. R., Sonnhammer, E. L. & Bateman, A. (2006). Pfam: clans, web tools and services. Nucleic Acids Res 34, D247-51. 64. Yonekura, K., Maki-Yonekura, S. & Namba, K. (2005). Building the atomic model for the bacterial flagellar filament by electron cryomicroscopy and image analysis. Structure (Camb) 13, 407-12. 65. Guex, N. & Peitsch, M. C. (1997). SWISS-MODEL and the Swiss- PdbViewer: an environment for comparative protein modeling. Electrophoresis 18, 2714-23. 66. Minamino, T., Imae, Y., Oosawa, F., Kobayashi, Y. & Oosawa, K. (2003). Effect of intracellular pH on rotational speed of bacterial flagellar motors. J Bacteriol 185, 1190-4. 67. Togashi, F., Yamaguchi, S., Kihara, M., Aizawa, S. I. & Macnab, R. M. (1997). An extreme clockwise switch bias mutation in fliG of Salmonella typhimurium and its suppression by slow-motile mutations in motA and motB. J Bacteriol 179, 2994-3003. 68. Mittal, N., Budrene, E. O., Brenner, M. P. & Van Oudenaarden, A. (2003). Motility of Escherichia coli cells in clusters formed by chemotactic aggregation . Proc Natl Acad Sci U SA 100, 13259-63. 69. Ashkin, A., Dziedzic, J. M. & Yamane, T. (1987). Optical trapping and manipulation of single cells using infrared laser beams. Nature 330, 769-71. 70. Ashkin, A. & Dziedzic, J. M. (1987). Optical trapping and manipulation of viruses and bacteria. Science 235, 1517-20. 71. Block, S. M., Blair, D. F. & Berg, H. C. (1989). Compliance of bacterial flagella measured with optical tweezers. Nature 338, 514-8. 72. Brehm-Stecher, B. F. & Johnson, E. A. (2004). Single-cell microbiology: tools, technologies, and applications. Microbiol Mol Biol Rev 68, 538-59, table of contents.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 73. Maier, B., Potter, L., So, M., Long, C. D., Seifert, H. S. & Sheetz, M. P. (2002). Single pilus motor forces exceed 100 pN. Proc Natl Acad Sci U SA 99, 16012-7. 74. Berry, R. M. & Berg, H. C. (1997). Absence of a barrier to backwards rotation of the bacterial flagellar motor demonstrated with optical tweezers. Proc Natl Acad Sci USA 94, 14433-7. 75. Konig, K., Svaasand, L., Liu, Y., Sonek, G., Patrizio, P., Tadir, Y., Bems, M. W. & Tromberg, B. J. (1996). Determination of motility forces of human spermatozoa using an 800 nm optical trap. Cell Mol Biol (Noisy-le-grand) 42, 501-9. 76. McCord, R. P., Yukich, J. N. & Bemd, K. K. (2005). Analysis of force generation during flagellar assembly through optical trapping of free- swimming Chlamydomonas reinhardtii. Cell Motil Cytoskeleton 61, 137-44. 77. Neuman, K. C., Chadd, E. H., Liou, G. F., Bergman, K. & Block, S. M. (1999). Characterization of Photodamage to Escherichia coli in Optical Traps. Biophys. J. 77, 2856-2863. 78. Malapaka, V. R. & Tripp, B. C. (2006). A theoretical model of Aquifex pyrophilus flagellin: implications for its thermostability. J Mol Model (Online) 12, 481-93. 79. Deckert, G., Warren, P. V., Gaasterland, T., Young, W. G., Lenox, A. L., Graham, D. E., Overbeek, R., Snead, M. A., Keller, M., Aujay, M., Huber, R., Feldman, R. A., Short, J. M., Olsen, G. J. & Swanson, R. V. (1998). The complete genome of the hyperthermophilic bacterium . Nature 392, 353-8. 80. Behammer, W., Shao, Z., Mages, W., Rachel, R., Stetter, K. O. & Schmitt, R. (1995). Flagellar structure and hyperthermophily: analysis of a single flagellin gene and its product in Aquifex pyrophilus. J Bacteriol 177, 6630-7. 81. Dill, K. A. (1990). Dominant forces in protein folding. Biochemistry 29, 7133- 55. 82. Bini, E., Knight, D. P. & Kaplan, D. L. (2004). Mapping domain structures in silks from insects and spiders related to protein assembly. J Mol Biol 335, 27- 40. 83. Ross, C. A. & Poirier, M. A. (2004). Protein aggregation and neurodegenerative disease. Nat Med 10 Suppl, S I0-7.

93

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Appendix A. DNA sequences of the fliC gene deletion variants.

> pTH890-C5

1 atggcacaagtcattaatacaaacagcctgtcgctgttgacccagaataacctgaacaaa MAQVINTNSLSLLTQNNLNK 61 tcccagtccgctctgggcaccgctatcgagcgtctgtcttccggtctgcgtatcaacagc SQSALGTAI ERLS-SGLR IN S 121 gcgaaagacgatgcggcaggtcaggcgattgctaaccgttttaccgcgaa.catcaaaggt A K D DAAGQA I AN R F TAN I KG 181 ctgactcaggcttcccgtaacgctaacgacggtatctccattgcgcagaccactgaaggc L .TQASRNANDGI S IAQTTEG 241 gcgctgaacgaaatcaacaacaacctgcagcgtgtgcgtgaactggcggttcagtctgct ALNE INNNLQRVRELAVQSA 3 01 aacagcaccaactcccagtctgacctcgactccatccaggctgaaatcacccagcgcctg NSTNS QSDLD S I QAE I TQRL 3 61 aacgaaatcgaccgtgtatccggccagactcagttcaacggcgtgaaagtcctggcgcag NEIDRVSGQTQFNGVKVLAQ 421 gacaacaccctgaccatccaggttggtgccaacgacggtgaaactatcgatatcgatctg DNTLTIQVGANDGETIDIDL 4 81 aagcagatcaactctcagaccctgggtctggatacgctgaatgtgcaacaaaaatataag KQINSQTLGLD TLNVQQKYK 541 gtcagcgatacggctgcaactgttacaggatatgccgatactacgattgctttagacaat VSDTAATVTGYADTTIALDN 601 agtacttttaaagcctcggctactggtcttggtggtactgaccagaaaattgatggcgat STFKASATGLGGT DQKIDGD 661 ttaaaatttgatgatacgactggaaaatattacgccaaagttaccgttacggggggaact LKFDDTTGKYYAKVTVTGGT 721 ggtaaagatggctattatgaagtttccgttgataagacgaacggtgaggtgactcttgct GKDGYYEVSVDKTNGEVTLA 781 ggcggtgcgacttccccgcttacaggtggacaagttgcaaatgctgatttgacagaggct GGATS PLTGGQVANADLTEA 841 aaagccgcattgacagcagcaggtgttaccggcacagcatctgttgttaagatgtcttat K AALTAAGVTG T ASVVKMSY 901 actgataataacggtaaaactattgatggtggtttagcagttaaggtaggcgatgattac TDNNGKTIDGGLAVKVGDDY 961 tattctgcaactcaaaataaagatggttccataagtattaatactacgaaatacactgca Y SATQNKDGS I S XNTTKYTA 1021 gatgacggtacatccaaaactgcactaaacaaactgggtggcgcagacggcaaaaccgaa DDGTSKTALNKLGGADGKTE 1081 gttgtttctattggtggtaaaacttacgctgcaagtaaagccgaaggtcacaactttaaa V V S I GGKT YAAS KAEGHNFK 1141 gcacagcctgatctggcggaagcggctgctacaaccaccgaaaacccgctgcagaaaatt AQPDLAEAAATTTENPLQKI 12 01 gatgctgctttggcacaggttgacacgttacgttctgacctgggtgcggtacagaaccgt DAAL AQVDTLRSDLGAVQNR 1261 ttcaactccgctattaccaacctgggcaacaccgtaaacaacctgacttctgcccgtagc FNSAITNLGNTVNNLTSARS 1321 cgtatcgaagattccgactacgcgaccgaagtttccaacatgtctcgcgcgcagattctg RIEDSDYATEV SNMSRAQIL

94

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 13 81 cagcaggccggtacctccgttctggcgcaggcgaaccaggttccgcaaaacgtcctctct QQAGTSVLAQANQVPQNVLS 1441 ttactgcgttaa 1452 L L R *

> pTH890-C3

1 atggcacaagtcattaatacaaacagcctgtcgctgttgacccagaataacctgaacaaa MAQVINTNS LSLLTQNNLNK 61 tcccagtccgctctgggcaccgctatcgagcgtctgtcttccggtctgcgtatcaacagc SQSALGT AIERLSSGLRINS 121 gcgaaagacgatgcggcaggtcaggcgattgctaaccgttttaccgcgaacatcaaaggt AKDDAAGQAIANRFTANI KG 181 ctgactcaggcttcccgtaacgctaacgacggtatctccattgcgcagaccactgaaggc LTQASRNANDGI S IAQTTEG ■241 gcgctgaacgaaatcaacaacaacctgcagcgtgtgcgtgaactggcggttcagtctgct ALNE INNNLQR VRELAVQSA 3 01 aacagcaccaactcccagtctgacctcgactccatccaggctgaaatcacccagcgcctg NSTNSQSDLDSIQAEITQRL 361 aacgaaatcgaccgtgtatccggccagactcagttcaacggcgtgaaagtcctggcgcag N ■ E IDRVSGQTQFNGVKVLAQ 421 gacaacaccctgaccatccaggttggtgccaacgacggtgaaactatcgatatcgatctg D NTLT I QVGA NDGE T I D I DL 481 aagcagatcaactctcagaccctgggtctggatacgctgaatgtgcaacaaaaatataag KQ INSQTLGLDTLNVQQKYK 541 gtcagcgatacggctgcaactgttacaggatatgccgatactacgattgctttagacaat VSDTAATVTGYADTTIALDN 601 agtacttttaaagcctcggctactggtcttggtggtactgaccagaaaattgatggcgat STFKASATGLGGTDQKIDGD 661 ttaaaatttgatgatacgactggaaaatattacgccaaagttaccgttacggggggaact LKFDDTTGKYYAKVTVTGGT 721 ggtaaagatggctattatgaagtttccgttgataagacgaacggtgaggtgactcttgct G KDGYYEVSVDKTNGEVTLA 781 ggcggtgcgacttccccgcttacaggtggactacctgcgacagcaactgaggatgtgaaa GG ATS PLTGGLPATA TEDVK 841 aatgtagcattgacagcagcaggtgttaccggcacagcatctgttgttaagatgtcttat NVALTAAGVTGTASVVKMSY 901 actgataataacggtaaaactattgatggtggtttagcagttaaggtaggcgatgattac TDNNGKTIDGGLAVKVGDDY 961 tattctgcaactcaaaataaagatggttccataagtattaatactacgaaatacactgca YSATQNKDGSISINTTKYTA 1021 gatgacggtacatccaaaactgcactaaacaaactgggtggcgcagacggcaaaaccgaa DDGTSKTALNKLGGADGK TE 1081 gttgtttctattggtggtaaaacttacgctgcaagtaaagccgaaggtcacaactttaaa VVSIGGKTYAASKAEGHNFK 1141 gcacagcctgatctggcggaagcggctgctacaaccaccgaaaacccgctgcagaaaatt AQPDLAEAAATTTENPLQKI 12 01 gatgctgctttggcacaggttgacacgttacgttctgacctgggtgcggtacagaaccgt DAALAQVDTLRSDLGAVQNR 1261 ttcaactccgctattaccaacctgggcaacaccgtaaacaacctgacttctgcccgtagc 95

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. FNSA I TNLGNTVNNLTSA RS 1321 cgtatcgaagattccgactacgcgaccgaagtttccaacatgtctcgcgcgcagattctg RIEDSDYATEVSNMSRAQIL 13 81 cagcaggccggtacctccgttctggcgcaggcgaaccaggttccgcaaaacgtcctctct QQAGTS VLAQANQVPQNVLS 1441 ttactgcgttaa 1452 L L R *

> pTH890-C108

1 atggcacaagtcattaatacaaacagcctgtcgctgttgacccagaataacctgaacaaa MAQVINTNSLSLLTQNNLNK 61 tcccagtccgctctgggcaccgctatcgagcgtctgtcttccggtctgcgtatcaacagc SQSALGTAI ERLSSGLRINS 121 gcgaaagacgatgcggcaggtcaggcgattgctaaccgttttaccgcgaacatcaaaggt A K D DAAGQA I AN R F TAN I KG 181 ctgactcaggcttcccgtaacgctaacgacggtatctccattgcgcagaccactgaaggc LTQASRNANDGI S IAQTTEG 241 gcgctgaacgaaatcaacaacaacctgcagcgtgtgcgtgaactggcggttcagtctgct ALNEINNNLQRVRELAVQSA 3 01 aacagcaccaactcccagtctgacctcgactccatccaggctgaaatcacccagcgcctg NSTNSQSDLDSIQAEI TQRL 361 aacgaaatcgaccgtgtatccggccagactcagttcaacggcgtgaaagtcctggcgcag N E I DRVSGQTQFNGVKV LAQ 421 gacaacaccctgaccatccaggttggtgccaacgacggtgaaactatcgatatcgatctg DNTLTIQVGANDGETIDIDL 4 81 aagcagatcaactctcagaccctgggtctggatacgctgaatgtgcaacaaaaatataag KQINSQTLGLDTL NVQQKYK 541 gtcagcgatacggctgcaactgttacaggatatgccgatactacgattgctttagacaat VSDTAATVTGYADTTIA LDN 601 agtacttttaaagcctcggctactggtcttggtggtactgaccagaaaattgatggcgat STFKASATGLGGTDQKIDGD 661 ttaaaatttgatgatacgactggaaaatattacgccaaagttaccgttacggggggaact LKFDDTTGKYYAKVTVT GGT 721 ggtaaagatggctattatgaagtttccgttgataagacgaacggtgaggtgactcttgct G K D GYYEVSVDKTNGEVTLA 7 81 ggcggtgcgacttccccgcttacaggtggactacctgcgacagcaactgaggaggctaaa GGATS PLTGGLPATATEEAK 841 gccgcattgacagcagcaggtgttaccggcacagcatctgttgttaagatgtcttatact AALTAAGVTGTASVVKMSYT 901 gataataacggtaaaactattgatggtggtttagcagttaaggtaggcgatgattactat DNNGKTID GGLAVKVGDDYY 961 tctgcaactcaaaataaagatggttccataagtattaatactacgaaatacactgcagat SATQNKDGS I S I NTTKYTAD 1021 gacggtacatccaaaactgcactaaacaaactgggtggcgcagacggcaaaaccgaagtt DGTSKTALNKLGGA DGKTEV 10 81 gtttctattggtggtaaaacttacgctgcaagtaaagccgaaggtcacaactttaaagca VSIGGKTYAASKAEGHNFKA 1141 cagcctgatctggcggaagcggctgctacaaccaccgaaaacccgctgcagaaaattgat Q P DLAEAAATTTENPLQKID 12 01 gctgctttggcacaggttgacacgttacgttctgacctgggtgcggtacagaaccgtttc AALAQVDTLRSDLGAVQNRF

96

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1261 aactccgctattaccaacctgggcaacaccgtaaacaacctgacttctgcccgtagccgt NSAITNLGNTVNNLTSARSR 1321 atcgaagattccgactacgcgaccgaagtttccaacatgtctcgcgcgcagattctgcag I EDSDY ATEVSNMSRAQ I LQ 13 81 caggccggtacctccgttctggcgcaggcgaaccaggttccgcaaaacgtcctctcttta QAGTSVLAQANQV PQNVLSL 1441 ctgcgttaa 1449 L R *

>pTH890-C55

1 atggcacaagtcattaatacaaacagcctgtcgctgttgacccagaataacctgaacaaa MAQVINTNSLSLLTQNNLNK 61 tcccagtccgctctgggcaccgctatcgagcgtctgtcttccggtctgcgtatcaacagc SQSALGTAIERLSSGLRINS 121 gcgaaagacgatgcggcaggtcaggcgattgctaaccgttttaccgcgaacatcaaaggt AKDDAAGQAIANRFTANI KG 181 ctgactcaggcttcccgtaacgctaacgacggtatctccattgcgcagaccactgaaggc LTQASRNANDGISIAQTTEG 241 gcgctgaacgaaatcaacaacaacctgcagcgtgtgcgtgaactggcggttcagtctgct ALNEINNNLQRVRELAVQSA 3 01 aacagcaccaactcccagtctgacctcgactccatccaggctgaaatcacccagcgcctg N S T N SQSDLDS I QAE I TQRL 361 aacgaaatcgaccgtgtatccggccagactcagttcaacggcgtgaaagtcctggcgcag NEIDRVSGQTQFNGVK VLAQ 421 gacaacaccctgaccatccaggttgqtqccaacgacqqtqaaactatcgatatcqatctq DNTLTIQVGANDGETIDID L 4 81 aagcagatcaactctcagaccctgggtctggatacgctgaatgtgcaacaaaaatataag KQINSQTLGLDTLNVQ QKYK 541 gtcagcgatacggctgcaactgttacaggatatgccgatactacgattgctttagacaat VSDTAATVTGYADTTIALDN 601 agtacttttaaagcctcggctactggtcttggtggtactgaccagaaaattgatggcgat STFKASATGLGGTDQKIDGD 661 ttaaaatttgatgatacgactggaaaatattacgccaaagttaccgttacggggggaact LKFDDTTGKYYAKVTVTGGT 721 ggtaaagatggctattatgaagtttccgttgataagacgaacggtgaggtgactcttgct GKDGYYEVSVDKTN GEVTLA 781 ggcggtgcgacttccccgcttacaggtggactacctgcgacagaggctaaagccgcattg GGATSPLTGGLPATEAKAAL 841 acagcagcaggtgttaccggcacagcatctgttgttaagatgtcttatactgataataac TAAGVTGTASVVKMSYTDNN 901 ggtaaaactattgatggtggtttagcagttaaggtaggcgatgattactattctgcaact G K T I DGGLAVKVGDDYY SAT 961 caaaataaagatggttccataagtattaatactacgaaatacactgcagatgacggtaca QNKDGS I S INTTKYTADDG T 1021 tccaaaactgcactaaacaaactgggtggcgcagacggcaaaaccgaagttgtttctatt SKTALNKLGGADGKTEVVS I 1081 ggtggtaaaacttacgctgcaagtaaagccgaaggtcacaactttaaagcacagcctgat G GKTYAASKAEGHNFKAQPD 1141 ctggcggaagcggctgctacaaccaccgaaaacccgctgcagaaaattgatgctgctttg LAEAAATTTENPLQKIDAAL 12 01 gcacaggttgacacgttacgttctgacctgggtgcggtacagaaccgtttcaactccgct

97

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. AQVDTLRSDLGAVQNRFNSA 126.1 attaccaacctgggcaacaccgtaaacaacctgacttctgcccgtagccgtatcgaagat I T NLGNTVNNLTSARSRIED 1321 tccgactacgcgaccgaagtttccaacatgtctcgcgcgcagattctgcagcaggccggt SDYATEVSNMSRAQI LQQ.AG 1381 acctccgttctggcgcaggcgaaccaggttccgcaaaacgtcctctctttactgcgttaa TSVLAQ ANQVPQNVLSLLR* 1440

>pTH8 90-C3 0

1 atggcacaagtcattaatacaaacagcctgtcgctgttgacccagaataacctgaacaaa MAQVINTNSLSLLTQNNLNK 61 tcccagtccgctctgggcaccgctatcgagcgtctgtcttccggtctgcgtatcaacagc SQSALGTA IERLSSGLRINS 121 gcgaaagacgatgcggcaggtcaggcgattgctaaccgttttaccgcgaacatcaaaggt AKDDAAGQAIANRFTANIKG 181 ctgactcaggcttcccgtaacgctaacgacggtatctccattgcgcagaccactgaaggc LTQASRNANDGI S IAQTTEG 241 gcgctgaacgaaatcaacaacaacctgcagcgtgtgcgtgaactggcggttcagtctgct ALNEINNNLQRVRELAVQSA 3 01 aacagcaccaactcccagtctgacctcgactccatccaggctgaaatcacccagcgcctg NSTNSQSDLDSIQAEITQRL 361 aacgaaatcgaccgtgtatccggccagactcagttcaacggcgtgaaagtcctggcgcag N E I DRVSGQTQFNGVK VLAQ 421 gacaacaccctgaccatccaggttggtgccaacgacggtgaaactatcgatatcgatctg DNTLTIQVGANDGETIDIDL 4 81 aagcagatcaactctcagaccctgggtctggatacgctgaatgtgcaacaaaaatataag KQINSQTLGLDTLNVQQKYK 541 gtcagcgatacggctgcaactgttacaggatatgccgatactacgattgctttagacaat VSDTAATVTGYADTTIALD N 601 agtacttttaaagcctcggctactggtcttggtggtactgaccagaaaattgatggcgat STFKASATGLGGTDQKIDGD 661 ttaaaatttgatgatacgactggaaaatattacgccaaagttaccgttacggggggaact LKFDDTTGKYYAKVTVTGGT 721 ggtaaagatggctattatgaagtttccgttgataagacgaacggtgaggtgactcttgct GKDGYYEVSVDKT NGEVTLA 781 ggcggtgcgacttccccggttgcaaatgctgatttgacagaggctaaagccgcattgaca GGATS PV ANADLTEAKAAL T 841 gcagcaggtgttaccggcacagcatctgttgttaagatgtcttatactgataataacqqt AAGVTGTASVVKMSYTDNNG 901 aaaactattgatggtggtttagcagttaaggtaggcgatgattactattctgcaactcaa K TIDGGLAVKVGDDYYSATQ 961 aataaagatggttccataagtattaatactacgaaatacactgcagatgacggtacatcc NKDGS I S INTTKYTADDGTS 1021 aaaactgcactaaacaaactgggtggcgcagacggcaaaaccgaagttgtttctattggt KTALNKLG GADGKTEVVS I G 1081 ggtaaaacttacgctgcaagtaaagccgaaggtcacaactttaaagcacagcctgatctg GKTYAASKAEGHNFKAQPDL 1141 gcggaagcggctgctacaaccaccgaaaacccgctgcagaaaattgatgctgctttggca A E AAATTTENPLQKI

98

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 12 01 caggttgacacgttacgttctgacctgggtgcggtacagaaccgtttcaactccgctatt QVDTL RSDLGAVQ NR FN SAI 12 61 accaacctgggcaacaccgtaaacaacctgacttctgcccgtagccgtatcgaagattcc TNLGNTV NNLTSARSRIEDS 1321 gactacgcgaccgaagtttccaacatgtctcgcgcgcagattctgcagcaggccggtacc DYATEVSNMSRAQILQQAGT 13 81 tccgttctggcgcaggcgaaccaggttccgcaaaacgtcctctctttactgcgttaa SVLAQANQVPQNVLSLLR* 1437

>pTH890-C110

1 atggcacaagtcattaatacaaacagcctgtcgctgttgacccagaataacctgaacaaa MAQVINTNSLSLLTQNNLNK 61 tcccagtccgctctgggcaccgctatcgagcgtctgtcttccggtctgcgtatcaacagc SQSALGTAIERLSSG LRINS 121 gcgaaagacgatgcggcaggtcaggcgattgctaaccgttttaccgcgaacatcaaaggt AKDDAAGQAIANRFTANI KG 181 ctgactcaggcttcccgtaacgctaacgacggtatctccattgcgcagaccactgaaggc LTQASRNAN DGISIAQTTEG 241 gcgctgaacgaaatcaacaacaacctgcagcgtgtgcgtgaactggcggttcagtctgct ALNEINNNLQRVRELAVQS A 3 01 aacagcaccaactcccagtctgacctcgactccatccaggctgaaatcacccagcgcctg N STN SQSDLDS I QAE I TQRL 3 61 aacgaaatcgaccgtgtatccggccagactcagttcaacggcgtgaaagtcctggcgcag NE I DRVSGQTQFN GVKVLAQ 421 gacaacaccctgaccatccaggttggtgccaacgacggtgaaactatcgatatcgatctg DNTLTIQVGANDGETIDIDL 481 aagcagatcaactctcagaccctgggtctggatacgctgaatgtgcaacaaaaatataag KQINSQTLGLDTLNVQQKYK 541 gtcagcgatacggctgcaactgttacaggatatgccgatactacgattgctttagacaat V S DTAATVTGYADTT I ALDN 601 agtacttttaaagcctcggctactggtcttggtggtactgaccagaaaattgatggcgat S T FKASATGLGGTDQKIDGD 661 ttaaaatttgatgatacgactqgaaaatattacqccaaagttaccgttacqggqqqaact LKFDDTTGKYY AKVTVTGGT 721 ggtaaagatggctattatgaagtttccgttgataagacgaacggtgaggtgactcttgct G KDGYYEVSVDKTNGEVTLA 781 ggcggtgcgacttccccgcttacaggtgatttgacagaggctaaagccgcattgacagca GGATS PLTGDLTEAKAALTA 841 gcaggtgttaccggcacagcatctgttgttaagatgtcttatactgataataacggtaaa AGVTGTASVVKMSYTDNNGK 901 actattgatggtggtttagcagttaaggtaggcgatgattactattctgcaactcaaaat TIDGGLAVKVGDDYY SATQN 961 aaagatggttccataagtattaatactacgaaatacactgcagatgacggtacatccaaa KDGS I S INTTKYTADDGTSK 1021 actgcactaaacaaactgggtggcgcagacggcaaaaccgaagttgtttctattggtggt TA LNKLGGADGK TEVVSIGG 1081 aaaacttacgctgcaagtaaagccgaaggtcacaactttaaagcacagcctgatctggcg KTYAASKAEGHNFKAQPDLA 1141 gaagcggctgctacaaccaccgaaaacccgctgcagaaaattgatgctgctttggcacag

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. EAAATTTENPLQKIDAALAQ 12 01 gttgacacgttacgttctgacctgggtgcggtacagaaccgtttcaactccgctattacc VDTLRSDLGAVQNRFNSAIT 1261 aacctgggcaacaccgtaaacaacctgacttctgcccgtagccgtatcgaagattccgac NLGNTVNNLTSARSRIEDSD 1321 tacgcgaccgaagtttccaacatgtctcgcgcgcagattctgcagcaggccggtacctcc YATEVSNMSRAQILQQAGTS 1381 gttctggcgcaggcgaaccaggttccgcaaaacgtcctctctttactgcgttaa 1434 VLAQANQVPQNVLSLLR*

>pTH8 90-C115

1 atggcacaagtcattaatacaaacagcctgtcgctgttgacccagaataacctgaacaaa MAQVINTNSLSLLTQNNLNK 61 tcccagtccgctctgggcaccgctatcgagcgtctgtcttccggtctgcgtatcaacagc SQSALGTAIERLSSGLRINS 121 gcgaaagacgatgcggcaggtcaggcgattgctaaccgttttaccgcgaacatcaaaggt A KD DAAGQA I AN R F TAN I KG 181 ctgactcaggcttcccgtaacgctaacgacggtatctccattgcgcagaccactgaaggc LTQASRNANDGISIAQTTEG 241 gcgctgaacgaaatcaacaacaacctgcagcgtgtgcgtgaactggcggttcagtctgct ALNEINNNLQRVRELAVQSA 3 01 aacagcaccaactcccagtctgacctcgactccatccaggctgaaatcacccagcgcctg NSTNSQS DLDSIQAEITQRL 3 61 aacgaaatcgaccgtgtatccggccagactcagttcaacggcgtgaaagtcctggcgcag NEIDRVSG QTQFNGVKVLAQ 421 gacaacaccctgaccatccaggttggtgccaacgacggtgaaactatcgatatcgatctg DNTLTIQVGANDGETIDIDL 481 aagcagatcaactctcagaccctgggtctggatacgctgaatgtgcaacaaaaatataag KQINSQTLGLDTLNVQQKYK 541 gtcagcgatacggctgcaactgttacaggatatgccgatactacgattgctttagacaat V SDTAATVTGYADTTIALDN 601 agtacttttaaagcctcggctactggtcttggtggtactgaccagaaaattgatggcgat STFKASATGLGGTDQKIDGD 661 ttaaaatttgatgatacgactggaaaatattacgccaaaqttaccqttacgqggqqaact LKFDDTTGKYYAKVTVTGGT 721 ggtaaagatggctattatgaagtttccgttgataagacgaacggtgaggtgactcttgct GK DGYYEVSVDKTNGEVTLA 781 qqcqqtqcqacttccccqcttacaqqtqqactacctqcqacaqcaactqaqqatqtqqca GGAT S PLTGGL PATATE DVA 841 qcagqtqttaccqqcacaqcatctqttqttaaqatqtcttatactqataataacqqtaaa AGVTGTASVVKMSYTDNNGK 901 actattqatqqtqqtttagcaqttaaqqtagqcqatqattactattctqcaactcaaaat T I DGGL AVKVGDDYY SATQN 961 aaagatggttccataagtattaatactacgaaatacactgcagatgacggtacatccaaa KDGS I S INTTKYTADD GTSK 1021 actgcactaaacaaactgggtggcgcagacggcaaaaccgaagttgtttctattggtggt TALNKLGGADGKTEVVS I G G 1081 aaaacttacqctqcaaqtaaaqccqaaqqtcacaactttaaaqcacaqcctgatctqqcq KTYAAS. KAEGHNFKAQPDLA 1141 qaaqcqqctqctacaaccaccqaaaacccqctqcaqaaaattqatqctqctttqqcacaq EAAATTTENPLQKIDAALAQ 12 01 gttgacacgttacgttctgacctgggtgcggtacagaaccgtttcaactccgctattacc

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. VDTLRSDLG AVQNRFNSAIT 1261 aacctgggcaacaccgtaaacaacctgacttctgcccgtagccgtatcgaagattccgac NLGNTVNNLTSARSRIEDSD 1321 tacgcgaccgaagtttccaacatgtctcgcgcgcagattctgcagcaggccggtacctcc YATEVSNMSRAQILQQAGTS 1381 gttctggcgcaggcgaaccaggttccgcaaaacgtcctctctttactgcgttaa 14 34 VLAQANQVPQNVLSLLR *

>pTH890-C70

1 atggcacaagtcattaatacaaacagcctgtcgctgttgacccagaataacctgaacaaa MAQVINTNSLSLLTQNNLNK 61 tcccagtccgctctgggcaccgctatcgagcgtctgtcttccggtctgcgtatcaacagc SQSALGTAIERLSSGLRINS 121 gcgaaagacgatgcggcaggtcaggcgattgctaaccgttttaccgcgaacatcaaaggt AKDDAAGQAIANRFTANI KG 181 ctgactcaggcttcccgtaacgctaacgacggtatctccattgcgcagaccactgaaggc LTQASRNANDGI S IAQTTEG 241 gcgctgaacgaaatcaacaacaacctgcagcgtgtgcgtgaactggcggttcagtctgct ALNEINNNLQRVRELAVQSA 3 01 aacagcaccaactcccagtctgacctcgactccatccaggctgaaatcacccagcgcctg NSTNSQSDLDSIQAEITQRL 3 61 aacgaaatcgaccgtgtatccggccagactcagttcaacggcgtgaaagtcctggcgcag NEIDRVSGQTQFNGVKVLAQ 421 gacaacaccctgaccatccaggttggtgccaacgacggtgaaactatcgatatcgatctg DNT LT I QVGANDGE T I D I DL 4 81 aagcagatcaactctcagaccctgggtctggatacgctgaatgtgcaacaaaaatataag KQINSQTLGLDTLNVQQKYK 541 gtcagcgatacggctgcaactgttacaggatatgccgatactacgattgctttagacaat V S DTAATVTGYADTT I ALDN 601 agtacttttaaagcctcggctactggtcttggtggtactgaccagaaaattgatggcgat STFKASATGLGGTDQKIDGD 661 ttaaaatttgatgatacgactggaaaatattacgccaaagttaccgttacggggggaact L .KFDDTTGKYYAKVTVTGGT 721 ggtaaagatggctattatgaagtttccgttgataagacgaacggtgaggtgactcttgct GKDGYYEVSVDKTNGEVTLA 781 ggcggtgcgactcaagttgcaaatgctgatttgacagaggctaaagccgcattgacagca GGATQ VANADLTEAKAALTA 841 gcaggtgttaccggcacagcatctgttgttaagatgtcttatactgataataacggtaaa AGVTGTASVVKMSYTDNNGK 901 actattgatggtggtttagcagttaaggtaggcgatgattactattctgcaactcaaaat T I DGGLAVKVGDDYY SATQN 961 aaagatggttccataagtattaatactacgaaatacactgcagatgacggtacatccaaa KDGSISINTTKYTADDGTSK 1021 actgcactaaacaaactgggtggcgcagacggcaaaaccgaagttgtttctattggtggt TALNKLGGADGKTEVVSIGG 1081 aaaacttacgctgcaagtaaagccgaaggtcacaactttaaagcacagcctgatctggcg KTYAASKAEGHNFKAQPDLA 1141 gaagcggctgctacaaccaccgaaaacccgctgcagaaaattgatgctgctttggcacag EAAATTTENPLQKIDAALAQ 12 01 gttgacacgttacgttctgacctgggtgcggtacagaaccgtttcaactccgctattacc VDTLRSDLGAVQNRFNSAIT

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1261 aacctgggcaacaccgtaaacaacctgacttctgcccgtagccgtatcgaagattccgac NLGNTVNNLT SARSRIEDSD 1321 tacgcgaccgaagtttccaacatgtetcgcgcgcagattctgcagcaggccggtacctcc YATEVSNMSRAQILQQAGTS 1381 gttctggcgcaggcgaaccaggttccgcaaaacgtcctctctttactgcgttaa 14 34 VLAQANQVPQN VLSLLR *

>pTH890-C112

1 atggcacaagtcattaatacaaacagcctgtcgctgttgacccagaataacctgaacaaa MAQVINTNSL SLLTQNNLNK 61 tcccagtccgctctgggcaccgctatcgagcgtctgtcttccggtctgcgtatcaacagc SQSALGTAIERLSSGLRINS 121 gcgaaagacgatgcggcaggtcaggcgattgctaaccgttttaccgcgaacatcaaaggt AKDDAAGQAIANRFTANI KG 181 ctgactcaggcttcccgtaacgctaacgacggtatctccattgcgcagaccactgaaggc LTQASRNANDGI S IAQTTEG 241 gcgctgaacgaaatcaacaacaacctgcagcgtgtgcgtgaactggcggttcagtctgct ALNEINNNLQRVRE LAVQSA 3 01 aacagcaccaactcccagtctgacctcgactccatccaggctgaaatcacccagcgcctg NSTNSQSDLDSIQAEITQRL 361 aacgaaatcgaccgtgtatccggccagactcagttcaacggcgtgaaagtcctggcgcag NE I DR V SGQTQFNGVKVLA Q 421 gacaacaccctgaccatccaggttggtgccaacgacggtgaaactatcgatatcgatctg DNTLTIQVGANDGETIDIDL 481 aagcagatcaactctcagaccctgggtctggatacgctgaatgtgcaacaaaaatataag KQINSQTLGLDTLNVQQKYK 541 gtcagcgatacggctgcaactgttacaggatatgccgatactacgattgctttagacaat VS DTAATVTGY ADTT I ALDN 601 agtacttttaaagcctcggctactggtcttggtggtactgaccagaaaattgatggcgat ST FKASATG LGGTDQKIDGD 661 ttaaaatttgatgatacgactggaaaatattacgccaaagttaccgttacggggggaact LKFDDTTGKYYAKVTVTGGT 721 ggtaaagatggctattatgaagtttccgttgataagacgaacggtgaggtgactcttgct GKDGYYE VSVDKTNGEVTLA 7 81 ggcggtgcgacttccccgcttacaggtacagaggctaaagccgcattgacagcagcaggt GGAT S PLTGTEAKAALTAAG 841 gttaccggcacagcatctgttgttaagatgtcttatactgataataacggtaaaactatt VTGTASVVKMSYTDNN G.KT I 901 gatggtggtttagcagttaaggtaggcgatgattactattctgcaactcaaaataaagat DGGLAVKVG DDYY SATQNKD 961 ggttccataagtattaatactacgaaatacactgcagatgacggtacatccaaaactgca GS I S INTTKYTADDGTS K T A 1021 ctaaacaaactgggtggcgcagacggcaaaaccgaagttgtttctattggtggtaaaact LNKLGGADGKTEVVS IG GKT 1081 tacgctgcaagtaaagccgaaggtcacaactttaaagcacagcctgatctggcggaagcg YAASKAEGHNFKAQPDL AEA 1141 gctgctacaaccaccgaaaacccgctgcagaaaattgatgctgctttggcacaggttgac AATTT ENPLQKI DAALAQVD 12 01 acgttacgttctgacctgggtgcggtacagaaccgtttcaactccgctattaccaacctg TLRSDLGA VQNRFNSAITNL 12 61 ggcaacaccgtaaacaacctgacttctgcccgtagccgtatcgaagattccgactacgcg

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. GNTVN NLTSARSRIEDSDYA 1321 accgaagtttccaacatgtctcgcgcgcagattctgcagcaggccggtacctccgttctg TEVSNMSRAQILQQAGTSVL 1381 gcgcaggcgaaccaggttccgcaaaacgtcctctctttactgcgttaa 1428 AQANQVPQNVLSLLR*

>pTH890-C118

1 atggcacaagtcattaatacaaacagcctgtcgctgttgacccagaataacctgaacaaa MAQVINTNSLSLLTQ NNLNK 61 tcccagtccgctctgggcaccgctatcgagcgtctgtcttccggtctgcgtatcaacagc SQSALGTAIE RLSSGLRINS 121 gcgaaagacgatgcggcaggtcaggcgattgctaaccgttttaccgcgaacatcaaaggt AKDDAAGQAIANRFTA NI KG 181 ctgactcaggcttcccgtaacgctaacgacggtatctccattgcgcagaccactgaaggc LTQASRNANDGI S IAQTT EG 241 gcgctgaacgaaatcaacaacaacctgcagcgtgtgcgtgaactggcggttcagtctgct ALNEINNNLQRVRELAVQSA 3 01 aacagcaccaactcccagtctgacctcgactccatccaggctgaaatcacccagcgcctg NSTNSQSDLDSIQAEITQRL 361 aacgaaatcgaccgtgtatccggccagactcagttcaacggcgtgaaagtcctggcgcag NEIDRVSGQTQFNGVKVLAQ 421 gacaacaccctgaccatccaggttggtgccaacgacggtgaaactatcgatatcgatctg DNTLTIQVGANDGETIDIDL 481 aagcagatcaactctcagaccctgggtctggatacgctgaatgtgcaacaaaaatataag KQINSQTLGLDTLNVQQKYK 541 gtcagcgatacggctgcaactgttacaggatatgccgatactacgattgctttagacaat VSDTAATVTGYADTTIALDN 601 agtacttttaaagcctcggctactggtcttggtggtactgaccagaaaattgatggcgat STFKASATGLGGTDQKIDGD 661 ttaaaatttgatgatacgactggaaaatattacgccaaagttaccgttacggggggaact LKFDDTTGKYYAKV TVTGGT 721 ggtaaagatggctattatgaagtttccgttgataagacgaacggtgaggtgactcttgct GKDGYYEVSVDKTNGEVTLA 781 ggcggtgcgacttccccgcttacaggtggactacctgcgacagcaactgagggtgttacc GGATS PLTGGLP ATATEGVT 841 ggcacagcatctgttgttaagatgtcttatactgataataacggtaaaactattgatggt GTASVVKMSYTDNNGKTIDG 901 ggtttagcagttaaggtaggcgatgattactattctgcaactcaaaataaagatggttcc GLAVKVGDDYY SATQNKDGS 961 ataagtattaatactacgaaatacactgcagatgacggtacatccaaaactgcactaaac ISINTTKYTADDGTSKTALN 1021 aaactgggtggcgcagacggcaaaaccgaagttgtttctattggtggtaaaacttacgct KLGGADGKTEVVSIGGKTYA 1081 gcaagtaaagccgaaggtcacaactttaaagcacagcctgatctggcggaagcggctgct ASKAEGHNFKAQPDLAEAAA 1141 acaaccaccgaaaacccgctgcagaaaattgatgctgctttggcacaggttgacacgtta TTTENPLQKIDAALAQVDTL 12 01 cgttctgacctgggtgcggtacagaaccgtttcaactccgctattaccaacctgggcaac RSDLGAVQNRFNSAI TNLGN 12 61 accgtaaacaacctgacttctgcccgtagccgtatcgaagattccgactacgcgaccgaa TVNNLTSARSRIEDSDYATE

103

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1321 gtttccaacatgtctcgcgcgcagattctgcagcaggccggtacctccgttctggcgcag VSNMSRAQILQQAGTSVLAQ 1381 gcgaaccaggttccgcaaaacgtcctctctttactgcgttaa 1422 ANQVPQNVLSLLR*

>pTH890-C42

1 atggcacaagtcattaatacaaacagcctgtcgctgttgacccagaataacctgaacaaa MAQVINTNSLSLLTQNNLNK 61 tcccagtccgctctgggcaccgctatcgagcgtctgtcttccggtctgcgtatcaacagc SQSALG TAIERLSSGLRINS 121 gcgaaagacgatgcggcaggtcaggcgattgctaaccgttttaccgcgaacatcaaaggt AKDDAAGQAIANRFTANI KG 181 ctgactcaggcttcccgtaacgctaacgacggtatctccattgcgcagaccactgaaggc L TQASRNANDGI S IAQTTEG 241 gcgctgaacgaaatcaacaacaacctgcagcgtgtgcgtgaactggcggttcagtctgct ALNEINNNLQRVRELAVQSA 3 01 aacagcaccaactcccagtctgacctcgactccatccaggctgaaatcacccagcgcctg NSTNSQSDLDSIQAEITQRL 3 61 aacgaaatcgaccgtgtatccggccagactcagttcaacggcgtgaaagtcctggcgcag N E I DRVS GQTQ F NGVKVLAQ 421 gacaacaccctgaccatccaggttggtgccaacgacggtgaaactatcgatatcgatctg DNTLTIQVGANDGETIDIDL 4 81 aagcagatcaactctcagaccctgggtctggatacgctgaatgtgcaacaaaaatataag KQINSQTLGLDTLNVQQK YK 541 gtcagcgatacggctgcaactgttacaggatatgccgatactacgattgctttagacaat VSDTAATVTGYADTTIALDN 601 agtacttttaaagcctcggctactggtcttggtggtactgaccagaaaattgatggcgat STFKASATGLG GTDQKIDGD 661 ttaaaatttgatgatacgactggaaaatattacgccaaagttaccgttacggggggaact LKFDDTTGKYYAKVTVTGGT 721 ggtaaagatggctattatgaagtttccgttgataagacgaacggtgaggtgactcttgct G KDGYYEVSVDKTNGEVTLA 781 ggcggtgcgacttccccgttgacagaggctaaagccgcattgacagcagcaggtgttacc G G A T S PLTEAKAALTAAGVT 841 ggcacagcatctgttgttaagatgtcttatactgataataacggtaaaactattgatggt GTASVVKMSYTDNNGKTIDG 901 ggtttagcagttaaggtaggcgatgattactattctgcaactcaaaataaagatggttcc G L AVKVGDDYY SATQNKDGS 961 ataagtattaatactacgaaatacactgcagatgacggtacatccaaaactgcactaaac I S INTTKYTADDGTSKTALN 1021 aaactgggtggcgcagacggcaaaaccgaagttgtttctattggtggtaaaacttacgct KLGGADGKTEVVS IG GKTYA 1081 gcaagtaaagccgaaggtcacaactttaaagcacagcctgatctggcggaagcggctgct ASKAEGHNFKAQPDLAEAAA 1141 acaaccaccgaaaacccgctgcagaaaattgatgctgctttggcacaggttgacacgtta TTTENPLQKIDAALAQVDTL 12 01 cgttctgacctgggtgcggtacagaaccgtttcaactccgctattaccaacctgggcaac RSDLGAVQNRFNSAITNLGN 1261 accgtaaacaacctgacttctgcccgtagccgtatcgaagattccgactacgcgaccgaa TVNNLTSARSRIEDSDYATE 13 21 gtttccaacatgtctcgcgcgcagattctgcagcaggccggtacctccgttctggcgcag 104

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. VSNMSRAQILQQAGTSVLAQ 1381 gcgaaccaggttccgcaaaacgtcctctctttactgcgttaa 1422 ANQV PQNVLSLLR*

>pTH8 90 -C l82

1 atggcacaagtcattaatacaaacagcctgtcgctgttgacccagaataacctgaacaaa MAQVINTNS'LSLLTQNNLNK 61 tcccagtccgctctgggcaccgctatcgagcgtctgtcttccggtctgcgtatcaacagc SQSALGTAI ERLSSGLRINS 121 gcgaaagacgatgcggcaggtcaggcgattgctaaccgttttaccgcgaacatcaaaggt AKDD AAGQAIANRFTANI KG 181 ctgactcaggcttcccgtaacgctaacgacggtatctccattgcgcagaccactgaaggc LTQASRNANDGISIAQTTEG 241 gcgctgaacgaaatcaacaacaacctgcagcgtgtgcgtgaactggcggttcagtctgct ALNEINNNLQRVRELAVQSA 3 01 aacagcaccaactcccagtctgacctcgactccatccaggctgaaatcacccagcgcctg NS TNS QSDLDS I QAE I TQRL 3 61 aacgaaatcgaccgtgtatccggccagactcagttcaacggcgtgaaagtcctggcgcag NEIDRVSGQTQFNGVKVLAQ 421 gacaacaccctgaccatccaggttggtgccaacgacggtgaaactatcgatatcgatctg DNTLTIQVGANDGETIDIDL 4 81 aagcagatcaactctcagaccctgggtctggatacgctgaatgtgcaacaaaaatataag KQINSQTLGLDTLNVQQKYK 541 gtcagcgatacggctgcaactgttacaggatatgccgatactacgattgctttagacaat VSDTAATVTGYADTTIALDN 601 agtacttttaaagcctcggctactggtcttggtggtactgaccagaaaattgatggcgat STFKASATGLGGTDQKIDGD 661 ttaaaatttgatgatacgactggaaaatattacgccaaagttaccgttacggggggaact LKFDDTTGKYYAKVTVTGGT 721 ggtaaagatggctattatgaagtttccgttgataagacgaacggtgaggtgactcttgct GKDGYYEVSVDK TNGEVTLA 781 ggcggtgcgacttccccgcttaaagccgcattgacagcagcaggtgttaccggcacagca GGATS PLK AALTAAGVTGTA 841 tctgttgttaagatgtcttatactgataataacggtaaaactattgatggtggtttagca SVV KMSYTDNNGKTI DGGLA 901 gttaaggtaggcgatgattactattctgcaactcaaaataaagatggttccataagtatt VKVGDDYY SATQNKDGS I S I 961 aatactacgaaatacactgcagatgacggtacatccaaaactgcactaaacaaactgggt NTTKYTADDGTS KTALNKLG 1021 ggcgcagacggcaaaaccgaagttgtttctattggtggtaaaacttacgctgcaagtaaa GADGKTEVVSIGGKTYAASK 1081 gccgaaggtcacaactttaaagcacagcctgatctggcggaagcggctgctacaaccacc AEGHNFKAQPDLAEAAATTT 1141 gaaaacccgctgcagaaaattgatgctgctttggcacaggttgacacgttacgttctgac ENPLQKIDAALAQVDTLRSD 12 01 ctgggtgcggtacagaaccgtttcaactccgctattaccaacctgggcaacaccgtaaac LGAVQNRFN SAI TNLGNTVN 12 61 aacctgacttctgcccgtagccgtatcgaagattccgactacgcgaccgaagtttccaac NL TSARSRI EDSDYATEVSN 13 21 atgtctcgcgcgcagattctgcagcaggccggtacctccgttctggcgcaggcgaaccag MSRAQ I LQQAGTSVLAQANQ

105

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1381 gttccgcaaaacgtcctctctttactgcgttaa 1413 VPQNVLSLLR*

>pTH890-C27

1 atggcacaagtcattaatacaaacagcctgtcgctgttgacccagaataacctgaacaaa MAQVINTNSLSLLTQNNLN K 61 tcccagtccgctctgggcaccgctatcgagcgtctgtcttccggtctgcgtatcaacagc SQSALGTAIERLSSGLRINS 121 gcgaaagacgatgcggcaggtcaggcgattgctaaccgttttaccgcgaacatcaaaggt AKDDAAGQAIANRFTANIKG 181 ctgactcaggcttcccgtaacgctaacgacggtatctccattgcgcagaccactgaaggc LTQASRNANDGI S IAQTTEG 241 gcgctgaacgaaatcaacaacaacctgcagcgtgtgcgtgaactggcggttcagtctgct ALNEINNNLQRVRELAVQSA 3 01 aacagcaccaactcccagtctgacctcgactccatccaggctgaaatcacccagcgcctg NSTNSQSDLDSIQA EITQRL 3 61 aacgaaatcgaccgtgtatccggccagactcagttcaacggcgtgaaagtcctggcgcag NEIDRVSGQT QFNGVKVLAQ 421 gacaacaccctgaccatccaggttggtgccaacgacggtgaaactatcgatatcgatctg DNTLT I QVGANDGE T I D I DL 4 81 aagcagatcaactctcagaccctgggtctggatacgctgaatgtgcaacaaaaatataag KQINSQTLGLDTLNVQQ KYK 541 gtcagcgatacggctgcaactgttacaggatatgccgatactacgattgctttagacaat VSDTAATVTGYADTTIALDN 601 agtacttttaaagcctcggctactggtcttggtggtactgaccagaaaattgatggcgat STFKASATGLGGT DQKIDGD 661 ttaaaatttgatgatacgactggaaaatattacgccaaagttaccgttacggggggaact LKFD DTT GKYYAKVTVTGGT 721 ggtaaagatggctattatgaagtttccgttgataagacgaacggtgaggtgactcttgct GKDGYYEVSVDKTNGEVTLA 781 ggcggtgcgacttccccgcttacaggtggactacctgcgacagcaactgaggatgtggca G G A T S PLTGGL PATATEDVA 841 tctgttgttaagatgtcttatactgataataacggtaaaactattgatggtggtttagca S VVKMSYTDNNGKTIDGG LA 901 gttaaggtaggcgatgattactattctgcaactcaaaataaagatggttccataagtatt VKVGDDYY SATQNKDGS I S I 961 aatactacgaaatacactgcagatgacggtacatccaaaactgcactaaacaaactgggt NTTKYTADDGTSKTALNKLG 1021 ggcgcagacggcaaaaccgaagttgtttctattggtggtaaaacttacgctgcaagtaaa GADGKTEVVSIGGKTYAASK 1081 gccgaaggtcacaactttaaagcacagcctgatctggcggaagcggctgctacaaccacc A E GHNFKAQPDLAEAAATTT 1141 gaaaacccgctgcagaaaattgatgctgctttggcacaggttgacacgttacgttctgac ENPLQKIDAALAQVDTLRSD 12 01 ctgggtgcggtacagaaccgtttcaactccgctattaccaacctgggcaacaccgtaaac LGAVQNRFN SAI TNLGN TVN 12 61 aacctgacttctgcccgtagccgtatcgaagattccgactacgcgaccgaagtttccaac N L T SARSRIEDSDYATEVSN 1321 atgtctcgcgcgcagattctgcagcaggccggtacctccgttctggcgcaggcgaaccag MSRAQIL. QQAGTSVLAQANQ 1381 gttccgcaaaacgtcctctctttactgcgttaa 1413 106

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. VPQNVLSLLR*

>pTH8 90-C4 8

1 atggcacaagtcattaatacaaacagcctgtcgctgttgacccagaataacctgaacaaa MAQVINTNSLSLL TQNNLNK 61 tcccagtccgctctgggcaccgctatcgagcgtctgtcttccggtctgcgtatcaacagc SQSALGTAIERLSSGLRINS 121 gcgaaagacgatgcggcaggtcaggcgattgctaaccgttttaccgcgaacatcaaaggt AKDDAAGQAIANRFTANI KG 181 ctgactcaggcttcccgtaacgctaacgacggtatctccattgcgcagaccactgaaggc LTQASRNANDGI S IAQTTEG 241 gcgctgaacgaaatcaacaacaacctgcagcgtgtgcgtgaactggcggttcagtctgct ALNE I NNNLQRVRELAVQSA 3 01 aacagcaccaactcccagtctgacctcgactccatccaggctgaaatcacccagcgcctg NSTNSQ SDLDS I QAE I TQRL 361 aacgaaatcgaccgtgtatccggccagactcagttcaacggcgtgaaagtcctggcgcag NEIDRVSGQTQFNGVKVLAQ 421 gacaacaccctgaccatccaggttggtgccaacgacggtgaaactatcgatatcgatctg DNTLTIQVGANDGETIDIDL 481 aagcagatcaactctcagaccctgggtctggatacgctgaatgtgcaacaaaaatataag KQINSQTLGLDTLNVQQKYK 541 gtcagcgatacggctgcaactgttacaggatatgccgatactacgattgctttagacaat VSDTAATVTGYADTTIALDN 601 agtacttttaaagcctcggctactggtcttggtggtactgaccagaaaattgatggcgat STFKASATGLG GTDQKIDGD 661 ttaaaatttgatgatacgactggaaaatattacgccaaagttaccgttacggggggaact LKFDDTTGKYYAKVTVTGGT 721 ggtaaagatggctattatgaagtttccgttgataagacgaacggtgaggtgactcttgct GKDGYYEVSVDKTNGEVTLA 781 ggcggtgcgacttccccgcttacaggtggactacctgcgacagcaaccggcacagcatct GGATSPLTGGLPATATGTAS 841 gttgttaagatgtcttatactgataataacggtaaaactattgatggtggtttagcagtt VVKMSYTDNNGKTIDGGLAV 901 aaggtaggcgatgattactattctgcaactcaaaataaagatggttccataagtattaat KVGDDYYSATQNKDGSISIN 961 actacgaaatacactgcagatgacggtacatccaaaactgcactaaacaaactgggtggc T T .KYTADDGTSKTALNKLGG 1021 gcagacggcaaaaccgaagttgtttctattggtggtaaaacttacgctgcaagtaaagcc ADGKTEVVSIGGKTYAASKA 1081 gaaggtcacaactttaaagcacagcctgatctggcggaagcggctgctacaaccaccgaa EGHNFKAQPDLAEAA ATTTE 1141 aacccgctgcagaaaattgatgctgctttggcacaggttgacacgttacgttctgacctg NPLQKIDAA LAQVDTLRSDL 12 01 ggtgcggtacagaaccgtttcaactccgctattaccaacctgggcaacaccgtaaacaac GAVQNRFNSAITNLGNTVNN 12 61 ctgacttctgcccgtagccgtatcgaagattccgactacgcgaccgaagtttccaacatg LTSARSRIEDSDYATE VS NM 13 21 tctcgcgcgcagattctgcagcaggccggtacctccgttctggcgcaggcgaaccaggtt SRAQILQQAGTSVLAQANQV 1381 ccgcaaaacgtcctctctttactgcgttaa 1410 PQNV LSLLR*

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. >pTH890-C33

1 atggcacaagtcattaatacaaacagcctgtcgctgttgacccagaataacctgaacaaa MAQVINTNSLSLLTQNNLNK 61 tcccagtccgctctgggcaccgctatcgagcgtctgtcttccggtctgcgtatcaacagc SQSALGTAIERLSSGLRINS 121 gcgaaagacgatgcggcaggtcaggcgattgctaaccgttttaccgcgaacatcaaaggt AKDDAAGQAIANRFTANI KG 181 ctgactcaggcttcccgtaacgctaacgacggtatctccattgcgcagaccactgaaggc L TQASRN ANDGI S IAQTTEG 241 gcgctgaacgaaatcaacaacaacctgcagcgtgtgcgtgaactggcggttcagtctgct ALNEINNNLQRVRELAVQSA 3 01 aacagcaccaactcccagtctgacctcgactccatccaggctgaaatcacccagcgcctg NSTNSQS DLDSIQAEITQRL 361 aacgaaatcgaccgtgtatccggccagactcagttcaacggcgtgaaagtcctggcgcag NEIDRVSGQTQFNGVKVLAQ 421 gacaacaccctgaccatccaggttggtgccaacgacggtgaaactatcgatatcgatctg DNTLTI QVGANDGETIDIDL 4 81 aagcagatcaactctcagaccctgggtctggatacgctgaatgtgcaacaaaaatataag KQINSQTLGLDTLNVQQKYK 541 gtcagcgatacggctgcaactgttacaggatatgccgatactacgattgctttagacaat VSDTAATVTGYADTTIALDN 601 agtacttttaaagcctcggctactggtcttggtggtactgaccagaaaattgatggcgat STFKASATGLGGTDQKIDGD 661 ttaaaatttgatgatacgactggaaaatattacgccaaagttaccgttacggggggaact LKFDDTTGKYYAKVTVTGGT 721 ggtaaagatggctattatgaagtttccgttgataagacgaacggtgaggtgactcttgct GKDGYYEVSVDKTNGEVTLA 781 ggcggtgcgacttccccgcttacaggtggactacctgcgacagcaactgaggatgtgtct GGATSPLT GGLPATATEDVS 841 gttgttaagatgtcttatactgataataacggtaaaactattgatggtggtttagcagtt VVKMSYTDNNGKTIDGGLAV 901 aaggtaggcgatgattactattctgcaactcaaaataaagatggttccataagtattaat KVGDDYY SATQNKDGS I S IN 961 actacgaaatacactgcagatgacggtacatccaaaactgcactaaacaaactgggtggc T TKYTADDGTS KTALNKLGG 1021 gcagacggcaaaaccgaagttgtttctattggtggtaaaacttacgctgcaagtaaagcc ADGKTEVVSIGGKTY AASKA 1081 gaaggtcacaactttaaagcacagcctgatctggcggaagcggctgctacaaccaccgaa EGHNFKAQPDLAEAAATTTE 1141 aacccgctgcagaaaattgatgctgctttggcacaggttgacacgttacgttctgacctg NPLQKIDA ALAQVDTLRSDL 1201 ggtgcggtacagaaccgtttcaactccgctattaccaacctgggcaacaccgtaaacaac GAVQNRFN SAI TNLGNTVNN 1261 ctgacttctgcccgtagccgtatcgaagattccgactacgcgaccgaagtttccaacatg LTSARSRIEDSDYAT EVSNM 1321 tctcgcgcgcagattctgcagcaggccggtacctccgttctggcgcaggcgaaccaggtt SRAQ I LQQAGTSVLAQANQV 1381 ccgcaaaacgtcctctctttactgcgttaa 1410 PQNVLSLLR*

108

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. >pTH8 90-C62

1 atggcacaagtcattaatacaaacagcctgtcgctgttgacccagaataacctgaacaaa MAQV I NTNSLSLLTQNNLNK 61 tcccagtccgctctgggcaccgctatcgagcgtctgtcttccggtctgcgtatcaacagc S Q S ALGTAIERLSSGLRINS 121 gcgaaagacgatgcggcaggtcaggcgattgctaaccgttttaccgcgaacatcaaaggt AKDDAAGQAIANRFTANI KG 181 ctgactcaggcttcccgtaacgctaacgacggtatctccattgcgcagaccactgaaggc LTQASRNANDGI S IAQTTEG 241 gcgctgaacgaaatcaacaacaacctgcagcgtgtgcgtgaactggcggttcagtctgct ALNEINNNLQRVRELAVQSA 3 01 aacagcaccaactcccagtctgacctcgactccatccaggctgaaatcacccagcgcctg NSTNSQSDLDSI QAEITQRL 3 61 aacgaaatcgaccgtgtatccggccagactcagttcaacggcgtgaaagtcctggcgcag NEIDRVSGQTQFNGVKVLAQ 421 gacaacaccctqaccatccagqttggtgccaacgacqqtgaaactatcgatatcqatctq DNTLT I QVGANDGET I D I DL 4 81 aagcagatcaactctcagaccctgggtctggatacgctgaatgtgcaacaaaaatataag KQINSQTLGLDT LNVQQKYK 541 gtcagcgatacggctgcaactgttacaggatatgccgatactacgattgctttagacaat VSDTAA TVTGYADTTIALDN 601 agtacttttaaagcctcggctactggtcttggtggtactgaccagaaaattgatggcgat STFKASATG LGGTDQKIDGD 661 ttaaaatttgatgatacgactggaaaatattacgccaaagttaccgttacggggggaact LKFDDT TGKY YAKVTVTGGT 721 ggtaaagatggctattatgaagtttccgttgataagacgaacggtgaggtgactcttgct GKDGYYEVSVDKTNGEVTLA 781 ggcggtgcgacttccccgcttacaggtggactacctgcgacagcaacttctgttgttaag GGAT S P LTGG L PATATSVVK 841 atgtcttatactgataataacggtaaaactattgatggtggtttagcagttaaggtaggc MSYTDNNGKTIDGGLAVKVG 901 gatgattactattctgcaactcaaaataaagatggttccataagtattaatactacgaaa DDYYSATQNKDGSISINTTK 961 tacactgcagatgacggtacatccaaaactgcactaaacaaactgggtggcgcagacggc YTADDGTSKTALNKLGGADG 1021 aaaaccgaagttgtttctattggtggtaaaacttacgctgcaagtaaagccgaaggtcac KTEVVSIGGKTYAASKAEGH 1081 aactttaaaqcacagcctqatctqqcqqaaqcqqctqctacaaccaccqaaaacccqctq NFKAQPDLAEAAATTTENPL 1141 cagaaaattgatgctgctttggcacaggttgacacgttacgttctgacctgggtg'cggta Q K I DAAL AQ VDTLRSDLGAV 12 01 cagaaccgtttcaactccgctattaccaacctgggcaacaccgtaaacaacctgacttct QNRFNSAI TNLGNT VNNLTS 12 61 gcccgtagccgtatcgaagattccgactacgcgaccgaagtttccaacatgtctcgcgcg ARSRIEDSDYATEVSNMSRA 1321 cagattctgcagcaggccggtacctccgttctggcgcaggcgaaccaggttccgcaaaac QILQQAGTSVLAQANQVPQN 1381 gtcctctctttactgcgttaa 1401 V L S L L R *

> p T H 8 9 0 - C l 2 0 109

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1 atggcacaagtcattaatacaaacagcctgtcgctgttgacccagaataacctgaacaaa MAQVINTNSLSLLTQNNLNK 61 tcccagtccgctctgggcaccgctatcgagcgtctgtcttccggtctgcgtatcaacagc SQSALGTAIERLSSGLRINS 121 gcgaaagacgatgcggcaggtcaggcgattgctaaccgttttaccgcgaacatcaaaggt AKDDAAGQAIANRFTANI KG 181 ctgactcaggcttcccgtaacgctaacgacggtatctccattgcgcagaccactgaaggc LTQASRNANDGI S IAQTTEG 241 gcgctgaacgaaatcaacaacaacctgcagcgtgtgcgtgaactggcggttcagtctgct ALNEINNNLQRVRELAVQSA 3 01 aacagcaccaactcccagtctgacctcgactccatccaggctgaaatcacccagcgcctg NSTNSQSDLDSIQAEITQRL 3 61 aacgaaatcgaccgtgtatccggccagactcagttcaacggcgtgaaagtcctggcgcag N EIDRVSGQTQFNGVKVLAQ 421 gacaacaccctgaccatccaggttggtgccaacgacggtgaaactatcgatatcgatctg DNTLTIQVGANDGETIDIDL 4 81 aagcagatcaactctcagaccctgggtctggatacgctgaatgtgcaacaaaaatataag KQINSQT LGLDTLNVQQKYK 541 gtcagcgatacggctgcaactgttacaggatatgccgatactacgattgctttagacaat VSDTAATVTGYADTTIALDN 601 agtacttttaaagcctcggctactggtcttggtggtactgaccagaaaattgatggcgat STFKASATGLGGTDQKIDGD 661 ttaaaatttgatgatacgactggaaaatattacgccaaagttaccgttacggggggaact LKFDDTTGKYYAKVTVTGGT 721 ggtaaagatggctattatgaagtttccgttgataagacgaacggtgaggtgactcttgct GKDGYYEVSVDKTN GEVTLA 781 ggcggtgcgacttccccgcttacaggtggactacctgcgacagcaactgaggttaagatg GGATS PLTGGLPAT ATEVKM 841 tcttatactgataataacggtaaaactattgatggtggtttagcagttaaggtaggcgat SYTDNNGKTIDGG LAVKVGD 901 gattactattctgcaactcaaaataaagatggttccataagtattaatactacgaaatac DYYSATQNKDGSISINTTKY 961 actgcagatgacggtacatccaaaactgcactaaacaaactgggtggcgcagacggcaaa TADDGTSKTALNKLGGADGK 1021 accgaagttgtttctattggtggtaaaacttacgctgcaagtaaagccgaaggtcacaac T EVVS IGGKTYAA SKAEGHN 1081 tttaaagcacagcctgatctggcggaagcggctgctacaaccaccgaaaacccgctgcag FKAQPDLAEA AATTTENPLQ 1141 aaaattgatgctgctttggcacaggttgacacgttacgttctgacctgggtgcggtacag KIDAALAQVDTLRSDLG AVQ 12 01 aaccgtttcaactccgctattaccaacctgggcaacaccgtaaacaacctgacttctgcc NRFN SAI TNLGNTVNN LTSA 1261 cgtagccgtatcgaagattccgactacgcgaccgaagtttccaacatgtctcgcgcgcag RSRIEDSDYATEVSNMSRAQ 1321 attctgcagcaggccggtacctccgttctggcgcaggcgaaccaggttccgcaaaacgtc I L QQAGTSVLAQANQVPQNV 1381 ctctctttactgcgttaa 1398 L S L L R *

>pTH8 90-C3 5

1 atggcacaagtcattaatacaaacagcctgtcgctgttgacccagaataacctgaacaaa 110

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. MAQVINTNSLSLLTQNNLNK 61 tcccagtccgctctgggcaccgctatcgagcgtctgtcttccggtctgcgtatcaacagc SQSALGTAIERLSSGLRINS 121 gcgaaagacgatgcggcaggtcaggcgattgctaaccgttttaccgcgaacatcaaaggt AKDDAAGQAIANRFTANIKG 181 ctgactcaggcttcccgtaacgctaacgacggtatctccattgcgcagaccactgaaggc LTQASR NANDGI S IAQTTEG 241 gcgctgaacgaaatcaacaacaacctgcagcgtgtgcgtgaactggcggttcagtctgct ALNE INNNLQRV RELAVQSA 301 aacagcaccaactcccagtctgacctcgactccatccaggctgaaatcacccagcgcctg N S T N SQSDLD S I QAE I TQRL 361 aacgaaatcgaccgtgtatccggccagactcagttcaacggcgtgaaagtcctggcgcag NEIDRVSGQTQFNGVKVLAQ 421 gacaacaccctgaccatccaggttggtgccaacgacggtgaaactatcgatatcgatctg DNTLTIQVGANDGETIDIDL 481 aagcagatcaactctcagaccctgggtctggatacgctgaatgtgcaacaaaaatataag KQINSQTLGLDTLNVQQKYK 541 gtcagcgatacggctgcaactgttacaggatatgccgatactacgattgctttagacaat VSDTAATVTGYADTTIALDN 601 agtacttttaaagcctcggctactggtcttggtggtactgaccagaaaattgatggcgat STFKASATGLGGTDQKIDGD 661 ttaaaatttgatgatacgactggaaaatattacgccaaagttaccgttacggggggaact LKFDDTTGKYYAKVTVTGGT 721 ggtaaagatggctattatgaagtttccgttgataagacgaacggtgaggtgactcttgct GKDGYYEVS VDKTNGEVTLA 781 ggcgccgcattgacagcagcaggtgttaccggcacagcatctgttgttaagatgtcttat GAALTAAGVTGTASVVKMSY 841 actgataataacggtaaaactattgatggtggtttagcagttaaggtaggcgatgattac TDNNGKTIDGGLAVKVGDDY 901 tattctgcaactcaaaataaagatggttccataagtattaatactacgaaatacactgca YSATQNKDGSISINTTKYTA 961 gatgacggtacatccaaaactgcactaaacaaactgggtggcgcagacggcaaaaccgaa DDGTSKTALNKLGGADGKTE 1021 gttgtttctattggtggtaaaacttacgctgcaagtaaagccgaaggtcacaactttaaa V V S I GGKTYAASKAE GHNFK 1081 gcacagcctgatctggcggaagcggctgctacaaccaccgaaaacccgctgcagaaaatt AQPDLAEAAATTTENPLQKI 1141 gatgctgctttggcacaggttgacacgttacgttctgacctgggtgcggtacagaaccgt DAALAQVDTLRSDLGAVQNR 12 01 ttcaactccgctattaccaacctgggcaacaccgtaaacaacctgacttctgcccgtagc F N S A I T NLGNTVNNLTSARS 1261 cgtatcgaagattccgactacgcgaccgaagtttccaacatgtctcgcgcgcagattctg RIEDSDYATEVSNMSRAQIL 1321 cagcaggccggtacctccgttctggcgcaggcgaaccaggttccgcaaaacgtcctctct QQAGTSVLAQANQVPQNVLS 1381 ttactgcgttaa 1392 L L R *

>pTH8 90-C4

111

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1 atggcacaagtcattaatacaaacagcctgtcgctgttgacccagaataacctgaacaaa MAQVINTNSLSLLTQNNLNK 61 tcccagtccgctctgggcaccgctatcgagcgtctgtcttccggtctgcgtatcaacagc SQSALGTAIERLSSGLR INS 121 gcgaaagacgatgcggcaggtcaggcgattgctaaccgttttaccgcgaacatcaaaggt A KD DA AGQA I AN R F TAN I KG 181 ctgactcaggcttcccgtaacgctaacgacggtatctccattgcgcagaccactgaaggc LTQASRNANDGI SIAQTTEG 241 gcgctgaacgaaatcaacaacaacctgcagcgtgtgcgtgaactggcggttcagtctgct ALNEINNNLQRVRELAVQSA 3 01 aacagcaccaactcccagtctgacctcgactccatccaggctgaaatcacccagcgcctg N S TN SQSDLDS I Q A E I TQRL 3 61 aacgaaatcgaccgtgtatccggccagactcagttcaacggcgtgaaagtcctggcgcag N E IDRV SGQTQFNGVKVLAQ 421 gacaacaccctgaccatcGaggttggtgccaacgacggtgaaactatcgatatcgatctg DNTLTIQVGANDGETIDIDL 4 81 aagcagatcaactctcagaccctgggtctggatacgctgaatgtgcaacaaaaatataag KQINSQTLGLDTLNVQQKYK 541 gtcagcgatacggctgcaactgttacaggatatgccgatactacgattgctttagacaat V SDTAATVTGYADTT IALDN 601 agtacttttaaagcctcggctactggtcttggtggtactgaccagaaaattgatggcgat STFKASATGLGGTDQKIDGD 661 ttaaaatttgatgatacgactggaaaatattacgccaaagttaccgttacggggggaact LKFDDTTGKYYAK VTVTGGT 721 ggtaaagatggctattatgaagtttccgttgataagacgaacggtgaggtgactcttgct GKDGYYEVSVDKTNGEVTLA 781 ggcgcattgacagcagcaggtgttaccggcacagcatctgttgttaagatgtcttatact GALTAAGVTGTASVVKMSYT 841 gataataacggtaaaactattgatggtggtttagcagttaaggtaggcgatgattactat DNNGKTIDGGLAVK VGDDYY 901 tctgcaactcaaaataaagatggttccataagtattaatactacgaaatacactgcagat S ATQNKDGSI S INTTKYTAD 961 gacggtacatccaaaactgcactaaacaaactgggtggcgcagacggcaaaaccgaagtt DGTSKTALNKLGGADGKTEV 1021 gtttctattggtggtaaaacttacgctgcaagtaaagccgaaggtcacaactttaaagca VSIGGKTYAASKAEGHNFKA 1081 cagcctgatctggcggaagcggctgctacaaccaccgaaaacccgctgcagaaaattgat QPDLA EAAATTTENPLQKID 1141 gctgctttggcacaggttgacacgttacgttctgacctgggtgcggtacagaaccgtttc AALAQVDTLRSDLGAVQNRF 12 01 aactccgctattaccaacctgggcaacaccgtaaacaacctgacttctgcccgtagccgt NSAIT NLGNTVNNLTSARSR 12 61 atcgaagattccgactacgcgaccgaagtttccaacatgtctcgcgcgcagattctgcag IEDSDYATE VSNMSRAQILQ 1321 caggccggtacctccgttctggcgcaggcgaaccaggttccgcaaaacgtcctctcttta QAGTSVLAQAN.QVPQNVLSL 1381 ctgcgttaa 1389 L R *

>pTH890-C174

1 atggcacaagtcattaatacaaacagcctgtcgctgttgacccagaataacctgaacaaa 112

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. MA QVINTNSLSLLTQNNLNK 61 tcccagtccgctctgggcaccgctatcgagcgtctgtcttccggtctgcgtatcaacagc SQSALGTAIERLSSGLRINS 121 gcgaaagacgatgcggcaggtcaggcgattgctaaccgttttaccgcgaacatcaaaggt AKDDAAGQAIANRFTANIKG 181 ctgactcaggcttcccgtaacgctaacgacggtatctccattgcgcagaccactgaaggc LTQASRNANDGISIAQTTEG 241 gcgctgaacgaaatcaacaacaacctgcagcgtgtgcgtgaactggcggttcagtctgct A LNE INNNLQRVRELAVQSA 3 01 aacagcaccaactcccagtctgacctcgactccatccaggctgaaatcacccagcgcctg NSTNSQSDLDSIQAEITQRL 361 aacgaaatcgaccgtgtatccggccagactcagttcaacggcgtgaaagtcctggcgcag NEIDRVSGQTQFNGVKVLAQ 421 gacaacaccctgaccatccaggttggtgccaacgacggtgaaactatcgatatcgatctg DNTLT I QVGANDGE T I D I DL 4 81 aagcagatcaactctcagaccctgggtctggatacgctgaatgtgcaacaaaaatataag KQINSQTLGLDTLNVQQKYK 541 gtcagcgatacggctgcaactgttacaggatatgccgatactacgattgctttagacaat VSDTAATVTGYADTTIALDN 601 agtacttttaaagcctcggctactggtcttggtggtactgaccagaaaattgatggcgat STFKASATGLGGTDQKIDGD 661 ttaaaatttgatgatacgactggaaaatattacgccaaagttaccgttacggggggaact LKFDDTTGKYYAKVTVTGGT 721 ggtaaagatggctattatgaagtttccgttgataagacgaacggtgaggtgactcttgct GKDGYYEVSVDKTNGEVTLA 781 ggcggtgcgacttccccgcttacaggtggactacctgcgacagcaactgaggatgtgaaa GGAT S PLTGG L PATATEDVK 841 gataataacggtaaaactattgatggtggtttagcagttaaggtaggcgatgattactat DNNGKTIDGGLAVKVGDDYY 901 tctgcaactcaaaataaagatggttccataagtattaatactacgaaatacactgcagat SATQ NKDGS I S INTTKYTAD 961 gacggtacatccaaaactgcactaaacaaactgggtggcgcagacggcaaaaccgaagtt DGTS KTALNKLGGADGKTEV 1021 gtttctattggtggtaaaacttacgctgcaagtaaagccgaaggtcacaactttaaagca VSIGGKTYAASKAEGHNFKA 1081 cagcctgatctggcggaagcggctgctacaaccaccgaaaacccgctgcagaaaattgat QPDLAEAAATTTENPLQK ID 1141 gctgctttggcacaggttgacacgttacgttctgacctgggtgcggtacagaaccgtttc AALAQVDTLRSDLGAVQNRF 12 01 aactccgctattaccaacctgggcaacaccgtaaacaacctgacttctgcccgtagccgt NSAITNLGNTVNNLTSARSR 12 61 atcgaagattccgactacgcgaccgaagtttccaacatgtctcgcgcgcagattctgcag IEDSDYATEVSNMSRAQILQ 1321 caggccggtacctccgttctggcgcaggcgaaccaggttccgcaaaacgtcctctcttta QAGTSVLAQANQVPQNVLSL 1381 ctgcgttaa 1389 L R *

113

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. >pTH890-C21

1 atggcacaagtcattaatacaaacagcctgtcgctgttgacccagaataacctgaacaaa MAQVINTNSLSLLTQNNLNK 61 tcccagtccgctctgggcaccgctatcgagcgtctgtcttccggtctgcgtatcaacagc S QS ALGTAIERLSSGLRI NS 121 gcgaaagacgatgcggcaggtcaggcgattgctaaccgttttaccgcgaacatcaaaggt AKDDAAGQAIA .NRFTANI KG 181 ctgactcaggcttcccgtaacgctaacgacggtatctccattgcgcagaccactgaaggc LTQASRNANDGISIAQTTEG 241 gcgctgaacgaaatcaacaacaacctgcagcgtgtgcgtgaactggcggttcagtctgct ALNEINNNLQRVRELAVQSA 3 01 aacagcaccaactcccagtctgacctcgactccatccaggctgaaatcacccagcgcctg N STN SQSDLDS I QAE I TQRL 3 61 aacgaaatcgaccgtgtatccggccagactcagttcaacggcgtgaaagtcctggcgcag NEIDRVSGQTQFNGVKVLAQ 421 gacaacaccctgaccatccaggttggtgccaacgacggtgaaactatcgatatcgatctg DNTLTIQVGANDGETI. DIDL 481 aagcagatcaactctcagaccctgggtctggatacgctgaatgtgcaacaaaaatataag KQINSQTLGLDTLNVQQKYK 541 gtcagcgatacggctgcaactgttacaggatatgccgatactacgattgctttagacaat VSDTAATV TGYADTT IALDN 601 agtacttttaaagcctcggctactggtcttggtggtactgaccagaaaattgatggcgat STFKASATGLGGTDQKIDGD 661 ttaaaatttgatgatacgactggaaaatattacgccaaagttaccgttacggggggaact LKFDDTTGKYYAKVTVTGGT 721 ggtaaagatggctattatgaagtttccgttgataagacgaacggtgaggtgactcttgct GKDGYYEVSVDKTNGEVTLA 781 ggcggtgcgacttccccggttaccggcacagcatctgttgttaagatgtcttatactgat GGATSPVTGTASVVKMSYTD 841 aataacggtaaaactattgatggtggtttagcagttaaggtaggcgatgattactattct NNGKTIDGGLAVKVGDDYY S 901 gcaactcaaaataaagatggttccataagtattaatactacgaaatacactgcagatgac ATQNKDGS I S I NTTKYTA DD 961 ggtacatccaaaactgcactaaacaaactgggtggcgcagacggcaaaaccgaagttgtt GTS KTALNKLGGA DGKTEVV 1021 tctattggtggtaaaacttacgctgcaagtaaagccgaaggtcacaactttaaagcacag SIGGKTYAASKAEGHNFKAQ 1081 cctgatctggcggaagcggctgctacaaccaccgaaaacccgctgcagaaaattgatgct PDLAEAAATTTENPLQKI DA 1141 gctttggcacaggttgacacgttacgttctgacctgggtgcggtacagaaccgtttcaac ALAQVDTL RSDLGAVQNRFN 1201 tccgctattaccaacctgggcaacaccgtaaacaacctgacttctgcccgtagccgtatc SAITNLGNTVNNLTSARSRI 12 61 gaagattccgactacgcgaccgaagtttccaacatgtctcgcgcgcagattctgcagcag EDSDYAT EVSNMSRAQILQQ 1321 gccggtacctccgttctggcgcaggcgaaccaggttccgcaaaacgtcctctctttactg AG TSVLAQANQVPQN VLSLL 1381 cgttaa 1386 R *

>pTH890-C61 114

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1 atggcacaagtcattaatacaaacagcctgtcgctgttgacccagaataacctgaacaaa MAQVINTNSLS LLTQNNLNK 61 tcccagtccgctctgggcaccgctatcgagcgtctgtcttccggtctgcgtatcaacagc SQSALGTAIERLSSGLRINS 121 gcgaaagacgatgcggcaggtcaggcgattgctaaccgttttaccgcgaacatcaaaggt AKDDAAGQAIANRFTANIKG 181 ctgactcaggcttcccgtaacgctaacgacggtatctccattgcgcagaccactgaaggc L T QASRNANDGI S IAQTTEG 241 gcgctgaacgaaatcaacaacaacctgcagcgtgtgcgtgaactggcggttcagtctgct ALNEINNNLQRVRELAVQSA 3 01 aacagcaccaactcccagtctgacctcgactccatccaggctgaaatcacccagcgcctg NSTNSQSDLDSIQAEITQRL 3 61 aacgaaatcgaccgtgtatccggccagactcagttcaacggcgtgaaagtcctggcgcag NE I D RV S GQ TQ FN GVKVLAQ 421 gacaacaccctgaccatccaggttggtgccaacgacggtgaaactatcgatatcgatctg DNTLTIQVGANDGETIDIDL 4 81 aagcagatcaactctcagaccctgggtctggatacgctgaatgtgcaacaaaaatataag KQINSQTL GLDTL NVQQKYK 541 gtcagcgatacggctgcaactgttacaggatatgccgatactacgattgctttagacaat VSDTAATVTG YADTTIALDN 601 agtacttttaaagcctcggctactggtcttggtggtactgaccagaaaattgatggcgat STFKASATGLGGTDQKIDGD 661 ttaaaatttgatgatacgactggaaaatattacgccaaagttaccgttacgggggqaact LKFDDTTGKYYAKVTVTGGT 721 ggtaaagatggctattatgaacaagttgcaaatgctgatttgacagaggctaaagccgca GKDGYYEQVANADLTEAKAA 781 ttgacagcagcaggtgttaccggcacagcatctgttgttaagatgtcttatactgataat LTAAGVT GTASVVKMSYTDN 841 aacggtaaaactattgatggtggtttagcagttaaggtaggcgatgattactattctgca NGKTIDGGLAVKVGDDYYSA 901 actcaaaataaagatggttccataagtattaatactacgaaatacactgcagatgacggt TQNKDG S I S INTTKYTADDG 961 acatccaaaactgcactaaacaaactgggtggcgcagacggcaaaaccgaagttgtttct T S KTALNKLGGADGK TEVV S 1021 attggtggtaaaacttacgctgcaagtaaagccgaaggtcacaactttaaagcacagcct IGGKTYAASKAEGHNFKAQP 1081 gatctggcggaagcggctgctacaaccaccgaaaacccgctgcagaaaattgatgctgct DLAEAAATTTENPLQKI DAA 1141 ttggcacaggttgacacgttacgttctgacctgggtgcggtacagaaccgtttcaactcc LAQVDTLRSDLGAVQNRFN S 12 01 gctattaccaacctgggcaacaccgtaaacaacctgacttctgcccgtagccgtatcgaa AITNLGNTVNNLTSARSRIE 12 61 gattccgactacgcgaccgaagtttccaacatgtctcgcgcgcagattctgcagcaggcc DS'DYATEVSNMSRAQILQQA 1321 ggtacctccgttctggcgcaggcgaaccaggttccgcaaaacgtcctctctttactgcgt GTSVLAQANQVPQNVLSLLR 1381 taa 1383 *

>pTH8 90-C4 7

1 atggcacaagtcattaatacaaacagcctgtcgctgttgacccagaataacctgaacaaa

115

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. MAQVINTNSLSLLTQNNLNK 61 tcccagtccgctctgggcaccgctatcgagcgtctgtcttccggtctgcgtatcaacagc S Q S A LGTAIER LSSGLRINS 121 gcgaaagacgatgcggcaggtcaggcgattgctaaccgttttaccgcgaacatcaaaggt AKDDAAGQAIANRFTANI KG 181 ctgactcaggcttcccgtaacgctaacgacggtatctccattgcgcagaccactgaaggc LTQASRNANDGI S IAQTTEG 241 gcgctgaacgaaatcaacaacaacctgcagcgtgtgcgtgaactggcggttcagtctgct ALNE INNNLQRVRELAVQSA 3 01 aacagcaccaactcccagtctgacctcgactccatccaggctgaaatcacccagcgcctg NSTNSQSDLDSIQAEITQRL 3 61 aacgaaatcgaccgtgtatccggccagactcagttcaacggcgtgaaagtcctggcgcag NEIDRVSGQTQFNGVKVLAQ 421 gacaacaccctgaccatccaggttggtgccaacgacggtgaaactatcgatatcgatctg DNTLTIQVGA NDGETIDIDL 4 81 aagcagatcaactctcagaccctgggtctggatacgctgaatgtgcaacaaaaatataag KQINSQTL GLDTLNVQQKYK 541 gtcagcgatacggctgcaactgttacaggatatgccgatactacgattgctttagacaat VSDTAATVTG YADTTIALDN 601 agtacttttaaagcctcggctactggtcttggtggtactgaccagaaaattgatggcgat STFKASATGLGGTDQKIDGD 661 ttaaaatttgatgatacgactggaaaatattacgccaaagttaccgttacggggggaact LKFDDTTGKYYAKVTVTGGT 721 ggtaaagatggctattatgaagtttccgttgataagacgaacggtgaggtgactcttgct GKDGYYEVSVDKTNGEVTLA 781 ggcggtgcgacttccccgcttacaggtggactacctgcgacagcaactgaggatgtgaaa GGATSPLTGGLPATATEDVK 841 actattgatggtggtttagcagttaaggtaggcgatgattactattctgcaactcaaaat T I DGGLAV KVGDDYY SATQN 901 aaagatggttccataagtattaatactacgaaatacactgcagatgacggtacatccaaa KDGSISINTTKYTADDGTSK 961 actgcactaaacaaactgggtggcgcagacggcaaaaccgaagttgtttctattggtggt TALNKL GGADGKTEVVS I G G 1021 aaaacttacgctgcaagtaaagccgaaggtcacaactttaaagcacagcctgatctqqcg KTYA ASKAEGHNFKAQPDL A 1081 gaagcggctgctacaaccaccgaaaacccgctgcagaaaattgatgctgctttggcacag EAAATTTENPLQKIDAALAQ 1141 gttgacacgttacgttctgacctgggtgcggtacagaaccgtttcaactccgctattacc VDTLRSDLGAVQNRFNSAIT 12 01 aacctgggcaacaccgtaaacaacctgacttctgcccgtagccgtatcgaagattccgac N L G NTVNNLTSARSRIEDSD 1261 tacgcgaccgaagtttccaacatgtctcgcgcgcagattctgcagcaggccggtacctcc YATEVSNMSRAQIL QQAGTS 1321 gttctggcgcaggcgaaccaggttccgcaaaacgtcctctctttactgcgttaa 13 74 V.LAQANQVPQNVLSLLR*

>pTH8 90-C l16

1 atggcacaagtcattaatacaaacagcctgtcgctgttgacccagaataacctgaacaaa MAQVINTNSLSLLTQN NL NK 61 tcccagtccgctctgggcaccgctatcgagcgtctgtcttccggtctgcgtatcaacagc SQSALGTAIERLSSGLRINS 116

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 121 gcgaaagacgatgcggcaggtcaggcgattgctaaccgttttaccgcgaacatcaaaggt AKDDAAG QAIANRFTANI KG 181 ctgactcaggcttcccgtaacgctaacgacggtatctccattgcgcagaccactgaaggc LTQASRNAND GI S IAQTTEG 241 gcgctgaacgaaatcaacaacaacctgcagcgtgtgcgtgaactggcggttcagtctgct ALNEINNNLQRVRELAVQSA 3 01 aacagcaccaactcccagtctgacctcgactccatccaggctgaaatcacccagcgcctg NSTNSQSDLDSIQAEITQRL 3 61 aacgaaatcgaccgtgtatccggccagactcagttcaacggcgtgaaagtcctggcgcag N E I DRVSGQTQFNGVKVLAQ 421 gacaacaccctgaccatccaggttggtgccaacgacggtgaaactatcgatatcgatctg DNTLT I QVGANDGE T I D I DL 4 81 aaqcagatcaactctcaqaccctgqqtctqqatacqctqaatqtqcaacaaaaatataaq KQINSQTLGLDTLNVQQK YK 541 gtcagcgatacggctgcaactgttacaggatatgccgatactacgattgctttagacaat VSDTAATVTGYADTTIALDN 601 agtacttttaaagcctcggctactggtcttggtggtactgaccagaaaattgatggcgat STFKASATGLGGTDQKI DGD 661 ttaaaatttgatgatacgactggaaaatattacgccaaagttaccgttacggggggaact LKFDDTTGKYYAKVTVTGGT- 721 ggtaaagatggctattatgaagtttccgttgataagacgaacggtgaggtgactcttgct GKDGYYEVSVDKTNGEVTLA 781 ggcggtgcgacttccccgcttacaggtggactacctgcgacagcagataataacggtaaa GGATS PLTGGLPATADNNGK 841 actattgatggtggtttagcagttaaggtaggcgatgattactattctgcaactcaaaat TIDGGLAVKVGDDYYSATQN 901 aaagatggttccataagtattaatactacgaaatacactgcagatgacggtacatccaaa KDGS I S INTTKYTADDGTSK 961 actgcactaaacaaactgggtggcgcagacggcaaaaccgaagttgtttctattggtggt T'ALNKLGGADGKTEVV S I GG 1021 aaaacttacgctgcaagtaaagccgaaggtcacaactttaaagcacagcctgatctggcg KTYAASKAEGHNFKAQPDLA 1081 gaagcggctgctacaaccaccgaaaacccgctgcagaaaattgatgctgctttggcacag EAAAT TTENPLQKIDAAL AQ 1141 gttgacacgttacgttctgacctgggtgcggtacagaaccgtttcaactccgctattacc VDTLRS DLGAVQNRFNSAIT 12 01 aacctgggcaacaccgtaaacaacctgacttctgcccgtagccgtatcgaagattccgac NLGNTVNNLTSARSRI EDSD 12 61 tacgcgaccgaagtttccaacatgtctcgcgcgcagattctgcagcaggccggtacctcc YATEVSNMSRAQILQQAGTS 1321 gttctggcgcaggcgaaccaggttccgcaaaacgtcctctctttactgcgttaa 1374 VLAQANQVPQNVLSLLR *

>pTH890-C6

1 atggcacaagtcattaatacaaacagcctgtcgctgttgacccagaataacctgaacaaa MAQVINTNSLSLLTQ NNLNK 61 tcccagtccgctctgggcaccgctatcgagcgtctgtcttccggtctgcgtatcaacagc SQSALGTAIERLSSGLRINS 121 gcgaaagacgatgcggcaggtcaggcgattgctaaccgttttaccgcgaacatcaaaggt AKDDAAGQ A IANRFTANI KG 181 ctgactcaggcttcccgtaacgctaacgacggtatctccattgcgcagaccactgaaggc LTQ ASRNANDGI S IAQTTEG

117

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 241 gcgctgaacgaaatcaacaacaacctgcagcgtgtgcgtgaactggcggttcagtctgct ALNEINN NLQRVRELAVQSA 3 01 aacagcaccaactcccagtctgacctcgactccatccaggctgaaatcacccagcgcctg NSTNSQSDLDSIQAEITQRL 361 aacgaaatcgaccgtgtatccggccagactcagttcaacggcgtgaaagtcctggcgcag NEI D RVSGQTQFNGVKVLAQ 421 gacaacaccctgaccatccaggttggtgccaacgacggtgaaactatcgatatcgatctg D N TLTIQVGANDGETIDIDL 4 81 aagcaqatcaactctcagaccctgggtctggatacqctqaatgtgcaacaaaaatataaq KQINSQTLGLDTLNVQQKYK 541 gtcagcgatacggctgcaactgttacaggatatgccgatactacgattgctttagacaat VSDTAATVTGYADTTIALDN 601 agtacttttaaagcctcggctactggtcttggtggtactgaccagaaaattgatggcgat STFKASATGLGGTDQKIDGD 661 ttaaaatttgatgatacgactggaaaatattacgccaaagttaccgttacggggggaact LKFDDTTGKYYAKVTVTGGT 721 ggtaaagatggctattatgaagtttccgttgataagacgaacggtgaggtgactcttgtt GKDGYYEVSVDKTNGEVTLV 781 accggcacagcatctgttgttaagatgtcttatactgataataacggtaaaactattgat TG TASVVKMSYTDNNGKTID 841 ggtggtttagcagttaaggtaggcgatgattactattctgcaactcaaaataaagatggt GGLAVKVGDDYY SAT QNKDG 901 tccataagtattaatactacgaaatacactgcagatgacggtacatccaaaactgcacta S I S INTTKYTADDGTSKTAL 961 aacaaactgggtggcgcagacggcaaaaccgaagttgtttctattggtggtaaaacttac NKLGGADGKTEVVSIGGKTY 1021 gctgcaagtaaagccgaaggtcacaactttaaagcacagcctgatctggcggaagcggct AASKAEGHNFKAQPDLAEAA 1081 gctacaaccaccgaaaacccgctgcagaaaattgatgctgctttggcacaggttgacacg ATTTENPLQKI DAALAQVDT 1141 ttacgttctgacctgggtgcggtacagaaccgtttcaactccgctattaccaacctgggc LRSDLGAVQNRFNSAITNLG 12 01 aacaccgtaaacaacctgacttctgcccgtagccgtatcgaagattccgactacgcgacc NTVNNLTSARSRIEDSDYAT 1261 gaagtttccaacatgtctcgcgcgcagattctgcagcaggccggtacctccgttctggcg EVSNMSRAQILQQAGTSVLA 1321 caggcgaaccaggttccgcaaaacgtcctctctttactgcgttaa 1365 QANQVPQNVLSLLR*

>pTH890-C14

1 atggcacaagtcattaatacaaacagcctgtcgctgttgacccagaataacctgaacaaa MAQVINTNSLSLLTQNNLNK 61 tcccagtccgctctgggcaccgctatcgagcgtctgtcttccggtctgcgtatcaacagc SQSALGTAIERLSSGLRINS 121 gcgaaagacgatgcggcaggtcaggcgattgctaaccgttttaccgcgaacatcaaaggt AKDDAA GQAIANRFTANI KG 181 ctgactcaggcttcccgtaacgctaacgacggtatctccattgcgcagaccactgaaggc LTQ ASRN ANDGI S.I AQTTEG 241 gcgctgaacgaaatcaacaacaacctgcagcgtgtgcgtgaactggcggttcagtctgct ALNE INNNLQRVRELAVQSA 3 01 aacagcaccaactcccagtctgacctcgactccatccaggctgaaatcacccagcgcctg

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. N S TN.SQSDLDS I QAE I TQRL 3 61 aacgaaatcgaccgtgtatccggccagactcagttcaacggcgtgaaagtcctggcgcag NEIDRVSGQTQFNGVKVL AQ 421 gacaacaccctgaccatccaggttggtgccaacgacggtgaaactatcgatatcgatctg DNTLTIQVGANDGETIDIDL 4 81 aagcagatcaactctcagaccctgggtctggatacgctgaatgtgcaacaaaaatataag KQINSQTLGLDTLNVQQKYK 541 gtcagcgatacggctgcaactgttacaggatatgccgatactacgattgctttagacaat VSDTAATVTGYADTTIALDN 601 agtacttttaaagcctcggctactggtcttggtggtactgaccagaaaattgatggcgat STFKASATGLGGTDQKIDGD 661 ttaaaatttgatgatacgactggaaaatattacgccaaagttaccgttacggggggaact LKFDDTTGKYYAKVTVTGGT 721 ggtaaagatggctattatgaagtttccgttgataagacgaacggtgaggtggttaccggc GKDGYYEVSVDKTNGEVVTG 781 acagcatctgttgttaagatgtcttatactgataataacggtaaaactattgatggtggt TASVVKMSYTDNNGKTIDGG 841 ttagcagttaaggtaggcgatgattactattctgcaactcaaaataaagatggttccata LAVK VGDDYY SATQNKDGS I 901 agtattaatactacgaaatacactgcagatgacggtacatccaaaactgcactaaacaaa S INTTKYTADDGTSK TALNK 961 ctgggtggcgcagacggcaaaaccgaagttgtttctattggtggtaaaacttacgctgca LGGADGKTEVVS IGGKTYAA 1021 agtaaagccgaaggtcacaactttaaagcacagcctgatctggcggaagcggctgctaca SKAEGHNFKAQPDLAEAAAT 1081 accaccgaaaacccgctgcagaaaattgatgctgctttggcacaggttgacacgttacgt TTENPLQKI DAALAQVDTLR 1141 tctgacctgggtgcggtacagaaccgtttcaactccgctattaccaacctgggcaacacc SDLGAVQNRFNSAITNLGNT 12 01 gtaaacaacctgacttctgcccgtagccgtatcgaagattccgactacgcgaccgaagtt VNNLTSARSRIEDSDYATEV 1261 tccaacatgtctcgcgcgcagattctgcagcaggccggtacctccgttctggcgcaggcg S NM S RAQ I LQ QAGT S VLAQA 1321 aaccaggttccgcaaaacgtcctctctttactgcgttaa 1359 NQVP Q NVLSLLR*

>pTH890-C39

1 atggcacaagtcattaatacaaacagcctgtcgctgttgacccagaataacctgaacaaa MAQVINTNSLSLLTQNNLNK 61 tcccagtccgctctgggcaccgctatcgagcgtctgtcttccggtctgcgtatcaacagc SQSALGTAIERLSSGLRINS 121 gcgaaagacgatgcggcaggtcaggcgattgctaaccgttttaccgcgaacatcaaaggt AKDDAAGQAIANRFTANI KG 181 ctgactcaggcttcccgtaacgctaacgacggtatctccattgcgcagaccactgaaggc LTQASRNANDGI S I AQTTEG 241 gcgctgaacgaaatcaacaacaacctgcagcgtgtgcgtgaactggcggttcagtctgct ALNE INNNLQRV RELAVQSA 3 01 aacagcaccaactcccagtctgacctcgactccatccaggctgaaatcacccagcgcctg NSTNSQSDLDSI QAEITQRL 361 aacgaaatcgaccgtgtatccggccagactcagttcaacggcgtgaaagtcctggcgcag NEID RVSGQT QFNGVKVLAQ

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 421 gacaacaccctgaccatccaggttggtgccaacgacggtgaaactatcgatatcgatctg DNTLTIQVGANDGETIDIDL 481 aagcagatcaactctcagaccctgggtctggatacgctgaatgtgcaacaaaaatataag KQINSQTLGLDTLNVQQKYK 541 gtcagcgatacggctgcaactgttacaggatatgccgatactacgattgctttagacaat VSDTAATVTGYADTTIALDN 601 agtacttttaaagcctcggctactggtcttggtggtactgaccagaaaattgatggcgat STFKASATGLGG TDQKI DGD 661 ttaaaatttgatgatacgactggaaaatattacgccaaagttaccgttacggggggaact LKFDDTTGKYYAKVTVTGGT 721 ggtaaagatggctattatgaagtttccgttgcattgacagcagcaggtgttaccggcaca GKDGYYEVSVALTAAGV TGT 781 gcatctgttgttaagatgtcttatactgataataacggtaaaactattgatggtggttta ASVVKMSYTDNNGKTIDGGL 841 gcagttaaggtaggcgatgattactattctgcaactcaaaataaagatggttccataagt AV KV GDDYYSATQNKDGS I S 901 attaatactacgaaatacactgcagatgacggtacatccaaaactgcactaaacaaactg INTTKYTADDGTS KTALNKL 961 ggtggcgcagacggcaaaaccgaagttgtttctattggtggtaaaacttacgctgcaagt GGADGKTEVVS I GGKTYAAS 1021 aaagccgaaggtcacaactttaaagcacagcctgatctggcggaagcggctgctacaacc KAEGHNFKAQPDLA EAAATT 1081 accgaaaacccgctgcagaaaattgatgctgctttggcacaggttgacacgttacgttct TENPLQKIDAALAQVDTLRS 1141 gacctgggtgcggtacagaaccgtttcaactccgctattaccaacctgggcaacaccgta DLGAVQNRFNSAITNLGNTV 12 01 aacaacctgacttctgcccgtagccgtatcgaagattccgactacgcgaccgaagtttcc NNLTSARSRIEDSDYATEVS 12 61 aacatgtctcgcgcgcagattctgcagcaggccggtacctccgttctggcgcaggcgaac NMSRAQILQQAGTSVLAQAN 1321 caggttccgcaaaacgtcctctctttactgcgttaa 1356 QVP .QNVLSLLR*

>pTH890-C88

1 atggcacaagtcattaatacaaacagcctgtcgctgttgacccagaataacctgaacaaa MAQ VINTNSLSLLTQNNLNK 61 tcccagtccgctctgggcaccgctatcgagcgtctgtcttccggtctgcgtatcaacagc SQSA LGTAI ERLS SGLRI NS 121 gcgaaagacgatgcggcaggtcaggcgattgctaaccgttttaccgcgaacatcaaaggt AKDDAAGQAIANRFTANI KG 181 ctgactcaggcttcccgtaacgctaacgacggtatctccattgcgcagaccactgaaggc LTQASRNANDGI S IAQTTEG 241 gcgctgaacgaaatcaacaacaacctgcagcgtgtgcgtgaactggcggttcagtctgct ALNEINNNLQRVRELAVQSA 3 01 aacagcaccaactcccagtctgacctcgactccatccaggctgaaatcacccagcgcctg NSTNSQSD LDSIQAEITQRL 361 aacgaaatcgaccgtgtatccggccagactcagttcaacggcgtgaaagtcctggcgcag NE I DRVSGQTQFNGV KVLAQ 421 gacaacaccctgaccatccaggttggtgccaacgacggtgaaactatcgatatcgatctg DNTLTIQVGANDGETIDIDL 4 81 aagcagatcaactctcagaccctgggtctggatacgctgaatgtgcaacaaaaatataag

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. KQINSQTLGLDTLNVQQKYK 541 gtcagcgatacggctgcaactgttacaggatatgccgatactacgattgctttagacaat VSDTAATVTGYADTTIALDN 601 agtacttttaaagcctcggctactggtcttggtggtactgaccagaaaattgatggcgat STFKASATGLGG TDQKIDGD 661 ttaaaatttgatgatacgactggaaaatattacgccaaagttaccgttacggggggaact LKFDDTTGKYYAKVTVTGGT 721 ggtaaagatggctattatgaagtttccgttacagcagcaggtgttaccggcacagcatct GKDGYYEVSVTAAGVTGTAS 781 gttgttaagatgtcttatactgataataacggtaaaactattgatggtggtttagcagtt VVKMSYTDNNGKTIDGGLAV 841 aaggtaggcgatgattactattctgcaactcaaaataaagatggttccataagtattaat KVGDDYY SATQNKDGS I S IN 901 actacgaaatacactgcagatgacggtacatccaaaactgcactaaacaaactgggtggc TTKYTADDGTSKTALNKLGG 961 gcagacggcaaaaccgaagttgtttctattggtggtaaaacttacgctgcaagtaaagcc ADGKTEVVSIGGKTYAASKA 1021 gaaggtcacaactttaaagcacagcctgatctggcggaagcggctgctacaaccaccgaa EGHNFKAQPDLAEAAATTTE 1081 aacccgctgcagaaaattgatgctgctttggcacaggttgacacgttacgttctgacctg NPLQKIDAALAQVDTLRSDL 1141 ggtgcggtacagaaccgtttcaactccgctattaccaacctgggcaacaccgtaaacaac GAVQNRFNSAITNLGNTVNN 12 01 ctgacttctgcccgtagccgtatcgaagattccgactacgcgaccgaagtttccaacatg LTSARSR IED SDYATEVSNM 1261 tctcgcgcgcagattctgcagcaggccggtacctccgttctggcgcaggcgaaccaggtt SRAQILQQAGTSVLAQANQV 1321 ccgcaaaacgtcctctctttactgcgttaa 1350 PQNVLSLLR*

>pTH8 90 - C l 6 7

1 atggcacaagtcattaatacaaacagcctgtcgctgttgacccagaataacctgaacaaa MAQVINTNSLSLLTQNNLNK 61 tcccagtccgctctgggcaccgctatcgagcgtctgtcttccggtctgcgtatcaacagc SQSALGTAIERLSSGLRINS 121 gcgaaagacgatgcggcaggtcaggcgattgctaaccgttttaccgcgaacatcaaaggt AKDDAAGQAIANRFTANI KG 181 ctgactcaggcttcccgtaacgctaacgacggtatctccattgcgcagaccactgaaggc LTQASRNANDGISIAQTTEG 241 gcgctgaacgaaatcaacaacaacctgcagcgtgtgcgtgaactggcggttcagtctgct ALNEINNNLQRVRELAVQ SA 3 01 aacagcaccaactcccagtctgacctcgactccatccaggctgaaatcacccagcgcctg NSTNSQSDLDSIQAEITQRL 3 61 aacgaaatcgaccgtgtatccggccagactcagttcaacggcgtgaaagtcctggcgcag NEIDRVSGQTQFNGVKVLAQ 421 gacaacaccctgaccatccaggttggtgccaacgacggtgaaactatcgatatcgatctg DNTLTIQVGANDGETIDIDL 4 81 aagcagatcaactctcagaccctgggtctggatacgctgaatgtgcaacaaaaatataag KQINSQTL GLDTLNVQQKYK 541 gtcagcgatacggctgcaactgttacaggatatgccgatactacgattgctttagacaat VSDTAATVTGYADTTIALDN 121

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 601 agtacttttaaagcctcggctactggtcttggtggtactgaccagaaaattgatggcgat STFKASATGLGGTDQKIDGD 661 ttaaaatttgatgatacgactggaaaatattacgccaaagttaccgttacggggggaact LKFDD TTGKYYAKVTVTGGT 721 ggtaaagatggctattatgaagcattgacagcagcaggtgttaccggcacagcatctgtt G KDGYYEALTAAG VTGTASV 781 gttaagatgtcttatactgataataacggtaaaactattgatggtggtttagcagttaag VKMSY TDNNGKTIDGGLAVK 841 gtaggcgatgattactattctgcaactcaaaataaagatggttccataagtattaatact VGD DYY SATQNKDGS I S INT 901 acgaaatacactgcagatgacggtacatccaaaactgcactaaacaaactgggtggcgca TKYTADDGTSKTALNKLGGA 961 gacggcaaaaccgaagttgtttctattggtggtaaaacttacgctgcaagtaaagccgaa DGKTEVVSIGGKTYAA SKAE 1021 ggtcacaactttaaagcacagcctgatctggcggaagcggctgctacaaccaccgaaaac GHNFKAQPDLAEAAATTTEN 1081 ccgctgcagaaaattgatgctgctttggcacaggttgacacgttacgttctgacctgggt PLQKI DAALAQVDTLRSDLG 1141 gcggtacagaaccgtttcaactc.cgctattaccaacctgggcaacaccgtaaacaacctg AVQN RFNSAITNLGNTVNNL 12 01 acttctgcccgtagccgtatcgaagattccgactacgcgaccgaagtttccaacatgtct TSARSRIEDSDYATEVSNMS 1261 cgcgcgcagattctgcagcaggccggtacctccgttctggcgcaggcgaaccaggttccg RAQILQQAGTSVLAQANQVP 1321 caaaacgtcctctctttactgcgttaa 1347 QNVLSLLR*

>pTH890-C9

1 atggcacaagtcattaatacaaacagcctgtcgctgttgacccagaataacctgaacaaa MAQVINTNSLSLLTQNNLNK 61 tcccagtccgctctgggeaccgctatcgagcgtctgtcttccggtctgcgtatcaacagc SQSALGTAIERLSSGLRINS 121 gcgaaagacgatgcggcaggtcaggcgattgctaaccgttttaccgcgaacatcaaaggt AKDDAAGQAIANRFTANI KG 181 ctgactcaggcttcccgtaacgctaacgacggtatctccattgcgcagaccactgaaggc LTQASRNANDGI S IAQTTE G 241 gcgctgaacgaaatcaacaacaacctgcagcgtgtgcgtgaactggcggttcagtctgct ALNEINNNLQRVRELAVQS A 3 01 aacagcaccaactcccagtctgacctcgactccatccaggctgaaatcacccagcgcctg NSTNSQSDLDS I Q A E I TQR L 361 aacgaaatcgaccgtgtatccggccagactcagttcaacggcgtgaaagtcctggcgcag NEIDRVSGQTQFNGVKVLAQ 421 gacaacaccctgaccatccaggttggtgccaacgacggtgaaactatcgatatcgatctg D NTLTIQVGANDGETIDIDL 481 aagcagatcaactctcagaccctgggtctggatacgctgaatgtgcaacaaaaatataag KQINSQTLGLDTLNVQ QKYK 541 gtcagcgatacggctgcaactgttacaggatatgccgatactacgattgctttagacaat VSDT AATVTGYADTTIALDN 6 01 agtacttttaaagcctcggctactggtcttggtggtactgaccagaaaattgatggcgat STFKASATGLGGTDQKIDGD 661 ttaaaatttgatgatacgactggaaaatattacgccaaagttaccgttacggggggaact

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. LKFDDTTGKYYAKVTVTGGT 721 ggtaaagatggctataaagccgcattgacagcagcaggtgttaccggcacagcatctgtt GKDGYKAALTAAGVTGTASV 781 gttaagatgtcttatactgataataacggtaaaactattgatggtggtttagcagttaag VKMSYTDNNGKTIDGG L AVK 841 gtaggcgatgattactattctgcaactcaaaataaagatggttccataagtattaatact V GDDYYSATQNKDGSISINT 901 acgaaatacactgcagatgacggtacatccaaaactgcactaaacaaactgggtggcgca TKYTADDGTSKTALNKLGGA 961 gacggcaaaaccgaagttgtttctattggtggtaaaacttacgctgcaagtaaagccgaa DGKTEVVSIGGKTYAASKAE 1021 ggtcacaactttaaagcacagcctgatctggcggaagcggctgctacaaccaccgaaaac GHNFKAQPDLAEAA ATTTEN 1081 ccgctgcagaaaattgatgctgctttggcacaggttgacacgttacgttctgacctgggt PLQKIDAALAQVDTLRSDLG 1141 gcggtacagaaccgtttcaactccgctattaccaacctgggcaacaccgtaaacaacctg AVQNRFNSAITNLGNTVNNL 12 01 acttctgcccgtagccgtatcgaagattccgactacgcgaccgaagtttccaacatgtct TSARSRIEDSDYATEVSNMS 12 61 cgcgcgcagattctgcagcaggccggtacctccgttctggcgcaggcgaaccaggttccg RAQILQQAGTSVLAQANQVP 1321 caaaacgtcctctctttactgcgttaa 1347 QNVLSLLR*

>pTH8 90-C8 9

1 atggcacaagtcattaatacaaacagcctgtcgctgttgacccagaataaoctgaacaaa M A Q VINTNS LSLLTQNNLNK 61 tcccagtccgctctgggcaccgctatcgagcgtctgtcttccggtctgcgtatcaacagc SQSALGTAIERLSSGLRINS 121 gcgaaagacgatgcggcaggtcaggcgattgctaaccgttttaccgcgaacatcaaaggt AKDDAAGQAIANRFTANI KG 181 ctgactcaggcttcccgtaacgctaacgacggtatctccattgcgcagaccactgaaggc LTQAS RNANDGI S IAQTT EG 241 gcgctgaacgaaatcaacaacaacctgcagcgtgtgcgtgaactggcggttcagtctgct ALNE I NNNLQ RVRELAVQ SA 3 01 aacagcaccaactcccagtctgacctcgactccatccaggctgaaatcacccagcgcctg NSTNS QSDLDSIQ. AEITQRL 361 aacgaaatcgaccgtgtatccggccagactcagttcaacggcgtgaaagtcctggcgcag NEIDRVSGQTQFNGVKVLAQ 421 gacaacaccctgaccatccaggttggtgccaacgacggtgaaactatcgatatcgatctg DNTLTIQVGANDGETIDIDL 4 81 aagcagatcaactctcagaccctgggtctggatacgctgaatgtgcaacaaaaatataag KQINSQTLGLDTLNVQQKYK 541 gtcagcgatacggctgcaactgttacaggatatgccgatactacgattgctttagacaat VSDTAATVTGYADTTIALDN 601 agtacttttaaagcctcggctactggtcttggtggtactgaccagaaaattgatggcgat STFKASATGLGGTDQKIDGD 661 ttaaaatttgatgatacgactggaaaatattacgccaaagttaccgttacggggggaact L K FDDTTGKYYAKVTVT GGT 721 ggtaaagatggctattatgaagtttccgttggtgttaccggcacagcatctgttgttaag GKDGYY EVSVGVTGTASV.VK

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 781 atgtcttatactgataataacggtaaaactattgatggtggtttagcagttaaggtaggc MSYTDNNGKTIDGGLAVKVG 841 gatgattactattctgcaactcaaaataaagatggttccataagtattaatactacgaaa DDYY SA TQNKD GS I S INTTK 901 tacactgcagatgacggtacatccaaaactgcactaaacaaactgggtggcgcagacggc YTADDGTSKTALNKL GGADG 961 aaaaccgaagttgtttctattggtggtaaaacttacgctgcaagtaaagccgaaggtcac KTEVV S I GGKTYAAS KAEGH 1021 aactttaaagcacagcctgatctggcggaagcggctgctacaaccaccgaaaacccgctg NPKAQPDLAEAAATTTENPL 1081 cagaaaattgatgctgctttggcacaggttgacacgttacgttctgacctgggtgcggta Q K I D AALAQVDTLRSDLGAV 1141 cagaaccgtttcaactccgctattaccaacctgggcaacaccgtaaacaacctgacttct QNRFN SAI TNL GNTVNNLTS 12 01 gcccgtagccgtatcgaagattccgactacgcgaccgaagtttccaacatgtctcgcgcg AR S R I E D S DYAT.EV S NM S RA 1261 cagattctgcagcaggccggtacctccgttctggcgcaggcgaaccaggttccgcaaaac Q I LQQAGTSVLAQ ANQVPQN 1321 gtcctctctttactgcgttaa 1341 V L S L L R *

>pTH890-C170

1 atggcacaagtcattaatacaaacagcctgtcgctgttgacccagaataacctgaacaaa MAQVINTNSLSLLT QNNLNK 61 tcccagtccgctctgggcaccgctatcgagcgtctgtcttccggtctgcgtatcaacagc SQSALGTAIERLSSGLRINS 121 gcgaaagacgatgcggcaggtcaggcgattgctaaccgttttaccgcgaacatcaaaggt AKDDAAGQAIANRFTA NI KG 181 ctgactcaggcttcccgtaacgctaacgacggtatctccattgcgcagaccactgaaggc LTQASRNANDGI S IAQTT EG 241 gcgctgaacgaaatcaacaacaacctgcagcgtgtgcgtgaactggcggttcagtctgct ALNEINNNLQRVRELAVQSA 3 01 aacagcaccaactcccagtctgacctcgactccatccaggctgaaatcacccagcgcctg N STN SQS DLDS I QAE I TQRL 3 61 aacgaaatcgaccgtgtatccggccagactcagttcaacggcgtgaaagtcctggcgcag N EIDRVSGQTQFNGVKVLAQ 421 gacaacaccctgaccatccaggttggtgccaacgacggtgaaactatcgatatcgatctg DNTLTIQVGANDGETIDIDL 4 81 aagcagatcaactctcagaccctgggtctggatacgctgaatgtgcaacaaaaatataag KQINSQTLGLDTLNVQQKYK 541. gtcagcgatacggctgcaactgttacaggatatgccgatactacgattgctttagacaat VSDTAATVTGYADTTIALDN 601 agtacttttaaagcctcggctactggtcttggtggtactgaccagaaaattgatggcgat STFKASAT GLGGTDQKIDGD 661 ttaaaatttgatgatacgactggaaaatattacgccaaagttaccgttacggggggaact LKFDDTTGKYYAKVTVTGGT 721 ggtaaagatggctattatgaagtttccgttgataagacgaacggtgaggtgactcttgct GKDG YYEV SVDKTNGEVTLA 781 ggcggtgcgacttccccgcttacaggtggactacctgcgacagcaactgagtactattct GGAT S P LTGGL PA T AT E YY S 841 gcaactcaaaataaagatggttccataagtattaatactacgaaatacactgcagatgac

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. ATQNKDGSISINTTKYTADD 901 ggtacatccaaaactgcactaaacaaactgggtggcgcagacggcaaaaccgaagttgtt GTS K TALNKLGGADGKTEVV 961 tctattggtggtaaaacttacgctgcaagtaaagccgaaggtcacaactttaaagcacag S I G GKTYAA S KAEGHNFKAQ 1021 cctqatctqgcqgaagcqgctgctacaaccaccgaaaacccgctqcagaaaattgatqct PDLAEAAATTTENPLQKID A 1081 gctttggcacaggttgacacgttacgttctgacctgggtgcggtacagaaccgtttcaac ALAQVDTLRSDLGAVQNRFN 1141 tccgctattaccaacctgggcaacaccgtaaacaacctgacttctgcccgtagccgtatc SAITNLGNTVNNLTSARSRI 12 01 gaagattccgactacgcgaccgaagtttccaacatgtctcgcgcgcagattctgcagcag EDSDYATEVSNMSRAQILQQ 12 61 gccggtacctccgttctggcgcaggcgaaccaggttccgcaaaacgtcctctctttactg AGTSVLAQANQVPQNVLSLL 1321 cgttaa 1326 R *

>pTH890-C3 7

1 atggcacaagtcattaatacaaacagcctgtcgctgttgacccagaataacctgaacaaa MAQVINTNSLSLLTQNNLN K 61 tcccagtccgctctgggcaccgctatcgagcgtctgtcttccggtctgcgtatcaacagc SQSALGTAIERLSSGLR INS 121 gcgaaagacgatgcggcaggtcaggcgattgctaaccgttttaccgcgaacatcaaaggt AKDDAAGQAIANRFTANIKG 181 ctgactcaggcttcccgtaacgctaacgacggtatctccattgcgGagaccactgaaggc LTQASRNANDGI S IAQTTEG 241 gcgctgaacgaaatcaacaacaacctgcagcgtgtgcgtgaactggcggttcagtctgct ALNEINNNLQRVRELAVQSA 3 01 aacagcaccaactcccagtctgacctcgactccatccaggctgaaatcacccagcgcctg NSTNSQSDLDSIQAEITQRL 361 aacgaaatcgaccgtgtatccggccagactcagttcaacggcgtgaaagtcctggcgcag NEIDRVSGQTQFNGVKVLAQ 421 gacaacaccctgaccatccaggttggtgccaacgacggtgaaactatcgatatcgatctg DNTLTIQVGANDGETIDIDL 481 aagcagatcaactctcagaccctgggtctggatacgctgaatgtgcaacaaaaatataag KQIN S QTLGLDTLNVQQKYK 541 gtcagcgatacggctgcaactgttacaggatatgccgatactacgattgctttagacaat VSDTAA TVTGYADTT IALDN 601 agtacttttaaagcctcggctactggtcttggtggtactgaccagaaaattgatggcgat STFK ASATGL GGTDQKIDGD 661 ttaaaatttgatgatacgactggaaaatattacgccaaagttaccgttacggggggaact LKFDDTTGKYYAKVTVTGGT 721 ggtaaagatggctattatgaagtttccgtttctgttgttaagatgtcttatactgataat GKDGYYEVSVSVVKMSYTDN 781 aacggtaaaactattgatggtggtttagcagttaaggtaggcgatgattactattctgca NGKTI DGGLAVKVGDDYY SA 841 actcaaaataaagatggttccataagtattaatactacgaaatacactgcagatgacggt TQNKDGS I S INTTKYTADDG 901 acatccaaaactgcactaaacaaactgggtggcgcagacggcaaaaccgaagttgtttct TSKTALNKLGGADGKTEVVS 125

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 961 attggtggtaaaacttacgctgcaagtaaagccgaaggtcacaactttaaagcacagcct IGGKTYAASKAEGHNFKAQP 1021 gatctggcggaagcggctgctacaaccaccgaaaacccgctgcagaaaattgatgctgct DLAEAAATTTENPLQKI DAA 1081 ttggcacaggttgacacgttacgttctgacctgggtgcggtacagaaccgtttcaactcc LAQVDTLRSDLGA VQNRFNS 1141 gctattaccaacctgggcaacaccgtaaacaacctgacttctgcccgtagccgtatcgaa AITNLGNTV NNLTSARSRIE 12 01 gattccgactacgcgaccgaagtttccaacatgtctcgcgcgcagattctgcagcaggcc DSDYATEVSNMSRAQILQQA 1261 ggtacctccgttctggcgcaggcgaaccaggttccgcaaaacgtcctctctttactgcgt GTSVLA QANQVPQNVLSLLR 1321 taa 1323 it

>pTH890-C85

1 atggcacaagtcattaatacaaacagcctgtcgctgttgacccagaataacctgaacaaa MAQVINTNSLSLLTQNNLNK 61 tcccagtccgctctgggcaccgctatcgagcgtctgtcttccggtctgcgtatcaacagc SQSALGTAIERLSSGLRINS 121 gcgaaagacgatgcggcaggtcaggcgattgctaaccgttttaccgcgaacatcaaaggt AKDDAAGQAIANRFTAN I KG 181 ctgactcaggcttcccgtaacgctaacgacggtatctccattgcgcagaccactgaaggc LTQASRNANDGISIAQTTEG 241 gcgctgaacgaaatcaacaacaacctgcagcgtgtgcgtgaactggcggttcagtctgct ALNEINNNLQRVRELAVQSA 3 01 aacagcaccaactcccagtctgacctcgactccatccaggctgaaatcacccagcgcctg NSTNSQSDLDS I QAE I TQRL 3 61 aacgaaatcgaccgtgtatccggccagactcagttcaacggcgtgaaagtcctggcgcag NEIDRVSGQTQFNGVKVLAQ 421 gacaacaccctgaccatccaggttggtgccaacgacggtgaaactatcgatatcgatctg DNTLTIQVGANDGET IDIDL 4 81 aagcagatcaactctcagaccctgggtctggatacgctgaatgtgcaacaaaaatataag KQINSQTLGLD TLNVQ QKY K 541 gtcagcgatacggctgcaactgttacaggatatgccgatactacgattgctttagacaat VSDTAATVTGYADTTIALDN 601 agtacttttaaagcctcggctactggtcttggtggtactgaccagaaaattgatggcgat STFKASATGLGGTDQKIDGD 661 ttaaaatttgatgatacgactggaaaatattacgccaaagttaccgttacggggggaact LKFDD TTGKYYAKVTVTGGT 721 ggtaaagatggctattatgaagtttccgttgataagacgaacggtgaggtgattgatggt GKDGYYEVSVDKTNGEVIDG 781 ggtttagcagttaaggtaggcgatgattactattctgcaactcaaaataaagatggttcc GLAVKVGDDYYSATQNKDGS 841 ataagtattaatactacgaaatacactgcagatgacggtacatccaaaactgcactaaac ISINTTKYTADDGTSKTALN 901 aaactgggtggcgcagacggcaaaaccgaagttgtttctattggtggtaaaacttacgct KLGGADGKTEVVSIGG KTYA 961 gcaagtaaagccgaaggtcacaactttaaagcacagcctgatctggcggaagcggctgct ASKAEGHNFKAQPDLAEAAA 1021 acaaccaccgaaaacccgctgcagaaaattgatgctgctttggcacaggttgacacgtta 126

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. TTTENPLQKIDAALAQVDTL 1081 cqttctqacctqqqtqcqqtacaqaaccqtttcaactccgctattaccaacctgggcaac RSDLGAVQNRFNSAITNLGN 1141 accgtaaacaacctgacttctgcccgtagccgtatcgaagattccgactacgcgaccgaa TVNNLTSARSRIEDSDYATE 12 01 gtttccaacatgtctcgcgcgcagattctgcagcaggccggtacctccgttctggcgcag VSNMSRAQILQQAGTSVLAQ 1261 gcgaaccaggttccgcaaaacgtcctctctttactgcgttaa 1302 A NQVPQNVLSLLR*

>pTH8 90 -C ll

1 atggcacaagtcattaatacaaacagcctgtcgctgttgacccagaataacctgaacaaa MAQVINTNSLSLLTQNNLNK 61 tcccagtccgctctgggcaccgctatcgagcgtctgtcttccggtctgcgtatcaacagc SQSALGTAIERLSSGLRINS 121 gcgaaagacgatgcggcaggtcaggcgattgctaaccgttttaccgcgaacatcaaaggt AKDDAAGQAIANRFTANI KG 181 ctgactcaggcttcccgtaacgctaacgacggtatctccattgcgcagaccactgaaggc LTQASRNANDGI S IAQTTEG 241 gcgctgaacgaaatcaacaacaacctgcagcgtgtgcgtgaactggcggttcagtctgct ALNEINNNLQRVRELAVQSA 3 01 aacagcaccaactcccagtctgacctcgactccatccaggctgaaatcacccagcgcctg NSTNSQSDLDSIQAEITQRL 3 61 aacgaaatcgaccgtgtatccggccagactcagttcaacggcgtgaaagtcctggcgcag NE I DRVS GQTQ FNGVKV LAQ 421 gacaacaccctgaccatccaggttggtgccaacgacggtgaaactatcgatatcgatctg DN TLTIQVGANDGETI DIDL 4 81 aagcagatcaactctcagaccctgggtctggatacgctgaatgtgcaacaaaaatataag KQINSQTLGLDTLNVQ QKYK 541 gtcagcgatacggctgcaactgttacaggatatgccgatactacgattgctttagacaat VSDTAATVTGYADTTIALDN 601 agtacttttaaagcctcggctactggtcttggtggtactgaccagaaaattgatggcgat STFKASATGLGGTDQKIDGD 661 ttaaaatttgatgatacgactggaaaatattacgttaccggcacagcatctgttgttaag LKF DDTTGKYYVTG TASVVK 721 atgtcttatactgataataacggtaaaactattgatggtggtttagcagttaaggtaggc MSYTDNNGKTIDGGLAVKVG 781 gatgattactattctgcaactcaaaataaagatggttccataagtattaatactacgaaa D DYYSATQNKDGSISINTTK 841 tacactgcagatgacggtacatccaaaactgcactaaacaaactgggtggcgcagacggc YTADDGTS KTALNKLGGADG 901 aaaaccgaagttgtttctattggtggtaaaacttacgctgcaagtaaagccgaaggtcac KTEVVS IGGKTY AASKAEGH 961 aactttaaagcacagcctgatctgqcggaaqcggctgctacaaccaccqaaaacccqctq NFKAQPDLAEAAATTTENPL 1021 cagaaaattgatgctgctttggcacaggttgacacgttacgttctgacctgggtgcggta QKIDAALAQVDTLRSDLGAV 1 0 8 1 cagaaccgtttcaactccgctattaccaacctgggcaacaccgtaaacaacctgacttct QNRFNSAITNLGNTVNNLTS 1141 gcccgtagccgtatcgaagattccgactacgcgaccgaagtttccaacatgtctcgcgcg ARSRIEDSDYATE. VSNMSRA 127

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 12 01 cagattctgcagcaggccggtacctccgttctggcgcaggcgaaccaggttccgcaaaac QILQQAGTSVLAQANQVPQN 1261 gtcctctctttactgcgttaa 1281 V L S L L R *

>pTH890-C79

1 atggcacaagtcattaatacaaacagcctgtcgctgttgacccagaataacctgaacaaa MAQVINTNSLSLLTQNNLNK 61 tcccagtccgctctgggcaccgctatcgagcgtctgtcttccggtctgcgtatcaacagc SQSALGTAIE RLSSGLRINS 121 gcgaaagacgatgcggcaggtcaggcgattgctaaccgttttaccgcgaacatcaaaggt AKDDAAGQAIANRFTANI KG 181 ctgactcaggcttcccgtaacgctaacgacggtatctccattgcgcagaccactgaaggc LTQASRNAND GI S IAQTTEG 241 gcgctgaacgaaatcaacaacaacctgcagcgtgtgcgtgaactggcggttcagtctgct ALNEINNNLQRV RELAVQSA 3 01 aacagcaccaactcccagtctgacctcgactccatccaggctgaaatcacccagcgcctg N . S TNSQSDLDS I QAE I TQ RL 3 61 aacgaaatcgaccgtgtatccggccagactcagttcaacggcgtgaaagtcctggcgcag NEIDRVSGQTQFNGVKVLAQ 421 gacaacaccctgaccatccaggttggtgccaacgacggtgaaactatcgatatcgatctg DNTLTIQVGANDGETIDIDL 4 81 aagcagatcaactctcagaccctgggtctggatacgctgaatgtgcaacaaaaatataag KQINSQTL GLDTLNVQQKYK 541 gtcagcgatacggctgcaactgttacaggatatgccgatactacgattgctttagacaat VSDTAATVTGYADTTIALDN 601 agtacttttaaagcctcggctactggtcttggtggtactgaccagaaaattgatggcgat STFKASATGLGGTDQKIDGD 661 ttaaaatttgatgatacgactggaaaatattacgccaaagttaccgttacggggggaact LKFDDTTGKYYAKVTVTGGT 721 ggtaaagatggctattatgaagtttccgttgttaaggtaggcgatgattactattctgca GKDGYYEVSVVKVGDDYYS A 781 actcaaaataaagatggttccataagtattaatactacgaaatacactgcagatgacggt TQNKDGS I S INTTKYTADD G 841 acatccaaaactgcactaaacaaactgggtggcgcagacggcaaaaccgaagttgtttct TSKTALNKLGGADGKTEVVS 901 attggtggtaaaacttacgctgcaagtaaagccgaaggtcacaactttaaagcacagcct I GGKTY AASKAEGHN FKAQP 961 gatctggcggaagcggctgctacaaccaccgaaaacccgctgcagaaaattgatgctgct DLAEAAA T TTENPLQKIDAA 1021 ttggcacaggttgacacgttacgttctgacctgggtgcggtacagaaccgtttcaactcc LAQVDTLRSDLGAVQNRFN S 1081 gctattaccaacctgggcaacaccgtaaacaacctgacttctgcccgtagccgtatcgaa AITNLGN TVNN LTSARSRIE 1141 gattccgactacgcgaccgaagtttccaacatgtctcgcgcgcagattctgcagcaggcc D S D YATEVSNMSRAQILQQA 12 01 ggtacctccgttctggcgcaggcgaaccaggttccgcaaaacgtcctctctttactgcgt GTSVLAQANQVPQNVLSLLR 1261 taa 1263 * >pTH8 90-C12 128

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1 atggcacaagtcattaatacaaacagcctgtcgctgttgacccagaataacctgaacaaa MAQVINTNSLSLL TQNNLNK 61 tcccagtccgctctgggcaccgctatcgagcgtctgtcttccggtctgcgtatcaacagc SQSALGTAIERLSSGL RINS 121 gcgaaagacgatgcggcaggtcaggcgattgctaaccgttttaccgcgaacatcaaaggt AKDDAAGQAIANRFTAN I KG 181 ctgactcaggcttcccgtaacgctaacgacggtatctccattgcgcagaccactgaaggc LTQASRNANDGI S IAQTTEG 241 gcgctgaacgaaatcaacaacaacctgcagcgtgtgcgtgaactggcggttcagtctgct ALNEINNNLQR VRELAVQSA 301 aacagcaccaactcccagtctgacctcgactccatccaggctgaaatcacccagcgcctg N S T NSQSDLDSIQAEITQRL 361 aacgaaatcgaccgtgtatccggccagactcagttcaacggcgtgaaagtcctggcgcag NEIDRVSGQTQFNGVKVLAQ 421 gacaacaccctgaccatccaggttggtgccaacgacggtgaaactatcgatatcgatctg DNTLT IQVGANDGETIDIDL 4 81 aagcagatcaactctcagaccctgggtctggatacgctgaatgtgcaacaaaaatataag KQINSQTLGLDTLNVQQKYK 541 gtcagcgatacggctgcaactgttacaggatatgccgatactacgattgctttagacaat VSDTAATVTGYADTT IALDN 601 agtacttttaaagcctcggctactggtcttggtggtactgaccagaaaattgatggcgat STFKASATGLGGTDQKIDGD 661 gatgattactattctgcaactcaaaataaagatggttccataagtattaatactacgaaa DDYY SATQNKDGS I S I NTTK 721 tacactgcagatgacggtacatccaaaactgcactaaacaaactgggtggcgcagacggc Y T A DDGTSKTALNKLGGADG 781 aaaaccgaagttgtttctattggtggtaaaacttacgctgcaagtaaagccgaaggtcac KTEVVSIGGKTYAASKAEGH 841 aactttaaagcacagcctgatctggcggaagcggctgctacaaccaccgaaaacccgctg NFKAQPDLAEAAATTTENPL 901 cagaaaattgatgctgctttggcacaggttgacacgttacgttctgacctgggtgcggta Q K IDAALAQVDTLRSDLGAV 961 cagaaccgtttcaactccgctattaccaacctgggcaacaccgtaaacaacctgacttct QNRFNSAITNLGNTVNNLTS 1021 gcccgtagccgtatcgaagattccgactacgcgaccgaagtttccaacatgtctcgcgcg ARSRIEDSDYATEVSNMSRA 1081 cagattctgcagcaggccggtacctccgttctggcgcaggcgaaccaggttccgcaaaac QILQQAGTSVLAQANQVPQN 1141 gtcctctctttactgcgttaa 1161 V L S L L R *

>pTH890-C131

1 atggcacaagtcattaatacaaacagcctgtcgctgttgacccagaataacctgaacaaa MAQVINTNSLSLLTQNNL NK 61 tcccagtccgctctgggcaccgctatcgagcgtctgtcttccggtctgcgtatcaacagc SQSALGTAIERLSSGLRINS 121 gcgaaagacgatgcggcaggtcaggcgattgctaaccgttttaccgcgaacatcaaaggt AKDDAAGQ AIANRFTANI KG 181 ctgactcaggcttcccgtaacgctaacgacggtatctccattgcgcagaccactgaaggc LTQASRNANDGI S IAQTTEG 241 gcgctgaacgaaatcaacaacaacctgcagcgtgtgcgtgaactggcggttcagtctgct

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. ALNEIN NNLQRVRELAVQSA 3 01 aacagcaccaactcccagtctgacctcgactccatccaggctgaaatcacccagcgcctg NSTNSQSDLDSIQAEITQRL 361 aacgaaatcgaccgtgtatccggccagactcagttcaacggcgtgaaagtcctggcgcag NEIDRVSGQT QFNGVKVLAQ 421 gacaacaccctgaccatccaggttggtgccaacgacggtgaaactatcgatatcgatctg DNT L T I QVGANDGET I D I DL 4 81 aagcagatcaactctcagaccctgggtctggatacgctgaatgtgcaacaaaaatataag KQ INSQTLGLDTLNVQQKYK 541 gtcagcgatacggctgcaactgttacaggatatgccgatactacgattgctttagacaat VSDTAATVTGYADTTI ALDN 601 agtaaaaccgaagttgtttctattggtggtaaaacttacgctgcaagtaaagccgaaggt SKTEVVS IGGKTYAASKAEG 661 cacaactttaaagcacagcctgatctggcggaagcggctgctacaaccaccgaaaacccg HNFKAQPDLAEAAATTTENP 721 ctgcagaaaattgatgctgctttggcacaggttgacacgttacgttctgacctgggtgcg LQKIDAA LAQVDTLRSDLGA 781 gtacagaaccgtttcaactccgctattaccaacctgggcaacaccgtaaacaacctgact VQNRFNSAITNLGNTVNNLT 841 tctgcccgtagccgtatcgaagattccgactacgcgaccgaagtttccaacatgtctcgc SARSRIEDSDYATEVSNMSR 901 gcgcagattctgcagcaggccggtacctccgttctggcgcaggcgaaccaggttccgcaa AQILQQAGTSVLAQANQVPQ 961 aacgtcctctctttactgcgttaa 984 NVLSLL R*

>pTH890-C150

1 atggcacaagtcattaatacaaacagcctgtcgctgttgacccagaataacctgaacaaa MAQVINTNSLSLLT QNNLNK 61 tcccagtccgctctgggcaccgctatcgagcgtctgtcttccggtctgcgtatcaacagc SQSALGTAIERLSSGLRINS 121 gcgaaagacgatgcggcaggtcaggcgattgctaaccgttttaccgcgaacatcaaaggt AKDDAAGQAIANRFTANIKG 181 ctgactcaggcttcccgtaacgctaacgacggtatctccattgcgcagaccactgaaggc LTQAS RNANDGI S IAQT TEG 241 gcgctgaacgaaatcaacaacaacctgcagcgtgtgcgtgaactggcggttcagtctgct ALNEINNNLQRVRELAVQSA 3 01 aacagcaccaactcccagtctgacctcgactccatccaggctgaaatcacccagcgcctg NSTN SQSDLDSIQAEITQRL 361 aacgaaatcgaccgtgtatccggccagactcagttcaacggcgtgaaagtcctggcgcag NEIDRVSGQTQFNGVKVLAQ 421 gacaacaccctgaccatccaggttggtgccaacgacggtgaaactatcgatatcgatctg DNTLT IQVGANDGETIDIDL 4 81 aagcagatcaactctcagaccctgggtctggatacgctgaatgtgcaacaaaaatataag K QINSQTLGLDTLNVQQKYK 541 gtcagcgatacggctgcaactgttacaggatatgccgatactacgattgctttagacaat VSDTAATVTGYADTTIALDN 601 agtacttttaaagcctcggctactggtcttggtggtactgaccagaaaattgatggegat STFKASATGLGGTDQKIDGD 661 ttaaaatttgatgatacgactaccgaaaacccgctgcagaaaattgatgctgctttggca LKFDDTTTENPLQKIDAALA

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 721 caggttgacacgttacgttctgacctgggtgcggtacagaaccgtttcaactccgctatt QVDTLRSDLGAVQNRFNSAI 781 accaacctgggcaacaccgtaaacaacctgacttctgcccgtagccgtatcgaagattcc TNLGNTVNNLTSARSRIEDS 841 gactacgcgaccgaagtttccaacatgtctcgcgcgcagattctgcagcaggccggtacc DYATEVSNMSRAQILQQAGT 901 tccgttctggcgcaggcgaaccaggttccgcaaaacgtcctctctttactgcgttaa SVLAQANQVPQNVLSLLR* 957

131

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Appendix B. A multiple sequence alignment of all the 39 unique internal deletion variants of FliC.

C108

MM sssssssss

rksssssss — i

SM C118 r n a SSSSSSSS C182 g g m i M i «« a s s a y s —

m i

C116 tsssssssss SSS^SS

Cl 67

IS?m^SSS M m a a s s e s s M5MS3 M 2SS3 H M B S ^ a

C150 Domain DO Domain D2a | Domain D3 Domain D1 Domain D2b

132

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER III

A THEORETICAL MODEL OF AQUIFEXPYROPHILUS FLAGELLIN: IMPLICATIONS FOR ITS THERMOSTABILITY

Abstract

Aquifex pyrophilus is a flagellated hyperthermophilic eubacterial species that grows

optimally at 85 °C. The thermostable A. pyrophilus flagellar filament is primarily

composed of a single protein called flagellin (FlaA). The N- and C-terminal sequence

regions of FlaA are important for self-assembly and share high sequence similarity

with mesophilic bacterial flagellins. We have developed a predictive 3-D structure of

FlaA, using the published structure of mesophilic Salmonella typhimurium flagellin

(FliC) as a template and analyzed it with respect to possible determinants of

thermostability. A sequence comparison of FlaA and FliC revealed a +7.0% increase

in FlaA hydrophobic residues, a +0.6% increase in charged residues and a

corresponding decrease o f-6.0% in polar residues. The FlaA N- and C-termini also

have higher proportions of hydrophobic and charged residues at the expense of polar

residues and higher non-polar surface areas. Thus, a predominant stabilizing factor in

FlaA appears to be increased hydrophobicity, which often confers greater rigidity to

proteins. Fewer intramolecular ion pairs were observed in FlaA than FliC although an

increase in the positive charge potential of the FlaA DO and D1 domains was also

observed; increased intermolecular salt bridges may also contribute to thermal

stability of the oligomeric flagellar fiber.

133

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Introduction

The flagella is the primary organelle responsible for motility in prokaryotes 1;

2; 3;4; 5. It is a complex structure composed of a basal body, a trans-membrane rotary

m otor3> 6> 7’8, a universal joint-like hook structure and a hollow helical flagellar

filament. This filament has an outer diameter of 12-25 nm, a 2-3 nm diameter inner

channel 9’10 and can be 1-15 pm or more in length 8’n . Flagella are rotated by the

membrane rotary motor to enable movement of the bacterium, in a process that is

often regulated by chemotaxis *. Peritrichously arranged flagella either associate into

a helical bundle of multiple flagella 12 that generates a net thrust along its axis or they

separate into separate fibers, depending on the direction of rotation and resulting

supercoiled state of the fiber.

Salmonella typhimurium is a bacterial pathogen that also frequently serves as

a model organism for the investigation of bacterial flagellar structure and function.

The flagellar fibers in S. typhimurium are primarily composed of up to 20,000 copies

of a single globular protein called flagellin, which self-assembles via noncovalent

forces to form a helical fiber composed of 11 protofilaments. The primary (phase-1)

flagellin gene in S. typhimurium is designated as fliC, a secondary (phase-2) flagellin

gene is named fljB; other bacterial flagellin genes are sometimes named as flaA, hag

or flaF. Flagellin proteins are synthesized in the cytoplasm as soluble monomers and

are prevented from self-assembly and protected from proteolytic degradation by

binding of the FliS chaperone protein 13. Flagellin is exported by a flagella specific

type III secretion system through the central pore of the flagellar fiber in a partially

134

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. unfolded conformation, followed by complete folding and oligomer assembly at the

distal end of the fiber 9’10; 14; 15. This terminal assembly process is aided by the

flagellar chaperone cap protein, termed FliD or HAP2 9’10,16,17; 18,19

Aquifex pyrophilus is a motile, hyperthermophilic, microaerophilic,

chemolithoautotrophic, gram-negative, rod-shaped bacterium that was originally

isolated from the hydrothermal system at the Kolbeinsey Ridge north of Iceland 20. It

grows at temperatures between 67-95 °C, with an optimum temperature of 85 °G.

Although it is an extremophile, A. pyrophilus is a eubacterial microbial species, not

an archaeal species and has been placed in the class . Like many other

species of motile eubacteria, A. pyrophilus is flagellated, with a polytrichous

arrangement of up to eight flagella fibers on the cell surface, as previously described

by Behammer, et al. 21. The diameter of A. pyrophilus flagella is 19 nm, which

conforms to the general morphology of the mesophilic bacterial flagella 21. A.

pyrophilus flagella are primarily composed of a single flagellin protein that is

encoded by the flaA gene 21. The corresponding FlaA protein has 501 residues (53.9

kDa), with the first N-terminal Met residue removed post-translationally 21. The

sequence length of FlaA is nearly identical to that of some mesophilic bacteria, such

as the 495 residue Salmonella typhimurium FliC protein (51.4 kDa). Flowever, A.

pyrophilus flagella are unusual in that they are functional, i.e., remain self-assembled

into fibers, at temperatures higher than 100 °C and at low pH values 21. This

functional temperature range is much higher than that typically observed for the

flagella of well-characterized mesophilic bacteria such as S. typhimurium and

135

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Escherichia coli, which typically dissociate at temperatures higher than 60-65 °C 22,

23 , a property often exploited in their purification for biophysical studies . The19 A.

pyrophilus flagellar fibers are prone to breakage during low speed centrifugation,

unlike the flagella of S. typhimurium and thus are more rigid than their mesophilic

counterparts at room temperature 21 . Henceforth, we refer to the S. typhimurium

flagellin protein as FliC and the A. pyrophilus flagellin protein as FlaA.

The partial and complete structures of mesophilic S. typhimurium FliC were

determined in 2001 11 and 2003 24. The complete FliC structure (PDB 1UCU) shows

four distinct globular domains in the protein, termed DO, D l, D2 (with subdomains

D2a and D2b) and D3. Domain DO forms the inner core of the filament; Dl forms the

outer core and domains D2 and D3 form a knob-like projection on the filament

surface. Various sequence analysis studies have shown that the N- and C-termini of

FliC that form the DO and Dl domains are essential for the export and assembly into

flagella and are conserved across most bacterial species 25. In contrast, the D2 and D3

domains in the middle are highly variable in sequence and size across different

species. Flagellin molecular masses can range from 20-77 kDa 26; 27; 28; 29; 30,

corresponding to amino acid sequences ranging in size from under 300 amino acids to

almost 700 amino acids (e.g., Pseudomonasputida has 688 residues), with the

majority of flagellins containing -500 amino acids 31. These size differences are

primarily due to high genetic variation in a "hypervariable" middle region of the

protein encoding outer domains D2 and D3. These hypervariable domain regions can

136

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. be deleted to a large extent while not affecting the self assembly properties of the

inner DO and D l domains 32.

The mechanisms of FlaA flagellar thermostability are fundamentally

interesting, as this is a self-assembling structural protein, rather than a thermostable

enzyme. Thermostable flagellin protein fibers based on Aquifex sp. flagellins may

also have potential applications in biotechnology for the extracellular display of

thermostable proteins and peptides with sensor or catalytic activity and in

nanotechnology as templates for biomineralization and as bionanotubes 33 for the

assembly of novel hybrid nanostructures. Previous reports have demonstrated the

utility of genetically inserted fusion peptides and small proteins displayed on the

immunologically reactive, solvent accessible, middle domain of mesophilic E. coli

flagellin 34 ’ 35 ’ ' 36 ’ " 37 . Thus, it is possible that FlaA could be adapted for use as similar

peptide and protein display system for use in applications that require high

temperature and chemical stability.

The complete amino acid sequence of the thermostable A. pyrophilus FlaA

flagellin is available in the Swiss-Prot database. It has a reasonable overall sequence

homology of -30% with the structurally characterized Salmonella FliC flagellin

protein. Many hyperthermophilic and mesophilic enzymes and proteins typically have

very similar secondary and tertiary structures and functional mechanisms 38. Thus, the

mesophilic FliC structure represents a possible folded structural template for the

corresponding thermostable FlaA protein. These observations suggested that it would

be possible to develop a predictive structural model of the A. pyrophilus FlaA protein

137

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. using standard computational homology modeling approaches. This 3-D structural

model could then be analyzed with respect to its structural determinants of

thermostability and might also serve as a basis for further protein engineering of

thermostable flagella fibers for use in peptide display and other nanomaterial

applications. Thus, we have used well-established comparative modeling software

including MODELLER39 to generate a hypothetical 3-D structure for the FlaA

protein and standard model evaluation software such as PROCHECK 40 to validate

the theoretical model.

Computational Methods

The amino acid sequences o f A. pyrophilus flagellin (FlaA: P46210) and S.

typhimurium phase-1 flagellin (FliC: P06179) were obtained from the Swiss-Prot41,

42, 43, 44,45 0f Swiss Institute o f Bioinformatics (SIB). The refined crystal

structure coordinates of S. typhimurium flagellin FliC (PDB file 1UCU) were

obtained from the RCBS Protein Data Bank 46; 47. The ProtParam 48; 49 tool at the

Expert Protein Analysis System (ExPASy) Proteomics Server50 was used to compute

various physical and chemical parameters for these two proteins from their 3-D

coordinate files and amino-acid sequences. The amino acid composition of the

proteins, aliphatic index and grand average of hydropathicity (GRAVY) were of

particular interest. The aliphatic index of a protein is defined as the relative volume

occupied by aliphatic side chains (Ala, Val, lie and Leu). It may be regarded as a

positive factor for the increase of thermostability of globular proteins. The GRAVY

value for a peptide or protein is calculated as the sum of hydropathy values 51 of all 138

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. the amino acids, divided by the number of residues in the sequence. We analyzed the

overall amino acid flexibility of the proteins using the Average Flexibility Calculator

52 option in the ProtScale tool48 at the ExPASy Proteomics Server. We also analyzed

the propensity of residues in the proteins to be disordered using the DisEMBL 53

protein disorder prediction tool and the GlobPlot54 protein

disorder/order/globularity/domain predictor tool at the ExPASy Proteomics Server.

The homology models for the human A. pyrophilus were generated using the

computer program MODELLER (6 v l)39; 55; 56 of the Accelrys (San Diego, CA)

Insightll software package. The input to the program is an alignment of the target

sequence with the related three-dimensional structure of FliC. The Insightll

visualization environment was used to analyze the structures and perform other

structure related operations.

The Solvent Accessible Surface Area (SASA)57 was measured using the

Solvation module of Insightll. The Delphi module of Insightll was used to investigate

the electrostatic charge distribution 58,59 on the surfaces of the proteins. The protein

model was validated using PROCHECK 40. All Accelrys modeling software was run

on an IBM IntelliStation M Pro PC workstation with the Red Hat® Enterprise Linux

WS 3 operating system. The multimer model of FlaA was constructed using the

coordinates provided by Dr. Koji Yonekura (UCSF, San Francisco) and Dr. Keiichi

Namba (Osaka University). The helical parameters are 1 = 66 n + 361 m and the

repeat distance is 1698.8 A. The FlaA monomers were aligned with the FliC subunits

139

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. using the Magic Fit option in Swiss-PdbViewer 60,61. We have calculated the

molecular surfaces of each subunit to visualize the surface interactions.

The contribution of electrostatic pairing of ionized side chains, i.e., salt

bridges/ion pairs, to the stability of FlaA was evaluated using the criteria that a salt

bridge is present if two oppositely charged atoms of each neighboring side chain are

closer than 6 A. The residues Arg, Lys, His, Asp and Glu were all considered in the

salt bridge calculations 38. Another school of thought is that proteins gain electrostatic

stabilization by minimizing the number of repulsive contacts between like charged

residues rather than by creating salt bridges 62. Thus, we have also investigated the

occurrence of like charged groups (+, + and -, -) within a 6 A distance and compared

this parameter for both proteins. Protein a-helices have a net dipole moment

resulting from the alignment of the peptide backbone hydrogen bonds.

It has been observed that negatively charged (Asp and Glu) and positively

charged (Arg, Lys and His) residues are preferentially found at the N- and C-terminal

ends of a-helices 63,64,65; 66. This charge helps to counteract and stabilize the charge

dipole of an a-helix, resulting in increased stability of the folded structure. The N-

capping box (Neap) is a local motif that acts as a stop signal and gives stability to the

folded protein. A similar Ccap motif was also identified at the C-terminus of

a-helices 67. The side chain of the residue at N3/C3 is hydrogen bonded to the amide

group of Ncap/Ccap and the side chain of Ncap/Ccap is hydrogen bonded to the

amide group of N3/C3. Possible N-capping boxes are characterized by the presence

of any one of the following residues at both the Neap and N3 positions: Ser, Thr, Asp,

140

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Asn, His, Glu or Gin. Similarly, C-capping boxes are characterized by the presence of

any one of the following residues at both the Ccap and C3 positions: Ser, Thr, Asp,

Asn, His, Glu or Gin. The cap regions where the above-mentioned residues are

present but no hydrogen bonding was observed were designated as “potential44

capping boxes. The FlaA model a-helices were evaluated for possible charge

a-helix dipole interactions in terms of the number of Asp and Glu residues at the

Neap, N l, N2 and N3 positions and the number of Arg, Lys and His residues at the

C3, C2, Cl and Ccap positions.

Results and Discussion

The properties of thermophilic proteins have been examined extensively over the past

two decades. Many structural determinants for thermostable proteins have been

postulated, as reviewed by Petsko 68, Vieille and Zeikus38, Scandurra et al. 69; 70,

Fields 71 and Sterner and Liebl72. Sequence analysis of hyperthermophilic bacterial

genomes has detected some preferences of thermophilic proteins for particular amino

acids but general design principles are not fully defined 70,73. Factors that have been

reported to increase thermal stability of proteins include tighter internal packing of

hydrophobic residues 69, increased structural rigidity 38; 71, increased numbers of

hydrophobic residues with branched side chains 74, additional prolines 75, fewer

glycines 76, deletion of flexible surface loops, fewer Asn and Gin residues 75; 77, fewer

histidine residues 69 and greater numbers of charged residues and consequently,

increased hydrogen bonds and salt bridges (ion pairs) on the protein surface, at the

141

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. expense of uncharged polar residues 62; 69; 74; 78; 79; 80. Other forms of electrostatic

stabilization forces include increased stabilization of a-helices by intrahelical salt

bridges, an increase in negative charge at the N-tefminus (helix dipole stabilization) 80

and increased cation-pi interactions 80. Protein oligomerization has also been noted as

a factor in increased protein stability81,82, in part due to decreased exposed surface

area. Confinement of proteins in a small inert volume, i.e., an “Anfmson cage”, is

also known to stabilize the folded state of globular proteins83. Flagella, which are

directly exposed to the external solvent environment, are a rather unique and ideal

system to investigate several aspects of protein thermostability, in the context of a

self-assembling, oligomeric structural system, rather than a catalytic enzyme system.

We have systematically compared the sequences and structures of FliC and FlaA to

elucidate the factors responsible for the thermostability of FlaA.

Sequence comparison

Sequence alignment for modeling was done using the pairwise-alignment tool of

Insightll. The alignment of the protein shows a 29.1% overall sequence similarity

between the thermophilic and mesophilic flagellins, FlaA and FliC (Figure 3.1).

The protein sequences were divided into three separate regions, based on

their similarity and structure, for purposes of further sequence analysis: the N-

terminus, the middle domain and the C-terminus. The FlaA N-terminus includes

amino acid residues 1-180, the FlaA middle domain includes residues 181-398 and

the FlaA C-terminus includes residues 399-500. The FliC protein was similarly

142

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. FliC MAQVINTNSLSLLTQNNIHKSQSALGTAIERI.SSGLRINSAKDDAAGQAIANRFTANIKG 80 F laA MATRINYNYEAAV’rYTTI.KQNEHIJMNRSLLRI.STGI.RILSAADDASGX.FIADQI.SLVSTG 6 0 ** ** * ; ;* i* ***;♦ :j;i ,*

F lic LTQASRNANDGISXAQTTEGALNEIN10NLQRVRELAVQSANSTNSQSDLDSXQAEITQRL 120 FlaA LQQGNRNIQFAISALQIAEGGVAQIYDKLKTMYQKAVSAANDINDPNARAALQHDIENLR 120 * *..** : .** * :* : : **.:**. *. . :* :

FliC NEIDRVSGQTQFNGVKVLA— QDNTLTIQVGANDGETIDIDLKQINSQTLGLDTLNVQQK 178 FlaA DAIQKIAQDTEYNGIRLLDGTLKNGIYIHYGARSEQKLSVAIDDVRASNLGAYLLDLKGQ 180 : .* : *: **.. *::: :

Flic YKVSDTAATVTGYADTTIALDN------STFKASATGLGGTDQKIDGDLKFDDTTGKYY 2 3 1 FlaA SNSAYDSFANLLTTDTNFDIASTETISIAGVTHSPPTNIVTDARYIADWINGDPTLQQMG 240 ; ; ;; :

FliC AKVTVTGGTGKDGYYEVSVDKTNGEVTLAGGATSPLTGGLPATATEDVKNVQVANADLTE 291 FlaA IKAKASNRWGDPWVNVAVDTGDSLTIKFYVGTSTTPAITLTYGPGKTITLDRIISDINS 300 . * : :*:**. . .**. .. : ::

FliC AKAALTAAGVTGTASWKMSYTDNNGKTIDGGLAVKVGDDYYSATQNKDGSISINTTKYT 351 FlaA KATGLNLVAKEENGRXiVIETSGETVGVEVS KSGTGGTTFSLDQIISGVNKTVNNTNT 357 * : j. * . *. ; * * . * .

FIIC ADDGTSKTALNKLGGADGKTEWSIGGKTYAASKAEGHNFKAQPDLAEAAATTTENPLQK 411 FlaA QASAIKVGDLRIIDDESYTYDFTGIAGAFGLTSPSGTVGITDLYAIDVTTNEGAELAMDI 417 • •• *, ; ;; I*. I!

FIIC IDAAIAQVDTLRSDLGAVQNRFNSAITNLGNTVNNLTSARSRIEDSDYATEVSNMSRAQI 471 FlaA LTIAAQKVDEIRSQXGSTIINLQAIYDAKAVAKDNTDNAENIIRNVDFAKEMTEFTKYQI 4 77 : * : . : :* **

FliC LQQAGTSVIAQANQVPQNVLSLLR 495 FlaA HMQSGIAMLAQANALPQLVIiQLLR 501

Amino acid Color Property

AVFPMILW RED Small (small-*- hydrophobic) DE BLUE Acidic RHK MAGENTA B asic STYHCNGQ GREEN Hydroxyl + Amine + B asic - Q

* Indicates identical residues. : Indicates conserved substitutions. . Indicates semi-conserved substitutions.

Figure 3.1 A pairwise sequence alignment of FliC and FlaA.The overall sequence identity of the alignment was 29.1%. The sequence identity at the N- and C-termini was much greater (~-55% identity) than the middle domain (16% identity).

143

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. divided into an N-terminal region of residues 1-175, a middle domain composed of

residues 176-394 and a C-terminal region comprising residues 395-494. Both the bl­

and C-terminal regions have a higher sequence identity that the overall proteins; the

FlaA and FliC N-termini have 54% similarity whereas the corresponding C-termini

have 55% similarity. Conversely, the middle domain hypervariable regions have only

16% similarity. This is in contrast to an earlier estimate of 27% similarity for 246

residues, starting at residue 173, by Behammer, et al. 21.

The highly conserved N- and C-terminal regions are essential for both

recognition by the type III export apparatus and subsequent self-assembly into the

flagellar fiber. Sequence analysis studies have shown that the N- and C-termini of

flagellins are conserved across most bacterial species 25. This sequence conservation

also confers functional conservation; flagellins from different mesophilic bacterial

species can be recombined into heterogeneous flagella in vitro 8’84,85; 86, due to

complimentary interactions between their highly conserved N- and C-terminal

domains. Thus, it is not surprising that a relatively high degree of similarity (-55%) is

observed for the N- and C-terminal domains of FlaA and FliC. However, it remains to

be experimentally determined if Aquifex flagellin will interact in a functional manner

with mesophilic flagellins, mesophilic FliS chaperone proteins and can be expressed

and self-assembled into flagella in mesophilic E. coli or Salmonella bacteria.

The original investigation of A pyrophilus flagella and FlaA sequence by

Behammer et al. in 1995 21 suggested some possible sequence and structural

determinants for its thermostability. These authors noted increased levels of

144

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Table 3.1 Amino acid composition (percentages) of the full length FliC and FlaA.

Amino F laA 17 FliC* FlaA - FliC c FlaA FliC F la A -F liC acid Middle Middle Middle domain d domain * domain^

Ala (A) 10.8% 12.3% -1.5% 7.3% 12.8% -5.5%

Arg (R) 4.0% 2.8% 1.2% 2.3% 0.0% 2.3%

Asn (N) 6.8% 8.5% -1.7% 6.4% 4.6% 1.8%

Asp (D) 7.4% 7.5% -0.1% 7.3% 9.1% -1.8%

Cys (C) 0.0% 0.0% 0.0% 0.0% 0.0% 0.0%

Gin (Q) 5.0% 6.5% -1.5% 1.8% 2.3% -0.5%

Glu (E) 3.8% 3.4% 0.4% 3.2% 2.7% 0.5%

G ly(G ) 7.4% 8.7% -1.3% 10.1% 13.2% -3.1%

His (H) 0.4% 0.2% 0.2% 0.5% 0.5% 0.0%

lie (I) 9.6% 5.1% 4.5% 9.6% 2.7% 6.9%

Leu (L) 9.0% 8.5% 0.5% 5.5% 5.0% 0.5%

Lys(K) 4.6% 5.7% -1.1% 4.1% 10.0% -5.9%

Met (M) 1.6% 0.4% 1.2% 0.5% 0.5% 0.0%

Phe (F) 2.0% 1.2% 0.8% 2.8% 1.4% 1.4%

Pro (P) 1.8% 1.0% 0.8% 3.2% 1.4% 1.8%

Ser(S) 6.8% 7.7% -0.9% 9.2% 5.9% 3.3%

Thr (T) 9.6% 11.5% -1.9% 14.7% 15.1% -0.4%

Trp(W ) 0.4% 0.0% 0.4% 0.9% 0.0% 0.9%

145

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Table 3.1 - Continued

Amino FlaA a FliC* FlaA - FliC c FlaA FliC FlaA - FliC acid Middle Middle Middle domain d domain e domain^ Tyr(Y) 3.6% 2.4% 1.2% 2.8% 5.0% -2.2%

Val (V) 5.6% 6.5% 0.9% 7.8% 7.8% 0.0%

hydrophobic residues, including aromatic residues and proline in several heat-stable

flagellins, with a respective decrease in polar hydrophilic residues. This increase in

average hydrophobicity is frequently accompanied by a decrease in average chain

flexibility at lower temperatures 87 ’88, as possibly indicated by fragility of the

thermostable FlaA flagellar filaments at room temperature. They also proposed that

thermostable flagellins form compact monomer structures and large interfaces

between subunits (excluding water molecules) in the helical polymer. Comparison of

the amino acid compositions using the ProtParam tool was performed for the

complete FlaA and FliC polypeptides, their more conserved N-termini and C-termini

and their hypervariable middle domains. Residues were classified as hydrophobic:

(Ala(A), Val(V), Ile(I), Leu(L), Met(M), Phe(F), Tyr(Y), Trp(W) and Pro(P)}; polar:

{Asn(N), Gln(Q), Ser(S), Thr(T) and Cys(C)}; charged: {Arg (R) Lys(K), His(H),

Asp(D) and Glu(E)}; and aromatic: {(Phe(F), Trp(W) and Tyr(Y)}.

A comparison of the FlaA and FliC residue proportions (percentages) for

the total proteins show a +7.0% increase in the total hydrophobic residues and a 146

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. decrease of -6.0% and a +0.6% increase in the polar and charged residues in FlaA

(Table 3.1). The FlaA aromatic residues increased by 2.4%. According to one

analysis, more charged residues (+3.2%), fewer polar residues (-5.0%) and slightly

more hydrophobic residues are found in hyperthermophilic proteins, on average 38.

FlaA seems to be an exception to this rule as there is only a minor increase of +0.6%

in the number of charged residues and a very high increase of +7.0% in the

hydrophobic residues. The greatest contribution to the increase of hydrophobic

residues in FlaA is from the branched hydrophobic residue, isoleucine (+4.5%).

A number of individual amino acid compositions were analyzed in further

detail. Isoleucine has greater thermostability at temperatures above 100 °C 38. The

relative percentages of Ala, Asn, Asp, Gin, Lys, Ser, Thr and Val decreased in FlaA

while those of Arg, Glu, His, Met and Pro increased, which conforms to the general

trend observed among hyperthermophilic proteins. Several properties of Arg residues

suggest that they are better suited to functioning at higher temperatures than Lys

residues. The Arg 5-guanido moiety has a reduced chemical reactivity due to its high

pKaand significant resonance stabilization 38. Arg also forms more stable ion pair

interactions at high temperatures 38. The Arg/Lys ratio in FlaA is 0.9 and in FliC is

0.5. Hyperthermophilic proteins can have an increase in the total number of Pro

residues in loop regions; Pro decreases the conformational entropy by conferring

rigidity to loops. Thermophilic proteins can also have a decreased proportion of Gly

residues because Gly has an opposite effect of Pro in terms of increasing

conformational entropy of the polypeptide38. There are 9 Pro residues in FlaA and 5

147

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Pro residues in FliC, an increase of +0.8%. Most of the Pro residues in FlaA are

located in the loop regions of the D2 and D3 domains except for residue Pro 106,

which is located in the a-helix of the Dl domain. There is also a -1.3% decrease in

the Gly residue percentage in FlaA. The observed increase of Pro residues and

decrease of Gly residues help to explain the thermostability and rigidity of the

protein. Apart from the Gly and Pro composition, the flexibility of a protein also

depends on its primary structure. By using the ProtScale tool of ExPASy, we have

calculated the overall amino acid flexibilities of FliC and FlaA and for their N- and

C-termini and middle domains. The entire protein, N- terminus, C-terminus and

middle region (D2 and D3 domains) have average flexibility values of 0.447, 0.446,

0.438 and 0.452 for FlaA and 0.450, 0.455, 0.444 and 0.450 for FliC. These values

indicate a marginally higher relative flexibility for the entire FliC protein (+0.67%),

N-terminal and middle domains (+2.0% and +1.4%) and a slightly lower flexibility

for the C-terminal domain (-0.44%). DisEMBL predicted that FliC has 6% more

residues than FlaA that may be disordered, whereas GlobPlot predicted a 3% increase

in the disordered residues in FliC. Thus, all three algorithms yielded a consistent

prediction of a slightly higher flexibility for mesophilic FliC than for thermostable

FlaA, although the differences were statistically minor. Both FlaA and FliC proteins

lack Cys; Cys is not observed in any wild-type flagellin sequences identified, to date.

The overall aliphatic indexes of FlaA and FliC proteins with their N-terminal

Met residues removed are 99.6 and 84.0 (Table 3.2). These aliphatic index values

indicate a higher overall aliphatic character for FlaA than FliC, a common sequence

148

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. feature of thermostable proteins. The GRAVY values are -0.121 for FlaA and -0.400

for FliC (Table 3.2), which strongly indicates a much higher hydrophobic character

for the FlaA sequence. Overall, both FlaA and FliC have very similar numbers of

charged residues (Table 3.2), with 56 and 54 negative residues and 43 and 42 positive

residues, respectively. However, the distribution of charged residues was very

different in these two proteins, as discussed below.

The sequence compositions of FlaA and FliC were further analyzed at three

different regions, as discussed above. The aliphatic indexes of the N-terminal regions

of FlaA and FliC are 105.4 and 99.3 and the corresponding GRAVY values are -

0.266 and -0.428 (Table 3.2). The FlaA N-terminus has an increase of +9.7% in

hydrophobic residues, a decrease of -0.4% in charged residues and a decrease of -

10.2% in polar residues (Table 3.3). The increase in charged residues was only due to

an increase of seven additional positive residues (20 vs. 13, Table 2), as the numbers

of negative residues were similar for both

149

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Table 3.2 Computed parameters of FlaA and FliC flagellin proteins and their N-terminal, C-terminal and middle hypervariable domain regions.______

Protein / Domain a Aliphatic GRAVY valu p l e Negative Positive

index b residues d residues e

FlaA Protein 99.6 -0.121 4.82 56 43

FliC Protein 84.0 -0.400 4.79 54 42

FlaA N-terminus^ 105.4 -0.266 8.12 19 20

FliC N-terminus^ 99.3 -0.428 4.65 18 13

FlaA C-terminus 8 111.9 0.002 4.50 14 9

FliC C-terminus 8 97.7 -0.269 4.61 10 7

FlaA middle domain h 89.0 -0.061 4.46 23 14

FliC middle domain h 65.6 -0.437 5.04 26 22

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Table 3.3 Amino acid composition (percentages) of N-terminus, C-terminus and middle domains of FliC and FlaA. Protein / Domain Hydrophobic Polar Charged

residues (%)a residues (%)b residues (%)c

FlaA N-terminus d 45.2% 30.7% 17.4%

FliC N-terminus d 35.5% 40.9% 17.8%

FlaA C-terminus e 50.6% 24.3% 22.4%

FliC C-terminus e 43.0% 37.0% 17.0%

FlaA middle domain^ 40.4% 32.1% 17.4%

FliC middle domain^ 36.6% 27.9% 22.3%

a The hydrophobic amino acids include Ala(A), Val(V), Ile(I), Leu(L), Met(M),

Phe(F), Tyr(Y), Trp(W) and Pro(P).

b The polar residues include Asn(N), Gln(Q), Ser(S), Thr(T) and Cys(C).

c The charged residues include Arg(R), Lys(K), His(H), Asp(D) and Glu(E).

JThe N-terminal domain regions were defined as residues 1-180 out of 500 total

amino acid residues for FlaA and residues 1-175 out o f494 for FliC.

e The C-terminal domain regions were defined as residues 399-500 out of 500 for FlaA

and residues 395-494 out of 494 for FliC.

^The FlaA hypervariable middle region encompassing domains D2 and D3 is defined

as residues 181-398 out of 500 total residues. The FliC hyper-variable middle region is

151

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Table 3.3 - Continued

defined as residues 176-394 out of 494 total residues.

proteins, 19 vs. 18. The aliphatic index values of the FlaA and FliC C-terminal

regions are 111.9 and 97.7 and the corresponding GRAVY values are -0.002 and -

0.269 (Table 3.2). The FlaA C-terminal region has an increase of +7.6% in

hydrophobic residues and a +5.4% increase in charged residues, relative to FliC

(Table 3.3). The polar residues correspondingly decreased by -12.7% in FlaA. In

contrast with the N-terminal region, 40% more negative residues were found in the C-

terminal region of FlaA (14 vs. 10, Table 3.2), although two additional positive

residues were also observed (9 vs. 7). The extensive increase in hydrophobic residues

in the C- and N-terminal regions of FlaA leads to increased hydrophobicity and

should contribute to its higher thermal stability. Thus, in spite of high sequence

similarity between the FliC and FlaA N- and C-terminal regions, significant

differences are apparent in the proportions of hydrophobic and polar amino acids. The

structural implications of these differences are discussed in the next section.

The hypervariable middle domains of FlaA and FliC have aliphatic index

values of 89.0 and 65.6, while the corresponding GRAVY values are -0.061 and -

0.437 (Table 3.2). The individual amino acid compositions for the middle domains of

152

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Figure 3.2 A comparison of 3-D structures of FliC and FlaA. 2a) Model 3-D structure of thermostable A. pyrophilus FlaA flagellin. 2b) 3-D structure of mesophilic S. typhimurium FliC flagellin. a-helices are shown in red, (3-sheets are shown in yellow, turns are shown in blue and coils are shown in green. Both the structures show domains DO, D l, D2 and D3. 2c, 2d) Stereo view of surface charge distribution of FlaA. 2e, 2f) Stereo view of surface charge distribution of FliC. The legend shows the scale for coloring of the charge spectrum. Blue represents positive charge, red represents negative charge and white represents neutral.

153

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. FlaA and FliC are given in Table 1. The middle domain of FlaA has an increase of

3.8% in the hydrophobic side chains and a 4.9% decrease in the charged residues.

There is an increase of 4.2% in the polar residue composition (Table 3.3). The FlaA

middle domain has significantly fewer positively charged residues (14 vs. 22) and

slightly fewer negatively charged residues (23 vs. 26). This indicates that this domain

region is more acidic in FlaA and that it is not stabilized by formation of additional

salt bridges, but rather by hydrophobic interactions.

Structure comparison

The 3-D model for the FlaA protein was built using the coordinate file 1UCU from

the Protein Data Bank as a structural basis using the MODELLER software.

Altogether, 10 models were generated with a high level of loop optimization and

structurally aligned with FliC. The best-fit model had the lowest probability density

function violations; its free energy and loops were further refined. The secondary

structures were visualized using the Kabsch-Sander89 method in Insightll (Fig. 3.2a

and b). The FlaA secondary structure comprises 9 a-helices and 20 p-sheets similar

to the FliC structure. The three dimensional model was verified using PROCHECK.

The coordinates of this model have been deposited in the RCSB PDB theoretical

model online database, under the ID 1XGX. A Ramachandran plot (Fig. 3.3) shows

that 87.2 % of the total residues of the FlaA model occupy the most favored regions,

9.7% occupy additional allowed regions, 1.3% occupy generously allowed regions

and only 1.1% fall in the disallowed region. Figure 4 shows the 3-D structure

alignment of FlaA model and the experimentally determined FliC structure, PDB

154

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 1UCU. The average root-mean-square (rms) deviation at C“ positions between the

FlaA model and PDB 1UCU is 1.9 A.

Ramachandran plot of FlaA model

PSi (Degrees)o

PHI (Degrees)

Figure 3.3 Ramachandran plot of the FlaA model.87.2% of the FlaA model residues occupy the most favored regions, 9.7% occupy additional allowed regions, 1.3% occupy generously allowed regions and only 1.1% are in the disallowed region.

155

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Figure3.4 A 3-D structural alignment of FliC with the model of FlaA.The FliC structure is represented as a red chain and the FlaA model as a blue chain. The average root-mean-square (rms) deviation at C“ positions between the FlaA model and the FliC structure (PDB 1UCU) is 1.9 A.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. The above observations indicate that the modeled FlaA structure is

conformationally correct. However, it should be noted that the probability of the

middle domain region having an alternate conformation is higher, due to the high

variability of sequences observed in this region for bacterial flagellins. We have also

modeled the structure using the SWISS-MODEL server 90 and Rosetta algorithms 91’

92 implemented on the Robetta comparative modeling server 93. SWISS-MODEL

failed to build a comparative model, where as Robetta gave a 3-D model (data not

shown) comparable to the one generated by MODELLER. The gross topologies of

the FlaA model and the FliC structure are very similar, as expected for a homology

model. The a-helical coiled-coil conformations at the N- and C-termini of the FlaA

model were conserved as these regions form the intricate intersubunit interactions in

the flagellar filament structure.

The C-terminal region of the DO domain, which forms the inner surface of

the channel, consists of mainly polar amino acids, as in the case of the mesophilic

homolog, FliC. The polar nature of the surface is thought to help the diffusion of the

unfolded monomers, because unfolded proteins with hydrophobic side chains exposed

will be trapped on a hydrophobic surface 24. It is interesting to consider that the C-

terminus of FlaA that forms the inner core of the flagellum channel has a higher

aliphatic index, 111.9 (Table 3.2), which may retard monomer movement in the

channel. A conserved C-terminal Arg residue is present in both proteins, Arg500 in

FlaA and Arg494 in FliC. This residue projects into the flagellar central channel

157

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. formed by FliC oligomers and was previously hypothesized to have a functional role

in transport of the unfolded flagellin monomers through the channel 24.

The proportions of residues in regular secondary structures were determined

from the ratio of total number of helical, strand and loop residues to the total number

of residues in FliC and FlaA. Loop residues include turns, bends and other irregular

secondary structures. The proportion of residues in helical, strand and loop regions

are 38.5%, 19.6% and 41.9% for FliC compared to 42.5%, 18.2% and 39.3% in FlaA.

Although both proteins have same number of secondary structures, the proportion of

residues falling in the secondary structures in FlaA is greater, especially in the

a-helical region. There is a 4.0% increase in the a-helical residues, a decrease of

1.5% in the p-strand residues and a 2.6% decrease in the loop residues in the FlaA

model. The predicted increase of a-helical secondary structure content and decrease

in the loop residues in the FlaA proteins should contribute to the thermal stability 80.

Several studies have demonstrated that p-branched residues are a-helix-

destabilizing due to their reduced conformational freedom in a-helices 38. Thus, the

proportions of p-branched residues lie, Val and Thr were analyzed in the FlaA

a-helices. We found results contrary to the general trend as there is an overall

increase in the proportion of the P-branched residues in the FlaA a-helices. The

overall percentages of lie, Val and Thr in the a-helices are 13.0%, 4.8% and 7.9%,

with a total content of 25.7%. The corresponding He, Val and Thr percentages in FliC

a-helices are 4.0%, 5.2% and 6.1%, for a total content of 15.3%. Thus, there is a

158

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. total o f+10.6% increase in the P-branched residues in a-helices in FlaA. This

suggests that the extensive increase in the hydrophobic p-branched residues leads to

better packing of the a-helices, resulting in a more extensive hydrophobic

environment in the coiled-coil structural motif and more stability in the DO and Dl

domains.

In contrast with the highly conserved N- and C-terminal regions, the middle

D2 and D3 domains of FlaA have relatively low homology with the same region in

FliC. Given that the numbers of amino acids in each protein are similar and

minimally acceptable sequence homology is apparent, the FlaA model was built with

secondary structures and domain folds for the hypervariable D2 and D3 middle

domains that are similar to the FliC structure (Fig 3.2a). These conserved secondary

structures include the three beta-folium secondary structures observed for residues

220-260 in FliC domain D3; residues 308-345 in FliC domain D2a and residues

345-383 in FliC domain D2b n . The FlaA model shows beta-folium structures at the

same residues in the D2a, D2b and D3 domains.

Another conserved structural feature is the beta-hairpin structure in the

conserved N-terminal Dl domain region of FlaA that extends from residues 140-160,

with beta strands from residues 141-146 and 154-159 and a beta turn in residues 151-

152. This secondary structure motif was hypothesized to function as a two-state

switch that abruptly changes conformation and as a result, changes the packing

distance between monomers and alters the supercoiled state of the flagellar fiber. This

conformational change occurs in response to a change in the direction of flagellar 159

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. rotation and is ultimately responsible for the ability of motile bacteria such as S.

typhimurium to change their swimming mode between running (moving) and

tumbling (stationary) modes n . The previous study by Behammer, et al. did not

present any data that A. pyrophilus bacteria can alternate their direction of flagellar

rotation, with resultant differences in their flagellar helical supercoiling and monomer

spacing. However, it is not unreasonable to postulate that these thermophilic bacteria

also use the same means of reversing flagellar rotation to either swim or remain

stationary by forming or disassembling flagella into larger bundles, to enable

sampling of temperature and chemical gradients over time, as observed for

mesophilic S. typhimurium bacteria.

As discussed by Honda, et al., 94 the secondary structural elements in

flagellin are well segregated radially, with inner domains that are rich in a-helices

and outer domains that are rich in (3-strands. Given the extremely harsh extracellular

environment in which Aquifex flagella function, e.g., high temperature and low pH, it

is not unreasonable to assume that the same secondary structural features that

stabilize mesophilic FliC against proteases and denaturing chemicals could also be

adapted for increased thermal stability in FlaA. Thus, it is also not unreasonable for

the FlaA outer D2 and D3 domains to be largely composed of p-strands, as observed

in the analogous FliC protein. These structures can be relatively rigid because of

cooperative hydrogen-bonding networks. When combined with increased residue

hydrophobicity and resulting tighter packing of the hydrophobic core regions, they

160

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 6 0 FiiC FlaA 120 FliC FlaA 180 FliC MMMMM FlaA 240 FliC ■■■ FlaA 300 FIIC m Mm " FlaA «& * 360 FliC p t m m m - FlaA 420 FiiC FlaA 480 FiiC FlaA 494 FliC 600 FlaA

Figure3.5 Alignment of the secondary structures of FliC and FlaA.The a-helices are represented as red cylinders, p-strands as yellow arrows and random coils as green lines.

161

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. can also be more stable against proteolysis due to minimal size (3-turn loop regions

that do not present themselves as flexible substrates for proteases.

Figure 3.5 shows an alignment of the secondary structures of the FlaA model

and the FliC structure. The above observations indicate that the modeled FlaA

structure is conformationally correct. Flowever, it should be noted that the probability

of the middle domain region having an alternate conformation is higher, due to the

high variability of sequences observed in this region for bacterial flagellins. A

separate analysis of the secondary structure of FlaA indicates that the secondary

structures in the FlaA model, including those in the hypervariable middle domain, are

largely consistent with standard secondary structure prediction algorithms (e.g.,

PSIPRED).

Thus, in addition to having consistent structural similarity with FliC, the FlaA

structure does not appear to violate any of the secondary structure predictions that

were independently derived by standard prediction algorithms. However, the

relatively low degree of sequence homology (16%) observed for the FlaA and FliC

hypervariable middle domains suggests that alternative conformations are possible for

the D2 and D3 domains in the FlaA model. The contribution of this middle domain to

the thermostability of the flagellar fiber remains to be investigated; this region also

has very low sequence homology with that of thermostable Aquifex aeolicus flagellin.

162

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Table 3.4 Solvent accessible surface area (SASA) calculations for conserved DO and D1 domains of FlaA and FliC a Protein Domain Total SASA b Polar SASAc Non-polar SASA d

(A2) (A2) (A2)

FlaA DO and D 1e 1,917 638 1,279

FliC DO and D1f 1,673 718 955

FlaA - FliC 8 244 -80 324

° All SASA calculations were performed using the Solvation module of Insightll

software (Accelrys).

b Total computed solvent accessible surface area for conserved DO and D1 domains.

c Computed solvent accessible surface area for polar surface regions of DO and D1

domains.

d Computed solvent accessible surface area for non-polar surface regions of DO and

D1 domains.

e The DO and D1 domains of FlaA include the more conserved N-terminal residues

1-180 and C-terminal residues 399-500 out of 500 total amino acids.

^The DO and D1 domains of FliC include the more conserved N-terminal residues 1-

175 and C-terminal residues 395-494 out of 500 total amino acids.

8 Difference between the corresponding SASA values of DO and D1 domains of

FlaA and FliC.

163

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. The hydrophobic effect is considered to be one of the most important of the

various molecular forces that determine the tertiary structure of proteins and thus their

stability 95 and the strength of this effect increases with increasing temperature. Thus,

the greater the magnitude of the hydrophobic effect, the more stable the protein 38. An

inverse correlation has been observed between flexibility of thermostable proteins and

their hydrophobicity; they may be significantly flexible at elevated temperatures,

allowing proper function, but may be too rigid, i.e. “solid“ or “wax-like“, to function

at typical mesophilic temperatures (10-45 °C) 69;96. The increased rigidity previously

noted for FlaA flagella filaments at mesophilic temperatures is likely a result of its

increased hydrophobic character and resulting increased compactness.

Summation of the non-polar solvent accessible surface areas (SASA) of a

folded chain yields a measure of the potential hydrophobic effect. The SASA values

were computed using the Solvation module of Insightll. SASA values were calculated

only for more conserved DO and D1 domains that include N- and C-terminal regions

as the lower sequence homology in the middle region may not give an accurate

measurement (Table 3.4). The total SASA of the FliC DO and D1 domains that

include N- and C-termini is 1,673 A2, which includes a polar surface area of 718 A2

and a non-polar surface area of 955 A2. The FlaA monomer model DO and D1

domains have a total SASA of 1,917 A2 that includes a polar surface area of 638 A2

and a non-polar surface area of 1,278 A2. Thus, the predicted total surface area of the

more conserved regions of FlaA is 15% greater than that of FliC, suggesting that

more interfacial area is available for interaction between flagellin subunits in the

164

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. thermostable helical flagella fiber. There is also a decrease o f-11% in the polar

surface area and a corresponding large increase of +34% in the non-polar surface

area. This is a strong indicator that increased hydrophobic interactions can form

between FlaA monomers, resulting in a greater hydrophobic effect upon burial from

the aqueous solvent and thus leading to a tighter packing of the protein.

A surface charge spectrum was constructed for each protein using a DelPhi

grid to visualize the relative charge distributions on the surface of the proteins (Fig.

3.2c, d). The DO and D1 domains of the FlaA model have a higher positive charge

distribution on the surface. This is due to the fact that the number of negatively

charged residues in the N-terminal DO and D1 domains is the same for both FlaA and

FliC, whereas FlaA has 6 more positively charged residues compared to FliC in these

regions. The C-terminal DO and D1 domains of FlaA have 13 negatively charged and

9 positively charged residues while FliC has 8 negatively charged and 7 positively

charged residues. The DO and D1 a-helical coiled-coil domains interact with other

subunits and are thought to form inter-subunit interactions and stabilize the fiber 24.

The increase of positively charged reidues in the DO and D1 domains of FlaA could

have significance in terms of the increasing the stability of the fiber by forming inter­

subunit salt-bridges.

Yonekura, et al. 24 have shown that most of the intersubunit interactions found

in the outer tube of S. typhimurium flagellum filament are polar-polar or charge-

polar and contributions of hydrophobic interactions are relatively small, whereas

those found within the inner tube and between the inner and outer tubes are mostly

165

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. hydrophobic, contributing to the high stability of the filament structure. A pentamer

of the FlaA subunit (Figure 3.6) was constructed based on the FliC heptamer

coordinates provided by the authors of the complete FliC structure 24, Dr. Koji

Yonekura (UCSF, San Francisco) and Dr. Keiichi Namba (Osaka University).

The multimer was visualized in Swiss-PdbViewer 60;61. The subunits were

numbered according to the scheme shown in the Figure 5a. The subunit numbered 1

and rendered in blue ribbon is used as a reference for analyzing the intersubunit

interactions. Subunits 2,3,4 and 5 form the most extensive interactions along each

direction of the major helical array with subunit 1 and are rendered in different ribbon

colors. The lateral interactions of subunit 1 with subunits 2 and 3 can be seen in Fig

3.5b. As we have noted earlier, there is an increase of hydrophobic and charged

residues at the expense of polar residues in the N- and C-terminal regions of FlaA.

The predominant lateral interactions between the DO and D1 domains of subunits 1,2

and 3 appear to be hydrophobic in nature. Many hydrophobic residues are exposed on

the surface of DO domain and to a lesser extent on D1 domain. Figure 5c shows the

interactions of subunit 1 with subunits 4 and 5 that lie immediately on top and bottom

of the subunit in the protofilament.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Figure 3.6 Structure of a flagellin pentamer. a) Stereo diagram of 5 FlaA subunits, viewed from inside of the flagellar filament. The subunit numbered 1 and rendered as a blue ribbon structure is used as a reference for analyzing the intersubunit interactions. Subunits 2, 3,4 and 5 form the most extensive interactions with subunit 1 in the filament and are rendered in different ribbon colors, b) Interaction of subunit 1 with subunits 2 and 3 that form immediate neighbors in the filament. The molecular surface of each subunit is shown. Blue represents positive potential, red; negative and white; zero charge potential, c) Interaction of subunit 1 with subunits 4 and 5 that lies immediately above and below subunit 1 in the flagellar protofilament. These images were generated with Swiss-PdbViewer.

167

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. interactions of subunit 1 with subunits 4 and 5 that lie immediately on top and bottom

of the subunit in the protofilament. In this case both subunits show a mixture of

hydrophobic and polar-charged interactions. Our prediction is that the FlaA subunits

in the filaments are tightly packed due to stronger hydrophobic interactions around

the DO and D1 domains that make them rigid at room temperature and stable at higher

temperatures.

Salt bridges/ion pairs were defined using the distance criteria that a salt bridge

exists if two oppositely charged atoms of adjacent side chains are located closer than

6 A. Only the amino acids Arg, Lys, His, Asp and Glu were considered for salt bridge

calculations. Based on these criteria, FlaA has a total of 15 salt bridges, of which 12

are located in the a-helical regions of the DO and D1 domains. In contrast, FliC has

24 salt bridges, of which 9 are in the a-helical regions. These results suggest that

intramolecular ion pairs do not play an important role in the increased thermostability

of the FlaA monomer, relative to mesophilic FliC, in agreement with the previous

results of the SASA calculations of polar surface area. However, these results do not

necessarily indicate the importance of intermolecular ion pairs between FlaA

monomers in stabilizing FlaA in the polymeric flagellar fiber, as noted above, the DO

and D1 domains have a positively charged region. We have also visually inspected

the presence of like charged groups within the 6 A cutoff distance. FliC has 3

potentially destabilizing interactions involving like charged groups where FlaA has

10 such interactions. This result does not agree with the previously stated theory that

168

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. minimizing the number of repulsive contacts between like charged residues will

increase the thermal stability of proteins 62.

We visually inspected the charge dipoles of FliC and FlaA a-helices to

determine if any of them are further stabilized by the presence of negatively charged

(Asp and Glu) and positively charged (Arg, Lys and His) residues at the N- and C-

terminal ends. Both FliC and FlaA a-helices did not show any preferential occurence

of these charged residues. FliC has four potential N-capping motifs and four actual C-

capping boxes. FlaA has five potential N-capping boxes and two actual C-capping

boxes. Thus, there is little apparent difference in the contribution of helix dipole

stabilization factors in FliC and FlaA.

As previously noted, protein oligomerization is another method for increasing

the thermostability of proteins and flagellin functions in vivo by forming extended

“20,000-mer” helical fibers. It has only recently been observed that the terminal FliD

chaperonin cap complex contributes to the thermal stability of S. typhimurium

flagellar filaments 97; thus this additional oligomerization factor represents an

additional potential stabilizing factor in thermostable Aquifex sp. flagella that may not

be directly encoded in the FlaA protein itself.

Conclusions

In this work, a 3-D structure of A pyrophilus FlaA protein was modeled based

on the crystal structure of S. typhimurium FliC with reasonable accuracy. The overall

secondary, tertiary and quaternary structure comparisons of FlaA and FliC indicate

169

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. that electrostatic interactions do not play a major role in FlaA monomer

thermostability, in contrast to some other thermophilic proteins. However, this does

not preclude increased strength and numbers of intermolecular electrostatic

interactions in the oligomeric flagella fiber. The predominant stabilizing force

appears to be the hydrophobic effect, which leads to tighter packing of the fiber and

its resulting rigidity.

References

1. Armitage, J. P. (1992). Bacterial motility and chemotaxis. Sci Prog 76, 451- 77. 2. Bardy, S. L., Ng, S. Y. & Jarrell, K. F. (2003). Prokaryotic motility structures. Microbiology 149, 295-304. 3. Berry, R. M. & Armitage, J. P. (1999). The bacterial flagella motor. Adv Microb Physiol 41, 291-337. 4. Medina, A. L. (2004). Bacterial and archaeal flagella as prokaryotic motility organelles. Biochemistry (Mosc) 69, 1203-12. 5. Moens, S. & Vanderleyden, J. (1996). Functions of bacterial flagella. CritRev Microbiol 22, 67-100. 6. Berg, H. C. (2003). The rotary motor of bacterial flagella. Annu Rev Biochem 72, 19-54. 7. DeRosier, D. J. (1998). The turn of the screw: the bacterial flagellar motor. Cell 93, 17-20. 8. Jones, C. J. & Aizawa, S. (1991). The bacterial flagellum and flagellar motor: structure, assembly and function. Adv Microb Physiol 32, 109-72. 9. Macnab, R. M. (2003). How bacteria assemble flagella. Annu Rev Microbiol 57, 77-100. 10. Macnab, R. M. (2004). Type III flagellar protein export and flagellar assembly. Biochim Biophys Acta 1694, 207-17. 11. Samatey, F. A., Imada, K., Nagashima, S., Vonderviszt, F., Kumasaka, T., Yamamoto, M. & Namba, K. (2001). Structure of the bacterial flagellar protofilament and implications for a switch for supercoiling. Nature 410, 331- 7. 12. Flores, H., Lobaton, E., Mendez-Diez, S., Tlupova, S. & Cortez, R. (2005). A study of bacterial flagellar bundling. Bull Math Biol 67, 137-68.

170

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 13. Ozin, A. J., Claret, L., Auvray, F. & Hughes, C. (2003). The FliS chaperone selectively binds the disordered flagellin C-terminal DO domain central to polymerisation. FEMS Microbiol Lett 219, 219-24. 14. Blocker, A., Komoriya, K. & Aizawa, S. (2003). Type III secretion systems and bacterial flagella: insights into their function from structural similarities. Proc Natl Acad Sci U SA 100, 3027-30. 15. Minamino, T. & Namba, K. (2004). Self-assembly and type III protein export of the bacterial flagellum. J Mol Microbiol Biotechnol 7, 5-17. 16. Ikeda, T., Oosawa, K. & Hotani, H. (1996). Self-assembly of the filament capping protein, FliD, of bacterial flagella into an annular structure. J Mol Biol 259, 679-86. 17. Maki-Yonekura, S., Yonekura, K. & Namba, K. (2003). Domain movements of HAP2 in the cap-filament complex formation and growth process of the bacterial flagellum. Proc Natl Acad Sci U SA 100, 15528-33. 18. Maki, S., Yonderviszt, F., Furukawa, Y., Imada, K. & Namba, K. (1998). Plugging interactions of HAP2 pentamer into the distal end of flagellar filament revealed by electron microscopy. J Mol Biol 277, 771-7. 19. Furukawa, Y., Imada, K., Vonderviszt, F., Matsunami, H., Sano, K., Kutsukake, K. & Namba, K. (2002). Interactions between bacterial flagellar axial proteins in their monomeric state in solution. J Mol Biol 318, 889-900. 20. Huber, R., Wilharm, T., Huber, D., Trincone, A., Burggraf, S., Konig, H., Rachel, R., Rockinger, I., Fricke, H., and Stetter, K. O. ( 1992). Syst. Appl. Microbiol 15, 340-351. 21. Behammer, W., Shao, Z., Mages, W., Rachel, R., Stetter, K. O. & Schmitt, R. (1995). Flagellar structure and hyperthermophily: analysis of a single flagellin gene and its product in Aquifex pyrophilus. JBacteriol 177, 6630-7. 22. Asakura, S. (1970). Polymerization of flagellin and polymorphism of flagella. Adv Biophys 1, 99-155. 23. Vonderviszt, F., Kanto, S., Aizawa, S., and Namba, K. (1989). J Mol Biol 209, 127-33. 24. Yonekura, K., Maki-Yonekura, S. & Namba, K. (2003). Complete atomic model of the bacterial flagellar filament by electron cryomicroscopy. Nature 424, 643-50. 25. Murthy, K. G., Deb, A., Goonesekera, S., Szabo, C. & Salzman, A. L. (2004). Identification of conserved domains in Salmonella muenchen flagellin that are essential for its ability to activate TLR5 and to induce an inflammatory response in vitro. J Biol Chem 279, 5667-75. 26. Kondoh, H. & Hotani, H. (1974). Flagellin from Escherichia coli K-12: polymerization and molecular weight comparison with Salmonella flagellins. Biochim. Biophys. Acta 336, 117-139. 27. Joys, T. M. (1985). The covalent structure of the phase-1 flagellar filament protein of Salmonella typhimurium and its comparison with other flagellins. J Biol Chem 260, 15758-61.

171

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 28. Lagenaur, C. & Agabian, N. (1976). Physical characterization of Caulobacter crescentus flagella. JBacteriol 128, 435-44. 29. Joys, T. M. (1988). The flagellar filament protein. Can J Microbiol 34, 452-8. 30. Andrade, A., Giron, J. A., Amhaz, J. M., Trabulsi, L. R. & Martinez, M. B. (2002). Expression and characterization of flagella in nonmotile enteroinvasive Escherichia coli isolated from diarrhea cases. Infect Immun 70, 5882-6. 31. Morgan, D. G. & Khan, S. (2001). Bacterial Flagella. Nature ELS, 1-8. 32. Kuwajima, G. (1988). Construction of a minimum-size functional flagellin of Escherichia coli. J. Bacteriol. 170, 3305-3309. 33. Martin, C. R. & Kohli, P. (2003). The emerging field of nanotube biotechnology. Nature Reviews Drug Discovery 2, 29-37. 34. Lu, Z., Murray, K. S., Van Cleave, V., LaVallie, E. R., Stahl, M. L. & McCoy, J. M. (1995). Expression of thioredoxin random peptide libraries on the Escherichia coli cell surface as functional fusions to flagellin: a system designed for exploring protein-protein interactions. Biotechnology (N Y) 13, 366-72. 35. Westerlund-Wikstrom, B. (2000). Peptide display on bacterial flagella: principles and applications. IntJMed Microbiol 290, 223-30. 36. Lu, Z., Tripp, B. C. & McCoy, J. M. (1998). Displaying libraries of conformationally constrained peptides on the surface of Escherichia coli as flagellin fusions. Methods Mol Biol 87, 265-80. 37. Tripp, B. C., Lu, Z., Bourque, K., Sookdeo, H. & McCoy, J. M. (2001). Investigation of the 'switch-epitope' concept with random peptide libraries displayed as thioredoxin loop fusions. Protein Eng 14, 367-77. 38. Vieille, C. & Zeikus, G. J. (2001). Hyperthermophilic enzymes: sources, uses, and molecular mechanisms for thermostability. Microbiol Mol Biol Rev 65, 1- 43. 39. Sali, A. & Blundell, T. L. (1993). Comparative protein modelling by satisfaction of spatial restraints. J Mol Biol 234, 779-815. 40. Laskowski, R. A., MacArthur, M. W., Moss, D. S. & Thornton, J. M. (1993). PROCHECK: a program to check the stereochemical quality of protein structures. J. Appl. Cryst. 26, 283-291. 41. Bairoch, A., Apweiler, R., Wu, C. H., Barker, W. C., Boeckmann, B., Ferro, 5., Gasteiger, E., Huang, H., Lopez, R., Magrane, M., Martin, M. J., Natale, D. A., O'Donovan, C., Redaschi, N. & Yeh, L. S. (2005). The Universal Protein Resource (UniProt). Nucleic Acids Res 33, D154-9. 42. Bairoch, A., Boeckmann, B., Ferro, S. & Gasteiger, E. (2004). Swiss-Prot: juggling between evolution and stability. Brief Bioinform 5, 39-55. 43. Apweiler, R., Bairoch, A., Wu, C. H., Barker, W. C., Boeckmann, B., Ferro, 5., Gasteiger, E., Huang, H., Lopez, R., Magrane, M., Martin, M. J., Natale, D. A., O'Donovan, C., Redaschi, N. & Yeh, L. S. (2004). UniProt: the Universal Protein knowledgebase. Nucleic Acids Res 32, D 115-9.

172

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 44. Apweiler, R., Bairoch, A. & Wu, C. H. (2004). Protein sequence databases. Curr Opin Chem Biol 8, 76-80. 45. Boeckmann, B., Bairoch, A., Apweiler, R., Blatter, M. C., Estreicher, A., Gasteiger, E., Martin, M. J., Michoud, K., O'Donovan, C., Phan, I., Pilbout, S. & Schneider, M. (2003). The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003. Nucleic Acids Res 31, 365-70. 46. Berman, H. M., Westbrook, J., Feng, Z., Gilliland, G., Bhat, T. N., Weissig, EL, Shindyalov, I. N., and Bourne, P. E .. (2000). Nucleic Acids Res 28, 235- 42. 47. Deshpande, N., Addess, K. J., Bluhm, W. F., Merino-Ott, J. C., Townsend- Merino, W., Zhang, Q., Knezevich, C., Xie, L., Chen, L., Feng, Z., Green, R. K., Flippen-Anderson, J. L., Westbrook, J., Berman, H. M., and Bourne, P. E. . (2005 ). Nucleic Acids Res 33 D233-7. 48. Gasteiger, E., Floogland, C., Gattiker, A., Duvaud, S., Wilkins, M. R., Appel, R. D. & Bairoch, A. (2005). Protein Identification and Analysis Tools on the ExPASy Server. In The Proteomics Protocols Handbook (Walker, J., ed.), pp. 571-607. Humana Press, Totowa, NJ. 49. Gill, S. C. & von Hippel, P. H. (1989). Calculation of protein extinction coefficients from amino acid sequence data. Anal. Biochem. 182, 319-26. 50. Gasteiger, E., Gattiker, A., Hoogland, C., Ivanyi, I., Appel, R. D. & Bairoch, A. (2003). ExPASy: The proteomics server for in-depth protein knowledge and analysis. Nucleic Acids Res 31, 3784-8. 51. Kyte, J. & Doolittle, R. F. (1982). A simple method for displaying the hydropathic character of a protein. J Mol Biol 157, 105-32. 52. Bhaskaran, R. & Ponnuswamy, P. K. (1988). Positional flexibilities of amino acid residues in globular proteins. Int. J. Peptide Protein Res. 32, 241-255. 53. Finding, R., Jensen, L. J., Diella, F., Bork, P., Gibson, T. J. & Russell, R. B. (2003). Protein disorder prediction: implications for structural proteomics. Structure (Camb) 11, 1453-9. 54. Linding, R., Russell, R. B., Neduva, V. & Gibson, T. J. (2003). GlobPlot: Exploring protein sequences for globularity and disorder. Nucleic Acids Res 31,3701-8. 55. Eswar, N., John, B., Mirkovic, N., Fiser, A., Ilyin, V. A., Pieper, U., Stuart, A. C., Marti-Renom, M. A., Madhusudhan, M. S., Yerkovich, B. & Sali, A. (2003). Tools for comparative protein structure modeling and analysis. Nucleic Acids Res 31, 3375-80. 56. Marti-Renom, M. A., Stuart, A. C., Fiser, A., Sanchez, R., Melo, F. & Sali, A. (2000). Comparative protein structure modeling of genes and genomes. Annu Rev Biophys Biomol Struct 29, 291-325. 57. Shrake, A. & Rupley, J. A. (1973). Environment and exposure to solvent of protein atoms. Lysozyme and insulin. J Mol Biol 79, 351-71. 58. Honig, B. & Nicholls, A. (1995). Classical electrostatics in biology and chemistry. Science 268, 1144-9.

173

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 59. Rocchia, W., Sridharan, S., Nicholls, A., Alexov, E., Chiabrera, A. & Honig, B. (2002). Rapid grid-based construction of the molecular surface and the use of induced surface charge to calculate reaction field energies: applications to the molecular systems and geometric objects. JComput Chem 23, 128-37. 60. Kaplan, W. & Littlejohn, T. G. (2001). Swiss-PDB Viewer (Deep View). Brief Bioinform 2, 195-7. 61. Guex, N. & Peitsch, M. C. (1997). SWISS-MODEL and the Swiss- PdbViewer: an environment for comparative protein modeling. Electrophoresis 18, 2714-23. 62. Karshikoff, A. & Ladenstein, R. (2001). Ion pairs and the thermotolerance of proteins from hyperthermophiles: a "traffic rule" for hot roads. Trends Biochem Sci 26, 550-6. 63. Ptitsyn, O. B. (1969). Statistical analysis of the distribution of amino acid residues among helical and non-helical regions in globular proteins. J Mol Biol 42, 501-10. 64. Richardson, J. S. & Richardson, D. C. (1988). Amino acid preferences for specific locations at the ends of alpha helices. Science 240, 1648-52. 65. Chou, P. Y. & Fasman, G. D. (1978). Prediction of the secondary structure of proteins from their amino acid sequence. Adv Enzymol Relat Areas Mol Biol 47, 45-148. 66. Shoemaker, K. R., Kim, P. S., York, E. J., Stewart, J. M. & Baldwin, R. L. (1987). Tests of the helix dipole model for stabilization of alpha-helices. Nature 326, 563-7. 67. Flarper, E. T. & Rose, G. D. (1993). Flelix stop signals in proteins and peptides: the capping box. Biochemistry 32, 7605-9. 68. Petsko, G. A. (2001). Structural basis of thermostability in hyperthermophilic proteins, or "there's more than one way to skin a cat". Methods Enzymol 334, 469-78. 69. Scandurra, R., Consalvi, V., Chiaraluce, R., Politi, L. & Engel, P. C. (1998). Protein thermostability in extremophiles. Biochimie 80, 933-41. 70. Scandurra, R., Consalvi, V., Chiaraluce, R., Politi, L. & Engel, P. C. (2000). Protein stability in extremophilic archaea. Front Biosci 5, D787-95. 71. Fields, P. A. (2001). Review: Protein function at thermal extremes: balancing stability and flexibility. Comp Biochem Physiol A Mol Integr Physiol 129, 417-31. 72. Sterner, R. & Liebl, W. (2001). Thermophilic adaptation of proteins. Crit Rev Biochem Mol Biol 36, 39-106. 73. van den Burg, B. & Eijsink, V. G. (2002). Selection of mutations for increased protein stability. Curr Opin Biotechnol 13, 333-7. 74. Kumar, S. & Nussinov, R. (2001). Flow do thermophilic proteins deal with heat? Cell Mol Life Sci 58, 1216-33. 75. Sriprapundh, D., Vieille, C. & Zeikus, J. G. (2000). Molecular determinants of xylose isomerase thermal stability and activity: analysis of thermozymes by site-directed mutagenesis. Protein Eng 13, 259-65. 174

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 76. Panasik, N., Brenchley, J. E. & Farber, G. K. (2000). Distributions of structural features contributing to thermostability in mesophilic and thermophilic alpha/beta barrel glycosyl hydrolases. Biochim Biophys Acta 1543, 189-201. 77. Farias, S. T. & Bonato, M. C. (2003). Preferred amino acids and thermostability. Genet Mol Res 2, 383-93. 78. Cambillau, C. & Claverie, J. M. (2000). Structural and genomic correlates of hyperthermostability. J Biol Chem 275, 32383-6. 79. Szilagyi, A. & Zavodszky, P. (2000). Structural differences between mesophilic, moderately thermophilic and extremely thermophilic protein subunits: results of a comprehensive survey. Structure FoldDes 8, 493-504. 80. Chakravarty, S. & Varadarajan, R. (2002). Elucidation of factors responsible for enhanced thermal stability of proteins: a structural genomics based study. Biochemistry 41, 8152-61. 81. Dams, T., Auerbach, G., Bader, G., Jacob, U., Ploom, T., Huber, R. & Jaenicke, R. (2000). The crystal structure of dihydrofolate reductase from Thermotoga maritima: molecular features of thermostability. J Mol Biol 297, 659-72. 82. Goodsell, D. S. & Olson, A. J. (2000). Structural symmetry and protein function. Annu Rev Biophys Biomol Struct 29, 105-53. 83. Zhou, H. X. & Dill, K. A. (2001). Stabilization of proteins in confined spaces. Biochemistry 40, 11289-93. 84. Asakura, S. & lino, T. (1972). Polymorphism of Salmonella flagella as investigated by means of in vitro copolymerization of flagellins derived from various strains. J Mol Biol 64, 251-68. 85. Kamiya, R., Asakura, S. & Yamaguchi, S. (1980). Formation of helical filaments by copolymerization of two types of'straight' flagellins. Nature 286, 628-30. 86. Fawn, A. M. (1977). Comparison of the flagellins from different flagellar morphotypes of Escherichia coli. J Gen Microbiol 101, 112-30. 87. Karplus, P. A. & Schulz, G. E. (1985). Prediction of chain flexibility in proteins. Naturwissenschaften 72, 212-213. 88. Vihinen, M. (1987). Relationship of protein flexibility to thermostability. Protein Eng 1, 477-80. 89. Kabsch, W. & Sander, C. (1983). Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features. Biopolymers 22, 2577-637. 90. Schwede, T., Kopp, J., Guex, N. & Peitsch, M. C. (2003). SWISS-MODEE: An automated protein homology-modeling server. Nucleic Acids Res 31, 3381-5. 91. Rohl, C. A., Strauss, C. E., Chivian, D. & Baker, D. (2004). Modeling structurally variable regions in homologous proteins with rosetta. Proteins 55, 656-77.

175

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 92. Rohl, C. A., Strauss, C. E., Misura, K. M. & Baker, D. (2004). Protein structure prediction using Rosetta. Methods Enzymol 383, 66-93. 93. Kim, D. E., Chivian, D. & Baker, D. (2004). Protein structure prediction and analysis using the Robetta server. Nucleic Acids Res 32, W526-31. 94. Honda, S., Uedaira, H., Vonderviszt, F., Kidokoro, S. & Namba, K. (1999). Folding energetics of a multidomain protein, flagellin. J Mol Biol 293, 719- 32. 95. Dill, K. A. (1990). Dominant forces in protein folding. Biochemistry 29, 7133- 55. 96. Zavodszky, P., Kardos, J., Svingor & Petsko, G. A. (1998). Adjustment of conformational flexibility is a key event in the thermal adaptation of proteins. Proc Natl Acad Sci U SA 95, 7406-11. 97. Dioszeghy, Z., Zavodszky, P., Namba, K. & Vonderviszt, F. (2004). Stabilization of flagellar filaments by FIAP2 capping. FEBS Lett 568, 105-9.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER IV

INVESTIGATION OF INSERTION OF GREEN FLUORESCENT PROTEIN IN SALMONELLA TYPHIMURIUM FLAGELLIN PROTEIN TO CREATE A FUNCTIONAL BIONANOTUBE

Introduction

Over the last decade several bacterial surface proteins have been used to display

foreign peptides as fusion proteins. Flagella display, similar to phage display, has

evolved as an alternative method of peptide display, in part due to the ease of

purification and tolerance for insertion of foreign peptides 1; 2; 3; 4; 5; 6. Flagella display

is based on the genetic fusion of foreign loop peptides into the surface-exposed non-

essential: “hypervariable” central region of flagellin, which is the major subunit

present in thousands of copies per flagella filament. Previous reports have

demonstrated the utility of genetically inserted fusion peptides and small proteins

displayed on the immunologically reactive, solvent accessible, middle domain of

mesophilic E. coli flagellin

Expression of these constructs in flagellin-deficient host strains results in

hybrid flagella carrying the heterologous peptides in thousands of copies displayed in

a regular array on the flagella surface. This approach has been successfully used for

expression of foreign peptides/proteins to be displayed on flagella. A versatile variant

of flagellar display is the hybrid display system created by the insertion of the entire

E. coli thioredoxin gene into the central region of flagellin5; 1. Peptide libraries have 177

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. been genetically introduced into a disulphide loop of thioredoxin creating a

conformationally constrained library that is readily accessible on the flagellar surface

1. This FliTrx random peptide display library system has been used with great success

for epitope-mapping purposes 5’1. Direct flagella display has also proven to be

applicable also in bacterial adhesion technology since large fragments, up to 302

amino acid residues in length, of bacterial adhesins can be functionally expressed as

fusions to flagellin 8. Several studies indicate that tolerance to internal domain

fusions may be a general property of many proteins 9’10, n ’12 and represents another

means by which protein structures may evolve 13. A number of foreign peptides have

been successfully inserted into the hypervariable region of flagellin 1>14'21j including

simultaneous insertion of 115 and 302 amino acid peptides 14. These engineered

hybrid peptide-flagellin proteins are most frequently used as immunological reagents

to generate vaccine antibodies against the inserted peptides when injected into

animals 15,16,17. Only two known reports describe the insertion of full-length proteins

into flagellin, the FliTrx system first described in 1995 18 and a second 1998

publication 19. Hybrid flagella are easily purified and can easily be analyzed for

binding to various targets, such as immobilized proteins, tissue sections, as well as

cell cultures x’8. In order to explore the feasibility of using flagella as a bio-nanotube

we randomly inserted the Green Fluorescent Protein (GFP) into flagellin.

GFP is one of the most widely used biological reporter proteins today. The

wild-type GFP was isolated from the pacific jellyfish Aequorea victoria that dwells in

the Pacific ocean, off the North American coast20; 21. Its biological role is to

178

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. transduce, by energy transfer, the blue chemiluminescence of another protein,

aequorin, into green fluorescent light21. GFP-like proteins with high sequence

homology to GFP have also been identified in several Anthozoans (Corals) ’ ’ .

These proteins are homologous to GFP and the spectral varieties are due to minor

variations in the chromophore and surrounding region in the folded protein 24. Since

its initial cloning and expression, GFP has revolutionized several fields of biology.

GFP is fusion tolerant on both its N- and C-termini and has been used as a fusion tag

in various applications including protein targeting 25; 26, plant cellular transport 21,

Drosophila cellular studies 28, gene expression studies in bacteria 29, Fluorescence

Resonance Energy Transfer (FRET) 26; 30, Bioluminescence Resonance Energy

Transfer (BRET)31, Fluorobodies 32, etc. GFP has also been used as a biosensor in

numerous applications by mutating the residues in the vicinity of the chromophore33;

34; 35

GFP is a protein with 238 residues that forms a “beta-can” motif structure, as

described by Yang et a l 36. GFP forms a monomer in solution, although other

fluorescent proteins have been identified in other related organisms (corals) that are

present as stable dimers and tetramers. The crystal structure of the protein

(PDB:1GFL) shows a cylindrical structure formed by 11 beta strands enclosing an

alpha helix at the center and some short helices at the periphery (Figure 4.1). The

functional fluorophore is formed at the central a-helix by the three residues Ser65-

Tyr66-Gly67. An auto-cyclization process generates the fluorophore that requires no

co-factors or other components. It occurs by a rapid cyclization of Ser65 and Gly67

179

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. followed by a slower oxygenation of Tyr66 on a timescale of hours 21. The Ser65 and

Gly67 form an - 4-(p-hydroxybenzylidene) - imidazolidin-5-one structure. The

Glycine residue plays an important role in the formation of the fluorophore and

cannot be replaced by any other residue 21 ’37. The deprotonated phenolate form of

residue Tyr66 provides the major contribution to the fluorescence of GFP36. The GFP

has a major excitation peak at a wavelength of 395 nm and a minor one at 475 nm,

With extinction coefficients of roughly 30,000 and 7,000 M '1 cm'1, respectively. The

emission peak is at 509 nm resulting in the observation of green light. Mutational

studies have shown that the N- and C- termini are important for the formation of a

functional fluorophore and the minimal domain length required is from the residues

7-229 38 ; truncation of more than 7 amino acid residues from the C-terminus or more

than the N-terminal Met residue lead to total loss of fluorescence.

180

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Tyr66 0 Gly65 b)

ActiveJ k _ JL m site * M <* of GFP P %

OH

Figure 4.1 Crystal structure of a dimeric GFP (PDB:1GFL). The crystal structure of the protein shows a cylindrical structure formed by 11 p-strands enclosing a a- helix at the center and some short helices at the periphery. b)The functional fluorophore is formed at the central a-helix by Ser65-Tyr66-Gly67.

181

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. GFP is a conformationally rigid protein due to the strong interactions between its

secondary structures. The fluorophore residues have been mutated to produce various

spectral and enhanced variants of GFP 21; 39; 40.

It has been demonstrated that GFP protein insertion in various proteins

preserves the function of the host protein. Studies have also shown that GFP will fold

and form a fluorophore when inserted into virtually any domain of another protein

and is relatively indifferent to N- or C- terminal fusions 41; 42;43. A complementation

system was recently developed in which an N-terminal fragment of the Enhanced

GFP (EGFP), a weak fluorophore, becomes brightly fluorescent upon

complementation with the corresponding small, C-terminal EGFP fragment44; 45. The

ability of the EGFP system to respond quickly to DNA hybridization is useful for

detecting the pairwise interations of oligonucleotides in vitro and in vivo 45.

Apart from its natural fluorescence, GFP has been found to be functional in

almost all cell types 21. The protein does not require an extensive post-translational

modification to fluoresce, hence it can be used in various cell types including

prokaryotic systems 21. With the above desirable characteristics, GFP serves as an

excellent protein to test the feasibility of understanding the insertion tolerance level of

flagellin protein. We have used a GFP transposon system created by Sheridan et al. to

randomly insert the protein into a Salmonella typhimurium flagellin gene. We used a

wild-type flagellin gene and a flagellin internal deletion variant, pTH890-C12 46

(Constructed as described in the previous chapter) for the creation of fusion proteins.

Preliminary experiments with this system have failed to yield any functional GFP-

182

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. flagellin fusion variants. It is not yet known if GFP can be exported when inserted

into flagellin; the high stability, low flexibility and larger physical size of GFP may

prevent its export as a fusion protein. One study suggests that the type III secretion

process can unfold previously folded proteins during the secretion process 47 and

another recent report describes the secretion of GFP using a modified flagellar type

III secretion apparatus 48. Wild-type mouse dihydrofolate reductase (DHFR; 21.6

kDa) was not exported by another type III system 49, although a destabilized variant

of DHFR w as50.

Materials and Methods

Bacterial strains

E. coli XL-1 Blue electrocompetent cells (Stratagene, La Jolla, CA) were used

for the preparation of plasmid libraries of fliC internal deletion variants. These cells

have a transformation efficiency greater than 1 x 10 10 transformants/pg of DNA, are

tetracycline resistant, endonuclease ( endA') deficient, which greatly improves the

quality of miniprep DNA and are recombination ( recA') deficient, improving insert

stability.

The S. typhimurium strains used for the experiments were kindly provided by

the late Dr. Robert Macnab (Yale University). S. typhimurium strain SJW 1103,51; 52; 53

a derivative of serovar Typhimurium LT2 that can only express the phase 1 fliC

flagellin (i.e.,fliC stable) was used as a wild-type flagellar motility control for the

studies of modified flagellin proteins and motility assays in solution and on agar

183

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. plates. S. typhimurium strain SJW134 (AfliC and AfljB), which was derived from

parent strain SJW806, is wild-type except for the deletion of the phase 1 fliC and

phase 2 fljB flagellin genes .5 4 , 55 This strain is non-motile, unless a functional flagellin

gene is introduced, e.g., on an expression plasmid . 54 Thus, it serves as an ideal strain

to screen engineered or mutated fliC genes, present in the introduced plasmids, for

functional flagellin protein expression, folding, export and flagellar fiber assembly, as

indicated by a simple agar microbial motility assay. S. typhimurium strain JR501, a

stable restriction-deficient (r), methylation-proficient (m+) galE strain , 5 6 ; 57 was used

to convert E. coli grown plasmids to S. typhimurium compatibility. Electrocompetent

cells o f S. typhimurium strains JR501 and SJW134 were prepared using standard lab

protocols.58; 59

fliC and GFP transposon plasmids

The pTrc series of plasmids were constructed using plasmid pKK233-2, for the

regulated expression of genes in Escherichia co li 60,6 1 . These vectors carry a strong

hybrid trp/lac, isopropyl-P-D-thiogalactopyranoside (IPTG) inducible promoter, the

lacZ ribosome-binding site (RBS), the multiple cloning site of pUC18 and the rmB

transcription terminators. The pTrc plasmids have been frequently used for cloning

purposes. The pTH890 plasmid is a derivative of pTrc99A plasmid with the S.

typhimurium fliC phase 1 flagellin gene cloned into the Xbal/Hindlll site (Figure 4.2).

The pTH890 plasmid was kindly provided by the late Dr. Robert Macnab (Yale

University).

184

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. The Enhanced GFP (EGFP) transposon plasmids were kindly provided by Dr.

Douglas L. Sheridan (Yale University). The EGFP-pBNJl 1.7 transposon plasmids

(Figure 4.3) were constructed using different spectral variants Green, Cyan, Yellow

and Venus strains as in the expression plasmid 41,62. The GFP gene has been

incorporated into an EZ::TN™ transposon system which also carries a kanamycin-

resistance {Karl1) at the 3’ end of the GFP gene for selection of the inserts. The KanR

gene is flanked with SrfI restriction endonuclease sites that can be removed after the

selection of the inserts.

Transposase reaction

The transposons with gfp and K arl gene inserts were amplified from their host

plasmids by performing PCR with a single 19 bp long oligonucleotide primer

complementary to the Tn5 ME sequence (5'-CTGTCTCTTATACACATCT-3'). (1 pi

PfuTurbo® DNA Polymerase, lpg of pBNJl 1.7 plasmid DNA, 5 pi 1 Ox PCR buffer,

1 pi of 2.5 mM dNTP mix, 2.5 pi of 100 ng/pl primer and 36 pi of DI FI 2 O). The

PCR product was purified using a QIAquick PCR Purification Kit (QIAGEN Inc.,

Valencia, CA). The transposition reaction was performed in vitro (1 pi of 200 ng/pl

pTH890 DNA, 1 pi of 200 ng/pl GFP transposon, 1 pi of lOx EZ::TN™ buffer, 1 pi

of EZ::TN™ Transposase enzyme and 6 pi of DI H 2 O) using an EZ::TN™

transposase enzyme (Epicentre Biotechnologies, Madison, WI).

185

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. pTH890 5.8 kb

Figure 4.2 A map of the pTH890 plasmid. The pTH890 plasmid for expression of S. typhimurium flagellin fliC was constructed by inserting the fliC gene into the pTrc99a plasmid.

EGFP

Tn5 ME S rfl/X m a I

pB N J11.7 5 .4 kb

Srfl/Xma li

TnSMi

Figure 4.3 A map of the pBNJ11.7 plasmid. Plasmid pBNJl 1.7 carries Enhanced Green Fluorescent Protein (EGFP) gene along with Kanamycin resistance gene. The EGFP and KanR genes are flanked with Tn5 ME transposon sequences.

186

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. The transposase reaction mixture was incubated at 37 °C for 2 hrs and stopped by

heating it at 70 °C for 15 minutes. Electrocompetent XL-1 blue Gold E. coli

(Stratagene, La Jolla, CA) were transformed with 0.5 pL of the transposition reaction

and plated on LB agar with ampicillin (100 pg/mL) and kanamycin (50 pg/mL).

Removal of Xmal restriction site from pTH890 plasmid

The removal of K anr from the transposon after the identification of the mutants using

the 8 -cutter SrfI restriction enzyme proved to be very expensive. Hence, we used

Xmal endonuclease, a relatively inexpensive enzyme. SrfI is a type II restriction

enzyme isolated from Saccharomyces species with an 8 bp long recognition sequence

(5'-GCCCGGGC) and produces blunt ends 63. The Xmal restriction endonuclease has

a 6 bp cut site (5'-CCCGGG) that falls with in the SrfI site. Analysis of the pTH890

plasmid sequence shows that there is only one site (286th bp) for the Xmal restriction

site and does not fall in any of the open reading frames. Hence, we mutated the site to

enable the use of Xmal endonuclease to remove the KanR gene after selection. The

oligonucleotide primers used for site-directed mutagenesis are as follows; 5’-GAG

CTC GGT ACC CGG AGA TCC TCT AGA AAT AAT TTT G-3’ and 5’-CAAAAT

TAT TTCTAGAGG ATCTCCGGGTACCGA GCTC-3’. Site-directed mutagenesis

of the pTH890 plasmid (1 pi PfuTurbo® DNA Polymerase, 50 ng of pTH890

plasmid DNA, 5 pi lOx PCR buffer, 1 pi of 2.5 mM dNTP mix, 1.25 pi of 100 ng/pl

forward and reverse primers and 36 pi of DI H 2 0) was performed using the

QuikChange® Site-Directed Mutagenesis Kit (Stratagene, La Jolla, CA).

187

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. S r f l / X m a

lac R

Transposase

fliC

Srfl/Xma I

fliC gene with fliC g e n e w ith EGFP an d Kanr in se rt only EGFP in s e rt

Figure 4.4 Insertion of GFP using transposase enzyme,a) PCR product of the EGFP gene with kanamycin resistance gene, b) The transposon was mixed with pTH890 plasmid along with the transposase enzyme for the random insertion, c) The transposon insert plasmids were isolated and cut with SrfI or Xmal enzymes to remove the KanR gene and ligated.

188

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Screening of the GFP inserts

The transformants after an over night growth were left at 4 °C for 24 hrs and screened

for fluorescence under UV light for fluorescence. The fluorescent colonies were then

streaked onto fresh LB plates with ampicillin and kanamycin to confirm the

fluorescence. Colony PCR was performed on the fluorescent colonies using a Techne

Touchgene Gradient Thermal Cycler (Techne, Inc., Burlington, NJ). PCR reagents

were obtained from New England Biolabs (Beverly, MA). Bacterial colonies were

picked using 10 pi micropipette tips and added to 50 pi of PCR master mix (0.25 pi

Taq DNA Polymerase (NEB), 5 pi 1 Ox standard PCR buffer, 5 pi of 2.5 mM dNTP

mix, 5 pi of 10 pM forward and reverse primers and 30 pi of DI H 2 O). The PCR

reaction conditions used were: an initial cycle of 94°C for 10 min to perform cell

lysis, followed by 30 cycles of 94 °C for 1 min; 55°C for 1 min; 72 °C for 1.5 min,

followed by a final extension step of 72 °C for 10 min. A forward oligonucleotide

primer with DNA sequence 5’-AATTAATCCGGCTCGT-3’ started at nucleotide

position 197 of the pTH890 plasmid. A reverse oligonucleotide primer with DNA

sequence 5’- ATTTAGTCTTGCGTCTTCGC - 3’ started at nucleotide position

1958. These two primers were designed to completely flank the fliC gene in the

pTH890 plasmid. The PCR products were analyzed by electrophoresis on 2% agarose

gels. A control PCR reaction on full-length fliC formed a band just above the 1.5 kb

DNA marker. The transposon inserts should yield PCR products with molecular

masses significantly larger than the wild-type PCR product.

189

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Removal of Kans gene

The E. coli colonies with fliC-gfp inserts identified using colony PCR were grown as

5 ml cultures and the plasmids were extracted via a miniprep procedure. The isolated

plasmids were cut with either SrfI (Stratagene, La Jolla, CA) or Xmal (NEB)

restriction endonucleases to remove the KanR gene from the inserts. Gel

electrophoresis was performed with 2% agarose gels to confirm that the KanR gene

was removed from the plasmid. The linear plasmids from the gel were isolated using

a QIAquick PCR Purification Kit (Qiagen Inc., Valencia, CA) and ligated at 16 °C

overnight with T4 DNA ligase (NEB), thus producing circular plasmids. These

plasmids were then transformed into the S. typhimurium JR501 cells to provide

methylation that is compatible with the restriction-modification genes present in the

other S. typhimurium strains. The methylated plasmids were isolated from the JR501

cells and the S. typhimurium SJW134 non-motile cells were transformed with the

methylated form of the plasmids.

Screening for functional insertions using bacterial motility assay

The positive///C-transposon insert plasmids were purified using a QIAprep Plasmid

Miniprep kit (Qiagen Inc., Valencia, CA). First, the S. typhimurium strain JR501

electrocompetent cells were transformed withy?/C-transposon insert plasmids before

transforming into the SJW134 strain for methylation. The transformed SJW134 cells

were streaked on motility agar media to test for motility. Motility agar media was

composed of tryptone broth (10 g/liter tryptone, EM Science, 5 g/liter NaCl, Fisher

Scientific) that contained 0.3% (wt/vol) agar (Sigma-Aldrich, St. Louis, MO). The

190

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. plates were then incubated at 30 °C for 6 - 8 hrs in a humidified incubator to prevent

dehydration of the media. The low density of the agar allowed the bacteria to move

within the agar, forming a visible halo of growth around the point of inoculation.

Results

Transposase reaction

We have performed a transposon reaction on the pTH890 plasmid with the wild-type

fliC gene and the JliC internal deletion variant pTH890-C12 (Please refer to the

previous chapter for the description of the construction of C12 deletion variant). The

variant pTH890- C12 has 107 amino acids deleted from its middle domain region and

hence serves as a good model to study the tolerance of large protein insertions in a

smaller functional flagellin variant. The transposon insertion is a widely used genetic

technique to insert, knock-out and knock-in genes in various organisms 64; 65; 66; 67.

They serve as carriers of DNA fragments for both homologous and heterologous

insertions. The GFP transposon system developed by Sheridan et al. 62 serves as an

excellent method to randomly insert the gene with an effective selection mechanism.

The in vitro transposase reaction performed along with the pTH890 gene yielded

good results, as indicated by colony PCR screening (Figure 4.5). Due to the random

nature of the transposon reaction, it can be incorporated into virtually any site in the

plasmid. The insertions in the ampicillin resistance gene will be eliminated due to the

loss of their resistance to the antibiotic on the LB plates. We did not expect any GFP

expression unless it was inserted in-frame with the other two open reading frames

191

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. (ORFs), the lacR and fliC genes. After the in vitro transposase reaction, the E. coli

XL-1 blue cells were electroporated with the transposed plasmids.

Screening for the fluorescence

After overnight incubation, the XL-1 blue transformants were transferred to the 4 °C

refrigerator and left there for 24 hrs to allow the proper folding of GFP inside the cell.

A long wave ultraviolet (UV) light (365 nm) was held over the LB plates with

colonies to screen for fluorescence. The fluorescent colonies were picked and freshly

streaked onto new LB Amp+Kan plates. The colony PCR performed as described in

the materials and methods section, shows the presence of inserts in the fliC gene of

the plasmid.

1.5 kb - t

1.2 kb - * * i mi* a & *

Figure 4.5 Colony PCR of the fluorescent inserts.A 2% agarose gel of the colony PCR of the fluorescent transposon inserts. Lane 1 is the control PCR reaction product obtained with the C12 plasmid. Lanes 2,3, 6 ,7 and 13 show fliC-GFP inserts. Other lanes do not show any inserts in the fliC gene.

192

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. The cells that are fluorescent but do not show any insert in the PCR product must do

so due to the insertion in the lacR gene. The fluorescence of the GFP inserts suggests

that the GFP protein does form a fluorophore in the large fusion protein. Figure 5.6

shows a 2% agarose gel of the PCR products of the colonies with inserts and no

inserts in the fliC gene.

Removal of KanR gene

The E. coli colonies with fliC-gfp inserts identified using colony PCR were grown as

5 ml cultures and the plasmids were extracted. The isolated plasmids were cut with

SrfI / Xmal restriction endonucleases to remove the KanR gene from the inserts

(Figure 4.6). An agarose gel was run to confirm that the KarP gene has been removed

from the plasmid. The linear plasmids from the gel were isolated, ligated (Figure 4.7)

and transformed into the S. typhimurium JR501 cells. The plasmids were isolated

from the JR501 cells and the S. typhimurium SJW134 non-motile cells were

transformed.

193

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Figure4.6 Gel picture of the removal of Kanamycin gene. A 2% agarose gel of the fliC-GFP insert plasmids treated with SrfI / Xmal restriction endonucleases. Lane 1 has a C12 plasmid. Lanes 2-8 show the small and large fragments corresponding to the cut kanamycin resistance gene and the rest of the linearized plasmid.

Figure 4.7 Ligation of thefliC-GFP inserts. A 2% agarose gel with the linear plasmids ligated using T4 DNA ligase. Lane 1 shows the 1 kb marker, lane 2 shows the linear plasmid, lanes 3-6 show the ligation products.

194

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Motility screening

The SJW134 cells with the fliC-gfp gene inserts were screened for the motility

on motility media. We have used S. typhimurium wild-type strain SJW1103 as a

positive control and untransformed S. typhimurium SJW134 cells as a negative

control for the motility experiments. Due to the low percentage of agar in the motility

media, the cells producing functional flagella can swarm across the media, creating a

halo around the inoculation zone. Along with the motility we also tested for

fluorescence of expressed, folded GFP with a long wave UV light (365 nm). We did

not identify any of the fliC-gfp inserts with swarming ability in either the wild-type or

in the C12 inserts. Altogether -100 wild-type flagellin-GFP insertion variants and

-200 C12 flagellin-GFP insertion variants were screened for motility, with no

positive identification of any functional internal GFP insertions.

Discussion

GFP is a highly stable protein; the purified wild-type form is still functionally

fluorescent in aqueous buffer solution after years of storage at 4 °C. GFP serves as a

very effective reporter protein for the expression of the fusion protein in many types

of cells. Due to its rigid core structure, GFP folds into its native structure irrespective

of the length and conformation of the attached fusion protein. Flagellin is a relatively

large bacterial protein with 494 amino acid residues. The flagellin export process is

similar to other type III bacterial pathogen export systems, e.g. (Yersinia sp. YoP

proteins ) that release protein to the extracellular environment through a protein tube

195

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 6 8 ; 69; 70 p 0 u 0 wjng translation, the C-terminal region of FliC flagellin protein is bound

to the FliS protein, in a 1:2 ratio, which prevents self-assembly and formation of

flagellin fiber aggregates within the cell 71 ’72,73. A new report published by Muskotal

A et al. 74 suggests that the FliS protein binds to the C-terminal region of the FliC in

the ratio of 1:1. The report suggests that FliS does not act as an unfolding factor.

Instead it helps in the formation of alpha-helical secondary structure in the region of

FliC where it binds 74. The FliC protein is exported to the distal end of flagella fibers

through the interior channel of the flagella fiber, which has a diameter of only 2 nm

( 2 0 A) wide. In order to pass through the flagella tube, the flagellin monomers are

thought to be in a semi-folded or completely unfolded state, due to the severe size

constraints posed by the pore diameter. None of the -100 wild-type and -200 C12

variant GFP inserts with detectable fluorescence identified showed any motility,

indicating a failure of the hybrid flagelin-GFP variants to be exported and assembled

into functional flagella. However, the observed fluorescence of the transformed E.

coli cells suggests that the inserted GFP properly folded as an internal fusion protein

in flagellin.

There are several possible explanations for these results. The GFP protein,

when folded in a native state, is about 50 A long and 30 A wide. Thus, the inserted

GFP domain may simply be too large to fit through the flagella pore and export

apparatus in its native, folded state. Furthermore, GFP may be too stable to be

unfolded by the Flil export protein and other proteins in the type III export complex

that did not evolve to unfold GFP. The cytoplasmic FliS chaperonin protein only

196

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. binds one C-terminal flexible end of the FliC protein, and should have minimal, if

any, effect on the folding or subsequent prerequisite unfolding of the inserted GFP

domain in the middle of the FliC protein. Because the channel diameter through the

filament is only 2 nm, proteins to be exported must be largely unfolded for entry into

and translocation through the channel. The flagellar type III protein export apparatus

is a complex made of six membrane proteins, FlhA, FlhB, FliO, FliP, FliQ, and FliR,

and three soluble proteins, FliH, Flil, and FliJ 75; 76; the Flil protein is thought to be an

ATP-dependent chaperonin that exports the FliC protein. Salmonella InvC, the Flil

homolog of the virulence type III secretion system, has been shown to induce

chaperone release from and unfolding of the cognate protein to be secreted in an

ATPase-dependent manner 77, suggesting that Flil may have a similar function and

mechanism• 78 '’ 79 .

Although the negative results of functional screening for motility were not

based on an exhaustive study, this study indicates that GFP does not readily serve as a

good model protein for insertion into flagellin proteins to enable the creation of a

flagella nanotube with genetically encoded fluoresence. Anectodal evidence indicates

that other attempts to export GFP through some types of trans-membrane export

systems may have also resulted in failure, although publications that describe these

failed attempts are not available. There is evidence suggesting an efficient intra- and

intercellular transport of GFP fusion proteins in eukaryotic cells 80; 81; 82 This could

be due to a more efficient unfolding and exporting systems present in the eukaryotic

cells and their membrane bound organelles. A discussion with a visiting scientist,

197

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Prof. Guy Cornelis from Biozentrum, University of Basel, indicated that GFP has

been tried as a fusion tag in other type III export systems by various researchers, with

no success. Furthermore, a recent seminar at a 2006 Protein Society meeting given by

Prof S. Choe, Salk Institute, indicated that GFP-fiision proteins were not always

successfully exported from one cellular compartment to another. This study also

suggests that, due to the high stability and structural constraints of the GFP protein, it

does not serve as a good internal fusion tag for labeling and possible sensor functions

in flagellin and other type III export systems. One study suggests that the type III

secretion process can unfold previously folded proteins during the secretion process 47

and other recent reports describe the secretion of GFP using a modified flagellar type

III secretion apparatus 4 8 and a destabilized variant of dihydrofolate reductase 50.

Perhaps a destabilized form of GFP, the opposite in terms of a recently described

“superfolder” GFP variant 83 could be yield a feasible methiod for developing GFP

fusions that are more amenable to functional insertion and export across various

cellular membranes.

References

1. Westerlund-Wikstrom, B. (2000). Peptide display on bacterial flagella: principles and applications. Int JM ed Microbiol 290, 223-30. 2. Brown, C. K., Modzelewski, R. A., Johnson, C. S. & Wong, M. K. (2000). A novel approach for the identification of unique tumor vasculature binding peptides using an E. coli peptide display library. Ann Surg Oncol 7, 743-9. 3. Hynonen, U., Westerlund-Wikstrom, B., Palva, A. & Korhonen, T. K. (2002). Identification by flagellum display of an epithelial cell- and fibronectin- binding function in the SlpA surface protein of Lactobacillus brevis. J Bacteriol 184, 3360-7.

198

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 4. Tanskanen, J., Korhonen, T. K. & Westerlund-Wikstrom, B. (2000). Construction of a multihybrid display system: flagellar filaments carrying two foreign adhesive peptides. Appl Environ Microbiol 6 6 , 4152-6. 5. Tripp, B. C., Lu, Z., Bourque, K., Sookdeo, H. & McCoy, J. M. (2001). Investigation of the 'switch-epitope' concept with random peptide libraries displayed as thioredoxin loop fusions. Protein Eng 14, 367-77. 6 . Fedorov, O. V. & Efimov, A. V. (1990). Flagellin as an object for supramolecular engineering. Protein Eng 3,411-3. 7. Lu, Z., Murray, K. S., Van Cleave, V., LaVallie, E. R., Stahl, M. L. & McCoy, J. M. (1995). Expression of thioredoxin random peptide libraries on the Escherichia coli cell surface as functional fusions to flagellin: a system designed for exploring protein-protein interactions. Biotechnology (N Y) 13, 366-72. 8 . Westerlund-Wikstrom, B., Tanskanen, J., Virkola, R., Hacker, J., Lindberg, M., Skurnik, M. & Korhonen, T. K. (1997). Functional expression of adhesive peptides as fusions to Escherichia coli flagellin. Protein Eng 10, 1319-26. 9. Betton, J. M., Jacob, J. P., Hofnung, M. & BroomeSmith, J. K. (1997). Creating a bifunctional protein by insertion of beta-lactamase into the maltodextrin-binding protein. Nature Biotechnology 15, 1276-1279. 10. Doi, N., Itaya, M., Yomo, T., Tokura, S. & Yanagawa, H. (1997). Insertion of foreign random sequences of 1 2 0 amino acid residues into an active enzyme. Febs Letters 402, 177-180. 11. Doi, N. & Yanagawa, H. (1999). Insertional gene fusion technology. FEBS Lett 457, 1-4. 12. Collinet, B., Herve, M., Pecorari, F., Minard, P., Eder, O. & Desmadril, M. (2000). Functionally accepted insertions of proteins within protein domains. Journal o f Biological Chemistry 215, 17428-17433. 13. Scalley-Kim, M., Minard, P. & Baker, D. (2003). Low free energy cost of very long loop insertions in proteins. Protein Science 12, 197-206. 14. Tanskanen, J., Korhonen, T. K. & Westerlund-Wikstrom, B. (2000). Construction of a multihybrid display system: flagellar filaments carrying two foreign adhesive peptides. Appl. Environ. Microbiol. 6 6 , 4152-4156. 15. Westerlund-Wikstrom, B. (2000). Peptide display on bacterial flagella: principles and applications. Int. J. Med. Microbiol. 290, 223-230. 16. Newton, S. M., Jacob, C. O. & Stocker, B. A. (1989). Immune response to cholera toxin epitope inserted in Salmonella flagellin. Science 244, 70-2. 17. . Wu, J. Y., Newton, S., Judd, A., Stocker, B. & Robinson, W. S. (1989). Expression of immunogenic epitopes of hepatitis B surface antigen with hybrid flagellin proteins by a vaccine strain of Salmonella. Proc Natl Acad Sci USA 8 6 , 4726-30. 18. Lu, Z., Murray, K. S., Van Cleave, V., LaVallie, E. R., Stahl, M. L. & McCoy, J. M. (1995). Expression of thioredoxin random peptide libraries on the Escherichia coli cell surface as functional fusions to flagellin: a system

199

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. designed for exploring protein-protein interactions. Biotechnology (N Y) 13, 366-372. 19. Ezaki, S., Tsukio, M., Takagi, M. & Imanaka, T. (1998). Display of heterologous gene products on the Escherichia coli cell surface as fusion proteins with flagellin. Journal o f Fermentation and Bioengineering 8 6 , 500- 503. 20. Brooks, S. (2005). The discovery of aequorin and green fluorescent protein. Journal of Microscopy 217, 1-2. 21. Zimmer, M. (2002). Green fluorescent protein (GFP): applications, structure, and related photophysical behavior. Chem Rev 102, 759-81. 22. Matz, M. V., Fradkov, A. F., Labas, Y. A., Savitsky, A. P., Zaraisky, A. G., Markelov, M. L. & Lukyanov, S. A. (1999). Fluorescent proteins from nonbioluminescent Anthozoa species. Nat Biotechnol 17, 969-73. 23. Labas, Y. A., Gurskaya, N. G., Yanushevich, Y. G., Fradkov, A. F., Lukyanov, K. A., Lukyanov, S. A. & Matz, M. V. (2002). Diversity and evolution of the green fluorescent protein family. Proc Natl Acad Sci U S A 99,4256-61. 24. Kelmanson, I. V. & Matz, M. V. (2003). Molecular basis and evolutionary origins of color diversity in great star coral Montastraea cavernosa (Scleractinia: Faviida). Mol Biol Evol 20, 1125-33. 25. Phillips, G. J. (2001). Green fluorescent protein - a bright idea for the study of bacterial protein localization. FEMS Microbiology Letters 204, 9-18. 26. Sekar, R. B. & Periasamy, A. (2003). Fluorescence resonance energy transfer (FRET) microscopy imaging of live cell protein localizations. J Cell Biol 160, 629-33. 27. Lucas, W. J. & Lee, J. Y. (2004). Plasmodesmata as a supracellular control network in plants. Nat Rev Mol Cell Biol 5, 712-26. 28. Hazelrigg, T. & Mansfield, J. H. (2006). Green fluorescent protein applications in Drosophila. Methods Biochem Anal 47, 221-51. 29. Carroll, J. A., Stewart, P. E., Rosa, P., Elias, A. F. & Garon, C. F. (2003). An enhanced GFP reporter system to monitor gene expression in Borrelia burgdorferi. Microbiology 149, 1819-28. 30. Takanishi, C. L., Bykova, E. A., Cheng, W. & Zheng, J. (2006). GFP-based FRET analysis in live cells. Brain Res 1091, 132-9. 31. Pfleger, K. D. & Eidne, K. A. (2006). Illuminating insights into protein- protein interactions using bioluminescence resonance energy transfer (BRET). Nat Methods 3,165-74. 32. Morino, K., Katsumi, H., Akahori, Y., Iba, Y., Shinohara, M., Ukai, Y., Kohara, Y. & Kurosawa, Y. (2001). Antibody fusions with fluorescent proteins: a versatile reagent for profiling protein expression. J Immunol Methods 257, 175-84. 33. Galietta, L. J., Haggie, P. M. & Verkman, A. S. (2001). Green fluorescent protein-based halide indicators with improved chloride and iodide affinities. FEBS Lett 499, 220-4. 200

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 34. Goh, Y. Y., Frecer, V., Ho, B. & Ding, J. L. (2002). Rational design of green fluorescent protein mutants as biosensor for bacterial endotoxin. Protein Eng 15, 493-502. 35. Barondeau, D. P., Kassmann, C. J., Tainer, J. A. & Getzoff, E. D. (2002). Structural chemistry of a green fluorescent protein Zn biosensor. J Am Chem Soc 124, 3522-4. 36. Yang, F., Moss, L. G. & Phillips, G. N., Jr. (1996). The molecular structure of green fluorescent protein. Nat Biotechnol 14, 1246-51. 37. Sniegowski, J. A., Lappe, J. W., Patel, H. N., Huffman, H. A. & Wachter, R. M. (2005). Base catalysis of chromophore formation in Arg96 and Glu222 variants of green fluorescent protein. J Biol Chem 280, 26248-55. 38. Li, X., Zhang, G., Ngo, N., Zhao, X., Kain, S. R. & Huang, C. C. (1997). Deletions of the Aequorea victoria green fluorescent protein define the minimal domain required for fluorescence. J Biol Chem 272, 28545-9. 39. Miyawaki, A., Nagai, T. & Mizuno, H. (2005). Engineering fluorescent proteins. Adv Biochem Eng Biotechnol 95, 1-15. 40. Ellenberg, J., Lippincott-Schwartz, J. & Presley, J. F. (1999). Dual-colour imaging with GFP variants. Trends Cell Biol 9, 52-6. 41. Sheridan, D. L., Berlot, C. H., Robert, A., Inglis, F. M., Jakobsdottir, K. B., Howe, J. R. & Hughes, T. E. (2002). A new way to rapidly create functional, fluorescent fusion proteins: random insertion of GFP with an in vitro transposition reaction. BMC Neurosci 3, 7. 42. Biondi, R. M., Baehler, P. J., Reymond, C. D. & Veron, M. (1998). Random insertion of GFP into the cAMP-dependent protein kinase regulatory subunit from Dictyostelium discoideum. Nucleic Acids Res 26, 4946-52. 43. Shaner, N. C., Campbell, R. E., Steinbach, P. A., Giepmans, B. N., Palmer, A. E. & Tsien, R. Y. (2004). Improved monomeric red, orange and yellow fluorescent proteins derived from Discosoma sp. red fluorescent protein. Nat Biotechnol 22, 1567-72. 44. Demidov, V. V., Dokholyan, N. V., Witte-Hoffmann, C., Chalasani, P., Yiu, H. W., Ding, F., Yu, Y., Cantor, C. R. & Broude, N. E. (2006). Fast complementation of split fluorescent protein triggered by DNA hybridization. Proc Natl Acad Sci U SA 103, 2052-6. 45. Demidov, V. V. & Broude, N. E. (2006). Profluorescent protein fragments for fast bimolecular fluorescence complementation in vitro. Nat. Protocols 1, 714-719. 46. Malapaka, R. R., Adebayo, L. O. & Tripp, B. C. (2006). A Deletion Variant Study of the Functional Role of the Salmonella Flagellin Hypervariable Domain Region in Motility. J Mol Biol. 47. Wilharm, G., Lehmann, V., Neumayer, W., Trcek, J. & Heesemann, J. (2004). Yersinia enterocolitica type III secretion: evidence for the ability to transport proteins that are folded prior to secretion. BMC Microbiol 4, 27. 48. Majander, K., Anton, L., Antikainen, J., Lang, H., Brummer, M., Korhonen, T. K. & Westerlund-Wikstrom, B. (2005). Extracellular secretion of

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. polypeptides using a modified Escherichia coli flagellar secretion apparatus. Nat Biotechnol 23, 475-81. 49. Lee, V. T. & Schneewind, O. (2002). Yop fusions to tightly folded protein domains and their effects on Yersinia enterocolitica type III secretion. J Bacteriol 184, 3740-5. 50. Feldman, M. F., Muller, S., Wuest, E. & Cornells, G. R. (2002). SycE allows secretion of YopE-DHFR hybrids by the Yersinia enterocolitica type III Ysc system. Mol Microbiol 46, 1183-97. 51. Yamaguchi, S., Fujita, H., Ishihara, A., Aizawa, S. & Macnab, R. M. (1986). Subdivision of flagellar genes of Salmonella typhimurium into regions responsible for assembly, rotation, and switching. J Bacteriol 166, 187-93. 52. Yamaguchi, S., Fujita, H., Sugata, K., Taira, T. & lino, T. (1984). Genetic analysis of H2, the structural gene for phase-2 flagellin in Salmonella. J Gen Microbiol 130, 255-65. 53. Yamaguchi, S., Fujita, H., Taira, T., Kutsukake, K., Homma, M. & lino, T. (1984). Genetic analysis of three additional fla genes in Salmonella typhimurium. J Gen Microbiol 130, 3339-42. 54. Williams, A. W., Yamaguchi, S., Togashi, F., Aizawa, S. I., Kawagishi, I. & Macnab, R. M. (1996). Mutations in fliK and flhB affecting flagellar hook and filament assembly in Salmonella typhimurium. J Bacteriol 178, 2960-70. 55. Tallant, T., Deb, A., Kar, N., Lupica, J., de Veer, M. J. & DiDonato, J. A. (2004). Flagellin acting via TLR5 is the major activator of key signaling pathways leading to NF-kappa B and proinflammatory gene program activation in intestinal epithelial cells. BMC Microbiol 4, 33. 56. Ryu, J. & Hartin, R. J. (1990). Quick transformation in Salmonella typhimurium LT2. BioTechniques 8 , 43-5. 57. Tsai, S. P., Hartin, R. J. & Ryu, J. (1989). Transformation in restriction- deficient Salmonella typhimurium LT2. J Gen Microbiol 135, 2561-7. 58. Dower, W. J., Miller, J. F. & Ragsdale, C. W. (1988). High efficiency transformation of E. coli by high voltage electroporation. Nucleic Acids Res 16, 6127-45. 59. Song, S., Zhang, T., Qi, W., Zhao, W., Xu, B. & Liu, J. (1993). Transformation of Escherichia coli with foreign DNA by electroporation. Chin J Biotechnol 9, 197-201. 60. Amann, E. & Brosius, J. (1985). "ATG vectors' for regulated high-level expression of cloned genes in Escherichia coli. Gene 40, 183-90. 61. Amann, E., Ochs, B. & Abel, K. J. (1988). Tightly regulated tac promoter vectors useful for the expression of unfused and fused proteins in Escherichia coli. Gene 69,301-15. 62. Sheridan, D. L. & Hughes, T. E. (2004). A faster way to make GFP-based biosensors: two new transposons for creating multicolored libraries of fluorescent fusion proteins. BMC Biotechnol 4, 17.

202

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 63. Simcox, T. G., Marsh, S. J., Gross, E. A., Lernhardt, W., Davis, S. & Simcox, M. E. (1991). SrfI, a new type-II restriction endonuclease that recognizes the octanucleotide sequence. Gene 109, 121-3. 64. Leal, S., Acosta-Serrano, A., Morris, J. & Cross, G. A. (2004). Transposon mutagenesis of Trypanosoma brucei identifies glycosylation mutants resistant to concanavalin A. JB iol Chem 279, 28979-88, 65. Bonetta, D. T., Facette, M., Raab, T. K. & Somerville, C. R. (2002). Genetic dissection of plant cell-wall biosynthesis. Biochem Soc Trans 30, 298-301. 6 6 . Funk, N., Becker, S., Huber, S., Brunner, M. & Buchner, E. (2004). Targeted mutagenesis of the Sap47 gene of Drosophila: flies lacking the synapse associated protein of 47 kDa are viable and fertile. BMC Neurosci 5, 16. 67. Westphal, C. H. & Leder, P. (1997). Transposon-generated 'knock-out' and 'knock-in' gene-targeting constructs for use in mice. Curr Biol 7, 530-3. 6 8 . Desvaux, M., Hebraud, M., Henderson, I. R. & Pallen, M. J. (2006). Type III secretion: what's in a name? Trends Microbiol 14, 157-60. 69. Blocker, A., Komoriya, K. & Aizawa, S. (2003). Type III secretion systems and bacterial flagella: insights into their function from structural similarities. Proc Natl Acad Sci U SA 100, 3027-30. 70. Mota, L. J. & Cornelis, G. R. (2005). The bacterial injection kit: type III secretion systems. Ann Med 37, 234-49. 71. Evdokimov, A. G., Phan, J., Tropea, J. E., Routzahn, K. M., Peters, H. K., Pokross, M. & Waugh, D. S. (2003). Similar modes of polypeptide recognition by export chaperones in flagellar biosynthesis and type III secretion. Nat Struct Biol 10, 789-93. 72. Ozin, A. J., Claret, L., Auvray, F. & Hughes, C. (2003). The FliS chaperone selectively binds the disordered flagellin C-terminal DO domain central to polymerisation. FEMS Microbiol Lett 219, 219-24. 73. Auvray, F., Thomas, J., Fraser, G. M. & Hughes, C. (2001). Flagellin polymerisation control by a cytosolic export chaperone. JM ol Biol 308, 221- 9. 74. Muskotal, A., Kiraly, R., Sebestyen, A., Gugolya, Z., Vegh, B. M. & Vonderviszt, F. (2006). Interaction of FliS flagellar chaperone with flagellin. FEBS Lett 580, 3916-20. 75. Macnab, R. M. (2003). How bacteria assemble flagella. Annu Rev Microbiol 57, 77-100. 76. Minamino, T. & Macnab, R. M. (1999). Components of the Salmonella Flagellar Export Apparatus and Classification of Export Substrates. J. Bacteriol. 181, 1388-1394. 77. Fan, F. & Macnab, R. M. (1996). Enzymatic characterization of Flil. An ATPase involved in flagellar assembly in Salmonella typhimurium. J Biol Chem 271, 31981-8. 78. Minamino, T., Imada, K., Tahara, A., Kihara, M., Macnab, R. M. & Namba, K. (2006). Crystallization and preliminary X-ray analysis of Salmonella Flil,

203

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. the ATPase component of the type III flagellar protein-export apparatus. Acta Crystallograph Sect F Struct Biol Cryst Commun 62, 973-5. 79. Imada, K., Minamino, T., Tahara, A. & Namba, K. (2007). Structural similarity between the flagellar type III ATPase Flil and FI-ATPase subunits. PNAS104, 485-490. 80. Rodriguez-Escudero, I., Hardwidge, P. R., Nombela, C., Cid, V. J., Finlay, B. B. & Molina, M. (2005). Enteropathogenic Escherichia coli type III effectors alter cytoskeletal function and signalling in Saccharomyces cerevisiae. Microbiology 151, 2933-45. 81. Stegmayer, C., Kehlenbach, A., Toumaviti, S., Wegehingel, S., Zehe, C., Denny, P., Smith, D. F., Schwappach, B. & Nickel, W. (2005). Direct transport across the plasma membrane of mammalian cells of Leishmania HASPB as revealed by a CHO export mutant. J Cell Sci 118, 517-27. 82. Ma, D., Zerangue, N., Lin, Y.-F., Collins, A., Yu, M., Jan, Y. N. & Jan, L. Y. (2001). Role of ER Export Signals in Controlling Surface Potassium Channel Numbers. Science 291, 316-319. 83. Pedelacq, J. D., Cabantous, S., Tran, T., Terwilliger, T. C. & Waldo, G. S. (2006). Engineering and characterization of a superfolder green fluorescent protein. Nat Biotechnol 24, 79-88.

204

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER V

EXPERIMENTAL INVESTIGATION OF THE HYPOTHETICAL S. TYPHIMURIUM FLAGELLIN MECHANICAL SWITCH BY INSERTION OF A DISULFIDE BRIDGE

Introduction

The eubacterial flagellum, a helical membrane-bound protein filament,

helps in propelling the cells through their liquid environment1; 2. The flagellum is a

protein complex that is composed of at least 2 0 different proteins 2; 3; 4; 5; 6 and has

several major features: a basal body, a trans-membrane motor, a hook structure and

an elongated helical filament. The filament has an outer diameter of 12-25 nm, a 2-3

nm diameter inner channel6 ’ 7 and can be 1-15 pM or more in length .5 ’ 8 It is

composed of as many as 30,000 subunits of a single flagellin protein 9; 10. The mature

S. typhimurium phase 1 flagellin protein, encoded by the fliC gene, consists of 494

amino acids; the first methionine residue is removed post-translationally. The

structure of S. typhimurium flagellin was recently determined (PDB 1UCU)8; 11

(Figure 5.1) and shows four distinct globular domains in the protein, the conserved

DO and D2 domains and the variable D2 and D3 domains (Fig. 5.1). Previous studies

show that the helical flagellum filament exists in two supercoiled forms, the L- and

R-forms. The transition of the flagella filament into either of the supercoiled forms

leads to the tumbling and running motions of bacteria 12; 13. Electron cryomicroscopy

studies and X-ray fiber diffraction studies have shown that the flagellin subunit repeat 205

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. distances of the L- and R-forms of the fibers are 52.7 A and 51.9 A, a difference of

Figure 5.1 A pictorial depiction of the mutated Cys residues.A three dimensional figure of PDB: 1UCU showing the conserved DO and D1 domains and the variable D2 and D3 domains. The inset shows the mutated cysteine residues and the disulfide bridge. Swiss-Pdb viewer ('http://www.expasv.org/spdbvA software was used to visualize the structure.

206

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 0 . 8 A 13; 14. Simulation studies performed by Samatey et al., by fixing the top

molecule and pulling the bottom molecule in 0 . 1 A steps predicted that the abrupt

downward shift of the beta-hairpin turn (140-160 residues) in the D1 domain leads to

the change in the repeat distance of the L- and R-form protofilaments. In order to test

this hypothesis we have mutated the residues Asn56 and Alal49 to cysteine, to allow

formation of a covalent disulfide bond. This disulfide bond cross-link could

effectively lock the hairpin turn into place at one location, and thereby prevent any

conformational changes in the flagellin protein that are predicted to occur upon

conversion between the L- and R-forms. Any observed differences in the running and

tumbling motility behavior of S. typhimurium bacteria expressing the engineered

disulfide loop variant would help to confirm or disprove the original hypothesis that

this loop region changes conformation.

Materials and Methods

Strains and plasmid

The S. typhimurium strain and the fliC plasmid used for the experiment were kindly

provided by the late Dr. Robert Macnab (Yale University). S. typhimurium strain

SJW134 (AfliC and AfljB), which was derived from parent strain SJW806, is wild-

type except for the deletion of the phase 1 fliC and phase 2 fljB flagellin genes 15; 16.

This strain is non-motile, unless a functional flagellin gene is introduced 15. The

pTH890 plasmid is a derivative of pTrc99A plasmid with the S. typhimurium fliC

phase 1 flagellin gene cloned into the Xbal/Hindlll site.

207

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Construction of cysteine disulfide bond

The cysteine disulfide bond was selected by using the program. Disulfide by Design

17. Site-directed mutagenesis of pTH890 plasmid (1 pi PfuTurbo® DNA Polymerase,

50 ng of pTH890 plasmid DNA, 5 pi lOx PCR buffer, 1 pi of 2.5 mM dNTP mix,

1.25 pi of 100 ng/pl forward and reverse primers and 36 pi of DIH 2 O) was

performed using QuikChange® Site-Directed Mutagenesis Kit (Stratagene, La Jolla,

CA). The primers for generating the N56C mutation in the fliC gene were: 5’-TAA

CCG TTT TAC CGC GTG CAT CAA AGG TCT GAC TC- 3’ and 5' -GAG TCA

GAC CTT TGA TGC ACG CGG TAA AAC GGT TAG- 3’. Primers for generating

the A149C mutation are 5’-CCA TCC AGG TTG GTT GCA ACG ACG GTG AAA

C- 3’ and 5’-GTT TCA CCG TCG TTG CAA CCA ACC TGG ATG G- 3’. The

pTH890 plasmid that has the mutation N56C is designated as pTH890-Cysl and the

plasmid that has both Cys residues (N56C+A149C) mutated is designated as

pTH890-Cys2.

Partial purification of fibers and SDS-PAGE

To test whether the mutated flagellin subunits are exported and self-assembled into

flagella fibers, the flagella fibers were partially purified. Bacterial cultures were

inoculated into 15 ml LB broth with 15 pi of 100 mg/ml ampicillin and allowed to

grow overnight at 37 °C. A 15 ml volume of each bacterial culture was transferred

into a 15 ml polypropylene tube and placed in a heat block at 60 °C for 10 min to

dissociate the flagella into monomers. The heated cells were pelleted by

centrifugation at 6000 x g and the supernatant was transferred to 15 ml Amicon Ultra

208

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. centrifugal filter units with a 30 kDa MWCO membrane (Millipore Corp., Billerica,

MA) for concentrating the flagella fibers. The concentrates were then resolved by

reducing 10% SDS-PAGE, followed by staining with Coomassie Brilliant Blue R-250

dye.

Ellman’s assay

The Ellman’s reagent assay 18 was performed on the mutated flagellin protein to test

for the presence of unreacted cysteine residues in the flagellin variants. The

generation of the yellow reaction product that is characteristic of reaction with free

thiols in their reduced state would indicate the presence of a single cysteine residue in

either of the N56C and A149C single mutation variants. Furthermore, the formation

of a disulfide bond between the two cysteine residues in the A149C+N65C double

mutation variant should result in no reaction with DTNB; conversely, reaction of

DTNB to form a significant quantity of yellow reaction product would indicate a

failure of the two introduced cysteine residues to form the desired disulfide bond.

The flagella fibers were from the wild-type plasmid and both the cysteine

variants were partially purified as described above. Ellman’s reagent, 5,5'-Dithio-

bis(2-nitrobenzoic acid) (DTNB) (Sigma-Aldrich, St. Louis, MO) was dissolved in 50

mM sodium acetate solution to yield a 2 mM stock and a stock solution of 1 M Tris-

HC1 (pH 8.0) was prepared and stored at 4 °C. The assay was performed by addition

of 100 pi of 1 M Tris-HCl and 100 pi of 2 mM Ellman’s reagent to 800 pi of 1 mg/ml

of protein solution at 25 °C. The assay mixture was incubated for 5 minutes and the

209

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. absorbance at 412 nm wavelength (A 4 12) was measured at times of 0 minutes and 5

minutes with a SpectraMax384 spectrophotometer.

Motility assay

SJW134 strain S. typhimurium cells were transformed by electroporation with either

the wild-type fliC control plasmid pTH890, or the pTH890-Cysl and pTH890-Cys2

plasmids, and were grown overnight at 37 °C. Volumes of 2 pi of each overnight

culture, diluted with LB to an OD550 of 0.5, were inoculated on motility agar media

(10 g/liter tryptone (EM Science), 5 g/liter NaCl, (Fisher Scientific) and 0.3%

(wt/vol) agar (Sigma-Aldrich, St. Louis, MO) to test for variations in the cellular

motility. The plates were then incubated at 30 °C for 8 hr in a humidified incubator to

prevent dehydration of the media.

Video microscopy and swimming analysis

A Nikon Eclipse E600 epifluorescence stereo microscope (Mager Scientific Inc.,

Dexter, MI) with 10X, 40X and 100X (oil immersion) Planfluor objectives,

transmission and darkfield condenser light sources and a Qlmaging Fast 1394 cooled

mono (grayscale) CCD digital camera was used to record visible images and movies

of non-motile S. typhimurium SJW134 bacteria transformed with pTH890 expression

plasmids encoding either wild-type or cysteine mutation variants of flagellin. The

video images of swimming bacteria were captured at 13 fps speed using a 40X oil

immersion lens. Bacterial cultures were grown to mid-log phase in LB media without

shaking to minimize breakage of potential brittle flagella fibers. A 30 pi volume of

210

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. a)

55.4 kDn

36.5 kDa

SJW134 SJW134 SJW.134 + + 4. pTH890 pTH890-Cysl pTH890-Cys2

Figure 5.2 SDS-PAGE and motility analysis of the wild-type flagellin and two cysteine mutation variants, a) A 10% reducing SDS-PAGE protein gel of partially purified flagella fibers. Lane 1 has the sample from the pTH890 transformed cells, lane 2 with pTH890-Cysl and lane 3 with pTH890-Cys2 transformed cells, b) A picture depicting the swarming of SJW134 cells transformed with different plasmids.

each culture at OD550 0.05 was loaded onto a concave glass slide and covered with a

glass cover slip.

211

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Results and Discussion

The SDS-PAGE analysis of the fibers from pTH890, pTH890-Cysl and pTH890-

Cys2 show that the cysteine mutant flagella fibers were efficiently exported (Figure

5.2a). Furthermore, both single and double cysteine mutation variants were as

functional as the wild-type in providing motility, as indicated by the swarming agar

motility assay. It has been previously demonstrate with the E. coli FliTrx flagellin

that disulfides can be tolerated to some extent, allowing for successful export and

assembly of flagellins containing a thioredoxin domain 19,20. The results of Ellman’s

assay (Table 5.1) indicate the successful formation of a disulfide bond in the pTH890-

Cys2 variant.

Table 5.1 Ellman’s assay of the wild-type and cysteine mutants. 1 ,2

Wild-type flagellin pTH890-Cysl pTH890-Cys2

0 minutes 0.140 0.167 0.145

5 minutes 0.163 0.221 0.152

1

section.

2 The numerical values indicate the measured O.D4 12 values of each protein variant

after the given time periods.

The swarming diameters measured for the three cultures were almost identical

(pTH890; 31 ± 2 mm, pTH890-Cysl; 30 ± 1.3 mm and pTH890-Cys2; 31 ± 2.2 mm),

212

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. suggesting that the formation of a disulfide bond did not alter the swarming ability of

bacteria in a relatively viscous media (Fig 5.2b). Video microscopy analysis of the

swimming behavior of the cells with wild-type protein and cysteine variants did not

show any variation in the swimming ability of the mutants. These results suggest that

the formation of the disulfide bond formation at the proposed mechanical switch

region does not alter the transformation of the fibers from L- form to R-form (or vice

versa).

References

1. Armitage, J. P. (1992). Bacterial motility and chemotaxis. Sci Prog 76, 451- 77. 2. Berry, R. M. & Armitage, J. P. (1999). The bacterial flagella motor. Adv Microb Physiol 41, 291-337. 3. Berg, H. C. (2003). The rotary motor of bacterial flagella. Annu Rev Biochem 72, 19-54. 4. DeRosier, D. J. (1998). The turn of the screw: the bacterial flagellar motor. Cell 93, 17-20. 5. Jones, C. J. & Aizawa, S. (1991). The bacterial flagellum and flagellar motor: structure, assembly and function. Adv Microb Physiol 32, 109-72. 6. Macnab, R. M. (2004). Type III flagellar protein export and flagellar assembly. Biochim Biophys Acta 1694, 207-17. 7. Macnab, R. M. (2003). How bacteria assemble flagella. Annu Rev Microbiol 57, 77-100. 8. Samatey, F. A., Imada, K., Nagashima, S., Vonderviszt, F., Kumasaka, T., Yamamoto, M. & Namba, K. (2001). Structure of the bacterial flagellar protofilament and implications for a switch for supercoiling. Nature 410, 331- 7. 9. Yonekura, K., Maki, S., Morgan, D. G., DeRosier, D. J., Vonderviszt, F., Imada, K. & Namba, K. (2000). The bacterial flagellar cap as the rotary promoter of flagellin self-assembly. Science 290, 2148-52. 10. Yonekura, K., Maki-Yonekura, S. & Namba, K. (2002). Growth mechanism of the bacterial flagellar filament. Res Microbiol 153, 191-7. 11. Yonekura, K., Maki-Yonekura, S. & Namba, K. (2003). Complete atomic model of the bacterial flagellar filament by electron cryomicroscopy. Nature 424, 643-50.

213

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 12. Asakura, S. (1970). [Polymorphism of bacterial flagella]. Tanpakushitsu KakusanKoso 15, 1372-81. 13. Mimori-Kiyosue, Y., Vonderviszt, F., Yamashita, I., Fujiyoshi, Y. & Namba, K. (1996). Direct interaction of flagellin termini essential for polymorphic ability of flagellar filament. Proc Natl Acad Sci U SA 93, 15108-13. 14. Yamashita, I., Hasegawa, K., Suzuki, H., Vonderviszt, F., Mimori-Kiyosue, Y. & Namba, K. (1998). Structure and switching of bacterial flagellar filaments studied by X-ray fiber diffraction. Nat Struct Biol 5, 125-32. 15. Williams, A. W., Yamaguchi, S., Togashi, F., Aizawa, S. I., Kawagishi, I. & Macnab, R. M. (1996). Mutations in fliK and flhB affecting flagellar hook and filament assembly in Salmonella typhimurium. JBacteriol 178, 2960-70. 16. Tallant, T., Deb, A., Kar, N., Lupica, J., de Veer, M. J. & DiDonato, J. A. (2004). Flagellin acting via TLR5 is the major activator of key signaling pathways leading to NF-kappa B and proinflammatory gene program activation in intestinal epithelial cells. BMC Microbiol 4, 33. 17. Dombkowski, A. A. (2003). Disulfide by Design: a computational method for the rational design of disulfide bonds in proteins. Bioinformatics 19, 1852-3. 18. Ellman, G. L. (1959). Tissue sulfhydryl groups. Arch Biochem Biophys 82, 70-7. 19. Lu, Z., Murray, K. S., Van Cleave, V., LaVallie, E. R., Stahl, M. L. & McCoy, J. M. (1995). Expression of thioredoxin random peptide libraries on the Escherichia coli cell surface as functional fusions to flagellin: a system designed for exploring protein-protein interactions. Biotechnology (N Y) 13, 366-72. 20. Kumara, M. T., Srividya, N., Muralidharan, S. & Tripp, B. C. (2006). Bioengineered Flagella Protein Nanotubes with Cysteine Loops: Self- Assembly and Manipulation in an Optical Trap. Nano Lett. 6, 2121-2129.

214

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER VI

HIGH-THROUGHPUT SCREENING FOR ANTIMICROBIAL COMPOUNDS USING A 96-WELL FORMAT BACTERIAL MOTILITY ABSORBANCE ASSAY

Abstract

There is a pressing need to develop new antimicrobial drugs, due to the increasing

resistance of pathogenic bacteria to existing antibiotics. The preliminary development

and validation of a novel methodology for the high-throughput screening of

antimicrobial compounds and inhibitors of bacterial motility is described. This

method uses a bacterial motility swarming agar assay combined with the use of both

centered and offset inoculation of the wells in a standard clear 96-well plate to enable

rapid screening of compounds for potential antibiotic and anti-motility properties with

a standard absorbance microplate reader. Thus, the methodology should be

compatible with 96-well and 384-well plate laboratory automation technology

typically used in drug discovery and chemical biology studies. To validate the

screening method, we screened a structurally diverse library of 960 biologically-

active compounds, the Genesis Plus compound library, against a motile strain of the

gram-negative bacterial pathogen, Salmonella typhimurium , to validate the assay. A

collection of 60 compounds with well-known antibiotic properties were successfully

identified using this assay.

215

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Introduction

Antibiotic resistance in bacteria is currently a growing, major public health

problem; bacteria may soon evolve resistance to all known antibiotics.1,2’3

Commercial research in this area has waned over the last several decades and

consequently, few new classes of antibiotics have recently been developed.1; 3; 4

Bacterial signal transduction components, efflux pumps and genes involved in

motility and biofilm formation represent some of the possible targets for antibiotic

drug discovery research.5,6 Bacterial motility frequently involves the chemotaxis-

controlled rotation of extracellular protein fibers called flagella,7,8 which are also

involved in evasion of host immunity,9 host colonization and biofilm formation.10

Thus, the bacterial flagellar system represents a possible target for antimicrobial

drugs.

Motility inhibitors could potentially be used with other antimicrobial

compounds to prevent the spread of infectious pathogens in the environment or the

host organism. A high-throughput screening (HTS) assay for detecting inhibition of

flagellar motility or other types of bacterial motility, e.g. pili-mediated motility, could

be used for screening compounds for antimicrobial activity and also in fundamental

structure-function studies of genes and proteins involved in motility. Here, we report

a simple screening method for the identification of small molecules that inhibit either

bacterial growth or bacterial motility.

216

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Materials and Methods

Two basic techniques were combined to enable rapid screening for motility

inhibitors; a bacterial swarming agar motility assay and the use of 96-well

microplates common to HTS assay methods. Initial proof of concept was

demonstrated with two Salmonella typhimurium strains: S. typhimurium strain

SJW1103,11 which is wild-type for flagellar motility and non-motile S. typhimurium

strain SJW134,12 which has both the fliC andfljB flagellin genes deleted and cannot

produce flagella fibers. A volume of 200 pi of freshly prepared motility agar,

composed of 1% tryptone (EM Science, Gibbstown, NJ), 0.5% NaCl (Sigma-Aldrich,

St. Louis, MO) and 0.3% agar (Calbiochem, La Jolla, CA), was pipetted into each

well of a sterile 96-well U-bottom plate (Linbro, Flow Laboratories McLean, VA)

using a Matrix Impact2 P1250 programmable 8-channel pipettor (Matrix

Technologies Corp., Hudson, NH) and allowed to solidify at 4 °C overnight. This low

agar media remains in a liquid state until the temperature is lowered to approximately

35 °C; hence, it can be effectively mixed with thermolabile screening compounds.

Liquid 50 ml culture volumes of motile SJW1103 and non-motile SJW134 S.

typhimurium strains were grown to saturation (overnight) in Luria-Bertani (LB)

media at 30 °C, with shaking at 150 rpm. Cultures were poured into a sterile lid

designed for 96-well microtitre plates. A plastic sterile disposable 96-pin tool (V&P

Scientific, Inc., San Diego, CA) was first dipped into the liquid culture, removed, and

217

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. then carefully lowered into the agar media to inoculate it. The inoculated agar plates

were incubated for 12-15 hrs at 30 °C in a humidified incubator.

A slight modification of this screening method was used to screen the Genesis

Plus (GenPlus) library of 960 biologically active, structurally diverse compounds

(MicroSource Discovery Systems, Inc., Gaylordsville, CT) for inhibitors of bacterial

growth and motility. A 3 pi volume of each 2 mM stock compound in 100% DMSO

was pipetted into each well of a 96-well tissue culture plate with a CCS Packard

PlateTrak robotic pipetting station (Perkin Elmer, Shelton, CT), followed by addition

of 197 pi of liquefied motility agar media with a Matrix Impact2 P1250

programmable 8-channel pipettor. This yielded a final screening compound

concentration of 30 pM, within the typical range used for screening assays.13 The

compound and media were then mixed by pipetting up and down 5-6 times to insure

even distribution of the compound within the well and allowed to solidify overnight.

Bacterial cultures (50 ml) of motile and non-motile S. typhimurium strains were

grown to saturation overnight in LB media at 30-37 °C, with shaking at 225 rpm.

Each culture was pipetted into a sterile 96-well plate and a sterile plastic disposable

96 pin tool (V&P Scientific, San Diego, CA) was dipped into the plate containing

each culture and then used to inoculate the motility agar media in each well of the 96-

well microplates. The plates were incubated for 12-15 hrs at 30 °C in a humidified

incubator. Plates were then inoculated with motile S. typhimurium SJW 1103 cells and

incubated overnight for 12-15 hrs at 30 °C under humid conditions.

218

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. The initial agar growth and motility screening procedure was further modified

by varying the position where each well of the agar motility plates were inoculated.

Overnight liquid cultures of motile SJW1103 and non-motile SJW134 S. typhimurium

control strains in LB media were transferred into sterile 25 ml disposable reservoirs

and 200 pi was pipetted into each sample or control well of a sterile 96-well tissue

culture plate with a Matrix Impact2 ®P250 8-channel pipettor. The PlateTrak

instrument was equipped with a P250 (250 pi x 96 tip capacity) pipetting head with

horizontal X- and Y-axis plate offset capability for dispensing liquids into 384-well

plates. The sterile pipette tips were first lowered into the 96-well source plate

containing the overnight liquid cultures and then raised. A 96-well plate containing

sterile motility agar was then placed under the pipetting station and laterally offset by

the PlateTrak to allow the 96 pipette tips to access the upper right quadrant of each

well, corresponding to one fourth of the wells in a 384-well plate. The tips were

lowered approximately halfway into the agar in each well of the motility agar plate to

inoculate the media and then removed. The plates were then incubated for 12-15 hrs

at 30 °C in a humidified incubator, as described previously.

Results and Discussion

The wild-type S. typhimurium motile cultures spread in all directions of the

motility media within the wells after overnight incubation, while the non-motile

cultures only grew in the immediate vicinity of the inoculation zone, leaving clear

surroundings (Figure 6.1). Thus, the difference in the motile vs. non-motile cells was

219

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. readily apparent to the human eye; furthermore, this method might be readily adapted

for simple high throughput screens of potent antibacterial compounds that completely

inhibit growth of the inoculated cells. However, questions remained about the

practicality of adapting this method for high-throughput screening of compounds

against bacterial strains with a microplate spectrophotometer to identify more subtle

differences in inhibition of growth and motility. Therefore, the absorbance of each

well at 550 nm wavelength (OD 550) was measured with a SpectraMax Plus384

spectrophotometer (Molecular Devices Corporation, Sunnyvale, CA). However,

differences between the motile and non-motile S. typhimurium cells were not readily

discemable by absorbance measurements with the plate reader, as all OD55o readings

yielded similar high values. This was due to the growth of cells in the inoculated

center region of each well to high densities, regardless of motility. Thus, the results of

this first method were not amenable to differentiating between motile and non-motile

strains with a spectrophotometer (Figure 6.2).

Given that visual inspection was successful in identifying hits, a “high-content”

screening digital camera image analysis system could probably be adapted to detect

hits with this assay format; however, this approach may require a complex and

expensive system that may not be available in many smaller academic and

commercial labs. Visual inspection of the plates, a simple form of high content

screening, yielded a total of 64 hit compounds that significantly inhibited bacterial

growth. All of these compounds possessed known or suspected antimicrobial activity,

e.g., amoxicillin, ciprofloxacin and rifampin. Furthermore, several different types of

220

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. growth patterns in the wells were observed: a complete spread of the cells in the well,

a total absence of growth and very faint growth at the region of inoculation but no

spreading of the cells (Figure 6. IB). One well indicated some growth but incomplete

swarming of the cells and contained the compound hexachlorophene (Figure 6. IB), a

chlorinated bisphenol compound with known bacteriostatic action against gram-

positive organisms,14 but less effective against gram-negative organisms.15

We also attempted to perform a more quantitative, HTS compatible analysis

of the center-inoculated plates using a plate reader. Following overnight growth, the

absorbance (OD 5 50) was measured for each of two duplicate plates with the

SpectraMax384 plate reader. The plate reader failed to detect significant differences in

the OD 5 50 values of the controls, which were all close to a value of 1.0, as noted

above with the motile and non-motile controls. For example, the average OD 550

values of the 80 compound sample wells in two duplicate plates of the GenPlus plate

2 were 0.88 ± 0.22 and 0.90 ± 0.22. The signal to noise ratios for the averaged motile

negative controls vs. non-motile positive controls were 0.89 and 0.97. The statistical

assay quality was determined by calculating the Z factor,16 given in Eq. 1:

Z' = 1 - (3ac+ + 3ctc-)/(pc+ O pc-) (1)

221

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. (A) Wild-type N o n -m o tile

» „ T ^ r ik ▼ &|%T ^ a a y 1 A \ ■ A jp | j ^ r v jlJ jL«8j^5K^Li|®^^$^UpEj|&5^y

A . .’V A ..y A •

Wild-type ' Wild-type + Non-motile Hexachlorophene

Figure 6.1 Comparison of motile and non-motile bacterial growth patterns in 96- well plate-formatted motility agar. (A) Image of entire 96-well plate containing motility agar that was inoculated with motile and non-motile strains of S. typhimurium bacteria and enlarged images of two example wells. A 96-well, U- bottom plate containing sterile Luria-Bertani (LB) 0.3% agar media was inoculated with non-motile SJW134 and wild-type SJW1103 S. typhimurium strains in the upper right quadrant of each well and incubated at 30 °C for 12 hours. Columns 1-6 were inoculated with motile SJW1103 cells and columns 7-12 contained non-motile SJW134 cells. The non-motile cells grew only around the region of inoculation, well 222

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. away from the center of each well where optical absorbance would typically be measured by a 96-well microplate reader. In contrast, the wild-type cells swarmed across the whole well. (B) Enlarged images showing three types of growth patterns observed for S. typhimurium cells inoculated in a 24-well tissue culture plate. Left image: wild-type SJW1103 cells in plain LB agar. Center image: wild-type SJW1103 cells grown in the presence of 30 pM hexachlorophene, showing partial inhibition of growth and motility. Right image: non-motile SJW134 cells grown in plain LB agar.

Average absorban ce values of Salmonella controls

s Motile (centered) Non-motile (centered) Motile (offset.) % 0.6-1 ]Non-moule {offset) i

Figure 6.2 Comparison of average absorbance readings for motile and non- motile S. typhimurium bacterial cells. When the cells were inoculated in the center of each well the absorbance readings always yielded values close to 1.0. However, when the location of inoculation was offset, i.e., moved to the upper right, the spectrophotometer was able to discriminate between the two strains by large differences in the absorbance readings.

223

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Where a c+ is the standard deviation of the positive control for the assay, a c_ is

the standard deviation for the negative control, and (|ic+ - pc_) define the mean values

for the positive and negative controls. Analysis of the positive and negative control

data for the GenPlus plate 2 OD 550 data yielded anomalously high Z values of 5.4 and

38.6. If screening strictly for antimicrobial growth compounds, the Z values would be

greatly improved by using an antibiotic for the positive control instead of the non-

motile strain, using this center-well inoculation method.

Although the positive controls did not provide a useful reference, we were able

to statistically analyze the absorbance data with respect to the deviation of individual

sample readings from the average value of each plate. The average ( x) and standard

deviation (a) of the OD 5 50 readings were computed for each of the two duplicate

plates using a spreadsheet; the absorbance data in each well were analyzed with

respect to the difference between each well and the average plate value. Strong

inhibitors were defined by the criteria that the OD 550 of a well containing a strong

inhibitor was less than the mean plate value by two times the standard deviation

{OD550 < ( x - 2a)}. This cut-off criterion yielded a total of 60 compounds that

significantly inhibited the growth of the bacteria, or 93.8% of those identified by the

prior visual inspection method. Although no previously unknown antibiotic

compounds were identified among the hits, these results indicate the general validity

of the screening method for detecting antimicrobial compounds, although a further

improvement was possible, as demonstrated below.

224

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. The modified “offset inoculation” method was used to inoculate one quadrant

of each well instead of the center well position in motility agar plates. Using this

method, we were able to visually differentiate between the wells inoculated with

motile and non-motile cells after 9-10 hrs of incubation; visual inspection of the

plates readily discriminated between the motile wild-type control and non-motile

controls (Figure 1A), with 100% repeatability. The wild-type cultures spread in all

directions of the motility media in the wells, while non-motile cultures only grew in

the immediate vicinity of the inoculation zone. This resulted in clear media in the

center of each well for non-motile cells; this location is where absorbance

measurements by a 96-well plate reader would normally occur, a key advantage for

using automated screening approaches. To demonstrate compatibility with HTS

laboratory automation, growth curves in each well were monitored after inoculation

by absorbance measurements at 550 nm (OD550), using a SpectraMax Plus 384

spectrophotometer. This approach consistently detected a significant difference

between the growth of motile and non-motile S. typhimurium strains (Figure 6.3).

225

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Average Growth of Motile vs. Non-motile Salmonella 1.0 ,

0.8

0 0 .0 - •r, 1 °-5 ; 1 0.4-

0.0 4.5 603 7.5 9 10.5 20.5 Time Elapsed (Hours)

Figure 6.3 Growth curves for motile and non-motileS. typhimurium bacterial cells inoculated in 96-well plate-formatted motility agar. S. typhimurium cells, either wild-type for motility (strain SJW1103) or non-motile (strain SJW134), were inoculated in the upper right quadrant of each well, as shown in Figure 1. The time- dependent absorbance readings (OD 550) from two identical 96-well plates containing six columns (48 wells) for each bacterial strain were averaged to generate each data point; the error bars represent the computed standard deviation of each set of measurements. The difference in a non-motile strain vs. a motile strain and the rate of growth of the motile strain can be readily determined with this assay method.

The hit compounds previously identified in the visual screen were used to

prepare a master hit plate and retested for inhibition of growth and motility by this

offset inoculation method. The absorbance at 550 nm of each well (OD 5 50) was

226

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. measured with the SpectraMax® Plus384 spectrophotometer. The reproducibility of

the assay controls was determined by calculating the Z' factor; the average Z' value

was found to be 0.80 ± 0.08, an acceptable value. Both the type of controls (motile

and non-motile) and the location of inoculation contributed to the quality of the assay.

Because of the different strains of Salmonella used as controls, the assay was able to

consistently detect a wide range of antibiotic activity (Table 6.1).

Encouraged by the initial results of the offset inoculation method with

previously identified hits, the entire 960 compound GenPlus library was rescreened

in duplicate, using the offset inoculation method, to verify the consistency of the

results. The hits identified in this more quantitative assay were 100% identical to the

previous screening methods. The average Z' value for all 12 compound plates was

0.62 ±0.19, an acceptable value for HTS format assays. These preliminary screening

results, obtained with a model compound library, indicate that the HTS-compatible

“offset inoculation” method can be used to screen compounds for bactericidal,

bacteriostatic, and motility inhibition activity. Alternatively, a microplate

spectrophotometer might be programmed to read a 96-well plate in 384-well

detection mode, with a subsequent statistical analysis of the four offset quadrants in

each well to monitor well-to-well changes in absorbance resulting in bacterial growth

and spreading.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Table 6.1 Complete list of antibacterial compounds detected in screen of Genesis Plus library with offset motility detection method.______Functional Antibiotic Compound name classification effectiveness a'b

1. Cephaloridine antimicrobial 106.13% (±5.66%)

2. Ofloxacin antimicrobial 105.68% (±5.41%)

3. Tetracycline antibacterial

hydrochloride 105.24% (±5.01%)

4. Lomefloxacin antimicrobial

hydrochloride 105.21% (±4.30%)

5. Flumequine antibacterial 105.04% (±5.36%)

6. Methacycline antibacterial

hydrochloride 104.94% (±5.73%)

7. Ciprofloxacin antibacterial,

fungicide 104.92% (±5.12%)

8. Thimerosal anti-infective 104.90% (±5.23%)

9. Cefuroxime sodium antimicrobial 104.72% (±5.22%)

10. Cefamandole nafate antimicrobial 104.60% (±6.13%)

11. Bacampicillin antibiotic

hydrochloride 104.39% (±3.85%)

12. Trimethoprim antibacterial 104.28% (±6.82%)

13. Ceftriaxone sodium antibiotic 104.14% (±1.91%)

14. Polymyxin b sulfate antibacterial 103.80% (±2.81%)

228

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Table 6.1 - Continued

Functional Antibiotic Compound name classification effectiveness a’b

15. Doxycycline antibacterial

hydrochloride 103.73% (±3.03%)

16. Chloroxine chelating agent 103.14% (±2.26%)

17. Broxyquinoline antiseptic;

disinfectant 103.07% (±2.50%)

18. Enoxacin antibacterial 102.72% (±2.06%)

19. Minocycline antibacterial

hydrochloride 102.53% (±1.74%)

20. Cephapirin sodium antibacterial 102.31% (±1.11%)

21. Cefoperazole sodium antimicrobial 102.28% (±4.61%)

22. Meclocycline antibacterial

sulfosalicylate 101.97% (±1.30%)

23. Norfloxacin antibacterial 101.82% (±0.07%)

24. Phenylmercuric acetate fungicide 101.81% (±1.69%)

25. Furazolidone antimicrobial 101.63% (±0.32%)

26. Alexidine hydrochloride antibacterial 101.57% (±0.86%)

27. Nalidixic acid antibacterial 101.46% (±0.16%)

28. Oxytetracycline antibacterial 101.34% (±0.31%)

29. Fletacillin potassium antibacterial 101.28% (±1.02%)

229

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Table 6.1 - Continued

Functional Antibiotic Compound name classification effectivenessa’b

30. Bergaptene antipsoriatic 101.12% (±0.83%)

31. Mitomycin C antineoplastic 101.04% (±1.72%)

32. Hexachlorophene topical anti-infective 100.79% (±1.15%)

33. Rifampin antibacterial 100.77% (±2.84%)

34. Aureomycin antibacterial 100.38% (±3.24%)

35. Nitrofurantoin antibacterial 100.20% (±1.97%)

36. Zidovudine antiviral 99.92% (±5.42%)

37. Cefotaxime sodium antibacterial 99.62% (±4.81%)

38. Chloramphenicol antibacterial 99.36% (±1.74%)

39. Oxolinic acid antimicrobial 98.76% (±3.39%)

40. Amoxicillin antibacterial 98.74% (±3.23%)

41. Cefazolin sodium antibacterial 98.44% (±1.26%)

42. Pyrithione zinc antibacterial;

antifungal 98.27% (±5.00%)

43. Cephalothin sodium antibacterial 97.57% (±5.34%)

44. Moxalactam disodium antibacterial 97.48% (±4.70%)

45. Piperacillin sodium antibacterial 97.37% (±3.75%)

46. Chlorhexidine topical antibacterial 97.23% (±2.60%)

47. Demeclocycline antibacterial 94.96% (±8.16%)

230

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Table 6.1 - Continued

Functional Antibiotic Compound name classification effectiveness a’b

48. Erythromycin antibacterial 94.77% (±2.13%)

49. Cinoxacin antibacterial 93.86% (±4.32%)

50. Penicillin g potassium antibiotic 91.30% (±2.84%)

51. Fluorouracil antineoplastic 89.50% (±0.59%)

52. Erythromycin antibacterial

ethylsuccinate 83.04% (±15.13%)

53. Cetylpyridinium topical anti-infective

chloride 78.99% (±5.13%)

54. Thiamphenicol antibacterial 71.83% (±2.69%)

55. Benzalkonium chloride topical anti-infective 60.32% (±55.55%)

56. Pipemidic acid antimicrobial 59.08% (±59.54%)

57. Floxuridine antineoplastic 57.74% (±61.87%)

58. Metampicillin sodium antimicrobial 52.90% (±71.02%)

a The percentage of motility inhibition for each compound, as normalized by

comparison with motile and non-motile S. typhimurium controls on each plate.

b Only compounds with at least 50% inhibition of the motility are listed.

231

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. References

1. Fishman, N. (2006). Antimicrobial stewardship. Am. J. Med. 119, S53-61; discussion S62-70. 2. Tenover, F. C. (2006). Mechanisms of antimicrobial resistance in bacteria. Am. J. Med. 119, S3-10; discussion S62-70. 3. Yoneyama, H. & Katsumata, R. (2006). Antibiotic resistance in bacteria and its future for novel antibiotic development. Biosci. Biotechnol. Biochem. 70, 1060-75. 4. Payne, D. J., Gwynn, M. N., Holmes, D. J. & Pompliano, D. L. (2007). Drugs for bad bugs: confronting the challenges of antibacterial discovery. Nat Rev Drug Discov 6, 29-40. 5. Danese, P. N. (2002). Antibiofilm approaches: prevention of catheter colonization. Chem. Biol. 9, 873-80. 6. Pages, J. M., Masi, M. & Barbe, J. (2005). Inhibitors of efflux pumps in Gram-negative bacteria. Trends Mol. Med. 11, 382-9. 7. Bardy, S. L., Ng, S. Y. & Jarrell, K. F. (2003). Prokaryotic motility structures. Microbiology 149, 295-304. 8. Macnab, R. M. (2004). Type III flagellar protein export and flagellar assembly. Biochim. Biophys. Acta 1694, 207-17. 9. Andersen-Nissen, E., Smith, K. D., Strobe, K. L., Barrett, S. L., Cookson, B. T., Logan, S. M. & Aderem, A. (2005). Evasion of Toll-like receptor 5 by flagellated bacteria. Proc. Natl. Acad. Sci. USA 102, 9247-52. 10. Costerton, J. W., Stewart, P. S. & Greenberg, E. P. (1999). Bacterial biofilms: a common cause of persistent infections. Science 284, 1318-22. 11. Yamaguchi, S., Fujita, H., Sugata, K., Taira, T. & lino, T. (1984). Genetic analysis of FI2, the structural gene for phase-2 flagellin in Salmonella. J. Gen. Microbiol. 130, 255-65. 12. Williams, A. W., Yamaguchi, S., Togashi, F., Aizawa, S. I., Kawagishi, I. & Macnab, R. M. (1996). Mutations in fliK and flhB affecting flagellar hook and filament assembly in Salmonella typhimurium. J. Bacteriol. 178, 2960-70. 13. Walters, W. P. & Namchuk, M. (2003). Designing screens: how to make your hits a hit. Nat. Rev. Drug. Discov. 2, 259-266. 14. Haley, C. E., Marling-Cason, M., Smith, J. W., Luby, J. P. & Mackowiak, P. A. (1985). Bactericidal activity of antiseptics against methicillin-resistant Staphylococcus aureus. J. Clin. Microbiol. 21, 991-2. 15. Maris, P. (1991). [Resistance of 700 gram-negative bacterial strains to antiseptics and antibiotics], Ann. Reck Vet. 22, 11-23. 16. Zhang, J. H., Chung, T. D. & Oldenburg, K. R. (1999). A Simple Statistical Parameter for Use in Evaluation and Validation of High Throughput Screening Assays. JBiomol Screen 4, 67-73.

232

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER VII

RATIONAL DESIGN OF METAL BINDING SITES AND TESTING THE FEASIBILITY OF CATALYTIC SITES IN FLAGELLIN

Introduction

Computer simulations play an increasingly important role in our

understanding of protein folding, stability, activity, and the specificity of protein-

ligand interactions 1; 2; 3; 4; 5. The methodologies being developed play a significant

role in understanding the basics of protein architecture as well as drug development.

Design methods are being developed which can be used to rationally modify the

structure and function of a protein 3; 6. A great challenge today is to rationally design

specific properties for proteins that can be experimentally put to test. Significant

progress has been made at enhancing the catalytic properties of existing enzymes;

however, the design of proteins with novel catalytic properties has met with relatively

limited success. These are being applied to building proteins with novel metal centers

and the rational manipulation of their chemical reactivity *’4.

Several software programs such as Rosetta 7; 8, ORBIT 6 etc. have been

successfully put to use in designing proteins with novel functions. We believe that

flagellin protein can play a potential role in enhancing the understanding of the basics

of protein catalytic functions that can be tested experimentally with relative ease.

The current study describes a preliminary engineering of metal binding sites and

catalytic sites to modify flagellin, a structural protein, into a functional protein.

233

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Flagella, due to their large size, abundance on the cell surface, and ease of

purification, can serve as a model for protein engineering.

Carbonic anhydrase (CA) are metalloenzymes that catalyze the reversible

reaction of the conversion of carbon dioxide to bicarbonate (Equation 1) 9’10,11; 12; 13.

There are 5 known CA structural families, the a, P, y -classes, and the recently

discovered 5 and s classes 9. The primary activity of the enzyme in humans is to

maintain the acid-base balance in blood and other tissues and to help transport carbon

dioxide out of tissues.

C 0 2 + H20<---- ► HCCtf + H+ (Equation 1)

The human CA II crystal structure was determined by Eriksson et a l14 (PDB

1CA2). The active site of the enzyme consists of the three histidine residues

coordinately bound to the Zn2+ metal ion. The zinc ion is ligated to three histidyl

residues and one water molecule in a nearly tetrahedral geometry. The neighboring

residue Glul06 is deprotonated and the Osl atom of Thr-199 donates its proton to the

Osl atom of Glu-106 and can function as a hydrogen bond acceptor only in additional

hydrogen bonds. As a preliminary study we have only considered the three active site

histidine residues for the design of metal binding regions in the flagellin protein. The

different parameters for the active site of the CA II histidine residues are given in

Table 1.

The structure of Salmonella typhimurium flagellin protein (FliC) is described

in the Chapter I in detail. We have designed metal binding sites with 3 histidines,

similar to the CA II site, in three different regions of the flagellin protein, labeled Site

234

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. I, Site II and Site III (Figure 7.1). Site I is located in an outer loop region of the D2b

domain. Site II is located at the interface of D2 and D3 domains, and Site III is

located at the interface of the D1 and D2b domain (Figure 7.1). These locations were

predicted to have active site-like features by the Computed Atlas of Surface

Topography of proteins (CASTp) software server http://sts.bioengr.uic.edu/castp/,

which analyzes protein structures for structural pockets and cavities that are often

potential binding sites and active sites of proteins 15; 16. As a preliminary experiment,

two of the three potential sites were individually constructed in FliC by site-directed

mutagenesis, (Site II and Site III) and tested for metal binding ability with the

transition metal ion, Co2+ and for catalytic activity with both Zn2+ and Co2+ ions.

Figure 7.1 Location of the three rationally designed metal binding sites in the flagellin (FliC) structure (PDB 1UCU). Site I is present in the loop region of D2b domain, Site II is located at the interface of D2 and D3 domains and Site III is located at the interface of the D1 and D2b domain.

235

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Table 7.1 Parameters used for the construction of metal binding sites in FliC protein.______2+ Distances between Zn ion-

coordinated N atoms in histidine Side Chain Angles imidazole sidechains

CAII parameters (PDB 1CA2)

His94 NE2 - His 19 ND1 : 3.27 A H isl 19: 30.51 Degrees

His94 NE2 - His96 NE2 : 3.31 A His94: 35.30 Degrees.

His96 NE2 - H isl 19 ND1 : 3.11 A His96: 35.88 Degrees

FliC Site I

His396 NE2 - His393 NE1 : 3.00 A His396: 32.29 Degrees.

His396 NE2 - His388 NE2 : 3.18 A His388: 33.19 Degrees.

His393 NE1 - His388 NE2 : 4.00 A His393: 32.27 Degrees.

FliC Site II

Hisl90 ND2 - His282 NE1 : 2.71 A His 190: 32.65 Degrees.

His 190 ND2 - His222 NE1 : 4.19 A His 282: 32.66 Degrees.

His282 NE1 - His222 NE1 : 4.82 A His 222: 32.29 Degrees.

FliC Site III

Hisl 16 NE2 - His 120 ND1 : 2.45 A H isl 16: 32.65 Degrees.

Hisl 16 NE2 - His384 NE2 : 3.83 A His384: 32.65 Degrees.

His384 NE2 - His 120 ND1 : 3.97 A Hisl20: 32.66 Degrees.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Materials and Methods

Bacterial strains

E. coli XL-1 Blue electrocompetent cells (Stratagene, La Jolla, CA) were used for the

transformation of the mutant plasmids. These cells have a transformation efficiency

greater than 1 x 1010 transformants/pg of DNA, are tetracycline resistant,

endonuclease ( endA~) deficient, which greatly improves the quality of miniprep DNA

and are recombination ( recA') deficient, improving insert stability. S. typhimurium

strain SJW134 {AfliC and AfljB), which was derived from parent strain SJW806, is

wild-type except for the deletion of the phase 1 fliC and phase 2 fljB flagellin genes.17’

18 This strain is non-motile, unless a functional flagellin gene is introduced, e.g., on

an expression plasmid.17 Thus, it serves as an ideal strain to screen engineered or

mutated fliC genes, present in the introduced plasmids, for functional flagellin protein

expression, folding, export and flagellar fiber assembly, as indicated by a simple agar

microbial motility assay. S. typhimurium strain JR501, a stable restriction-deficient (r

), methylation-proficient (m+) galE strain,19’20 was used to convert E. coli grown

plasmids to S. typhimurium compatibility. Electrocompetent cells of S. typhimurium

strains JR501 and SJW134 were prepared using standard lab protocols 21; 22. The

pTH890 plasmid is a derivative of pTrc99A plasmid with the S. typhimurium fliC

phase 1 flagellin gene cloned into the Xbal/Hindlll site (Figure 7.2). The pTH890

plasmid was kindly provided by the late Dr. Robert Macnab (Yale University). This

plasmid was used for the mutagenesis.

237

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Design of the metal binding sites

The metal binding sites were designed by using the PDB structure of CA II1CA2 as a

template. The metal coordinating histidine residues have the following coordinates.

Distances between Zn2+ ion-coordinated N atoms in histidine imidazole side chains

are as follows; His94 NE2 - Hisl 19 ND1 atoms: 3.27 A; His94 NE2 - His96 NE2

atoms: 3.31 A; His96 NE2 - H isl 19 ND1 atoms: 3.11 A. Distances between Zn2+ ion

and Zn2+ ion-coordinated N atoms in histidine imidazole side chains are as follows;

Zn2+ - His94 NE2 atoms: 1.99 A; Zn2+ -His96 NE2 atoms: 2.10 A; Zn2+ -Hisl 19

ND1 atoms: 1.91 A. The side chain angles of CA II; Hisl 19: 30.51 degrees; His94:

35.30 Degrees; His96: 35.88 degrees (Table 7.1). Using this structural data, we chose

three possible sites were designed in the FliC protein. The SwissPDBViewer software

version 3.7 23 was used to manually mutate different residues in the predicted regions

to histidine and select the side chain angles with least steric hindrance for the design

of the sites.

Site-directed mutagenesis

Site-directed mutagenesis of pTH890 plasmid (1 pi PfuTurbo® DNA polymerase, 50

ng of pTH890 plasmid DNA, 5 pi 1 Ox PCR buffer, 1 pi of 2.5 mM dNTP mix, 1.25

pi of 100 ng/pl forward and reverse primers and 36 pi of DIH20) was performed

using the QuikChange® Site-Directed Mutagenesis Kit (Stratagene, La Jolla, CA).

Please refer to the Table 2 for the primers.

238

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CA Active site His94 His96 <^;°/^Ste396

J i o A\ \ 3,18 A

„ m ' 3 1 A ^ His393 .3 A ' . 41 A His388

His119

His116

His190 4.0 A 2.4 A 2.7 A His384 4.2 A * ■v-'CL/ His282 H i s l i o N ^

His222 Site II Site 111

Figure 7.2 Pictorial depiction of three rationally designed metal binding sites in the FliC protein. Please refer to the Table 2 for the description of the parameters for each metal binding site.

239

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Table 7.2 Mutagenesis primers.

Mutation Primer Sequence

Q393H 5' CACAACTTTAAAGCACACCCTGATCTGGCGGAAG 3'

5’ CTTCCGCCAGATCAGGGTGTGCTTTAAAGTTGTG 3'

L396H 5’ CTTTAAAGCACAGCCTGATCATGCGGAAGCGGCTGCTACAAC 3'

5' GTTGTAGCAGCCGCTTCCGCATGATCAGGCTGTGCTTTAAAG 3'

Y190H 5' GCTGCAACTGTTACAGGACATGCCGATACTACGATTGC 3’

5' GCAATCGTAGTATCGGCATGTCCTGTAACAGTTGCAGC 3'

F222H 5’ GATGGCGATTTAAAAC AT GAT GAT ACGACTGG 3’

5' CCAGTCGTATCATCATGTTTTAAATCGCCATC 3’

Q282H 5' GAGGATGTGAAAAATGTACACGTTGCAAATGCTG 3’

5' CAGCATTTGCAACGTGTACATTTTTCACATCCTC 3’

T116H 5' CTCCATCCAGGCTGAAATCCACCAGCGCCTGAACGAAATC 3'

5’ GATTTCGTTCAGGCGCTGGTGGATTTCAGCCTGGATGGAG 3’

N120H 5’ CTGAAATCACCCAGCGCCTGCATGAAATCGACCGTGTATCCG 3'

5' CGGATACACGGTCGATTTCATGCAGGCGCTGGGTGATTTCAG 3’

K384H 5’ GGTAAAACTTACGCTGCAAGTCACGCCGAAGGTCACAAC 3'

5’ GTTGTGACCTTCGGCGTGACTTGCAGCGTAAGTTTTACC 3’

240

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Partial purification of fibers and SDS-PAGE

Engineered active-site flagellins, in the self-assembled form of flagella fibers,

were partially purified by shearing. The successful isolation of flagellins by this

method indicated that the mutated flagellin subunits were successfully exported and

assembled to form functional flagella fibers. Bacterial cultures were inoculated into

15 ml LB broth with 15 pi of 100 mg/ml ampicillin and allowed to grow overnight at

37 °C. 15 ml of the bacterial cultures were transferred to a 15 ml polypropylene tube

and and vortexed for 30 seconds to 1 minute. The cells were pelleted at 6000 x g and

the supernatant was transferred to 15 ml (30 kDa cutoff) Amicon Ultra centrifugal

filtration tubes (Millipore Corp., Billerica, MA) for concentrating the flagella fibers.

The concentrates were resolved by 10% SDS-PAGE. The concentration of the protein

was standardized at 2 mg/ml.

Dialysis of the mutants

The partially purified flagella fibers were dialyzed overnight with EDTA buffer (10

mM Tris-HCl, 1 mM EDTA, pH 8.0) to remove any preexisting bound or unbound

metal atoms. The overnight dialysis was performed with a buffer volume of 50 times

the protein solution volume, using Slide-A-Lyzer cassette™ 30 kDa molecular weight

cutoff (MWCO) dialysis cassettes (Pierce, Rockford, IL). The dialysis step was

repeated with a buffer change for 6 hrs. After this step the fibers were dialyzed 24 hrs

with 25 mM MOPS (pH 7.5) containing 150 mM KC1 and 2mM C 0CI2. This dialysis

step was followed by dialysis with metal-free MOPS KC1 buffer to remove unbound

241

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. metal ions for 4 hrs. After dialysis, the protein samples were scanned for absorbance

with a Beckman DU-530 spectrophotometer from wavelengths 200-800 nm.

Enzyme assay

The catalytic esterase activity of the engineered flagellins was monitored by the

hydrolysis of the nonphysiological but commercially available ester, 4-nitrophenyl

acetate (4-NPA) (Sigma-Aldrich, St. Louis, MO). The products of the hydrolysis

reaction are acetate and nitrophenolate, which ionizes to give a bright yellow anion

that is detected by measuring its absorbance at 348 to 400 nm.

Results and Discussion

The parameters of designed metal binding sites are listed in Table 1 and

described in Figure 2. Spectroscopic scans were taken for the native flagella fibers

and the engineered FliC Site II and FliC Site III fibers. No differences were observed

in the absorbance wavelengths and values in the pre- and post-dialyzed samples. This

preliminary result suggests that the metal ions did not coordinate with the histidine

residues. The enzyme assay performed with the cobalt chloride-dialyzed native and

mutated fibers also yielded a negative result for increased esterase activity.

The above results suggest that the cobalt ion did not coordinate with designed

histidine residues, indicating a failure of this simple preliminary approach to active

site design in a protein. The distances and geometries of the histidine metal ligands

that were anticipated to form may not have been attained. In a related example, a

histidine residue was predicted to participate in transition metal binding in a rationally

242

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. designed “histidine-patch” thioredoxin mutant. However, this histidine residue, His6,

failed to do so for reasons that are not well-understood .

One of the major impediments for predicting the conformation of protein

structures is the limitation in the computational power. The side chain structures were

designed using various available software that only considers the rigid crystal

structures. The amino acid side chains in the solution state can assume numerous

conformations that may not be accurately described in the rotomers available in the

used in this study. A more detailed rational design approach to this

problem, involving the use of protein design software, might yield a functional metal

binding site. Alternatively, techniques of random mutagenesis and screening,

common to “directed evolution” methods of enzyme engineering 25, might also be

another pathway to achieving the creation of a functional flagellin protein with metal

binding and catalytic activity.

References

1. Dwyer, M. A., Looger, L. L. & Hellinga, H. W. (2003). Computational design of a Zn2+ receptor that controls bacterial gene expression. Proc Natl Acad Sci USA 100, 11255-60. 2. Yang, W., Jones, L. M., Isley, L., Ye, Y., Lee, H. W., Wilkins, A., Liu, Z. R., Hellinga, H. W., Malchow, R., Ghazi, M. & Yang, J. J. (2003). Rational design of a calcium-binding protein. J Am Chem Soc 125, 6165-71. 3. Dwyer, M. A., Looger, L. L. & Hellinga, H. W. (2004). Computational design of a biologically active enzyme. Science 304, 1967-71. 4. Allert, M., Rizk, S. S., Looger, L. L. & Hellinga, H. W. (2004). Computational design of receptors for an organophosphate surrogate of the nerve agent soman. Proc Natl Acad Sci U SA 101, 7907-12. 5. Yang, W., Wilkins, A. L., Ye, Y., Liu, Z. R., Li, S. Y., Urbauer, J. L., Hellinga, H. W., Kearney, A., van der Merwe, P. A. & Yang, J. J. (2005).

243

with permission of the copyright owner. Further reproduction prohibited without permission. Design of a calcium-binding protein with desired structure in a cell adhesion molecule. J Am Chem Soc 127, 2085-93. 6. Bolon, D. N. & Mayo, S. L. (2001). Enzyme-like proteins by computational design. Proc Natl Acad Sci U SA 98, 14274-9. 7. Kuhlman, B., Dantas, G., Ireton, G. C., Varani, G., Stoddard, B. L. & Baker, D. (2003). Design of a novel globular protein fold with atomic-level accuracy. Science 302, 1364-8. 8. Dantas, G., Kuhlman, B., Callender, D., Wong, M. & Baker, D. (2003). A large scale test of computational protein design: folding and stability of nine completely redesigned globular proteins. J Mol Biol 332, 449-60. 9. Iyer, R., Barrese, A. A., 3rd, Parakh, S., Parker, C. N. & Tripp, B. C. (2006). Inhibition profiling of human carbonic anhydrase II by high-throughput screening of structurally diverse, biologically active compounds. JBiomol Screen 11, 782-91. 10. Silverman, D. N. (1991). The catalytic mechanism of carbonic anhydrase. Can. J. Bot. 69, 1070-1078. 11. Lindskog, S. (1997). Structure and Mechanism of Carbonic Anhydrase. Pharmacol. Ther. 74, 1-20. 12. Tripp, B. C., Smith, K. & Ferry, J. G. (2001). Carbonic anhydrase: new insights for an ancient enzyme. J Biol Chem 276, 48615-8. 13. Tripp, B. C., Tu, C. & Ferry, J. G. (2002). Role of Arginine 59 in the gamma- Class Carbonic Anhydrases. Biochemistry 41, 669-678. 14. Eriksson, A. E., Jones, T. A. & Liljas, A. (1988). Refined structure of human carbonic anhydrase II at 2.0 A resolution. Proteins 4, 274-82. 15. Binkowski, T. A., Naghibzadeh, S. & Liang, J. (2003). CASTp: Computed Atlas of Surface Topography of proteins. Nucleic Acids Res 31, 3352-5. 16. Dundas, J., Ouyang, Z., Tseng, J., Binkowski, A., Turpaz, Y. & Liang, J. (2006). CASTp: computed atlas of surface topography of proteins with structural and topographical mapping of functionally annotated residues. Nucleic Acids Res 34, W 116-8. 17. Williams, A. W., Yamaguchi, S., Togashi, F., Aizawa, S. I., Kawagishi, I. & Macnab, R. M. (1996). Mutations in fliK and flhB affecting flagellar hook and filament assembly in Salmonella typhimurium. JBacteriol 178, 2960-70. 18. Tallant, T., Deb, A., Kar, N., Lupica, J., de Veer, M. J. & DiDonato, J. A. (2004). Flagellin acting via TLR5 is the major activator of key signaling pathways leading to NF-kappa B and proinflammatory gene program activation in intestinal epithelial cells. BMC Microbiol 4, 33. 19. Ryu, J. & Hartin, R. J. (1990). Quick transformation in Salmonella typhimurium LT2. BioTechniques 8, 43-5. 20. Tsai, S. P., Hartin, R. J. & Ryu, J. (1989). Transformation in restriction- deficient Salmonella typhimurium LT2. J Gen Microbiol 135, 2561-7. 21. Dower, W. J., Miller, J. F. & Ragsdale, C. W. (1988). High efficiency transformation of E. coli by high voltage electroporation. Nucleic Acids Res 16, 6127-45. 244

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 22. Song, S., Zhang, T., Qi, W., Zhao, W., Xu, B. & Liu, J. (1993). Transformation of Escherichia coli with foreign DNA by electroporation. Chin J Biotechnol 9, 197-201. 23. Kaplan, W. & Littlejohn, T. G. (2001). Swiss-PDB Viewer (Deep View). Brief Bioinform 2, 195-7. 24. Lu, Z., DiBlasio-Smith, E. A., Grant, K. L., Wame, N. W., LaVallie, E. R., Collins-Racie, L. A., Follettie, M. T., Williamson, M. J. & McCoy, J. M. (1996). Histidine Patch Thioredoxins. J. Biol. Chem. 271, 5059-5065. 25. Altamirano, M. M., Blackburn, J. M., Aguayo, C. & Fersht, A. R. (2000). Directed evolution of new catalytic activity using the alpha/beta-barrel scaffold. Nature 403, 617-22.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER VIII

FUTURE STUDIES

Study of the conformational state of Salmonella typhimurium flagellin protein during type III export and assembly into flagella

The specific aim of this study is to determine the extent of folding and the

conformational state(s) of the Salmonella typhimurium phase 1 flagellin protein, both

in vivo, after complete polypeptide synthesis and prior to export and assembly, and

while it is being transported through the central 20 A diameter channel of the flagella

fiber. The phase 1 flagellin protein does not have any cysteine (Cys) or tryptophan

(Trp) residues. These two residues can be introduced by site-directed mutagenesis of

the corresponding pTH890 fliC expression plasmid. A Trp residue can be used as an

intrinsic fluorescent probe (Trp), or used as a fluorescence energy donor, along with

another introduced fluorescence acceptor, e.g. IADANS dye, in Forster fluorescence

resonance energy transfer (FRET) studies of protein folding.

Cys residues can also be introduced that can either spontaneously cross-link to

form covalent bonds in oxidizing environments or provide a uniquely reactive point

of attachment for synthetic fluorescent dye molecules, for use as acceptor probes in

fluorescence resonance energy transfer (FRET) studies on the folded state of the

protein. Therefore, one or more of Cys and/or Trp residues can be introduced into any

desired location of the flagellin protein. One proposed set of experiments will involve

introducing pairs of Cys residues into various locations throughout the four DO, D l,

D2 and D3 domains. Assuming that they are appropriated positioned next to each 246

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. other in the folded or unfolded flagellin structure, these pairs of Cys residues should

spontaneously form covalently bonded disulfide bridges in the oxidizing environment

of the bacterial periplasmic space. Previous in vitro studies have shown that the

flagellin proteins are present in a highly disordered state as monomers and assume a

globular shape with a-helical coiled-coil conformation at the N- and C-termini, upon

polymerization. These interactions can be further studied by designing disulfide

bridges in different domains of the flagellin protein and testing the resulting variants

for successful export and assembly. The hypothesis is that disulfide bonds that are

formed prior to export.

The disulfide bridges would be designed by using Disulfide by Design (need

to cite ref for this software), a software application for the rational design of disulfide

bonds in proteins, created by Dr. Alan Dombkowski, Wayne State University

(Detroit, MI). The algorithm takes a Protein Data Bank (PDB)-format protein

structure (e.g., determined by X-ray diffraction or NMR) as the input. Then all

possible pairs of amino acid residues are rapidly assessed for proximity and geometry

consistent with disulfide formation, assuming that the residues were mutated to

cysteines. The output displays all the possible residue pairs meeting the appropriate

criteria. The Disulfide by Design algorithm has been successfully used for disulfide

engineering to study the protein dynamics and interactions. The PDB structure of S.

typhimurium flagellin, 1UCU, was used as the input for the Disulfide by Design

program. The algorithm revealed 36 potential amino acid pairs, that when substituted

with Cys residues, have a strong potential for forming disulfide bonds. As an example

247

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. of the proposed preliminary experiments, five different substitution pairs were

identified in the set of 36 disulfide pairs that are located in different domains of the

protein. The selected substitutions are L17C and L479C (DO domain); A50C and

I453C (D1 domain); T171C and A400C (D2a domain); Y178C and Y311C (D2b

domain) and G189C and G340C (D3 domain).

Self-assembly of GFP-flagellin fusion proteinsin vitro

As described in Chapter IV, approximately 100 flagellin-GFP fusion proteins, with

GFP as an internal fusion, were successfully generated by transposon insertion, as

identified by fluorescence screening of colonies for green fluorescence and confirmed

by colony PCR screening. However, none of the GFP-flagellin variants identified in

this study yielded functional flagella, as indicated by non-functional motility assays.

Although the negative results of functional screening for motility were not based on

an exhaustive study, this study indicates that GFP does not readily serve as a good

model protein for insertion into flagellin proteins to enable the creation of a

functional flagella nanotube with genetically encoded fluorescence. The limiting

factor in this case could be the pore size of the flagella export apparatus. Purified

monomeric flagellin proteins from mesophilic bacteria can rapidly polymerize in

vitro, especially if incubated with small flagella fiber seed fragments (need in vitro

polymerization refs). This hurdle could potentially be overcome by isolating the

flagellin-GFP fusion proteins from the cells and polymerizing them in vitro. The

proteins can be purified after over-expression in Salmonella or E. coli cells and

standard flagellin polymerization methods can be used to assemble the fibers. Fiber

248

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. formation can then be verified using transmission electron microscopy (TEM), light

scattering and, possibly, techniques of gel filtration chromatography with a large pore

size resin.

Conformational study of various flagellin deletion variants using circular dichroism and fluorescence spectroscopy

In the Chapter II, a description is given of the creation of an internal deletion variant

library of Salmonella typhimurium flagellin and the predictive modeling of some of

the structures. Although the models predict the possible conformational states of each

deletion variant, they need to be experimentally verified. Here, we propose the study

of at least five different variants using circular dichroism spectroscopy (CD) to

examine secondary structure formation and content and fluorescence spectroscopy to

examine global folding of the flagellin structures. The variant proteins can also be

studied for their chemical denaturation and thermal melting properties using the

above mentioned studies; fluorescent probes could also be attached, using the

previously described methods of site-directed mutagenesis to introduce reactive Cys

residues in the flagellin variant sequences.

Engineering of catalytic sites in flagellin protein

Even though our preliminary study of engineering catalytic sites was not successful,

this study can be further explored by designing sites by better algorithms and testing

more and diverse active sites. Also, a directed evolution approach can be used to

generate catalytic flagellin protein.

249

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Insertion of single domain antibodies in flagellin hypervariable region

In our previous study we have demonstrated that flagellin hypervariable region can be

removed to a large extent and created shorter functional flagellin variants. These short

flagellins can be used to insert single molecule antibodies that can have implications

in diagnostics, vaccine research etc.

250

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Appendix A Recombinant DNA Biosafety Approval

251

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Western Michigan University

Recombinant D N A Biosafety Committee

Registration for Recombinant Research - 2007

This form must be submitted for att research involving recombinant DNA molecules. Renewal of approval is required annually.

In t e f a i a , Guidt-Unes Antons: Guidelines farResearch involving Recombinant DNA Molecules 0 m Guidelines). )

General Registration Information

Principal Investigator: Brian C.Tripp, PhD. Oj'fice Phone: : ;38?-416fi laboratory Phone: 38'MISS Department: (MGS Electronic Mail Address: M»:t«pp#wweb.Mu ; Office Address: Room 343$, Building; Wort Hilt

Ltttoratoiy Where Research is to be Conducted: Building: Haeniete Hall and WMU Engineering Campus Room: 408?H,4Mfft AltOEng,

Project Title: Engineering and Display of E«rm« ami Proteins on Bacterial Flagella Fibers

Project Stan Date: ® 4m m j Current status (check one): I 1 Initiated : Date: [" I Will be bndated . Date:

OKI Continuing (bo changes) Curreni ftojeci Numtter: Bfi-04-BTa : □ Continuing (modifications) Current Project Number: OP .17’' 1 Will not ;t« initiated or will be discontinued . . • [: ] Completed Date;

If Part of a Grant Prbpo&aj, List AgencyiAgencScs;:: NASAU.S.ArWf, VWUTftACAS WMU Proposal Tracking Number (if applicable): :

wkibSc"" ~ -T"! ~ T • ...... Al> other t'0|we$- tjbsotete ■■■■.. .

252

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. The experiments covered under this registration form invote the use of several strains of the Risk Group t

The sccBon entiled Appendix M - Risk Group 2 fR 62j Agents of the April 2002 version of the NIH Guidelines gives the definition of RG2 agents: *RG2 agents are associated with human disease which is rarefy serious and tor which preventive or therapeutic interventions a re g fe n available."

Appendix Brll-A of the NilI Guidelines fete Rift Group 2 (RG2) - Bacterial Agents IncMlng Chlamydia, which Includes the statement ’Salmonella including S. mitorme, S. choiemsum, S. erteritkiis, & g a ^n a rm puHomm, & meleagridt, S. paratyphi, A, 8, C, & fypfu, S. typMmurkm.

B. Hm-Pmeaenk£MiM.eaMsi Other experiments h these recombmant DNA protocols involve Ire use of standard, non-pathogenic E coU K-1? Host-plasmid systems such ae DHS-slpha and BL21-DE3 strains, which are considered exempt from the MW Guidelines. As staled In the NIH Guidelines, Appendix O il Escherichia col K-12 Host-Veckn Systems: ‘Experiments which use Escherichia ool K-12 bost-vector system , wilh the exception of those experiments listed in Appendix C-ll A are exem pt from toe NIH Guidelines provided that (i) the Esdmmhm coli host does not contain conjugation proficient plasmids or generalised transducing phages; or (if) lambda or lambdold or Ff bacteriophages or non-conjugative plasmids {see Appendix C-VIt, Footnotes and References of Appendix C, Footnotes and References of Appendix Q shall be used as vectors. However, experiments involving the insertion into Escherichia coir K-1? oI DNA horn prokaryotes that ex ch an g e genetic information {see Appendix C-VII. Footnotes and References of Appendix C, Footnotes and References of Appendix Q wfifs Escherichia coli may be performed with any Escherichia coli K-12 vector (e.g., conjugates ptesmkJ). When 8 non-conjogalive vector is used, the E sd m rk lm cot K-1? host may contain conjugalion-proficieni pfearrtfs either autonomous or integrated, or generate?®) transducing phages. Far these exempt laboratory experiments, Btesatety bevel (BL) 1 physical containment conditions are recommended. For large-scale fermentation experiments, tori appropriate physical containment conditions need be no greater than those lor the host organism unmodified by recombinant DNA techniques; toe tnsftutional Biosafety Committee can specify higher containment If deemed necessary. * C- {fadgrial Cell Culture Votenm.

The experimental procedures described in this protocol involve bacterial cell culture of Exempt or potential RG1 E coli K-12 strains and RG2 Salmonella typtmwlom cells in volumes ranging from of 1 ml to 5 Iters. As noted in Appendix C-ll A Exceptions of the Guidelines, "The M ow ing categories a re not exem pt from the NIH Guidelines: ...(«} large-scale experiments {e.g., more than 10 tors of c u ltu r e ) ,Thus, the culture procedures in this protocol do no! use volumes of cell culture that exceed more than 50% of the volumes considered to be in the category ol ”targe»scate’ by toe WW GuWefces:

UfvSsxdgVOif ’WMM RDBC .1 Ail nttw j a:>pk'$

253

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Description of Recombinant DNA Experiments

1. Summary of p ro je c t (1 or 2 paragraphs). In c lu d e purpose and manner in which recombinant DNA will be used in the project. Please define acronyms and abbreviations.

! am conttmrtng a research project that involves the investigation of the export and folding pathways of bacterial ttagetfa fibers, w*h the goal of displaying foreign enzymes and proteins as genetic fusions to flagein proteins attached to tho Surface of bacteria, tor possible use as sensots and in nanotechnology applications. This flagellin- orienled research also involves a collaboration with Dr, So bra Muratidharan, WMU Chemistry Dept, on the construction oi novel nanotnateriais using tlagein proteins as a novel nanwnaleriat fiber to bind or induce precipitation of btber inorganic nanorrateriats. Some gene fusion experiments involve the well-characterized green fluorescent protein originally domed from A$qw rta Victoria, Pacific jellyfish, which w i b e inserted in various ways into Salmonella, £. coli and Agtofcx pymphilus ffageWn proteins. Addttfonaf etontog experiment wM involve toe same techniques and approaches applied to the extracellular display of human carbonic anhydrase II on bacterial ftagoin. Carbonic anhydrase U is a well characterized enzyme with easily detected catalytc activity. Another area of current research invofvw the engineering and optimization of a particular structural class of bacterial ten-handed beta helical acyltransterase and carbonic anhydrase enzymes to perform novel catalytic functions, ami the catalytic engineering of tegeftins to a) mimic the carbonic anhydrase active ste and catalytic function and b) contain an internal fusion of the human carbonic anhydrase enzyme' These areas of research w i involve Investigating toe effect of various mutations, foreign gene insertions and gene rearrangements on (unction of toe corresponding proteins. The catalytic activity of mutated enzymes, level of recombinant protein production in E coli, extracellular protein export, and toe flagella- mediated bacterial motility of E coli and Salmonella typMmirium strains wilt be studied as a function of mutations in the cartespubdMg genetic DMA. Thus, recombinant DNA wiB be used m a template for tire introduction of amino acid and polypeptide.sdostituftons into tire corresponding proteins. Cones encoding various proteins and enzymes w i MMduafly be subjected to random mutagenesis, site-directed mutagenesis, gene insertion and gene shuffling. Standard molecular biology techniques, such as polymerase chain reaction (PCR) will be used to introduce mutations in the DMA, white other cutting and trgation procedures will bo performed with restriction enzymes and T4 tjgaso enzyme. Molecular biology and coll culture procedures are performed in the PCs laboratory in 400/ and 4009 HaenMre I tel. Other physiological experiments involve the use of a laser tweezer-optfcal microscope Instrument to m easure the motility, i.e. ability to swim, of £ coli and Salmonella bacteria. This instrument is W a tte d as part o fa W.M. Keclt NanoacSetmes facility in room A1 to in the WMU Engineering Cam pus building. An interrelated eruymnlogy research project that involves the Investigation of catalytic function and screening small molecules lor inhibition of mammalian, bacterial and a rth a e l carbonic anhydrsses and several structurally related tett- hsnded beta helical acytronsfcrase enzymes. Cloning experiments wi involve structure lunctfon studies of the relationships between catalytic rate, substrate recognition and binding by various types of Inhibitors, initial studies involve human carbonic enhydrase li, which is is a well characterized enzyme with easily detected catalytic activity. Another part of this current research involves the engineering end optimization of alpha and gemma-class class carbonic anhydrase enzymes to perform novel catalytic functions. These areas of research wall involve investigating the effect of various mutations, foreign gene Insertions and gene rearrangement* on function of tlie corresponding proteins. The catalytic, activity of wild type and mutated enzym es, inhiWtton by small molecules, and the level of recombinant protein production In F call w* bo studied us a function of mutations In toe corresponding genetic DNA. Thus, recombinant DMA w i be used as a template for the introduction of amino add and polypeptide substitutions into tte corresponding proteins. Genes encoding various carbonic anhydrase and related enzymes may individually be subjected to random mutagenesis, sto directed mutagenesis, gene insertion and gene shuffling. Standard molecular biology techniques, such as polymerase chain reaction (PCRj will be used to introduce mutations rn the DNA, white other cutting and ligation procedures will be performed with restriction enzymes and T4 tips® enzyme,

2, Host .stiain(s) used, including genus, sjjccics. parent strains, and class crf'cach agent. Onty prokaryotic bacterial microorganisms wilt be used in this research. E coli and Salmonella enlerfca serevar Typhimurium LT2 (Salmonella typhtmuimm} Host strains to be used include: F. colt K-1? strains (exempt class) XL-1 Red. Xl-1 Blue, DHS-alpha, BL21{OE3)5oid (Stralagene, La delta. CA), TOP10, GI/24, GI828, B l23p3}star. 8L21{DE3)plysS, Bl ?1(D£3)ptysE, B l?1(0O )A I (liiv tag tm , C arlsbad, CA), NEB 5 alpha (New Eng land Brofabs, Beverly, MA) Three Salmonella typ h im tm m strains (C lass III- D, risk group 2431-2} will also be used; $JW 1103 (wifd- type lor flagettar-mediotcd urotrKty and chemotaxis). SJW 134, which h a s the two genes tor tlagetfin deleted (fliC and fP) and is therefore non-mottle unless foreign flagellin genes are added via plasmid transformation, and TRiO I (rrr) which is used to waived t . coli grown piasmkfe tor compatibility in ether Salmonella strains. These bacterial strains will bo used for host-vecior ptosinid production and also tor bacterial protein expression.

Rwfswl«M» WMCKDBC * ~ ~ 4 AN atl»cr ohsttteu'

254

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. “B^S O SB T» * . , — ...... <*$£■!* .v ...

0 - 8 ^ a I I s s.®# e .^ i - 1 5 . . I p t l s i i i i ■as« ^ '2 cst » -SI - 25 ‘** i—' ’—•§? 2 & ja.-o-S ® S : f If S * e iE # S □ ® 3gf I l f s 1 P9SE «-Jr '£% 3 ® tx s oa: 2 i.Ifi t *18? i S 'i : o * S -'.«»j'S.SoS2 2 if permission. «J.i gOo “ O.. g | « 2-o»s e S i s « - g i 53 K a -T2--'S^-- = S C < t p S i ._. .. £*.< ..-. without 'S ,,g 3 ■■■■/«■"■ f,tg.Sg : p r ■« tsJ § S®» S « g I ;.S ^ :| , l l s S f l £ ® * »2' jp* :® i^ft I? ."** S | | :S ••ta - 1 • ,2? ■ f s l l ■ ^ « S.'S sts jg 4£ •vSI-"’ ' $8- i J - $!!5 ir> IT) II *~ tS*- S ? l l i- cN 45 .<*3 H :® ? ! J= w **#3 fl'l'is i i l ll- i f 'I S c s < « «i l ’-. £ S ’i :; I S I t «• x : & o •g f 3 | j £L ?*i © •_. £3> .'.'JQ SS- ■&<§ - i f i ,t : 5 I f I I i& 'o f | I ’tUl reproduction prohibited

a . "O i i l i | a - • » J ; <» P 3jbj - »=■ ■■a... SO j.«s. » © .c ::.:.g"' Further ..W U J^ -X . t'.H S a ST ■O O *& § £ . >2 I SI'S » 1 &C8 ^ ss . » e f ;- ■ o % M “ -s> ; «■ © *? ::£ ,: i t f i I 1 5 S s a 'i- g ? t l f S'- ■> S » « a « 1 8 «*.» S e 1 « ^ ^ |2> s,-'-•» § ” ' ' s» i s SJ O a ® J 1 : S I S | J •~ ‘5 & je yf> •H &*$»■.! 1 * * I * ' I " s w 4£ «* SB » •'«> 3. J WP E a. •=, is Or M - M I. .^; M1 m 2. S* ’ B *=y3 & t £ ® . 0 * 0 G> I -S l a I ' 1-a S-c .1.DS 1® II Reproduced with permission the of copyright owner. I certify that 1 have read anti understood the Western Michigan Uniwrsity Policy forRecombinant DNA ffimttfeiy including the description of my rote and responsibilities as lYsnc'tpil Investigator on this project. I agree to abide by the Guidelines fo r Research Involving Recombinant DNA Molecules (NIH G m M ines) in conducting allwork using recombinant DNA molecules.

i*rinctpal Investigator Signature Date

I have itstd the description of this proposed research involving recombinant DNA molecules. I have determined that the facilities and procedures proposed in this project registration form are adequate for the safe conduct of this research and the safety of other personnel - faculty, staff, and visitors using the facilities within which this research is to be conducted.

.. D epitm eM Chatr/Unil I lent! St gnat ure Date

Submit the origin*) plus 7 copies to the research compliance

Revise! OMifi WMHRDBC ...... is Alt other copies obsolete

256

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Recombinant DNA Biosafety Committee

Project Approval Certification

For yD.iN V Bmsafetv Coinintitei Use Only

Project Title; Engineering and Display of Enzymes and Proteins on Bacterial Flagella F illers

J i is ip il Investigator: Briai 1i ipp

IBC Project Number; 06-BTa

Date Received by the rDNA Biosafety Committee: 11/15/06

H2 Reviewed by the rDNA Biosafety Committee

|XJ Approved [ 1 Approval not required

1 _ ^ . 12/1.3/06 Omit ofif)N^ Buouieiy Committee Signature Date

R«vts«! WMU RDBC AH Oihyr topics s>b$f)kt-e

257

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Appendix B Letters and Permission

258

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Copyright question 2 messages

Raghu Ram Matapste m , T M ar 200? at 10:50 P I To: pster.emtef8@aprinfl«f.cem

Dear Or. Enders, I have a qusslon regarding thecopyright permission for the article piWlshed is tie jeans) o f Molecular Modeling, tied

*k tneoneScai model of AquMex pyropMlus ftapin; Implications for its hainostablfly” Volume 12, Number 4 / March, 2006 Pages 481-4S3 Authors V. RaghuRam MalapRa and Brian C, Tripp

I am the Irst author of the paper and current!' I am writing my Doctoral Dissertation. I will need the copyright permission to use the figures andarticle as a whole in my Dissertation, if you could please guideme about how to obtain the copyright permission, I willbe grateful. Thanking you, RaghuRam V. Matapaka.

Enders, Peter, Springer DE

Dear Mr, Maiapaica, Springer hereby pants you the penaisslen requested, W e wish you much su ccess with your Ph D thesis and are Isokirsg forward to receiving moreInteresting papers from you. Yours sincerely,

Peter Enders Springer Senior Editor Chemistry

Tiwpitenstrasss 17180121 H eNM bwg I G erm any tsf * 4 9 ©56221 487-8541, lax * 4 9 $58221 487-8139 cgter.eRder^ierinQerxaa ww-#.ssrinaer.£gn

259

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. venkata Raghu Ram malapaka

From; Duncan Janes IOJ*ffles|JwieymtiC| on behalf of Permission RequesfsliK fparref«[||wtf©y.co.uk;| Sent: Tuesday', March 13,200? 6:41 mi To: Venkata Raghu RamMalapaka Subject: Re: Copyright perroisators

Dear RaghuMaiapaka

Thank you 1m your reply. Our grantof permission is below. Where indicated, the wort "Copy right* may be replaced with tee usual symbol if desired.

Permission is hereby gran ted for the use requested. .

If any of the material in qawtton appears within our work with c redit to another source, authorisation to n that source must ba obtained.

This permission does not Include the right to grantothers permission to photocopy or otherwise reproduce this material except for versions mad® by non-profit organisationsfor use by the blind or handicapped persons.

We are waiving our fee on this occasion.

Proper credit must be given to our publication.

Credit must include the followingcomponents: Title of the Work. Authors; andrtr Edltorfsj NameCs). Copyright year. Copyright fsymboij JohnWiley & Sons Limited. Reproduced with permission.

Yours sincerely

Duncan James Permissions Coordinator John Wiley & Sons, Ltd. 1 QMIaftds Way BognorRegis West Sussex PG22 9SA UK Tel: *44 (0)1243 8233-6 : +44 (0) 1243 770620 Email: [email protected]

WSey Blcentsnnlat: Knowledge for o®f»rattt»s 1807-2007

nk ta R 0'rc Rsrsr Maiapaka-«v#7ikata.f::i3lspsk-3.gw«5i2h^dL> IP'. cs jk>

260

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. DearDuacait*

I ba¥« the articledetails below and the figure smmber.

Article title: "Bacterial Flagella". Copyright (c) 2001 John Wiley Jr Sons, Ltd

Authors: David Gene Morgan, StalidKbaii

Figure 3, Schematic of the bacterial flagellar base,

Tltaskfou,

RaghuMalapafca.

©right®! Message ——■

From: Penniwtofi RequesMJK

Date: Friday, March 2 ,2007 7:42 am.

Subject: Re: Copyright permission

> Bear Ragtat Malapaka > > T hai: you for your mqtiiiy. > > Its order to process your rapes!* we require details o f the > article > containing thefigure, aid tie figure nutiitw from within A® article. > > Please mciade this snessage sad the previous cwej^mttence below >wi*at - > replying. > > Yours sincerely > > Duncan James > Peo&issions Coordinator > John Wiley & Sons, Ltd.

261

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. > I Oldtasds Way > Bognor Regis > West Sussex > P 022 9SA >U K > Tel: ”44 (0) 1243 S43356 > Fax: -44 (0) 1243 770620 > Email: permreq^wilev.co.uk > > Wiley Bicentennial: Knowledge for Gmeratioas >1807-200? > > > Dear Edita, > > I am a PhD student and am in the process of completing; my > dissertation. My > •work is pimarily based on the eubacterial ftogetlm structure. I > find the > Figure 1 of the article k ELS titled ’’Bacterial Hagella” to be > extremely > felavant to my thesis. I would like to know if it is possible to > get the > c^yright permission to use the particular figure m my thesis, > > Thank you, > > fhaghu Malapaka, > >

> > The information contained In. Ais e-mail and any subsequent •> correspondence is private aid confidential and intended solely > for the named recipient(s). If you are sot a named recipient, > you must not copy, distribute, or disseminate the infomatiofl, > open any attackneat, or take any action in reliance oa it. If you >have received the e-mail in error, please notify the sender and delete > the e-mail > > Any views or opinions expressed in this e-snail are those of the > individual sender, unless ©iierwise stated. Although this e-mail >!ias. > bees scanned for viruses you should rely m your own virus check,

262

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. > Registered office address: The Atrium, Southern Gate, Chichester, > West Sussex, P019 8SQ.

> ... >

The information contained « this e-mailand any subsequent correspondence ts private and contfdenHai and intended solely for the named recipledtfs), if you are not a named recipient, p u must not copy, distribute, or disseminate t i t Information, open any attachment, or take any acton in reliance on It if you have received the e-mail In error, please notify the sender and delete tie e-mail.

Any views or opinions expressed in this e-mail are those of the individual sender, unless otherwise slated. Although thise-mail has been scanned for viruses you should rely on your own virus check, as the sender accepts no lability for any damage arising oat of any bug or virus Infection.

John Wttey & Sons Limited is a private limited company registeredis SnglareS with registered number 641132.

Registered officeaddress: The Atrium, Southern Sate, Chichester, West Sussex, P019 8SQ.

263

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission Re: Copyright Permission [REF :78624918961 1 n«s»ags

supp

Dear Or MaiapaHa, Thar* you for your e-mail.

We wish to advise that you do not m m 4to obtain a copyright permission if you wM use the article or a pari of the article in your disseftattoo.

Ab an author, you retain fights tor large number of author uses, including use by your employing institute or company. These rights are retained and permitted without the seed to obtain specific permission from Elwwter Including 'the right to Include the article in full or in part in a thesis or dissertation (provided that this is not to be pubifchedcorontefcialtyr Please ensure that the reference number remains In the subject line if responding to this e-snail.

Please feel free to contact us with any addiionaf questions or concerns at the e-mail address or telephone number* shown below. Tours sincerely. Meats Msgante Etsevter Customer Support Helpingyou get published httioyfwww.oiseywr.tom/authors Qiobai telephone support Is available 2477: For The Americas: *1 888 S34 728? (ioll-fre© for US & Canadian customer®) For Asia & Pacific: * m 3 5561 5832 Far Europe & rest of its® world: +-353 61 7D9 190

If you have any feedback on our service, please e-mail custom srttsmsbazkMeisovior.com.

To e n su re datively of our e-mails to your inbox (not t * or junk folders), please add suooorteiBaviar.coro to -/our address book or safe senders list, Copyright 2005 Elsevier Urn Had. All rights reserved.

— Original Message — From: [email protected] S ent G?-Mar-2O0? 3:58:26 To: authoreupport Subject: Copyright Permission

A Deletion Variant Study of the Functional Role of tie Salmonella Ftageliin Hypervaisfele Domain Region in MetiBty Journal of Molecular Biology, Votume 385, Issue 4,28 January 2007, Pages 1102-1116 RagftuRam V. Malapaka, Leslie O. Adebayo ami Brian C. Tripp

I am the first author of tie paper and currently I am writingmy Doctoral Dissertation. I will need the copyright permissionto use the figuresand aficle as a whole in my Dissertation. If you could please guide me atxsit how to obtain the copyright permission, Iwit be grateful.

Thanking you,

Ragbu Ram V. Malapaka.

264

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.