structural and Functional Analysis of

Muskelin

by

Soren Prag Hansen

A Thesis submitted to the University College London for the degree of PhD

September 2003

Department of Biochemistry and Molecular Biology

University College London

Gower Street

London

UK ProQuest Number: U643396

All rights reserved

INFORMATION TO ALL USERS The quality of this reproduction is dependent upon the quality of the copy submitted.

In the unlikely event that the author did not send a complete manuscript and there are missing pages, these will be noted. Also, if material had to be removed, a note will indicate the deletion. uest.

ProQuest U643396

Published by ProQuest LLC(2016). Copyright of the Dissertation is held by the Author.

All rights reserved. This work is protected against unauthorized copying under Title 17, United States Code. Microform Edition © ProQuest LLC.

ProQuest LLC 789 East Eisenhower Parkway P.O. Box 1346 Ann Arbor, Ml 48106-1346 Abstract

Muskelin is an intracellular, kelch-repeat that is functionally involved in cell spreading on thrombospondin-1. The aim of this thesis project was to

investigate the role of muskelin domains in the subcellular distribution and the

regulation of the protein using cellular, biochemical, and bioinformatic approaches.

Using bioinformatics to compare muskelin orthologues from Mus musculus,

Homo sapiens, Rattus norvegicus, Danio rerio, Drosophila meianogaster, and

Anopheles gambiae, highly conserved regions within muskelin were identified, with the combination of a discoidin domain, a LisH motif, a C-terminal to LisH

motif, and six kelch repeats. A further bioinformatic analysis of the kelch repeat in whole genomes from Homo sapiens, Drosophila meianogaster, and

Anopheles gambiae, demonstrated that muskelin has a unique molecular architecture amongst the large family of kelch repeat proteins.

EGFP-tagged forms of muskelin were prepared and used to study its cellular localisation and to explore possible regulatory mechanisms. EGFP-MK had a cytoplasmic distribution and also formed mobile, granular inclusion bodies in

25% of the transfected cells. Formation of the inclusion bodies was a specific property of EGFP-wild-type muskelin, because EGFP alone or EGFP-muskelin mutated at key residues within kelch repeat 4 (Y488AA/495A or G474S/G475S) or mutations in the discoidin domain (K132A/I133AA/134A) were equivalently expressed yet had a uniformly diffuse distribution. Because of the complex domain structure of muskelin, a rational view of domain boundaries has proven essential for successful expression of truncated forms. Truncated forms composed of the discoidin or the kelch repeats alone did not form inclusion bodies. In biochemical pull-down assays, muskelin was found to self-associate

3 by a mechanism dependent on both the discoidin domain and the fourth kelch

repeat. Mutations in kelch repeat 4 altered the sensitivity to limited proteolysis

by proteinase K, suggesting that the protein conformation is altered. To begin to explore physiological factors that regulate the formation of inclusion bodies, the possible roles of five putative protein kinase C (PKC) phosphorylation sites that are conserved between muskelin species orthologues were evaluated in the cellular and biochemical assays. Activation of PKC decreased the formation

of the inclusion bodies and resulted in phosphorylation of muskelin. Using a

site-directed mutagenesis approach, two PKC phosphorylation sites critical for this activity were identified. Both of these sites are predicted to lie on the same face of the p-propeller structure composed by the six kelch-repeats. Together, these studies have developed new evidence that muskelin is a multidomain

protein with multiple protein-protein interaction sites. Acknowledgements

Firstly, I would to thank the Danish Research Academy, The Wellcome Trust, and Cleveland Clinic Foundation for the financial help during this project. Also, I would to thank Dr. Ana Giannini, Nina Kureishy, Dr. N. Anilkumar, Alexia

Zaromytidou, and Mark Shipman and people at the LMCB, for materials and discussions and tutoring during my stay in London. From Cleveland, I would like to thank Dr. Michael K inter and Dr. Thomas Weimbs and staff at the Lerner

Research Institute, for all your help and for letting me use your equipment. Also,

I would to thank Ray Monk for all your help and your friendly support during the year we stayed together in Cleveland. Also, I would like to thank Dr. Tony

Stadler and Dr. Rune Hartmann for your scientific advice and friendly support in

Cleveland. In addition, I very much would like to say thanks to Olga Morals for an incredible patience with me while we were apart and over the years.

Furthermore, I would to thank my family for all your support over the years. And thanks to Dr. Tony Ng and the rest of his group for all your support on my return to London.

Finally, a very big thanks to Dr. Josephine Adams for all the support you have given me over the years. Thanks for your patience and extraordinary focus and drive to keep this project going and everything you have taught me during this project. Table of Contents

Abstract ...... 3

Acknowledgements ...... 5

Table of Contents ...... 6

List of Figures ...... 12

List of Tables ...... 15

Abbreviations ...... 16

Introduction...... 18

Extracellular Matrix Proteins ...... 18

Collagen and Laminin ...... 19

Fibronectin ...... 20

Cell-Substratum Contacts ...... 21

Integrins ...... 21

Focal Adhesions ...... 24

Development of Cell Contractility ...... 27

Cell-Basement Membrane Adhesion ...... 28

Cellular Functions of ECM proteins ...... 29

ECM-Mediated DNA Transcription ...... 31

Regulation of Cell Migration ...... 32

Thrombospondins ...... 37

Molecular Structures of Thrombospondins ...... 37

Localisation of Thrombospondins ...... 39

TSP-1 and TSP-2 Knock-out Mice ...... 41

Properties of Thrombospondins ...... 42

Molecular Interactions of TSP-1 ...... 43

6 Interaction with ECM proteins ...... 43

TGFp ...... 45

MMP-2...... 46

Cell-Surface Receptors ...... 46

Cellular Signalling mediated by TSP-1 ...... 47

Formation of Lamellipodia and Microspikes ...... 48

Vascular Smooth Muscle Cell Proliferation and Migration ...... 49

Disassembly of Focal Adhesions ...... 50

Kelch-repeat proteins ...... 52

Actin-Binding Kelch-Repeat proteins ...... 55

Kelch in Ring Canals ...... 55

Spermatogenesis ...... 58

Kelch-Repeat proteins in Neurons ...... 60

Other Actin-Binding Kelch-Repeat proteins ...... 61

Kelch-Repeat proteins and Microtubles...... 62

Kelch proteins in Transcriptional Regulation ...... 64

Rag-2...... 67

NS1-BP...... 68

Kelch Proteins and the Cell Cycle ...... 69

Extracellular Kelch-repeat proteins ...... 70

Self-association of Kelch-Repeat Domain and Secondary Domains ...... 71

Identification of Muskelin as a Kelch-Repeat Protein ...... 73

Project Objectives ...... 76

Chapter 2: Materials and Methods...... 77

BLAST Database Searches and Identification of Domain Architectures....77

Calculation of Ka/Ks Ratios...... 81

7 Prediction o f Secondary Structure, p-Propeller Closure Mechanisms, and

Putative Phosphorylation Sites ...... 82

Multiple Sequence Alignments ...... 83

Three-Dimensional Reconstruction ...... 84

Cell Culture ...... 85

Preparation of muskelin expression constructs ...... 85

Cloning of EGFP-Muskelin ...... 85

Ligation and Transformation into Bacteria: ...... 86

Cloning of MK-EGFP ...... 87

Cloning of V5-MK ...... 88

Cloning of EGFP-Muskelin Discoidin Domain ...... 88

Cloning of EGFP-MKA85 ...... 88

Cloning of EGFP-MK Point Mutations Constructs ...... 89

Site-Directed Mutagenesis of Putative PKC Sites in Muskelin ...... 89

Cloning and Purification of GST-Discoidin Domain ...... 91

Transient Transfection of Mammalian Cells ...... 92

Immunfluorescent Stainings ...... 93

Dil staining, TPA Treatment, and Proteasome Inhibition ...... 94

Real-Time Epifluorescence Imaging ...... 95

Pull Down Experiments and Immunoblotting ...... 95

2-Dimensional Isoelectric Focusing and SDS-PolyAcrylamide Gel

Electrophoresis ...... 97

Proteinase K Proteolysis Assay ...... 98

Chapter 3: Bioinformatic Analysis of Muskelin...... 99

Introduction ...... 99

Bioinformatics Applied to Single Proteins ...... 99

8 Comparative Bioinformatics on Whole Genomes ...... 101

Synonymous and Nonsynonymous Substitution Rates ...... 101

Rationale of My Studies ...... 103

Results ...... 104

Sequence Alignment of Muskelin Orthologues ...... 104

Conservation of Domains Calculated by Ka/Ks Ratios...... 111

Conclusion from Multiple Sequence Alignments of Muskelin ...... 116

Analysis of Muskelin Domain Architecture ...... 116

Discoidin Domain ...... 117

LisH and C-terminal to LisH ...... 122

The Kelch-Repeats ...... 123

Phosphorylation Sites in Muskelin ...... 127

Discussion ...... 133

Discoidin Domain ...... 133

Kelch-Repeat Domain ...... 134

Chapter 4: Bioinformatic Analysis of the Kelch-Repeat Protein Superfamiiy 136

Introduction ...... 136

Results ...... 137

Identification of Kelch-Repeat Proteins in the ...... 137

Chromosomal Localisation of Human Kelch-Repeat Proteins ...... 141

Domain Architecture of Human Kelch-Repeat Proteins ...... 141

Structural Relationships of Human BTB/Kelch Proteins ...... 144

The Kelch-Repeat Protein Family in Invertebrates ...... 150

Kelch-repeat Proteins encoded in yeast genomes ...... 154

Analysis of Kelch-Repeat proteins in Lower Organisms ...... 155

Discussion...... 159

9 BTB/Kelch-repeat proteins ...... 160

The discoidin domain/kelch-repeat proteins ...... 161

Chapter 5: The Role of Muskelin Domains in the Cellular Distribution and

Self-association of Muskelin ...... 164

Introduction ...... 164

Results ...... 165

Expression of Mouse Muskelin Fusion Proteins ...... 165

Cellular localisation of EGFP-muskelin ...... 169

Characterisation of muskelin particles ...... 171

The Role of Muskelin Domains in Its Cellular Distribution ...... 178

Muskelin self-association depends on the N-terminal domain and the kelch

repeats ...... 181

Loss of self-association capacity is associated with altered sensitivity to

proteolysis ...... 186

Discussion ...... 189

Self-association of muskelin ...... 189

Formation of particles ...... 192

Nuclear localisation of EGFP-MKA85 ...... 195

Chapter 6: Regulation of Muskelin Self-association by Protein Kinase C ...... 198

Introduction ...... 198

Results ...... 201

Identification of Muskelin as a Protein Kinase C-Regulated Protein 201

Muskelin is Not Post-T ranslationally Modified upon Endothelin-1

Stimulation ...... 203

Study of Muskelin Post-Translational Modifications with EGFP-Fusion

Proteins ...... 204

10 Formation of Muskelin Particles is Down-Regulated by Activation of Protein

Kinase C ...... 207

Initial Identification of Protein Kinase C Target Sites in Muskelin ...... 210

Muskelin Self-Association is Independent of the Phosphoregulable

Serine/Threonine Sites ...... 215

Discussion ...... 218

Identification of Muskelin as a Target for PKC ...... 218

Analysis of Muskelin by Two-Dimensional Electrophoresis ...... 219

PKC Activation and the Formation of Muskelin Particles ...... 221

General Discussion ...... 224

Reference list ...... 235

Supplementary Data:...... CD-ROM:

Movie A: EGFP-MK expressed in Cos-7

Movie B: EGFP-MKY488AA/495A expressed in Cos-7

Movie C: EGFP-MK expressed in Cos-7 plus TPA

Movie D: EGFP-MKS324A/T515A expressed in Cos-7

11 List of Figures

Introduction:

Figure 1 Molecular Organisation of Focal adhesions 25

Figure 2 Hemidesmosomes and Dystrophin-Glycoprotein ...... 30 Complex Figure 3 Mitogen-Activated Protein Kinase Signalling 33 Pathways. Figure 4 Molecular Structures of Thrombospondins 38

Figure 5 Molecular Interactions of TSP-1 44

Figure 6 The Kelch Repeat and the p-Propeller 54

Figure 7 Drosophila meianogaster Kelch Protein in Ring ...... 57 Canals

Chapter 3:

Figure 1. Multiple sequence alignment of Muskelin ...... 105 Orthologues Figure 2 ESTs of Muskelin Orthologues 112

Figure 3 Domain Distribution in Muskelin 118

Figure 4 Prediction of Secondary Structure in Muskelin ...... 120

Figure 5. Discoidin Domain of Muskelin 121

Figure 6. LisH motif in Muskelin 124

Figure 7 Kelch-repeats in Muskelin. 126

Figure 8 Prediction Phosphorylation Sites in Muskelin. 128

Figure 9 Predicted Tertiary Localisation of Conserved 130 Residues and Motifs in Muskelin

12 Chapter 4:

Figure 1 Relationships between BTB/kelch proteins in 145 Homo sapiens Figure 2 Multiple Sequence Alignments of Subgroup 1 and ...... 147 2 of human BTB/kelch proteins Figure 3 Consensus sequence of Subgroup 1 and 2 of ...... 149 human BTB/kelch proteins and Prediction of Tertiary Structure Figure 4 Phylogenetic Comparisons of Muskelin and ...... 163 Muskelin-like Proteins

Chapter 5:

Figure 1 Outline of Muskelin Constructs 166

Figure 2 Cellular Expression of Muskelin and Muskelin 168 Constructs Figure 3 Cellular Localisation of Muskelin Constructs ...... 170

Figure 4 Cellular Localisation of Muskelin Particles 172

Figure 5 Localisation of Muskelin and Subcellular Markers ...... 174

Figure 6 Dynamics of Muskelin Particles 177

Figure 7 Cellular Localisation of Muskelin Constructs ii ...... 179

Figure 8 Self-association of Muskelin ...... 183

Figure 9 Characterisation of Self-association of Muskelin ...... 185

Figure 10 Proteolytic Analysis of Muskelin and Muskelin ...... 187 Mutant Figure 11 Schematic Model of Muskelin Self-association ...... 191

13 Chapter 6:

Figure 1 Two-Dimensional Electrophoresis of Endogenous ...... 202 Muskelin Figure 2 Two-Dimensional Electrophoresis of Exogenous ...... 206 Muskelin Figure 3 Effects of PKC Activity of Muskelin Particles 209

Figure 4 Expression of Muskelin Putative Phosphorylation ...... 211 Site Mutants Figure 5 Cellular Localisation of Muskelin Putative 212 Phosphorylation Site Mutants Figure 6 Quantification of Particles in Muskelin Putative ...... 213 Phosphorylation Site Mutants Figure 7 Self-association of Muskelin Putative 217 Phosphorylation Site Mutants

General Discussion:

Figure 1 Overview of Structural and Functional Capabilities ...... 227 of Muskelin Figure 2 Schematic Model of Self-Association of Muskelin ...... 229

14 List of Tables

Introduction:

Table 1 Localisation of Thrombospondins ...... 40

Table 2 Kelch-repeat Proteins ...... 53

Materials and Methods:

Table 1 List of Expressed Sequence Tags of Muskelin Orthologues ...... 78

Table 2 Oligonucleotides for Site-Directed Mutagenesis of ...... 90 Muskelin Table 3 List of Antibodies ...... 94

Chapter 3:

Table 1 Amino Acid Identity between Muskelin Orthologues. 106

Table 2 Ka/Ks Ratios of Muskelin Orthologues 114

Chapter 4:

Table 1 The Kelch-Repeat Superfamiiy in Homo sapiens 139

Table 2 The Kelch-Repeat Superfamiiy in D. meianogaster an6 ...... 151 A. gambiae Table 3 The Kelch-Repeat Superfamiiy in C.eiegans 153

Table 4 The Kelch-Repeat Superfamiiy in S. pombe and S ...... 156 cerevisiae

15 Abbreviations:

BIM Bisindolylmaleimide I

BLASTP Basic Local Alignment Search Tool for Proteins

BTB/POZ Broad-Complex, Tramtrack, and Bric-a-Brac/Poxvirus and

Zincfinger

CCD Conserved Domain Database

CTLH C-terminal to LisH-motif

DAG Diacylglycerol

ECM Extracellular matrix

EGFP Enhanced green fluorescent protein

ENC-1 Ectoderm and Neural Cortex-1

ERK Extracellular Regulated Kinase

FAK Focal adhesion kinase

FRNK FAK-related nonkinase

GST Glutathione-S-transferase

HCF Host Cell Factor lAP Integrin-Associated Protein

Keapi Kelch protein ECH-Associated protein 1

LisH Lissencephaly-1 homology motif

LRP Low-Density Receptor-Related Protein

MAPK Mitogen-activated protein kinase

MIND Muskelin Identity in N-terminal Domain

MK Muskelin

MKDD Muskelin discoidin domain

16 MLC Myosin light chain

MMP Matrix Metalloproteinase

NES Nuclear export sequence

ORF Open-reading frame

Pfam Protein Families database of Alignments

models

PKC Protein kinase 0

PLO Phospholipase 0

PIP Phosphatidylinositol 4-Phosphate

PIP2 Phosphatidylinositol 4,5-bisphosphate

Rag Recombination Activating

SDS-PAGE SDS-PolyAcrylamid Gel Electrophoresis

Spe-26 Spermatogenesis defective-26 protein

SMART Simple Modular Architecture Research Tool

TGF-p Transforming Growth Factor-p

TPA 12-0-T etradecanoylphorbol 13-acetate

TSP Thrombospondin

TSR Thrombospondin type 1 repeats

VSMC Vascular Smooth Muscle Cell

17 Introduction

Multicellular organisms require mechanisms for cell adhesion in order to coordinate tissue organisation and the intercellular communication. In this chapter, I summarise the current understanding of extracellular communication with regards to cell-matrix adhesion, and the cellular responses connected to these processes. The focus of my project has been on muskelin, a member of the kelch-repeat superfamiiy of cytoskeletal and cell signalling proteins and I summarise the current knowledge of the molecular basis and biological functions of this growing family in relation to inter- and intracellular responses.

Extracellular Matrix Proteins

The ECM is a highly organised three-dimensional network of heterogeneous macroproteins that provides physical strength and the environmental cues for cellular and tissue organisation in multicellular organisms. For instance, the mineralisation of the cartilage ECM proteins, collagen XI and collagen II, in the bone, provide a mechanical support, whereas the ECM proteins, collagen IV, laminins, perlecan, and fibronectin are involved in assembling basement membranes in order to support polarisation of epithelial cells and contribute to compartmentation and selective barriers between cells and tissue layers.

The complex meshwork of ECM fibers consists of four main categories of components, namely collagens, glycoproteins (e.g. fibronectin, tenascin, thrombospondin, and laminin), proteoglycans (e.g. perlecan, agrin, and syndecan) and glycosaminoglycans (e.g. chondroitin sulphate, hyaluronate, and heparan sulphate). These molecules interact with each other and link to cell

18 surface receptors by multiple binding regions. Besides the multiple binding domains, most ECM protein are assembled as multimers, which increases the

affinity or avidity of their binding interactions.

Collagen and Laminin

Collagen and laminin are both composed of three polypeptides. Collagen is the

most abundantly expressed component of the ECM proteins and consists of at

least 21 different members (for recent review see Boot-Handford and Tuckwell,

2003). The three polypeptides are cotranslationaly assembled together either as

homotrimers or as a mixture of two or three distinct a-chains. On the structural

basis, collagens are categorised according to their morphological behaviour in the ECM, namely as fibril-forming collagens (e.g. types I, II, III, V and IX), or the

microfilamentous (e.g. type VI), or as network-forming collagens (e.g. type IV,

VIII, and X). The fibril and filamentous forming collagens assemble into long tight elastic bundles mainly located in bones and surrounding muscles, whereas the network-forming collagens assemble into large sheets, which are one of the

main components in the basement membrane.

Laminin is assembled from three subunits (a, p, and y) and so far 12 different laminin heterotrimers, assembled from five a-chains, three p-chains and three y- chains, have been identified in mammals and two different laminins have been described in Drosophila meianogaster and Caenorhabditis elegans (Mutter et al., 2000; Hynes and Zhao, 2000). The ECM proteins collagen IV, laminin, and perlecan together assemble into high-order structures that constitute the basement membrane and are expressed in the early stages of the development of the embryo (reviewed by Sannes and Wang, 1997). The basement

19 membrane is formed as a dense sheet, which supports the epithelium as well as physically creating a barrier, separating the outer surface of the epithelium from the underlying mesenchymal cells. Collagen IV, laminin, and perlecan are

highly conserved among vertebrates and invertebrates and therefore considered the fundamental components for the early basement membrane (for

reviews see Olsen, 1995; Hynes and Zhao, 2000).

Fibronectin

Fibronectin exists in solution as a d imer. I n contrast to collagen and Iaminin, fibronectin is expressed from only one , but alternative splicing of the

mRNA gives rise to more then 20 different variants of fibronectin (review see ffrench-Constant 0, 1995). Fibronectin consists of three types of domains,

namely fibronectin type I, II, and III. These domains are repeated throughout the

molecule and account for approximately 90% of the total sequence. Also, all three types of fibronectin repeats have been found in other molecules,

suggesting that fibronectin arose from exon shuffling during evolution (Patel et

al., 1987). Intriguingly, fibronectin appears to be missing in the C. elegans and

Drosophila meianogaster genomes, although fibronectin type III repeats has

been detected in other molecules, including the Drosophila meianogaster

homologue of the vertebrate neural cell adhesion receptor LI, named

neuroglian (Bieber et al., 1989; Hynes and Zhao, 2000). Because its wide

distribution in the ECM throughout life, fibronectin is one of the best-

characterised ECM proteins (for recent review see Pankov and Yamada, 2002).

Fibronectin can interact with heparan sulphate proteoglycans, collagens, fibrin,

and members of the integrin receptor family, as further described in the next

section. In addition to the high abundance of fibronectin in ECM, high

20 concentration is also found in the plasma (300 pg/ml) as a soluble, but inactive form. The transition of soluble, inactive form of fibronectin to matrix assembly is

a cell-dependent process. Initially, monomeric soluble fibronectin binds to the

cell surface via integrins and molecules of large apparent molecular weight

(LAMMs; review see Magnusson and Mosher, 1998). The cell-bound fibronectin

is then converted to multimers by formation of intermolecular disulfide bridges.

The fibronectin matrix assembly is also dependent on cell contractility and

cytoskeletal rearrangements regulated by the signalling pathways downstreams

of the integrins. The tension created by the cytoskeletal rearrangements and

cell contractility thus exposes the matrix assembling sites within the fibronectin

molecule.

Cell-Substratum Contacts

The initial mechanisms by which the cell adheres to ECM proteins are mainly

mediated by the integrin family of cell-surface receptors and also depends on transmembrane proteoglycans. Since a lot of the signalling pathways and

recruitment of scaffolding proteins promoted by ECM proteins are general to the

cellular responses to ECM proteins such as fibronectin, collagen, laminin, or thrombospondin, the main focus on cell-substratum contacts in the field has

been on fibronectin, and will therefore be described here in more detail.

Integrins

The integrin family of cell-surface heterodimeric receptors facilitate the link between the ECM and the intracellular actin cytoskeleton, and specific integrins are recruited dependent on the ECM protein. Extensive studies of fibronectin- integrin interactions have so identified at least 12 different integrin receptors

21 involved in fibronectin-mediated cell adhesion, including the classic fibronectin

receptor aSpi (for recent reviews see Schwartz, 2001 ; Moiseeva, 2001 ; Plow et

al., 2000). Fibronectin contains a cell-binding domain in the fibronectin type III

repeat 9 and 10. This domain mediates the interaction to a5p1, in which a

synergy site in repeat 9 interacts with a5 and an RGD sequence in repeat 10

interacts with aSpi dimer. Several integrins recognise the RGD sequence

including a4p1, aSpi, aVpi, aVpS, aVp5 and allbpS, but these receptors can

still discriminate among the RGD-containing ECM proteins. This suggests that

the structural context in which the RGD sequence is presented determines

whether an interaction can occur.

Recently, the crystal structure of the aVp3 extracellular part was resolved alone

(Xiong et al., 2001) and in the presence of the RGD sequence (Xiong et al.,

2002; for review see Giancotti, 2003, Humphries et al., 2003b; Liddington and

Ginsberg, 2002; Shimaoka et al, 2002). The overall structure revealed a

globular “head” composed of the aV and the p3 subunits, and two “leg”-like

extensions connecting the “head” to the cell-surface. But the most important

observation was the mapping of the binding sites for the ECM proteins,

especially the RGD sequence. The crystal structure revealed that a groove is formed between a p-propeller domain in aV and a von Willebrand factor A

domain in the p3 subunit. This binding pocket is located in close proximity to the

metal-ion-dependent adhesion site, MIDAS. It is speculated that the ECM

protein binds to the synergy site within the p-propeller and folds down into the

RGD-binding pocket (Humphries et al., 2003a).

22 The most intriguing observation was a bending in the “legs”. Some questioned this observation as an artefact due to the crystallisation, but there is evidence to suggest that the bending is a way of holding the integrin in an inactive conformation. Introducing cysteine residues in the a and the p-subunit, creates an artificial disulfide bond between the p-propeller of aV or allb and the “leg” of

p3, which holds the heterodimer in a bent conformation (Takagi et al., 2001).

This suggests that these two domains are in close enough proximity (5.4 A, as determined in the crystal structure) to assemble a disulfide bond, when expressed at the cell-surface, and thus able to adapt to the bent conformation.

Further evaluation of this construct showed that holding the integrin aVp3 or allbp3 in the bent conformation inhibited the interaction with the ligand fibrinogen (Takagi et al., 2001). In conjunction with these observations, electron

microscopy of purified proteins has shown that aVp3 is in a bent conformation

in the presence of the inactivating calcium ions, and an extended conformation

in the presence of manganese, which binds to the MIDAS and promotes a high-

affinity conformation. Thus the protein has an inherent flexibility (Takagi et al.,

2002).

A third interesting observation revealed from the crystal structure was the

reorganisation of the two “legs”. Epitope mapping of the “leg” of the p2 subunit combined with aL, revealed that in a closed low-affinity state, the two legs of the

integrin heterodimer is held tightly together, and upon activation by manganese or phorbol esters, the legs move apart exposing unique epitopes within the p-

“leg” (Lu et al., 2001). Furthermore, electron microscopy has showed that for

recombinant a5p1, this separation was approximately 14 nm (Tagaki et al.,

23 2001). This huge conformational change could be a mechanism for transmitting

the extracellular signal to the cytoplasm (outside-in), or opposite, transmitting an

intracellular signalling in order to activate the integrin (inside-out).

The crystal structure of the aVpS was lacking the transmembrane and the

cytoplasmic domains. But NMR-studies of the allbpS have shown that the two

cytoplasmic domains are associated when the complex is in a low-affinity state

(Vinogradova et al., 2002). This interaction between the two domains could be

disrupted in the presence of the intracellular protein talin, which activates

allbps. These data on short peptides corresponds to the previously described

“unclasping” of the “legs” in the whole molecule electron microscopy studies

and therefore strengthen the hypothesis of the separation of the “legs” as a functional mechanism for an inside-out activation of integrins (Vinogradova et

al., 2002).

Focal Adhesions

Cell contact with ECM proteins can lead to the formation of many different forms

of cell-matrix contacts (for review see Adams, 2001a). Of these, the focal

adhesion has been most widely studied in cell culture assays.

Upon integrin activation, the association between integrins and the actin

cytoskeleton promotes a reorganisation of integrin localisation and recruitment of the integrins to the focal contacts. Extensive studies have focused on the

hierarchy by which the scaffolding proteins and signalling molecule gets

recruited to the focal adhesion (Figure 1). The initial contact between the cell and fibronectin promotes clustering of the a5p1 integrins in a Rho-dependent manner (Figure lA ^ B ; Hotchin and Hall, 1995). Clustering of integrins by ECM

24 B

FN, Col, LN, VN FN, Col, LN, VN

Extracellular m m ? M W K

Intracellular PIP5K PKCPKC Talin FAK Talin S yn d ecan -4 FAK Vinculii â a -a c tin in P IP -2 Paxilin a-actin in

feTP; [CTP: Actin Stress Fiber R acl Rho

MLC I MLC R ho-kinase MHC

Filapodia Lamellapodia

Figure 1. Formation of focal adhesions. (A) Cells unexposed to extracellular ligands (i.e. cells in suspension) has uncoulpled a and p integrin subunits. (B) Exposure to extracellular matrix proteins such as fibronectin, FN, collagen. Col, laminin, LN, or vitronectin, VN, activates the integrins and recruits focal adhesion kinase, FAK, and talin to the focal contact. The multimerisation of FAK, activates the kinase function. Additionally, activation of the small GTPases, Cdc42, Rac and Rho, promotes the formation of filapodia, lamellapodia and stress fibers formation, respectively. The GTP-bound forms of Rac and Rho activates the phosphatidylinositol 4-phosphate (PIP) 5-kinase (PIP5K), which phosphorylates PIP and produces phosphatidylinositol 4,5-bisphosphate (PIP-2). PIP-2 binds directly to vinculin and promotes conformational changes, which allows binding of talin and actin. Paxilin and a-actinin gets recruited to the forming focal adhesion and stabilises the formation of actin stress fibers. The coreceptor syndecan-4 binds directly to ECM proteins and via a-actinin, links to the actin cytoskeleton. The localisation of syndecan-4 to focal adhesion recruits and activates protein kinase C (PKC). The GTP-bound form of Rho activates the myosin light chain kinase (MLCK), which phosphorylates the myosin light chain in the myosin complex. This promotes binding between myosin and actin filaments and the formation of contractile actomyosin stress fibers.

25 proteins (or antibodies), leads to a subsequently coclustering of the focal

adhesion kinase (FAK) associated with a5pi, and activation of tyrosine kinase

activity by the auto-phosphorylation at FAK-Y397 (Miyamoto et al., 1995a, b).

This leads to an increase in Cdc42 and Rac activity, which stimulate membrane

ruffling and formation of filopodia and lamellipodia, respectively (Del Pozo et al.,

2000, 2002; Nobes and Hall, 1995; Clark et al., 1995). Also, a rapid and transient decrease in Rho activity is observed during the initial cell spreading

(Ren et al., 2000). The inhibition of Rho activity is dependent on FAK and is

important for the initial formation of focal adhesion. The activation of Rac and the delayed activation of Rho are important for the translocation of

phosphatidylinositol 4-phosphate (PIP) 5-kinase (PIP5K) from a perinuclear distribution to the membrane (Chatah and Abrams, 2001), and a subsequently activation of the kinase, which catalyses the production of phosphatidylinositol

4,5-bisphosphate (PIP 2) from PIP. Vinculin can bind to PIP 2 and the binding site is located on a hinge region connecting the head, which folds up and interacts with the tail of vinculin (Johnson and Craig, 1995a,b). Upon binding to PIP 2 , conformation changes in vinculin dissociate the head to tail interaction and unmask binding sites for both talin and actin within vinculin, and a later recruitment of paxillin to the focal adhesion (Figure 1B->C; Gilmore and

Burridge, 1996). A recruitment of the actin-bundling protein, a-actinin to the mature focal adhesions promotes stabilising and maintaining of attachment between the actin filament and the focal adhesions (Pava I ko and Burridge,

1991). PIP2 is the substrate for phospholipase C (PLC), which catalyses the formation of inositol trisphosphates (lnsP3) and diacylglycerol (DAG). Inositol trisphosphate functions as a second messenger to control the release of internal calcium (reviewed by Taylor, 2002; Putney et al., 2001), whereas the

26 best characterised function of DAG is the recruitment and activation of protein

kinase Cs (PKC) at the plasma membrane. Also, PKCa is associated directly

with the pi integrin subunit, as detected in fluorescence resonance energy

transfer microscopy and coimmunoprecipitation (Ng et al., 1999). A further

description of the cellular functions of PKC will be addressed in the Introduction

for Chapter 6.

Development of Cell Contractility

Establishment of contractile actin stress fibers in smooth muscle cells involves

activation of the myosin light chain (MLC) by the Ca^'^-sensitive MLC-kinase

(MLCK). The phosphorylation of MLC increases the actin-dependent ATPase

activity of the myosin heavy chains, which drives the movement of the actin filament (Sellers and Kachar, 1990; for review see Tan et al., 1992). The GTP-

bound form of Rho activates the Rho-associated kinase, which, similar to

MLCK, phosphorylates MLC (Matsui et al., 1996).

A second target for the Rho-associated kinase is phosphorylation of the myosin-

binding subunit of the MLC phosphatase (Kawano et al., 1999). This

phosphorylation inactivates the phosphatase activity and thereby increases the concentration of phosphorylated MLC. So an activation of the Rho-kinase and a concurrent inactivation of the myosin phosphatase lead to rapid increase in

myosin 11 ATPase activity, and a subsequently actin association and filament contractility.

The integrin-ligand interactions are sufficient for cellular attachment and spreading, but additional signalling is needed for complete focal adhesion

27 formation. In the case of fibronectin, a concurrent addition of the heparin-

binding domain together with the cell-binding domain is required (Woods and

Couchman, 1994). The transmembrane heparan sulphate proteoglycan,

syndecan-4 binds to the heparin-binding domain and acts as a coreceptor for

integrins (Figure 1C; review see Yoneda and Couchman, 2003). The ubiquitous

expression of syndecan-4 has been shown to promote cellular adhesion

mediated by pi and p3 containing integrins on ECM substrates of fibronectin,

laminin, vitronectin and collagen I. The location of syndecan-4 to focal

adhesions is associated to a direct binding to a-actinin (Greene et al., 2003),

and this interaction links syndecan-4 indirectly to the actin cytoskeleton.

Furthermore, a subsequently multimerisation of syndecan-4 in focal adhesions

recruits and activates PKCa (Oh et al., 1997, 1998; Horowitz and Simons,

1998; Lim et al., 2003)

Cell-Basement Membrane Adhesion

The cellular response between epithelial, skeletal or vascular smooth muscle

cells and laminin in the basement membrane promotes the formation of two

specialised receptor complexes, namely the hemidesosomes and the

dystrophin-glycoprotein complex (review see Moiseeva, 2001; Borradori and

Sonnenberg 1999, 1996). Hemidesmosomes are the major adhesion contact

between the epithelium and the underlying basement membrane, especially

laminin-5. The main adhesion receptor in the hemodesmosomes are the a6p4

integrin, and in contrast to focal adhesions and the dystrophin-glycoprotein

complexes, the hemidesmosomes links to intermediate filament. This interaction

is mediated intracellular by plectin and the bullous pemphigoid antigen 230

(BP230) linked to the transmembrane coreceptor bullous pemphigoid antigen-

28 180 (BP180) (Figure 2A; review see Borradori and Sonnenberg 1999, 1996). It

is noteworthy to mention that hemidesmosomes are considered stable cellular contacts, which enable cells to resist deformation from mechanical stress and counteract cellular migration.

The dystrophin-glycoprotein complex forms between the basement membrane

protein and vascular or skeletal muscle cells and consists of the extracellular

proteoglycan, a-dystroglycan, which links to the integral membrane

proteoglycan, p-dystroglycan, and the p-, 5-, s-sarcoglycans and sarcospans.

Intracellular, p-dystroglycan binds to the actin-binding protein, dystrophin,

linking the ECM to the actin cytoskeleton (Ervasti and Campbel, 1993; review

see Moiseeva, 2001; Henry and Campbell, 1996). a lp l and a2pi also acts as

laminin receptors in conjunction to the dystrophin-glycoprotein complex (Figure

2B). The dystroglycan-null mice show defects in basement membrane organisation during early development, indicating a major role of the dystrophin-glycoprotein complex in the assembly of basement membranes in

non-muscle cells (Williamson et al., 1997).

Cellular Functions of ECM proteins

The cellular response to extracellular matrix proteins includes proliferation,

differentiation, cell migration, and inflammation, and is determined by several factors including the composition of the ECM, the cell surface receptors

expressed by the cell, and the expression of intracellular signalling molecules

(review see Adams, 2001a; Giancotti and Ruoslahti, 1999; Schwartz et al.,

1995; Adams and Watt, 1993; Hynes RO, 2002; 1992). As described in the

previous section, the initial contact between the cell and the ECM engage

29 Hemidesmosome P Dystrophin-glycoprotein complex

Lam inin-5

a-dystroglycan f\ Extrace u ar

ÏÏÏÏ ÏÏ ÏÏ ÏÏ ÏÏ ÏÏ ÏÏÏÏ ÏÏ ÏÏÏÏ ÏÏ ÏÏ ÏÏ IT

Intracellular I (^^-dystroglycany

Plectrin I BP230 1 ^

Intermediate filaments

Figure 2. Schematic overview of the molecular architecture of hemidesosomes and the dystrophin- glycoprotein complex. A ) The cellular contact between epithelium and basement membrane proteins such as laminin-5, induce the formation of hemidesosomes. The main integrin involved is the a6(34, via the scaffolding protein plectrin, is linked to the intermediate filaments. B) The cellular contact between muscle cells and basement membrane proteins such as laminin, is mediated by the extracellular protein a- dystroglycan, which binds to the transmembrane receptor p-dystroglycan and the coreceptors p-, 5-, e- sarcoglycans and sarcospans. Intracellular, b-dystroglycan links to the actin cyoskeleton via the scaffolding protein, dystrophin. The integrins alpl and a2pi participate as co-receptors to the dystrophin-glycoprotein complex.

30 various receptors including integrins and heparan sulphate proteoglycans at the

cell surface, these receptors then provides an intracellular scaffold for further

engagement of intracellular signalling molecules transmitting the extracellular

information provided by the ECM to the intracellular milieu.

Multiple intracellular signalling molecules are stimulated downstream of integrin-

mediated cell adhesion. Besides the previous described scaffolding proteins,

several signalling molecules are targeted to the focal adhesions include

members of the mitogen-activated protein kinase (MARK), non-receptor tyrosine

kinases (i.e. FAK) and members of the Src family of tyrosine kinases, and the

phosphatidyl-inositol 3-kinase (PI-3 kinase) and PKC (for review see Burridge

and Chrzanowska-Wodnicka, 1996; Schwartz et al., 1995). Here I focus on two

phenotypes, DMA transcription and cell migration, where the ECM-dependent signalling pathways have been well-studied.

ECM~Mediated DNA Transcription

In mammals, the MAPKs consist of four related groups, namely the extracellular signal-related kinases (ERK)-1/2, the Jun N-terminal kinases/stress-activated

protein kinases (JNK1/2/3), the p38 proteins (p38a/p/y/ô) and ERK5 (review see

Pearson et al., 2001). As shown in Figure 3, each of these MAPKs is activated

by specific MARK kinases (MAPKK, or MEK), which in turn are activated by one or more MAPKK kinases (MAPKKK or MEKK). That the specific MEKs can be activated by one or more MEKKs increases the complexity and diversity of

MARK signalling pathways. One of the main functions of MARK signalling pathways is the regulation of in response to ECM stimuli

(review see Treisman, 1995). Of all the known MARK signalling transduction pathways, the best-characterised cascade is the Ras-^Raf-

31 1(MEKK)^MEK1/2->ERK1/2. An elevated ERK1/2 activity leads to a

phosphorylation and altered activity of a subclass of transcription factors, which

includes the AP-1 family (activated protein-1), the TCFs (ternary complex

factors) and the STATs (signal transducers and activators of transcription). The

AP-1 family of leucine zipper proteins consist of jun, fos, and ATF-2 (activating

transcription factor), which forms homo- and heterodimers (review see Karin et

al., 1997; Treisman, 1995). The TCFs mediate transcription from serum

response elements (SREs), including the AP-1 transcription factor, fos. In

contrast to the AP-1 transcription factors and the TCP, inactive STATs are

located in the cytoplasm. Following tyrosine phosphorylation by ERK1/2, the

STATs dimerise, and translocate to the nucleus. Inside the nucleus, STATs bind

to various promoter regions, which contains the STAT-response element

(review see Horvath and Darnell, 1 997). I n general, activation o f the ERK1/2

has been linked to cell survival and proliferation, whereas activation of JNK and

p38 is linked to induction of apoptosis (Xia et al., 1995). In contrast, the

activation of JNK2 leads to an increased phosphorylation of the transcription factors, c-Jun and Nuclear Factor of Activated T-cell involved in apoptosis

(Behrens et al., 2001). Altogether, the MAPK directly induces transcription factors, which leads to either cell proliferation or apoptosis, dependent on the

MEK activation.

Regulation of Cell Migration

The regulation of cellular migration is a complex process, which includes intracellular-mediated assembly and disassembly of cell protrusions and focal adhesions for regulation of cell-substratum adhesive strength, and polarised rearrangement of actin cytoskeleton for a unidirectional translocation of the cell

32 ECM

Extracellular

Intracellular Adhesion Ras complex

Raf K l-4 Ask MEKKSMEK I / I \ I I MEKl/2 MEK3/6 MEK4/7 MEK5 I I 1 1 ERKl/2 p38a/p/y/0 JNKl/2/3 ERK5

STAT

Nucleus

STAT TCP AP-1 NF-AT c-Jun

F ig u re 3. Schematic overview of MAP-kinase regulated DNA transcription induced by ECM proteins. Adhesion complexes formed upon cell-substratum contact induces intracellular MAP-kinase signalling pathways dependent on the integrin-ECM protein engaged. The activation of Ras induces the kinase activity of the MEKK, Raf. Activate Raf phosphorylates and activates the MAP-kinase kinase, MEKl or 2, which then activates the MAP-kinase. Activated MAP-kinase promotes the formation of the transcription factor complexes AP-1, IGF and STATs, which induces DNA transcription from specific promotes. The activated JNKl/2/3 leads to an activation of the DNA transcription factors c-Jun and the Nuclear Factor of Activated T cell (NF-AT)

33 body. The adhesive strength by which cells bind to the ECM regulates cell

migration. On weakly adhesive substrates the interaction cannot provide the

traction for the translocation, whereas on strongly adhesive substrates, the cell

is immobilized (Palecek et al., 1997; Dimilla et al., 1993). The avidity and affinity

of the integrins is one site of regulating cellular adhesive strength. Cells

exposed to different concentrations of either fibronectin or collagen IV, exhibit a

maximum cellular velocity at intermediate adhesive strength (DiMilla et al.,

1993). At the same time, changes in expression levels of the aVp3 integrin also

affect cell migration, in which a maximum velocity is achieved at intermediate

expression levels (Palecek et al., 1997).

A second site for the regulation of cellular adhesive strength, is the assembly

and disassembly of focal adhesions. One of the pivotal molecules in regulating

focal adhesions is the focal adhesion kinase, FAK. FAK-deficient cells shows an

increase in number of focal adhesion, which suggests that FAK regulates the

cycle of assembly and disassembly of focal contacts, rather than being required

specifically for their assembly (Ilic et al., 1995). FAK plays a role in regulating

signals to the small GTPases Rac and Cdc42, as well as PI-3 kinase. These

molecules are closely involved in lamellipodia and filapodia formation, and

cortical actin filament assembly. All three processes are essential for cellular

migration (Clark et al., 1998; Nobes and Hall, 1992). The activity of FAK is

modulated by expression of an endogenous FAK inhibitor named FRNK (FAK-

related nonkinase), which is expressed as an independent protein and consists of the carboxyl-terminal noncatalytic domain of FAK, and therefore lacks the

kinase activity associated with the N-terminal region of FAK (Richardsons and

Parsons, 1996). FRNK can compete with FAK in focal adhesion localisation and

by this, decreases FAK activation and tyrosine phosphorylation in focal

34 adhesions (Richardsons and Parsons, 1996; Taylor at a!., 2001). Where FAK is ubiquitously expressed in mammalian cells, the expression of FRNK is selectively restricted to smooth muscle cells (Taylor et al., 2001).

Overexpression of FRNK in rat aortic smooth muscle cells or adenocarcinoma cells inhibited PDGF- or EGF-induced cellular migration in Boyden chamber, respectively (Hauck et al., 2001; Taylor et al., 2001). Whereas FRNK expression in Src-transformed NIH 3T3 fibroblasts has no effect on their motile behaviour, but showed decrease expression and secretion of matrix

metalloproteinase-2 (Hauck et al., 2002), which could also be a plausible explanation for the previous reported decrease in cellular migration observed in

Boyden chamber.

Calpain is a calcium-dependent protease localised to the focal adhesions

(Beckerle et al., 1987; Du et al., 1995). An intracellular increase in calcium

concentration activates calpain, which then partially cleaves focal adhesion

components including integrin subunit p3 (Du et al., 1995), FAK

(Schoenwaelder et al., 1997; Carragher et al., 1999), and talin (Inomata et al.,

1996). This calpain-induced disassembling of focal adhesion is an alternative

mechanism for regulation of cell migration (Huttenlocher et al., 1997).

Furthermore, the small GTPases are also involved in focal adhesion formation, where Rac is essential for the formation of new adhesion sites at the cell front,

and Rho is required for the maturation of focal contacts to focal adhesions, as

previously described (Kiosses et al., 2001; Rottner et al., 1999; del Pozo et al.,

1999). Concurrently, the Rho-induced formation of contractile actomyosin filament assembles the motile capability of cells.

35 In conclusion, the ECM determines cellular behaviour by regulating gene expression, as well as motile behaviour. In addition, the selective expression of protein in different cell types adds to the complexity and diversity in the cellular responses to ECM. The ECM o f vertebrates contains many components that are not yet well-understood. Knowledge of the molecular interactions and biological regulation of signalling molecules will help for a further understanding of the underlying mechanisms for the cellular responses to ECM proteins.

36 Thrombospondins

The cellular responses and behaviour towards the ECM proteins, thrombospondins has been the central interest in Dr. Josephine Adams’ laboratory. The main focus has been to elucidate signalling transduction pathways towards functional regulation of the actin-bundling protein, fascin, and its role in actin reorganisation, cell protrusions and cell migration. These investigations lead to the discovery of muskelin. Therefore, the structures, cellular responses to and functional role of thrombospondins will be described in more detail.

Molecular Structures of Thrombospondins

Insects contain a single thrombospondin, whereas in vertebrates the thrombospondins are a small family of five separate gene products (Adams et al., 2003). As shown in Figure 4, the five vertebrate thrombospondins are divided into two subgroups according to the overall molecular properties and architecture. TSP-1 and TSP-2, composing subgroup A, are trimeric proteins, whereas subgroup B, which comprises TSP-3, -4, and COMP (cartilage oligomeric matrix protein, also designated TSP-5) forms pentamers.

Furthermore, subgroup A have a distinct domain architecture, a procollagen homology domain, and three thrombospondin type 1 repeats (TSR), whereas subgroup B contains a unique N-terminal domain and an additional thrombospondin type 2 repeat. So the overall molecular characteristics of the family of thrombospondins include the thrombospondin type 2 and 3 repeats, and the C-terminal domain.

37 TSP-1 __ -00(X>00-U- 1 1 |-CZ>':°°” Subgroup A TSP-2 __ hOOOOOOii 1 1 1 KZZ^COOH

TSP-3 ""2xH>^ 111 1 1 | - C I > “ OH

TSP-4 11 1 1 1 K >COOH Subgroup B

TSP-5/C0MP 1 1nil -CZ>“°«

Amino Type 2 > terminal < ) repeats domain o

Procollagen Type 3 homology mil repeats domain

Carboxy Type 1 term inal repeats 0 domain

Coiled-coil region Coiled-coil region A trimer capacity ☆ pentamer capacity

Figure 4. Schematic diagram of the molecular architecture of the five members of the vertebrate thrombospondin gene family. The five thrombospondins are divided into two subgroups, subgroup A (TSP-1 and TSP-2) and subgroup B (TSP-3, TSP-4 and COMP/TSP-5). Domains shared between the members are indicated by colours.

38 Localisation of Thrombospondins

Thrombospondins are components of the ECM conserved between insects and vertebrates (Adams et a!., 2003), and have mostly been studied in vertebrates.

In the central nervous system, a selective expression of the ECM proteins, thrombospondin-1, -2, -3, and -4, has been reported (Table 1; review see

Adams and Tucker, 2000; Adams 2001). Antibodies against TSP-1 immunostain specific sites for neurogenesis and process outgrowth, as well as peripheral nerves and central nerve tracts capable of nerve regeneration in diverse organism including mice, goldfish and Xenopus (Hoffman et al., 1994;

Hoffman and O’Shear, 1999).

In s/Yu-hybridisation of avian embryos indicated a specific expression of TSP-1 mRNA in the ependymal layers in the midbrain and the hindbrain that corresponds to regions of active neurogenesis (Tucker et al., 1997). In contrast to TSP-1 mRNA, a delayed expression of TSP-3 mRNA were detected in regions with post-mitotic neurons, indicating individual functions for TSP-1 and -

3 in neural differentiation and neurite outgrowth, respectively (Table 1; Tucker et al., 1997). In the myelinated regions of the central nervous system, TSP-1, but not TSP-2 or TSP-3, is produced by the astrocytes and promotes oligodendrocyte precursor migration (Scott-Drew and ffrench-Constant 1997).

TSP-1 expression has been detected in other tissues including the cartilage, bronchi, kidneys, and skeletal muscles (O’Shear and Dixie, 1988, Tooney et al.,

1998). When comparing the TSP-1 and TSP-2 expression in the developing mouse, an overlap of the two proteins in certain tissues or organs can be detected, though the time of expression of TSP-2 in mice has not been detected

39 Table 1. Major Expression Sites of Thrombospondins in vertebrates

Protein Site of expression Cellular Process Reference

TSP-1 Widespread, muscle, heart, Neurogenesis, process Tucker et al., 1997; O’Shear and lung, nervous system, outgrowth Dixit, 1988; Lawler et al., 1998 kidney TSP-2 Cartilage, bone, muscle Collagen fibriollogenesis, Melnick et al., 2000; Kyriakides et wound healing al., 1998, 1999. Yang et al., 2000 TSP-3 Post-mitotic neurons, lung, Process differentiation Tucker et al., 1997; Urry et al., bone 1998 TSP-4 Nervous system, muscle Neurite outgrowth Urry et al., 1998; Arber and Caroni, 1995. TSP-5 Cartilage Matrix interaction and Adams, 2001 assembly

40 before embryonic day 14 as compared to embryonic day 10 for TSP-1 (Tooney et al., 1998). Similar to the TSP-1 expression, the TSP-2 expression can be detected in the cartilage and the developing bone, but in contrast to TSP-1, the

expression of TSP-2 has not been detected in either lung or heart (Table 1;

Tooney et al., 1998).

During Drosophila melanogaster embryogenesis thrombospondin mRNA is detected in a repetitive pattern corresponding to the ectoderm and mesoderm

giving rise to the neural tissue and musculoskeleton, respectively (Adams et al.,

2003). In later stages of the Drosophila melanogaster embryogenesis thrombospondin is located in the apodeme cells corresponding to the middle

region supporting the legs and the wings, in the imaginai discs for both the wings and the legs, and a subset of muscle cells (Adams et al., 2003).

TSP-1 and TSP-2 Knock-out Mice

The biological roles of two TSPs, TSP-1 and TSP-2, have been extensively

studied from the development of knock-out mice (Lawler et al., 1998; Kyriakides

et al., 1998). In the TSP-1 deficient mice, the lungs develop normally at birth,

whereby after 4 weeks after birth, they develop an acute and chronic

pneumonia (Lawler et al., 1998). The pneumonia could partially be restored by

treatment with a transforming growth factor-p (TGF-pl) activating peptide

derived from TSP-1 (Crawford et al., 1998). This acute or chronic pneumonia in the lungs is not observed in the TSP-2 deficient mice (Kyriakides et al., 1998).

The TSP-2 deficient mice show an abnormal collagen fibrillogenesis causing

reduced tensile strength and fragile skin. Also, a decreased cellular attachment of skin fibroblasts induced by increased matrix metalloproteinase-2 activity has

41 been observed in TSP-2 deficient mice (Yang et al., 2000). A third aberrant

phenotype is an increased neovascularisation with accelerated wound healing

in TSP-2 deficient mice (Kyriakides et al., 1998; 1999). This phenotype in TSP-2

deficient mice is in contrast to an observed decrease in wound healing repair by

a delayed re-epithelialisation when TSP-1 expression is downregulated by

antisense depletion (DiPietro et al., 1996). Intriguingly, the double TSP-1/TSP-2

mice showed a decrease in wound healing responses, which is therefore dominated by the TSP-1 deficiency, but otherwise exhibited a phenotype

representing the sum of the abnormalities observed in the single-deficient mice

(Agah et al., 2002). Interestingly, the expression of TSP-1 in the TSP-2-deficient

mice, and of TSP-2 in the TSP-1 deficient mice, were indistinguishably from wild-type mice (Agah et al., 2002), These data indicate an interesting point that

neither TSP compensates for the absence of the other by an altered expression.

Properties of Thrombospondins

The thrombospondins binds to other matrix components, including fibronectin, collagens, and laminin (Holden et al., 2000; Rosenberg et al., 1998; Sipes et al.,

1993; for recent review see Adams, 2001b), but so far the only molecular

interaction in common throughout the family, is the ability to bind a large

number of calcium ions. These interactions have been assigned to the seven thrombospondin type 3 repeats, which are conserved between all the thrombospondins (see Figure 4). Biochemical analysis using recombinant thrombospondin type 3 repeats containing polypeptides, showed that these repeats alone bind 14-20 (Misenheimer et al., 2003). The thrombospondin type 3 repeats in conjunction with either the C-terminal domain or the last

42 thrombospondin type 2 repeat shows only slight increase in calcium binding, whereas the thrombospondin type 3 repeats together with the thrombospondin type 2 repeat and the C-terminal domain binds 22-27 Ca^‘" (Misenheimer et al.,

2003). Single nucleotide polymorphism giving raise to a single amino acid

substitution in the first thrombospondin type 2 repeat is associated with familial

premature coronary heart disease (Topol et al., 2001). Interestingly, this

missense variant also lead to a decrease in the concentration of thrombospondin-1 in the plasma. Biochemical analysis of this mutation

identified local conformational changes and changes in Ca^^ sensitivity (Hannah

et al., 2003). Together, these data suggest that the binding of calcium induces

conformational changes affecting the physical properties. These conformational

changes lead to changes in cell-attachment activity and sensitivity of TSPs to

proteolysis (Lawler et al., 1988; Slane et al., 1988; Dardik and Lahav, 1999).

TSP-1 has been the most intensively studied TSP and the major TSP studied in

Adams’ laboratory. Therefore I focus on the properties of TSP-1

Molecular Interactions of TSP-1

TSP-1 interacts with ECM proteins, fibronectin, cytokines, proteases, growth factors, and a large number of cell-surface receptors (Figure 5).

Interaction with ECM proteins

The interaction of TSP-1 with the extracellular matrix proteins fibronectin,

collagens, and laminin, changes the cellular attachment and spreading activity towards the matrix proteins (Mumby et al., 1984; Sipes et al., 1993; Canfield and Shour, 1995; Aho and Ditto, 1998). For instance, addition of a small

43 Oligomerisation

A m in o Procollagen C a rb o x y te rm in a l homology Type 1 Type 2 T y p e 3 te rm in a l d o m a in domain repeats repeats re p e a ts d o m a in ]""SSS~^^CX -COOH / \ / \ / / \ Molecular Syndecan-l «3P1 C D 3 6 TGF-p-LAP Calcium «v Pb ' IAP/CD47 Interactions: Fibronectin allbps Collagen V Laminin

Cellular Cell attachment Anti-angiogensis Activation of TGF-p Cell attachment response: Cell migration Induction of apoptosis Cell migration Regulation of Regulation of proliferation proliferation

Figure 5. Schematic overview of molecular interaction of Thrombospondin-1 and cellular responses. TSP-1 consists of multiple domains. The oligomerisation domain promotes trimer assembly of TSP-1 monomers. Well-characterised and mapped receptors and molecular interactions of TSP-1 are indicated beneath their associated domains shown for a single TSP-1 monomer. The many cellular responses to TSP-1 assigned to various receptors and molecular interactions are indicated. For further details see text.

44 hexapeptide corresponding to the fibronectin-binding site in thrombospondin type 1 repeat in TSP-1, inhibits fibronectin-mediated cell adhesion of human

breast carcinoma cells to collagen I (Sipes et al., 1993). Also, transition of

resting bovine endothelial cells to a sprouting phenotype when plated on a

complex matrix containing laminin, is inhibited by the addition of TSP-1

(Canfield and Shour, 1995). These properties have lead to the categorisation of

TSPs as “matricellular” proteins, whose functions as bridges or modulators of the interactions between matrix proteins and cell surface receptors (Bornstein,

1995, 2001). Furthermore, it suggests a functional role of soluble TSP-1, which will be described in detail later.

TGFp

Together with integrins and proteases, TSP-1 is one of main activators of transforming growth factor-p (TGF-p) in vivo (for review see Koli et al., 2001).

The second TSR of TSP-1 binds to the complex containing TGF-p and the

Latency-Associated Peptide (LAP), whereupon a KRFK motif in the same TSR

interacts with LAP. The interaction between TSP-1 and LAP possibly induces a conformational change in LAP, which subsequently releases and activates

TGF-p (Ribeiro et al., 1999). Furthermore, a continuous interaction between

TSP-1 and LAP prevents a reformation of the inactive TGF-p/LAP complex

(Ribeiro et al., 1999). TSP-2 lacks KRFK motif, but contains the initial “docking”

motif and therefore binds but unable to activate TGF-p (Schultz-Cherry et al.,

1995).

45 MMP-2

Both TSP-1 and TSP-2 binds matrix metalloproteinase (MMP)-2 and the related

MMP-9 (Bein and Simons, 2000; Yang et al., 2000). TSP-2 deficient mice show an increase in MMP-2 concentration causing adhesive defects of dermal fibroblasts due to reduced cellular endocytosis of the TSP-2/MMP-2 complex by the Low Density Lipoprotein-related Scavenger receptor (Yang et al., 2000,

2001). Correspondingly, an overexpression of TSP-2 in colon carcinoma cells

causes a downregulation of the expression of MMP-2 and MMP-9 as detected

by cDNA microarray (Kamochi et al., 2003). In contrast, the addition of TSP-1 to

endothelial or human gastric adenocarcinoma cells upregulates the production

and activation of MMP-9 (Qian et al., 1997; Albo et al., 2002), which has been

correlated with a more aggressive tumour phenotype with increased tumour

angiogenesis and cell invasion (Albo et al., 2002).

Cell-Surface Receptors

The thrombospondins support cellular attachment through interactions with a

variety of cell-surface receptors (review see Adams, 2001b). The N-terminal

domain of TSP-1 and -2 contains a BBXB motif, which interacts with the heparin

sulphate glycosaminoglycans modifications on proteoglycans (Sun et al., 1989;

Kaesberg et al., 1989). Multiple GAG binding sites has also been identified in

the TSR (Guo et al., 1992) and some of the specific proteoglycans has been

characterised as regulating cell attachment and spreading including the ECM

protein, decorin, and cell-surface receptor, syndecan-1 (Merle et al., 1997;

Adams et al., 2001). The glycoprotein CD36 binds to the second TSR of TSP-1

and TSP-2, where this interaction mediates the anti-angiogenic effects of TSP-1

on endothelial cells (Asch et al., 1987, 1992; Dawson et al., 1997). Various

46 types of integrins binds to different domains within TSP-1; the N-terminal domain of TSP-1 binds to activated aSpi (Krutzsch et al., 1999;

Chandrasekaran et al., 1999) and to a4p1 in T-cells (Li et al., 2002), the TSR contains a binding site for pi integrin (Yabkowitz et al., 1993), and the aVp3

receptor binds to the RGD-motif in the thrombospondin type 3-repeats (Lawler and Hynes, 1989). The integrin-associated protein, lAP or CD47, binds to the C- terminal domain of TSP-1 (Gao et al., 1996b). The interaction between lAP and

TSP-1 affects the binding affinity of various integrins, including the aVp3

receptors on human melanoma (Gao et al., 1996a), the a2p1 on smooth muscle

cells (Wang and Frazier, 1998), and the platelet allbpi receptors (Chung et al.,

1997). The lAP-mediated activation of aVp3 by TSP-1 is inhibited by pertussis toxin suggesting that the signalling pathway by which lAP promotes aVp3-

mediated adhesion is inside-out mediated by heterogeneous Gi-proteins (Gao

et al., 1996a). Together, the repertoire of various cell-surface receptors

expressed by various cell-types specifies the cellular response towards TSP-1

(Adams and Lawler, 1993, 1994).

Cellular Signalling mediated by TSP-1

Three major signalling transduction pathways have been characterised

regarding cellular responses to TSP-1. The best characterised morphological

response is the formation of lamellipodia and fascin-containing microspikes in

cell protrusions. A second well-characterised mechanism is the induction of

vascular smooth muscle migration and proliferation in response to soluble TSP-

1, and the third mechanism described, is the disassembly of focal adhesion

mediated by soluble TSP-1.

47 Formation of Lamellipodia and Microspikes

Cells adherent to TSP-1 form large arrays of fascin-containing actin microspikes

located in the cell margins and the lamellipodia (Adams, 1995, 1997, review see

Kureishy et al., 2002). Fascin binds directly to actin and is important for the

organisation and rearrangement of actin-based structures that function in cell

adhesion and migration (review see Kureishy et al., 2002). The inhibition of fascin-actin interaction by antibodies against the actin-binding domain of fascin

completely prevents cell spreading and migration on TSP-1, suggesting that

fascin is a primary mediator of cellular spreading and migration in response to

TSP-1 (Adams and Schwartz, 2000). Formation of fascin microspikes is

dependent on the engagement of the proteoglycan syndecan-1 (Adams et al.,

2001). Interestingly, the extracellular glycoaminoglycans (GAG) sidechains and the cytoplasmic part of syndecan-1 have distinct roles in promoting cell

spreading and fascin microspike organisation in response to TSP-1. Mutations

of the two GAG-sites, serine45 and serine47, reduce cell spreading when plated

on TSP-1, but increased the number of fascin microspikes per cell and the

length of the individual microspikes, whereas deletion of the cytoplasmic part of

syndecan-1 both reduced the cell spreading on TSP-1 and inhibits formation of

fascin microspikes (Adams et al., 2001).

Formation of fascin microspikes in response to TSP-1 is also dependent on the

small GTPases, Rac and Cdc42 (Adams and Schwartz, 2000). Cell adhesion to

TSP-1 activates Rac and Cdc42 and their downstream effector, p21-activated

kinase (PAK) over an extended period of time compared to the transient

activation occurring when cells adhere to fibronectin (Adams and Schwartz,

2000). The expression o f either constitutive or dominant negative versions of

48 Rac or Cdc42 blocks the spreading and the formation of fascin microspikes on

TSP-1, suggesting that the TSP-1-mediated regulation of the activity of Rac and

Cdc42 is important for a proper spatial organisation of fascin microspikes and

lamellipodia formation on TSP-1 (Adams and Schwartz, 2000).

Cells adherent on fibronectin, vitronectin or collagen IV, have a diffuse

distribution of fascin-1 that is non-coincident with actin stress fibers (Adams,

1995, 1997). This change in fascin distribution is induced by the activation of

protein kinase C-a (PKC-a) in response to these ECM proteins. The activation

of PKC-a induces a phosphorylation of fascin at serine39 and thus inhibits its

interaction with actin filament (Yamakita et al., 1996; Ono et al,. 1997). In

contrast, fascin is kept in a non-phosphorylated form when plated on TSP-1

(Adams et al., 1 999). Correspondingly, expression of a non-phosphorylatable

mutant of fascin (Fascin S39A) inhibits cell spreading and formation of focal

adhesion in fibronectin-adherent cells. In contrast, introducing a constitutive

negative change on serine39 (S39D) mimicking a phosphorylated form of

fascin, promotes spreading on fibronectin. These data indicates an active role of

fascin in cellular responses to both TSP-1 and fibronectin.

Vascular Smooth Muscle Cell Proliferation and Migration

An increased expression and localisation of TSP-1 in blood vessels walls has

been related to development of arterosclerosis and intimai hyperplasia, of which

one characteristic is an increase in vascular smooth muscle cell migration and

proliferation (Steninaet al., 2003; Roth et a I., 1998). This increase in TSP-1

expression is related to increased concentration of glucose in the blood in

diabetes (Stenina et al., 2003). The signalling transduction events by which

49 TSP-1 induces vascular smooth muscle cell (VSMC) migration and DNA synthesis involve a number of different pathways, including activation of tyrosine kinases, focal adhesion kinase (FAK), mitogen-activated protein kinase

(MAPK/ERK), phosphatidylinositol 3-kinase, and protein kinase C (Patel et al.,

1997, Guo et al., 1998; Gahtan et al., 1999; Lymn et al., 1999; Willis et al.,

2000). The addition of soluble TSP-1 induces chemotactic migration and proliferation of human VSMC as detected in Boyden Chamber migration assays, which can be abolished by pharmacological inhibition of either tyrosine kinase activity, PKC, or phosphatidylinositol 3-kinase (Patel et al., 1997; Guo et al., 1998; Lymn et al., 1999; Willis et al., 2000). Furthermore, a biphasic phosphorylation of phosphatidylinositol 3-kinase occurs in VSMC in response to

TSP-1 (Lymn et al., 1999). Also, phosphorylation of FAK has been detected in both melanoma cells and VSMC in response to TSP-1, where in the VSMC, the phosphorylation of FAK was shown to be dependent on phosphatidylinositol 3- kinase (Gao et al., 1996a; Lymn et al., 1999).

Disassembly of Focal Adhesions

Fully spread bovine aortic endothelial cells plated on fibronectin or fibrinogens are able to form focal adhesions, in contrast to partial spreading and absence of focal adhesions when plated on TSP-1. Addition of soluble TSP-1 to prespread bovine aortic endothelial cells plated on fibronectin stimulating disassembly of preformed focal adhesion or inhibited the dynamic reformation of focal adhesion, preferentially in the central regions of the cells, whereas focal contacts in the periphery remained unaffected (Murphy-Ullrich and Hook, 1989).

Interestingly, the TSP-1 mediated focal adhesion disassembly had no effect on cell spreading. Further molecular and cellular analysis of TSP-1 mediated focal

50 adhesion disassembly identified a complex consisting of the low-density

receptor-related protein (LRP, also known as CD91 or the aa-macroglobulin) and, interestingly, the calcium-binding protein calreticulin as a TSP-1 receptor

involved in the disassembly of focal adhesions (Orr et al., 2003; Goicoechea et al., 2002, 2000). Calreticulin is mainly an intracellular protein, which primarily function as a chaperone in the endoplasmatic retriculum and as a regulator of calcium homeostasis (Mery et al., 1996; review by Michalak et al., 1999).

Overexpression of calreticulin in mouse fibroblasts increases the level of vinculin and subsequently an increase in cellular adhesion and decrease in migration (Opas et al., 1996). Upon binding to TSP-1, the calreticulin/LRP complex activates phosphoinositide 3-kinase and ERK in a pertussis toxin (G- protein) dependent pathway (Orr et al., 2002).

The LRP has previously been shown to mediate endocytic degradation of several extracellular components including TSP-1 (Mikhailenko et al., 1997;

Godyna et al., 1995). This has been suggested to be mechanism by which LRP regulates the concentration of soluble TSP-1. Together, these data indicate highly cell-type specific signalling responses to TSP-1, possibly due to the variation in the expression of cell surface receptors or to different modes of activation of cell signalling pathways in different cell types. Thus, the study of cell signalling response to TSPs poses special challenges.

51 Kelch-repeat proteins

In search of novel intracellular regulators of cellular attachment and spreading in response to TSP-1, a cDNA library from the TSP-1 responsive myoblast cell line C2C12 was screened in the non-responsive Cos-7 cells for the acquisition of the ability to attach to the C-terminal domain of TSP-1 (Adams et al., 1998).

This cell-attachment assay identified a novel protein, muskelin. Muskelin belongs to the family of kelch-repeat containing proteins, which motivated me for a further analysis of the function and regulation this family of proteins.

The kelch motif was first described in Drosophila melanogaster Kelch (Xue and

Cooley, 1993) and is an ancient and evolutionarily-widespread sequence motif of 44-56 amino acids in length. The kelch motif occur as groups of five to seven repeats and has been identified in proteins of otherwise distinct molecular architecture, termed the kelch-repeat superfamily (review see Adams et al.,

2000). Currently, over 28 kelch-repeat proteins have been sequenced and functionally characterised in diverse organisms including viruses, plants, fungi and mammals (Table 2). The sequence identity between the individual kelch

repeats is low, where only eight residues within the motif is highly conserved and constitute the consensus sequence (see Figure 6A; Adams et a I., 1 998;

Bork and Doolittle, 1994). The crystal structure of the fungal galactose oxidase

has revealed that the individual kelch motif forms a four-stranded antiparrallel twisted p-sheet fold, which comprises into a "blade" (Figure 6 8). The sets of

blades formed by the repeats all together forms a p-propeller structure (Figure

6C; I to et al., 1991, 1994). Intra- and inter-blade loops of varying lengths

protrude above, below, or at the sides of the p-sheets and contribute to the

52 Table 2. The Molecular Organisation and Functional Subgroups of Kelch-Repeat Proteins

Functional Molecular Localisation Size/Organism Refs grouping/Name Organisation

Actin-associated Kelch BTB-kelch Ring Canals 688aa/Drosohlla Xue and Cooley, 1993; Robinson and Cooley, 1997 Keap1 BTB-Kelch Ectoplasmic 624aa/Mouse Vellchkova et al., 2002 Specialisations a-scruin Kelch-kelch Acrosomal bundles 918aa/horseshoe crab Way et al., 1995a,b Calicin BTB-Kelch Calyx of sperm 588aa/Human, Bovine von Bülow et al., 1995 spe-26 Unique-kelch Spermatogonial cells 570aa/C.elegans Varkey et al., 1995 ENC-1 BTB-Kelch Neuroectodermal 589aa/Human, Mouse Hernandez et al., 1997 regions Actinfilin BTB-kelch Dendrites 640aa/Rat Chen et al., 2002 Mayven BTB-kelch Neuron 593aa/Human, mouse Soltysik-Espanola et al., 1999 Actin-fragmin kinase Unique-kelch 737aa/Sllme mold FuruhashI and Hatano, 1992a,b; Gettermans et al., 1992 Nd1 BTB-kelch Ubiquitously 642aa/Mouse Sasagawa et al., 2002

Microtubules- associated gigaxonin BTB-Kelch Neurons 597aa/Human Bomont et al., 2000 teal Kelch-coiled coil - 1147aa/S.pombe Verde et al., 1995

Transcription- associated HCF-1 Kelch-FNIII - 2035aa/Human, mouse Kristle and Sharp, 1990, 1993 HCF-2 Kelch-FNIll - 792aa/Human, mouse Johnson et al., 1999x% Rag-2 Kelch-PHD T-cells 527aa/Human, mouse Chang et al., 1995

Cell morphology Muskelin DD-kelch - 735aa/Human, mouse Adams et al., 1998 KeUp - 1164aa/S. cere v/s/ae Philips and Herskowitz, 1998 Kel2p - 882aa/S.cerevisiae Philips and Herskowitz, 1998 Ral2p Kelch-unique - 611aa/S.pombe Fukul et al., 1988 P-scruin Kelch-kelch Acrosomal vesicles 916aa/Horseshoe crab Way et al., 1995a,b

Extracellular Attractin/Mahogany CUB-Kelch T-cells 1198-1429aa/Human, Duke-Cohan et al., 1998, mouse, C.elegans Gunn et al., 1998 Galactose oxidase DD-Kelch 728aa/H.rosellus McPherson et al., 1992

53 B Intra-blade loops Q

pi N-term

Inter-blade^ C-term loops

Intra-blade loop

1

Figure 6. The kelch motif and the kelch-repeat p-propeller fold, and a seven bladed kelch-repeat p- propeller domain. A ) Consensus sequence of the kelch motif as described in Adams et al., 2000 and Bork and Doolittle, 1994. The kelch motif is shown in relation to the four p-sheets of a propeller blade structure, as determined for the kelch motifs of fungal galactose oxidase (Ito el al., 1991). In the consensus sequence, G^glycine, Y=tyrosine, W=tryptophan, s^small residue; l = large residue; h=hydrophobic residue. B) Crystal structure of a kelch-repeat propeller blade. A single blade from the crystal structure of fungal galactose oxidase (IGOF) is shown. The p-sheets 1-4 are colored as in panel A. The consensus sequence from panel A is mapped onto the kelch-repeat fold. The N- and C-terminal loops join adjacent blades. As shown, the most highly-conserved residues are located in the p-sheets and the intervening loop sequences are variable. C) Crystal structure of the kelch-repeat p-propeller domain of the fungal galactose oxidase (IGOF). This p-propeller contains seven blades. The p-sheets 1-4 are colored as in panel A and B. The p4 strand of blade 7 is derived from the amino-terminus of the domain and thus closes the circular structure using N-terminal closure mechanism.

54 variability of the binding properties of the individual p-propellers (review see

Fülop and Jones, 1999; Adams et al., 2000; Jawad and Paoli, 2002).

Actin-Binding Kelch-Repeat proteins

The current view of the functions of kelch-repeats proteins have been compiled from studies of individual kelch-repeat proteins from diverse organisms

(Table 2). Several members of the kelch protein family have been associated with actin filaments, where for most of them, the kelch repeats mediate the direct binding to actin. These proteins include Kelch expressed in Drosophila melanogaster (Xue and Cooley, 1993; Robinson and Cooley, 1997), the mammalian ENC-1, a-scruin from Limulus polyphemus (Way et al., 1995a), the mammalian calicin, the actin-fragmin kinase in Physarum polycephalum

(Furuhashi and Hatano, 1992), and the mammalian IPP (VanHouten et al.,

2001). For these kelch proteins, the interaction with actin filaments mediates a number of diverse cellular function including neuronal differentiation (ENC-1), oocyte maturation (Kelch) and acrosomal processing (a-scruin, calicin), which I will discuss below.

Kelch in Ring Canals

In the maturing Drosophila melanogaster egg, Kelch is located in the ring canal constituting the intercellular bridge between nurse cells and the oocyte (Xue and Cooley, 1993; Robinson and Cooley, 1997). During oogenesis, mRNA and nutrients are transported from the nurse cells to the oocyte through the ring canals (Figure 7A). These are intercellular bridges derived from arrested contraction rings, and consists of several proteins including F-actin, filamin (Li

55 MG et al., 1999), Src tyrosine kinases (Dodson et al., 1998), and Kelch protein, that line the submembranous regions of the canals (Figure 7B). As cytoplasmic transport proceeds, the ring canals expands from >1pm to 10-12 pm in

diameter. The expansion of the ring canals is initiated by the nucléation of new

actin filaments, after which the Kelch protein gets recruited to the ring canals

and a rearrangement of the actin filaments begins. Thus, Kelch functions as an

actin-crosslinking protein that promote sliding of actin filaments. In the kelch

mutants, the initial assemble of the ring canals appears normal, whereas the

ring canals becomes disorganised during the expansion (Robinson et al., 1994).

This disorganisation leads to an impaired cytoplasmic transport between the

nurse cells and the oocyte, which causes small immature eggs and results in female sterility. Kelch is a multidomain protein composed of a Broad-Complex,

Tramtrack, and Bric-a-Brac/Poxvirus and Zincfinger (BTB/ROZ) domain and six

kelch repeats. In vitro assays have shown that Kelch interacts directly with F- actin in an Src tyrosine kinase-dependent manner. A tyrosine residue in the fifth

kelch repeat of Kelch has been identified as a putative phosphorylation site

(Kelso et al., 2002). Site-directed mutagenesis of tyrosine-627 to alanine had no effect on the binding to F-actin, whereas phosphorylation-mimicking substitution to either aspartic acid or glutamic acid disrupted the interaction with actin filaments (Kelso et al., 2002). In vivo localisation studies of Drosophila melanogaster egg chambers showed that KelchY627A was recruited to the ring canals, but the subsequently expansion of ring canals was impaired. This phenotype is similar to the src mutant, in which the ring canals appear smaller and an incomplete transportation of the nurse cell cytoplasm between the egg chambers (Dodson et al., 1998). These data suggest that after Kelch gets recruited to the ring canals, Src-dependent tyrosine phosphorylation disrupts

56 Nurse cells Oocyte

Ring Canals

Src Kinase activation

L u m en

L u m en

Figure 7. Schematic Overview of Drosophila melanogaster Ring Canal Expansion A) A drawing of a Drosophila melanogaster egg chamber showing the position of the nurse cells and the oocyte. A cytoplasmic flow of nutrients and mRNA into the oocyte goes through the ring canals, which are connecting the nurse cells with the oocyte. B) A schematic overview of the molecules connecting the actin cytoskeleton to the plasma membrane. The actin-bundling protein, Kelch, forms dimer through the B IB domain located in the N-terminal region, and binds to actin filaments via the C-terminal kelch-repeat domain. Filamin bridges actin filaments to the plasma membrane. C) Activation of Scr kinase promotes the disruption of the interaction between Kelch and actin filaments via phosphorylation of the Kelch protein at tyrosine-627 located in the kelch-repeat domain.

57 the binding between Kelch and the actin filaments and this regulatory

phoshorylation is important for the reorganisation of actin filaments during the expansion the ring canals (Kelso et al., 2002).

Sperm a togenesis

Ectoplasmic Specialisations (ES) are intercellular junctions between developing sperm cells and the Sertoli cells in mammals (for review see VogI et al., 2000).

The Sertoli cells, or the accessory somatic cells, provide nutrients and growth factor via the ES to the germ cells. ES are composed of actin filaments, integrin aePi (Salanova et al., 1998; Mulholland et al., 2001), vinculin (Pfeiffer and VogI,

1991), and myosin-Vlla (Hasson et al., 1997). The kelch-protein Keapi colocalises with the ES structures in mice testis and interacts with the myosin-

Vlla protein as detected in a yeast two-hybrid screen (Velichkova et al., 2002).

Keapi is still associated with ES in myosin-Vlla null mice {shaker-1 mice),

suggesting that Keapi might interact directly with other ES proteins such as actin filaments (Velichkova et al., 2002). Also, Keapi has been detected in focal adhesions in a subset of epithelial MDCK cells, where it colocalises with vinculin

and so far the only example of a kelch containing protein localised to focal

adhesions (Velichkova et al., 2002).

Some reports indicate important roles of other kelch-repeat proteins in spermatogenesis. The extension of the acrosomal process in Limulus polyphemus (horseshoe crab) sperm involves the release of a ‘spring’ composed of coiled actin filaments bundles. Upon activation, the actin bundles convert from coiled to straight bundles and extrude through the nucleus, which gives rise to the acrosomal process extension (Tinley, 1975; DeRosier et al.,

58 1982). a- and p-scruin both contains two six-bladed kelch domains (Way et al.,

1995a,b). Electron m icroscopy of acrosomal bundles shows that the a-scruin molecule is organised into two globular structures, i.e. the kelch domains, and each structure is in close proximity to an actin subunit (Schmid et al., 1994). In

vitro studies showed that a-scruin binds and cross-links actin filaments through the two six-bladed kelch-repeat domains. One of the binding sites has been

mapped to the C837 positioned in the loop connecting the fifth and sixth blade

(Sun et al., 1997). Also, a-scruin binds calmodulin and the binding site has been

mapped to the hinge region connecting the two kelch-repeat domains (Sanders et al., 1996). It has been speculated that binding of calcium to calmodulin

induces a conformational change in a-scruin, which effectively modulates the cross-linking of actin bundles to bring about a subsequently change in the actin organisation from coiled coil to the straight and extended bundles (Sanders et al., 1996).p-scruin is 67.4% identical to a-scruin, but even though the high

homology, p-scruin is targeted to the acrosomal vesicles, which are located at the anterior of sperm and does not contain any actin (Way et al., 1995b). Upon fertilisation, the hydrolytic enzymes in the acrosomal vesicles get exocytosed in order to penetrate the outer membrane of the egg. The role of p-scruin in the

acrosomal vesicles in still unknown.

Calicin was isolated from the calyx of bull sperm heads (von Bülow et al.,

1995). Earlier controversial discussions on the occurrence of actin in these

structures were revived after the determination of calicin as an actin-binding

protein in vitro (Heid et al., 2002), and immnuofluorescent stainings of calicin

had showed a diffuse distribution in boar testis, which is similar to the diffuse distribution of actin in these cells. Northern blot analysis of Calicin mRNA from

59 various tissues showed an exclusive expression in testis (von Bülow et al.,

1995).

Mutations in the Caenorhabditis elegans kelch-protein spe-26 inhibit

spermatogenesis resulting in male sterility (Varkey et al., 1995). The

spermatogenesis proceeds normally through the formation of the primary

spermatocytes, but subsequently fail to complete meiosis to form spermatids

(Varkey et al., 1995). Six individual mutations in SPE-26 have been reported to

induce this phenotype, where five of these mutations have been mapped to the

kelch repeats. On the molecular level, all the mutations disrupted

segregation and showed mislocalisation of the actin filaments to the periphery

of the cell. This phenotype coincides with defects in actin reorganisation

suggesting a functional role of spe-26 as an actin binding protein.

Kelch-Repeat proteins in Neurons

The Ectoderm and Neural Cortex-1 protein (ENC-1) has also been described as

the nuclear restricted protein/brain (NRP/B) (Hernandez et al., 1997; Kim et al.,

1998). The expression is highly limited to the in neuroectodermal regions and in

the nervous system, where it colocalises with actin filaments (Hernandez et al.,

1997; Kim et al., 1998). In vitro cosedimentation assays shown direct interaction

between ENC-1 and actin (Hernandez et al., 1997). The expression of ENC-1 is

upregulated during retinoic acid or dibyutyryl cAMP stimulated neuronal cell

differentiation and overexpression of ENC-1 in Neuro 2A cells induces neural

process formation (Kim et al., 1998). Correspondingly, depletion of ENC-1

expression in PCI2 cells using antisense expression inhibits the neural process

formation in response to the growth factor nerve growth factor (Kim et al.,

1998).

60 The cystic fibrosis transmembrane conductance regulator-associated protein-70

(CAP70) is a scaffolding protein composed of four PDZ-domains and is involved in regulation of the chloride channel activity of the ABC transporter, cystic fibrosis transmembrane conductance regulator (Wang et al., 2000). A yeast two-hybrid screen of CAP70 identified the kelch-protein Actinfilin as a member of CAP70-interacting proteins (Chen et al., 2002). Actinfilin is specifically expressed in the brain, and most commonly associated with dendrites and subcellular localised to the postsynaptic densities. A further molecular characterisation of actinfilin identified a C-terminal BTB/POZ-domain promoting homodimerisation and a kelch-repeat domain mediating binding to actin filaments (Chen et al., 2002). The role of the actinfilin/CAP70 association has not been characterised any further.

Other Actin-Binding Kelch-Repeat proteins

Mayven was discovered as an ENC-1 related protein (Soltysik-Espanola et al.,

1999). Mayven colocalises with actin filaments and in vitro assays identified the kelch-repeats of Mayven as the binding site for actin (Soltysik-Espanola et al.,

1999). The kelch protein Actin-fragmin kinase of the slime mold Physarum polycephalum complexes with actin and fragmin (Furuhashi and Hatano,

1992a,b). The actin-fragmin kinase inhibits the nucléation and capping capacity of the actin-fragmin complex by a direct phosphorylation of the actin subunit.

This inhibition of the actin-fragmin complex promotes the formation of longer actin stress fibers (Gettermans et al., 1992, 1995; Furuhashi and Hatano,

1992b).

The Nd1 protein {Neural crest homeobox downstream gene 1) was isolated in a representational difference analysis of A/cx-deficient mice (Sasagawa et al.,

61 2002). Among the isolated cDNAs, a single gene encoding a short 25kDa form

(Nd1-S) containing a BTB/POZ domain and a longer 72kDa form (Nd1-L) containing an additional six kelch-repeats were further characterised. The nucleotide sequence for Nd1-L is identical to the previous described Ns1- binding protein (Wolff et al., 1998), where the latter contains a single nucleotide deletion resulting in premature termination lacking the sixth kelch repeat (my observation). Ndl-L colocalise and co-sediments with actin filaments and overexpression of Ndl-L in NIH3T3 cells protects actin filaments from

depolymerisation induced by cytochalasin D (Sasagawa et al., 2002). The

specific function of Ndl-L down regulation in the A/cx-deficient mice is still

uncertain.

Kelch-Repeat proteins and Microtubles

The cytoskeletal network is composed of three major components, which,

besides actin filaments, consists of microtubules and intermediate filaments.

Both of these cytoskeletal components are also involved in a number of cellular

functions including intracellular transport, mitoses and differentiation (review

see Keating and Borisy, 1999; Kirfel et al., 2003; Paramio and Jorcano, 2002).

Kelch-repeat proteins have also been linked to the function of microtubules.

Giant axonal neuroparthy (GAN) is a severe neuropathy disease affecting the

central nervous system and peripheral nerves and the disease symptoms

includes clumsiness, muscle weakness, and loss of reflexes, and occurs in

early childhood and usually follows by death in adolescence (for review see

Ouvrier, 1989). On the cellular level, GAN is characterised by abnormal

microtubule networks and aberrant intermediate filament accumulations, which

results in giant axonal swellings. One cause of the disease has been located to

62 mutations in the kelch protein gigaxonin (Bomont et al., 2000). Of the nine missense mutations reported, four were located in the BTB/POZ domain, three located in the kelch repeats, one in the intervening region, and one in the C- terminal region. Wild-type gigaxonin protein colocalises with microtubules as observed in immunofluorescent stainings and coimmunoprecipitates with the microtubule-associated protein 1B light chain (MAP-1 B-LC; Ding et al., 2002).

This interaction can be disrupted by a mutation in the kelch repeats previously identified in GAN patients (Ding et al., 2002). MAP-1 B-LC forms a complex with

MAP-1 B-heavy chain and is involved in microtubules dynamics and stability

(review see Gordon-Weeks and Fischer, 2000). A possible interaction between

MAP-1 B-LC and actin filaments suggests that MAP-1 B-LC could function as a cytoskeletal cross-linking protein by interconnecting microtubules with actin filaments (Togel et al., 1996). Gigaxonin competes with microtubulin-associated

protein-1 B-heavy chain for binding to MAP-1 B-LC and inhibits the formation of the MAP-1 B complex. An association between gigaxonin and MAP-1 B-LC

protects the microtubules from depolymerisation upon colchicine treatment

indicating a functional role for gigaxonin in microtubulin stabilisation and

reorganisation (Ding et al., 2002).

Two kelch-repeat proteins in Schizosaccharomyces pombe are involved in the control of cell morphology. In search of members of the Ras signalling transduction pathway, Schizosaccharomyces pombe were visually screened for

mutants showing similar phenotypes (Verde et al., 1995; Fukui et al., 1988,

1989). These screens identified two kelch-repeat proteins. Teal (tip elongated

abberant-1) and Ral2p (Ras like protein). Teal mutants show defects in growth

by dislocating the elongation point, which gives rise to a T-shaped morphology

(Verde et al., 1995). In feaf-deficient strains, the microtubules showed a curled

63 morphology in the cell ends. As teal is targeted to the ends of the cell in a microtubule-dependent manner, it has been suggested that microtubules require teal to recognise the cell ends in order to terminate elongation and sustain an antipodal growth (Mata and Nurse, 1997; Behrens and Nurse, 2002).

Teal is phosphoregulated by the p21-activated kinase homologue, Shkl (Kim et al., 2003). This indicates that Teal is a downstream target for the signalling pathway involving Rasi and Cdc42 in Schizosaccharomyces pombe. The similar phenotypes of Ral2p and Rasi defective Schizosaccharomyces pombe

mutants suggested that Ral2p is as an activator of Rasi (Fukui et a I., 1988,

1989). These cells showed deformed cell morphology and deficiency in conjugation and decreased sporulation. Overexpression of Rasi restores the

mating ability of Ral2 deficient strains indicating that Rasi is a downstream target of Ral2.

Kelch proteins in Transcriptional Regulation

Some evidence for kelch-repeat proteins involved in mRNA splicing and

processing has been reported. The kelch protein Host cell factor-1 (HCF-1)

protein was first identified as a cofactor required for the formation of transcriptional activator VP16-complex from the herpes simplex virus (Kristie

and Sharp, 1990, 1993). HCF-1 is mainly located to the nucleus due to a

nuclear localisation sequence in the C-terminal region. The exact normal

cellular function of HCF-1 is still uncertain, but upon herpes virus infection, the virus protein VP16 binds to the kelch repeats of HCF-1. This complex

associates with the transcription factor Oct-1 and induces the expression of the five viral immediate-early genes, which are important for the lytic replication of

herpes simplex virus. HCF-1 is expressed in a wide range of tissues as a

64 2036-residue precursor protein that subsequently is digested into two subunits,

HCF-1 N and HCF-1 c, which remain noncovalently associated (Wilson et al.,

1995). The six kelch repeats located in the N-terminal region of HCF-1 are

responsible for the interaction to the VP16 protein. A mutation of Proline134

located in the loop connecting the second and the third kelch repeat disrupts the

interaction with VP16 (Wilson et al., 1997). In the nucleus, HCF-1 colocalises

with small nuclear ribonucleoproteins (snRNP) responsible for mRNA splicing

and co-immunoprecipitation experiments indicate a direct binding between

HCF-1 and the snRNP subunits U1 and U5 (Ajuh et al., 2002). The mutation of

Proline134 also disrupts the interaction between HCF-1 and snRNPs and

inhibits the assembly of spliceosomes. Recently, a novel HCF-1 binding protein was isolated, which binds to the same Asp-His-Pro-Tyr motif as the VP16 in the

p-propeller of HCF-1 (Mahajan et al., 2002). This HCF-1 p-gropeller interacting protein, HPIP, shuttles between the nucleus and the cytoplasm, and

overexpression of HPIP redirects the localisation of HCF-1 from the nucleus to

the cytoplasm (Mahajan et al., 2002). The role of HPIP and the function of HCF-

1 shuttling between the nucleus and the cytoplasm are still unclear. In human, a

HCF-1 homologue has been identified called HCF-2 (Johnson et al., 1999). The

HCF-2 binds and stabilises the VP16/0ct-1 complex, but in contrast to HCF-1

fails to induce transcriptional activation (Lee and Herr, 2001). HCF-2 lacks the

nuclear localisation sequence present in C-terminal subunit of HCF-1 and therefore distributes uniformly between the nucleus and the cytoplasm (Johnson

et al., 1999).

HCF orthologues has been identified in Caenorhabditis elegans and Drosophila

me/anogasfer (Wysocka et al., 2001; Mahajan et al., 2002). The CeHCF is more

related to the mammalian HCF-2 than HCF-1. CeHCF is a 782 residue long

65 polypeptide compared to 2035aa and 792aa for mammalian HCF-1 and HCF-2, respectively. Furthermore, similar to HCF-2, CeHCF is not a subject to the proteolytic processing for activation of HCF-1 (Johnson et al., 1999). But

CeHCF contains four phosphorylation sites, which are not conserved in neither of the mammalian HCFs nor DmHCF. Phosphorylation of CeHCF correlates with the developmental stages of Caenorhabditis elegans, where embryonic

CeHCF is hyperphoshorylated and CeHCF in the larva is hypophosphorylated

(Wysocka et al., 2001). Phosphorylation of CeHCF also occurred when expressed in mammalian cells, suggesting a conserved kinase is involved.

The primary sequence of DmHCF is most similar to the mammalian HCF-1, though further sequence analysis indicated a lack of proteolytic processing sites in the central region. On assessing whether DmHCF undergo proteolytic processing, two opposing analysis has been reported (Mahajan et al., 2002;

Izeta et al., 2003). Izeta and colleagues detected a full-length dmHCF when transiently expressed in BHK21 hamster cells, whereas Mahajan and colleagues detected two separate fragments, which corresponded to specific C- and an N-terminal tagged versions of DmHCF when expressed in the

Drosophila me/a/iogasfer cell line SL2. The behaviour of endogenous dmHCF is still to be revealed.

Both CeHCF and DmHCF are capable of supporting VP16-induced complex formation (Liu et al., 1999; Mahajan et al., 2002). This suggests that HCF posses a highly conserved function targeted by VP16 upon herpes virus infection.

66 Rag-2

The combinatorial assembly of T-cell receptors and immunoglobulin genes in B- and I - cells relies on a recombinatory genomic process known as Varible (V),

Diversity (D), and Joining (J) recombination (for recent review see Gellert,

2002). The recombinant V(D)J gene segments gives rise to the variable regions responsible for specific antigen binding of immunglobulins and T-cell receptors.

The initial step in V(D)J recombination is the cleavage of precisely defined regions, the recombination signal sequences (RSSs), adjacent to the gene segments and is primarily mediated by the recombination activating genes

(Rag)-1 and -2. The binding of Rag1 to the RSS recruits Rag2. This complex alone is responsible for the initial V(D)J recombination and is only expressed in the lymphoid cells that are responsible for the immunoglobulin and T-cell receptor expression (review see Gellert, 2002). Subsequently, the cloning of these proteins, identified Rag2 as a kelch-repeat protein (Callebaut and

Mornon, 1998)

Mutations d isrupting the recombination a ctivity of the Rag complex lead to a human disease named severe combined immunodeficiency (SCID) also known as Omen Syndrome, in which the maturation o f both T cells and B cells are perturbed (Chang et al., 1995). The disease is ultimately fatal and has been one of the primary targets for gene therapy (review see Fischer et al., 2002).

Various genetic screens of SCID patients have characterised mutations in either

Ragi or Rag2. The mutations in Ragi disrupt the binding to the RSS and the cleavage of DNA, whereas mutations in the kelch repeats of Rag2 inhibits binding to Ragi and the assembly of the Rag complex (Schwarz et al., 1996;

Callebaut et al., 1998; Aidinis et al., 2000; Gomez et al., 2000). Studies using truncated Ragi and Rag2 have identified that the C-terminal part of Ragi

67 (aa384-1008) and the kelch-repeats of Rag2 (aa1-383) are necessary and sufficient for in vitro cleavage of DNA. Rag1 contains the putative catalytic site for DNA cleavage including the binding site for a divalent metal ion (Landree et al., 1999; Kim et al., 1999). Since both the binding and the cleavage of DNA are functionally allocated to Ragi, the role of Rag2 in V(D)J recombination is less direct. Rag2 is required for all catalytic steps and some data suggest that Rag2 is important for proper targeting and recognition of the RSS, positing a role as a cofactor causing conformational changes in Ragi (Akamatsu and Oettinger,

1998; Mo et al., 1999; Qiu et al., 2001).

NS1-BP

As previously described, Nsl-binding protein is the truncated form of the Ndl-

L. The NS1 protein of influenza A viruses is implicated in several regulatory cellular functions including splicing, translation, and nucleocytoplasmic transport of mRNA. The Nsl-binding protein was isolated in a yeast two-hybrid screen where the NS1 protein bound specifically to the kelch repeats of NS1-binding

protein (Wolff et al., 1998). This truncated form of Ndl-L protein is subcellular targeted to the interchromatin granules and the perichromatin fibrils in the

nucleus and colocalises with the spliceosome assembly factor SC35,

suggesting a role for Nsl-binding protein in the cellular splicing machinery. After

infection with influenza vira, NS1-binding protein redistributes throughout the

nucleoplasm together with the NS-1 protein.

The transcription factor Nuclear-factor erythroid 2p45 related factor 2, Nrf2, is

important for the induction of phase 2 genes in response to electrophil and oxidative toxicity (review see Nguyen et al., 2002). Under normal conditions

68 binding to the kelch protein Keapi in the cytoplasm sequester Nrf2 from the

nucleus (Itoh et al., 1999). Upon induction from oxidative reagents, the Keapi-

Nrf2 complex dissociates and Nrf2 translocates to the nucleus and, in

association with other transcription factors, binds to the antioxidative response

element regions upstream of phase 2 genes and induces their transcription.

Keapi contains several cysteines in the intervening region between the

BTB/POZ domain and the kelch domain. These cysteines are highly reactive to oxidative reagents and the interaction with various inducers disrupts the

interaction between Keapi and Nrf2, and therefore thought as the cellular

sensors for detecting electrophil and oxidative toxicity (Dinkova-Kostova et al.,

2002).

Kelch Proteins and the Cell Cycle

A few reports have implicated certain kelch proteins in cell cycle regulation. In

unstimulated cells, ENC-1 (Nrp/B) associates w ith the retinoblastoma protein, whereas the association between ENC-1 and retinoblastoma protein is decreased during neuronal differentiation (Kim et al., 1998). ENC-1 is regulated

by tyrosine phosphorylation, but the specific phosphorylation site is still unknown. The phosphorylation of ENC-1 is regulated through cell cycle progression with a high increase of the tyrosine phosphorylation of ENC-1 in cells in S-phase (Kim et al., 1998). These data suggest a possible role of ENC-1 in cell cycle progression.

The overexpression of Ndl-L in NIH3T3 cells delays cell growth and induces polyploid cells (Sasagawa et al., 2002). The colocalisation of Ndl-L and actin filaments in the dividing cells suggests that overexpression of Ndl-L inhibits cytokinesis by stabilising actin filaments during the reorganisation necessary for

69 formation of the cleavage furrow. Similar effects have been reported for the mutations in HCF-1. A single mutation in the kelch repeats of HCF-1 protein causes a temperature-sensitive G 0/G1 arrest, but a subset of cells are polyploid, suggesting a defect in the cytokinesis upon exit from the mitosis (Reilly and

Herr, 2002). The mutation in the fifth kelch-repeat causes dissociation between

HCF-1 and chromatin at increased temperatures (Wysocka et al., 2001). The expression of simian virus 40 large T antigen rescued cell proliferation and cytokinetic defects induced by the mutation in HCF-1 without restoring the HCF-

1-chromatin association (Reilly et al., 2002). The ability of SV40 large T antigen to suppress the HCF-1 defects in cell proliferation correlated with their ability to inactive the retinoblastoma protein by hyperphosphorylation. These data suggests that HCF-1 is involved in regulating the levels of retinoblastoma protein phosphorylation through the cell cycle, and the inability to activate retinoblastoma protein (i.e. dephosphorylate) in G 0/G1, caused by the mutation in HCF-1, may explain the cytokinetic defects and the arrest in cell cycle in these cells (Reilly et al., 2002).

Extracellular Kelch-repeat proteins

Attractin/mahogany is a family of transmembrane glycoprotein that is expressed from a single gene (Gunn et al., 1999). Attractin is expressed by T-cells and after cleavage released extracellularly, where it promotes cellular spreading and clustering of monocytes (Duke-Cohan et al., 1998). Mahogany is a receptor expressed in several cell types, and is involved in suppression of obesity (Nagle et al., 1999), where deletion of the gene in mice is associated with increase in body weight when fed on high-fat diet.

70 Galactose oxidase is a secreted enzyme expressed in the fungi Hypomyces rosellus. This copper-containing enzyme catalyses the oxidation of alcohols to aldehydes (Baron et al., 1994). The crystal structure of galactose oxidase

revealed that the Cu(ll)-ion is located at bottom of the p-propeller structure formed by the seven kelch repeats (Ito et al., 1991). Galactose oxidase contains a discoidin domain located in the N-terminal region (Ito et al., 1991). This domain coordinates the binding of substrate to the enzyme as well as activation of the enzyme (Ito et al., 1994).

Self-association of Kelch-Repeat Domain and Secondary Domains

Eight out of the s ixteen kelch-repeat p roteins c haracterised i n Homo sapiens contains a BTB/POZ domain (Adams et al., 2000). This domain is evolutionally conserved and is found in actin binding proteins as well as nuclear DNA binding

proteins. Members of this family are involved in wide range of cellular functions

including regulation of gene expression and actin reorganisation by formation of

homo- or heteromeric complexes (review see Albagli, 1995). The crystal structure of the BTB/POZ domain of promyelocytic leukaemia zinc-finger protein

reveals an obligate homodimer interlocked through a hydrophobic interface composed of a-helices and p-sheets (Li et al., 1997; 1999; Ahmad et al., 1998).

Homodimerisation of BTB/POZ domains has also been reported for kelch-

repeat proteins (Kelch, Keapi, Actinfilin, Ndl) suggesting a general characteristic for this domain, though heteromeric kelch-repeat protein complexes has not been reported (Robinson and Cooley, 1997a; Zipper and

Mulcary, 2002; Chen et al., 2002; Sasagawa et al., 2002)

71 The BTB/POZ domain in the Kelch protein mediates homodimerisation, where this domains interlocks two Kelch proteins together and thus allows cross- linking of ring canal actin filaments through an actin-binding site in the fifth kelch repeat of Kelch (Robinson and Cooley, 1997). A mutant lacking all the six kelch repeats is still recruited to the ring canals when expressed in Drosophila melanogaster oocytes, where expression of the same mutant in a kelch mutant background causes a diffuse distribution. Thus, the BTB/POZ mediates intermolecular dimérisation and promote bundling of the actin filaments. Similar molecular cross-linking mechanism has been suggested for other actin-binding kelch proteins that contain BTB/POZ domain, including Mayven and actinfilin. In vitro cross-linking experiments with Mayven demonstrated that the BTB/POZ domain dimerised in solution (Soltysik-Espanola et al., 1999). Biochemical co- immunoprecipitation assays have shown that the BTB/POZ domain of actinfilin is important for the self-association (Chen et al., 2002; Soltysik-Espanola et al.,

1999).

Similarly, the BTB/POZ mediated homodimerisation of Keapi is important for cytoplasmic sequestering of Nrf2 (Zipper and Mulcahy, 2002). A single mutation of Serine104 in the BTB/POZ domain of Keapi disrupts homodimerisation, resulting in impaired Nrf2 interaction, but did not affect the kelch-mediated binding of actin filaments (Zipper and Mulcahy, 2002). So far only the BTB/POZ members of the kelch proteins has been identified as homodimers.

One other mechanism for dimérisation of kelch-repeat proteins has been reported. The Kellp and Kel2p identified in Saccharomyces cerevisiae does not contain BTB/POZ domain, but have a unique ability to dimerises through C- terminal coiled-coil domains (Philips and Herskowitz, 1998). Both of these proteins are involved in cell fusion during mating of S.cerevisiae.

72 The studies of kelch-repeat proteins indicate multiplicity of functional roles in all

cell compartments. Thus, presence of kelch-repeats in a protein cannot lead directly to a hypothesis as to its function. So far the presence of kelch-repeats is only a general indication of a protein-protein interaction domain.

Identification of Muskelin as a Kelch-Repeat Protein

Overexpression of muskelin in mouse myoblasts increases cellular attachment to TSP-1, but did not change attachment when plated on the ECM proteins fibronectin or laminin (Adams et al., 1998). Correspondingly, when myoblasts

cells were stably transfected with antisense cDNAs, which suppresses the

expression of muskelin, a decrease in cell spreading on TSP-1 was observed, whereas no changes was observed in cell spreading in response to fibronectin

(Adams et al., 1998). Mouse 02012 skeletal myoblasts selected for stable

overexpression of muskelin showed qualitative alterations in cytoskeletal

organisation with a decrease in fascin containing actin microspikes and an

intriguingly increase in vinculin-containing focal adhesions when plated on

TSP-1 (Adams et al., 1998). This effect is not apparent in stable muskelin-

overexpressing 02012 cells adherent to fibronectin. A muskelin-depleted

02012 cell line showed a downregulation in the formation of F-actin

microspikes, and did not form focal contacts when plated on TSP-1. This

specificity in cellular response suggests that muskelin is linked to cell signalling

pathways that are differentially regulated by EOM components.

Muskelin is encoded by a single gene in the mouse genome and highly-

conserved orthologues have been identified in Homo sapiens (Adams and

Zhang, 1999), and Drosophila melanogaster (Adams, 2002). Bioinformatic

73 analysis of these orthologues has shown that muskelin is a multidomain protein

(Adams, 2002). The highly-conserved amino-terminal half of the protein contains two recently-characterised sequence motifs, the Lissencephaly-1

homology motif (LisH-motif) (SMART # SM0667) (Emes and Ponting, 2001) and the C-terminal to LisH-motif (CTLH) (SMART # SM0668). The LisH-motif is an a-helical region firstly described in the amino-terminus of Lisi where it contributes to an interaction with the a-subunit of platelet-activating factors acetyl hydrolase and interactions with the microtubule motor protein, dynein

(Cahana et al., 2001 ; Smith et al., 2000). The motif is also found in proteins that are not associated with microtubules (reviewed by Emes and Ponting, 2001).

The CTLH motif is another alpha-helical segment of unknown function that is

present in several LisH-containing proteins (Emes and Ponting, 2001). The car boxy-terminal half of muskelin contains six kelch-repeats and a short carboxy-terminal region.

A previously published report has described novel muskelin interactions as

identified in yeast two-hybrid screens (Hasegawa et al., 2000). The family of G-

protein coupled prostaglandin EP3 receptors are distinguished according to sequence specificity in the C-terminal intracellular tail (review see Narumiya et al., 1999). A yeast two-hybrid screen of the C-terminal tail of the three isoforms of the prostaglandin EP3 receptor identified muskelin-binding specifically to the a-isoform, and not with either the p- or y-isoform (Hasegawa et al., 2000).

Previous attempts i n our I aboratory to d iscover putative muskelin i nteractions using muskelin as bait in the yeast two-hybrid system were unsuccessful

(Collet, 1999). Instability and degradation of the yeast expression plasmid

74 containing full-length muskelin was detected when transiently expressed In yeast. Furthermore, expression of the full-length muskelin protein from alternative expression vectors could not be detected and an approximately 50% decrease In growth rate of transfected cells was observed suggesting that expression of full-length muskelin In yeast Is highly toxic. A following library screen gave no colonies emphasising the Importance of stable, non-toxic expression of bait protein for a successful yeast two-hybrid screen. Expression of a C-termlnal deletion mutant of muskelin could be detected In yeast, but a following library screen was not performed. This suggests that one approach for evaluating muskelin Interactions In cells could be by expressing truncated or mutated versions of muskelin. In view of the relatively low expression of the deleted protein. It was decided to achieve a g reater u nderstanding o f protein domains and seek alternate approaches to studying biochemical properties of muskelin. These studies have been the focus of my project.

75 Project Objectives

Muskelin, as a kelch-protein, encourage me to study the complexity of this

superfamily of proteins by analysing the complement of kelch-repeat proteins

encoded in the human genome and to make phylogenetic comparisons to the

kelch-repeat proteins a nd m uskelin-like proteins encoded in other sequenced genomes. These data can suggest logical experimental approaches to a

structural and functional analysis of muskelin and other kelch-repeat proteins.

Using various biochemical and cellular biological approaches, my aim was then to identify the role of muskelin domains in the spatial and temporal cellular

localisation and the cellular regulation of muskelin, in order to obtain further

information on the cellular properties and function(s) of muskelin.

76 Materials and Methods

BLAST Database Searches and Identification of Domain Architectures.

A) For muskelin.

The BLASTP database (http://www.ncbi.nlm.nih.gov/BLAST) was searched using full-length sequences of previously identified muskelin orthologues,

M.musculus (NP 038819; Adams et al., 1998) H.sapiens (NP 037387, Adams and Zhang, 1999), D.rerio (AAN32664; direct submission Adams and

Zaromytidou, 2001), D.melanogaster (NP_610801; Adams, 2002), S.pombe

(NP_594297; Adams et a I., 1 998) at parameter values of Expect-value o f 10 and the number of descriptions at 100. Using these high e-values gave a high number of output entries, including several kelch-repeat proteins, but was selected in order to enable detection of remotest muskelin orthologues. Direct analysis on the whole genomes of C.elegans (The C. slogans Sequencing

Consortium, 1998), and S.cerevisiae (Goffeau et al., 1996) was performed at http://www.ncbi.nlm.nih.gov/BLAST/Genome/NematodeBlast.html and http://www.ncbi.nlm.nih.gov/BLAST/Genome/YeastBlast.html to confirm that muskelin or muskelin-like proteins are not identifiable in these organisms

(Adams, 2002).

Expressed Sequence Tags were analysed using the translated BLAST search engine (TBLASPN) with est_others set as the Chosen Database parameter.

This database is a translated nucleotide-based search engine using protein sequences as queries and was searched in a similar manner as the BLASTP database as previously described. The following genebank accession entries were identified in Gallus gaiius, Canis familiaris, Bos taurus and Silurana tropicalis as muskelin orthologues:

77 Specie Genebank Accession #

Gallus gallus AJ449615, AJ451619, BU234199, BG712626,

BU117343, BU228515, BG709924,

BU203113, BU448071, BU229245, BU207319

Canis familiaris BM541029, BM538798

Bos taurus AW670217, BM483312, BI540863

Silurana tropicalis AL874034, AL966473, AL846781, AL652943,

AL653531, AL801392, AL960225, AL629578

Table 1. List of Expressed Sequence Tags of muskelin orthologues identified in Gallus gallus, Canis familiaris, Bos taurus, Silurana tropicalis

All nucleotide sequences were translated to amino acid sequences in all three

reading frames and peptides corresponding to the correct muskelin orthologues were isolated. Peptide sequences from the individual species were aligned

using ClustalW multiple sequence alignment software (Thompson et al., 1994) at the San Diego Supercomputer Center Biology Workbench

(http://workbench.sdsc.edu) at standard parameters. These alignments were

processes for identification of overlapping sequences, which were excised from the individual alignments in order to compile single muskelin orthologues from the individual species. These compiled orthologues were subsequently

realigned with M. musculus muskelin (NP 038819) using ClustalW and shaded

using Boxshade (http://workbench.sdsc.edu).

78 B) For kelch-repeat superfamily

For the bioinformatic analysis of kelch-repeat protein, the whole human genome

(http://www.ncbi.nlm.nih.gov/genome/seq/HsBlast.html) was searched with the

46 residue consensus sequence for the Kelch motif

(PRSGAGVWVGGKIYVIGGFDGSQSLSSVEVYDPETNTWEKLPSMP;

CDD543) by the basic local alignment search tools BLASTP and PSI-BLAST

(Altschul et al., 1990; 1997) at standard parameters, with a default expectation value of 10 and reporting up to 500 protein sequences. The database was also searched with the kelch repeats from human host cell factor-1 (aa1-aa357;

Wilson et al., 1997), mouse leucine-zipper-like transcriptional regulator (aa49- aa369; Kurahashi et al., 1995), rat Actinfilin (aa331-aa616; Chen et al., 2002), human Rag2 (aa1-aa348; Callebaut et al., 1998), human muskelin (aa 249-636;

Adams and Zhang, 1999) and the complete sequences of all the previously- characterised kelch-repeat proteins (Adams et al., 2000). The listings of kelch proteins in Pfam 9.0 (Wellcome Trust-Sanger Institute; Bateman et al., 2002) and SMART (EMBL; Schultz et al., 1998) domain databases were also reviewed and any apparent additional human kelch-repeat proteins searched against GenBank. From these searches, I found that the species listings in

Pfam and SMART contained redundant and duplicate entries including partial

ORFs for the same protein and did not contain all the kelch-repeat proteins identified in the BLASTP searches. Each identified protein sequence was reciprocally searched against the non-redundant protein database of GenBank and Conserved Domain Database (Marchler-Bauer et al., 2002; which includes the Pfam and SMART databases) to confirm the identification of kelch repeats and to identify additional domains. Novel hypothetical ORFs within the human genome were used to query the database of human expressed sequence tags

79 (ESTs) to determine whether corresponding translation products could be identified. Hits that appeared ambiguous as putative incomplete open reading frames in the initial search were searched against the mouse and rat genome databases at NCBI, and included as incomplete. Characterisation of domain architecture was based on the orthologues identified in mouse or rat. Similar search methods were used to identify kelch-repeat proteins in the genome databases of D.melanogaster (Adams MD et al., 2000), A.gambiae (Holt et al.,

2002), C.elegans (The C. elegans Sequencing Consortium, 1998), S.cerevisiae

(Goffeau et al., 1996), S.pombe (Wood et al., 2002) and A.thaliana (Arabidopsis genome initiative, 2000) at NCBI. A.gambiae proteins are not listed in the Pfam or SMART species trees. The searches were carried out in the period March 18 to April 20 2003. A second set or searches were carried out between 14‘^ June and 20^^ July 2003. Each identified sequence from these species was used in

BLAST searches of the human genome to identify the closest human

homologue (ie, an ORE with significant identity to the entire protein sequence,

not only to the kelch repeats). Previously unknown proteins identified in these searches were also used to query the human genome database to determine if additional human kelch-repeat proteins could be identified.

Entries in the human proteome, which appeared to be partial sequences,

included GenBank XP_054631, which was 91% identical to the C3IP1 Kelch-

protein (Accession NP_067646). Comparison of the two sequences at the

nucleotide level showed that a single nucleotide deletion

( AC A GT GG^ACATGG) in the XP 054631 entry generated an additional start- codon, in the equivalent position to valinelOO in C3IP1. Additionally, a single

mutation in XP_054631 (GTCCGACG->GTCTGACG) generates a stopcodon

80 and consequently a truncated open reading frame. Thus, the XP_054631 and

NP_067646 entries appeared to be derived from the same gene and only

NP 067646 (#11) was included in the analysis. The entry XP 209285 (#13) encoded an open reading frame of 174 amino acids, containing 3 kelch-repeats with h ighest sequence identity to rat actinfilin, NP_663704, length 640 amino acids (Chen et al., 2002). Additional searches with rat actinfilin did not identify a full-length human sequence, so the rat orthologue was used in the analysis. The entry NP_689579 encoded a 263 residue polypeptide containing 2 kelch-

repeats. I identified orthologues in both Mus musculus and Rattus norvegicus of

303 amino acids and 238 amino acids, respectively. Alignments of all three orthologues indicate a highly conserved segment identified in all three orthologues, a segment only present in human, and a segment only present in mouse. These discrepancies may arise from alternate predictions of intron-exon boundaries during genome annotation. The entry BAB67814 appeared to be a partial sequence because it lacked a start codon. All these entries were included in Table 1 and included in statistic sequence analyses, but excluded from the subgrouping of the BTB/Kelch repeat proteins.

Calculation ofK a/Ks Ratios

The calculations of Ka/Ks ratios were preformed on cDNA sequences from

M.musculus (NM_013791) H.sapiens (NM_013255), D.rerio (AF418017) using the YNOO software in the Phylogentic Analysis by Maximum Likelihood (PAML) software package (ftp://abacus.gene.ucl.ac.uk/pub/paml/), which estimates synonymous and nonsynonymous substitution rates based on pairwise comparison of protein-coding DNA sequences as described in Yang and

Nielsen, 2000. As these calculations are based on pairwise comparison of

81 protein-coding DNA sequences, the query sequences have to be of equal length. So for the calculation of Ka/Ks ratios of D. melanogaster muskelin, an initial alignment between the amino acid sequences of D. melanogaster

(NP_6108017) and M. musculus (NP_038819) was performed using ClustalW multiple sequence alignment software and gaps in the individual orthologues were identified and excised. For D. melanogaster muskelin-like protein, this included aa1-45, aa156-157, aa477-482, aa535-542, aa582-585, aa612-621, aa740-787 and aa806-823, and for M. musculus muskelin Aaa1-13, Aaa405 were excised. The corresponding 2160-bp nucleotide sequences of the remaining sequences were isolated and applied in the Ka/Ks ratios calculations using the YNOO software.

Prediction of Secondary Structure, p-Propeiier Closure Mechanisms, and

Putative Phosphorylation Sites

The secondary structure was predicted for M.musculus muskelin (NP_038819)

using the Jpred software (Cuff and Barton, 2000;

http://www.compbio.dundee.ac.uk/~www.jpred). The jpred output data is the consensus sequence based on five independent prediction algorithms, Jalign,

Jfreq, Jhmm, Jnet, and Jpssm (Cuff and Barton, 2000) and therefore utilized in this analysis.

The closure mechanisms in the kelch-repeat p-propeller of the human kelch- repeat proteins were predicted by use of Jpred secondary structure prediction, in order to identify p-sheet secondary structure either amino- or carboxy- terminal to the kelch motifs in each polypeptide sequence. Each sequence was also examined manually for the presence of a tryptophan residue amino- terminal to the first motif or carboxy-terminal to the last motif, as a second

82 predictor of amino-terminal or carboxy-terminal p-propeller closure mechanisms, respectively, (Adams et al., 2000).

Prediction of putative protein kinase A and C phosphorylation sites in individual muskelin orthologues and muskelin-like proteins was performed using NetPhos

2.0 (Blom et al., 1999; http://www.cbs.dtu.dk/databases/PhosphoBase/). The individual putative phosphorylation sites were highlighted in a multiple sequence alignments of muskelin and muskelin-like proteins using ClustalW and Adobe

Illustrator 9.0.

Multiple Sequence Alignments

Estimations of percentage identity between muskelin and muskelin-like proteins were calculated and aligned by the ClustalW multiple sequence alignment software (Thompson et al., 1994) at the San Diego Supercomputer Center

Biology Workbench (http://workbench.sdsc.edu) at standard parameters. The multiple sequence alignments were shaded using BoxShade

(http://workbench.sdsc.edu).

All the human BTB/kelch protein sequences predicted to form six-bladed propellers (37 sequences) were aligned by ClustalW, either as full-length sequences, kelch-repeats alone, or BTB domains alone, and presented in neighbourhood-joining trees (or rooted dendograms) using the Phylip Drawgram

3.5c at the Biology Workbench (http://workbench.sdsc.edu) using standard parameters. These analyses identified two major robustly-related subgroups within the kelch/BTB proteins. Multiple sequence alignments of the two subgroups were used to derive a 50% identity consensus sequence for each subgroup. The 50% identity consensus sequence derived from each set of six

83 kelch motifs were re-aligned with each other to prepare an averaged single motif consensus at 50 % identity for each subgroup using ClustalW.

Three-Dimensional Reconstruction

The discoidin domain of muskelin was aligned with the previously identified discoidin domains from Human Coagulation Factor Va chain A (1CZTA, aa1-

160), Human Coagulation Factor VIII Chain M (1D7PM; aal-159), Galactose

Oxidase (AAAI 6228; aa41-198), Mammalian X-ray cross-complementing group

1 Xrcd (1XNT_A; aal-183), Anaphase promoting complex subunit 10

APC10/DOC1 (1JHJA, aal-171) and Sialidase (1EUU; aa557-603) using

ClustalW in conjunction with a previously published multiple sequence alignment of discoidin domains (Macedo-Ribeiro et al., 1999). The position of the sequences that aligned with the previously described MIND sequence of muskelin (aa130-151; Adams, 2002) were identified in the crystal structures of

Galactose Oxidase (PDB: 1G0F), Sialidase (PDB: 1EUU), Human Coagulation

Factor Va (PDB: 1CZT), Human Coagulation Factor VIII (PDB: 1D7P), and the

Anaphase promoting complex subunit 10, APC10/DOC1 (PDB: 1JHJ) (Ito et al,

1991, Gaskell et al., 1995, Macedo-Ribeiro et al, 1999, Pratt et al, 1999, Wendt et al, 2001) and the localisation were visualised using Swiss-PDB viewer v3.7

(Guex and Peitsch, 1997; http://www.expasy.org/spdbv/) with annotations using

Adobe Illustrator 9.0. In order to depict the localisation of serine-44 and serine-49 of mouse muskelin, similar three-dimensional reconstruction procedure was performed using the crystal structure of Human Coagulation

Factor VIII (PDB: 1D7P).

To examine the localisation of putative phosphorylation sites in the p-propeller, the kelch-repeat domain of muskelin (aa250-624) was aligned with the kelch-

84 repeat domain of Galactose Oxidase (aa201-574) using ClustalW and manual adjusted with regards to conserved residues within the kelch repeats. From this alignment, the corresponding residues in galactose oxidase to muskelin serine-

324, serine-350, serine-355, serine-396, threonine-515, and threonine-519 were identified and highlighted on the three-dimensional structure of Galactose

Oxidase (PDB: 1G0F) as previously described.

Cell Culture

02012 murine skeletal myoblasts were grown in Dulbeco’s modified Eagle’s media (DMEM, Gibco) containing 20% heat-inactivated fetal calf serum (FOS).

Rat Vascular Smooth Muscle Cells (SMC), 293T cells, and Cos-7 green monkey kidney cells were grown in DMEM containing 10% FOS. All cell lines were maintained at 37°C in a water saturated incubator with an atmosphere containing 5% CO 2 and were routinely replated every third day in standard tissue culture plastic products.

Preparation of muskelin expression constructs

Cloning of EGFP-Muskelin

Mouse muskelin was subcloned from pDNA3.1 construct containing a cDNA of full-length mouse muskelin (Adams et al., 1998). 5pg of pDNA3.1/MK and 5 pg of pEGFP-C1 plasmid (Clontech) were digested with 1pl of the restriction enzyme EcoRI (20U/pl; NEB) at 37°C for 3 hours. 1pl of alkaline phosphatase

(lU/pl; Roche) was added to the pEGFP-C1 digestion for the last hour. Both reactions were gel-purified on 1% agarose gel using Qiagen Gel Purification kit

(Qiagen) according to the protocol provided by the manufacture and the 2.3 kb

85 fragment of digested muskelin (insert) and the 4.7kb fragment of pEGFP-C1

(vector) were isolated.

Ligation and Transformation into Bacteria:

The concentration of the insert and vector were analysed on 1 % agarose gels, and then diluted in total volume of 17|il with 3:1 molar excess of muskelin insert to pEGFP-C1 plasmid. 2|Lil of 14 DNA ligase buffer and 1pl of 14 DNA ligase

(20U/pl; NEB) were added and the reaction were incubated for 1h at room temperature. 5pl of ligation reaction were gently mixed with 50pl of competent

BL21 E.co//and incubated 40 minutes on ice. The transformed BL21 E.co//were heat-shocked at 42°C for 2 minutes and 500pl of LB-broth were added. After 1 hour incubation at 37° on a shaker table, 150pl of bacteria were plated on LB- agarose plates containing lOOpg/ml of kanamycin. Next day, 10 single colonies were picked and grown overnight in 7ml of LB media containing lOOpg/ml of kanamycin. Plasmid from the ten colonies were purified using Qiagen Mini Prep kit (Qiagen) according to the protocol provided by the manufacture before analysis for the correct insert and its orientation by digestion using the restriction enzymes EcoRI and BamHI. cDNA of mouse muskelin contains a

BamHI restriction enzyme digestion site at nucleotide 1950-1955 and the pEGFP-C1 contains a BamHI restriction enzyme site in the MGS, which thus allows an analysis of the direction of insert using this enzyme. Six of the ten colonies contained an insert of similar size as mouse muskelin and four were in the correct orientation. A single colony was selected and analysed by manual sequencing for verification of the correct reading frame, in frame with the EGFP and for identification of mouse muskelin. Manual sequencing was performed on

5pg of purified plasmid with the pEGFP-C sequencing primer (Clontech) using

86 the Sequenase Version 2.0 DNA Sequencing Kit (USB) according to the protocol provided by the manufacture. The construct was full-length sequenced automatically at the Molecular Biotechnology Core at Cleveland Clinic

Foundation.

Cloning of MK-EGFP

The stop codon in full-length muskelin was excluding by PCR using following primers:

Forward: 5’-GATCATGAATTCGGGTGCTGACAAGATGGCG-3’ and

Reverse: 5-GACTGCAGAATTCGCAGTGTGATCAGGTC-3'. Both primers contains a EcoRI site. The reverse primer contains a mutation in the stop-codon of muskelin allowing C-terminal fusion to EGFP-N1. The PCR was achieved under the following conditions (Standard PCR reaction): 70ng of pEGFP-mMK plasmid (template), 0.1 pM forward primer, 0.1 pM reverse primer, 2 mM dNTP,

IxPCR reaction buffer containing MgC^, 2.5U Taq DNA polymerase in a lOOpI reaction with the following cycles: an initial extended denaturing at 94°C for 2 minutes, followed by 30 cycles: denaturing at 94°C for 1 minute, annealing at

55°C for 2 minutes, elongation at 72°C for 3.5 minutes and a final elongation cycle at 72°C for 10 minutes.

The PCR fragment at 2.3kb were isolated by gel purification and digested by restiction enzyme EcoRI for 3 hours at 37°C. Concurrently, 5pg of pEGFP-N1 were digested with EcoRI for 3 hours at 37°C, treated with alkaline phosphatase for 1 hours at 37°C and the purified on 1% agarose gel. The digested PCR fragment was ligated into the pEGFP-N1 according the procedure previous described. The 3’ terminal part of muskelin was manual sequenced for verification of correct reading frame with EGFP, using the EGFP-N sequencing

87 primer. Constructs were full-length sequenced automatically at the Molecular

Biotechnology Core at Cleveland Clinic Foundation.

Cloning of V5-MK

The V5-MK construct was previously prepared in the laboratory and was stable expressed in an insect expression system based on HighS cells. Briefly, the cDNA of mouse muskelin was subcloned in frame into the insect expression vector plZA/5-His (Invitrogen) and transfected into the insect cell line HighS using Lipofectin (Invitrogen) according the protocol provided by the manufacture. A stable cell line expressing V5-MK were selected using 500pg/ml

Zeocin grown in Insect Express Media (Invitrogen).

Cloning of EGFP-Muskelin Discoidin Domain

10pg of pEGFP-MK were double digested with 1pl of restriction enzyme EcoRV

(20U/pl; NEB) and 1|il of restriction enzyme Smal (20U/|il; NEB) in 1xNEB4

buffer for 2h at 25°C and 2h at 37°C, and gel purified on 1% agarose gel using

Qiagen Gel Purification kit (Qiagen) according to the protocol provided by the

manufacture. Both of these enzymes produce blunt ends. After purification, the digested 5.2 kb DNA fragment was ligated using 14 ligase as previously described. This construct was manual sequenced for a verification of the correct

religation using the EGFP-C sequencing primer, and full length sequenced automatically at the Molecular Biotechnology Core at Cleveland Clinic

Foundation.

Cloning of EGFP-MKA85

10^g of pEGFP-MK were digested with 1pl of restriction enzyme BamHI

(10U/pl; NEB) and gel purified on 1% agarose gel. The cDNA of muskelin

88 contains a digestion site for the BamHI restriction enzyme at nucleotide 1950-

1955 and the multiple cloning site of pEGFP-C1 plasmid contains a BamHI restriction enzyme site 12 nucleotides upstream of an internal stop codon. After purification of a 6.7 kb fragment, the digested DNA was ligated using T4 ligase as previously described. This construct was full length sequenced automatically at the Molecular Biotechnology Core at Cleveland Clinic Foundation.

Cloning of EGFP-MK Point Mutations Constructs

The triple mutation 1137AA/138A/P139A, and the two double mutations

G478S/G479S and Y488AA/495A in the pDNAS.I-mMuskelin backbone has previously been prepared in the laboratory. All three constructs were prepared by a similar method as to the EGFP-MK wild type construct by digestion with

EcoRI a nd I igated i nto t he p EGFP-C1 expression v ector ( Clontech). AII t hree constructs were full length sequenced automatically at the Molecular

Biotechnology Core at Cleveland Clinic Foundation.

Site-Directed Mutagenesis of Putative PKC Sites in Muskelin

Five putative protein kinase C phosphorylation sites, S44, S315, S350, S396,

T515, were changed to alanines (A) or aspartic acid (D) using the QuikChange

Site-Directed Mutagenesis Kit (Stratagene).

Table 2 enlists the PAGE-purified primers (synthesised by Sigma-Genosys) used for the mutations of the indicated amino acids, where the mismatched nucleotides are indicated in bold.

S44A Forward 5 ' - gtggataaaccaaatgatcaggcttccagatggtcttc - 3 '

Reverse 5 ' - gaagaccatctggaagcctgatcatttggtttatccac - 3 '

S324A Forward 5 ' - gagaaagagaatggtccggctgccagatcgtgtc - 3 '

Reverse s ' - gacacgatctggcagccggaccattctctttctc - 3 '

89 S350A Forward 5 ' - gccgttacttggattccgctgtgaggaacagc - 3 '

Reverse 5 ' - gctgttcctcacagcggaatccaagtaacggc - 3 '

S396A Forward 5 ' - catcagatgtgtatggacgcagaaaaacatatgatctacacg - 3 '

Reverse 5 ' - cgtgtagatcatatgtttttctgcgtccatacacatctgatg - 3 '

S515A Forward 5 ' -ggttccaatgaccggattcgcacagagagcaac- 3 '

Reverse 5 ' - gttgctctctgtgcgaatccggtcattggaacc - 3 '

S324D Forward 5 ' - gagaaagagaatggtccggatgccagatcgtgtc - 3 '

Reverse 5 ' - gacacgatctggcatccggaccattctctttctc - 3 '

Table 2. Oligonucleotides synthesised for site-directed mutagenesis of putative

phosphorylation sites, S44, S315, S350, S396, T515 in vertebrate muskelin. A indicate

mutation to alanine, D indicate mutation to aspartic acid. Letters in bold indicate position

of mismatched nucleotides.

The mutagenesis were performed under the following conditions: 100ng of pEGFP-mMK plasmid, 0.1-1pg forward primer, 0.1-1 pg reverse primer, Ipl dNTP mix, 1xreaction buffer, 2.5U Pfi/Turbo DNA polymerase in a 50pl reaction with the following reaction cycles: 1 cycle: denaturing at 95°C for 2 minutes, 14 cycles: denaturing at 95°C for 30 seconds, annealing at 55°C for 1 minute, elongation at 68°C for 18 minutes. Reactions were cooled to 4°C before adding

10U of Dpn I restriction enzyme and incubated for 1 hour at 37°C. All samples were transformed into competent E.co// (XL-1 Blue) following standard procedure as previously described. Ten clones of each construct were selected and grown in 7ml LB-broth overnight at 37°C for purification of plasmid using

Qiagen MiniPrep Kit (Qiagen). All clones were manual sequenced for identification of site-directed mutation and verified by full-length sequencing at the Molecular Biotechnology Core at Cleveland Clinic Foundation. The double

90 mutation of S324A/S515A was prepared as the S324A mutation using pEGFP-

MKS515A as template in the mutagenesis reaction.

Cloning and Purification of GST-Discoidin Domain

The discoidin domain was cloned from wild type muskelin with PCR using following primers: forward primer 5’-GCTACCTTAGGATCCATGGCGGCTGGC-

3’ and reverse primer 5’-ATGCATGGATCCTAGATATCAGGATCATCAATGC-

3’. The PCR was done under the standard conditions. PCR fragment was purified on 1% agarose gel and restriction enzyme digested for 4 hours using

BamHI restriction enzyme (NEB) at 37°C. pGEX-1 plasmid (Amersham) was enzyme digested for 4 hours at 3 7°C using 1pl of BamHI restriction enzyme

(20U/pl; NEB), and treated for 1 hours at 37°C with alkaline phosphatase before purified on 1% agarose gel. 100ng of digested plasmid was mixed with 200ng of digested insert and ligated using Quick-Ligase (NEB) for 30 minutes at room temperature and transformed into E.Coli-BL21 under selection by ampicillin

(100pg/ml). Ampicillin resistant clones were screened for expression of

Glutathione-S-transferase (GST)-fusion protein by SDS containing

Polyacrylamid gel electrophoresis (SDS-PAGE). 5ml of LB-broth was inoculated by 500pl of overnight cultures and grown for 2 hours, before induction for 3 hours by the addition of 5pl of 1M Isopropyl p-D-1 -thiogalactopyranoside (IPTG;

Sigma-Aldrich). Bacteria were harvested by centrifugation at 13.000 rpm for 1 minute and resuspended in 50pl of 2xSDS-PAGE sample buffer. Samples were resolved on 15% SDS-PAGE and analysed by Coomassie staining. A single colony with high expression of a protein with apparent molecular weight of 47- kDa was selected. This construct was designated GST-MKDD.

91 For large scale purification, the expression of GST-MKDD in E.Coli-BL21 was induced by treating 500ml of midlog liquid cultures (OD(600)~0.6-08) with

0.1 mM IPTG for 3 hours at 30°C. The cultures were harvested by centrifugation and resuspended in 20ml ice-cold Phosphate Buffered Saline (PBS) containing

IXprotease inhibitor cocktail (Roche) and lysed by sonication. After centrifugation, 1ml of 50% GST-conjugated agarose beads (Sigma) was added to the suspension and incubated for 4 hours at 4°G. The beads were collected by centrifugation and washed extensively in ice-cold PBS. GST-MKDD was eluted from the beads by resuspending in 7 5 0 |liI of lOmM reduced glutathione

(Sigma) for 1 hour on ice and the suspension was collected. Purified protein was dialysed overnight at 4°C against PBS. GST was purified by similar procedures. The concentration of protein in both samples was measured against a Bovine Serum Albumin (BSA; Sigma) standard on Coomassie stained

SDS-PAGE gels. Purified GST-MKDD and GST were kept at -80°C and gently thawed on ice when used.

Transient Transfection of Mammaiian Ceils

Cells for transient transfection were plated at 30-40% confluency and transfected with either SuperFect or PolyFect according to the protocols

provided by the manufacture (Qiagen). Briefly, cells transfected with SuperFect were plated overnight on coverslips in 4x10^ cells/35mm dish. 2pg of plasmid

DNA were diluted in lOOpI serum-free DMEM and 15pl SuperFect were added

and vortexed for 10 seconds. DNA and SuperFect were incubated for 10

minutes in order to form complexes. 1ml of DMEM containing 10% FCS were

mixed with the complexes and added to the cells. For transient transfection

using PolyFect, cells were plated at 8x10^ cells/60mm dishes. 2.5pg of plasmid

92 DNA were diluted in 150|jl serum-free DMEM. 15pl PolyFect were added to the

DNA solution and mixed by vortexing for 10 seconds. Samples were incubated for 10 minutes at room temperature. Cells were washed once in PBS and 3ml

DMEM containing serum were added. 1ml serum-containing DMEM were mixed with the transfection samples and added to the cells.

Immunfluorescent Stainings

Cos-7 cells were plated and transfected as describe above for 36 hours on coverslips. Coverslips were washed once in PBS before fixation in 3.7% formaldehyde in PBS for lOmin. After washing in PBS, cells were permeabilised using 0.5% TritonX-100 in PBS for 10 minutes. For immunofluorescent stainings of pericentrin, cells were fixed and permeabilised with 100% ice-cold methanol for 10 minutes at -20°C. Primary antibodies or TRITC-conjugated phalloidin was diluted in PBS in final concentration as listed in Tabel 3. 30pl of the diluted antibody was added to the 12mm coverslips and incubated for 1 hour at room temperature. Secondary antibodies were diluted in PBS for final concentrations as listed in Tabel 3. Coverslips were washed in PBS before incubated for 1 hour with secondary antibodies. After washing extensively in PBS and once in water, coverslips were mounted on glass slides in 3pl Vectashield (Vector

Laboratories) containing 4'6-diamidino-2-phenylindole-2HCI (DAPI) for fluorescent nuclear staining and allowed to airdry overnight.

Samples were examined by epifluorescence using a Leica DM LB microscope equipped with 100X/1.30 PH3 oil immersion objective. Images were captured using cooled charged-coupled device camera (Optronics) and MagnaFire digital imaging capture software (Optronics) at 1280x1024 pixels resolution and processed using Adobe PhotoShop 7.0. Confocal images in XY and Z were

93 obtained using a BioRad MRC-1000 laser confocal microscope equipped with

63x oil immersion objective.

Antigen Fixation/permeabilisation Antibody 1 ° cone 2° cone

Cy3-conjugated 3.7% formaldehyde/ TUB-2.1 1:200

p-tubulin 0.5% TritonX-100 (Sigma)

y-tubulin 3.7% formaldehyde/ GTU-88 1:200 1:50

Mouse 0.5% TritonX-100 (Sigma)

Pericentrin Methanol PRB-432C 1:200 1:50

Rabbit (Covance)

Texas-Red 3.7% formaldehyde/ (Sigma) 1:100

Phalloidin 0.5% TritonX-100

FITC-V5 3.7% formaldehyde/ (Abeam) 1:25

0.5% TritonX-100

Tabel 3. Concentration and fixation methods of various antibodies used in immunofluorescence stainings.

D/7 staining, TP A Treatment, and Proteasome Inhibition

Untransfected or transfected Cos-7 cells were trypsinised, pelleted by centrifugation, and resuspended in 1.T-dihexadecyl-3,3,3',3'- tetramethylindodicarbocyanine perchlorate (Dil)-staining solution (5pg/ml PBS;

Molecular Probes) and incubated for 10 minutes at 37°C. After extensively washing in DMEM containing 10% FCS, cells were plated on coverslip. 12 hours after Oil staining, cells were either fixed (transfected cells) or transfected with expression plasmids using Polyfect procedures and examined microscopically 24 hours after transfection.

Cos-7 cells were plated on coverslips at 4x10"^ cells/cm^ in 6 multiwell plates and transfected with indicated expression plasmids using Polyfect as transfection agent. 36 hours after transfection, cells were treated with 12-0-

Tetradecanoylphorbol 13-acetate (TPA), 4-a-phorbol, or dimethyl sulfoxide

94 (DMSO) as solvent control, for indicated concentrations and periods of time at

37°C. After washing in PBS, cells were fixed in 3.7% formaldehyde in PBS for

10 minutes at room temperature and washed in PBS and water before mounted on glass slides using Vectashield. From each coverslip, >100 transfected cells were counted and the percentages of cells containing spots were estimated.

For proteasome inhibition, transfected cells were treated with lOOpM Z-lle-

Glu(OtBu)-Ala-Leu-CHO (PSI, Calbiochem) or 5 pM (Benzyloxycarbonyl)-Leu-

Leu-phenylalaninal (Z-LLF-CHO, Calbiochem) for 1, 6 or 12 hours at 37 °C.

Real-Time Epifluorescence Imaging

Cos-7 cells were plated and transfected as describe above for 36 hours on coverslips. For real-time imaging as Z stacks of XY images, cells were monitored over 1 hour periods in a CO 2- and thermostatically-controlled chamber mounted on an inverted Leica DM IRE2 epifluorescence microscope with a motorised stage. Cells were viewed with a 63x/1.30 oil immersion objective and images captured with a cooled charged-coupled device camera

(Hamamatsu Orca) with an automatic shutter and focusing drive, using a 1pm stepsize for the Z stack. Images at 640x512 pixels resolution were captured by time-lapse recording software (Improvision Openlab 3.0.9) at 1 minute intervals.

Movement distances and velocity were measured in OpenLab by manually tracking individual particles on calibrated images. Statistics were calculated using Excel spread sheets. Image sequences were processed in Adobe

ImageReady for QuickTime movie presentation.

Puii Down Experiments and Immunoblotting

Confluent 100mm tissue culture plates of 293T cells were transiently transfected as described with the constructs indicated and lysed in 1ml of lysis-

95 buffer containing 1% Triton X-100, 1% glycerol, 150mM NaCI, 50mM Tris.HCI, pH 7.5 and protease inhibitor cocktail (Roche). Lysates were centrifuged for 10 minutes at 13.000 g and lOpg of either GST-MKDD or GST were added together with 50 pi of 60% glutathione-sepharose 4B beads (Amersham) in lysis buffer. Samples were incubated with rotation overnight at 4°G. Beads were pelleted by centrifugation and washed 5 times in 1ml of lysis buffer and resuspended in 40pl of SDS-PAGE sample buffer containing lOOmM dithiothreitol (DTT), lOOmM Tris.HCI pH 6.8, 4% SDS and 20% glycerol and resolved on 12.5% SDS-polyacrylamide gels. In experiments to determine protein expression levels, whole cell lysates were prepared from transfected cells by lysis in 2XSDS-PAGE sample buffer containing lOOmM DTT. All samples were separated on 10% SDS-PAGE gels and transferred electrophoretically to polyvinylidene difluoride (PVDF) membranes (Millipore) in buffer containing 25mM Tris.HCI pH 9.4, 40mM glycine, and 10% methanol by semidry blotting (Owl Separation Systems). Blots were blocked in 0.5% casein or 2.5% non-fat milk before being probed with JOD-2 rabbit antibodies to muskelin (1:100; Adams et al., 1998), mouse monoclonal antibody to GPP

(1:5000; Roche), Protein Kinase C (PKC) antibody sampler kit (BD Bioscience), or mouse monoclonal antibody to y-tubulin (clone GTU-88, Sigma) followed by

ECL detection using alkaline phosphatase (AP)-conjugated secondary antibodies (Tropix) and Hybond ECL film (Amersham).

For the GFP pull-down assays, 293T cells were transiently transfected with

EGFP-MK or EGFP as previously described. Lysates was incubated with 5pg of mouse monoclonal antibodies against GFP (Roche) for 1 hour, before addition of 50 pi of 60% protein A-sepharose 4B beads (Amersham) in lysis buffer and incubated with rotation for 2 hours at 4°C. Beads were pelleted by centrifugation

96 and washed 5 times in 1ml of lysis buffer before resuspended in 293T cell lysates transiently transfected with pcDNA3.1-MK as previously described and further processed similar to the GST pull-down assay. The V5-tagged muskelin was precipitated using 50pl of 50% nickel-charged resin beads (Qiagen) in lysate buffer from HighS cell lysates previously prepared in the laboratory and resuspended in 293T cell lysates transiently transfected with EGFP-MK and further processed similar to GST pull down assays.

2-Dimensional Isoelectric Focusing and SDS-PolyAcrylamide Gel

Electrophoresis

SMC cells or 02012 cells were lysed in buffer containing 0.5% SDS, 25mM

Tris-HOI, pH 7.8, 2.5mM MgOb with protease inhibitor cocktail (Roche), phosphatase inhibitor mix (Sigma), 10 pg/ml DNase I (Sigma) and 10 pg/ml

RNase I (Sigma). Protein concentration was quantified using Advanced Protein

Assay Reagent (Oytoskeleton) according to the protocol provided by the manufacture. Protein was precipitated with 6 volumes of acetone and resuspended at a final protein concentration of 4 mg/ml in a solution containing

7M urea, 2M thiourea, 1% DTT, 1% OHAPS, 1% ampholytes (pH 3-10, BioRad) and 1% Triton X-100. 750pg or 75pg of each sample was added to Immobilized pH Gradient (IPG) strips, (11cm, pH 5-8, BioRad), and allowed to actively rehydrate overnight using the PROTEAN lEF system (BioRad) run at 50V. IPG- strips were then focused for 24 hours (>63,000 Vhr). After focusing, the IPG- strips were reduced for 15 minutes in lOOmM DTT in equilibration buffer (6M urea, 20% glycerol, 2% SDS) and alkylated for 15 minutes in lOOmM iodoacetamide in equilibration buffer before being processed in the second

97 dimension using Crition 10% IPG precast gels (BioRad), followed by immunoblotting as described above.

For the activation of PKC, SMC or C2C12 cells were treated with lOOnM TPA,

50nM bisindolylmaleimide I (BIM; Calbiochem), or both simultaneously for 1 hour at 37°C prior addition of lysis buffer. Endothelin-1 treatment was performed by serum-starving SMC for 2 hours prior the addition of 50nM Endothelin-1

(Sigma) diluted in DMEM for 15 minutes at 37°. The treatment with Endothelin-1 was terminated by washing in PBS and addition of lysis buffer to the cells.

Proteinase K Proteolysis Assay

Mouse muskelin or MKY488AA/495A in the pcDNA3.1 vector were transcribed and translated in vitro using the eukaryoticTNT-Coupled Reticulocyte Lysate

System (Promega) with Transcend Non-Radioactive Translation Detection for biotinylated lysine (Progema) according to manufacturer’s instructions. Equal aliquots of the translated protein were mixed with 0-50pg of Proteinase K

(Fisher Scientific) in proteolysis assay buffer (20mM Tris.HCI pH 7.4, 80mM

KAc, ImM MgAc and 2mM cycloheximide) and incubated for 10 minutes on ice.

Proteolysis was terminated with ImM phenylmethanesulfonyl fluoride (PMSF;

Sigma) and 4x SDS-PAGE sample buffer. Samples were separated on 10% polyacrylamide gels under reducing conditions and processed for western blotting as previous described, using horse radish peroxidase (HRP)-conjugated

Streptavidin (1:7500, Promega) to detect biotin-lysine or JOD-2 antibodies

(1:100, Adams et al., 1998) to detect the amino-terminus of muskelin.

98 Chapter 3

Bioinformatic Analysis of Muskelin

Introduction

Bioinformatics is a novel field combining computer science, information technology, statistics and biology. Over the last decade the field of bioinformatics has expanded enormously, with public and searchable databanks of published scientific literature as well as databanks encompassing a broad range of biological data including gene and protein sequences, gene expression patterns and protein structures. I was interested to apply bioinformatic methods to the study of muskelin. First, I discuss here the various methods applied to studies of proteins and proteomes.

Bioinformatics Applied to Single Proteins

The use of bioinformatics allows you to identify previous uncharacterised genes and proteins, and can provide various predictions of domain architecture, cellular distribution, regulation and function from comparative analysis and multiple sequence alignments. Comparative alignments can suggest orthologues of a known protein in other species, and also identify known domains and motifs within families of related proteins. This allows predictions not only of cellular localisation, regulation and putative interacting proteins, but also the three-dimensional structure. Sander and Schnieder quantified that when two sequences have at least 25-30% homology for a length of 80 residues or more, there was a very high probability that the two sequences have the same three dimensional structure within the matching regions (Sander and

Schneider, 1991; Ouzounis et al., 1993). As the biological function of a protein

99 is dependent on the three-dimensional structure, comparisons of protein orthologues between species identify conserved regions that are important for structure and therefore can raise suggestions for the protein function. Large proteins have a modular architecture of smaller recognisable domains, which occur in other protein and therefore allow subdivision of proteins into domain families. As each individual domain can have individual functions, multidomain proteins combine individual functions and the specific combination of domains can enable predicts of its precise function. Mutations occur randomly throughout the whole sequence, but will only be passed on to the next generation if they occur in regions that do not substantially disrupts the natural function of the protein.

In parallel, the contrast to divergent evolution, in which two individuals (i.e. structures) derives from a single ancestor and become more and more dissimilar, is convergent evolution, where two individuals (i.e. structures) become more and more similar in appearance (orfunction) as they adapt to specific functional constraints. For instance, the p-propeller structure is a widespread and universal protein domain fold, as revealed from crystal structures of diverse proteins (Fulop and Jones, 1999; Jawad et al., 2002;

Andrade et al., 2001a). Many unrelated primary sequences could adopt the stereotypical topology of a p-propeller; however, several sequence repeat motifs have been identified in relation to known crystal structures that are predictors of propeller domains. These include the WD motif, the regulator of chromosome condensation 1 (RCC1) motif and the tachylectin-2 repeat and the kelch-repeat. This indicates a convergent evolution, where the stability in a certain structural motif can be adapted from diverse sequences.

100 Comparative Bioinformatics on Whole Genomes

The power of bioinformatics has been greatly increased by the completion of genome sequencing projects that allows the entire predicted proteome of an organism to be analysed. The completion of the Humane Genome Project

(Lander et al., 2001) was the first step, but with the completion of the Mouse

Genome Project (Waterston et al., 2002) the opportunity of comparative analysis with another mammal gave key information for a better understanding of the human per se. Predictions and assignment of novel genes has been the biggest task after completing the human genome and the comparison between human and mouse genomes has strengthened the predictions of assigning coding and non-coding regions. Currently (GeneSweep, May 2003), the human genome contains an estimated 24.847 transcripts and some predict that the final number will be in the range of 30.000 (Waterston et al., 2002). It has been shown that 99% of the mouse genes have an identifiable homologue in the human genome. From the estimated 22.011 genes in the mouse genome, 72% of the proteins contained at least one known domain (Waterston et al., 2002).

As both genomes contain several examples of putative gene duplications giving rise to gene families, so far only 12.845 of the mouse protein coding genes have a single orthologue in the human genome.

Synonymous and Nonsynonymous Substitution Rates

Comparing protein orthologues by multiple sequence alignment between human and mouse allows quantitative assessments of Darwin’s theories of positive selection. Two numbers, which can be quantified from these alignments, are of great interest: the percentage of amino acid identity and the rate of nucleotide substitutions causing a change in amino acid compared to the rate of nucleotide

101 substitutions causing no change in amino acid (silent mutations). This ratio between the two rates is also known as non-synonymous mutations per non­ synonymous site (Ka) compared to synonymous per synonymous site (Ks). A change in amino acid will be subject to selection, whereas a silent mutation is considered neutral, so the Ka/Ks reflects the amount of selection imposed on a certain gene or region within a gene. Highly conserved genes or regions with few changes in the amino acid will have low KA/Ks-values, whereas genes or regions within g enes u nder I ess s election which can a How c hanges i n a mino acids without disturbing the function of the protein, will have higher KA/Ks- values. If the rate of non-synonymous substitutions exceeds the rate of synonymous substitutions, a Darwinian positive selection or advantageous substitution has occurred. This has been shown for the antigen recognition site in the major histocompatibility complex class I and II (Hughes and Nei, 1988;

1989), where a comparison between the non-synonymous substitution rate in the nucleotide coding the putative antigen recognition site was higher then the synonymous substitution rate, in contrast to other regions in the genes coding major histocompatibility complex class I and II genes in both human and mouse.

Comparing the 12.845 mouse-human orthologues, the overall median amino acid identity is 78.5% and the median KA/Ks-value is 0.115. When comparing only the domain-containing regions (as identified in the Protein Families database of alignments and hidden Markov models (Pfam, Bateman et al.,

2002) and Simple Modular Architecture Research Tool (SMART, Schultz et al.,

2000, Letunic et al., 2002) the median amino acid identity is 93.5% and the median KA/Ks-value is 0.061, whereas regions showing no homology to any known domain structure gives a median amino acid identity of 71.1% and a

102 median KA/Ks-value of 0.155 (Waterston et a!., 2002). These data emphasises that selection and conservation occurs on independent domains within regions of individual proteins and strengthen the hypothesis that proteins are build as discrete units.

Rationale of My Studies

An ultimate goal in bioinformatics is the prediction of cellular functions and regulation solely from the nucleotide sequence of novel proteins. One approach for this purpose involves an identification of domains based on sequence alignments of orthologues and related proteins. This allows a classification of related protein into families and suggestions for putative function and regulation.

In considering how the function of muskelin might be related to its structure, I wished to obtain a clearer view of its domain architecture.

Previous work in the laboratory has identified and cloned muskelin orthologues in Homo sapiens (nucleotide id: AF047489; Adams and Zhang, 1 999), Danio rerio (nucleotide id: AF418017, Zaromytidou and Adams, direct submission).

Drosophila melanogaster (protein id: CG8811/AAL38905; Adams, 2002), and additionally identified a muskelin-like protein in Schizosaccharomyces pombe

(protein id: NP_594297; Adams et al., 1998, Adams, 2002). On the basis of these known orthologues, I wanted to identify more species orthologues and analyse the conservation between species to identify the composition of domain architecture. Secondly, comparison to related proteins with similar domains and the larger kelch-repeat superfamily were made to obtain indications of putative function and regulation of muskelin. Results from this chapter and Chapter 5 form the basis of a paper in preparation.

103 Results

Sequence Alignment of Muskelin Orthologues

In order to identify muskelin orthologues in other species, the protein sequence of Mus musculus muskelin was used a query for sequence searches using the protein-based Basic Local Alignment Search Tool for Proteins, BLASTP

(http://www.ncbi.nlm.nih.gov/BLAST). Besides the known orthologues in Homo sapiens and Danio rerio, this search identified orthologues in Rattus norvegicus

(protein id: NP_112649), Anopheies gambiae (protein id: EAA09711), and a muskelin-like protein in the microsporidia Encephaiitozoon cunicuii (protein id:

NP_584592). Sequence alignment between the Mus musculus and Homo sapiens, and Mus musculus and Rattus norvegicus muskelin orthologues shows

98% and 99% identity respectively, when comparing the full-length muskelin

(Figure 1 and Table 1). Recently a second version of Homo sapiens muskelin

(protein id: Q9UL63) was released in the Human Genome Database

(http://www.ncbi.nlm.nih.gov/genome/seq/HsBlast.html) showing few changes in the nucleotide sequences compared to the muskelin orthologue cloned and sequenced in our laboratory. This second version of Homo sapiens muskelin shows 99% identity compared to Mus musculus muskelin and probably reflects

natural single nucleotide polymorphisms within the muskelin coding sequence.

All the work cited in this work refers to the Homo sapiens muskelin identified in our laboratory. The Danio rerio muskelin orthologue (AF418017_1) recently identified and sequenced in the lab shows 89% sequence identity to Mus musculus muskelin (Figure 1 and Table 1). This high conservation between muskelins from Homo sapiens, Mus musculus, Rattus norvegicus, nd Danio a rerio suggests a unique and important function, but also restricted the ability to

104 f------N-terminal domai Human

1 zebra fish 1 Fruit fly 1 msssstssptdnssh |:sH l ^salnpslnpvsgtntssqkpfslsep 1:î Mosquito S .pombe Microspodia MLRNRRLY

Domain N-terminal domain/Discoidin domain ------<------MIND-...... --)---- 1 73 Mouse 73 Rat 73 zebra fish 67 Fruit fly 105 Mosquito 60 S .pombe 69 Microspodia 67 1 i " JEMWFGkT

Domain -Intervening region/LisH and CTLH 1 r------Kelch d o m a i Human 180 180 180 zebra fish 174 Fruit fly 214 Mosquito 168 #TP S .pombe 179 PKDN Microspodia 170 -----

Doma in Kelch domai Human 290 290 290 zebra fish 264 Fruit fly 318 Mosquito 273 S .pombe 282 MtNEYG-- Microspodia 269 WiF.INIGTR

Domain Kelch domain-- 398 398 398 zebra fish 392 Fruit fly 426 Mosquito 381 S .pombe 388 Microspodia 365 J:. : 'RSTEKEDTYGGLYMFSLGGAFPNYGKEAFSVHHGVEVEKGASRLSEECEACENNPLGRSGQSSGSTVELDL:ÿJ|.n|HEGEE§SNSPV®| t| ---

Domain Kelch domain- Human 435 435 435 zebra fish 429 Fruit fly 462 .’KEHPHHGTIA r; Mosquito 416 ..-REN------S .pombe 416 EAVQHVRKN------Microspodia 472 p p d y i e g s s v B ' B ' I B .'PFP------|DS

Doma in Kelch domain Human 524 524 524 zebra fish 518 Fruit fly 563 ISINNMNi:,- Mosquito 511 I 1 ------NGEN S .pombe 506 ;RN 1. ...H ------[ 01 ;:;r e g n - - e h p c s r y v f g k j .n h l t l n f r Microspodia 564 FCYA-'AB.fli-ffl 1 |l VKPRVEGG------Vfuft v| ertdî

Doma in C-terminal 621 621 621 zebra fish 615 Fruit fly 673 HDGNLDRECNTPALCADADDLETDSDVS-AKIKEELADG-- Mosquito 607 [55 BBSs :[3aiK^Hl^HsDTVDGETEPTSLLTCPLASTASASAPVGGNKHKQQRADDDG S.pombe 609 LDRWGIKH . RQtlJ^FWNALlLDA Microspodia 648 EgK@-TQ LYL d^ r--ei J|JfJch

Domain C-terminal region

Mouse 686 Rat 688 zebra fish 682 Fruit fly 779 SCTKRAR--TDCTPELKNh Mosquito 716 DCSQNQPQQAIDQOFEMKI S .pombe 677 n k I i a q p pB k t f q t i B If- o jn v Microspodia 713 RRlJVDF’iBDICRL

Figure 1. Multiple sequence alignment of Homo sapiens (Human; NP 037387 ), Mus musculus (mouse; NP_038819), Rat norvegicus (Rat; NP_112649), Danio rerio (Zebra fish; AAN32664), Drosophila melanogaster (Fruit fly; NP_610801), Anopheles gambiae (Mosquito; EAA09711), Schizosaccharomyces pombe (S.pombe; NP_594297), and Encephaiitozoon cunicuii (microspodia; NP 584592). Sequence similarities are coloured using BoxShade and displayed as black boxes (identical) and grey boxes (homology).

105 Table 1 Percentage amino acid sequence identity between various regions within muskelin

Specie N-terminal Intervening Kelch C-terminal region region repeats region (aa14-158) (aa171-249) (aa250-624) (aa625-735)

Compared to Mus musculus muskelin

Homo sapiens 98 98 97 99

Rattus norvegicus 99 100 99 100

Danio rerio 87 88 91 86

Drosophila melanogaster 56 47 46 38

Anopheies gambiae 63 43 47 42

Schizosaccharomyces pombe 46 14 29 10

Encephaiitozoon cunicuii 43 19 29 20

Compared to Drosophila melanogaster

Anopheies gambiae 65 56 56 37

Schizosaccharomyces pombe 44 11 27 8

Encephaiitozoon cunicuii 41 11 25 20

Compared to Schizosaccharomyces pombe

Encephaiitozoon cunicuii 39 11 21 12

106 predict the importance of various regions or identification of critical residues and segments within muskelin, because of the great similarity between all the polypeptides along their length. Therefore, I focused on extending the comparisons to the invertebrate muskelins.

Alignment of the previously cloned Drosophila melanogaster muskelin and Mus musculus muskelin showed 46% sequence identity overall (Adams, 2002).

Closer evaluation of the sequence alignment between Mus musculus and

Drosophila melanogaster muskelin showed different levels of conservation between various regions within muskelin. An N-terminal region covering Mus musculus muskelin 14aa-158aa aligned with Drosophila melanogaster muskeWn

46aa-192aa showed 56% and an intervening region covering Mus musculus muskelin 171aa-249aa and Drosophila melanogaster muskelin 206aa-296aa showed a homology of 47%. Alignments of the previous identified kelch-repeats in Mus musculus muskelin (aa250-624) and Drosophila melanogaster muskelin (aa284-675) showed 46% homology (Adams, 2002).

The Drosophila melanogaster muskeWn orthologue contains an additional 108 amino acid insert in the C-terminal region. These data indicated two highly conserved regions in mouse muskelin, namely the very N-terminal region covering aa14-158 and the kelch-repeat domain (aa250-624).

To strengthen the comparisons, I took advantages of the recently published genome sequence at Anopheles gambiae (Holt et al., 2002). A BLASTP search using the highly conserved amino terminal region of Mus musculus muskelin identified an orthologue in Anopheles gambiae (protein id: EAA09711). The predicted open reading frame of the muskelin orthologue in Anopheies gambiae encodes a 797 amino acid long polypeptide with a 48% identity to Mus muscuius muskelin (Figure 1), and with a 53% identity to the Drosophiia

107 melanogaster muskeWn. A further analysis between the Anopheles gambiae and

Drosophila melanogaster muskelin orthologues was carried out. Sequence alignment of the N-terminal regions of Drosophila melanogaster and Anopheles gambiae muskelin showed 65% homology. The Anopheles gambiae muskelin lacks 45 amino acids in the N-terminal region, which could possibly be due to an error in the predictions of regions encoding the Anopheles gambiae muskelin orthologue, as the cDNA identified in Anopheles gambiae genome encoding the muskelin orthologue lacks the starting ATG-codon. The coding sequence in the

Anopheles gambiae genome encoding the muskelin orthologue has so far only been predicted using conceptual translation software. The intervening region and the kelch repeats of Drosophila melanogaster and Anopheles gambiae muskelins both showed 56% sequence homology. Similar to Drosophila melanogaster, Anopheles gambiae muskelin contains the additional 108 residues long insert in the C-terminal region, and a sequence alignment of this region in Drosophila melanogaster and Anopheles gambiae showed 37% identity.

The whole genome sequence of the two fungi Encephaiitozoon cunicuii

(microsporidia) and Schizosaccharomyces pombe have been published

(Katinka et al., 2001 ; Woods et al., 2002). Two proteins with strong homology to muskelin are identified in Encephaiitozoon cunicuii (microsporidia) and

Schizosaccharomyces pombe. Sequence alignment between Mus musculus muskelin and the muskelin-like protein from Schizosaccharomyces pombe shows 28% identity when comparing full-length muskelin (Table 1). Sequence alignments of the four regions previously described shows that the N-terminal region of Schizosaccharomyces pombe muskelin-like protein has 46%

108 sequence identity to Mus musculus muskelin (Table 1). The intervening region shows a low sequence identity between Mus musculus muskelin and

Schizosaccharomyces pombe muskelin-like protein with an estimated 14% sequence identity. The kelch repeats shows 29% identity between Mus musculus muskelin and Schizosaccharomyces pombe muskelin-like protein, and the C-terminal region shows 10% identity (Table 1). Sequence alignment of

Mus musculus muskelin and in Encephaiitozoon cunicuii muskelin-like protein shows 43% sequence identity in the N-terminal domain and a 19% identity in the intervening domain (Table 1). The kelch repeats shows 29% sequence identity, which includes a 73 amino acid long insertion in the Encephaiitozoon cunicuii muskelin-like protein. The fact that galactose oxidase is a seven bladed p-propeller and the mouse muskelin is a predicted six bladed propeller raised the question whether the additional 73 amino acids could possibly assemble a seventh blade in the p-propeller structure the Encephaiitozoon cunicuii muskelin-like protein. The size of the predicted six kelch-bladed p-propeller of mouse muskelin is 375 residues. The seven bladed p-propeller in galactose oxidase is 375 amino acids. So, because of the equal number of residues but the difference in number of blades, the homology score in the alignment is lowered to only 8%. Comparison of the seven kelch repeats in Galactose oxidase and the 405 residues covering the predicted seven kelch repeats in microsporidia muskelin gives an sequence identity of 9%, compared to a 30% identity between the kelch repeats in mouse and microsporidia muskelin.

Sequence alignment of the Encephaiitozoon cunicuii muskelin-like protein and

Mus muscuius muskelin shows that the additional kelch repeat in

Encephaiitozoon cunicuii muskelin-like protein is inserted between the second and the third kelch repeat. A seven-bladed kelch domain for muskelin has so far

109 only been identified in Encephaiitozoon cunicuii. The C-terminal region of

Encephaiitozoon cunicuii muskelin-like protein lacks 26 amino acids when compared to Mus musculus muskelin and shows 20% identity to Mus musculus muskelin. A sequence alignment between in Encephaiitozoon cunicuii muskelin- like protein and muskelin-like protein in Schizosaccharomyces pombe shows

20% identity between the two, indicating that the muskelin-like protein in

Encephaiitozoon cunicuii is as diverse from muskelin-like protein in

Schizosaccharomyces pombe as to Mus musculus muskelin.

The overall median amino acid identity between all identified Mus musculus and

Homo sapiens orthologues is 78.5%, and similar for Drosophila melanogaster and Anopheles gambiae the overall sequence identity for orthologues is 56%

(Zdobnov et al., 2002; Waterston et al., 2002). This shows that muskelin is among the highest conserved proteins when compared to other Mus musculus and Homo sapiens orthologues, but within normal conservation when compared to other Drosophila melanogaster and Anopheles gambiae orthologues.

Furthermore, from these sequence alignments, I also conclude that muskelin and muskelin-like proteins contains two highly conserved regions, namely an N- terminal region and a kelch-repeat domain, whereas the intervening and the 0- termianl region show lower conservation.

For further identification of muskelin orthologues, searches of databases of expressed sequence tags (dbEST (TBLASTN), Boguski et a I., 1 993) with full length Mus musculus muskelin sequence was performed. These searches identified putative muskelin orthologues in Gallus gallus, Canis familiaris, Bos taurus, and Silurana tropicalis (for GeneBank numbers see materials and

110 methods). For a further analysis of sequence homology between these orthologues, the ESTs identified were translated to their putative amino acid sequences and aligned to Mus musculus muskelin in order to identify the equivalent regions. Sequence alignments of these orthologues showed an estimated 95% identity for Gallus gallus muskelin, 94% identity for Canis familiaris muskelin, 97% identity for Bos taurus muskelin, and 92% identity for

Silurana tropicalis muskelin. As shown in Figure 2, the ESTs only covered segments within muskelin, and when compiling ESTs for each species, none of the four orthologues covered a full-length muskelin. Database searches using full-length muskelin or the highly conserved N-terminal domain in the

Saccharomyces cerevisiae genome (http://../Genome/YeastBlast.html), the

Caenorhabditis elegans genome (http://../Genome/NematodeBlast.html) and the

Arabidopsis thaliana genomes (http://../Genome/ara.html) identified several kelch-containing proteins, but none of them showed any identity as muskelin or muskelin-like proteins.

Conservation of Domains CalculatedK bya/K s Ratios

As an alternative method for evaluating the degree of sequence conservation, the rate of non-synonymous versus synonymous substitutions was calculated.

This method is based on an evaluation of nucleotide sequences of the coding regions and estimates the rate of mutations causing a change in amino acid as compared to silent mutations, i.e. the third nucleotide in the codon. If certain domains have been subject for strong selection, the Ka/Ks ratio will accordingly be low (K a /K s =0.015-0.178), whereas regions of less importance will have calculated values of Ka/Ks =0.03-0.275. The most commonly used method now is the recently described method by Yang and Nielsen (Yang and Nielsen,

111 mouse 1 t'YALHKWastSSTÏLl'ENiLVUKFNUUSSKWSSESNVPfyYHL.KLtSKl'AivgBITEGKYEKTHVCNLKl' : cow 1 chicken 1 iTfüKYEKTHVüNLKKtKVtüGMNEENMi dog 1 IT FGKYEKTHVCNLKKFKVFGGMNEENMTI Frog 1 ITFGKYEKTHVCNLKKFKVFGGMNEENMTi mouse 101 SSÜLKNUYNKETFTI 1 chicken 91 3SGLKN[)YNKETFTLKHKIDEQMFPCRFIKIVPLLSWGPSFNFSIWYVELBg IDDPDW0PCLNWYSKYREQEAIRLCLKHFR0HNYTEAFESL0KK"i dog 72 ssglkndynketftlkhkideqmfpcrfikivpllswgpsfnfsiwyve J giddpdwqpclnwyskyreqeairlclkh ______Frog 97 ^SGLKNPYNKETFTLKHKIDEOMFPCRFIKIVPLLSWGPSFNFSIWYVîMillfPDWQPCiro-IYSKYREQEAIRLCLKHFROHNYTEAFESLOKK'i

mouse 201 H'IBimi W r1sH 3 B ~ KüUüELiNKPUMKÜGHyMV 1 UV .jyüWUG'i'yuiAUi- 1 jFGGWDGTODLAPr chicken 191 dog 155 KG,: g . :.r PüMKüühynviuvysEMVYLFGüwuü'ryuLAUh Frog 197 KGDGEDNRPGMRGGHOMVIDVySEMVYLFGGWDGTODLAPl

mouse 301 hyiYTEGh :. : . ;

mouse 701 jHTYAQRTQLFDTLVNFFPDSMTPPKGNLVDLITI cow 266 iHTYAQRTQLFDTLVNFFPDSMTPPKGNLVDLITI chicken 476 jHTYAQRTQL FDTLVN FFPE§MT PPKGNLVDLITI dog 461 ■HTYAQRTOLFDTLVNFFPDSMTPPKGNLVDLITl Frog

Figure 2. Sequence alignments o f compiled ESTs from B os tharus (Cow), Gallus gallus (Chicken), Canis familiaris (Dog), and Silurana tropicalis (Frog). For all the accession numbers see materials and methods. Individual ESTs were translated to corresponding amino acid sequences and aligned against mouse muskelin using ClustalW and coloured using BoxShade and displayed as black boxes (identical) and grey boxes (homology).

112 2000) and was used for my evaluations. As the target for my analysis was to identify conserved regions within muskelin, I compared four regions within muskelin from all species as compared to Mus musculus muskelin, rather then comparing conservation of individual domain between all species, as the diversity from various vertebrates to Mus musculus is much less compared to the diversity between Mus musculus and Drosophila melanogaster muskeWns.

The nucleotide and amino acid sequences encoding the N-terminal region

(aa14-159), the intervening region (aa171-249), the kelch repeats (aa249-624) and the C-terminal region (aa625-735) of muskelin was compared between

Homo sapiens and Mus musculus and the Ka/Ks value for each region was calculated. The N-terminal regions of Homo sapiens and Mus musculus muskelin have a Ka/Ks value of 0.0206 and an amino acid sequence identity of

98% (Table 2). These values indicate not only a very high conservation protein sequence but also a high conservation in the nucleotide sequence. The intervening regions of Homo sapiens and Mus musculus muskelin has an amino acid identity of 98% and a Ka/Ks value similar to the N-terminal region of

0.0207, whereas the kelch repeats shows 97% homology and a calculated

Ka/Ks value of 0.0193. The C-terminal regions of Homo sapiens and Mus musculus muskelin have a Ka/Ks value of 0.0249 and an amino acid identity of

99%. In general, the nucleotide sequence of Homo sapiens and Mus musculus is very highly conserved, indicating that muskelin has been under a very strong selection (in contrast to positive selection) during evolution from mouse to man.

The kelch repeats shows the highest degree of conservation with a Ka/Ks value of 0.0193 and the C-terminal regions has the lowest degree of conservation with a K a / K s of 0.0249.

113 Table 2. Percentages of amino acid identity and Ka/Ks ratios as compared to Mus musculus muskelin

Specie N-termIr a! region Interveni ng region Kelch repeats C-termin al region

(aa1^t-158) (aa17 1-249) (aa250-624) (aa62 5-735)

%aa Dn/ds %aa Dn/ds %aa Dn/ds %aa Dn/ds

Homo sapiens 98% 0.0206 98% 0.0207 97% 0.0193 99% 0.0249

Rattus norvegicus 99% 0.0373 100% N/A 99% 0.0208 100% N/A

Danio rerio 87% 0.0432 87% 0.0737 91% 0.0326 86% 0.0636

Drosophiia 56% 0.1045 47% 0.4677 46% 0.1238 38% 0.4224

melanogaster

114 Calculations of Ka/Ks ratios are based on nucleotide sequences of equal length.

As Drosophila melanogaster muskelin is 852 residues long and Mus musculus muskelin is 735 residues long, an initial sequence alignment of Mus musculus and Drosophila melanogaster muskelin was performed and the additional inserts for the two amino acid sequences was identified. The corresponding nucleotides encoding the additional inserts were excised so the nucleotide sequences for Mus musculus muskelin and Drosophila melanogaster muskelin were of equal length (mouse muskelin: Aaa1-13, Aaa405, Drosophila muskelin:

Aaa1-45, Aaa156-157, Aaa477-482, Aaa535-542, Aaa582-585, Aaa612-621,

Aaa740-787 and Aaa806-823). The N-terminal region of mouse and Drosophila melanogaster muskelin has a Ka/Ks value of 0.1045 with an amino acid homology of 58% (Table 2). This value indicates a relative high conservation in the nucleotide sequence of muskelin between Mus musculus and Drosophila melanogaster. The Ka/Ks value for the intervening sequence of muskelin from

Mus musculus and Drosophila melanogaster (206aa-296aa) was calculated to

0.4677 with an amino acid homology of 46%. The amino acid identity of the kelch-repeats in Drosophila melanogaster and mouse is 46% and the calculated

Ka/Ks value is 0.1238. The C-terminal regions was compared in a similar manner using Drosophila melanogaster (A740-787 and A806-823) compared to mouse (aa624-735) and the Ka/Ks value was calculated to 0.4224 and this region has a 38% amino acid homology. A comparison of these four regions between Mus musculus and Drosophila melanogaster indicate a high conservation in the N-terminal region and in the kelch repeats and less conservation in the intervening region and the C-terminal region. Furthermore, as all the Ka/Ks values were below 1.0 there has not been any positive selection upon the four regions within muskelin.

115 As the sequence similarity between rat and mouse muskelin is so high (99% in overall), the Ka/Ks analysis can give inconclusive results. As listed in Table 2, the 100 percent identity for the Intervening and the C-terminal region does not allow calculation of KA/Ks-values for these regions. A Ka/Ks analysis was performed comparing the less identical Danio rerio muskelin to Mus musculus muskelin. Again, the same hierarchy of conservation was observed, where the

N-terminal region and the kelch repeats had Ka/Ks values of 0.0432 and 0.0326, respectively, and the intervening region and the C-terminal region had Ka/Ks values of 0.0737 and 0.0636 respectively (Table 2).

Calculation of the Ka/Ks values for the muskelin-like proteins from

Schizosaccharomyces pombe and Encephaiitozoon cunicuii muskelin were restricted by the limited sequence relationship to Mus musculus and therefore not included in this study.

Conclusion from Multiple Sequence Alignments of Muskelin

An analysis of conservation in nucleotide sequences in conjunction with amino acid identity was used to identify highly conserved regions in muskelin. Using the nucleotide and amino acid sequences from five species, I have been able to identify regions and domains in muskelin, which have been under strong selection during evolution. This analysis indicated an importance for both the N- terminal region and the kelch repeats suggesting possible functional or regulatory domains in muskelin.

Analysis of Muskelin Domain Architecture

Because the multiple sequence alignment of muskelin has indicated a high conservation of the N-terminal domain and the kelch-repeats, I went on to

116 analyse the domain structure of these regions in more detail. Searching the

Conserved Domain Database (CCD; http://www.ncbi.nlm.nih.gov/Structure/cdd/wrpsb.cgi) using full-length Mus musculus muskelin identifies two domains present in Mus musculus muskelin when using an expected threshold value of 0.01. The N-terminal part of muskelin contains a putative F5/F8 Type C domain covering aa24 to aa139 with an e-value of 10'® when 81.4% of the consensus sequence for F5/F8 Type C domain is included in the alignment to Mus musculus muskelin. The F5/F8 Type

C domain has also been described as a discoidin domain (Baumgartner et al,

1998). Searching the SMART database (http://smart.embl-heidelberg.de/) with full-length Mus musculus muskelin indicates a LisH motif covering aa172 to aa204 with an e-value of 4.6x10’^ and an additional C-terminal to LisH motif

(CTLH) covering aa206 to aa258 with an e-value of 5.1x10'^. These domains are located in the intervening region of muskelin located between the N-terminal discoidin domain and the kelch repeats. The kelch repeats in muskelin has been identified by a combination of database searches and manual assignment

(Adams et al., 1998; Adams and Zhang, 1999; Adams, 2002; see also review

Adams et al., 2000).

None of the databases detects any domain or motifs associated with the C- terminal region. The distribution of the domains identified in muskelin using both

CCD and SMART is outlined in Figure 3. I will discuss each domain separately below.

Discoidin Domain

Crystal structures of several discoidin domain-containing proteins have been resolved. These include Galactose Oxidase (PDB: 1G0F), Sialidase (PDB:

117 N-terminal Intervening Kelch-repeat domain C-terminal Discoidin domain/ LisH/CTLH F5/F8 Type C domain

Figure 3. Schematic diagram of the domain distribution in mouse muskelin as identified using Conserved Domain Database and SMART and manual assignment. The boundaries for each domain are indicated in amino acid number as positioned in vertebrate muskelin. The N-terminal domain contains a Discoidin Domain or F5/F8 type C domain (indicated in light grey), which includes the Muskelin Identity in N-terminal Domain (MIND, Adams, 2002). The intervening region contains a LisH (indicated in horizontal lines) and a C-Terminal to LisH motif (CTLH; as indicated in diagonal lines). The Kelch-repeat Domain consists of six kelch repeats (indicated in dark grey). The C-terminal region (indicated in white) contains no identifiable conserved domain.

118 1EUU), Human Coagulation Factor Va (PDB: 1CZT), Human Coagulation

Factor VIII (PDB: 1D7P,), and the Anaphase promoting complex subunit 10,

APC10/DOC1 (PDB: 1JHJ) (Ito et al, 1991, Gaskell et al., 1995, Macedo-

Ribeiro et al, 1999, Pratt et al, 1999, Wendt et al, 2001). The discoidin domain forms a globular structure consisting of 11-19 p-sheets, eight of which form the core structure resembling a p-sandwich, also known as a jellyroll fold. In the coagulation factors, three adjacent loops protruding from the p-sandwich are responsible for the specific interaction with phosphatidyl-L-serine at the plasma membrane (Macedo-Ribeiro et al, 1999, Pratt et al, 1999). In the anaphase promoting complex subunit-10, the very C-terminal part interacts directly with another subunit of the anaphase-promoting complex, Cdc25 (Wendt et al,

2001).

Aligning mouse muskelin to the amino acid sequence of galactose oxidase and coagulation factor Va allows an assignment of the p-sheets structure distribution in the discoidin domain of muskelin. Initially, a secondary structure prediction program was used to predict the p-sheet distribution in muskelin (Figure 4) and this prediction was used in conjunction with sequence alignments to the previously described discoidin domains from coagulation factor V, coagulation factor VIII, galactose oxidase, mammalian X-ray cross-complementing group 1

(Xrcci), Anaphase promoting complex subunit 10 APC10/DOC1, and Sialidase

(Figure 5A). The discoidin domain of muskelin contains the previously described

Muskelin Identity in N-terminal Domain (MIND; Adams, 2002), which consists of a 21-residue long segment with 100% identity between vertebrates and

Drosophila melanogaster. As shown in Figure 5B for the structure of Galactose oxidase, the MIND sequence is included in the seventh and the eight p-sheets

119 1 10 2 0 30 40 Muskelin MAAGGMVAAAPECRLL PYALHKWSSFSSTYLPENI LVDKP

jnet

50 60 70 80 Muskelin NDQSSRWSSESNYPPQYLI LKLERPAI VQNI TFGKYEKTH

jnet

90 100 110 120 Muskelin VCNLKKFKVFGGMNEENMTELLSSGLKNDYYKETFTLKHK

jnet

130 140 150 160 Muskelin I DEQMFPCRFI Kl VPLLSWGPSFNFSI WYVELSGI DDPDI

jnet ] 170 180 190 200 Muskelin VQPCLNWYSKYREQEAI RLCLKHFRQHNYTEAFESLQKKT

jnet

210 220 230 240 M uskelin Kl ALEHPMLTDI HDKLVLKGDFDACEELI EKAVNDGLFNQ

jnet

250 260 270 280 Muskelin Y I SQQEYKPRWSQI I P KSTKGDGE DNRPGMRGGHQMVI DV

jnet

290 300 310 320 M uskelin QTETVSLFGGWDGTQDLADFWAYSVKENQWTCI SRDTEKE

jnet

330 340 350 360 M uskelin NGPSARSCHKMCI D I QRRQI YTLGRYLDSSVRNSKSLKSD

jnet

370 380 390 400 M uskelin FYRYDI DTNTWML L SE DTAADGGP K L VF DHQMCMD5 E KHM

jnet

410 420 430 440 M uskelin I YTFGGRNLTCNGGVDDSRASEPQFSGLFAFNCQCKTWKL

jnet

450 460 470 480 M uskelin LREDSCNAGPEDl QSRI GHCMLFHSKNRCLYVFAGQRSKT

jnet

490 500 510 520 M uskelin YLNDFFSYNELVSTGFVVECGNKKDSGMVPMTGFTQRATI jnet 530 540 550 560 M uskelin DPELNEI HVLSGLSKDKEKREENVRNSFWI YDI VRNSWSC jnet 570 580 590 600 M uskelin VYKNDQAAKDNPTKSLQEEEPCPRFAHQLVYDELHKVHYL jnet 610 620 630 640 M uskelin FGGNPGKSCSPKMRLDDFWSLKLCRPSKDYLLRHCKYLI R jnet 650 660 670 680 M uskelin KHRFEEKAQVDPLSALKYLQNDLYITVDHSDPEETKEFQL jnet

690 700 710 720 M uskelin LASALFKSGSDFTALGFSDVDHTYAQRTQLFDTLVNFFPD jnet

730 Muskelin SMTPPKGNLVDLI TL jnet

Figure 4. Prediction of the secondary structure of mouse muskelin. Full-length mouse muskelin was analysed using the jnet software at Jpred (http://www.compbio.dundee.ac.uk/~www-jpred/). Predicted p-sheets secondary structures are shown as yellow arrows and the predicted a-helices secondary structures are shown as red tubes.

120 mk 9 ------AAPECRLLPYALHKWSS, LVDKP NDOS SNYPPOYLILKt^^i^VONIT-FGKYEKTHWNLKKFKVFGGMNEENMTEL fa5 2065 c. ^ ^ l GMENGKIENKOITAS; PFRARLNAOGRV INKOWLElEL^RTAIITaX'K'SI.|jB|i'VKSYTIHYSEQGVEWKPY faS 2 1 9 2 ------GgMlGMESKAISDAOITAi SKARLHLOGRG PKEWLQvBBlÉBBrGVTTOGiKS I, IH » '.K i; FLISSSQDGHQWTLF GO 41------ASAPIGSAISRN’:WAVTC iCNKAIDGNKDTFWHTFY’G -DPKPPHTYTIDMKTTQNVNGLSMLPRQDGNQNGWIGRHEX'YLSSDGTNWGSP xrccl 1 MPEIKLFPyVSC: .NLLKADTYRKV; TISWLOLEKEEQIHSVDIGNDGSAFVEVLVGSSAGGAGEQDYEVLLVTSS apclO 1 — MTTPNRTPPGADPKQLER '.FGVDOLRDDNLE ISOPHLVKIQFRRKTTVKTLCIYADYKSDESYT PSKISVRVGNN FHNLQEI sialidase 4 57 ------VTVGLLDQiiRMSIADVDSI iWIDGNPSTFWHTEWS--r

103 LSS'-.LKNDYYKETFTLKHKIDEOMFP—FP— -ŒFIKIVPLI^^.'FNFSIWYVELSGIDDPDIVQPCLNWYSKYREO-- faS 2169 RLKSSMVDKÏFMNTtiTKGHVKNFFN'N j##FIRVIP K#M.'ITLR— LELFGC— DIY------faS 2295 FQNGKV - K###NODS FT PWNS1 HJ r YLRI H P - o lf c B . lALR MEVLGCEAQDLY - GO 141 ------VASGSWFADSTTKYSNFf*BwYVRLVAITEANGQPWTSIAEINVFQASSYTAPQPG- xrccl 111 FMSPSESRSGSNPNRVRMFGPDKL IVKIVCSOPYSKDSPFGLSFVRFHSPPDKDEAEAPSQKVTVTKLGQFRVKEEDESAN- apclO 109 ROLEL------VEPSGWIHVPLTDNHKh:^^»1I0IAVLAliHQNGRDTHMRQIKIYTPVEESSIGKFPRCTTIDFMMYRSlR sialidase 557 ------VASGRFTTGLAPQRAVFPARDAEYIRLVAI-SEQTGHKYAAVAELEVEG

- P-sheets 1-8 in P-sandwich I P-sheets not part of the P-sandwich - “-helics I Ps split by a loop

P71 R121

1128

Figure 5. Prediction of tertiary structure of mouse muskelin discoidin domain. A ) The assignment of secondary structure for the discoidin domains were based on the crystal structures of the coagulation factor V (1CZT_A), the coagulation factor VIII (1D7P_M), Galactose Oxidase (IGOF), XRCCI (1XNA_A), APClO/Doc (1JHJ_A), Sialidase (lEUU). Predicted secondary structure of mouse muskelin was based on figure 4 and is highlighted in the alignment. B) Dark blade indicate the positioning of aal21-128 of GO corresponding to the six initial amino acids in the MIND sequence (aal29-135) of mouse muskelin. C ) Dark blade indicate the positioning of aa65-71 of GO corresponding to the highly conserved segment aa71-78 in mouse muskelin.

121 of the p-sandwich and folds back as the centre core of the domain and therefore is possibly involved in stabilising the p-sandwich. Similar localisation was observed for the MIND sequence when mapped to X rc d , APC10/DOC1, and

Sialidase (data not shown).

Another highly conserved segment in muskelin covering aa71-aa93 includes the third and the fourth p-strand in the discoidin domain (Figure 5C). Predictions of the secondary structure of this segment indicate that this sequence consists of two p-strands. An alignment to the coagulation factor V and VIII showed that the corresponding segments in coagulation factor V and VIII forms a p-hairpin loop, which extends from the p-sandwich and is involved in membrane interactions

(Macedo-Ribeiro et al, 1999).

In contrast, two tryptophan resides positioned in the first p-hairpin loop of both coagulation factor V and VIII that are also involved in the membrane interaction are not present in the muskelin discoidin domain, suggesting that the putative p- hairpin loop in muskelin does not participate in membrane interaction. Other possible target(s) might be involved, for example specific partners.

LisH and C-terminal to LisH

No crystal structure has so far been solved for any LisH motif or C-terminal to

LisH (CTLH) motif-containing protein. Previous published data indicated that the

LisH motifs and the CTLH are short motifs each consisting of two a-helices

(Emesand Ponting, 2001). Thus, they may be flexible regions in the protein structure. The predicted secondary structure of mouse muskelin aa160-aa245 indicates four a-helices similar to the predictions of other LisH/CTLH domains

(Figure 4; Emes and Ponting, 2001). This sequence corresponds to the predicted LisH and the CTLH motifs in muskelin.

122 Three mutations in the LisH motif have been associated with loss of function

(Cardoso et al., 2000; Boulton et al., 2000; Wei et al., 2003). A mutation in aaF31S of Lis1 causes severe lissencephaly due to impaired neuronal migration (Cardoso et al., 2000), where a mutation of in the LisH motif of the

Drosophila melanogaster ebi protein (aaL16Q) causes a phenotype with reduced neuronal differentiation (Boulton et al., 2000). The cyclin E/Cdk 2 substrate p220 (or the nuclear protein, ataxia-telangiectasia , NPAT) functions in G1/S-phase transition and activation of histone transcription activity.

Deletion or site-directed point mutations in the LisH motif of p220 inhibits histone activity, without affecting G1/S-phase transition (Wei et al., 2003). The molecular functions disrupted by any of these mutations are still unknown.

Sequence alignment between muskelin, Lis1, ebi, and p220 indicate one amino acid within muskelin conserved among the previously described mutations

(Figure 6). Notable, the LisH motif is localised in the very N-terminal part of a majority of the LisH-containing proteins. As the function of this motif is still unknown, the possible role of this motif in muskelin is also uncertain. I suggest that the LisH motif in muskelin functions a flexible spacer region.

The Kelch-Repeats

The consensus sequence for the kelch repeat is built around four p-strands, and consists of three to four hydrophobic amino acids composing the second p- strand, a double glycine in the loop between the second and the third p-strand, and a tyrosine in the third p-strand, which pairs with a tryptophan in the fourth p- strand (see Introduction, Figure 6A). The crystal structure of the fungal galactose oxidase has revealed that each kelch repeat forms a four-stranded antiparallel p-sheet that represents a blade within a p-propeller or a p-barrel

123 IRAI 3 Y mgESpSH^ 0FT0GI Sir2p 3 Yffl Bc@EMgHEVygLALQD| RanBPM 437 h h B "c a ^ | e /^ RSHDQTV Lisl 6 iAyADHlRSNl jEEAYSvl

ebi 3

p220 2 LG QENLI S | c QT|§I l | S | S D L K

MK 171

Figure 6. Sequence alignment of LisH motifs. Amino acid sequences from 1RA1 {M.musculus, NP_109657), Sif2p [S.cerevisiae, NP_009661), RanBPM (H.sapiens, AAH52781), Lis1 (H.sapiens, NP_000421), Ebi (D.melanogaster, NP_477329), p220 or Nuclear protein ataxia-telangiectasia locus, NPAT (H.sapiens, NP_002510), and muskelin (M.muscuius, NP_038819) were aligned using ClustaW using ClustalW and coloured using BoxShade and displayed as black boxes (identical) and grey boxes (homology). Asterisks indicate reported loss-of-function mutations in Lis1 (F31; Cardoso et al., 2000), Ebi (L16; Boulton et al., 2000), and p220 (V7, L10, V11, L15, F27, E30; Wei et al., 2003). As shown L179 in muskelin is conserved among the reported loss-of-function mutations.

124 (see Introduction, Figure 6B and C, Ito et al., 1991). In the last blade, the ring closure of the p-propeller is achieved by combining p-strands from both the N- terminal and the C-terminal part of the propeller. By this mechanism, the whole structure is held together. The closure can be achieved by three mechanisms, in which either the pi strand is generated from the C-terminus, the p4 strand is generated from the N-terminus, or pi and p2 is generated from the C-terminus and p3 and p4 is generated form the N-terminus (Adams et al., 2000; Fulop and

Jones, 1999). Comparing sequence alignments of the kelch repeats in muskelin with regard to possible C-terminal closure (Figure 7A), N-terminal p-strand closure (Figure 7B), or the N- and C-terminal closure (7C) suggests that muskelin uses an N-terminal closure mechanism (Figure 7B), which is similar to galactose oxidase (see Introduction, Figure 6C). The seven kelch-repeats in the

Encephaiitozoon cunicuii muskelin-like protein were analysed in a similar manner (Figure 7D). These sequence alignments indicated that the kelch- repeat domain of the Encephaiitozoon cunicuii muskelin-like protein contained an N-terminal p-strand, similar to vertebrate muskelin, and therefore utilised a similar closure mechanism for the p-propeller.

Furthermore, the sequence alignment of the kelch repeats in mouse muskelin together with secondary prediction programs, allows p-sheet assignment in the six blades of the kelch domain of mouse muskelin. The conserved residues in the muskelin kelch repeats are clustered around the four p-sheets in each repeat (Figure 7B) and the loops connecting the p-sheets shows less conservation of amino acid composition and length of the loops as previously described (see Introduction, Figure 6B; Adams et al., 2000). The loop between

125 Kelch repeats in mouse muskelin:

A) C-terminal closure (3 + 1 ):

3 3 8 4 BKLvBpSSfficiliSE. ’M 0 HT^gRILTCNGSVDD@RASEPQFS-GL'^A CNAg ----- 4 4 5 3 TOSigTSlflcJllFHSKN B M o R S ------KTYfflN-jl^sBBI *DSI*HVD fP .gHkBIsSmV-- 5 5 1 0 gMTGgTQRATgraPELNEïfî'L^LSK ------DKEKREEn|rNsB|iE VRjg ^ C^YK^DQATKDN ---- 6 5 8 1 Bcpg: SBSgELggVHQ |^NPG ------K|cSPKMR|D-[®S--f|KLCRPg;KDYLLRHCKYL

+ 1 64 0 R K H ® E E K A Q 2@PL

Predic: --pi-- --P 2 -- L o o p 2 - 3 - - P 3 -- - - P 4 ----

B) N-terminal closure (1 + 3 );

IPKgTfflGDËEDNR VDG------TQDl |Eg "LgR YLD------sgVRNSKSgKSgg F Age Ï t I ^ R I LTCNGSVDDgRASEPQFS - GL*^ A R rg iJ -c N ; 4 53 IQsEl|gjcl:|FHSKN^ I q r s ------k t y | n - 1^ s | D S > H V D * ^ k S Is S m v -- 5 1 0 BmtgBtqratïEpeln ffLSgLSK------DKEKREENgRNsBSl r^YK(%DQATKDN--- 5 8 1 Bcpj^ gRjSŸBELiïfSVHB HSSnPG ------K@CSPKMR|D-ig^S--gKLC

P r e d i c : — pi — — P 2 — L o o p 2 - 3 — P 3 — — p 4 ---

C) N and C-terminal closure (2 + 2 ]

KAVNDGLFNQYISQQEYKPR IPKWT EDNR V O T EG MS G V O T EG VDG ------TQ SCfflKlfflC 3t L@ R YL D ------SSVRNSK RILTCNGSVDE 3rasepqfs-gl-:A ''CQC'« 4 5 3 I Q S F H S K N C RS ------k t y B n - :DS|^HVD^^jg : v - - MTGgTQRATmPELNEQH ,LS@LSK--- DKEKREENiRNSj ?v r “ ]C2|YK%DQATKDN------SC P® : 2 B S y S e L Ü V H g . I ^ N P G ------

P r e d i c : — pi - - - - p 2 - - L o o p 2 - 3 - - P 3 - - - "P4----

D) Kelch repeats in microsporidia muskelin:

-1 igTmPRAETPV 1 :EgGGgj - QM m E i g D G C ^ g ------WNGIEEL " " EGE|SEHNIGTRT 2 SGFj^cB-RMfflsHESKnBjp^ ------YVPLSSRKVFSGRS^gv|-- - D G N | N v | n MNDEGG 3 3GNVYdB-QmEv|gDkBc^' ^ ------RSTEKEIt YGGgMgS-LGGAgPNYGKEAFSVHHGVEVEKGA 4 jl,G[^Q-SSGSTVELDJ|.;fflQI LHEGGA e | s NSRYEPSTSSASLFAS r | l P®RSDTMQPPST 5 gSL^RLGHT^LFVPPDg^EgSSYNNCLVIIGGQRNKNCIREIVj^SLD-TDTVgQSgPFPIDSDGKV 6 -IQ^VL-HSTE|v-v( 1 c g R ------EKEGRFgN lg»T|SLIREK|AK|lVKPRVEGG 7 g A P ^ A j J -Q F 0 e | d G R F 0 0 ^ ------NVSERt|r RVNggPgKLEKK

P r e d i c : — P 1 — — P 2 — L o o p 2 - 3 — P 3 — — P 4

Figure 7. Comparison of A) C- terminal (3+1), B) N- terminal (1+3), or C) N- and C-terminal (2+2) closures of the kelch-repeat p-propeller structure in mouse muskelin. The predicted secondary structure shown was detected using Jpred (figure 4) and sequences were aligned using ClustalW and coloured using BoxShade and displayed as black boxes (identical) and grey boxes (homology). From this comparison, the highest degree of conservation is detected in B, suggesting the kelch-repeat p-propeller of mouse muskelin is stabilised by an N-terminal closure mechanism similar to galactose oxidase (see figure 7). D) Kelch- repeats alignment of Microsporidia muskelin shows N-terminal closure mechanism similar to mouse muskelin.

126 the second and the third p-sheet is the most variable region within the muskelin kelch-repeats, both in regard to conservation of residues between the individual kelch-repeats and in the length of the individual loops.

Phosphorylation Sites in Muskelin

Previous analyses of the Homo sapiens, Mus musculus, Rattus norvegicus, and

Drosophila meianogaster muskelin sequences have identified several conserved putative phosphorylation sites for protein kinase C (Adams, 2002). I extended this analysis to the additional new species identified (Figure 1). A sequence analysis was carried out in order to identify putative phosphorylation sites in muskelin. The phosphorylation prediction database

(http://www.cbs.dtu.dk/databases/PhosphoBase/) was searched using muskelin sequences from three groups of muskelin, namely vertebrates {Homo sapiens,

Mus musculus, Rattus norvegicus, Danio rerio), insects {Drosophila meianogaster. Anopheles gambiae), and fungi {Schizosaccharomyces pombe,

Encephalitozoon cunicuii) and the output was aligned to identify conserved phosphorylation sites among these species. The motif for phosphorylation by protein kinase C by this prediction program is X-S/T-X-R/K (Kishimoto et al,

1985). This short motif is identified 14-15 times in the vertebrates, with 11 sites conserved between all the vertebrates (Adams, 2000, and highlighted in red in

Figure 8). Five sites were identified conserved between Drosophila meianogaster and vertebrates (Adams, 2002). The Anopheles gambiae muskelin only contains 7 putative protein kinase C phosphorylation sites. Four of the sites (T76, T352, S438, T554 as numbered in Drosophila meianogaster) are conserved between Anopheles gambiae and Drosophila meianogaster, where T76, T352, T554 overlap with protein kinase C phosphorylation

127 Domain r H-terminal domain Human î ------M.«VAGGM\'AAAPEC------RLLPÏALHKWSSFSSTyLPENIUT-KPHD'iM ’SSBS------MYPPCYLILKLERPAIi'CNrT Mouse 1 ------MAAGGAVAVAPBC------F(LLPYALHKWSSFSSTYLPEHlLVDK?NDoB| ÎSSES------NYPPQYLILKLERPAIVONIT Rat 1 -..... MAAGGAVAAAPEC------RbbPYALHKWSSFSSTYLPENILi'DKPKD^a r 5 g gs------NYPPQYLILKLE RP AI VQNIT zebra fish 1 ------MAVAPDS------RVLS YSVYKWSSY S S TY LPENI Li'DKP N -SSBS------NYPPQYLILKLBRPAIVLSIT Fruit fly 1 MSSSSTSSPTDNSSHGSVALSSALNPSLNPVSGTNTSH|PFSL|Hl NFEIYRYSSYSPNYLPENVLVT)CPNI>M 'SAYT------NSPPQFlgglRRPAIVKKIK Mosquito i ------KKLDYDIYKYSSYSSSYLPENIREDNPSNcH 'SSNT------NTPPQYLVLKLERPAIVTKIK S.pombe 1 ------MERKELSYTIQSWSSYSGAFHPINILIDNPHQANSRWSCNTSICKPNNSEEQYVILKLDKLALVTDIT Microspodia 1 ------MLRNRRLYSVIPY-EÎHSYSTYASWPNNIKRNDPHDPNSRWSTST------NDHLQYIVLKVP-TSLWTIT Highly conserved ------N-terminal demain/Discoidin domain ------(------MIND------)---- 1 f------Human 73 FGKYEKTHVCNLKKFKVFGGMNEENMTELLSSGLKNDYYKETFJBh KIDEQ—MFPCRFIKIVPLLSWGPSFNFSIWYVELSGIDDPDIVQPCLNWYSKYREQEAIRL Mouse 73 FGKYEKTHVCNLKKFKVFG^WEENMTELLSSGLKNDYNKETfM h k IDEQ— MFPCRFIKTVPLLSWGPSFNFSIWYVELSGIDDPD3VQPCLNWYSKYREQEAIRL Rat 73 FGKYEKTHVCNLKKFKVFGGMNEEKMTELLSSGLKNDYNKETF^feKinEO MFPCRFIKIVPLLSWGPSFNFSIWYVELSGinDPDIVQPCLNWYSKYREQEAIRL zebra fish 67 YGKYEKTH’VCNLKKFlC'/FGGHSEENMTELLSSGLKNDFNKETnBlHKIDEO— MFPCRFIKIVPLMSWGPSFNFSIWFIELHGIEDPEWOPCLNW'/SKYREGEAIRL Fruit Ely 105 FGK.FEKSHVCNIKRFRVYGGLDDEHMVLLLEGGLKNDNVPEVFNLRCLTEOGSE-NLPILYLKlVPLLSWGPSFNFSIWYVELHGODDPMYTi^KNYNBBEVElIKL Mosquito 60 FGKYEKTHVCNLRKFKILGGLEEERMVLLFEGGLRNDSVPETFNLKHRTNSG-E-LLPIKFIQIVPLLSWGPSFNFCIWYVELLGIEDSMFILSNLRKYNMLREVEIVRL S.pombe 69 fgkfhtphlcnlkdfivyggreohlmfpllragltne ^ H I etfplcikkynkiensiackyikivploahahefnmsiwyvalhgidsasalsc JB nfa.l h l e k k a l n c Microspodia 67 FGKYNKMHVSNVKKMRISVSRDGVEYREVLCGGLKNDSEMETFNLLVKDEIY FIAQYVKIEPLAAWGVNFSYSLWFVELRGLVD— VEEFVRYNEM’A/FGRBB n

------Intervening region/LisH and CTLH------1 T- -Kelch domain------160 CLKHFRQKNYTEAFESLC'KKTKIALBHPMLTDIHDKLVLKGDFDACEELIB^CAVHDGLFNÛYISC'ÛEYKPP.WSQIIP: SDGEDNRF-GMRGGHOMVIDVQTETVSLFG 180 CLKHFRClHNYTEAFESLOKKTKIALBHPMLTDMHDKLVLKGDFDACBELIEKAVNDGLFNClYlSClQEYKPRWSQlIPi ^DGEDNRPGMRGGHQMVIDVQTETVYLFG Rat 180 CLKHFRCHNYTEAFESLOKKTKIALEHPMLTt^HDKLi^KGDFDACEELIBÎCV.^DGLFNQYISQÛEYKPRWSQIIP]■ DGEDNRPGMRGGHQMVIDVQTETi'YLFG zebra fish 174 CLKHFRQHNYTEAFESLQKKTRIALEHPMLTHLHERLVLRGDFDACEELIDKAVRDGLFNQYISQQDyKPRWSQIIPKC*NKGDGDDNRP(34RGGHQMVIDVQTETVYLFG Fruit fly 214 CLKHFRQQGYLSAFGALQEQTNVQLEHPLISELHKCLVLQGDFEKAERFIVECIDEGLMDEYLHKQDYKHSWHMKHT------ESDIKPGNRGGHQLWDAKRRLIYLYG Mosquito 168 FLKHCROQGYVNSFSALQEETGVKLEDEKMTBLHETLVNRGDFKKTEEMLVGFIENGLMDNYLERQEYKPQWSLQLTP----- ENEPRPGRRGGHQLIIDPMTSTIYLYG s . pombe 179 LLSYF^FKGKU)ICSVILaSYQLSIPHSPLMPMYD?.FCP.L<3nBJriLLFQQELQAEIIEKWKHKLTr.LpHilAT------ELKGlHisGHQH'T'fNEKDNCIYLYG Microspodia 170 CLRFLRDCGFRDIYEILEKRSÉBiEDEWRKIRELLE1--|By EEVEEVLDKLDPRVFQGFIDSSPYTLVWTEIPR------AEIWPCERÛGHQMVEKDG--CLYLÏG

Domain ------Kelch domain GWDGTQDLADFWA-i'™ENQWTCISRDTBKENGF' SCHKMCIDIQRRQIYTLGRÏLDSI dfyrydidtntwmllsedtaadggpklvfdhqmcmm Mouse gwdgtqdladfway B enqwtcisrdtekengp SCHKMCIDIQRRQIYTLGRYLDsj yrydidtntwmllsedtaadggpklvfdhqmcmd H Rat GWDGTQDLADFWAi'^BENQWTCIS RDTEKENGpI schkmcidiqrrqiytlgrylds I dfyrydidtntwmllsedtaadggpklvfdhqmc 'm d B zebra fish gwdgtqdiJiDf w a 'î'H enqhac I s p-dBB e b-ir-j 3CHKMCI D^ B pOI YTLGP.Y LDjj ;SK- 1B:JDFYC-YDIDANTWTLLSBDTSADGGPKLVFDH<^CMdB Fruit fly GWDGYQDLSDLWVFNIEDDQWTLLCERSELSNGE S CHKMVFDPIS ENVFMLGRY LDfi lTSD--YIKSDFYLyDTRAGNWMOICDDTSQVGGPQLWDHÛMCMDBI Mosquito GWDGN EDLSDLWS VDVK^JtfTLI HERS E LLI -PF ' HKM'vYLTkNSQIFMLGRYLDSASRTEE--NINSDFYLYNTITKTWFLICNDTSC>VGGPQLVFDH(;^ClDVD S. pombe GWDGVKTFSDFWIYNVDKDLWIMENBYG IPGERVCHRMVIDTSQQKLYLLGNYF' tETEVPPDARTDFWEFD I ^gvWTCLSYIj0gDGGPAAI FDHGMSVDEK Microspodia GWNGIEELEDFWKYGDGE WSEINIGTRTPGKRSCHRMV--SHESKLYLMGKYVP DIWVFDGNWN VMNMNDEGGPGNVYDHQMWIGD Highly conserved Dciovain ------Kelch domain------KMIY------TFGGRNLTCHGG’VDDSPJ^SEPCrFSGLFAFNOj HMIY------TFGGP.I LTC34GSVnDSPA5BP*QFSGLFAFNCQ Rat ------TFGGRILTCMGSVDDSRASBPQFSGLFAFNCQ zebra fish ------TFGGRILTCNGSVDDGRTSEPQFSGLYAYHCQ Fruit fly |RTIY------VFGGKI -SVNATGAIBPEYSGLFSYHIA Mosquito KQTIY------VFGGRILEPK-NBEET-TGQYRYSGLFSYHIS RGIVY------VSGGCKWDPD------ELVFBGLYAYDTK Microspcdi, KLCVFGGR:?H§EDTY'30LYMFSLGGAFPNYGKEAFSVHHGVEVEKGASRLSBBCEACBHHPU>RSGOSSGSTVELDLLFGGOILHBGGAEESNSRYEPSTS SASLF

Demain ------Kelch domain Human LLKEDSCNAGF----- EDIQSRIGHCMLFHS- --KNRCLYVFAGQ-RSKTYLNDFFSYDVDSDHVDIISDGN------KKDSGMVPMTG: TIDPE LPEDSCNAGP----- EDIQSRIGHCMLFHS- — KNRCLYVFGGÛ- RSKTYLNDFFSYDVDSDHVDIISDG|------TIDPE LREDSCMAGf----- EDIQSRIGHCMLFHS- —KNRCLYVFGGQ-RSKTYLNDFFSYDVDSDHVDIISDgI ------f c s G M ’/rM TIDPE zebra fish AGSWSLLREDSCKAGP----- EDICSRIGHCMLFHT- - - RNRCLYVTGGO- RSKTYLNDFFSYDVDGDHVEI15D]§------Fruit fly TNTWAQI LVDCHHASA^L.-i:'\>«fcRITHOfVFHN- — KHRKLYIYGGQ-RGKEDLDEFLSYD\^TaAISVLNKEHPHHGTIAGNE.AKFEP£. ATIDCE Mosquito TNTWAHILVDCEHPTASSPY V vQ s RVTHSMLFHH- — KHRXlYIFGGC-RGKDYMTDFLIYD-'/NTNELSNMTPEN------NNTDTKNVPOSG TIDC HRIWEQLAVRYEDRQK— HCEFKIERMGHCMEYFP- -DENKLYIFGGQSYDQEFILDMCYIKLETREAVOHVRXN- -DTSQSPSPSFCÛRSÎMDSK Microspodia ASRWLMVRSDTMQPPS Ti jHlYRLGHTMLFVPPDYIEGSSYNNCLVIIGGQ-RNKNCI REIVFYSLDTDTVFQSVPFP- IDSDGKVIQRGVLHS- Highly conserved ------Kelch domain------Human LNEIHVLSGLSKDKBKREEN-—VRNSFWIYDIVRNSWSCVYKNDQAAKD------NPTKSLCîEEEPCPRFAHgLVYDELHKVHYLFGGNFGKSC Rl.DrFwi LNBIHVLSGLSKDKEKREEN---VRNSPWIYDIVRNSWSCVY'KNDC'ATKa)------NLSKSLOEEEPCPRFAHCLVYDELHKVTiYLFGGNPGKSC RLI'C'FiJ LNEIHVLSGLSKTiKEKREEN-— VRNSFWIVDIVRNSWSCvYKNDgAAKE------NLSKSLQEEEPCPRFAHOLVYDELHKVHYLFGGNPGKSC zebra fish LNEIHVLSGLSKDKDKREEN VP.NSFWIYDIARNNWSCVYKNDQAWE------NPTKALgEEEPCPRFAHQLYYDELHKVHYLFGGNPGK£?| R LD P Fw l Fruit fly RDEIYIFSSLSKVKDRRDIQHIDASNSFWV':gmmk-WSRIYYCRHSGQDANSINNMNILGASKGDGTLEPCPRYAHgLVYDDVTNTHYLFGGNPGIC(;XX)gLRLDDFWL Mosquito KDEIYVLTSLSKEKERRDLN INSFWLFSljBEWrrvxKSEHS------NGENCYQKSQ-NTCHEPCPRYAHQIVYDTTNQVHFLFGGNPGTNSN—VRLDDFWV S.pombe NHRIFTMFGFEQRNIHfCVLR---- PSLWIFYITTEEWVKISDINEEEGN--EHPCSRYVFGKKt'NHLTLNFKRFGHAVAADLNRNIIYLMGGNASTHPLRPMKLSDFWK Mictospodi -TEIWLFCYGREKEGRFEN---- IDVYTYSLIREKHAKVIVKPRVEGG------VE1KPAPRSAHQFVEMDGR--FYLFGGN— V ® i ”DRRVND«WM

1 r------C-terminal region------621 ■tLCRrS-KDYLLRHCKYLIRKHRFEEKAQVDPLSALKYLQNDLYITVDHSDPEETKEFQLLASALFK------Mouse 621 ■ l c r p s -kdyllrhckylirkhrfeekaqmdplsalkylqndlyitvdhsdpeetkefqllasalfk ------Rat 621 ■ l .r p s -kdyllrhckylirkhrfeekaqmdplsalkylqndlyitvdhsdpeetkefqllasalfk ------zebra fish 615 B l y RE'S-KBYLLRHCKYLIRKYRFEEKAQTBPLNALKYI/QNDLSLTVDHTDPDETKEFQLLPSALFK------Fruit fly 673 LCUEKPK-REEILKHCRFLVTdCLRYBEMTRODPLKAMQYLRHHlADVIDHSDAEQLNNFHKLASLLFKNHDGl^LDRECNTPALCADADDLETDSDvf-BlKEELADG-- Mosquito 607 LRLEKPN-RGNILRYCKYLLRKQEYEBITKNDPLSAILYLCTKLYDIIDHSDPVQLKEFHKLASLLFKSDTVDGETEPTSLLTCPLASTASASAPVGGNKHKQQRADDDG S .pombe 609 LDILDRWGIîCHCLSNVSLQLHLHRFHELVSENLAKAVEYLQKSMQPQFDKSPLFWNALILDAFSG>§------Microspodia 64 8 FKLEKKT-TQEICHLSKLLVRKHKYLYLLGVDEEKAVEYLRTSILSMWDDDTR--ELFEELCHEVYR------

------C-terminal region------1 Human 688 ------SGSDFTALGFSDVDHTYA------GRTQLFDTLVNFFPDSMTPPKGNLVDLITL-- 688 ------SGSDFTALGFSDVDHTYA------ORTQLFDTLVHFFPDSMTPPKGHLVDLITL— Rat 686 ------SGSDFTALGFSD'/DHTYA------QRTOLFDTL'/ÎIFFPDSMTPPKGNL'/DLITL-- zebra fish 682 ------SSSDFIPLGFSDVDQTYA------QRTQLFDTLWFFPDNMTPPKGNLVDLITL— Fruit fly 779 -PATENLS LESNESMISSGISETASSPSSSSCJBa P---TDCTPELKNKRAFLFNKLVQLMPPSMVQPERNLSDFWM-- Mosquito 716 hkeebnlakvmrve IB sveeadtssissmsssdcsqnopoqaidqqfemkirrcllfnklvdlmpealcopkknisdfvll -- S.pombe 677 ------NKDIAQRRFETFQTIYD------LLPVDENTVAPNESLLNMIBFFT Microspodia 713 ------RRirVDPVEDICRLLlfBEH------

Figure 8. Identification of putative phosphorylation sites in muskelin and muskelin-like orthologues. Alignment of muskelin and musketin-fike sequences were prepared similar to figure 1. Putative phosphorylation sites tor protein kinase C and protein kinase A were predicted using PhosphoBase v2.0 for all muskelin and muskelin-like orthologues. Red highlights predicted phosphorylation sites for protein kinase C, yellow indicates predicted phosphorylation sites for protein kinase A, and green for colocalisation of sites for both protein kinase C and protein kinase A. Asterisks indicate highly conserved putative phosphorylation sites for protein kinase C.

1 2 8 consensus sequences identified in vertebrates. Thus, three of the four identified phosphorylation sites appear most well-conserved between vertebrates and invertebrates. The Schizosaccharomyces pombe and the Encephalitozoon cunicuii muskelin-like proteins contain 9 consensus sequences for phosphorylation by protein kinase C individually, only one of which is conserved between the two fungi. This site overlaps with S350 in vertebrates (Figure 8).

The prediction of putative phosphorylation sites for protein kinase A was also investigated. T en consensus sequences are identified as conserved between vertebrates (highlighted in yellow in Figure 8). Three putative protein kinase A phosphorylation sites are identified as conserved between Drosophila meianogaster and Anopheles gambiae muskelin, where S49 and T519 as numbered in vertebrate muskelin are conserved between insect and vertebrate.

S49 identified in vertebrate muskelin is also a putative phosphorylation site for protein kinase A in yeast, and therefore the only site conserved in all species, whereas S355 was conserved between vertebrates, D. meianogaster, and S. pombe.

To establish whether the putative kinase phosphorylation sites would actually be exposed to the surface in the three-dimensional protein structure, the putative phosphorylation sites for protein kinase C and A were mapped on to the crystal structure of both Human coagulation factor VIII (discoidin domain;

PDB: 1D7P) and galactose oxidase (kelch repeats; PDB: 1G0F). These alignments predict S44 of mouse muskelin as corresponding to K35 in galactose oxidase and S44 in coagulation factor VIII. The positioning of aa44 in factor VIII is indicated on Figure 9A. The S44 residues are located in a loop extending from the p-sandwich, which for the coagulation factor Va and factor

129 5 4 4 S49 V

5350

5355 5324 T519 7 5 1 5

Figure 9. Predicted structural localisation of putative protein kinase C phosphorylation sites In vertebrate muskelin. A ) The crystal structure of the discoldin domain of Human coagulation factor VIII (1D7P_M) with residue S2235 highlighted In black corresponding to the putative protein kinase C phosphorylation site S44 and S2240 highlighted In purple corresponding to the putative protein kinase A phosphorylation site S49 In vertebrate muskelin. The discoldin domains from muskelin and the Human coagulation factor VIII (Fa8) were aligned as In Figure 5 and the residue S2235 In Fa8 corresponding to the S44 In muskelin were Identified. The predicted localisation of S44 shows accessibility to protein kinase C for the addition of a phosphate group. B) The crystal structure of kelch- repeat domain of galactose oxidase (IGOF) with residues Y272, G292, L342, R493 highlighted in black corresponding to the putative protein kinase C phosphorylation site S324, S350, S396, and T515 In vertebrate muskelin, and E296 and S497 highlighted In purple corresponding to the putative protein kinase A phosphorylation sites S355 and T520 In vertebrate muskelin. The five sites S324, S350, S355, T515, and T519 are located In close proximity on one side of the (3-propeller, whereas S396 Is located on the opposite side. All four sites are predicted as accessible for protein kinase C for the addition of phosphate groups, whereas only S355 Is predicted as accessible for protein kinase A. T519 Is predicted as embedded In the propeller structure.

1 3 0 VIII corresponds to the second spike, important for the membrane association

(Pratt et al., 1999). The putative protein kinase A phosphorylation site S49 was mapped on the crystal structure of coagulation factor VIII in a similar manner.

As shown in Figure 9A, the S49 (corresponding to E49) was predicted as located in the same loop as S44, and therefore accessible for the kinase.

The localisation of putative phosphorylation sites for protein kinase C in muskelin kelch domain was assigned to the structure of galactose oxidase

(IGOF). The four sites conserved in the kelch-repeats between vertebrates and

Drosophila meianogaster Included S324, S350, S396 and T515. S324 and T515 are located in loop connecting the fourth and the first p-sheet (Figure 7B). S350 is located in a loop connecting the second and the third p-sheet and S396 is located in the second p-sheet in the third kelch repeat (Figure 9B). These results indicate that S324, S350 and T515 are located in close proximity at the same side of the p-barrel and in a favourable position for a regulation by serine/threonine phosphorylation. The S396 is predictably positioned in a p- sheet, but a possible favourable position for phosphorylation by protein kinase C.

The localisation of the conserved putative protein kinase A phosphorylation sites, S355 and T519 was analysed in a similar manner. As shown in Figure 9B, the S355 was predicted as located in close proximity of the S350, and a favourable position for a regulation by serine/threonine phosphorylation. The

T519 was located in a p-sheet embedded in the p-propeller structure and therefore not in a favourable position for accessibility by the protein kinase A.

131 In conclusion, the bioinformatic analyses on muskelin and muskelin-like proteins have indicated two highly conserved regions within muskelin, namely the N- terminal discoidin domain and the kelch-repeat domain. Crystal structural information on similar domains in other proteins have provided predictions of the tertiary localisation of the conserved MIND sequence and putative phosphorylation sites within muskelin. The identification of muskelin as a multidomain protein suggests the possibilities for multiple interactions.

132 Discussion

In this chapter I described the results of bioinformatic characterisation of muskelin and muskelin-like proteins in vertebrates, insects, yeast, and microsporidia. The c omparison of this I arge n umber of o rthologues p rovide a firm basis for identification of regions and residues highly conserved through evolution and therefore of special interest. In addition to the previously described LisH/CTLH and six kelch repeats, I identified a discoidin domain in the N-terminal region as well as five highly conserved putative phosphorylation sites for protein kinase C and three highly conserved putative phosphorylation sites for protein kinase A.

Discoidin Domain

The identification of an N-terminal discoidin domain in muskelin allowed structural comparisons to previous determined crystal structures of other discoidin domain proteins, including galactose oxidase (1G0F; I to et al, 1991),

Sialidase (1EUU; Gaskell et al., 1995), coagulation factor Va (1CZT;), and VIII

(1D7P; Pratt et al, 1999) and APC10/DOC1 (1JHJ; Wendt et al, 2001). X-ray cross-complementing group 1 (PDB: 1XNT; Marintchev et al., 1999) shows no significant sequence similarities to the discoidin domain, but an NMR-structure of the N-terminal domain of has shown structural similarities between the two domains, indicating that the p-sandwich is an evolutionary-conserved structure.

The p-sandwich fold is also resembles in the immunoglobulin structure, where the composition of the S-type Ig-like fold is most similar to the discoidin domain

(Bork et al., 1994).

From the comparisons to the determined crystal structures, I retrieved strong structural indications of the positioning of p-strands within the muskelin discoidin

133 domain, localisation of the previous identified MIND sequence (Adams, 2002), and positioning of a putative PKC phosphorylation site.

Other analyses of discoidin domains have identified the seventh p-sheet as a particularly highly-conserved region in the discoidin domain and as representing a part of the consensus sequence (Macedo-Ribeiro et al, 1999; Pratt et al.,

1999). Previous studies have shown that a 21-residue peptide corresponding to the seventh and the eight p-strand of the discoidin domain competes with coagulation factor VIII for membrane binding (Gilbert et al., 1995). These data suggested that the peptide competed with the membrane-binding domain of the coagulation factor VIII, whereas the later determination of the crystal structure of coagulation factor VIII suggests a disruption of the p-sandwich core structure

(Pratt et al, 1999). This peptide is analogues to the MIND sequence in muskelin, suggesting a similar role of the MIND sequence in stabilising the discoidin domain.

Kelch-Repeat Domain

By similar techniques I compared the kelch-repeat domain in muskelin with the previous determined crystal structure of the galactose oxidase kelch-repeat domain (Ito et al., 1991). Although muskelin only consists of six kelch repeats compared to seven kelch-repeats in galactose oxidase, I still was able to obtain strong indications of the three-dimensional organisation of the muskelin kelch- repeat domain. These comparisons identified which residues were located in the p-strands and the intra- and inter-connecting loops (Figure 78).

Interestingly, the identification of four putative PKC phosphorylation sites located in the kelch-repeat domain of muskelin and a subsequently structural positioning of these sites, indicated that three of the sites are in close proximity

134 to each (Figure 9B). All three sites are predicted as spatial accessible for interactions with PKC and the addition of phosphate groups. Furthermore, these three sites are positioned near the previous characterised catalytic site of galactose oxidase (Ito et al., 1994). The fourth putative PKC phosphorylation site is located on the opposite side of the p-propeller domain and predicted as accessible for addition of a phosphate group (Figure 9B). Furthermore, the predicted localisation of the PKA phosphorylation sites, S355 and 1519, indicated that one the S355 is accessible for the kinase, whereas the 1519 is located on the first p-sheet in the sixth kelch-repeat within the p-propeller structure, and therefore seems inaccessible.

Galactose oxidase contains an unusual covalent thioether bond between a tyrosine (Y272) and a cysteine (C228), which acts as the catalytic motif promoting oxidation of primary alcohols to their corresponding aldehydes

(Firbank et al., 2001; Ito et al., 1994). Sequence alignments between muskelin and galactose oxidase showed that muskelin does not contain these residues, and therefore most probably lacks a similar enzymatic activity.

These results have provided information on domain architecture of muskelin, which I could use as a basis for my further evaluation of muskelin.

135 Chapter 4

Bioinformatic Analysis of the Kelch-Repeat Protein Superfamily

Introduction

With over 28 kelch-repeat proteins having been sequenced and functionally characterised in diverse organisms including viruses, plants, fungi and mammals, the full extent and complexity of the kelch-repeat family within individual species is still unknown (Adams et al., 2000). Having obtained a full characterisation of the molecular architecture of muskelin, I wished to understand the relationship of muskelin to other proteins within the kelch-repeat superfamily, i.e. is muskelin a unique form of kelch-repeat protein, or do other proteins in the same species have the same domain architecture. This analysis was based on an complete overview of this domain family, built on the availability of complete genome sequence information for H.sapiens,

D.meianogaster, A.gambiae, C.eiegans, S.cerevisiae, S.pombe and A.thaliana.

Based on these data, I wished to make a rational assignment of structural subgroups in order to open up possibilities for prediction of structure/function relationships.

In this chapter, I describe how I addressed these questions through database mining and bioinformatic sequence analyses. The results from these studies form the basis of a published paper (Prag and Adams, 2003).

136 Results

Identification of Kelch-Repeat Proteins in the Human Genome

To date, muskelin has been identified as a single gene in Homo sapiens, Mus musculus, Drosophila meianogaster, and Anopheles gambiae. In order to search for proteins containing a similar molecular architecture, a complete list of all kelch containing protein in the human genome was compiled. The whole human genome (http://www.ncbi.nlm.nih.gov/genome/seq/HsBlast.html) was searched using the 46 residue consensus sequence for the kelch-repeat

(PRSGAGVWVGGKIYVIGGFDGSQSLSSVEVYDPETNTWEKLPSMP) as defined in the Conserved Domain Database (CCD543), Pfam (Pfam1344), and

SMART (00612)). This search identified 57 different kelch-repeat containing proteins and hypothetical proteins in the human genome. It was noted that several of the known kelch-repeat containing proteins (Adams et al., 2000) were not identified using this consensus sequences. Probably because of the low number of conserved residues in the consensus sequences of the kelch motif

(see Introduction, Figure 6A) and also because of the variations in length of the inter- and intra-blade loops between the kelch repeats (Jawad and Massimo,

2002; Adams et al., 2000; Bork and Doolittle, 1994). Therefore further searches were made with the kelch-repeats of all 28 known superfamily members as sequence queries at the human genome BLAST search engine. These kelch- repeat domains identified 18 additional kelch-containing proteins in the human genome. Cross referencing all 75 entries against GenBank identified 9 of the entries as partial sequences and/or duplicate entries for the same protein or hypothetical open-reading frame, ORF, and two of the entries as non-kelch containing proteins. I also cross-referenced the search results to the domain

137 entries for kelch in the Pfam database (http://pfam.wustl.edu/hmmsearch.shtml) and SMART domain databases (http://smart.embl-heidelberg.de/). Many entries were listed in both SMART and Pfam, however a number of the proteins identified were not listed in these databases (indicated in Table 1), even though when searching these polypeptides against SMART or Pfam, kelch-repeats were clearly identified within the sequence. Furthermore, the numbers of kelch- repeat proteins assigned to H. sapiens in the Species tree or Taxbreak links of

Pfam and SMART were misleading because of inclusion of incomplete ORFs and multiple entries for the same polypeptide. Additional searches of GenBank were also carried out with single kelch motifs from the 28 known kelch-repeat proteins that were distinctly longer or shorter than the CDD kelch-motif consensus, in order to search more extensively for proteins containing more divergent repeats. Taking all these evaluations and exclusions into account, at least 72 kelch-repeat proteins encoded in the human genome were identified

(Table 1).

In order to determine the number of kelch-repeats in each protein or hypothetical protein, each entry was evaluated using CCD

(http://www.ncbi.nlm.nih.gov/Structure/cdd/wrpsb.cgi) together with manual identification of kelch motifs. An initial evaluation of the number of kelch-repeats identified in the individual kelch domains varied from two to seven. None of the identified kelch-containing proteins in human or mouse contained three-, four or more then seven kelch repeats. Though four-bladed p-propellers have been reported (Li J et al., 1995; Paoli et al., 1999), earlier mathematical predictions of p-propellers indicated preferences for seven-bladed over six- or eight-bladed p- propellers (Murzin AG, 1992). The indication of only two kelch-repeat proteins in

138 Table 1, Homo sapiens kelch-repeat proteins

Architecture Accession Alternative name Amino acids Chromosome Domain organisation Localisation BTB-Keich proteins: 1. BTB-(5) C reflNP 055666.11 KIAA0469 539aa 1g36.23 2. BTB-(5) C reflNP 006054.11 sarcosin 596/606aa 2q31.1 3. BTB-(5) C reflAAH 10437.21 494/558aa 2q31.1 4. BTB-(5) C reflNP 115894.11 TA-KRP, KIAA1842 575aa 3p14 5. BTB-(5) C (a) reflNP 689606.11 Sarcosynapsin 472/*621aa 3p22.1 6. BTB.(5) C reflXP 291224.11 DKFZP566C134 622/623aa 7p14.3 7. 2xBTB-{5) C reflXP 063481.11* 819aa 14q11.1 8. BTB-(5) C reflXP 171687.11 * 458aa 15g21.2 9. BTB-(6) C reflNP 055273.11 Kelch like protein X 609aa 1q24.1-g24.3 10. BTB-{6) C reflNP 006460.2! NSI-binding, Ndl 642aa 1g25.1-q31.1 11. BTB-(6) C reflNP 067646.11 C3IP1 568aa 1q32.1 12. BTB-(6) C reflNP 005888.11 IPP 584aa 1p34-p32 13. BTB-(6) C (b) reflXP 209285.11 * Actinfilin 174aa 1p36.33 14. BTB-(6) C reflBAB67814.1l KIAA1921 545/707aa 2p23.3 15. BTB-(6) C reflXP 292990.11 FLJ37812 649aa 2q37.3 16. BTB-(6) C reflXP 093813.21* 585aa 3q21.3 17. BTB-(6) C reflNP 079286.11 KIAA0795, OK/SW-CL.74 509aa 3p21.31 18. BTB-(6) C reflNP 569713.11 Kelch-like protein 6 610aa 3g27.3 19. BTB-(6) C (c) reflNP 060114.11 FU20059 412aa 3g27.3 20. BTB-(6) C reflNP 057074.2! lymphocyte activation-associated protein 755aa 4p14 Kelch-like 5 21. BTB-(6) C reflNP 009177.21 Kelch-like 2, Mayven 593aa 4g21.2 22. BTB-(6) C reflNP 065854.11 Kelch-like 8, KIAA1378 620aa 4q21.3 23. BTB-(6) C reflNP 003624.11 NrpB, PIG10, or ENC-1 589aa 5q12-q13.3 24. BTB-(6) C reflNP 059111.11 Kelch-like 3, KIAA1129 587aa 5g31 25. BTB-(6) C reflCAC 16284.11 634aa 6p12.1 26. BTB-(6) C reflNP 443136.11 KIAA1900 620aa 6g16.3 27. BTB-(6) C reflNP 061334.21 SBBI26 564aa 7q15.3 28. BTB-(6) C reflNP 055682.11 KIAA0711 623aa 8p23.2 29. BTB-(6) C reflXP 294387.11 * 649aa 8q24.13 30. BTB-(6) C reflNP 005884.11 Calicin 588aa 9p13.1 31. BTB-(6) C reflNP 061335.11 KIAA1354 617aa 9p22 32. BTB-(6) C reflNP 060565.21 FLJ10450 518aa 11p11.12 33. BTB-(6) C reflNP 689646.11 FU30685 608aa 11q22.3 34. BTB-(6) C reflXP 044836.11 KIAA1340 505aa 12g11.23 35. BTB-(6) C reflNP 115514.21 DKFZp434E2318 684aa 13g13.3 36. 2xBTB-(6) C reflNP 065917.11 Kelch-like protein 1, KIAA1490 748aa 13q21 37, BTB-(6) C (c) reflXP 058629.21 * XP 058629 309aa 14g21.1 38. BTB-(6) C reflNP 071925.21 FLJ12587 589aa 15q25.2 39. BTB-(6) C reflNP 079007.11 Chromosome 16 ORF44 616aa 16q23.3 40. BTB-(6) C reflNP 071324.11 gigaxonin aka GAN-1 597aa 16g24.1 41. BTB-(6) C reflNP 689680.11 Kelch-like 10 614aa 17g21.2 42. BTB-(6) C reflNP 060613.11 FLJ10572 708aa 17g21.2 43. BTB-(6) C reflXP 035405.51 * KIAA1384 628aa 18g12.1 44. BTB-(6) C reflNP 060786.11 FLJ11078 615aa 19p13.11 45. BTB-(6) C reflNP 036421.21 Keap 624aa 19p13.2 46. BTB-(6) C reflNP 116164.21 FLJ14360 634aa 22q11.21 47. BTB-(6) C reflNP 476503.11 kelch-like 4 718/720aa Xq21.3 48. BTB-(6) C reflNP 695002.11 FLJ34960 644aa Xp22.13 49. BTB-(6) C reflXP 040383.31 KIAA1677 604aa Xg22.1-g21 50. BTB-(6) C reflNP 277030.11 KIAA1309 604aa Xq23-q24 51. (6)-2xBTB N (c) reflNP 006758.11 LZTR 552aa 22g11.21 Discoidin-Keich: 52. DD-(6) N reflNP 037387.11* Muskelin 735aa 7g32 F-box-Kelch: 53. Fbox-(5) N reflXP 048774.21 KIAA1332 717aa 1p36.23-p36.11 LCM-Kelch 54. LCM-(?) ? reflNP 055608.21 * p21WAF1/CIP1 promoter-interacting protein 686aa 15g15.1 Kelch and PHD 55. (6)-- N reflNP 000527.11* RAG-2 527aa 11p13 Kelch and Muitidomain: 56. (5)-CLECT C reflXP 049349.81 KIAA0534 lOllaa 10q26 57. (6)-FNIII N reflNP 037452.11 host cell factor 2 792aa 12q23.3 58. CUB-(6) ? reflCAC 12966.11 510aa 10q26.11 59. CUB-(6) reflXP 209118.11 MEGF8 1018/1737aa 19q12 60. CUB-(6)-CLECT N reflNP 647537.11* Attractin, mahogany 1198-1429aa 20p13 61. (6)-FNIII N reflNP 005325.11 HCF-1 VP16-binding 1938aa Xq28 Kelch and unique 62. (6)— ? reflNP 060036.21 DKFZp434G0522 520aa 16q24.3 63. - ( 6 ) C reflNP 689588.11 FLJ38753 559aa 1q36.13 64. - ( 5 ) C reflNP 612442.11 BC009980 495aa 22q13.33 Propeller only 65. -(7)- N reflNP 060673.11 FLJ10748 350aa 1q32.1 66. -(7)- N reflNP 775817.11 * 354aa 3p21.31 67. -(2)- ? reflNP 689579.21 FLJ35949 263aa 1q23.1 68. -(6)- C reflXP 045954.21 KIAA0265 442aa 7q32.2 69. -(6)- N reflNP 005824.1! Rab9 effector p40 372aa 9q34.11 70. -(6)- ? reflNP 476502.11 kelch domain containing 3, PEAS, Sortilin-1 389aa 6p21.1 71. -(6)- C reflNP 751943.11 kelch domain containing 1 406aa 14q21.3 72. -(6 )-C reflNP 055130.1! kelch domain containing 2 406aa 14g21.3 Domain distribution based on: a) 621 bp-variant enlisted in BLASTP b) Rattus norvegicus orthologue c) Mus musculus orthologue

139 the h uman g enome w ith s even k elch-repeats w as t hereto re i ntriguing. S o far the only examples of seven-bladed kelch domains in any other species are galactose oxidase and the muskelin-like protein in Encephalitozoon cunicuii (Ito et al., 1994, and Chapter 3, Figure 7D).

Furthermore, as the crystallographic structure of fungal galactose oxidase has established that the kelch-repeats assemble into a larger p-barrel composed of several blades, it appeared unlikely that the entry indicating two-bladed kelch repeat protein (NP_689579.2, #67 in Table 1) corresponded to a complete polypeptide. Also, no Rattus norvegicus or Mus musculus orthologues could be detected and therefore this sequence was excluded from further characterisation. Four other kelch-repeat proteins identified in the human genome were found to be incomplete sequences (Actinfilin, #13; FLJ20059,

#19; XP_058629, #37; LZTR, #51), but orthologues were identified in Rattus norvegicus or Mus musculus and these orthologues were used in further analysis. For one of the entries (p21WAF1/CIP1, #54) the number of kelch repeat could not be determined because the h igh sequence divergence from the consensus sequence. Taking these data into account, the family of kelch- repeat proteins in humans consists only of five, six and seven bladed p- propellers, where 80.3% (57/71) are six-bladed, 15.5% (11/71) are five-bladed, and 2.8% (2/71) are seven-bladed.

The p-propeller structure is closed and stabilised by either C- or N-terminal closures (see Chapter 3, Figure 7 for further detail). Therefore, I examined the primary sequence of each kelch-repeat protein by secondary structure predictions or manual analysis of the sequences. From these evaluations I predicted that 84.8% (56/66) of the kelch-repeat proteins in the human genomes are closed by a C-terminal p-strand (indicated in Table 1).

140 Chromosomal Localisation of Human Kelch-Repeat Proteins

Evaluation of the chromosome distribution of the identified human kelch-repeat proteins showed that these proteins are dispersed throughout the genome, being located on all except chromosome 21 and the Y chromosome (Table 1, column 5). Several instances of genes in physical proximity were noticed, for example NP 006460 and NP_067646 at 1q31.3 and

NP_569713 and NP_060114 at 3q27.3 (Table 1 #2 and #3, and #18 and #19).

However, in the majority of cases these did not correspond to the most closely- related protein sequences as would be expected for recently-duplicated genes.

One exception was NP 055130 and NP-751943 which are both located at

14q21.3 and which were the most closely-related to each other (46 % identity.

Table 1, #71 and #72). Overall, there was no evidence for physical grouping of kelch-protein encoding sequences within the human genome.

Domain Architecture of Human Keich-Repeat Proteins

The previous described kelch-repeat proteins from various organisms were grouped into five structural categories according to the positioning of the kelch repeats within the polypeptide sequence and the presence of other conserved domains (Adams et al., 2000). In order to evaluate the complexity of domain architecture within a single organism, each human kelch-repeat protein was analysed by searching CCD and SMART and then subgrouped according to the presence of additional domains.

71.8% (51/71) of the human kelch proteins contained a Broad-Complex,

Tramtrack, and Bric-a-Brac/ Poxvirus and Zincfinger (BTB/POZ) domain (Table

1, column 1, #1-51). The BTB/POZ domain was first identified as a dimérisation domain of transcription factors and also mediates homodimerisation of kelch

141 and other BTB/kelch domain proteins (Albagli et al., 1995; Zipper and Mulcahy,

2002). In all but one of the proteins, the BTB domain was amino-terminal to the kelch domain. This hypothetical protein, LZTR-1, contained two tandem

BTB/POZ domains. Also, for 98% of the BTB-kelch proteins, the kelch domain is closed by the C-terminal p-strand.

Five kelch-repeat proteins (8%) were identified as very large, multidomain proteins (Table 1, #56-61). Two of these contained an additional CUB domain

(SM00042, PF00431). The CUB domain is found in extracellular proteins involved in developmental processes, including the von Willebrands factor- cleaving protease, ADAMTS-13 (Zheng et al., 2001), and spermadhesins

(reviewed by Bork and Beckmann, 1993). The kelch-repeat proteins

Attractin/mahogany, which are splice variants from a single gene (Duke-Cohan et al., 1998; Gunn et al., 1999; Nagle et al., 1999) and the multiple EGF-like motif containing protein MEGF8 are all over 1000 amino acids long and contained a CUB-domain, six kelch repeats, a C-type lectin domain and EGF- like domains. Diverse functions have been attributed to attractin and mahogany that include a role in T-cell interactions (Attractin, the secreted splice variant;

Duke-Cohan et al., 1998) and obesity regulation in mice (mahogany, the transmembrane splice variant; Gunn et al., 1999; Nagle et al., 1999). MEGF8 was identified in a BLAST search using the EFG-motif as query (Nakayama et al., 1998), but the function is still unknown.

Host cell factor-1 and -2 (HCF-1 and HCF-2) are also large proteins, which contain six amino-terminal kelch-repeats, two fibronectin type III domains, and, for HCF-1, a series of unique HCF repeats. These proteins function as transcriptional coactivators of herpes simplex virus immediate early gene

142 expression (Wilson et al., 1997; Johnson et al., 1999). Furthermore, three hypothetical kelch-repeat proteins containing unrelated unique sequences that did not correspond to recognised structural domains, positioned either amino- or carboxy-terminal to the kelch repeats were identified (Table 1, #62-64).

Rab9 effector p40 (Diaz et al., 1997) and six other kelch-repeat proteins were short polypeptides, ranging from 350-442 amino acids in length, that consisted almost entirely of kelch repeats (Table 1, #65-72). Five of these proteins or hypothetical proteins, including p40, contained six sequence repeats and thus are predicted to form six-bladed p-propellers. Two hypothetical proteins,

NP_060673 and XP_114323, consisted of putative seven-bladed p-propellers.

Four (5.6 %) kelch-repeat proteins contained other conserved domains (Table

1, #52-55). A single kelch-protein, XP_048774 (#53), contained a F-box domain

(CDD9197, Pfam 00646). The F-box is a domain of about forty residues, first identified in cyclin A, that interacts with Skpl to anchor proteins to the ubiquitin- ligase assembly for ubiquitination and targeting to proteosome-mediated degradation (Craig et al., 1999). The combination of F-box and kelch-repeat domains has previously been described in A. thaliana, where at least 67 F- box/kelch-repeat proteins and hypothetical proteins are encoded in the genome

(Andrade et al., 2001b; Kuroda et al., 2002). Several of these function in light- dependent regulation of the circadian clock, but the function of many others are obscure (Andrade et al., 2001; Kuroda et al., 2002; Somers, 2001). To my knowledge, this is the first recognition of a F-box/kelch protein in an animal genome. One predicted kelch-repeat protein, NP_055608 (#54), contained a leucine carboxyl methyltansferase (LCM) domain (CDD9631, Pfam 04072) with

143 34% identity to the LCM domain of protein phosphatase 2 leucine carboxyl methyltransferase (De Baere et al., 1999). Recombination-activating gene-2

(RAG-2, #55) contains a putative PHD finger domain at the carboxy-terminus

(Callebaut and Mornon, 1998). The PHD domain is a zinc-binding domain related to the RING domain, and is found mainly in multicomponent complexes involved in chromatin-mediated transcription activity (review by Aasland et al.,

1995) and members of a novel class of E3 ubiquitin ligases (review by Coscoy and Ganem, 2003).

This subcategorisation showed that muskelin is unique in its combination of kelch-repeats and discoidin domain in the H. sapiens genome (Table 1).

Structural Relationships of Human BTB/Kelch Proteins

The large number of BTB/kelch proteins found in the human genome encouraged me to study this group in more detail, in order to identify related subgroups that might also represent functional subsets. Of the 43 sequences that contained predicted six-bladed p-propellers and BTB/POZ domain, the three non-full length hypothetical proteins identified as Rattus norvegicus and

Mus muscuius orthologues (Actinfilin (#13), FLJ20059 (#19), XP 058629 (#37), and the two hypothetical proteins containing two BTB/POZ domains (KIAA1490

(#36), LZTR (#51)) were excluded. The remaining 38 sequences were aligned according to sequence similarity in ClustalW and viewed as neighbourhood- joining trees. Alignment of the full-length sequences revealed three subgroups of approximately equal size, which I termed subgroups 1 to 3 (Figure 1A). The similar sequence alignment and analysis was performed with the kelch domains alone. This analysis revealed that the same grouping was apparent for subgroup 1 as well as a substantial proportion of subgroup 2, which I therefore

144 A. Full-length BTB/Kelch B. Kelch alone C.BTB alone

NP_065917 NP.065917 NP_476503 NP_115514 ' NP_057074 NP_057074 NP_476503 NP_079286 ' NP_079286 NP_059111 NP_277030 ' NP_065854 NP_009177 * NP_061335 NP_055273 NP_067646 2A • NP_036421 NP_055273 NP_079007 NP_061334 NP_065854 N P_059U 1 NP_036421 NP_009177 NP_005888 * NP_067646 NP_689680 NP_005888 NP_061334 NP_115514 NP_006460 * NP_060565 NP_071324 NP_443136 NP_277030 NP_079007 NP_061335 XP_044836 NP_060786 XP_040383 NP_079007 NP_695002 XP_035405 NP_060786 CAC16284 NP_277030 XP_055299 NP_061335 NP_443136 NP_116164 XP_040383 NP_055682 NP_116164 XP_035405 XP_044836 CAC16284 NP_055682 NP_071324 XP_093813 g NP_689680 NP_115514 NP„060613 NP_060565 XP_093813 NP_060613 NP_006460 NP_005884 * NP_071925 NP_071925 NP_003624 NP_003624 * XP_294387 XP_294387 XP .292990 NP_569713 NP_569713 NP_689646 NP_6B9646 XP_292990 NP_005884

Q Subgroup 1 Figure 1. Relationships between human BTB/kelch proteins. The amino acid sequences of the 38 human BTB/kelch proteins I I Subgroup 2 predicted to contain six-biaded b-propellers were aligned in CLUSTALW and are presented in Phylip Drawgram for A ) fuil- I I Subgroup 3 length sequences; B) kelch-repeat domains only; C) BTB domains only, with shade codes for the three identified subgroups as indicated, where subgroup 1 in red, subgroup 2 in yellow, and subgroup 3 in green. The bracket in panel B indicates the robust grouping of a set of sequences from subgroup 2, termed subgroup 2A. The asterisks in panel A indicate the known actin-binding proteins, Mayven (NP_009177), IPP (NP_005888), Ndl (NP_006460), Caiicin (NP_005884), and ENC-1 U1 (NP_003624). termed subgroup 2A (Figure 1B). An analysis of the 38 BTB domains only, showed that subgroups 1 and 2 were maintained for the majority of sequences

(Figure 1C). I therefore focused on these related kelch-repeat sequences in subgroups 1 and 2A, for a closer analysis of the kelch-repeat domains.

In order to obtain consensus sequences for subgroup 1 and 2A, a ClustalW multiple sequence alignment of the kelch-repeat domains from each subgroup was performed. This demonstrated that the derived consensus sequences from the two subgroups had distinctive features in terms of the most conserved residues (Figure 2A and B). Subsequently, I re-aligned the consensus sequences for each set of six kelch motifs to derive a mean 50% identity consensus sequence for subgroup 1 and subgroup 2A (Figure 3A). These motifs were mapped against the known blade structure of galactose oxidase

(Figure 3B). The consensus motifs included both amino acids of importance to the fold (located within the beta-strands) and amino acids within loops, which might have a functional role in binding interactions. Of note, the average length of the motif was shorter in group 1 than group 2B. The subgroup 2A consensus is predicted to contain a longer loop between beta sheets 2 and 3 (the 2-3 loop.

Figure 3B). In the context of an intact p-propeller domain, the 1-2 and 3-4 loops protrude above one face of the p-sheets and the 2-3 loop protrudes from the opposite face (Figure 3A and B). The 4-1 loop lies either on the same face as the 2-3 loop, or may be positioned more closely to the beta-sheet core of the propeller (Figure 3B). Furthermore, there were distinctions in the number and positioning of charged residues within the loop regions in the two consensus motifs (Figure 3A). The conservation of these charged residues was most pronounced in subgroup 1, where these positions were conserved in the motif

146 NP 059111 KgQTPVSLPKVMIVgEg- -QAPKAIRSj EFEEDRgjDQIgjELP - SR| CRAgKVFMAgHV] NP_009177 iRU?TPMNLPKLMVVn?B- -QAPKAIRSj gFKEERgHQVgELP-SR CRAgMVYMAgLV NP_476503 KQ0 KSTVG- -ALYA^^MDAMKG-TTTI Sl r t n s H l h i g t m n -g r IDNKI NP 057074 JKSTVG- -TLFaJ^MDSTKG-ATSI gL R TNmHt P VgjN MN - G R JLDDKl NP_065854 IKHTAG- -VLFCraaêlRGGSGDPFRSI SSINKNSgFFGPEMN-SR RHV ISVEgKV] NP_055273 iKPIRCGEVLFAPZeB-WCSGDAISSi gQTNEgRMVgSMS - Kr| CGV ISQLD NP 079286 jCCTSIAGLIYAg^LNSAGDSLNVl aiANCfflERCRPMT-T SRV QvNgL: NP_067646 PRTQARLGANEVLLvjSfflFGSQQSPIDVl KgggKTQEgjSFLPS IT - RKI RYVASVSLHDR NP_036421 PTQVMgCRAPKVGRLIYTA^YFRQS - -LSYÜ AgNgSDGTmLRLgDLQ-Vpl SGLAGCfflVGgL RNNSPDGNTD NP_005888 ..... RKYLYAVGGYTRLQÜB r WSDSRALSCI r f B t f s q y S t t v s s l h -q a SGLgQTgLGgMVl EKDS ---- MI NP 689680 --FQNgLTRPRLPYAILPAIgGWSGGSPTNAg AggARADRgVNVTCEEESP AYHgAAYLKgYVi FfflSV---- DY NP_006460 - -YARSGLGTAEMNGKLIAAgGYNREECLRT|_ CQNgHTDHgSFLgPMR-TP ARFQMAjjjLMgQ S ^ H S DD NP_061334 ..... a^RKKHDYRIALFGg ...... SQPQSC:RYFNgKDYSgTDIRCPF-EKl RDAACVFWDNV^ SQLFP ..... NP 071324 PRGYSECIVTVGGEERVSRKPTAAMRCMC:PLgggNRQLglELQPLS-MPINHftKLSAEfêlFfflFVF QjijENK QT consensus --rtkpr l--vm--vgg ------i-sv^e-ydp--n-W--ia-l r R gv-vm- g -lyav G G - d g ------

NP 059111 VRTHDV GVKDQSIT - S iQSSjQER STL: NDLL STGfJAS SYKTNEE NP 009177 VRTWDS kdoHt-sî^nBrdr S T L r.. - - STGHSS AgNIKSNE pSNgl NP 476503 CFN -- gkiHt-vmpRiJsth H G L Hg - - WSY NP 057074 CgN - - - KTKtHs-VMPgJSTH HGL WSY QARQ NP_065854 MFfg— gLTNKgM-MKgSgNTK R G I A L A S -DNTCFigD lESDQ NP 055273 KTNoBsSDgAHTSTC TSVl VSCgTjl KENK NP 079286 _ _ETDtHt - RpGSSjNSK SAMHTVl NSSgfsS ETDK NP_067646 CLgYTADEDGVgY-S|; GLAgATTgGDM: SRRHTS NIDQ ISMLGDgogA NP_036421 SSALDC»iN-- t n q H s -p c E NRiggGgiDBH: ciHHms HLggPgLgR NP 005888 F D C T g C Q a SVTKQjJ SSiNHP CGLpRcflCYi a e i g S t DENK lEVflGNgAVS NP_689680 FNSjJjKF HSR CYV^TftjGNF! ETNQ LIgPgHEQ NP_006460 [Js c g Q f — s n i d d EJi -p E p e l r t n CNAggcAgNgK^ KNCDVF VTKL iTSCgPLNIR NP 061334 IKRHDCWN VVKDSgjYSKLGgP-TP DSLAACAAEHKg J TRTES TKPSgLQQ NP 071324 [JSSGQkJJî] gDANTgT-ALPSgNEA HNFglVEIDgML® lYSKT KQPDLTMV consensus 1 -tve-yd pv Wt--vapm-t-R--lgvavl-g-iYavC -g-t-lnsverydp-t--Ws-va-m-t-r

NP 059111 ggGHVEgK !Y[^A--1 STggQMNgAÎ sgAggGvgsGo ...... PLVRK NP 009177 BK g Q v g S l U V A - - 1 STgRcgNAT? SgAgWGVaNNL HüT)...... PLVRK NP_476503 fflvAjjNNK GgV^ATYNGF .PASNH-CSRjJSD NP 057074 BgAggSgK GgvfflTTWNGL HgAPASNL-TSRgSD NP 065854 EjSVAaVNH aHnM skIUhgc f3 d------ns pffls- NP 055273 ^ A g g G g F KHLïJcAVYQDMIj ..... TTEMS- NP 079286 l^THFEgR JG SK M l Y g g ...... SGFMS - NP_067646 S l v B a s B v JNDH f ™ ...... TAHgS- NP_036421 ^ A g B N R L Jhnc YÎ3 3 ...... QDQgN- NP 005888 gCCEHQ nSNE-- JNDC WNET ..... QDAgH- NP_689680 SATT FNg-- ilAYGEHV^____ F g g ...... ANR 0 R- NP_006460 ^ C E ^ES-- VVj3NGK[l]FVC^ F g g ...... SHAIS NP 061334 gWVEANl LHNNV JJLVFVKDKIF[^^ Q N g ------LGGLD NP 071324 CYAAMKKKI jjgjGSYG- - FgAVACGVAMEgQVF __ VRSREDAQGSEMVTC consensus gv-vl-g-lY avGG - dg -- .-svE-ydp-tn-Wt-vapm-trR-g-gvg-ln--lyavGGIfdg------Is -

NP 059111 jDgiNMCgRN...... A g V C A g N g L -DgrascNjj K TLLPTNMS-- NP_009177 D g N M C n R N ...... A g V C A Q N g L -DifflsCN j JKIJ TVVSSCMS-- NP 476503 Sp l s v p H d a ..... M a v c p l g d k - Y ^ H T Y I QRNEg KESMQELLQN NP 057074 iKTDMgTAgjSSjSISgDA ..... [ ^ C L L G D K -yhB q a y I OgNEg TQVAPLCLGR NP 065854 IrSHkSdySTjALTTpRgG...... ^lATgMgK - H ^ N A Y I v l n r J ELVGSVSHC- NP 055273 1 s pJJvAgjT s rRs g ..... M l avHnHq -FgKTTYI DANTg RLYGGMNY-- NP 079286 fcLIVPgHTRgSR..... gSLVASCgR - yiBqsn I EQQcg TFHAPMACHE NP_067646 gTTgrSgTTPgCY ..... ggATVLRgR AI A» - y T O n s l I EVVTSMGTQR NP 036421 CTFBJPgKHRRsA ..... LglTVHQgR - Y g g H T F ' SEVTRHTSG- NP 005888 gVEjEsSiKVPyAG ..... M C V V A g N g L RSSSHOFg TEIGNMITS- NP_689680 jRTIPT g F N P g S N ..... FglEVgDDL FNGFTTTFN- YDAHDHSIY- NP_006460 SKMMGNjjjTSP:gT S PQS N ..... AglATgGNT I----- SPYTKIFQF- NP 061334 jKMQSPgPWKGVT..... B ^ A A g G S I HILEgMTEgl VANSKVRAFP NP 071324 lYLNDQNLCIPASSSFVYMAVPIGAS _ gREFKRSgC HHTKPLLPS- con sensus -Ydp-tn-W--va-m r-a ----- vgv--v-g-lyvvgg-ydg--yl ------sve-ydp-td-W t - 1 — 1 ----

Figure 2A. Multiple sequence alignment of kelch domain subgroups 1 of human BTB/kelch proteins. The kelch-repeat domains of subgroup 1 BTB/kelch proteins were aligned in CLUSTALW and the consensus sequence are derived at 50 % identity threshold level. Alignments of the repeats are presented in Boxshade: black shading indicates identical amino acids, grey shading indicates similar amino acids and white background indicates unrelated amino acids.

147 NP_277030 QSD ijDTTH VLRQQ ...... ■ - -LVVgKELRMYDEKAH ...... EgjKS NP_061335 DSTH VLRQQ ...... LVVgKELRMYDERAQ ...... e H r S NP_44 313 6 FQSDT KKREVCKVKEL ...... RYFNPVDQENAglAAIA ...... N^SE NP_079007 TNQERLFV EVSERC ...... LELg]DDTCYgDAKSE...... QÿJjVK NP 060786 DVPS TPYTDS ...... DRSVSSKVYQLPEPGAR ...... HFRE CAC16284 GGCRV RPGLTE ...... KSLgRDILYRDPENG ...... - ...... igSK NP 116164 DFOCVPiGFMf IHSTPS ...... TVLgDQAKYfjNPLLG...... e H k H XP 040383 AKPQTTVFR MIG ...... HSMVNSKIL iJl KKPR ...... vHwE NP 695002 - -EQQS P Q T R ILLVg RRAREVVIEEVAAPQRAARGQVAAPEPEEEEEEgEEEEEEEEWELTQNVVAFDVYNHRgRS XP 035405 ;NKKM[jLLVg_ ...... LPPGPDRLPSNgvQYYDDE ------KKTgjKI cons ensu s rt-iRs Iv-lgG ------vs - e --- I d ------wk -

NP 277030 gAPgDgg-RYQ QSNYDTKgKTgPjDTi gTF SALKGYffl NP 0 613 3 5 g A P ^ 3 3 -RYQ qsnydtk SktTOdtiJîf t f SALKGHffl NP 443136 aAPggVG-RSH EVEHASGRTCgWRTAC ______KNC GAMEEYffl NP 079007 EnPl!Hf?lR-RSHi SFSRDNGgDAgSNLLYggggCKQgi j:: a s i e d m B NP 060786 gMEgiEVG-CSHT QHLQYRSgEGggDACYfT^HLgiRgLRLQ^QgS IIQ NVLCGMV CAC16284 ESjEgijEUC-SFNQ ED ------QNDARNQAKH[^HNFCy[j|J|»jjFljlTlJISNFCI%?!Rlp"T" I H L ( g 2NQK[^JTH SVFNGLV NP 116164 mASL>%jRMSNOG VYLI ...... DNNVQgFRgjE S RCwfJWTlHraRS FffllQgLQQEHADLSVCVVGRYI XP 040383 |egpqv3 -lrpdHl! VN! VFLI ...... EELGPDGEFHgSSKgF^^g^QgSgLgMgDgSVPgSEgAVGVIGKFI NP 695002 jQLBTB-LLGBsBCTA[* L^ESPSGSASSPLADDSRVVTAQgH^MgFHAg TEVPAgRRAgAHgwCGAVGERg XP 035405 wliTHYN-SAHRW!VEVE L ^ E D Q ...... WNPNgKHSTNFQSggggFgSg igLPPgQQRljASgYACRLDKHg consensus It-mpap hcvavignFlyvvGGa ------gk-av - - vfRyDPryn - Wmqvasmn ekr - - 1 NP 277030 -AAgE yTgjPjAKMSEPH JYG-! ITHDT ...... FQj3 E NP 0613 3 5 S-AAgE QsQRakmseph TgYG-HLMg ITHDT ...... FQNE NP 443136 ELRQVj Jtfqqsfd^sl Y^AD-gLLV^ T-NTAQ OQNR NP 079007 E N -Ha SsgQAG T I Y K - D F V r HDYQIGP QR^N NP 060786 -RAHS gGgACS RTW s g -Hr l E lYGISVE D K 0 A CAC16284 -AE0 S Sqpktp EVARCC D-MRVLVTgJ Y-IANA ---- n S R S V C NP 116164 D-YHNDg SaQQap TLE-0KMQnTc! R-RGED ---- gL^ E T H XP 040383 T - RDETFYl DITNDK SefQdpypvnk EBTMLN-NKLFQTg ITSSST ..... SgQVCVF NP 695002 iLGAGGEVgAi DLRRDR JTAAGAgpgALH iGDRHVVr KAGRGEGGASSLROgYVL XP 035405 E-T g Y 0 S|__ lNLErariEiflR?giSs"pOPLAA hn-Skii VHN ...... G E QVPWgY cons ensus yavgG r n g-l-svE '-prtn-wtyva-l-r--ygH agav grmy i sgGi ...... -y-k-l-cydp

NP 277030 CgjCTVGER INHFRGT - -SDYDjgggSCgYl NP 0 613 3 5 CgCTVGDK In h f r g t - -SDYDjWpsC@Y| TLL* NP 443136 ISRS g A AVQRK INDLDYN- - ndrilQrhids IDT RCNFNjj NP 079007 EERR SgCSLGDGIgS ISDDNIES- M E R F iîPhlG 3QCN RVgP-iSnHANR NP 060786 ApJJvGAGGRIBA -R M D H V - - DRCF SVSP-MRAgOg CAC16284 q e l p n l s Q p CAVTLSDRVt ISQLGPR - - GERV ATGS EYAQp-govgvg NP 116164 HTL 0 DGPVR GgjATLLNKr -S N N D A - - GYRR _ CTSG ssvcp-Bpahhg XP 040383 EQRTRRTQVVTNCjJ ENQSK KglSYNGKggFg CVILRAS FESQGCPST ILgS-MPIgRg NP 695002 HgAVLRGAVFAFLg -RYEPFSEI RLR-PgPYDRF XP 035405 AR|jQD|^N TLAVMNDRnPiA? jNHLKGF - - - SHLÎÎT2ML ILQTPlQEgRg consensus ------td-W - -kapm 1 1 -RgyH -in - tv- e r lyvi gG - - 1 ------d - -d v l -V p-tdQWt-ia--ll-gqs

NP 277030 -WN R--CMVE] NP 061335 -WNf R--CMVE] NP 443136 HNGF IWTi eplaciqvldvsreg[ 3eE FASNGIAAC NP 079007 T- - AFSKTjj K A lA- - NP 06 07 8 6 NWRLI N VTGIERVQNTDTÜ ESFAGIACA CAC16284 SAfflHGRA WNEGEKK YKKCIgCFSrag^gjjTEDDEgnEATVGV-- NP 116164 DNR R^HNRGS RTGYgHiggMgggCgEEGPQgDNS IS ..... XP 040383 LCYNGHY SDSILTFWÏIDENk H k ED-EYQRMPCKL -- NP 695002 ETALLL 5 LKWRDSRO-VPTRNmVGgTLDLgRQEDIGCALPWAWS ---- XP 035405 _ DDSQQL |WS M G --- AYKSSTICgCgg^GTQTELEGDVAEPLA ---- consensus d-GvavIe-kiyvvGGys n ----- s--vq-ydpekdew-e--dlp ------i ---

Figure 2B. Multiple sequence alignment of kelch domain subgroups 2A of human BTB/kelch proteins. The kelch-repeat domains of subgroup 2A BTB/kelch proteins were aligned in CLUSTALW and the consensus sequence are derived at 50 % identity threshold level. Alignments of the repeats are presented in Boxshade: black shading indicates identical amino acids, grey shading indicates similar amino acids and white background indicates unrelated amino acids.

148 S u b g r o u p 1 I. . R . . . GiV . . . .1. .IÜYAVIGG . DG . . . L . .fVE . YIDP . .

pi P2 P3 P4

Subgroup 2A I H .1 . AVL .1. .ILY . jGGl't J. 7 . . . . .Iv . . ^DP

B 1-2 loop 3-4 loop

4-1 loop

2-3 loop

F igure 3. Relationships of the consensus kelch-motifs to p-propeller blade structure. A ) Alignment of the consensus kelch motifs derived from a 50% identity threshold from the six kelch-repeat consensus sequences from BTB/kelch subgroups 1 and 2A. The derived consensus sequences were assigned to the p- blade structure determined by the crystal structure of galactose oxidase (IGOF), where the position of each b-strand is indicated. B) Side view of single p-propeller blade structure in galactose oxidase (IGOF). The p-strands and the loops are color-coded as in panel A.

149 to t he 7 0% i dentity t hreshold I aval ( data not s hown). I t harafora suggest t hat these distinctions in loop characteristics are indicative of different specificity in protein-protein interactions for the p-propellers o f subgroups 1 and 2 A. With regard to known protein-binding properties, it was observed that the BTB/kelch proteins that bind to actin through their kelch-repeats were split between subgroups 1 and 3 (asterisk indicated on Figure 1A); thus this function did not have a simple relationship to the hereby-described sequence characteristics.

The Kelch-Repeat Protein Family in Invertebrates

The next question was whether this distribution of kelch-repeat proteins was conserved between vertebrates and invertebrates. So a BLAST-analysis, using similar procedure as for the H.sapiens genome, was performed on the genomes of Drosophila meianogaster, Anopheles gambiae, and Caenorhabditis eiegans.

Both the Drosophila meianogaster and the Anopheles gambiae genome contain

18 kelch-repeat proteins (Table 2). Seventeen of these kelch-repeat proteins were orthologues conserved between the two species and one was unique to each species. Thus, an Actinfilin homologue was identified in A. gambiae but not in D. meianogaster and the D. meianogaster genome contained a homologue of NP_116164, which was not present in A. gambiae (Table 2). Only three of the kelch-repeat proteins has previously been functionally characterised in D. meianogaster, namely Kelch (Robinson and Cooley, 1997; Xue and

Cooley, 1993), Muskelin (Adams, 2002) and the D. meianogaster host cell factor (Izeta et al., 2003). Two others, namely diablo and scruin-like at the midline (SLIM), have previously been recognised as kelch-repeat proteins

(Adams MD et al., 2000), though function is still unknown. The average conservation of the seventeen orthologues was 58.9 %, which is slightly above

150 Table 2 Drosophila melanogaster, and Anopheles gambiae kelch-repeat proteins

Architecture GenBank Accession % Identity between Alt. name Amino acids Closest human Domain organisation Dm Ag Dm and Ag (Dm) Dm Ag sequence BTB-Keich proteins: 1. BTB-(6) NP 724095 EAA12712 57 Kelch 689aa/1477aa 1129 Mayven 2. BTB-(6) NP 524989 EAA05692 94 Diablo 623 583 NP 055273 3. BTB-(6) NP 650594 EAA08775 68 744 651 Keapi 4. BTB-(6) NP 727331 EAA12172 67 653 698 KHLH5 5. BTB-(6) NP 650143 EAA05226 70 575 580 NP 079268 6. BTB-(6) NP 609616 EAA14037 66 627 618 IPP 7. BTB-(6) NP 611377 EAA13860 78 620 614 NP 055273 8. BTB-(6) AFF45342 EAA03884 43 474 669 KHLH10 9. BTB-(6) NP 608397 616 NP 060786 10. BTB-(6) EAA05952 873 Ndl 11. 2x BTB-(6) NP 569869 EAA09185 66 975 759 LZTR Discoidin-LisH/CTLH-Kelch: 12. DD-LisH/CTLH-(6) NP 610801 EAA09711 53 Muskelin 853 797 Muskelin F-box-Keich: 13. Fbox-(5) NP 611647 EAA00139 41 667 580 XP 048774 Kelch and Multidomain: 14. CUB-(6)-EGF NP 651571 EAA05420 64 Mahogany-like 1284 1203 Attratctin 15. CUB-(6)-EGF NP 609180 EAA12463 51 2898 2811 MEGF8 16. (6)-FNIII NP 726567 EAA13243 41 Host Cell Factor 1500 1396 HCF Kelch and unique 17. (6)-u NP 648590 EAA10380 54 509 550 NP 060036 Propeller-only 18. (6) NP 725794 EAA10497 44 SUM 627 373 XP 045954 19. (6) NP 572494 EAA05915 56 403 383 PEAS estimated average of 56% identity between orthologues in these two species

(Zdobnov et al., 2002).

95% of the 19 kelch repeat proteins identified in D. melanogaster and A. gambiae genome consisted of six kelch repeats. The single five-bladed kelch- repeat protein identified in D. melanogaster and A. gambiae corresponded to an orthologue of the single F-box/kelch repeat protein identified in the human genome (XP 048774, #13). 56% of the kelch-repeat proteins in the D. melanogaster and A. gambiae genomes are BTB/Kelch proteins (Table 2, #1-

11). Two CUB/kelch repeat proteins in D. melanogaster and A. gambiae were identified as orthologues to the human Attractin/Mahogany and the MEGF8

(Table 2, #14-16). Both D. melanogaster and A, gambiae genomes contain two members of the propeller-only subcategory o f kelch-repeat proteins (Table 2,

#18-19; Dm: NP_725794, NP_572494; Ag: EAA10497, EAA05915), one kelch with unique (Table 2, #17; Dm: NP_648590, Ag: EAA10380). The Drosophila and Anopheles genomes contain only a single member of the discoidin-kelch subgroup, which corresponds to the orthologues of human muskelin (Table 2,

#12). Thus, all of the 19 kelch-repeat proteins identified had homologues in the human genome and the BTB/kelch domain architecture was the most prevalent

(Table 2, column 8).

As no muskelin or muskelin-like orthologues was previously identified in

Caenorhabditis elegans, I performed similar searches on the genome of

Caenorhabditis elegans (The C. elegans Sequencing Consortium, 1998) for kelch-repeat proteins in order to establish whether this species contains any hypothetical protein(s) with similar molecular architecture(s). The C.elegans genome contains 16 kelch-repeat proteins. Of these proteins, only kel-1, spe-26

152 Table 3. Identified Kelch-repeat proteins in C.elegans

Architecture GenBank Alternative Amino acids Closest human Domain organisation Accession name sequence BTB-Keich proteins: 1. BTB-(6) NP 496496 Kell 618 KLHL3 2. BTB-(6) NP 510109 817 Ndl 3. BTB-(6) NP 503729 836 KLHL8 4. BTB-(6) NP 499784 591 NP 079286 5. BTB-(6) NP 499241 518 KLHL10 6. BTB-(6) NP 498380 579 KLHL10 7. BTB-(5) NP 491322 531 NP 055273 F-box-Kelch: 8. Fbox-(4) NP 497184 809 - Cyclin C-Kelch: 9. CvclinC-(6) NP 506605 480 - Ring-Kelch: 10. Ring-(6) NP 506602 570 - Kelch and Multidomain: 11. CUB-(6)-EGF NP 510443 Attractin 1271 Attractin 12. (6)-FNIII NP 501279 CeHCF 782 HCF- Kelch and unique: 13. (4) NP 506607 430 - 14. (5) NP 501919 Spe26 570 - Propeller-only 15. (6) NP 506895 426 PEAS 16. (6) NP 493418 359 -

153 and CeHCF have previously been functionally characterised (Ohmachi et al.,

2001; Varkey et al., 1995; Liu et al., 1999). 43.7 % of the kelch-repeat proteins in the C.elegans genome contains an additional BTB domain, two members of the multidomain kelch-repeat proteins, namely the HCF and an Attractin orthologue, two propeller-only kelch-repeat proteins, a single F-box kelch-repeat protein, and two contained unique sequences outside the kelch repeats (Table

3). Furthermore, two hypothetical proteins in the C.elegans genome contained other domains, namely the NP_506605 which also contained a cyclin carboxy- terminal domain (CDD 7965, Pfam 02984, SMART 00385) and NP_506602, that contained a RING domain (CDD 8941, Pfam 00097, SMART 00184). The cyclin carboxy-terminal domain forms an a-helical fold that may constitute a protein interaction site (Brown et al., 1995). The RING domain is a zinc-binding domain involved in protein-protein interactions as well as transcriptional control, with similarities to the PHD domain. This could suggest a homologous function of NP_506602 in C.elegans and the PHD-kelch repeat protein, Rag-2 in

H.sapiens.

Interestingly, these data indicate that the C.elegans genome contains no muskelin or any alternative member of the discoidin domain/kelch-repeat subgroup.

Kelch-repeat Proteins encoded in yeast genomes

Several kelch-repeat proteins have been studied functionally in budding and fission yeast but none of these correspond to BTB/kelch proteins (Adams et al.,

2000; Mata and Nurse, 1997; Philips and Herskowitz, 1998). I investigated whether the prevalance of the BTB/kelch domain architecture identified in

154 multicellular animals extended to yeast, by analysing the complement of kelch- repeat proteins encoded in the S. pombe and S. cerevisiae genomes (Wood et al., 2002; Goffeau et al., 1996). Each genome encoded a small number of kelch-repeat proteins (five in S.pombe, six in S.cerevisiae), none of which corresponded to a BTB/kelch protein (Table 4). Proteins and hypothetical proteins consisting of an amino-terminal kelch beta-propeller and an extended coiled-coil region (Adams et al., 2000; Mata and Nurse, 1997; Philips and

Herskowitz, 1998) and a protein corresponding to a putative leucine carboxyl methyltransferase (De Baere et al., 1999) were common to S.pombe and

S.cerevisiae. The other encoded kelch-repeat proteins were non-homologous

(Table 4). Muskelin-like 1 protein and Ral-2p were identified in S.pombe but not

S.cerevisiae (Adams, 2002; Fukui et al., 1989). Two proteins with distantly related kelch repeats, Gpbl/Krhl and Gpb2/Krh2, have been characterised functionally as G protein-coupled receptor-binding proteins in S.cerevisiae

(Harashima et al., 2002; Batlle et al., 2003). Homologues of these proteins were not identified in S.pombe. Thus, this analysis showed that the BTB/kelch domain architecture was not identified in these yeasts.

Analysis of Keich-Repeat proteins in Lower Organisms i) Discoidin/Kelch-repeat proteins:

Because the discoidin-kelch repeats domain architecture appeared in all vertebrates analysed so far, and in the S.pombe genome but not in S.cerevisiae and C.eiegans, it was interesting to search other organisms for proteins with similar domain architecture. It has been reported that the poxvirus family of animal virus contain open reading frames of hypothetical kelch-repeat proteins, but none of these contain an additional discoidin domain or Lish/CTLH motifs

155 Table 4. Kelch-repeat proteins of S.cerevisiae and S.pombe

Architecture G enBank Alternative Name Amino Domain organisation Accession acids S.pombe Kelch/coiled coil: K6/coiled-coil NP_588351 Tealp 1147 K6/coiled-coil NP_594099 1125 Discoidin/kelch: DD/K6 NP_594297 Muskelin-like, MKLN1 716 LCM/kelch: LCM/K6 NP_596164* Putative carboxymethyl transferase 681 Kelch and unioue: K5/u NP_596339 Ral-2 611

S.cereviseae Kelch/coiled-coil K6/coiled-coil NP_012028 Kelip 1164 K6/coiled-coil NP_011754 Kel2p 882 K4/coiled-coil NP 011318 1487 K4/coiled-coil NP_011058 1753 LCM/kelch LCM/K6 NP_014500 Carboxymethyl transferase, Pmp2p 695 Kelch and unioue K5/U NP_015060 Kel3p 651 U/K7 NP_015016* Krhip/Gpbl 897 U/K7 NP_009345* Krh2p/Gpb2 847 (Shchelkunov et al., 2002). The Conserved Domain Architecture Retrieval Tool

(GDART) database at NCBI contains 6 entries for discoidin-kelch proteins, which consists of the muskelins from vertebrates (Homo sapiens, Mus musculus, Ratius norvegicus, Danio rerio) and the previously described

Hypomyces rose//as galactose oxidase (Adams et al., 2000; I to et al., 1991).

One novel entry was a n orthologue of the galactose oxidase identified in the fungus Neurospora crassa (CAD79663). This protein has 48% identity to galactose oxidase and 8% identity to mouse muskelin. BLAST genomes searches of the databases of complete or partially-sequenced eukaryotic animal, plant and microbial genomes at NCBI (Cummings et al., 2002), that included the whole genomes of the Apicomplexia, Plasmodium faiciparum

(Gardner et al., 2002), the Microsporidia Encephaiitozoon cuniculi (Katinka et al., 2001), the plant Oryza sativa (rice; (Yu et al., 2002) and the fungus

Neurospora crassa (Galagan et al., 2003) were carried out. Besides using the consensus sequence for the kelch-repeat, searches were performed using either full-length Mus musculus muskelin, full-length Encephaiitozoon cuniculi muskelin, or full-length Hypomyces rosellus galactose oxidase. These searches identified many predicted kelch-repeat-containing proteins, but no novel ORFs that contained a discoidin domain. In conclusion, these data strongly indicate a significant role of muskelin in multicellular organisms and certain fungi, as not only being highly conserved between species (see Chapter 3, Figure 1), but also being significantly unique in domain architecture throughout evolution.

ii) BTB/Kelch-repeat proteins:

As the BTB/kelch domain architecture appeared widespread and made up the largest subgroup in both vertebrates and invertebrates, the searches were

157 extended to other organisms for proteins with similar domain architecture. The reported hypothetical ORFs in the poxvirus family of animal viruses all contain additional BTB domains (Shchelkunov et al., 2002). The CDART database at

NCBI lists 333 entries for BTB/kelch proteins, all of which originate from vertebrates, insects, C.elegans or poxviruses. To date, the BTB domain has only been identified in eukaryotes (Pfam 00651 species tree). In addition to reviewing the SMART and Pfam species trees for the categorisation of the

BTB/kelch domain architecture, BLASTP and TBLASTX searches of the

A.thaliana genome database were performed (Arabidopsis Genome Initiative,

2000) using the 46 residues sequence of consensus kelch-repeats. This search identified 72 protein sequences, the majority of which were F-box/kelch proteins, some of which were serine-threonine phosphatase/kelch proteins

(Andrade et al., 2001b; Kuroda et al., 2002; Arabidopsis Genome Initiative,

2000), and none of which were BTB/kelch proteins. Searches with only the BTB domains of several human or invertebrate kelch-repeat proteins also did not identify BTB/kelch proteins in A.thaliana. The BLAST genome searches identified no hypothetical proteins with BTB/kelch domain architecture.

However, the Apicomplexia species contains two proteins with a K Tetra/kelch domain architecture ( NP_705330 and EAA22466). The K tetra domain (Pfam

02214) is a distant structural relative of the BTB/POZ domain (Bixby et al.,

1999). Overall, these results provide a strong indication that protein-encoding sequences for the BTB/kelch domain architecture have become expanded during the evolution of multicellular animals, compared to Apicomplexia, fungi, plants and other eukaryotes, whereas the muskelin domain architecture is present as a single copy gene.

158 Discussion

This analysis of the molecular organisation and phylogeny of kelch-repeat proteins has provided a new view on the evolution of the kelch-repeat domain and the proteins that share this domain. Some features are characteristic for the kelch-repeat motif, which distinguish them from other sequence motifs that fold as p-propeller structures. Firstly, the consensus sequence of a kelch-repeat motif differs from that of other p-propeller motifs, such as the WD motif, RCC1 motif, tachylectin-2 repeat or YWTD motif (Adams et al., 2000; Fulop and

Jones, 1999; Andrade et al., 2001). Secondly, the majority of the p-propeller domains that have been crystallised are seven- or eight-bladed p-propeller structures (Jawad and Paoli, 2001). A geometric preference for assembly of seven-bladed p-propellers over six or eight bladed-forms has been demonstrated by mathematical modelling (Murzin, 1992) and indeed the majority of WD-motif containing proteins, which is a larger superfamily in metazoa than the kelch-repeat proteins, are predicted to form seven-bladed p- propellers (Andrade et al., 2001). However, this comprehensive evaluation of kelch-repeat proteins in multiple organisms has clearly indicated a preference for predicted six-bladed p-propellers in the kelch-repeat superfamily. Even though the kelch-repeats are found in proteins in many organisms, the diverse domain architectures of kelch-repeat proteins and the differences in the relative representation of the various subgroups encoded in the genomes of animals, fungi and plants suggests a considerable functional diversity and biological specialisation of the members of this protein superfamily. The data presented here also provides new insights into the evolution of the domain organisation in kelch-repeat proteins.

159 BTB/Kelch-repeat proteins

The large representation of BTB/kelch proteins in the metazoan animal genomes, and particular the discovery that 71.8% of all the kelch-repeat proteins in humans contains a BTB domain, present an interesting enigma. Why do animals need so many BTB/kelch proteins? So far, only eight out of the 51 human BTB/kelch proteins are of known function. Although there are several examples of actin-binding BTB/kelch proteins (Adams et al., 2000), the subcategorisation of the human BTB/kelch proteins showed that the previously described actin-binding kelch-repeat proteins did not form a sequence- dependent subgroup (Figure 1).

Instead, the alternate consensus motifs for the two sequence-related subgroups of BTB/Kelch proteins identified indicate distinctions in the specific structural features o f the s econdary o r t ertiary o f these p-propellers. A Ithough i t s eems unlikely that these features specify a common binding partner such as actin, the loop characteristics could be important indications of putative protein-binding sites in the two subgroups of BTB/kelch proteins. This information could be adopted to accelerate rational design of mutations in functional studies of these proteins.

Obviously, it would be interesting to have crystal structural information of a p- propeller from each of the BTB/kelch subgroups as well as for the other domain architecture groups. Interestingly, eight of the nine BTB/kelch proteins in the insects and six out the seven BTB/kelch-repeat proteins in C.e/egans mostly resembled the subgroup 1 human BTB/kelch proteins (Tables 2 and 3,

Figure 1). This suggests that subgroup 1 may represent the BTB/kelch proteins of earliest origin from which the others have diversified during evolution.

160 The discoidin domain/kelch-repeat proteins

The existence of discoidin domain/kelch-repeat architecture in muskelin, the muskelin-like proteins in yeast and invertebrates, and the fungal enzyme galactose oxidase, suggest an earlier evolutionary origin of the discoidin domain-kelch-repeat proteins as compared to the BTB-kelch-repeat proteins.

But the very low sequences identity between muskelin and galactose oxidase suggests that these proteins are not related in function and indeed the active site is not present in muskelin, the similarities in domain architecture are unlikely to be of any significance. Thus, muskelin and galactose oxidase might be an example of convergent evolution. The differences in function is also supported by the fact that g alactose oxidase is a secreted enzyme, whereas muskelin has a cytoplasmic cellular localisation (Ito et al., 1994; Adams et al.,

1998). But more intriguing is the fact that no other proteins with homologous domain architecture have evolved in any of the organisms included in the characterisation. These data suggest a rather late evolutional origin of muskelin, with the common ancestor of fungi and animals. An earlier comparison between

S.pombe, S.cerevisiae and the metazoan C.eiegans identified 67% of the

S.pombe proteins as having homologues in common with both S.cerevisiae and

C.elegans, 16% have homologues were unique to S.cerevisiae, and 3% have homologues in C.elegans and not in S.cerevisiae proteins (Woods et al., 2002).

The remaining 14% of proteins (681) seemed to be unique to S.pombe. A recent phylogenetic investigation has established that the Encephaiitozoon can/cu//evolved from the fungi rather then the historical classification among various parasites (Keeling et al., 2000; review see Keeling and Fast, 2002).

This places the two muskelin-like proteins from E. cuniculi and S.pombe in closer proximity to each other in evolutional terms. Phylogenetic analysis based

161 on multiple sequence alignment of muskelin and muskelin-like protein also indicates an evolutional correlation between E.cuniculi and S.pombe (Figure 4).

The fact that other fungi and animals (Saccharomyces cerevisiae, Neurospora crassa, and Caenorhabditis elegans) do not contain a muskelin probably indicates that the encoding gene was lost during adaptive radiation of certain modern organisms. In conclusion, these studies of whole genomes demonstrated that muskelin have a unique domain architecture among the kelch-repeat proteins of modern animals.

162 Rat

Mouse

Human

Zebra fish

Mosquito

Fruit fly

S.pombe

Microsporidia

F igure 4 . Phylogenetic relationship between muskelin and muskeiin-like protein orthologues. Full- length sequences of Homo sapiens (Human; NP_037387 ), Mus musculus (Mouse; NP_038819), R at norvegicus (Rat; NP_112649), Danlo rerlo (Zebra fish; AAN32664), Drosophila melanogaster (Fruit fiy; NP_610801o), Anopheles gambiae (Mosquito; EAA09711), Schlzosaccharomyces pombe (S.pom be; NP_594297), and Encephaiitozoon cuniculi (Microspodia; NP_584592) were aligned using ClustaW and presented in Phyiip Drawgram.

163 Chapter 5

The Role of Muskelin Domains in the Cellular Distribution and

Self-association of Muskelin

Introduction

Intracellular translocation of proteins is a common procedure for regulation of protein function. This translocation includes a regulated shuttling between different cellular compartments such as the nucleocytoplasm transport across the nuclear envelope or reallocation of cytoplasmic proteins to the cell membrane (review see Schnell and Herbert, 2003; Mattaj, 1998; Kaffman and

O’Shea, 1999). Translocation can be regulated by various mechanisms such as post-translational modifications, and/or changes in the affinity towards associated proteins and phospholipids, that may also involve changes in protein conformation (review see Bogatcheva et al., 2002; Parekh and Rohlff, 1997).

Having identified several structural domains in muskelin, I wished to understand how these domains contribute to the subcellular distribution of muskelin.

Previous indirect immunofluorescent staining of endogenous muskelin in mouse myoblast C2C12 cells showed diffuse distribution with minor increased intensity in the cell periphery and membrane ruffles (Adams et al., 1998; 2000). These studies also showed that muskelin does not significantly codistribute with actin filaments or microtubules. To understand the roles of the muskelin domains in its subcellular distribution, I prepared enhanced green fluorescent protein

164 (EGFP)-tagged versions of wild-type mouse muskelin, N- and C-terminal domain constructs, and point mutations of highly conserved sites in muskelin.

In this chapter, I describe the characteristics of the exogenous expressed proteins, studies of protein localisation in several cell types, and biochemical analysis of interaction between domains.

Results

Expression of Mouse Muskelin Fusion Proteins

In order to study the dynamic cellular localisation of muskelin, the cDNA of mouse muskelin (Adams et al., 1998) was cloned C-terminal to EGFP (EGFP-

MK). To exclude any unforeseen artefacts from the EGFP-fusion, the cDNA of mouse muskelin was also cloned N-terminal to the EGFP (MK-EGFP), and an alternative V5-epitope was fused to the N-terminal of mouse muskelin (V5-MK).

For the characterisation of the roles of the various domains of muskelin, various mutations and deletions of muskelin was made on the EGFP-MK backbone. As outlined in Figure 1, the design of the constructs was based on the previously identified domain distribution (described in Chapter 3), and included the N- terminal discoidin domain of muskelin alone (EGFP-MKDD), the kelch-repeats and the C-terminal region of muskelin (EGFP-MKKelch/C-term), and a deletion of the C-terminal region (EGFP-MKA85). To study the discoidin domain in more detail, a construct containing three mutations of highly conserved residues in the previously identified MIND sequence in the discoidin domain was analysed

(EGFP-MK1137AA/138A/P139A). Furthermore, two individual double mutations

165 ^ LisH CTLH /O' 1/

W ild -typ e MK N-terminal C-term

EGFP and V5 tagged Muskelin constructs:

EGFP-MK N-terminal ^ ■ 1 K2 KM K5 c-term

MK-EGFP m a m m n i i K4 K5 C-term V5-M K ■ST' N-terminal C-term V5-Epitope

B Muskelin domain constructs:

EGFP-MKDD CEDl N-terminal

EGFP-MKA85 ( ^ E G F P ^ N-terminal K5

EGFP-MKkeich/C-term

Muskelin point mutation constructs:

EGFP-MKI137A/V138A/P139A (^^^E G F ^

EGFP-MKY488A/V495A ( ^EGFP^) N-terminal |

*

EGFP-MKG478S/G479S ^EGFP^) N-terminal r7/^a @ 0 b K3 K4 K5 ^ C-term

Figure 1. Schematic overview of the EGFP and V5 tagged muskelin constructs. A) The EGFP and V5 tagged full-length mouse muskelin constructs, where EGFP were tagged N-terminal or C-terminal to muskelin. The V5-epitope were tagged to the N-terminal of muskelin. B) Three deletion mutants of muskelin were designed. The EGFP-MKDD consists of aal-159 of muskelin merged with the EGFP N-terminal to muskelin discoidin domain. The EGFP-MKA85 consists of the discoidin domain, the LisH/CTLH motifs and the six kelch repeats (aal-650) of muskelin with EGFP tagged to the N-terminus. The EGFP-MKkelch/C-term consists of the six kelch-repeats and the C-terminal region (a249-735) of muskelin with EGFP tagged to the N-terminus. C) Three point mutants of conserved residues in muskelin were designed. The EGFP-MKI137A/V138A/P139A contains three mutations in the discoidin domain marked with asterisks. The EGFP-MKY488A/V495A and the EGFP-MKG478S/G479S both contains two mutations in conserved residues in the fourth kelch repeats of muskelin as marked with asterisks. All three constructs were designed with an EGFP tagged to the N-terminus.

166 in highly conserved residues in the fourth kelch-repeat of mouse muskelin were analysed (EGFP-MKG478S/G479S, EGFP-MKY488AA/495A). These sites were chosen from knowledge of natural function-perturbing mutations of kelch- repeats (Adams et al., 2000). The cells chosen for study were C2C12 and SMC, which have abundant endogenous muskelin with an apparent molecular mass of 82 kDa, and Cos-7 and 293T cells, in which endogenous muskelin was not detectable (arrow in Figure 2A and Adams et al., 1998). Two unspecific bands at approximately 40kDa and 35kDa (i.e. not compete by the muskelin peptide used as antigen) were detected by the antibody to muskelin in all three cell lines tested and indicate an equal protein loading (arrowhead in Figure 2A). An initial analysis of the expression levels of the various constructs was performed after transient expression in Cos-7 cells (Figure 2B). All the constructs were expressed in similar amounts when transiently expressed in Cos-7 cells. Thus, differences in cellular distribution were not due to differences in expression levels. Furthermore, I verified that all the constructs were expressed at their predicted molecular weights and no significant degradation took place during the time of observation (Figure 2C). As these constructs were exogenously expressed in Cos-7 cells, I compared the transient expression levels of EGFP-

MK in Cos-7 cells with the levels of endogenous expression of wild-type muskelin in SMC. This was done by quantifying the level of EGFP-MK relative to the endogenous level in SMC by immunoblotting matched protein loads from

SMC and transiently-transfected Cos-7 cells with an antibody against muskelin

(Figure 2C). From this comparison, I estimated a four-fold overexpression of

EGFP-MK when transiently expressed in Cos-7 cells as compared to SMC.

167 101 - Muskelin

56 -

B

125- 1 0 1 -

5 6 -

3 6 -

WCL WCL Whole cell lysate WCL Whole cell lysate WB: GFP WB: V5 WB: GFP WB: GFP WB: GFP

Endogenous MK EGFP-MK Exposure C expressed in SMC expressed in Cos-7 tim e:

1 2 5 - Exogeous EGFP-muskelin

Endogeous muskelin 8 1 - lO m in

Exogeous EGFP-muskelin 1 2 5 -

20min Endogeous muskelin 8 1 -

Protein load (pg) 40 20 10 40 20 10

Figure 2. Expression of endogenous muskelin and exogenous muskelin constructs. A) Endogenous muskelin expression in SMC, Cos-7, and 293T cells as detected by SDS-PAGE and western blotting techniques using antiserum against muskelin. The arrow indicates the expression of a protein with a apparent molecular weight of 82 kDa similar to muskelin in SMC, whereas Cos-7 and 293T cells showed no expression of endogenous muskelin. The arrowhead indicates unspecific binding used as indication of equal protein loading. B) Exogenous expression of muskelin constructs transiently expressed in Cos-7 or 293T cells. Equal amounts of protein load were separated on 12% SDS-PAGE and the expression were detected by western blotting techniques using antibodies against GFP or V5 as indicated. C ) Quantative Detection of exogenous expression levels of EGFP-MK as compared to endogenous expression of muskelin in SMC. The indicated protein loads of whole cell extracts were resolved on a 10% SDS-PAGE and the expression were detected by western blotting techniques using antibodies against muskelin. Two ECL exposure times was taken for quantification of expression using Scion Image (NIH). Pixel intensity per area was quantified at each exposure time (EGFP-MK signal from lOpg of protein load compared to MK signal from 40pg and 20pg of protein load at 20min exposure time) and mean values calculated.

168 Cellular localisation of EGFP-muskelin

Previous biochemical studies demonstrated that the majority of endogenous muskelin is found in the soluble fraction of C2C12 skeletal myoblast cells and smooth muscle cells (Adams et al., 1998, own observations). By indirect immunofluorescent staining of C2C12 skeletal myoblast cells, endogenous muskelin has a predominately cytoplasmic distribution with some concentration in membrane ruffles (Adams et al., 1998, 2000). To investigate the subcellular localisation and dynamics of muskelin in detail, EGFP-MK was transiently expressed in Cos-7 cells. By confocal microscopy, EGFP-MK was found to be located throughout the cytoplasm, with exclusion from membrane-bound organelles, and excluded from the nucleus, similar to previous observations of endogenous mouse muskelin (Figure 3A). Furthermore, the distribution of

EGFP-MK showed no colocalisation with actin filaments when transient expressed in Cos-7 (Figure 3A-C). It was noted that the expression of

EGFP-MK resulted in the formation of distinct, cytoplasmic, punctuate aggregates or particles of EGFP-MK which varied in number and in size between cells (arrowed in Figure 3A). These distributions appeared attributed to muskelin because an unfused EGFP transiently expressed in Cos-7 cells was uniformly distributed between the nucleus and cytoplasm and no particles were observed (Figure 3D). The EGFP-MK particles were detected within 12 hour after transfection and remained in the cells for at least 72 hours after transfection.

To verify that the formation of the EGFP-MK particles was not an artefact of the fusion of muskelin C-terminal to EGFP, the MK-EGFP construct in which

169 Merged

Cos-7

EGFP IE MK-EGFP

Cos-7 Cos-7

G EGFP-MK

Figure 3. Localisation of muskelin constructs in cells. A ) Distribution of transiently-expressed EGFP-MK in Cos-7 cells B) Distribution of F-actin in same cell as panel A, and C) Merged image of panel A and B. D ) Distribution of transiently-expressed EGFP alone in Cos-7 cells. E) Distribution of transiently- expressed the EGFP tagged C-terminal to MK (MK-EGFP) in Cos-7 cells. F) The distribution of V5-MK transiently-expressed Cos-7 cells detected by indirected immunofluorescent using FITCH-conjugated antibodies against V5. G ) The distribution of EGFP-MK transiently expressed in SMC. Arrow indicates formation of particles. Bar lOpma

170 muskelin was fused N-terminal to EGFP was analysed. M K-EGFP transiently expressed in Cos-7 cells were mainly located in the cytoplasm and resulted in formation of MK-EGFP particles in similar numbers and sizes to cells transiently expressing EGFP-MK (Figure 3E), indicating that the formation of these bodies was independent of the design of the fusion protein. Also I examined the localisation of V5-MK when transiently expressed in Cos-7 cells. The FITC- conjugated primary antibody against the V5-epitope specifically stained the V5-

MK transfected cells, and formation of small particles in these cells was observed (arrowed in Figure 3F)

To determine whether the formation of the EGFP-MK particles was a feature of expression in Cos-7 cells, EGFP-MK was transiently expressed in SMC. In these cells, EGFP-MK had a cytoplasmic distribution, and also formed EGFP- muskelin particles (arrowed in Figure 3G). A similar distribution o f EGFP-MK was observed when transiently expressed in C2C12 cells and 293T cells (data not shown).

These data show that when wild-type muskelin fused to EGFP is transiently expressed in either Cos-7 cells or SMC, in at most four-fold over the endogenous levels in SMC, muskelin is located in the cytoplasm and also conjugate into larger particles.

Characterisation of muskelin particles

I undertook further studies to analyse the characteristics and dynamics of the

EGFP-MK bodies in detail. By 36 hours after transfection of Cos-7 cells,

22.4±1.1% of the transfected cells contained particles, with 25% of these cells containing 2-10 particles (n=300, from a total of 3 experiments). The diameter of

171 Figure 4. Characteristics of muskelin particles. A) High magnification view of EGFP-MK particles, showing range of morphologies and particle sizes. Bar l^im. B) Confocal Z-section showing EGFP-MK localisation in Cos-7 cells. Bars 10pm.

172 the particles was measured using high-magnification epifluorescent microscopy and varied between 0.44 pm and 1.33 pm, with a mean diameter of 0.76 ± 0.03 pm s.e.m (n=46). 50 % of the particles appeared as ring-like structures, whereas others appeared more solid and occasionally appeared paired

(examples shown in Figure 4A). From laser confocal optical Z sections of particle-containing cells the positioning of the particles was monitored. These scans showed that the positioning of the particles varied within the cytoplasm

(examples shown in Figure 4B).

Because of the ring-like appearance and perinuclear location of some of the muskelin bodies, a notable resemble to the y-tubulin ring complexes associated to centrosomes was in question. Co-immunofluorescent staining for the two centrosome components y-tubulin and pericentrin showed that the muskelin particles d id n ot c ontain o re olocalise w ith a ny o f t hese c entrosomal proteins

(compared the arrow marking localisation of a particle. Figure 5A-D). Also, the morphology of the centrosomes and the number appeared normal in cells that transiently expressed EGFP-MK (Figure 5B and D). One function of the centrosomes is as a template for microtubule nucléation (for review see

Doxsey, 2001). In order to evaluate whether the EGFP-MK particles were templates for microtubulin nucléation or associated with microtubules, immunostaining of p-tubulin in EGFP-MK expressing Cos-7 cells was performed. As shown in Figure 5E-F, the EGFP-MK particles or cytoplasmic

EGFP-MK did not localise with microtubules or microtubule nucléation sites

(compared the arrow marking localisation of a particle. Figure 5E and F).

173 c EGFP-MK

I D I

E EGFP-MK J DAPI

Figure 5. Muskelin particles does not colocalise with centrosome components, p- or y-tubulin, or intracellular vesicles. A -3) Cos-7 cells transiently expressing EGFP-MK, A -E ) shows distribution of EGFP-MK and F -H ) indirect immunofluorescent staining with TRITC-conjugated secondary antibodies against antibodies to y-tubulin (F), pericentrin (G), and p-tubulin (H ). I) Distribution of intracellular vesicles as detected by DiIC15(3). EGFP-MK were expressed in Cos-7 cells for 24h before staining with DiIC16(3) in suspension and reattached for 12h. E and J) Formation of aggresomes in Cos-7 cells. EGFP-MK were expressed in Cos-7 cells for 24h before addition of lOOpM proteasome inhibitor PSI. After additionally 12h, cells were fixed with 3.7% paraformaldehyde. E) indicates aggresome formation containing EGFP-MK, and J) indicates localisation of nucleus as detected by DAPI. Arrows indicate localisation of particles. Bar 10mm

174 In order to evaluate whether the EGFP-MK particles were associated with any vesicle transportation or subcompartment, the intracellular cell membranes associated with vesicles, Golgi, and endoplasmic reticulum of EGFP-MK expressing cells were stained using the fluorescent lipophilic 1.1’-dihexadecyl-

3,3,3’,3’-tetramethylindodicarbocyanine perchlorate (DilCi6(3)). To ensure complete uptake and recycling of the DilCi6(3)-staining to these compartments, two approaches were undertaken in which EGFP-MK transfected cells were stained with DilCi6(3) or cells were stained with DilCi6(3) prior to transfection with EGFP-MK. Both procedures showed that the EGFP-MK particles were not membrane-bound and did not co-localise with vesicles, Golgi, or endoplasmic reticulum (Figure 5G and H; shown only for initially EGFP-MK transfected Cos-7 cells stained with DilCi6(3)).

A number of unusual cytoplasmic bodies have been described from various cell sources, including differentiation- or disease-related inclusion bodies and aggresomes (reviewed by Goebel and Warlo, 2000, Kopito, 2000, Dickson,

2002). Aggresomes are approximately 5pm large bodies located near the microtubule-organising centre in which misfolded protein oligomers become sequestered when in excess of the capacity of the proteasome degradation pathway (Johnston et al., 1998, Garcia-Mata et al., 1999). The smaller size of the EGFP-MK particles and their lack of association with microtubules or with y- tubulin at the centrosome indicated that these particles did not correspond to aggresomes. To investigate this point in more detail, cells that had expressed

EGFP-MK were incubated with the membrane-permeable proteasome inhibitors

PSI or Z-LLF-CHO for 12 hours. In both conditions, a loss of the diffuse cytoplasmic EGFP signal and accumulation of EGFP-MK in large, amorphous,

175 juxtanuclear aggregates that appeared to correspond to aggresomes were observed (Figure 51-J, shown for PSI only). The small particles formed specifically by EGFP-MK appeared distinct from the aggregates that accumulated under conditions of impaired protein catabolism.

Dynamics of muskelin particles

To investigate whether the EGFP-MK particles were static or dynamic, real-time epifluorescence imaging of EGFP-MK in Cos-7 cells was carried out. Individual cells were filmed for 40 minutes as Z stacks of XY images, using a step size of

1 pm and covering the whole depth of each cell. Individual particles were observed to move around laterally and vertically within the cytoplasm (Figure

6A, particles marker by arrow and arrowhead and see supplementary movie A).

The total distances moved by individual particles varied from 4 pm to 109 pm, with a mean distance of 35.4 ± 6.6pm s.e.m. (n=17, in three individual experiments). This corresponded to velocities ranging from 1.7nm/sec to

45nm/sec (mean velocity 17.2 ± 3.0nm/sec, n=17, from three independent experiments). These observations documented that the particles were randomly motile and dynamic and thus explain why static views of the transfected cells showed major variations in the distribution and number of muskelin particles.

The dynamics of macromolecules inside the cell can be distinguished between passive diffusion and active transportation as the dependency of metabolic energy in form as ATP as the driving force for the movement. Incubation of cells at reduced temperatures affects the production of ATP in the mitochondria and thereby the active transportation of macromolecules. Although lowering the temperature also lowers the rate of passive diffusion, the effect of reduced

176 A

#

B

25n □ 37 °C

^ 25 "C - 2 0 -

g_15

1 0 - e? LU O 5 - u o 3

Figure 6. The dynamics of muskelin particles. A ) Sequence of still images from real-time live cell epifluorescence imaging of EGFP-MK in Cos-7 cells. Images were taken every 5 minutes as stacks of Z sections made at 0.5 mm intervals through the whole thickness of each cell. A movie (supplementary m ovie A) was prepared by tracking the particle indicated by the arrow in XYZ over time. A less motile particle is indicated by the arrowhead. Z=5, etc indicates the spatial relationship of each image panel as Z sections. B) Velocity of particles recorded at 37°C (open bar) and 25°C (hatched bar). Individual cells were recorded for 40min with Smin interval and the distance between particles were measured for consecutive images and the velocity were calculated. The error bars indicates SEM for n = 17 (37°C) and n = 10 (25°C).

177 temperature is much more drastic on the active transport (Carmo-Fonseca et a!., 2002). In order to characterise the energy-dependency of the motile dynamics of the EGFP-MK particles, individual cells were filmed at 25°C. The velocity of EGFP-MK particles transiently expressed in Cos-7 cells at 25°C ranged from 3.4 nm/sec to 9.8 nm/sec with a mean velocity of 6.4 ± 0.7 nm/sec

(Figure 6B; n=10).

The Role of Muskelin Domains in its Ceiiuiar Distribution

A number of natural disease-causing mutations in the kelch-repeat proteins

RAG-2 and gigaxonin involve non-conservative substitutions of single residues within the kelch repeats. These are thought to disrupt the structure of the propeller blades and thereby disrupt protein interactions (Schwarz et al., 1996,

Callebaut and Mornon, 1998, Bomont et al., 2000). As previously described in chapter 1 and 3, the consensus motif in the kelch repeat is defined by eight highly conserved residues (Bork and Doolittle, 1994, Adams et al, 2000, Aravind and Koonin, 2000). To investigate whether the kelch beta-propeller of muskelin had a role in its cytoplasmic localisation or the formation of particles, three constructs were designed and the distribution was evaluated after transient expression in Cos-7 cells. Firstly, the EGFP-construct containing the N-terminal discoidin domain of muskelin (EGFP-MKDD), that lacks the kelch-repeats of muskelin, was transiently expressed in Cos-7 cells. As shown in Figure 7A,

EGFP-MKDD was uniformly distributed between the cytoplasm and the nucleus

(as colocalised with DAP I, Figure 7B) and did not form any particles.

It has previously been reported that proteins with a molecular mass lower then

65kDa can passively diffuse between the nucleus and the cytoplasm

178 EGFP-MKDD I B EGFP-MKDD

EGFP

EGFP-MKA85 I D EGFP-MKA85

EGFP EGFP-MKkelch/ C-term

G EGFP-MKG478S/ I H EGFP-MKY492A/ G479S I ^ V499A

Figure 7. Cellular localisation of muskelin domains and point mutations. Cos-7 cells transiently expressing EGFP-MKDD (A -B ), EGFP-MKA85 (C -D ), EGFP-MKkelch/C-term (E ), EGFP- MKI137A/V138A/P139A (F), EGFP-MKG478S/G479S (G ), or EGFP-MKY492A/V499A (H; see also supplementary movie B). B) DAPI-staining indicating nuclear localisation of EGFP-MKDD (A), D) DAPI- staining indicating nuclear localisation of EGFP-MKA30 (C). Bar 10pm.

179 (Paine et al., 1975). As the EGFP-MKDD has a molecular mass of approximately 48kDa (Figure 2B), it therefore has the potential for passive diffusion into the nucleus. This question was addressed by evaluating a construct lacking only the C-terminal region of muskelin, and thus consisted of the discoidin domain, LisH and CTLH motifs, and all six kelch-repeats (EGFP-

MKA85). When transiently transfected in Cos-7 cells, this construct expressed a polypeptide of 105kDa (Figure 2B) which is therefore too large for passive diffusion through the nuclear pore complex. However, transiently expressed

EGFP-MKA85 is mainly targeted to the nucleus of Cos-7 cells (Figure 7C and

D). These data suggests that muskelin contains a nuclear export signal in the

C-terminal region.

In order to evaluate whether the discoidin domain was necessary for cytoplasmic localisation or the formation of particles, I evaluated the localisation of the construct consisting only of the kelch repeats and the C-terminal domain of muskelin (EGFP-MKKelch/C-term) fused to the EGFP, thus lacking the discoidin domain. EGFP-MKKelch/C-term has a predicted MWr of 85 kDa.

EGFP-MKKelch/C-term showed a specifically cytoplasmic localisation and no formation of particles when transiently expressed in Cos-7 cells (Figure 7E), which suggests that the discoidin domain is not required for the for the normal cytoplasmic localisation, but is involved in the formation of muskelin particles.

To study the function of the specific regions of muskelin in more detail, I examined the localisation of proteins containing site-directed mutations of highly-conserved residues. I examined the localisation of a construct containing mutations of three highly conserved residues in the discoidin domain (EGFP-

180 MKI137AA/138A/P139A). The EGFP-MKI137AA/138A/P139A was located in the cytoplasm and did not form particles when transiently expressed in Cos-7 cells (Figure 7F). This provides further support that the discoidin domain specifically is involved in the formation of particles in Cos-7 cells.

My next goal was to examine the role of the kelch-repeats in cellular localisation. This was done by examining two individual mutations in the kelch repeats, namely a double mutation of the glycine motif of residue 478 and 479 within the fourth kelch repeat substituted by serine residues and mutations of the conserved tyrosine 488 and valine 495 within the same kelch repeat 4 substituted by alanine residues (EGFP-MKY488AA/495A). Both of these double mutations were targeted against the most conserved residues in the kelch- repeat (see Introduction, Figure 6) in order to locally destabilise the kelch domain. EGFP-MKG478S/G479S protein had a uniform, diffuse cytoplasmic distribution and no particles when transiently expressed in Cos-7 cells (Figure

7G). Similar, EGFP-MKY488AA/495A protein also showed a uniform, diffuse cytoplasmic distribution and disrupted formation of any particles (Figure 7H; see also supplementary movie B). These data indicate an essential role of the kelch-repeats in the formation of particles, and that this process is separable from the cytoplasmic localisation of muskelin.

Muskelin self-association depends on the N-terminai domain and the kelch repeats

The specific formation of muskelin particles by EGFP-MK suggested that muskelin has an intrinsic capacity for self-association and multimerisation.

Although the amino-terminal region of muskelin does not contain a BTB/ROZ

181 domain, I hypothesised that the discoidin domain might have an analogous role in mediating the association of EGFP-MK. To test this hypothesis, I characterised the self-association of muskelin biochemically. EGFP-MK or

EGFP alone was expressed in 293T cells and immunoprecipitated using antibodies to GFP bound to protein A-conjugated agarose beads. The semi­ purified EGFP-MK and EGFP were then used as affinity probe to pull down untagged muskelin transiently expressed in 293T cells. The full-length, untagged muskelin bound specifically to the immunprecipitated EGFP-MK and showed no interaction with immunoprecipitated EGFP alone (Figure 8A). As an alternative experimental pull down approach to show that this property was independent of the epitope tag, a HisA/5 tagged muskelin was made and affinity purified from transiently-transfected 293T cells using nickel-charged agarose beads. The semi-purified HisA/5-tagged muskelin bound to the agarose beads was then used as a probe to pull down the EGFP-tagged muskelin transiently expressed in 293T cells. The HisA/5-tagged muskelin bound specifically to

EGFP-MK and showed no affinity towards EGFP alone (Figure 8B).

For a further characterisation of the molecular mechanisms for self-association, a GST-fusion protein encoding the amino-terminal discoidin domain of muskelin

(termed GST-MKDD) was prepared and used as an affinity probe on glutathione-agarose beads to pull-down various forms of EGFP-MK from cell extracts. The full-length, wild type EGFP-MK bound specifically to GST-MKDD- beads, whereas EGFP did not interact (Figure 8C). Thus, GST-fusion protein containing the discoidin domain of muskelin was sufficient for mediating an interaction to the EGFP-MK.

182 1 2 5 - EGFP-MK

100 - Untagged MK

7 5 -

5 0 - EGFP

WCL IP: GFP WB: GFP WB: Muskelin

1 2 5 -

101- mm

5 6 -

3 6 -

WCL IP: His WB: GFP WB: GFP

1 2 5 - 101-

IP: GST-MKDD IP: GST WB: GFP WB: GFP

Figure 8. Biochemical analysis of muskelin self-association. A ) EGFP-MK and EGFP alone were transiently expressed in 293T cells and immunoprecipitated using antibodies against GFP bound to protein A-conjugated agarose beads (lane 1 and 2). These beads were mixed with cell lysates from 293T cells transiently expressing untagged muskelin (lane 3 and 4). All samples were resolved on 12% SDS-PAGE and expression were detected by western blotting techniques using anti-bodies against GFP (lane 1 and 2) or muskelin (lane 3 and 4). Arrows indicate proteins with expected m olecular weight as EGFP, EGFP-MK and untagged muskelin. B) EGFP-MK, EGFP and V5/Flis-MK were transiently expressed in 293T cells. The V5/His-MK were purified using nickel-changed agarose beads and mixed with cell lysates from EGFP-MK and EGFP (lane 1 and 2). Protein bound to the V5/His-MK were resolved on 12% SDS-PAGE and co-precipitated protein were detected by western blotting techniques using antibodies against GFP (lane 3 and 4). C) EGFP-MK and EGFP were transiently expressed in 293T cells. Cell lysates were mixed with GST-MKDD or GST alone, and bound protein were resolved on 12% SDS-PAGE and co-precipitated protein were detected by western blotting techniques using antibodies against GFP. Protein precipitated by GST-MKDD (lane 1 and 2) and protein precipitated by GST alone (lane 4 and 5).

183 To examine which domain(s) of muskelin was mediating an interaction with the discoidin domain, various muskelin domain mutants were tested in the pull down assays. To evaluate whether the discoidin domain was mediating self­ association similarly to the BTB/POZ domain, the EGFP-MK, EGFP-MKDD,

EGFP-MKA85, and EGFP alone was expressed in 293T cells in equal amounts

(Figure 9A, lane 1-4) and tested in the pull down assays. The EGFP-MK bound specifically to the GST-MKDD (Figure 9A lane 5) and no showed to affinity towards the GST alone (Figure 9A lane 9). The EGFP-MKDD, EGFP-MKA85, and EGFP alone showed no binding to GST or GST-MKDD (Figure 9A, lane 6-

8, and 10-12). These data suggests that the discoidin domain is not a homodimerisation domain similar to the BTB/POZ. Furthermore, these data suggest an association between the discoidin domain and the C-terminal region. As the discoidin domain alone did not self-associate, the EGFP-

MKkelch/C-term should be sufficient for the interaction with the discoidin domain. EGFP-MK, EGFP-MK249-735, and EGFP alone were expressed in

293T cells and tested in the pull-down assays against GST-MKDD and GST alone. The EGFP-MK249-735 showed a specific binding to GST-MKDD and no interaction with GST alone, although the expression of the protein in these cells was lower compared to EGFP-MK (Figure 9B).

A further characterisation of the role of the kelch domain in supporting the self­ association was performed by investigating the two kelch-repeat mutants

EGFP-MKY488A/V495A and EGFP-MKG478S/G479S in the pull-down assay against GST-MKDD or GST alone. As previously described both of these two mutations impaired the formation of particles in the cellular assays (Figure 7G and H). Neither EGFP-MKY488A/V495A nor EGFP-MKG478S/G479S showed

184 y ^ y y y y''

56--

Whole cell lysate IP: GST-MKDD IP: GST WB: GFP WB: GFP WB: GFP

125 101

56 —

3 6 - m WCL IP: GST-MKDD IP: GST WB: GFP WB: GFP WB: GFP

y" // - y y y// y y v y'^ y- ^ y y y' y y y

125-

Whole cell lysate IP: GST-MKDD IP: GST WB: GFP WB: GFP WB: GFP

F ig u re 9 . Biochemical analysis of muskelin domain involved in self-association. A ) EGFP-MK, EGFP-MKA85, EGFP-MKDD and EGFP were transiently expressed in 293T cells (lane 1-4). Cell lysates were mixed with GST-MKDD or GST alone, and bound protein were resolved on 12% SDS-PAGE and co-precipitated protein were detected by western blotting techniques using antibodies against GFP. Protein precipitated by GST-MKDD (lane 5-8) and protein precipitated by GST alone (lane 9-12). B ) EGFP-MK, EGFP-MKkelch/C-term and EGFP alone were expressed in 293T cells (lane 1-3). Cell lysates were mixed with GST-MKDD or GST alone, and bound protein were resolved on 12% SDS-PAGE and co-precipitated protein were detected by western blotting techniques using antibodies against GFP. Protein precipitated by GST-MKDD (lane 4-6) and protein precipitated by GST alone (lane 7-9). C ) EGFP-MK, EGFP- MKI137A/V138A/P139A, EGFP-MKY488A/V495A, EGFP-MKG478S/G479S, and EGFP were transiently expressed in 293T cells (lane 1-5). Cell lysates were mixed with GST-MKDD or GST alone, and bound protein were resolved on 12% SDS-PAGE and co-precipitated protein were detected by western blotting techniques using antibodies against GFP. Protein precipitated by GST-MKDD (lane 6-10) and protein precipitated by GST alone (lane 11-15).

185 binding activity to GST-MKDD, whereas both EGFP-MK and EGFP-

MKI137AA/138A/P139A were bound specifically to GST-MKDD (Figure 9C lane

6-10) and no interactions were detected with GST alone (Figure 9C, lane 11-

15).

Loss of self-association capacity is associated with altered sensitivity to proteolysis

The results from the assays of cellular localisation and biochemical interactions have identified several facets of muskelin that contribute to its ability to form particles in cells: the correct structure of the kelch propeller and the discoidin domain. Of these, point mutations in the fourth kelch motif had the most profound effect in completely abolishing the ability to form particles or to self­ associate in the pull-down assay (see Figure 7G and H, and see Figure 9B).

These results led to the hypothesis that the self-association capacity of muskelin might be regulated by alterations to protein structure. To test this, wild type muskelin and MKY488AA/495A (in untagged forms) were labelled with biotinylated lysine by in vitro transcription/translation and then subjected to limited proteolytic digestion by proteinase K in vitro. For both proteins, the amino-terminus was rapidly removed, as demonstrated by loss of the full-length protein with the epitope recognised by JOD-2 antibody (Figure 10A and 0).

However, M KY488AA/495A had a aItered sensitivity to further p roteolysis, as shown by differences in the size and abundance of the largest fragments produced by digestion with 2.5 pg or 5 pg of proteinase K in comparision to the fragments produced by digesting the wild-type protein (unique fragments arrowed in Figure 10B and D).

186 A Wild type MK B W ild type MK IB: anti-MK IB: HRP-Streptavidin 1 3 0 -

8 2 -

^ 0 2.5 5.0

pg/m l ^ 0 2.5 5.0 10 20 30 40 50 Proteinase K pg/ml Proteinase K

c MK-Y488A/V495A MK-Y488A/V495A IB: anti-MK IB: HRP-Streptavidin

1 3 0 - * 7 5 -

8 2 - m 5 0 -

0 2.5 5.0 2 5 -

jjg/m l ^ Proteinase K |jg/ml Proteinase K

Figure 10. Sensitivity of wildtype and Y488A/V495A mutant muskelin to proteolysis by proteinase K. WT-MK (A, B) and MKY488A/V495A (C, D) were in vitro translated/transcribed with incorporation of biotinylated-lysine and samples taken immediately, or after incubation with 0-50 pg/ml proteinase K. All samples were resolved on 10% SDS-polyacrylamide gels, transferred to PVDF membrane and probed with antibodies to muskelin, (A and C), or with streptavidin-HRP (B and D). Arrows in B and D indicate differences between the largest fragments; arrowhead in B and D Indicate the predominant fragment of 60 kDa produced from both proteins; asterisks in B indicate small fragments of 35kDa and 30kDa specific to digestion of WT-MK, asterisk in D indicates minor protease-resistant fragment specific to the MKY488A/V495A digest. Molecular weights are in kDa.

187 With increasing amounts of proteinase K, both proteins were digested down to a protease-resistant core fragment of apparent molecular mass 60 kDa (marked with arrowhead in Figure 10B and D). For wild-type muskelin, digestion to this fragment was essentially complete upon digestion with 10 pg of proteinase K, whereas for MKY488AA/495A, larger fragments were present as minor species in the presence of up to 30pg proteinase K (asterisk in Figure 10D). In addition, digestion of wild-type muskelin with 20pg to 50pg proteinase K produced small fragments of 35 kDa and 30 kDa as minor species (marked with asterisks in

Figure 10B). These fragments were not detected in the digests of

MKY488AA/495A (Figure 10D). Thus, the MKY488AA/495A mutation has a measurable effect on the structural conformation of the whole protein.

188 Discussion

Self-association of muskeiin

One of the characteristics of the family of kelch proteins is the capacity to dimerise both in vivo as well as in vitro (Robinson and Cooley, 1997; Zipper and

Mulcary, 2002; Chen et al., 2002; Sasagawa et al., 2002). The dimérisation of certain BTB/POZ-kelch proteins has been shown to be important for actin bundling, such that BTB/POZ domain homodimerisation of two actin-binding kelch domains facilitate the bundling of actin filaments. My data provide the first indications that muskelin is able to form multimers and self-associate both in vitro and in vivo. But in contrast to the previously described head-to-head homodimerisation of BTB/POZ kelch proteins, muskelin has the capacity for self-association in a head-to-tail manner, in which the discoidin domain is able to interact with the kelch domain. These data are based on analysis of deletions of various domains and site-directed mutagenesis of highly conserved residues.

In summary, deletion of the kelch domain and the C-terminal domain inhibited an interaction with the discoidin domain of muskelin in pull-down assays, where the kelch domain and the C-terminal domain alone still showed specific interactions with the discoidin domain. The specificity of the interaction between the discoidin domain and the kelch repeats were further emphasised by disrupting an interaction by site-directed mutations of conserved residues within the fourth kelch repeat. These data suggest an interface between the discoidin domain and the fourth kelch repeat.

Previous attempts in yeast two-hybrid screen in our laboratory were performed on full-length muskelin and a C-terminal deletion constructs. These experiments were hindered by the lack of plasmid stability and toxicity when expressed in

189 yeast (Collet, 1999). A yeast two-hybrid screen using single domains of muskelin as bait and prey, that could possibly be expressed in yeast, and therefore utilised in a further characterisation of the interaction between the discoidin domain and the kelch-repeat domain. Also, a surface plasmon resonance system, such as BIAcore, on purified single domains of muskelin could be used for a further characterisation of a molecular interaction between the domains, and would also provide an estimate of the binding coefficient of the self-association.

The experimental indications of a capacity for self-association of muskelin raise the question whether the binding between the discoidin and the kelch-repeat domain is intra- or intermolecular, i.e. the self-association constrains muskelin in closed conformation (intramolecular or cis interaction, see Figure 11 A) or promotes homodimerisation or multimerisation of muskelin proteins

(intermolecular or trans interaction, see Figure 11B and C). Two lines of experiments indicate that the self-association of muskelin is principally a trans interaction. Observing formation of muskelin particles when transiently expressed in several cell lines, suggests that intermolecular bridges are taking place between the discoidin domain of one muskelin protein to the kelch-repeat domain of a second muskelin protein, and thus aggregation of muskelin proteins into particles detectable by light microscopy level (see Figure 11C).

Furthermore, a biochemical pull-down assay using full-length V5-muskelin showed interaction with full-length EGFP tagged muskelin. The pull-down of

EGFP-MK could only take place if the association of muskelin is a trans interaction, but these experiments do not distinguish between dimérisation and multimerisation (Figure 11 B and C). One possible way to distinguish between

190 A

B

Figure 11. Schematic model of muskelin self-association. A) Intramolecular interaction between discoidin domain (containing MIND sequence) and the kelch-repeat propeller. B) Homodimerisation of muskelin, in which the discoidin domain in one muskelin interacts intermolecular with the kelch- repeat propeller in another muskelin protein. C) Multimerisation of muskelin.

191 dimérisation and multimerisation would be biochemical analysis of purified muskelin protein when resolved on non-denaturing PAGE. In case of dimérisation, the complex should show an apparent molecular weight of twice the molecular weight of single muskelin (2 x 82 kDa= 164 kDa), whereas multimerisation would show stepwise increase with 82 kDa interval between the complexes, or as a very high molecular weight complex, which would not be resolved on SDS-PAGE, but could be estimated using gel filtration.

Formation of particles

The cellular studies with EGFP-MK demonstrated that the formation of the cytoplasmic particles was independent of the design of the fusion protein and the cell type in which the protein was expressed. Several lines of evidence support the interpretation that the formation of these small (0.76 pm) cytoplasmic bodies at modest levels of protein overexpression resulted from an intrinsic property of muskelin and not as a passive consequence of excess heterologous protein expression. First, the level of expression of EGFP-MK in these experiments was at most four-fold relative to the level of expression in

SMC, which, like 02012 cells, have high expression of endogenous muskelin relative to other cells we have examined. Second, the muskelin bodies were only detected in a fraction of the EGFP-MK-expressing population and were always seen in conjunction with diffusely cytoplasmic EGFP-muskelin, which matched the previously determined localisation of endogenous muskelin

(Adams et al., 1998, 2000). Third, a mutant containing two point mutations in kelch repeat 4, EGFP-MKY488AA/495A, was expressed equivalently to wild- type muskelin and yet did not form the particles.

192 Individual expression of either the N-terminal discoidin domain or the kelch domain and the C-terminal domain alone, disrupted the formation of these particles. Additionally, site-directed mutagenesis towards conserved residues in either the N-terminal discoidin domain or the kelch repeats of muskelin, namely mutations of the double-glycine motif located in the loop between the second and the third (3-sheet of the fourth kelch repeat, or mutations of the Tyrosine and the Valine located in the third and the fourth p-sheet of the fourth kelch repeat, both disrupts the self-association (for positioning of the conserved residues in the kelch-repeat, see chapter 3, Figure 7). Together, these data indicates that both the discoidin and the kelch repeat are involved in the formation of particle suggesting a self-association between these domains in a head-to-tail manner.

Many forms of inclusion bodies or aggregates have been documented in cells, resulting from either excess production of incorrectly-folded proteins or defective protein catabolism. The aggresome is a recently-described cellular body in which misfolded proteins become concentrated and which has been suggested to have a cytoprotective role in sequestering and routing aberrant proteins for degradation (Johnston et al., 1998, reviewed by Kopito, 2000). In contrast, there are many examples of cytopathic aggregates associated with defective protein catabolism and impaired proteasome function. For example, surplus protein myopathies are characterised by accumulations of desmin or actin in cytoplasmic granules or inclusions (reviewed by Goebel and Warlo,

2001). Lewy bodies within neurons are associated with the pathogenesis of

Parkinson’s disease and other neurodegenerative disorders (reviewed by

Dickson, 2002). I demonstrated that inhibition of proteasome function correlated with the formation of large, amorphous aggregates of wild-type MK. These

193 results provide further clarification that the particles specifically formed by wild- type muskelin are distinct entities. It will be interesting to discover if there are diseases in which over-expression of muskelin is associated with the assembly of intracellular particles.

By real-time imaging of live cells I documented that the EGFP-MK particles moved randomly in three dimensions within cells at a mean velocity of 17.5 nm/sec. This is considerably slower than microtubule motor-dependent movements, where the sliding of dynein and kinesin on microtubules have been measured at 400 nm/sec and 800 nm/sec, respectively (Wang and Sheetz,

2000), but is within the same order as the movement of small organelles. For example, saltatory movements of endosomes occur at around 30 nm/sec and protein aggregates move centripetally into the aggresome at 50 nm/sec

(Ichikawa et al., 2000; Garcia-Mata et al., 1999). Lowering the temperature from

37°C to 25°G caused a 63% reduction in the mean velocity of the EGFP-MK particles. This magnitude of decrease corresponds to a previous reported decrease in the energy-dependent diffusion of a poly(A)-binding protein when lowering the temperature from 37°C to 22°G, as compared to a passive diffusion of GFP-dextran, which showed a 20% reduction in diffusion when lowering the temperature from 37°G to 22°G (Galapez et al., 2002; reviewed by Garmo-

Fonseca et al., 2002, Verkman, 2002). These data suggest that the motile dynamics of EGFP-MK is energy-dependent. However, because cytoplasm is crowded with other molecules, protein complexes and organelles, of which the diffusion or energy-dependent movement will also be affected by temperature, it is difficult to distinguish direct or indirect effects on particle mobility by this method (reviewed by Minton, 2001).

194 Nuclear localisation of EGFP-MKA85

The nucleocytoplasmic transport process a large number of substrates, including all the nuclear proteins such as histones and transcription factors imported into the nucleus, as well as RNAs exported out of the nucleus (review see Gorlich and Kutay, 1999). The nuclear pore complex (NPC) facilitates nucleocytoplasmic transport and is a -125 MDa large complex embedded in the nuclear envelope. Molecules of approximately 9nm, which corresponds to a globular protein of approximately 60 kDa, can enter and leave the nucleus through the NPC by passive diffusion, although the efficiency at this size is very low. These data correspond to my observations that the EGFP alone (30 kDa) and the EGFP-MKDD (-48 kDa) could passively enter the nucleus when expressed in Cos-7 cells (Figure 3D and Figure 7A). During an active nucleocytoplasmic transport, conformational changes of the NPC increase the size of the pore to more than 25nm, corresponding to globular proteins of approximately 1.3 MDa, and is largely mediated by two families of proteins, importins and exportins named after the direction in which they promote the transport across the NE via the NPC (review see Gorlich and Kutay, 1999;

Ullman et al., 1997). These proteins binds to specific signalling sequences within the cargo protein, designated the nuclear localisation signal (NLS) recognised by importins, or the nuclear export signal (NES) recognised by exportins. The initial interaction between importin or exportin and the cargo protein is energy-independent and facilitates the docking of the complex at the cytoplasmic side of the NPC, whereas the translocation of the importin/exportin- cargo protein complex across the NE is energy-dependent and involves hydrolysis of GTP by the Ras-related nuclear protein. Ran GTPase (Moore and

Blobel, 1993; Melchior et al., 1993, 1995, Weis et al., 1996).

195 The deletion of the C-terminal domain of muskelin reallocates muskelin from the cytoplasm to the nucleus. This suggests that muskelin contains a nuclear export signal (NES) in the C-terminal domain necessary for an exclusion of muskelin from the nucleus. A less plausible explanation could be that the C-terminal region is masking a nuclear targeting sequence, which then gets exposed by the deletion of the C-terminal region. But as the phenomena of nuclear targeting was never observed for any other construct, whether in a small proportion of full-length muskelin or during the time-lapse recording observing single cells over long period of time, the indication that muskelin contains a strong nuclear export sequence in the very C-terminal region is more likely. As only EGFP-

MKA85 showed nuclear targeting, besides the low-molecular weight EGFP-

MKDD, muskelin must contain a strong nuclear export signalling sequence in the C-terminal region. As NESs are not identified by the various prediction programs, a manual evaluation of the C-terminal part of muskelin identified a putative NES. Amino acids 729-735 (l v d l i t l ) of the mouse muskelin contain a leucine-rich segment, which displays conservation of some of the key hydrophobic residues required for NES function (Wen et al., 1995; Bogerd et al.,

1996; Nix and Beckerly, 1997). This corresponds to other observations in the laboratory, where a deletion of amino acids 710-735 of mouse muskelin localises to the nucleus as well (personal communication Dr. Josephine

Adams). These questions could be further clarified experimentally by studying the cellular localisation of proteins containing site-directed mutations of the lysine residues located in the NES.

To examine whether nuclear localisation of endogenous muskelin can be detected, treating cells with Leptomycin B should lead to an accumulation of endogenous muskelin in the nucleus, which could be detected by indirect

196 immunofluorescent staining. Leptomycin B inhibits the NES-dependent export of nuclear proteins via a direct interaction with the nuclear pore complex protein,

CRM1 (chromosome region maintenance 1), and is generally used as a specific inhibitor of nucleocytoplasmic transport (Kudo et al., 1998; Wolff et al., 1997).

Recently, a second approach to study nuclear export mechanism has been published (Siebrasse et al., 2002). This in vitro system is based on isolated nuclear envelopes from Xenopus oocytes and showed Leptomycin B and hydrolysis of Ran-bound GTP dependent nucleocytoplasmic transport. Thus, studying purified muskelin protein in this in vitro system could establish the molecular mechanisms for muskelin nucleocytoplasmic transport.

197 Chapter 6

Regulation of Muskelin Self-association by Protein Kinase C

introduction

Phosphorylation of proteins is a highly abundant post-translational modification in cells, in which the addition of a phosphate can change the affinity between the protein and its associated target(s). Two groups of regulatory enzymes control protein phosphorylation. Protein kinases catalysis the phosphotransfer reaction to either tyrosine or serine/threonine residues, whereas protein phosphatases catalyses the hydrolysis reaction, both in a highly regulated manner (for review see Hunter, 1995).

The protein kinase 0 (PKC) family of serine/threonine kinases are activated by a number of stimuli including growth factors, antigens, cytokines, and cell attachment to a variety of ECM proteins (for review see Dempsey et al., 2003).

Several in vivo PKC targets have been characterised including cytoskeletal proteins such as fascin, talin, vinculin and paxilin (Adams et al., 1999; Lewis et al., 1996; Burridge and Chrzanowska-Wodnicka, 1996), also proteins involved in receptor-induced signalling events such as the serine/threonine kinase Raf-1, the tyrosine phosphatase SHP1, and the EOF and Insulin receptors (Chen and

Davids, 2003; Corbit et al., 2003; Strack et al., 2002). PKC has been shown to be one of the pivotal signalling molecules in the cellular response to ECM and is involved in cell spreading and motility (Mostafavi-Pour et al., 2003; Vuori and

Ruoslahti, 1993; Danilov and Juliano, 1989). When CHO cells are plated on the

198 ECM protein fibronectin, cell adhesion and spreading mediated by integrin a5p1 is accelerated by the activation of PKC (Vuori and Ruoslahti, 1993; Danilov and

Juliano, 1989). Correspondingly, inhibition of PKC in primary muscle cells or

CHO cells plated on fibronectin reduce cell spreading and adhesion (Vuori and

Ruoslahti, 1993; Danilov and Juliano, 1989; Disatnik and Rando, 1999). The impaired cell spreading due to the inhibition of PKC is associated with impaired a5p1-mediated phosphorylation of focal adhesion kinase (FAK; Vuori and

Ruoslahti, 1993; Disatnik and Rando, 1999). Additionally, primary mouse muscle cells deficient in a5-expression fails to spread and activate FAK in response to cell attachment to fibronectin, whereas upon PKC activation these cells spread when plated on fibronectin. Antibodies against a5pi or allp3 are able to support cell adhesion but incapable of inducing cell spreading, whereas a pharmacological activation of PKC in CHO promotes cell spreading when plated on antibodies against alip3 or a5pi (Defilippi et al., 1997). In contrast, modulation of PKC activity in macrophages plated on fibronectin has no effect on cell adhesion and spreading, whereas an increase in cell adhesion of macrophages plated on laminin in observed upon stimulation of PKC activity

(Mercurio and Shaw, 1 988). When mouse skeletal myoblast C2C12 cells are plated on the ECM protein thrombospondin-1 (TSP-1), the cellular responses includes the formation of actin microspikes promoted by the actin-bundling protein fascin (Adams, 1995; 1997). Pharmacological activation of PKC in

C2C12 cells plated on TSP-1 induces a redistribution of fascin from the actin microspikes to a diffuse cytoplasmic localisation and a concomitant rearrangement of the actin cytoskeleton from cortical fascin-containing microspikes to actin stressfibers (Adams et al., 1999). This morphological rearrangement of the actin cytoskeleton depends on the phosphorylation of 199 fascin on Serine-39 by PKC, because phosphorylation at this site inhibits the actin binding capacity of fascin (Ono et a!., 1997; Adams et a!., 1999; for review see Kureishy et a!., 2002). Together, these data indicate an important role of protein kinase 0 in mediating distinct signalling mechanism critical for various cellular responses in cell-matrix interactions.

The identification of muskelin as a modulator of cell spreading and adhesion in response to TSP-1 as well as the identification of five conserved putative PKC- phosphorylation site in muskelin, encourage me to study whether PKC activity regulates the behaviour of muskelin in its cytoplasmic distribution, the formation of particles, and in protein self-association.

200 Results

Identification of Muskelin as a Protein Kinase C-Regufated Protein

To examine the hypothesis that endogenous muskelin is phosphorylated as a result of protein kinase C activation, I utilised two-dimensional electrophoresis by isoelectric focusing (lEF) in the first d imension followed by SDS-PAGE to compared extracts of control and TPA treated SMC and 02012 cells, which have high endogenous expression of muskelin (see Chapter 3, Figure 1 ; Adams et al., 1998). TPA was used as a strong direct activator of PKO. In extracts of untreated 02012 cells, endogenous muskelin migrated as multiple isoelectric variants with pi values ranging from 7.6 to 6.4 (Figure 1A) which is slightly more basic compared to the calculated pl-value for mouse muskelin of 5.87

(http://us.expasy.org/tools/pi_tool.html). In 02012 cells treated with lOOnM TPA for 1 hour, there was an increase in the relative amount of the isoelectric variants with pi of 6.8, 6.6, 6.5, 6.4 and also an appearance of additional acidic variants at pi 6.3 and 6.2 (Figure ID). This effect of TPA was inhibited when cells were co-treated with 50nM bisindolylmaleimide I (BIM), a specific inhibitor of protein kinase 0 (Figure 10). Treatment of 02012 cells with BIM alone did not affect the distribution of isoelectric variants in comparison to untreated cells

(Figure IB). Thus, the relative increase in the acidic isoelectric variants ranging from pi 6.8 to 6.4 and the appearance of the most acidic variants of muskelin was dependent on protein kinase 0 activity. To clarify that this was a general mechanism for post-translational regulation of muskelin, these experiments were also carried out with SMO. The endogenous muskelin in whole cell extracts from SMC migrated as multiple isoelectric variants of pi from 6.7 to 5.9, with the major variants being of pi 6.5, 6.4, and 6.1 (Figure 1E). When treated

201 C2C12 pH: 7.5 7.0 6.5

A MW, '■ • % U ntreated " *

# B # 4 ’ Bisindolylmaleimide I

*« * » » TPA + C Bisindolylmaleimide I 1111 # D # ## TPA Ih

6.7 6 .3 5.9

U ntreated

Bisindolylmaleimide I

TPA + Bisindolylmaleimide I

TPA Ih

SMC pH:

U ntreated

Serum starved 3h

Serum starved 3h + Endothelin

Figure 1. Effects of PKC activation on muskelin phosphorylation in C2C12 and SMC. C2C12 (A -D ) or SMC (E-K) cells were harvested under control conditions, (A, E, I) , or after treatment with 50 nM BIM for Ih (B, F); 50 nM BIM plus 100 nM TPA for Ih (C, G); 100 nM TPA for Ih (D, H), serum starvation for 3h (J) or serum starvation for 3h and 50nM Endothelin for 30min (K). 750 |ig of urea whole cell extracts were resolved as parallel samples in two dimensions by lEF and SDS-PAGE, transferred to PVDF membranes and probed with muskelin antibodies. Arrows in D and H indicate the acidic isoelectric variants that were specifically increased by TPA treatment, pi values calcuated from the linear pH gradient are indicated above each set of blots. Representative of 3 independent experiments.

202 with lOOnM TPA for 1 hour, the relative amounts of the most acidic isoelectric variants were increased (Figure 1H compared to Figure 1E). This increase in the most acidic spots was not apparent in SMC treated with both 50nM BIM and lOOnM TPA for 1 hour (Figure 1G), or in cells treated with 50nM BIM alone

(Figure 1F), though minor decreases in the most basic spots were observed.

These results indicated that endogenous muskelin exists as a set of isoelectric variants and becomes phosphorylated as a result of protein kinase 0 activation in several cell types. Also, a distinct pattern of isoelectric variants in both SMC and C2C12 cells indicates that muskelin is highly regulated by post-translational modifications that may be differentiation-specific.

Muskelin is Not Post-Translationaliy Modified upon Endotheiin~1

Stimuiation

As an alternative regulatory signalling pathway that involves PKC, I investigated whether muskelin was post-translationally modulated upon stimulation with the mitogenic hormone endothelin-1. Endothelin-1 induces the vasoconstriction of smooth muscle cells through an activation of the G-protein coupled receptors

Endothelin A and B (for review see Leppaluoto and Ruskoaho, 1992). This activation induces tyrosine phosphorylation of several intracellular signalling proteins, including MAPK and FAK, as well as activation of PKC (Feng et al.,

1998; Zachary et al., 1992; Cazaubon et al., 1993, 1994). Since foetal calf serum contains small amounts of endothelin, SMC were grown in serum-free

DMEM for three hours prior addition of 50nM endothelin for either 30 min or 1 hour. As previously described, the endogenous muskelin in whole cell extracts from SMC migrated as multiple isoelectric variants of pi from 6.7 to 5.9, with the major variants being of pi 6.5, 6.4, and 6.1 (Figure II). Upon serum-starvation

203 for three hours, a decrease in the isoelectric variant migrating at pi 6.5 was observed, whereas the relative amounts of the acidic isoelectric variants of endogenous muskelin in serum starved SMC were similar to the endogenous muskelin in untreated SMC (Figure 1J compared to Figure 11). Upon stimulation with 50nM endothelin for 30 min and 1 hour, the distribution of isoelectric variants of endogenous muskelin was similar to that of endogenous muskelin in serum-starved SMC (compare Figure IK to Figure II; only endothelin stimulation for 30 min is shown). These data suggest that endogenous muskelin expressed in SMC is not a target for post-translational modification upon endothelin stimulation. Thus, further experiments would be needed to determine if PKC is a physiological regulator of muskelin. Because direct activation of PKC did regulate muskelin isoelectric variants, I focused on experiments designed to identify which sites are phosphorylated by protein kinase C.

Study of Muskelin Post-Translational Modifications with EGFP-Fusion

Proteins

My next goal was to examine the isoelectric variants produced by exogenously expressed muskelin-constructs. This approach would enable a further characterisation of the putative phosphorylation sites through use of the muskelin domain and point mutation I had prepared and studied for their cellular localisation (see chapter 5, Figure 1), and would enable me to establish a direct functional role for these sites in the putative protein kinase-dependent phosphorylation of muskelin. Several experimental approaches were undertaken to establish a robust assay using two-dimensional electrophoresis to evaluate the isoelectric variants of EGFP-MK transiently expressed in either

Cos-7 or 293T cells. Initially, whole cell lysate from Cos-7 cells transiently

204 expressing EGFP-MK for 36 hours were processed in a similar manner to the previously described two-dimensional electrophoresis assay using whole cell lysate from both SMC and 02012 cells (Figure 1A-K). 750pg of protein from whole cell lysates from Oos-7 cells transiently transfected with EGFP-MK, which in parallel experiments showed a clear single EGFP-MK band on immunoblots of one-dimensional SDS-PAGE, showed an unspecific smearing distribution when processed in two-dimensional electrophoresis (Figure 2A). As this smearing distribution of EGFP-MK could be due to protein overload and being aware of the high specificity and affinity of the antibodies to GFP (for example se Chapter 5, Figure 2B), a second approach was undertaken in which 1/10 of the protein load was tested in two-dimensional electrophoresis. 75pg of protein from whole cell lysate from transiently transfected Oos-7 cells also showed a smearing distribution, indicating that the high protein load was not the reason for the lack of clear isoelectric focusing of the EGFP-MK protein (Figure 2B).

Two of the critical factors for improving isoelectric focusing are removal of salts and nucleotides from the protein samples prior isoelectric focusing. The standard procedure for preparing protein samples for isoelectric focusing

includes addition of nucleases in the lysis buffer to digest DNA to nucleotides, followed by desalting using acetone precipitation of protein and resuspension in a salt-free urea-solution. In order to further minimise the presence of

nucleotides in the initial protein samples, the TritonX-100 soluble fraction

prepared from 293T cells transiently expressing EGFP-MK was analysed, as endogenous muskelin in SMC and 02012 cells and exogenous EGFP-MK transiently expressed in Oos-7 or 293T cells is mainly present in this fraction

(Adams et al., 1998; data not shown). However, two-dimensional electrophoresis of detergent-soluble extracts from 293T cells transiently

205 Cos-7 transiently transfected with EGFP-MK

pH: 5.0 6.0 7.0 8.0

I A MW, 750 pg I

B 75 pg

c TritonX-100 extract #»

Figure 2. Two-dimensional electrophoresis of EGFP-MK. 293T cells transient expressing EGFP-MK were harvested directly in SDS-buffer (A and B) or extracted with TritonX-100 (C). 750 mg (A and C) or 75 mg (B) of sample were resuspended in urea and resolved in two dimensions by lEF and SDS-PAGE and developed by western blotting techniques using antibodies against GFP.

206 expressing E GFP-MK s howed a s imilar s meared d Istrlbution I ndicating t hat a reduced amount of nucleotides in the samples had no effect on the impaired isoelectric focusing of EGFP-MK (Figure 2C).

Because of the limitations of time, I was not able to pursue this question further during this project.

Formation of Muskelin Particles is Down-Regulated by Activation of

Protein Kinase C

Given the initial identification of muskelin as a PKC regulated protein (Figure 1), and the role of both discoidin domain and the kejch-repeat domain in self­ association suggested by pull-down assays (Chapter 5, Figure 9), the next question was to examine the roles of the phophoregulable PKC-sites in cellular localisation and self-association.

The multiple sequence alignment of the animal muskelin orthologues had demonstrated five potential sites for phosphorylation by protein kinase C that have been conserved during evolution (Adams, 2002; Chapter 3, Figure 9). As an initial test of whether muskelin distribution was regulated by protein kinase C activity, I investigated the effect of direct activation of protein kinase C by TPA on the localisation of EGFP-MK in transiently transfected SMC or Cos-7 cells.

Because both the conventional (a, p, and y) and the novel (6, s, and 0) members of protein kinase C family are activated by TPA, I initially determined which forms of PKC were expressed in these cells. SMC expressed conventional PKC-a, -Pi, and the novel PKC-6, and did not express PKC-p 2 , -y,

207 -8, -0. Cos-7 cells expressed conventional PKC-a, and -y and the novel PKC-s, and did not express PKC-p1 ,-p2, -6, or -0 (Figure 3A and data not shown).

Treatment of SMC or Cos-7 cells with TPA did not alter the cytoplasmic localisation of the protein, but did result in a striking decrease in the number of cells that contained muskelin particles (Figure 3B-I). These results were extended and quantified for statistical analysis using transiently transfected

Cos-7 cells. After 36 hour of muskelin expression, Cos-7 cells were treated with a range of TPA concentrations for 1 hour and then scored for number of cells containing particles. lOOnM TPA was found to be optimally effective, whereas there was no decrease in the number of particles in cells treated with lOOnM or

500nM of the inactive 4-a-phorbol (Figure 3J). 13.1 ± 1.1 % of EGFP-MK- expressing Cos-7 cells treated with lOOnM TPA for 1 hour contained particles compared to 22.4 ±1.1 % of the solvent-treated control cells (significant at p<0.001, n=8 experiments). Within this population of particle-positive cells, 28

% contained more than one particle.

These particles moved i n a similar mannerto the particles of u ntreated cells

(see supplementary movie C), however the mean diameter of the particles after

TPA treatment was significantly smaller than in control EGFP-MK-expressing cells; 0.60 pm +/-0.02 pm (n=72, significant p <0.0001), (example cell shown in supplementary movie C). Cos-7 cells transiently transfected with EGFP-MK were also treated with lOOnM TPA for up to 4 hours. No further decrease in the number of cells with muskelin particles was observed after 1 hour of treatment

(Figure 3K). Thus, the self-association capacity of muskelin was modulated but not abolished by protein kinase C-dependent phosphorylation.

208 A SMC Cos-7 PKCa PKCPi PKCÔ PKCa PKCy PKCe

125 — 90 — 100 — » MUjnjlMW «wmw

Cos-7 EGFP-mMK Phalloidin EGFP-mMK Phalioidin

Untreated

TPA Ih

TPA K

• i'. 30-1 4-a

S. 26- S. 25- cr IJ I «- 2c c o 8 16- S "S 1 0 - I

0 100 200 300 400 600 0 60 120 180 240 Concentration (nM) Time of TPA treatment, min

Figure 3, Expression of PKC family members in SMC and Cos-7 and effects of PKC activation on EGFP- MK distribution. A ) Western blots of SMC or Cos-7 whole cell extracts (40 mg per lane) were probed with antibodies to PKC isoforms as indicated. Molecular mass markers are in kDa. B-I, After 36 hour of EGFP-MK expression, SMC (B -E), or Cos-7 (F -I) were treated with solvent, (B, C, F, G), or 100 nM TPA, (D, E, H, I) for 1 hour then fixed and counterstained for F-actin, (C, E, G, I) . Arrows indicate examples of EGFP-MK particles after TPA treatment.See also supplementary m ovie C. Bar = 20pm. J, Concentration-dependence and specificity of the effect of TPA treatment on EGFP-MK particles in Cos-7 cells. K, Time-dependence of the effect of 100 nM TPA on EGFP-MK particles in Cos-7 cells. In both J and K, each box shows mean values and bars indicates.e.m. Data from three independent experiments.

209 Initial Identification of Protein Kinase C Target Sites in Muskelin

To establish which of the conserved consensus phosphorylation sites in muskelin were targets for PKC-dependent phosphorylation, the relevant conserved serine or threonine residues in EGFP-MK were individually mutated to alanines (Figure 4A). The five expression constructs were transfected into

Cos-7 cells and the expression levels were compared by western blotting. All the proteins had an apparent molecular mass of 112 kDa and were expressed at equivalent levels to wild-type EGFP-MK (Figure 4B).

Next, I compared the subcellular distributions of the mutant proteins with that of

EGFP-MK. All the proteins were diffuse in the cytoplasm (Figure 5), however in both Cos-7 cells and SMC the ability of some of the mutant proteins to form particles appeared altered. These initial observations were quantified by scoring

500 Cos-7 cells in 5 replicate experiments. For EGFP-MKS44A, EGFP-

MKS350A, and EGFP-MKS396A the percentage of cells with particles was not significantly different to EGFP-MK-expressing cells (Figure 6). In contrast, the numbers of EGFP-MKS324A and EGFP-MKT515A-expressing cells with particles was markedly decreased relative to EGFP-MK-expressing cells. Only

13.4 ± 1.3% SEM of EGFP-MKS324A cells and 15.4 ± 2.4% SEM of EFGP-MK-

T515A cells contained the particles (difference to wild-type significant at p<0.001 and p<0.02, respectively; Figure 6).

210 A LisH CTLH

N -te r m in a l C -te r m / \ /\ /\ OSSRW PSARS SSVRNDSEKHFTQRA QSSRW PSARSSSVRNDSEKH FTQRA QSSRW PSARS SSVRNDSEKHFTQRA QTSRW PTPRS NSIRNDTDKRYTLRA I QASRW PAPRS SAVRNDAEKHFAQRA

B A

kDa

16 97

55 Blot: GFP

y-tubuiin

Figure 4. Design and expression of the putative phosphorylation site mutation in muskelin. A) Schematic overview of muskelin domain structure and the putative phosphorylation sites in muskelin. The conservation of the five putative phosphorylation sites in muskelin species orthologues and the design of the point mutations is indicated below the structural cartoon. Hs= Homo sapiens, Mm = Mus muscularis, Dr = Danio rerio, Dm = Drosophila melanogaster. B) 40 |ig of extracts of Cos- 7 cells expressing EGFP-MK, EGFP-MK m utants or EGFP as indicated, were resolved on 12.5% SDS- polyacrylamide gels, transferred to PVDF membrane and probed with antibody to GFP. The blot was stripped and reprobed with antibody to y-tubulin to demonstrate equal protein loading.

211 EGFP P hallodin

EGFP-MK

EGFP-MKS44A

EGFP-MKS324A

EGFP-MKS350A

EGFP-MKS391A

EGFP-MKT515A

Figure 5. Cellular localisation of putative phosphorylation site mutants of muskelin. Cos-7 cells transiently expressing EGFP-MK (A and B), EGFP-MKS44A (C and D), EGFP-MKS324A (E and F), EGFP-MKS350A (G and H), EGFP-MKS391A ( I and J), EGFP-MKT515A (K and L) were fixed in 3.7% paraformaldehyde and actin cytoskeleton were fluorescently labelled with TRITC-conjugated phalloidin. Images were captured of EGFP-tagged protein (A, C, E, G, I, K) and TRITC (B, D, F, H, J, L). Bar 10mm

212 30 -1 I I U n tre a te d I S lOOnM TPA, Ih I

I 20- S 8

Figure 6. Requirement for serine 324 and threonine 515 in formation of muskelin particles. Effect of PKC activation on formation of MK particles in Cos-7 cells expressing EGFP-MK wild type or mutant proteins as indicated. Particles were scored from solvent-treated control cells (open boxes) or after treatment with 100 nM TPA for Ih (shaded boxes). Each bar is the mean from 500 cells scored in 5 replicate experiments, bars = s.e.m. See also supplementary movie C.

2 1 3 I next examined the sensitivity of the particles formed by mutant muskelins to protein kinase C activation. Cos-7 cells expressing EGFP-MKS44A, EGFP-

MKS350A, or EGFP-MKS396A showed a significant decrease in the number of particle-positive cells when treated with lOOnM TPA for 1 hour (Figure 6; differences to solvent-treated controls significant at p<0.0001, p<0.02, p<0.002, p<0.05, respectively). Thus, the normal PKC-dependent regulation did not appear impaired for these mutants. In contrast, when EGFP-MK-S324A or

EGFP-MKT515A-expressing cells were treated with lOOnM TPA for 1 hour, there was no significant decrease in the percentage of cells with particles compared to controls (Figure 6). Together, these results suggest that serine-

324 and threonine-515 of muskelin normally participate in the formation of particles and also that these residues are specifically regulated by protein kinase C-dependent phosphorylation that modulates the ability of muskelin to self-associate into oligomers. In both cases, the number of particles was not decreased to zero, suggesting that features other of PKC phosphorylation contribute to particle assembly in whole cells.

I also investigated whether the two phosphorylation sites functioned additively, by mutating both S324 and T515 in muskelin to alanines. EGFP-

MKS324A/T515A was expressed in Cos-7 cells equivalently to EGFP-MK

(Figure 6B). The protein had a cytoplasmic distribution and 12.2 ± 1.5% SEM of the transfected cells contained particles (Figure 6) (decrease compared to

EGFP-MK-expressing cells significant at p<0.001, n=5 experiments) (example cell shown in movie D). Within the particle-positive population, 25.5% of cells contained more than one muskelin body. Morphometric measurements revealed that the diameter of the particles was significantly reduced. Particle diameters were in the range 0.35 pm to 1.23 pm, with a mean diameter of 0.62 ± 0.02 pm

214 (n=56 aggregates, decrease relative to EGFP-MK particles significant at p<0.002). Notably, the number of particles was not further decreased by treatment with lOOnM TPA for 1 hour (Figure 6). Thus, each site appeared to act independently in modulating the ability of muskelin to form cytoplasmic bodies.

I also tested the effect of mutating S324 of muskelin to an aspartic acid in order to mimic a constitutive phosphorylation at this site. The EGFP-MKS324D had a diffuse cytoplasmic distribution and still formed particles similar to the EGFP-

MKS324A (data not shown). Quantitative analysis showed a significant decrease in number of transfected cells containing particles as compared to

EGFP-MK with 10.4+1.6% cells containing particles (Figure 6, n=3 experiments, decrease relative to EGFP-MK particles significant at p<0.001). Treatment of transfected Cos-7 cells expressing EGFP-MKS324D with lOOnM TPA for 1 hour showed no further decrease in percentage of cells containing particles

(11.2±1.3%, Figure 6) as compared to untreated transfected cells.

Muskelin Self‘Association is Independent of the Phosphoregulable

Serine/Threonine Sites

As my previous data has indicated an important role of the kelch-repeat p-propeller in self-association, these results raised the question whether the phosphorylation sites S 324 and the T515, respectively located in the second and the fifth kelch domain, were important for the self-association of muskelin

identified in the pull-down assays (Chapter 5). Both EGFP-MKS324A and

EGFP-MKT515A transiently expressed in 293T cells bound specifically to the

GST-MKDD and showed no binding activity to GST alone (Figure 7).

215 Additionally, the double mutation of both S324 and T515 to alanines, EGFP-

MKS324A/T515A, and the mutation of S324 to aspartic acid, EGFP-MKS324D, mimicking a constitutive phosphorylation of this site, also showed specific interaction with GST-MKDD and no affinity towards GST alone (Figure 7).

The three other putative conserved protein kinase C phosphorylation sites,

EGFP-MKS44A, EGFP-MKS350A, and EGFP-MKS396A, were also characterised for their capacity for self-association. These three proteins also showed specific binding activity to GST-MKDD and showed no interactions with

GST alone (Figure 7). Together, these data indicate that the capacity for self­ association of muskelin is independent of the phosphoregulable sites S324 and

T515. Additionally, two mutations in the kelch repeats (S350 and S396) and a mutation in the discoidin domain (S44) had no effect on the self-association of muskelin, which further emphasise the specificity of the other mutations identified to inhibit this interaction, as described in chapter 5.

216 f jf .

Whole Cell Lysate IP: GST-MKN IP: G ST W B: GFP WB: GFP W B: GFP

Figure 7. Self-association of muskelin is independent of the five putative phosphorylation sites. EGFP-MK and the indicated muskelin mutant proteins were transiently expressed in 293T cells (lane 1-9). Cell lysates were mixed with GST-MKDD or GST alone, and bound protein were resolved on 12% SDS-PAGE and co-precipitated protein were detected by western blotting techniques using antibodies against GFP. Protein precipitated by GST-MKDD (lane 10-18) and protein precipitated by GST alone (lane 19-27).

217 Discussion

Identification of Muskelin as a Target for PKC

My data presented in this chapter provide initial evidence of a phosphoregulatory mechanism for the capacity of muskelin to promote formation of particles. As demonstrated in several cell lines, upon the activation of PKC by TPA, the accumulation of particles, with regard to both size and number, were reduced. In studies with point mutant proteins, I identified two putative PKC-dependent phosphorylation sites that contribute to this effect. In studies of endogenous muskelin, I have demonstrated an increase in the acidic isoelectric variants of muskelin identified in two-dimensional electrophoresis upon PKC activation. Besides PKC, other intracellular targets do get activated upon addition of phorbol esters (Kazanietz, 2000; Brose and Rosenmund,

2002). These proteins contain a C1-domain similar to PKC, which is involved in the binding of diacylglycerol and phorbol esters (Kazanietz, 2000). Addition of the PKC-inhibitor (bisindolylmaleimide I) abolished the effects observed by

TPA-treatment, which strongly suggests that these processes are mediated specifically by PKC. Together, these data provide clear suggestions that muskelin is a target for the activity of protein kinase C.

In the time available it was not possible to complete definite studies of muskelin as a PKC substrate. I discuss here the main conclusions from the experiments carried out and future experiments that would be needed to complete the study.

218 Analysis of Muskelin by Two-DImensionai Electrophoresis

The two-dimensional electrophoresis showed a clear change in isoelectric variants of endogenous muskelin upon activation of PKC in several cell lines.

Differences in pl-values of isoelectric variants of endogenous muskelin were observed in untreated SMC and C2C12. Both cell lines contained approximately

8-10 different isoelectric variants of muskelin. This suggests that muskelin is differentially regulated by post-translational modifications in different cell types, and muskelin were a target for PKC in both cell line.

Since high-level of activation of PKC by TPA did not convert muskelin to a single acidic variants in either cell type, and cells treated with BIM still contained multiple isoelectric variants, it appears that muskelin isoelectric variants dependents on other regulatory events. These events could include phosphorylation by protein kinase A or D, or alternative post-translational modification such as acétylation or 0-linked p-N-acetylglucosamine (0-GlcNAc; for review see Quivy and Lint, 2002; Vosseller et al., 2001). Acétylation has mainly been observed as a mechanism in regulation of DNA transcription including histone acétylation in chromatin remodelling (for review see Neely and

Workman, 2002). The reversible post-translational modification 0-GlcNAc is found on serine/threonine residues of both cytoplasmic proteins, such as talin

(Hagmann et al., 1992) and nuclear transcription factors, such as Serum

Response Factor (Reason et al., 1992) and appears as widespread as phosphorylation (review see Hart, 1997). Most of the 0-GlcNAc modified proteins are also targets for post-translational phosphorylation. This suggests that the function of 0-GlcNAc is considered reciprocal to the phosphorylation of proteins and is involved in alterations in protein-protein interactions (Hart,

1997). Demonstration of these post-translational modifications in cellular

219 proteins is limited and mainly detected by mass spectrometry (Wells et al.,

2002). In order to determine whether muskelin is a target for any of these modifications would require purification of endogenous muskelin and characterisation by mass spectrometry. These issues could be resolved in our collaboration with the Mass Spectrometry Core facilities at the Cleveland Clinic

Foundation under the supervision of Dr. Michael Kintler.

The addition of endothelin-1 for several time-periods showed no effect on the distribution of isoelectric variants of endogenous muskelin in SMC. These data showed that muskelin is not associated with the signalling pathways induced by endothelin-1. Although several reports h ave showed an activation of P KC by endothelin-1 (Piacentini et al., 2000; Feng et al., 1998; Chen, 1994), recent data showed a decrease in membrane translocation and activity of PKC upon addition o f endothelin-1, which suggests a cell-type specific response (Boivin and Allen, 2003), and could explain why no differences in pl-variants for muskelin were observed in my experiments. Thus, it would have been optimal to study other physiological signals that regulate PKC. Of particular interest would be the effect of cellular adhesion to ECM components such as fibronectin that activate PKC.

Several alternative approaches were tested for the analysis of exogenous expressed muskelin-constructs in two-dimensional electrophoresis. This includes different cell types transiently expressing various EGFP-constructs, analysis of immunoprecipitated EGFP-MK protein, a HisA/5-tagged muskelin stablely expressed in the insect expression system High-5, and various concentrations of nucleases and protein load. As none of these experiments

220 gave conclusive results, a current work in progress is to directly identify phosphorylated serine/threonine or other post-translational modifications of muskelin using mass spectrometry on purified EGFP-MK or V5-MK treated with

100 nM TPA for 1 hour. This work is performed in collaboration with the Mass

Spectrometry Core facilities at the Cleveland Clinic Foundation under the supervision of Dr. Michael Kintler.

It would also be optimal demonstrate direct phosphorylation of purified muskelin by purified PKC in vitro. Conditions for such experiments are well-established, with demonstration of the lipid dependency of PKC activity (Padma et al., 1999;

Murakami et al., 1986). If muskelin was phosphorylated by PKC in vitro, the phosphorylated protein could also have been used in mass spectrometry to identify the phosphorylation site(s). Assuming that the these phosphorylation sites would have been mutual with the conserved sites, for which I have prepared point mutations, I would then have continued to test the physiological significance of PKC-dependent phosphorylation of muskelin in whole cell assays.

PKC Activation and the Formation of Muskelin Particles

In cells expressing EGFP-muskelin, the number and size of particles was significantly decreased by PKC activation. Although, as described above, further experiments will be needed to fully establish muskelin as a direct substrate of PKC, the regulatory role of PKC was further substantiated by the identification of two of the conserved putative phosphorylation sites, S324 and

T515, as necessary for the PKC activation-dependent component of particle formation. Strikingly, both of these phosphorylation sites are located in

221 analogous positions within the kelch-repeat p-propeller. By sequence alignment with the kelch-repeat consensus sequence and the known structure of the fungal galactose oxidase propeller, S324 and T515 of muskelin were predicted to be located in the so-called 4-1 loops that join blades 1 and 2, and 5 and 6, respectively, to each other (see Chapter 3, Figure 9B). W ithin the galactose oxidase structure, all the 4-1 loops are exposed on the same outer face of the propeller (Ito et al., 1991, and see Chapter 3, Figure 9 B). There are several precedents for involvement of 4-1 loops in either c/s- or trans- protein interactions. The self-processing of galactose oxidase from an inactive precursor to the active enzyme involves a significant structural shift to the 4-1 loop that contains cysteine 228. Cysteine 228 is needed for c/s-thioester bond formation in the mature, active enzyme (Firbank et al., 2001). In HCF-1, mutation of P I34 within the 4-1 loop between blades 2 and 3 disrupts VP16- binding activity (Johnson et al., 1999). The 4-1 loop between blades 5 and 6 of the second p-propeller of a-scruin corresponds to an actin-binding site and contains a surface-exposed cysteine, cysteine 837 (Sun et al., 1997). I suggest that phosphorylation at either 8324 or T515 within muskelin might modulate a single protein interaction site, which would explain why the two sites were not observed to function additively in the particle assembly assay.

Mutations of the phosphorylation sites 8324 or T515 in muskelin, either individual or together, did not change the affinity towards the discoidin domain of muskelin as characterised in the pull-down experiments. This suggests that the process for homodimerisation of muskelin is independent of phosphorylation at these sites. Given that the formation of particles is only partially dependent on the phosphorylation sites, I suggest that either an associated protein could

222 be involved in mediating an intracellular accumulation of muskelin and the interface between this unknown scaffolding protein and muskelin includes the two phosphorylation-sites, or that a direct c/s interaction between the kelch- repeat domain and the discoidin domain involved additional residues, not all of which modulated by phosphorylation. This could explain the decrease in number and size of the p article i n i ntact c ells a nd the I ack o f effect o f t hese point mutations on binding in the biochemical assay. To take these studies further, a first need would be to fully define muskelin as a PKC substrate, as discussed above.

223 General Discussion

Muskelin is a unique, intracellular member of the kelch-repeat superfamily

(Adams et al., 2000; Prag and Adams, 2003). The identification of kelch repeats and LisH/CTLH motifs has indicated potential for multiple protein-protein interactions. In this thesis, I have expanded this knowledge in a new direction by identification of the amino-terminal region as a predicted discoidin domain, demonstrating that muskelin has a unique domain architecture amongst the kelch-repeat proteins of animals, and by developing molecular, cellular and biochemical evidence for specific self-association properties of muskelin that depend on head-to-tail association involving the discoidin domain and the kejch- repeat p-propeller. These studies also identified features of domain organisation that contribute to the subcellular distribution of muskelin.

The identification of a discoidin domain within the N-terminal region of muskelin by bioinformatic analyses places muskelin in close structural relationship to the previously described discoidin domain/kelch-repeat protein. Galactose Oxidase

(Ito et al., 1994). Though, the differences in cellular distribution, the lack of residues involved in the enzymatic activity of Galactose oxidase and the additional LisH/CTLK motifs in muskelin, suggest differences in cellular functions between these two proteins, the structural data given by the resolved crystal structure of Galactose oxidase gives us predictions about tertiary localisation o f conserved residues, including the MIND sequence in muskelin

(see chapter 3, Figure 5). The predicted localisation of the MIND sequence positioned in the discoidin domain suggests a stabilising role rather than involved in protein-protein interactions. This could be further analysed

224 experimentally in a proteolysis assay comparing the wild-type discoidin domain to MIND mutant (MK-I137AA/138A/P139A) in order to detect conformational changes.

When expressed as a GST-fusion protein, the discoidin domain interacted with the kelch-repeat domain and the C-terminal region of muskelin, and the mutations in MIND sequence inhibited the interaction. So in order to identify the binding interface between these two domains, other interesting residues could be analysed. The resolved crystal structure of the discoidin domain proteins.

Human coagulation factor Va and Villa revealed the localisation of the residues involved in targeting these proteins to the membrane (Pratt et al., 1999;

Macedo-Ribeiro et al., 1999). For the coagulation factor Va, these residues include a double tryptophan (aa2091-2092) and leucine-2144, whereas coagulation factor Villa, these residues include methionine-2199, phenylalanine-2200, and leucine-2252 (Pratt et al., 1999; Macedo-Ribeiro et al.,

1999). The corresponding residues in muskelin include tyrosine-30, leucine-31 and lysine-78, and mutations within these residues could reveal whether the putative protein-protein interaction site is located within this region.

The uniqueness of muskelin was described by a comprehensive bioinformatic analysis of the whole kelch-repeat superfamily in several organisms (Prag and

Adams, 2003). In this analysis, muskelin was identified as the only kelch-repeat protein of vertebrates, D.melanogaster, A.gambiae or S.pombe that also contained a discoidin domain. A comprehensive analysis of gene expression patterns in Schizosaccharomyces pombe has shown that the muskelin-related protein (SPAC15A10.10) is a member of the middle genes, for which the expression peaks at meiosis (Mata et al., 2002). This group contains genes

225 important for mitoses/meiosis including cell-cycie regulators (cdc25 and odd 3), components of the Anaphase Promoting Complex (including apc1, apc5, nuc2, nuc4, nuc20, and nuc23), and kinases (plo1, ark1, and fin1). The sexual reproduction in S.pombe requires meiosis for the production of haploid gametes and is dependent on transcriptional regulation induced by the transcription factor Mei4p (Horie et al., 1998). A genome-wide screen for putative meiotic genes induced by Mei4p identified the muskelin-like protein (SPAC15A10.10) among the group of 9 genes in S.pombe, which contains a Mei4p-binding site named FLEX (Freac-like element of spo 6 ) in the 1-kb region upstream from the initial start codon (Abe and Shimoda, 2000). Comparison of mei4+ and mei4(A) strains showed a specific induction of muskelin-like protein during meiosis in mei4+, but not in mei4D, indicating that the expression of muskelin-related protein in S.pombe is Mei4p-dependent (Abe and Shimoda, 2000).

Furthermore, a gene deletion project in S.pombe has identified the muskelin-like protein (SPAC15A10.10) as a non-essential gene (Decottignies et al., 2003).

These authors identified that all of the essential genes in S.pombe have been conserved among the yeasts and C.elegans indicating a natural selection of genes required for vital functions. The cell-type specific expression of muskelin

(chapter 5, fig. 2) also suggests that muskelin is not an essential gene for cell survival.

The bioinformatic studies provided the basis for development of new muskelin reagents and new questions as to its cellular properties. Using cellular and biochemical assays I have been able to indicate three primary molecular capabilities of muskelin. I have characterised a head-to-tail association mechanism for muskelin homodimerisation that includes the discoidin domain.

226 wild-type MK I M N|m||k3|;k^|M |g|

Formation of Self-association Nuclear Full-length Muskelin construct: particles to DD Localisation

EGFP-MK % isinioiQisiQBsa MK-EGFP % ■aiBminsncBaaaaig V5-MK ■ GKPIPN PLLGLDST ^ | OiSCoWf n % liirsimiaicir VS-Epitope Figure 1. Summary of molecular functions of muskelin and various Muskelin domain construct: mutantions. Four main categorise of muskelin EGFP-MKDD ^^EGFP^^j Discoidin | constructs were analysed, full-length constructs, domain deletions EGFP-MKA85 (^^EGFP^^^ff^scoldln | |Kt||K2|| K3 | | k4| | ks | | Ksj | constructs, point mutations directed

EGFP-MKkelch/C-term EGFP iK l KZ K3 K4 IkSi K6 C-term against conserved residues in specific domains, and point mutations directed against putative protein kinase C Muskelin point mutation constructs: phosphorylation sites. All constructs were analysed for the modulation of

EGFP-MKI137A/V138A/P139A three distinct functions, namely formation of particles in cells, self­ EGFP-MKY488A/V495A association to the discoidin domain detected in biochemical pull down EGFP-MKG478S/G479S assays, and nuclear targeting detected in cells. + indicates a sustained or gain of capacity, - indicates lost or Muskelin putative phosphorylation mutation constructs: unable of the capacity, | indicates decrease in capacity. EGFP-MKS44A IM HIglliMH

EGFP-MKS324A

EGFP-MKS350A

EGFP-MKS396A

EGFP-MKS515A

EGFP-MKS324D

EGFP-MKS324A/S515A K) ro Also, I have identified a capacity of muskelin for promoting formation of intracellular particles in a PKC-dependent manner, and uncovered indications of a nuclear-export signal localised in the C-terminal region. As summarised in

Figure 1, different domains and motifs of muskelin are associated with the different capabilities. The formation of particles is dependent on the discoidin domain and the kelch-repeats, and specifically a stably-folded fourth kelch- repeat. This suggests that the discoidin domain undergoes molecular interactions (which on the basis of the experiments carried out could direct or indirect) with the kelch domain, as confirmed in the pull-down assays. But furthermore, in the pull-down experiments I showed that mutations of the phosphorylation sites S324 or T515, either individually or both together, did not disrupt the binding to the discoidin domain of muskelin, whereas mutation of conserved residues in the kelch-repeat (MKY488AA/495A and

MKG478S/G479S) completely abolished the homodimerisation. These data suggests a model for a two step formation of particles (Figure 2). The first step is a self-association of muskelin. Whether this interaction is in cis or trans is still unclear, but in this model, for simplicity the self-association is outlined as in cis.

The second step is induction of a conformational change exposing a second self-association site or a recruitment of a scaffolding protein, which mediate an accumulation of muskelin protein. The conformational change or the interaction between muskelin and the associated protein is dependent on the self­ association of muskelin. Upon activation of protein kinase C, muskelin becomes phosphorylated at S324 and/or T515. The addition of the phosphate groups changes the second self-association site or destabilises the interaction between muskelin and the associated scaffolding protein, and subsequently leads to destabilisation of the particles. The destabilisation does not completely inhibit

228 Mutation of DD Mutations of Wild type or kelch repeats phosphorylation sites

1. Self-association

Conformational change or unknown binding partner

2. Formation of particles

Protein kinase C

3. Destabilisation of particles

F ig u re 2 . Schematic model of a theoretical relationship between self-association of muskelin and formation of particles in cells. Wild-type muskelin can self-associate and promotes the accumulation of particles in a protein kinase C-dependent manner. The self-association induces conformational changes within the propeller, or recruitment of an unknown factor, which then allows the accumulation of particles. Direct activation of protein kinase C phosphorylates Serine324 and/or ThreonineSlS, which induces conformational changes or decrease the association to the unknown factor leading to a destabilisation of the particles. Mutations of conserved residues in the kelch-repeats or the discoidin domain disrupts the initial self-association and the subsequently inhibits conformational changes or recruitment of the unknown factor and the accumulation of particles. Mutation of the Serine324 or ThreonineSlS does not disrupt the self-association, but inhibits the conformation changes or the recruitment of the unknown factor and a subsequently destabilisation of the particles.

229 the self-association as I still observed particle after treating the cells with TPA, but the reduced number of particles suggest a decrease in the binding affinity between the two proteins. These changes have no effect on the initial self­ association of muskelin. Obviously, this is a very speculative model for the regulation of molecular interaction related to muskelin, but can be used as an initial schedule for further experimental characterisation.

The appearance of EGFP-MKA85 in the nucleus was very surprising, given the clear cytoplasmic distribution of endogenous, EGFP-tagged, and V5-tagged muskelins. This result was observed late in the project period, where it was not possible to undertake a new set of experiments to characterise muskelin in relation to nuclear transport. However, results from another group that were published late in the project period, provided a suggestion as to how muskelin could undergo nuclear shuttling (Umeda et al., 2003). A truncated form of the human Ran-binding protein, RanBPM binds to the Ras-like GTPase Ran

(Nakamura et al., 1998). A yeast two-hybrid screen using the truncated

RanBPM identified three interacting proteins, which including muskelin (Umeda et al., 2003). This interaction was further consolidated in biochemical co­ precipitation assays of transiently expressed GST-tagged muskelin constructs.

Ran is involved in nuclear transport. The nuclear shuttling of Ran is determined by the cycling between GDP- and GTP-bound states, which is regulated by guanosine nucleotide exchange factors (GEF) catalysing the exchange of GDP for GTP, and the GTPase-activating proteins (GAPs) stimulating the intrinsic

GTPase activity (review see Sazer and Dasso, 2000). Whereas Ran and

RanGEF is predominately located in the nucleus, RanGAP is located in

230 cytoplasm. This compartmentation of RanGEF and RanGAP allows a differentiation in the localisation of the GDP- and GTP-bound RanGTPase. The exchange of GDP for GTP in the nucleus facilitates the accumulation of

RanGTP in the nucleus causing a concentration gradient across the NE. This concentration gradient is thought essential for the cycling of Ran between the cytoplasm and the nucleus.

The Ran binding protein, RanBPM, was first characterised as a 55-kDa protein isolated in a yeast two-hybrid screen (Nakamura et al., 1998). Recently, it became obvious that RanBPM55 was a truncated form of a 90-kDa protein, and the two forms were designated RanBPM55 and RanBPM90 (Nishitani et al.,

2001). Indirect immunofluorescent staining of RanBPM55 expressed in Cos-7 cells revealed centrosome localisation and caused ectopic microtubule nucléation sites (Nakamura et al., 1998). In contrast, a GFP-RanBPM90 construct expressed in Cos-7 cells showed diffuse localisation within the cytoplasm and the nucleus, with an increased localisation in perinuclear region

(Nishitani et al., 2001). Furthermore, where RanBPM55 interacts with Ran as showed in pull down assays and yeast two-hybrid screen, co- immunoprecipitation assays of RanBPM90 shows only a very weak interaction with Ran (Nakamura et al., 1998; Nishitani et al., 2001). These discrepancies between RanBPM55 and RanBPM90 behaviour in cellular localisation and interactions with Ran raise the question whether RanBPM90 is actually physiological connected to Ran, and whether it interacts with muskelin as well.

My data suggests that muskelin is not associated with the centrosomes ( see

Chapter 5, Figure 5A-D), where RanBPM55 is specifically located, whereas the diffuse cytoplasmic distribution of RanBPM90 does indeed correspond to the diffuse cytoplasmic localisation o f muskelin (although not provided that these

231 proteins are associated), and therefore could allow a possible Interaction between RanBPMQO and muskelin. RanBPM90 has been Identified In high- molecular weight complex (Nishitani et al., 2001). Many other Interactions of

RanBPM90 have been Identified by yeast two-hybrid screen Including the receptor proteln-tyroslne kinase, MET (Wang, D., et al., 2002), the transcription factor, androgen receptor (Rao et al., 2002), the homeodomaln-lnteracting protein kinase 2 (Wang, Y., et al., 2002), the SlOO-proteIn psorlasin (Emberley et al., 2002), the ubiqultln-speclfic protease USP11 (IdeguchI et al., 2002), and

Twal (Umeda et al., 2003). These data Indicate RanBPM as a scaffolding protein with multiple Interactions, but until all these Interactions have been carefully evaluated by multiple techniques and on endogenous protein and there Is more Information on the function of RanBPM, It remains uncertain, which of these are physiological associations.

Nucleocytoplasmic réallocation has been reported for a number of proteins

Involved In cell-substratum and cell-cell adhesion. The serlne/threonlne kinase

PAK-2 contains a NES In the regulatory domain, and In response to stress stimulants, PAK-2 gets proteolytic degraded to PAK-2q34 and translocated to the nucleus (Jacobi et al., 2 003). Zyxin was Initially Identified located In cell- substratum adhesion In fibroblasts and In cell-cell junctions of epithelial cells and Is not detected In the cell nucleus when cells are In steady-state (Crawford and Beckerle, 1991). Injection of antibodies against chicken zyxIn Into the nuclei of chicken fibroblasts resulted In a nucleocytoplasmic translocation of the antibodies bound to zyxin, and accumulation at sites of cell-substratum adhesion after 16 hours. This suggests that zyxin shuttles between the cytoplasm and the nucleus. The role of zyxin In the nucleus Is still unclear

232 (review Wang and Gilmore, 2003). In vitro reporter gene assays indicate some transactivation activity of zyxin, e ither d irectly o rb ringing transcription factors and activation domains together (Wang and Gilmore, 2001).

Nucleocytoplasmic translocation to tight junction at cell-cell contact sites has been reported for the zonula occludens-1 protein (ZO-1). ZO-1 is nuclear in subconfluent epithelial cells and shifts to the plasma membrane upon the maturation of cell-cell contacts. In the nucleus, ZO-1 interacts with the Y-box transcription factor, YONAB, and regulates the expression of receptor tyrosine kinase, Erb-2 (Gottardi et al., 1996).

Whether muskelin has a function in the nucleus is still to be discovered.

Additionally, the mechanism by which muskelin gets transported across the NE is also uncertain. Possible mechanism to be investigated experimentally, could be that an interaction with Ran or a Ran-interacting protein, such as RanBPM, transports muskelin to the nucleus, where it rapidly recycles back to the cytoplasm. These questions could be addressed by evaluating the nucleus targeted muskelin mutant together with various RanBPM mutants.

The most obvious question still to answer is; how does muskelin promote cellular attachment and spreading in response to TSP-1. The identification of several molecular interactions related to muskelin, suggests that muskelin could function as a scaffolding protein regulated by PKC. Cellular attachment to TSP-

1 does not activate PKC, and should therefore theoretically keep muskelin in an unphosphorylated state. This state was mimicked by the S324A and T515A mutations, with lead to a decrease in formation of particles and a putative open conformation or disrupted association to an unknown factor that could allow

233 binding to other cytoplasmic proteins. The data with the MKA85 mutant also

raise a new possibility, that nuclear-cytoplasmic distribution of muskelin could

be regulated by ECM-adhesion conditions. These events could hypothetically

be the link to the promotion of cellular spreading and attachment to TSP-1.

These possibilities are now under the investigation in the laboratory.

234 Reference list

Aasland.R., Gibson,T.J., and Stewart,A.F. (1995). The PHD finger: implications for chromatin-mediated transcriptional regulation. Trends Biochem. Sci, 20, 56- 59.

Abe,H. and Shimoda,C. (2000). Autoregulated Expression of Schizosaccharomyces pombe Meiosis-Specific Transcription Factor Mei4 and a Genome-Wide Search for Its Target Genes. Genetics, 154, 1497.

Adams,J., Kelso,R., and Cooley,L. (2000). The kelch repeat superfamily of proteins: propellers of cell function. Trends Cell Biol., 10, 17-24.

Adams,J. (2002). Characterization of a Drosophila melanogaster orthologue of muskelin. Gene, 297, 69.

Adams,J.C. and Lawler,J. (1993). Diverse mechanisms for cell attachment to platelet thrombospondin. J. Cell Sci., 104 ( Ft 4), 1061-1071.

Adams, J.C. and Watt,F.M. (1993). Regulation of development and differentiation by the extracellular matrix. Development, 117, 1183-1198.

Adams,J.C. and Lawler,J. (1994). Cell-type specific adhesive interactions of skeletal myoblasts with thrombospondin-1. Mol. Biol. Cell, 5, 423-437.

Adams,J.C. (1995). Formation of stable microspikes containing actin and the 55 kDa actin bundling protein, fascin, is a consequence of cell adhesion to thrombospondin -1 : implications for the anti-adhesive activities of thrombospondin-1. J. Cell Sci., 108 ( Ft 5), 1977-1990.

Adams,J.C. (1997). Characterization of cell-matrix adhesion requirements for the formation of fascin microspikes. Mol. Biol. Cell, 8 , 2345-2363.

Adams,J.C., Seed,B., and Lawler,J. (1998). Muskelin, a novel intracellular mediator of cell adhesive and cytoskeletal responses to thrombospondin- 1 . EMBO J., 17, 4964-4974.

235 Adams,J.C. and Zhang,L. (1999). cDNA cloning of human muskelin and localisation of the muskelin (MKLNI) gene to human chromosome 7q32 and mouse chromosome 6 B1/B2 by physical mapping and FISH. Cytogenet. Cell Genet, 87, 19-21.

Adams,J.C., Clelland,J.D., Gollett,G.D., Matsumura,F., Yamashiro,S., and Zhang,L. (1999). Cell-matrix adhesions differentially regulate fascin phosphorylation. Mol. Biol. Cell, 10, 4177-4190.

Adams,J.C. and Schwartz,M.A. (2000). Stimulation of fascin spikes by thrombospondin-1 is mediated by the GTPases Rac and Cdc42 [see comments]. J. Cell Biol., 150, 807-822.

Adams,J.C. and Tucker,R.P. (2000). The thrombospondin type 1 repeat (TSR) superfamily: diverse proteins with related roles in neuronal development. Dev. Dyn., 218, 280-299.

Adams,J.C., Kureishy,N., and Taylor,A.L. (2001). A role for syndecan-1 in coupling fascin spike formation by thrombospondin-1. J. Cell Biol., 152, 1169- 1182.

Adams,J.C. (2001 )a. Cell-matrix contact structures. Cell Mol. Life Sci., 58, 371- 392.

Adams,J.C. (2001 )b. THROMBOSPONDINS: Multifunctional Regulators of Cell Interactions. Annu. Rev. Cell Dev. Biol., 17, 25-51.

Adams,J.C., Monk,R., Taylor,A.L., Ozbek,S., Fascetti,N., Baumgartner,S., and Engel,J. (2003). Characterisation of Drosophila thrombospondin defines an early origin of pentameric thrombospondins. J. Mol. Biol., 328, 479-494.

Adams,M.D., Celniker.S.E., Holt,R.A., Evans,C.A., Gocayne,J.D., Amanatides,P.G., Scherer,S.E., Li,P,W., et al., (2000). The Genome Sequence of Drosophila melanogaster. Science, 287, 2185-2195.

Agah,A., Kyriakides,T.R., Lawler,J., and Bernstein,P. (2002). The Lack of Thrombospondin-1 (TSP1) Dictates the Course of Wound Healing in Double- TSP1/TSP2-Null Mice. Am J Pathol, 161, 831-839.

236 Ahmad,K.F., Engel,C.K., and Prive,G.G. (1998). Crystal structure of the BTB domain from PLZF. Proc. Natl. Acad. Sci. U. S. A, 95, 12123-12128.

Aho,S. and Uitto,J. (1998). Two-hybrid analysis reveals multiple direct interactions for thrombospondin 1. Matrix Biology, 17, 401-412.

Aidinis,V., Dias,D.C., Gomez,C.A., Bhattacharyya,D., Spanopoulou,E., and Santagata,S. (2000). Definition of minimal domains of interaction within the recombination-activating genes 1 and 2 recombinase complex. J. Immunol., 164, 5826-5832.

Ajuh,P., Chusainow,J., Ryder,U., and Lamond,A.I. (2002). A novel function for human factor 01 (HCF-1), a host protein required for herpes simplex virus infection, in pre-mRNA splicing. EMBO J., 21, 6590-6602.

Akamatsu,Y. and Oettinger.M.A. (1998). Distinct Roles of RAG1 and RAG2 in Binding the V(D)J Recombination Signal Sequences. Mol. Cell. Biol., 18, 4670.

Albagli,0., Dhordain,P., Deweindt,C., Lecocq,G., and Leprince,D. (1995). The BTB/POZ domain: a new protein-protein interaction motif common to DNA- and actin-binding proteins. Cell Growth Differ., 6 , 1193-1198.

Albo,D., Shinohara,T., and Tuszynski,G.P. (2002). Up-regulation of Matrix Metalloproteinase 9 by Thrombospondin 1 in Gastric Cancer. Journal of Surgical Research, 108, 51-60.

Altschul,S.F., Gish,W., Miller,W., Myers,E.W., and Lipman,D.J. (1990). Basic local alignment search tool. J Mol Biol, 215, 403-410.

Altschul,S.F., Madden,T.L., Schaffer,A.A., Zhang,J., Zhang,Z., Miller,W., and Lipman.D.J. (1997). Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res, 25, 3389-3402.

Andrade,M.A., Perez-lratxeta,C., and Ponting,C.P. (2001 )a. Protein repeats: structures, functions, and evolution. J Struct. Biol, 134, 117-131.

Andrade,M.A., Gonzalez-Guzman,M., Serrano,R., and Rodriguez,P.L. (2001 )b. A combination of the F-box motif and kelch repeats defines a large Arabidopsis family of F-box proteins. Plant Mol. Biol., 46, 603-614. 237 Arabidopsis Genome Initiative (2000). Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. Nature, 408, 796-815.

Aravind,L. and Koonin,E.V. (1999). Gleaning non-trivial structural, functional and evolutionary information about proteins by iterative database searches. J. Mol. Biol., 287, 1023-1040.

Asch.A.S., Barnwell,J., Silverstein,R.L., and Nachman,R.L. (1987). Isolation of the thrombospondin membrane receptor. J Clin. Invest, 79, 1054-1061.

Asch,A.S., Silbiger,S., Heimer,E., and Nachman,R.L. (1992). Thrombospondin sequence motif (CSVTCG) is responsible for CD36 binding. Biochem. Biophys. Res. Commun., 182, 1208-1217.

Baron,A.J., Stevens,C., Wilmot,C., Seneviratne,K.D., Blakeley,V., Dooley,D.M., Phillips,S.E., Knowles,P.P., and McPherson,M.J. (1994). Structure and mechanism of galactose oxidase. The free radical site. J. Biol. Chem., 269, 25095-25105.

Bateman,A., Birney,E., Cerruti,L., Durbin,R., Etwiller,L., Eddy,S.R., Griffiths- Jones,S., et al., (2002). The Pfam Protein Families Database. Nucl. Acids. Res., 30, 276-280.

Batlle,M., Lu,A., Green,D.A., Xue,Y., and Hirsch,J.P. (2003). Krhip and Krh2p act downstream of the Gpa2p G(alpha) subunit to negatively regulate haploid invasive growth. J. Cell Sci., 116, 701-710.

Baumgartner,S., Hofmann,K., Chiquet-Ehrismann,R., and Bucher,P. (1998). The discoidin domain family revisited: new members from prokaryotes and a homology-based fold prediction. Protein Sci., 7, 1626-1631.

Beckerle,M.C., Burridge,K., DeMartino.G.N., and Croall,D.E. (1987). Colocalization of calcium-dependent protease II and one of its substrates at sites of cell adhesion. Cell, %20;51, 569-577.

Behrens,R. and Nurse,P. (2002). Roles of fission yeast tealp in the localization of polarity factors and in organizing the microtubular cytoskeleton. J. Cell Biol., 157, 783-793.

238 Bein.K. and Simons,M. (2000). Thrombospondin Type 1 Repeats Interact with Matrix Metalloproteinase 2. REGULATION OF METALLOPROTEINASE ACTIVITY. J. Biol. Chem., 275, 32167.

Bieber.A.J., Snow,P.M., Hortsch,M., Patel,N.H., Jacobs,J.R., Traquina,Z.R., Schilling,J., and Goodman,O.S. (1989). Drosophila neuroglian: a member of the immunoglobulin superfamily with extensive homology to the vertebrate neural adhesion molecule L I. Cell, 59, 447-460.

Bixby,K.A., Nanao,M.H., Shen,N.V., Kreusch,A., Bellamy,H., Pfaffinger,P.J., and Choe,S. (1999). Zn2+-binding and molecular determinants of tetramerization in voltage-gated K+ channels. Nat. Struct. Biol, 6 , 38-43

Blom,N., Gammeltoft,S., and Brunak,S. (1999). Sequence and structure-based prediction of eukaryotic protein phosphorylation sites. J. Mol. Biol., 294, 1351- 1362.

Bogatcheva,N.V., Garcia,J.G.N., and Verin,A.D. (2002). Role of tyrosine kinase signaling in endothelial cell barrier regulation. Vascular Pharmacology, 39, 201- 212.

Bogerd,H.P., Fridell,R.A., Benson,R.E., Hua,J., and Cullen,B.R. (1996). Protein sequence requirements for function of the human T-cell leukemia virus type 1 Rex nuclear export signal delineated by a novel in vivo randomization-selection assay. Mol. Cell. Biol., 16, 4207.

Boguski,M.S., Lowe,T.M., and Tolstoshev,C.M. (1993). dbEST-database for "expressed sequence tags". Nat. Genet., 4, 332-333.

Boivin,B. and Allen,B.G. (2003). Regulation of membrane-bound PKC in adult cardiac ventricular myocytes. Cellular Signalling, 15, 217-224.

Bomont,P., Cavalier,L., Blondeau,F., Hamida,C.B., Belal,S., Tazir,M., Demir,E., Topaloglu,H., et al., (2000). The gene encoding gigaxonin, a new member of the cytoskeletal BTB/kelch repeat family, is mutated in giant axonal neuropathy [In Process Citation]. Nat. Genet, 26, 370-374.

239 Boot-Handford,R.P. and Tuckwell.D.S. (2003). Fibrillar collagen: the key to vertebrate evolution? A tale of molecular incest. Bioessays, 25, 142-151.

Bork.P. and Beckmann,G. (1993). The CUB domain. A widespread module in developmentally regulated proteins. J. Mai. BioL, 231, 539-545.

Bork.P. and Doolittle,R.F. (1994). Drosophila kelch motif is derived from a common enzyme fold. J. Mol. Biol., 236,1277-1282.

Bork,P., Holm,L., and Sander,C. (1994). The immunoglobulin fold. Structural classification, sequence patterns and common core. J. Mol. Biol., 242, 309-320.

Bernstein,P. (1995). Diversity of function is inherent in matricellular proteins: an appraisal of thrombospondin 1. J. Cell Biol., 130, 503.

Bernstein,P. (2001). Thrombospondins as matricellular modulators of cell function. J. Clin. Invest, 107, 929-934.

Borradori,L. and Sonnenberg,A. (1996). Hemidesmosomes: roles in adhesion, signaling and human diseases. Curr. Opin. Cell Biol., 8 , 647-656.

Borradori,L. and Sonnenberg,A. (1999). Structure and Function of Hemidesmosomes: More Than Simple Adhesion Complexes. Journal of Investigative Dermatology, 112, 411-418.

Boulton,S.J., Brook,A., Staehling-Hampton,K., Heitzler,P., and Dyson,N. (2000). A role for Ebi in neuronal cell cycle control. EMBO J., 19, 5376-5386.

Brose,N. and Rosenmund,C. (2002). Move over protein kinase C, you've got company: alternative cellular effectors of diacylglycerol and phorbol esters. J Cell Sci, 115, 4399-4411.

Brown,N.R., Noble,M.E., Endicott,J.A., Carman,E.F., Wakatsuki,S., Mitchell,E., Rasmussen,B., Hunt,T., and Johnson,L.N. (1995). The crystal structure of cyclin A. Structure., 3, 1235-1247.

Burridge,K. and Chrzanowska-Wodnicka,M. (1996). Focal adhesions, contractility, and signaling. Annu. Rev. Cell Dev. Biol., 12:463-518., 463-518.

240 Cahana.A., Escamez.T., Nowakowski.R.S., Hayes,N.L., Giacobini.M., von Holst,A., Shmueli,0., Sapir,T., et a!., (2001). Targeted mutagenesis of Lisi disrupts cortical development and LISI homodimerization. Proc. Natl. Acad. Sci. U. S. A, 98, 6429-6434.

Calapez,A., Pereira,H.M., Calado,A., Braga,J., Rino,J., Carvalho,C., Tavanez,J.P., Wahle,E., et al., (2002). The intranuclear mobility of messenger RNA binding proteins is ATP dependent and temperature sensitive. J. Cell Biol., 159, 795.

Callebaut,l. and Mornon,J.P. (1998). The V(D)J recombination activating protein RAG2 consists of a six-bladed propeller and a PHD fingerlike domain, as revealed by sequence analysis. Cell Mol. Life Sci., 54, 880-891.

Canfield,A.E. and Schor,A.M. (1995). Evidence that tenascin and thrombospondin-1 modulate sprouting of endothelial cells. J Cell Sci, 108, 797.

Cardoso,C., Leventer,R.J., Matsumoto,N., Kuc,J.A., Ramocki,M.B., Mewborn,S.K., Dudlicek.L.L., May,L.F., et al., (2000). The location and type of mutation predict malformation severity in isolated lissencephaly caused by abnormalities within the LISI gene. Hum. Mol. Genet, 9, 3019-3028.

Carmo-Fonseca,M., Platani,M., and Swedlow,J.R. (2002). Macromolecular mobility inside the cell nucleus. Trends Cell Biol., 12, 491-495.

Carragher,N.O., Levkau,B., Ross,R., and Raines,E.W. (1999). Degraded Collagen Fragments Promote Rapid Disassembly of Smooth Muscle Focal Adhesions That Correlates with Cleavage of pp125FAK, Paxillin, and Talin. J. Cell Biol., 147,619-630.

Cazaubon,S., Parker,P.J., Strosberg,A.D., and Couraud,P.O. (1993). Endothelins stimulate tyrosine phosphorylation and activity of p42/mitogen- activated protein kinase in astrocytes. Biochem. J., 293 ( Ft 2), 381-386.

Cazaubon,S.M., Ramos-Morales,F., Fischer,S., Schweighoffer,F., Strosberg,A.D., and Couraud,P.O. (1994). Endothelin induces tyrosine phosphorylation and GRB2 association of She in astrocytes. J. Biol. Chem., 269, 24805-24809.

241 Chandrasekaran.S., Guo.N.h., Rodrigues,R.G., Kaiser,J., and Roberts,D.D. (1999). Pro-adhesive and Chemotactic Activities of Thrombospondin-1 for Breast Carcinoma Cells Are Mediated by alpha 3beta 1 Integrin and Regulated by Insulin-like Growth Factor-1 and CD98. J. Biol. Chem., 274, 11408.

Chang,Y., Bosma.G.C., and Bosma,M.J. (1995). Development of B cells in scid mice with immunoglobulin transgenes: implications for the control of V(D)J recombination. Immunity., 2, 607-616.

Chatah,N. and Abrams,C.S. (2001). G-protein-coupled Receptor Activation Induces the Membrane Translocation and Activation of Phosphatidylinositol-4- phosphate 5-Kinase lalpha by a Rac- and Rho-dependent Pathway. J. Biol. Chem., 276, 34059-34065.

Chen,C.C. (1994). Effects of Ca2+ on the activation of conventional and new PKC isozymes and on TPA and endothelin-1 induced translocations of these isozymes in intact cells. FEBS Lett., 348, 21-26.

Chen,D.B. and Davis,J.S. (2003). Epidermal growth factor induces c-fos and c- jun mRNA via Raf-1/MEK1/ERK-dependent and -independent pathways in bovine luteal cells. Mol Cell Endocrinol., 200,141-154.

Chen,Y., Derin.R., Petralia.R.S., and Li,M. (2002). Actinfilin, a brain-specific actin-binding protein in postsynaptic density. J. Biol. Chem., 277, 30495-30501.

Chung,J., Gao,A.G., and Frazier,W.A. (1997). Thrombspondin acts via integrin- associated protein to activate the platelet integrin alphallbbeta3. J. Biol. Chem., 272, 14740-14746.

Clark,E.A., King.W.G., Brugge,J.S., Symons,M., and Hynes,R.O. (1998). Integrin-mediated Signals Regulated by Members of the Rho Family of

GTPases. J. Ce// 8 /0 /., 142, 573-586.

Corbit,K.C., Trakul,N., Eves,E.M., Diaz,B., Marshall,M., and Rosner,M.R. (2003). Activation of Raf-1 Signaling by Protein Kinase C through a Mechanism Involving Raf Kinase Inhibitory Protein. J. Biol. Chem., 278, 13061-13068.

242 Coscoy.L. and Ganem.D. (2003). PHD domains and E3 ubiquitin ligases: viruses make the connection. Trends in Cell Biology, 13, 7-12.

Craig,K.L. and Tyers.M. (1999). The F-box: a new motif for ubiquitin dependent proteolysis in cell cycle regulation and signal transduction. Prog. Biophys. Mol Biol, 72, 299-328.

Crawford,A.W. and Beckerle,M.C. (1991). Purification and characterization of zyxin, an 82,000-dalton component of adherens junctions. J. Biol. Chem., 266, 5847.

Crawford,S.E., Stellmach,V., Murphy-Ullrich,J.E., Ribeiro,S.M., Lawler,J., Hynes,R.O., Boivin,G.P., and Bouck,N. (1998). Thrombospondin-1 is a major activator of TGF-betal in vivo. Cell, 93, 1159-1170.

Cuff,J.A. and Barton,G.J. (2000). Application of multiple sequence alignment profiles to improve protein secondary structure prediction. Proteins, 40, 502- 511.

Cummings,L., Riley,L., Black,L., Souvorov,A., Resenchuk,S., Dondoshansky,l., and Tatusova,T. (2002). Genomic BLAST: custom-defined virtual databases for complete and unfinished genomes. FEMS Microbiol. Lett., 216, 133-138.

Danilov,Y.N. and Juliano,R.L. (1989). Phorbol ester modulation of integrin- mediated cell adhesion: a postreceptor event. J. Cell Biol., 108, 1925-1933.

Dardik,R. and Lahav,J. (1999). Functional changes in the conformation of thrombospondin-1 during complexation with fibronectin or heparin. Exp. Cell Res, 248, 407-414.

Dawson,D.W., Pearce,S.F., Zhong,R., Silverstein,R.L., Frazier,W.A., and Bouck,N.P. (1997). CD36 mediates the In vitro inhibitory effects of thrombospondin-1 on endothelial cells. J. Cell Biol., 138, 707-717.

De,B., I, Derua,R., Janssens,V., Van Hoof,C., Waelkens,E., Merlevede,W., and Goris,J. (1999). Purification of porcine brain protein phosphatase 2A leucine carboxyl methyltransferase and cloning of the human homologue. Biochemistry, 38, 16539-16547.

243 Decottignies,A., Sanchez-Perez,!., and Nurse,P. (2003). Schizosaccharomyces pombe essential genes: a pilot study. Genome Res., 13, 399-406.

Defilippi,P., Venturino,M., Gulino,D., Duperray,A., Boquet,P., Fiorentini.C., Volpe,G., Palmieri,M., et al., (1997). Dissection of pathways implicated in integrin-mediated actin cytoskeleton assembly. Involvement of protein kinase C, Rho GTPase, and tyrosine phosphorylation. J. Biol. Chem., 272, 21726-21734. del Pozo,M.A., Vicente-Manzanares,M., Tejedor,R., Serrador,J.IVI., and Sanchez-Madrid,F. (1999). Rho GTPases control migration and polarization of adhesion molecules and cytoskeletal ERM components in T lymphocytes. Eur. J Immunol, 29, 3609-3620. del Pozo,M.A., Price,L.S., Alderson,N.B., Ren,X.D., and Schwartz,M.A. (2000). Adhesion to the extracellular matrix regulates the coupling of the small GTPase Rac to its effector PAK. EMBO J., 19, 2008-2014. del Pozo,M.A., Kiosses,W.B., Alderson,N.B., Meller,N., Hahn,K.M., and Schwartz,M.A. (2002). Integrins regulate GTP-Rac localized effector interactions through dissociation of Rho-GDI. Nat. Cell Biol., 4, 232-239.

Dempsey,E.G., Newton,A.C., Mochly-Rosen,D., Fields,A.P., Reyland,M.E., lnsel,P.A., and Messing,R.O. (2000). Protein kinase C isozymes and the regulation of diverse cell responses. Am. J. Physiol Lung Cell Mol. Physiol, 279, L429-L438.

DeRosier,D.J., Tilney,L.G., Bonder,E.M., and Frankl,P. (1982). A change in twist of actin provides the force for the extension of the acrosomal process in Limulus sperm: the false-discharge reaction. J. Cell Biol., 93, 324-337.

Diaz,E., Schimmoller,F., and Pfeffer,S.R. (1997). A novel Rab9 effector required for endosome-to-TGN transport. J. Cell Biol., 138, 283-290.

Dickson,D.W. (2002). Misfolded, protease-resistant proteins in animal models and human neurodegenerative disease. J. Clin. Invest, 110, 1403-1405.

244 DiMilla,P.A., Stone,J.A., Quinn,J.A., Albelda,S.M., and Lauffenburger,D.A. (1993). Maximal migration of human smooth muscle cells on fibronectin and type IV collagen occurs at an intermediate attachment strength. J Cell BioL, 122, 729-737.

Ding,J., Liu,J.J., Kowal,A.S., Nardine,!., Bhattacharya,P., Lee,A., and Yang,Y. (2002). Microtubule-associated protein IB: a neuronal binding partner for gigaxonin. J. Cell BioL, 158, 427-433.

Dinkova-Kostova,A.T., Holtzclaw,W.D., Cole,R.N., ltoh,K., Wakabayashi,N., Katoh,Y., Yamamoto,M., and Talalay,P. (2002). Direct evidence that sulfhydryl groups of Keapi are the sensors regulating induction of phase 2 enzymes that protect against carcinogens and oxidants. Proc. Natl. Acad. Sci. U. S. A, 99, 11908-11913.

DiPietro,L.A., Nissen,N.N., Gamelli,R.L., Koch,A.E., Pyle,J.M., and Polverini,P.J. (1996). Thrombospondin 1 synthesis and function in wound repair. Am J Pathol, 148, 1851-1860.

Disatnik,M.H. and Rando,T.A. (1999). Integrin-mediated muscle cell spreading. The role of protein kinase c in outside-in and inside-out signaling and evidence of integrin cross-talk. J. BioL Chem., 274, 32486-32492.

Dodson,G.S., Guarnieri,D.J., and Simon,M.A. (1998). Src64 is required for ovarian ring canal morphogenesis during Drosophila oogenesis. Development, 125, 2883-2892.

Doxsey,S. (2001). Re-evaluating centrosome function. Nat. Rev. Mol. Cell BioL, 2, 688-698.

Du,X., Saido,T.C., Tsubuki,S., Indig,F.E., Williams,M.J., and Ginsberg,M.H. (1995). Calpain Cleavage of the Cytoplasmic Domain of the Integrin beta(3) Subunit. J. BioL Chem., 270, 26146-26151.

245 Duke-Cohan,J.S., Gu,J., McLaughlin,D.F., Xu,Y., Freeman,G.J., and Schlossman,S.F. (1998). Attractin (DPPT-L), a member of the CUB family of cell adhesion and guidance proteins, is secreted by activated human T lymphocytes and modulates immune cell interactions. Proc. Natl. Acad. Sci. U. S. A, 95, 11336-11341.

Emberley,E., Gietz,R.D., Campbell,J.D., HayGlass,K., Murphy,L., and Watson,P. (2002). RanBPM interacts with psoriasin in vitro and their expression correlates with specific clinical features in vivo in breast cancer. BMC Cancer, 2, 28.

Emes.R.D. and Ponting,C.P. (2001). A new sequence motif linking lissencephaly, Treacher Collins and oral-facial-digital type 1 syndromes, microtubule dynamics and cell migration. Hum. Mol. Genet, 10, 2813-2820.

Ervasti,J.M. and Campbell,K.P. (1993). A role for the dystrophin-glycoprotein complex as a transmembrane linker between laminin and actin. J. Cell Biol., 122, 809-823.

Feng,X., Zhang,J., Barak,L.S., Meyer,T., Caron,M.G., and Hannun,Y.A. (1998). Visualization of Dynamic Trafficking of a Protein Kinase C beta ll/Green Fluorescent Protein Conjugate Reveals Differences in G Protein-coupled Receptor Activation and Desensitization. J. Biol. Chem., 273, 10755-10762.

Ffrench-Constant,C. (1995). Alternative Splicing of Fibronectin-Many Different Proteins but Few Different Functions. Experimental Cell Research, 221,261- 271.

Firbank,S.J., Rogers,M.S., Wilmot,C.M., Dooley,D.M., Halcrow,M.A., Knowles,P.F., McPherson,M.J., and Phillips,S.E. (2001). Crystal structure of the precursor of galactose oxidase: an unusual self-processing enzyme. Proc. Natl. Acad. Sci. U. S. A, 98, 12932-12937.

Fischer,A., Hacein-Bey,S., and Cavazzana-Calvo,M. (2002). Gene therapy of severe combined immunodeficiencies. Nat. Rev. Immunol., 2, 615-621.

246 Fukui.Y. and Yamamoto,M. (1988). Isolation and characterization of Schizosaccharomyces pombe mutants phenotypically similar to rasi-. Mol. Gen. Genet, 215, 26-31.

Fukui.Y., Miyake,S., Satoh,M., and Yamamoto,M. (1989). Characterization of the Schizosaccharomyces pombe ral2 gene implicated in activation of the rasi gene product. Mol. Cell Biol., 9, 5617-5622.

Fulop,V. and Jones,D.T. (1999). Beta propellers: structural rigidity and functional diversity. Curr. Opin. Struct. Bioi, 9, 715-721.

Furuhashi,K. and Hatano,S. (1992)a. Identification of actin kinase activity in purified fragmin-actin complex. FEBS Lett., 310, 34-36.

Furuhashi,K. and Hatano,S. (1992)b. Actin kinase: a protein kinase that phosphorylates actin of fragmin-actin complex. J. Biochem. (Tokyo), 111, 366- 370.

Gahtan,V., Wang,X.J., lkeda,M., Willis,A.L, Tuszynski,G.P., and Sumpio,B.E. (1999). Thrombospondin-1 induces activation of focal adhesion kinase in vascular smooth muscle cells. J Vase. Surg., 29, 1031-1036.

Galagan,J.E., Calvo,S.E., Berkovich,K.A., Selker,E.U., Read,N.D., Jaffe,D., FitzHugh,W., Ma,L.J., et al., (2003). The genome sequence of the filamentous fungus Neurospora crassa. Nature, 422, 859-868.

Gao,A.G., Lindberg,F.P., Dimitry,J.M., Brown,E.J., and Frazier,W.A. (1996)a. Thrombospondin modulates alpha v beta 3 function through integrin- associated protein. J. Cell Biol., 135, 533.

Gao,A.G., Lindberg,F.P., Finn,M.B., Blystone,S.D., Brown,E.J., and Frazier,W.A. (1996)b. Integrin-associated Protein Is a Receptor for the C- terminal Domain of Thrombospondin. J. Biol. Chem., 271, 21.

Garcia-Mata,R., Bebok,Z., Sorscher,E.J., and Sztul,E.S. (1999). Characterization and dynamics of aggresome formation by a cytosolic GFP- chimera. J. Cell Biol., 146,1239-1254.

247 Garner,C.C., Nash.J., and Huganir.R.L. (2000). PDZ domains in synapse assembly and signalling. Trends Cell BioL, 10, 274-280.

Gaskell.A., Crennell,S., and Taylor,G. (1995). The three domains of a bacterial sialidase: a beta-propeller, an immunoglobulin module and a galactose-binding jelly-roll. Structure., 3, 1197-1205.

Gellert,M. (2002). V(D)J recombination: rag proteins, repair factors, and regulation. Annu. Rev. Biochem., 71, 101-132.

Gettemans,J., De Ville,Y., Vandekerckhove,J., and Waelkens,E. (1992). Physarum actin is phosphorylated as the actin-fragmin complex at residues Thr203 and Thr202 by a specific 80 kDa kinase. EMBO J., 11, 3185-3191.

Gettemans,J., De Ville,Y., Waelkens,E., and Vandekerckhove,J. (1995). The actin-binding properties of the Physarum actin-fragmin complex. Regulation by calcium, phospholipids, and phosphorylation. J. Biol. Chem., 270, 2644-2651.

Giancotti,F.G. and Ruoslahti.E. (1999). Integrin signaling. Science, 285, 1028- 1032.

Giancotti,F.G. (2003). A structural view of integrin activation and signaling. Dev. Ceil, 4, 149-151.

Gilmore,A.P. and Burridge,K. (1996). Regulation of vinculin binding to talin and actin by phosphatidyl-inositol-4-5-bisphosphate. Nature, 381, 531-535.

Godyna,S., Liau,G., Popa,l., Stefansson,S., and Argraves,W.S. (1995). Identification of the low density lipoprotein receptor-related protein (LRP) as an endocytic receptor for thrombospondin-1. J. Cell Biol., 129, 1403-1410.

Goebel,H.H. and Warlo,I.A. (2001). Surplus protein myopathies. Neuromuscul. Disord., 11,3-6.

Goffeau,A., Barrell,B.G., Bussey,H., Davis,R.W., Dujon,B., Feldmann,H., Galibert,F., Hoheisel,J.D., et al., (1996). Life with 6000 Genes. Science, 274, 546-567.

248 Goicoechea.S., Orr,A.W., Pallero.M.A., Eggleton.P., and Murphy-Ullrich,J.E, (2000). Thrombospondin Mediates Focal Adhesion Disassembly through Interactions with Cell Surface Calreticulin. J. Biol. Chem., 275, 36358-36368.

Goicoechea.S., Pallero,M.A., Eggleton.P., Michalak,M., and Murphy-Ullrich.J.E. (2002). The Anti-adhesive Activity of Thrombospondin Is Mediated by the N- terminal Domain of Cell Surface Calreticulin. J. Biol. Chem., 277, 37219-37228.

Gomez,C.A., Ptaszek,L.M., Villa,A., Bozzi,F., Sobacchi,C., Brooks,E.G., Notarangelo,L.D., Spanopoulou,E., et al., (2000). Mutations in conserved regions of the predicted RAG2 kelch repeats block initiation of V(D)J recombination and result in primary immunodeficiencies. Mol. Cell Biol., 20, 5653-5664.

Gordon-Weeks,P.R. and Fischer,l. (2000). MAP1B expression and microtubule stability in growing and regenerating axons. Microsc. Res. Tech., 48, 63-74.

Gorlich,D. and Kutay,U. (1999). Transport between the cell nucleus and the cytoplasm. Annu. Rev. Cell Dev. Biol, 15, 607-660.

Gottardi,C.J., Arpin,M., Fanning,A.S., and Louvard,D. (1996). The junction- associated protein, zonula occludens- 1 , localizes to the nucleus before the maturation and during the remodeling of cell-cellacontacts. PNAS, 93, 10779.

Greene,D.K., Tumova,S., Couchman,J.R., and Woods,A. (2003). Syndecan-4 Associates with alpha -Actinin. J. Biol. Chem., 278, 7617-7623.

Guex,N. and Peitsch,M.C. (1997). SWISS-MODEL and the Swiss-PdbViewer: an environment for comparative protein modeling. Electrophoresis, 18, 2714- 2723.

Gunn,T.M., Miller,K.A., He,L., Hyman,R.W., Davis,R.W., Azarani,A., Schlossman,S.F., Duke-Cohan,J.S., and Barsh,G.S. (1999). The mouse mahogany locus encodes a transmembrane form of human attractin. Nature, 398, 152-156.

249 Guo,N., Zabrenetzky.V.S., Chandrasekaran,L., Sipes,J.M., Lawler,J., Krutzsch,H.C., and Roberts,D.D. (1998). Differential roles of protein kinase C and pertussis toxin-sensltlve G-blndIng proteins In modulation of melanoma cell proliferation and motility by thrombospondin 1. Cancer Res, 58, 3154-3162.

Guo,N.H., Krutzsch,H.C., Negre,E., Zabrenetzky,V.S., and Roberts,D.D. (1992). Heparln-blnding peptides from the type I repeats of thrombospondin. Structural requirements for heparin binding and promotion of melanoma cell adhesion and chemotaxis. J. BioL Chem., 267, 19349.

Hagmann,J., Grob,M., and Burger,M.M. (1992). The cytoskeletal protein talin Is 0-glycosylated. J. BioL Chem., 267, 14424-14428.

Hannah,B.I., Mlsenhelmer,T.M., Annls,D.S., and Mosher,D.F. (2003). A Polymorphism In Thrombospondln-1 Associated with Familial Premature Coronary Heart Disease Causes a Local Change In Conformation of the Ca2+- blndlng Repeats. J. Bioi. Chem., 278, 8929-8934.

Harashlma,T. and Heltman,J. (2002). The Galpha protein Gpa2 controls yeast differentiation by Interacting with kelch repeat proteins that mimic G beta subunits. Mol. Celi, 10, 163-173.

Hart,G.W. (1997). Dynamic 0-llnked glycosylatlon of nuclear and cytoskeletal proteins. Annu. Rev. Biochem., 6 6 , 315-335.

Hasegawa,H., Katoh,H., Fujlta,H., Morl,K., and Neglshl,M. (2000). Receptor Isoform-speclfic Interaction of prostaglandin EP3 receptor with muskelin. Biochem. Biophys. Res. Commun., 276, 350-354.

Hasson,T., Walsh,J., Cable,J., Mooseker,M.S., Brown,S.D., and Steel,K.P. (1997). Effects of shaker-1 mutations on myosln-Vlla protein and mRNA expression. CeilMotil. Cytoskeleton, 27J ^27-^3Q.

Hauck,C.R., Sleg,D.J., Hsla,D.A., Loftus,J.C., Gaarde,W.A., Monla,B.P., and Schlaepfer,D.D. (2001). Inhibition of Focal Adhesion Kinase Expression or Activity D Isrupts E pidermal G rowth F actor-stimulated S Ignaling P romoting the Migration of Invasive Human Carcinoma Cells. Cancer Res, 61, 7079-7090.

250 Hauck.C.R., Hsia.D.A., Puente,X.S., Cheresh.D.A., and Schlaepfer.D.D. (2002). FRNK blocks v-Src-stimulated invasion and experimental métastasés without effects on cell motility or growth. EMBO J., 21, 6289-6302.

Heid.H., Figge.U., Winter,S., Kuhn,C., Zimbelmann,R., and Franke,W. (2002). Novel actin-related proteins Arp-T 1 and Arp-T2 as components of the cytoskeletal calyx of the mammalian sperm head. Exp. Cell Res., 279, 177- 187.

Henry,M.D. and Campbell,K.P. (1996). Dystroglycan: an extracellular matrix receptor linked to the cytoskeleton. Curr. Opin. Cell Biol., 8 , 625-631.

Hernandez,M.C., Andres-Barquin,P.J., Martinez,S., Bulfone,A., Rubenstein,J.L., and Israel,M.A. (1997). ENC-1: a novel mammalian kelch-related gene specifically expressed in the nervous system encodes an actin-binding protein. J. Neurosci., 17, 3038-3051.

Hoffman,J.R., Dixit,V.M., and O'Shea,K.S. (1994). Expression of thrombospondin in the adult nervous system. J. Comp Neurol., 340,126-139.

Hoffman,J.R. and O'Shea,K.S. (1999). Thrombospondin expression in nerve regeneration II. Comparison of optic nerve crush in the mouse and goldfish. Brain Res. Bull., 48, 421-427.

Holden,P., Meadows,R.S., Chapman,K.L., Grant,M.E., Kadler,K.E., and Briggs,M.D. (2001). Cartilage Oligomeric Matrix Protein Interacts with Type IX Collagen, and Disruptions to These Interactions Identify a Pathogenetic Mechanism in a Bone Dysplasia Family. J. Biol. Chem., 276, 6046-6055.

Holt,R.A., Subramanian,G.M., Halpern,A., Sutton,G.G., Charlab,R., Nusskern,D.R., Wincker,P., Clark,A.G., et al., (2002). The Genome Sequence of the Malaria Mosquito Anopheles gambiae. Sc/ence, 298, 129-149.

Horie,S., Watanabe,Y., Tanaka,K., Nishiwaki,S., Fujioka,H., Abe,H., Yamamoto,M., and Shimoda,C. (1998). The Schizosaccharomyces pombe mei4+ Gene Encodes a Meiosis-Specific Transcription Factor Containing a forkhead DMA-Binding Domain. Mol. Cell. Biol., 18, 2118.

251 Horowitz,A. and Simons,M. (1998). Phosphorylation of the Cytoplasmic Tail of Syndecan-4 Regulates Activation of Protein Kinase Calpha. J. Biol. Chem., 273, 25548-25551.

Horvath,C.M. and Darnell,J.E. (1997). The state of the STATs: recent developments in the study of signal transduction to the nucleus. Curr. Opin. Cell Biol., 9, 233-239.

Hughes,A.L. and Nei,M. (1988). Pattern of nucleotide substitution at major histocompatibility complex class I loci reveals overdominant selection. Nature, 335, 167-170.

Hughes,A.L. and Nei,M. (1989). Nucleotide substitution at major histocompatibility complex class II loci: evidence for overdominant selection. Proa. Natl. Acad. Sol U. S. A, 86, 958-962.

Humphries,M.J., Symonds,E.J., and Mould,A.P. (2003)a. Mapping functional residues onto integrin crystal structures. Curr. Opin. Struct. Biol., 13, 236-243.

Humphries,M.J., McEwan,P.A., Barton,S.J., Buckley,P.A., Bella,J., and Paul,M.A. (2003)b. Integrin structure: heady advances in ligand binding, but activation still makes the knees wobble. Trends Biochem. Sci, 28, 313-320.

Hunter,T. (1995). Protein kinases and phosphatases: the yin and yang of protein phosphorylation and signaling. Cell, 80, 225-236.

Huttenlocher,A., Palecek,S.P., Lu,Q., Zhang,W., Mellgren,R.L., Lauffenburger,D.A., Ginsberg,M.H., and Horwitz,A.F. (1997). Regulation of Cell Migration by the Calcium-dependent Protease Calpain. J. Biol. Chem., 272, 32719-32722.

Hutter,H., Vogel,B.E., Plenefisch,J.D., Norris,C.R., Proenca,R.B., Spieth,J., Guo,C., Mastwal,S., et al., (2000). Conservation and Novelty in the Evolution of Cell Adhesion and Extracellular Matrix Genes. Science, 287, 989-994.

Hynes,R.O. (1992). Integrins: versatility, modulation, and signaling in cell adhesion. Cell, 69, 11-25.

252 Hynes,R.O. (1999). Cell adhesion: old and new questions. Trends Cell BioL, 9, M33-M37.

Hynes,R.O. and Zhao,Q. (2000). The evolution of cell adhesion. J. Cell Biol., 150, F89-F96.

Hynes,R.O. (2002). Integrins: bidirectional, allosteric signaling machines. Cell, 20;110, 673-687. lchikawa,T., Yamada,M., Homma,D., Cherry,R.J., Morrison,I.E., and Kawato,S. (2000). Digital fluorescence imaging of trafficking of endosomes containing low- density lipoprotein in brain astroglial cells. Biochem. Biophys. Res. Commun., 269, 25-30. ldeguchi,H., Ueda,A., Tanaka,M., Yang,J., Tsuji,T., Ohno,S., Hagiwara,E., Aoki,A., and lshigatsubo,Y. (2002). Structural and functional characterization of the USP11 deubiquitinating enzyme, which interacts with the RanGTP- associated protein RanBPM. Biochem. J, 367, 87-95. llic,D., Furuta,Y., Kanazawa,S., Takeda,N., Sobue,K., Nakatsuji,N., Nomura,S., Fujimoto,J., et al., (1995). Reduced cell motility and enhanced focal adhesion contact formation in cells from FAK-deficient mice. Nature, 377, 539-544. lnomata,M., Hayashi,M., Ohno-lwashita,Y., Tsubuki,S., Saido,T.C., and Kawashima,S. (1996). Involvement of calpain in integrin-mediated signal transduction. Arch. Biochem. Biophys., 328,129-134. lto,N., Phillips,S.E., Stevens,0., Ogel,Z.B., McPherson,M.J., Keen,J.N., Yadav,K.D., and Knowles,P.F. (1991). Novel thioether bond revealed by a 1.7 A crystal structure of galactose oxidase. Nature, 350, 87-90. lto,N., Phillips,S.E., Yadav,K.D., and Knowles,P.F. (1994). Crystal structure of a free radical enzyme, galactose oxidase. J. Mol. Biol., 238, 794-814. ltoh,K., Wakabayashi,N., Katoh,Y., lshii,T., lgarashi,K., Engel,J.D., and Yamamoto,M. (1999). Keapi represses nuclear activation of antioxidant responsive elements by Nrf2 through binding to the amino-terminal Neh2 domain. Genes Dev., 13, 76-86.

253 Izeta.A., Malcomber,S., and O'Hare.P. (2003). Primary structure and compartmentalization of Drosophila melanogaster host cell factor. Gene, 305, 175-183.

Jakobi.R., McCarthy,C.C., KoeppeI.M.A., and Stringer,D.K. (2003). Caspase- activated PAK-2 is regulated by subcellular targeting and proteasomal degradation. J. Biol. Chem., M306494200.

Jawad,Z. and Paoli,M. (2002). Novel Sequences Propel Familiar Folds. Structure, 10, 447-454.

Johnson,K.M., Mahajan,S.S., and Wilson,A.C. (1999). Herpes simplex virus transactivator VP 16 discriminates between HCF-1 and a novel family member, HCF-2. J. Virol., 73, 3930-3940.

Johnson,R.P. and Craig,S.W. (1995). The carboxy-terminal tail domain of vinculin contains a cryptic binding site for acidic phospholipids. Biochem. Biophys. Res Commun., 210, 159-164.

Johnson,R.P. and Craig,S.W. (1995). F-actin binding site masked by the intramolecular association of vinculin head and tail domains. Nature, %19;373, 261-264.

Johnston,J.A., Ward,C.L., and Kopito,R.R. (1998). Aggresomes: a cellular response to misfolded proteins. J. Cell Biol., 143, 1883-1898.

Kaesberg,P.R., Ershler,W.B., Esko,J.D., and Mosher,D.F. (1989). Chinese hamster ovary cell adhesion to human platelet thrombospondin is dependent on cell surface heparan sulfate proteoglycan. J Clin. Invest, 83, 994-1001.

Kaffman,A. and O'Shea,E.K. (1999). Regulation of nuclear localization: a key to a door. Annu. Rev Cell Dev. Biol, 15:291-339., 291-339.

Karin,M., Liu,Z., and Zandi,E. (1997). AP-1 function and regulation. Curr. Opin. Cell Biol., 9, 240-246.

254 Katinka,M.D., Duprat.S., Cornillot.E., Metenier.G., Thomarat.F., Prensier.G., Barbe,V., Peyretaillade,E., at al., (2001). Genome sequence and gene compaction of the eukaryote parasite Encephalitozoon cuniculi. Nature, 414, 450-453.

Kawano,Y., Fukata,Y., Oshiro.N., Amano.M., Nakamura,!., lto,M., Matsumura,F., lnagaki,M., and Kaibuchi,K. (1999). Phosphorylation of Myosin- binding Subunit (MBS) of Myosin Phosphatase by Rho-Kinase In Vivo. J. Cell

8 /0 /., 147, 1023-1038.

Kazanietz,M.G. (2000). Eyes wide shut: protein kinase C isozymes are not the only receptors for the phorbol ester tumor promoters. Mol. Carcinog., 28, 5-1 !

Keating,!.J. and Borisy,G.G. (1999). Centrosomal and non-centrosomal microtubules. Biol Cell, 91, 321-329.

Keeling,P.J. and Fast,N.M. (2002). Microsporidia: biology and evolution of highly reduced intracellular parasites. Annu. Rev. Microbiol., 56, 93-116.

Keeling,P.J., Luker,M.A., and Palmer,J.D. (2000). Evidence from Beta-!ubulin Phylogeny that Microsporidia Evolved from Within the Fungi. Mol Biol Evol, 17, 23-3!

Kelso,R.J., Hudson,A.M., and Cooley,L. (2002). Drosophila Kelch regulates actin organization via Src64-dependent tyrosine phosphorylation. J. Cell Biol., 156, 703-713.

Kim,D.R., Dai,Y., Mundy,C.L., Yang,W,, and Oettinger,M.A. (1999). Mutations of acidic residues in RAG1 define the active site of the V(D)J recombinase. Genes Dev., 13, 3070-3080.

Kim,H., Yang,P., Catanuto,P., Verde,F., Lai,H., Du,H., Chang,F., and Marcus,S. (2003). ! he Kelch repeat protein, !ea1, is a potential substrate target of the p21-activated kinase, Shkl, in the fission yeast, Schizosaccharomyces pombe. J. Biol. Chem., M302609200.

255 Kim.T.A., Lim.J., Ota,S., Raja,S., Rogers,R., Rivnay,B., Avraham,H., and Avrahann,S. (1998). NRP/B, a novel nuclear matrix protein, associates with pllO(RB) and is involved in neuronal differentiation. J. Cell BioL, 141, 553-566.

Kiosses,W.B., Shattil,S.J., Pampori,N., and Schwartz,M.A. (2001). Rac recruits high-affinity integrin alphavbetaS to lamellipodia in endothelial cell migration. Nat. Cell BloL, 3, 316-320.

Kirfel,J., Magin,T.M., and Reichelt,J. (2003). Keratins: a structural scaffold with emerging functions. Cell Mol. Life Sci, 60, 56-71.

Kishimoto,A., Nishiyama,K., Nakanishi,H., Uratsu]i,Y., Nomura,H., Takeyama,Y., and Nishizuka,Y. (1985). Studies on the phosphorylation of myelin basic protein by protein kinase C and adenosine 3':5'-monophosphate- dependent protein kinase. J. BioL Chem., 260, 12492-12499.

Koli,K., Saharinen,J., Hyytiainen,M., Penttinen,C., and Keski-0]a,J. (2001). Latency, activation, and binding proteins of TGF-beta. Microsc. Res. Tech., 52, 354-362.

Kopito,R.R. (2000). Aggresomes, inclusion bodies and protein aggregation. Trends Cell BioL, 10, 524-530.

Kristie,!.M. and Sharp,P.A. (1990). Interactions of the Oct-1 POL) subdomains with specific DMA sequences and with the H SV alpha-trans-activator protein. Genes Dev., 4, 2383-2396.

Kristie,T.M. and Sharp,P.A. (1993). Purification of the cellular 01 factor required for the stable recognition of the Oct-1 homeodomain by the herpes simplex virus alpha- trans-induction factor (VP16). J. BioL Chem., 268, 6525.

Krutzsch,H.C., Choe,B.J., Sipes,J.M., Guo,N., and Roberts,D.D. (1999). Identification of an alpha(3)beta(1) integrin recognition sequence in thrombospondin-1. J Biol Chem., 274, 24080-24086.

256 Kudo,N., Wolff,B., Sekimoto,!., Schreiner,E.P., Yoneda,Y., Yanagida,M., Horjnouchi,S., and Yoshida,M. (1998). Leptomycin B inhibition of signal- mediated nuclear export by direct binding to CRM1. Exp. Cell Res, 242, 540- 547.

Kureishy,N., Sapountzi,V., Prag,S., Anilkumar,N., and Adams,J.C. (2002). Fascina, and their roles in cell structure and function. Bioessays, 24, 350-361.

Kuroda,H., Takahashi,N., Shimada,H., Seki,M., Shinozaki,K., and Matsui,M. (2002). Classification and Expression Analysis of Arabidopsis F-Box-Containing Protein Genes. Plant Cell Physiol, 43, 1073-1085.

Kyriakides,T.R., Zhu,Y.H., Smith,L.T., Bain,S.D., Yang,Z., Lin,M.T., Danielson,K.G., lozzo,R.V., et al., (1998). Mice that lack thrombospondin 2 display connective tissue abnormalities that are associated with disordered collagen fibrillogenesis, an increased vascular density, and a bleeding diathesis. J Cell Biol, 140,419-430.

Kyriakides,T.R., Tam,J.W., and Bernstein,P. (1999). Accelerated wound healing in mice with a disruption of the thrombospondin 2 gene. J Invest Dermatol, 113, 782-787.

Lander,E.S., Linton,L.M., Birren,B., Nusbaum,C., Zody,M.C., Baldwin,J., Devon,K., Dewar,K., Doyle,M., et al., (2001). Initial sequencing and analysis of the human genome. Nature, 409, 860-921.

Landree,M.A., Wibbenmeyer,J.A., and Roth,D.B. (1999). Mutational analysis of RAG1 and RAG2 identifies three catalytic amino acids in RAG1 critical for both cleavage steps of V(D)J recombination. Genes Dev., 13, 3059-3069.

Lawler, J., Weinstein,R., and Hynes,R.O. (1988). Cell attachment to thrombospondin: the role of ARG-GLY-ASP, calcium, and integrin receptors. J. Cell Biol., 107, 2351-2361.

Lawler,J. and Hynes,R.O, (1989). An integrin receptor on normal and thrombasthénie platelets that binds thrombospondin. Blood, 74, 2022-2027.

257 Lee,s. and Herr,W. (2001). Stabilization but not the transcriptional activity of herpes simplex virus VP16-induced complexes is evolutionarily conserved among HCF family members. J. Virol., 75, 12402-12411.

Leppaluoto.J. and Ruskoaho.H. (1992). Endothelin peptides: biological activities, cellular signalling and clinical significance. Ann. Med., 24, 153-161.

Letunic,l., Goodstadt,L., Dickens,N.J., Doerks,T., Schultz,J., Mott,R., Ciccarelli,F., Copley,R.R., et al., (2002). Recent improvements to the SMART domain-based sequence annotation resource. Nucl. Acids. Res., 30, 242-244.

Lewis,J.M., Cheresh,D.A., and Schwartz,M.A. (1996). Protein kinase C regulates alpha v beta 5-dependent cytoskeletal associations and focal adhesion kinase phosphorylation. J Celi Bid, 134, 1323-1332.

Li,J., Brick,P., 0'Hare,M.C., Skarzynski,T., Lloyd,L.F., Curry,V.A., Clark,I.M., Bigg,H.F., et al., (1995). Structure of full-length porcine synovial collagenase reveals a C-terminal domain containing a calcium-linked, four-bladed beta- propeller. Structure., 3, 541-549.

Li,M.G., Serr,M., Edwards,K., Ludmann,S., Yamamoto,D., Tilney,L.G., Field,C.M., and Hays,T.S. (1999). Filamin is required for ring canal assembly and actin organization during Drosophila oogenesis. J. Cell Biol., 146, 1061- 1074.

Li,X., Lopez-Guisa,J.M., Ninan,N., Weiner,E.J., Rauscher lll,F.J., and Marmorstein,R. (1997). Overexpression, Purification, Characterization, and Crystallization of the BTB/POZ Domain from the PLZF Oncoprotein. J. Biol. Chem., 272, 27324.

Li,X., Peng,H., Schultz,D.C., Lopez-Guisa,J.M., Rauscher,F.J., Ill, and Marmorstein,R. (1999). Structure-Function Studies of the BTB/POZ Transcriptional Repression Domain from the Promyelocytic Leukemia Zinc Finger Oncoprotein. Cancer Res, 59, 5275.

258 Li,Z., Calzada.MJ., Sipes,J.M., Cashel,J.A., Krutzsch,H.C., Annis,D.S., Mosher,D.F., and Roberts,D.D. (2002). Interactions of thrombospondins with {alpha}4{beta}1 integrin and CD47 differentially modulate T cell behavior. J. Cell Biol., 157, 509-519.

Liddington,R.C. and G insberg,M.H. (2002). Integrin activation takes shape. J. Cell Biol., 158, 833-839.

Lim,S.T., Longley,R.L., Couchman,J.R., and Woods,A. (2003). Direct binding of syndecan-4 cytoplasmic domain to the catalytic domain of PKCalpha increases focal adhesion localization of PKCalpha. J. Biol. Chem..

Liu,Y., Hengartner,M.O., and Herr,W. (1999). Selected Elements of Herpes Simplex Virus Accessory Factor HCF Are Highly Conserved in Caenorhabditis elegans. Mol. Cell. Biol., 19, 909-915.

Lu,C., Ferzly,M., Takagi.J., and Springer,T.A. (2001). Epitope Mapping of Antibodies to the C-Terminal Region of the Integrin {{beta}}2 Subunit Reveals Regions that Become Exposed Upon Receptor Activation. J Immunol, 166, 5629-5637.

Lymn,J.S., Rao,S.J., Clunn,G.F., Gallagher,K.L., O'Neil,C., Thompson,N.T., and Hughes,A.D. (1999). Phosphatidylinositol 3-Kinase and Focal Adhesion Kinase Are Early Signals in the Growth Factor-Like Responses to Thrombospondin-1 Seen in Human Vascular Smooth Muscle. Arterioscler Thromb Vasa Biol, 19, 2133.

Macedo-Ribeiro,S., Bode,W., Huber,R., Quinn-Allen,M.A., Kim,S.W., Ortel,T.L., Bourenkov,G.P., Bartunik,H.D., et al., (1999). Crystal structures of the membrane-binding C2 domain of human coagulation factor V. Nature, 402, 434- 439.

Magnusson,M.K. and Mosher,D.F. (1998). Fibronectin : Structure, Assembly, and Cardiovascular Implications. Arterioscler Thromb Vase Biol, 18, 1363-1370.

Mahajan,S.S., Little,M.M., Vazquez,R., and Wilson,A.C. (2002). Interaction of HCF-1 with a Cellular Nuclear Export Factor. J. Biol. Chem., 277, 44292-44299.

259 Marchler-BauerA, Panchenko,AR-, Shoemaker,BA, Thiessen,P.A., Geer.L.Y., and Bryant,S.H. (2002). CDD: a database of conserved domain alignments with links to domain three-dimensional structure. Nucleic Acids Res, 30, 281-283.

Marintchev,A., Mullen,M.A., Maciejewski,M.W., Pan,B., Gryk,M.R., and Mullen,G.P. (1999). Solution structure of the single-strand break repair protein

XRCC1 N-terminal domain. Nat. Struct. Bid., 6 , 884-893.

Mata,J. and Nurse,P. (1997). teal and the microtubular cytoskeleton are important for generating global spatial order within the fission yeast cell. Ceii, 89, 939-949.

Mata,J., Lyne,R., Burns,G., and Bahler,J. (2002). The transcriptional program of meiosis and sporulation in fission yeast. Nat. Genet., 32, 143-147.

Matsui,T., Yonemura,S., Tsukita,S., and Tsukita.S. (1999). Activation o f E RM proteins in vivo by Rho involves phosphatidyl-inositol 4-phosphate 5-kinase and not ROCK kinases. Curr. Bid, 9, 1259-1262.

Mattaj,I.W. and Englmeier,L. (1998). Nucleocytoplasmictransport: the soluble phase. Annu. Rev. Biochem., 67, 265-306.

Melchior,P., Paschal,B., Evans,J., and Gerace,L. (1993). Inhibition of nuclear protein import by nonhydrolyzable analogues of GTP and identification of the small GTPase Ran/TC4 as an essential transport factor [published erratum appears in J Cell Biol 1994 Jan;124(1-2):217]. J. Ceii Bid., 123, 1649.

Melchior,F., Guan,T., Yokoyama,N., Nishimoto,T., and Gerace,L. (1995). GTP hydrolysis by Ran occurs at the nuclear pore complex in an early step of protein import. J. Ceii Bid., 131, 571.

Mercurio,A.M. and Shaw,L.M. (1988). Macrophage interactions with laminin: PMA selectively induces the adherence and spreading of mouse macrophages on a laminin substratum. J. Ceii Bid., 107, 1873-1880.

260 Merle,B-, Malaval,L., Lawler,J., Delmas,P., and Clezardin,P. (1997). Decorin inhibits cell attachment to thrombospondin-1 by binding to a KKTR-dependent cell adhesive site present within the N-terminal domain of thrombospondin-1. J Cell Biochem., 67, 75-83.

Mery,L., Mesaeli,N., Michalak,M., Opas,M., Lew,D.P., and Krause,K.H. (1996). Overexpression of Calreticulin Increases Intracellular Ca[IMAGE] Storage and Decreases Store-operated Ca[IMAGE] Influx. J. Biol. Chem., 271, 9332-9339.

Michalak,M., Corbett,E.F., Mesaeli,N., Nakamura,K., and Opas,M. (1999). Calreticulin: one protein, one gene, many functions. Biochem. J., 344 Ft 2, 281- 292.

Mikhailenko,l., Krylov,D., Argraves,K.M., Roberts,D.D., Liau,G., and Strickland,D.K. (1997). Cellular internalization and degradation of thrombospondin -1 is mediated by the amino-terminal heparin binding domain (HBD). High affinity interaction of dimeric HBD with the low density lipoprotein receptor-related protein. J. Biol. Chem., 272, 6784-6791.

Minton,A.P. (2001). The influence of macromolecular crowding and macromolecular confinement on biochemical reactions in physiological media. J. Biol. Chem., 276, 10577-10580.

Misenheimer,T.M., Hannah,B.L., Annis,D.S., and Mosher,D.F. (2003). Interactions among the three structural motifs of the C-terminal region of human thrombospondin-2. Biochemistry, 42, 5125-5132.

Miyamoto,S., Teramoto,H., Coso,O.A., Gutkind,J.S., Burbelo,P.D., Akiyama,S.K., and Yamada,K.M. (1995). Integrin function: molecular hierarchies of cytoskeletal and signaling molecules. J. Cell Biol., 131, 791-805.

Miyamoto,S., Akiyama,S.K., and Yamada,K.M. (1995). Synergistic roles for receptor occupancy and aggregation in integrin transmembrane function. Science, 267, 883-885.

Mo,X., Bailin,T., and Sadofsky,M.J. (1999). RAG1 and RAG2 Cooperate in Specific Binding to the Recombination Signal Sequence in Vitro. J. Biol. Chem., 274, 7025.

261 Moiseeva,E.p. (2001). Adhesion receptors of vascular smooth muscle cells and their functions. Cardiovascular Research, 52, 372-386.

Moore,M.S. and Blobel,G. (1993). The GTP-binding protein Ran/TC4 is required for protein import into the nucleus. Nature, 365, 661-663.

Mostafavi-Pour,Z., Askari,J.A., Parkinson,S.J., Parker,P.J., Ng,T.T.C., and Humphries,M.J. (2003). Integrin-specific signaling pathways controlling focal adhesion formation and cell migration. J. Cell BloL, 161, 155-167.

Mulholland,D.J., Dedhar,S., and Vogl,A.W. (2001). Rat seminiferous epithelium contains a unique junction (Ectoplasmic specialization) with signaling properties both of cell/cell and cell/matrix junctions. BloL Reprod., 64, 396-407.

Mumby,S.M., Raugi,G.J., and Bornstein,P. (1984). Interactions of thrombospondin with extracellular matrix proteins; selective binding to type V collagen. J. Cell BloL, 98, 646.

Murakami,K., Chan,S.Y., and Routtenberg,A. (1986). Protein kinase C activation by cis-fatty acid in the absence of Ca2+ and phospholipids. J. BloL Chem., 261, 15424-15429.

Murzin,A.G. (1992). Structural principles for the propeller assembly of beta- sheets: the preference for seven-fold symmetry. Proteins, 14,191-201.

Nagle,D.L., McGrail,S.H., Vitale,J., Woolf,E.A., Dussault,B.J., Jr., DiRocco,L., Holmgren,L., Montagno,J., et al., (1999). The mahogany protein is a receptor involved in suppression of obesity. Nature, 398, 148-152.

Nakamura,M., Masuda,H., Horii,J., Kuma,K., Yokoyama,N., Ohba,T., Nishitani,H., Miyata,T., et al., (1998). When overexpressed, a novel centrosomal protein, RanBPM, causes ectopic microtubule nucléation simiJar to gamma-tubulin. J. Cell BloL, 143,1041-1052.

Nakayama,M., Nakajima,D., Nagase,T., Nomura,N., Seki,N., and Ohara,O. (1998). Identification of high-molecular-weight proteins with multiple EGF-like motifs by motif-trap screening. Genomics, 51, 27-34.

262 Narumiya.S., Sugimoto.Y., and Ushikubi.F. (1999). Prostanoid receptors: structures, properties, and functions. Physiol Rev., 79, 1193-1226.

Neely,K.E. and Workman,J.L. (2002). Histone acétylation and chromatin remodeling: which comes first? Molecular Genetics and Metabolism, 76,1-5.

Ng,T., Shima,D., Squire,A., Bastiaens,P.I., Gschmeissner,S., Humphries,M.J., and Parker,P.J. (1999). PKCalpha regulates betal integrin-dependent cell motility through association and control of integrin traffic. EMBO J., 18, 3909- 3923.

Nguyen,!., Sherratt,P.J., and Pickett,C.B. (2002). Regulatory Mechanisms Controllong Gene Expression Mediated by the Antioxidant Response Element. Annu. Rev. Pharmacol. Toxicol..

Nishitani,H., Hirose,E., Uchimura,Y., Nakamura,M., Umeda,M., Nishii,K., Mori,N., and Nishimoto,!. (2001). Full-sized RanBPM cDNA encodes a protein possessing a long stretch of proline and glutamine within the N-terminal region, comprising a large protein complex. Gene, 272, 25-33.

Nix,D.A. and Beckerle,M.C. (1997). Nuclear-Cytoplasmic Shuttling of the Focal Contact Protein, Zyxin: A Potential Mechanism for Communication between Sites of Cell Adhesion and the Nucleus. J. Cell Biol., 138, 1139.

Nobes,C.D. and Hall,A. (1995). Rho, rac, and cdc42 GTPases regulate the assembly of multimolecular focal complexes associated with actin stress fibers, lamellipodia, and filopodia. Cell, 81, 53-62.

O'Shea,K.S. and Dixit,V.M. (1988). Unique distribution of the extracellular matrix component thrombospondin in the developing mouse embryo. J. Cell Biol., 107, 2737-2748.

Oh,E.S., Woods,A., and Couchman.J.R. (1997). Multimerization of the cytoplasmic domain of syndecan-4 is required for its ability to activate protein kinase C. J. Biol. Chem., 272, 11805-1181 !

263 Oh,E.S., Woods,A., and Couchman.J.R. (1997). Syndecan-4 proteoglycan regulates the distribution and activity of protein kinase 0. J. Biol. Chem., 272, 8133-8136.

Ohmachi.M., Sugimoto.A., lino.Y., and Yamamoto,M. (1999). kel-1, a novel Kelch-related gene in Caenorhabditis elegans, is expressed in pharyngeal gland cells and is required for the feeding process. Genes to Cells, 4, 325-337.

Olsen,B.R. (1995). New insights into the function of collagens from genetic analysis. Curr. Opin. Cell Biol, 7,720-727.

Ono.S., Yamakita.Y., Yamashiro.S., Matsudaira.P.T., Gnarra.J.R., Obinata.T., and Matsumura.F. (1997). Identification of an actin binding region and a protein kinase C phosphorylation site on human fascin. J. Biol. Chem., 272, 2527-2533.

Opas.M., Szewczenko-Pawlikowski.M., Jass.G.K., Mesaeli.N., and Michalak.M. (1996). Calreticulin modulates cell adhesiveness via regulation of vinculin expression. J Cell Biol, 135, 1913-1923.

Orr.A.W., Pallero.M.A., and Murphy-Ullrich.J.E. (2002). Thrombospondin Stimulates Focal Adhesion Disassembly through Gi- and Phosphoinositide 3- Kinase-dependent ERK Activation. J. Biol. Chem., 277, 20453-20460.

Orr.A.W., Pedraza.C.E., Pallero.M.A., Elzie.C.A., Goicoechea.S., Strickland,O.K., and Murphy-Ullrich.J.E. (2003). Low density lipoprotein receptor-related protein is a calreticulin coreceptor that signals focal adhesion disassembly. J. Cell Biol., 161,1179.

Ouvrier,R.A. (1989). Giant axonal neuropathy. A review. Brain Dev., 11, 207- 214.

Ouzounis.C., Sander.C., Scharf.M., and Schneider,R. (1993). Prediction of protein structure by evaluation of sequence-structure fitness. Aligning sequences to contact profiles derived from three-dimensional structures. J. Mol. Biol., 232, 805-825.

264 Padma.M. and Das.U.N. (1999). Effect of cis-unsaturated fatty acids on the activity of protein kinases and protein phosphorylation in macrophage tumor (AK-5) cells in vitro. Prostaglandins, Leukotrienes and Essential Fatty Acids, 60, 55-63.

Paine,P.L., Moore,L.C., and Horowitz,S.B. (1975). Nuclear envelope permeability. Nature, 254, 109-114.

Palecek,S.P., Loftus,J.C., Ginsberg,M.H., Lauffenburger,D.A., and Horwitz,A.F. (1997). Integrin-ligand binding properties govern cell migration speed through cell-substratum adhesiveness. Nature, 385, 537-540.

Pankov,R. and Yamada,K.M. (2002). Fibronectin at a glance. J Cell Sci, 115, 3861-3863.

Paoli,M., Anderson,B.F., Baker,H.M., Morgan,W.T., Smith,A., and Baker,E.N. (1999). Crystal structure of hemopexin reveals a novel high-affinity heme site formed between two beta-propeller domains. Nat Struct. Biol, 6 , 926-931.

Paramio,J.M. and Jorcano,J.L. (2002). Beyond structure: do intermediate filaments modulate cell signalling? Bioessays, 24, 836-844.

Parekh,R.B. and Rohlff,C. (1997). Post-translational modification of proteins and the discovery of new medicine. Curr. Opin. Biotechnol., 8 , 718-723.

Patel,M.K., Lymn,J.S., Clunn.G.F., and Hughes,A.D. (1997). Thrombospondin-1 Is a Potent Mitogen and Chemoattractant for Human Vascular Smooth Muscle Cells. Arterioscler Thromb Vase Biol, 17, 2107.

Patel,R.S., Odermatt,E., Schwarzbauer,J.E., and Hynes,R.O. (1987). Organization of the fibronectin gene provides evidence for exon shuffling during evolution. EMBO J, 6 , 2565-2572.

Pavalko,F.M. and Burridge,K. (1991). Disruption of the actin cytoskeleton after microinjection of proteolytic fragments of alpha-actinin. J Cell Biol, 114, 481- 491.

265 Pearson,G., Robinson,F., Beers Gibson,!., Xu,B., Karandikar,M., Berman,K., and Cobb,M.H. (2001). Mitogen-Activated Protein (MAP) Kinase Pathways: Regulation and Physiological Functions. EndocrRev, 22, 153-183.

Pfeiffer,D.C. and Vogl,A.W. (1991). Evidence that vinculin is co-distributed with actin bundles in ectoplasmic ("junctional") specializations of mammalian Sertoli cells. Anat. Rec., 231, 89-100.

Philips,J. and Herskowitz,l. (1998). Identification of Kellp, a kelch domain- containing protein involved in cell fusion and morphology in Saccharomyces cerevisiae. J. Cell BioL, 143, 375-389.

Piacentini,L., Gray,M., Honbo,N.Y., Chentoufi,J., Bergman,M., and Karliner,J.S. (2000). Endothelin-1 Stimulates Cardiac Fibroblast Proliferation Through Activation of Protein Kinase 0. Journal of Molecular and Cellular Cardiology, 32, 565-576.

Plow,E.F., Haas,T.A., Zhang,L., Loftus,J., and Smith,J.W. (2000). Ligand binding to integrins. J BioL Chem., 275, 21785-21788.

Prag,S. and Adams,J.C. (2003). Molecular phylogeny of the kelch-repeat superfamily reveals an expansion of BTB/kelch proteins in animals. BMC Bioinformatics., 4, 42.

Pratt,K.P., Shen,B.W., Takeshima,K., Davie,E.W., Fujikawa,K., and Stoddard,B.L. (1999). Structure of the 02 domain of human factor VIII at ! 5 A resolution. Nature, 402, 439-442.

Putney,J.W., Jr., Broad,L.M., Braun,F.J., Lievremont,J.P., and Bird,G.S. (2001). Mechanisms of capacitative calcium entry. J Cell Sci, 114, 2223-2229.

Qian,X., Wang,T.N., Rothman,V.L., Nicosia,R.F., and Tuszynski,G.P. (1997), Thrombospondin-1 Modulates Angiogenesisin Vitroby Up-Regulation of Matrix Metalloproteinase-9 in Endothelial Cells. Experimental Cell Research, 235, 403- 412.

266 QiuJ.X., Kale.S.B., Yarnell,S.H., and Roth.D.B. (2001). Separation-of-function mutants reveal critical roles for RAG2 in both the cleavage and joining steps of V(D)J recombination. Mol. Cell, 7, 77-87.

Quivy.V. and Lint,C.V. (2002). Diversity of acétylation targets and roles in transcriptional regulation: the human immunodeficiency virus type 1 promoter as a model system. Biochemical Pharmacology, 64, 925-934.

Rao,M.A., Cheng,H., Quayle.A.N., Nishitani.H., Nelson,C.C., and Rennie,P.S. (2002). RanBPM, a Nuclear Protein That Interacts with and Regulates Transcriptional Activity of Androgen Receptor and Glucocorticoid Receptor. J. Biol. Chem., 277, 48020.

Reason,A.J., Morris,H.R., Panico,M., Marais,R., Treisman,R.H., Haltiwanger,R.S., Hart,G.W., Kelly,W.G., and Dell,A. (1992). Localization of O- GlcNAc modification on the serum response transcription factor. J. Biol. Chem., 267, 16911-16921.

Reilly,P.T., Wysocka,J., and Herr,W. (2002). Inactivation of the retinoblastoma protein family can bypass the HCF-1 defect in tsBN67 cell proliferation and cytokinesis. Mol. Cell Biol., 22, 6767-6778.

Reilly,P.T. and Herr,W. (2002). Spontaneous reversion of tsBN67 cell proliferation and cytokinesis defects in the absence of HCF-1 function. Exp. Cell Res., 277, 119-130.

Ren,X., Kiosses,W.B., Sieg,D.J., Otey,C.A., Schlaepfer,D.D., and Schwartz,M.A. (2000). Focal adhesion kinase suppresses Rho activity to promote focal adhesion turnover [In Process Citation]. J. Cell Sci., 113 (Ft 20), 3673-3678.

Ribeiro,S.M.aF., Poczatek,M., Schultz-Cherry,S., Vinain,M., and Murphy- Ullrich,J.E. (1999). The Activation Sequence of Thrombospondin-1 Interacts with the Latency-associated Peptide to Regulate Activation of Latent Transforming Growth Factor-beta. J. Biol. Chem., 274, 13586.

Richardson,A. and Parsons,T. (1996). A mechanism for regulation of the adhesion-associated proteintyrosine kinase pp125FAK. Nature, 380, 538-540.

267 Robinson,D.N., Cant,K., and Cooley,L. (1994). Morphogenesis of Drosophila ovarian ring canals. Development, 120, 2015-2025.

Robinson,D.N. and Cooley,L. (1997)a. Drosophila kelch is an oligomeric ring canal actin organizer. J. Cell BioL, 138, 799-810.

Robinson,D.N. and Cooley,L. (1997)b. Examination of the function of two kelch proteins generated by stop codon suppression. Development, 124,1405-1417.

Rosenberg,K., Olsson,H., Morgelin,M., and Heinegard,D. (1998). Cartilage Oligomeric Matrix Protein Shows High Affinity Zinc-dependent Interaction with Triple Helical Collagen. J. Biol. Chem., 273, 20397-20403.

Roth,J.J., Gahtan,V., Brown,J.L., Gerhard,C., Swami,V.K., Rothman,V.L., Tulenko.T.N., and Tuszynski,G.P. (1998). Thrombospondin-1 Is Elevated with both Intimai Hyperplasia and Hypercholesterolemia. Journal of Surgical Research, 74, 11-16.

Rottner,K., Hall,A., and Small,J.V. (1999). Interplay between Rac and Rho in the control of substrate contact dynamics. Curr. Biol., 9, 640-648.

Salanova,M., Ricci,G., Boitani,C., Stefanini,M., De Grossi,S., and Palombi,F. (1998). Junctional contacts between Sertoli cells in normal and aspermatogenic rat seminiferous epithelium contain alpha 6 beta 1 integrins, and their formation is controlled by follicle-stimulating hormone. Biol. Reprod., 58, 371-378.

Sander,C. and Schneider,R. (1991). Database of homology-derived protein structures and the structural meaning of sequence alignment. Proteins, 9, 56- 68 .

Sanders,M.C., Way,M., Sakai,J., and Matsudaira.P. (1996). Characterization of the actin cross-linking properties of the scruin-calmodulin complex from the acrosomal process of Limulus sperm. J. Biol. Chem., 271, 2651-2657.

Sannes,P.L. and Wang,J. (1997). Basement membranes and pulmonary development. Exp. Lung Res, 23, 101-108.

268 Sasagawa.K., Matsudo.Y., Kang.M., Fujimura.L., litsuka.Y., Okada.S., Ochiai.T., Tokuhisa.T., and Hatano,M. (2002). Identification of Nd1, a novel murine kelch family protein involved in stabilization of actin filaments. J. Biol. Chem..

Sazer.S. and Dasso,M. (2000). The ran decathlon: multiple roles of Ran. J. Cell Sci., 113 (P t 7), 1111-1118.

Schmid,M.P., Agris,J.M., Jakana,J., Matsudaira.P., and Chiu,W. (1994). Three- dimensional structure of a single filament in the Limulus acrosomal bundle: scruin binds to homologous helix-loop-beta motifs in actin. J. Cell Biol., 124, 341-350.

Schnell.D.J. and Hebert,D.N. (2003). Protein translocons: multifunctional mediators of protein translocation across membranes. Cell, 112, 491-505.

Schoenwaelder,S.M., Yuan,Y., Cooray,P., Salem,H.H., and Jackson,S.P. (1997). Calpain Cleavage of Focal Adhesion Proteins Regulates the Cytoskeletal Attachment of Integrin alpha llbbeta 3 (Platelet Glycoprotein llb/llla) and the Cellular Retraction of Fibrin Clots. J. Biol. Chem., 272, 1694- 1702.

Schultz-Cherry,S., Chen,H., Misenheimer,T.M., and Krutzsch,H.C. (1995). Regulation of Transforming Growth Factor-beta Activation by Discrete Sequences of Thrombospondin 1. J. Biol. Chem., 270, 7304.

Schultz,J., Copley,R.R., Doerks,T., Ponting,C.P., and Bork,P. (2000). SMART: a web-based tool for the study of genetically mobile domains. Nucl. Acids. Res., 28, 231-234.

Schwartz,M.A., Schaller,M.D., and Ginsberg,M.H. (1995). Integrins: emerging paradigms of signal transduction. Annu. Rev. Cell Dev. Biol., 11:549-99., 549- 599.

Schwartz,M.A. (2001). Integrin signaling revisited. Trends Cell Biol., 11, 466- 470.

269 Schwarz,K., Gauss,G.H., Ludwig,L., Pannicke,U., Li,Z., Lindner,D., Friedrich,W., Seger,R.A., et al., (1996). RAG mutations in human B cell- negative SCID. Science, 274, 97-99.

Scott-Drew,S. and ffrench-Constant,C. (1997). Expression and function of thrombospondin-1 in myelinating glial cells of the central nervous system. J. Neurosci. Res., 50, 202-214.

Sellers,J.R. and Kachar,B. (1990). Polarity and velocity of sliding filaments: control of direction by actin and of speed by myosin. Science, 249, 406-408.

Shchelkunov,S., Totmenin,A., and Kolosova,I. (2002). Species-specific differences in organization of orthopoxvirus kelch-like proteins. Virus Genes, 24, 157-162.

Shimaoka,M., Takagi,J., and Springer,T.A. (2002). Conformational regulation of integrin structure and function. Annu. Rev. Biophys. Biomol. Struct, 31:485- 516., 485-516.

Siebrasse,J.P., Coutavas,E., and Peters,R. (2002). Reconstitution of nuclear protein export in isolated nuclear envelopes. J. Cell Biol., 158, 849-854.

Sipes,J.M., Guo,N., Negre,E., Vogel,!., Krutzsch,H.C., and Roberts,D.D. (1993). Inhibition of fibronectin binding and fibronectin-mediated cell adhesion to collagen by a peptide from the second type I repeat of thrombospondin. J. Cell Biol., 121,469.

Slane,J.M., Mosher,D.F., and Lai,O.S. (1988). Conformational change in thrombospondin induced by removal of bound Ca2+. A spin label approach. FEBS Lett., 229, 363-366.

Smith,D.S., Niethammer,M., Ayala,R., Zhou,Y., Gambello,M.J., Wynshaw- Boris,A., and Tsai,L.H. (2000). Regulation of cytoplasmic dynein behaviour and microtubule organization by mammalian L is! Nat. Cell Biol., 2, 767-775.

270 Soltysik-Espanola.M., Rogers,R.A., Jiang,S., Kim,T.A., Gaedigk,R., White,R.A., Avraham,H., and Avraham,S. (1999). Characterization of Mayven, a novel actin- binding protein predominantly expressed in brain. Mol. Biol. Cell, 10, 2361- 2375.

Somers,D.E. (2001). Clock-associated genes in Arabidopsis: a family affair. Philos. Trans. R. Soc. Lond B Biol. Sci., 356,1745-1753.

Stenina,O.I., Krukovets,l., Wang,K., Zhou,Z., Forudi,F., Penn,M.S., Topol,E.J., and Plow,E.F. (2003). Increased Expression of Thrombospondin-1 in Vessel Wall of Diabetic Zucker Rat. Circulation, 107, 3209-3215.

Strack,V., Krutzfeldt,J., Kellerer,M., Ullrich,A., Lammers,R., and Haring,H.U. (2002). The Protein-tyrosine-phosphatase SHP2 is phosphorylated on serine residues 576 and 591 by protein kinase C isoforms alpha, beta 1, beta 2, and eta. Biochemistry, 41, 603-608.

Sun,S., Footer,M., and Matsudaira,P. (1997). Modification of Cys-837 identifies an actin-binding site in the beta-propeller protein scruin. Mol. Biol. Ceii, 8 , 421- 430.

Sun,X., Mosher,D.F., and Rapraeger,A. (1989). Heparan sulfate-mediated binding of epithelial cell surface proteoglycan to thrombospondin. J. Biol. Chem., 264, 2885.

Takagi,J., Erickson,H.P., and Springer,T.A. (2001). C-terminal opening mimics

'inside-out' activation of integrin alpha5beta1. Nat. Struct. Biol., 8 , 412-416.

Takagi,J., Petre,B.M., Walz,T., and Springer,T.A. (2002). Global conformational rearrangements in integrin extracellular domains in outside-in and inside-out signaling. Ceii, 110, 599-11.

Tan,J.L., Ravid,S., and Spudich,J.A. (1992). Control of nonmuscle myosins by phosphorylation. Annu. Rev Biochem., 61:721-59., 721-759.

Taylor,C.W. (2002). Controlling calcium entry. Ceii, 111, 767-769.

271 Taylor,J.M., Mack,C.P., Nolan,K., Regan,C.P., Owens,G.K., and Parsons,J.T. (2001). Selective Expression of an Endogenous Inhibitor of FAK Regulates

Proliferation and Migration of Vascular 8 mooth Muscle Cells. Mol. Cell. Biol., 21, 1565-1572.

The C.elegans Sequencing Consortium (1998). Genome Sequence of the Nematode C. elegans: A Platform for Investigating Biology. Science, 282, 2012-2018.

Thompson,J.D., Higgins,D.G., and Gibson,T.J. (1994). CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res, 22, 4673-4680.

Tilney,L.G. (1975). Actin filaments in the acrosomal reaction of Limulus sperm. Motion generated by alterations in the packing of the filaments. J. Ceil Biol., 64, 289-310.

Togel,M., Wiche,G., and Propst,F. (1998). Novel Features of the Light Chain of Microtubule-associated Protein MAP1B: Microtubule Stabilization, Self Interaction, Actin Filament Binding, and Regulation by the Heavy Chain. J. Cell Biol., 143, 695.

Tooney,P.A., Sakai,T., Sakai,K., Aeschlimann,D., and Mosher,D.F. (1998). Restricted localization of thrombospondin-2 protein during mouse embryogenesis: a comparison to thrombospondin-1. Matrix Biol., 17, 131-143.

Topol,E.J., McCarthy,J., Gabriel,S., Moliterno,D.J., Rogers,W.J., Newby,L.K., Freedman,M., Metivier,J., et al., (2001). Single Nucleotide Polymorphisms in Multiple Novel Thrombospondin Genes May Be Associated With Familial Premature Myocardial Infarction. Circulation, 104, 2641-2644.

Treisman,R. (1996). Regulation of transcription by MAP kinase cascades. Curr.

Opin. Ceil Biol., 8 , 205-215.

Tucker,R.P., Hagios,C., Chiquet-Ehrismann,R., and Lawler,J. (1997). In situ localization of thrombospondin-1 and thrombospondin-3 transcripts in the avian embryo. Dev. Dyn., 208, 326-337.

272 Ullman.K.S., Powers,M.A., and Forbes,DJ. (1997). Nuclear export receptors: from importin to exportin. Cell, %19;90, 967-970.

Umeda,M., Nishitani,H., and Nishimoto,!. (2003). A novel nuclear protein, Twal, and Muskelin comprise a complex with RanBPM. Gene, 303, 47-54.

VanHouten,J.N., Asch,H.L., and Asch,B.B. (2001). Cloning and characterization of ectopically expressed transcripts for the actin-binding protein MIPP in mouse mammary carcinomas. Oncogene, 20, 5366-5372.

Varkey,J.P., Muhlrad,P.J., Minniti,A.N., Do,B., and Ward,S. (1995). The Caenorhabditis elegans spe-26 gene is necessary to form spermatids and encodes a protein similar to the actin-associated proteins kelch and scruin. Genes Dev., 9, 1074-1086.

Velichkova,M., Guttman,J., Warren,C., Eng,L., Kline,K., Vogl,A.W., and Hasson,T. (2002). A human homologue of Drosophila kelch associates with myosin-Vlla in specialized adhesion junctions. Cell Motil. Cytoskeleton, 51, 147- 164.

Verde,P., Mata,J., and Nurse,P. (1995). Fission yeast cell morphogenesis: identification of new genes and analysis of their role during the cell cycle. J Cell Biol, 131, 1529-1538.

Verkman,A.S. (2002). Solute and macromolecule diffusion in cellular aqueous compartments. Trends Biochem. Sci., 27, 27-33.

Vinogradova,O., Velyvis,A., Velyviene,A., Hu,B., Haas,!., Plow,E., and Qin,J. (2002). A structural mechanism of integrin alpha(llb)beta(3) "inside-out" activation as regulated by its cytoplasmic face. Cell, 110, 587-597.

Vogl,A.W., Pfeiffer,D.C., Mulholland,D., Kimel,G., and Guttman,J. (2000). Unique and multifunctional adhesion junctions in the testis: ectoplasmic specializations. Arch. Histol. Cytol., 63, 1-15. von Bulow,M., Heid,H., Hess,H., and Franke,W.W. (1995). Molecular nature of calicin, a major basic protein of the mammalian sperm head cytoskeleton. Exp. Ce//Res., 219, 407-413.

273 Vosseller.K., Wells,L., and Hart.G.W. (2001). Nucleocytoplasmic O- glycosylation: 0-GlcNAc and functional proteomics. Biochimie, 83, 575-581.

Vuori.K. and Ruoslahti,E. (1993). Activation of protein kinase C precedes alpha 5 beta 1 integrin-mediated cell spreading on fibronectin. J. Biol. Chem., 268, 21459-21462.

Wang,D., Li,Z., Messing,E.M., and Wu,G. (2002). Activation of Ras/Erk Pathway by a Novel MET-interacting Protein RanBPM. J. Biol. Chem., 277, 36216.

Wang,S., Yue,H., Derin,R.B., Guggino,W.B., and Li,M. (2000). Accessory protein facilitated CFTR-CFTR interaction, a molecular mechanism to potentiate the chloride channel activity. Celi, 103, 169-179.

Wang,X.Q. and Frazier,W.A. (1998). The Thrombospondin Receptor CD47 (lAP) Modulates and Associates with alpha 2beta 1 Integrin in Vascular Smooth Muscle Cells. Mol. Biol. Cell, 9, 865.

Wang,Y. and Gilmore,T.D. (2001). LIM domain protein Trip 6 has a conserved nuclear export signal, nuclear targeting sequences, and multiple transactivation domains. Biochim. Biophys. Acta, 1538, 260-272.

Wang,Y., Marion,S.E., Li,X., Duttenhofer,l., Debatin,K., and Hug,H. (2002). HIPK2 associates with RanBPM. Biochem. Biophys. Res Commun., 297, 148- 153.

Wang,Y. and Gilmore,T.D. (2003). Zyxin and paxillin proteins: focal adhesion plaque LIM domain proteins go nuclear. Biochimica et Biophysica Acta (BBA) - Molecular Cell Research, 1593, 115-120.

Wang,Z. and Sheetz,M.P. (2000). The C-terminus of tubulin increases cytoplasmic dynein and kinesin processivity. Biophys. J., 78, 1955-1964.

Waterston,R.H., Lindblad-Toh.K., Birney,E., Rogers,J., Abril,J.F., Agarwal,P., Agarwala,R., Ainscough,R., et a I., (2002). I nitial sequencing and comparative analysis of the mouse genome. Nature, 420, 520-562.

274 Way,M., Sanders,M., Garcia,C., Sakai,J., and Matsudaira,P. (1995)a. Sequence and domain organization of scruin, an actin-cross-linking protein in the acrosomal process of Limulus sperm. J. Cell Biol., 128, 51-60.

Way.M., Sanders,M., Chafel,M., Tu,Y.H., Knight,A., and Matsudaira,P. (1995)b. beta-Scruin, a homologue of the actin crosslinking protein scruin, is localized to the acrosomal vesicle of Limulus sperm. J. Cell Sci., 108 ( Pt 10), 3155-3162.

Wei,Y., Jin,J., and Harper,J.W. (2003). The Cyclin E/Cdk2 Substrate and Cajal Body Component p220NPAT Activates Histone Transcription through a Novel LisH-Like Domain. Mol. Cell. Biol., 23, 3669-3680.

Weis,K., Dingwall,0., and Lamond,A.I. (1996). Characterization of the nuclear protein import mechanism using Ran mutants with altered nucleotide binding specificities. EMBO J, 15, 7120-7128.

Wells,L., Vosseller,K., Cole,R.N., Cronshaw,J.M., Matunis,M.J., and Hart,G.W. (2002). Mapping Sites of 0-GlcNAc Modification Using Affinity Tags for Serine and Threonine Post-translational Modifications. Mol. Cell Proteomics., 1, 791- 804.

Wen,W., Meinkoth,J.L., Tsien,R.Y., and Taylor,S.S. (1995). Identification of a signal for rapid export of proteins from the nucleus. Cell, 82, 463-473.

Wendt,K.S., Vodermaier,H.C., Jacob,U., Gieffers,C., Gmachl,M., Peters,J.M., Huber,R., and Sondermann,P. (2001). Crystal structure of the APC10/DOC1 subunit of the human anaphase-promoting complex. Nat. Struct. Biol., 8 , 784- 788.

Williamson,R.A., Henry,M.D., Daniels,K.J., Hrstka,R.F., Lee,J.C., Sunada,Y., Beskrovnaya,0., and Campbell,K.P. (1997). Dystroglycan is essential for early embryonic development: disruption of Reichert's membrane in Dag 1-null mice.

Hum. Mol. Genet., 6 , 831-841.

275 Willis,A.I., Fuse,S., Wang,X.J., Chen,E., Tuszynski,G.P., Sumpio,B.E., and Gahtan,V. (2000). Inhibition of phosphatidylinositol 3-kinase and protein kinase C attenuates extracellular matrix protein-induced vascular smooth muscle cell chemotaxis. J Vase Surg., 31, 1160-1167.

Wilson,A.C., Peterson,M.G., and Herr,W. (1995). The HCF repeat is an unusual proteolytic cleavage signal. Genes Dev., 9, 2445-2458.

Wilson,A.C., Freiman,R.N., Goto,H., Nishimoto,T., and Herr,W. (1997). VP16 targets an amino-terminal domain of HCF involved in cell cycle progression. Mol. Cell Biol., 17, 6139-6146.

Wolff,B., Sanglier,J.J., and Wang,Y. (1997). Leptomycin B is an inhibitor of nuclear export: inhibition of nucleo-cytoplasmic translocation of the human immunodeficiency virus type 1 (HIV-1) Rev protein and Rev-dependent mRNA. Chem. Biol, 4, 139-147.

Wolff,T., O'Neill,R.E., and Palese,P. (1998). NSI-Binding protein (NS1-BP): a novel human protein that interacts with the influenza A virus non structura I NS1 protein is relocalized in the nuclei of infected cells. J. Virol., 72, 7170-7180.

Wood,V., Gwilliam,R., Rajandream,M.A., Lyne,M., Lyne,R., Stewart, A., Sgouros,J., Peat,N., et al., (2002). The genome sequence of Schizosaccharomyces pombe. Nature, 415, 871-880.

Woods,A. and Couchman,J.R. (1994). Syndecan 4 heparan sulfate proteoglycan is a selectively enriched and widespread focal adhesion component. Mol. Biol. Cell, 5, 183-192.

Wysocka,J., Reilly,P.T., and Herr,W. (2001). Loss of HCF-1-chromatin association precedes temperature-induced growth arrest of tsBN67 cells. Mol.

Ce// 8 /0 /., 21, 3820-3829.

Xia,Z., Dickens,M., Raingeaud,J., Davis,R.J., and Greenberg,M.E. (1995). Opposing effects of ERK and JNK-p38 MAP kinases on apoptosis. Science, 270, 1326-1331.

276 Xiong,J.p., Stehle.T., Diefenbach,B., Zhang,R., Dunker,R., Scott, D.L., Joachimiak,A., Goodman,S.L., and Arnaout,M.A. (2001). Crystal structure of the extracellular segment of integrin alpha VbetaS. Science, 294, 339-345.

Xiong,J.P., Stehle,T., Zhang,R., Joachimiak,A., Freeh,M., Goodman,S.L., and Arnaout,M.A. (2002). Crystal structure of the extracellular segment of integrin alpha Vbeta3 in complex with an Arg-Gly-Asp ligand. Science, 296, 151-155.

Xue,F. and Cooley,L. (1993). kelch encodes a component of intercellular bridges in Drosophila egg chambers. Cell, 72, 681-693.

Yabkowitz,R., Dixit,V.M., Guo,N., Roberts,D.D., and Shimizu,Y. (1993). Activated T-cell adhesion to thrombospondin is mediated by the alpha 4 beta 1 (VLA-4) and alpha 5 beta 1 (VLA-5) integrins. J Immunol., 151, 149-158.

Yamakita,Y., Ono,S., Matsumura,F., and Yamashiro,S. (1996). Phosphorylation of human fascin inhibits its actin binding and bundling activities. J. Biol. Chem., 271, 12632-12638.

Yang,Z., Kyriakides,T.R., and Bernstein,P. (2000). Matricellular proteins as modulators of cell-matrix interactions: adhesive defect in thrombospondin 2-null fibroblasts is a consequence of increased levels of matrix metalloproteinase-2. Mol. Biol. Cell, 11, 3353-3364.

Yang,Z. and Nielsen,R. (2000). Estimating synonymous and nonsynonymous substitution rates under realistic evolutionary models. Mol. Biol. Evol., 17, 32- 43.

Yao,H., Kristensen,D.M., Mihalek,l., Sowa,M.E., Shaw,C., Kimmel,M., Kavraki,L., and Lichtarge,0. (2003). An accurate, sensitive, and scalable method to identify functional sites in protein structures. J. Mol. Biol., 326, 255- 261.

Yoneda,A. and Couchman,J.R. (2003). Regulation of cytoskeletal organization by syndecan transmembrane proteoglycans. Matrix Biol., 22, 25-33.

277 Yu,J., Hu,S., Wang,J., Wong.G.K., Li,S., Liu,B., Deng,Y., Dai,L., et al., (2002). A draft sequence of the rice genome (Oryza sativa L. ssp. indica). Science, 296, 79-92.

Zachary,!., Sinnett-Smith,J., and Rozengurt,E. (1992). Bombesin, vasopressin, and endothelin stimulation of tyrosine phosphorylation in Swiss 3T3 cells. Identification of a novel tyrosine kinase as a major substrate. J. Biol. Chem., 267, 19031-19034.

Zdobnov,E.M., von Mering,C., Letunic,!., Torrents,D., Suyama,M., Copley,R.R., Christophides,G.K., et al.,(2002). Comparative genome and proteome analysis of Anopheles gambiae and Drosophila melanogaster. Science, 298, 149-159.

Zheng,X., Chung,D., Takayama.T.K., Majerus,E.M., Sadler,J.E., and Fujikawa,K. (2001). Structure of von Willebrand Factor-cleaving Protease (ADAMTS13), a Metalloprotease Involved in Thrombotic Thrombocytopenic Purpura. J. Biol. Chem., 276, 41059-41063.

Zipper,L.M. and Mulcahy.R.T. (2002). The Keapi BTB/POZ dimerization function is required to sequester Nrf2 in cytoplasm. J. Biol. Chem., 277, 36544- 36552.

278