Structural investigations of involved in the COPI complex

Avital Lahav

Structural investigations of proteins involved in the COPI complex

Research Thesis Submitted In Partial Fulfillment of The Requirements for the Degree of Doctor of Philosophy

Avital Lahav

Submitted to the Senate of the Technion – Israel Institute of Technology

Kislev, 5774 Haifa November 2014

The research was done under the supervision of Prof. Noam Adir in the Schulich Faculty of Chemistry.

First and foremost I would like to thank my supervisor Prof. Noam Adir for his help, guidance and understanding during all my research period.

Second, I would like to thank Prof. Dan Cassel from the Biology Faculty in the Technion for his collaboration and help on this research and to Dr. Anna Parnis from Prof. Dan Cassel's lab for her help and patience during our work.

Third, I would like to thank my friends from the lab of both Prof. Noam Adir and Prof. Dan Cassel who made the lab a pleasant place to work at and from I learned a lot.

The generous financial help of the Technion is gratefully acknowledged.

I would like to thank my wonderful and beloved family: Yehuda, Ben and Adam who with one hug make everything right. This thesis is dedicated to them. To David Klein who again, helps me reach the stars.

CONTENTS

Abstract 1 Abbreviations 3 1. Introduction 5 1.1 The COPI system of vesicular traffic 5 1.1.1 The GTPase-activating proteins, ArfGAPs 6 1.1.2 The γCad 8 1.1.3 The MHD of the δ–COPI subunit 9 1.1.4 The R-based signals recognition site 10 1.1.5 The ArfGAP1 interaction with δCad 11 1.2 Single crystal X-ray crystallography 12 1.2.1 's preparation 12 1.2.2 Crystal mounting 13 1.2.3 X-ray source 14 1.2.4 Basic concepts 14 1.2.4.1 The unit cell 14 1.2.4.2 The asymmetric unit 14 1.2.4.3 The space group 15 1.2.4.4 The reciprocal lattice 15 1.2.4.5 Bragg's law and Miller indices 15 1.2.5 Data processing 16 1.2.6 Scaling 17 1.2.7 Solutions to the phase problem 18 1.2.8 Calculation of the electron density map 18 1.2.9 Model building 19 1.2.10 Refinement 19 1.2.11 Model validation 20 2. Research goals 21 3. Research plan and methods 22 3.1 Research plan 22 3.2 Methods 23 3.2.1 Molecular methods 23 3.2.1.1 Preparation of competent cells 23

3.2.1.2 PCR- Polymerase Chain Reaction 23 3.2.1.3 Agarose gel electrophoresis of DNA 23 3.2.1.4 DNA digestion by restriction enzymes 24 3.2.1.5 Ligation of PCR product into cloning vector 24 3.2.1.6 Electro-transformation 24 3.2.1.7 DNA sequence determination 24 3.2.2 Protein Isolation 24 3.2.2.1 Overexpression of the proteins 24 3.2.2.2 Isolation and primary purification of the proteins 25 3.2.2.3 SDS-PAGE analysis 25 3.2.2.4 Chromatographic analysis 25 3.2.2.5 Protein concentration 26 3.2.2.6 Determination of protein concentration 26 3.2.2.7 Mass spectroscopy 26 3.2.3 Protein analysis 26 3.2.3.1 SLS measurements 26 3.2.3.2 ITC measurements 27 3.2.4 Crystallization of the proteins 27 3.2.4.1 Improving crystal growth 27 3.2.4.2 Twinning and implementing twin law 28 3.2.5 Software used in crystallographic structure determination 28 3.2.5.1 Data processing, scaling and refinement 28 3.2.5.1.1 Mosflm 28 3.2.5.1.2 XDS 28 3.2.5.1.3 HKL-2000 package 28 3.2.5.1.4 CCP4 29 3.2.5.1.5 CNS 29 3.2.5.1.6 Phenix 29 3.2.5.2 Molecular graphic tools 29 3.2.5.2.1 PyMol 29 3.2.5.2.2 Coot 29 3.2.5.3 Bioinformatics programs 30 3.2.5.3.1 BLAST 30 3.2.5.3.2 Generunner 30

3.2.5.3.3 Clone Manager 30 3.2.5.3.4 WebCutter 30 3.2.5.3.5 PHYRE2 30 3.2.5.3.6 SWISS-MODEL 30 3.2.5.3.7 PDBePISA 30

4. Materials 31 4.1 General reagents 31 4.2 Vitamins 32 4.3 Amino acids 32 4.4 Biochemical and crystallization kits 33 4.5 Enzymes 33 4.6 Other materials 33 4.7 Growth media 34 4.8 Cell lines 34 4.9 Plasmid vectors and oligonucleotides 35 4.9.1 Vector description 35 4.9.2 Oligonucleotides 35 4.10 Buffers and solutions 37 4.10.1 Solutions for electrophoresis of DNA on agarose gel 37 4.10.2 Buffers used for protein isolation, purification and biochemical experiments 37 4.10.2.1 Nickel column buffers 37 4.10.2.2 Gel filtration buffers 38 4.10.3 Buffers used for SDS-PAGE 38 4.11 Gel recipes 38 4.12 Column material and sources 38 5. Results 39 5.1 The interaction between γCad and ArfGAP2 39 5.1.1 γCad and ArfGAP2 co-crystallization 39 5.1.1.1 γCad 39 5.1.1.2 ArfGAP2 39 5.1.1.3 Co-crystallization trials 40 5.1.2 The γCad and ArfGAP2 fusion protein 46 5.1.2.1 Designing the fusion protein 46

5.1.2.2 The fusion protein interaction experiments. 47 5.1.2.3 The fusion protein crystallization 50 5.1.2.4 Analysis of the putative ArfGAP2- γCad interaction site 54 5.2 The δCM 56 5.2.1 δCM purification 56 5.2.2 δCM crystallization 57 5.2.3 δCM data determination 59 6. Discussion 67

6.1 The interaction between γCad and ArfGAP2 67 6.1.1 Adding unstructured protein for crystallization 67 6.1.2 Findings of interaction. 67 6.1.3 Suggestions for the future. 68 6.2 The δCM 68 6.2.1 The δCM structure for interpretation of the entire COPI system 68 6.2.2 Exploration of the δCM interface and assembly 72 6.2. Mutation in δCM involved in Neurodegenerative disorder 72

7. References 74

LIST OF FIGURES

Figure Title Page 1 Model for COPI Assembly 6 2 Identification of residues in the basic stretch of ArfGAP2/3 that are important 7 for ER localization 3 Potential protein interaction sites on γCad 9 4 A composite model of F-COPI subcomplex 10 5 Growing protein crystals by vapor diffusion 13 6 Crystal organization 15 7 Conditions that produce strong diffracted rays 16 8 SDS_PAGE gel of γCad purification 39 9 Crystal lattice packing of the two γCad crystal structures 41 10 Structural comparison between γCad structure (1R4X) and the structure 46 determined from the co-crystallization 11 Cloning information of the different lengths of the fusion protein 47 12 ArfGAP2 competition with γCad, Fusion protein and ArfGAP2-II peptide. 48 13 Calorimetric data for the binding of ArfGAP2-II peptide to the γCad and the 49 "Medium" length Fusion protein 14 Crystals containing the “Medium” length fusion protein 50 15 Crystal structure of γCad from "Medium" fusion protein overlaid by 2Fo-Fc 53 and Fo-Fc electron density maps showing unoccupied density

16 γCad and ArfGAP2 structure electrostatics 55

17 Structural comparison between the γCad structure (1R4X) and the structure 56 determined from “Medium” length Fusion protein. 18 δCM purification 57 19 δCM crystal MS analysis 58 20 δCM diffraction pattern 59 21 Preliminary model fitting of δCM to the electron density map 60 22 Model building of δCM 62 23 Crystal structure of δCM protein 63 24 Section of the δCM protein crystal structure overlaid by 2Fo-Fc electron 65 density map 25 MHD structural differences 66 26 Partial model for open conformation of Coatomer 70

27 Proposed R-based signals binding site visualized in δCM structure 69 28 Modelling of the I422T point mutation in δCM 73

LIST OF TABLES

Table Title Page 1 Oligonucleotides details 36 2 Crystals obtained from published and novel crystallization 42 conditions 3 Structures solved from crystals containing γCad and ArfGAP-II 44 peptide 4 Data collection and refinement statistics for fusion protein 51 5 Different crystals obtained from different crystallization conditions 56 6 Se atoms coordinates for SAD phasing 60 7 Data collection and refinement statistics for the δCM 64

LIST OF EQUATIONS

Equation Title Page 1 Bragg’s law 16 2 Mattews coefficient 17 3 Electron density as a Fourier series 19 4 R-factor 19

ABSTRACT

The complex set of coordinated and regulated chemical reactions present in all living cells, is called metabolism. As the environments of most organisms are constantly changing, the primary role of these processes is to provide the necessary energy to fuel and maintain the organism’s functions, allowing organisms to respond to their environments, grow and reproduce.

To understand cellular processes one needs to delineate how the responses are transmitted and how molecular factors perform specific roles within the cell, through the hierarchy of assemblies and higher structures. One of the ways to elucidate these properties is by the collection of structural data provided from X-ray crystallography. In this research thesis, we studied two proteins that are part of one system; both may serve as a tool for understanding an important cellular process. Determination of three dimensional structures can provide the template to perform structure based drug design for the treating of human diseases caused by damage to this process.

COPI-coated vesicles mediate retrograde transport from the Golgi back to the ER and intra-Golgi transport. The cytosolic precursor of the COPI coat called coatomer creates a lattice on the membrane and concentrate cargo proteins inside the growing vesicle. Disassembly of the lattice occurs following a GTP hydrolysis on Arf1 which is controlled by ArfGAPs. This releases Arf1 into the cytoplasm, which in turn allows the release of coatomer from the membrane. The coatomer complex can be thought of as composed of two subcomplexes. The first consists of the β-, γ-, δ- and ζ-COPI subunits. The second consists of α-, β’- and ε-COPI subunits.

The structure of the carboxyl-terminal region of γ-COPI, referred to as the γ-COPI appendage, was determined previously and showed that it possesses a protein/protein interaction site on its platform subdomain. In mammalian γ-COPI this site most likely binds to the unstructured ArfGAP2 protein. In order to provide insight into the mechanism of COPI function, we are seeking a structural understanding of this interaction. We have successfully cloned a fusion protein containing the γ-COPI appendage protein fused to a peptide segments of the ArfGAP2 and identified several crystallization conditions for the crystallization of the fusion protein. We preformed several experiments which proved the binding of the ArfGAP2 peptide to the γ-COPI appendage in these proteins. We managed to collect one data set at 2.8Å resolution and solved this structure where we located a

1 small part of the apparent ArfGAP2 peptide near the previously suspected binding site to the γ-COPI appendage of W776.

The coatomer's δ–COPI subunit is suspected to bind ArfGAP1 and in addition contains a sequence suspected to bind Arginine based signals that are important for recognizing proteins that should be retrieved to the ER.

We have successfully identified several crystallization conditions for the crystallization of the MHD of the δ–COPI subunit, and for the co-crystallization of this domain along with the ArfGAP1 peptide containing the δ–COPI binding site and with several Arginine based signal peptides. We have collected several data sets from the crystals and determined the three dimensional structure of the MHD of the δ–COPI subunit at 2.15Å.

2

Abbreviations

ALPS Amphipathic Lipid Packing Sensor motifs AP Adaptor Protein APS Ammonium persulfate ARF ADP ribosylation factor ARFGAP ADP ribosylation factor GTPase activating protein ARFGEP ADP ribosylation factor Guanine nucleotide Exchange Factors BLAST Basic Local Alignment Search Tool BT Bis- Tris CCP4 Collaborative Computational Project 4 CNS Crystallography and NMR system COOT Crystallographic Object-Oriented Toolkit COP Coatomer protein complex DM n-Dodecyl- β -D-Maltoside DNA Deoxyribonucleic Acid dNTP Deoxyribonucleotide triphosphate DTT D,1-dithiothreitol E. coli Escherichia coli ER Endoplasmic Reticulum ESRF European Synchrotron Radiation Facility GTP Guanosine triphosphate HEPES N-2-Hydroxyethylpiperazine-N'-2ethanesulfonic acid IPTG Isopropyl β-D-thiogalactopyranoside ITC Isothermal Titration Calorimetry MAD Multiple wavelength Anomalous Dispersion MES 2-(N-morpholino) ethanesulfonic acid MHD Mu Homology Domain MIR Multiple Isomorphous Dispersion MR Molecular Replacement MS Mass Spectrometry NCBI The National Center for Biotechnology Information NCS Non-crystallographic Symmetry OD Optical Density

3

PCR Polymerase Chain Reaction PDB PEG Polyethylene glycol Phenix Python-based Hierarchical Environment for Integrated Xtallography RPM Rounds Per Minute SAD Single wavelength Anomalous Dispersion SDS Sodium Dodecyl Sulfate SDS-PAGE Sodium Dodecyl Sulfate Poly Acrylamide Gel Electrophoresis SLS Static Light Scattering TEM Transmission Electron microscopy TEMED N,N,N`,N`-tetra-methylethylendiamide UV Ultraviolet XDS X-ray Detector Software γCad γ-COPI appendage domain δCM MHD of the δ–COPI subunit

4

1. Introduction

1.1 The COPI system of vesicular traffic

The eukaryotic secretory pathway consists of a series of membrane-bound compartments that are distinct from each other in terms of molecular composition and function. The establishment and maintenance of the identity of every organelle depends on the correct localization and retention by selective transport of its resident proteins. The sorting of membrane proteins relies on different peptide sorting motifs that are recognized by three main types of coated vesicles: COPI, COPII, and and their adaptor complexes [1, 2]. These interactions can capture cargo at various donor membranes (e.g., the ER, different Golgi compartments, or the plasma membrane), and lead to their inclusion into transport vesicles. Each class is responsible for carrying out distinct transport steps within the cell: Clathrin mediates post-Golgi traffic, whereas COPII and COPI coats function in the early secretory pathway. The COPI system is responsible for vesicular retrograde trafficking of proteins between Golgi cisternae and from the Golgi to the ER [3, 4]. Coatomer, the coat protein of the complex COPI- Golgi-derived vesicles, is a soluble 700-kDa protein complex made up of seven subunits: α-, β-, β’-, γ-, δ-, ε- and ζ-COPI [5, 6], which form a stable complex that shuttles between cytosol and membranes [7]. The COPI complex can be reversibly dissociated into two subcomplexes [8]: the F-COPI subcomplex, which is comprised of the β, γ, δ and ζ subunits, and the B-COPI subcomplex, composed of α, β’, and ε subunits (Figure 1). Each of the subunits is well conserved from yeast to mammals (with the exception of ε -COPI, which appears to play a structure-stabilizing role). No high-resolution structure of the COPI coat complex is available, but sequence homology and partial structures suggest that the F-COPI subcomplex is structurally similar to the clathrin-adaptor core complex, whereas the B-COPI subcomplex is thought to be functionally equivalent to clathrin [9, 10, 11].

5

Figure 1: Model for COPI Assembly. A proposed model for the COPI complex, based on the conserved structural features with the AP2 complex and clathrin. The figure was adopted from [12]. Coatomer -mediated sorting of proteins is based on the physical interaction between coatomer (COPI) and targeting motifs found in the cytoplasmic domains of membrane proteins. Both the Coatomer and GTP- bound Arf1 constitute the minimal machinery required to deform a lipid bilayer into vesicles or buds. [13, 14, 15]. Upon binding of GTP (which is mediated by guanine nucleotide exchange factors- ArfGEFs), Arf1 undergoes a conformational change that exposes its myristoylated N-terminus, resulting in Arf1 membrane association [16, 17, 18]. Membrane-bound Arf1 recruits the coatomer complex from the cytosolic side of the Golgi, which polymerizes into a lattice on the membrane and sorts cargo ready for return to the ER through its binding to ER retrieval signals. Recognition of transmembrane cargo proteins is mediated by signals on their cytosolic tails, whereas luminal proteins are sorted through transmembrane adaptors that recognize cargo at the luminal side and coatomer at the cytosolic side [19, 20, 21]. The next step is the fission of the bud from the donor membrane, followed by transport of the newly formed vesicular carrier toward the target compartment. The uncoating of COPI vesicles is initiated when Arf1 hydrolyzes its bound GTP, which releases Arf1 and coatomer back into the cytosolic pool, and this ultimately allows fusion of the vesicles with its target membrane to occur [22].

1.1.1 The GTPase-activating proteins, ArfGAPs

From yeast to mammals, two types of GTPase-activating proteins, ArfGAP1 and ArfGAP2/3, control GTP hydrolysis on Arf1 at the Golgi apparatus [23]. All ArfGAPs possess a similar catalytic domain of about 130 amino acids, whereas each subfamily has a distinct non- catalytic part [24], suggesting differential regulation. ArfGAP2 and ArfGAP3 are highly related (58% identity) and show little similarity to ArfGAP1 outside the catalytic domain. Nevertheless, ArfGAP1 and ArfGAP2/3 display functional interplay, the two types of ArfGAP

6 proteins that reside at the Golgi have differences in their regulation and use a different combination of protein–protein and protein–lipid interactions to promote GTP hydrolysis in Arf1-GTP [23]. ArfGAP1 is controlled by membrane distortion imposed by the COPI coat through its amphipathic lipid packing sensor motifs, (ALPS) which are localized in the noncatalytic part and are required for Golgi targeting in vivo [25, 26].

The Golgi localization of ArfGAP2/3 depends on both a central basic stretch of residues and a carboxy-terminal amphipathic motif. The carboxy-amphipathic motif interacts directly with lipid membranes but has minor role in the regulation of ArfGAP3. The basic stretch that includes positions 215-242 interacts directly with coatomer, which is essential for the catalytic activity of ArfGAP3 on Arf1-GTP. This interaction involves the use of partially redundant KK motifs [23, 27].

The region encompassing residues 204-360 (which interact very efficiently with the coatomer) consist of residues whose conservation was indicated by multiple alignment of Arf-GAP2/3 from different species (Figure 2A). Several residues in this region were identified whose mutation strongly diminished ER localization (Figure 2B).

Figure 2: Identification of residues in the basic stretch of ArfGAP2/3 that are important for ER localization. Panel A. The Multispecies alignment shows the high conservation of residues 215-242 (Hs, Homo sapiens; Mm, Mus musculus; Xl, Xenopus laevis; Tn, Tetraodon nigroviridis; Gg, Gallus gallus; Dm, Drosophila melanogaster; Sc, S. cerevisiae). Panel B. The sequence of residues 200-360 of ArfGAP3, with mutations that abrogated ER localization in red, mutations that had no effect in blue, and those with partial effect in orange. The figure was adopted from [23].

7

The coatomer interacts with ArfGAP2/3 through a binding site on the C-terminal γ-COPI appendage domain [10, 12, 23]. The γ-COPI appendage domain interacts with ArfGAP2 more effectively than with ArfGAP3 and therefore, the interaction of ArfGAP2 with the γ-COP appendage domain through the basic stretch plays a role in coatomer- dependent regulation of ArfGAP2 activity.

1.1.2 The γ-COPI appendage domain

The structure of the γ-COPI appendage domain (residues 608–874 of human γ-COPI) was determined by X-ray crystallography at 1.9Å resolution [10]. The γ-COPI appendage domain (hereinafter γCad) has a two subdomain structure. The N-terminal subdomain is an 8-stranded β-sandwich, which possesses a short α-helix at the extreme N-terminal end. The C-terminal or platform subdomain is joined to the N-terminal subdomain by a short linker, and consists of a four-stranded β-sheet flanked by 3 α-helices. The γCad has a similar structure to the appendages of the α- and β2- subunits from the AP2 adaptor protein complex. Via their C- terminal appendage domains, structural and functional studies have shown to involve binding of proteins that play accessory/regulatory roles in vesicle formation. According to this, it is likely that the platform domain in γCad will act as a site for protein interactions. These interactions were proposed to involve binding of accessory proteins in a hydrophobic pocket centered around a pair of aromatic residues (F772 and W776 in γ-COPI). The two aromatic residues are flanked by a cluster of basic amino acids in all three appendage domains, comprised of R843 and R845 as well as a third basic residue R859 in γ-COPI. The conserved basic patch is apparent in the electrostatic potential of the platform domains of the structure as shown in Figure 3A, 3B [12, 10]. In γCad, binding of ArfGAP2 may occur at this hydrophobic site. Mutation of this site prevents the binding of ArfGAP2 to the γ-COPI appendage both in vitro and in vivo [28]. However, the size of this pocket is unlikely to account for binding of a 37 amino acids polypeptide that was defined as the minimal γCad binding region in ArfGAP2 [28]. Bioinformatics analysis of the structure of the human γCad for potential protein interaction sites has pointed to a larger hydrophobic pocket surrounded by a high negative potential that may be favorable for interaction with the basic stretch of ArfGAP2, and containing two aromatic residues, F789 and F807, localized at the platform subdomain of the appendage as well (Figure 3C). Mutation of each of those two amino acids abolished binding to the γCad in vitro [28]. The I804A/V805A/F807A mutation has not been seen to exert any major effect on the conformation of the protein, this correlates with the extent to which the mutated F807 residue is 51% surface exposed, whereas 91% of the side

8

chain of W776 is buried inside the protein core [28]. Structural understanding of the γ-COPI interaction with ArfGAP2 may eventually identify the correct binding site.

Figure 3: Potential protein interaction sites on γCad. In panels A and B the upper panel shows the γCad (PDB code: 1R4X) in ribbon form for orientation, while the lower panel shows the hydrophobic surface potential. Regions of hydrophobicity are indicated in Yellow while regions with high hydrophobicity are indicated in green. Panel A. A close-up of the platform subdomain, highlighting the structurally conserved residue W776 and neighboring side chains. W776, highlighted by a black circle, lies within a hydrophobic pocket. Panel B. Overall hydrophobic potential of the γCad. Panel C. Hydrophobic residues at the predicted pockets (orange, left), electrostatic potential at this face of the appendage domain (red-negative, blue-positive potential, right) . Panels A and B were adopted from [10]. Panel C was adopted from [28]

1.1.3 The MHD of the δ–COPI subunit

The δ–COPI presents a significant homology with medium (µ) chains of the clathrin- associated adaptor complexes [29, 30]. Together they form a family of proteins composed essentially of a conserved N-terminal domain followed by a more variable C-terminal sequence (the MHD domain). δ–COPI and the µ chain of adaptor complexes recognize similar sorting signals containing an aromatic residue and therefore, the C-terminus of δ– COPI must be in a position within the complex where it can interact with sorting motifs. Nevertheless, it was suggested that the respective binding sites of δ–COPI and of the µ chain of adaptor complexes might even be structurally related (Figure 4) [30, 31]. However, proteins with endocytosis motifs are evidently transported to the cell surface, while proteins 9 with motifs recognized by COPI are transported to the ER, suggesting that the motifs recognized by COPI or clathrin adaptor complexes are distinct. Although the structure of all the subunits of B-COPI subcomplex and two of the subunits of F-COPI subcomplex have been determined, the structure of the δ–COPI subunit is still unknown (Figure 4).

Figure 4: A composite model of F-COPI subcomplex. Composite model of β-, δ-, γ-, and ζ-COPI subunits bound to membrane via two molecules of Arf1-GTP. The γζ- COPI/Arf1 (PDB code: 3TJZ) crystal structure is colored. The remainder of βδ-COPI, in grey, is modeled based on homology with the crystal structure of the open conformation of AP2 (PDB code: 2XA7) [32]. The figure was adopted from [31].

1.1.4 The R-based signals recognition site The MHD domain of δ-COPI (hereinafter δCM) interacts with membrane proteins by recognition of R-based signals during the formation of COPI-coated vesicles. In monomeric proteins R-based signals act as ER localization signals, but in assembled multimeric proteins they can be hidden in the complex or rendered inactive by the recruitment of proteins [33, 34, 35]. The region of amino acids 390-412 in bovine's δ-COPI contributing to the recognition site for R-based signals is highly conserved across eukaryotic species, consistent with the evolutionarily conserved recognition of R-based sorting motifs [36]. Deletion of this region from the full-length δ–COPI results in losing its ability to interact with those motifs. From the homology model of the adaptor-like F-COPI subcomplex based on the crystal structure of the clathrin adaptor 1 core it is thought that following their close proximity the relevant regions of β- and δ-COPI form a single binding site for R-based signals at the subunit 10 interface. The structural comparison between this binding site and the site where endocytic motifs are recognized by clathrin adaptors suggests that the general architecture of the trunk's COPI structure has evolved to accommodate completely different cargo-sorting signals [35, 36].

1.1.5 The ArfGAP1 interaction with δCM

The ArfGAP1 carboxyl domain contains several di-aromatic sequences that are reminiscent of motifs known to interact with clathrin adaptors. Within this carboxy-terminal, a major coatomer-binding determinant was identified (405AADEGWDNQNW) [26]. This determinant is required for coatomer binding to full-length ArfGAP1, an interaction which is mediated through the δ-COPI.

11

1.2 Single crystal X-ray crystallography

The dynamics within the cell, the mechanisms responsible for the dynamics, and the cell's interactions with the exterior are equally important. To understand those, one need’s to delineate how the building block molecules respond to chemical and physical forces, how the responses are regulated, and how the responses are transmitted through the hierarchy of assemblies and higher structures. Interpretation of X-ray diffraction is the most common experimental means of obtaining a resolution of individual atoms in large molecules. Obtaining atomic resolution of proteins is not possible with visible light waves which are longer than the ~1.5Å distance between the bonded of the molecule. Electromagnetic radiation of this wavelength falls into the X-ray range of 0.1 to 100Å, allowing X-rays to be diffracted by atoms in molecules which can result in an atomic resolution description of the diffracting material [38]. The diffraction is analyzed from crystals and not from a single molecule because most of the X-rays will pass through a single molecule without being diffracted, and the few diffracted beams will be undetectable. A crystal contains huge numbers of molecules in the specific orientations, such that diffracted beams for all molecules can add up to produce a measurable signal of the diffracted X-ray beam. The X-ray scattering is determined by the density of electrons within the crystal. From the electron density it is possible (at high enough resolution) to discern the positions of atoms and from them interpret the geometry of the bonds that make up the molecule being studied.

1.2.1 Protein's preparation

In order to crystallize a protein for X-ray diffraction analysis the purified protein must be grown from aqueous solutions, ones to which it is tolerant, these solutions are called mother liquors. Some factors require consideration such as protein homogeneity and purity, maintenance of a particular pH, concentration of protein, temperature and precipitants that cause the protein to precipitate out of solution. A common feature of nucleation and growth is that both are critically dependent on what is termed the supersaturation of the mother liquor giving rise to the crystals. This is a non-equilibrium condition in which some quantity of the macromolecule in excess of the solubility limit, under specific chemical and physical conditions, is nonetheless present in solution. Two of the most commonly used methods for protein crystallization fall under the category of vapor diffusion [39]. These are known as the hanging drop and sitting drop methods (Figure

12

5). Both entail a droplet containing purified protein, buffer, and precipitant being allowed to equilibrate with a larger reservoir containing similar buffers and precipitants in higher concentrations. Initially, the droplet of protein solution contains an insufficient concentration of precipitant for crystallization, but as water vaporizes from the drop and transfers to the reservoir, the precipitant concentration and the protein concentration increase to a level optimal for crystallization. Since the system is in equilibrium, these optimum conditions are maintained until the crystallization is complete. The number of bonds (salt bridges, hydrogen bonds, and hydrophobic interactions) that a conventional molecule forms with its neighbors in a crystal, provide the lattice interactions essential for crystal maintenance [40]. The hanging drop method differs from the sitting drop method in the vertical orientation of the protein solution drop within the system. Both methods require a closed system, that is, the system must be sealed off from the outside.

Figure 5: Growing protein crystals by vapor diffusion. The hanging drop and the sitting drop are both entail a droplet containing purified protein, buffer, and precipitant. The figure is adopted from [41]

1.2.2 Crystal mounting

Preparation of a crystal for data collection entails placing it in cryoprotected mother liquor for ~10sec to wash of the old mother liquor. Then, picking it up in a small circular loop of nylon fiber and immediately freezing it with liquid nitrogen. The droplet of mother liquor keeps the crystal hydrated and the cryoprotectant prevents ice crystals from forming during the freezing. The loop is mounted onto the goniometer between an X-ray source and a detector which records the position and intensity of diffracted X- rays. There it is held in a stream of cold nitrogen gas coming from a reservoir of liquid nitrogen to maintain a temperature of -100°C. The advantage of collecting data at very low temperature is that it increases the molecular order in the crystal and improves diffraction and reduces the radiation damage to the crystal to allow collection of more data [38]. 13

1.2.3 X-ray source

In the conventional source, used in in-house diffractometers, a heated filament produces electrons that are accelerated by an electric field. The electrons bombard a metal target, most commonly copper, resulting in a high energy electron displacement of an electron from low- lying orbital in the target metal atom. Then, an electron from higher orbital, drops into the resulting vacancy, emitting its excess energy as an X- ray photon [38, 42]. Synchrotron radiation is the electromagnetic radiation emitted by electrons (or any charged particle) when they are forced into curved motion and in accelerators, driven by energy from radio- frequency transmitters and maintained in circular motion by powerful magnets. The X-rays produced by the source are next conditioned by a number of optical elements installed between the source and sample position, delivering an optimized beam at the crystal. Three types of devices are commonly used. A monochromator system selects a single wavelength from the source spectrum. During data collection, the direct beam is blocked just beyond the crystal by a metal beam stop. The power of X-rays emitted in synchrotron beamlines is hundreds to thousands of times greater than a diffractometer source [42]. A diffraction data set may take days to collect using a diffractometer as the source, while the same information is obtainable in minutes at a synchrotron facility. In addition, a synchrotron is a tunable X-ray source which is a requirement in cases where sufficient intensity at a particular wavelength is needed.

1.2.4 Basic concepts

1.2.4.1 The unit cell

The unit cell is the smallest translationally repeating unit that makes up the crystal. Its dimensions are given as three lengths: a, b, and c, and three angles: α, β and γ. The dimensions of the unit cell determine the spot spacing on the diffraction image: it is a reciprocal relation, so the larger the cell the more spots present for each unit area. The positions and symmetry of the diffraction spots can be used to determine the crystal system.

1.2.4.2 The asymmetric unit

The asymmetric unit is the minimal arrangement of molecules in the unit cell that can be reproduced by symmetry operations. This is the smallest repeating component (Figure 6).

14

Figure 6: Crystal organization. The figure is adopted from http://www.rcsb.org/pdb/home/

1.2.4.3 The space group

The space group is determined by the symmetry relationship of the asymmetric units within the unit. A total of 230 space groups exist. They are combined from the following symmetry operations: 2, 3, 4, and 6-fold rotation and screw axes, mirror planes, glide planes and also centering of symmetry. However, due to the chirality of all biological molecules, the space groups that contain a center of symmetry, a mirror plane or a glide plane are eliminated, leaving only 65 space groups applicable to biological macromolecules [38].

1.2.4.4 The reciprocal lattice

The diffraction pattern consists of reflection (spots) on an orderly array on the film. The spacing of reflections in the lattice on the film is called the "reciprocal lattice" because of its inverse relationship to the real lattice which is composed of the spacing of the unit cells in the crystal. Because the real lattice spacing is inversely proportional to the spacing reflections, the dimensions of the unit cell of a crystal can be calculated from the spacing of the reciprocal lattice on the x-ray film.

1.2.4.5 Bragg’s law and Miller indices

W.L. Bragg showed that a set of parallel planes with index h,k,l and interplanar spacing dhkl produces a diffracted beam when x- rays of wavelength λ impinge upon the planes at an angle θ and are reflected at the same angle (Figure 7), only if:

15

Equation 1: Bragg's law. 2dsinθ=nλ n is an integer.

For other angles of incidence θ', that don’t fulfill this condition, waves emerging from successive planes are out of phase, so they interfere destructively, and no beam emerges at that angle. Whenever a crystal is placed in an x-ray beam, it has to be rotated because only a subset of planes will fulfill Bragg’s law. That is the reason why a diffraction experiment involves the collection of diffraction images at different orientations of the crystal. The directions of reflection, as well as the number of reflections, depend only on unit cell dimensions and not upon the contents of the unit cell.

Figure 7: Conditions that produce strong diffracted rays. The dots represent two parallel planes of lattice points with interplanar spacing d. Two rays are reflected from them at angle θ. The figure is adopted from www.microscopy.ethz.ch

Miller indices: h,k,l, identify a particular set of equivalent, parallel planes (as described by Bragg’s law) in the reciprocal lattice which cut the x, y, z axes of the unit cell respectively.

1.2.5 Data processing.

A full data set may consist of hundreds of separate images taken at different orientations of the crystal. Each spot on the image is assigned an index to determine the highest symmetry lattice of the crystal- its space group.

16

1.2.6 Scaling

After the space group and the unit cell have been determined, as the highest symmetry lattice, the first stage of data processing involves scaling. Intensities of diffracted spots vary from one image to the next as a result of variability in both the diffracting power of crystals and the intensity of the x-ray beam. The accuracy of the measurement of the intensities is of paramount importance. A scale factor must be allocated so that the intensities of all the images in the data set will be with a consistent intensity scale due to the mosaic nature of a protein crystal. The next stage is the determination of the number of molecules in the unit cell using the Matthews coefficient, Vm, which gives the ratio between the volume of the unit cell (Å3) and the total weight of protein in the unit cell (Dalton) as follows:

Equation 2: Matthews coefficient. Vm=V/MW*Z*X  V=unit cell volume.  MW=molecular weight of protein.  Z= number of asymmetric units in unit cell.  X= number of molecules in unit cell.

Analysis of globular proteins showed fraction of crystal volume occupied by solvent to range from 27% to 78%, with the most common value being around 43%. Matthews empirically determined that the most probable values of X are those which give Matthews coefficients between 1.68 Å3/Dalton and 3.53 Å3/Dalton. The most commonly observed value of Vm is near the value of 2.15 Å3/Dalton [43].

The mathematical relationship between molecule's electron clouds (real space: ρ(x,y,z)), and its diffraction pattern (reciprocal space: F(h,k,l)) [38] can be described by Fourier transform of structure factors. The structure factor is a wave created by the superposition of many individual waves, each resulting from diffraction by an individual atom in the unit cell.

Fourier transform of structure factors possesses three characteristic features: amplitude, frequency and phase. In the case of a diffraction pattern, the frequency is that of the X-ray source used to create the diffraction. The amplitude is obtainable from the intensity of 1/2 reflection h,k,l since Fhkl α (Ihkl) , but the phase of each reflection is not recorded on the film. This is known as the phase problem.

17

1.2.7 Solutions to the phase problem

The phase angle can be determined in a variety of ways. The simplest technique is known as molecular replacement [44] and requires the knowledge of homologous protein's determined structure. This molecule is an initial search model from which a first estimation of the phase angle can be calculated. In order to determine the orientation and position of the molecule whose structure is unknown within the unit cell, the model is positioned within the unit cell of the molecule using three rotational and three translational variables. In the cases where such information isn't available, multiple anomalous dispersion (MAD) can be used. This technique requires an atom with significant anomalous scattering signal in the protein. Selenium atoms are often used in the form of selenomethionine which is incorporated into a recombinant protein in place of methionine by growing bacteria in specific media. In MAD, data is collected from one crystal by using 3-4 X-ray wavelengths, one of them is close to the scatterer absorption edge (When only one data set is collected using this wavelength to solve the phase problem, the technique is called SAD) [45, 46]. The anomalous scatterer position is determined from the differences of the intensities of the reflections from one dataset collected by using one wavelength to another. Due to a mathematical relationship between the position of the anomalous scatterer and the positions of the other atoms in the structure, this information is used for deriving the positions and therefore the phases for all the reflections. Another method that can be used for solving the phase problem is multiple isomorphous replacement (MIR). This technique requires heavy atoms inserted into the protein crystal. In MIR, the diffraction data of the native protein crystal and at least two heavy atom derivative sets are collected (When only one heavy atom derivative set is collected, the technique is called SIR) [47]. From the differences in intensities for the reflections of the heavy derivatives when compared with those of the native crystal, the position of the heavy atom can be derived and therefore, the phase for the native crystal reflections. The correct solution to the phase of each reflection is the one that is equal when using each one of the derivatives (as for each one of them two solutions are obtained), and therefore, usually more than two derivatives are required.

1.2.8 Calculation of the electron density map

After retrieving the estimated phases, they are combined with the scaled intensities for calculation of the structure factors that are used in Fourier transformation for the building of the electron density map. Because the Fourier transform operation is reversible, the electron density is in turn the transform of the structure factors as follows:

18

Equation 3: Electron density as a Fourier series. -1 (-2πi(hx + ky +lz - α )) ρ(x,y,z) = V ΣhΣkΣl |F(hkl)|e hkl

This equation shows the electron density (ρ(x,y,z)) as a function of the known amplitudes |F(hkl)| and the unknown phases αhkl of each reflection represented by its Miller indices h, k, l. V is the volume of the unit cell. The distribution from the protein atoms present in symmetry related places in the unit cell contribute to the diffraction pattern unlike the solvent atoms and the protein's outline can be calculated. Density modifications can be manipulated in order to yield a more interpretable electron density map, leading towards resolving the 3-dimensional structure.

1.2.9 Model building

The analysis of electron density map involves the building of the known amino acid sequence into the density whilst preserving the stereo chemical features of both the back bone and its side chains. If the structure has been solved by MR there is little obscurity in the analysis of the electron density map. However, if the structure has been solved by MAD or MIR, than aromatic amino acids and heavy atoms as Selenium can be identified as the starting point for the analysis. The analysis is easier when the map is sharpest. This depends on the X-ray data quality such as the resolution and on the flexibility of the protein.

1.2.10 Refinement

Refinement is carried out to optimize the atomic coordinates of the atoms in the starting structure. This procedure is monitored by the R-factor which gives a comparison of the experimentally observed structure factor amplitudes (Fobs) and the hypothetical structure factors calculated from the model (Fcalc). The R-factor is defined as:

Equation 4: R-factor.

R -factor = Σ(|Fobs| - |Fcalc|)/Σ|Fobs|

The smaller the R-factor is, the better the correlation between the experimental and the hypothetical model diffraction patterns thus, the accuracy of the model is higher [38]. Another factor that follows the refinement procedure is The R-free. In this methodology of refinement, approximately 10% of the data is isolated. The structure factors from this data are calculated at each refinement cycle not for the use in the R-factor calculations. A good indication that the

19 process is headed towards the proper direction is that the R-free is similar to the R-factor and both are less than 30%.

1.2.11 Model validation

As the model is refined, the phase data improves followed by the improvement of the electron density maps. Then, water and other ligands present in the crystallization mixture may be seen in the electron density and be built into the model. The model should be checked for accuracy in coordinate, side-chain rotamers, peptide bond flips, etc. Those errors are normally a function of resolution. Which is indicating on the accuracy of the determined atom's position, followed the amount of electron density details in an image. Other important model statistics include completeness, mosaicity (the degree of order in the crystal) and B factors (indication on more flexible regions of the molecule).

20

2. Research Goals

In order to provide insight into the mechanism of COPI function, we seek a structural understanding of the connection between γ-COPI and Arf-GAP2 and of the δCM. The lack of structural data required to test the γCad and Arf-GAP2 connection coupled with the existence of unfolded regions of the Arf-GAP2 and the lack of any structural data about the δ–COPI subunit present significant obstacles to further analysis of COPI-mediated trafficking events.

The specific goals of our research project were:

1. To decipher the 3D structural determinants of γCad that facilitate Arf-GAP2 interaction and to detail, on the molecular level the amino acids in γCad involved in the binding to the suggested Arf-GAP2 segment. 2. To decipher the 3D structure of the δCM and to detail, on the molecular level, its functional determinants such as the interaction interfaces with ArfGAP1 and with Arginine signal peptides. 3. To establish the functions of the proteins that are part of a bigger metabolism system.

21

3. Research plan and Methods

3.1 Research plan

Obtaining large amounts of the subcomponents of the COPI complex to perform the structural and functional experiments from mammalian cells is extremely difficult. For this reason, the basic experimental plan for both the γCad (including variants that were fused to different ArfGAP2 segments) and δCM protein is via cloning into expression vectors, expressing in E. coli and purification of large amounts of homogenous, clean and correctly folded protein. These isolated proteins are to be used for interaction experiments or to be crystallized (alone or with different interaction synthetic peptides) for high resolution structure determination by X-ray crystallography.

22

3.2 Methods

3.2.1 Molecular methods

3.2.1.1 Preparation of competent cells

In this method, E.coli species of BL-21(DE3) pLysS, B834 (DE3) and XL1-blue were prepared. 25 ml starter culture of bacteria which were grown overnight at 37°C was prepared. The starter was inoculated with 0.5 liter of LB and the cells were grown at 37°C till

OD600~0.8. The cells were chilled on ice for 15-30 min and pelleted by centrifugation (5000rpm, 10min, 4°C). The precipitate was resuspended with 750 ml of sterile double distilled cold water and was recovered by another centrifugation (5000rpm, 10min, and 4°C). The precipitate was then resuspended with 10 ml 10% glycerol solution, centrifuged and then the precipitate was resuspended again with 2ml 10% glycerol solution. The suspension was divided into small aliquots and frozen at -80°C.

3.2.1.2 PCR- Polymerase Chain Reaction

PCR method was used for enzymatically synthesizing and amplifying the DNA encoding sequences of the Human γCad together with different segments of Rat ArfGAP2 (First the Human γCad was cloned and then the different segment of Rat ArfGAP2 genes was added to the clone). The reaction uses two oligonucleotide primers designed to hybridize to opposite strands of the DNA that is to be amplified. The cycling protocol consists of initial denaturation in 98°C of 30 seconds, followed by 30 cycles of three temperatures: strand denaturation at 98°C, primer annealing at 60°C and primer extension by the enzyme Phusion DNA polymerase at 72°C. After those cycles the reaction ends in final extension in 72°C of 3 minutes. The PCR reaction mixture contained 0.5-1μl of each oligonucleotide (~100 ng/μl), 10μl buffer, 5 μl dNTPs from stock containing 2mM, 1.5μl DNA, 0.5μl Phusion DNA polymerase and was made up to a final volume of 50μl.

3.2.1.3 Agarose gel electrophoresis of DNA

Electrophoresis in agarose gel was used for separating DNA fragments on the basis of their molecular weights. The DNA fragments were resolved on 1% agarose gels. The gel was prepared by boiling 0.8gr of agarose in 80 ml 1x TAE, cooling the mixture, adding 4 µl ethidium bromide to final concentration of 0.5 μg/ml. The gel run in 1x TAE. 2.5μl PCR product was mixed with 1.5 μl of DNA loading dye and loaded into the gel. Ethidium bromide intercalates DNA and fluoresces when illuminated by UV light. 23

3.2.1.4 DNA Digestion by Restriction Enzymes

Restriction enzymes usage was performed according to manufacture instructions. In general, plasmid DNA was cut with Fast digest enzyme for 10 minutes at 37°C.

3.2.1.5 Ligation of PCR Product into Cloning Vector

T4 DNA ligase enzyme was used to combine the DNA insert fragments with the linearized cloning vector. The ligation mixture was incubated for one hour at 25°C, followed by dialysis to purify the plasmid.

3.2.1.6 Electro-transformation

In this method, the plasmid DNA was inserted into competent cells by an electric shock that causes modifications in the permeability of the competent cell membrane. A 60 μl of cell suspension was mixed with 6μl of DNA. The Pulser apparatus was set at 25μF and 2.5 kV. The Pulse Controller was set to 200Ω. The mixture of cells and DNA was transferred to a 0.2 cm electroporation cuvette. Immediately after the applying a single pulse, the cell suspension was resuspended with 400μl of LB. As a recovery, the tube was placed at 37°C for 30 minutes. Following this stage, 10- 100μl of the cells were plated on petri dishes containing LB, and the appropriate antibiotics.

3.2.1.7 DNA sequence determination

Sequence determination of DNA plasmid was performed by Macrogen, Korea (http://dna.macrogen.com/eng/service/seq/standard/standardseq.jsp).

3.2.2 Protein Isolation

3.2.2.1 Overexpression of the Proteins

The Human γCad together with different segments of Rat ArfGAP2 genes were cloned into the PKM260 plasmid which contains an N terminal His tag in its sequence.

100μl of each of the transformed E. coli cells were grown in 4ml of LB supplemented with the appropriate antibiotics (100 mg/ml ampicillin and 34 mg/ml chloramphenicol) overnight at 37°C. This starter solution was used to inoculate 500 ml of LB medium with the same antibiotics concentration mentioned above. This solution was shaken at 37°C until the mid- logarithmic growth phase was reached at OD600~0.8. At that point 0.25-0.5mM of Isopropyl- β-D-thiogalactopyranoside (IPTG) was added, and the culture was shaken for further 3-4

24 hours at 37°C. Cells were harvested by centrifugation (7000 rpm, 10 min, 4°C) and frozen at - 20°C till protein purification was performed.

3.2.2.2 Isolation and primary purification of the proteins:

Frozen cells were resuspended in Lysis buffer and broken by French Press at 1500 atm. The lysate was then centrifuged (17,000, 30min, 4°C) to separate the pellet and supernatant. The supernatant was loaded on a Nickel column for purification of 6xHis-tagged proteins by gravity-flow chromatography, then it was washed in 30ml wash buffer and eluted with 6ml elution buffer as 6 fractions.

3.2.2.3 SDS-PAGE analysis

The fractions containing protein was identified by SDS-PAGE analysis which separates proteins by electrophoresis [48]. The anionic detergent SDS allows separation under an electric field so that the protein movement in denaturative gel containing SDS is only according to protein's molecular weights without charge influence. SDS-PAGE contains resolving gel that is firstly topped with water and some minutes after termination of the polymerization, the water is replaced by the stacking gel mixture. protein samples in volume 5-10μl were incubated for 3 min at 80°C in sample buffer containing DTT, centrifuged for 2 min at 13Kxg, and placed into gels slots. The apparatus was filled up with running buffer containing 0.1% SDS. Gels were run at room temperature at 25mA for each plate. After completion of electrophoresis, the gels were placed in the staining solution and shacked gently. After 5 min the staining solution was poured off and the excess dye was removed by a destain solution, until the gel background was clear.

3.2.2.4 Chromatographic analysis

The Gel filtration analysis was performed using- AKTA™ Laboratory- scale Chromatography Systems by GE Healthcare. This Superdex-200 column is able to separate proteins according to their size. The Column was equilibrated with the appropriate buffer until a stable baseline was achieved. According to the absorbance change at a giving time that was monitored at 280nm, the protein was detected.

Gel filtration analysis was performed at the laboratories of Prof. Dan Cassel and Prof. Alian Akram, the faculty of Biology, Technion.

25

3.2.2.5 Protein concentration

The concentration and purification of the protein was done using an Amicon Ultra Centrifugal filter units device. Centrifugal filter concentrators can fractionate proteins on the basis of molecular weight (MW). The centrifugal filter unit was filled with up to 2 ml of protein mixture, centrifuged to 100µl- 1ml.

3.2.2.6 Determination of protein concentration

The determination of protein concentration was done using the Bradford assay. This assay is based on an absorbance shift of the dye Coomassie Brilliant Blue G-250 in which under bond interactions with the assayed protein, the red form of the dye is converted into its bluer form. The increase of absorbance measured at the blue form absorptions spectrum maximum of 595 nm is proportional to the amount of bound blue dye, and thus to the concentration of protein present in the sample.

3.2.2.7 Mass spectroscopy

MS was performed for the detection of the protein from the SDS-PAGE gel suspected sample or from the protein's crystal. The proteins are proteolysed into tryptic fragments by the protease trypsin that specifically cleaves at the C terminus of the amino acids lysine and arginine. The sample is loaded onto the MS instrument where these fragments are then ionized to calculate the mass to charge ratio of the particles from the motion of the ions as they transit through electromagnetic fields. The peptide sequence map of a molecule is deciphered through the set of fragment masses and compared against a library of known databases to detect the protein sequence.

Mass spectroscopy was performed at the Smoler Proteomics Center, Technion using Tandem RPLC-ESI-MS-MS apparatus.

3.2.3 Protein analysis

3.2.3.1 SLS measurements

For the static light scattering (SLS) assay, the uncoating of the coatomer from liposomes was monitored using a spectrofluorometer. The liposomes (0.1 mM) were incubated in KHM buffer (150 mM KCl, 40 mM Hepes, pH7.4, 1 mM MgCl2, 1 mM DTT) in the presence of 1 μM Arf1, 0.1 mM GTP, 0.2 μM of coatomer, followed by an Arf1 activation that was initiated

26 by the addition of 2 mM EDTA. Activation was stopped by the addition of 4 mM MgCl2, and ArfGAPs were subsequently added [28, 49].

3.2.3.2 ITC measurements

In nano isothermal titration calorimetry, the binding affinity (KD), stoichiometry or number of binding sites, enthalpy (ΔH), and entropy (ΔS) between Human γCad or its fusion form (With ArfGAP2 segment) and the ArfGAP2 peptide was measured. Excess amount of 100 μM ARFGAP2 peptide was injected into the cell containing 11μM Human γCad or its fusion form, increasing the peptide's concentration within the cell with each injection. With each injection the protein's binding sites will be more saturated and less peptide will bind. Thermodynamic measurements taken from the reaction provide information on conformational changes, hydrogen bonding, hydrophobic interactions, and charge-charge interactions and that is it is proportional to the reaction's rate [50, 51].

ITC measurements were performed using ITC200 from GE healthcare.

3.2.4 Crystallization of the proteins

Crystallization experiments were performed using the hanging drop vapor-diffusion method and the sitting drop vapor-diffusion method in a 24-well tissue culture plate at 20ºC (293K). Crystallization drops were formed by mixing 2µl protein with 2µl precipitant above 1ml reservoir well solution for the hanging drop vapor-diffusion method. For the sitting drop vapor-diffusion method, the crystallization drops were formed by mixing 7µl protein with 7µl percipitant above 1ml reservoir well solution.

3.2.4.1 Improving crystal growth

Seeding was performed for inducing nucleation and better crystal growth from small crystals of the protein under investigation that can be served as seeds [52-54]. The crystals were crushed and diluted or small crystals were taken and partially melted in the precipitant solution and added into protein/ precipitant solution. In other method of decoupling nucleation and growth, the cover slips holding the hanging drop were transferred over reservoirs containing decreasing reservoirs concentration.

27

3.2.4.2 Twinning and implementing twin law.

Twinning is a situation where during crystal growth nuclei come together in entirely different orientations making both their direct and reciprocal lattice overlap and violating the crystal symmetry. The result is what mayappear to be single crystals but are, in fact, countless crystals grow together in apparently non-specific clusters that are composed of two or more crystalline domains, the orientations of which differ in some special way. In this case, a twin law is implemented as a set of twin operations mapping two individuals of a twin. It is obtained by decomposition of the point group of the twin lattice with respect to the intersection group of the point groups of the individuals in their respective orientations [55, 56].

3.2.5 Software used in Crystallographic structure determination

3.2.5.1 Data processing, scaling and Refinement

3.2.5.1.1 Mosflm

Mosflm (http://www.mrc-lmb.cam.ac.uk/ imosflm/) is a program intended to process single crystal diffraction data collected from area detectors using the oscillation method by dividing the process into a series of steps, which are normally carried out sequentially. We used Mosflm for indexing, unit cell and space group determination and data integration.

3.2.5.1.2 XDS

XDS- X-ray Detector Software (http://xds.mpimf-heidelberg.mpg.de/) is a program intended to process single crystal diffraction data recorded by the rotation method and collected from area detectors. We used XDS for indexing, unit cell and space group determination and data integration.

3.2.5.1.3 HKL-2000 package

HKL-2000 (http://www.hkl-xray/hkl-2000.com) program package for the analysis of X-ray diffraction data collected from singal crystals. It is based on the extended versions of Denzo for automatic data indexing, XdisplayF for visualization of the diffraction pattern and Scalepack for merging and scaling of the intensities obtained by Denzo and refinement of crystal parameters.

28

3.2.5.1.4 CCP4

CCP4 (http://www.ccp4.ac.uk/ccp4i_main.php) is a program- package for experimental determination and analysis of protein structure. We used CCP4 for scaling, Matthews coefficient calculation and molecular replacement.

3.2.5.1.5 CNS

CNS- Crystallography and NMR system (http://cns.csb.yale.edu/v1.1/) is a program- package for experimental determination and analysis of protein structure. We used CNS for calculation of omit maps.

3.2.5.1.6 Phenix

Phenix- Python-based hierarchical environment for integrated crystallography (http://www.phenix-online.org/) PHENIX is a software suite for the automated determination of macromolecular structures using X-ray crystallography and other methods. We used Phenix for refinement.

3.2.5.2 Molecular graphic tools

3.2.5.2.1 PyMol

The structure files were displayed with the molecular graphics program, PyMol (http://pymol.sourceforge.net/). This program allowed visual analysis of proteins in order to infer structural alignments, compare functionally important sites and to visualize electrostatic potential of proteins.

3.2.5.2.2 COOT

COOT- Crystallographic Object-Oriented Toolkit (http://www2.mrc- lmb.cam.ac.uk/personal/pemsley/coot/). This program was used for macromolecular model building, completion and validation.

29

3.2.5.3 Bioinformatics programs

3.2.5.3.1 BLAST

The Basic Local Alignment Search Tool from NCBI (http://www.ncbi.nlm.nih.gov/BLAST/) was used to identify homologous proteins by comparing nucleotide or protein sequences and finding local similarity between them.

3.2.5.3.2 Generunner

Generunner program was used for analyzing DNA or amino acid sequences, finding sequence motifs, translation of DNA and designing Oligonucleotides.

3.2.5.3.3 Clone Manager

Clone Manager (http://www.scied.com/pr_cmbas.htm) is a comprehensive software package with an integrated set of tools. It was used to simulate virtual cloning.

3.2.5.3.4 WebCutter

Web cutter on-line program (http://rna.lundberg.gu.se/cutter2/) was used for helping restriction map DNA sequences planning.

3.2.5.3.5 PHYRE2

The PHYRE2 is an automatic fold recognition server, (http://www.sbg.bio.ic.ac.uk/phyre2/). It was used for predicting the structure and/or function of protein sequence.

3.2.5.3.6 SWISS-MODEL

SWISS-MODEL is an automated protein structure homology-modelling server, (http://swissmodel.expasy.org/). It was used for making Protein Modelling based on a desired templet.

3.2.5.3.7 PDBePISA

PDBePISA is an interactive tool (http://www.ebi.ac.uk/msd-srv/prot_int/) for the structural and chemical exploration of macromolecular surfaces and interfaces as well as properties and probable assemblies.

30

4. Materials

4.1 General reagents

Reagent Manufacture Acetic acid FRUTAROM Acrylamide SIGMA Acrylamide/bis-Acrylamide 40% SIGMA Agar DIFCO Laboratories Agarose (electrophoresis grade) SIGMA Ampicillin sodium crystalline SIGMA

Ammonium Chloride (NH4Cl) MERCK Ammonium persulfate (APS) SIGMA Bacto-tryptone Becton, Dickinson and Co. Bacto yeast extract DIFCO Laboratories Bio-Safe Coomassie Stain BIO-RAD Bradford Reagent SIGMA Bromphenol blue SIGMA Chloramphenicol SIGMA DNA ladder NORGEN D,1-Dithiothreitol (DTT) SIGMA Glycerol (Glycerin) J.T.Baker

Hepes (C8H17N2O4SNa) SIGMA Hydrochloric acid 37% (HCl) Carlo Erba Reagents

Imidazole (C3H4N2) FLUKA

Magnesium Chloride hexahydrate (MgCl2) Carlo Erba Reagents

Manganous chloride (MnCl2) Carlo Erba Reagents

Magnesium Sulfate (MgSO4) FRUTAROM Methanol FRUTAROM Molecular weight markers for SDS-PAGE SIGMA, Bio-Rad Laboratories Polyethylene glycol (PEG) Hampton research Corp. Pottasium chloride (KCl) MERCK

Pottasium dihydrogen phosphate (KH2PO4) MERCK

31

Reagent Manufacture di- Sodium hydrogen phosphate (Na2HPO4) MERCK Sodium chloride (NaCl) FRUTAROM Sodium dodecyl sulfate (SDS) Biochemical Sodium hydroxide (NaOH) FRUTAROM N,N,N’,N’–tetra-methylethyenediamide BIO-RAD (TEMED)

Tris (C4H11NO3) SIGMA

4.2 Vitamins

Vitamin Manufacture Thiamine SIGMA Riboflavin MERCK Pyridoxine hydrochloride B.N.D (British Drug Houses Ltd.)

4.3 Amino acids

Amino acid Manufacture Alanine SIGMA Arginine SIGMA Asparagines FLUKA Aspartic acid SIGMA Cysteine SIGMA D-glucose SIGMA Glutamic acid SIGMA Glutamine SIGMA Glycine SIGMA Histidine SIGMA Isoleucine SIGMA Leucine FLUKA Lysine SIGMA Methionine FLUKA Phenylalanine MERCK

32

Amino acid Manufacture Proline MERCK Selenomethinonine Anatrace Serine FLUKA Threonine SIGMA Tryptophane FLUKA Tyrosine SIGMA Valine SIGMA

4.4 Biochemical and Crystallization Kits

Solution Manufacture CryoPro Hampton Research Corp. Crystal ScreenTM Hampton Research Corp. Crystal Screen 2TM Hampton Research Corp. Gell PCR DNA HiYield fragments extraction kit Index TM Hampton Research Corp. Plasmid DNA Promega

purification- miniprep

4.5 Enzymes

Enzyme Manufacture Phusion High-Fidelity DNA Polymerase Thermo scientific Fast digest enzymes Thermo scientific

4.6 Other materials

Material Manufacture Centricon 10K Amicon Electroporation cuvettes Cell projects Petri dished Miniplast

33

Material Manufacture Sterile needles 0.8mm*40mm Bacton Dickenson Sterile syringe 10mm Bacton Dickenson

4.7 Growth media

 Luria Bertani (LB) media: 1% Bacto tryptone, 0.5% Bacto-Yeast extract, 1% NaCl.  LB-Agar (for plates): 1% Bacto tryptone, 0.5% Bacto-Yeast extract, 1% NaCl, 2% Agar.  Growth media for the production of Se-Met labeled protein:

Buffer M9*2: 2g/L NH4Cl, 6g/L KH2PO4, 12g/L Na2HPO4 500mg/L MgSO4, 25mg/L FeCl3, 0.5µg/ml of each vitamin, 0.16mg/ml Se- Methionine, 80µg/ml of each amino acid, 100µg/ml ampicillin, 5mg/ml D-glucose [55].

*The amino acids were supplied by Onit Alaluf from the laboratory of Prof. Yuval Shoham, the faculty of Biotechnology and Food Engineering, Technion.

4.8 Cell lines

 E.coli strain XL1-blue (Stratagene) - was used as host for the PKM260 vector.  E.coli strain BL-21 pLysS (Novagen) – was used as host for the over expression vector PKM260. E.coli strain B834 DE3 (Novagen)- was used as host for the over expression vector PKM260 for the production of Se-Met labeled protein.

* E.coli strain B834 DE3 was supplied by Onit Alaluf from the laboratory of Prof. Yuval Shoham, the faculty of Biotechnology and Food Engineering, Technion.

34

4.9 Plasmid vectors and oligonucleotides

4.9.1 Vector description

Expressions of the proteins in E.coli were done using the PKM260 vector which introduces an amino terminal hexahistidine tag. The vector was modified by insertion, from 5' to 3', of NotI, SmaI, XhoI and Acc65I restriction sites between the NheI and BamHI sites.

4.9.2 Oligonucleotides

The oligonucleotides were supplied from SIGMA. They were planned such that their sequence contains arround 60% G-C and that the added restriction enzymes recognition sites, didn't appear in the gene sequence. As follows there are primer sequences:

 F-hgamma608-Nhe1 5'- CTTTGCTAGCACCAGGCAGGAGATCTTCCAG - 3'  R-hgamma874-BamHIstop 5'- GCGGATCCTTATCCCACAGATGCCAAGATGATG - 3'  F-hgamma1-608-Xho1 5'- CGGCTCGAGTACCAGGCAGGAGATCTTCCAGG – 3'  F-gap2_177-Nhe1 5'- CTTAGCTAGCCCAGCCCCGTCTACAGAGAGCAG -3'  F-gap2_202-Nhe1 5'- CTTAGCTAGCCCCAAAGCCTCACTGGAACTG – 3'  F-gap2_222-Nhe1 5'- CTTAGCAAGAAAGGGCTGGGTGCCAAG – 3'  R-gap2_262-Xho1 5' – CGGCTCGAGCCTCCACCTCCGGCATCGGCTGCCTGCCTGCTGCTC – 3'  F- bdeltaXho1aasense 5'- CGGCTCGAGCTATGATGGTGCTGTTGGCAGCAGC - 3'  R - bdeltaCOP511Bamas 5'- GCGGATCCTACAGAATTTCATACTTATCCACTAGG – 3'

More details of these oligonucleotides are in the following table:

35

Table 1: Oligonucleotides details.

Oligonucleotide name Length Comments (bp) Cloning of the gene encoding to the 31 human γCad to the PKM260 vector with F-hgamma608-Nhe1 His-tag on the N-terminal (N-terminal, Forward) Cloning of the gene encoding to the human γCad to the PKM260 vector with R-hgamma874-BamHIstop 33 His-tag on the N-terminal (C-terminal, Reverse) Cloning of the gene encoding to the human γCad to the PKM260 vector with F-hgamma1-608-Xho1 32 His-tag on the N-terminal (N-terminal, Forward) Cloning of the gene encoding to the Rat ArfGAP2 peptide to the PKM260 F-gap2_177-Nhe1 33 vector, containing the human γCad gene with His-tag on the N-terminal (N-terminal, Forward) Cloning of the gene encoding to the Rat ArfGAP2 peptide to the PKM260 F-gap2_202-Nhe1 31 vector, containing the human γCad gene with His-tag on the N-terminal (N-terminal, Forward) Cloning of the gene encoding to the Rat ArfGAP2 peptide to the PKM260 F-gap2_222-Nhe1 31 vector, containing the human γCad gene with His-tag on the N-terminal (N-terminal, Forward) Cloning of the gene encoding to the Rat ArfGAP2 peptide to the PKM260 R-gap2_262-Xho1 41 vector, containing the human γCad gene with His-tag on the

36

N-terminal (R-terminal, Reverse) Cloning of the gene encoding to the Bovine δCM to the PKM260 vector, F- bdelta_267-Xho1aasense 34 with His-tag on the N-terminal (N-terminal, Forward).

Cloning of the gene encoding to the

Bovine δCM to the PKM260 vector, R – bdeltaCOP_511- 35 with His-tag on the BamHIas N-terminal (C-terminal, Reverse)

* The cloning of the Human γCad gene was done by Dr. Lena Lifshitz from Dan Cassel's laboratory at the faculty of Biology, Technion.

* The cloning of the Bovine δCM gene was done by Dr. Anna Parnis from Dan Cassel's laboratory at the faculty of Biology, Technion.

4.10 Buffers and solutions

4.10.1 Solutions for electrophoresis of DNA on agarose gel

 TAE running Buffer: 40mM Tris-base, 40mM acetic acid, 1mM EDTA.  Agarose gel: 1% agarose in TAE buffer.

4.10.2 Buffers used for protein isolation, purification and biochemical experiments

 Lysis buffer: 25mM Tris/ Hepes pH=7.5.

 KHM buffer for SLS: 150 mM KCl, 40 mM Hepes pH=7.5, 1 mM MgCl2, 1 mM DTT

4.10.2.1 Nickel column buffers

 Wash buffer: 25mM Tris/ Hepes pH=7.5, 200mM NaCl, 20mM Imidazole, 2mM MTG  Elution buffer: 25mM Tris/ Hepes pH=7.5, 200mM NaCl, 200mM Imidazole, 2mM MTG

37

4.10.2.2 Gel filtration buffer

 FPLC Buffer: 25mM Tris/ Hepes pH=7.5, 200mM NaCl, 1mM DTT.

4.10.3 Buffers used for SDS-PAGE

 Resolving buffer x 8: 3M Tris (ph=8.8).  Stacking buffer x 4: 0.5M Tris (pH=6.8).  Running buffer x 5: 125mM Tris, 960mM Glycine.  Sample buffer x 5: 6% Lauryl sulphate lithium salt, 0.15M DTT, 0.5M Tris. (pH=6.8), 30% glycerol, 0.01% bromphenol blue.

4.11 Gel recipes

 Resolving gel: 14% Acrylamide, 0.4% bis-Acrylamide, 1% APS, 0.1% TEMED in resolving buffer.  Stacking gel: 6% Acrylamide, 0.15% bis-Acrylamide, 2% APS, 0.2% TEMED in stacking buffer.

4.12 Column material and sources

 Nickel column: Ni-NTA Agarose, QIAGENE.  Gel filtration column: Supredex 200

38

5. Results

5.1 The interaction between γCad and ArfGAP2

5.1.1 γCad and ArfGAP2 co- crystallization

5.1.1.1 γCad

100μl of The E. coli BL-21(DE3) pLysS (Novagen) host strain containing the vector encoding for the human γCad were grown in Luria – Bertani (LB) supplemented with the appropriate antibiotics. Cells were harvested by centrifugation (7000 rpm, 10 min, 4°C) and frozen at - 20°C. Frozen cells were lysed using a French pressure cell in 10mM HEPES (pH7.5), 200mM NaCl (buffer A) plus protease inhibitors, and the protein was purified by Nickel affinity chromatography. The protein was further purified by gel- filtration chromatography in buffer A containing 5mM DTT (where it appeared to have a monomer form) and concentrated to 7- 10 mg/ml (Figure 8).

Figure 8: SDS_PAGE gel of γCad purification. Lane 1 is the γCad protein as eluted from the gel-filtration Superdex 200 chromatography in order to purify the protein from aggregates. It was then concentrated. The gel was overloaded in order to see the presence of small amounts of impurities that could affect crystallization. The impurities were estimated to be less than 5% of the sample.

5.1.1.2 ArfGAP2

We used two synthetic peptides that represent different segments of the human ArfGAP2 protein for co-crystallization with the γCad protein.

The first one is residues 212-262, which also contains an N-terminal fluorescein molecule (enabling tracking by fluorescence), which will hence be referred to in this thesis as ArfGAP2-I, while the second one is residues 222-258, hence referred to in this thesis as ArfGAP2-II.

39

5.1.1.3 Co-crystallization trials.

We tried to co-crystallize the γCad together with each of the two ArfGAP2 peptides. From previous studies it appears that there is only one binding site in γCad thus, an initial molar ratio of 1:4 (protein : peptide) was used in γCad - ArfGAP2 crystallization trials (to try to ensure elevated occupancies) but we also tried other molar ratios with an excess of ArfGAP2. First, we tried to crystallize the γCad together with ArfGAP2-I, containing the N-terminal fluorescein. The fluorescein tag was used so we could detect whether the crystal contains the ArfGAP2-I using confocal microscopy. We crystallized the proteins using the previously published crystallization conditions of the γCad [10]. We obtained crystals after a day with our purified protein. Then we tried to scan other varieties containing the same buffer and precipitant but in a broad range of concentrations. Crystals grew overnight in all the conditions tested. Their shapes were similar to those of the previously published crystals. In all of the crystals we could visually detect the presence of the fluorescein. However, we later showed by confocal microscopy, that we could detect the presence of the fluorescein as an absorbed layer of the peptide on the surface of the crystal (i.e. we could not detect the peptide inside the crystal) [57]. The diffraction from the different γCad crystals was recorded on the ID14-1 beamline at the European Synchrotron Radiation Facility (ESRF). Crystals diffracted X-rays to a maximum resolution of ~2.5Å and all crystals incubated with peptide or without belonged to the P212121 space group with one molecule in the asymmetric unit, similar to what was previously found for the γCad [10]. We could not locate the ArfGAP2-I in any of the structures solved from these crystals. Due to the tight crystal packing seen in the γCad from a human source (PDB code: 1R4X), we sought different crystallization conditions such as those from which crystals of the bovine γCad were grown (PDB code: 1PZD), (Figure 9) [12].

40

Figure 9: Crystal lattice packing of the two published γCads crystal structures. Symmetry-related molecules in the crystal lattice (Blue) within 8Å of the monomer in the asymmetric unit (Green) were generated using the PyMOL viewer for the comparison of the two packing arrangements. From the first amino acid (T608) of the Human γCad structure (PDB code: 1R4X) the sequence of the proteins from both human and bovine source are 100% identical. Panel A. Human γCad structure; the crystal from which the data was collected grew against a reservoir containing 0.2M magnesium acetate, 0.1M cacodylate pH6.5,

20% PEG 8000. This packing is tighter and the suspected ArfGAP2 binding site (F772 colored Orange and W776 colored Magenta) is concealed within the packing. Panel B. Bovine γCad structure (PDB code: 1PZD); the crystal from which the data was collected grew against a reservoir containing 1.4–1.7 M lithium sulfate, 0.1M sodium citrate pH6. The alignment between the structures gave small RMS (RMS=0.438), but this packing is looser and the suspected

ArfGAP2 binding site (F772 colored Orange and W776 colored Magenta) is more accessible for the peptide to penetrate and this was the initiative for testing different crystallization conditions.

For additional crystallization trials we used Crystal Screen Kits (Hampton Research), and the robotic systems that are available for dispensing the initial sitting-drop screening conditions at the Israel Structural Proteomics Center (ISPC), Department of Structural Biology, Weizmann Institute of Science. Trials on the ArfGAP2-I resulted in crystals from several new conditions some with a different morphology to the previous ones (Table 2). Simultaneously, we used the shorter ArfGAP2-II. In this case crystals also grew, between two days and two weeks, in the new crystallization conditions and some crystals also had different shapes than obtained in the initial trials (Table 2).

41

Table 2: Crystals obtained from published and novel crystallization conditions. Crystal Protein Crystallization condition Crystal image number

1 γCad and human ArfGAP2- The crystals grew overnight, I, residues 212-262, by sitting drop vapor containing N-terminal diffusion against a reservoir fluorescein. containing 0.1-0.4M magnesium acetate, 0.05- The image was

0.2M cacodylate pH6.5, 15- taken using 20% PEG 8000. confocal microscopy.

2 γCad and human ArfGAP2- The crystals grew after a I, residues 212-262, week, by sitting drop vapor containing N-terminal diffusion against a reservoir fluorescein. containing 1.4–1.7 M lithium sulfate, 0.1M sodium citrate

pH6.

3 γCad and human ArfGAP2- The crystals grew after two I, residues 212-262, weeks, by sitting drop vapor containing N-terminal diffusion against a reservoir fluorescein. containing 0.2M ammonium acetate, 0.1M hepes pH7.5,

25% PEG 3350.

4 γCad and human ArfGAP2- The crystals grew after two II residues 222-258. weeks, by sitting drop vapor diffusion against a reservoir containing 0.2M magnesium chloride, 0.1M bis tris

propane pH6, 25% PEG 3350.

42

5 γCad and human ArfGAP2- The crystals grew after two II residues 222-258. weeks, by sitting drop vapor diffusion against a reservoir containing 0.2M potassium/

sodium tartrate, 20% PEG 3350.

6 γCad and human ArfGAP2- The crystals grew overnight, II residues 222-258. in the ISPC, by sitting drop vapor diffusion against a reservoir containing 0.2M potassium thiocyanate, 0.1M

bis tris propane PH6.5, 20% PEG 3350.

The best Crystals grew under crystallization conditions number 2 whether co-crystallized with ArfGAP2-II or ArfGAP2-I. They diffracted X-rays to a maximum resolution of ~2.5Å. All of the crystals belonged to the P212121 space group with one molecule in the asymmetric unit (Table 3). Unfortunately, we couldn’t locate the ArfGAP2-I in any of the structures solved from those crystals. Nevertheless, all the structures were similar (RMS~0.4) to the one solved from γCad crystallized without the ArfGAP2-II peptide (Figure 10) and containing similar unit cell dimensions (a~50Å; b~80Å; c~100Å) to the dimensions of that previously described [10, 12]. The reasons we were not able to identify the presence of the ArfGAP2-II peptide could be a results of lack of binding to the γCad, in the presence of the crystallization liquors. It should be recalled that in order to obtain crystals, the proteins are incubated in a mixture of crystallization components that may impart upon the proteins conditions that are not conducive to binding. A second possibility is that the peptide was indeed bound to the γCad, however due to its unfolded, dynamic nature we could not identify clear electron density representing its position. This is indeed a problem that we knew we might face, however we had hoped that at least the binding place to the γCad will be rigid enough so we will be able to locate the amino acids involved in this interaction.

43

Table 3: Structures solved from crystals containing γCad and ArfGAP2-II peptide.

Structure The crystallization Resolution R/Rfree (%) Structure number conditions

1 The crystals grew 2.5 Å 22.5/26.7 after 2 weeks, by sitting drop vapor diffusion against a

reservoir containing 1.4M lithium sulfate, 0.1M sodium citrate pH6.

2 The crystals grew 2.5 Å 23.5/27.3 after 2 weeks, by sitting drop vapor

diffusion against a reservoir containing 1.7M lithium sulfate, 0.1M sodium citrate pH6.

3 The crystals grew 2.7 Å 23.2/29.2 after 2 weeks, by sitting drop vapor diffusion against a

reservoir containing 1.4M lithium sulfate, 0.1M sodium citrate pH6.

44

4 The crystals grew 2.8 Å 22.6/26.3 after a week, by sitting drop vapor diffusion against a

reservoir containing 0.2M KNatetrate, 0.1M sodium citrate, 2M Amonium sulfate.

5 The crystals grew 2.9 Å 24/30 after a week, by sitting drop vapor diffusion against a reservoir containing 1.4M lithium sulfate, 0.1M sodium citrate pH6.

45

Figure 10: Structural comparison between γCad structure (1R4X) and the structure determined from the co-crystallization. Alignment between the structure of the determined γCad structure (1R4X) (Green) with the structure determined at 2.5Å from the co-crystallization of γCad together with ArfGAP2-II peptide (Red) gave a low RMS of 0.3 for all C atoms. Besides few mild movements of the side chains there are no major differences. On the basis of these findings and the fact we couldn’t locate an unoccupied electron density to fit the peptide, it most probably did not bind.

5.1. 2 The γCad and ArfGAP2 fusion protein

5.1.2.1 Designing the fusion protein.

In order to try to increase the chances of obtaining a complex between the γCad protein and ArfGAP2 peptides in our crystals [58], we designed and expressed three fusion proteins containing the γCad protein fused to different peptide segments of the ArfGAP2 separated by a short GGGGS (4xglycine + serine) containing linker. The difference in each fusion protein is the number of N-terminal residues, leading to a differences in the length of the three fusion proteins referred to here as "Small" (starting from ArfGAP2 residue 222), "Medium" (starting from ArfGAP2 residue 202), and "Large" (starting from ArfGAP2 residue 177) (Figure 11).

46

Figure 11: Cloning information of the different lengths of the fusion protein. All the clones contain the basic architecture of 6 Histidines on the N terminal followed by a segment from ArfGap2 protein, then a linker containing 4x glycine and a serine ending in Gamma appendage domain segments 608 to 874 in the C- terminal. The difference is in the number of amino acid containing the ArfGap2 segment: The "Large" fusion protein contains a segment composed out of amino acids number 177 to 262, The "Medium" fusion protein containing a segment composed out of amino acids number 202 to 262 and the "Small" fusion protein contains a segment composed out of amino acids number 222 to 262.

We designed the three fusion proteins in the fashion described above in order to try and utilize all of the suspected γCad binding sites located in ArfGAP2, while keeping the ArfGAP2 as short as possible. We hoped that in this fashion we would enable binding.

5.1.2.2 The fusion protein interaction experiments.

We used two additional experimental methods to try and ascertain the binding of the ArfGAP2 peptide to the human γCad in the proteins used for crystallization.

a. Using a static light scattering (SLS) assay we monitored liposome uncoating that results from GTP hydrolysis on Arf1 [28, 49]. The SLS signal generated upon Arf1- dependent coatomer recruitment was reversed following the addition of ArfGAP2 resulting in liposomes uncoating. The addition of ArfGAP2-II peptide before adding ArfGAP2 inhibited the uncoating reaction due to the ArfGAP2-II peptide interaction with the coatomer. This interaction didn’t cause liposomes uncoating due to the lack of the catalytic part in ArfGAP2-II. The outcome of this interaction is less available sites for ArfGAP2 interaction, without generating liposome uncoating. The addition of γCad before adding ArfGAP2 caused the same inhibition due to its binding to available ArfGAP2, leaving less ArfGAP2 to bind the coatomer and cause uncoating. The addition of the "Medium" length fusion protein before adding ArfGAP2, also caused inhibition due to its interactions either with the coatomer (with its ArfGAP2 segment) or to the ArfGAP (with its γCad) (Figure 12).

47

Figure 12: ArfGAP2 competition with γCad, Fusion protein and ArfGAP2-II peptide. Liposomes were incubated with Arf1, GTP and coatomer followed by the addition of EDTA (indicated). The addition of EDTA initiates nucleotide exchange on Arf1 resulting in its activation. When Arf1 is activated the coatomer is recruited to the membrane. This process increases the light scattering. Nucleotide exchange was stopped by the indicated addition of excess magnesium ions. 5nM ArfGAP2 was added (Blue line) which caused the decreasing of the light scattering due to the binding of ArfGAP2 (that contains a catalytic region) to the coatomer. This binding activates GTP hydrolysis on Arf1 resulting in liposomes uncoating thus the decreasing of the light scattering. When either 10μM ArfGAP2-II peptide (Light Blue line), 10μM γCad (Red line) or 10μM "Medium" length fusion protein (Purple line) were added before ArfGAP2 for competition, there was more moderate decrease in the light scattering thus they inhibit this uncoating reaction by binding either to the coatomer or to the ArfGAP2, leaving less out of these proteins free for binding and activating hydrolysis. These findings indicate that there is an interaction of ArfGAP2 to the recombinant fusion protein and to the recombinant γCad [28, 49].

48 b. Isothermal Titration Calorimetry (ITC) was also used, to measure the binding affinity and thermodynamics obtained during the biochemical binding events [50, 51]. The heat released or absorbed during the binding of ArfGAP2-II peptide to the γCad and the binding of ArfGAP2-II peptide to the "Medium" length Fusion protein was

measured directly, giving information on relative binding affinity (KD) and enthalpy (ΔH). When we checked the binding of ArfGAP2-II peptide to the γCad, as we saturated the γCad, less and less of the injected ArfGAP2-II peptide is bound and the

reaction was exothermic with a KD of 0.19µM. When we checked the binding of ArfGAP2-II peptide to the "Medium" length Fusion protein, the reaction was

endothermic with lower affinity as the measured KD was 0.54µM (Figure 13).

Figure 13: Calorimetric data for the binding of ArfGAP2-II peptide to the γCad and the "Medium" length Fusion protein. The titration plots were derived

from the integrated heats of binding, corrected for heats of dilution. The line represents the nonlinear best fit to the data assuming a single-site binding model.

The data allows the calculation of the relative binding affinity (KD), stoichiometry (n), enthalpy (ΔH), and entropy (ΔS) as mentioned in each figure section. Panel A. Titration plot derived from the binding of ArfGAP2-II peptide to the γCad. As seen from the titration plot, the enthalpy (ΔH) decreases as the reaction continues and there are more saturated the γCad, making the reaction an exothermic one due to the binding of the ArfGAP2-II peptide to the γCad. Panel B. Titration plot derived from the binding of ArfGAP2-II peptide to the "Medium" length Fusion protein. As seen

from the titration plot, the enthalpy (ΔH) increases as the reaction continues and there are less saturated the γCad, making the reaction an endothermic one. That contributes to the fact that now the fusion protein is already self-bound as it contains the γCad part and ArfGAP2-II peptide it binds, and now this bond needs to be open for the injected ArfGAP2-II peptide to bind the fusion protein. Nevertheless, the higher relative binding affinity (KD) in titration plot B implies that not only that the fusion protein binds theArfGAP2-II peptide, (as it derived from the SLS experiment), it binds the peptide with lower affinity due to fact that the fusion protein is already self-bounded, leaving less available binding sights for the binding of the peptide.

49

5.1.2.3 The fusion protein crystallization.

We purified all the three different fusion proteins (in the same fashion used for γCad ) and each one was used for crystallization screening. In size exclusion chromatography performed prior to the crystallization trials, all of the proteins behaved as monomers in solution. We identified several crystallization conditions for the "Medium" and "Small" fusion proteins. No crystals were obtained from the “Large” fusion protein, indicating that this length of additional peptide disturbs the formation of the crystal lattice. Unfortunately, the crystals were shaped as needles (Figure 14). All attempts to improve crystal quality using several types of seeding techniques [52-54], and by the addition of chemical additives, were unsuccessful.

Figure 14: Crystals containing the “Medium” length fusion protein. The crystals containing the "Medium" length Fusion proteins grew after 3 weeks, by hanging drop vapor diffusion against a reservoir containing 0.2M Sodium chloride, 0.1M Bis-Tris pH6.5, 25% Polyethylen glycol 3350. The needle shaped crystals length was ~ 0.05 mm.

We did however manage to collect a complete data set to 2.8Å resolution from a "Medium" size fusion protein crystal. This structure was determined using molecular replacement to solve the phase problem. As the search model, we used the previously determined structure of human γCad, PDB code 1R4X. We used the programs Phenix, CNS and Coot, for refinement and model building. After many iterative rounds of refinement (using twin law: k,h,-l due to the presence of twinning), the Rwork and Rfree converged to 22.24% and 27.7%, respectively (Table 4).

50

Table 4: Data collection and refinement statistics for the fusion protein.

Data Collection

Space group P212121

Cell dimentions

a, b, c (Å) 50, 51.2, 98

α, β, γ (°) 90, 90, 90

Resolution(Å) 49-2.8

*, † Rmerge 0.059

Mean I/sigma(I) 17.18 (9.02)

Completeness (%)* 98.63 (100)

Multiplicity* 4.4

Refinement

No. of Measured reflections 28832

Rwork/Rfree (%)‡ 22.24/27.7

No. of non- hydrogen atoms 1990

Protein residues 249

Water 34

Average B-factor (Å2) 32.5

Root mean square deviations

Bond lengths (Å) 0.008

Bond angles (°) 1.52

*Values in parentheses are for the last shell. †Rmerge = ƩhklƩi |Ii(hkl)-/|/ƩhklƩiIi(hkl), where I is the observed intensity, and is the mean value of I.

51

‡Rwork/Rfree = Ʃhkl||Fobs|-|Fcalc||/Ʃhkl|Fobs|, where R and Rfree are calculated using the test reflections, respectively. The test reflections (5%) were held aside and not used during the entire refinement process.

The medium fusion protein was crystallized as a monomer in the asymmetric unit where we located a significant patch of unoccupied electron density suspected to belong to a small part of the apparent ArfGAP2 peptide near one of the previously suspected binding sites to the γCad, adjacent toW776 (Figure 15). The density was flanked by two negatively charged γCad amino acids: D781 and E784 and the positively charged amino acids R843 and R859 (Figure 15D). An additional nearby residue was found to be the negatively charged amino acid E675 from an additional molecule in the crystal lattice. Due to the ArfGAP2 length of the "Medium" crystalline fusion protein, the diffraction resolution and the fact that we could only identify electron density suitable to represent only three out of its 61 amino acids, we could not unequivocally identify which specific ArfGAP2 amino acids contribute to the binding to γCad. We were unable to fit buffer molecules (Bis-Tris, found in the crystallization liquor, or Hepes which was used during purification) into the un-modeled density. Furthermore, placing one of the buffer molecules into the un-modeled density increased the R -factor and R –free, ruling out the possibility that the un-modeled density belongs to one of the buffer molecules. We managed to fit 3 alanine residues into the un-modelled density (Figure 15C and Figure 15D) without raising the R -factor and R -free.

52

Figure 15: Crystal structure of γCad from Medium fusion protein overlaid by 2Fo-Fc and Fo-Fc electron density maps showing unoccupied density. The γCad structure is colored purple, while a second, symmetry related molecule in the crystal lattice is colored gray. Panel A. A section of the 2Fo-Fc electron density map (yellow mesh). The electron density map is contoured at  = 1.5, and the un- modelled density is located near residue 776 (black arrow). Panel B. A section of the Fo-Fc difference map. The un-modelled density indicating the lack of atoms from the model are colored in orange, at  = 2.5. The suspicious, large un-modelled density corresponding to a part of ArfGAP2 is noted by the black arrow and can be seen between W776 and E675 from a symmetry related molecule. Panels C-D. Two visual orientation of the same section of the 2Fo-Fc density map shown in panel following positioning of three alanine residues and calculated with them fitted inside.

53

5.1.2.4 Analysis of the putative ArfGAP2- γCad interaction site Having identified a significant patch of density in the crystal structure which could be proposed to be the strongest area of interaction between the peptide and the γCad protein (at least in crystal), we used this position to try and model the most critical segment of ArfGAP2 (with respect to binding to γCad) into the density area. As can be seen in figure 16, a surface representation of the γCad shows a significant groove that could accommodate the ArfGAP2. This groove was already proposed to accommodate the ArfGAP2 by previously solved structure [28]. We calculated the surface electrostatic potentials of the protein by using Pymol and compared the potential of the contact area between both the γCad protein and the selected peptide (Figure 14).

54

A

B

Figure 16: γCad and ArfGAP2 structure electrostatics. In blue are the positive patches, in red are the negative patches and in white are the hydrophobic ones. Panel A. The fusion structure electrostatics with the Rat ArfGAP2 critical segment (KKGLGAKKGLGAKV) in green. ArfGAP2 was modelled using Phyre server, with the template structure, PDB code 1b25 chain A [59, 60]. Part from the ArfGAP2 peptide was fitted in the un-occupied electron density patch (circled) near a niche containing a hydrophobic pocket. Panel B. The fusion structure electrostatics with the

ArfGAP2 critical segment electrostatics in 180° mirror inversion comparison. The ArfGAP2 peptide binding area of the un-occupied electron density patch is circled (the arrow indicates the W776 place in this hydrophobic niche). This area contains mainly hydrophobic patch compatible to the correspondent hydrophobic surface of ArfGAP2.

55

Structural comparison between the determined γCad structure (1R4X) with the structure determined from "Medium" length fusion protein of γCad together with ArfGAP2-II segment gave a RMS of 1.44 which is higher RMS (indicating more differences) than the one from the alignment between the determined γCad structure (1R4X) and the structure determined from the co-crystallization of γCad together with ArfGAP2-II peptide (Figure 17).

Figure 17: Structural comparison between the γCad structure (1R4X) and the structure determined from “Medium” length Fusion protein. Alignment between the structure of the determined γCad structure (1R4X) (Green) with the structure determined at 2.8Å from "Medium" length Fusion protein of γCad together with ArfGAP2-II segment (Red) gave a RMS of 1.44. (The position of ArfGAP2-II wasn’t identified).

5.2 The δCM

5.2.1 δCM purification

100μl of the E. coli BL-21(DE3) pLysS (Novagen) host strain containing the vector encoding to bovine δCM (amino acids 267-511) were grown in Luria – Bertani (LB) supplemented with the appropriate antibiotics. Cells were harvested by centrifugation (7000 rpm, 10 min, 4°C) and frozen at -20°C. Frozen Cells were lysed using a French pressure cell in 10mM Tris (pH7.5), 200mM NaCl (buffer A) plus protease inhibitors, and the protein was purified by Nickel affinity chromatography. The protein was further purified by size exclusion chromatography in buffer A containing 5mM dithiothreitol (DTT), where it eluted with a calculated dimer form (based on a standard curve of known molecular weight proteins). The protein was then concentrated to ~8 mg/ml (Figure 18).

56

Figure 18: δCM purification. Panel A. Gel filtration chromatogram showing the elution profile of the purified. The peak indicates a clean, homogenous, dimeric form of δCM. Panel B. SDS_PAGE gel of δCM. Lane number 1 is the δCM fractions eluted from the gel-filtration chromatography after 30-34 minutes and were taken for concentration. The star indicates the δCM band between 25kDa and 37kDa. The Gel is overloaded in order to see the various small amounts of impurity.

5.2.2 δCM crystallization

The δCM was screened for potential crystallization conditions. We also designed 3 short synthetic peptides for co-crystallization trials: A R-based sorting motif peptide containing the sequence: PLRKRSV (referred to in this thesis as R1), A strong interaction R-based sorting motif peptide containing the sequence: KLRRRRI (referred to in this thesis as R2) [36] and an ArfGAP1 coatomer- binding determinant containing the sequence: AADEGWDNQNW [37] referred to in this thesis as AG1. From previous studies it appears that there is only one binding site in the δCM, thus an initial molar ratio of 1:4 (protein: peptide) was used in the δCM with each of the peptides crystallization trials but we also tried other molar ratios with each peptide (always in excess). We obtained crystals in five different conditions. The crystals grew within a week, with a rectangular shape and approximately 0.1mm length (Table 5).

57

Table 5: Different crystals obtained from different crystallization conditions.

In order to confirm that the crystals from the purified recombinant protein which were obtained in the purification and crystallization process are indeed the δCM protein, we performed analysis by mass spectrometry (MS). The MS resulted in peptides from the δCM protein which together represent 82% of the whole δCM protein sequence (Figure 19).

267- PPINMESVHMKIEEKITLTCGRDGGLQNMELHGMIMLRISDDKFGRIRLHV ENEDKKGVQLQTHPNVDKKLFTAESLIGLKNPEKSFPVNSDVGVLKWRLQ TTEESFIPLTINCWPSESGNGCDVNIEYELQEDNLELNDVVITIPLPSGVGAP

VIGEIDGEYRHDSRRNTLEWCLPVIDAKNKSGSLEFSIAGQPNDFFPVQVSF ISKKNYCNIQVTKVTQVDGNSPVRFSTETTFLVDKYEIL

-511

Figure 19: δCM crystal MS analysis. The identified peptides of the δCM sequence are colored Red. Peptides unresolved in the MS are colored in Black.

Most of the crystals produced data sets. We tried to solve the structure from a crystal, co – crystallized with R2 peptide that was grown using the crystallization conditions containing: 0.4M Sodium chloride, 0.1M Hepes pH7.5 and 18% Polyethylene glycol 3350. This crystal, referred to here as “the native protein crystal” diffracted to 2.1Å resolution (Figure 20).

58

Figure 20: δCM diffraction pattern. 1º oscillation diffraction pattern from a frozen δCM crystal recorded with ADSC quantum 4R detector on beamline ID14- 4 at ESRF, Grenoble, France

5.2.3 δCM data determination

We processed the data using Mosflm and using the Matthew’s coefficient we proposed that it is most likely that there are two protein molecules in the asymmetric unit. We searched for previously determined structures of homologous proteins with Phyre2 search engine according to amino acid sequence homology [59]. We couldn't find a homologous protein structure with a protein of a high enough sequence homology (the sequence homology with previously determined MHD structures was 10-20%). However, numerous trials were made, using different models (usually taken from the equivalent µ2 subunit of AP2) for the performance of molecular replacement, but none of these trials yielded a molecular replacement solution.

In an attempt to obtain X-ray anomalous scattering signal, suitable for complete X-ray analysis, the recombinant δCM was expressed in a Met(-) strain of E. coli in the presence of seleno-methionine and crystallized using the same conditions we used to obtain the native protein crystals [61]. The seleno-methionine protein produced crystals suitable for complete SAD or MAD structural analysis. We collected a full 2.8Å resolution SAD data set at the Se anomalous diffraction peak wavelength of 0.9791Å. This data set was collected from a crystal crystallized in 0.2M Sodium citrate tribasic dehydrate and 20% Polyethylene glycol 3350. We processed the data using XDS and assuming the presence of two protein molecules in the asymmetric unit we located 10 Se atoms (Table 6). 59

Table 6: Se atoms coordinates for SAD phasing

We used the programs Phenix and Coot for experimental phasing and model building, however these programs only managed to place ~60% of the amino acids from both chains into the electron density (Figure 21).

A B

Figure 21: Preliminary model fitting of δCM to the electron density map. Panel A. A 2Fo-Fc electron density map (colored blue) of the asymmetric unit at 1.5σ, obtained after experimental phasing using the program Phenix at 2.8Å. The electron

density is suitable for two δCM monomers. Panel B. An initial model (colored Yellow) was obtained by placing several peptides from the δCM into the electron density using Phenix. The program managed to place ~60% from the two monomers, mostly as polyalanine peptides, with one monomer exhibiting more order then the other.

60

Numerous trials were made, using various programs, in order to place the rest of the protein's amino acids:

1. We have noticed that the more organized monomer, referred to as "molecule A", contained missing peptide of the less organized "molecule B" and vice versa. Using PyMOL we replaced each peptide resulting in more organized and unified monomers in the asymmetric unit (Figure 22A). 2. We used the more organized monomer of chain A as a target in a molecular replacement experiment using PhaserMR (as implemented in CCP4) on the processed data from the native protein crystal raising the resolution of the solved structure to 2.1Å resolution which made it easier to identify the missing amino acids from the electron density. 3. Using the fact that the protein was crystallized as a dimer in the asymmetric unit, and that one molecule contains organized peptides the other was missing, we aligned, using PyMOL, molecule A on B and tried to place the missing peptides from each molecule using the equivalent, aligned peptide from the other. The structure of the two molecules were not identical, however this procedure helped us identify the general outline of each chain (Figure 22B). 4. Using the program CNS, we computed an OMIT map for different areas in the asymmetric unit of the unit cell. An OMIT map is an electron density map in which for a specific area, a set of phases is calculated by the inverse Fourier transforms of the existing electron density distribution to detect error in the atomic model during model building. 5. The resulting solution was submitted to refinement using Phenix but the R-factor and R- free didn't improve further then 36.5% and 39.5%, respectively, suggesting a problem with the x-ray data set or a wrong solution. Furthermore, after all the attempts, molecule B still wasn’t as well defined in its electron density maps as molecule A and only ~90% of the protein was covered (Figure 22C).

61

Figure 22: Model building of δCM. Panel A. The δCM structure is colored Blue. The peptides that were initially placed in the opposite chain are circled Red. Each peptide was added to the opposite chain from the nearest asymmetric unit. Panel B. Alignment of chain A (colored Red) on chain B (colored Green) gave a RMS of 3.636. Panel C. A cartoon presentation of the structure of δCM at 2.1Å where it can be seen that after many manipulations only ~90% of the protein is recovered and there are some undefined segments (Black arrows).

6. The x-ray data from the native protein was reprocessed using HKL-2000 (indexing, integrating and scaling) afterwards the manipulated molecule A from the last resulting solution was used as a target in a molecular replacement experiment using PhaserMR (as implemented in Phenix). A clear and unique solution was obtained with two proteins in the asymmetric unit as expected. This solution was submitted to several refinement cycles starting from auto-building (using simulated annealing to 5000K followed by a slow cooling refinement), refinement of individual coordinates and isotropic temperature factors for each atom, addition of water molecules, modeling of residues with multiple conformations when necessary. All along the process, the program Coot was used again for graphic inspection, correction, modification and validation of the proteins in their weighted electrons density maps (2Fo-Fc and Fo-Fc), as well as manual addition and/or removal of solvent molecules. A single FMT (Formic acid) was added near R435 and N463 of molecule A, in replacement of what was at first believed to be two water molecule that were however too close (Figure 23). The final structure has R-factor and R- free of 18.2% and 22.8% respectively (Table 7). In the process of rebuilding and refinement of the protein structure, we were greatly assisted by Dr. Haim Rozenburg from the Weizmann Institute of Science. The structure of δCM at 2.15Å was submitted and accepted into the PDB and annotated with code 4O8Q.

62

Figure 23: Crystal structure of δCM protein. The two molecules are indicated as well as their variations - a longer N terminal in molecule A and an inner flexible loop in molecule B. The FMT is indicated as well near molecule A. PDB code 4O8Q.

63

Table 7: Data collection and refinement statistics for the δCM Data Collection (native)

Space group P212121

Cell dimentions

a, b, c (Å) 42.09, 110.3, 145.79

α, β, γ (°) 90, 90, 90

Resolution(Å) 19.8-2.15

*, † Rmerge 0.055

Mean I/sigma(I) 2.7

Completeness (%)* 87.72 (75.59)

Redundancy* 2.8

Refinement

No. of Unique reflections 43250

Rwork/Rfree (%)‡ 18.2/22.88

No. of non- hydrogen atoms 4049

Protein residues 484

Water 229

Average B-factor (Å2) 42.1

Root mean square deviations

Bond lengths (Å) 0.008

Bond angles (°) 1.16

*Values in parentheses are for the last shell. †Rmerge = ƩhklƩi |Ii(hkl)-/|/ƩhklƩiIi(hkl), where I is the observed intensity, and is the mean value of I. ‡Rwork/Rfree = Ʃhkl||Fobs|-|Fcalc||/Ʃhkl|Fobs|, where R and Rfree are calculated using the

64

test reflections, respectively. The test reflections (5%) were held aside and not used during the entire refinement process.

The two molecules in the crystal structure contain some variation – molecule A has a longer resolved N-terminal (267-269) containing amino acids: SPI (Figure 24A), while molecule B has an apparently complete flexible loop at positions 384-388 containing amino acids: ESGNG (Figure 24B). No interpretable electron density could be obtained for this flexible loop in molecule A.

A

A B

Figure 24: Section of the δCM protein crystal structure overlaid by 2Fo-Fc electron density map. The δCM structure is colored Purple and the electron density map is in Yellow mesh. Panel A. A section of the 2mFo-Dfc density map of molecule A of the protein. At  = 1.5, the resolved N-terminal (267-269) containing amino acids: SPI, is well defined. Panel B. A section of the 2Fo-fc density map of molecule B of the protein. At  = 1.3, the flexible loop at positions 384-388 containing amino acids: ESGNG is defined. No interpretable electron density could be obtained for N387, but its backbone is clearly defined in the density map.

The structure of δCM possesses parallel β-sheets characteristics as other solved MHD structures, but alignment of the structure with the equivalent MHD structure of the µ subunit of AP2 resulted in few general structural differences (Figure 25).

65

A B

Figure 25: MHD structural differences. The N-terminal and C-terminal of the protein’s structures are labeled as the shortcuts “N-ter” and “C-ter”, respectively. Panel A. Alignment of the highest score δCM model created by PHYRE based on the structure of the MHD of µ subunit from AP2 (14% sequence homology), PDB code: 2JKR (Green) on δCM’s molecule B (Red) [62]. The RMS is 2.857. The RMS of the alignment of the δCM model on δCM’s molecule B (not presented) is 2.499. The significant differences are indicated with the black arrow. Panel B. Alignment of MHD of µ subunit from AP2, PDB code: 2XA7 (Green) on δCM molecule B (Red). The RMS is 6.302. The RMS of the alignment of MHD of µ subunit from AP2 on δCM’s molecule B (not presented) is 5.906. There are differences in the general locations of the helices and β-sheets. The µ subunit from the crystal structure of the open conformation of AP2 (minor sequence homology), PDB code: 2XA7, was previously used to create the model for δ-COPI structure when a Model for Membrane Recruitment of Coatomer was composited [31, 32].

66

6. Discussion

6.1 The interaction between γCad and ArfGAP2

6.1.1 Adding unstructured protein for crystallization.

We had difficulties locating the ArfGAP2 peptide in the crystal structure determination even after the usage of a smaller peptide or different crystallization conditions. This is can be due to the dynamic nature of the ArfGAP2 protein [63] or the weak binding. Unstructured proteins have multiple conformations that are separated by low free-energy barriers and thus their structures fluctuate between different states. When bound to their biological targets, some unstructured proteins gain a folded structure while others retain much flexibility, forming ‘fuzzy’ complexes [64]. To increase the chance of co-crystallization and perhaps to increase the odds of gaining a folded structure of the ArfGAP2 peptide, we cloned a fusion protein composed out of the two segments. The fusion protein wasn’t enough to stabilize it in place probably because it doesn’t bind well. This fact is supported by the Kd being in μM and the confocal microscopy that show peptide absorption to crystal.

6.1.2 Findings of interaction.

Using the SLS assay we showed that not only do the γCad and our synthetic ArfGAP2 peptide interact with ArfGAP2 and the coatomer, respectively, but also that the fusion protein has available sites for interaction either with the coatomer (using its ArfGAP2 segment) or with ArfGAP2 (using its γCad). In addition we showed by ITC that the fusion interacts with the ArfGAP2 peptide with less affinity than γCad due to fewer available ArfGAP2 peptide's binding sites as, the fusion protein can interact with itself. This is supported by the fact that the ITC result showed that the fusion binding to ArfGAP2 peptide is an endothermic reaction, as the fusion needs to unbind before binding the ArfGAP2 peptide. We were then able to crystallize and solve the structure of the fusion protein, and managed to locate a patch of un- modelled electron density (that cannot be attributed to the surrounding modeled γCad electron density). This patch matches 3 amino acids in symmetrically repeated places near the previous suspicious site of W776. This site was previously suggested to be important for binding based on an unexpected structural similarity of the γCad to the carboxyl-terminal appendage domains of the beta subunit of the AP2 adaptor protein of clathrin-coated vesicles. Even though the two share low sequence identity (but high sequence similarity), the structural conservation exhibited by the γCad, coupled with functional data, supported a model of function parallel to the clathrin/AP2 system, a model which was reinforced by our findings 67

[10, 12, 65]. Even though we were unable to locate the entire ArfGAP2 peptide (due to its dynamic nature) or its and γCad's exact binding amino acids, this was the first time that there is experimental evidence for the general location of the binding site (were the part of the ArfGAP2 peptide managed to become more rigid due to γCad binding). Based on the location of the patch and the chemical character of the peptide we propose that the positively charged ArfGAP2 peptide is attracted to this site by the negatively charged amino acids surrounding this position in the crystal (D781, E784 and E675 from a different asymmetric unit).

6.1.3 Suggestions for the future.

The attempts described here illuminate the problems of adding an unfolded peptide to a structured protein for crystallization: on the one hand we needed to add a long peptide to assist in its folding as well as it’s binding but on the other hand the length of the peptide makes co-crystallization more difficult. It might be more beneficial to add the peptide in many small pieces and focus on predicted areas from the protein/peptide sequence that are not intrinsically unfolded [63, 66]. In this scenario, one could perhaps identify short linear sequences with elevated affinity towards the γCad resulting in successful co-crystallization. Our attempts to add chemical cross linkers to ensure binding and co-crystallization of the γCad and the ArfGAP2 peptide resulted in amorphous aggregates instead of crystals. However, chemical cross linking coupled with MS/MS analysis [67, 68] may be used to find the interaction site without crystallization: A protein complex formed, undergoes cross- linking, is then digested with protease and the cross-linked peptides are analyzed by MS/MS resulting in the identification of specific linked peptides between interacting subunits. These interacting peptides can then be mapped onto the entire structures and used as anchors or distance constrains in protein-protein docking models.

6.2 The δCM

6.2.1 The δCM structure for interpretation of the entire COPI system.

Each of the subunits in the F-COPI subcomplex exhibits low but significant sequence identity to subunits in the AP2 complex. This, together with other biochemical data, have led to a model in which the F-COPI and B-COPI subcomplexes are functionally analogous to AP2 complex and clathrin, respectively [9, 12, 69-71]. This analogy model of COPI provides an important framework for understanding COPI function; however, the lack of structural data for the entire COPI complex required to test its predictions presents significant obstacles to further analysis of COPI-mediated trafficking events. The δ-COPI subunit shows sequence

68 similarity to μ subunit of AP2 but lacks recognizable sequence identity (<30%). It is one out of two subunits (β–COPI and δ–COPI) from the seven subunits composing the coatomer, that had to date absolutely no crystallographic data. We have successfully solved the structure of δCM using a combined SAD-MR method without using previously determined model as there were no model with high enough sequence identity to do so (molecular replacement failed). The structure of δCM can now in turn serve as a search model for solving the undetermined structure of β-COPI which is known to interact tightly with δ-COPI in the trunk domain [36, 70, 71] and can get purified with it as a fusion protein. The resulting structure of this fusion protein will show their exact binding site for further analysis of COPI-mediated trafficking events. Meanwhile, we can use the crystal structure to try and predict the binding site to β-COPI, using programs that calculate the electrostatics and the amino acids composing the δCM interface and to other COPI subunits for understanding the structure of the entire COPI (Figure 26). Such a model can narrow the options for targeting point mutations in specific amino acids in δCM, to abolish the binding to other subunits. Nevertheless, the δCM can help in the process of solving the entire coatomer complex using TEM method [72].

69

Figure 26: Partial model for open conformation of Coatomer. A ribbon diagram of the open conformation of a partial composite model with respect to the membrane. The Coatomer is bound to the membrane via two molecules of Arf1-GTP (not presented). The composite model contains: δCM’s (PDB code 4O8Q, red), γ-COPI (PDB code 3TJZ, green), ζ-COPI (PDB code 3TJZ, cyan) and a part of the β- COPI subunit, (amino acids 25-464, gray).The β- COPI model was built using the SWISS-MODEL modeling server [73-76] based on its homology with chain B from the crystal structure of the open conformation of AP2 adaptor complex (PDB code 2XA7). The core of adaptor complexes (including the coatomer tetrameric complex) may exist in open and closed conformations and thus the published structures may only represent one of these conformations [77]. We aligned the positions of the coatomer subunits using the open conformation of AP2 adaptor complex as the template (PDB code 2XA7).

70

The δCM structure’s general outline obtained in the present study is similar to the previously determined, MHD containing structures (including the ones from subunit µ of AP2). However, it does possess a few significant differences (Figure 26) which is the most likely reason that molecular models could not serve as straightforward models for molecular replacement. These differences make the δCM a unique and interesting MHD domain structure. The δCM structure served as a search model for molecular replacement solving the rest of the data sets we collected from the co-crystallization of the δCM with R-based signal peptides and with ArfGAP1 peptide. Although all the structures co-crystallized with ArfGAP1 peptide possessed a significantly different unit cell (a=55, b=119, c=125) than the ones without (a=42, b=112, c=146) and the same space group, we could not locate the added peptides in any of the structures. However, we are able to show, for the first time, the suspected R-based signals binding site [36] (residues 390-412) on an actual structure (instead on a model as was done so far) and to notice that it is located near the flexible loop (residues 384-388), a fact that perhaps possesses a physiological reason for the binding of the cargo (Figure 27).

Figure 27: Proposed R-based signals binding site visualized in the δCM structure. The putative R-based signals binding site (residues 390-412) is presented as blue sticks on both δCM molecules A and B. The proposed site is shown on the structure to be near the flexible loop (residues 384-388) that is missing from molecule A (due to lack of interpretable electron density).

71

6.2.2 Exploration of the δCM interface and assembly

The δCM was both purified and structurally determined as a dimer in the asymmetric unit. There is however no evidence that δCM can be found as a dimer in vivo. Using PDBePISA we calculated the interface and although each monomer is indeed folded to its energetically favorable state (ΔG<0 for both A and B molecules), the analysis of the protein interfaces has not revealed any specific interactions that could result in the formation of stable quaternary structures. Most probably the molecules do not form a dimer in solution. This is logical as in the coatomer complex (according to the model [12]) each subunit appears only once as a monomer.

6.2.3 Mutation in δCM involved in Neurodegenerative disorder.

Mutations of δ-COPI orthologs (as well as of other COPI subunits) in other species such as yeast, Drosophila and C. elegans all result in lethality [71, 78, 79]. However, Nur17 mutant mice carry a T-to-C missense mutation in archain 1 (Arcn1) gene which encodes the δ-COPI. The nucleotide transition, which causes an amino-acid change from Ile to Thr at amino acid 422 (Figure 28) which is located in the δCM domain, results in partial loss of the protein function [79]. In nur17 mice the efficiency of protein trafficking through ER and Golgi may be affected. It is possible that there is a perturbation of the glycosylation process in the Golgi of nur17 mice in melanocytes leading to the mutant mice to exhibit both coat-color dilution and ataxia due to Purkinje cell degeneration in the cerebellum. Moreover the mutants share common characteristics of neurodegenerative disorders such as abnormal protein accumulation, ER stress, neurofibrillary tangles. This shows a direct association between δ- COPI and neurodegeneration. Impairment of intracellular trafficking has been implicated in the pathogenesis of various neurodegenerative disorders, such as Alzheimer’s disease [80, 81], Huntington disease [82-84] and Parkinson’s disease [85]. This missense mutation uncovers the possibility that δ-COPI has multiple roles in vesicle trafficking. δ-COPI may even also exist unbound to β-COP in mammalian cells thus, the protein may also have a function independent of β-COP or the COPI complex. The mutation does not cause RNA instability or improper processing of the protein, as the expression and localization of ARCN1 are not altered in the nur17 melanocytes, leading for the assumption it could be a structural problem that can harm the vital protein binding sites or even lead to the missfolding of the protein. This assumption can be verified through I422T point mutation, crystallization and crystal structure determination of the mutant protein. This is the reason our research on δCM does not end with our results but is a beginning to further research.

72

Figure 28: Modeling of the I422T point mutation in δCM. The right panel shows the WT δCM with amino acid I422 while left panel shows the suggested point mutation I422T, both of them in green [79]. The mutation is in the vicinity of the cargo-binding site of δCM which recognizes R-based ER localization signals (residues 390-412 are presented in blue) and near the indicated flexible loop. It seems that the structure of the two amino acids is not that different although Ile is a little longer. However, the mutation changes the amino acid from the hydrophobic Ile to the polar, uncharged amino acid Thr. This, in addition to the close proximity to the R-based signals binding suspected place, may be the cause of the mutant phenotype.

73

7. References

1. Bonifacino JS, Lippincott-Schwartz J. (2003) Coat proteins: shaping membrane transport. Nat Rev Mol Cell Biol. 4: 409–414. 2. Kirchhausen T. (2000) Three ways to make a vesicle. Nat Rev Mol Cell Biol. 1: 187–198. 3. Lee, M. C., Miller, E. A., Goldberg, J., Orci, L., and Schekman, R. (2004) Bi-directional protein transport between the ER and Golgi. Annu Rev Cell Dev Biol 20: 87-123. 4. Bethune, J., Wieland, F., and Moelleken, J. (2006) COPI-mediated transport. J Membr Biol 211: 65-79. 5. Lowe, M., and Kreis, T.E. (1998). Regulation of membrane traffic in animal cells by COPI. Biochim. Biophys. Acta 1404: 53–66. 6. Nickel,W., Brugger, B., and Wieland, F.T. (2002). Vesicular transport: the core machinery of COPI recruitment and budding. J. Cell Sci. 115: 3235–3240. 7. Rothman, J. E., and Orci, L. (1992) Molecular dissection of the secretory pathway. Nature 355: 409-15. 8. Fiedler, K., Veit, M., Stamnes, M.A., and Rothman, J.E. (1996). Bi- modal interaction of coatomer with the p24 family of putative cargo receptors. Science 273: 1396–1399. 9. Schledzewski, K., Brinkmann, H., Mendel, RR. (1999). Phylogenetic analysis of components of the eukaryotic vesicle transport system reveals a common origin of adaptor protein complexes 1, 2, and 3 and the F subcomplex of the coatomer COPI. J Mol Evol 48: 770–778. 10. Peter J. Watson, Gabriella Frigerio, Brett M. Collins, Rainer Duden and David J. Owen (2004) γ-COP Appendage Domain – Structure and Function. Traffic. 5: 79–88. 11. Langer, J.D., E.H. Stoops, J. Bethune, and F.T. Wieland. (2007) Conformational changes of coat proteins during vesicle formation. FEBS Lett. 581: 2083–2088.

12. Gregory R. Hoffman, Peter B. Rahl, Ruth N. Collins, and Richard A. Cerione. (2003) Conserved Structural Motifs in Intracellular Trafficking Pathways: Structure of the γ-COP Appendage Domain. Molecular Cell. 12: 615–625.

13. Rothman JE, Wieland FT. (1996) Protein sorting by transport vesicles. Science. 272: 227– 234. 14. Orci L, Palmer DJ, Ravazzola M, Perrelet A, Amherdt M, Rothman JE. (1993) Budding from Golgi membranes requires the coatomer complex of non-clathrin coat proteins. Nature. 362: 648–652.

74

15. Spang A, Matsuoka K, Hamamoto S, Schekman R, Orci L. (1998) Coatomer, Arf1p, and nucleotide are required to bud coat protein complex I-coated vesicles from large synthetic liposomes. Proc Natl Acad Sci USA. 95: 11199–11204. 16. Jackson, C. L., and Casanova, J. E. (2000) Turning on ARF: the Sec7 family of guanine- nucleotide-exchange factors. Trends Cell Biol. 10: 60-67. 17. Franco, M., Chardin, P., Chabre, M., and Paris, S. (1995) Myristoylation of ADP- ribosylation factor 1 facilitates nucleotide exchange at physiological Mg2+ levels. J Biol Chem. 270: 1337-1341. 18. Goldberg, J. (1998) Structural basis for activation of ARF GTPase: mechanisms of guanine nucleotide exchange and GTP-myristoyl switching. Cell. 95: 237-248. 19. Cosson, P., Lefkir, Y., Demolliere, C., and Letourneur, F. (1998) New COP1-binding motifs involved in ER retrieval. Embo J. 17: 6863-6870. 20. Jackson, M. R., Nilsson, T., and Peterson, P. A. (1990) Identification of a consensus motif for retention of transmembrane proteins in the endoplasmic reticulum. Embo J. 9: 3153- 3162. 21. Semenza, J. C., Hardwick, K. G., Dean, N., and Pelham, H. R. (1990) ERD2, a yeast gene required for the receptor-mediated retrieval of luminal ER proteins from the secretory pathway. Cell. 61: 1349-57. 22. Rothman JE. Lasker (2002) Basic Medical Research Award. The machinery and principles of vesicle transport in the cell. Nat Med. 8: 1059–1062. 23. Lena Kliouchnikov, Joelle Bigay, Bruno Mesmin, Anna Parnis, Moran Rawet, Noga Goldfeder, Bruno Antonny, and Dan Cassel (2009) Discrete Determinants in ArfGAP2/3 Conferring Golgi Localization and Regulation by the COPI Coat. Molecular Biology of the Cell. 20: 859–869. 24. Inoue, H., and Randazzo, P. A. (2007) Arf GAPs and their interacting proteins. Traffic 8: 1465-1475. 25. Parnis, A., Rawet, M., Regev, L., Barkan, B., Rotman, M., Gaitner, M., and Cassel, D. (2006) Golgi localization determinants in ArfGAP1 and in new tissue-specific ArfGAP1 isoforms. J. Biol. Chem. 281: 3785–3792. 26. Levi, S., Rawet, M., Kliouchnikov, L., Parnis, A., and Cassel, D. (2008). Topology of amphipathic motifs mediating Golgi localization in ArfGAP1 and its splice isoforms. J. Biol. Chem. 283: 8564–8572. 27. Kliouchnikov Lena. (2008) Dissection of function of ArfGAP2/3 proteins in the COPI trafficking machinery. Research proposal for the degree of Doctor of Philosophy.

75

28. Pevzner, I., Strating, J., Lifshitz, L., Parnis, A., Glaser, F., Herrmann, A., Brügger, B., Wieland, F., Cassel, D. (2012). Distinct role of subcomplexes of COPI coat in regulation of ArfGAP2 activity. Traffic 2012 13(6): 849-856. 29. Cosson, P., De´mollie`re, C., Hennecke, S., Duden, R. and Letourneur, F. (1996) d- and z- COP, two coatomer subunits homologous to clathrin associated proteins, are involved in ER retrieval. EMBO J., 15: 1792–1798. 30. Cosson, P., Lefkir, Y., Demolliere, C., and Letourneur, F. (1998) New COP1-binding motifs involved in ER retrieval. EMBO j. 17:6863-6870. 31. Yu, X., Breitman, M., and Goldberg, J. (2012). A structure-based mechanism for Arf1- dependent recruitment of coatomer to membranes. Cell 148: 530-542. 32. Jackson, L.P., Kelly, B.T., McCoy, A.J., Gaffry, T., James, L.C., Collins, B.M., Honing, S., Evans, P.R., and Owen, D.J. (2010). A large-scale conformational change couples membrane recruitment to cargo binding in the AP2 clathrin adaptor complex. Cell 141: 1220-1229. 33. Mrowiec, T, Schwappach, B. (2006). 14-3-3 proteins in membrane protein transport. Biological chemistry, 387(9): 1227-36. 34. Yuan, H., K. Michelsen, and B. Schwappach. 2003. 14-3-3 dimers probe the assembly status of multimeric membrane proteins. Curr. Biol. 13:638–646. 35. Michelsen, K, Mrowiec, T, Duderstadt, KE, Frey, S, Minor, DL, Mayer, MP, Schwappach, B. (2006). A multimeric membrane protein reveals 14-3-3 isoform specificity in forward transport in yeast. Traffic, 7(7): 903-16. 36. Michelsen, K, Schmid, V, Metz, J, Heusser, K, Liebel, U, Schwede,T, Spang, A, Schwappach, B. (2007). Novel cargo-binding site in the beta and delta subunits of coatomer. The Journal of cell biology, 179(2): 209-17. 37. Rawet, M., Levi-Tal, S., Szafer-Glusman, E., Parnis, A., and Cassel, D. (2010). ArfGAP1 interacts with coat proteins through tryptophan-based motifs. Biochemical and biophysical research communications 394, 553-557. 38. Gale, R. (2000) Crystallography Made Crystal Clear. San Diego: Academic Press, 2nd edition. 39. Smyth, M.S and Martin, J.H.J. (2000) X ray crystallography. J Clin Pathol: Mol Pathol 53: 8–14. 40. McPherson, A. (2004) Introduction to protein crystallization. Methods 34: 254–265. 41. Kanteev, M. M. Sc. thesis (2007). 42. Mitchell, E. Kuhn, P. and Garman, E. (1999) Demystifying the synchrotron trip: a first time user’s guide. Structure 7: R111–R121. 76

43. Mattews, B.W (1968) Solvent content of protein crystal. Mol.Biol. 33: 491-497. 44. Rossmann, M. G., D. M. Blow. (1962) The detection of sub-units within the crystallographic asymmetric unit. Acta Crystallogr. 15: 24-31 45. Hendricson, W.A. (1991) Determination of macromolecular structures from anomalous diffraction of synchrotron radiation. Science 254: 51-58 46. Krishna, M.H.M., Hendrickson, W.A., Orme-Johnson, W.H., Merritt, E.A., and Phizackerley, R.P. (1988) Crystal Structure of Clostridium acidi-urici Ferredoxin at 5Å Resolution Based on Measurements of Anomalous X-ray Scattering at Multiple Wavelengths. J. Biol. Chem. 263: 18430-18436. 47. Kartha, G. and Parthasarathy, R (1965) Combination of multiple isomorphous replacement and anomalous dispersion data for protein structure determination. I. Determination of heavy-atom positions in protein derivatives Acta Cryst. 18:745-749 48. Laemmli, UK (1970) Nature, 227: 680–685. 49. Bigay J., Antonny B. (2005) Real-time assays for the assembly-disassembly cycle of COP

coats on liposomes of defined size. Methods Enzymol 404: 95–107. 50. Ababou, A., Ladbury JE. (2006) Survey of the year 2004: literature on applications of isothermal titration calorimetry. J Mol Recognit 19(1):79-89. 51. Olsen, S.N., (2006) Applications of isothermal titration calorimetry to measure enzyme kinetics and activity in complex solutions. Thermochimica Acta 448(1): 12-18. 52. Thaller, C. et al. (1981) Repeated seeding technique for growing large single crystals of proteins. J Mol Biol. 147: 465-469. 53. Stura, E.A. & Wilson, I.A. (1990) Analytical and production seeding techniques. Methods 1: 38-49. 54. McPherson, A. & Shlichta, P. (1988) Heterogeneous and Epitaxial Nucleation of Protein Crystals on Mineral Surfaces. Science 239: 385-387. 55. http://reference.iucr.org/dictionary/Main_Page. The Online Dictionary of Crystallography, maintained by the Commission for Crystallographic Nomenclature of the International Union of Crystallography. 56. Todd O Yeats and Barry c Fam. (1999) Protein crystals and their evil twins. Structure. 7:R25-R29. 57. David, L. Marx, A. Adir, N. (2011) High-Resolution Crystal Structures of Trimeric and Rod Phycocyanin. Journal of Molecular Biology 405(1): 201-213. 58. Kent, HM. Evans, PR. Schäfer IB. Gray, SR. Sanderson, CM. Luzio, JP. Peden, AA. Owen, DJ. (2012) Structural Basis of the Intracellular Sorting of the SNARE VAMP7 by the AP3 Adaptor Complex. Dev Cell. 22(5):979-88. 77

59. Kelley, LA., and Sternberg, MJE. (2009) Protein structure prediction on the web: a case study using the Phyre server. Nature Protocols 4:363 – 371. 60. Hu Y, Faham S, Roy R, Adams MW, Rees DC. Formaldehyde ferredoxin oxidoreductase from Pyrococcus furiosus: the 1.85 A resolution crystal structure and its mechanistic implications. (1999) J Mol Biol 286:899–914. 61. Mechaly, A, Teplitsky. A, Belakhov. V, Baasov. T, Shoham. G, Shoham. Y (2000) Overproduction and characterization of seleno-methionine xylanase T-6. J Biotechnol. 78(1):83-86. 62. Kelly, BT, McCoy, AJ, Späte, K, Miller, SE, Evans, PR, Höning, S, Owen, DJ. (2008) A structural explanation for the binding of endocytic dileucine motifs by the AP2 complex. Nature 456(7224):976-79. 63. Varadi, M., Kosol, S., Lebrun, P., Valentini, E., Blackledge, M., Dunker, A.K., Felli, I.C., Forman-Kay, J.D., Kriwacki, R.W., Pierattelli, R., Sussman, J.L., Svergun, D.I., Uversky, V.N., Vendruscolo, M., Wishart, D., Wright, P.E., Tompa, P. (2014) pE-DB: a database of structural ensembles of intrinsically disordered and of unfolded proteins Nucleic Acids Res 42: D326-D335. 64. Tompa, P. and Fuxreiter, M. (2008) Fuzzy complexes: polymorphism and structural disorder in protein-protein interactions. Trends Biochem. Sci., 33:2–8. 65. Edeling M. A., Mishra S. K., Keyel P. A.,Steinhauser A. L.,Collins B. M., Roth.,Heuser J. E., Owen D. J., Traub L. M. (2006) Molecular switches involving the AP-2 beta2 appendage regulate endocytic cargo selection and clathrin coat assembly. Dev. Cell 10: 329–342. 66. Prilusky, J., Felder, C.E., Zeev-Ben-Mordehai, T., Rydberg, E., Man, O., Beckmann, J.S., Silman, I. & Sussman, J.L. (2005) FoldIndex©: a simple tool predicts whether a given protein is intrinsically disordered. Bioinformatics 21: 3435-3438. 67. Kalisman, N., Adams, C. M., & Levitt, M. (2012). Subunit order of eukaryotic TRiC/CCT chaperonin by cross-linking, mass spectrometry, and combinatorial homology modeling. Proceedings of the National Academy of Sciences, 109(8): 2884-2889. 68. Tal, O. (2014) Investigations of the interactions leading to phycobilisome assembly. Ph.D Thesis. 69. Boehm, M., Bonifacino J.S., (2001) Adaptins: the final recount Mol. Biol. Cell 12: 2907– 2920. 70. Eugster, A., Frigerio, G., Dale, M., Duden, R., (2000) COP-I domains required for coatomer integrity, and novel interactions with ARF and ARF-GAP EMBO J 19: 3905– 3917. 78

71. Faulstich, D., Auerbach., S, Orci., L, Ravazzola., M, Wegchingel., S, et al. (1996) Architecture of coatomer: molecular characterization of delta-COP and protein interactions within the complex. J Cell Biol 135: 53–61. 72. Fultz, Brent and Howe, James M. Transmission Electron Microscopy and Diffractometry of Materials (Third Edition) (2007) Springer , Heidelberg. 73. Biasini, M. et al. (2014) SWISS-MODEL: modelling protein tertiary and quaternary structure using evolutionary information. Nucleic Acids Research; doi: 10.1093/nar/gku340. 74. Arnold, K., Bordoli, L., Kopp, J., and Schwede, T. (2006). The SWISS-MODEL Workspace: A web-based environment for protein structure homology modelling. Bioinformatics 22:195-201. 75. Kiefer, F., Arnold, K., Künzli, M., Bordoli, L., Schwede, T., (2009). The SWISS-MODEL Repository and associated resources. Nucleic Acids Research. 37: D387-D392. 76. Guex, N., Peitsch, M.C., Schwede, T. (2009) Automated comparative protein structure modeling with SWISS-MODEL and Swiss-PdbViewer: A historical perspective. Electrophoresis, 30(S1): S162-S173. 77. Xuefeng, Ren et al (2013) Structural Basis for Recruitment and Activation of the AP-1 Clathrin Adaptor Complex by Arf1. Cell 152(4): 755–767. 78. Xu X, Kedlaya R, Higuchi, H., Ikeda, S., Justice, MJ., et al. (2010) Mutation in Archain 1, a Subunit of COPI Coatomer Complex, Causes Diluted Coat Color and Purkinje Cell Degeneration. PLoS Genet 6(5): e1000956. 79. Hamamichi., S, Rivas., RN, Knight., AL, Cao., S, Caldwell., KA, et al. (2008) Hypothesis- based RNAi screening identifies neuroprotective genes in a Parkinson's disease model. Proc Natl Acad Sci U S A. 105:728–733. 80. Annaert., W, De Strooper., B. (2002) A cell biological perspective on Alzheimer's disease. Annu Rev Cell Dev Biol. 18:25–51. 81. Uemura., K, Kuzuya., A, Shimohama., S. (2004) Protein trafficking and Alzheimer's disease. Curr Alzheimer Res. 1:1–10. 82. DiFiglia., M, Sapp., E, Chase., K, Schwarz., C, Meloni., A, et al. (1995) Huntingtin is a cytoplasmic protein associated with vesicles in human and rat brain neurons. Neuron. 14:1075–1081. 83. Gil., JM, Rego., AC. (2008) Mechanisms of neurodegeneration in Huntington's disease. Eur J Neurosci. 27:2803–2820. 84. Strehlow., AN, Li., JZ, Myers., RM. (2007) Wild-type huntingtin participates in protein trafficking between the Golgi and the extracellular space. Hum Mol Genet. 16:391–409. 79

85. Cooper., AA, Gitler., AD, Cashikar., A, Haynes., CM, Hill., KJ, et al. (2006) Alpha- synuclein blocks ER-Golgi traffic and Rab1 rescues neuron loss in Parkinson's models. Science. 313:324–328.

80

חקר מבניהם של חלבונים הקשורים במערכת ה- COPI

אביטל להב

חקר מבניהם של חלבונים הקשורים במערכת ה- COPI

חיבור על מחקר

לשם מילוי חלקי של הדרישות לקבלת התואר דוקטור לפילוסופיה

אביטל להב

הוגש לסנט הטכניון - מכון טכנולוגי לישראל

כסלו תשע"ד חיפה נובמבר 2014

המחקר נעשה בהנחיית פרופ' נעם אדיר בפקולטה לכימיה ע"ש שוליך.

ראשית, אני מודה לפרופ' נעם אדיר על עזרתו הרבה, הנחייתו וסבלנותו לכל אורך המחקר.

שנית, אני מודה לפרופ' דן קסל מהפקולטה לביולוגיה בטכניון על שיתוף הפעולה ועזרתו במחקר ולדר' אנה פרניס ממעבדתו של פרופ' קסל על עזרתה וסבלנותה.

ברצוני להודות גם לחבריי הן במעבדת פרופ' נעם אדיר והן במעבדת פרופ' דן קסל שהפכו את המעבדה למקום נעים לעבודה ומהם למדתי רבות.

אני מודה לטכניון – מכון טכנולוגי לישראל על תמיכתו הכספית הנדיבה בהשתלמותי.

אני מודה למשפחתי האהובה: יהודה ובן שבעזרת חיבוק אחד מהם הכל נראה טוב. תיזה זו מוקדשת להם.

לדיויד קליין, ששוב עוזר לי להגיע לשיאים.

תקציר

המערכת המורכבת של תגובות כימיות מתואמות ומוסדרות המצויה בכל תאי החיים, נקראת חילוף חומרים. בשל שינוי בלתי פוסק של סביבת רוב האורגניזמים, התפקיד העיקרי של מערכת תהליכים זו הוא לספק את האנרגיה הדרושה כדי לתדלק ולתחזק את פעילויות האורגניזמים, המאפשרות להם להגיב לסביבה שלהם, לגדול ולהתרבות.

כדי להבין את התהליכים בתא צריך לתאר כיצד התגובות מועברות וכיצד גורמים מולקולריים מבצעים תפקידים ספציפיים בתוך התא, דרך ההיררכיה של מכלולים ומבנים גבוהים יותר. מאפיינים ותחזיות אלה ניתן לבדוק על ידי נתונים מבניים שניתן לקבל בעזרת קריסטלוגרפית קרני איקס. במחקר זה, חקרנו שני חלבונים שהם חלק ממערכת אחת, אשר משמשים כלי להבנת תהליכים תוך- תאיים חשובים. קביעת המבנה התלת ממדי של חלבונים אלה יכולה לספק את התבנית להבנת מנגנון מערכת זו וכמו גם מערכות הומולוגיות אליה, כדי לתכנן תרופות לטיפול במחלות בבני אדם הנגרמות על ידי פגיעה בתהליך חשוב בתא.

המערכת ה- vasicular הנקראת COPI מתווכת את החזרת החלבונים אחורה בין מדורי ה- Golgi ומה- Golgi ל- ER) Endoplasmic Reticulum). קומפלקס חלבוני המעטפת של COPI נקרא קואטומר )Coatomer(. הקואוטמר נמצא על מעטפת ממברנת הווסיקולה וממיין חלבונים הנדרשים לחזור בתוך הווסיקולה )vesicle( הנוצרת. השחרור של חלבוני המעטפת מהווסיקולה תלוי בהידרוליזת מולקולת ה- GTP הקשורה ל – Arf1. הידרוליזה זו מאוקטבת ע"י חלבוני עזר הנקראים ArfGAPs . הידרוליזה זו משחררת את ה- Arf1 לציטוזופלסמה אשר גורם לשיחרור הקואוטמר מהממברנה. הקואוטמר מורכב משני קומפלקסי משנה: הראשון מורכב מיחידות משנה הנקראות: -δ- ,γ- ,β ו- ζ-COPI. הקומפלקס השני מורכב מ: -β'- ,α ו- ε-COPI.

המבנה של אזור הקצה הקרבוכסילי של תת יחידה γ-COPI, המכונה γ-COPI appendage )ייקרא כאן מעתה כ- γCad(, נקבע בעבר בעזרת קריסטלוגרפית קרני איקס והראה שיש לו אתר אינטראקציה חלבון- חלבון על חלק ה- "פלטפורמה" שלו. ב- γ-COPI שמקורו מיונקים, אזור זה הוצע כקושר את החלבון הגמיש ArfGAP2 . להבנת מנגנון תפקוד מערכת ה- COPI, ביקשנו לפענח את המבנה התלת מימדי של אינטראקציה זו ולבחון בעזרתה אילו חומצות אמינו תורמות לה. לשם כך, הצלחנו למצוא כמה תנאי גיבוש לגיבושם המשותף של γCad הומאני עם קטע הקצר ביותר הנלקח מחלבון העזר המאקטב הידרוליזה- ArfGAP2 מחולדה שעדיין מכיל כנראה את האזור החשוב לקישור γCad . אספנו כמה מערכות מידע קריסטלוגרפיות )data-sets( מגבישים שגדלו בתנאי גיבוש שונים ופתרנו את המבנים התלת ממדיים של החלבון מנתונים אלו. אולם, בכל המבנים שפתרנו לא הצלחנו למקם את הפפטיד מחלבון ה- ArfGAP2. הסיבות לכך יכולות להיות אופיו הגמיש של פפטיד ArfGAP2 או שפפטיד זה לא נקשר ולא נכנס לגביש מלכתחילה. כדי לוודא שהפפטיד מחלבון ה- ArfGAP2 אכן יהיה חלק מהגביש אם ייווצר כזה, שיבטנו כמה חלבונים המכילים את חלבון ה- γCad מאוחה לחלקים בגדלים שונים של חלבון ArfGAP2 אשר נמצא בקצה האמיני. המטרה בשיבוט החלבון המאוחה בגדלים שונים היתה הצורך בהתחשבות בין שני דברים מנוגדים: מצד אחד שהפפטיד יהיה קצר כדי שאופיו הגמיש לא יפריע ליצירת הגביש ומצד שני שהפפטיד יהיה ארוך דיו להגיע לאתר הקישור. לבסוף קיבלנו שלושה חלבונים מאוחים בגדלים שונים להם קראנו "קצר", "בינוני" ו"ארוך". ביצענו שני ניסויים שהוכיחו את הקישור של הפפטיד מ- ArfGAP2 ל- I

γCad הנמצא בחלבון המאוחה ה"בינוני": בעזרת ניסוי ה- SLS העוקב אחר פיזור האור ממולקולות ומתבסס על כך שהקומפלקס השלם מפזר את האור יותר מחלקיו, הוכחנו שכאשר מכניסים את החלבון המאוחה של γCad עם פפטיד ה- ArfGAP2 לפני הכנסת ArfGAP2 השלם, פיזור האור יורד בצורה מתונה יותר- מה שמעיד על כך שהורדת הקואטמר מהליפוזומים ע"י הידרוליזה על Arf1 התרחשה בקצב מתון יותר בשל הקשרות חלק הפפטיד של ArfGAP2 מהחלבון המאוחה לקואוטומר ללא יכולת קטליטית לזירוז הידרוליזה ובשל הקשרות חלק ה- γCad של החלבון המאוחה עם ArfGAP2 החופשי. בשני המקרים התוצאה היא פחות אתרים פנויים לקישור לקואוטמר ולכן גם יורד קצב הורדת הקואוטמר מהליפוזומים. בעזרת ניסוי ה- ITC שנותן אינדיקציה לפליטה או השקעת החום כתוצאה מקישור חלבונים או התרתם, בהתאמה, הוכחנו שהחלק ה- γCad מהחלבון המאוחה נקשר לחלק ה- ArfGAP2 בו בשל תגובה אנדותרמית שהתרחשה בעת הוספת פפטיד ה- ArfGAP2 לתא המדידה המכיל את החלבון המאוחה. יתרה מכך, האפיניות של פפטיד הArfGAP2- לחלבון המאוחה היתה קטנה יותר מהאפיניות של הפפטיד לחלבון ה- γCad בשל פחות אתרים פנויים לקישור )מפאת קישור חלק ה- γCad מהחלבון המאוחה לחלק ה- ArfGAP2 בו(. הצלחנו למצוא כמה תנאי גיבוש שונים לגיבושו של החלבון המאוחה ה"בינוני", אולם הגבישים שהתקבלו היו מחטיים וככאלה בעיתיים לסריקה בעזרת קרני איקס. למרות זאת, הצלחנו לאסוף data-set אחד בלבד שממנו פתרנו את מבנהו התלת מימדי של החלבון ברזולוציה 2.8Å. במבנה זה איתרנו לראשונה צפיפות אלקרונית הנובעת מחלק קטן מפפטיד ה- ArfGAP2 בסמוך לאתר הקישור הנמצא בחומצה אמינית W776 בחלבון ה- γCad . צפיפות אלקטרונית זו אמנם היתה קטנה ומתאימה רק לשלוש חומצות אמינו מתוך הפפטיד שכן הפפטיד הוא מאוד גמיש וכנראה שרק חומצות האמינו שהיו מעורבות בקישור קובעו לקבלתה. לכן לא יכולנו להבין בדיוק אילו חומצות אמינו נקשרו שם והכנסנו לתוך צפיפות זו שלוש חומצות אמינו אלאנין, אך הוכחנו שאתר זה נמצא על חלק ה- "פלטפורמה" של γ-COPI appendage, כפי שהוצע בעבר כאתר חשוד לקשירת ArfGAP2 על סמך הומולוגיה לקומפלקס האדפטור של חלבון הקלטרין.

תת היחידה δ-COPI של הקואטומר היא אחת משתי תתי היחידות של הקואטומר שעליהן לא היה קיים מידע מבני. תת-יחידה זו מכילה חלק הנקרא (Mu Homology Domain (MHD הקושר ArfGAP1 לתפקיד לא ידוע, ובנוסף חלק זה מכיל אזור החשוד שקושר רצפים חשופים מבוססי ארגינין לזיהוי ומיון חלבונים שצריכים לחזור ל- ER בעזרת וסיקולת COPI. לא ניתן היה למצוא הומולוג קרוב מספיק מבחינה רצפית לחלק ה- MHD מחלבון ה- δ-COPI (ייקרא כאן מעתה δCM) כדי לפתור את המבנה בשיטת Mulecular replacement ולכן היה צריך לקבל סיגנל אנומלי מקרני ה- X כדי לפתור את המבנה בעזרת שיטת SAD. לשם כך, חלבון ה- δCM בוטא בזן E. coli החסר חומצת האמינו מתיונין ובנוכחות סלנו-מתיונין. הצלחנו למצוא כמה תנאי גיבוש לגיבושו של δCM וכן גם לגיבושם המשותף של δCM עם הפפטיד המתוכנן הכי קצר מחלבון ArfGAP1 שעדיין מכיל את רצף הקישור לחלבון δCM ועם שני פפטידים המכילים רצפי ארגינין שתוכננו. אספנו כמה data-sets מהגבישים, ופתרנו את המבנה התלת מימדי של δCM ברזולוציה של 2.15Å. המבנה הופקד ב-PDB וקיבל את הקוד 4O8Q. למרות שבכל המבנים שפתרנו לא הצלחנו למצוא צפיפות אלקטרונית המתאימה לפפטידים שגובשו ביחד עם δCM, הצלחנו לראשונה להראות על מבנהו של δCM את מקום קשירתם המשוער של הפפטידים קושרי רצפי ארגנין. בנוסף לכך, כעת ניתן להראות מקום מוטציות מכוונות במקום זה שגורמות לבעיות פיזיולוגיות ובכלל להוסיף עוד נדבך שלא היה ידוע במבנהו של COPI שיכול לשמש כמודל לפתירת מבנה של חלבון β- COPI

II

הקשור ל-δ-COPI ובכלל לעזר לפתירתו של כל קומפלקס הקואטומר ב-TEM, במטרה להבנת פעולתו של כל קומפלקס זה.

III