<<

Single under the : Conformations, Dynamics and Medicinal Therapies

by

Baoxu Liu

A thesis submitted in conformity with the requirements for the degree of Doctor of Philosophy

Graduate Department of Physics University of Toronto

© Copyright by Baoxu Liu 2013 ii

Single Proteins under the Microscope: Conformations, Dynamics and Medicinal Therapies

Baoxu Liu

Doctor of Philosophy, 2013

Graduate Department of Physics University of Toronto

Abstract

We applied single-molecule (SMF) methods to probe the properties of individual fluorescent probes, and to characterize the proteins of interest to which these probes were attached. One remarkable advantage of SMF spectroscopy is the ability to investigate heterogeneous subpopulations of the ensemble, which are buried in ensemble averaging in other measurements. Other advantages include the ability to probe the entire dynamic sequences of a single molecule transitioning between different conformational states.

For the purpose of having an extended observation of single molecules, while maintaining the native nanoscale surroundings, we developed an improved vesicle preparation method for encapsulating scarce biological samples. SMF investigations revealed that molecules trapped in vesicles exhibit nearly ideal single-emitter behavior, which therefore recommends the vesicle encapsulation for reproducible and reliable SMF studies.

Hyperactive Signal-Transducer-and-Activator-of-Transcription 3 (STAT3) contributes significantly to human cancers, such as leukemia and lymphoma. We have proposed a novel therapeutic strategy by designing a cholesterol-based protein membrane iii anchor (PMA), to tether STAT3 to the and thus inhibit unwanted transcription at the . We designed in vitro proof-of-concept experiments by encapsulating STAT3 and PMAs in phospholipid vesicles. The efficiency and the stability of STAT3 anchoring in the lipid membrane were interrogated via quantitative fluorescence imaging and multiparameter SMF spectroscopy. Our in vitro data paved the way for the in vivo demonstration of STAT3 inhibition in live cells, thus demonstrating that PMA-induced protein localization is a conceptually viable therapeutic strategy.

The recent discovery of intrinsically disordered proteins (IDPs) highlights important exceptions to the traditional structure-function paradigm. SMF methods are very suited for probing the properties of such highly heterogeneous systems. We studied in detail the effects of electrostatics on the conformational disorder of an IDP protein, Sic1 from yeast, and found that the electrostatic repulsion is a major factor controlling the dimensions of

Sic1. Based on our data we also conclude that a rod-like shape seems a better candidate than a random Gaussian chain to describe and predict the behavior of Sic1.

iv

Acknowledgements Here I would like to sincerely thank all the people who supported, helped and assisted me in the past six years, on my way to obtaining my PhD degree. Because of you I had a wonderful time during my study at University of Toronto, harvesting not only academic achievements, but also friendship and happiness. My special thanks go to Prof. Claudiu Gradinaru. It was great to have Claudiu as supervisor, as he wished to teach me in detail the equipment and the methods used in the lab. His deep scientific knowledge, his common touch with students, his wise and sharp guidance and his continuous encouragement have made my progress and this work possible. I have to thank him again here for being my “free” English teacher, by spending a generously large amount time for correcting the writing of my papers and thesis and for improving my oral presentation skills. I have to admit that having Amir Mazouchi as a friend and colleague was something truly enjoyable. His decision to join the group in 2007 made me no longer the “lonely wolf” in the lab. He was always there when you needed him, no matter how busy he was with his own stuff. I want to thank him again for sharing the “pain” of optimizing the optical and of taking the daily shuttle bus to the lab. I would like to thank Dr. Sergei Musikhin for introducing me to the equipment and helping with building the first microscope in the lab. Sergey, your excellent guidance on everything optics is highly appreciated. I would like to thank all the previous and current lab members in the Gradinaru lab, for creating a nice working atmosphere in the lab, and for sharing the lab tasks. Particularly, I would like to thank Sangkyu Lee, Masood Samim, Abdullah Bahram, Daniel Badali, Jinkyue Ree and Darius Chia, for assisting me in various research projects. All your hard work and contributions in efforts and time are highly appreciated. I would also like to acknowledge the other members of our group: Douglas Goh, Maxim Prigozhin, Denys Marushchak, Susanne Scharf, Subin Yoo, Daniel Jacobs, Yuchong Li, Zhenfu Zhang, Jordan Rebelo, Anthony Pugliese and Amanda Leung. For all of you, there is one saying: “it’s a fate to meet”. It was my good fortune to get to know so many nice people like you. v

I would also like to express my heartfelt gratitude to our research collaborators. I thank Professors Patrick Gunning, Peter Macdonald and Julie Forman-Kay, and Neil MacKinnon, Miriam Avadisian, Veronika Csizmok and Qasim Saleem: without your constructive suggestions and valuable materials/samples provided, the research progress presented here would not have been possible. I would like to thank Professor Virginijus Barzda and David McMillen for kindly agreeing to be in my supervisory committee in the past five years, and providing important guidance all through my PhD years. I would like to thank department of physics at University of Toronto, and the “CIHR Training Program in Protein Folding and Interaction Dynamics: Principles and Diseases” for the financial support. I would like to thank my former supervisors Professor Zengming Zhang and Professor Zejun Ding at University of Science and Technology of China. Thank you for your guidance and patience in my undergraduate thesis writing, and for providing opportunities for my first research experience in your lab. I would also like to thank Professor Bicai Pan (and Professor Zengming Zhang, Professor Zejun Ding again) for providing me with reference letters when I was applying for graduate school at University of Toronto. Last but not the least, I would like to thank my family and friends for their continuous support over the past six years, particularly to my mother Yueling Chen, my father Xinan Liu and my wife Zijuan Zhang. Without your enormous and selfless love and support, I would not be where I am today. Finally I would like to thank God for making a great arrangement for me to meet so many nice people. Support from all of them defines what I am today.

vi

Table of Contents

Abstract ...... ii Acknowledgement ...... iv Table of Contents ...... vi List of Tables ...... x List of Figures ...... xi List of Abbreviations ...... xiv Thesis Outline ...... xvi

Chapter 1: Introduction ...... 1 1.1. Proteins ...... 2 1.2. Fluorescence Spectroscopy & in Protein Research ...... 6 1.3. Principles of Fluorescence Spectroscopy ...... 14 1.3.1. Fundamentals of Fluorescence ...... 14 1.3.2. Fluorescence Spectra ...... 16 1.3.3. Fluorescence Lifetime and Quantum Yield ...... 17 1.3.4. Confocal Fluorescence Microscopy ...... 18 1.3.5. Total Internal Reflection Fluorescence Microscopy...... 20 1.3.6. Förster Resonance Energy Transfer ...... 21 1.3.7. Fluorescence Correlation Spectroscopy ...... 24 1.3.8. Fluorescence Polarization and Anisotropy ...... 26 References ...... 30

Chapter 2: Detection and Analysis of Single-Molecule Fluorescence ...... 35

2.1. Confocal Laser Scanning Microscope...... 36 2.1.1. Optical Layout of the Microscope ...... 36 2.1.2. Signal Detection: The Photon Counting Apparatus...... 39 2.2. Total Internal Reflection Fluorescence Microscope ...... 41

2.3. Data Acquisition and Analysis Software ...... 43 2.3.1. Hardware Connection Flow Diagram ...... 43 2.3.2. Data Acquisition Software ...... 45 2.4. and Selections ...... 47 vii

2.5. Coverslip Cleaning and Surface Modifications ...... 49 2.5.1. Plasma Cleaning Protocol ...... 50 2.5.2. Surface Coating with BSA(-biotin) ...... 50 2.5.3. PEG Surface Coating ...... 52 2.6. Protein Labeling and Purification ...... 52 2.6.1. Amine Group Reactive Cross-Linking Chemistry ...... 53 2.6.2. Cysteine Reactive Cross-Linking Chemistry ...... 54 2.6.3. Size Exclusion Chromatography ...... 55 2.6.4. Degree of Labeling Measured by Mass Spectroscopy ...... 56 2.7. Single Molecule Fluorescence Spectroscopy ...... 57 2.7.1. Single Molecule FRET Burst Measurements ...... 58 2.7.2. Single Molecule Measurements using Encapsulation ...... 61 References ...... 63

Chapter 3: A New Liposome Preparation Method for Biomolecule Encapsulation ...... 65

3.1. Introduction ...... 66

3.2. Materials and Methods ...... 66 3.2.1. Materials ...... 66 3.2.2. Liposome Preparation Methods in Literature ...... 67 3.2.3. Liposome Preparation Protocol ...... 69 3.2.4. Liposome Size Measurements ...... 69 3.2.5. Fluorescence Polarization ...... 70 3.3. The Size Distribution of ...... 70

3.4. Effective Encapsulation Method for Scarce Biological Samples ...... 72

3.5. Single Molecules: Vesicle Trapping vs. Direct Immobilization ...... 75 3.5.1. Number of Emitting States ...... 75 3.5.2. Variance of Single-Molecule Signals: Noise Analysis ...... 78 3.5.3. Multiparameter Histograms ...... 81 3.6. FT-Induced Changes of Fluorescence ...... 82

3.7. FT Effects on Protein Functionality ...... 84

3.8. Conclusions ...... 86

References ...... 88 viii

Chapter 4: New Cancer Therapies: Membrane Anchorage of the STAT3 Protein ...... 91

4.1. STAT3 Protein Structure ...... 92

4.2. STAT3 Pathway and Inhibition ...... 93

4.3. Materials and Methods ...... 95 4.3.1. Materials ...... 95 4.3.2. Synthesis of Protein-Membrane-Anchor ...... 95 4.3.3. Labeling STAT3 with a Fluorescent Dye ...... 96 4.3.4. Liposome Preparation ...... 97 4.3.5. Supported Lipid-Bilayer ...... 97 4.3.6. Fluorescence Setup ...... 98 4.3.7. STAT3 Fluorescence Polarization (FP) Binding Assays ...... 98 4.4. PMA Structures and STAT3-Binding Characterization ...... 99 4.4.1. The Structures of PMA Derivatives ...... 99 4.4.2. Binding of PMAs to the STAT3 Protein ...... 100 4.4.3. Quantitative Fluorescence Labeling of STAT3 ...... 103 4.4.4. Characterization of F-NAc Marker ...... 104 4.5. STAT3 Distribution inside Large Liposomes ...... 106

4.6. Direct Observation of STAT3 Anchoring on a ...... 108

4.7. STAT3-Inhibitor Binding Observed at SM Level ...... 109

4.8. STAT3 Inhibition In Cells ...... 112

4.9. Summary ...... 116

References ...... 117

Chapter 5: Electrostatics Tweaks the Conformational Properties of Intrinsically Disordered Proteins ...... 119

5.1. Introduction ...... 120

5.2. Materials and Methods ...... 122 5.2.1. Materials ...... 122 5.2.2. Quantitative Labeling of Sic1 ...... 123 5.2.3. SMF Spectroscopy Experiments ...... 124 5.3. Information about Sic1 Obtained from Bulk Measurements ...... 125 ix

5.3.1. Multi-Phosphorylation Dependent Sic1-Cdc4 Recognition ...... 125 5.3.2. Sic1 Conformers: ENSEMBLE Calculations ...... 126 5.4. smFRET Measurements ...... 128 5.4.1. Sic1 Labeling and Quantification ...... 128 5.4.2. Sic1 Size Distribution is Broad Under Physiological Conditions ...... 129 5.4.3. Sic1 Conformations in KCl ...... 132 5.4.4. Sic1 Conformations in GuHCl ...... 138 5.4.5. Mechanisms of Sic1 Charge Screening by KCl and GuHCl ...... 141 5.4.6. The Nature of Sic1 in KCl and GuHCl ...... 143 5.5. Time-Resolved Anisotropy: Sic1 is Rod-Like ...... 147

5.6. Sic1 Conformational Fluctuations ...... 150

5.7. Summary and Future Work Plans ...... 152

References ...... 155

Statement of Contribution ...... 157

List of Publications ...... 160

x

List of Tables

Chapter 3

Table 1: Number of Emitting States of Single TMR-DNA Molecules Encapsulated in Vesicles ...... 77

Table 2: Noise Analysis of Intensity-Time Trajectories of Single TMR-DNA Molecules Encapsulated in Vesicles Vs Directly Immobilized on Surface ...... 78

Chapter 5

Table 1: Sic1 Anisotropy Decay Fitting Results ...... 149

xi

List of Figures

Chapter 1

Figure 1.1-1: The Four Levels of Molecular Organizations of STAT3 Protein ...... 3 Figure 1.2-1: Sub-Millisecond Kinetics of Cyt. c ...... 8 Figure 1.2-2: Live Cell Imaging of Subcellular Distribution of STAT3-CY Fusion Proteins in HepG2 Cells ...... 10 Figure 1.2-3: Single-Molecule Energy Transfer Efficiency Histograms of CspTm in the Presence of Chemical Denaturant Guanidinium Hydrochloride ...... 13 Figure 1.3-1: The Jablonski Diagram ...... 15 Figure 1.3-2: Excitation and Emission Spectra of Alexa555 ...... 17 Figure 1.3-3: Schematic Representation of TIRFM ...... 21 Figure 1.3-4: Interdye Distance Dependence of the Energy Transfer Efficiency ...... 23 Figure 1.3-5: Application of FRET-FCS in Quantifying Structural Dynamics of α-Synuclein ...... 26 Figure 1.3-6: The Principles of Photoselection and Fluorescence Depolarization .... 28

Chapter 2

Figure 2.1-1: Optical Layout of the CLSM Setup ...... 38 Figure 2.1-2: PicoHarp 300 and Router Connections used in the CLSM ...... 40 Figure 2.2-1: Optical Layout of the TIRFM Setup ...... 41 Figure 2.2-2: Wide-Field and Total-Internal-Reflection Mode of the TIRFM ...... 42 Figure 2.3-1: Flow-Diagram of Apparatus used in the CLSM ...... 44 Figure 2.3-2: The User Interface of the Data Acquisition Software ...... 46 Figure 2.4-1: Front Panel of the SpectraViewer Application ...... 48 Figure 2.6-1: Coupling Principle of Amine-NHS and Thiol-Maleimide Cross-Linking Chemistries ...... 53 Figure 2.6-2: Principle of Size Exclusion Chromatography ...... 56 Figure 2.7-1: smFRET Burst Experiments ...... 59 Figure 2.7-2: Graphic User Interface of the Burst Identification and Analysis Software ...... 60 xii

Chapter 3

Figure 3.3-1: Wide-Field Images and Size Distributions of Vesicles Prepared with and without Freeze-Thaw Cycles ...... 72 Figure 3.4-1: Liposome Preparation Scheme ...... 73 Figure 3.4-2: Average Number of Molecules Trapped per Vesicle as a Function of the Amount of Sample ...... 75 Figure 3.5-1: Intensity-Time Trajectories of Single TMR-DNA Molecules ...... 76 Figure 3.5-2: STD/SRM of Single Molecule Intensity-Trajectory as Function of Binning Time ...... 80 Figure 3.5-3: Multi-Parameter Histograms from Single-Molecule Data Acquired for TMR-DNA ...... 82 Figure 3.6-1: Single-Molecule FP Distribution Measured on Single Cy3-ssDNA .... 84 Figure 3.7-1: FT effects on STAT3 functionality ...... 86

Chapter 4

Figure 4.1-1: Domain Structure of the STAT3 Monomer ...... 93 Figure 4.2-1: Signaling Pathway and Proposed Inhibition Scheme of STAT3 ...... 94 Figure 4.4-1: Design of PMA Molecules ...... 99 Figure 4.4-2: Chemical Structures of PMA Derivatives ...... 100 Figure 4.4-3: STAT3-Peptide Binding Measured by FCS and FA ...... 101 Figure 4.4-4: FP Binding Measurements of STAT3-PMA 3 and STAT3-PMA 1 .. 102 Figure 4.4-5: Quantitative Fluorescence Labeling of STAT3 ...... 104 Figure 4.4-6: 2D Chemical Structure of F-NAc ...... 105 Figure 4.4-7: pH Dependence of Emission Spectra of F-NAc ...... 105 Figure 4.4-8: Photobleaching Kinetics of F-NAc Compared with FITC ...... 106 Figure 4.5-1: STAT3 Distribution inside Liposomes ...... 107 Figure 4.5-2: Time-Lapse Imaging of STAT3 inside Liposomes ...... 108 Figure 4.6-1: TIRF image of Adsorbed STAT3 on a Supported Lipid Bilayer ...... 109 Figure 4.7-1: Apparent Size Distribution of Vesicles Prepared via Extrusion Through a 0.1μm Porous Membrane...... 111 Figure 4.7-2: Polarized Intensity-Time Trajectory Recorded for Single Pair of PMA xiii and STAT3 Trapped Inside a Surface Immobilized Liposome ...... 112 Figure 4.8-1: EMSA Analysis of the Inhibition of STAT3-STAT3 Dimerization by PMA ...... 113 Figure 4.8-2: Inhibition of the EGF-Induced Nuclear Translocation of STAT3 by PMA ...... 114 Figure 4.8-3: Induction of STAT3 Membrane Anchorage and Inhibition of the Nuclear Translocation of STAT3 by PMA ...... 116

Chapter 5

Figure 5.1-1: Sic1 in Budding Yeast Cell Life Cycle ...... 121 Figure 5.4-1: MS Spectrum of Sic1 and Fluorescently Labelled Sic1 ...... 128

Figure 5.4-2: REE Distribution of Sic1 Conformers from ENSEMBLE Calculations and Expected/Measured EFT Histogram in smFRET Measurements ...... 130 Figure 5.4-3: Sic1 Conformations in KCl ...... 134 Figure 5.4-4: Sic1 Conformations Measured Using a Different FRET Pair ...... 135

Figure 5.4-5: EFT correlation map ...... 137 Figure 5.4-6: Dependence of the End-To-End Distance and the Hydrodynamic Radius on KCl Concentration ...... 138 Figure 5.4-7: smFRET Histograms of Sic1 at Various Concentrations of GuHCl .. 141 Figure 5.4-8: Dependence of the End-To-End Distance and the Hydrodynamic Radius on GuHCl Concentration ...... 141 Figure 5.4-9: smFRET Histograms of End-Labelled Sic1obtained in the Presence of SDS ...... 142 Figure 5.4-10: Dependence of Gyration Radius of Sic1Fitted to a Polyelectrolyte Binding Model ...... 145

Figure 5.4-11: Ratio of Rh over REE for Cluster #3 of Sic1 as a Function of KCl or GuHCl Concentrations ...... 146 Figure 5.5-1: Conformational Properties of Sic1 Further Investigated by Time-Resolved Fluorescence Anisotropy ...... 148 Figure 5.6-1: Conformational Dynamics of Sic1 Investigated by FRET-FCS ...... 151

xiv

List of Abbreviations

2D Two-dimensional 3D Three-dimensional AOTF Acousto-optical tunable filter CCD Charge coupled device CF Carboxyfluorescein CLSM Confocal Laser Scanning Microscope CPS Counts per second CW Continuous wave DMPE 1,2-dimyristoyl-sn-glycero-3-phosphoethanolamine DNA Deoxyribonucleic acid E Efficiency

EFT Fluorescence resonance energy transfer efficiency EMSA electrophoretic mobility shift assay FA Fluorescence anisotropy FCS Förster/Fluorescence correlation spectroscopy FP Fluorescence polarization FRET Fluorescence resonance energy transfer FT Freeze-thaw FWHM Full-width at half-maximum GFP Green fluorescent protein GdmHCl Guanidinium hydrochloride IRF Instrument response function KPS Potassium persulfate MS Mass spectroscopy NA Numerical aperture nm Nanometer NMR Nuclear magnetic resonance OD Optical density PBS Phosphate buffer saline xv

PC Personal computer PDB Protein data bank PMA Protein membrane anchor PMT Photomultiplier tube POPC 1-palmitoyl-2-oleoyl-sn-glycero-3-phosphocholine PSF

REE End-end distance

Rg Gyration radius

Rh Hydrodynamic radius ROI Region of interest SAXS Small angle X-ray scattering SH2 Src Homology 2 SHG Second harmonic generation SMD Single molecule detection SMF Single molecule fluorescence smFRET Single molecule FRET SPAD Single photon avalanche diode SYNC Synchronization TCSPC Time correlation single photon counting TIR Total internal reflection TIRF Total internal reflection fluorescence TIRFM Total Internal Reflection Fluorescence Microscope TRFA Time-resolved fluorescence anisotropy TTL Transistor-transistor logic TTTR Time-tagged time-resolved UV Ultraviolet WF Wide filed YFP Yellow fluorescent protein

xvi

Thesis Outline

Chapter 1 first describes the fundamental concepts of protein molecules, and the importance of understanding the structure-function relationship in protein research to shed light on a number of diseases. Next, an overview of fluorescence spectroscopy and microscopy is given, with applications in protein research, including intrinsic fluorescence and protein folding investigated at bulk and at single-molecule levels. To help the reader further understand the applications, the basic principles of fluorescence spectroscopy are presented in some detail.

Chapter 2 focuses on introducing the instruments built in the Gradinaru lab and the custom-written data acquisition and analysis software for single-molecule studies. First of all, the optical configuration of the two fluorescence microscopes, a confocal laser scanning microscope and a total internal reflection microscope, is described in detail. Second, the data acquisition and analysis software that I developed to control all the hardware components and analyze/fit the data, is discussed. This section also contains several LabView applications, which I wrote to facilitate the selection of fluorophore and fluorescence filters, the calculation of the Förster radius and of the correction factors for calculating the energy transfer efficiency. Third, it describes experimental protocols used in this thesis, such as coverslip cleaning and surface modifications, protein labelling and purification, and mass spectroscopy. The last part describes the experimental procedures for performing single molecule fluorescence spectroscopy, including the “burst” mode and the “immobilization” mode. For single molecule FRET (smFRET) experiments, an application in LabView is programmed to do burst identification and data analysis of each burst. Strategy and software interface of this application is discussed in detail.

Chapter 3 describes a novel method for vesicle encapsulation of valuable biological materials for single molecule immobilization experiments. First of all, the vesicle size distribution prepared with and without freeze-thaw is characterized. A relatively high encapsulation efficiency is achieved with our method using much less biological material. Second, the behavior of single molecules encapsulated in the vesicle is quantified in detail, showing a near ideal single-emitter behavior compared to molecules directly xvii immobilized on coverslip surface. Finally, protein denaturation and fluorophore photophysical changes caused by freeze-thaw are identified and characterized.

In Chapter 4, inhibition of an oncogenic protein is investigated using fluorescence spectroscopy and imaging. The structure of this protein, signal transducer and activator of transcription 3 (STAT3) is first described, followed by a description of its functionality involved in signalling pathway in both normal cells and oncogenic cells. To inhibit oncogenic functionality of the STAT3 protein, a novel inhibition scheme using protein membrane anchorage is proposed and discussed. Several peptide-based protein membrane anchors (PMAs) are synthesized and their binding affinity with the target protein STAT3 are quantified by fluorescence techniques. A novel fluorescent linker used in the fluorescent PMA is specially designed for fluorescence imaging applications and its photophysical properties are carefully characterized. Next, the in vitro demonstration of the PMA based inhibition scheme is discussed, using a liposome model system that I designed. STAT3 anchorage on a lipid bilayer containing various types of PMA molecules is observed at the single molecule level, and the inhibition potency of PMAs quantified. Single molecule fluorescence polarization experiments further probe the strength of the STAT3-PMA interactions. The last section describes the in vivo demonstration of STAT3 membrane anchorage using PMA in cells, showing that PMA-induced protein localization is a conceptually viable therapeutic strategy. More quantitative SMF investigation of the inhibitor-STAT3 interaction using both the vesicle encapsulation and the supported lipid bilayer approaches is currently performed in the Gradinaru lab by a new graduate student. Further studies to determine the biochemical and biological utility of more drug-like, nonphosphorylated PMAs are also ongoing.

In Chapter 5, single molecule fluorescence spectroscopy techniques are used to probe conformational properties of an intrinsically disordered protein, Sic1. It first describes conformational properties of Sic1 obtained from NMR and other bulk spectroscopy methods, and indicates the necessity of independent approaches of these conformational properties at the single molecule level. Fluorescent labeling of Sic1 and labeling degree quantification using mass spectroscopy are then discussed. The following sections describe how smFRET burst measurements are used to investigate the conformational properties and chain flexibility of Sic1 under varying concentrations of xviii

KCl and GuHCl. The nature of charge screening in the presence of KCl and GuHCl are then investigated by comparing with SDS, whose protein interacting characteristics have been studied thoroughly in literatures. Time-resolved fluorescence anisotropy measurements on Sic1 are then performed in appropriate salt concentrations to further probe Sic1’s chain flexibility properties. In the last section, FRET-FCS analysis on Sic1 is used to probe the time constants of the Sic1 intrachain conformational fluctuations, which occur on both ultrafast (10-100 ns) and slow (10-100 ms) time scales. The slow conformational fluctuations will be further investigated in single molecule immobilization experiments, to obtain valuable information such as interchanging pathways and timescales between Sic1 conformers. Future smFRET and FRET-FCS studies on pSic1 and pSic1-Cdc4 complex systems are also ongoing, to understand the impacts of multi-phosphorylations on Sic1’s conformational properties and interactions with Cdc4. Introduction 1

Introduction

Introduction 2

1.1 Proteins

Being the material base of life, proteins are vital to life, whether in animals, plants or bacteria. Almost ~80% of the dry mass of a cell is made of protein1, and almost all of the life activities are carried out by numerous types of proteins. From bacterial to higher organisms, from metabolism to life reproduction, everywhere there are protein activities involved. Ever since proteins are found to be fundamentals of life forms in the middle of 20th century, numerous efforts have been taken to understand fundamental questions of proteins, including their composition2, structure3, function, and more importantly, sequence-structure4-6 and further structure-function7, 8 relations. The compositions of proteins are rather simple: all the natural proteins are made of combinations of up to 20 α-amino acids. On the contrary, the structures and functions of proteins are highly diverse and complicated. The human body alone is estimated to contain already over two million types of proteins9, involved in most life activities. From a physical point of view, proteins are just a pack of molecules, and it should be possible to probe all of their physical properties using physical methods. As early as in the 1930s, J.D. Bernal at Cambridge University in the UK pointed out that protein folding was a physical process and could be carried out in vitro10. The groundwork for protein folding kinetics was laid 35 years later, when the famous experiments of C.B. Anfinsen et al. showed that protein’s complicated conformational/functional information is encoded in its primary amino acid sequence6. Anfinsen’s discovery provided confidence to scientists that predicting protein structures based on their amino acid sequence (primary sequence) is possible. Together with the discoveries that a specific structure of a globular protein is critical for its biological function7, 11-13, it seems that all physical and biological properties of a protein could be derived solely from its amino acid sequence. Problems like how proteins avoid collapsing into an enormous number of useless folds, how exactly a sequence of amino acids determines a specific 3D topology Introduction 3 and how uniquely this structure is linked to biological activity remain the most essential questions in protein science since then. Over the last century, with contributions from many scientists, protein sequence identification has become a standard lab technique. However, despite theoretical developments and increases in computational power, protein structure prediction based on the amino acid sequence has only made limited progress and even much less so concerning the protein functionality prediction. Because of continuous improvements of techniques and instruments, such as X-ray crystallography, NMR, electron diffraction, AFM, etc, more than 83,000 protein structures (as of August 2012) have been documented in the Protein Data Bank. Based on these solved protein structures, a traditional protein structure paradigm has been established14, in which proteins are described to have four levels of structure and a stable and specific 3D conformation that defines its function. The four levels are named primary structure, secondary structure, tertiary structure and quaternary structure, respectively, and the functional 3D fold is called the native state. As an example, the summarized four levels of molecular organization for the Signal Transducer and Activator of Transcription 3 (STAT3) protein are shown in Figure 1.1-1.

Figure 1.1-1. The four levels of molecular organization illustrated for the STAT3 protein. (A) the primary structure is simply a linear sequence of amino acids jointed via peptide bonds; (B) Part of Introduction 4

STAT3’s secondary structures, containing a few α-helix and β-sheets which are connected via turns or loops; (C) Spatial organization of secondary structures makes a tertiary structure containing domains of specific functions as described in the text; (D); Quaternary structure of STAT3 which contains two

STAT3 monomers, interacting with each other at the top in the Src Homology 2 (SH2) domains.

The primary structure is simply a linear polymer made of amino acids which are connected by peptide bonds (polypeptide). Anfinsen’s measurements state that the primary structure encodes the necessary information of the following three more complicated structures. Hydrogen bonds formed among the backbone peptide groups and hydrophobic interactions involving side chain groups cause proteins to assume local highly ordered patterns which make up the secondary structure (e.g. helices). Secondary structure comes in many forms, with two predominant types, the α-helix and the β-sheet, usually connected by turns or loops. The assembly of secondary structures in typical 3D alignment makes up the tertiary structure, which is believed to determine the protein’s specific function. In proteins such as STAT3, the tertiary structure (Figure 1.1-1 C) clearly shows domains consisting of specific arrangements of secondary structures, such as the 4-helix domain and a following DNA-binding domain. Each domain may have a different purpose and overall they work together to fulfill the specific functions of the protein. In brief, the helical bundle structure is found to be essential for the interaction of STAT3 with several other proteins15, in which its extended dimension may facilitate protein recognition.The DNA binding domains in a STAT3 dimer behave like a clamp, contacting double-stranded DNA in the cell nucleus on both the major and minor grooves. The function of the connector domain in STAT3 still remains unknown, although the overall structure shows high similarity with calcium-binding domains as found for example in troponin C. 16 The SH2 domain and a tyrosine at position 702 (Y702) are most essential components for STAT3 dimerization and following functionality in signal transducing and transcription. Interaction of the phosphorylated tyrosine with the SH2 domain on another STAT3 leads to STAT3 dimerization and Introduction 5 following nuclear translocation and signal transcription16. At an even higher level of complexity, many proteins function upon the assembly of several (non)identical subunits. The resulting architecture is called the quaternary structure. In the case of STAT3, a homodimer (Figure 1.1-1 D) is the functional unit that transmits extracellular information from membrane receptors to the cell nucleus. There, the STAT3 dimer clamps the DNA at specific recognition sites and activates transcription, an action which cannot be performed by a STAT3 monomer. Given the complexity of their structures, it is not surprising that proteins do not always fold correctly17. Incorrect folding and “disorder states” in proteins usually lead to protein malfunctions and even diseases18. For instance, in recent years, an increasing number of diseases have been recognized to be caused by protein misfolding and disorders18, including Parkinson’s disease, Alzheimer’s dementia and Huntington’s chorea. Therefore having a clear picture of the structure not only helps understanding proper functionalities of a protein in biological processes, but more importantly, it helps in identifying causes of protein-involving diseases, and also provides clues for designing drugs to disrupt the disease-causing function19. In general, disease-involving proteins could be potentially targeted at various domains which are involved in different disease stages. Take the cancer-related STAT3 protein for example. One could potentially target its SH2 domain, the DNA-binding domain or the transactivation domain for inhibiting its cancer role. A study of STAT3 inhibition by targeting the SH2 domain will be discussed in Chapter 4, based on a thorough understanding of STAT3’s structural properties and associated functionalities of each domain. The above mentioned protein structure-function paradigm seems reasonable and is supported by examples20, 21 involving diverse physiological processes. However, the recent discovery of intrinsically disordered proteins (IDPs) highlights some important exceptions14, 22. Unlike traditionally defined globular proteins, IDPs lack stable tertiary structures under physiological conditions. Although as early as the late 1970s, proteins Introduction 6 containing unstructured sequences or domains under physiological conditions had been identified by X-ray crystallography23, 24 and NMR spectroscopy25, the presence of intrinsic disorder in proteins was unfortunately ignored for a long time26. The scientists instead preferred using other interpretations to explain the unusual observations, such as “artifacts in experiment” 26. With the appearance of more high-performance NMR machines, more and more proteins and/or protein domains were found to be disordered, and over the last decade scientists have begun to seriously investigate IDPs’ properties and functions as a different category27. Despite the lack of stable structure, IDPs are found to be very well-represented in eukaryotic organisms, involved in numerous cellular processes such as protein-protein binding28, molecular recognition29, signalling30, etc. Interestingly, many IDPs are found to undergo disorder-to-order transitions28 upon binding with folded partner proteins in order to perform biological functions. To date, due to their extraordinary broad conformational heterogeneity and complex functionality, the biology of most IDPs is still poorly understood. Their structural flexibility and plasticity imposes both challenges and motivations to scientists to uncover their role in biological processes.

1.2 Fluorescence Spectroscopy & Microscopy in Protein Research

Fluorescence spectroscopy is a very effective biophysical method for probing protein conformational properties and functionalities31. High sensitivity and selectivity, as well as the ability to use very small sample amounts, make it very competitive. Unlike other methods such as X-ray crystallography32, in which the protein sample has to be crystallized, fluorescence spectroscopy probes proteins in aqueous conditions, resembling its native conditions without introducing artifacts due to environmental changes. It gathers quite a few physical parameters31 such as the emission spectrum, quantum yield, lifetime and anisotropy, which are sensitive indicators of protein Introduction 7 conformational states and variations under specific experimental conditions. Long after the phenomenon of fluorescence was first observed by Nicolás Monardes on wood extracts in the 16th century31, the first fluorescence microscope was constructed by Otto Heimstädt in 191133, who successfully observed bacteria making use of their . At that time, due to reliance on the autofluorescence of the imaged object and difficulties in separating fluorescence from excitation light, even Heimstädt was not convinced how and when the fluorescence microscope would broaden the applications of microscopic imaging33. However, it is hard to imagine the extraordinary improvement in fluorescence microscopy over the past 100 years. The discovery of secondary fluorescence and the development of the epi-fluorescence microscope33 removed anyone’s doubt of the feasibility of fluorescence microscopy applications. Later developments of the laser34, dichroic optical coatings, advanced signal detectors and the concepts of confocal35, total internal reflection36 and two-photon excitation microscopy37 have made the fluorescence microscope an extremely ubiquitous and universal instrument in the life sciences. In parallel, various spectroscopy techniques were developed in combination with advanced fluorescence microscopy, including fluorescence correlation spectroscopy (FCS)38, 39, Förster Resonant Energy Transfer (FRET)31, Fluorescence Anisotropy (FA)40, etc. For applications of fluorescence spectroscopy in the study of protein, the intrinsic fluorescence of proteins was first employed to probe the conformational status and variations under a certain environment. Intrinsic fluorescence mainly derives from the naturally fluorescent amino acid tryptophan (Trp), to a lesser extent from tyrosine (Tyr), and even less from phenylalanine (Phe). They are generally excited at 280 nm and fluoresce at longer wavelengths, near 300 nm. These three aromatic amino acids have different quantum yields (therefore different brightness) and always have a red-shifted spectrum when buried in their hydrophobic polypeptide chain. Therefore the amplitude and the profile of their fluorescence emission spectrum can serve as a probe of the Introduction 8 folded (functional) state of the protein. This principle is also widely applicable in probing rates of conformational changes. Consider cytochrome (cyt.) c as an example: native cyt. c has a sole Trp59 residue located within 1 nm of a heme iron, which acts as a quencher41. When cyt.c transitions from the unfolded to the folded states, its intrinsic Trp fluorescence is quenched by the heme group, to an extent that depends on the overall compactness of the protein scaffold. Figure 1.2-1 shows fluorescence data obtained by using a continuous-flow method to probe folding-unfolding time constants of cyt.c41. The cyt.c sample is first prepared in acidic conditions (pH 2.0, unfolded state). The sample is then rapidly diluted to pH 4.5, which stimulates the folding of cyt.c.

Figure 1.2-1: Sub-millisecond kinetics of cyt. c with an initial state (pH = 2, 22˚C) diluted to pH =

4.5. Recorded continuous-flow folding kinetics is shown on a linear time plot. Fitting the curve with bi-exponential decay function yields two time constants:57 μs and 454 μs. Reproduced with permission from reference41, Figure 14.

One drawback of intrinsic protein fluorescence is that Phe, Tyr and Trp have relatively low extinction coefficients for absorption and low quantum yields for emission. Their fluorescence spectra span only a narrow ultraviolet spectrum, where most modern detectors have generally a poor detection efficiency, and where excitation sources are limited to arc lamps. The invention of bright, photostable fluorescent labels (fluorescent proteins and organic dyes) effectively promotes fluorescence micro-spectroscopy applications by spanning the visible and infrared spectrum and enabling visualization of Introduction 9 non-fluorescent substances. Labeling a non-fluorescent protein with these fluorescent labels is an important and valuable process to visualize its structural and functional properties. Until now, extrinsic fluorescent labels have been widely used in numerous protein studies, e.g. to probe folding/unfolding dynamics42, to characterize protein-protein interaction and aggregation43, to determine conformational states and dynamics, and to monitor enzymatic activities. The extrinsic fluorescent labels mainly fall into two categories: fluorescent proteins (FPs), which can be genetically co-expressed in living cells, and fluorescent chemical compounds which usually contain aromatic ring groups or several conjugated π bonds in their structure. The most famous FP is the green fluorescent protein (GFP), which is first cloned from the jellyfish Aequorea victoria44, 45 in 1992. The Nobel Prize in Chemistry 2008 was awarded jointly to Osamu Shimomura, Martin Chalfie and Roger Y. Tsien, for their outstanding contributions in uncovering the genetic code of GFP and developing a library of GFP variants, which now spans the entire visible spectrum46. It is hard to overstate the importance of GFP being first cloned in 199245 and then used as a for in vivo labelling in 199447. Compared to other extrinsic fluorescent tags such as dyes and quantum dots, FPs are much more suited for in vivo imaging experiments. They are biocompatible and non-invasive to living cells and even entire organisms such as mice, and they can be genetically co-expressed with targeting proteins in living cells. Nowadays, GFP and its variants are widely used in a variety of applications related to various aspects of living systems44, e.g. protein localization, protein translocation, promoters tracking, etc. Figure 1.2-2 shows an example of using FPs to study the nucleocytoplasmic shuttling of the STAT3 protein48. In HepG2 cells, STAT3 is co-expressed with two C-terminal FPs, a cyan fluorescent protein (CFP) and a yellow fluorescent protein (YFP). Initially, prior to cell stimulation, STAT3 is distributed quasi-uniformly in the cytoplasm, as shown by the YFP signal in red. After being Introduction 10 triggered by extracellular signals (IL-6 growth factor, in this study), STAT3 translocates into the cell nucleus within the next 15-30 minutes. This series of confocal fluorescence images proves that STAT3 plays an important role in transducing extracellular signals from cytokines and growth factors, by shuttling from the cytoplasm into the cell nucleus to further activate extracellular signal responsive gene transcription.

Figure 1.2-2: Live cell imaging of the subcellular distribution of STAT3-CY fusion proteins in

HepG2 cells upon 20 ng/ml IL-6 stimulation. Confocal images of YFP fluorescence were taken at the time points indicated after IL-6 stimulation. The fluorescence intensities are encoded by the colors shown in the scale bar. Reproduced with permission from reference48, Figure 2.

Although the FPs are advantageous for in vivo applications, synthetic such as organic dyes have advantages in terms of brightness, extended photostability and other improved or specifically designed photophysical properties43. For in vitro applications, fluorophores are usually preferable, since their storage is rather simple and long-lasting. Currently, most fluorophores are commercially available in various reactive forms, selected to target proteins or other biomolecules at specific sites using specific chemistry. Their small molecular size makes them suited for tracking the diffusion of the host molecule and for measuring intra- and inter-molecular distances via FRET. On the contrary, FPs’ size is similar to that of the host protein47, which makes the experiments aimed at measuring molecular distances via FRET less accurate. In general, FPs are also Introduction 11 less bright and less (photo)stable than synthetic fluorophores, which do not require special sample handling and are robust under a wide range of environmental conditions49. Furthermore, environment-sensitive (polarity, pH, ions, etc) dyes50, 51 were developed to probe biologically relevant variations around the host molecules. The use of bright, photostable fluorophores enables the observation of single molecules at normal physiological conditions. The possibility of measuring signals from single molecules was suggested by theory and has always attracted the scientists31. However due to obstacles in achieving a sufficient signal-to-noise ratio, reproducible single molecule identification was not possible under ambient conditions. In 1993, the first repetitive imaging of individual fluorophores at room temperature was successfully performed by Eric Betzig and Robert J. Chichester52. Their findings triggered an avalanche of studies focused upon developing and applying single-molecule optical imaging and spectroscopy techniques. Up to now most of the knowledge about biological systems and processes has been obtained from macroscopic/bulk measurements, however it is hard to minimize the recent impact of the single-molecule approach for experimental biology. As its name implies, single molecule fluorescence (SMF) spectroscopy probes the spectroscopic properties of individual single molecules in the biological sample of interest17, 53-55. One of the most remarkable advantages of SMF spectroscopy is the ability to investigate heterogeneous subpopulations of the ensemble, which are buried in ensemble averaging in other measurements. Other advantages include the ability to probe the mechanical properties of a single biomolecule sample, and to probe entire dynamic sequences of a single molecule transitioning between different conformational states56-58. The principle of achieving conditions for SMF observation is actually simpler than it appears. First of all, the sample needs to be diluted until most of time there is only one molecule in the detection volume. Also, the excitation/observation volume has to be minimized by focusing the excitation beam through a high NA objective, to minimize Introduction 12 background signal from the solvent. A sub-femtoliter observation volume can easily be achieved using modern high NA oil objectives. A subnanomolar concentration therefore satisfies the conditions that are necessary for single molecule observation. The experimental details of achieving single molecule observation will be addressed in Chapter 2. Investigations at the single-molecule level remove the ensemble averaging and thus provide a unique means to uncover the heterogeneity of states and behaviour that often characterize biological systems. As a result of their versatility, in the past two decades a variety of single-molecule techniques, both fluorescence-59 and force-based56-58 (AFM, ) has been applied to various fields in biophysics and biochemistry. The role of the single-molecule approach for discoveries in life sciences and medicine will continue to increase. The work of Schuler and coworkers on a small, fast folding cold-shock protein, CspTm, is shown in Figure 1.2-3 as an example demonstrating the application of single-molecule (sm) FRET54 to protein folding. The data have highlighted a bimodal distribution of the smFRET efficiency, interpreted by the authors as evidence of two states (folded and unfolded) in the presence of a denaturant. The smFRET data also shows increased broadening in the unfolded state distribution at higher concentrations of the denaturant, indicating an overall expansion of the unfolded states. Such features cannot be extracted from ensemble experiments, where only an average FRET efficiency (E) value would be obtained at each denaturant concentration, similar to that in Figure 1.1-1. It should also be noted that the population at E ≈ 0 originates from molecules labelled either with two donors or with a donor and an inactive acceptor. This artifact is very common in FRET-based experiments, since protein labeling is a stochastic process. For ensemble experiments, additional control experiments need to be performed to determine the contributions of these (and other) artifacts. Often, this is problematic as not only the fraction but also the brightness of different species must be taken into Introduction 13 account. However, the artifact contribution could be easily removed in single molecule experiments straightforwardly as shown in the distributions below (gray area).

Figure 1.2-3: Single-molecule energy transfer efficiency histograms of CspTm in the presence of the chemical denaturant Guanidinium Hydrochloride (GdmHCl). The high E population corresponds to the folded protein, the lower E to the unfolded protein. The signal at E ≈ 0 for each sample originates from molecules without (or with inactive) acceptor. Reprinted with permission from reference54, Figure 2.

For IDPs the folding landscape is generally more complicated, since they are known to display broad conformational heterogeneity and complex binding-induced conformational changes60. Observing IDPs at the single-molecule level could therefore reveal important information about rapid conformational fluctuations, rare intermediate subpopulations and pathways, all of which are almost impossible to be measured by ensemble experiments61. One example is the stoichiometric inhibitor Sic1 of the cyclin dependent kinase (CDK) in the budding yeast Saccharomyces cerevisiae. Lack of stable Introduction 14 tertiary and secondary structures limit NMR and X-ray crystallography methods to resolve the structure of Sic1. An ensemble of Sic1 conformers was obtained in previous reports60 by performing iterative calculations based experimental constraints from NMR, small angle X-ray scattering (SAXS), etc. Due to the limitations of experimental constraints (many of which also suffer from ensemble averaging), the ensemble of conformers calculated by ENSEMBLE (a program developed in Professor Julie Forman-Kay’s lab to determine and analyze ensemble of structures of intrinsically disordered proteins; refer to Chapter 5 for details) may not be unique. Information obtained from single molecule spectroscopy, such as distributions of end-to-end distance, hydrodynamic radius under appropriate conditions, will further enrich knowledge of the Sic1 protein. Applications of powerful SMF data analysis to explore the properties of the yeast Sic1 will be addressed in Chapter 5. Before going deeper into the details of the biophysical applications of fluorescence in this thesis, the basic principles of fluorescence will be presented in section 1.3, followed by an introduction of the SMF spectroscopy techniques used.

1.3 Principles of Fluorescence Spectroscopy

1.3.1. Fundamentals of Fluorescence

Fluorescence is a phenomenon that occurs in optically active molecules called fluorophores31. Fluorescence is the result of a multi-stage process, in which the fluorophore first absorbs a photon and then emits a photon immediately (~ 10-9 s) with a longer wavelength. The microscopic picture of the more general phenomenon of photoluminescence (including fluorescence and phosphorescence) can be simply illustrated using the Jablonski diagram31 (Figure 1.3-1), in which various electronic states involved in this process are represented by simple energy levels. Introduction 15

Figure 1.3-1: The Jablonski diagram illustrates the principles behind the fluorescence process. S0, S1 and S2 represent the ground, first and second (electronic) singlet excited states respectively. T1 represents the triplet excited state. Excitation (green lines), emission (red lines),

(blue dot line), internal conversion and vibrational relaxation (colored dash lines), quenching and other non-radiative process (grey lines) are indicated by lines connecting appropriate energy levels in the Jablonski diagram.

At room temperature, the fluorophore occupies the lowest vibrational level of the ground electronic state S0. When a photon is provided by external light sources (lamp, laser, the Sun, etc), the fluorophore absorbs it and elevates one electron to create an excited electronic singlet state. This process is called excitation and represented by the solid green lines in Figure 1.3-1. Depending on the energy of the photon, the molecule may reach any of the vibrational levels of S1 or S2. The energy levels seem discrete in Figure 1.3-1. In most fluorophores, the closely spaced vibrational energy levels of the ground state, and the presence of even finer rotational energy levels on top of each vibrational level (not shown in the diagram), when coupled with thermal motion, produce a wide range of photon energies during fluorescence emission. Therefore it is actually impossible to resolve each transition and a continuous excitation spectrum is Introduction 16 observed. The excited fluorophore resides at higher vibrational levels very shortly. The fluorophore relaxes quickly to the lowest vibrational state of S1 as shown in the dashed green lines, during a time of the order of 10-12 s or less. The process is known as internal conversion (IC) or vibrational relaxation (VR). The excess vibrational energy is converted into heat, which is absorbed by neighboring solvent molecules upon colliding with the excited state fluorophore. Most molecules always undergo complete vibrational relaxation before emitting a fluorescence photon, because VR occurs on a timescale (~ 10-12 s) that is normally three orders of magnitude faster than that of the excited lifetime (~ 10-9 s).

From the ground excited state of S1, the molecule can return to any vibrational level of S0 by a variety of competing pathways, including fluorescence, phosphorescence after intersystem crossing to the triplet excited states, or other non-radiative relaxation due to collisional quenching or resonance energy transfer. In the case of fluorescence, as shown by the solid red lines, the molecule relaxes back to S0 by spontaneous emission of a fluorescence photon. Since the electron in the excited state is paired to the electron in the ground state, returning to the ground state is spin allowed. The average time spent by the molecule in the excited state S1 before returning to the ground state S0 is called the fluorescence lifetime. For most fluorophores this lifetime is of the order of 10-9 s. Occasionally the molecule may undergo a spin conversion into a forbidden triplet state, a process known as intersystem crossing (blue dot line). Returning from the triplet excited state to the ground state, either by phosphorescence or other non-radiative processes, is spin forbidden and therefore it is much slower, occurring in the order of milliseconds to seconds.

1.3.2. Fluorescence Spectra

As stated above, due to the presence of much finer rotational energy levels on each vibrational level, the excitation and emission spectra of fluorophores are continuous, as Introduction 17 shown for the dye Alexa-555 in Figure 1.3-2. Also, the profile of the emission spectrum of a fluorophore follows closely the mirror image of its excitation spectrum. The origin of “the mirror rule” is immediately apparent in the Jablonski diagram31: the distribution of vibrational levels in the ground and excited states are the same, since absorption of a photon does not significantly perturb the shape of the fluorophore 31.

Figure 1.3-2: Excitation and emission spectrum of Alexa555 in aqueous buffer (50mm Tris, pH 7.4).

The excitation and emission pathways depicted in the Jablonski diagram also point to another important spectroscopic feature of fluorophores: fluorescence emission always occurs at a longer wavelength than excitation, due to energy dissipation in the process of internal conversion. Overall the emission spectrum of a fluorophore is “redder” than its excitation spectrum. The difference in wavelength between the two maxima is defined as the Stokes shift. A large Stokes shift is always beneficial in fluorescence measurements: the fluorescence photons are more efficiently isolated from the excitation photons, and therefore the signal to noise ratio (SNR) could be improved.

1.3.3. Fluorescence Lifetime and Quantum Yield

The fluorescence lifetime, τ, measures the average dwell time of the fluorophore in

31 the first excited state before relaxing to the ground state . If we use kf to represent the rate of fluorescence radiation, and ki to represent the rates of all the other non-radiative relaxations (internal conversion, intersystem crossing, collisional quenching, energy Introduction 18 transfer, etc), mathematically, τ could be expressed using the following equation31:

The lifetime is a very important characteristic of a fluorophore. The value depends both on the spectroscopic properties of the fluorophore and also on the molecular surroundings, providing a way to probe the surrounding conditions. One way to measure the lifetime is to use ultra-short excitation pulses to excite the fluorophore62 and measure the delay time from each detected photon to the previous excitation pulse. By creating a histogram of many delay times, an exponential distribution is obtained which upon fitting yields the lifetime of the fluorophore. The fluorophore does not always emit a photon on its way back to the ground state after being excited. The ratio of photons emitted though fluorescence to photons absorbed is defined as the fluorescence quantum yield (QY). QY measures the probability of the excited state being deactivated by fluorescence rather than by non-radioactive mechanisms. Using the rates of the excited state decay (refer to Eq. 1.3-1), the quantum yield Φ is expressed as:

The QY depends on both the chemical nature of the fluorophore and its molecular surroundings. On one hand, the QY can be used to probe environmental changes in proteins. On the other hand, care has to be taken when operating experiments or calculations in which the QY is essential. One such example is the Förster radius calculation described in section 1.3.6.

1.3.4. Confocal Fluorescence Microscopy

The invention of confocal fluorescence microscopy35 solved issues of poor resolution and high background encountered in traditional fluorescence microscopy. In a traditional microscope, the excitation light is widely spread over the specimen in fairly uniform illumination. Fluorescence from the specimen is collected by the objective from Introduction 19 all depths, and therefore the image resolution on the focal plane is always negated by out-of-focus information. In , the specimen is only illuminated on a diffraction limited spot. Excitation is limited around this diffraction limited spot both laterally and axially. The spatial resolution is further improved by using a pinhole in front of the detector to reject out-of-focus fluorescence signals. Although confocal microscopy provides improved lateral and axial resolutions in sample imaging, it also has limitations because it is an optical system. Ernst Abbe, who developed the theory of image formation through a compound microscope, pointed out that the optical resolution of an is limited by wavelength (λ) of the incident light and numerical aperture (NA =n sinθ; n is the refractive index of medium in which the lens is working and θ is the half angle of the maximum cone of light that can enter or exit the lens)35 of the objective. In other words, the best lateral resolution of the diffraction limited spot on the specimen is given by:

In reality it is always hard to achieve the optical resolution predicted by Abbe’s equation. The optical resolution of the confocal microscope is normally measured by imaging a sample of 24 nm fluorescent microspheres (C37261, Invitrogen, Canada) and fitting the fluorescence intensity of a single bead to a 2-D Gaussian distribution. A fluorescence correlation spectroscopy (FCS) experiment on a standard sample, e.g., 5nM Rhodamine 6G (R6G) dye prepared in phosphate buffer, provides an alternative estimation of the lateral resolution ωo. FCS also determines the axial resolution, which is typically estimated to be 5-7 times larger than the lateral resolution. Larger lateral/axial resolution figures are a sign of poor optical alignment of the microscope. Confocal microscopy provides improved optical resolution and reduced background level. It also has disadvantages. A raster scan is necessary to build up the image of one area on the specimen. The insertion of the imaging pinhole before the detector significantly reduces the detected signal from the sample. Therefore either longer Introduction 20 acquisition or higher illumination intensity is needed to build up an image of sufficient quality. It used to be hard to tightly focus light from traditional light sources, such as arc lamps, but that limitation was removed by the invention of highly collimated laser sources34. Since then, confocal microscopy has been widely used and various types of confocal microscopes have been developed. Various fluorescence spectroscopy methods39, 40, 53 were also developed in combination with confocal microscopy. Presently, confocal microscopes are routine research instruments in most research institutions. The details of constructing a laser scanning confocal microscope (CLSM) in the Gradinaru Lab will be described in Chapter 2.

1.3.5. Total Internal Reflection Fluorescence (TIRF) Microscopy

Optical resolution in confocal microscopy has been improved significantly compared with conventional epi-fluorescence, especially in the axial direction. Yet the axial resolution is of the order of 1-2 micrometers, which is comparable to the size of E. coli bacteria63, for instance. For near-surface applications, such as signalling protein-cell surface receptor binding and tracking of secretory granules in intact cells64, the axial resolution of the confocal microscope is not sufficient. Fluorescence from regions deeper within the sample will also be detected and near-surface visualization will be blurred. The invention of the TIRF imaging modality greatly improved the axial resolution by more than an order of magnitude. The idea of using total internal reflection was first proposed by E.J. Ambrose as early as in 195636. The first construct of total internal reflection fluorescence microscope (TIRFM) was operated by D. Axelrod in the 1980s64. As its name says, TIRFM utilizes the fact that an evanescent field is produced when an excitation beam is incident at an angle larger than the critical angle, at the coverslip-sample interface. The evanescent field intensity exponentially decays with increasing distance from the surface, with a decay length only a fraction of the wavelength of incident light. Therefore only a very thin illumination layer, typically on the order of 100 nm, is created above the glass substrate to probe surface processes. Introduction 21

Figure 1.3-3: Schematic representations of (A) prism-based and (B) objective-based total internal reflection fluorescence microscopy.

Two popular configurations are used to achieve total internal reflection on an inverted microscope64: prism-based and objective-based configurations as shown in Figure 1.3-3. In the prism-based configuration, direct laser illumination is incident through a prism attached to the surface of the coverslip. The fluorescence from the sample excited near the bottom of the prism is collected by the inverted objective and sent to the detector. The prism-based configuration features better background reduction, however it has serious limitations due to geometric constraints on specimen manipulation. In the objective-based configuration, laser illumination is directed to the edge of a high NA objective (typically 1.45) which refracts the beam to a large incident angle at the coverslip/sample interface. The same objective collects the fluorescence induced by the evanescent field. In this configuration many of the sample limitations present in the prism-based configuration are avoided. The details of constructing a TIRFM based on this configuration will be described in Chapter 2.

1.3.6. Förster Resonance Energy Transfer (FRET)

FRET is a very important photophysical process involving a donor molecule in an excited state and an acceptor molecule in the ground state31. Through the long-distance dipole-dipole interaction, the donor molecule will experience non-radiative decay to the Introduction 22 ground state by transferring energy to the acceptor molecule. What makes FRET extremely useful is the fact, for a given FRET pair, the energy transfer efficiency has a steep dependence on the distance between donor and acceptor

17, 65 molecules (Eq. 1.3-4 and Figure 1.3-4) . The energy transfer efficiency EFRET is expressed by the following mathematical equation65:

where R is the distance between donor and acceptor molecules, and Ro is called Förster radius of the donor-acceptor pair. The strong dependence of EFRET on R is advantageous in probing accurate values of R in a given system, though over a typical range of 0.6 Ro

-1.6 Ro (3 - 8 nm), corresponding to ~ 5% - 95% in EFRET. Outside this range, as shown in Figure 1.3-4, when R changes from 1.6 Ro to 3 Ro (8 - 15 nm) and from 0.1 Ro to 0.6

Ro (0.5 - 3 nm), EFRET only changes from 5% to 0% and from 95% to 100% respectively.

For these two cases, determination of the inter-dye distance R from EFRET is not accurate.

65 Fortunately, Ro of commonly used FRET pairs is of the order of a few nanometres , which is typically the size of a protein and many other biomolecules. Although in confocal and TIRF microscopy the resolution is bound to the optical diffraction limit, FRET could be used to bypass the diffraction limit by providing accurate intra- and inter-molecular distance information on the nanometer scale in protein research.

Figure 1.3-4: Interdye distance dependence of the energy transfer efficiency. The curve is simulated Introduction 23 assuming a Förster radius of 5 nm. The shadowed area indicates a distance range within which energy transfer efficiency can be probed experimentally at reasonable precision (5% - 95%).

From equation 1.3-4, EFRET also depends critically on the Förster radius of the selected fluorophore pair. Any uncertainty in the Ro determination will be magnified in

66 the sixth power dependence. Ro is described by the following equation in units of Å:

where

2 κ : orientation factor, . ΘT is the angle between the donor

emission and acceptor absorption transition dipoles, while θD and θA are the angles that these dipoles make with the line connecting them. N: refractive index of the medium (usually assumed to be the solvent)

QD: quantum yield of the donor in the absence of the acceptor

J: spectral overlap between the donor emission spectrum FD(λ) and acceptor absorption

ε (λ) nm J spectrum A : . The wavelength is expressed in and

is in units of M-1cm-1nm4.

A complication in determining Ro is that the orientation factor depends on the orientation of the donor and acceptor dipoles66. Normally an average value of 2/3 can be used for κ2, assuming both fluorophores rotate freely and rapidly through all possible orientations during energy transfer. Unfortunately the assumption is not always valid in FRET applications66, 67, because the fluorophores labelled on the protein surface usually experience motional restrictions. Also the overlap integral J has to be calculated using the donor and acceptor spectra measured when the fluorophores are attached to the protein. Spectra of most fluorophores change upon protein labelling, and using spectra of donor and acceptor in solution may not yield the correct J determination.

1.3.7. Fluorescence Correlation Spectroscopy (FCS)

FCS was developed in the early 1970s68, 69 to measure chemical kinetic constants Introduction 24 and diffusion coefficients in equilibrium states. Later improvements of the technique using laser sources and ultrasensitive avalanche photodiodes (APDs)33 in a confocal microscope have made it a standard spectroscopy tool for diluted samples. The theory of FCS and its biophysical applications are discussed widely in the literature38, 63, 70, 71. FCS is based on the analysis of intensity fluctuations of fluorescent molecules diffusing through a (sub)femtoliter detection volume. Due to this small volume, the measurements are optimal for concentrations of 1-10 nM63, which ensure that on average the number of molecules in the detection volume is limited to one or a few. FCS maps the timescales of the fluctuations and the local concentration to the decay and the amplitude of the correlation curves. FCS is based on the study of temporal fluctuations of the fluorescence signal (δF) recorded when fluorescently labelled molecules diffuse through the detector volume38, 39. The signal fluctuations arise from rotational and translational diffusion of individual fluorophore through the detection volume and from reversible photodynamics, such as singlet-state quenching and triplet state blinking. The fluorescence intensity is commonly split into two or more channels, and the normalized auto- and cross-correlation functions are defined as63:

where i and j are detection channels and brackets denote time averaging of fluorescence from the corresponding detection channels. The detection volume is determined by the product of the excitation laser beam profile and the collection efficiency function of the confocal pinhole. When the projection of the pinhole diameter on the sample plane is of the order of the excitation beam waist, the detection volume can be approximated by a 3D Gaussian profile38. Under these assumptions, the intensity correlation function for free Brownian diffusion of a single fluorescent species with triplet blinking is38: Introduction 25

Here Neff is the number of molecules in the effective detection volume (

) and represents the inverse of the diffusion-related amplitude ( ). s is the aspect ratio of the axial and the lateral radii of the detection volume (s = z0/w0). τd is the diffusion time, τt is the triplet lifetime and ft is the average fraction of molecules in

38 the triplet state. The diffusion coefficient (calculated from ), is linked to the solvent’s temperature T and viscosity η and the solute’s hydrodynamic radius Rh through the Stokes-Einstein equation63:

In FCS, any signal with fluctuations can be subjected to time-correlation analysis, to derive the underlying time constants and kinetics70. Just as molecular diffusion leads to intensity fluctuation, FRET efficiency variations lead to intensity partition fluctuations in donor/acceptor channels. Because EFT variations are usually caused by conformational fluctuations of the host molecule, it is attractive to combine FRET with FCS to study the dynamics of protein folding and protein interactions72. For example, in order to probe conformational dynamics in an IDP protein, α-Synuclein61, 73, Ferreon and colleagues have performed FRET-FCS measurements under various conditions, in which α-Synuclein is known to preferentially occupy different conformations (Figure 1.3-5). The authors performed a simultaneous analysis of the auto- and cross-correlation of signals measured with three independent detectors on the donor-acceptor-labelled protein. Since all detectors record data from the same molecules, the contributions of molecular diffusion to the two correlation functions (Green: GDD and Red: GDA) are identical. Thus, by simply taking the ratio of the two correlation curves a correlation function that contains only the FRET fluctuations is obtained (Yellow: GDD/GDA). The yellow curves in Figure 1.3-5 show conformational fluctuations on the ns-μs timescales (indicated in the yellow curves) that occur in different protein folding states. Introduction 26

Figure 1.3-5: Application of FRET-FCS in quantifying structural dynamics of α-Synuclein under conditions where the indicated protein conformations were preferentially populated. The correlation analysis shows that structural fluctuations (up to 3 different timescales) are present in these protein conformations. Reprint with permissions from reference73, Figure 2C.

1.3.8. Fluorescence Polarization and Anisotropy

As mentioned in section 1.3.6, the absorption and emission processes have associated transition dipole moments. If the incident light is linearly polarized, the probability that a molecule will be excited is proportional to cos2θ, with θ being the angle between the absorption transition dipole and the electric field vector of the linearly polarized light. This is called photoselection. Photoselection preferentially excites molecules oriented in a direction close to that of the electric field, when a population of randomly oriented molecules is illuminated under polarized light. In most cases, the absorption and emission dipoles are (nearly) collinear. The fluorescence will then be preferentially polarized along the same direction, unless the orientation of the fluorophore changes before the photon is emitted. In essence, anisotropic excitation by linearly polarized light causes fluorescence with unequal intensities along different Introduction 27 polarization axes, which is a phenomenon called fluorescence polarization (FP) or fluorescence anisotropy (FA). Mathematically, FP and FA are defined as the ratios40:

where I// and I⊥ represent the intensity measured in the axis parallel and orthogonal to polarization of the incident light respectively. FP and FA contain virtually the same information, especially for samples containing a single emitting species or at single molecule level. They can be interchanged using the following equation31:

A fluorophore undergoes rotational movements during its random walk in solution. Changes in the orientation of the emission dipole during the lifetime of the excited state lead to a decrease in anisotropy. In other words, the r or P value decreases to a degree depending on both the fluorescence lifetime and the rotational correlation time of the fluorophore74. The smaller the molecule, the faster it tumbles in solution, and the smaller the measured anisotropy value. Figure 1.3-6 shows the depolarization in two extreme cases of fast and slow rotation of the fluorophores in solution, corresponding to anisotropy values of 0 and 0.4, respectively. In reality, the measured anisotropy always falls between these two limiting cases, providing a direct measure of the rotational correlation time of the molecule, which in turn provides information of the effective molecular volume of the fluorophore (see below). Therefore, FA or FP is suited for studying protein-ligand binding40, 75, by labelling the ligand (small molecule) with a fluorescent marker. The free and bound states of the small molecule have a low and high FA/FP value respectively, due to their mass differences. The measured FA or FP value should be just a weighted average of them, from which the fraction of bounded state could be simply derived. Apart from probing the rotational freedom of molecules and Introduction 28 measuring ligand-binding affinities, the polarization-sensitive detection is also important for estimating nanoscale directionality in FRET experiments, as the orientation of molecular dipoles involved in energy transfer enters the Förster radius calculation65.

Figure 1.3-6: The principle of photoselection and fluorescence depolarization in two extreme cases: slow and fast tumbling of the fluorophores in solution. θrot: average time of a full tumbling. τ: fluorescence lifetime.

Typically, anisotropy calculations use time-integrated signals in the parallel and orthogonal channels and the steady-state value is obtained via equation 1.3-5. If the parallel and the orthogonal polarization components are time-resolved on the (sub)nanosecond scale, a time-resolved anisotropy (TRFA) curve r(t) can be constructed76:

The TRFA curve is a homogenous contributions from species with varying rotational correlation times (i) as shown in the above equation. For instance, for a dye in solution or for a quasi-spherical fluorescent protein, only one component enters the sum in eq. 1.3-12. However, for more complicated samples, e.g., a mixture of free dye and dye-labelled protein, or a cylindrically-shaped fluorescent protein, two or more components are needed for a better physical description. In principle the time-resolved FP virtually contains the same information. However as FP is not an additive property, Introduction 29 the decay of time-resolved FP is not a single exponential even for a spherical molecule. Therefore TRFA is preferable in time-resolved applications, such as in Chapter 5, TRFA is used to probe shape information of the Sic1 protein. Similar to translational diffusion (see above), the Stokes-Einstein equation76 relates the rotational diffusion time  to the hydrodynamic radius:

where η, T, Rh are viscosity and the temperature of the solvent and hydrodynamic radius of the fluorescent particle, respectively. Note that rotational correlation time has a linear relationship with the molecular volume instead of the hydrodynamic radius, and therefore FA is more sensitive to molecular size than FCS.

Introduction 30

References: 1. Hartmann-Goldstein, I.J. and D.J. Goldstein, Effect of temperature on dry mass of polytene nuclei in Drosophila. J Cell Sci, 1979. 38: p. 405-16. 2. Pattee, H.H., On the origin of macromolecular sequences. Biophys J, 1961. 1: p. 683-710. 3. Kauzmann, W., The Three Dimensional Structures of Proteins. Biophys J, 1964. 4: p. SUPPL43-58. 4. Guzzo, A.V., The influence of amino-acid sequence on protein structure. Biophys J, 1965. 5(6): p. 809-22. 5. Dunnill, P., How proteins acquire their structure. Sci Prog, 1965. 53(212): p. 609-19. 6. Anfinsen, C.B., Principles that govern the folding of protein chains. Science, 1973. 181(4096): p. 223-30. 7. Frieden, E., S. Osaki, and H. Kobayashi, Copper proteins and oxygen. Correlations between structure and function of the copper oxidases. J Gen Physiol, 1965. 49(1): p. Suppl:213-52. 8. Liberles, D.A., et al., The interface of protein structure, protein biophysics, and molecular evolution. Protein Sci, 2012. 21(6): p. 769-85. 9. HGP. Human Genome Project Information. 2012 [cited 2012 Aug. 07, 2012]; Available from: www.ornl.gov/hgmis. 10. Brown, A. and A.L. Mackay, J D Bernal and the replication of the genetic material--hindsight on foresight. J Biosci, 2005. 30(4): p. 407-9. 11. Hess, G.P. and J.A. Rupley, Structure and function of proteins. Annu Rev Biochem, 1971. 40: p. 1013-44. 12. Pfister, K. and P. Christen, [Structure and function of enzymes. A brief introduction]. Med Lab (Stuttg), 1974. 27(6): p. 143-52. 13. Foucault, G., et al., Structure-function relationship in rabbit muscle glyceraldehyde-3-phosphate dehydrogenase. Trinitrophenylation of the lysine residues. Eur J Biochem, 1974. 46(1): p. 43-57. 14. Dunker, A.K., et al., Intrinsic disorder and protein function. Biochemistry, 2002. 41(21): p. 6573-82. 15. Levy, D.E. and J.E. Darnell, Jr., Stats: transcriptional control and biological impact. Nat Rev Mol Cell Biol, 2002. 3(9): p. 651-62. 16. Becker, S., B. Groner, and C.W. Muller, Three-dimensional structure of the Stat3beta homodimer bound to DNA. Nature, 1998. 394(6689): p. 145-51. Introduction 31

17. Schuler, B. and W.A. Eaton, Protein folding studied by single-molecule FRET. Curr Opin Struct Biol, 2008. 18(1): p. 16-26. 18. Lee, C. and M.H. Yu, Protein folding and diseases. J Biochem Mol Biol, 2005. 38(3): p. 275-80. 19. Fletcher, S., et al., Molecular disruption of oncogenic signal transducer and activator of transcription 3 (STAT3) protein. Biochem Cell Biol, 2009. 87(6): p. 825-33. 20. Watson, J.D., et al., Towards fully automated structure-based function prediction in structural genomics: a case study. J Mol Biol, 2007. 367(5): p. 1511-22. 21. Redfern, O.C., B. Dessailly, and C.A. Orengo, Exploring the structure and function paradigm. Curr Opin Struct Biol, 2008. 18(3): p. 394-402. 22. Tompa, P. and A. Fersht, eds. Structure and Function of Intrinsically Disordered Proteins. 1 ed. 2010, Chapman and Hall/CRC. 359. 23. Bloomer, A.C., et al., Protein disk of tobacco mosaic virus at 2.8 A resolution showing the interactions within and between subunits. Nature, 1978. 276(5686): p. 362-8. 24. Bode, W., P. Schwager, and R. Huber, The transition of bovine trypsinogen to a trypsin-like state upon strong ligand binding. The refined crystal structures of the bovine trypsinogen-pancreatic trypsin inhibitor complex and of its ternary complex with Ile-Val at 1.9 A resolution. J Mol Biol, 1978. 118(1): p. 99-112. 25. Aviles, F.J., et al., The conformation of histone H5. Isolation and characterisation of the globular segment. Eur J Biochem, 1978. 88(2): p. 363-71. 26. Dunker, A.K., et al., Intrinsically disordered protein. J Mol Graph Model, 2001. 19(1): p. 26-59. 27. Tompa, P., Intrinsically disordered proteins: a 10-year recap. Trends Biochem Sci, 2012. 28. Uversky, V.N., C.J. Oldfield, and A.K. Dunker, Intrinsically disordered proteins in human diseases: introducing the D2 concept. Annu Rev Biophys, 2008. 37: p. 215-46. 29. Dunker, A.K., et al., Flexible nets. The roles of intrinsic disorder in protein interaction networks. FEBS J, 2005. 272(20): p. 5129-48. 30. Uversky, V.N. and A.K. Dunker, Understanding protein non-folding. Biochim Biophys Acta, 2010. 1804(6): p. 1231-64. 31. Lakowicz, J.R., Principles of Fluorescence Spectroscopy. 4 ed. Vol. 1. 2006: SpringerLink. 32. Drenth, J. (2007) Principles of Protein X-Ray Crystallograph. 332 DOI: Introduction 32

http://apps.webofknowledge.com/full_record.do?product=UA&search_mode=Refin e&qid=4&SID=1B@45@jNOd767A99chk&page=2&doc=12. 33. Millstones, Millstones. Nature Millstones, Light Microscopy, 2009: p. S6-S23. 34. Ben-Tuvim, N., Laser-Light Amplification by Stimulated Emission of Radiation. Harefuah, 1963. 65: p. 191-4. 35. Pawley, J., Handbook of Biological Confocal Microscopy. 2006, Springer. 36. Ambrose, E.J., A surface contact microscope for the study of cell movements. Nature, 1956. 178(4543): p. 1194. 37. Denk, W., J.H. Strickler, and W.W. Webb, Two-photon laser scanning fluorescence microscopy. Science, 1990. 248(4951): p. 73-6. 38. Schwille, P. and E. Haustein, Fluorescence Correlation Spectroscopy: An Introduction to its Concepts and Applications, in The Biophysics Textbook Online. 2002, www.biophysics.org/education/schwille.pdf. 39. Krichevsky, O. and G. Bonnet, Fluorescence correlation spectroscopy: the technique and its applications. Rep. Prog. Phys., 2002. 65: p. 251?97. 40. Gradinaru, C.C., et al., Fluorescence anisotropy: from single molecules to live cells. Analyst, 2010. 135(3): p. 452-9. 41. Roder, H., K. Maki, and H. Cheng, Early events in protein folding explored by rapid mixing methods. Chem Rev, 2006. 106(5): p. 1836-61. 42. Yang, L.L., et al., Observation of protein folding/unfolding dynamics of ubiquitin trapped in agarose gel by single-molecule FRET. Eur Biophys J, 2012. 41(2): p. 189-98. 43. Hawe, A., M. Sutter, and W. Jiskoot, Extrinsic fluorescent dyes as tools for protein characterization. Pharm Res, 2008. 25(7): p. 1487-99. 44. Chudakov, D.M., et al., Fluorescent proteins and their applications in imaging living cells and tissues. Physiol Rev, 2010. 90(3): p. 1103-63. 45. Prasher, D.C., et al., Primary structure of the Aequorea victoria green-fluorescent protein. Gene, 1992. 111(2): p. 229-33. 46. Miyawaki, A., A. Sawano, and T. Kogure, Lighting up cells: labelling proteins with fluorophores. Nat Cell Biol, 2003. Suppl: p. S1-7. 47. Chalfie, M., et al., Green fluorescent protein as a marker for . Science, 1994. 263(5148): p. 802-5. 48. Pranada, A.L., et al., Real time analysis of STAT3 nucleocytoplasmic shuttling. J Biol Chem, 2004. 279(15): p. 15114-23. 49. Liu, B., A. Mazouchi, and C.C. Gradinaru, Trapping single molecules in liposomes: surface interactions and freeze-thaw effects. J Phys Chem B, 2010. 114(46): p. Introduction 33

15191-8. 50. Han, J. and K. Burgess, Fluorescent indicators for intracellular pH. Chem Rev, 2010. 110(5): p. 2709-28. 51. Mohr, G.J. and O.S. Wolfbeis, Optical sensing of anions via polarity-sensitive dyes: a bulk sensor membrane for nitrate. Analytica Chimica Acta, 1995. 316: p. 239-246. 52. Betzig, E. and R.J. Chichester, Single molecules observed by near-field scanning optical microscopy. Science, 1993. 262(5138): p. 1422-5. 53. Ritort, F., Single-molecule experiments in biological physics: methods and applications. J Phys Condens Matter, 2006. 18(32): p. R531-83. 54. Borgia, A., P.M. Williams, and J. Clarke, Single-molecule studies of protein folding. Annu Rev Biochem, 2008. 77: p. 101-25. 55. Selvin, P.R. and T. Ha, Single Molecule Techniques: A Laboratory Manual. Cold Spring Harbor Laboratory Press. Vol. 2. 2008, New York: CSHL Press. 6. 56. Ashkin, A., et al., Observation of a single-beam gradient force optical trap for dielectric particles. Opt Lett, 1986. 11(5): p. 288. 57. Lipman, E.A., et al., Single-molecule measurement of protein folding kinetics. Science, 2003. 301(5637): p. 1233-5. 58. Zhuang, X. and M. Rief, Single-molecule folding. Curr Opin Struct Biol, 2003. 13(1): p. 88-97. 59. Plakhotnik, T., E.A. Donley, and U.P. Wild, Single-molecule spectroscopy. Annu Rev Phys Chem, 1997. 48: p. 181-212. 60. Mittag, T., et al., Structure/function implications in a dynamic complex of the intrinsically disordered Sic1 with the Cdc4 subunit of an SCF ubiquitin ligase. Structure, 2010. 18(4): p. 494-506. 61. Ferreon, A.C., et al., Single-molecule fluorescence studies of intrinsically disordered proteins. Methods Enzymol, 2010. 472: p. 179-204. 62. Liu, B., et al., A Photostable, pH-Invariant Derivative for Single-Molecule Microscopy. J. Fluoresc., 2009. 19(5): p. 915-920. 63. Mazouchi, A., et al., On the performance of bioanalytical fluorescence correlation spectroscopy measurements in a multiparameter photon-counting microscope. Anal Chim Acta, 2011. 688(1): p. 61-9. 64. Axelrod, D., Total internal reflection fluorescence microscopy in . Traffic, 2001. 2(11): p. 764-74. 65. Roy, R., S. Hohng, and T. Ha, A practical guide to single-molecule FRET. Nat Methods, 2008. 5(6): p. 507-16. 66. Badali, D. and C.C. Gradinaru, The effect of Brownian motion of fluorescent probes Introduction 34

on measuring nanoscale distances by Forster resonance energy transfer. J Chem Phys, 2011. 134(22): p. 225102. 67. Dale, R.E., J. Eisinger, and W.E. Blumberg, The orientational freedom of molecular probes. The orientation factor in intramolecular energy transfer. Biophys J, 1979. 26(2): p. 161-93. 68. Bopp, K.P. and W. Magdeburg, [Possibilities and limitations of cortisone therapy in internal diseases]. Z Allgemeinmed, 1972. 48(2): p. 43-53. 69. Magde, D., E.L. Elson, and W.W. Webb, Fluorescence correlation spectroscopy. II. An experimental realization. Biopolymers, 1974. 13(1): p. 29-61. 70. Wohland, T., R. Rigler, and H. Vogel, The standard deviation in fluorescence correlation spectroscopy. Biophys J, 2001. 80(6): p. 2987-99. 71. Chen, H., E.R. Farkas, and W.W. Webb, Chapter 1: In vivo applications of fluorescence correlation spectroscopy. Methods Cell Biol, 2008. 89: p. 3-35. 72. Price, E.S., M. Aleksiejew, and C.K. Johnson, FRET-FCS detection of intralobe dynamics in calmodulin. J Phys Chem B, 2011. 115(29): p. 9320-6. 73. Ferreon, A.C., et al., Interplay of alpha-synuclein binding and conformational switching probed by single-molecule fluorescence. Proc Natl Acad Sci U S A, 2009. 106(14): p. 5645-50. 74. Koshioka, M., K. Sasaki, and H. Masuhara, Time-Dependent Fluorescence Depolarization Analysis in Three-Dimensional Microspectroscopy. Applied Spectroscopy, 1995. 49: p. 224-228. 75. Nasir, M.S. and M.E. Jolley, Fluorescence polarization: an analytical tool for immunoassay and drug discovery. Comb Chem High Throughput Screen, 1999. 2(4): p. 177-90. 76. Schroder, G.F., U. Alexiev, and H. Grubmuller, Simulation of fluorescence anisotropy experiments: probing protein dynamics. Biophys J, 2005. 89(6): p. 3757-70.

Single Molecule Fluorescence 35

Detection and Analysis of

Single-Molecule Fluorescence

Single Molecule Fluorescence 36

This chapter describes the experimental setups used in the Gradinaru lab to measure single-molecule fluorescence (SMF) signals, the custom-built data acquisition and analysis software, and the commonly used experimental procedures. Sections 2.1 and 2.2 describe two optical instruments that were completely designed and built in our lab (to which I had a significant contribution)1, 2: a Confocal Laser Scanning Microscope (CLSM), and a Total Internal Reflection Fluorescence microscope (TIRFM). Section 2.3 describes how different hardware components are synchronized with each other and controlled by LabView code that I wrote. This section also contains different single-molecule measuring modalities that I developed, as well as data acquisition and analysis methods that I designed and implemented within a user-friendly interface. Section 2.4 describes commonly used experimental protocols in single-molecule biophysics research, such as protein labeling and purification, microscope coverslip cleaning and surface modification, as well as final sample preparation procedures for observation of single-molecules on the fluorescence microscopes.

2.1. Confocal Laser Scanning Microscope (CLSM)

2.1.1. Optical Layout of the Microscope

The schematic optical layout of our custom-built confocal system, which enables multiparameter spectroscopic measurements of single molecules as well as multiple imaging modalities3, is shown in Figure 2.1-1. A Millennia Pro laser head (Spectra Physics) outputs up to 15 W of continuous-wave (CW) power at 532nm to pump a Ti:sapphire laser (Tsunami HP, Spectra Physics). The Tsunami laser outputs nearly Fourier-transformed femtosecond pulses, with a center wavelength tunable across a wide spectrum, from 700 nm to 1100 nm. These near-infrared pulses are frequency-doubled by a nonlinear β-BBO crystal to make light pulses in the visible spectrum. The time interval between the pulses is around 12.5 ns, and for applications requiring a longer relaxation of molecules between excitation events, a pulse picker (Conoptics, USA) is used to increase the inter-pulse duration by up to 100 fold. The most commonly used fluorophores in single-molecule research are excited either by blue (~480 nm) or by green (~530 nm) light4-6, therefore the most frequent Tsunami Single Molecule Fluorescence 37 outputs in our setup are centered at 960 nm and 1055 nm, respectively. To have easy-to-switch optical alignments, we designed two parallel paths, one for 960 nm and the other for 1055 nm. In Figure 2.1-1, the infrared beam is divided and sent to two independent paths for Second-Harmonic Generation (SHG) to provide blue (480 nm) and green (528 nm) excitation beams, respectively. A rotating half-wave plate (P1) followed by a polarization cube (PC1) are used to control the amount of infrared input power for each frequency-doubling path. This arrangement makes the optical alignment more reproducible when switching the excitation colors for our samples. For each path, the infrared (fundamental) beam has to be focused in the nonlinear crystal (denoted SHG in Figure 2.1-1) to reach a high doubling efficiency (~ 10% at ~300mW input power). The generated visible beam is diverging and astigmatic, therefore we use a pair of cylindrical lenses (CL1 & CL2 for green, CL3 & CL4 for blue) to circularize and (approximately) collimate it. Blue glass filters (BG-39, Edmund Optics, USA) are then inserted to block the fundamental wavelength in the output beam. Finally, a “blue-reflective” dichroic mirror (DM1, 47267, Edmund Optics, USA) is used to collinearly combine the green and blue light beams. This arrangement ensures robustness of the optical alignment when switching colors for different experiments. The excitation beam is spatially filtered through a 20 μm pinhole (Pin1, Thorlabs, Canada), which is placed at the focus of a 1:3 telescope (L1 and L2). The pinhole is used to create a circular, Gaussian-like mode of the excitation beam. At the same time, the pinhole is the common “point” for all laser sources used for excitation, and we use the transmission ratio through the pinhole as an indicator of robustness. Also, the telescope magnifies the diameter of the excitation beam to 8-10 mm, in order to overfill the back aperture of the microscope objective. After the telescope, a set of optical density (OD) filters and a mechanical shutter (S) are installed. 12 OD filters are used on a double-wheel mount to selectively and finely tune the excitation intensity over 7 orders of magnitude, as needed for each sample. For instance, diffusing and immobilized single-molecules require different levels of excitation, and both are different than confocal imaging applications. The mechanical shutter blocks the excitation beams in between measurements to avoid the photobleaching of the sample. It also protects the sensitive optical detectors, by shutting the excitation beam off when Single Molecule Fluorescence 38 the signal reaches a set level.

Figure 2.1-1: Optical layout of the CLSM setup used in this thesis. M7 is a mirror used to direct excitation beams from other (external) laser sources into the confocal microscope. The bottom right image shows the multiparameter information (smFRET trajectory as an example) obtained from the sample.

Section 2.4 will describe in more detail the selection of dichroic, long- and band-pass filters for optimal SMF measurements. The excitation dichroic mirror (DM2, LPD01-532RS or FF560-Di01, Semrock, USA) reflects the excitation beam at 45˚ incidence angle, vertically through the microscope objective. The 1.4 NA/100X Plan-Apochromat objective (420790-9900, Carl Zeiss, Canada) focuses the excitation to a diffraction-limited spot onto the coverslip-mounted sample. The objective is infinity corrected, so that the fluorescence signal collected from the focal spot projects along the optical axis of the microscope as a (quasi)parallel bundle of rays. This feature is advantageous for detection path design and alignment. For instance, it makes installing the emission pinhole (Pin2, 50 or 75 μm depending on application, Thorlabs, Canada) rather simple; we placed it at the focus of the 1:1 telescope (L3 and L4) in the emission path. The epi-detected fluorescence is thus filtered spatially using the pinhole and filtered Single Molecule Fluorescence 39 spectrally using long-pass and band-pass filters (HQ600/80, D705/80 or HQ535/50, Semrock, USA), which remove out-of-focus fluorescence and Rayleigh and Raman scattering. In a typical CSLM, either the laser beam or sample needs to be scanned during data acquisition. We chose to mount the sample on a piezo scanning stage and keep the laser beam fixed, since single-molecule experiments do not need ultra-fast scans. The three-dimensional translational piezo scanner (T225, MadCity Labs, USA) has a scanning range of 200 μm in the XY-plane and 50 μm in the Z-axis, with nanometer resolution. Normally the sample is first raster scanned in the XY-plane by the scanner, and then the molecules of interest are moved into the focus of the excitation beam for long time observation. Section 2.3 will describe in detail how the scanner works and how it is controlled and integrated within the data acquisition software. The fluorescence is split into two or more separate pathways as needed, using dichroic mirrors (DM3, DCXR640, Chroma or FF560, Semrock) and/or polarization cubes (PC2 and PC3). The dichroic mirrors and polarization cubes are all mounted on interchangeable magnetic bases, which make the alignment robust, reproducible easy to change. Each resulting beam is focused onto single-photon avalanche diodes (SPADs) (PD5CTC, MPD, Italy). These detectors are designed for time-resolved single-photon counting applications, with an instrument response function of ~ 48ps (FWHM, NIM output). The high detection efficiency (~ 52% peak at 550 nm) and low dark count (< 20 cps) features make them ideal for low-signal detection from single-molecules. For fluorescence resonant energy transfer (FRET) experiments (Chapter 5), a red-sensitive SPAD (COUNT, Laser Components, USA) replaced one of the MPD detectors in the acceptor channel. Although it has poorer timing response, the COUNT detector has an improved photon detection efficiency in the 600-1000 nm range (~73% peak at 660 nm), which optimizes the detection of (red) fluorescence from acceptor dyes in smFRET.

2.1.2. Signal Detection: The Photon Counting Apparatus

PicoHarp 300 (PicoQuant, Germany) is a stand-alone single photon counting apparatus7, with two independent input channels, each having up to 4 ps resolution. The USB interface makes it easy to use and capable of achieving high dataflow. A library of basic LabView VI’s was provided by the manufacturer, which I incorporated in a data Single Molecule Fluorescence 40 acquisition program controlling all the hardware components of the confocal setup.

Figure 2.1-2: PicoHarp 300 and router connections used in the confocal setup.

PicoHarp300 has three working modes, two of which are routinely used in our experiments and are described here: Time-tagged (TTR) mode and Time-tagged Time-resolved (TTTR) mode7. The TTR mode uses a built-in clock to register received TTL/NIM pulses corresponding to individual photons. In this mode, both input channels could be used as signal channels. The TTTR mode needs to have an external synchronizer to register received photon events. The (constant) frequency of the external synchronization pulses is used to calculate the arrival time of each photon. This mode may seem disadvantageous compared to the TTR mode, since no external synchronization is really constant. However, TTTR is best suited for fluorescence lifetime measurements. Using the excitation pulses as the reference signal, PicoHarp 300 is able to do time-correlated single photon counting (TCSPC), with 4 ps accuracy, by counting the delay time between each arriving photon and the previous excitation pulse. For the femtosecond laser source used in the Gradinaru lab (TsunamiHP, Spectra Physics), the typical variation of the period between the pulses is < 0.001% within a typical measurement time. This makes high accuracy in time-tagging possible. PicoHarp300 has only two independent channels, one of which is reserved for the reference (synchronization) signal in the TTTR mode. This is limiting when doing multi-parametric measurements in which at least two fluorescence channels need to be measured simultaneously. On the other hand, there is no option for synchronization with other hardware. To accommodate these needs, a router module (PHR800, PicoQuant, Germany) is used to extend the accessible number of channels8. PHR800 has four input channels to accept fluorescence signals from four detectors. Unlike PicoHarp 300 which Single Molecule Fluorescence 41 only accepts NIM (negative) pulses, the router channels accept independently adjustable input polarity and trigger threshold input pulses. The router also has four "marker" channels which can receive synchronization signals from other hardware, such as the piezo stage controller. However, it should be noted that the four input channels on the router are not fully independent for photon counting, as they share the same time-to-digital converter. This implies that the dead time period of the router limits the auto- and cross-correlation analysis of these channels to delays longer than ~80 ns.

2.2. Total Internal Reflection Fluorescence Microscope (TIRFM)

The optical layout of our custom built, inverted TIRFM2 is shown in Figure 2.2-1. It is similar to that of CLSM, but with some differences:

Figure 2.2-1: Optical layout of the TIRFM custom-built in the Gradinaru Lab. Three high reflective mirrors (M1-M3) and two dichroic mirrors (DM1-DM2) are used to co-linearly align beams from three CW . A focusing lens (L3) is used to focus the excitation beam on the back focal plane of the objective. Fluorescence collected by the same objective is captured by an EMCCD camera.

(a). In CLSM, the excitation beam is collimated before entering the objective to create a tightly focused spot on the sample. In TIRFM, it has to be pre-focused on the back focal plane of the objective to create a quasi-collimated illumination beam on the sample (Figure 2.2-2). In other words, the confocal microscope illuminates and probes a (diffraction-limited) spot on the sample, while the TIRFM images a large area of the Single Molecule Fluorescence 42 sample.

Figure 2.2-2: Depending on incidence angle of the excitation beam on the sample, the same microscope setup can be operated: (A) in the wide-field mode, or (B) in the total-internal-reflection mode.

(b). The microscope we built can work in two different imaging modes: wide-field (WF) and total internal reflection (TIR) mode (Figure 2.2-2). In the WF mode, the illumination beam enters the objective at the center and illuminates the sample with deep penetration depth. WF mode is commonly used for imaging large objects (> 0.5 μm)2, and for single-molecule tracking in cell membranes far from the coverslip surface9. In the TIR mode, the illumination beam is totally reflected at the coverslip-sample interface, and it creates an evanescent field into the sample with a typical depth on the order of 100 nm10. Therefore, the TIR mode is good for near-surface imaging studies11, such as immobilized proteins, supported lipid bilayers and cell membranes. Due to a small illumination volume in the sample space, the background signal is minimized under TIR mode and a high signal-to-noise ratio (SNR) is achieved. TIR mode is the most popular method for high-throughput SMF spectroscopy studies. (c). A highly sensitive CCD camera is used instead of a SPAD detector for collecting the fluorescence signal (Figure 2.2-1). The laser illuminates an area of the sample, and the CCD consists of a large array of detectors that can preserve that spatial information. The detection area of the SPAD is small, ~ 50 μm, to capture the entire image, as the size of fluorescence image after L4 is of the order of mm. More importantly, the SPAD is not pixelated, so it would integrate the signal from the entire area detected at a given time and washes out features of the illuminated area if we managed to focus the fluorescence beam on it. The camera, while capable of capturing images of the entire illuminated area, does so by integrating signals over relatively long (10-100 ms) exposure times and it does not preserve the timing information on the single-photon basis. Single Molecule Fluorescence 43

Diode or diode-pumped solid-state lasers of different wavelengths (473 nm, 532 nm and 635 nm) were co-aligned and combined before an acousto-optical tunable filter (AOTF, Gooch and Housego, USA). The AOTF is an electro-optical device, which is used to simultaneously modulate the wavelength and the intensity of the combined laser beams. After the AOTF, the selected laser lines are directed into the 1.45 NA/60x oil objective (Olympus, USA) to illuminate the sample. As discussed in section 1.3.5, a high NA objective is necessary for TIR to be feasible at the coverslip-sample interface10. Emitted fluorescence was collected by the same objective and the laser scattering was eliminated by a dichroic mirror (DM3, FF495-Di02, Semrock, USA), followed by extensive filtering using a combination of long-pass (LP-488-RS, Semrock, USA) and bandpass (HQ520/66, Chroma, USA) filters. Images were captured with a cooled electron-multiplied charge-coupled device (EMCCD) camera (DU-897BV, Andor, USA) with typical exposure times of 50-100 ms. The camera chip has 512 x 512 pixels and it features a very high photon detection efficiency in the visible region (~90%). To avoid excessive photobleaching of the samples, the AOTF was synchronized with the camera using LabView. The sample was illuminated strictly during the exposure time of each acquisition, so that no light reaches the sample during the readout of each image in a kinetic series or in between different acquisitions. The fluorescence signals can also be split into two channels as needed by color or polarization, similar to the CLSM setup. The two fluorescence signals are captured on the same camera, by sending them to different, non-overlapping areas on the camera chip. The lens in front of the camera (L4) has either 40 cm or 15 cm focal length, depending on the requirement for magnification. For immobilized single molecule experiments, the 15 cm focusing lens is commonly used in order to optimize the SNR. For samples containing large objects or for SM tracking, the 40 cm focusing lens is used.

2.3. Data Acquisition and Analysis Software

2.3.1. Hardware Connection Flow Diagram

The data acquisition software is programmed in Labview 8.2 on a personal computer in the Gradinaru lab. Figure 2.3-1 shows the control flow diagram of the confocal microscope. The PicoHarp300 and the Nano-stage controller are the two main Single Molecule Fluorescence 44 instruments in this microscope. In the software, the PC communicates with the piezo stage controller and PicoHarp via their high speed USB interfaces (double-headed arrows), such as moving and scanning the stage, setting output ISS (image scan synchronization) clock levels, setting PicoHarp and router running modes, etc. The stage controller synchronizes with PicoHarp by using one of its ISS clock outputs as a synchronization signal. The stage controller has four ISS clock outputs named pixel, line, frame and aux respectively. They are independently modulated at different levels, or associated with a read/write operation on a specified axis. The "line clock" is related, for instance, to scanning along the fast (x) axis: triggering scanning in the x-axis will send a pulse at the beginning and at the end of this line scan (marker pulses). The two marker pulses are recorded by the PicoHarp router and are used in the data analysis software to build correct images by providing start and stop time tags of each line scan. The "frame clock" is used to control an external mechanical shutter by simply setting the clock TTL high or TTL low. Similarly, the "aux clock" is used to enable or disable the SPAD detectors by setting the clock TTL high or TTL low.

Figure 2.3-1: Flow diagram of apparatus used in the laser scanning confocal microscope. The arrows indicate the direction of data flow (solid arrows), commands (dotted arrows), and communications.

The PicoHarp (and router) is a photon counting apparatus, and functions by receiving pulses from four SPAD detectors corresponding to single photons. For each Single Molecule Fluorescence 45 photon, PicoHarp will register three types of information: channel indexing (4 bits: values 0-3), delay to reference pulse (12 bits: values 0-4095) and arrival time (16 bits: values 0-65535). Therefore for each fluorescence photon, PicoHarp creates a 32-bit data point and stores it in a binary file. The marker pulses are registered in a similar way, with their channel index fixed at 14.

2.3.2. Data Acquisition Software

The data acquisition software is programmed in LabView 8.2, with the user graphic interface is shown in Figure 2.3-2. The software can run experiments in two major modes (controlled via a “stationary” dropdown box): a scanning mode and a stationary mode. The scanning mode is designed for confocal imaging of microscopic samples. The stationary mode is for collecting fluorescence data at a fixed stage position. Other functions including lifetime fitting, correlation analysis, molecule identification, batch data collection were also designed to facilitate different applications. In the scanning mode, the stage controller will first load information of the scanning area (red square box in Figure 2.3-2) into the stage memory, including the starting coordinates and the size of the scanning area, the pixel size and dwell time, etc. When the PicoHarp is ready for data acquisition, the software commands the stage controller to start scanning. After each scan, the saved binary file is decoded and an image is built and displayed in the intensity graph in the lower right panel. In Figure 2.3-2, a 5 m  5 m area of spin-coated fluorescent beads is shown as an example. A click on the graph displays the intensity of the spot in “I (KHz)”, and then by clicking “Move” brings that spot into the laser beam for stationary fluorescence acquisition. The “Zoom In” and “Zoom Out” buttons enable the user to focus on details of the acquired image. Also lifetime information from the area of interest (surrounded by the green and yellow cursor lines) can be displayed on the bottom left panel by double-clicking. The stationary mode is used to continuously collect the fluorescence photons from a spot of interest in the sample. This could be a single molecule identified upon sample scanning, a fluorescent bead (like in Figure 2.3-2), or a randomly selected diffraction-limited volume inside the solution when doing FCS measurements. An intensity-time trajectory is built and displayed after each acquisition. The lifetime could Single Molecule Fluorescence 46

be built from the region of interest, spaced by two vertical cursor lines.

Z-Scan

Time-intensity trajectory Intensity Graph

Figure 2.3-2: The user interface of the data acquisition software. An “offline” version is available for data analysis without the need of hardware communication.

The acquisition software has a few features in the program for other requirements. The “Z-scan” enables sample sectioning with controllable Z-step, number of sections and lead time between sections. The images will be built in a stack, either automatically saved or displayed in the intensity graph after scanning. The “online trajectory viewer (OT)” and “online correlation (AC)” enable simultaneously building real-time intensity traces and correlation curves when doing data acquisition. The “auto-collect” feature (checkbox next to “collecting…”) enables the software to collect fluorescence at a group of positions. This is particularly useful when doing SM immobilization experiments. It saves time and effort by enabling the software to automatically identify molecules (“Molecules”), collect fluorescence signals one by one until the signal from a molecule is below the settable “smart value”, and save the data automatically. Single Molecule Fluorescence 47

An offline version is available for doing data analysis on saved data files. It has all the features except apparatus communication. It also includes some other features to built anisotropy-time and lifetime-time trajectories, histograms of the intensity, anisotropy or lifetime from the single-molecule trajectories, and time-resolved anisotropy curves, etc. The results can also be exported into “.txt” files and processed by other software, such as Origin and Matlab.

2.4. Fluorophore and Optical Filter Selections

In order to select optimal fluorophores and optical filters when designing single-molecule experiments, I developed a user-friendly “Spectra View” application (Figure 2.4-1). This was inspired by a web application, Fluorescence SpectraViewer (Life Technologies, USA)12, which I used as a guide for the graphic user interface. The code of the application is programmed in Labview 8.2, containing not only fluorescence spectra viewing, but also other features which are discussed below. This application enables the user to view the excitation and emission spectra of up to five fluorophores simultaneously, which are selectable by drop-down boxes. Adding spectra of a new fluorophore is very simple. The measured/downloaded spectra of the fluorophore should be stored in two files named “ex_fluorophorename.txt” and “em_fluorophorename.txt”. Adding these files to the folder named “Fluorophore” will automatically add the new fluorophore to the drop-down box list. SpectraView can also be used to examine the transmission spectra of dichroic mirrors and filters (under section “Filters”). Adding spectra of the dichroic mirrors and filters is similar to adding spectra of fluorophores. There are two separate folders named “Dichroic” and “Filter” for storing these spectra files respectively. As a guide for optimal experiment design, the application provides a theoretical estimation of signals detected in each channel when using a certain set of dichroic mirrors and fluorescence filters. The feature is called “Signal split” in the application interface and the schematics of different places for dichroic mirrors and filters in the detection path is shown in the top right corner. The example shown in the figure displays the calculated signal fractions in the “blue” and “red” channels if we select the 640DCXR dichroic (Chroma, USA) to split the TMR fluorescence. Basically, placing the 640DCXR at the Single Molecule Fluorescence 48 position “D2” corresponds to selecting it in the second drop-down box in the section “Dichroics”. In this configuration, we expect 88.7% of the signal to be detected in the blue channel and 11.3% in the red channel. In a real experiment, if we do not detect nearly 8 times more signal in the blue channel than in the red one, it means that the SPAD alignment is not optimized in at least one of the detection arms and it must be improved.

Figure 2.4-1: The front panel of the SpectraView application used for visualizing the spectra of fluorophores and optical filters, and for calculating parameters for experiment design: signal split, Förster radius, etc. The inset on the top right shows possible mounting places for optical filters and dichroic mirrors used in our confocal setup.

The SpectraView application is especially useful for FRET pair selection, Förster radius calculation, optimum filter set selection for donor-acceptor signal split, setup detection efficiency estimation, etc. For the selection of a fluorophore pair for FRET measurements, a few important considerations should be taken into account:

(a). the FRET efficiency (EFT) has a steep dependence on the inter-dye distance. In studies of protein conformations, external (pH, salt concentration, etc) or internal (phosphorylation, state transitions, etc) factors can lead to conformational fluctuations or Single Molecule Fluorescence 49 dynamics. To probe these variations, the Förster radius (Ro) of the selected FRET fluorophore pair should be comparable to the size of the molecules of interest. The application has an option (named “Forster” in Figure 2.4-1) to calculate the Förster radius,

Ro, of a pair of fluorophores based on the overlap between their excitation and emission spectra, the donor’s quantum yield, and the acceptor’s peak extinction coefficient. The

Förster radius is calculated in nm; calculated Ro for TMR and Atto647N is shown as an example. (b). the donor and acceptor emission spectra should have minimal overlap, to limit the influence of signal crosstalk on the EFT calculation. On the other hand, a significant overlap in the emission spectra always pairs with significant acceptor direct excitation, because most fluorophores have a similar Stokes shift. The interference of acceptor direct excitation is even more problematic than signal crosstalk, because it is rather difficult to distinguish it from conformations with high EFT. (c). the selected fluorophores should not affect the biological system to be investigated. For the intrinsically disordered protein project (Chapter 5), since the IDP protein is highly positively charged, fluorophores with negative charges, such as most Alexa fluorophores (normally -2 or -3), may have a strong electrostatic attraction with the protein being studied. Therefore, neutral fluorophores, such as TMR and Atto dyes (neutral or +1) are preferable for this system. (d). for single molecule observations, in order to increase the signal-to-noise, the fluorophores selected need to have high fluorescence quantum yield, be photostable and show little photoblinking behavior. Dichroic mirror and fluorescence filter selections are relatively facile. The general thumb rule is that they should maximize transmission efficiency for the fluorophores chosen, and minimize the inherent background signals due to Rayleigh and water Raman scattering. The emission dichroic mirror should be able to separate donor and acceptor signals and minimize signal crosstalks.

2.5. Coverslip Cleaning and Surface Modifications

Microscope coverslips are commonly used substrates for the fluorescence microscope. However for SMF measurements, the coverslip has to be thoroughly cleaned Single Molecule Fluorescence 50 to remove any fluorescent impurities11, 13. The cleaned coverslip also has to be treated to either prevent nonspecific adsorption or to promote specific attachment of biomolecules under investigation on the glass surface for observations within an extended time window14-16. The protocols for coverslip cleaning and chemical surface treatment are discussed in this section.

2.5.1. Plasma Cleaning Protocol

Microscope coverslips are cleaned of surface fluorescent impurities using an Argon-plasma cleaner (PDC-32G, Harrick Plasma, USA). For some specific experiments that require Polyethylene glycol (PEG)-coated surfaces, the coverslips are first washed in ethanol/methanol, and then dried under nitrogen flow. The coverslips are then placed under vacuum in the plasma chamber to remove further ethanol/methanol residues. The plasma chamber is then fluxed with Argon, which is an inert gas and will not oxidize the coverslip supports. The plasma cleaner is running at low pressure (~ 200 mTorr) for 20 minutes to fully remove fluorescent impurities by the plasma activated Argon atoms. For coverslips used for measurements in solution (burst or FCS), pre-cleaning in ethanol/methanol is usually not applied. The effectiveness of coverslip cleaning is tested on the confocal microscope, by scanning the surface under similar experimental conditions similar to those of single-molecule observations. The fluorescent impurities usually are less photostable and are also characterized by multi-component lifetimes. Good sample coverage for SMF measurements is of the order of 5-10 molecules in an area 5m  5m, and therefore coverslips containing less than one fluorescent impurity in an area 10m  10m are considered clean.

2.5.2. Surface Coating with BSA(-biotin)

For long time-resolved SMF measurements, the coverslips need to be functionalized for specific immobilization of the molecules of interest and to prevent nonspecific adsorption of other molecular species from the buffer solution. The classic protocol is to incubate the clean coverslips with concentrated Bovine Serum Albumin (BSA). The BSA molecules are sticky, and adsorption of BSA molecules on the hydrophilic glass surface creates a thin BSA layer. The negative charge of the carboxylic groups in BSA and Single Molecule Fluorescence 51 exposed hydrophobic side chains upon surface-induced denaturation of BSA prevent the adsorption of other molecules17. The procedure is very simple: BSA solution (1 mg/ml in 50mM Tris at pH7.4) is applied on clean coverslips and incubated for about 15 minutes. The excess BSA is removed by gently flowing the buffer solution over the coverslip for about 5 minutes. The BSA-treated coverslips are frequently used for FCS and SM burst measurements. For that, we also add 0.01% Tween-20 (P2287, Sigma-Aldrich, Canada) to the sample, to further reduce nonspecific adsorption of proteins from solution over extended periods of time 18. Treating coverslips with biotinylated BSA (BSA-biotin) is very similar. The only difference is that a small fraction of BSA-biotin (~1%) is added to the BSA solution prior to incubation. The BSA-biotin adsorbed on the coverslip provides a way to immobilize biomolecules via the well-known biotin-streptavidin interaction. The interaction of biotin to the homo-tetramer protein streptavidin is one of the strongest non-covalent interactions found in nature (dissociation constant in the order of 4 × 10-14 M)19. Incubation of the streptavidin solution creates a layer of streptavidin on the top of the surface BSA-biotins. Since each streptavidin has four binding sites, three sites are open to accept other biotin molecules. Therefore, incubation of biotinylated biomolecules will create a layer of firmly immobilized biomolecules on the coverslip for SMF studies. The fraction of BSA-biotin, the amount of biotinylated biomolecules and their incubation time control the final density of immobilized biomolecules. Good sample coverage for single molecule identification is on the order of 5-10 molecules per 5m5m. To summarize, the protocol for BSA/BSA-biotin covered coverslips is: 1). 50 L of 1 mg/ml BSA-biotin is applied and incubated at room temperature for 15 min, followed by washing five times with the sample buffer. 2). 50 L of 0.2 mg/ml streptavidin is applied on the coverslip and incubated at room temperature for 15 min, followed by washing five times with the sample buffer. 3). 75 L of ~100 pM biotinylated biomolecule solution is applied on the coverslip and incubated at room temperature for 15 min, followed by washing five times with the sample buffer. BSA, BSA-biotin and streptavidin are commercially available (Sigma-Aldrich, Single Molecule Fluorescence 52

Canada). Biotinylated biomolecules are either available commercially or can be prepared in the lab via biotin-labeling, which is discussed in section 2.6.

2.5.3. PEG Surface Coating

PEG is a highly hydrophilic polymer molecule widely used in material science. Its excellent resistance to nonspecific adsorption of biomolecules (proteins, DNA, lipids, etc), make it a perfect candidate for coverslip treatment. Although it is advantageous over BSA-treated coverslip, the preparation of PEG/PEG-biotin coated coverslip is more complicated and time consuming. Detailed preparation protocols can be found in several publications6, 20, and here it will only be briefly described. Unlike BSA, which has a nonspecific (noncovalent) interaction with the surface, PEG molecules have to be specifically attached on the coverslip via a chemical reaction. Amino-activated coverslips have to be prepared prior to incubation with methoxy-terminated PEG. The amino-activation of glass is achieved by incubating plasma-cleaned coverslips in methanol, with 1% (v/v) aminopropylsilane and 5% (v/v) acetic acid for 30 minutes. Then, the coverslips are dipped sequentially in MilliQ water and in methanol, and then again, after which they are dried under nitrogen gas flow before reacting with methoxy-terminated PEG molecules. Methoxy-terminated PEG is dissolved in sodium bicarbonate buffer (0.1 M, pH 8.5), at a concentration of 250 mg/ml. A mix of 1:50 PEG-biotin:PEG is used instead when preparing biotinylated surfaces for specific surface attachment of biomolecules. 60 μL of this solution is sandwiched between two amino-activated coverslips and left in the dark for 3 hours. After incubation, the sandwiched coverslips are separated and dipped sequentially in MilliQ water, methanol, MilliQ water, and methanol. Then they are dried under nitrogen gas flow and are kept in a dry and dark place (vacuum desiccator covered with black table cloth) for storage. For immobilization measurements, PEG-ylated coverslips are treated further with 0.2 mg/ml streptavidin and then incubated with the biotinylated sample at ~100 pM. The procedure is identical as that discussed in section 2.5.2 for BSA-coated coverslips.

2.6. Protein Labeling and Purification

To perform fluorescence studies on proteins, fluorescent markers (fluorophores, Single Molecule Fluorescence 53 quantum dots and fluorescent proteins) have to be attached to the proteins of interest11, 21, because most proteins do not fluoresce in the visible spectrum. The development of bright and photostable fluorophores in the visible range gave an enormous impetus to fluorescence imaging and spectroscopy applications in biophysical research. For example, in proteins, these fluorophores were attached to specific residues using various types of cross-linking reagents21. The two most common cross-linking chemistries are discussed below, as well as the protein labelling protocols used in the Gradinaru lab. For protein labelling, it is always necessary to add excess labelling reagent (2-20 times), considering the relatively low coupling efficiency of reactive dyes to the host protein. Therefore, the labelled material has to be separated from a considerable amount of free dye by a procedure called protein purification. Size exclusion chromatography is commonly used for this task, taking advantage of the fact that proteins (molecular weight (MW) ~ 20-100 kDa) are much larger than the typical fluorophore dyes (MW ~ 0.5-1 kDa). The principle and the experimental details of the purification protocol will be described below.

2.6.1. Amine Group Reactive Cross-linking Chemistry

Due to their abundance and high accessibility in polypeptides, the amine groups are frequently used as targets for labelling reactions21. A common and easy-to-use reagent is N-hydroxysuccinimide ester (NHS ester). Under slightly basic conditions (pH 7.5-8.5), NHS ester forms a stable amide bond with a primary amine group (Figure 2.6-1 A). Fluorophores containing a NHS ester tag are commercially available and suitable for protein labeling. The NHS-amine coupling reaction can be done in phosphate and borate buffers, with pH adjusted between 7.5 and 8.5. Note that primary amine buffers such as Tris are not suitable because they compete for the NHS reaction. Generally, a protein concentration of 10-100 μM in a volume of 50 μL is used for amine labeling. A NHS-ester-activated fluorophore is then added to the protein sample at a molar ratio of 2:1 or higher (depending on the desired degree of labeling). The reaction is facilitated by gentle shaking in the dark at room temperature. The reaction is normally done in 3 hours and then the protein needs to be purified. Single Molecule Fluorescence 54

Figure 2.6-1: Coupling principle of amine-NHS and thiol-maleimide cross-linking chemistries between reactive dyes and proteins.

The amine-group-targeted labelling is usually efficient, due to high abundance of amino groups (at N-terminus, Lys, Arg) and to high accessibility (positively charged and usually outward-facing on protein surface). However, the disadvantage is also clear: due to multiple targets, it is often difficult to achieve stoichometric and site-specific labelling by this method. Therefore, this cross-linking chemistry is mostly used for applications where site-specificity is not necessary, such as imaging and FCS.

2.6.2. Cysteine Reactive Cross-linking Chemistry

The naturally low frequency of the cysteine (Cys) residue in proteins makes this amino acid the best candidate targeted in protein labelling11. The maleimide group forms stable thioether bonds with the thiol group on Cys at neutral pH conditions (Figure 2.6-1 B). The reaction specificity of maleimide with Cys also recommends this method for site-specific and stoichiometric labelling. In the presence of oxygen, thiol groups on one Cys may form disulfide bond with thiol group on other Cys. The disulfide bond has to be reduced to make them available for coupling to fluorophore-maleimide derivatives. Either tris(2-carboxyethyl)phosphine (TCEP) or DL-Dithiothreitol (DTT) can be used to reduce the disulfide bonds. However, TCEP is preferable because DTT competes for the reaction with maleimide groups. If DTT is used, it has to be removed before adding the fluorophore-maleimide derivative. The reaction is also better to be operated in oxygen-free conditions to avoid disulfide bond formation. For a general protocol, protein concentration of 10-100 μM in a volume of 50 μL is Single Molecule Fluorescence 55 used in buffers such as Tris at pH 6.5-7.5. Disulfide bonds are reduced with TCEP at 10:1 molar ratio with the protein. Fluorophore maleimide derivatives are then added at ~5:1 molar ratio with the protein. Oxygen is removed by placing the sample in a vacuum desiccator flushed with argon gas. The reaction is facilitated by gentle vortexing at room temperature for 3 hours and then in the fridge at 4C˚ overnight. The advantage of Cys-reactive chemistry is clearly its site specificity. However, Cys labeling has a poorer efficiency than amine-labelling. Also, different Cys residues may have very different solvent accessibility, since this amino acid is neutral and may be buried inside the protein. It is recommended that the number of dyes/protein be measured after each labeling reaction (by optical methods or mass spectroscopy), so that the dye:protein ratio and the incubation time may be adjusted to obtain the desired results.

2.6.3. Size-Exclusion Chromatography

Size exclusion chromatography (SEC) is a method to separate molecules of different sizes when eluting the solution through a column of packed porous particles22. Generally speaking, molecules smaller than the pore size can diffuse inside the porous particles and therefore have a longer elution path and elution time (Figure 2.6-2). Molecules larger than the pore size cannot diffuse inside the pore and have a shorter elution path and elution time. The elution time of the molecules thus depends critically on their size. For protein labeling applications, the free dyes are usually much smaller than the (labelled) proteins, and SEC is suitable in separating the two species. In the Gradinaru lab, commercially available chromatography columns (Micro Bio-spin P-30, BIO-RAD, USA) are used to purify proteins of MW above 10 kDa from the free dyes upon labelling. Protein purification through this column is operated by centrifugation after loading the protein sample in the gel bed. Four minutes of centrifugation at 1300g is sufficient to elute proteins out of the column. Single Molecule Fluorescence 56

Figure 2.6-2: The principle of size exclusion chromatography. Molecules smaller than the pore size of the porous particles in the column have a longer path and elution time through the column. Larger molecules such as proteins have a shorter path and elute through the column first. The sizes of fluorophores (red) and proteins (green-brown structures) in the figure are not to scale.

2.6.4. Degree of Labeling Measured by Mass Spectroscopy

It is always good practice to determine the degree of labeling of the protein after labelling and purification, especially when the protein needs to be labelled with a pair of different dyes for FRET. In this case, not only the degree of labeling, but also the fraction of proteins with both donor and acceptor labels are important. In particular, donor-only proteins interfere with the interpretation of EFT measurements, by giving rise to zero- EFT values in the data. If the fraction of donor-only proteins could be measured precisely, then protocols could be designed to minimize it, or at least it could be subtracted from FRET measurements. Optical absorption is usually employed to measured the degree of protein labeling in the case of a single dye label11. In the case of labelling proteins with two different dyes, mass spectroscopy (MS) is preferable to determine the extent and the heterogeneity of labeling. The key principles and applications of MS are discussed in detail in reference23. In brief, the sample is first ionized by either gaining or losing protons. This is done at low pressure to prolong the lifetime of the ionized forms. The mass-charge-ratio (m/z) is then Single Molecule Fluorescence 57 measured inside a region containing a strong magnetic field. The abundance of species with different m/Z values are finally measured using a mass analyzer. The data is analyzed to obtain the molecular weight information (value and abundance) for different species present in the sample.

2.7. Single Molecule Fluorescence Spectroscopy

As its name implies, single molecule fluorescence (SMF) spectroscopy probes the properties of individual fluorophores and this information can be used to characterize the biomolecules of interest to which these fluorophores are attached21. One of the most remarkable advantages of SMF spectroscopy is the ability to investigate heterogeneous populations of the ensemble11, which are always obscured by averaging in typical bulk measurements. Another important advantage is the ability to probe the entire dynamic sequences of a single molecule transiting between different conformational states5, 24. The principle of achieving conditions for SMF observation is actually simpler than it appears initially. First of all, the sample needs to be diluted until there is going to be at most one molecule in the detection volume of the microscope. At the same time, the excitation/observation volume has to be minimized by focusing the laser excitation beam through a high NA objective, in order to minimize the background signal from the solvent. Sub-femtolitter observation volumes can be achieved using high (~1.4) NA oil objectives. This ensures that a sample concentration in the (sub)nanomolar range satisfies the conditions for single-molecule observation. In our lab, we use two approaches for performing in vitro SMF measurements. The first approach is to position the observation volume inside a solution of the sample prepared at sub-nanomolar concentrations. The sample is probed at equilibrium. Molecules diffuse freely in solution, and each time they diffuse into the detection volume, a fluorescence burst will be detected. By sampling thousands of such fluorescence bursts, the full microscopic distribution of states of the system at equilibrium is obtained25. Single molecule FRET (smFRET) measurements performed in this way provide the distribution of conformations and give insight into the structural heterogeneity of the sample. Other SMF techniques, such as FCS, are used in this experimental configuration to measure the diffusion time and hence the hydrodynamic radius of the molecule or its Single Molecule Fluorescence 58 interaction with other molecules26, or to measure the time scale of conformational fluctuations via the FRET-FCS method27, 28. The second approach is to sparsely immobilize the molecules of interest on the coverslip to ensure that no more than one molecule resides in the sampling volume. The full sequential dynamic events of each molecule can be probed on the millisecond-second time scale by moving it in the observation volume and exposing it to excitation continuously until photobleaching occurs. This data is typically analyzed with change-point20 or Hidden-Markov29 models to obtain the biologically-relevant states of the system and the conversion rates between them.

2.7.1. smFRET Burst Measurements

In the smFRET burst measurements, a diluted solution of donor-acceptor-labeled biomolecules (protein, DNA, etc) diffuse freely in the buffer solution above the coverslip. Occasionally, one molecule passes through the detection volume and for a short period of time (ca. 100 μs) absorbs and emits light, resulting in a fluorescence burst. The fluorescence is split by an emission dichroic into donor and acceptor channels. The EFT of this burst could be determined from the following equation6:

where ND and NA represent the number of photons detected in donor and acceptor channels respectively for each burst, and γ represents a correction factor accounting for differences in detection efficiencies and quantum yields for donor and acceptor. The value of γ has to be pre-determined from separate experiments. ND and NA also have to be background and crosstalk corrected to ensure the accuracy of the EFT determination.

Figure 2.7-1 shows the experimental scheme (A and B) for smFRET burst measurements and an example of such data (C and D) acquired on an intrinsically disordered protein (Chapter 5). The calculated EFT distribution can be converted to structural information, in this case the end-to-end distances of the disordered protein. The accuracy in the end-to-end distance calculation also depends critically on the accurate determination of Ro for the FRET pair, as discussed in section 1.3-6 (Chapter 1). Single Molecule Fluorescence 59

Figure 2.7-1: smFRET burst experiments: (A,B). Diffusion in and out of the detection volume of individual molecules shows up as bursts in the intensity-time trajectory; (C). EFT distribution (normalized by area) obtained from sampling thousands of bursts for the Sic1 protein (see Chapter 5);

(D). Donor-acceptor distance information obtained from the EFT distribution using Eq. 1.3-4.

Performing the burst (diffusion) experiments is relatively simpler and faster than the trajectory (immobilization) experiments, since the average passage time of typical biomolecules through the detection volume is of the order of hundreds of microseconds. Most of the time of the acquisition trace contains background noise, a precaution to avoid having two molecules at the same time in the detection volume. The main feature of the data analysis is to discriminate the bursts from background and to calculate the multiparameter spectroscopic information of each burst (EFT, anisotropy, lifetime, transit time, etc.). An analysis program based on a “burst sliding scheme” was written in LabView 8.2 to analyze such data files and output histograms of selected spectroscopic parameters of the sample under investigation. The interface of this application is shown in Figure 2.7-2. The details of this burst detection scheme were described previously20. In essence, three important parameters -- M, L and T -- are used to identify a fluorescence burst in the measured intensity-time trajectory. Any photon which has “M” neighboring photons within a time window “T” around itself is considered to be a “candidate photon” to a fluorescence burst. Only when there are at least “L” successive candidate photons, these candidate photons are recognized as real photons belonging to a fluorescence burst. Single Molecule Fluorescence 60

Otherwise they are rejected as background photons. Once a burst has been identified, the software uses correction factors to estimate the corresponding EFT value. The correction factors account for differences of donor/acceptor channel detection efficiency, quantum yield of donor/acceptor, and also background signal subtraction (parameters in the red rectangle in Figure 2.7-2). Other parameter estimations (anisotropy, lifetime) require less extensive corrections to be applied to each “raw” burst. Finally, the detected fluorescence bursts are filtered by the number of photons contained in each burst (surrounded by a blue rectangle in Figure 2.7-2). Bursts with a very few number of photons may be from the background signal which are mistakenly picked up by the software, and bursts with a number of photons that is much larger than the average value may originate from biomolecule aggregations.

Correlation map Histogram of …

Figure 2.7-2: Graphic user interface of the burst identification and analysis software. See text for explanations of various parameters and panels.

A weighting factor may need to be applied to avoid selection bias, especially when the correction factor γ is too far away from unity (< 0.8 or > 1.2). Therefore the number of photons in each burst is not simply a sum of ND and NA, but rather has to be the sum of γ

×ND and 1×NA. This feature is turned on by checking the “Weighted?” checkbox in the Single Molecule Fluorescence 61 program. The “View” option is used to observe the burst selection results so that the user can judge if the selection parameters used were appropriate or not. The “FRET E” dropdown box determines histograming the bursts in either EFT, anisotropy or lifetime, when this information is available in the experiment. The burst analysis software also provides some dynamics information about the sample, by dividing each burst equally into two parts. EFT of both parts can be calculated separately and used to build a correlation map (right panels, 2D and 3D representations). The conformational dynamics of the molecule under investigation on a time scale on the order of the burst passage time can be identified by significant off-diagonal signals and broadening in the correlation map.

2.7.2. Single Molecule Measurements using Liposome Encapsulation

SMF spectroscopy is sensitive in probing the nano-environment surrounding the fluorescing molecule, utilizing various fluorescence characteristics, such as the emission intensity, spectrum, lifetime, and polarization13. However, one critical requirement of this experimental approach is that the nanoscale surroundings must be carefully controlled so that important biological properties of the sample will not be compromised3. One possible way to achieve this is to perform measurements on individual protein molecules freely diffusing in a buffer solution, using, for instance, FCS26, 30 and the smFRET burst approach discussed in section 2.7.1. Although they provide the most native-like environment for biomolecules, the observation time window is limited by the diffusion of molecules through the detection volume, which rarely exceeds 1 ms. Moreover, the sample concentration is limited to the nanomolar range, not sufficient for most biochemical reactions to occur at a sufficiently fast frequency. To extend the observation time window of single molecules, the preferred method is to anchor the molecules under investigation to a solid support14, 31. The most widely used method is the strong biotin-streptavidin linkage to immobilize biotin-modified biomolecules to a surface coated with BSA/PEG-biotin-streptavidin (as described in section 2.5). One problem faced by direct surface tethering methods is that they often introduce interactions of the biomolecule under investigation with the surface via van der Waals forces3. Another major drawback is the (sub)nanomolar concentrations required to Single Molecule Fluorescence 62 spatially separate molecules tethered on surface for detection, which seriously limits the observation of weak protein-protein/DNA/RNA/ligand associations14, 32. An alternative immobilization technique that overcomes the problem of unwanted surface effects is liposome encapsulation14, 33. This method is increasingly popular in single-molecule fluorescence studies of protein-protein/DNA/RNA interactions and folding14. Encapsulation of biomolecules such as enzymes in lipid vesicles protects them from self-denaturation in high dilution and prevents attacks by external agents such as proteases. The liposomes are tethered to BSA/PEG/lipid-coated coverslips via the biotin-streptavidin interaction. In contrast to the burst mode, the observation of freely diffusing biomolecules confined inside 100-nm liposomes is limited only by the photobleaching of attached fluorescent markers. Unfortunately, following the traditional liposome preparation protocol, there are some limitations for encapsulating biological molecules. First of all, biological samples such as proteins and are costly and available in scarce amounts. Considering that the liposome storage conditions (4 ˚C) are not ideal for protein stability, it is critically important to minimize material usage when preparing the sample. In traditional encapsulation protocols, about 100 μl of sample at 1 μM is necessary for a single preparation. This is a waste since the sample cannot last long under the ~4˚C storage condition. More importantly, the freeze/thaw (FT) cycles used in vesicle preparation will lead to denaturation of sensitive biological material. Although some proteins are resistant to such harsh treatments, most proteins are not. Proteins have relatively fragile structures and many lose their functionality under such conditions. It is surprising that little is reported about denaturation effects caused by FT and about the necessity of doing FT for the single molecule trapping protocol, in contrast to its popular applications. In our lab, protein denaturation and fluorophore photophysics changes caused by FT were studied in depth using the multiparameter single-molecule fluorescence setup. A similar approach is used to characterize the extent of surface effects upon vesicle encapsulation vs. direct immobilization and to estimate the vesicle size distribution with and without FT. A major improvement of the new method is that relatively high encapsulation efficiencies are achieved using much less biological material than previous protocols. The full details are described separately in chapter 3. Single Molecule Fluorescence 63

References: 1. Liu, B., et al., A photostable, pH-invariant fluorescein derivative for single-molecule microscopy. J Fluoresc, 2009. 19(5): p. 915-20. 2. Saleem, Q., et al., Lipogels: single-lipid-bilayer-enclosed hydrogel spheres. Biomacromolecules, 2011. 12(6): p. 2364-74. 3. Liu, B., A. Mazouchi, and C.C. Gradinaru, Trapping single molecules in liposomes: surface interactions and freeze-thaw effects. J Phys Chem B, 2010. 114(46): p. 15191-8. 4. Ritort, F., Single-molecule experiments in biological physics: methods and applications. J Phys Condens Matter, 2006. 18(32): p. R531-83. 5. Lipman, E.A., et al., Single-molecule measurement of protein folding kinetics. Science, 2003. 301(5637): p. 1233-5. 6. Borgia, A., P.M. Williams, and J. Clarke, Single-molecule studies of protein folding. Annu Rev Biochem, 2008. 77: p. 101-25. 7. PicoQuant. Stand Alone TCSPC Module with USB Interface. 2011 Nov 2012 [cited 2012 Sep 12, 2012]. 8. PicoQuant, PHR800. 2011, PicoQuant GmbH. 9. Semrau, S., A. Pezzarossa, and T. Schmidt, Microsecond single-molecule tracking (musSMT). Biophys J, 2011. 100(4): p. L19-21. 10. Axelrod, D., Total internal reflection fluorescence microscopy in cell biology. Traffic, 2001. 2(11): p. 764-74. 11. Selvin, P.R. and T. Ha, Single Molecule Techniques: A Laboratory Manual. Cold Spring Harbor Laboratory Press. Vol. 2. 2008, New York: CSHL Press. 6. 12. LifeTechnologies. Fluorescence SpectraViewer. [Web Application] 2012 [cited 2012 Sep 13, 2012]; Available from: http://www.lifetechnologies.com/ca/en/home/about-us.html. 13. Liu, B., et al., A Photostable, pH-Invariant Fluorescein Derivative for Single-Molecule Microscopy. J. Fluoresc., 2009. 19(5): p. 915-920. 14. Okumus, B., et al., Vesicle Encapsulation Studies Reveal that Single Molecule Ribozyme Heterogeneities Are Intrinsic. Biophys. J., 2004. 87(4): p. 2798-2806. 15. Betzig, E. and R.J. Chichester, Single molecules observed by near-field scanning optical microscopy. Science, 1993. 262(5138): p. 1422-5. 16. Mazouchi, A., et al., On the performance of bioanalytical fluorescence correlation spectroscopy measurements in a multiparameter photon-counting microscope. Anal Chim Acta, 2011. 688(1): p. 61-9. 17. Visnapuu, M.L., D. Duzdevich, and E.C. Greene, The importance of surfaces in single-molecule bioscience. Mol. Biosyst., 2008. 4(5): p. 394-403. 18. Hoffmann, A., et al., Mapping protein collapse with single-molecule fluorescence and kinetic synchrotron radiation circular spectroscopy. Proc Natl Acad Sci U S A, 2007. 104(1): p. 105-10. Single Molecule Fluorescence 64

19. Holmberg, A., et al., The biotin-streptavidin interaction can be reversibly broken using water at elevated temperatures. Electrophoresis, 2005. 26(3): p. 501-10. 20. Roy, R., S. Hohng, and T. Ha, A practical guide to single-molecule FRET. Nat Methods, 2008. 5(6): p. 507-16. 21. Ferreon, A.C., et al., Single-molecule fluorescence studies of intrinsically disordered proteins. Methods Enzymol, 2010. 472: p. 179-204. 22. Striegel, A., et al., Modern Size-Exclusion Liquid Chromatography: Practice of Gel Permeation and Gel Filtration Chromatography. Second ed. 2009, NY: Wiley. 23. (2005) Lecture 7: Principles of mass spectrometry, AMS, and ion optics. 24. Zhuang, X. and M. Rief, Single-molecule folding. Curr Opin Struct Biol, 2003. 13(1): p. 88-97. 25. Schuler, B., et al., Application of confocal single-molecule FRET to intrinsically disordered proteins. Methods Mol Biol, 2012. 896: p. 21-45. 26. Krichevsky, O. and G. Bonnet, Fluorescence correlation spectroscopy: the technique and its applications. Rep. Prog. Phys., 2002. 65: p. 251?97. 27. Price, E.S., M. Aleksiejew, and C.K. Johnson, FRET-FCS detection of intralobe dynamics in calmodulin. J Phys Chem B, 2011. 115(29): p. 9320-6. 28. Ferreon, A.C., et al., Interplay of alpha-synuclein binding and conformational switching probed by single-molecule fluorescence. Proc Natl Acad Sci U S A, 2009. 106(14): p. 5645-50. 29. McKinney, S.A., C. Joo, and T. Ha, Analysis of single-molecule FRET trajectories using hidden Markov modeling. Biophys J, 2006. 91(5): p. 1941-51. 30. Schwille, P. and E. Haustein, Fluorescence Correlation Spectroscopy: An Introduction to its Concepts and Applications, in The Biophysics Textbook Online. 2002, www.biophysics.org/education/schwille.pdf. 31. Okumus, B., et al., Single Molecule Nanocontainers Made Porous Using a Bacterial Toxin. J. Am. Chem. Soc., 2009. 131(41): p. 14844-14849. 32. Benitez, J.J., et al., Probing Transient Copper Chaperone-Wilson Disease Protein Interactions at the Single-Molecule Level with Nanovesicle Trapping. JACS, 2008. 130(8): p. 2446-2447. 33. Boukobza, E., A. Sonnenfeld, and G. Haran, Immobilization in Surface-Tethered Lipid Vesicles as a New Tool for Single Biomolecule Spectroscopy. J. Phys. Chem. B, 2001. 105(48): p. 12165-12170.

Single Molecules Trapped in Liposomes 65

A New Liposome Preparation Method

for Biomolecule Encapsulation

Work published in:

Trapping Single Molecules in Liposomes: Surface Interactions and

Freeze-Thaw Effects

Baoxu Liu, Amir Mazouchi and Claudiu C. Gradinaru

J. Phys. Chem. B 114 (46), 15191-98 (2010) Single Molecules Trapped in Liposomes 66

3.1. Introduction

In recent years, a remarkable development in single molecule techniques has formed the basis for quantitative biological studies aimed at understanding the basic functionality of biological macromolecules and systems1-4. One critical experimental requirement is that the nanoscale surroundings must be carefully controlled not to compromise important biological properties of the sample5, 6. As discussed in Chapter 2 in section 2.7.2, single molecule encapsulation in liposomes provides such a quasi-native environment for biomolecules, and also overcomes the problem of unwanted surface effects encountered in surface-direct-immobilization methods2, 6, 7. However, vesicle preparation protocols proposed in previous papers8-12 includes several freeze-thaw (FT) cycles by liquid nitrogen and warm water to reduce vesicle lamellarity. Proteins have relatively fragile structures and may lose their functionality under such conditions. It is therefore surprising that little is reported about denaturation effects caused by FT and about the necessity of doing FT for the single molecule trapping protocol, in contrast to its popular applications. We studied in depth protein denaturation and fluorophore photophysics changes caused by FT using our multiparameter fluorescence setup13. We characterized the extent to which surface effects are eliminated by vesicle encapsulation compared to direct immobilization and estimated the vesicle size distribution with and without FT cycles. We also proved that a relatively high encapsulation efficiency can be achieved using much less biological material than previously published protocols.

3.2. Materials and Methods

3.2.1. Materials

1,2-Dipalmitoyl-sn-glycero-3-phosphoethanolamine-N-(biotinyl) (Biotin-DMPE, 870285) and 1-palmitoyl-2-oleoyl-sn-glycero-3-phosphocholine (POPC, 850457) were purchased from Avanti Polar Lipids (Alabaster, AL, USA). A mini extruder and Single Molecules Trapped in Liposomes 67 accessories (0.1 m polycarbonate membrane, filter and filter supports, etc.) are also from Avanti Polar Lipids. RhodamineB-DHPE was purchased from Invitrogen (L-1392, Burlington, ON, Canada). STAT3 protein (S54-54G) was purchased from SignalChem (Richmond, BC, Canada). Tetramethylrhodamine (TMR)-GpYLPQTV-NH2 peptide (QCP100302-17R) was purchased from CanPeptide (Montreal, QC, Canada). DNA sequences, 3-ATT TAG ATT TCA TTG TAT TCC/i6-TMR/TT GTA TTG CCA TTC AGG T-5 with biotinylated complement at the 5' end and 3/Cy3-TAGGAG- CCC-GTG-GTA-5, were purchased from IDT-DNA (Coralville, IA, USA). Alexa555-NHS (A20009) and TMR-maleimide (T6027) were purchased from Invitrogen (Burlington, ON, Canada).

3.2.2. Liposome Preparation Methods in Literature

Several liposome (vesicle) preparation methods have been proposed in the literature, including detergent dialysis, organic solvent/aqueous solution mixture, and lipid film hydration8-12. The lipid film hydration method is most commonly used for encapsulating biological materials5, 11, since it allows biomolecules to be captured in their functional form.

Scheme 1: Schematics of the molecule encapsulation protocol via lipid film hydration.

The procedure of lipid film hydration and sample encapsulation in liposomes is shown schematically in Scheme 1. In short, commercially available lipids are usually delivered in powder form (ideal for long-term storage) and need to be dissolved in Single Molecules Trapped in Liposomes 68 organic solvents like chloroform. The solvent is then dried out under nitrogen gas and chloroform traces are removed by leaving the vial open in a vacuum desiccator overnight. The resulting lipid film is usually hydrated with a buffer solution containing the fluorescent sample (shown in red in Scheme 1) to be trapped in vesicles. At this step, the solution contains mostly liposomes with multiple lamellarity and varying sizes (mostly in the order of micrometers), the so called multi-lamellar vesicles (MLVs). Freeze and thaw cycles (FT cycles, freeze in liquid nitrogen for 30 s and thaw in 30 ˚C water) reduce the lamellarity and sizes of the liposome8, 12. In principle, FT cycles break MLVs membrane and allow them to reform into vesicles with reduced size and lamellarity. To make the vesicles even more unilamellar and increase their size uniformity, extrusion of the MLV solution through a porous polycarbonate membrane is typically employed. As shown in Scheme 2, MLVs break into unilamellar vesicles after multiple passages through the pores of the membrane. The extrusion method works most efficiently when making vesicles with a diameter around 0.1 m. Smaller vesicles (diameter < 0.05 m) have high curvature and are less stable at room temperature. On the other hand, larger vesicles (diameter > 0.2 m) feature larger polydispersity and multilamellarity.

Scheme 2: Schematics of lipid extrusion through a porous membrane. Single Molecules Trapped in Liposomes 69

Fortunately, for vesicle encapsulation in single-molecule fluorescence studies, 0.1 m is a sufficiently large size to create a quasi-native environment for trapped soluble biomolecules, while sufficiently smaller than the size of the point-spread function (PSF, on the order of 300-350 nm)14 such that the freely diffusing fluorescent molecule appears immobile in the confocal imaging system.

3.2.3. Liposome Preparation Protocol

Large unilamellar vesicles (LUVs) in this work were prepared using the lipid film hydration and extrusion method described in section 3.2.2, with some modifications. In short, a 1 mg mixture of phospholipids (POPC/biotin-DMPE, 99:1, or POPC/biotin-DMPE/RhB-DHPE, 99:1:0.28) was dissolved in chloroform. The resulting lipid film was vigorously hydrated for ca. 5 min followed by a gentle hydration for 15 min with a buffer solution of 10-200 L containing ca. 0.5 M fluorescent sample to be trapped in vesicles. FT cycles of the standard protocol were avoided except when used for comparison purposes, as indicated below. The hydrated multilamellar vesicles (MLVs) were diluted to a final lipid concentration of 5 mg/mL by adding buffer to a total volume of 200 L and then extruded 35 times through a porous polycarbonate membrane filter using a Mini-Extruder apparatus to produce LUVs with a nominal diameter of 0.1 m.

3.2.4. Liposome Size Measurements

The theory of FCS and its biophysical applications are discussed in detail in Chapter 1 (section 1.3.7). In this study, FCS measurements were performed on the liposome samples prepared by extrusion. Liposomes prepared by extrusion usually exhibit a broad size distribution and therefore a modified version15 of Eq. 1.3-7 needs to be used to fit the experimental curve and to determine the size distribution of the liposomes:

Single Molecules Trapped in Liposomes 70

where Ni is the number of particles of brightness Bi and diffusion coefficient Di

( was used to replace in Eq. 1.3-7). s is the aspect ratio of the axial and the lateral radii (zo and ωo respectively) of the detection volume (s = z0/w0). The vesicle brightness distribution Ni (Bi) was measured separately by wide-field fluorescence imaging, and the vesicle diameter (Φ) distribution N(Φ) was obtained by fitting the measured FCS curves to the above equation and using the Stokes-Einstein relation (Eq.

η are solvent’s temperature and viscosity respectively. 1.3-8): , where T and

3.2.5. Fluorescence Polarization

Principles of fluorescence anisotropy (FA) and fluorescence polarization (FP) have been described in Chapter 1. In previous applications in biology and chemistry6, 16, FP has been proven to be a valuable technique to study orientations of individual molecules. In our study, we have adopted FP to probe the rotational freedom of molecules trapped in vesicles vs. directly immobilized on the coverslip. In our study, molecules immobilized via different protocols were studied using circularly polarized light to avoid photoselection artifacts. The vertically and horizontally polarized components of the emission, I// and I, were split by a polarizing cube and measured with two different detectors. The polarization   was calculated for each time bin of a single-molecule time trajectory. The scalar factor g corrects for different detection sensitivities in the two channels, and it was determined using a dye solution (10 nM Alexa555) with virtually zero polarization. Most of the single molecule FP experiments in this report were carried out using circularly polarized light to avoid photoselection. In the measurements probing the functionality of the STAT3 protein, linearly polarized excitation was used instead.

3.3. The Size Distribution of Liposomes

Liposomes have been studied intensively for their use as biocompatible drug delivery systems and models for artificial cell systems17-19. Extensive efforts have been Single Molecules Trapped in Liposomes 71 directed into controlling and optimizing the size distribution20 and encapsulation efficiency21, 22 of lipid vesicles. However, little is known about the characteristics of vesicles used for trapping biomolecules for single-molecule studies. Here we report on the size distribution and encapsulation efficiency for vesicles prepared without FT and compare with the FT vesicles used in previous studies7, 12, 20. Liposomes prepared by extrusion usually exhibit a broad size distribution around the pore diameter of the membrane filter. Knowing the actual vesicle size is critical for studying biochemical reactions between confined reactants because it determines the effective concentration. To that end, we prepared fluorescent liposomes using a small amount of fluorescent lipids, 0.5% RhB-DHPE (w/w). Assuming the fluorescent lipids are uniformly distributed in the lipid bilayer, the total number of dyes in each vesicle is proportional to the total surface or the square of the diameter Φ. The size of the PSF of the wide-field microscope is on the order of 300-350 nm14, so that the brightness B of each liposome scales linearly with the number of dyes on its surface. Hence Φ , where c is a proportionality constant20. The brightness of nearly 1500 liposomes was measured in wide-field images as shown in Figure 3.3-1A. Corrections for the non-uniform illumination field were applied to the raw images using data acquired under similar conditions from a diluted dye solution. The brightness distribution information was included into the FCS fitting analysis with the conversion factor c treated as a free fitting parameter (Figure 3.3-1B). The outcome of this combined wide-field and FCS approach is that the vesicle size distribution, N(Φ),can be obtained (Figure 3.3-1C, red column plot). Consistent with previous reports20, 22, the liposome size distribution is rather broad and clearly non- Gaussian. The average hydrodynamic diameter is 110 nm, but a considerable number of vesicles are smaller than 100 nm. This comes from a natural tendency to form small vesicles during hydration, as observed previously20. Upon Single Molecules Trapped in Liposomes 72 centrifugation at 12,000 rpm for 2 min prior to extrusion, smaller vesicles are removed, and the remaining distribution is approximately Gaussian (142  44 nm, Figure 3.3-1D).

Figure 3.3-1: (A) Wide-field image of surface-immobilized vesicles (scale bar 1 m); (B) FCS data and fitting for freely diffusing vesicles; (C) Size distribution of vesicles with (blue line) and without

(red column) FT estimated from brightness data and FCS fitting; (D) Size distribution after partially removing micelles by centrifugation before extrusion.

The size distribution of FT liposomes was calculated using a similar approach (Figure 3.3-1C, blue line plot). It exhibits a similar shape as the no-FT preparation but a reduced occurrence of small vesicles and an average diameter of 94 nm. This is not surprising, knowing that FT cycles are expected to break the larger MLVs and fuse the smaller ones before extrusion. However, there is no clear advantage of using FT for a more uniform size distribution, which is similar for both cases, and it can be further confined by centrifugation.

3.4. Efficient Encapsulation Method for Scarce Biological Samples

Biological samples such as proteins and DNAs are costly and available in scarce amounts. Considering the limitation in small vesicle storage conditions (~4 ˚C, which is Single Molecules Trapped in Liposomes 73 not ideal for proteins and other biomolecules), it is critically important to minimize material usage when preparing the vesicle sample.

Figure 3.4-1: Liposome preparation scheme. (A) Typical: hydration at high volume (A1) and extrusion (A2). (B) Modified: hydration at low volume (B1), dilution (B1’), and extrusion (B2).

It has been observed that encapsulation efficiency (EE) in vesicle preparation increases with increasing the hydration time and the number of FT cycles1, 21, 23, because of reduced lamellarity in MLVs and thus an increased void volume. However, these processes have to be used with limitations such as in the presence of stabilizer, because they can lead to denaturation of sensitive biological material. One way of minimizing biological material usage is to lower the sample volume used for vesicle preparation. The minimum working volume for the extruder is on the order of 100 L, but for the hydration of the lipid film the limit is much lower; for instance, 10 L is still adequate for a 1 mg POPC lipid film. This leads to an interesting scheme to minimize sample usage by first hydrating the lipids at low sample volume (~10 L), then adding buffer to the MLV solution up to the operating volume (~200 L) and finally extruding the diluted MLV solution to obtain LUVs for single-molecule studies. A scheme comparing this method with the typical protocol is shown in Figure 3.4-1. Single Molecules Trapped in Liposomes 74

Ideally, immediately after the hydration step (A1 and B1 in Figure 3.4-1), the sample concentration both inside and outside the MLVs equals the sample concentration used. This implies a sharp increase of EE for the low-volume hydration method (B1), because the same amount of sample is trapped inside MLVs for a much lower amount of sample used for lipid hydration. During extrusion (parts A2 and B2 of Figure 3.4-1), MLVs break into LUVs upon being pushed through the polycarbonate membrane. Because breaking events are fast, the material encapsulated in MLVs during hydration has little time to equilibrate with the bulk. The extrusion is expected to simply reallocate MLV-encapsulated volumes into smaller LUV encapsulated volumes and to release the rest. Hence, we expect that the local sample concentration in LUVs will resemble that found in MLVs before extrusion and that the average number of molecules trapped per vesicle will be similar in both cases (parts A3 and B3 in Figure 3.4-1). We tested the proposed method by hydrating a 1 mg POPC film with buffered solutions of 3 M Alexa555 at volumes of 10, 20, 50, 100 and 200 L. Upon hydration, all samples were brought up to a total volume of 200 L prior to extrusion by adding appropriate amounts of buffer solution. For each case, the average number of dyes trapped per vesicle was estimated by studying the photobleaching steps in the intensity-time trajectories, e.g., N = 2 in Figure 3.4-2. We found that is rather insensitive to the amount of dye (volume) used for hydration (Figure 3.4-2 B). If MLV breaking and LUV formation was slow, would be proportional to the amount of dye used. However, when reducing the initial amount of dye from 0.6 nmoles (in 200 L) to 0.03 nmoles (in 10 L), decreases only slightly from 1.04 to 0.42. This decrease could be explained by the higher lamellarity of MLVs prepared at higher lipid concentration, leakage due to osmotic stress after dilution, and/or by some equilibration during extrusion. Given the much lower amount of initial sample, the EE actually increased five times6. Single Molecules Trapped in Liposomes 75

Figure 3.4-2: (A) Intensity-time trajectory corresponding to 2 Alexa555 molecules trapped in the same vesicle; (B) The average number of molecules trapped per vesicle () as a function of the volume of sample (3 μM of Alexa555 in PBS buffer) used for hydrating the lipid film (1 mg POPC).

As a trade-off between saving material and ease of operation, for biological samples, we normally use 20 μL buffered solution to hydrate a 1 mg lipid film. This protocol consistently yields ≈ 0.58 (Figure 3.4-2 B), which is comparable to literature values obtained with the classic one-step hydration protocol6, 24. Thus, our data shows that valuable biological material can be saved by hydrating the lipid film at low volumes without compromising the quality of trapping.

3.5. Single Molecules: Vesicle Trapping vs. Direct Immobilization

3.5.1. Number of Emitting States

For an ideal fluorophore, only one emitting state is expected to be seen before photobleaching, corresponding to constant intensity, emission spectrum, lifetime, and polarization. However, complex photophysics and interactions with the environment often lead to deviations from the ideal behavior. For instance, a single TMR dye attached to a short piece of surface-tethered dsDNA can exhibit six emitting states, illustrated by transitions between different intensity and/or polarization levels (Figure 3.5-1A). Besides reducing the excitation power, little can be done to control the dye photophysics. Alternatively, we attempt to reduce or eliminate the environmental interference by trapping individual molecules inside surface-tethered liposomes. These lipid vesicles are Single Molecules Trapped in Liposomes 76 large enough for the trapped molecule to experience free diffusion but small enough compared to the microscope’s PSF so that the molecule appears effectively immobile under the fluorescence microscope.

Figure 3.5-1: Intensity time-trajectories of single TMR-DNA molecules: (A) directly immobilized to surface via biotin-streptavidin linkage, and (B) freely diffusing inside surface-tethered vesicles. In A and B fluorescence signals in the I// and I channels are shown in blue and red respectively. The ccorresponding lifetime-time trajectories are shown in C and D, respectively.

Most frequently, a single TMR-DNA diffusing in a vesicle shows a single emission state (Figure 3.5-1B). Counting the number of the emitting states of the fluorophore is thus an effective way of evaluating the impact of surface interactions in different immobilization methods. Corresponding lifetime traces were built from the same data using as little as 800 photons/bin (parts C and D of Figure 3.5-1). Only one lifetime level characterizes the vesicle trapped molecule, while as many as four can be distinguished for the other sample. Fluorescence quenching in the proximity of the surface is therefore a major source of nonideal behavior, but steric hindrance also plays a role, resulting in restricted dipole orientations. The statistics of the number of emitting states for molecules trapped in Single Molecules Trapped in Liposomes 77 vesicles was computed from 135 single-molecule trajectories and compared to similar data obtained for direct immobilization. On average, a single TMR-DNA molecule trapped in a vesicle exhibits 1.37  0.71 states before photobleaching, considerably less than 6.30  4.91 states for a directly immobilized molecule. Note the much larger variance in the latter case, which can seriously affect quantitative analysis of single-molecule data. The presence of significant surface effects in fluorescence studies has been observed before5, 6, 25 and constitutes a serious obstacle for studying the dynamics of the host macromolecule. Under similar excitation conditions, the photobleaching time of TMR-DNA inside vesicles is about two times shorter than for directly immobilized DNA. The vesicle cavity might induce a local enhancement of the excitation field and/or the charge of the lipid heads may lead to quenching. Previous encapsulation methods have used oxygen/triplet scavenger systems to extend the photostability of single fluorophores26, 27. Then, it is possible that vesicle-trapped molecules may not be active long enough to sample more emitting states. In that case, a correlation should emerge between the number of states and the photobleaching time. Single-molecules excited at lower power last longer, but a corresponding increase in the number of states was not observed (Table 1). This strongly suggests that the number of emitting states of TMR-DNA encapsulated in a lipid vesicle is intrinsically low, and it is determined mainly by the intrinsic photophysics of the dye.

Table 1: Number of emitting states of single TMR-DNA molecules encapsulated in vesicles

Excitation power (W/cm2) Photobleaching time (s) No. of states

54.9 10.4 ± 8.4 1.64 ± 0.95

10.5 58.9 ± 42.8 1.40 ± 0.72

In a previous study of TMR-DNA fluorescence in solution, two bright states and one dark state were detected. While an ensemble of fluorophores can display two fluorescence lifetimes, a certain molecule may only show one of them. Environmental Single Molecules Trapped in Liposomes 78 factors, such as salt concentration and temperature28 or surface interactions in this work, can drive the transition between different TMR emitting states. Our results are consistent with the fact that these transitions are reduced when the TMR is encapsulated in vesicles, although not completely eliminated. In general, small lipid vesicles provide a non-interacting environment that is ideal for time-resolved single-molecule fluorescence studies.

3.5.2. Variance of Single-Molecule Signals: Noise Analysis

A single fluorophore emitting from a single electronic state is essentially a Poisson light source, which means that the standard deviation (STD) equals the square root of the mean (SRM) for an intensity trace detected by photon counting. The ratio STD/SRM is therefore a good indicator of the electronic “purity” of a single-molecule signal. As shown in Figure 3.5-1A, a surface-immobilized TMR-DNA shows transitions between different “macro” states lasting at least a fraction of a second. To analyze faster fluctuations, we calculated the STD/ SRM value for each “macro” state instead of the entire trajectory. The Poissonian character of TMR emission is quantified either for the initial state or as the average for all the emitting states of a given molecule. This analysis can shed light on the origin of different emitting states in the trajectory of the same molecule, e.g., a higher noise ratio at later times compared to the initial state could be attributed to photoactivated states.

Table 2: Noise analysis of intensity-time trajectories of single TMR-DNA molecules encapsulated in vesicles vs directly immobilized on surface.

Directly immobilized Trapped in vesicle

First state 1.56 ± 0.22 1.14 ± 0.15

All states 1.58 ± 0.60 1.12 ± 0.17

Table 2 shows the noise statistics for TMR-DNA molecules immobilized on surface by two methods. It is evident that vesicle-trapped fluorophores approach the ideal Single Molecules Trapped in Liposomes 79 single-state behavior while directly immobilized ones do not. The calculated ratios are close to unity for all “macro” states identified from single-molecule trajectories. In addition, both STD/SRM values depend on the choice of the binning time for the single molecule trajectories, though much less for vesicles than for direct immobilization. Simulation code was written in Labview 8.2 to study the noise characteristics for the intensity-time trajectory of a single molecule. We focused on how the ratio between standard deviation (STD) and the square root of the mean (SRM) of the signal depends on the choice of binning time. In the simulation for an ideal single emitter, a signal count rate of 1 kHz and a collection time of 10 seconds (typical values for a time trajectory) were used and STD/SRM values in each time bin were calculated. For example, for a binning time of 10 ms, 1000 points (10 s / 10 ms) will be generated using a “Poisson Signal Generator” with an average number of photons of 10 (1 kHz × 10 ms). The STD/SRM value is then calculated from these 1000 points. For each binning time considered, from 1 ms to 500 ms, the simulation is repeated 100 times, then an average STD/SRM is calculated and used to plot the curve in Figure 3.5-2A. Clearly, there appears to be no significant dependence of the STD/SRM ratio on the binning time. As discussed, the proximity of the surface affects the emission of directly anchored TMR-DNA molecules, represented by the correlation of intensity and lifetime fluctuations for the "macro" states (Figure 3.5-1). However, due to signal intensity limitations and counting noise, fast binning times on the (sub)millisecond scale are not practical, hence transitions between "micro" states cannot be resolved. For instance, for a certain "macro" emitting state, although the average intensity seems to be constant, it may still vary in (sub)millisecond time scale because of surface interactions. We could assume that this variation follows a Poisson distribution, which means that instead of using a constant intensity value (1000 Hz in the ideal case) to generate the number of photons in a given time bin, we must first generate the most probable intensity Single Molecules Trapped in Liposomes 80 value in the current time bin. Say that value is 987 Hz, then we generate the number of photons in the current time bin based on the binning time and generated intensity (987 Hz×10 ms = 9.87). The same procedure was then applied to all the following 999 time bins. In this case, a clear linear dependence of STD/SRM is observed, which is identical no matter what the average signal strength is (Figure 3.5-2 B).

Figure 3.5-2: STD/SRM vs. binning time for: (A) an ideal single emitter with constant intensity

(simulated data); (B) a non-ideal single emitter (simulated data) with Poisson-distributed intensity with an average signal of 1 kHz (red) and 10 kHz (blue); (C) STD/SRM as function of binning time

(experimental data), calculated from the single-molecule time trajectories shown in Figure 3.5-1 (blue:

TMR-DNA in vesicle; red: TMR-DNA on surface).

In Figure 3.5-2 C, for the vesicle encapsulated single-molecule experimental data the ratio is almost-independent of the binning time (blue), similar to the ideal emitter. In contrast, for the molecule on the surface (red) a sharp dependence is observed. This lends support to the idea of fast variations in intensity occurring for the directly immobilized sample. We conclude that there are indeed fast surface interactions with the fluorophore (quenching, limited rotation and photoselection). These non-specific interactions introduce unwanted intensity fluctuations that are unresolvable at the "macro" state level, but do seriously affect quantitative single-molecule measurements. The reduced dependence observed for the vesicle trapped protocol suggests that, under these conditions, TMR is closer to an ideal emitter. The absence of nonspecific Single Molecules Trapped in Liposomes 81 surface interactions and photoinduced states recommends the liposome encapsulation as the preferred method for single-molecule studies of fluorescently tagged biomolecules. This method is especially helpful in studying biomolecules which may have structural fluctuations on fast timescales. The described noise analysis (STD/SRM) can be a useful tool to determine the underlying timescale of these structural fluctuations in the absence of environment-induced noise in the signal trajectory, i.e. upon vesicle encapsulation.

3.5.3. Multiparameter Histograms

In our confocal microscope several fluorescence parameters are measured simultaneously at the level of each photon. Correlations between intensity, polarization, and lifetime fluctuations can be studied in detail and help to identify different emitting states and/or conformations. We constructed 2-D histograms of all “macro” states found in single TMR-DNAs encapsulated in liposomes (parts A and B of Figure 3.5-3) and directly immobilized on the surface (parts C and D of Figure 3.5-3). Molecules trapped in liposomes display a narrow distribution around a single peak, as expected from free diffusion in the aqueous interior of the vesicles. This also shows that the influence of multilamellarity is limited and that the number of molecules trapped between vesicle layers is negligible. On the contrary, directly immobilized molecules show dispersed values for all parameters and a quasi-linear lifetime-intensity correlation (Figure 3.5-3 C), indicating fluorescence quenching effects. The orientation of a single TMR emission dipole was already shown to vary randomly in time (Figure 3.5-1A), but here we can see that it also varies widely from one molecule to another. The signal is quenched and the dye rotation is constrained as a result of nonspecific interactions of the DNA or the dye with the surface. This shows that polarized spectroscopy, including FRET measurements, can be performed reliably with fluorophores trapped inside liposomes without interference from surfaces and photoselection. Single Molecules Trapped in Liposomes 82

Figure 3.5-3: Lifetime-intensity (A, C) and lifetime-polarization (B, D) “state” histograms from single-molecule data acquired for TMR-DNA immobilized to the surface via encapsulation in vesicles

(A, B) or via biotin-streptavidin linkage (C, D).

3.6. FT-Induced Changes of Fluorescence

Using the TMR-DNA probe, we discussed and compared the surface effects in two different single-molecule immobilization methods. In this study we are also interested to measure the extent the emission properties of vesicle-trapped fluorophores are affected by the commonly used FT cycles. A Cy3-labeled DNA oligonucleotide was chosen as a test sample because Cy3 usually exhibits very “quiet” single-molecule trajectories (uniform intensities, low blinking probability) compared to other fluorophores. Thus, it is relatively straightforward to characterize the Cy3 fluorescence inside vesicles prepared via different protocols using simple analysis. In addition, Cy3 is a widely used label in single-molecule fluorescence experiments on proteins and nucleic acids. Understanding Single Molecules Trapped in Liposomes 83 the behavior of Cy3-DNA in vesicles can provide valuable insight for the applicability of the vesicle immobilization protocol to single-molecule studies. Cy3-DNA trapped in vesicles prepared either with or without FT was studied with single-molecule spectroscopy. Most molecules exhibit only one emitting state prior to irreversible photobleaching. Average photobleaching times of 12 and 50 seconds were recorded at excitation intensities of 55 and 10.5 W/cm2, respectively, similar for both types of vesicles. For FT vesicles, ca. 90% of the molecules (223 out of 248) showed a single-state emission before photobleaching, while the other 10% have two or more bright states. In contrast, for vesicles prepared without FT, only 4 out of 173 molecules (∼2%) displayed multiple emitting states. Typically, the transitions tend to occur immediately before photobleaching, and in two-thirds of them the second state lasts no more than 10% of the duration of the first state. The intensity noise analysis revealed virtually identical STD/ RSM ratios of 1.08  0.10 and 1.05  0.06 for FT and no-FT vesicles, respectively. This shows that fast transitions are not likely to occur for either preparation method, and it demonstrates yet again the minimized surface interaction when using surface-tethered vesicles to immobilize single molecules. Under further scrutiny, Cy3-DNA molecules encapsulated in FT vesicles show a distinct shoulder in the FP distribution toward high positive values (Figure 3.6-1). Each event in the histogram represents an average polarization value for each individual emitting state. The FP values for the no-FT vesicles are narrowly distributed near zero polarization (∼15 mP) and can be fitted with a single Gaussian (FWHM ~ 63 mP, Figure 3.6-1A). In contrast, the FT vesicle data is fitted with two Gaussians of similar width (60-70 mP): a major component (75-80%) that resembles the no-FT case, and a minor component (20-25%) that is clearly biased toward positive values (center at ∼80 mP). From previous simulations6, the width of the polarization distribution is anticorrelated with the solid angle available to the fluorophore for tumbling/rotational motions. If that is the case, our results indicate that the tumbling is largely unaffected by Single Molecules Trapped in Liposomes 84

FT, but the procedure seems to induce some permanent photophysical transitions in the organic dye. This occurs with relatively low probability, but repeated FT cycles will enhance the effect. Our study suggests that these artifacts can be circumvented at no cost for the quality of single-molecule data if the vesicles are prepared without the FT cycles.

Figure 3.6-1: Single-molecule FP distribution measured on single Cy3-ssDNA trapped in vesicles prepared without FT (A) and with four FT cycles (B).

3.7. FT Effects on Protein Functionality

Biomolecules such as proteins are quite sensitive to their environment and are often unstable when not in their native environment. Proteins can lose activity as a result of proteolysis, aggregation and sudden temperature jumps. EGFP for example, after only one FT cycle, shows approximately 85% reduction in the overall fluorescence intensity, and a broader and a red-shifted emission spectrum (Gradinaru Lab, unpublished results). Not surprisingly, most proteins used for research purposes need to be aliquoted to small amounts in order to avoid denaturation by repeated FT cycles. The effect of FT on a biological macromolecule needs to be addressed by monitoring changes directly related to its functionality. In this section, we choose the signaling protein STAT3 (signal transducer and activator of transcription 3) and use its binding affinity to a small ligand (see details in Chapter 4 of this thesis) to address the issue of protein denaturation caused by FT. The ligand is a short peptide sequence which was found to exhibit high affinity to STAT3, Single Molecules Trapped in Liposomes 85

29 with kd = 285 ± 84 nM measured by FCS titration experiments . The TMR-labeled peptide is small (1.3 kDa), while the protein is large (93 kDa), so that the degree of binding can be monitored through the increase of fluorescence polarization. Upon merely applying one FT cycle to a buffered solution of TMR-peptide and STAT3 a clear reduction of the FP signal was observed. However, this may not necessarily be equivalent to FT conditions during vesicle preparation, because the aqueous compartment enclosed by the multilamellar vesicles may undergo different temperature gradients from the bulk solution. Encapsulating single pairs of TMR-peptide and STAT3 molecules in 100-nm liposomes allowed us to study directly the effect of preparative FT cycles on the denaturation of this ligand-protein system. The local concentration for a single ligand-protein pair inside a 100 nm vesicle is around 3 μM, ca. an order of magnitude larger than kd. Hence, the peptide is expected to be stably bound to the protein, and a FP value of 160-180 mP should be observed30, 31. However, if the TMR-peptide is free due to STAT3 thermal denaturation, the polarization measured should be much lower, around 15  3 mP30. The distributions of FP values measured for single TMR-peptides encapsulated in both types of vesicles in the presence of STAT3 are shown in Figure 3.7-1. Both distributions were fitted with a combination of two Gaussian functions. Figure 3.6-1A clearly shows two populations centered around P1 = 43 mP and P2 =183 mP, respectively. The area ratio is 1:3, which suggests that in the absence of FT at least 75% of the ligand is bound to the protein. The free peptide component is likely caused by the fact that STAT3 proteins can be stochastically absent in some vesicles. Similar data recorded for molecules entrapped in liposomes prepared using 3 FT cycles show a similar bimodal distribution (Figure 3.7-1 B). The two peaks are at similar positions as before, P1 = 41 mP and P2 = 166 mP, respectively; however the area ratio is almost reversed, 1:0.4. Single-molecule experiments on TMR-peptides encapsulated in vesicles without STAT3 showed no evidence of interaction with the lipid membrane or significant differences Single Molecules Trapped in Liposomes 86 induced by FT cycles. The protein does not spontaneously associate with lipid membranes, as confirmed by fluorescence images of TMR-STAT3 encapsulated in micrometer-sized liposomes. We can therefore conclude that only 28% of the ligand is in the bound state, due to a significant reduction of STAT3 functionality caused by the FT process.

Figure 3.7-1: FT effects on STAT3 functionality. Single-molecule FP distribution of a TMR-peptide ligand co-encapsulated with STAT3 in lipid vesicles prepared (A) without FT and (B) with 3 FT cycles.

While this effect may be stronger for some biomolecules and weaker for others, leaving out the FT cycles when encapsulating proteins and nucleic acids for single molecule studies presents clear advantages. Risks of denaturation and fluorophore damage are simply too great to ignore, and our protocol shows how to obtain a high yield of encapsulation with low amounts of sample and how to control the size distribution using centrifugation before extrusion.

3.8. Conclusions

We described an improved protocol for encapsulating fluorescently tagged biomolecules inside surface-tethered liposomes in order to ensure optimal environmental conditions for single molecule investigations. Multiparameter single-molecule fluorescence experiments helped characterize the anchoring method relative to direct immobilization via biotin-streptavidin linkers. Correlations of fluctuations of intensity, Single Molecules Trapped in Liposomes 87 polarization, and lifetime revealed that molecules trapped in liposomes exhibit nearly ideal single-emitter behavior, whereas molecules directly immobilized on the surface emit from various states or conformations. The lack of nonspecific surface interactions and photoinduced states therefore recommends the vesicle encapsulation as the preferred immobilization method for single-molecule fluorescence studies. By hydrating the lipid film at low volume, a high encapsulation yield was achieved with ca. 10 times less biological material than previous reports. Using an ingenious combination of wide-field images and FCS curves we determined absolute size distributions and found no significant advantage for using FT cycles during vesicle preparation. On the contrary, the temperature jump was found to induce irreversible damage of popular fluorophores such as Cy3 and to drastically diminish protein functionality, as demonstrated by single-ligand binding experiments on the signaling protein STAT3. The improved and biologically gentle molecule encapsulation protocol characterized in this report has a great potential for widespread applications in single-molecule fluorescence spectroscopy.

Single Molecules Trapped in Liposomes 88

References:

1. Colletier, J.P., et al., Protein encapsulation in liposomes: efficiency depends on interactions between protein and phospholipid bilayer. BMC Biotech., 2002. 2: p. 9. 2. Rosenkranz, T., et al., Observing Proteins as Single Molecules Encapsulated in Surface-Tethered Polymeric Nanocontainers. ChemBioChem, 2009. 10(4): p. 702-709. 3. Itoh, H., et al., Mechanically driven ATP synthesis by F1-ATPase. Nature, 2004. 427(6973): p. 465-468. 4. Adachi, K., et al., Coupling of Rotation and Catalysis in F1-ATPase Revealed by Single-Molecule Imaging and Manipulation. Cell, 2007. 130(2): p. 309-321. 5. Visnapuu, M.L., D. Duzdevich, and E.C. Greene, The importance of surfaces in single-molecule bioscience. Mol. Biosyst., 2008. 4(5): p. 394-403. 6. Boukobza, E., A. Sonnenfeld, and G. Haran, Immobilization in Surface-Tethered Lipid Vesicles as a New Tool for Single Biomolecule Spectroscopy. J. Phys. Chem. B, 2001. 105(48): p. 12165-12170. 7. Okumus, B., et al., Vesicle Encapsulation Studies Reveal that Single Molecule Ribozyme Heterogeneities Are Intrinsic. Biophys. J., 2004. 87(4): p. 2798-2806. 8. Barenholz, Y., et al., A simple method for the preparation of homogeneous phospholipid vesicles. Biochemistry, 1977. 16(12): p. 2806-10. 9. MacDonald, R.C., et al., Small-volume extrusion apparatus for preparation of large, unilamellar vesicles. Biochim Biophys Acta, 1991. 1061(2): p. 297-303. 10. Lasic, D.D., The mechanism of vesicle formation. Biochem. J., 1988. 256(1): p. 1-11. 11. Szoka, F. and D. Papahadjopoulos, Comparative Properties and Methods of Preparation of Lipid Vesicles (Liposomes). Annu. Rev. Biophys. Bio., 1980. 9(1): p. 467-508. 12. Olson, F., et al., Preparation of liposomes of defined size distribution by extrusion through polycarbonate membranes. Biochim. Biophys. Acta, 1979. 557(1): p. 9-23. 13. Liu, B., et al., A Photostable, pH-Invariant Fluorescein Derivative for Single-Molecule Microscopy. J. Fluoresc., 2009. 19(5): p. 915-920. 14. Mazouchi, A., et al., On the performance of bioanalytical fluorescence correlation spectroscopy measurements in a multiparameter photon-counting microscope. Anal Chim Acta, 2011. 688(1): p. 61-9. 15. Schwille, P. and E. Haustein, Fluorescence Correlation Spectroscopy: An Introduction to its Concepts and Applications, in The Biophysics Textbook Online. 2002, www.biophysics.org/education/schwille.pdf. Single Molecules Trapped in Liposomes 89

16. Deschenes, L.A. and D.A. Vanden Bout, Single-molecule studies of heterogeneous dynamics in polymer melts near the glass transition. Science, 2001. 292(5515): p. 255-8. 17. Edidin, M., THE STATE OF LIPID RAFTS: From Model Membranes to Cells, Annual Review of Biophysics and Biomolecular Structure, 2003. 32(1): p. 257-283. 18. Torchilin, V.P., Recent advances with liposomes as pharmaceutical carriers. Nat. Rev. Drug. Discov., 2005. 4(2): p. 145-160. 19. MacKinnon, N., et al., Liposome-hydrogel bead complexes prepared via biotin-avidin conjugation. Langmuir, 2009. 25(16): p. 9413-23. 20. Kunding, A.H., et al., A Fluorescence-Based Technique to Construct Size Distributions from Single-Object Measurements: Application to the Extrusion of Lipid Vesicles. Biophys. J., 2008. 95(3): p. 1176-1188. 21. Sun, B. and C.D. T., Determination of the encapsulation efficiency of individual vesicles using single-vesicle photolysis and confocal single-molecule detection. Anal. Chem., 2005. 77(9): p. 2770-2776. 22. Lohse, B., P.-Y. Bolinger, and D. Stamou, Encapsulation Efficiency Measured on Single Small Unilamellar Vesicles. J. Am. Chem. Soc., 2008. 130(44): p. 14372-14373. 23. Sou, K., et al., Effective Encapsulation of Proteins into Size-Controlled Phospholipid Vesicles Using Freeze-Thawing and Extrusion. Biotechnol. Progr., 2003. 19(5): p. 1547-1552. 24. Okumus, B., et al., Single Molecule Nanocontainers Made Porous Using a Bacterial Toxin. J. Am. Chem. Soc., 2009. 131(41): p. 14844-14849. 25. Krichevsky, O. and G. Bonnet, Fluorescence correlation spectroscopy: the technique and its applications. Rep. Prog. Phys., 2002. 65: p. 251?97. 26. Roy, R., S. Hohng, and T. Ha, A practical guide to single-molecule FRET. Nat Methods, 2008. 5(6): p. 507-16. 27. Selvin, P.R. and T. Ha, Single Molecule Techniques: A Laboratory Manual. Cold Spring Harbor Laboratory Press. Vol. 2. 2008, New York: CSHL Press. 6. 28. Vamosi, G., C. Gohlke, and R.M. Clegg, Fluorescence characteristics of 5-carboxytetramethylrhodamine linked covalently to the 5' end of oligonucleotides: multiple conformers of single-stranded and double-stranded dye-DNA complexes. Biophys J, 1996. 71(2): p. 972-94. 29. Avadisian, M., et al., Artificially induced protein-membrane anchorage with cholesterol-based recognition agents as a new therapeutic concept. Angew Chem Int Ed Engl, 2011. 50(28): p. 6248-53. Single Molecules Trapped in Liposomes 90

30. Drewry, J.A., et al., Coordination complex SH2 domain proteomimetics: an alternative approach to disrupting oncogenic protein-protein interactions. Chem. Commun. (Camb), 2010. 46(6): p. 892-4. 31. Badali, D., et al., Development of STAT3 as an accessible target for fluorescence-based inhibition assays. Journal of Undergraduate Life Sciences, 2010. 4(1): p. 18-23. STAT3 Inhibition 91

New Cancer Therapies: Membrane Anchorage of the STAT3 Protein

Work published in:

Artificially Induced Protein-Membrane Anchorage with Cholesterol-based Recognition Agents: Towards a New Therapeutic Concept M. Avadisian*, S. Fletcher*, B. Liu*, W. Zhao*, X. Zhang, P. Yue, D. Badali, W. Xu, A.D. Schimmer, J. Turkson, C.C. Gradinaru, and P.T. Gunning, Angewandte Chemie Intl. 50 (28), 6248-6253 (2011) *: These authors contributed equally to this work

A Photostable, pH-invariant Fluorescein Derivative for Single-Molecule Microscopy B. Liu, S. Fletcher, M. Avadisian, P.T. Gunning, and C.C. Gradinaru J. Fluoresc. 19, 915-920 (2009),

Single-Molecule Fluorescence Study of the Inhibition of the Oncogenic Functionality of STAT3 B. Liu, D. Badali, S. Fletcher, M. Avadisian, P.T. Gunning, and C.C. Gradinaru, Proc. SPIE 7386, 738605 (2009) STAT3 Inhibition 92

Signal-Transducer-and-Activator-of-Transcription 3 (STAT3) protein is a member of the STAT family, a group of transcription factors characterized by initiating transcription upon receiving extracellular signals from cytokines and growth factors1. STAT3 has been found to be involved in oncogenesis by stopping apoptosis and transforming cell lines, and is constitutively active in many cancers2. What causes the consistent activity of STAT3 in tumour cells is not completely understood, but the evidence that it is active in a large number of tumours such as breast cancer, prostate cancer, leukemia, and lymphoma is compelling1, 3. As such it is of special interest and is the target of our collaborative research with the Gunning group (Chemistry, UTM). 4.1. STAT3 Protein Structure

The STAT3 protein consists of 5 domains: a 4-helix bundle domain, a DNA binding domain, a connector domain, a Src-homology-2 (SH2) domain, and a trans-activation domain (TAD) (Figure 4.1-1)4. The entire protein has 770 amino acids, with an overall size of 14.5 × 11.5 × 6 nm3, as determined for a STAT3 homodimer bound to DNA4. At the N-terminus there is an intrinsically disordered region, which is quite common in signal transcription proteins4, 5. Therefore, the structural information of this first 129 amino-acid sequence cannot be resolved by X-ray crystallography. The first amino acid which has a clear electron density map is residue 130, where the 4-helix bundle structure starts. The helical structures (~ 8nm in length)4, connected via short loops, make the STAT3 protein appear very extended. It was found the helical bundle structure is essential for the interaction of STAT3 with several other proteins, in which the extended dimension may facilitate protein recognition3. Following the helical bundle domain, there is an eight-stranded β–barrel DNA binding domain. The two DNA binding domains on a STAT3 dimmer behave like a clamp, contacting double-stranded DNA in the cell nucleus on both the major and minor grooves. The DNA binding domain is connected to the SH2 domain by the connector domain, which contains 4 helices, connected via short, flexible loops. Although the overall structure of this domain shows high similarity with calcium-binding domains as found for example in troponin C4, its function in STAT3 remains unknown. STAT3 Inhibition 93

Figure 4.1-1: Domain structure of the STAT3 monomer. PDB file: 3CWG.

The SH2 domain and a tyrosine at position 702 (Y702) are most essential components for STAT3 dimerization and following functionality in signal transducing and transcription. The linking sequence between the SH2 domain and Y702 is intrinsically disordered, which may be advantageous in gaining flexibility for SH2 recognition and binding. Upon phosphorylation, this tyrosine and its neighbouring residues (702-716) form a phosphotyrosine peptide which is able to recognize the SH2 domain on its cognate STAT3 protein (see below). Interaction of the phosphopeptide with the SH2 domain on the opposite STAT3 leads to STAT3 dimerization and following nuclear translocation to express cytokine-responsive in cells. 4.2. STAT3 Pathway and Inhibition

STAT3 has been a topic of recent anti-cancer research due to its significant role in tumorigenesis 3, 6. Latent STAT proteins were found to dwell in the cytoplasm as inactive monomers or complexes. Extracellular signal binding by a wide range of receptors such as cytokines, growth factors, or hormones, leads to the activation of cytoplasmic receptor bound proteins such as Janus kinases (JAKs). JAK-mediated phosphorylation of Tyr702 in STAT3 allows the formation of transcriptionally active STAT3:STAT3 homo-dimers that are mediated by reciprocal phosphotyrosine-SH2 domain interactions. Activated STAT3 dimers translocate to the nucleus where they bind to specific DNA-response elements in the promoters of target genes and regulate gene expression (Fig. 4.2-1A). STAT3 Inhibition 94

Figure 4.2-1: (A) Signaling pathway of STAT3 in the cell; (B) Proposed inhibition scheme of STAT3 activity using a ligand that anchors STAT3 in the cell membrane.

Although active STAT3 is not commonly harmful to cells, overactive STAT3 activity causes unwanted transcription of anti-apoptotic genes, leading to tumorigenesis. Many methods of inhibiting the oncogenic functionality of STAT3 have been proposed which target various steps of the signaling pathway6. Most commonly, previous strategies have focused on the suppression of upstream kinases7 and STAT3-STAT3 protein-protein interactions8-13. However, due to the complexity of the activation of STAT3, any approach which only targets one pathway has the disadvantage that activation could be initiated by an alternative pathway2, 8. To date, these approaches have not yielded a clinically relevant STAT3-targeting drug. For example, targeting the pathway of tyrosine phosphorylation only reduced STAT3 activity, but did not prevent it6. Nature has developed a vast array of post-translational modifications that result in physical changes in protein’s cellular location. An ingenious mechanism for protein localization is prenylation: the covalent attachment of a hydrophobic prenyl group to a protein to facilitate protein association with the plasma membrane14. Analogous post-translational modifications that induce protein–membrane anchorage include the covalent attachment of glycolipid anchors15, palmitoyl groups16 and cholesterol17, 18. Through these processes, otherwise soluble proteins are sequestered to cellular membranes and ultimately lose their cellular motility19. We postulated that induced membrane anchorage of proteins such as STAT3 could hold significant therapeutic value (Figure 4.2-1 B). Therefore, we explored the STAT3 Inhibition 95 therapeutic potential of applying the principles of protein anchorage to the development of a conceptually novel drug modality. The main goal was to develop a protein-membrane anchor (PMA) molecule that could sequester STAT3 at the plasma membrane and as a result suppress its nuclear translocation. The in vitro and in vivo applications of this inhibition strategy, the biophysical characterization and the first example of induced membrane localization of STAT3 through the use of rationally-designed PMA is the focus of this Chapter. 4.3. Materials and Methods

4.3.1. Materials

1,2-Dipalmitoyl-sn-glycero-3-phosphoethanolamine-N-(biotinyl) (Biotin-DMPE, 870285) and 1-palmitoyl-2-oleoyl-sn-glycero-3-phosphocholine (POPC, 850457) were purchased from Avanti Polar Lipids (Alabaster, AL, USA). A mini extruder and accessories (0.1 m polycarbonate membrane, filter and filter supports, etc.) are also from Avanti Polar Lipids. STAT3 protein (S54-54G) was purchased from SignalChem

(Richmond, BC, Canada). GpYLPQTV-NH2 and tetramethylrhodamine

(TMR)-GpYLPQTV-NH2 peptide were purchased from CanPeptide (Montreal, QC, Canada). Rhodamine B- 1,2-Dihexadecanoyl-sn-Glycero-3-Phosphoethanolamine, (RhodamineB-DHPE, L1392), Alexa555-NHS (A20009) and TMR-maleimide (T6027) were purchased from Invitrogen (Burlington, ON, Canada). Most of other chemicals (such as Chloroform, NaCl) used in sample preparation are obtained from Sigma Aldrich Canada (Oakville, ON), unless specified.

4.3.2. Synthesis of Protein-Membrane-Anchor (PMA 1)

Chemical synthesis of PMA 1 is shown in Scheme 1. N-methylmorpholine (4.5 μL, 3.5 eq) was added to a solution of cholesteryl chloroformate (6 mg, 1.1 eq) in tetrahydrofuran (THF, 0.2 mL). After 10 mins of stirring, commercially purchased peptide GpYLPQTV-NH2 (10 mg, 1.0 eq) was added and allowed to stir for 12 hrs at room temperature. The reaction mixture was concentrated in vacuo and then dissolved in

CH2Cl2 (10 mL) and poured into a separatory funnel containing water (40 mL). The organic layer was collected and the aqueous layer was further extracted with CH2Cl2 (3  10 mL). The combined organic layers were washed with water (3  10 mL) and then with STAT3 Inhibition 96 brine (10 mL) and dried over anhydrous sodium sulfate (Na2SO4). The organic layer was concentrated in vacuo and dissolved in a minimal amount of CH2Cl2 and purified via silica gel column with eluting conditions Hex/EtOAC (5:1) to give a white solid (15 mg, 90 %) PMA 1. For detailed structural characterization, please refer to the supporting information of reference20.

N-methylmorpholine Tetrahydrofuran

90%

Scheme 1: Chemical Synthesis of PMA 1.

Two other PMA derivatives are also synthesized to examine the role played by the linking group (PMA 2), and used as a fluorescent control compound (PMA 3) repectively. In PMA 2, an extended PEG linker was used to link the peptide and cholesterol moieties. In PMA 3, a synthesized bivalent fluoresceinated probe (please refer to reference21 for chemical synthesis) is used to incorporate the cholesterol unit and the peptide. Please refer to reference20 for the detailed synthesis procedures of PMA 2 and 3.

4.3.3. Labeling STAT3 with a Fluorescent Dye

The STAT3 protein was labeled with a thiol-reactive maleimide form of tetramethylrhodamine (TMR-maleimide), using the cysteine labeling protocol described in Chapter 2. In brief, STAT3 was delivered in a Tris buffer (50 mM Tris-HCl, 150 mM NaCl, pH 7.5) containing 0.1 M dithiothreitol (DTT) to prevent the formation of disulfide bonds. The DTT was removed by dialysis in Tris buffer (50mM Tris, 150 mM NaCl, pH7.4) and replaced with tris(2-carboxyethyl)phosphine (TCEP, C4706, Sigma, St. Louis, MO) at 20× molar excess. This solution was placed in a vacuum desiccator and flushed with argon for 5 min. to remove oxygen, then the vial was capped tightly and STAT3 Inhibition 97 incubated for half an hour at room temperature while stirring gently. TMR-maleimide was added at 20× molar excess, followed by desiccation and sealing under argon. The solution was gently mixed by vortexing for 2 hours at room temperature and then kept at 4 ºC overnight. The excess dye was removed by centrifugation at 1300g for 1 min using 40 kDa cutoff spin columns (732-6223, Bio-Rad Laboratories, Hercules, CA). The purified labeled protein (TMR-STAT3) was dissolved in Tris buffer (50 mM, pH 7.4, 150 mM NaCl) and stored at -20 °C.

4.3.4. Preparation of Liposomes Multilamellar vesicles (MLV) and large unilamellar vesicles (LUVs) were prepared using the extrusion method modified from the protocol described in Chapter 3. In short, 5 mg/ml POPC and 1% mol fraction of Biotin-DMPE were dissolved in chloroform. For STAT3 anchoring experiments, 1% mol fraction of PMA 1 was added to the POPC/DMPE mixture. The solvent was dried out under nitrogen and chloroform traces were removed by leaving the glass vial open in a vacuum desiccator overnight. The resulting lipid film was gently hydrated for ca. 30 min. with a 50 μL solution of 2 μM TMR-STAT3 in a Tris buffer (100 mM Tris, 150 mM NaCl, pH of 7.4) to create MLV's. Liquid nitrogen freeze-thaw cycles of the standard protocol were avoided because they were found to denature STAT3 and other proteins and alter the photophysical properties of the fluorescence probes. MLVs were pushed 11 times through a porous polycarbonate membrane filter using a Mini- Extruder apparatus to produce LUVs with an average diameter of 1 μm. 4.3.5. Preparation of Supported Lipid-Bilayers

Lipid bilayers were formed by sandwiching a solution containing 10 mg/ml POPC vesicles w/wo1 mol % of PMA 1 between closely spaced, plasma-cleaned glass coverslips. After incubation for ca. 2 hrs at room temperature, the coverslips were separated and rinsed thoroughly with PBS buffer (137 mM NaCl, pH 7.4). Under these conditions, lipid vesicles are expected to fuse together and spread out on the glass surface to form a lipid bilayer. We verified that this process indeed occurred in our experiments by using rhodamine B -labeled lipids. Under the TIRF microscope a highly homogeneous distribution of the fluorescence intensity was observed (data not shown), suggesting that: 1). a lipid layer has been formed on the glass surface; 2). the layer has a highly uniform STAT3 Inhibition 98 thickness and 3). thickness of the layer does probably not exceed 10-20 nm.

4.3.6. The Fluorescence Setup

Fluorescence imaging and spectroscopy experiments were performed on custom-built confocal laser scanning microscope (CLSM) and total internal reflection microscope (TIRFM) that are capable of hyperspectral detection of single emitters. The technical details for both instruments are described in Chapter 2. Single-molecule fluorescence (SMF) measurements were performed on the custom-built CLSM. In studies reported in this chapter, the fluorescence signal is divided by a polarization cube beamsplitter (Newlight Photonics, Canada) into two beams (I// and

I) focused onto separate photon-counting detectors (PD5CTC, Optoelectronic Components, Canada), which are read and digitized by a multichannel time-correlated counting module (PicoHarp300, PicoQuant, Germany). In order to assess the functional integrity of STAT3’s key pY-binding SH2 domain, we conducted a series of fluorescence correlation spectroscopy (FCS) experiments using the GpYLPQTV-NH2 recognition sequence (referred as “Peptide”). Due to the multiparameter functionality of our setup, fluorescence anisotropy (FA) values at each STAT3 concentration were measured from the same photons used to construct the FCS curves. The titration curves of binding built from FCS and FA analysis were used to probe the dissociation constants of the Peptide to STAT3 independently. Liposomes and lipid bilayers were imaged on the custom-built, inverted wide-field TIRFM. The fluorescence images are acquired with a cooled electron-multiplied charge-coupled device (EMCCD, DU-897BV, Andor, USA). Typical individual frames were obtained from an area 65×65 μm2 of the sample exposed to continuous laser illumination for 20-100 ms.

4.3.7. STAT3 Fluorescence Polarization (FP) Binding Assays

STAT3 FP binding assays were employed to determine dissociation constants of the Peptide and PMA derivatives with STAT3. For evaluating fluorescein-labeled peptide probe (f-Peptide), a fixed concentration of 10 nM was incubated with increasing concentration of the STAT3 protein for 30 min at room temperature in the buffer, 50 mM NaCl, 10 mM HEPES, 1 mM EDTA, 0.1% Nonidet P-40, and the FP measurements were STAT3 Inhibition 99 determined using the TECAN M1000 infinite (TECAN, NC), with the set gain adjustment at 35 mP. For evaluating the dissociation constant of PMA 1 with STAT3, a competitive FP binding assay was employed, because PMA 1 is nonfluorescent. Serial concentrations of PMA 1 were incubated with STAT3 protein (150 nM) in the indicated assay buffer conditions. Prior to the addition of the fluorescence probe, the protein:PMA 1 mixtures were allowed to equilibrate at room temperature for 15 min. f-Peptide was added at a final concentration of 10 nM and incubated for 30 min at room temperature following which the FP measurements were collected using the Infinite M1000, with the set gain adjustment at 35 mP.

4.4. PMA Structures and STAT3-Binding Characterization

4.4.1. The Structures of PMA Derivatives Herein, we describe the design, synthesis, and application of a novel PMA that targets STAT3 protein in liposome and whole-cell systems. The prototype PMA is composed of two binding modules: a recognition motif to bind the protein and an anchor to sequester the protein complex at the membrane. Proof-of-concept PMA (Figure 4.4-1) comprised a potent STAT3-recognition 22 sequence GpYLPQTV-NH2 (Peptide) covalently attached to a cholesterol membrane anchor at the N-terminus. The Peptide binds strongly to the SH2 domain of STAT3 and it offers an excellent handle for coupling lipid anchors.

Figure 4.4-1: Design of PMA molecules: Cholesterol is used as the membrane anchoring motif, the most potent STAT3 binder peptide sequence is used as the protein recognition motif, and a fluorescein derivative F-NAc is used to bridge the two motifs. STAT3 Inhibition 100

We elected to employ cholesterol as our membrane anchor in preference to prenyl and GPI lipids, owing to facile synthetic procedures and potent membrane insertion. Moreover, in support of this strategy, Simons and coworkers have successfully used cholesterol to anchor drugs that target membrane-embedded proteins to the plasma membrane17, 18. The Peptide is attached to the cholesterol unit in high yield through chloroformate coupling (PMA 1, Figure 4.4-2A). Furthermore, to examine the role played by the linking group, we prepared a PMA in which an extended PEG linker was used to link the peptide and cholesterol moieties (PMA 2, Figure 4.4-2B). As a control compound, we synthesized a bivalent fluoresceinated probe, PMA 3, which incorporates a fluorescein moiety between the cholesterol unit and the peptide (Figure 4.4-2C).

Figure 4.4-2: Chemical structures of PMA derivatives: (A) cholesterol-Peptide, (B). cholesterol-Peptide with a PEG linker, and (C). cholesterol-Peptide with fluorescein (F-NAc) linker .

4.4.2. Binding of PMAs to the STAT3 Protein

In order to assess the functional integrity of STAT3’s key pY-binding SH2 domain, we conducted a series of FCS experiments using the f-Peptide sequence. Due to the multiparameter functionality of our setup, FA values at each STAT3 concentration were obtained simultaneously on the same photon dataset. FCS is a powerful method for studying ligand-protein interaction dynamics, provided the ligand is fluorescently labelled and has a much smaller size than the target STAT3 Inhibition 101 protein. To this end, 12 nM f-Peptide was titrated with non-fluorescent STAT3. The family of FCS curves obtained is shown in Figure 4.4-3. The fraction of Peptide bound to protein increases with STAT3 concentration, as seen by the shifting of the FCS curve to longer decay (diffusion) times. By interpreting the family of correlation curves in terms of a two-state system, we can obtain the concentrations of both free and STAT3-bound peptide at each concentration of STAT3. The dependence of fraction of STAT3-bound 23, 24 peptide (fb) on titrated STAT3 concentration is then fit to the following equation :

where [peptide] and [STAT3] are the total peptide and STAT3 concentrations, respectively, [peptide-STAT3] is the concentration of bound peptide, and kd is the dissociation constant. Hydrolysis of the phosphate group of the peptide can inactivate its binding capabilities, so only a fraction of the peptide (f) is assumed active. From the fitting of our data to Eq. 4.4-1, we obtained kd = 285 ± 84 nM and f = 0.86±0.07.

Figure 4.4-3: (A) FCS curves measured for the titration of the binding reaction between of the GpYLPQTV peptide to STAT3. The peptide concentration is fixed at 12nM and the STAT3 concentration increases from 50nM to 1M. (B) The fraction of STAT3-bound peptide found by FCS as a function of STAT3 concentration, fitted to Eq. 4.4-1. (C) The fraction of STAT3-bound peptide found by FA as a function of STAT3 concentration, fitted to Eq. 4.4-2.

A similar titration of binding can be made using FA, as bound peptide would exhibit higher anisotropy values due to slower rotation of the much larger host protein. As FA is an additive property, the following equation can be employed to determine kd:

where rmax is the maximum measured anisotropy (from the STA3-bound peptide), and STAT3 Inhibition 102 rmin is the minimum measured anisotropy (from free peptide). Note that rmax and rmin were assumed to be constant. Curve fitting of anisotropy data with this model resulted in ' kd = 209 ± 36 nM. Upon correction for the presence of inactive peptide, as measured by ' FCS (kd = kd /f), this "apparent" value scales to the "real" value kd = 243 ± 46 nM, close to what was measured simultaneously by FCS. To evaluate whether the ditopic inhibitor PMA 1 conserved its STAT3-binding capability when conjugated to the cholesterol steroid, we conducted two control binding FP experiments using the TECAN M1000 infinite. A first measurement probed the binding affinity of the fluorescent PMA (PMA 3) with full-length STAT3 (Figure 4.4-4A). The concentration of PMA 3 is fixed at 10 nM, and the concentration of STAT3 is increased up to 1.5 M. As FP is not an additive property, a modified equation of 4.4-2

k 25 is used instead to obtain d , using the relation :

where P is the overall measured polarization, Pmin and Pmax are the minimum and maximum measured fluorescence polarizations respectively.

Encouragingly, a dissociation constant value of kd = 72 ± 23 nM is obtained by fitting these data to Eq. 4.4-3. Although the data is not corrected for the possible fraction of inactive peptide, the small kd value obtained indicates covalently coupling of peptide into PMA using the fluorescein bridge greatly retains its binding affinity to the protein.

Figure 4.4-4: (A). FP Titration of STAT3-PMA 3 binding (solid square), fitted to a curve (red line) described by Eq. 4.4.3; (B). Competitive FP experiments to determine the dissociation constant of

STAT3-PMA 1 (solid circle), fitted to a curve with an optimized kd. Please refer to the text for details.

To evaluate the dissociation constant of the non-fluorescent PMA 1 with the protein, STAT3 Inhibition 103 a competitive FA measurement23, 26 needs to be operated using the TECAN M1000 infinite. Since PMA 1 has a low solubility in water, a buffer containing 50% ethanol had to be used instead of pure aqueous buffer. Serial concentrations of PMA 1 were first incubated with STAT3 protein (150 nM) to equilibrate at room temperature for 15 min. F-peptide was then added at a final concentration of 10 nM and incubated for 30 min at room temperature following which the FP measurements were taken using the M1000 infinite. Because of the competition binding of PMA 1 and F-peptide with STAT3, the polarization of the mixture decreases with increasing concentration of PMA 1. Due to the complexity of the competitive binding experiments, the dependence of the FP of the sample at each STAT3 concentration cannot be represented by a simple model. Instead, the fractions of bound PMA 1 should be numerically solved based on protein and

F-peptide, PMA 1 concentrations, and an assumed dissociation constant kd of PMA 1 with STAT3, following the theory for competitive binding models described in 23 reference . An iterative routine was written in LabView to find an optimized kd of PMA 1 with STAT3. Encouragingly, we found that the cholesterol conjugate retained a good binding potency for the STAT3 SH2 domain (kd = 132  35 nM).

4.4.3. Quantitative Fluorescence Labeling of STAT3 To investigate the proposed inhibition scheme by using PMA to anchor STAT3 to the cell membrane, the protein needs to be labeled with fluorescent markers. The cysteine thiol groups in the protein were labeled with TMR-maleimide27 (see Chapter 2 for protein thiol-labeling protocol). The resulting protein (TMR–STAT3) was measured by FCS and it was found, by comparison with a fluorescent bead of known diameter (24 nm), that the hydrodynamic radius of STAT3 is 11.9 nm (Figure 4.4-5A). The experimental curve (green) is considerably different from that of free TMR (red), confirming the success of the labeling and purification protocol applied to the STAT3 protein. To quantify the extent of dye labeling, the protein was spin-coated into a poly(methyl methacrylate) (PMMA) polymer matrix. This procedure immobilizes and separates individual molecules in a thin polymer film for single-molecule imaging. By analyzing single-molecule intensity data, it was found than each protein molecule contained an average of ca. 2.4 fluorophores, up to a maximum of 6 (Figure 4.4-5B). Upon close examination of the protein sequence and crystal structure, it is apparent that STAT3 Inhibition 104 there are 8 cysteine residues in STAT3 which could potentially be labeled by this procedure, but they vary widely in terms of solvent accessibility.

Figure 4.4-5: (A) FCS for TMR-maleimide (red), TMR-STAT3 (green) and 24nm fluorescent microspheres (blue). (B). Histogram of the number of TMR dye labels per STAT3 protein. The solid line is the Poisson distribution that best fits the data, giving an average of λ=2.38.

To assess whether fluorescence labeling compromised the phosphopeptide-binding function of the STAT3 SH2 domain, we conducted a series of control experiments. Samples of 6 nM FNAc-peptide with 250 nM STAT3-TMR were incubated for 30 minutes prior to measurements. Simultaneously detected FA and FCS data confirmed that the binding affinity of TMR–STAT3 for the inhibitor was similar to that of the unlabeled protein27. We reasoned that successful binding of the phosphorylated ligands would confirm that TMR labeling had not affected SH2 domain function. 4.4.4. Characterization of F-NAc marker PMA 3 is a fluorescent version of PMA 1. For imaging as well as for single-molecule experiments of the PMA-STAT3 interaction, the fluorescent bridge between cholesterol and peptide needs to be photostable. A novel fluorescein derivative was developed to be incorporated into the structure of the peptide-drug28. Extensive spectral characterization revealed that the new dye is ideally suited for the STAT3 application, as well as other single-molecule studies. The new dye, F-NAc, contains two readily accessible functional groups that can be utilized or blocked independently (Figure 4.4-6). This capability recommends F-NAc as a linking agent between both the STAT3-recognizing peptide and the membrane anchoring cholesterol. STAT3 Inhibition 105

Figure 4.4-6: 2D geometric model of the chemical structure of F-NAc with two accessible functional groups identified. The aniline group is used to couple the STAT3 recognition peptide, and the phenol group is used to couple cholesterol.

F-NAc exhibits remarkable pH-invariance over a large range of acidities. This is uncommon among fluorescent dyes, but it a necessary trait for studying biological systems in general and cancerous cells in particular. The absorption and the emission spectra, as well as the quantum yield and fluorescence decay lifetime hardly change with pH. For pH buffers between 4 and 10, the emission peak remained at 520 nm, the quantum yield held almost constant at Φ = 0.22, as did the fluorescence lifetime at τ = 2.8 ns.

Figure 4.4-7: (A) pH dependence of the emission spectrum of F-NAc. The spectra were normalized by their optical density at the excitation wavelength, 470 nm. (B). Absorption and emission spectra of F-NAc (blue) and FITC (orange) at pH 8. Also shown are the absorption spectrum of the biotinylated derivative (black, left panel) and the emission spectrum of F-NAc excited at 457 nm (red, right panel).

Photostability is also a distinguishing characteristic of fluorescent dyes used in biological studies. To test this, small volumes of F-NAc were illuminated in a wide-field STAT3 Inhibition 106 fluorescence microscope and the signal was captured by an EMCCD camera. As seen in Figure 4.4-8A, F-NAc exhibits excellent photostability, with an estimated 50% photobleaching lifetime of about 15 minutes. This is considerably longer than for the typical fluorescein derivative FITC, which bleaches rapidly with a characteristic half-time of ca. 13 seconds, in agreement to previous reports29.

Figure 4.4-8: (A). Photobleaching kinetics of F-NAc and FITC measured using a CCD camera. The fluorescence was integrated across each frame and normalized for each sample to the initial signal. (B). Intensity time trajectory for a selected F-NAc molecule under continuous laser excitation.

The observed resistance to photobleaching recommends F-NAc for long-term observation of intracellular structures in confocal microscopy studies. We also tested the new fluorescein derivative in our custom-built confocal microscope that has single-molecule detection capability and enables the identification of single fluorophores by their spectrum or lifetime. An intensity time trajectory of a typical molecule immobilized to the surface by biotin-streptavidin linkage is shown in Figure 4.4-8B. On average, we detected ca. 4.5×104 photons from a single F-NAc molecule prior to its irreversible photobleaching, about three times more than the average number of photons detected from an individual FITC molecule under similar conditions, 1.6×104 photons. Our data recommend the new fluorophore for a wide-range of single-molecule investigations beyond the STAT3 project, in particular those aimed at resolving protein interactions.

4.5. STAT3 Distribution Inside Large Liposomes

To assess the efficacy of our STAT3 PMAs, we developed a series of in vitro fluorescence-based assays to visualize and quantify PMA-modulated STAT3 localization STAT3 Inhibition 107 in micron-sized liposome model systems30. The protein localization images were acquired in the wide-field mode on our home-built TIRFM, which was described in detail in section 2.2 of this thesis.

Figure 4.5-1: Wide-field fluorescence images of liposomes: prepared with 1 mol % of PMA 3 (A); encapsulating TMR-STAT3, without PMA (B); encapsulating TMR-STAT3 with 1 mol% of PMA 1 (C). The scale bar is 1µm.

Initial test experiments were aimed at measuring the distribution of the F-NAc conjugated inhibitor, PMA 3, encapsulated in micron-sized lipid vesicles. This was done with and without the STAT3 protein, and localization of PMA 3 to the liposome boundary was clearly observed (Figure 4.5-1A). As a further control, we prepared lipid vesicles containing only dye-labelled target protein (TMR-STAT3). In this case, in the absence of PMA compounds, the protein was found to be distributed uniformly throughout the aqueous interior of the liposome (Figure 4.5-1B). Additional experiments conducted at varying STAT3 concentrations consistently showed similar protein distributions and thus confirmed that the nonspecific adsorption/insertion of STAT3 to the lipid membrane was negligible. Remarkably, when the liposomes were treated with the non-fluorescent inhibitor, PMA 1, at levels as low as 1 mol%, STAT3 was sequestered at the liposome membrane. Indeed, as clearly seen in Figure 4.5-1C, most of the TMR fluorescence comes from the membrane region. This dramatic change in the spatial distribution of STAT3 can be interpreted in terms of the strong binding affinity between the protein and the ligand, which is effectively tethered to the membrane through the cholesterol moiety (Figure 4.5-1A). Continuous imaging of the liposomes for several minutes (Figure 4.5-2) showed that the association of the protein with the lipid membrane was stable over tens of minutes and STAT3 Inhibition 108 that the interaction with the PMA 1 inhibitor was strong.

Figure 4.5-2: Time-lapse imaging of STAT3 inside liposomes prepared with 1 mol% PMA 1. The exposure time of each frame is 100 ms. The sample was illuminated only while taking images to reduce photobleaching. The scale bar is 4 µm.

4.6. Direct Observation of STAT3 Anchoring on a Lipid Bilayer

The STAT3-PMA1-membrane interactions were measured quantitatively by incubating TMR-STAT3 (50 L, 4 nM) on a POPC lipid bilayer supported on a glass coverslip. The data was acquired under evanescent field illumination on our TIRFM to avoid capturing the signal from free TMR-STAT3 proteins that are diffusing freely in the solution above the lipid bilayer. In the absence of PMA 1, nonspecific absorption of single TMR-STAT3 proteins onto the lipid bilayer was observed by TIRFM imaging. The adhesion of the protein to the lipid surface is described by an average of 12.4 ± 2.6 molecules on an area of 16 × 16 m2 (Figure 4.6-1A). This coverage is very similar to the protein adhesion observed onto the hydrophilic surface of plasma-cleaned glass coverslips (Figure 4.6-1B). On the other hand, when the lipid bilayer was exposed to PMA 1 (1 mol%) prior to the incubation with TMR-STAT3, more than 450 proteins were found immobilized on the same area (Figure 4.6-1C). Since phosphopeptides are often de-phosphorylated by intracellular phosphatases/esterases, a non-phosphorylated analogue of PMA 1 was prepared as a control to investigate whether phenolic species alone could sequester STAT3 at the lipid bilayer. Interestingly, the phenolic PMA elicited moderate levels of STAT3 membrane localization (Figure 4.6-1D) around four times higher than the nonspecific adsorption limit. This result was not unexpected, as both Dourlat et al. and Gunning et al. have STAT3 Inhibition 109 previously reported similar behavior, whereby the corresponding non-phosphorylated STAT3-binding sequence was found to retain moderate STAT3-binding activity31 32.

Figure 4.6-1: TIRFR image of adsorbed STAT3 on a supported lipid bilayer: (A) without PMA, (B) on bare plasma-cleaned coverslip, (C) with PMA 1 (1 mol%), (D) with np-PMA (1 mol%) and (E) with PMA containing a long PEG linker. Scale bar: 2µm.

The presence of the PEG linker in PMA 2 drastically reduced the membrane anchoring of STAT3 to about twice the level of the non-specific adsorption limit (Figure 4.6-1E). This surprising decrease in membrane localization may be caused by the folding of the cholesterol-linker-peptide complex. Folding of the complex could prevent both membrane anchoring and STAT3 binding. In the case of PMA 1, the surface coverage increased noticeably with incubation time and was limited only by the distribution of ligands on the surface of the bilayer. Successive imaging of the same membrane area also showed individual STAT3 molecules diffusing in the plane of the lipid membrane. The biophysical data shown herein demonstrate unambiguously the potency of protein-specific PMAs.

4.7. STAT3-Inhibitor Binding Observed at SM Level

Binding affinity between ligand and its receptor protein is an important parameter characterizing ligand-protein interactions. The rates of ligand-protein association and dissociation are also important parameters to characterize the potency of the ligand, since they are directly related to the energetics of binding and release. The association and STAT3 Inhibition 110 dissociation are usually described as ON and OFF states of a ligand from its receptor proteins, and SMF spectroscopy is ideally suited to capture these events. By means of SMF anisotropy of the PMA(no cholesterol)-STAT3 pair we were able to probe the ON and OFF binding states, and thus obtain quantitative information about the strength of their association. As stated in Chapter 3, a primary obstacle to probe ligand-protein interactions in single molecule experiments is the low concentration required to spatially separate the molecules of interest, while the ligand-protein interaction needs relatively high local concentrations to occur (100 nM – 10 M). One strategy to satisfy both requirements is to encapsulate a single pair of ligand and protein molecules inside small liposomes (~0.1 μm diameter) that are then sparsely immobilized on the coverslip for SMF measurements. Inside these vesicles, the effective solute concentration is around 3 μM, sufficiently high for typical ligand-protein interactions. For the peptide-STAT3 pair, this concentration is ca. 10 times higher than their dissociation constant. The diameter of lipid vesicles prepared by extrusion with and without FT cycles is a relatively broad distribution instead of a single value (Chapter 3). Extruding multi-lamellar vesicle (MLVs) through 0.1 μm membrane leads to vesicles having a broad size distribution, with diameters from tens to hundreds of nanometers. However, smaller vesicles have poorer encapsulation efficiency than larger vesicles. Although a large fraction of vesicles has a diameter smaller than 0.1 μm, most of them are empty when hydrating the sample at 1μM of sample or less. Therefore, an apparent vesicle size distribution for SMF studies is obtained when taking into account the encapsulation efficiency in our preparation method (Figure 4.7-1). The larger vesicles provide a lower local concentration (~0.4 μM for 0.2 μm vesicles), which approaches the dissociation constant of PMA-STAT3 and thus increases the chance to spontaneously observe dissociation events. Vesicles larger than 0.3 μm usually contain more than one PMA-STAT3 pair and can easily be eliminated from analysis based on the recorded photobleaching behavior. The uncertainty in local concentrations due to vesicle size variations is not important if we only consider the dissociation rate, which is given by the inverse of the average “ON” time of the ligand on the protein. The dissociation rate depends on the STAT3 Inhibition 111 intermolecular binding forces between PMA and STAT3 and on their diffusion constants24. In addition to diffusion and energy barriers, the association kinetics also depends on the bulk concentration of the ligand and the protein.

Figure 4.7-1: The size distribution of lipid vesicles prepared via extrusion through a 0.1μm porous membrane. Encapsulation efficiency is taken into account, since most small vesicles are empty when hydrating the sample at ~ 1μM or less.

From previous FP studies of this complex, we learned that the free and STAT3-bound PMAs have different FP values, ~ 0.02 and ~ 0.16-0.18 on average, respectively33. These values may vary depending on chemical groups present in the PMA molecule. The single molecule FP study was performed on TMR-labelled PMA, co-encapsulated with STAT3 inside surface-tethered vesicles. To ensure that the ligand is not anchored in the membrane but is freely diffusing inside the liposome, we removed the cholesterol motif from the PMA molecule and used only the peptide sequence. Polarization-resolved intensity trajectories were recorded for more than 200 single TMR-PMA molecules co-trapped with STAT3 in vesicles. Very interestingly, the majority of the trajectories show a single FP state, indicating the presence of either free or protein-bound PMA before the photobleaching of the fluorophore (Figure 4.7-2A). The transitions between the two FP states occur rarely (Figure 4.7-2B), indicating that the average ON time (perhaps also the OFF time) is much longer than the average photobleaching time of the TMR probe (~30 s). A more detailed SMF investigation of the inhibitor-STAT3 interaction using both the vesicle encapsulation and the supported lipid bilayer approaches which I developed is currently performed in the Gradinaru lab STAT3 Inhibition 112 by a new graduate student.

Figure 4.7-2: Polarized intensity-time trajectory recorded for a single pair of PMA and STAT3 trapped inside a surface immobilized liposome. The fluorescence signal in the I// and I channels are shown in blue and orange respectively. (A) PMA bound to STAT3; (B) PMA free for t = 0-12 s and 16-17s, then bound to STAT3 for t=12-16s and 17-22s; (C) FP trajectory built from B to show the “ON” and “OFF” states of PMA on STAT3.

4.8. STAT3 Inhibition in Cells

The whole-cell inhibitory activity of PMA 1 was investigated in v-Src (Sarcoma)-transformed mouse fibroblasts20. Briefly, NIH3T3/v-Src fibroblasts that harbor aberrant STAT3 activity were treated with PMA 1 (at both 10 and 50 M) for 24 h, and then nuclear extracts were prepared and subjected to a STAT3-STAT3:DNA-binding assay in vitro with a hSIE probe. The level of DNA-binding activity of STAT3 protein was then determined by an electrophoretic mobility shift assay (EMSA) analysis (Figure 4.8-1A). Encouragingly, the potent STAT3-binding activity of PMA 1 in vitro was reflected in this assay, in which the DNA-binding activity of STAT3 was repressed significantly in a dose-dependent manner (Fig. 4.7-1A). The reduced levels of STAT3 activity might indicate that PMA 1 anchors STAT3 to the membrane and thus prevents phosphorylation and subsequent nuclear translocation. STAT3 Inhibition 113

Figure 4.8-1: (A). EMSA analysis of the inhibition of STAT3-STAT3 dimerization (as judged by the disruption of the STAT3–STAT3:DNA complex) by PMA 1 (10 and 50 M) in whole NIH3T3/v-Src cells (the small molecule inhibitor is absent in the control). The position of the STAT3-STAT3:DNA complex in the gel is shown; the results are representative of two independent assays. (B) Cytosolic extracts of equal total protein were prepared from NIH3T3/v-Src fibroblasts (treated with PMA 1 for 24 h or untreated) that stably express the STAT3-dependent luciferase reporter (pLucTKS3) or from treated or untreated NIH3T3/v-Src fibroblasts that stably express the STAT3-independent luciferase pLucSRE and analyzed for luciferase activity with a luminometer.

Next, luciferase reporter studies were conducted to further determine the effects of PMA 1 on the transcriptional activity of STAT3. Encouragingly, the results showed that the treatment of v-Src-transformed mouse fibroblasts (NIH3T3/v-Src), which stably express the STAT3-dependent luciferase reporter (NIH3T3/v-Src/pLucTKS3), significantly repressed the induction of the STAT3-dependent reporter when treated with PMA 1 at low-micromolar concentrations (Figure 4.8-1B). To examine nonspecific effects, the NIH3T3/v-Src/pLucSRE fibroblasts, which overexpress the STAT3-independent luciferase reporter (pLucSRE), were similarly treated with PMA 1 and a luciferase assay was performed on the cytosolic extracts. Most encouragingly, STAT3-independent luciferase activity was not inhibited in cells treated with PMA 1, even at concentrations of up to 80 M (Figure 4.8-1C). These data corroborate the EMSA analysis of STAT3 DNA-binding activity and indicate that PMAs may specifically inhibit STAT3 nuclear translocation and thus inhibit the expression of genes STAT3 Inhibition 114 targeted by STAT3.

Figure 4.8-2: Inhibition of the EGF-induced nuclear translocation of STAT3 by PMA. Serum-starved NIH3T3/hEGFR cells were treated with PMA 1 (50 M) for 3 h and then stimulated with recombinant human epidermal growth factor (rhEGF; 100 ng/mL) for 15 min, immunostained with anti-STAT3 (green) or stained with DAPI (nucleus, blue), and analyzed by laser-scanning confocal microscopy.

To verify that PMAs inhibit STAT3 nuclear translocation, NIH3T3/hEGFR mouse fibroblasts overexpressing the human epidermal growth factor receptor (hEGFR) were starved of serum for 24 h and either treated or not with PMA 1 for 3 h prior to stimulation with EGF for 15 min. Cells were then subjected to for STAT3 (green) or staining of the nucleus with DAPI and analyzed by laser scanning confocal microscopy for the inhibition of STAT3 nuclear translocation. In the resting NIH3T3/hEGFR fibroblasts, STAT3 (green) was predominantly distributed within the cytoplasm, with limited presence in the nucleus (blue; Figure 4.8-2 A-C). Subsequent stimulation of untreated cells with EGF induced a strong nuclear STAT3 Inhibition 115 presence of STAT3 (cyan for merged STAT3 (green) and DAPI (blue-stained nucleus; Figure 4.8-2 D-F). The treatment of resting NIH3T3/hEGFR cells with PMA 1 (50 M) elicited a visible decrease in nuclear STAT3 protein and noticeable association with the cytoplasmic and intracellular membranes (Figure 4.8-2 G-I). Most encouragingly, EGF-stimulated STAT3 nuclear localization was strongly blocked when cells were pretreated with PMA 1 (50 M) before stimulation with EGF (Figure 4.8-2 J-L), presumably through the PMA-mediated hindrance of the binding of the STAT3 SH2 domain to pY motifs of receptors and the prevention of de novo phosphorylation by tyrosine kinases. We conclude that PMA 1 probably captured STAT3 and anchored it to the membrane. To further define the mode of action of the PMA and to confirm that the inhibitory effects observed are predominantly a consequence of the PMA-mediated localization of STAT3 exclusively to the plasma membrane, we conducted an immunofluorescence study in combination with laser-scanning confocal microscopy for the localization of STAT3 (Figure 4.8-3). The human breast-cancer cell line MDA-MB-231, which harbors constitutively active STAT3, was treated with PMA 1 (25 M) and then subjected to staining with the plasma-membrane stain FM-4-64 (red) or the nuclear stain DAPI (blue), or to immunostaining with AlexaFluor488 antibody (green) for STAT3 detection. Again, we observed complete inhibition of STAT3 nuclear translocation in PMA-treated cells, whereas in untreated MDA-MB-231 cells, activated STAT3 resided predominantly in the nucleus (Figure 4.8-3). Most encouragingly, not only was STAT3 completely excluded from the nucleus in the presence of PMA 1, but the colocalization images unequivocally showed that it was predominantly localized at the labeled plasma membrane (yellow). The present results provide a proof-of-concept for the membrane-anchoring properties of PMAs. Membrane anchorage is a novel mechanism for the inhibition of STAT3 function. Thus, prevention of the nuclear translocation of STAT3 inhibits its transcriptional function. As the probes contain a labile phosphotyrosine moiety, biological activity may be decreased as a result of intracellular dephosphorylation and/or peptide hydrolysis. Therefore, the Gunning Lab is now in the process of preparing more metabolically resistant PMA inhibitors. STAT3 Inhibition 116

Figure 4.8-3: Induction of STAT3 membrane anchorage and inhibition of the nuclear translocation of STAT3 by PMA 1. MDA-MB-231 breast tumor cells were treated with PMA 1 (25 M) for 6 h, immunostained with membrane stain FM-4-64 (red) or anti-STAT3 antibody (green) or stained with DAPI (nucleus, blue), and analyzed by laser-scanning confocal microscopy.

4.9. Summary

In summary, we presented a novel inhibitor for the targeting of aberrant signaling proteins associated with human disease. We demonstrated that PMA-induced protein localization is a conceptually viable therapeutic strategy with STAT3 as our model in liposome, lipid-bilayer, and whole-cell systems. We envisage that the successful application of PMAs in tumors with aberrant STAT3 activity will be of significant therapeutic importance. Further studies to determine the biochemical and biological utility of more drug-like, nonphosphorylated PMAs are ongoing. Future related approaches will incorporate more drug-like, less peptidic STAT3 binders that are less prone to metabolic degradation. The lipogel systems, which provide a better modeling of the cell system and are useful for future drug delivery applications, are under investigation.

STAT3 Inhibition 117

References: 1. Aggarwal, B.B., et al., Targeting signal-transducer-and-activator-of-transcription-3 for prevention and therapy of cancer: modern target but ancient solution. Ann N Y Acad Sci, 2006. 1091: p. 151-69. 2. Drewry, J.A., et al., Coordination complex SH2 domain proteomimetics: an alternative approach to disrupting oncogenic protein-protein interactions. Chem Commun (Camb), 2010. 46(6): p. 892-4. 3. Levy, D.E. and J.E. Darnell, Jr., Stats: transcriptional control and biological impact. Nat Rev Mol Cell Biol, 2002. 3(9): p. 651-62. 4. Becker, S., B. Groner, and C.W. Muller, Three-dimensional structure of the Stat3beta homodimer bound to DNA. Nature, 1998. 394(6689): p. 145-51. 5. Uversky, V.N., C.J. Oldfield, and A.K. Dunker, Intrinsically disordered proteins in human diseases: introducing the D2 concept. Annu Rev Biophys, 2008. 37: p. 215-46. 6. Turkson, J. and R. Jove, STAT proteins: novel molecular targets for cancer drug discovery. Oncogene, 2000. 19(56): p. 6613-26. 7. De Vos, J., et al., JAK2 tyrosine kinase inhibitor tyrphostin AG490 downregulates the mitogen-activated protein kinase (MAPK) and signal transducer and activator of transcription (STAT) pathways and induces apoptosis in myeloma cells. Br J Haematol, 2000. 109(4): p. 823-8. 8. Fletcher, S., J. Turkson, and P.T. Gunning, Molecular approaches towards the inhibition of the signal transducer and activator of transcription 3 (Stat3) protein. ChemMedChem, 2008. 3(8): p. 1159-68. 9. Zhang, X., et al., A novel small-molecule disrupts Stat3 SH2 domain-phosphotyrosine interactions and Stat3-dependent tumor processes. Biochem Pharmacol, 2010. 79(10): p. 1398-409. 10. Mandal, P.K., W.S. Liao, and J.S. McMurray, Synthesis of phosphatase-stable, cell-permeable peptidomimetic prodrugs that target the SH2 domain of Stat3. Org Lett, 2009. 11(15): p. 3394-7. 11. Mandal, P.K., et al., Conformationally constrained peptidomimetic inhibitors of signal transducer and activator of transcription. 3: Evaluation and molecular modeling. J Med Chem, 2009. 52(8): p. 2429-42. 12. Page, B.D., D.P. Ball, and P.T. Gunning, Signal transducer and activator of transcription 3 inhibitors: a patent review. Expert Opin Ther Pat, 2011. 21(1): p. 65-83. 13. Haftchenary, S., M. Avadisian, and P.T. Gunning, Inhibiting aberrant Stat3 function with molecular therapeutics: a progress report. Anticancer Drugs, 2011. 22(2): p. 115-27. 14. Roskoski, R., Jr., Protein prenylation: a pivotal posttranslational process. Biochem Biophys Res Commun, 2003. 303(1): p. 1-7. 15. Brown, D. and G.L. Waneck, Glycosyl-phosphatidylinositol-anchored membrane proteins. J Am Soc Nephrol, 1992. 3(4): p. 895-906. 16. Cadwallader, K.A., et al., N-terminally myristoylated Ras proteins require palmitoylation or a polybasic domain for plasma membrane localization. Mol Cell Biol, 1994. 14(7): p. 4722-30. STAT3 Inhibition 118

17. Rajendran, L., et al., Efficient inhibition of the Alzheimer's disease beta-secretase by membrane targeting. Science, 2008. 320(5875): p. 520-3. 18. Rajendran, L., H.J. Knolker, and K. Simons, Subcellular targeting strategies for drug design and delivery. Nat Rev Drug Discov, 2010. 9(1): p. 29-42. 19. Guldenhaupt, J., et al., Secondary structure of lipidated Ras bound to a lipid bilayer. FEBS J, 2008. 275(23): p. 5910-8. 20. Avadisian, M., et al., Artificially induced protein-membrane anchorage with cholesterol-based recognition agents as a new therapeutic concept. Angew Chem Int Ed Engl, 2011. 50(28): p. 6248-53. 21. Liu, B., et al., A Photostable, pH-Invariant Fluorescein Derivative for Single-Molecule Microscopy. J. Fluoresc., 2009. 19(5): p. 915-920. 22. Ren, Z., et al., Identification of a high-affinity phosphopeptide inhibitor of Stat3. Bioorg Med Chem Lett, 2003. 13(4): p. 633-6. 23. Goodrich, J.A. and J.F. Kugel, Binding and Kinetics for Molecular Biologists. Second ed. 2006, New York: Cold Spring Harbor. 182. 24. Jackson, M.B., Diffusion-limited association, in Molecular and Cellular Biophysics. 2006, Cambridge University Press. 25. Lakowicz, J.R., Principles of Fluorescence Spectroscopy. 4 ed. Vol. 1. 2006: SpringerLink. 26. Schust, J. and T. Berg, A high-throughput fluorescence polarization assay for signal transducer and activator of transcription 3. Anal Biochem, 2004. 330(1): p. 114-8. 27. Badali, D., et al., Development of STAT3 as an accessible target for fluorescence-based inhibition assays. Journal of Undergraduate Life Sciences, 2010. 4(1): p. 18-23. 28. Liu, B., et al., A photostable, pH-invariant fluorescein derivative for single-molecule microscopy. J Fluoresc, 2009. 19(5): p. 915-20. 29. Burchak, O.N., et al., Fluorescein-based amino acids for solid phase synthesis of fluorogenic protease substrates. Bioorg Med Chem, 2006. 14(8): p. 2559-68. 30. Liu, B., A. Mazouchi, and C.C. Gradinaru, Trapping single molecules in liposomes: surface interactions and freeze-thaw effects. J Phys Chem B, 2010. 114(46): p. 15191-8. 31. Dourlat, J., et al., A novel non-phosphorylated potential antitumoral peptide inhibits STAT3 biological activity. Biochimie, 2009. 91(8): p. 996-1002. 32. Shahani, V.M., et al., Design, synthesis, and in vitro characterization of novel hybrid peptidomimetic inhibitors of STAT3 protein. Bioorg Med Chem, 2011. 19(5): p. 1823-38. 33. Mazouchi, A., et al., On the performance of bioanalytical fluorescence correlation spectroscopy measurements in a multiparameter photon-counting microscope. Anal Chim Acta, 2011. 688(1): p. 61-9. Intrinsic Disorder and Electrostatics 119

Electrostatics Tweaks the Conformational Properties of

Intrinsically Disordered Proteins

Intrinsic Disorder and Electrostatics 120

5.1. Introduction

The traditional structure-function paradigm states that proteins need to adopt a well-defined three dimensional structure in order to perform a specific function1. Until very recently, the dogma in biochemistry was that, under physiological conditions, proteins adopt structured conformations and certain functionality. Although as early as the late 1970s, proteins containing unstructured sequences and/or domains under physiological conditions had been discovered by X-ray crystallography and NMR spectroscopy, the presence of intrinsic disorder in proteins were unfortunately ignored2. Not even one example was mentioned, let alone carefully discussed, in the textbooks. The scientists instead preferred using other interpretations to explain the unusual observations, such as “artifacts in experiment”. With the latest developments in NMR, CD, and other techniques capable of in depth characterization of structural heterogeneity and dynamics, more and more proteins were found to exhibit intrinsically disordered sequences or domains under physiological conditions3, with most of them vitally important in many biological processes4. According to recent statistical data, 35%-51% of eukaryotic proteins have disordered regions of 30-50 residues or longer2, 5. In bacteria and simpler organisms, the fraction is lower, however still significant: 9-37%. Intrinsically disordered proteins (IDPs) play critical yet often poorly understood roles in a variety of cellular processes6, 7. Compared to globular proteins, IDPs have many significant differences in: amino acid composition, hydrophobicity, net charges, chain dimension and flexibilities, sequence and structural complexity, etc. In general, IDPs lack order-promoting amino acids4, such as hydrophobic and aromatic amino acids. On the contrary, IDPs are enriched in disorder-promoting amino acids, such as polar and structure-breaking amino acids (Gly, Pro). They are most involved in biological activities such as protein-protein binding, membrane anchoring, molecular recognition, signalling, and enzymatic activities. Considering their natural abundance, it is surprising how long Intrinsic Disorder and Electrostatics 121 and convoluted the path to the current knowledge of IDPs has been. Sic1 is a 283 residue disordered protein7, which is a stoichiometric inhibitor of the cyclin-dependent kinase (CDK) in the budding yeast Saccharomyces cerevisiae. Its main function is to prevent the yeast to prematurely enter the S phase from the G1 phase in the yeast life cycle (Figure 5.1-1A)8. In advanced stages of the G1 phase, the CDK complexes are needed to initialize bud formation to enter the S phase, therefore Sic1 degradation becomes essential and obligatory. This is done via binding to Cdc(Cell division control)4 protein, a subunit of the Skp, Cullin, F-box containing complex (SCF complex) ubiquitin ligase, when the Sic1 is multi-phosphorylated. Upon recruitment, the SCF complex tags Sic1 with poly-ubiquitination to target it for proteasomal degradation (Figure 5.1-1A).

Figure 5.1-1: (A) Cell-cycle regulation in budding yeast: role of Sic1 is to prevent premature S-phase entry by inhibiting CDK activity; (B) Sic1 needs to be degraded in the advanced stage of G1 phase for the yeast to continue its life cycle.

The full sequence of Sic1 contains nine sites which could be phosphorylated and termed Cdc4 phosphodegrons (CPDs)7, 9. Seven of them clustered in the 90 residue N-terminal region are found to be sufficient for targeting of Sic1 to Cdc410, and therefore the 90-residue N-terminal fragment of Sic1 was used in our study. The multi-phosphorylation dependence of Sic1 in recognizing its binding partner Cdc4 is interesting and important, although the physical basis is not fully understood. The Intrinsic Disorder and Electrostatics 122 mean-field statistical model proposed by our collaborator Julie Forman-Kay and her colleagues at the University of Toronto states the importance of electrostatic properties in Sic1/pSic1‟s interaction with Cdc4. Their model also suggests pSic1 needs to be rather compact in order to achieve cumulative electrostatic interaction with the positively charged binding pocket on Cdc4. In this work, we studied in detail the effects of electrostatics on conformational properties, flexibility and dynamics of an IDP protein, Sic1, using single molecule fluorescence (SMF) spectroscopy methods, such as single-molecule fluorescence resonant energy transfer (smFRET), fluorescence correlation spectroscopy (FCS) and time-resolved fluorescence anisotropy (TRFA). We believe that an explicit representation of Sic1‟s conformational properties under varying conditions will be beneficial to understand the relationship between its structure and function.

5.2. Materials and Methods

5.2.1. Materials

Sic1 Sample Preparation: Sic1 1-90 was expressed as a GST fusion protein at 16 °C in the Escherichia coli BL21(DE3) codon plus cells grown in minimal medium with

15 13 [ N]NH4Cl and/or [ C]glucose as the sole nitrogen and carbon sources, respectively, depending on the desired labeling scheme. Two cysteine residues were added at each terminus (N- and C-) of the wild-type Sic1 sequence (Sic1-1c90c) to be labelled with a pair of fluorescent dyes for single-molecule fluorescent resonant energy transfer (smFRET) experiments. The GST-Sic1 protein was purified by using glutathione-Sepharose affinity chromatography. After digestion with TEV protease, GST was precipitated by lowering the pH to 5.0 and spun out at 42,000 × g. Sic1 was purified to homogeneity by using a Superdex 75 column (Amersham Biosciences), as confirmed by mass spectrometry. We made use of maleimide chemistry and commercially available cysteine-reactive fluorophores to do Sic1-1c90c labelling. Tetramethylrhodamine (TMR)-maleimide Intrinsic Disorder and Electrostatics 123

(T6027) and Alexa488-maleimide (A10254) were obtained from Invitrogen Canada, while Atto647N-maleimide (05316) and AttoRh101-maleimide (73522) were obtained from Sigma-Aldrich Canada. Upon labelling, the free dye was removed and the protein was purified using Micro Bio-Spin columns (P30-0.5mL, 732-6224, Bio-Rad, Canada). Other chemicals used for various buffer solutions, such as NaCl, guanidinium chloride (GuHCl), Tris(2-carboxyethyl)phosphine (TCEP), etc, were obtained from Sigma Aldrich Canada, unless specified otherwise. 5.2.2. Quantitative Labeling of Sic1

The principles of protein labelling with fluorescent dyes, purification and quantification of the degree of labelling have been described in detail in Chapter 2 of this thesis. However, they may differ from protein to protein and IDPs do present specific challenges. The procedure used for labelling and purifying Sic1 is briefly described below. First, the Sic1-1c90c sample is buffer-exchanged into a 50mM Tris buffer containing 0.15M NaCl at pH7.4 for fluorescence labelling. Since Sic1 is highly positively charged, to avoid introducing interference due to protein-fluorophore interactions, we chose TMR (neutral, donor) and Atto647N (+1, acceptor) as the donor-acceptor pair for smFRET experiments (Förster radius of 6.0 ± 0.2 nm). From the stock, the Sic1 protein was diluted to 100 M in Tris buffer and mixed with 5× molar excess of TCEP (0.5 mM). After 30 minutes, 0.5× molar excess of TMR and 4× molar excess Atto647N was added to the sample, followed by a 3-hour gentle shaking at room temperature. For doing FRET-FCS experiments, Alexa488 (donor) and TMR (acceptor; having a Förster radius of 6.5 ± 0.3 nm to increase the sensitivity) are used instead to label the Sic1 protein. For the purpose of doing TRFA (described in Chapter 1) measurements, a sample of Sic1 that is singly labelled with TMR is prepared following a similar procedure, except the Atto647N is added during the labelling reaction. The dye-labelled protein is purified in MilliQ water (for the purpose of doing mass Intrinsic Disorder and Electrostatics 124 spectroscopy) using the P30 micro bio-spin column purchased from Bio-Rad, and kept at -80˚C for long term storage. A fraction of the labelled Sic1 sample is then sent to the mass spectroscopy facility to measure the nature and the extent of labelling. 5.2.3. SMF Spectroscopy Experiments

Observations of dually-labelled Sic1 fluorescence at the single-molecule level were performed using our custom-built multiparameter confocal microscope (described in Chapter 2). Samples were prepared at a protein concentration of 50-100 pM in buffer or MilliQ water in the presence of salt or denaturant at appropriate concentrations. 0.005% Tween-20 v/v (P2278, Sigma Aldrich Canada) was added to the sample to prevent the surface adsorption of the protein. The fluorescence signal of the sample was split by a dichroic mirror (640DCXR, Semrock, Canada) into donor and acceptor channels for smFRET experiments. A combination of three optical filters was used in the detection path to remove Rayleigh and Raman scattering: a sharp long-pass filter (Lp532RS, Semrock) before the dichroic mirror, a bandpass filter (HQ600/80, Chroma) in the donor channel, and a bandpass filter (D705/80, Chroma) in the acceptor channel. TRFA measurements were performed on singly labelled Sic1. In this case, Sic1 was labelled only with the donor (TMR) at a low labelling ratio (average dye/protein < 0.3) to minimize dual labelling. For TRFA, samples were prepared at a protein concentration of 10-50 nM. The sample was excited by the linearly polarized and pulsed laser beam. The fluorescence signal is split using a polarizing cube beamsplitter into parallel and perpendicular polarizations I|| and I with respect to the polarization of the excitation, respectively. Two single-photon avalanche diodes (SPAD) (PDM, Optoelectronic Components, Canada) with picosecond instrument response (full width at half maximum (FWHM) ~48 ps) were used in both channels to accurately record the detection time for each fluorescence photon from the sample. For the smFRET experiments, a red-sensitive SPAD detector (COUNT, Laser Components, USA) was used in the acceptor channel. The COUNT detector has an improved photon detection efficiency in the 600-1000 nm Intrinsic Disorder and Electrostatics 125 range (~73% peak at 660nm) compared to the PDM detector (~52% peak at 550 nm), which is why we preferred it for acceptor signal detection. For FRET-FCS measurements, we inserted a pinhole (50 m) in the detection path and the fluorescence signal was also split into donor and acceptor channels. Additionally, a non-polarizing cube was inserted in the donor channel to split the signal roughly in two equal parts. These two intensities were used to construct the pseudo-autocorrection of the donor signal (GDD). The total (sum) signal in the donor channel and the signal in the acceptor channel were used to calculate the cross-correlation between donor and acceptor

(GDA). By dividing GDD by GDA, the fluctuations from translational diffusion are eliminated and anticorrelated fluctuations of the donor and acceptor fluorescence become visible, as a marker for intra-molecular conformational dynamics.

5.3. Information about Sic1 Obtained from Bulk Measurements

5.3.1. Multi-Phosphorylation Dependent Sic1-Cdc4 Recognition

One interesting and important aspect of Sic1 is the multi-phosphorylation dependence: a threshold in the number of phosphorylation sites on Sic1 is necessary for Cdc4 recognition. Sic1-CPDs are found to be suboptimal binding motifs of Cdc411. In principle, Sic1 with one phosphorylation (one CPD) should already be sufficient for Cdc4 recognition: although the binding affinity is lower than that of multi-phosphorylated Sic1, it is a mere matter of time before all Sic1 proteins will be degraded in the presence of Sic1 phosphorylation kinase. Apparently, the absence of a threshold in the number of phosphorylation sites on Sic1 will lead to genome instability11. In fact, Sic1 needs to have at least any six of the nine CPD sites phosphorylated to bind Cdc4 at a considerable rate (dissociation constant ~ 0.6 M11). The dependence of binding affinity of Sic1 to Cdc4 is non-linear with the average number of phosphorylation sites. Very interestingly, inside the budding yeast, this leads to a switch-like response10 of Sic1 degradation to the concentration level of phosphorylation kinase. The presence of Intrinsic Disorder and Electrostatics 126 the threshold prevents premature onset of Sic1 degradation and leads to genome stability. The biological significance is therefore obvious, allowing the living organism to fine-tune regulatory processes. This phenomenon is well documented in the literature, however its physical basis, that is how multi-site phosphorylation drives binding between substrates, still remains unclear. Our collaborator Julie Forman-Kay and her colleagues at the University of Toronto proposed a mean-field statistical model for the pSic1-Cdc4 electrostatic interactions9. The model is based on the fact that the phosphorylation events have negative contributions to the net charge of Sic1 (from +11 to -1 for 6p-Sic1) and therefore change the binding affinity of pSic1 to the binding pocket on Cdc4. The model also suggests that pSic1 needs to be rather compact in order to achieve cumulative electrostatic interaction with the positively charged binding pocket on Cdc4. The mean-field statistical model states the importance of electrostatic properties in Sic1/pSic1‟s interaction with Cdc4. In the budding yeast, pSic1 will be degraded upon being tagged with a poly-ubiquitination group by the SCF complex. In the case of phosphorylation below the threshold, the dwell time of pSic1 may not be sufficient for pSic1 to be captured and tagged the poly-ubiquitination group by the SCF complex. Probing at the single molecule level of the dependence of the residence time of pSic1 on Cdc4 on the degree of Sic1 phosphorylation may provide the missing details.

5.3.2. Sic1 Conformers: ENSEMBLE Calculations

An explicit representation of the Sic1/pSic1 ensemble will be beneficial to understand the relationship between their structure and function. From CD spectroscopy, it was found that Sic1 lacks stable secondary and tertiary structure in either the unphosphorylated or phosphorylated state10, and therefore the structure of neither Sic1 nor pSic1 could be directly resolved by conventional X-ray crystallography. Both Sic1 and pSic1 adopt a significant amount of transient secondary and tertiary structures under physiological conditions. The presence of these transient structures leads Intrinsic Disorder and Electrostatics 127 to conformational compactness in both Sic1 and pSic1, as indicated from the pulsed-field gradient (PFG) NMR experiments from Forman-Kay‟s lab11. They calculated the hydrodynamic radius (Rh) values for Sic1 and pSic1 to be 21.5  1.1 and 19.3  1.4 Å, respectively. Both values are significantly smaller than that expected for a fully denatured 90-residue polypeptide (28.7 Å), and only slightly larger than that expected for a globular protein (17.5 Å).

NMR experiments simply provide the ensemble average of Rh for both Sic1 and pSic1. In order to gain further information, Forman-Kay‟s lab has developed the software code named ENSEMBLE7 to estimate the smallest possible ensemble of protein conformers that are collectively consistent with available experimental restraints. A variety of experimental restraints, such as Cα and Cβ chemical shifts, amide proton/nitrogen residual dipolar coupling (RDC), SAXS, paramagnetic relaxation

15 enhancement (PRE) and N R2 relaxation rates, have been applied in the ENSEBLE program for Sic1/pSic1/pSic1-Cdc4 ensemble determination. ENSEMBLE first starts with a large ensemble (structures from a pre-generated conformer pool), and then performs iterative ensemble calculations, in which structures in the ensemble are randomly modified to be more consistent with the experimental restraints. The first round of calculations ends when an ensemble that fits all of the experimental restraints is obtained. The ensemble size is then reduced by one structure, and a same interactive calculation routine is continued. This proceeds until a smaller ensemble is no longer available to be in best consistency with all experimental restraints. The so called simplest-ensemble approach not only finds ensemble of structures for Sic1/pSic1/pSic1-Cdc4, but also avoids the problem of over-fitting, having a very limited number of experimental restraints. Due to limitations of available experimental constraints (many of which also suffer from ensemble averaging), the ensemble of conformers calculated by ENSEBLE may not be unique. As stated in Chapter 1, SMF spectroscopy, especially smFRET, are essential Intrinsic Disorder and Electrostatics 128 for resolving protein conformations and dynamics. Firstly, smFRET measurements could be used to verify results already calculated using ENSEMBLE. Secondly, the single-molecule results could be used as additional experimental restraints in new ENSEMBLE calculations. Finally, other fluorescence spectroscopy methods, such as FCS and FA, could also be used to probe properties of Sic1/pSic1/pSic-Cdc4 such as the timescale of conformational fluctuations, molecular dimensions and shapes, association/dissociation dynamics, etc.

5.4. smFRET Measurements

5.4.1. Sic1 Labeling and Quantification

From its amino acid sequence, Sic1 was determined to have a molecular weight of 9822 Da, which matches the value obtained from mass spectrometry (MS) measurements (shown in red, Fig. 5.4-1). The single and sharp peak centered at the expected molecular weight (MW) value indicates a high quality, pure sample. Side peaks are observed in the MS spectrum of Sic1, due to additions of water molecules to Sic1.

Figure 5.4-1: MS spectrum of Sic1 (red) and fluorescently labelled Sic1 (blue). Sic1 has only one major peak in the spectrum, corresponding to highly purified sample. The labelled Sic1 shows two major peaks, in which one is from Sic1 labelled with a donor-acceptor pair (Sic1-D-A) and the other is from Sic1 labelled with two acceptor molecules (Sic1-A-A). Intrinsic Disorder and Electrostatics 129

The MS spectrum of labelled Sic1 indicates that all of the Sic1 proteins are labelled either with a donor-acceptor dye pair or with two acceptor molecules. The high labelling efficiency of Sic1 suggests both ends of Sic1 are highly exposed, indicated by its disordered structure. For the dually labelled Sic1 sample, about 18% of the protein contains a donor-acceptor dye pair. Since the acceptor-only Sic1 proteins are invisible in smFRET experiments (contributing to an offset in background signal which should be subtracted), the measurements and data interpretation will not be affected. Using the custom built Spectraview application (described in Chapter 2), the TMR-Atto647N pair is found to have a Förster radius of 6.0 ± 0.2 nm. The fluorophores are positioned at the N- and C-termini of the Sic1 protein, so that the energy transfer between them probes conformational properties of the full peptide chain, i.e., the end-to-end distance (REE) values and fluctuations. 5.4.2. Sic1 Size Distribution is Broad Under Physiological Conditions

IDPs usually contain heterogeneous and interconverting conformers. SAXS

11 measurements in a previous report show that Sic1 has a broad Rg distribution, nearly uniformly spread between 10 Å and 60 Å. For pSic1 the distribution was measured to be slightly narrower, spanning from 15 Å to 45 Å. Such broad Rg distribution is extremely uncommon even for IDPs, as reported in other studies12-14, which makes the investigation of Sic1‟s conformational properties via other experimental techniques, such as smFRET, appealing and necessary. The ensemble of conformers calculated with ENSEMBLE is consistent with the

SAXS results. Output parameters from ENSEMBLE calculations also include both Rg and REE of each conformer. Therefore, comparing with the REE distribution of Sic1 obtained from smFRET experiments provides a useful validation of the SAXS measurements and ENSEMBLE calculations.

Intrinsic Disorder and Electrostatics 130

Figure 5.4-2: (A) Four Sic1 structures calculated with ENSEMBLE; (B) REE distribution of 44 Sic1 conformers calculated with ENSEMBLE; (C) Simulated measurements of FRET efficiency (EFT) distribution, assuming Sic1 adopts a REE distribution shown in (B) and a Förster radius of 6.0 nm; (D)

Experimentally measured EFT distribution of Sic1 in Tris buffer (50mM, 150mM NaCl, pH 7.4).

Similar to Rg, REE of Sic1 is very broadly distributed (Figure 5.4-2B). From polymer theory, Sic1 is clearly different from a random coil, for which the end-to-end distance obeys a Gaussian distribution15. I wrote a LabView simulation program to “measure” the

EFT distribution based on the ENSEMBLE-calculated REE distribution. Typically, the simulation models a scenario in which Sic1 proteins are randomly distributed inside a 15-μm diameter sphere (~8 molecules in simulation at a concentration of ~50 pM). At the center of this sphere lies a 3D-Gaussian approximation of the confocal detection volume. The proteins randomly walk into the sphere. If one protein diffuses out the boundary, a new protein will be generated at a random point near the boundary. The Sic1 proteins are simulated to have their REE values distributed as shown in Figure 5.4-2B, according to

ENSEMBLE. The REE values of generated Sic1 molecules may be allowed to interchange into neighboring REE values at a settable timescale (Ttrans; Ttrans =∞ for the present scenario). For each Sic1 molecule labelled with TMR (donor) and Atto647N (acceptor) Intrinsic Disorder and Electrostatics 131 dyes, the contributions to the detected fluorescence signal in donor and acceptor channels are proportional to the apparent detection efficiency CEF(x,y,z) of the objective16, which depends on protein position, donor brightness, acceptor quantum yields, pinhole size,

16, 17 beam waist , expected EFT, and detection efficiencies of donor and acceptor channels. The donor brightness (in units of Hz, at the center of detection volume) is affected by excitation power, the donor extinction coefficient and also quantum yield. The donor brightness of all of the TMR molecules are assumed to be the same and set to a value which produces similar “average burst brightness” as that obtained from real experiments.

Generated EFT distributions are not affected by varying donor brightness values if they are of the same order. The typical time step size in the simulation is 1μs, two orders of magnitude slower than the typical diffusion time of Sic1 through the detection volume (~100 μs). Number of photons from donor molecules (No) is simply the donor brightness times the step size, also corrected by the protein positions in the detection volume (refer to reference16 for details). The fraction of energy transfer (fDA) is then calculated using the Förster radius and REE. The fluorescence signal “emitted” by the donor and acceptor molecules is then determined based on No, fDA, acceptor quantum yield assuming a binomial distribution. Finally, the fluorescence signal “detected” in the donor/acceptor channels is determined by donor/acceptor channel detection efficiencies. A Poissonian number is generated using the “collected” fluorescence signal (usually << 1) as its mean value. The generated Poissonian number (either 0 or 1; 1 indicates a fluorescence photon) is then saved into a binary file. The saved file is finally analyzed with our burst analysis program (refer to

Chapter 2 for details) to obtain the simulated EFT distribution. Using the size distribution shown in Figure 5.4-2B and a Förster radius of 6.0 ± 0.2 nm for the FRET dye pair (TMR and Atto647N) in the simulation, an EFT distribution (Fig. 5.4-1C) is generated for Sic1 in Tris (50 mM, 150 mM NaCl, and pH 7.5). We can see that at least 3-4 distinct FRET populations are obtained, spanning the entire efficiency Intrinsic Disorder and Electrostatics 132 scale. Encouragingly, smFRET burst experiments on Sic1 in Tris buffer yield a similar efficiency distribution (Figure 5.4-1D). This supports the presence of a very broad distribution of Sic1 sizes under physiological conditions and lends credence to the ENSEMBLE approach. However, some deviations do exist, e.g., the experimental fraction near EFT ~ 0 is more abundant than the simulated one. This might be due to apparent low EFT values of the proteins labelled with an inactive acceptor molecule (see also below). 5.4.3. Sic1 Conformations in KCl

Intrinsically disordered sequences are especially common in proteins involved in regulation, signal transduction and recognition3, 4, 6, 18. One suggestion of its physical basis is due to IDPs‟ increased chain flexibility and capture radius, which is beneficial in approaching their binding partner and finding the proper binding conformational adaptation3. Therefore investigating IDPs‟ chain flexibility and dimensions under various conditions becomes important and necessary15. Similar approaches on Sic1 will help understand its physical properties and how these change upon phosphorylation at different CPD sites. Another consideration is the intracellular and extracellular concentrations of ionic species are not constant. They may vary with organisms and environment, but typically fluctuate between 1 and 200 mM19. IDPs are also highly charged and therefore it is crucial to quantify the effect of salt on their structural flexibilities and functionalities. As mentioned above, Sic1 contains +11 net charges distributed along the N-terminal 92-amino acid fragment. In MilliQ water and in the absence of salt, the electrostatic repulsion between these residues (for instance, R8, R10 and R13, K32 and K36) may dominate other intra-chain interactions such as hydrogen bonding, hydrophobic interaction, van der Waals, etc. Under such conditions, Sic1 may only exhibit expanded conformations. The degree of expansion should depend on of the balance between the electrostatic repulsion and the attractive interactions. Intrinsic Disorder and Electrostatics 133

Adding salt to the sample reduces the effective charge on the polypeptide according to the Debye-Hückel screening theory20, and so the electrostatic repulsion between different segments will decrease. If the Sic1 polypeptide chain is considered homogenous among the Sic1 ensemble in terms of molecular conformations, charge exposure and interactions, then its collapse upon increasing the salt concentration should be monotonic. The salt-induced collapse of Sic1 should then be detected as a monotonic shift of the overall EFT distribution from low to high values. However, in our smFRET experiments at varying KCl concentrations we clearly observe a different behavior than expected (Figure 5.4-3). At least three, probably as many as four, clusters appear to contribute to the overall FRET distribution, and they respond to salt in different ways. Figure 5.4-3 shows selected smFRET data in the presence of KCl at concentrations varying from 0 M to 4 M. A global fitting analysis was performed on these data sets, consisting of four Gaussians for each EFT distribution above 0.1M KCl. Below 0.1M KCl we have effectively a three-component fit, and so we set the fourth Gaussian to zero to avoid overfitting the data. Each Gaussian in the fitting represents one cluster of Sic1 conformers having similar properties, specifically having similar REE values. It seems that clusters 1 and 2 are insensitive to salt, as they remain at centered at low FRET even at the highest KCl concentration. Cluster #3 shifts monotonically to higher

EFT values, indicating significant structural collapse as a function of salt concentration. Cluster #4 also responds to salt and becomes more compact, however at a slower pace and accompanied by significant broadening. Although in previous reports21 fractions having similarly broad FRET distributions were typically treated as experimental artifacts, this Sic1 cluster is probably “real”, as a similar peak is expected from calculations based on independent data (see above, Figure 5.4-2C). Here we grouped the Sic1 ensemble of conformations into four distinct clusters, using a minimalist fitting criterion of the end-to-end FRET (distance) data. How "real" are they? Should we still think of Sic1 as consisting of a rather continuous distribution of Intrinsic Disorder and Electrostatics 134 conformers? If so, how static/dynamic is this distribution?

Figure 5.4-3: smFRET histograms (normalized by area) measured for TMR-Sic1-Atto647N at different KCl concentrations. The EFT distributions were fitted globally with four Gaussians to facilitate identification of Sic1 conformational clusters. Clusters #1 to #4 are indicated for the 1 M KCl concentration.

At first glance, the low-FRET clusters (#1 and #2) are experimental artefacts due to the presence of proteins with donor-only or with an inactive acceptor. The MS analysis (Figure 5.4-1) shows that the fraction of donor-only Sic1 is too small to account for such a significant population. As discussed in Chapter 1, FRET measurements are only sensitive in the range EFT = 10% - 90%, therefore the details of the first two clusters are poorly resolved in our data. However, as seen in Figure 5.4-2, the smFRET Intrinsic Disorder and Electrostatics 135 measurements shows a reasonable similarity with that predicted based on the ENSEMBLE calculation, although the near-zero populations are slightly higher. To clarify this issue, we labelled the N- and the C-termini of Sic1 with another dye pair that has a larger Förster radius. We expect to see broadening of the near-zero peak towards larger EFT values if the two Sic1 low-FRET clusters correspond to extended structures of the sample.

Figure 5.4-4: smFRET histograms (normalized by area) obtained from AttoRh101- Sic1-Atto647N in MilliQ water in the presence of KCl at appropriate concentrations.

We choose AttoRh101 (net charge +1) and Atto647N (net charge +1) as a second pair, due to their relatively low charge and larger Förster radius of 7.1  0.3 nm. smFRET burst measurements were performed on this sample in varying salt concentrations as shown in Figure 5.4-4. Interestingly, we observed significant broadening in the near-zero

EFT region (0‒ 30%). Two populations (clusters #1 and #2) are clearly resolved, corresponding to the questionable FRET populations mentioned above. Confirming the results obtained with the previous donor-acceptor pair, clusters #1 and #2 are largely insensitive to salt. Cluster #1 (EFT < 5%) is due to very expanded protein conformations

(REE > 11.6 nm, predicted to exist in Sic1 from ENSEMBLE), or to inactive acceptors.

Cluster #2 (EFT ≈ 12.5  3 % at 1 M KCl) is "real", corresponding to Sic1 conformations Intrinsic Disorder and Electrostatics 136 with an average end-to-end distance of ~9.8 nm, which was clearly beyond the resolution of the previous dye pair. To explore the heterogeneity of the clusters, we analyzed the raw burst data using a different, novel approach. Each detected fluorescence burst was divided into two even parts by photon number: an "initial" half and a "final" half. Thus, an "initial" and a "final" value of EFT is obtained for each burst. The two EFT values define a point on the 2-D correlation map, with the x - axis assigned to the "initial" EFT (EIni) and the y - axis to the

"final" EFT (EFin). For example, the correlation map obtained for Sic1 at 0.15 M KCl is shown in Figure 5.4-5 A. The Labview simulation program described in section 5.4.2 was modified to investigate how the timescale of the Sic1 conformer inter-converting affects the EFT distribution and also the EFT correlation maps. When the REE values of the generated Sic1 molecules interchanges into neighboring REE values at a timescale Ttrans

~10 times faster than diffusion time of Sic1 (τd ~ 0.2 ms), a single spot will be observed in the EFT correlation map, with an equivalent amount of spot broadening along the diagonal and off-diagonal, solely due to photon-counting noise in burst detection. If Ttrans is instead ~ 10 times slower than τd of Sic1, more significant spot broadening along the diagonal will be observed and the broadening along off-diagonal is solely due to photon-counting noise. However if Ttrans is in the same order of τd, broadening along both the diagonal and off-diagonal in the EFT correlation map becomes more significant than expected from photon-counting noise. In Figure 5.4-5A, the spot is clearly slanted along the diagonal direction, indicating that the cluster #3 consists instead of conformers with slightly different REE that do not inter-convert on the time scale of the burst duration (~0.2 ms). Encouragingly, we reached a similar conclusion via another independent approach. We simulated the shot-noise limited FRET distribution with the same characteristics as the cluster #3 distribution: mean EFT value, protein diffusion coefficient, average number of photons per burst, etc. We assumed a "stick" FRET spectrum, i.e. a single EFT (REE) Intrinsic Disorder and Electrostatics 137 value, measured in the presence of only detection (photon-counting) noise. This case is indistinguishable from a distribution of fast inter-converting conformers around the same average value. Firstly, the simulated EFT distribution (FRET-PSF, Figure 5.4-5 B) has a very similar width as the width seen in the data along the red line, suggesting that the off-diagonal broadening is originates mostly from photon-counting noise. Secondly, the FRET-PSF is at least half as narrow as the width of cluster #3 along the white line, suggesting that diagonal broadening is due to the presence of slow inter-converting conformers.

Figure 5.4-5: (A) EFT correlation map of Sic1 in MilliQ water in the presence of 0.15 M KCl; (B) The

EFT distribution of cluster #3 (red) and the simulated shot-noise limited EFT distribution with the same mean value (blue).

As clearly seen in Figure 5.4-3, the cluster #3 shows the most dramatic response to increasing the salt concentration. Its estimated average REE is plotted against KCl concentration in Figure 5.4-6 (blue line), showing a monotonic decrease upon adding salt to the solution. There seems to be two phases in this curve. At low [KCl] the protein shrinks slowly and quasi-linearly from REE = 7.3 nm at no salt to ca. 5.5 nm at 1 M KCl. The Debye length at 1M KCl in MilliQ water is estimated to be ~ 0.31 nm from

20 Debye-Hückel screening theory . We attribute the decrease in REE to the screening of electrostatic repulsion between positively charged residues in presence of Cl- ions in the Intrinsic Disorder and Electrostatics 138

solution. At higher salt concentration we can see a steep decrease of REE to a value of ~ 3 nm at 4 M KCl. This could be due to the Hofmeister effect at high salt concentration. Hydrophobic interactions are promoted by increased surface tension of the solvent and thus make Sic1 even more compact. The Hofmeister effect is normally used to precipitate proteins in the presence of such “water structure maker” salts including KCl and NaCl 22.

Figure 5.4-6: FRET cluster's #3 dependence of the end-to-end distance (REE, blue) and the hydrodynamic radius (Rh, red) on the salt (KCl) concentration.

We implemented EFT-gated fluorescence correlation spectroscopy on the same data set used for smFRET analysis. This enabled us to directly and simultaneously measure the hydrodynamic radius (Rh) of Sic1 sub-populations at varying [KCl] conditions. For instance, the Rh of cluster #3 is determined by building a FCS curve using only photons from bursts with EFT-values in that range. The dependence of Rh on [KCl] can thus be examined at the same time as that of REE, as shown in Figure 5.4-6. As expected, Rh decreases monotonically with increasing salt concentration, which reinforces the idea of

Sic1 compactization. The correlation between Rh and REE will be discussed in more detail in section 5.4.5, together with similar results obtained when exposing Sic1 to increasing concentrations of denaturant.

5.4.4. Sic1 Conformations in GuHCl

Guanidine hydrochloride (GuHCl) is a widely used protein denaturant by breaking Intrinsic Disorder and Electrostatics 139 down hydrogen-bonding and promoting hydrophobic interactions with itself. For globular proteins, adding GuHCl drives folded to unfolded states and the transitions usually reflect the shape of the protein folding free-energy profile23. At high concentrations (4-6 M), GuHCl efficiently denatures most proteins from well-structured conformations to random coils. GuHCl is the chloride salt of guanidine, which is a strong base in water. The ionic nature of GuHCl makes it a better denaturant than neutral denaturants like urea, by screening other structure-stabilizing charge-charge interactions in proteins containing non-zero net charges. For Sic1, as suggested by smFRET measurements under varying salt concentration, the charge-charge interaction plays a major role in “destabilizing” the formation of secondary structure, considering its high non-zero net charge and the reduced content of hydrophobic interaction. Varying [GuHCl] for Sic1 may lead to a completely different “denaturation” pattern from that of globular proteins, as reported recently15. GuHCl does affect both the interactions involving charged residues and the hydrophobic interactions, therefore smFRET experiments will provide clues of the most critical factors in determining Sic1‟s conformational properties. Figure 5.4-7 shows smFRET histograms recorded for TMR-Sic1-Atto647N in the presence of 0 ‒ 8 M GuHCl. Interestingly, in the low concentration range, [GuHCl] = 0 ‒ 0.3M, a significant collapse of the chain is observed. This effect is very similar to that observed when increasing the salt content (Figure 5.4-3). GuHCl is also a salt and the chlorine ions released in the aqueous buffer will screen the repulsion between the Sic1 charges, just like in the case of adding KCl. Above ~ 0.3 M, GuHCl “denatures” the polypeptide chain by favorable interactions with the Sic1 hydrophobic residues. This leads to a gradual chain expansion accompanied by conformational broadening, as previously observed for several globular proteins24. smFRET histograms acquired for Sic1in the presence of GuHCl are globally fit to a sum of three clusters of conformers. The fitting results facilitate data analysis and Intrinsic Disorder and Electrostatics 140 interpretation, although Sic1 should still be treated as a continuum of conformers (see also above). Similarly to FRET experiments in KCl, clusters # 1 and #2 do not shift with increasing [GuHCl] although their amplitude does vary considerably.

Figure 5.4-7: smFRET histograms (normalized by area) of Sic1 at various concentrations of GuHCl.

The FRET distributions are fitted to a sum (red) of three Gaussian components (blue). Clusters #1 to

#3 are indicated for the 1 M GuHCl.

Conversely, cluster #3 is very sensitive to GuHCl. Its REE dependence on [GuHCl] is plotted in Figure 5.4-8 (blue line), together with its Rh dependence (red line). The REE and

Rh are both corrected for differences in viscosity and refractive index of the solvent at Intrinsic Disorder and Electrostatics 141

high GuHCl concentrations. The non-monotonic variation of REE is most likely caused by a competition between charge-screening effects and hydrophobic interaction effects of GuHCl. As suggested by the salt experiments, cluster # 3 should keep collapsing if other interactions in Sic1 are not significant. However, the hydrophobic interaction and hydrogen bonding between Sic1 and GuHCl becomes dominant at molar-range concentrations over further charge-screening effects and the Sic1 protein is expanding, just like any typical globular protein under denaturing conditions.

Figure 5.4-8: The REE (blue line) and Rh (red line) dependence on [GuHCl] for FRET cluster #3 (see figure 5.4-7).

5.4.5. Mechanisms of Sic1 Charge Screening by KCl and GuHCl

The decreases of REE and Rh upon adding KCl and GuHCl indicate that the electrostatic repulsion is a major factor controlling the dimension of Sic1. Another interesting observation is that Sic1 is more compact in 8 M GuHCl than in MilliQ water in the absence of charge screening. Although the two sets of experiments showed highly similar behavior below 0.3 M KCl or GuHCl, the mechanisms of their interaction with Sic1 are quite different. The non-denaturing salts, including NaCl and KCl, are found to be more efficient in screening long-range interactions over short-range ones19, 22. They do not have direct contact with the proteins. On the contrary, charged denaturant molecules such as GuHCl effectively screen protein charge-charge interactions, by preferentially interacting with the protein surface22. They accumulate on the surface of the protein and create an additional layer, Intrinsic Disorder and Electrostatics 142 which increases the protein‟s effective hydrodynamic radius. This may explain why the

Rh values of Sic1 is less sensitive than REE to changing GuHCl concentration in the range from 0 up to about 0.6 M (Figure 5.4-8). The decrease of Rh due to conformational collapse is balanced by the increase of Rh due to specific GuHCl-Sic1 binding. The different mechanisms responsible for Sic1 charge screening by KCl and GuHCl could be further verified using an anionic surfactant, sodium dodecyl sulfate (SDS). Previous reports have investigated the nature of SDS-protein interaction and found it is predominantly hydrophobic in the submicellar and exclusively hydrophobic at micellar levels of SDS concentration25. Upon binding on proteins, the electrostatic repulsions between attached DS¯ ions expand the protein chain and lead to protein denaturation. However, in the case of the highly positively charged Sic1 protein, the presence of 0.02  5 mM SDS induces high conformational compactness (Figure 5.4-9A), much more efficiently than 4 M KCl (Figure 5.4-3). In the presence of both KCl and SDS (Figure 5.4-9B), Sic1 adopts structures similar to Fig. 5.4-9A, only slightly perturbed by the charge screening of KCl on SDS.

Figure 5.4-9: smFRET histograms (normalized by area) of end-labelled Sic1obtained in the presence of: (A) 0.5 mM SDS in MilliQ water; (B) 0.5 mM SDS and 1 M KCl; (C) 0.5 mM SDS and 0.6 M

GuHCl and; (D) 50 mM Tris buffer at pH 7.5.

The decoration of DS¯ ions on the Sic1 surface might be physically similar to the process of having functional phosphorylation (6p) on Sic1. Both of these processes modify the apparent net charges on the protein instead of simply screening the electrostatic fields around the charged amino acids. Therefore, we predict that 6pSic1 will Intrinsic Disorder and Electrostatics 143 adopt a similar conformational distribution as in the presence of 0.5 mM SDS. pSic1 with 5 phosphorylations or less maintains a net positive charge which dominates the chain compactness, similar to conditions of partial charge screening in the presence of KCl. However, in the co-presence of both GuHCl and SDS, Sic1 seems to adopt structures determined by GuHCl (Figure 5.4-9 C). GuHCl is ca. 1000 times more concentrated than SDS, and therefore it competes effectively with SDS for the decorating of the protein surface. Our data also indicates that the charge screening ability of GuHCl is much weaker than SDS: at least 50 mM GuHCl is necessary to produce a sizeable change of the conformational distribution of the Sic1 protein (Fig. 5.4-7). One interesting observation is that Sic1 in a standard buffer (50 mM Tris ,0.15 M NaCl, pH 7.4) is much more compact (Figure 5.4-9 D) than in MilliQ water in the presence of salt alone (0.15 M KCl, Figure 5.4-3). It seems that Tris molecules may also have direct contact with proteins. The accumulation of their (weak) charge screening ability can therefore notably affect the chain dimensions of Sic1. This further reduces the magnitude of the Sic1 conformation changing with the salt concentration. Although we did not find similar observations for other proteins in the literature, the effects of Tris and maybe also other commonly used buffer agents should be taken into account seriously in studies involving electrostatic contributions, especially for highly charged proteins like IDPs.

5.4.6. The Nature of Sic1 in KCl and GuHCl

For globular proteins, it was found that in the presence of GuHCl at high concentrations, the protein adopts a random-coil-like structure, which is usually treated as a Gaussian chain. In a recent report from Ben Schuler‟s group, it was found that this also holds for highly charged IDPs, such as the human prothymosin α (ProTα)15. Schuler et al. have used a simple binding model and successfully described variations in chain dimensions at appropriate GuHCl concentrations. More interestingly, this simple binding model predicts the remarkable rollover of Rg in IDPs, under the assumption that the IDPs Intrinsic Disorder and Electrostatics 144 adopt a Gaussian end-to-end distance („r’ in the equation) distribution. In brief, at each GuHCl concentration, the mean FRET efficiency was used to estimate the equilibrium distance distribution P(r), as shown in Eq. 5.4-1 to 5.4-3:

2 2 In Eq. 5.4-3 the only unknown parameter is the values (assuming = /6 for the Gaussian chain model), to be obtained by numerical calculation. The obtained radius of gyration values are then used to fit to a simple binding model with an effective constant K as shown in Eq. 5.4-4:

where N is number of amino acids, b is the segment length in a polypeptide (0.38 nm), lB is the Bjerrum length (0.7 nm in water), к is the Debye length, a is the denaturant activity,

3 ρ is the relative change in Rg at high denaturant activities, νb is the excluded volume in the uncharged chain, and f, g are fractions of positive and negative charges respectively. Although the formulas above seem quite complicated, most of the parameters (such as a, lB, к, N, f and g) are obtained from calculations. The remaining three parameters, K, ρ and

ν, are obtained by fitting the dependence of Rg on the GuHCl concentration. We have applied a similar approach to cluster #3 of Sic1 in KCl, using the following parameters: N = 92, b = 0.38nm, f = 0.12 and g =0 (Sic1 contains only positively charged amino acids at pH 5-8, and therefore the polyelectrolyte theory is more suitable and simpler for Sic1). As shown in Figure 5.4-10, the simple binding model predicts the Intrinsic Disorder and Electrostatics 145

rollover of Sic1‟s Rg and yields the following fitting parameters: K = 0.54 ± 0.02, ρ = 0.75 ± 0.02 and ν = 1.02 ± 0.01.

Figure 5.4-10: The dependence of radius of gyration of Sic1‟s FRET cluster #3 on denaturant concentration, fitted to a polyelectrolyte binding model described by Eq. 5.4-615.

However, one can still argue whether the binding model developed by Schuler et al. should be applied to Sic1, which has an unusual degree of structural heterogeneity even in the presence of denaturant (although the model was selectively applied to cluster #3). For instance, the νb3 value obtained for Sic1 (0.06 ± 0.01 nm3), approaches that of a charge-balanced protein, CspTm, (0.05 ± 0.01 nm3), instead those of highly-charged proteins, ProTαN (0.86 ± 0.05 nm3) and ProTαC (0.72 ± 0.08 nm3) 15. An important property of the Gaussian polymer chain is the simple linear relationship between Rh, Rg, and REE. Rh is quite small compared with REE, at a ratio of . From the smFRET burst experiments and applied

EFT-gated FCS of each data set, we were able to obtain REE and Rh at each salt or denaturant concentration simultaneously. Therefore, we could use the ratio of Rh /REE as an indication of whether Sic1 adopts indeed a Gaussian chain conformation or not.

Very interestingly, up to ~4 M GuHCl, the Sic1‟s ratio Rh /REE for cluster #3 is approximately constant around 0.35 ± 0.03. This ratio is ~30% higher than the Intrinsic Disorder and Electrostatics 146 theoretically expected ratio for a Gaussian polymer chain, indicating that (1) the nature of this cluster does not change with [GuHCl], and (2) it may be a hydrated random coil, or

(3) it may not be properly modelled as a random coil. The dependence of Rh /REE on KCl concentration (up to ~ 1 M) is surprisingly close to the dependence of Rh /REE on GuHCl.

Figure 5.4-11: Ratio of Rh over REE for cluster #3 of Sic1 as a function of KCl or GuHCl concentrations.

In section 5.4.5, we have discussed that KCl and GuHCl screen charged amino acids and reduce electrostatic repulsion quite weakly compared to SDS. Therefore, even at high salt or denaturant concentrations, there may still exist significant electrostatic repulsions that disrupt the random behaviour of the Sic1 chain. Under this assumption, Sic1 will still be more “rod-like” due to the distribution of positively charged amino acids along the chain. The hydrodynamic radius of a rod-like molecule increases nearly linearly with molecular weight, and leads to a higher value of Rh/REE than the random chain (Rh/REE = 0.2726). This seems to be in agreement with the trend observed in Figure 5.4-11.

At concentrations of KCl and GuHCl in the molar range, the Rh/REE ratios increase steeply. This increase may be due to the Hofmeister effect at high salt concentration. The Hofmeister effect promotes hydrophobic interactions within Sic1 and hence accelerates a conformational collapse of the protein (decrease in REE). The ion layer surrounding Sic1 Intrinsic Disorder and Electrostatics 147

instead accelerates the increase of Rh. This will result in an accelerated steep increase of

Rh/REE , particularly at high [KCl], as seen in Figure 5.4-11.

5.5. Time-Resolved Anisotropy: Sic1 is Rod-Like

Time-resolved fluorescence anisotropy (FA) spectroscopy is a powerful tool to provide both mobility and dynamics information of fluorescent species 27. We applied FA spectroscopy on TMR-labelled Sic1 in the presence of KCl at appropriate concentrations to learn about the molecular shape of the protein and its conformational properties. For fluorescently-labelled proteins, the fluorophore is covalently attached to the host protein via a linkage such as a sulfhydryl group. Therefore, we expect a significant contribution to anisotropy decay due to the flexibility of the linker. Put differently, a fast decay on the order of the decay time of the free dye (sub-ns) should occur. If the host protein is spherical in shape, another component should be present, representing the tumbling motion of the dye-protein complex as a whole 27:

Eq. 5.5-1 is derived from the common “wobbling-in-a-cone” model and describes the restricted motion of a dye that is attached to a (much) larger protein. The dye transition dipole is assumed to diffuse freely inside a cone fixed within the molecular frame, giving rise to the fast component (ϕf), which is affected by free rotation of the dye

ϕd and also overall tumbling motion of the dye-protein complex ϕG (1/ϕf =1/ϕd+1/ϕG).

The degree of steric restrictions on the protein is described by the parameter A∞, and the overall tumbling motion of the dye-protein complex is described by the parameter ϕG.

Very large proteins (Rh > 5nm) can give rise to ϕG values (> ~130 ns) that may be too large to be resolvable, and the FA decay curve will seem to have a single exponential component and a nonzero A∞ value. The model above is only valid for spherical proteins, although most of globular proteins are nearly spherical in shape. Additional components in TRFA decay curves are Intrinsic Disorder and Electrostatics 148 necessary if the host protein is shaped like an ellipsoid, to account for rotation around the long axis and a short axis. The TRFA measurements on Sic1 at each salt concentration (Figure 5.5-1) can be fitted to the following model28:

In Eq. 5.5-2 the single exponential describing the protein tumbling motion is replaced by a bi-exponential consisting of contributions from two different rotational components, roughly corresponding to rotations around the ellipsoid axes. The rationale for not using the simpler model (Eq. 5.5-1) is not just to get a better fitting, but also that treating Sic1 as a spherical molecule cannot account for the unresolvable stationary FA values in the measured TRFA decay curves. The Stokes-Einstein equation, mentioned in Chapter 1, relates the rotational diffusion time of a molecule to its hydrodynamic radius,

Rh. The Rh of Sic1 was found to vary between 1.8  2.6 nm in the KCl experiments, suggesting a rotational diffusion time of 6  18 ns, which should be just about resolvable on our instrument. This is clearly different from what we have observed in experiments (Figure 5.5-1): r(t) remains locked at more than 1/4 of its initial value, and a very slow decay component (> 50 ns) may be present.

Figure 5.5-1: Time-resolved fluorescence anisotropy of the TMR dye and the TMR- labelled Sic1 in the presence of KCl at different concentrations. Intrinsic Disorder and Electrostatics 149

Table 1. Sic1 anisotropy decay fitting results

[KCl] ro ϕd (ns) A∞ AS ϕS (ns) AL ϕL

0 M 0.37 0.54 0.78 24% 4.24 76% ∞

0.05 M 0.35 0.54 0.71 44% 2.37 56% ∞

0.50 M 0.35 0.54 0.69 50% 1.84 50% ∞

1 M 0.35 0.54 0.69 52% 1.96 48% ∞

4M 0.35 0.54 0.70 64% 2.06 36% ∞

TMR 0.37 0.25

For a rod-like molecule of length L and diameter b, the hydrodynamic radius, Rh,

26 and rotational diffusion coefficient, Dr, are found to dependent on L/b :

where kB, T and η are Boltzmann constant, solvent temperature and viscosity respectively.

Going back to section 5.4.6 and Figure 5.4-11, the Rh/REE (equivalent to Rh/L) ratio is 0.35 ± 0.03, which leads to a L/b ratio of 5.63 ± 0.17. Take Sic1 in the presence of 0.05

M KCl for example, for which we measured Rh = 2.4 ± 0.1 nm (cluster #3). Eq. 5.5-4

-1 then provides Dr = 18.9 ± 1.5 μs after simple math. Also the values of Dr obtained predicts that Sic1 has a rotational diffusion time of ~53 ns in the presence of 0.05 M KCl, thus justifying the presence of the unresolvable long decay (L) in the data and the nonzero stationary FA value. The short rotational correlation time S, reflects a fast rotational diffusion along the Sic1 chain, without the need of significant solvent displacement.

The parameter A∞ is called the motional restriction parameter and is related to the

29, 30 semicone angle θmax within which the dye rotates freely on the surface of the protein : Intrinsic Disorder and Electrostatics 150

Using Eq. 5.5-8 and the fitting results (Table 1), it appears that the semicone angles are slightly different in the presence (~ 27.5 ˚) and absence (~ 23.1˚) of charge screening. However, both these values are surprisingly small, indicating that strong sterical restrictions for an IDP persist even under conditions of (partial) screening of intrachain electrostatic repulsions. This also raises concerns about converting FRET to distance in our data and other smFRET data on IDPs, as the dipole orientation may skew the conclusions in different, yet predictable, ways29.

5.6. Sic1 Conformational Fluctuations

Using smFRET and FCS, we performed a systematic study of Sic1's conformations in varying salt (KCl) and denaturant (GuHCl) concentrations. smFRET data suggests that Sic1 protein is comprised of a continuum of conformers with varying end-to-end distances. Previous NMR experiments7 suggest that these conformers are rapidly converting into each other. The lack of line broadening in NMR spectra implies that the Sic1 conformational fluctuations should occur on a sub-microsecond timescale. However, these experiments were not able resolve directly the Sic1 structural dynamics and a new experimental approach is required to quantitatively measure the rates of conformational changes in intrinsically disordered proteins. The principles of FRET-FCS have been described in section 1.3.7 in Chapter 1. This is very powerful method for probing (sub)μs conformational fluctuations, because FRET is distance-sensitive and the (sub)μs range is well separated from the timescale of molecular diffusion (~200 μs). The I// and I⊥ channels are used to obtain the pseudo-autocorrection of the donor channel (GDD). The signal in the donor (sum of I// and

I⊥) and acceptor channels are used to obtain the cross-correlation of donor and acceptor channels (GDA). The ratio of GDD over GDA eliminates the contribution from diffusion to FCS and yields the contribution from conformational (end-to-end distance) fluctuations. Intrinsic Disorder and Electrostatics 151

From the smFRET burst experiments, we expect that Sic1 conformers also interconvert slower than the molecular diffusion time. In Figure 5.4-4, the elongation along the diagonal of the EFT correlation map points to the co-existence of conformers of different sizes for longer times than Sic1‟s typical diffusion time through the detection volume (~200 μs). Figure 5.6-1 shows the measured FRET-FCS curves for Sic1 under two different conditions: physiological (50 mM Tris, 0.15M NaCl and pH7.4) and (MilliQ) water only. Under physiological conditions, we indeed observe an apparent fast decay component having a lifetime of 59.5 ± 3.2 ns. Much slower components, 83.7 ± 6.3 ms (33%) and 17.3 ± 0.7 ms (67%), are also observed by fitting the curve at longer times. In the absence of charge screening (MilliQ water only), no fast-decaying component is found in the sub-μs timescale. However, the slow decay components, 58.6 ± 6.1 ms (37%) and 13.1 ± 0.9 ms (63%), are still present in MilliQ water.

Figure 5.6-1: FRET-FCS curves of Alexa488-TMR labelled Sic1 in the presence of (A) 50mM Tris,

0.15M NaCl, pH7.4 and (B) MilliQ water only. The GDD/GDA curves are shown in orange and the fitted curves are shown in green (sub-s timescale) and purple (ms timescale).

The fast nanosecond process could be attributed to a restricted pool of conformers interchanging around a metastable state, while the “full” conformational exchange takes tens of milliseconds to sample the full Sic1 conformational space. In the absence of charge screening, in pure water with no salt, the Sic1 chain is much stiffer due to strong Intrinsic Disorder and Electrostatics 152 electrostatic repulsion between charged residues, and the fast conformational fluctuations around metastable states are “frozen”. The presence of “slow” interconverting times of Sic1 conformers could be further proved by extended observations in single-molecule immobilization experiments, using for instance the liposome encapsulation protocol described in Chapter 3. Of course, the timescales obtained from FRET-FCS may only represent an average over a large heterogeneous population of Sic1 molecules. We expect that some proteins will show distinct EFT levels in their intensity-time trajectories. These are “slow” interconverting Sic1s and information could be obtained about the “connectivity map” between different conformations, as well as the residence time in each conformation providing associated transition rates. Conversely, other proteins may only show a single time-averaged EFT value. These correspond to rapid interconversions and may contain extra “noise” which could be analyzed using the routine outlined in Chapter 3. In addition, simulations could prove useful for predicting the rate of interconversion of a single Sic1 molecule based on the “noise” in its FRET-time trajectory.

5.7. Summary and Future Work Plans

We studied in detail the conformational properties and flexibility of an IDP protein, Sic1, using SMF spectroscopy methods, such as smFRET, FCS and TRFA. Our smFRET data shows that Sic1 adopts an extremely broad distribution of sizes under physiological conditions, consistent with previously published data7, 11, 31. From experiments in which we varied salt (KCl) and denaturant (GuHCl) concentrations, we found that Sic1 charge-charge interactions (mainly electrostatic repulsions) significantly dominate its chain dimension and flexibility. Normal non-denaturant ions can weekly screen the intrachain charge interactions and play an insignificant role in affecting Sic1‟s conformations. Therefore, Sic1 may not be sensitive to variations of salt concentration, at least at the levels occurring in the intracellular environment. Sic1 adopts highly compacted conformations upon a full Intrinsic Disorder and Electrostatics 153 neutralization (in the presence of SDS binding) of its net charges. We think that this physical mechanism is much closer to imitating the functional phosphorylation of Sic1, which reduces its net charge from +11 to -1. Sic1 with low phosphorylation number still has some net charge, resembling conditions of insufficient charge screening in the presence of salt. Unlike globular proteins and other reported IDPs15, 18, smFRET and TRFA experiments of Sic1 show that Sic1 cannot be modelled as a Gaussian chain in the absence or presence of salt or denaturant. A rod-like model seems a better candidate to describe and predict the behavior of Sic1 in the presence of KCl and GuHCl. Such rod-like shape (aspect ratio ~ 6:1) is probably due to insufficient screening of the electrostatic repulsion. FCS data shows that the conformational fluctuations of Sic1 occur on both ultrafast (10-100 ns) and slow (10-100 ms) time scales. Time scales of its conformational fluctuations are also found to be buffer dependent. In smFRET bursts experiments, conformers with different REE persist much longer than Sic1‟s typical diffusion time through the detection volume (~ 200 μs). Therefore, we assign sub-μs kinetics to a small pool of conformers exchanging around a metastable conformation, and we assign the sub-second process to conformers sampling the entire Sic1 conformational landscape. The latter process will be further investigated by extended observations in single molecule immobilization experiments, using for instance the improved vesicle preparation method (discussed in Chapter 3). The dually-labelled Sic1 will be encapsulated inside unilamellar vesicles of suitable dimension (diameter ~ 0.1μm), which are anchored to the coverslip via biotin-streptavidin system. The slow conformational fluctuations will be reflected in the variations or jumps of the EFT-time trajectory, from which valuable information such as interchanging pathways and timescales between various Sic1 conformers will be probed. One very important goal of this project is to understand the impacts of Intrinsic Disorder and Electrostatics 154 multi-phosphorylations on Sic1‟s conformational properties. Future studies will include similar approaches (smFRET burst experiments, FRET-FCS, etc.) on pSic1, for comparing with experimental results obtained from bulk measurements such as NMR and SAXS. The obtained single molecule results on pSic1 will also serve as an additional experimental restraint for the ENSEMBLE program, to obtain the further-refined simplest pSic1 ensembles. This explicit representation of the pSic1 ensemble will be beneficial to understand how it recognizes and interacts with the partner protein Cdc4, and to validate the “mean-field statistical model” proposed by Professor Julie Forman-Kay and her colleagues. SM studies will also provide direct observations of how pSic1 (CPD motifs) and Cdc4 interacts as a “dynamic complex”, a model of which predicts pSic1 rapidly exchanges on and off the Cdc4-binding pocket. By encapsulating a donor-labelled pSic1 and an acceptor-labelled Cdc4 protein inside a unilamellar vesicle of suitable dimension (diameter ~ 0.1μm), the exchange pathways and timescales of pSic1-Cdc4 will be probed. Overall, our single-molecule data resolves conformational heterogeneity and dynamics in the Sic1 protein and represents the first step towards the validation of the polyelectrostatic model for the Sic1-Cdc4 interaction. The techniques and the data analysis developed in this study, as well as our findings about the role of electrostatics in conformational flexibility, could be applied to other IDPs and help shed some light on the influence of structural disorder in biology.

Intrinsic Disorder and Electrostatics 155

References:

1. Liberles, D.A., et al., The interface of protein structure, protein biophysics, and molecular evolution. Protein Sci, 2012. 21(6): p. 769-85. 2. Dunker, A.K., et al., Intrinsically disordered protein. J Mol Graph Model, 2001. 19(1): p. 26-59. 3. Tompa, P., Intrinsically disordered proteins: a 10-year recap. Trends Biochem Sci, 2012. 4. Uversky, V.N., C.J. Oldfield, and A.K. Dunker, Intrinsically disordered proteins in human diseases: introducing the D2 concept. Annu Rev Biophys, 2008. 37: p. 215-46. 5. Dunker, A.K., et al., Intrinsic disorder and protein function. Biochemistry, 2002. 41(21): p. 6573-82. 6. Babu, M.M., et al., Intrinsically disordered proteins: regulation and disease. Curr Opin Struct Biol, 2011. 21(3): p. 432-40. 7. Mittag, T., et al., Structure/function implications in a dynamic complex of the intrinsically disordered Sic1 with the Cdc4 subunit of an SCF ubiquitin ligase. Structure, 2010. 18(4): p. 494-506. 8. Orlicky, S., et al., Structural basis for phosphodependent substrate selection and orientation by the SCFCdc4 ubiquitin ligase. Cell, 2003. 112(2): p. 243-56. 9. Borg, M., et al., Polyelectrostatic interactions of disordered ligands suggest a physical basis for ultrasensitivity. Proc Natl Acad Sci U S A, 2007. 104(23): p. 9650-5. 10. Nash, P., et al., Multisite phosphorylation of a CDK inhibitor sets a threshold for the onset of DNA replication. Nature, 2001. 414(6863): p. 514-21. 11. Mittag, T., et al., Dynamic equilibrium engagement of a polyvalent ligand with a single-site receptor. Proc Natl Acad Sci U S A, 2008. 105(46): p. 17772-7. 12. Mukhopadhyay, S., et al., A natively unfolded yeast prion monomer adopts an ensemble of collapsed and rapidly fluctuating structures. Proc Natl Acad Sci U S A, 2007. 104(8): p. 2649-54. 13. Ferreon, A.C. and A.A. Deniz, Alpha-synuclein multistate folding thermodynamics: implications for protein misfolding and aggregation. Biochemistry, 2007. 46(15): p. 4499-509. 14. Ferreon, A.C., et al., Interplay of alpha-synuclein binding and conformational switching probed by single-molecule fluorescence. Proc Natl Acad Sci U S A, 2009. 106(14): p. 5645-50. 15. Muller-Spath, S., et al., Charge interactions can dominate the dimensions of intrinsically disordered proteins. Proc Natl Acad Sci U S A, 2010. 107(33): p. 14609-14. 16. Wohland, T., R. Rigler, and H. Vogel, The standard deviation in fluorescence correlation spectroscopy. Biophys J, 2001. 80(6): p. 2987-99. Intrinsic Disorder and Electrostatics 156

17. Mazouchi, A., et al., On the performance of bioanalytical fluorescence correlation spectroscopy measurements in a multiparameter photon-counting microscope. Anal Chim Acta, 2011. 688(1): p. 61-9. 18. Ferreon, A.C., et al., Single-molecule fluorescence studies of intrinsically disordered proteins. Methods Enzymol, 2010. 472: p. 179-204. 19. Dominy, B.N., et al., The effects of ionic strength on protein stability: the cold shock protein family. J Mol Biol, 2002. 319(2): p. 541-54. 20. Wright, M.R., An Introduction to Aqueous Electrolyte Solutions. 2007: John Wiley & Sons. 602. 21. Selvin, P.R. and T. Ha, Single Molecule Techniques: A Laboratory Manual. Cold Spring Harbor Laboratory Press. Vol. 2. 2008, New York: CSHL Press. 6. 22. Perez-Jimenez, R., et al., The efficiency of different salts to screen charge interactions in proteins: a Hofmeister effect? Biophys J, 2004. 86(4): p. 2414-29. 23. Schuler, B. and W.A. Eaton, Protein folding studied by single-molecule FRET. Curr Opin Struct Biol, 2008. 18(1): p. 16-26. 24. Borgia, A., P.M. Williams, and J. Clarke, Single-molecule studies of protein folding. Annu Rev Biochem, 2008. 77: p. 101-25. 25. Bhuyan, A.K., On the mechanism of SDS-induced protein denaturation. Biopolymers, 2010. 93(2): p. 186-99. 26. Teraowa, I., Polymer solutions, in An introduction to physical properties. 2002, Wiley-Interscience. 27. Schroder, G.F., U. Alexiev, and H. Grubmuller, Simulation of fluorescence anisotropy experiments: probing protein dynamics. Biophys J, 2005. 89(6): p. 3757-70. 28. Lakowicz, J.R., Principles of Fluorescence Spectroscopy. 4 ed. Vol. 1. 2006: SpringerLink. 29. Badali, D. and C.C. Gradinaru, The effect of Brownian motion of fluorescent probes on measuring nanoscale distances by Forster resonance energy transfer. J Chem Phys, 2011. 134(22): p. 225102. 30. Kollmann, H., et al., Probing the Ca(2+) switch of the neuronal Ca(2+) sensor GCAP2 by time-resolved fluorescence spectroscopy. ACS Chem Biol, 2012. 7(6): p. 1006-14. 31. Marsh, J.A. and J.D. Forman-Kay, Structure and disorder in an unfolded state under nondenaturing conditions from ensemble models consistent with a large number of experimental restraints. J Mol Biol, 2009. 391(2): p. 359-74.

Statement of Contributions 157

Statement of Contributions

I was involved in the building, the development and the daily maintenance of the multi-parameter confocal laser scanning microscope (CLSM) and the total internal reflection microscope (TIRFM) described in Chapter 2. The hardware from PicoQuant (PicoHarp 300 and router) came with sample Labview codes. A prototype LabView program was written by a summer student, Sangkyu Lee. I developed the program into the multi-featured and user-friendly controlling software described in Chapter 2. I calibrated the scanning stage and developed methods to synchronize it with other hardware components in the confocal microscope. I also developed a LabView program to control all hardware (except the CCD camera) in the TIRFM to facilitate operation during data acquisition. I also programmed the offline version of the controlling program in the confocal microscope, which was used to analyze the saved data files. I programmed a few other applications to facilitate data analysis in batch mode, such lifetime fitting and single molecule burst analysis. Amir Mazouchi developed the core program for FCS analysis based on a early prototype written by me.. I developed several sample protocols described in Chapter 2 based on the existing literature, and finalized them with the assistance of other lab members. These protocols include coverslip surface modification, protein labelling and purification, lipid vesicle preparation, etc. I developed a novel, improved vesicle preparation method to encapsulate biomolecules for single-molecule observations. I characterized the size distribution, and the encapsulation efficiency for vesicles prepared by our method and compared to the classical method from literature. I proved that a relatively high encapsulation efficiency can be achieved using much less biological material using my method. I characterized the extent to which surface effects are eliminated by vesicle encapsulation compared to direct immobilization. I also studied in depth protein denaturation and fluorophore photophysics changes caused by freeze-thaw. Details of the results are given in Chapter 3. Those results have been published in: “Trapping Single Molecules in Liposomes: Surface Interactions and Freeze-Thaw Effects”, B. Liu, A. Mazouchi and C.C. Gradinaru, J. Phys. Chem. B 114 (46), 15191-98 (2010), in which I am the first author. Statement of Contributions 158

I performed STAT3 fluorescent labeling and developed the in vitro demonstration of STAT3 membrane anchorage in a liposome model system that I developed. PMA (protein membrane anchor) compounds were synthesized in the lab of our collaborator, Prof. Patrick Gunning (Chemistry, UTM). I performed the characterization of photophysical properties of the fluorescent PMAs. I developed the STAT3-lipid bilayer anchoring assay, to directly visualize STAT3 anchorage on lipid bilayers containing PMA molecules. I also performed singe molecule experiments of a single pair of dye-peptide and STAT3 trapped inside a vesicle, to probe the strength of the proposed STAT3 inhibition method.STAT3. Based on our data, in vivo imaging experiments were performed in Professor Dr. James Turkson’s lab at University of Central Florida. Details of the results are described in Chapter 4. The results have been published in: (1). “Artificially Induced Protein-Membrane Anchorage with Cholesterol- based Recognition Agents: Towards a New Therapeutic Concept”, M. Avadisian, S. Fletcher, B. Liu, W. Zhao, X. Zhang, P. Yue, D. Badali, W. Xu, A.D. Schimmer, J. Turkson, C.C. Gradinaru, and P.T. Gunning, Angewandte Chemie Intl. 50 (28), 6248-6253 (2011), in which I am a shared-first author.

(2). “A Photostable, pH-invariant Fluorescein Derivative for Single-Molecule Microscopy”, B. Liu, S. Fletcher, M. Avadisian, P.T. Gunning, and C.C. Gradinaru, J. Fluoresc. 19, 915-920 (2009), in which I am the first author.

(3). “Single-Molecule Fluorescence Study of the Inhibition of the Oncogenic Functionality of STAT3”, B. Liu, D. Badali, S. Fletcher, M. Avadisian, P.T. Gunning, and C.C. Gradinaru, Proc. SPIE 7386, 738605 (2009), in which I am the first author.

Sic1 protein samples (Chapter 5) were expressed and purified in the lab of Prof. Julie Forman-Kay at the Hospital for Sick Children in Toronto. I did the fluorophore labelling of Sic1 and subsequent purification by size-exclusion chromatography. I also did mass spectroscopy on Sic1 and labelled Sic1 samples, to probe the efficiency and the nature of labeling for smFRET. I performed smFRET burst measurement on donor-acceptor-labelled Sic1 samples at varying concentrations of KCl and GuHCl. I wrote a LabView program to analyze smFRET burst measurements and mathematically fitted the obtained FRET efficiency distributions to proper models. I was able to obtain Statement of Contributions 159 correlated end-end distance (from smFRET) and hydrodynamic radius (from burst-gated FCS) of the results to understand the structural behavior of Sic1. I also performed time-resolved fluorescence anisotropy measurements in appropriate salt concentrations to further probe Sic1’s chain flexibility properties and its overall shape. Finally, I performed FRET-FCS measurements to learn about the rates of Sic1 intrachain fluctuations. Details of the results are described in Chapter 5. An undergraduate student Darius Chia assisted me in the above work and he deserves much credit. A version of this work is currently being prepared for a manuscript submission.

Apart from the work described in this thesis, I had significant contributions to a number of collaborative projects with Professor Peter Macdonald (Chemistry, UTM) and. Those projects involved fabrications and characterizations of microgel and lipogel systems. I designed and performed fluorescence co-localization experiments on microgels and lipogels, which was crucial to understand their core-shell structure. I performed microgel and lipogel imaging under the TIRFM with heating cycles, which is important to understand their thermoresponsive properties. I designed and performed fluorescence recovery after photobleaching experiments of the lipid layer coating of the Lipogel in the confocal setup. The goal was to obtain the diffusion coefficients of lipids in the coating layer and thus understand the nature of the lipid layer (monolayer, bilayer or liposomes). FRAP curves obtained from a small object like the lipogel cannot be fit to an analytical model. I programmed a LabView application to simulate the fluorescence recovery, considering the geometry of the object and the size/shape of the excitation-detection volume, and obtained diffusion coefficients values comparable to literature values. The work was published in three publications: in two of them I am the second author, and in one I am a shared-first author (refer to “List of Publications” for details).

List of Publications 160

List of Publications

1. M. Avadisian*, S. Fletcher*, B. Liu*, W. Zhao*, X. Zhang, P. Yue, D. Badali, W. Xu, A.D. Schimmer, J. Turkson, C.C. Gradinaru and P.T. Gunning, “Artificially Induced Protein-Membrane Anchorage with Cholesterol-based Recognition Agents: Towards a New Therapeutic Concept”, Angewandte Chemie Intl (2011), 50 (28), 6248-6253 * The authors contributed equally to the work 2. Q. Saleem*, B. Liu*, C.C. Gradinaru and P.M. Macdonald, “Lipogels: Single Lipid Bilayer-Enclosed Hydrogel Spheres”, Biomacromolecules (2011), 12 (6), 2364-74 * The authors contributed equally to the work 3. A. Mazouchi, B. Liu, A. Bahram and C.C. Gradinaru, “On the performance of bioanalytical fluorescence correlation spectroscopy measurements in a multiparameter photon-counting microscope”, Anal. Chim. Acta (2011), 688 (1), 61-69 4. B. Liu, A. Mazouchi and C.C. Gradinaru, “Trapping Single Molecules in Liposomes: Surface Interactions and Freeze−Thaw Effects”, J. Phys. Chem. B (2010), 114(46), 15191-15198 5. D. Badali, B. Liu, A. Mazouchi, M. Avadisian, P.T. Gunning and C.C. Gradinaru, “Development of STAT3 as an accessible target for fluorescence-based inhibition assays”, JULS (2010), 4, 18-23 6. N. MacKinnon, G. Guerin, B. Liu, C.C. Gradinaru, J.L. Rubinstein, and P.M. Macdonald, “Triggered Instability of Liposomes Bound to Hydrophobically Modified Core-Shell PNIPAM Hydrogel Beads”, Langmuir (2010), 26, 2, 1081–1089 7. B. Liu, S. Fletcher, M. Avadisian, P.T. Gunning and C.C. Gradinaru, “A Photostable, pH-invariant Fluorescein Derivative for Single-Molecule Microscopy”, J. Fluoresc. (2009), 19, 915-920 8. N. MacKinnon, G. Guerin, B. Liu, C.C. Gradinaru, and P.M. Macdonald, “Liposome-Hydrogel Bead Complexes Prepared via Biotin-Avidin Conjugation”, Langmuir (2009), 25 (16), 9413-9423 9. W. R. Algar, M.B. Prigozhin, B. Liu, U.J. Krull, C.C. Gradinaru, “Developing Methods to Study the Conformational Dynamics of Quantum Dot-Oligonucleotide Conjugates by Single Molecule Spectroscopy”, Proc. SPIE (2009), 7386, 73860J 10. B. Liu, D. Badali, S. Fletcher, M. Avadisian, P.T. Gunning, and C.C. Gradinaru, “Single molecule fluorescence study of the inhibition of the oncogenic functionality of STAT3”, Proc. SPIE (2009), 7386, 738605