Application of Protein-based Biosensors in Detection of Novel Therapeutics & Environmental Monitoring

Dissertation

Presented in Partial Fulfillment of the Requirements for the Degree Doctor of Philosophy in the Graduate School of The Ohio State University

By

Jeevan Baretto, M.S.

Graduate Program in Department of Chemical Engineering

The Ohio State University

2014

Dissertation Committee:

Professor David W. Wood, Advisor Professor Andre Palmer Professor Shian-Tian Yang c Copyright by

Jeevan Baretto

2014 Abstract

Proteins play a vital role in a living system. Any malfunctions lead to a serious genetic disease. In recent years, thousands of genes been discovered, each with specific functions. The development of tools for controlling their function is a challenging task and researchers have been working on it for decades. Proteins have been used as molecular switches to construct biosensors. In this work, an engineered protein- based bacterial biosensor has been introduced as a tool for detection of potential ligands to nuclear hormone receptor proteins. The modular design of the biosensor protein allows us to swap the receptor protein and create new biosensors easily. The study has been extended to receptor proteins across different species and tested with a library of chemicals. A pre-synaptic protein, neurexin, known to play a role in neurodevelopmental disorders has been incorporated into the biosensor and could potentially lead to new avenues in discovery of new therapeutics. Techniques including nuclear magnetic resonance (NMR) have been used to confirm the binding. A detailed study on the mechanism of biosensors has been presented which can lead to intelligent design of next generation biosensors.

ii To my late Daddy

&

my Amma

iii Acknowledgments

I have to thank David for giving me this excellent opportunity to work in his lab and providing every possible support. I am grateful to him for believing in me in spite of not having any prior biology experience. He was a constant source of inspiration in the group, a great mentor and an advisor. I also want to thank my committee members Dr. Palmer and Dr. Yang for their constant support. I have to mention

Dr. Tapas Mal, who extensively guided me with the NMR experiments and am very grateful for that.

I am very thankful to the current and past lab members who were there when I needed them. I would like to thank Drs. Jingjing, Iraj and Izabela for their guidance, support and deep insight during my initial days in the lab. I am thankful to other lab members - Mike, Dan, Miriam, Elif, Sam, Tzu Chiang, Ashwin, Steven, Samar among others for their friendship and constant support.

A special thanks to my best friends: Dr. Verena, Michi and Shadwa for cheering me up all the time when my experiments did not work.

I have to thank my roommates with whom I spent a significant amount of time while working on my thesis. Special thanks to Sanket, Deepak, Arvind, Sughosh and

Sidharth.

I would have never made it to OSU without the efforts of my inspiring professors at

IIT Bombay - Prof. Preeti Aghalayam, Prof. Sanjay Mahajani and Prof. Anuradda

iv Ganesh. This journey would not be possible without the efforts of my amazing family

- my mom, dad and my sister, Mangala. I miss you a lot. I would like to thank my extended family for their unconditional love and support even during hard times -

Mamatha and Rajeshanna.

v Vita

2006 ...... B.E.ChemicalEngineering, S. V. National Institute of Technology Surat, India 2013 ...... M.S.ChemicalEngineering, The Ohio State University Columbus OH 2009-present ...... GraduateTeaching Associate, Department of Chemical Engineering, The Ohio State University Columbus OH

Fields of Study

Major Field: Chemical Engineering

vi Table of Contents

Page

Abstract...... ii

Dedication...... iii

Acknowledgments...... iv

Vita...... vi

ListofTables...... x

ListofFigures...... xi

1. Introduction...... 1

1.1Goal...... 4 1.2Background...... 5 1.2.1 Inteins...... 5 1.2.2 Structure...... 6 1.2.3 Engineered Mini-inteins ...... 7 1.2.4 Split Inteins ...... 9 1.2.5 Splicing Mechanism ...... 10 1.2.6 Applications of Inteins ...... 13 1.2.7 Construction of a Thymidylate Synthase Reporter System . 14

2. ProteinSwitchesforBiosensing...... 17

2.1 Nuclear Hormone Receptors Superfamily ...... 18 2.2 Nuclear Hormone Receptors Classification ...... 19 2.3 Nuclear Hormone Receptor Ligands ...... 20 2.4StructuralfeaturesofNHRs...... 22 2.5 Hormone-sensing protein-based Biosensor Design and Construction 24

vii 2.6 Detection of compounds with hormone-mimicking properties .... 26 2.7 Biosensors with Receptors from Various Species ...... 28 2.8Discussion...... 48 2.9Conclusions...... 50

3. NeurexinBiosensor...... 51

3.1Neurexin...... 51 3.2PlasmidConstructions...... 55 3.3BiosensorAssay...... 55 3.4 15N Labeled Protein Expression and Purification ...... 57 3.5 Sample for NMR experiments ...... 58 3.6 Saturated Transfer Difference (STD) NMR ...... 58 3.7Heteronuclearsinglequantumcoherence(HSQC)...... 59 3.8ResultsandDiscussion...... 60 3.9Conclusions...... 65

4. MechanismofBiosensors...... 66

4.1Results...... 67 4.2Discussion...... 74 4.3Conclusion...... 78

5. Summary ...... 79

5.1FutureWork...... 80 5.1.1 3DstructureofneurexinusingNMR...... 80 5.1.2 MolecularDocking...... 81 5.1.3 In-vitro biosensor...... 81

6. Materials&Methods...... 82

6.1ReagentsandStrains...... 82 6.2PlasmidConstruction...... 83 6.3 15N Labeled Protein Expression and Purification ...... 84

Appendices 86

A. DNA Sequences of Ligand Binding Domains of Nuclear Hormone Recep- torsandNeurexin...... 86

A.1 Human β ...... 86

viii A.2 Cow Estrogen Receptor β ...... 87 A.3 Rat Estrogen Receptor β ...... 88 A.4 Zebrafish Estrogen Receptor β ...... 89 A.5Neurexin2b...... 90

B. PlasmidMaps...... 91

B.1 pMIT:ERβ Human...... 91 B.2 pMIT:ERβ Cow...... 96 B.3 pMIT:ERβ Rat...... 101 B.4 pMIT:ERβ Zebrafish...... 106 B.5 pMIT:Nrx 2b ...... 111 B.6pET:Nrx-His6...... 116

Bibliography ...... 119

ix List of Tables

Table Page

2.1 List of ICCVAM suggested chemicals ...... 30

2.2 Statistical Data for human ER β biosensor...... 37

2.3 Statistical Data for cow ER β biosensor...... 41

2.4 Statistical Data for rat ER β biosensor...... 45

2.5 Statistical Data for zebrafish ER β biosensor...... 46

x List of Figures

Figure Page

1.1Thecentraldogmaofmolecularbiology...... 3

1.2Evolutionofinteins...... 8

1.3 Evolution of mini-inteins ...... 9

1.4Mechanismofproteinsplicing...... 12

1.5Schematicofinteinmediatedproteinpurification...... 14

1.6FolateCycle...... 16

2.1 Mechanism of estrogen signaling ...... 20

2.2 receptor ligands ...... 21

2.3 Retinoid X receptor ligands ...... 22

2.4 Estrogen receptor ligands ...... 23

2.5 Tertiary structure of ligand bound ERβ ...... 24

2.6 Schematic of the biosensor modules ...... 25

2.7 Animal biosensor plasmid construction ...... 30

2.8 Dose response curves for chemicals with human ERβ 1 ...... 35

2.9 Dose response curves for chemicals with human ERβ 2 ...... 36

xi 2.10 Dose response curves for chemicals with human ERβ 3 ...... 36

2.11 Dose response curves for chemicals with cow ERβ 1...... 37

2.12 Dose response curves for chemicals with cow ERβ 2...... 38

2.13 Dose response curves for chemicals with cow ERβ 3...... 38

2.14 Dose response curves for chemicals with cow ERβ 4...... 39

2.15 Dose response curves for chemicals with cow ERβ 5...... 39

2.16 Dose response curves for chemicals with cow ERβ 6...... 40

2.17 Dose response curves for chemicals with rat ERβ 1 ...... 41

2.18 Dose response curves for chemicals with rat ERβ 2 ...... 42

2.19 Dose response curves for chemicals with rat ERβ 3 ...... 43

2.20 Dose response curves for chemicals with rat ERβ 4 ...... 43

2.21 Dose response curves for chemicals with rat ERβ 5 ...... 44

2.22 Dose response curves for chemicals with zebrafish ERβ 1 ...... 45

2.23 Dose response curves for chemicals with zebrafish ERβ 2 ...... 46

2.24 Dose response curves for chemicals with zebrafish ERβ 3 ...... 47

2.25 Dose response curves for chemicals with zebrafish ERβ 4 ...... 47

3.1SchematicofSynapseJunction...... 52

3.2Schematicofneurexinsubtypes...... 53

3.3 Schematic representation of neurexin biosensor ...... 55

3.4 Schematic representation of the high-throughput screening method . . 56

3.5 Scheme showing the principle of STD NMR ...... 59

xii 3.6Doseresponsecurveforneurexinbiosensorwithrosiglitazone..... 61

3.7STDNMRspectrumofneurexinwithrosiglitazone...... 62

3.8 Overlay of HSQC for neurexin in apo form and in the ligand-bound form 64

4.1Schematicofdifferentbiosensorconstructs...... 68

4.2 Effect of MBP and/or intein domain deletions ...... 69

4.3 Effect of intein V437L mutation ...... 71

4.4 Effect of length of linker between MBP and intein domain ...... 72

4.5 Effect of GS linkers between intein and TS domain ...... 73

4.6 Effect of intein N440A mutation ...... 75

4.7 Effect of intein N440A mutation and GS linkers ...... 76

xiii Chapter 1: Introduction

Proteins are an integral part of any living organism. Ones ability to develop and sustain life depends entirely on active production, regulation and function of proteins.

Enzymes participate in catalyzing biochemical reactions in our body, starting from digestion of food to absorption of oxygen into the blood stream. For example, the enzyme amylase breaks down starch, which is a large molecule, to smaller molecules such as maltose, which is more readily absorbed by the intestines. Several metabolic pathways, such as glycolysis, which produces the organisms energy currency, ATP, are vital for the living system and would not be possible if there were no enzymes catalyz- ing the biochemical reactions. Another example would be the process of homeostasis.

The regulation of the organisms temperature, pH, alkalinity and acidity requires a tight control of the protein activity within their system. Any malfunction may lead to a serious genetic disease. Ion channels, which are membrane proteins, play a very important role in controlling the flow of ions, which mediate the action potential.

They are like gates that let only certain ions pass through. Our nervous system depends on these action potentials for the transfer of electrochemical signals. So if we want to move our arms or look right, we need those ion channels working. In fact, the venoms produced by some predators target these ion channels of their prey and inactivate them. Peptide-based hormones act as co-activators and regulate the

1 function of certain genes. Therefore we can see that proteins play a very important role in the sustenance of life.

Proteins in different species or even within the same species display a variety of difference in their phenotype and function. They all have specific functions that are equally important for the survival of the species. Before we understand life, we need to understand the importance of protein function. In recent years, thousands of genes have been discovered, each with a specific function in an organism. These genes interact with thousands of proteins which regulate the genes function. Understanding their function will lead to discoveries than can help cure diseases and improve yield in agricultural crops. For example, we can engineer the metabolic pathway of a gene for higher intake of nutrients and increase yield. This method has been successfully applied in the field of agricultural sciences.

The activity and function of proteins depend on their structure and folding char- acteristics. According to the central dogma of molecular biology, figure 1.1 which describes the flow of genetic information within a biological system, the DNA encod- ing a particular gene is transcribed to messenger RNA which is then translated to a protein.

The proteins then fold to secondary, tertiary and quaternary structures and be- come functional. The proteins have a dynamic structure, which means they are breathing all the time and can undergo conformational changes to allow for interac- tions with other proteins or small-molecules. Enzymes bind to their substrates in a similar fashion, termed as lock and key.

We can regulate the function of proteins by engineering their biophysical proper- ties. We can have serious consequences if the biophysical characteristics of proteins

2 Figure 1.1: The central dogma of molecular biology

are altered. Mutations in DNA lead to changes in amino acid sequence and in turn alter the function of the protein. Several known diseases such as sickle-cell anemia,

Alzheimers etc. are caused by mutations in the gene expressing the protein. Therefore the protein that was supposed to carry out a particular function is unable to do so because it has a different genetic code. Hence understanding proteins and engineering them in a favorable way can help cure these diseases.

To understand any system, we need to first perturb the system, see how it reacts, and know the function of that particular module and thus how it affects the system.

Similarly, to understand the function of a particular protein, one strategy is to knock

3 out its gene from the chromosome, or mutate the nucleotides of the gene sequence which expresses that protein and see what implications that has on its phenotype.

This technique is simple and can be easily generalized to mutate any nucleotide from billions of nucleotides and still be highly specific so the function of only that particular gene is affected. Unfortunately, we cannot completely generalize this technique to any target gene, as some proteins are involved in cell viability and cannot be disturbed.

Bacterial and mammalian cell hosts have provided a cheap alternative for expressing human proteins. They are like cell factories that can make proteins for us in large quantities. They are cheaper, faster, and more efficient. We can insert a specific set of

DNA into these host cells and transform them to make our protein of interest. These host cells are considered industrial work horses for making therapeutic proteins and other biotechnology products. Proteins have also been used as molecular switches to regulate the functioning of a gene. This means that we can make use of the biophysical feature of proteins to regulate the target gene.

1.1 Goal

Our goal in this thesis is to design alternative methodologies in regulating protein function. First, a method will be described for the engineering of protein switches called biosensors that can detect chemicals binding to the protein. We will then fo- cus on developing these ideas and applying them to specific targets such as estrogen receptors, thyroid receptors, peroxisome proliferator-activated receptors, and neurex- ins. We will probe into the mechanism of working biosensors by deleting domains and truncating protein sequences to study the role of individual domains.

4 1.2 Background

Let us now delve into learning more about specific proteins central to this work.

Proteins have a variety of functions, and most of them are still unknown. In this chapter, we will try to understand inteins its structure, function, and applications.

1.2.1 Inteins

Several archael, eubacterial and eukaryotic genes contain in-frame insertions that are excised during post-translational modification (Perler, 1998). These sets of se- quences are termed as inteins. This results in the formation of two products: the excised intein and the mature spliced host protein (Perler et al., 1994). Intein splic- ing is analogous to the self-splicing of introns at the RNA level (Derbyshire and

Belfort, 1998).

Intein was first discovered in 1988 by sequence comparison of carrot vacuolar

ATPase with the exons of a Neurospora 69-kDa genomic clone from Saccharomyces cerevisiae (Zimniak et al., 1988). It revealed a discrepancy between the size of the gene that encodes for this protein and the mature translated product. When it was initially cloned, the expected size of the mature translated product was 119 kDa.

After running the protein samples through a polyacrylamide gel electrophoresis, they found out that the actual size of the product was 69 kDa. Northern blot analysis confirmed the presence of a single mRNA transcript corresponding to the size 119 kDa. Thus it was believed that there is some other mechanism involved. Researchers at New England Biolabs did a systematic study into the splicing mechanism and suggested that factors such as temperature and pH play an important role in the splicing reaction (Xu et al., 1993).

5 Inteins exist as one of the three forms in nature 1. As maxi-inteins which contain

DNA homing endonuclease within the reading frame of the intein, resulting in N- terminus IN and C-terminus IC intein fragments. 2. As a mini-intein where the endonuclease domain is removed but contain an adjoining protein splicing domain.

3. As split inteins where the IN and IC are encoded by two independent genes and fused to their respective exteins (van den Heuvel et al., 1998). Over 550 inteins have been identified in the intein InBase database (Raffo et al., 2013), of which a majority are from unicellular organisms. It was found that a number of amino acids flanking the intein-extein splice junctions were well conserved. The first amino acid of an intein is always Cys or Thr while the first amino acid of the C-extein is either Cys,

Thr or Ser. These conserved features have led to discovery of different inteins in many different host proteins and species (Pietrokovski, 2001). It is found that most of the intein alleles are found to be similar which leads to the theory that inteins have evolved from a single gene ancestor (Karki et al., 2014). Although there is not much similarity among the non-allelic inteins, their integration point is found in highly conserved sequence motifs (Derbyshire and Belfort, 1998). Intein host proteins are very diverse, including vacuolar ATPase, DnaB helicase, RecA, GyraseA, DNA polymerase, PEP synthase, anaerobic rNTP reductase, and others (Karki et al., 2014).

1.2.2 Structure

The size of inteins varies significantly between species and also within the same species. The smallest intein discovered so far is the RIR-1 intein from Methanoth- ermobacter thermautotrophicus (Mth RIR-1), consisting of 134 amino acids. One the other hand, the largest discovered is the RFC-2 intein from Pyrococcus abyssi

6 (Pab RFC-2) of 608 amino acids (Raffo, Berardi et al. 2013). In general, inteins are classified as large or maxi inteins if their size is greater than 350 amino acids and mini-inteins if their size is smaller than 200 amino acids. The mini inteins lack a sequence of amino acids that are similar to homing endonucleases.

Inteins have two functional domains similar to the hedgehog proteins: one is a splicing domain while the other is a central homing endonuclease domain. From the crystal structure, we can see that the homing endonuclease domain seems to have been inserted into the splicing domain. It is speculated that at some point the ancestral self-splicing gene was invaded with the homing endonuclease gene (Vogel et al., 2014).

In Figure 1.2, we can see an ancestral Hedgehog and INTein (HINT) module was modified to a mini-intein by the addition of a polypeptide ligation region and further to a maxi-intein by the addition of a homing endonuclease. Also the same

HINT module, with addition of sterol recognition region was modified to hedgehog autoprocessing domain (Raffo et al., 2013).

1.2.3 Engineered Mini-inteins

To verify the a hypothesis that endonuclease gene invaded a sequence encoding a small, functional splicing element, variable lengths of sequence from the endonuclease domain from the Mycobacterium tuberculosis (Mtu) RecA intein were deleted to cre- ate engineered functional mini-inteins that are splicing proficient (Derbyshire et al.,

1997). The goal was to show that endonuclease activity is not required for protein splicing functions using site-directed mutagenesis. This mini-intein was transferred to a tripartite fusion system (MIC) for in-vivo characterization of splicing products.

The tripartite fusion system (MIC) consisted of maltose binding protein (M), fused

7 Figure 1.2: Evolution of inteins and Hedgehog-like autoprocessing proteins (Raffo et al., 2013))

in-frame to the intein (I) and then to the C-terminal domain of the homing endonu- clease I-TevI (C). It was found that the mini-intein was able to cleave itself out to generate the ligated product MC and verified using Western blot experiments (Der- byshire et al., 1997). V67L substitution helped stabilization of the mini-intein (Hiraga et al., 2005). This mini-intein has played an important part in this thesis, which we will study in the coming chapters.

8 Figure 1.3: Evolution of mini-inteins. Originally the inteins contain two domains: a splicing domain and an endonuclease domain. Endonuclease domain was deleted to yield functional artificial mini-inteins. Figure adapted from Derbyshire et al. (Der- byshire et al., 1997)

1.2.4 Split Inteins

The splice domains of a mini-intein can be split into two fragments, separated at the point where the endonuclease domain was excised, and expressed separately

9 using two separate genes. The N- and the C- domains recombine in trans to regain their ability to induce splicing activity. An artificial mini-intein derived from Mtu

RecA intein was split at the location where the endonuclease domain was deleted and was shown to retain splicing activity in vivo, but it was also shown to occur in vitro after the two fragments were denatured and then renatured (Derbyshire et al., 1997).

The split inteins are naturally present in living organisms. The catalytic subunit alpha (α) of DNA polymerase III (DnaE protein) is expressed in two fragments in two separate genes both encoding part of the DnaE pieces. When the fragments are ligated together and protein splicing takes place, the DnaE protein also ligates and becomes functional. It was also recently found that a number of split inteins exist in nature that have split sites other than the location where endonuclease domain deletions take place (Paterni et al., 2013).

1.2.5 Splicing Mechanism

The inteins are able to splice and re-ligate a number of polypeptide sequences.

This means that they have all the information required to carry out this process and is contained within the intein sequence. Protein splicing is therefore a self-catalyzed process, where the intein can be viewed as an enzyme that catalyzes the splicing reaction and then links the concomitant two substrates with a new peptide bond.

Since the discovery of protein splicing, the mechanism of intein splicing has been thoroughly investigated. The initial amino acid residue in an intein is generally a cysteine or a serine. These amino acids act as nucleophiles and attack the carbonyl group of the N-extein/intein peptide bond. This results in the formation of an ester

10 in the case of a serine by a N to O acyl rearrangement, and a thiosester by an N-

S one when a cysteine is the initial amino acid of the intein (Chong et al., 1996).

The proximity of the N- and C-termini of the intein then allows a transesterification reaction to take place, which results in the ligation of the two exteins. During this step, the hydroxyl or thiol group of the initial amino acid of the C-terminal extein acts once again as nucleophiles. These groups attack the previously generated (thio)ester

N-extein/intein linkage to create a new (thio)ester bond at the C-terminal splice junction between the two exteins (Xu et al., 1993). The transesterification reaction then leads to the formation of a branched intermediate where the two exteins are linked by a (thio)ester bond and the intein remains connected to the C-intein by a peptide bond. This branched intermediate can be identified with SDS polyacrylamide gel electrophoresis due to its substantially reduced electrophoretic mobility when the splicing reaction is decelerated and this reaction product is allowed to accumulate (Xu et al., 1993). After this reaction, the intein releases itself during a cyclization reaction

(aminosuccinimide formation), which is mediated by the asparagine present as the

final amino acid in the intein sequence (Evans and Xu, 1999). Finally, the (thio)ester link of the ligated exteins spontaneously rearranges itself through a last acyl shift to form a native peptide bond, leading to the formation of the mature ligated extein product (Paulus, 2000). Figure 1.4 is a schematic representation of this process.

After this reaction, the intein releases itself during a cyclization reaction (aminosuc- cinide formation), which is mediated by the asparagine present as the final amino acid in the intein sequence (Evans and Xu, 1999). Finally, the (thio)ester link of the lig- ated exteins spontaneously rearranges itself through a last acyl shift to form a native

11 Figure 1.4: The mechanism of protein splicing. Adapted from Wood 2000.

peptide bond, leading to the formation of the mature ligated extein product (Paulus,

2000). Figure 3 is a schematic representation of this process.

12 1.2.6 Applications of Inteins

Certain host proteins into which inteins have been inserted are inactivated but subsequently reactivate as the intein splices itself out. Some of the examples of inteins that are involved in such kind of behavior include Sce VMA intein (Zeidler et al.,

2004), Mtu RecA (Derbyshire et al., 1997) and Mxe GyrA (Adam and Perler, 2002).

Some examples of host proteins where this kind of behavior is seen are aminoglucoside phosphotransferase (Daugelat and Jacobs, 1999), thymidylate synthase (Wood et al.,

1999) and more. As mentioned earlier, the intein splicing reaction is self-catalyzed, meaning it does not require any cofactors or coenzymes. This ability makes them active either in vivo or in vitro. The Sce VMA intein has been found to be splicing- competent in bacteria, yeast (Kane et al., 1990), insects (Zeidler et al., 2004) or even in mammalian cells (Mootz et al., 2003). Similarly, the Mtu RecA intein is able to splice in bacteria (Daugelat and Jacobs, 1999), yeast (Buskirk et al., 2004) and in vitro (Gangopadhyay et al., 2003b,a).

Conventional protein purification involves using an affinity tag so that the target protein can be easily separated from the rest of the proteins. Sometimes removal of the affinity tag may not be necessary. But for applications like pharmaceuticals, it is necessary to remove the affinity tag. Typically, protease specific sites are introduced adjacent to the tag which is recognized by specific proteases and hence can be eas- ily cleaved. This step is expensive for two reasons: the proteases are prohibitively expensive, and the protease needs to be removed after cleavage. To overcome these drawbacks, inteins have been used as self-cleaving tags as shown in the figure 1.5.

13 Figure 1.5: Schematic of the intein mediated purification process with on-column cleavage. Inclusion of an intein between a binding domain and product protein renders the binding domain self-cleaving. A shift in pH initiates the cleavage reaction in column-bound material, allowing the collection of a pure product protein. Adapted from Wood et al 2000 (Wood, Derbyshire et al. 2000).

1.2.7 Construction of a Thymidylate Synthase Reporter Sys- tem

The characteristics of protein splicing mentioned earlier make intein technology an attractive candidate for the construction of molecular switches. During natural

14 selection, only the most active intein during splicing reaction has survived. Further- more, there are no examples available in the literature where nature makes use of the switching properties of the intein to regulate the levels of active host protein and thus there is no readily available natural mechanism of controlling the splicing reaction.

Since no apparent mechanism for regulation of intein activity is known, inteins must be modified to fine tune the splicing activity such that splicing reaction can be controlled. Therefore we would need a mode of control that can be easily manipulated to control the splicing reaction. There are two types of reporter systems: one is a screening system and the other is a selection system. The screening system requires evaluation of the properties of the reporter protein for every individual intein vari- ant. The selection system relies on the survival of cells expressing only intein variants with desirable characteristics. The enzyme, thymidylate synthase (TS), is found in all living organisms and is involved in a metabolic pathway called the folate cycle. En- zyme activity is critical for providing the cells with deoxythymidine monophosphate

(dTMP), which is a precursor molecule for thymidine (deoxythymidine triphosphate

-dTTP), an essential structural unit for the synthesis of DNA (Belfort and Pedersen-

Lane, 1984). TS catalyzes the reductive methylation of deoxyuridine monophosphate

(dUMP) by 5, 10 methylenetetrahydrofolate to yield dTMP and dihydrofolate. Two other enzymes: dihydrofolate reductase (DHFR) and serine transhydroxymethylase are required for completion of the folate cycle (Carreras and Santi, 1995). Cells that are expressing TS can be easily selected in a thymine-free medium (-THY medium) as TS activity is absolutely essential for the survival of cells as synthesis of DNA depends on it. The stringency of selection can be tuned by changing the incubation

15 temperature, as TS activity is temperature dependent. As the temperature is in- creased, there will be an increase in metabolic rates within the cell and thus increase in requirement for active TS. Another useful characteristic of the TS reporter is that, in the presence of trimethoprim, which is a DHFR inhibitor, the thymine production can be inhibited and thus provide a negative selection. Genetic selection with the TS reporter system is therefore a diverse tool for the evaluation of splicing performance of inteins.

Figure 1.6: TS genetic selection system. The roles of thymidylate synthase (TS), di- hydrofolate reductase (DHFR) and serine transhydroxymethylase (SHT) in the folate cycle. Adapted from Carreras and Santi (Carreras and Santi, 1995).

16 Chapter 2: Protein Switches for Biosensing

In the previous chapter, we were introduced to inteins, their different types, and how they can be used as protein switches. In this chapter, we will explore the idea of using inteins to make a simple bacterial biosensor. Before that, we will briefly touch upon the current strategies used in creating new protein switches. Over the past few decades, a lot of work has gone into creating new protein switches for biosensing and regulation of protein function. While natural allosteric proteins often contain intrin- sic molecular recognition and signal output function in a single domain, artificially engineered protein switches usually rely on a chimeric protein fusion where the ligand binding domain and the signal reporter domain are fused together. An important hurdle to constructing a protein switch is the signal transmission from the binding domain to the reporter domain (Golynskiy et al., 2011; Stratton and Loh, 2011). Sev- eral protein fusion strategies have been used; examples include end to end fusion and domain insertion. Fluorescent proteins can be fused to a receptor protein inducing a Forster Resonance Energy Transfer (FRET) signal (Miyawaki et al., 1997). Drug discovery relies on biosensing tools for efficient and rapid detection of ligand binding to therapeutically relevant protein targets. Nuclear hormone receptors (NHRs) are one such family of proteins that participate in vital functions in our body. They all have similar structural design where a particular alpha helix behaves as a switch to

17 turn the receptor on or off based on the ligand binding. This structural rearrange- ment feature can be utilized to construct a biosensor to screen for ligands that bind to the receptors.

2.1 Nuclear Hormone Receptors Superfamily

Nuclear Hormone Receptors (NHRs) are a family of receptors that participate in various vital functions in our body such as cell differentiation, reproduction, metabolism and cell growth. Physiology in mammals is subject to daily oscillations in hormone secretions, body temperature, renal activity, etc. There is a complex interlock sys- tem that coordinates these activities with perfect precision (Kotronoulas et al., 2009;

Marcheva et al., 2013). This interlock system consists of a superfamily of receptors that are collectively called NHRs. Many members of this family have been candi- dates for drug discovery studies due to their role in important metabolic pathways

(Lin et al., 2013; Yang et al., 2013). Estrogen receptors (ERs) are one of the most extensively studied members of this superfamily. ERs bind to the hormone estro- gen, thereby undergoing a conformational change of the ligand-binding domain which allows the receptor to dissociate from Hsp90 and form a homodimer. This homod- imer can then translocate to the nucleus and interact with a number of coactivators such as the steroid receptor co-activator (SRC)-1 and form an active complex with an ability to bind to regulatory regions of DNA (termed estrogen response elements

- EREs) and activate the expression of specific genes (Manolagas et al., 2013). Es- trogen receptors (ERs) contain two subtypes: ERα and ERβ.ERα is expressed in breast cancer cells, ovarian stroma cells and the hypothalamus (Yaghmaie et al., 2005;

Cheng et al., 2013). ERβ is expressed in kidney, brain, bone, heart, lungs, intestinal

18 mucosa, prostate and endothelial cells (Babiker et al., 2002). Ligand binding domains of nuclear hormone receptors share homology in their structure. They consist of a switch domain called containing a helix 12 which acts like a door to the ligand binding pocket (Aranda and Pascual, 2001). The repositioning of the helix 12 is dependent on the nature of the ligand. When the ligand is an agonist, the helix 12 moves towards the binding pocket, creating a charged area on the protein surface. This charged area is then occupied by a co-activator resulting in the initiation of transcription. When the ligand is an antagonist, the helix 12 tends to rotate away from the binding pocket resulting in suppression of transcription. This feature of structural rearrangement is important for designing of biosensors to detect ligand binding.

2.2 Nuclear Hormone Receptors Classification

NHRs are broadly classified into three categories: 1. the steroid receptors that include estrogen receptor (ER), androgen (AR), progesterone (PR), glucocorticoid

(GR), and mineralocorticoid (MR) receptors; 2. the retinoid acid-heterodimeric re- ceptors such as the thyroid hormone (TR), vitamin D (VTD), retinoic-acid (RAR),

9-cis-retinoic-acid (RXR) and ecdysone receptor (EcR); and 3. the orphan receptors such as the estrogen-related receptor and the steroidogenic factor 1 for which no en- dogenous ligands have been identified. For some of the orphan receptors, endogenous and synthetic potential ligands have been identified recently. For some of them, it is unknown if they are activated either by ligand binding or by some other mechanism

(Weatherman et al., 1999).

NHRs are further classified into subtypes to add to the complexity. Estrogen receptor, for example, has α and β subtypes, while the estrogen-related receptor has

19 Figure 2.1: Mechanism of estrogen signaling. (A) The estrogen receptor (ER) is present in the cytosol in a heterodimeric state bound to the molecular chaperone Hsp90. (B) Hormone binding induces a conformational change in the receptor, disso- ciation from Hsp90 and subsequent homodimerization. (C) The estrogen-bound ER dimer translocates to the nucleus and binds to estrogen response elements (EREs). (D) The ER complex interacts with coactivators (e.g. SRC) and allows the RNA polymerase to bind to the TATA box and initiate gene transcription.

α, β,andγ subtypes. Peroxisome proliferator-activated receptor (PPAR) has α, β,

γ,andδ subtypes (Bourguet et al., 2000).

2.3 Nuclear Hormone Receptor Ligands

NHRs are activated by binding to their native ligands which are small-molecule hormones. NHR ligands are mostly hydrophobic in nature, as they have to diffuse through the cellular membrane to reach the nucleus. They are broadly classified into

20 two categories: 1. steroidal compounds that are derivatives of cholesterol, and 2. non-steroidal compounds that are derived from various sources such as metabolites.

An important property of NHR ligands is that they are very similar in structure and have well conserved molecular volume. This property is important because the NHRs ligand binding pocket should be able to fit the ligand; this also determines its affinity and specificity.

NHRs can be activated by a number of chemicals that are structurally similar to the native ligands. These chemicals function as synthetic hormones and have the potential to treat conditions that stem from hormonal deficiencies. These chemicals are termed as agonists. There are chemicals that can inhibit the response of natural

Figure 2.2: Steroid receptor ligands

hormones or of other synthetic agonists. These kinds of hormone analogues are called antagonists and have a great medicinal value.

21 Figure 2.3: Retinoid X receptor ligands

2.4 Structural features of NHRs

The structural features of NHRs are highly conserved. They have three main functional domains: an N-terminal transactivation domain, a DNA binding domain, and a C-terminal ligand-binding domain. The N-terminal transactivation domain includes a ligand-independent activation function called AF-1. The DNA binding do- main consists of a transcriptional activation function domain (AF-2) that is activated upon ligand binding (Bourguet et al., 2000).

The C-terminal ligand binding domains of NHRs contain 12 alpha helices which are organized in a three layer helical sandwich. The 12th alpha helix is flexible and moves to accommodate the ligand binding. It opens up the ligand binding pocket and thus allows for the NHRs to undergo conformational change upon ligand binding.

This helix 12 includes the AF-2 region and extends outwards of the ligand-binding

22 Figure 2.4: Estrogen receptor ligands

pocket. Several NHRs have been crystallized in both apo as well as ligand bound form to confirm this phenomenon (Bourguet et al., 2000). Figure 2.5 is an example of human ERβ bound to synthetic estrogen E2.

23 Figure 2.5: The structures of ERβ in apo form and ligand-bound form. The ligand here is estrogen E2. The flexible helix 12 is red in color and undergoes a conforma- tional change to accommodate the ligand binding.

2.5 Hormone-sensing protein-based Biosensor Design and Construction

The biosensor protein was designed by fusing the ligand binding domain of a NHR to a well characterized reporter enzyme, TS, which was discussed in the previous chapter. It is known that expression of the ligand binding domain of NHRs suffers with stability and solubility issues, but has been resolved by fusion with other genes

(Wittliff et al., 1990). Therefore, fusions with other domains should help stability and solubility issues. Also, the mini splicing domain of the Mycobacterium tuberculosis

RecA intein is known to fold properly and retain activity when inserted into different protein hosts. Maltose binding protein (MBP) is known to increase the solubility of the fusion protein (Jeong et al., 2014). Thus, endonuclease domain of the full intein

24 was replaced with the ligand binding domain of a NHR which was further fused to the bacteriophage T4 td gene (expressing T4 TS enzyme) at the C-terminus. In addition, the first amino acid of the intein was mutated from Cys to Ala to suppress any splicing and N-terminal cleavage. The resulting fusion was cloned into the plasmid pMal-c2 fused to the C-terminal side of E. coli MBP. The resulting plasmid is referred to as pMIT::LBD (MBP-Intein-TS::LBD), where LBD is the ligand binding domain of a NHR. E. coli D1210ΔthyA cells were transformed with the resulting pMIT::LBD

Figure 2.6: Schematic of the different modules in the biosensor. As we can see, upon successful ligand binding the ligand binding domain undergoes conformational changes and further activates TS.

plasmids. The lac operator had G to A nucleotide substitution, 16 bases downstream of the TATAA motif such that there was reduced affinity to lac repressor, or making it constitutively active (Skretas and Wood 2005). The cells were grown in a minimal

25 media without thymine named as –– Thy. The incubation was optimized at 34oC.

Upon addition of ligand and incubation for about 16h, the cells were able uptake the ligand and bind to the ligand binding domain of the NHR, thus activating TS.

As a result, we can observe a TS+ growth phenotype. Several biosensor plasmids were constructed by swapping the LBD from different NHRs such as pMIT::ERβ, pMIT::TRβ, pMIT::PPARγ and pMIT::Nrx2b. Biosensor subtypes α for ER and TR were also constructed (Skretas et al., 2007)(Skretas, Meligova et al. 2007). Once we had the biosensor cells, we used a library of chemicals to detect any potential ligands, especially for those that might have therapeutic properties. We were able to plot the dose response curves and calculate the effective half minimal concentration (EC50) for each ligand with respect to the receptor protein.

2.6 Detection of compounds with hormone-mimicking prop- erties

Our endocrine system consists of a series of glands that secrete hormones directly into the blood stream. These hormones are involved in vital functions in our body such as development, reproduction, hemostasis and others (Burris et al., 2013). En- docrine disruptor chemicals (EDCs) are compounds, both natural and synthetic, that mimic the properties of hormones and alter the normal functioning of the endocrine system for both wildlife and humans. They are known, by many animal experiments, to cause adverse health defects by affecting the fundamental physiological processes

(Agas et al., 2012). They are found in almost every commodity that we use in our daily lives, such as perfumes, baby foods, vaccines and plastic bottles. Histor- ically, they were known to only impair reproductive and developmental processes,

26 but recently the focus has shifted to other metabolic diseases such as obesity, dia- betes, cancerous tumors, birth defects, autism spectrum disorders and others (Chen et al., 2009; Newbold, 2010; Polyzos et al., 2012). They can be classified according to the nature of their endocrine actions, examples include anti-androgenic, andro- genic, estrogenic, anti-thyroid, inhibitors of steroid hormone synthesis and retinoid agonists. They are also classified according to their usage in daily life: pesticides

(DDT and ), fungicides (vinclozolin), herbicides (), industrial chemicals (PCBs, dioxins), chemicals used in the synthesis of plastics (phthalates, (BPA), alkylphenols), plant hormones (), pharmaceutical drugs (, DES) and personal care products (Gawrys et al., 2009; Skin- ner et al., 2011). Therefore it is imperative to develop a tool that can detect these compounds and potentially lead to discoveries in therapeutics to diseases that are linked to exposure of these hormone-like compounds. Several committees have been set up to develop strategies to tackle the problems related to EDCs. The inter-agency coordinating committee for validating alternative methods (ICCVAM) and the envi- ronmental protection agency (EPA) are examples of such committees that are inviting proposals to develop new tools for detection of EDCs. Therefore, we were able to test the library of compounds suggested by ICCVAM with the biosensor cells for their es- trogenicity. The original list contained 74 compounds but very few of them displayed a positive response. In the next section we will talk about using biosensor cells from different animal species and using them to test against the ICCVAM library.

27 2.7 Biosensors with Estrogen Receptors from Various Species

In the past, we have successfully constructed biosensors using the human estrogen and thyroid receptor protein to detect chemicals with hormonal activity (Skretas and

Wood, 2005a). We have extended the strategy by constructing biosensors with ERβ from various animal species including fish, pig, cow, zebrafish and mouse (Gierach et al., 2012). The biosensor design and the ligand binding domain boundaries are shown in figure 2.7.

The experiment was then extended to test a library of chemicals suggested by

Interagency Coordinating Committee for Validation of Alternative Methods (ICC-

VAM) that were classified as estrogenic. The ICCVAM aims to reduce the use of animals and encourage replacing them with lower level species for in vivo testing.

Table 1 shows the list of chemicals that were tested and displayed a positive growth phenotype during the test. The potencies of these test compounds were compared.

The half-maximal effective concentration (EC50) was compared to determine the rel- ative pseudotransactivation (RTPA). The EC50 values and the standard deviations were obtained using triplicate samples and have been presented as a plot between optical density versus test compound concentration. The calculations were based on nonlinear regression with variable hill slope (GraphPad Prism 6; GraphPad Software,

La Jolla, CA, USA). Bottom + (Top – Bottom) Y= (2.1) 1+10(logEC50–X)*Hillslope ECE2 RTPA = 50 X100% (2.2) ligand EC50

28 To further examine the quality of assay, Z factor for each measurement was cal- culated, which is an indication of signal to noise in a measurement (Zhang et al.,

1999).

SD –SD Z =1–3X max min (2.3) |Meanmax – Meanmin| Mean – Mean S/N = max min (2.4) SDmin Mean S/B = max (2.5) Meanmin

Each ligand plate contained a positive control ligand (E2) and was performed in triplicates. The plates with biosensor cells supplemented with ligands were incubated at 34oC for 16-18 h.

29 Figure 2.7: Biosensor protein design illustrating the different domain fusions and the boundaries for the different ligand binding domains from different animal species. The human (h), cow (c), zebrafish (zf) and rat (r) are the different species used.

Table 2.1: List of chemicals suggested by ICCVAM and used in our biosensor assay to test biosensors with estrogen receptor β across different species

Compound Chemical Class Product Class Estrogenic Activity

Actinomycin D Phenoxazone; Lac- Pharmaceutical tone; Peptide Ammonium perchlo- Organic acid; Or- Pharmaceutical Negative rate ganic salt Anastrazole Nitrile; Triazole Pharmaceutical Negative 4- Steroid, nonphenolic Hormone Flavanoid; Flavone; Natural product Phenol Continued on next page

30 Table 2.1 – continued from previous page

Compound Chemical Class Product Class Estrogenic Activity

Apomorphine Heterocycle; Quino- Pharmaceutical line Atrazine Aromatic amine; Tri- Pesticide azine; Arylamine Bicalutamide Anilide; Nitrile; Sul- Pharmaceutical fone Bisphenol A Diphenylalkane; Chemical intermedi- ** Bisphenol; Phenol ate Bisphenol B Diphenylalkane; Adhesive, Chemical ** Bisphenol; Phenol intermediate, Coat- ings Butylbenzyl phtha- Phthalate Plasticizer * late 2-sec-Butylphenol Phenol Pharmaceutical CGS 18320B Nitrile; Imidazole Metabolic inhibitor Clomiphene citrate Chlorinated triph- Pharmaceutical enylethylene; Ben- zylidene; Stilbene Corticosterone Steroid, nonphenolic Pharmaceutical ; Ketone Natural product * Benzopyranone; 4-Cumylphenol Phenol Chemical intermedi- ate Cycloheximide Piperidine; Glu- Pharmaceutical * taramide Cyproterone acetate Nitrile; Diphenyl Pharmaceutical ether; Organochlo- rine Flavanoid; Natural product * Isoflavone; Phe- nol p,p-DDE Organochlorine; Pesticide metabolite *** Diphenylalkene o,p-DDT Organochlorine; Pesticide ***** Diphenylalkene Dexamethasone Steroid, nonphenolic Pharmaceutical Continued on next page

31 Table 2.1 – continued from previous page

Compound Chemical Class Product Class Estrogenic Activity

Dibenzo[a,h]anthracenePolycyclic aromatic None * hydrocarbon; An- thracene Di-n-butyl phthalate Phthalate Plasticizer ** Diethylhexyl phtha- Phthalate Plasticizer * late Diethylstilbestrol Stilbene; Benzyli- Pharmaceutical ***** dene; Diphenylalkene 5α- Steroid, nonphenolic Pharmaceutical Negative 17α- (E1) Steroid, phenolic; Es- None ** trene 17β-Estradiol (E2) Steroid, phenolic; Es- Hormone Positive trene Steroid, phenolic; Es- Pharmaceutical ** trene 17α-Ethinyl estradiol Steroid, phenolic Pharmaceutical **** (EE2) Ethyl paraben Paraben; Organic Pharmaceutical ** acid Fadrozole Imidazole; Nitrile Pharmaceutical Negative Heterocycle; Pyrimi- Pesticide dine Finasteride Steroid, nonphenolic; Pharmaceutical Androstene Flavone Flavanoid; Flavone Natural product * Fluoranthene Polycyclic aromatic None * hydrocarbon; Fluo- rene Fluoxymestrone Steroid, nonphenolic Pharmaceutical Flutamide Amide; Anilide; Ni- Pharmaceutical trobenzene Flavanoid; Natural product ** Isoflavone; Phe- nol Haloperidol Butyrophenone; Ke- Pharmaceutical tone; Piperazine Continued on next page

32 Table 2.1 – continued from previous page

Compound Chemical Class Product Class Estrogenic Activity meso- Diphenylalkane; Pharmaceutical **** Bisphenol; Phenol Hydroxyflutamide Amide; Anilide; Ni- Pharmaceutical, trobenzene Metabolite 4-Hydroxytamoxifen ; Pharmaceutical **** Phenol; Benzylidene; Stilbene ICI 182,780 Steroid, phenolic Pharmaceutical Flavanoid; Flavone; Natural product Phenol Kepone Organochlorine; Pesticide Chlorinated bridged cycloalkane Ketoconazole Imidazole; Piper- Pharmaceutical azine Linuron Urea Pesticide * Medroxyprogesterone Steroid, nonphenolic; Pharmaceutical acetate Polycyclic hydrocar- bon p, p-Methoxychlor Organochlorine; Pesticide Chlorinated hydro- carbon Methyl Steroid, nonphenolic; Pharmaceutical Negative Androstene Methyltrienolone Steroid, nonphenolic; Pharmaceutical Estrene Mifepristone Steroid, nonphenolic; Pharmaceutical Estrene Morin Flavanoid; Flavone; Dye ** Phenol Nilutamide Heterocycle; Imida- Pharmaceutical * zole p-n- Alkylphenol; Phenol Chemical intermedi- ** ate Norethynodrel Steroid, nonphenolic; Pharmaceutical Norpregnene Continued on next page

33 Table 2.1 – continued from previous page

Compound Chemical Class Product Class Estrogenic Activity

4-tert-Octylphenol Alkylphenol; Phenol Chemical intermedi- ** ate Oxazepam Benzodiazepine Pharmaceutical Negative Phenobarbital Heterocycle; Pyrimi- Pharmaceutical Negative dine Phenolphthalin Triphenylmethane; Analytical reagent Diphenyalkane car- boxylic acid Pimozide Piperidine; Benzimi- Pharmaceutical dazole Procymidone Organochlorine; Pesticide Cyclic imide Progesterone Steroid, nonphenolic; Pharmaceutical * Pregnenedione Propylthiouracil Pyrimidine; Uracil Pharmaceutical Negative Reserpine Heterocycle; Yohim- Pharmaceutical ban Sodium azide Organic salt; Azide Negative Steroid, nonphenolic; Pharmaceutical Pregnene lactone Triphenylethylene; Pharmaceutical *** Benzylidene; Stil- bene Testosterone Steroid, nonphenolic Pharmaceutical Negative 12-O-Tetradecanoyl- Phorbol ester; Ter- Pharmaceutical ** phorbol-13-acetate pene L-Thyroxine Aromatic amino acid Hormone Negative 17β-Trenbolone Steroid, nonphenolic; Pharmaceutical Estrene 2,4,5-Trichloro- Organochlorine; Pesticide phenoxyacetic acid Chlorinated aro- matic hydrocarbon Vinclozolin Organochlorine; Pesticide Cyclic imide; Carba- mate Resorcylic acid lac- Chemical intermedi- * tone; Phenol ate, Natural product

34 The dose response curves were plotted for all the compounds that were tested.

The test ligands were serially diluted in a 1:10 dilution between each data point. The biosensor cells were incubated at 34oC for about 16 hours. The dose response curves are shown in figures 2.8 to 2.25

Figure 2.8: Dose-response curves for the human biosensor protein encoded by pMIT::ERβ treated with different test compounds. EE2, E1 and E2. The EC50 values listed are in table 2.2.

35 Figure 2.9: Dose-response curves for human biosensor protein encoded by pMIT::ERβ treated with different test compounds. Morin hydrate, hexestrol, genistein, E2. The EC50 values listed in table 2.2

Figure 2.10: Dose-response curves for human biosensor protein encoded by pMIT::ERβ treated with different test compounds. DES, BPA, Daidzein and E2. The EC50 values have been listed in table 2.2.

36 Test Compound EC50 in %RPTA Z S/N S/B μM

Hexestrol 0.05 1024 117.8 7.93 0.85 DES 0.2 232.8 596.98 9.08 0.72 E2 0.48 100 117.62 8.87 0.92 Genistein 0.5 97.19 119.8 9.83 0.88 EE2 0.85 56.79 153.72 7.71 0.79 E1 2.68 18.06 102.96 5.69 0.51 Daidzein 4.29 11.29 216.91 8.65 0.77 Morin hydrate 18.6 2.602 93.25 6.18 0.82 BPA 20.42 2.37 189.37 3.56 0.41

Table 2.2: EC50 values and statistical parameters for human β

Figure 2.11: Dose-response curves for cow biosensor protein encoded by pMIT::ERβ treated with different test compounds. Dioctyl phthalate. The EC50 valueshavebeen listed in table 2.3.

37 Figure 2.12: Dose-response curves for cow biosensor protein encoded by pMIT::ERβ treated with different test compounds. Bisphenol B, E2. The EC50 valueshavebeen listed in table 2.3.

Figure 2.13: Dose-response curves for cow biosensor protein encoded by pMIT::ERβ treated with different test compounds. Ethyl paraben, Zearalenone, and E2. The EC50 values have been listed in table 2.3.

38 Figure 2.14: Dose-response curves for cow biosensor protein encoded by pMIT::ERβ treated with different test compounds. EE2, E1 and E2. The EC50 values have been listed in table 2.3.

Figure 2.15: Dose-response curves for cow biosensor protein encoded by pMIT::ERβ treated with different test compounds. Hexestrol, genistein and E2. The EC50 values have been listed in table 2.3.

39 Figure 2.16: Dose-response curves for cow biosensor protein encoded by pMIT::ERβ treated with different test compounds. DES, BPA, Daidzein and E2. The EC50 values have been listed in table 2.3.

40 Test Compound EC50 in %RPTA Z S/N S/B μM

Hexestrol 0.06 498 419.73 6.59 0.69 E2 0.3 100 271.46 9.24 0.87 DES 0.4 76.54 165.57 8.89 0.85 Genistein 0.51 58.96 706.68 10.42 0.93 Zearalenone 0.9 33.91 80.06 3.27 0.73 E1 1.07 28.48 123.62 7.03 0.36 EE2 1.46 20.9 340 9.1 0.14 Bisphenol B 1.48 20.64 89.67 5.48 0.86 Daidzein 8.96 3.41 290.98 8.69 0.97 Bisphenol A 10.1 3.04 149.53 3.04 0.89 Ethyl paraben 16.1 1.9 132.29 6.65 0.92 Dioctyl phthalate 74.6 0.41 13 2.27 -0.82

Table 2.3: EC50 values and statistical parameters for cow β

Figure 2.17: Dose-response curves for rat biosensor protein encoded by pMIT::ERβ treated with different test compounds. Bisphenol B and E2. The EC50 values have been listed in table 2.4.

41 Figure 2.18: Dose-response curves for rat biosensor protein encoded by pMIT::ERβ treated with different test compounds. Nilumatide, zearalenone and E2. The EC50 values have been listed in table 2.4.

42 Figure 2.19: Dose-response curves for rat biosensor protein encoded by pMIT::ERβ treated with different test compounds. EE2, E1 and E2. The EC50 values have been listed in table 2.4.

Figure 2.20: Dose-response curves for rat biosensor protein encoded by pMIT::ERβ treated with different test compounds. Hexestrol, genistein and E2. The EC50 values have been listed in table 2.4.

43 Figure 2.21: Dose-response curves for rat biosensor protein encoded by pMIT::ERβ treated with different test compounds. DES, BPA, Daidzein and E2. The EC50 values have been listed in table 2.4.

44 Test Compound EC50 in %RPTA Z S/N S/B μM

Hexestrol 0.06 465.2 78.99 3.64 0.87 DES 0.12 220.6 40.5 4.12 0.74 E2 0.28 100 47.61 5.28 0.86 Zearalenone 0.54 51.76 12.62 1.65 -0.2 EE2 0.74 37.57 81.43 4.41 0.92 Genistein 0.93 29.97 253.57 5.32 0.94 Bisphenol B 1.2 23.26 21.42 2.54 0.42 E1 5 5.6 60.11 2.84 0.61 Daidzein 8.23 3.37 32.05 4.03 0.72 Bisphenol A 14.03 2 13.46 2.35 0.57 Nilutamide 32100* 0 25.73 3.22 0.07

Table 2.4: EC50 values and statistical parameters for rat β. * represents EC50 values cannot be determined

Figure 2.22: Dose-response curves for zebrafish biosensor protein encoded by pMIT::ERβ treated with different test compounds. Nilumatide, zearalenone and E2. The EC50 values have been listed in table 2.5.

45 Figure 2.23: Dose-response curves for zebrafish biosensor protein encoded by pMIT::ERβ treated with different test compounds. EE2, E1 and E2. The EC50 values have been listed in table 2.5.

Test Compound EC50 in %RPTA Z S/N S/B μM

DES 0.53 138.8 52.32 4.67 0.58 E2 0.73 100 87.44 6.51 0.76 Hexestrol 1.05 69.7 38.94 3.46 0.81 EE2 1.71 43.04 61.18 4.38 0.74 Zearalenone 3.79 19.41 28.93 2.25 0.6 Genistein 9.39 7.83 71.18 6.35 0.81 E1 11.54 6.38 66.27 3.22 0.69 Daidzein 19.02 3.87 13.84 2.22 0.49 Nilutamide 2256* 0.03 202 3.97 0.82

Table 2.5: EC50 values and statistical parameters for zebrafish β. * represents EC50 values cannot be determined

46 Figure 2.24: Dose-response curves for zebrafish biosensor protein encoded by pMIT::ERβ treated with different test compounds. Hexestrol, genistein and E2. The EC50 values have been listed in table 2.5.

Figure 2.25: Dose-response curves for zebrafish biosensor protein encoded by pMIT::ERβ treated with different test compounds. DES, Daidzein and E2. The EC50 values have been listed in table 2.5.

47 The EC50 values were obtained based on the normalized values with bottom of the curve set as the lowest value and the top of the curve as the highest value. Each data point was collected in triplicate. The cell growth time for each assay was about

16-20h.

2.8 Discussion

Biosensor cells provide a novel method for studying agonistic effects of ligands for estrogen receptor β (ERβ) ligand binding domain across multiple species human, zebrafish, cow and rat. These species were chosen because of their extensive usage in animal model studies. The strength of this method relies on the ability to swap out ligand binding domains (LBDs) of different species and create a new biosensor. The modular feature of the biosensor assists in conducting a detailed study to analyze the function of individual domains. The TS reporter system provides a reliable, sensitive and rapid method for detection. Compounds identified by our bacterial biosensor system as agonists have been confirmed in subsequent human/animal cell assays and have a generally good correlation (Skretas and Wood, 2005a; Skretas et al., 2007;

Hartman et al., 2009). The animal biosensors constructed were able to identify es- trogen agonists with high degree of certainty (Z>0.5) and the signal to background ratio (s/b) was in the range of 2 to 9 while the signal to noise ratio (s/n) was much higher. The RPTA values for each test chemical were calculated. It was found that for the human version, the potencies for the test compounds were in the following order: Hexestrol > DES > E2 > Genistein > EE2 > E1 > Daidzein > Morin hydrate

> BPA. This trend correlates to previously published results from other researchers using different assays such as yeast and mammalian cell assays (Bovee et al., 2004;

48 Escande et al., 2006; Chu et al., 2009). Hexestrol, which is a non-steroidal estrogen,

is known to be more potent than E2 and our results show a similar trend (Spradau

and Katzenellenbogen, 1998). such as hexestrol, DES, EE2 have been

reported for their strong estrogenic activities using yeast assays and correlates to our

results (Bovee et al., 2004). In a zebrafish in vivo and in vitro assay, it was reported

that the cyp19a1b mRNA expression, aromatase B protein expression and brain aro-

matase activity has increased upon exposure to the xenoestrogens just mentioned

(Le Page et al., 2006; Vosges et al., 2010). Also a combination of chem-

ical sample led to a concentration-additive effect (Le Page et al., 2006). Genistein

has been confirmed, both in vitro and in vivo, to increase the rate of growth of ER

expressing breast cancer cells and our system has been able to detect them accurately

as estrogen agonists (Chen and Wong, 2004; Ju et al., 2006; Yang et al., 2010). In a

CXCL12 secretion model in human MCF-7 cells, it was reported that E2, EE2, Genis-

tein and BPA had potencies as 1e-11 M, 1e-12 M, 1e-8 M, and 1e-6 M respectively,

compared to the values from our assays as 4.8e-7 M, 8.52e-7 M, 5e-7 M, and 2e-5 M

respectivel (Habauzit et al., 2010). The order of estrogenic potencies of test chemicals

with cow estrogen receptor is as follows: Hexestrol > E2 > DES > Genistein > E1 >

EE2 > Daidzein > BPA. Similarly the trend for Rat is Hexestrol > DES > E2 > EE2

> Genistein > E1 > Daidzein > BPA and for zebrafish is DES > E2 > Hexestrol >

EE2 > Genistein > E1 > Daidzein. The highest potency for three of the four species

was found to be in hexestrol except zebrafish which showed highest affinity to DES.

There are several studies with other Piscean species for endocrine disruptors. E2 and

EE2 have been reported to modulate the inflammatory response in gilthead seabream

(Sparus aurata) through activation of endothelial cells (Liarte et al., 2011). EE2 has

49 been reported to disrupt sexual selection in sand gobies (Pomatoschistus minutus) and reduce the competitive reproductive fitness in male guppies (Poecilia reticulata)

(Kristensen et al., 2005; Saaristo et al., 2009). In a zebrafish study, it was found that the potency of E2 was similar to EE2 while, the DES was twofold higher than E2.

E1 had tenfold less potency compared to E2 (Escande et al., 2006).

As far as measuring the direct impact of the is concerned, it is difficult to accurately assess since most of the EDCs have a cumulative impact. Also, the potencies measured using different assays vary considerable up to the order of

1000. Therefore, it is difficult to measure the exact impact of EDCs on human body

(Dobbins et al., 2008). Also the half-life of the EDCs can be significantly higher than expected as we are exposed to these chemicals on a daily basis (Diamanti-Kandarakis et al., 2009). Studying the impact of weakly binding chemicals could be challenging especially if they are combined with a strongly potent EDC.

2.9 Conclusions

In this chapter we were able to demonstrate how we can construct a new biosensor

by simply swapping the ligand binding domain from an existing biosensor. As men-

tioned earlier, the biosensor system relies on the conformational changes to LBD and

hence activates the TS leading to a growth phenotype. This method differs from con-

ventional transactivation assays, which require co-activators and separate detection

system for reporter genes. The results presented here are consistent and reproducible.

Biosensors with receptors from different species will be a useful tool to screen candi-

date chemicals before going for animal models. In the next chapter, we shall see how

we can use this system to extend it beyond nuclear hormone receptor family proteins.

50 Chapter 3: Neurexin Biosensor

In the previous chapter we talked about construction of a protein-based bacterial biosensor. We used ligand binding domains from nuclear hormone receptors to detect potential ligands. In this chapter we will extend this study to a different class of proteins that are involved in the formation of synapses and are known to play a role in several cognitive disorders including autism, schizophrenia and Alzheimers disease.

I have to thank Drs. Tapas K. Mal and Chunhua Yuan for their extensive help during the whole project.

3.1 Neurexin

The human nervous system consists of billions of neurons which are bridged by trillions of connections (Washbourne et al., 2004). To ensure that these connections are formed correctly, an intricate spatially and temporally coordinated multi-step process must take place (O’Donnell and Nolan, 2011). This involves the formation of growth cones and its arrival at the specific target cells.

They form synapses, which are highly specialized intercellular junctions dedicated to the transfer of information from a neuron to a target cell, usually another neuron

(Glees and Meller, 1964). Neurexins (Nrx) and neuroligins are the best characterized synaptic cell adhesion molecules (CAMs) and are the only ones for which a specific

51 Figure 3.1: Schematic of synapse junction. The presynaptic cell adhesion molecule neurexin binds to its postsynaptic partner neuroligin. (Sudhof, 2008)

synaptic function has been established (Washbourne et al., 2004). Nrx is located in the presynaptic region and the neuroligin is in the postsynaptic region (Ushkaryov et al., 1992; Song et al., 1999) and together they form trans-synaptic complexes. Nrxs were originally discovered as a receptor for black widow spider (Lactrodectus) venom,

α-latrotoxin (Sugita et al., 1999) which binds to presynaptic Ca2+ channels and other proteins and triggers massive neurotransmitter release (Ushkaryov et al., 1992). The

Nrx family has three members in mammals, Nrx 1-3 (Ichtchenko et al., 1996). Each member has two different promoters giving rise to a longer α subtype and a shorter

β subtype (Ushkaryov et al., 1992). α-Nrxs consist of six LNS (Laminin, Nrx, Sex- hormone-binding globulin) domains interspersed with three EGF (Epidermal Growth

Factor) domains. Nrxs undergo extensive splicing to generate a large number of splice variants (Sudhof, 2008). α-Nrxs have 5 splice sites while β-Nrxs have only 2

52 (figure 3.2)(Sudhof 2008) which results in almost 4000 possible different splice forms originating from the three genes (Rowen et al., 2002; Tabuchi and Sudhof, 2002).

Figure 3.2: Schematic of neurexin subtypes. We can also see the alternative splice sites. α - Nrx has 5 sites and the shorter β - Nrx has 2 sites.

Thestructureoftheβ-Nrx gene with and without splice inserts at splice site 4 has been reported (Koehnke et al., 2008). The three known extracellular binding partners for Nrx are neuroligins, dystroglycan and neurexophilins (Ichtchenko et al.,

1995; Petrenko et al., 1996; Sugita et al., 2001). These proteins have been involved in the formation of synapse and interference with this system leads to defects in synaptic transmission and have been found to be associated with cognitive disorders (Blasi et al., 2006; Kim et al., 2008; Kirov et al., 2008; Ching et al., 2010; Gauthier et al.,

53 2011). Nrx is a potential target for drug discovery for several cognitive disorders, and as such there is large interest in finding a ligand for Nrx.

As discussed in the earlier chapter, we have successfully developed a protein- based bacterial biosensor that is structured with a modular design. This biosensor is rapid, inexpensive, and sensitive to low concentrations of a ligand. With 16 to

20h of incubation, the biosensor cells can display a growth phenotype in response to binding to a small molecule. The different modules in the biosensor design include the ligand binding domain of the receptor protein, maltose binding protein, and a reporter protein like thymidylate synthase (TS) (Skretas and Wood, 2005a). Since Nrx is an important target for many cognitive disorders, we have developed a Nrx biosensor which can detect small molecules that bind to the Nrx protein. From our initial tests we found that the chemical rosiglitazone binds strongly to Nrx. Rosiglitazone

(trade name Avandia, GlaxoSmithKline) is an anti-diabetic treatment drug belonging to the drug class of thiazolidinediones. The drug is a known agonist for peroxisome proliferator-activated receptor γ and acts as an insulin sensitizer (Filz, 2000). It is reported that while it restores alveolar and pulmonary vascular development in mice

(Lee et al., 2014), it also poses a risk for cancer development (Bosetti et al., 2013) and cardiovascular arrest (Bach et al., 2013). A number of small molecules were screened for the development of the protein-based bacterial biosensor (Skretas and

Wood, 2005a) and it was found that rosiglitazone binds to Nrx with an EC50 of

1.08 μM. We report here the full characterization of this binding using biochemical

(biosensor assay) and biophysical methods (NMR).

54 3.2 Plasmid Constructions

The Nrx gene (Gly870 to Gly1047, NCBI reference sequence XP0052744621) was

amplified from the vector pBluescript II SK + (BC150275), which was purchased from

Open Biosystems (ThermoScientific). The gene was then inserted into the mini-intein

domain in a pMal-c2 vector containing T4 td gene encoding thymidylate synthase

(Skretas and Wood, 2005a) using the restriction sites Age I and Xho I to make a

pMIT::Nrx2b plasmid. The schematic representation of the pMIT::Nrx2b plasmid

is shown in figure 3.3. The pET::Nrx2b His6 plasmid was prepared by inserting the

Nrx2b with the six histidine tag into the pET vector using the restriction sites BamHI

and XhoI.

Figure 3.3: Schematic representation of the Nrx biosensor. Notations are as follows, Ptac* = mutant tac promoter associated with hormone-dependent phenotypes; MBD = Maltose Binding Domain; N-Mtu = first 110 residues of the Mycobacterium tuber- culosis RecAintein(Mtu RecA intein); β Nrx = Nrx 2 β subtype; C-Mtu = Last 48 residues of the Mtu RecA intein; TS = bacteriophage T4 thymidylate synthase enzyme.

3.3 Biosensor Assay

The plasmid pMIT::Nrx2b was transformed into a TS-deficient E. coli strain

D1210ΔthyA::KanR [F- Δ(gpt-proA)62 leuB6 supE44 ara-14 galK 2 lacY 1Δ(mcrC-

mrr) rpsL20 (Strr) xyl-5 mtl-1 recA13 lacI q] and selected on a Luria-Bertani (LB)

55 medium agar supplemented with 100 μg/mL ampicillin and 50 μg/mL of thymine.

A single clone was inoculated in 5 mL LB medium with 100 μg/mL ampicillin and

50 μg/mL thymine followed by cultivation at 37oC until the optical density at 600

nm (OD600) reached 1.3. The culture was diluted 1:200 into a thymine-free medium

(-Thy) and dispensed into a 96-well plate with 200 μl in each well. We added either 2

μl of ligand dissolved in DMSO or just DMSO to each well. The plates are incubated

o in a shaker at 34 C for 16 hours. The OD600 was read using a BioTek Synergy2

UV-Vis plate reader.

Figure 3.4: Schematic representation of the high-throughput screening method as described in Materials and Methods. Adapted from Gierach et al. (2013)

56 3.4 15N Labeled Protein Expression and Purification

The pET::Nrx2b His6 was transformed into E.coli BLR (DE3) (Novagen) and selected on a LB medium agar supplemented with 100 μg/mL ampicillin. A single clonewasinoculatedintoa5mLLBmedium with 100 μg/mL ampicillin overnight seed cultures at 37oC. The overnight seed cultures were subsequently used to inoculate expression cultures. For expression, 1 L LB media was inoculated with a seed culture

o with a ratio of 1:50 and incubated at 37 C until the optical density at 600 nm (OD600) was 0.6-0.8. The culture was gently spun down to collect pellets and re-suspended in 1 L M9 media supplemented with 100 μg/mL ampicillin and a final concentration of 1 mM Isopropyl β-D-1-thiogalactopyranoside (IPTG). The recipe used for 1 L of

15 M9 media is as follows: N2HPO4 (6.5 g), KH2PO4 (3.0 g), NaCl (0.5 g), NH4Cl

(1 g), Glucose (4 g), Thiamin (5 - 10 mg), Biotin (5 10 mg), CaCl2 (14.7 mg),

MgSO4 (120.4 mg) in deionized water. The expression temperature was lowered to

16oC and incubated for 20 h. The cell pellets were collected by spinning down the culture at 5000 g for 10 min at 4oC and re-suspending them in 50 mL PBS buffer

(NaCl 137 mM, KCl 2.7 mM, Na2HPO4 10 mM, KH2PO4 1.8 mM) + 500 mM NaCl

+ 10 mM imidazole at pH 7.4. The re-suspended cells were lysed using sonication and centrifuged at 14,000 g, 4 oC for 15 mins. Lysate was loaded onto a Ni-NTA packed column containing 1 mL of the resin (New England Biolabs) in Econo-Pac chromatography columns (Bio Rad laboratories, Hercules, CA) and washed with 10 mL of Wash buffer (PBS buffer + 500 mM NaCl + 20 mM imidazole at pH 7.4).

Elution buffer (PBS buffer + 500 mM NaCl + 250 mM imidazole at pH 7.4) was used to elute the protein. The final protein concentration was 10 mg/mL. The protein was

57 concentrated using a vivaspin column (Sartorius, Bohemia, NY) by centrifuging at

4000 g and 4oC.

3.5 Sample for NMR experiments

The NMR sample (550 μL) was prepared in 10 mM phosphate buffer containing

50 mM NaCl at pH 6.5 with 10% (v/v) D2O. All NMR experiments were performed

on a Bruker DRX 600 or DRX 800 equipped with Z-axis gradient triple resonance

inverse TXI cryoprobes at 25oC.

3.6 Saturated Transfer Difference (STD) NMR

STD NMR is an established technique to study protein and small molecule in-

teractions (Lopez-Cebral et al., 2013; Antonini et al., 2014; Adam and Perler, 2002;

Sivertsen et al., 2014). The sample for the STD NMR contained 2 μM Nrx along

with 0.2 mM rosiglitazone. Data were collected with and without saturation of pro-

tein using a train of selective Gaussian pulses of 50 ms for 3 s separated by 1 ms in

interleaved fashion: in the saturated spectrum (Isat), the protein was irradiated on

resonance at -0.5 ppm and in the off resonance spectrum (I0), identical train Gaussian

pulses were used at -20 ppm. Saturation is transferred through the protein mediated

by spin diffusion. A ligand that is bound receives the saturation from the protein.

Ligand protons that are in close contact with the protein (Ha)(asshowninfigure

3.5) receive more saturation compared to the ones that are fully in solvent (Hc). Dif-

ference of the on-resonance and off-resonance spectrum give the STD showing signals

only from the bound ligand protons. The saturation difference spectrum (Istd) were

obtained by subtracting Isat from I0 (Isat =I0-Isat).

58 Figure 3.5: Scheme showing the principle of STD NMR method. Target protein is saturated with a cascade of selective Gaussian pulses. Saturation is transferred through the protein mediated by spin diffusion. A ligand that is bound receives the saturation from the protein. Ligand protons that are in close contact with the protein (Ha) receive more saturation compared to the ones that are fully in solvent (Hc). Difference of the on-resonance and off-resonance spectrum give the STD spectrum showing signals only from the bound ligand protons. Figure adapted from (Packer and Karlsson, 2009)

3.7 Heteronuclear single quantum coherence (HSQC)

Two dimensional (2D) 1H-15N HSQC experiments were carried out on 15N-labeled

Nrx with and without rosiglitazone. 1H-15N HSQC data were recorded with 128 x

59 512 complex points in t1 and t2, respectively. Spectral widths were 100-140 ppm and

6-11 ppm for the 15Nand1H dimension, respectively. The data were processed and analyzed using Brukers TopSpin software.

3.8 Results and Discussion

Protein-based bacterial biosensors have a unique advantage of being a rapid, sen-

sitive and inexpensive tool for screening ligands. This tool has a potential to help

researchers in drug discovery. We have developed a sensor that can detect ligands,

both synthetic and natural, with as low as μM scale affinities. An important advan-

tage of our system is the ability to swap the ligand binding domain with another to

easily create a new biosensor. The pMIT::Nrx2b biosensor cells were used for the

biosensor assay and tested with several test compounds. We have successfully de-

veloped a biosensor using hormone receptors estrogen α & β, thyroid receptor α &

β and peroxisome proliferator-activated receptor (PPAR) γ in the past (Skretas and

Wood, 2005a; ?; ?; Gierach et al., 2013). We used a thymidylate synthase (TS) ge- netic selection system where a ligand binding event can translate to activation of TS to produce thymidine which is a precursor molecule for thymine. TS-deficient E. coli cells mentioned in the methods can grow in a thymine-free medium during a positive selection. We can observe a dose-dependent growth phenotype for the E. coli cells.

As mentioned earlier, Nrx is a potential target for several neurodevelopmental

disorders. Nrx also regulates visual function via mediating retinoid transport (Tian

et al., 2013). Therefore, we decided to construct a Nrx protein biosensor to understand

the binding of Nrx to small molecules. pMIT::Nrx2b biosensor cells were generated

and screened against our chemical library to check if any compound interacts with

60 Nrx. Please note that there is no known small molecule ligand reported in the liter- ature which interacts with Nrx to the best of our knowledge except the black widow spider venom α-latrotoxin. From the biosensor assay, we observed that the agonist compound for PPARγ, rosiglitazone, binds strongly to Nrx and measured its binding affinity by a dose response curve and found an EC50 of 1.8 μM. Rosiglitazone, a FDA approved anti-diabetic drug is currently on the market and is also under clinical trial for treatment of Alzheimers disease (Miller et al., 2011).

Figure 3.6: Dose response curve for pMIT::Nrx2b with test compounds rosiglitazone and E2. The EC50 for rosiglitazone is 1.8 μM.

To confirm the interaction between rosiglitazone and Nrx, we decided to employ solution NMR spectroscopy. We performed an STD NMR experiment (refer to meth- ods and materials) that is widely used to test binding between ligand and receptors.

In the STD NMR experiment, two data were collected in interleaved fashion: one is recorded on resonance where Nrx signals were selectively saturated at -0.5 ppm (no

61 signals from rosiglitazone were observed at this region) with a train of 50 ms Gaussian pulses separated by 1 ms for 2 s and second data is recorded applying same selective

Gaussian pulses at off resonance region (-20 ppm, no signals from Nrx and rosiglita- zone were observed). 1D 1H off resonance spectrum of rosiglitazone in presence of 2

μM Nrx is shown in Figure 3.7a. Protein signals were not visible because there was a

100 times excess of rosiglitazone present in the solution. Figure 3.7b shows the STD spectrum of rosiglitazone in presence of 2 μM Nrx. Please note that only the signals of rosiglitazone that received saturation transfer from protein in the on resonance experiment via spin diffusion through nuclear overhauser effect (NOE) are observed.

This experiment clearly demonstrates that rosiglitazone interacts with Nrx.

Figure 3.7: STD NMR spectrum (B) and 1H NMR reference spectrum (A) for NRX in the presence of rosiglitazone (200 μM). The spectra were acquired at 600 MHz and 298 K with 16 scans each for the difference and the reference spectrum. The signals in the difference spectrum are from rosiglitazone, unambiguously indicating binding of this ligand to the Nrx protein.

62 However, STD NMR experiment does not exclude non-specific binding of rosigli-

tazone to Nrx. Therefore, we decided to test the interaction between rosiglitazone and

Nrx employing two dimensional (2D) HSQC experiment where protein signals were

observed. To perform this experiment, we have recombinantly expressed 15N-labeled

15 Nrx using M9 media (refer to methods and materials) with N-labeled NH4Cl used

as a sole 15N source. Two HSQC experiments with and without ligand, rosiglitazone were performed (Figure 3.8). Figure 3.8A shows two overlaid spectra of 15N-labeled

Nrx with (blue - bound state) and without (red - apo state) rosiglitazone. It is evi-

dently visible that only a selected number of 1H-15N cross peaks in the HSQC spectra

are affected and reconfirm its selective nature of binding.

63 Figure 3.8: An overlay of HSQCs of β-Nrxs in apo form and with the rosiglitazone. We can clearly see the shifting of peaks after the addition of the ligand.

64 3.9 Conclusions

To the best of our knowledge, this is the first time that Nrx has been reported to bind to a synthetic small molecule. The structural studies can lead to better insights of the biophysical interactions between the protein and the ligand. Peak assignment by labeling the carbon atoms with 13C will lead us to answer the question of which amino acids are involved in this interaction. Using the structural knowledge we can improve the binding specificity and affinity of the drug. We can extend this experiment to screen more chemicals using a high throughput screening assay which has already been developed in our lab. Due to its high sensitivity, rapidness, and cost effectiveness, the protein-based biosensor assay has an advantage over other biosensing tools. This result could potentially open new avenues for the development of new therapeutics for neurodevelopmental disorders.

65 Chapter 4: Mechanism of Biosensors

In the previous chapters we were introduced to biosensors and their applications in detecting therapeutic compounds. We did not get a chance to discuss in detail the working of the biosensors. We hypothesized that the growth phenotype is from ligand-induced cleaving of inteins but no experiments were performed to test the hypothesis. In this chapter we will discuss about the different experiments that were performed to determine the mechanism. We used techniques like domain deletion, site-directed mutagenesis and linker engineering to construct several biosensors to test our hypothesis. I would like to thank Dr. Jingjing Li, who contributed to this work significantly. Here I want to mention the different constructs generated and the person who carried out the work. The constructs were designed to study the roles of maltose binding protein (MBP), intein, linkers between them, and linkers between the intein and the thymidylate synthase (TS) among others. Also, I have to mention that some of the work from Dr. Jingjing Li will be presented here as, without those results, the results I generated will not be able to give the big picture. So here is the list of constructs generated:

1. Sensors without the MBP domain (constructed and tested by JJL)

2. Sensors without the intein domain (constructed and tested by JJL)

66 3. Replacing the intein with V437L mutated version (constructed and tested by

JJL)

4. Sensors without both MBP and intein (constructed and tested by JJL)

5. Sensors with truncated linker between MBP and intein (constructed by JJL and

tested by JB)

6. Sensors with Glycine-Serine linker between intein and the TS (constructed and

tested by JB)

7. Sensors with non-cleaving N440 mutations in intein (constructed and tested by

JB)

8. Sensors with non-cleaving N440A mutation and the Glycine-Serine linker be-

tween the intein and the TS (constructed and tested by JB)

4.1 Results

The prototype constructs are pMIT:ERβ, pMIT:TRβ and pMIT:PPARγ (Skretas and Wood, 2005a; Li et al., 2011) and were tested with their native ligands E2, T3, and rosiglitazone (rosig) respectively. They were all able to display a ligand-dependent growth phenotype at 34oC. The constructs without the MBP were made to study the role played by MBP in generating the ligand-dependent growth phenotype. It was found that all the constructs, pIT:ERβ, pIT:TRβ and pIT:PPARγ, did not show any growth phenotype at both 34oC and a less stringent 26oC (Wood et al., 1999) in the presence of their native ligands figure 4.2. This suggests that MBP is required for

67 Figure 4.1: Schematic of the prototype biosensors and mutant constructs. Ptac* refers to a modified Ptac promoter isolated in our previous work (Skretas, Meligova et al. 2007); MBP is maltose binding protein; N-Int is the first 110 residues of the Mycobacterium tuberculosis RecA intein with a Cys to Ala mutation at the first intein residue; C-Int is the last 58 residues of the intein; TS is the bacteriophage T4 thymidylate synthase enzyme. ERβ,TRβ and PPARγ refer to the LBDs of the respective NHRs, and are inserted between N-Int and C-Int, resulting in pMIT:ERβ, pMIT:TRβ, and pMIT:PPARγ, respectively. The lines between MBP and intein and between TS and intein represent the linkers between these domains. The restriction sites for assembly PCR and cloning are shown in the prototype construct.

68 TS activation and hence any growth phenotype. MBP has been reported to assist in

solubility of the fused protein (Kapust and Waugh, 1999).

Figure 4.2: Effects of MBP or intein domain deletion on the ERβ,TRβ and PPARγ biosensors. (A) The sensor cells were incubated at 34oC in the presence of either 1% DMSO vehicle control (black bars) or 6.25 μM agonists (E2 for the ERβ sensor, T3 for the TRβ sensor and Rosig for the PPARγ sensor) (grey bars). (B) The biosensors were incubated at 26oC for 20h to decrease the stringency of the TS selection. In each case, the OD was read at 600 nm.

To study the role of inteins, the following constructs were tested: pMT:ERβ,

pMT:TRβ and pMT:PPARγ. It was found that none of the constructs were able to

show any growth phenotype at 34oC in the presence of their respective ligands. When

69 the temperature was lowered to 26oC, the ER and PPAR version of the biosensors were able to show a growth phenotype but no signal (figure 4.2). This suggests that intein domain plays a role in stabilizing the LBD at higher temperatures. The constructs with both MBP and intein domains deleted showed no growth phenotype at both 34oCand26oC. This further confirms that MBP is imperative for a growth phenotype. It is worth to mention here that MBP has been reported to assist in solubility of the fused protein (Kapust and Waugh, 1999; Nettles and Greene, 2005).

The mutation V437L in intein is known to destabilize them (Wu et al., 2010).

We tested the biosensor containing the mutated intein domain and found that they displayed a decreased growth phenotype and ligand-dependent growth response 4.3.

This suggests that intein stability is important for a good biosensor signal.

The linker between the MBP and intein was modified by truncating part of it to see if there any role of steric inhibition. The prototype sensor contains a 20 amino acid linker (NNNNNNNNNNLGIEGRISEF). Two constructs each were made with last 10 amino acids and 20 amino acids of the linker truncated for ER, TR and PPAR versions respectively. It was found that deletion of last 10 amino acids from the linker, improved the signal for ER and TR biosensors, while for PPAR it remained the same.

With 20 amino acids truncated, the growth response was inhibited for both ER and

PPAR biosensors, while the TR remained the same as prototype (figure 4.4). These results suggest that linkers play a role in steric inhibition of TS dimerization by MBP domain.

The flexibility between intein and the TS domain was increased by addition of

G4Sand(G4S)3 linkers. The addition G4S linker resulted in increase in growth phenotype for ER and TR sensors but surprisingly their ligand potency went down.

70 Figure 4.3: Effect of intein V437L mutations on ligand sensitivity of the sensors. The sensor cells were incubated at 34oC for 16h for ERβ (a), TRβ (b), and for 12h for PPARγ (c) sensors. Solid black squares, the prototype; open gray circles, the V437L mutant sensors.

71 Figure 4.4: Effect of truncation of the linker between the MBP and intein domains on ligand sensitivity. The sensor cells were incubated at 34oC in the presence of various concentrations of respective agonists. (a) ERβ in the presence of E2; (b) TRβ in T3; (c) PPARγ in Rosig. Solid black squares represent the prototype sensors; solid gray triangles represent a ten-residue truncation, open gray circles represent sensors with a twenty-residue truncation.

72 PPAR sensors displayed an increase in growth phenotype but with a high background masking any signal (figure 4.5). These results suggest that, while small linker helps in TS activation, large linker tend to obscure the ligand dependent TS activation.

Figure 4.5: Effect of GS linker insertion between the intein and TS domains on ligand sensitivity. The ERβ (a), TRβ (b), and PPARγ sensors (c) without any linker, or with a five-residue G4S linker, or with a fifteen-residue (G4S)3 linker between intein and TS were incubated at 34oC in the presence of their respective aognists (E2 for ERβ, T3 for TRβ and Rosig for PPARγ) at indicated temperatures. Solid black squares represent the prototype sensors; open gray circles represent the sensors with a five- residue G4S linker; solid gray triangles represent the sensors with a fifteen-residue (G4S)3 linker.

73 The intein used in the prototype sensors contain asparagine (N) as the last amino acid. This N is required for the cleaving reaction of inteins (Wood et al., 1999).

We introduced a N440A mutation to abolish the cleaving ability of intein to see if cleaving reaction has any role to play in TS growth phenotype (Van Roey, Pereira et al. 2007). We found that N440A mutation severely affects the growth phenotype and none of the biosensors grew at 34oC. When the temperature was lowered to 26oC, the growth phenotype was rescued but no signal except for the TR biosensor (figure 4.6).

In another set of experiments, we introduced a G4S linker between the intein with

N440A mutation and the TS. We found that the linkers led to a growth phenotype but without any signal. When we increased the length of this linker by three folds

(G4S)3, we found that all the sensors displayed an overgrowth phenotype except the

ER version which showed a signal to E2 (figure 4.7).

4.2 Discussion

The results clearly indicate that MBP plays a key role in the working of biosensor.

As it has been previously mentioned, here MBP plays the role of helping the solubility of the fused protein (Kapust and Waugh, 1999; Nettles and Greene, 2005) and thus necessary for any growth phenotype. The intein on the other hand, provides a scaffold for the LBD and thus stabilizes the LBD structure (Skretas and Wood, 2005a,b;

Skretas et al., 2007; Li et al., 2011). Intein also provides a conduit for the ligand binding signal to reach the TS domain. Results from the constructs without intein domain suggest that, at lower temperatures some LBDs can attain proper folding and give a signal. This could be due to ligand-induced stabilization in NHR LBDs, which has been established in the literature (Nygaard and Harlow 2001). The results from

74 Figure 4.6: Effect of the intein N440A mutation on the growth phenotype of the sensors. Sensor cells with and without the cleavage-abolishing N440A mutation were incubated for 16h for ERβ and TRβ, and 12h for PPARγ sensor at the indicated respective agonist concentrations. The cells were incubated at 34oC (A) and 26oC (B). Solid squares represent the prototype sensors; open circles represent the N440A mutant sensors. 75 Figure 4.7: Effect of GS linker insertions between the intein and TS domains in the presence of the N440A mutation. The N440A sensor cells with a five-residue G4S linker or a fifteen-residue (G4S)3 linker between the intein and TS domain were incubated for 16h for ERβ (a) and TRβ (b); and 12h for PPARγ sensor (c) at 34oC. Solid black squares represent the prototype sensors; open gray circles represent the sensors with G4S linker; Solid gray triangles represent the sensors with the (G4S)3 linker.

V437L intein variant in the biosensors suggest that a stable intein is necessary for a strong signal.

Excessively short linkers can restrict the normal folding and activity of fused domains (Nygaard and Harlow, 2001). The linkers between MBP and intein were modified to study the role in possible steric blockage of TS dimerization. From the results, we can see that modifying the linker lengths did affect the ligand potency but

76 completely removing the linkers abolished the growth phenotype suggesting steric inhibition of TS dimerization in the absence of a linker. Similarly, the flexible G4S linkers between the intein and the TS led to increase in the growth phenotype but reduced ligand sensitivity. When the linker length was increased to (G4S)3,therewas an overgrowth phenotype and no signal was observed. Further, when a GS linker was added between the intein and TS, to the N440A intein mutant biosensors, the growth phenotype was rescued. These results suggest that linker lengths and flexibility affects the TS dimerization and thus sensor behavior. In the beginning of the chapter we mentioned it was hypothesized that the TS activation is due to ligand induced intein cleavage leading to TS dimerization. As mentioned earlier, the N440A mutation inhibits the intein cleavage. After we tested the biosensor with mutant intein at 34oC

, it can be suggested that, by abolishing the intein cleavage, the TS dimerization is inhibited and thus diminished growth phenotype. Although, with the addition of linkers, the growth phenotype was restored. Similar results were seen when we tested constructs with inteins having N440D, N440E and N440Q mutations. These results suggest that linkers between domains affect the steric interaction between domains and hence TS dimerization. The hypothesis from these results is that, the agonist binding to the LBD causes the helix 12 to reposition itself and thus change the structural conformation of the LBD (Skretas et al., 2007; Li et al., 2011). This signal is transferred to the intein scaffold which cleaves the TS leading to TS dimerization.

When an antagonist binds to LBD, the helix 12 repositions in a different configuration, and thus inhibiting the intein cleavage reaction. Another factor affecting the ligand sensitivity is the linker length between the LBD and the intein, which has already been published.

77 4.3 Conclusion

Our observations reveal that MBP is crucial for the proper folding and solubility of the whole fusion protein. It also participates in the generation of the sensor signal by providing a steric interference for TS dimerization. The intein may act as a scaffold for LBD folding and stabilize the folded LBD structure. Additionally the intein plays a critical role in the transmission of conformational changes upon ligand binding.

Thanks to the aforementioned factors, the four domain fusion protein uses slight changes in intein cleaving activity for the discrimination of agonists and antagonists.

This study also provides a potential approach to improve sensor performance by engineering optimized linkers.

78 Chapter 5: Summary

Protein-based bacterial biosensor has a great potential to be a tool for screening of therapeutic ligands for hormone receptor proteins. Several attempts have been made in the past to develop new biosensors, some with unique advantages over the others. Nuclear hormone receptor proteins, as mentioned earlier, have been a tar- get for a majority of drug discovery projects. This enables us to use this family of proteins to develop a sensor that can detect its interaction with small molecules.

Also with the rise of endocrine disruptor chemicals (EDCs), there is an urgent need to develop tools that can detect these chemicals even at nano molar concentrations.

EDCs have been known to be a leading cause for breast cancer, early puberty, sex reversals and infertility among others. Also EDCs are affecting individuals across different species including fish, chicken, and crocodiles among others. In this thesis,

I have discussed about developing a tool with different ligand binding domains. An important advantage of the current biosensor is its modular structure and the ability to swap out the ligand binding domains and create new biosensors. The TS reporter system is a reliable and powerful genetic selection method. This tool was tested with a library of chemicals suggested by the interagency coordinating committee for vali- dating alternative methods (ICCVAM). We were able to successfully screen chemicals with ligand binding domains derived from several animal species. The mechanism of

79 working of the biosensor was studied in detail which will help in intelligent design of next generation of biosensors. This includes incorporating newer reporter proteins and expanding the array of ligand binding domains.

Neurexin biosensor will be an invaluable tool that can open new avenues in the study of therapeutics and drug discovery in the field of neuroscience. With the dis- covery that an anti-diabetic drug binding to neurexin, we expect a new beginning in the study of rosiglitazone-neurexin interaction.

5.1 Future Work

Several projects could take shape from the work presented in this thesis. These include both experimental as well as computational projects. Some of these have already been started and should be completed very soon. Similar strategies that were applied to the current protein-based bacterial biosensor can be applied.

5.1.1 3D structure of neurexin using NMR

We now know that neurexin binds to rosiglitazone from the NMR experiments.

To have a further understanding of the binding we will have to determine the 3D structure of the ligand-bound protein. For a 3D structure determination, we will have to label the protein with both 13C and 15N isotopes. Once labeled, we can run

3D scans to collect the data and assign the peaks. Once the peaks are assigned, an ensemble of structures can be generated. The 3D structure will give a deeper insight into the binding pocket and amino acids involved in the binding. Once we have this information, we can use computational modeling techniques to dock small molecules and hence screen a library of chemicals before doing the actual experiment in the lab.

80 This project has a great potential to help researchers understand neurodevelopmental disorders at a molecular level.

5.1.2 Molecular Docking

Computational tools can be handy and does not occupy bench space in the lab.

They can save some time and resources if used carefully. Molecular docking is a tech- nique which uses protein structures as the input and determines the fitting of a small molecule based on the energy minimization calculations. The technique has been suc- cessfully used by industries to screen large library of chemicals. In conjunction with the bacterial biosensor, we can use the molecular docking method to refine our library and improve the possibility of finding a ligand through laboratory experiments.

5.1.3 In-vitro biosensor

Although thymidylate synthase (TS) genetic selection system is a reliable and fast reporter system, it takes about 16h for the biosensor cells to show dose-dependent growth phenotype. A β-galactosidase reporter protein can be used in place of TS and express the protein in-vitro, the entire biosensing process should take less than

3h. The work is already in progress on this project. The key advantage of this project would be that it can be expressed using an in-vitro kit in a small volume and thus lower amount of ligand that would be required for testing. Once the protein is expressed, we can check for the change in color using a simple colorimetric assay.

This project has great potential as it takes up very little volume and is much faster in response.

81 Chapter 6: Materials & Methods

6.1 Reagents and Strains

All cloning experiments were carried out with E. coli strain DH5 cells (Invitrogen,

Carlsbad, CA) while E. coli strain D1210ΔthyA::KanR [F- Δ(gpt-proA)62 leuB6 supE44 ara-14 galK 2 lacY 1Δ(mcrC-mrr) rpsL20 (Strr) xyl-5 mtl-1 recA13 lacI q] was used for the determination of TS growth phenotypes. BL21 (DE3) (Novagen,

Madison, WI) was used for the protein expression experiments.

The estrogen analogues 17-α-estradiol, 17-β-estradiol, diethylstilbestrol, hexestrol, , estrone, tamoxifen, cycloheximide, linuron, 2,4,5 trichlorophenoxyacetic acid, 2-sec butylphenol, 4-cumylphenol, ethylparaben, zearalenone, flutamide, rosigli- tazone, T3 were purchased from Sigma Aldrich (Saint Louis, MO). Rosiglitazone was purchased from Cayman chemical company (Ann Arbor, MI). All hormone analogues were dissolved in dimethylsulfoxide (DMSO) to form 10 mM solutions.

The recipe for –Thy media is 10 mL of 20% glucose, 10 mL of Thy pool (2 mg/mL each of L-Arg, L-His, L-Leu, L-Met, L-Pro, L-Thr), 10 mL of 10% casamino acids solution, 200 μL of 1% thiamine HCl, 1 mL of 0.1 M calcium chloride, 200 mL of 5X

Minimal Davis Broth (35 mg/mL K2HPO4, 10 mg/mL KH2PO4, 2.5 mg/mL sodium citrate, 0.5 mg/mL MgSO4, 5 mg/mL NH4SO4), 4 mL of 25 mg/mL ampicillin, and

82 filled to 1 L using deionized water. The glucose, Thy Pool, casamino acids, and ampicillin components were sterilized by microltration through a 0.2 μm membrane.

All other components were sterilized by autoclaving at 121oC for 25 min, and the

final medium was mixed after cooling of all the components at room temperature.

Deionized water was used for preparing the media (Gierach, Shapero et al. 2013).

The recipe used for M9 media is as follows: N2HPO4 (6.5 g), KH2PO4 (3.0 g), 15 o NaCl (0.5 g), NH4Cl (1 g), sterilize it by autoclaving at 121 C for 25 min. Add

Glucose (4 g), Thiamin (5 – 10 mg), Biotin (5 – 10 mg), CaCl2 (14.7 mg), MgSO4

(120.4 mg). Fill it to 1 L with deionized water.

6.2 Plasmid Construction

The coding sequence corresponding to residues Arg254-Lys504 of the human es- trogen hormone receptor β was amplified from the vector pSG5-hERβ (a gift from

Cathleen Valentine, Department of Medicine, University of California, San Francisco,

CA), while the coding sequence for residues Ser296-Val519 of the zebrafish estrogen hormone receptor β was amplified from the vector pME18S-FL3 (Thermoscientific

Bio, Pittsburgh, PA). The coding sequence for residues Arg209-Val440 of the mouse estrogen hormone receptor β was amplified from the vector pCR4-TOPO (Thermo- scientific Bio, Pittsburgh, PA). In each case, PCR primers were used to perform a

SOEing reaction to join intein fragments. The restriction sites AgeI and XhoI were used to insert into the pMIT vector to replace the residues 110-383 of the Mtu RecA intein (Wood, Wu et al. 1999). The coding sequences for the residues Gly870-Gly1047

83 of the human nrx2b was amplified from the vector pBluescript II SK+ (Thermosci-

entific Bio, Pittsburgh, PA).The sequences of the constructed mini-intein/binding

domain chimeras were verified with nucleotide sequencing.

6.3 15N Labeled Protein Expression and Purification

The pET::Nrx2b His6 was transformed into E.coli BLR (DE3) (Novagen) and selected on a LB medium agar supplemented with 100 μg/mL ampicillin. A single clonewasinoculatedintoa5mLLBmedium with 100 μg/mL ampicillin overnight

seed cultures at 37oC. The overnight seed cultures were subsequently used to inoculate

expression cultures. For expression, 1 L LB media was inoculated with the seed

o culture with a ratio of 1:50 at and incubated at 37 C till the optical density at OD600

was 0.6-0.8. The culture was gently spun down to collect pellets and re-suspended in

1 L M9 media supplemented with 100 μg/mL ampicillin and a final concentration of 1 mM Isopropyl -D-1-thiogalactopyranoside (IPTG). The expression temperature was lowered to 16oC and incubated for 20h. The cell pellets were collected by spinning down the culture at 5000 x g for 10 min at 4oC and re-suspending them in 50 mL

PBS buffer (NaCl 137 mM, KCl 2.7 mM, Na2HPO4 10 mM, KH2PO4 1.8 mM) +

500 mM NaCl + 10 mM imidazole at pH 7.4. The re-suspended cells were lysed

using sonication and centrifuged at 14,000 g, 4oC for 15 mins. Lysate was loaded

onto a Ni-NTA packed column containing 1 mL of the resin (New England Biolabs)

in Econo-Pac chromatography columns (Bio Rad laboratories, Hercules, CA) and

washed with 10 mL of Wash buffer (PBS buffer + 500 mM NaCl + 20 mM imidazole

at pH 7.4). Elution buffer (PBS buffer + 500 mM NaCl + 250 mM imidazole at pH

7.4) was used to elute the protein. The final protein concentration was 10 mg/mL.

84 The protein was concentrated using vivaspin column (Sartorius, Bohemia, NY) by centrifuging at 4000 g and 4oC.

85 Appendix A: DNA Sequences of Ligand Binding Domains of Nuclear Hormone Receptors and Neurexin

A.1 Human Estrogen Receptor β

CGCCGAGTGCGGGAGCTGCTGCTGGACGCCCTGAGCCCCGAGCAGCT AGTGCTCACCCTCCTGGAGGCTGAGCCGCCCCATGTGCTGATCAGCCGC CCCAGTGCGCCCTTCACCGAGGCCTCCATGATGATGTCCCTGACCAAGT TGGCCGACAAGGAGTTGGTACACATGATCAGCTGGGCCAAGAAGATTCC CGGCTTTGTGGAGCTCAGCCTGTTCGACCAAGTGCGGCTCTTGGAGAGC TGTTGGATGGAGGTGTTAATGATGGGGCTGATGTGGCGCTCAATTGACC ACCCCGGCAAGCTCATCTTTGCTCCAGATCTTGTTCTGGACAGGGATGA GGGGAAATGCGTAGAAGGAATTCTGGAAATCTTTGACATGCTCCTGGCA ACTACTTCAAGGTTTCGAGAGTTAAAACTCCAACACAAAGAATATCTCT GTGTCAAGGCCATGATCCTGCTCAATTCCAGTATGTACCCTCTGGTCAC AGCGACCCAGGATGCTGACAGCAGCCGGAAGCTGGCTCACTTGCTGAAC GCCGTGACCGATGCTTTGGTTTGGGTGATTGCCAAGAGCGGCATCTCCT CCCAGCAGCAATCCATGCGCCTGGCTAACCTCCTGATGCTCCTGTCCCA CGTCAGGCATGCGAGTAACAAGGGCATGGAACATCTGCTCAACATGAAG TGCAAAAATGTGGTCCCAGTGTATGACCTGCTGCTGGAGATGCTGAATG CCCACGTGCTTCGCGGGTGCAAGGCG

86 A.2 Cow Estrogen Receptor β

CGAGTGAAAGAGCTGCTGCTGAGCGCCCTGAGCCCAGAGCAGCTGGT GCTTACGCTCCTGGAGGCCGAGCCGCCCCACGTGCTCATAAGCCGCCCC AGCACGCCCTTCACTGAGGCCTCCATGATGATGTCCCTCACCAAGCTGG CCGACAAGGAACTGGTACACATGATCAGCTGGGCCAAGAAGATTCCGGG CTTCGTGGAGCTCAGCCTGTACGACCAAGTGCGGCTTTTGGAGAGCTGC TGGTTGGAGGTGCTCATGGTGGGGCTGATGTGGCGCTCCATCGACCACC CTGGCAAGCTCATCTTTGCTCCAGACCTCATTCTGGACAGGGATGAAGG GAAATGTGTTGAAGGAATTCTAGAAATCTTTGACATGCTCCTGGCAACG ACTTCAAGGTTTCGTGAGTTAAAACTCCAACACAAAGAATATCTCTGTG TCAAGGCCATGATCCTCCTCAACTCCAGTATGTACCCTTCAGCTACAGC ACCCCAGGAGGCTGACAGTGGCCGGAAGCTGACTCACCTGCTGAATGCT GTGACGGACGCTCTGGTCTGGGTGATTGCCAAGAGTGGCATGTCCTCCC AGCAGCAGTCCATGCGCCTGGCTAACCTGCTGATGCTCCTGTCTCACGT CAGGCACGCCAGTAACAAGGGCATGGAACACCTGCTCAACATGAAGTGC AAAAACGTGGTCCCCGTCTACGATCTGCTGCTGGAGATGCTGAATGCCC ACACACTTCGCGGCAAC

87 A.3 Rat Estrogen Receptor β

AGTCCCGAGCAGCTGGTGCTCACCCTGCTGGAAGCTGAGCCACCCAA TGTGCTAGTGAGCCGTCCCAGCATGCCCTTCACCGAGGCCTCCATGATG ATGTCCCTCACGAAGCTGGCTGACAAGGAACTGGTGCACATGATTGGCT GGGCCAAGAAAATCCCTGGCTTTGTGGAGCTCAGCCTGTTGGACCAAGT CCGCCTCTTGGAAAGCTGCTGGATGGAGGTGCTGATGGTGGGGCTGATG TGGCGCTCCATCGACCACCCCGGCAAGCTCATCTTTGCTCCAGACCTCG TTCTGGACAGGGATGAGGGGAAGTGCGTGGAAGGGATTCTGGAAATCT TTGACATGCTCCTGGCGACGACGGCACGGTTCCGTGAGTTAAAACTGCA GCACAAAGAATATCTGTGTGTGAAGGCCATGATTCTCCTCAACTCCAGT ATGTACCCCTTGGCTACCGCAAGCCAGGAAGCAGAGAGTAGCCGGAAGC TGACACACCTATTGAACGCAGTGACAGATGCCCTGGTCTGGGTGATTTC GAAGAGTGGAATCTCTTCCCAGCAGCAGTCAGTCCGTCTGGCCAACCTC CTGATGCTTCTTTCTCATGTCAGGCACATCAGTAACAAGGGCATGGAAC ATCTGCTCAGCATGAAGTGCAAAAATGTGGTCCCGGTGTACGACCTGCT GCTGGAGATGCTGAATGCTCAC

88 A.4 Zebrafish Estrogen Receptor β

TCCCCTGAACAATTGGTTAGCTGTATTCTAGAGGCGGAGCCACCTCA AATTTACCTGAGAGAGCCGGTGAAAAAGCCATACACTGAGGCTAGCATG ATGATGTCACTAACAAGCCTCGCCGACAAGGAGCTAGTGCTCATGATTA GCTGGGCGAAGAAGATACCAGGTTTTGTAGAGTTGACTTTGTCAGATCA GGTGCATTTGCTGGAATGCTGCTGGCTGGATATTCTGATGTTAGGATTG ATGTGGAGATCTGTGGATCATCCTGGGAAACTCATCTTCACCCCTGACC TCAAGCTCAACAGGGAGGAAGGGAATTGTGTTGAAGGCATCATGGAGAT TTTCGACATGCTGCTGGCCACCACCTCTCGATTCAGAGAGCTGAAGCTG CAGAGAGAGGAATACGTCTGTCTCAAAGCCATGATCCTGCTCAACTCTA ATAACTGTTCGAGTTTGCCACAGACTCCTGAGGATGTGGAGAGTCGCGG GAAGGTGCTGAATCTGCTGGACTCAGTGACCGATGCTCTGGTGTGGATC ATCTCCAGAACGGGTCTGTCCTCACAACAACAGTCCATCCGGCTCGCTC ATCTGCTAATGCTGCTCTCACACATCCGACACCTCAGCAACAAAGGCAT CGAGCATCTGTCAAACATGAAGAGGAAAAACGTGGTG

89 A.5 Neurexin 2b

GGGACCACATACATCTTTGGGAAGGGGGGAGCGCTCATCACCTACAC GTGGCCCCCCAATGACAGGCCCAGCACGAGGATGGATCGCCTGGCCGTG GGCTTCAGCACCCACCAGCGGAGCGCTGTGCTGGTGCGGGTGGACAGCG CCTCCGGCCTTGGAGACTACCTGCAGCTGCACATCGACCAGGGCACCGT GGGGGTGATCTTTAACGTGGGCACGGACGACATTACCATCGACGAGCCC AACGCCATAGTAAGCGACGGCAAATACCACGTGGTGCGCTTCACTCGAA GCGGCGGCAACGCCACCCTGCAGGTGGACAGCTGGCCGGTCAACGAGCG GTACCCGGCAGGCCGCCAGCTGACCATCTTCAACAGCCAGGCTGCCATC AAGATCGGGGGCCGGGATCAGGGCCGCCCCTTCCAGGGCCAGGTGTCCG GCCTCTACTACAATGGGCTCAAGGTGCTGGCGCTGGCCGCCGAGAGCGA CCCCAATGTGCGGACTGAGGGTCACCTGCGCCTGGTGGGGGAGGGG

90 Appendix B: Plasmid Maps

B.1 pMIT:ERβ Human

CCGACACCATCGAATGGTGCAAAACCTTTCGCGGTATGGCATGATAG CGCCCGGAAGAGAGTCAATTCAGGGTGGTGAATGTGAAACCAGTAACGT TATACGATGTCGCAGAGTATGCCGGTGTCTCTTATCAGACCGTTTCCCG CGTGGTGAACCAGGCCAGCCACGTTTCTGCGAAAACGCGGGAAAAAGTG GAAGCGGCGATGGCGGAGCTGAATTACATTCCCAACCGCGTGGCACAAC AACTGGCGGGCAAACAGTCGTTGCTGATTGGCGTTGCCACCTCCAGTCT GGCCCTGCACGCGCCGTCGCAAATTGTCGCGGCGATTAAATCTCGCGCC GATCAACTGGGTGCCAGCGTGGTGGTGTCGATGGTAGAACGAAGCGGC GTCGAAGCCTGTAAAGCGGCGGTGCACAATCTTCTCGCGCAACGCGTCA GTGGGCTGATCATTAACTATCCGCTGGATGACCAGGATGCCATTGCTGT GGAAGCTGCCTGCACTAATGTTCCGGCGTTATTTCTTGATGTCTCTGAC CAGACACCCATCAACAGTATTATTTTCTCCCATGAAGACGGTACGCGAC TGGGCGTGGAGCATCTGGTCGCATTGGGTCACCAGCAAATCGCGCTGTT AGCGGGCCCATTAAGTTCTGTCTCGGCGCGTCTGCGTCTGGCTGGCTGG CATAAATATCTCACTCGCAATCAAATTCAGCCGATAGCGGAACGGGAAG GCGACTGGAGTGCCATGTCCGGTTTTCAACAAACCATGCAAATGCTGAA TGAGGGCATCGTTCCCACTGCGATGCTGGTTGCCAACGATCAGATGGCG CTGGGCGCAATGCGCGCCATTACCGAGTCCGGGCTGCGCGTTGGTGCGG ATATCTCGGTAGTGGGATACGACGATACCGAAGACAGCTCATGTTATAT CCCGCCGTTAACCACCATCAAACAGGATTTTCGCCTGCTGGGGCAAACC AGCGTGGACCGCTTGCTGCAACTCTCTCAGGGCCAGGCGGTGAAGGGCA ATCAGCTGTTGCCCGTCTCACTGGTGAAAAGAAAAACCACCCTGGCGCC CAATACGCAAACCGCCTCTCCCCGCGCGTTGGCCGATTCATTAATGCAG CTGGCACGACAGGTTTCCCGACTGGAAAGCGGGCAGTGAGCGCAACGCA ATTAATGTGAGTTAGCTCACTCATTAGGCACAATTCTCATGTTTGACAG CTTATCATCGACTGCACGGTGCACCAATGCTTCTGGCGTCAGGCAGCCA TCGGAAGCTGTGGTATGGCTGTGCAGGTCGTAAATCACTGCATAATTCG TGTCGCTCAAGGCGCACTCCCGTTCTGGATAATGTTTTTTGCGCCGACA TCATAACGGTTCTGGCAAATATTCTGAAATGAGCTGTTGACAATTAATC

91 ATCGGCTCGTATAATGTGTGGAATTGTGAACGGATAACAATTTCACACA GGAAACAGCCAGTCCGTTTAGGTGTTTTCACGAGCACTTCACCAACAAG GACCATAGATTATGAAAATCGAAGAAGGTAAACTGGTAATCTGGATTAA CGGCGATAAAGGCTATAACGGTCTCGCTGAAGTCGGTAAGAAATTCGAG AAAGATACCGGAATTAAAGTCACCGTTGAGCATCCGGATAAACTGGAAG AGAAATTCCCACAGGTTGCGGCAACTGGCGATGGCCCTGACATTATCTT CTGGGCACACGACCGCTTTGGTGGCTACGCTCAATCTGGCCTGTTGGCT GAAATCACCCCGGACAAAGCGTTCCAGGACAAGCTGTATCCGTTTACCT GGGATGCCGTACGTTACAACGGCAAGCTGATTGCTTACCCGATCGCTGT TGAAGCGTTATCGCTGATTTATAACAAAGATCTGCTGCCGAACCCGCCA AAAACCTGGGAAGAGATCCCGGCGCTGGATAAAGAACTGAAAGCGAAAG GTAAGAGCGCGCTGATGTTCAACCTGCAAGAACCGTACTTCACCTGGCC GCTGATTGCTGCTGACGGGGGTTATGCGTTCAAGTATGAAAACGGCAAG TACGACATTAAAGACGTGGGCGTGGATAACGCTGGCGCGAAAGCGGGTC TGACCTTCCTGGTTGACCTGATTAAAAACAAACACATGAATGCAGACAC CGATTACTCCATCGCAGAAGCTGCCTTTAATAAAGGCGAAACAGCGATG ACCATCAACGGCCCGTGGGCATGGTCCAACATCGACACCAGCAAAGTGA ATTATGGTGTAACGGTACTGCCGACCTTCAAGGGTCAACCATCCAAACC GTTCGTTGGCGTGCTGAGCGCAGGTATTAACGCCGCCAGTCCGAACAAA GAGCTGGCAAAAGAGTTCCTCGAAAACTATCTGCTGACTGATGAAGGTC TGGAAGCGGTTAATAAAGACAAACCGCTGGGTGCCGTAGCGCTGAAGTC TTACGAGGAAGAGTTGGCGAAAGATCCACGTATTGCCGCCACCATGGAA AACGCCCAGAAAGGTGAAATCATGCCGAACATCCCGCAGATGTCCGCTT TCTGGTATGCCGTGCGTACTGCGGTGATCAACGCCGCCAGCGGTCGTCA GACTGTCGATGAAGCCCTGAAAGACGCGCAGACTAATTCGAGCTCGAAC AACAACAACAATAACAATAACAACAACCTCGGGATCGAGGGAAGGATTT CAGAATTCGCCCTCGCAGAGGGCACTCGGATCTTCGATCCGGTCACCGG TACAACGCATCGCATCGAGGATGTTGTCGATGGGCGCAAGCCTATTCAT GTCGTGGCTGCTGCCAAGGACGGAACGCTGCATGCGCGGCCCGTGGTGT CCTGGTTCGACCAGGGAACGCGGGATGTGATCGGGTTGCGGATCGCCGG TGGCGCCATCGTGTGGGCGACACCCGATCACAAGGTGCTGACAGAGTAC GGCTGGCGTGCCGCGGGGGAACTCCGCAAGGGAGACAGGGTGGCGCAA CCGCGACGCTTCGATGGATTCGGTGACAGTGCGCCGATTCCGGCGGACG CCGAGTGCGGGAGCTGCTGCTGGACGCCCTGAGCCCCGAGCAGCTAGTG CTCACCCTCCTGGAGGCTGAGCCGCCCCATGTGCTGATCAGCCGCCCCA GTGCGCCCTTCACCGAGGCCTCCATGATGATGTCCCTGACCAAGTTGGC CGACAAGGAGTTGGTACACATGATCAGCTGGGCCAAGAAGATTCCCGGC TTTGTGGAGCTCAGCCTGTTCGACCAAGTGCGGCTCTTGGAGAGCTGTT GGATGGAGGTGTTAATGATGGGGCTGATGTGGCGCTCAATTGACCACCC CGGCAAGCTCATCTTTGCTCCAGATCTTGTTCTGGACAGGGATGAGGGG AAATGCGTAGAAGGAATTCTGGAAATCTTTGACATGCTCCTGGCAACTA CTTCAAGGTTTCGAGAGTTAAAACTCCAACACAAAGAATATCTCTGTGT CAAGGCCATGATCCTGCTCAATTCCAGTATGTACCCTCTGGTCACAGCG

92 ACCCAGGATGCTGACAGCAGCCGGAAGCTGGCTCACTTGCTGAACGCCG TGACCGATGCTTTGGTTTGGGTGATTGCCAAGAGCGGCATCTCCTCCCA GCAGCAATCCATGCGCCTGGCTAACCTCCTGATGCTCCTGTCCCACGTC AGGCATGCGAGTAACAAGGGCATGGAACATCTGCTCAACATGAAGTGCA AAAATGTGGTCCCAGTGTATGACCTGCTGCTGGAGATGCTGAATGCCCA CGTGCTTCGCGGGTGCAAGGCGGCGGATGCCCTGGATGACAAATTCCTG CACGACATGCTGGCGGAAGAACTCCGCTATTCCGTGATCCGAGAAGTGC TGCCAACGCGGCGGGCACGAACGTTCGACCTCGAGGTCGAGGAACTGCA CACCCTCGTCGCCGAAGGGGTTGTTGTACACAACTGTAAACAATACCAA GATTTAATTAAAGACATTTTTGAAAATGGTTATGAAACCGATGATCGTA CAGGCACAGGAACAATTGCTCTGTTCGGTACTAAATTACGCTGGGATTT AACTAAAGGTTTTCCTGCGGTAACAACTAAGAAGCTCGCCTGGAAAGCT TGCATTGCTGAGCTAATATGGTTTTTATCAGGAAGCACAAATGTCAATG ATTTACGATTAATACAACACGATTCGTTAATCCAAGGCAAAACAGTCTG GGATGAAAATTACGAAAATCAAGCAAAAGATTTAGGATACCATAGCGGT GAACTTGGTCCAATTTATGGAAAACAGTGGCGTGATTTTGGTGGTGTAG ACCAAATTATAGAAGTTATTGATCGTATTAAAAAACTGCCAAATGATAG GCGTCAAATTGTTTCTGCATGGAATCCAGCTGAACTTAAATATATGGCA TTACCGCCTTGTCATATGTTCTATCAGTTTAATGTGCGTAATGGCTATT TGGATTTGCAGTGGTATCAACGCTCAGTAGATGTTTTCTTGGGTCTACC GTTTAATATTGCGTCATATGCTACGTTAGTTCATATTGTAGCTAAGATG TGTAATCTTATTCCAGGGGATTTGATATTTTCTGGTGGTAATACTCATA TCTATATGAATCACGTAGAACAATGTAAAGAAATTTTGAGGCGTGAACC TAAAGAGCTCTGTGAACTAGTAATAAGTGGTCTACCTTATAAATTCCGA TATCTTTCTACTAAAGAACAATTAAAATATGTTCTTAAACTTAGGCCTA AAGATTTCGTTCTTAACAACTATGTATCACACCCTCCTATTAAAGGAAA GATGGCGGTGTAATCTAGAGTCGACCTGCAGGCAAGCTTGGCACTGGCC GTCGTTTTACAACGTCGTGACTGGGAAAACCCTGGCGTTACCCAACTTA ATCGCCTTGCAGCACATCCCCCTTTCGCCAGCTGGCGTAATAGCGAAGA GGCCCGCACCGATCGCCCTTCCCAACAGTTGCGCAGCCTGAATGGCGAA TGGCAGCTTGGCTGTTTTGGCGGATGAGATAAGATTTTCAGCCTGATAC AGATTAAATCAGAACGCAGAAGCGGTCTGATAAAACAGAATTTGCCTGG CGGCAGTAGCGCGGTGGTCCCACCTGACCCCATGCCGAACTCAGAAGTG AAACGCCGTAGCGCCGATGGTAGTGTGGGGTCTCCCCATGCGAGAGTAG GGAACTGCCAGGCATCAAATAAAACGAAAGGCTCAGTCGAAAGACTGGG CCTTTCGTTTTATCTGTTGTTTGTCGGTGAACGCTCTCCTGAGTAGGAC AAATCCGCCGGGAGCGGATTTGAACGTTGCGAAGCAACGGCCCGGAGGG TGGCGGGCAGGACGCCCGCCATAAACTGCCAGGCATCAAATTAAGCAGA AGGCCATCCTGACGGATGGCCTTTTTGCGTTTCTACAAACTCTTTTTGT TTATTTTTCTAAATACATTCAAATATGTATCCGCTCATGAGACAATAACC CTGATAAATGCTTCAATAATATTGAAAAAGGAAGAGTATGAGTATTCAA CATTTCCGTGTCGCCCTTATTCCCTTTTTTGCGGCATTTTGCCTTCCTGT TTTTGCTCACCCAGAAACGCTGGTGAAAGTAAAAGATGCTGAAGATCAG

93 TTGGGTGCACGAGTGGGTTACATCGAACTGGATCTCAACAGCGGTAAGA TCCTTGAGAGTTTTCGCCCCGAAGAACGTTCTCCAATGATGAGCACTTT TAAAGTTCTGCTATGTGGCGCGGTATTATCCCGTGTTGACGCCGGGCAA GAGCAACTCGGTCGCCGCATACACTATTCTCAGAATGACTTGGTTGAGT ACTCACCAGTCACAGAAAAGCATCTTACGGATGGCATGACAGTAAGAGA ATTATGCAGTGCTGCCATAACCATGAGTGATAACACTGCGGCCAACTTA CTTCTGACAACGATCGGAGGACCGAAGGAGCTAACCGCTTTTTTGCACA ACATGGGGGATCATGTAACTCGCCTTGATCGTTGGGAACCGGAGCTGAA TGAAGCCATACCAAACGACGAGCGTGACACCACGATGCCTGTAGCAATG GCAACAACGTTGCGCAAACTATTAACTGGCGAACTACTTACTCTAGCTT CCCGGCAACAATTAATAGACTGGATGGAGGCGGATAAAGTTGCAGGACC ACTTCTGCGCTCGGCCCTTCCGGCTGGCTGGTTTATTGCTGATAAATCT GGAGCCGGTGAGCGTGGGTCTCGCGGTATCATTGCAGCACTGGGGCCAG ATGGTAAGCCCTCCCGTATCGTAGTTATCTACACGACGGGGAGTCAGGC AACTATGGATGAACGAAATAGACAGATCGCTGAGATAGGTGCCTCACTG ATTAAGCATTGGTAACTGTCAGACCAAGTTTACTCATATATACTTTAGA TTGATTTACCCCGGTTGATAATCAGAAAAGCCCCAAAAACAGGAAGATT GTATAAGCAAATATTTAAATTGTAAACGTTAATATTTTGTTAAAATTCG CGTTAAATTTTTGTTAAATCAGCTCATTTTTTAACCAATAGGCCGAAAT CGGCAAAATCCCTTATAAATCAAAAGAATAGCCCGAGATAGGGTTGAGT GTTGTTCCAGTTTGGAACAAGAGTCCACTATTAAAGAACGTGGACTCCA ACGTCAAAGGGCGAAAAACCGTCTATCAGGGCGATGGCCCACTACGTGA ACCATCACCCAAATCAAGTTTTTTGGGGTCGAGGTGCCGTAAAGCACTA AATCGGAACCCTAAAGGGAGCCCCCGATTTAGAGCTTGACGGGGAAAGC CGGCGAACGTGGCGAGAAAGGAAGGGAAGAAAGCGAAAGGAGCGGGCG CTAGGGCGCTGGCAAGTGTAGCGGTCACGCTGCGCGTAACCACCACACC CGCCGCGCTTAATGCGCCGCTACAGGGCGCGTAAAAGGATCTAGGTGAA GATCCTTTTTGATAATCTCATGACCAAAATCCCTTAACGTGAGTTTTCG TTCCACTGAGCGTCAGACCCCGTAGAAAAGATCAAAGGATCTTCTTGAG ATCCTTTTTTTCTGCGCGTAATCTGCTGCTTGCAAACAAAAAAACCACC GCTACCAGCGGTGGTTTGTTTGCCGGATCAAGAGCTACCAACTCTTTTT CCGAAGGTAACTGGCTTCAGCAGAGCGCAGATACCAAATACTGTCCTTC TAGTGTAGCCGTAGTTAGGCCACCACTTCAAGAACTCTGTAGCACCGCC TACATACCTCGCTCTGCTAATCCTGTTACCAGTGGCTGCTGCCAGTGGC GATAAGTCGTGTCTTACCGGGTTGGACTCAAGACGATAGTTACCGGATA AGGCGCAGCGGTCGGGCTGAACGGGGGGTTCGTGCACACAGCCCAGCTT GGAGCGAACGACCTACACCGAACTGAGATACCTACAGCGTGAGCTATGA GAAAGCGCCACGCTTCCCGAAGGGAGAAAGGCGGACAGGTATCCGGTAA GCGGCAGGGTCGGAACAGGAGAGCGCACGAGGGAGCTTCCAGGGGGAA ACGCCTGGTATCTTTATAGTCCTGTCGGGTTTCGCCACCTCTGACTTGA GCGTCGATTTTTGTGATGCTCGTCAGGGGGGCGGAGCCTATGGAAAAAC GCCAGCAACGCGGCCTTTTTACGGTTCCTGGCCTTTTGCTGGCCTTTTG CTCACATGTTCTTTCCTGCGTTATCCCCTGATTCTGTGGATAACCGTATT

94 ACCGCCTTTGAGTGAGCTGATACCGCTCGCCGCAGCCGAACGACCGAGC GCAGCGAGTCAGTGAGCGAGGAAGCGGAAGAGCGCCTGATGCGGTATT TTCTCCTTACGCATCTGTGCGGTATTTCACACCGCATATGGTGCACTCTC AGTACAATCTGCTCTGATGCCGCATAGTTAAGCCAGTATACACTCCGCT ATCGCTACGTGACTGGGTCATGGCTGCGCCCCGACACCCGCCAACACCC GCTGACGCGCCCTGACGGGCTTGTCTGCTCCCGGCATCCGCTTACAGAC AAGCTGTGACCGTCTCCGGGAGCTGCATGTGTCAGAGGTTTTCACCGTC ATCACCGAAACGCGCGAGGCAGCTGCGGTAAAGCTCATCAGCGTGGTCG TGCAGCGATTCACAGATGTCTGCCTGTTCATCCGCGTCCAGCTCGTTGA GTTTCTCCAGAAGCGTTAATGTCTGGCTTCTGATAAAGCGGGCCATGTT AAGGGCGGTTTTTTCCTGTTTGGTCACTTGATGCCTCCGTGTAAGGGGG AATTTCTGTTCATGGGGGTAATGATACCGATGAAACGAGAGAGGATGCT CACGATACGGGTTACTGATGATGAACATGCCCGGTTACTGGAACGTTGT GAGGGTAAACAACTGGCGGTATGGATGCGGCGGGACCAGAGAAAAATC ACTCAGGGTCAATGCCAGCGCTTCGTTAATACAGATGTAGGTGTTCCAC AGGGTAGCCAGCAGCATCCTGCGATGCAGATCCGGAACATAATGGTGCA GGGCGCTGACTTCCGCGTTTCCAGACTTTACGAAACACGGAAACCGAAG ACCATTCATGTTGTTGCTCAGGTCGCAGACGTTTTGCAGCAGCAGTCGC TTCACGTTCGCTCGCGTATCGGTGATTCATTCTGCTAACCAGTAAGGCA ACCCCGCCAGCCTAGCCGGGTCCTCAACGACAGGAGCACGATCATGCGC ACCCGTGGCCAGGACCCAACGCTGCCCGAAATT

95 B.2 pMIT:ERβ Cow

CCGACACCATCGAATGGTGCAAAACCTTTCGCGGTATGGCATGATAG CGCCCGGAAGAGAGTCAATTCAGGGTGGTGAATGTGAAACCAGTAACGT TATACGATGTCGCAGAGTATGCCGGTGTCTCTTATCAGACCGTTTCCCG CGTGGTGAACCAGGCCAGCCACGTTTCTGCGAAAACGCGGGAAAAAGTG GAAGCGGCGATGGCGGAGCTGAATTACATTCCCAACCGCGTGGCACAAC AACTGGCGGGCAAACAGTCGTTGCTGATTGGCGTTGCCACCTCCAGTCT GGCCCTGCACGCGCCGTCGCAAATTGTCGCGGCGATTAAATCTCGCGCC GATCAACTGGGTGCCAGCGTGGTGGTGTCGATGGTAGAACGAAGCGGC GTCGAAGCCTGTAAAGCGGCGGTGCACAATCTTCTCGCGCAACGCGTCA GTGGGCTGATCATTAACTATCCGCTGGATGACCAGGATGCCATTGCTGT GGAAGCTGCCTGCACTAATGTTCCGGCGTTATTTCTTGATGTCTCTGAC CAGACACCCATCAACAGTATTATTTTCTCCCATGAAGACGGTACGCGAC TGGGCGTGGAGCATCTGGTCGCATTGGGTCACCAGCAAATCGCGCTGTT AGCGGGCCCATTAAGTTCTGTCTCGGCGCGTCTGCGTCTGGCTGGCTGG CATAAATATCTCACTCGCAATCAAATTCAGCCGATAGCGGAACGGGAAG GCGACTGGAGTGCCATGTCCGGTTTTCAACAAACCATGCAAATGCTGAA TGAGGGCATCGTTCCCACTGCGATGCTGGTTGCCAACGATCAGATGGCG CTGGGCGCAATGCGCGCCATTACCGAGTCCGGGCTGCGCGTTGGTGCGG ATATCTCGGTAGTGGGATACGACGATACCGAAGACAGCTCATGTTATAT CCCGCCGTTAACCACCATCAAACAGGATTTTCGCCTGCTGGGGCAAACC AGCGTGGACCGCTTGCTGCAACTCTCTCAGGGCCAGGCGGTGAAGGGCA ATCAGCTGTTGCCCGTCTCACTGGTGAAAAGAAAAACCACCCTGGCGCC CAATACGCAAACCGCCTCTCCCCGCGCGTTGGCCGATTCATTAATGCAG CTGGCACGACAGGTTTCCCGACTGGAAAGCGGGCAGTGAGCGCAACGCA ATTAATGTGAGTTAGCTCACTCATTAGGCACAATTCTCATGTTTGACAG CTTATCATCGACTGCACGGTGCACCAATGCTTCTGGCGTCAGGCAGCCA TCGGAAGCTGTGGTATGGCTGTGCAGGTCGTAAATCACTGCATAATTCG TGTCGCTCAAGGCGCACTCCCGTTCTGGATAATGTTTTTTGCGCCGACA TCATAACGGTTCTGGCAAATATTCTGAAATGAGCTGTTGACAATTAATC ATCGGCTCGTATAATGTGTGGAATTGTGAACGGATAACAATTTCACACA GGAAACAGCCAGTCCGTTTAGGTGTTTTCACGAGCACTTCACCAACAAG GACCATAGATTATGAAAATCGAAGAAGGTAAACTGGTAATCTGGATTAA CGGCGATAAAGGCTATAACGGTCTCGCTGAAGTCGGTAAGAAATTCGAG AAAGATACCGGAATTAAAGTCACCGTTGAGCATCCGGATAAACTGGAAG AGAAATTCCCACAGGTTGCGGCAACTGGCGATGGCCCTGACATTATCTT CTGGGCACACGACCGCTTTGGTGGCTACGCTCAATCTGGCCTGTTGGCT GAAATCACCCCGGACAAAGCGTTCCAGGACAAGCTGTATCCGTTTACCT GGGATGCCGTACGTTACAACGGCAAGCTGATTGCTTACCCGATCGCTGT TGAAGCGTTATCGCTGATTTATAACAAAGATCTGCTGCCGAACCCGCCA AAAACCTGGGAAGAGATCCCGGCGCTGGATAAAGAACTGAAAGCGAAAG

96 GTAAGAGCGCGCTGATGTTCAACCTGCAAGAACCGTACTTCACCTGGCC GCTGATTGCTGCTGACGGGGGTTATGCGTTCAAGTATGAAAACGGCAAG TACGACATTAAAGACGTGGGCGTGGATAACGCTGGCGCGAAAGCGGGTC TGACCTTCCTGGTTGACCTGATTAAAAACAAACACATGAATGCAGACAC CGATTACTCCATCGCAGAAGCTGCCTTTAATAAAGGCGAAACAGCGATG ACCATCAACGGCCCGTGGGCATGGTCCAACATCGACACCAGCAAAGTGA ATTATGGTGTAACGGTACTGCCGACCTTCAAGGGTCAACCATCCAAACC GTTCGTTGGCGTGCTGAGCGCAGGTATTAACGCCGCCAGTCCGAACAAA GAGCTGGCAAAAGAGTTCCTCGAAAACTATCTGCTGACTGATGAAGGTC TGGAAGCGGTTAATAAAGACAAACCGCTGGGTGCCGTAGCGCTGAAGTC TTACGAGGAAGAGTTGGCGAAAGATCCACGTATTGCCGCCACCATGGAA AACGCCCAGAAAGGTGAAATCATGCCGAACATCCCGCAGATGTCCGCTT TCTGGTATGCCGTGCGTACTGCGGTGATCAACGCCGCCAGCGGTCGTCA GACTGTCGATGAAGCCCTGAAAGACGCGCAGACTAATTCGAGCTCGAAC AACAACAACAATAACAATAACAACAACCTCGGGATCGAGGGAAGGATTT CAGAATTCGCCCTCGCAGAGGGCACTCGGATCTTCGATCCGGTCACCGG TACAACGCATCGCATCGAGGATGTTGTCGATGGGCGCAAGCCTATTCAT GTCGTGGCTGCTGCCAAGGACGGAACGCTGCATGCGCGGCCCGTGGTGT CCTGGTTCGACCAGGGAACGCGGGATGTGATCGGGTTGCGGATCGCCGG TGGCGCCATCGTGTGGGCGACACCCGATCACAAGGTGCTGACAGAGTAC GGCTGGCGTGCCGCGGGGGAACTCCGCAAGGGAGACAGGGTGGCGCAA CCGCGACGCTTCGATGGATTCGGTGACAGTGCGCCGATTCCGGCGGACG AGTGAAAGAGCTGCTGCTGAGCGCCCTGAGCCCAGAGCAGCTGGTGCTT ACGCTCCTGGAGGCCGAGCCGCCCCACGTGCTCATAAGCCGCCCCAGCA CGCCCTTCACTGAGGCCTCCATGATGATGTCCCTCACCAAGCTGGCCGA CAAGGAACTGGTACACATGATCAGCTGGGCCAAGAAGATTCCGGGCTTC GTGGAGCTCAGCCTGTACGACCAAGTGCGGCTTTTGGAGAGCTGCTGGT TGGAGGTGCTCATGGTGGGGCTGATGTGGCGCTCCATCGACCACCCTGG CAAGCTCATCTTTGCTCCAGACCTCATTCTGGACAGGGATGAAGGGAAA TGTGTTGAAGGAATTCTAGAAATCTTTGACATGCTCCTGGCAACGACTT CAAGGTTTCGTGAGTTAAAACTCCAACACAAAGAATATCTCTGTGTCAA GGCCATGATCCTCCTCAACTCCAGTATGTACCCTTCAGCTACAGCACCCC AGGAGGCTGACAGTGGCCGGAAGCTGACTCACCTGCTGAATGCTGTGAC GGACGCTCTGGTCTGGGTGATTGCCAAGAGTGGCATGTCCTCCCAGCAG CAGTCCATGCGCCTGGCTAACCTGCTGATGCTCCTGTCTCACGTCAGGC ACGCCAGTAACAAGGGCATGGAACACCTGCTCAACATGAAGTGCAAAAA CGTGGTCCCCGTCTACGATCTGCTGCTGGAGATGCTGAATGCCCACACA CTTCGCGGCAACGCGGATGCCCTGGATGACAAATTCCTGCACGACATGC TGGCGGAAGAACTCCGCTATTCCGTGATCCGAGAAGTGCTGCCAACGCG GCGGGCACGAACGTTCGACCTCGAGGTCGAGGAACTGCACACCCTCGTC GCCGAAGGGGTTGTTGTACACAACTGTAAACAATACCAAGATTTAATTA AAGACATTTTTGAAAATGGTTATGAAACCGATGATCGTACAGGCACAGG AACAATTGCTCTGTTCGGTACTAAATTACGCTGGGATTTAACTAAAGGT

97 TTTCCTGCGGTAACAACTAAGAAGCTCGCCTGGAAAGCTTGCATTGCTG AGCTAATATGGTTTTTATCAGGAAGCACAAATGTCAATGATTTACGATT AATACAACACGATTCGTTAATCCAAGGCAAAACAGTCTGGGATGAAAAT TACGAAAATCAAGCAAAAGATTTAGGATACCATAGCGGTGAACTTGGTC CAATTTATGGAAAACAGTGGCGTGATTTTGGTGGTGTAGACCAAATTAT AGAAGTTATTGATCGTATTAAAAAACTGCCAAATGATAGGCGTCAAATT GTTTCTGCATGGAATCCAGCTGAACTTAAATATATGGCATTACCGCCTT GTCATATGTTCTATCAGTTTAATGTGCGTAATGGCTATTTGGATTTGCA GTGGTATCAACGCTCAGTAGATGTTTTCTTGGGTCTACCGTTTAATATT GCGTCATATGCTACGTTAGTTCATATTGTAGCTAAGATGTGTAATCTTA TTCCAGGGGATTTGATATTTTCTGGTGGTAATACTCATATCTATATGAA TCACGTAGAACAATGTAAAGAAATTTTGAGGCGTGAACCTAAAGAGCTC TGTGAACTAGTAATAAGTGGTCTACCTTATAAATTCCGATATCTTTCTA CTAAAGAACAATTAAAATATGTTCTTAAACTTAGGCCTAAAGATTTCGT TCTTAACAACTATGTATCACACCCTCCTATTAAAGGAAAGATGGCGGTG TAATCTAGAGTCGACCTGCAGGCAAGCTTGGCACTGGCCGTCGTTTTAC AACGTCGTGACTGGGAAAACCCTGGCGTTACCCAACTTAATCGCCTTGC AGCACATCCCCCTTTCGCCAGCTGGCGTAATAGCGAAGAGGCCCGCACC GATCGCCCTTCCCAACAGTTGCGCAGCCTGAATGGCGAATGGCAGCTTG GCTGTTTTGGCGGATGAGATAAGATTTTCAGCCTGATACAGATTAAATC AGAACGCAGAAGCGGTCTGATAAAACAGAATTTGCCTGGCGGCAGTAGC GCGGTGGTCCCACCTGACCCCATGCCGAACTCAGAAGTGAAACGCCGTA GCGCCGATGGTAGTGTGGGGTCTCCCCATGCGAGAGTAGGGAACTGCCA GGCATCAAATAAAACGAAAGGCTCAGTCGAAAGACTGGGCCTTTCGTTT TATCTGTTGTTTGTCGGTGAACGCTCTCCTGAGTAGGACAAATCCGCCG GGAGCGGATTTGAACGTTGCGAAGCAACGGCCCGGAGGGTGGCGGGCA GGACGCCCGCCATAAACTGCCAGGCATCAAATTAAGCAGAAGGCCATCC TGACGGATGGCCTTTTTGCGTTTCTACAAACTCTTTTTGTTTATTTTTCT AAATACATTCAAATATGTATCCGCTCATGAGACAATAACCCTGATAAAT GCTTCAATAATATTGAAAAAGGAAGAGTATGAGTATTCAACATTTCCGT GTCGCCCTTATTCCCTTTTTTGCGGCATTTTGCCTTCCTGTTTTTGCTCA CCCAGAAACGCTGGTGAAAGTAAAAGATGCTGAAGATCAGTTGGGTGCA CGAGTGGGTTACATCGAACTGGATCTCAACAGCGGTAAGATCCTTGAGA GTTTTCGCCCCGAAGAACGTTCTCCAATGATGAGCACTTTTAAAGTTCT GCTATGTGGCGCGGTATTATCCCGTGTTGACGCCGGGCAAGAGCAACTC GGTCGCCGCATACACTATTCTCAGAATGACTTGGTTGAGTACTCACCAG TCACAGAAAAGCATCTTACGGATGGCATGACAGTAAGAGAATTATGCAG TGCTGCCATAACCATGAGTGATAACACTGCGGCCAACTTACTTCTGACA ACGATCGGAGGACCGAAGGAGCTAACCGCTTTTTTGCACAACATGGGGG ATCATGTAACTCGCCTTGATCGTTGGGAACCGGAGCTGAATGAAGCCAT ACCAAACGACGAGCGTGACACCACGATGCCTGTAGCAATGGCAACAACG TTGCGCAAACTATTAACTGGCGAACTACTTACTCTAGCTTCCCGGCAAC AATTAATAGACTGGATGGAGGCGGATAAAGTTGCAGGACCACTTCTGCG

98 CTCGGCCCTTCCGGCTGGCTGGTTTATTGCTGATAAATCTGGAGCCGGT GAGCGTGGGTCTCGCGGTATCATTGCAGCACTGGGGCCAGATGGTAAGC CCTCCCGTATCGTAGTTATCTACACGACGGGGAGTCAGGCAACTATGGA TGAACGAAATAGACAGATCGCTGAGATAGGTGCCTCACTGATTAAGCAT TGGTAACTGTCAGACCAAGTTTACTCATATATACTTTAGATTGATTTAC CCCGGTTGATAATCAGAAAAGCCCCAAAAACAGGAAGATTGTATAAGCA AATATTTAAATTGTAAACGTTAATATTTTGTTAAAATTCGCGTTAAATT TTTGTTAAATCAGCTCATTTTTTAACCAATAGGCCGAAATCGGCAAAAT CCCTTATAAATCAAAAGAATAGCCCGAGATAGGGTTGAGTGTTGTTCCA GTTTGGAACAAGAGTCCACTATTAAAGAACGTGGACTCCAACGTCAAAG GGCGAAAAACCGTCTATCAGGGCGATGGCCCACTACGTGAACCATCACC CAAATCAAGTTTTTTGGGGTCGAGGTGCCGTAAAGCACTAAATCGGAAC CCTAAAGGGAGCCCCCGATTTAGAGCTTGACGGGGAAAGCCGGCGAACG TGGCGAGAAAGGAAGGGAAGAAAGCGAAAGGAGCGGGCGCTAGGGCGC TGGCAAGTGTAGCGGTCACGCTGCGCGTAACCACCACACCCGCCGCGCT TAATGCGCCGCTACAGGGCGCGTAAAAGGATCTAGGTGAAGATCCTTTT TGATAATCTCATGACCAAAATCCCTTAACGTGAGTTTTCGTTCCACTGA GCGTCAGACCCCGTAGAAAAGATCAAAGGATCTTCTTGAGATCCTTTTT TTCTGCGCGTAATCTGCTGCTTGCAAACAAAAAAACCACCGCTACCAGC GGTGGTTTGTTTGCCGGATCAAGAGCTACCAACTCTTTTTCCGAAGGTA ACTGGCTTCAGCAGAGCGCAGATACCAAATACTGTCCTTCTAGTGTAGC CGTAGTTAGGCCACCACTTCAAGAACTCTGTAGCACCGCCTACATACCT CGCTCTGCTAATCCTGTTACCAGTGGCTGCTGCCAGTGGCGATAAGTCG TGTCTTACCGGGTTGGACTCAAGACGATAGTTACCGGATAAGGCGCAGC GGTCGGGCTGAACGGGGGGTTCGTGCACACAGCCCAGCTTGGAGCGAAC GACCTACACCGAACTGAGATACCTACAGCGTGAGCTATGAGAAAGCGCC ACGCTTCCCGAAGGGAGAAAGGCGGACAGGTATCCGGTAAGCGGCAGG GTCGGAACAGGAGAGCGCACGAGGGAGCTTCCAGGGGGAAACGCCTGG TATCTTTATAGTCCTGTCGGGTTTCGCCACCTCTGACTTGAGCGTCGAT TTTTGTGATGCTCGTCAGGGGGGCGGAGCCTATGGAAAAACGCCAGCAA CGCGGCCTTTTTACGGTTCCTGGCCTTTTGCTGGCCTTTTGCTCACATG TTCTTTCCTGCGTTATCCCCTGATTCTGTGGATAACCGTATTACCGCCTT TGAGTGAGCTGATACCGCTCGCCGCAGCCGAACGACCGAGCGCAGCGAG TCAGTGAGCGAGGAAGCGGAAGAGCGCCTGATGCGGTATTTTCTCCTTA CGCATCTGTGCGGTATTTCACACCGCATATGGTGCACTCTCAGTACAAT CTGCTCTGATGCCGCATAGTTAAGCCAGTATACACTCCGCTATCGCTAC GTGACTGGGTCATGGCTGCGCCCCGACACCCGCCAACACCCGCTGACGC GCCCTGACGGGCTTGTCTGCTCCCGGCATCCGCTTACAGACAAGCTGTG ACCGTCTCCGGGAGCTGCATGTGTCAGAGGTTTTCACCGTCATCACCGA AACGCGCGAGGCAGCTGCGGTAAAGCTCATCAGCGTGGTCGTGCAGCGA TTCACAGATGTCTGCCTGTTCATCCGCGTCCAGCTCGTTGAGTTTCTCC AGAAGCGTTAATGTCTGGCTTCTGATAAAGCGGGCCATGTTAAGGGCGG TTTTTTCCTGTTTGGTCACTTGATGCCTCCGTGTAAGGGGGAATTTCTG

99 TTCATGGGGGTAATGATACCGATGAAACGAGAGAGGATGCTCACGATAC GGGTTACTGATGATGAACATGCCCGGTTACTGGAACGTTGTGAGGGTAA ACAACTGGCGGTATGGATGCGGCGGGACCAGAGAAAAATCACTCAGGGT CAATGCCAGCGCTTCGTTAATACAGATGTAGGTGTTCCACAGGGTAGCC AGCAGCATCCTGCGATGCAGATCCGGAACATAATGGTGCAGGGCGCTGA CTTCCGCGTTTCCAGACTTTACGAAACACGGAAACCGAAGACCATTCAT GTTGTTGCTCAGGTCGCAGACGTTTTGCAGCAGCAGTCGCTTCACGTTC GCTCGCGTATCGGTGATTCATTCTGCTAACCAGTAAGGCAACCCCGCCA GCCTAGCCGGGTCCTCAACGACAGGAGCACGATCATGCGCACCCGTGGC CAGGACCCAACGCTGCCCGAAATT

100 B.3 pMIT:ERβ Rat

CCGACACCATCGAATGGTGCAAAACCTTTCGCGGTATGGCATGATAG CGCCCGGAAGAGAGTCAATTCAGGGTGGTGAATGTGAAACCAGTAACGT TATACGATGTCGCAGAGTATGCCGGTGTCTCTTATCAGACCGTTTCCCG CGTGGTGAACCAGGCCAGCCACGTTTCTGCGAAAACGCGGGAAAAAGTG GAAGCGGCGATGGCGGAGCTGAATTACATTCCCAACCGCGTGGCACAAC AACTGGCGGGCAAACAGTCGTTGCTGATTGGCGTTGCCACCTCCAGTCT GGCCCTGCACGCGCCGTCGCAAATTGTCGCGGCGATTAAATCTCGCGCC GATCAACTGGGTGCCAGCGTGGTGGTGTCGATGGTAGAACGAAGCGGC GTCGAAGCCTGTAAAGCGGCGGTGCACAATCTTCTCGCGCAACGCGTCA GTGGGCTGATCATTAACTATCCGCTGGATGACCAGGATGCCATTGCTGT GGAAGCTGCCTGCACTAATGTTCCGGCGTTATTTCTTGATGTCTCTGAC CAGACACCCATCAACAGTATTATTTTCTCCCATGAAGACGGTACGCGAC TGGGCGTGGAGCATCTGGTCGCATTGGGTCACCAGCAAATCGCGCTGTT AGCGGGCCCATTAAGTTCTGTCTCGGCGCGTCTGCGTCTGGCTGGCTGG CATAAATATCTCACTCGCAATCAAATTCAGCCGATAGCGGAACGGGAAG GCGACTGGAGTGCCATGTCCGGTTTTCAACAAACCATGCAAATGCTGAA TGAGGGCATCGTTCCCACTGCGATGCTGGTTGCCAACGATCAGATGGCG CTGGGCGCAATGCGCGCCATTACCGAGTCCGGGCTGCGCGTTGGTGCGG ATATCTCGGTAGTGGGATACGACGATACCGAAGACAGCTCATGTTATAT CCCGCCGTTAACCACCATCAAACAGGATTTTCGCCTGCTGGGGCAAACC AGCGTGGACCGCTTGCTGCAACTCTCTCAGGGCCAGGCGGTGAAGGGCA ATCAGCTGTTGCCCGTCTCACTGGTGAAAAGAAAAACCACCCTGGCGCC CAATACGCAAACCGCCTCTCCCCGCGCGTTGGCCGATTCATTAATGCAG CTGGCACGACAGGTTTCCCGACTGGAAAGCGGGCAGTGAGCGCAACGCA ATTAATGTGAGTTAGCTCACTCATTAGGCACAATTCTCATGTTTGACAG CTTATCATCGACTGCACGGTGCACCAATGCTTCTGGCGTCAGGCAGCCA TCGGAAGCTGTGGTATGGCTGTGCAGGTCGTAAATCACTGCATAATTCG TGTCGCTCAAGGCGCACTCCCGTTCTGGATAATGTTTTTTGCGCCGACA TCATAACGGTTCTGGCAAATATTCTGAAATGAGCTGTTGACAATTAATC ATCGGCTCGTATAATGTGTGGAATTGTGAACGGATAACAATTTCACACA GGAAACAGCCAGTCCGTTTAGGTGTTTTCACGAGCACTTCACCAACAAG GACCATAGATTATGAAAATCGAAGAAGGTAAACTGGTAATCTGGATTAA CGGCGATAAAGGCTATAACGGTCTCGCTGAAGTCGGTAAGAAATTCGAG AAAGATACCGGAATTAAAGTCACCGTTGAGCATCCGGATAAACTGGAAG AGAAATTCCCACAGGTTGCGGCAACTGGCGATGGCCCTGACATTATCTT CTGGGCACACGACCGCTTTGGTGGCTACGCTCAATCTGGCCTGTTGGCT GAAATCACCCCGGACAAAGCGTTCCAGGACAAGCTGTATCCGTTTACCT GGGATGCCGTACGTTACAACGGCAAGCTGATTGCTTACCCGATCGCTGT TGAAGCGTTATCGCTGATTTATAACAAAGATCTGCTGCCGAACCCGCCA AAAACCTGGGAAGAGATCCCGGCGCTGGATAAAGAACTGAAAGCGAAAG

101 GTAAGAGCGCGCTGATGTTCAACCTGCAAGAACCGTACTTCACCTGGCC GCTGATTGCTGCTGACGGGGGTTATGCGTTCAAGTATGAAAACGGCAAG TACGACATTAAAGACGTGGGCGTGGATAACGCTGGCGCGAAAGCGGGTC TGACCTTCCTGGTTGACCTGATTAAAAACAAACACATGAATGCAGACAC CGATTACTCCATCGCAGAAGCTGCCTTTAATAAAGGCGAAACAGCGATG ACCATCAACGGCCCGTGGGCATGGTCCAACATCGACACCAGCAAAGTGA ATTATGGTGTAACGGTACTGCCGACCTTCAAGGGTCAACCATCCAAACC GTTCGTTGGCGTGCTGAGCGCAGGTATTAACGCCGCCAGTCCGAACAAA GAGCTGGCAAAAGAGTTCCTCGAAAACTATCTGCTGACTGATGAAGGTC TGGAAGCGGTTAATAAAGACAAACCGCTGGGTGCCGTAGCGCTGAAGTC TTACGAGGAAGAGTTGGCGAAAGATCCACGTATTGCCGCCACCATGGAA AACGCCCAGAAAGGTGAAATCATGCCGAACATCCCGCAGATGTCCGCTT TCTGGTATGCCGTGCGTACTGCGGTGATCAACGCCGCCAGCGGTCGTCA GACTGTCGATGAAGCCCTGAAAGACGCGCAGACTAATTCGAGCTCGAAC AACAACAACAATAACAATAACAACAACCTCGGGATCGAGGGAAGGATTT CAGAATTCGCCCTCGCAGAGGGCACTCGGATCTTCGATCCGGTCACCGG TACAACGCATCGCATCGAGGATGTTGTCGATGGGCGCAAGCCTATTCAT GTCGTGGCTGCTGCCAAGGACGGAACGCTGCATGCGCGGCCCGTGGTGT CCTGGTTCGACCAGGGAACGCGGGATGTGATCGGGTTGCGGATCGCCGG TGGCGCCATCGTGTGGGCGACACCCGATCACAAGGTGCTGACAGAGTAC GGCTGGCGTGCCGCGGGGGAACTCCGCAAGGGAGACAGGGTGGCGCAA CCGCGACGCTTCGATGGATTCGGTGACAGTGCGCCGATTCCGGCGGACA GTCCCGAGCAGCTGGTGCTCACCCTGCTGGAAGCTGAGCCACCCAATGT GCTAGTGAGCCGTCCCAGCATGCCCTTCACCGAGGCCTCCATGATGATG TCCCTCACGAAGCTGGCTGACAAGGAACTGGTGCACATGATTGGCTGGG CCAAGAAAATCCCTGGCTTTGTGGAGCTCAGCCTGTTGGACCAAGTCCG CCTCTTGGAAAGCTGCTGGATGGAGGTGCTGATGGTGGGGCTGATGTG GCGCTCCATCGACCACCCCGGCAAGCTCATCTTTGCTCCAGACCTCGTTC TGGACAGGGATGAGGGGAAGTGCGTGGAAGGGATTCTGGAAATCTTTG ACATGCTCCTGGCGACGACGGCACGGTTCCGTGAGTTAAAACTGCAGCA CAAAGAATATCTGTGTGTGAAGGCCATGATTCTCCTCAACTCCAGTATG TACCCCTTGGCTACCGCAAGCCAGGAAGCAGAGAGTAGCCGGAAGCTGA CACACCTATTGAACGCAGTGACAGATGCCCTGGTCTGGGTGATTTCGAA GAGTGGAATCTCTTCCCAGCAGCAGTCAGTCCGTCTGGCCAACCTCCTG ATGCTTCTTTCTCATGTCAGGCACATCAGTAACAAGGGCATGGAACATC TGCTCAGCATGAAGTGCAAAAATGTGGTCCCGGTGTACGACCTGCTGCT GGAGATGCTGAATGCTCACGCGGATGCCCTGGATGACAAATTCCTGCAC GACATGCTGGCGGAAGAACTCCGCTATTCCGTGATCCGAGAAGTGCTGC CAACGCGGCGGGCACGAACGTTCGACCTCGAGGTCGAGGAACTGCACAC CCTCGTCGCCGAAGGGGTTGTTGTACACAACTGTAAACAATACCAAGAT TTAATTAAAGACATTTTTGAAAATGGTTATGAAACCGATGATCGTACAG GCACAGGAACAATTGCTCTGTTCGGTACTAAATTACGCTGGGATTTAAC TAAAGGTTTTCCTGCGGTAACAACTAAGAAGCTCGCCTGGAAAGCTTGC

102 ATTGCTGAGCTAATATGGTTTTTATCAGGAAGCACAAATGTCAATGATT TACGATTAATACAACACGATTCGTTAATCCAAGGCAAAACAGTCTGGGA TGAAAATTACGAAAATCAAGCAAAAGATTTAGGATACCATAGCGGTGAA CTTGGTCCAATTTATGGAAAACAGTGGCGTGATTTTGGTGGTGTAGACC AAATTATAGAAGTTATTGATCGTATTAAAAAACTGCCAAATGATAGGCG TCAAATTGTTTCTGCATGGAATCCAGCTGAACTTAAATATATGGCATTA CCGCCTTGTCATATGTTCTATCAGTTTAATGTGCGTAATGGCTATTTGG ATTTGCAGTGGTATCAACGCTCAGTAGATGTTTTCTTGGGTCTACCGTT TAATATTGCGTCATATGCTACGTTAGTTCATATTGTAGCTAAGATGTGT AATCTTATTCCAGGGGATTTGATATTTTCTGGTGGTAATACTCATATCT ATATGAATCACGTAGAACAATGTAAAGAAATTTTGAGGCGTGAACCTAA AGAGCTCTGTGAACTAGTAATAAGTGGTCTACCTTATAAATTCCGATAT CTTTCTACTAAAGAACAATTAAAATATGTTCTTAAACTTAGGCCTAAAG ATTTCGTTCTTAACAACTATGTATCACACCCTCCTATTAAAGGAAAGAT GGCGGTGTAATCTAGAGTCGACCTGCAGGCAAGCTTGGCACTGGCCGTC GTTTTACAACGTCGTGACTGGGAAAACCCTGGCGTTACCCAACTTAATC GCCTTGCAGCACATCCCCCTTTCGCCAGCTGGCGTAATAGCGAAGAGGC CCGCACCGATCGCCCTTCCCAACAGTTGCGCAGCCTGAATGGCGAATGG CAGCTTGGCTGTTTTGGCGGATGAGATAAGATTTTCAGCCTGATACAGA TTAAATCAGAACGCAGAAGCGGTCTGATAAAACAGAATTTGCCTGGCGG CAGTAGCGCGGTGGTCCCACCTGACCCCATGCCGAACTCAGAAGTGAAA CGCCGTAGCGCCGATGGTAGTGTGGGGTCTCCCCATGCGAGAGTAGGGA ACTGCCAGGCATCAAATAAAACGAAAGGCTCAGTCGAAAGACTGGGCCT TTCGTTTTATCTGTTGTTTGTCGGTGAACGCTCTCCTGAGTAGGACAAA TCCGCCGGGAGCGGATTTGAACGTTGCGAAGCAACGGCCCGGAGGGTG GCGGGCAGGACGCCCGCCATAAACTGCCAGGCATCAAATTAAGCAGAAG GCCATCCTGACGGATGGCCTTTTTGCGTTTCTACAAACTCTTTTTGTTTA TTTTTCTAAATACATTCAAATATGTATCCGCTCATGAGACAATAACCCTG ATAAATGCTTCAATAATATTGAAAAAGGAAGAGTATGAGTATTCAACAT TTCCGTGTCGCCCTTATTCCCTTTTTTGCGGCATTTTGCCTTCCTGTTTT TGCTCACCCAGAAACGCTGGTGAAAGTAAAAGATGCTGAAGATCAGTTG GGTGCACGAGTGGGTTACATCGAACTGGATCTCAACAGCGGTAAGATCC TTGAGAGTTTTCGCCCCGAAGAACGTTCTCCAATGATGAGCACTTTTAA AGTTCTGCTATGTGGCGCGGTATTATCCCGTGTTGACGCCGGGCAAGAG CAACTCGGTCGCCGCATACACTATTCTCAGAATGACTTGGTTGAGTACT CACCAGTCACAGAAAAGCATCTTACGGATGGCATGACAGTAAGAGAATT ATGCAGTGCTGCCATAACCATGAGTGATAACACTGCGGCCAACTTACTT CTGACAACGATCGGAGGACCGAAGGAGCTAACCGCTTTTTTGCACAACA TGGGGGATCATGTAACTCGCCTTGATCGTTGGGAACCGGAGCTGAATGA AGCCATACCAAACGACGAGCGTGACACCACGATGCCTGTAGCAATGGCA ACAACGTTGCGCAAACTATTAACTGGCGAACTACTTACTCTAGCTTCCC GGCAACAATTAATAGACTGGATGGAGGCGGATAAAGTTGCAGGACCACT TCTGCGCTCGGCCCTTCCGGCTGGCTGGTTTATTGCTGATAAATCTGGA

103 GCCGGTGAGCGTGGGTCTCGCGGTATCATTGCAGCACTGGGGCCAGATG GTAAGCCCTCCCGTATCGTAGTTATCTACACGACGGGGAGTCAGGCAAC TATGGATGAACGAAATAGACAGATCGCTGAGATAGGTGCCTCACTGATT AAGCATTGGTAACTGTCAGACCAAGTTTACTCATATATACTTTAGATTG ATTTACCCCGGTTGATAATCAGAAAAGCCCCAAAAACAGGAAGATTGTA TAAGCAAATATTTAAATTGTAAACGTTAATATTTTGTTAAAATTCGCGT TAAATTTTTGTTAAATCAGCTCATTTTTTAACCAATAGGCCGAAATCGG CAAAATCCCTTATAAATCAAAAGAATAGCCCGAGATAGGGTTGAGTGTT GTTCCAGTTTGGAACAAGAGTCCACTATTAAAGAACGTGGACTCCAACG TCAAAGGGCGAAAAACCGTCTATCAGGGCGATGGCCCACTACGTGAACC ATCACCCAAATCAAGTTTTTTGGGGTCGAGGTGCCGTAAAGCACTAAAT CGGAACCCTAAAGGGAGCCCCCGATTTAGAGCTTGACGGGGAAAGCCGG CGAACGTGGCGAGAAAGGAAGGGAAGAAAGCGAAAGGAGCGGGCGCTA GGGCGCTGGCAAGTGTAGCGGTCACGCTGCGCGTAACCACCACACCCGC CGCGCTTAATGCGCCGCTACAGGGCGCGTAAAAGGATCTAGGTGAAGAT CCTTTTTGATAATCTCATGACCAAAATCCCTTAACGTGAGTTTTCGTTCC ACTGAGCGTCAGACCCCGTAGAAAAGATCAAAGGATCTTCTTGAGATCC TTTTTTTCTGCGCGTAATCTGCTGCTTGCAAACAAAAAAACCACCGCTA CCAGCGGTGGTTTGTTTGCCGGATCAAGAGCTACCAACTCTTTTTCCGA AGGTAACTGGCTTCAGCAGAGCGCAGATACCAAATACTGTCCTTCTAGT GTAGCCGTAGTTAGGCCACCACTTCAAGAACTCTGTAGCACCGCCTACA TACCTCGCTCTGCTAATCCTGTTACCAGTGGCTGCTGCCAGTGGCGATA AGTCGTGTCTTACCGGGTTGGACTCAAGACGATAGTTACCGGATAAGGC GCAGCGGTCGGGCTGAACGGGGGGTTCGTGCACACAGCCCAGCTTGGA GCGAACGACCTACACCGAACTGAGATACCTACAGCGTGAGCTATGAGAA AGCGCCACGCTTCCCGAAGGGAGAAAGGCGGACAGGTATCCGGTAAGCG GCAGGGTCGGAACAGGAGAGCGCACGAGGGAGCTTCCAGGGGGAAACG CCTGGTATCTTTATAGTCCTGTCGGGTTTCGCCACCTCTGACTTGAGCG TCGATTTTTGTGATGCTCGTCAGGGGGGCGGAGCCTATGGAAAAACGCC AGCAACGCGGCCTTTTTACGGTTCCTGGCCTTTTGCTGGCCTTTTGCTC ACATGTTCTTTCCTGCGTTATCCCCTGATTCTGTGGATAACCGTATTACC GCCTTTGAGTGAGCTGATACCGCTCGCCGCAGCCGAACGACCGAGCGCA GCGAGTCAGTGAGCGAGGAAGCGGAAGAGCGCCTGATGCGGTATTTTCT CCTTACGCATCTGTGCGGTATTTCACACCGCATATGGTGCACTCTCAGT ACAATCTGCTCTGATGCCGCATAGTTAAGCCAGTATACACTCCGCTATC GCTACGTGACTGGGTCATGGCTGCGCCCCGACACCCGCCAACACCCGCT GACGCGCCCTGACGGGCTTGTCTGCTCCCGGCATCCGCTTACAGACAAG CTGTGACCGTCTCCGGGAGCTGCATGTGTCAGAGGTTTTCACCGTCATC ACCGAAACGCGCGAGGCAGCTGCGGTAAAGCTCATCAGCGTGGTCGTGC AGCGATTCACAGATGTCTGCCTGTTCATCCGCGTCCAGCTCGTTGAGTT TCTCCAGAAGCGTTAATGTCTGGCTTCTGATAAAGCGGGCCATGTTAAG GGCGGTTTTTTCCTGTTTGGTCACTTGATGCCTCCGTGTAAGGGGGAAT TTCTGTTCATGGGGGTAATGATACCGATGAAACGAGAGAGGATGCTCAC

104 GATACGGGTTACTGATGATGAACATGCCCGGTTACTGGAACGTTGTGAG GGTAAACAACTGGCGGTATGGATGCGGCGGGACCAGAGAAAAATCACTC AGGGTCAATGCCAGCGCTTCGTTAATACAGATGTAGGTGTTCCACAGGG TAGCCAGCAGCATCCTGCGATGCAGATCCGGAACATAATGGTGCAGGGC GCTGACTTCCGCGTTTCCAGACTTTACGAAACACGGAAACCGAAGACCA TTCATGTTGTTGCTCAGGTCGCAGACGTTTTGCAGCAGCAGTCGCTTCA CGTTCGCTCGCGTATCGGTGATTCATTCTGCTAACCAGTAAGGCAACCC CGCCAGCCTAGCCGGGTCCTCAACGACAGGAGCACGATCATGCGCACCC GTGGCCAGGACCCAACGCTGCCCGAAATT

105 B.4 pMIT:ERβ Zebrafish

CCGACACCATCGAATGGTGCAAAACCTTTCGCGGTATGGCATGATAG CGCCCGGAAGAGAGTCAATTCAGGGTGGTGAATGTGAAACCAGTAACGT TATACGATGTCGCAGAGTATGCCGGTGTCTCTTATCAGACCGTTTCCCG CGTGGTGAACCAGGCCAGCCACGTTTCTGCGAAAACGCGGGAAAAAGTG GAAGCGGCGATGGCGGAGCTGAATTACATTCCCAACCGCGTGGCACAAC AACTGGCGGGCAAACAGTCGTTGCTGATTGGCGTTGCCACCTCCAGTCT GGCCCTGCACGCGCCGTCGCAAATTGTCGCGGCGATTAAATCTCGCGCC GATCAACTGGGTGCCAGCGTGGTGGTGTCGATGGTAGAACGAAGCGGC GTCGAAGCCTGTAAAGCGGCGGTGCACAATCTTCTCGCGCAACGCGTCA GTGGGCTGATCATTAACTATCCGCTGGATGACCAGGATGCCATTGCTGT GGAAGCTGCCTGCACTAATGTTCCGGCGTTATTTCTTGATGTCTCTGAC CAGACACCCATCAACAGTATTATTTTCTCCCATGAAGACGGTACGCGAC TGGGCGTGGAGCATCTGGTCGCATTGGGTCACCAGCAAATCGCGCTGTT AGCGGGCCCATTAAGTTCTGTCTCGGCGCGTCTGCGTCTGGCTGGCTGG CATAAATATCTCACTCGCAATCAAATTCAGCCGATAGCGGAACGGGAAG GCGACTGGAGTGCCATGTCCGGTTTTCAACAAACCATGCAAATGCTGAA TGAGGGCATCGTTCCCACTGCGATGCTGGTTGCCAACGATCAGATGGCG CTGGGCGCAATGCGCGCCATTACCGAGTCCGGGCTGCGCGTTGGTGCGG ATATCTCGGTAGTGGGATACGACGATACCGAAGACAGCTCATGTTATAT CCCGCCGTTAACCACCATCAAACAGGATTTTCGCCTGCTGGGGCAAACC AGCGTGGACCGCTTGCTGCAACTCTCTCAGGGCCAGGCGGTGAAGGGCA ATCAGCTGTTGCCCGTCTCACTGGTGAAAAGAAAAACCACCCTGGCGCC CAATACGCAAACCGCCTCTCCCCGCGCGTTGGCCGATTCATTAATGCAG CTGGCACGACAGGTTTCCCGACTGGAAAGCGGGCAGTGAGCGCAACGCA ATTAATGTGAGTTAGCTCACTCATTAGGCACAATTCTCATGTTTGACAG CTTATCATCGACTGCACGGTGCACCAATGCTTCTGGCGTCAGGCAGCCA TCGGAAGCTGTGGTATGGCTGTGCAGGTCGTAAATCACTGCATAATTCG TGTCGCTCAAGGCGCACTCCCGTTCTGGATAATGTTTTTTGCGCCGACA TCATAACGGTTCTGGCAAATATTCTGAAATGAGCTGTTGACAATTAATC ATCGGCTCGTATAATGTGTGGAATTGTGAACGGATAACAATTTCACACA GGAAACAGCCAGTCCGTTTAGGTGTTTTCACGAGCACTTCACCAACAAG GACCATAGATTATGAAAATCGAAGAAGGTAAACTGGTAATCTGGATTAA CGGCGATAAAGGCTATAACGGTCTCGCTGAAGTCGGTAAGAAATTCGAG AAAGATACCGGAATTAAAGTCACCGTTGAGCATCCGGATAAACTGGAAG AGAAATTCCCACAGGTTGCGGCAACTGGCGATGGCCCTGACATTATCTT CTGGGCACACGACCGCTTTGGTGGCTACGCTCAATCTGGCCTGTTGGCT GAAATCACCCCGGACAAAGCGTTCCAGGACAAGCTGTATCCGTTTACCT GGGATGCCGTACGTTACAACGGCAAGCTGATTGCTTACCCGATCGCTGT TGAAGCGTTATCGCTGATTTATAACAAAGATCTGCTGCCGAACCCGCCA AAAACCTGGGAAGAGATCCCGGCGCTGGATAAAGAACTGAAAGCGAAAG

106 GTAAGAGCGCGCTGATGTTCAACCTGCAAGAACCGTACTTCACCTGGCC GCTGATTGCTGCTGACGGGGGTTATGCGTTCAAGTATGAAAACGGCAAG TACGACATTAAAGACGTGGGCGTGGATAACGCTGGCGCGAAAGCGGGTC TGACCTTCCTGGTTGACCTGATTAAAAACAAACACATGAATGCAGACAC CGATTACTCCATCGCAGAAGCTGCCTTTAATAAAGGCGAAACAGCGATG ACCATCAACGGCCCGTGGGCATGGTCCAACATCGACACCAGCAAAGTGA ATTATGGTGTAACGGTACTGCCGACCTTCAAGGGTCAACCATCCAAACC GTTCGTTGGCGTGCTGAGCGCAGGTATTAACGCCGCCAGTCCGAACAAA GAGCTGGCAAAAGAGTTCCTCGAAAACTATCTGCTGACTGATGAAGGTC TGGAAGCGGTTAATAAAGACAAACCGCTGGGTGCCGTAGCGCTGAAGTC TTACGAGGAAGAGTTGGCGAAAGATCCACGTATTGCCGCCACCATGGAA AACGCCCAGAAAGGTGAAATCATGCCGAACATCCCGCAGATGTCCGCTT TCTGGTATGCCGTGCGTACTGCGGTGATCAACGCCGCCAGCGGTCGTCA GACTGTCGATGAAGCCCTGAAAGACGCGCAGACTAATTCGAGCTCGAAC AACAACAACAATAACAATAACAACAACCTCGGGATCGAGGGAAGGATTT CAGAATTCGCCCTCGCAGAGGGCACTCGGATCTTCGATCCGGTCACCGG TACAACGCATCGCATCGAGGATGTTGTCGATGGGCGCAAGCCTATTCAT GTCGTGGCTGCTGCCAAGGACGGAACGCTGCATGCGCGGCCCGTGGTGT CCTGGTTCGACCAGGGAACGCGGGATGTGATCGGGTTGCGGATCGCCGG TGGCGCCATCGTGTGGGCGACACCCGATCACAAGGTGCTGACAGAGTAC GGCTGGCGTGCCGCGGGGGAACTCCGCAAGGGAGACAGGGTGGCGCAA CCGCGACGCTTCGATGGATTCGGTGACAGTGCGCCGATTCCGGCGGACT CCCCTGAACAATTGGTTAGCTGTATTCTAGAGGCGGAGCCACCTCAAAT TTACCTGAGAGAGCCGGTGAAAAAGCCATACACTGAGGCTAGCATGATG ATGTCACTAACAAGCCTCGCCGACAAGGAGCTAGTGCTCATGATTAGCT GGGCGAAGAAGATACCAGGTTTTGTAGAGTTGACTTTGTCAGATCAGGT GCATTTGCTGGAATGCTGCTGGCTGGATATTCTGATGTTAGGATTGATG TGGAGATCTGTGGATCATCCTGGGAAACTCATCTTCACCCCTGACCTCA AGCTCAACAGGGAGGAAGGGAATTGTGTTGAAGGCATCATGGAGATTTT CGACATGCTGCTGGCCACCACCTCTCGATTCAGAGAGCTGAAGCTGCAG AGAGAGGAATACGTCTGTCTCAAAGCCATGATCCTGCTCAACTCTAATA ACTGTTCGAGTTTGCCACAGACTCCTGAGGATGTGGAGAGTCGCGGGAA GGTGCTGAATCTGCTGGACTCAGTGACCGATGCTCTGGTGTGGATCATC TCCAGAACGGGTCTGTCCTCACAACAACAGTCCATCCGGCTCGCTCATC TGCTAATGCTGCTCTCACACATCCGACACCTCAGCAACAAAGGCATCGA GCATCTGTCAAACATGAAGAGGAAAAACGTGGTGGCGGATGCCCTGGAT GACAAATTCCTGCACGACATGCTGGCGGAAGAACTCCGCTATTCCGTGA TCCGAGAAGTGCTGCCAACGCGGCGGGCACGAACGTTCGACCTCGAGGT CGAGGAACTGCACACCCTCGTCGCCGAAGGGGTTGTTGTACACAACTGT AAACAATACCAAGATTTAATTAAAGACATTTTTGAAAATGGTTATGAAA CCGATGATCGTACAGGCACAGGAACAATTGCTCTGTTCGGTACTAAATT ACGCTGGGATTTAACTAAAGGTTTTCCTGCGGTAACAACTAAGAAGCTC GCCTGGAAAGCTTGCATTGCTGAGCTAATATGGTTTTTATCAGGAAGCA

107 CAAATGTCAATGATTTACGATTAATACAACACGATTCGTTAATCCAAGG CAAAACAGTCTGGGATGAAAATTACGAAAATCAAGCAAAAGATTTAGGA TACCATAGCGGTGAACTTGGTCCAATTTATGGAAAACAGTGGCGTGATT TTGGTGGTGTAGACCAAATTATAGAAGTTATTGATCGTATTAAAAAACT GCCAAATGATAGGCGTCAAATTGTTTCTGCATGGAATCCAGCTGAACTT AAATATATGGCATTACCGCCTTGTCATATGTTCTATCAGTTTAATGTGC GTAATGGCTATTTGGATTTGCAGTGGTATCAACGCTCAGTAGATGTTTT CTTGGGTCTACCGTTTAATATTGCGTCATATGCTACGTTAGTTCATATT GTAGCTAAGATGTGTAATCTTATTCCAGGGGATTTGATATTTTCTGGTG GTAATACTCATATCTATATGAATCACGTAGAACAATGTAAAGAAATTTT GAGGCGTGAACCTAAAGAGCTCTGTGAACTAGTAATAAGTGGTCTACCT TATAAATTCCGATATCTTTCTACTAAAGAACAATTAAAATATGTTCTTAA ACTTAGGCCTAAAGATTTCGTTCTTAACAACTATGTATCACACCCTCCTA TTAAAGGAAAGATGGCGGTGTAATCTAGAGTCGACCTGCAGGCAAGCTT GGCACTGGCCGTCGTTTTACAACGTCGTGACTGGGAAAACCCTGGCGTT ACCCAACTTAATCGCCTTGCAGCACATCCCCCTTTCGCCAGCTGGCGTA ATAGCGAAGAGGCCCGCACCGATCGCCCTTCCCAACAGTTGCGCAGCCT GAATGGCGAATGGCAGCTTGGCTGTTTTGGCGGATGAGATAAGATTTTC AGCCTGATACAGATTAAATCAGAACGCAGAAGCGGTCTGATAAAACAGA ATTTGCCTGGCGGCAGTAGCGCGGTGGTCCCACCTGACCCCATGCCGAA CTCAGAAGTGAAACGCCGTAGCGCCGATGGTAGTGTGGGGTCTCCCCAT GCGAGAGTAGGGAACTGCCAGGCATCAAATAAAACGAAAGGCTCAGTCG AAAGACTGGGCCTTTCGTTTTATCTGTTGTTTGTCGGTGAACGCTCTCC TGAGTAGGACAAATCCGCCGGGAGCGGATTTGAACGTTGCGAAGCAACG GCCCGGAGGGTGGCGGGCAGGACGCCCGCCATAAACTGCCAGGCATCAA ATTAAGCAGAAGGCCATCCTGACGGATGGCCTTTTTGCGTTTCTACAAA CTCTTTTTGTTTATTTTTCTAAATACATTCAAATATGTATCCGCTCATGA GACAATAACCCTGATAAATGCTTCAATAATATTGAAAAAGGAAGAGTAT GAGTATTCAACATTTCCGTGTCGCCCTTATTCCCTTTTTTGCGGCATTTT GCCTTCCTGTTTTTGCTCACCCAGAAACGCTGGTGAAAGTAAAAGATGC TGAAGATCAGTTGGGTGCACGAGTGGGTTACATCGAACTGGATCTCAAC AGCGGTAAGATCCTTGAGAGTTTTCGCCCCGAAGAACGTTCTCCAATGA TGAGCACTTTTAAAGTTCTGCTATGTGGCGCGGTATTATCCCGTGTTGA CGCCGGGCAAGAGCAACTCGGTCGCCGCATACACTATTCTCAGAATGAC TTGGTTGAGTACTCACCAGTCACAGAAAAGCATCTTACGGATGGCATGA CAGTAAGAGAATTATGCAGTGCTGCCATAACCATGAGTGATAACACTGC GGCCAACTTACTTCTGACAACGATCGGAGGACCGAAGGAGCTAACCGCT TTTTTGCACAACATGGGGGATCATGTAACTCGCCTTGATCGTTGGGAAC CGGAGCTGAATGAAGCCATACCAAACGACGAGCGTGACACCACGATGCC TGTAGCAATGGCAACAACGTTGCGCAAACTATTAACTGGCGAACTACTT ACTCTAGCTTCCCGGCAACAATTAATAGACTGGATGGAGGCGGATAAAG TTGCAGGACCACTTCTGCGCTCGGCCCTTCCGGCTGGCTGGTTTATTGC TGATAAATCTGGAGCCGGTGAGCGTGGGTCTCGCGGTATCATTGCAGCA

108 CTGGGGCCAGATGGTAAGCCCTCCCGTATCGTAGTTATCTACACGACGG GGAGTCAGGCAACTATGGATGAACGAAATAGACAGATCGCTGAGATAGG TGCCTCACTGATTAAGCATTGGTAACTGTCAGACCAAGTTTACTCATAT ATACTTTAGATTGATTTACCCCGGTTGATAATCAGAAAAGCCCCAAAAA CAGGAAGATTGTATAAGCAAATATTTAAATTGTAAACGTTAATATTTTG TTAAAATTCGCGTTAAATTTTTGTTAAATCAGCTCATTTTTTAACCAATA GGCCGAAATCGGCAAAATCCCTTATAAATCAAAAGAATAGCCCGAGATA GGGTTGAGTGTTGTTCCAGTTTGGAACAAGAGTCCACTATTAAAGAACG TGGACTCCAACGTCAAAGGGCGAAAAACCGTCTATCAGGGCGATGGCCC ACTACGTGAACCATCACCCAAATCAAGTTTTTTGGGGTCGAGGTGCCGT AAAGCACTAAATCGGAACCCTAAAGGGAGCCCCCGATTTAGAGCTTGAC GGGGAAAGCCGGCGAACGTGGCGAGAAAGGAAGGGAAGAAAGCGAAAG GAGCGGGCGCTAGGGCGCTGGCAAGTGTAGCGGTCACGCTGCGCGTAA CCACCACACCCGCCGCGCTTAATGCGCCGCTACAGGGCGCGTAAAAGGA TCTAGGTGAAGATCCTTTTTGATAATCTCATGACCAAAATCCCTTAACG TGAGTTTTCGTTCCACTGAGCGTCAGACCCCGTAGAAAAGATCAAAGGA TCTTCTTGAGATCCTTTTTTTCTGCGCGTAATCTGCTGCTTGCAAACAAA AAAACCACCGCTACCAGCGGTGGTTTGTTTGCCGGATCAAGAGCTACCA ACTCTTTTTCCGAAGGTAACTGGCTTCAGCAGAGCGCAGATACCAAATA CTGTCCTTCTAGTGTAGCCGTAGTTAGGCCACCACTTCAAGAACTCTGT AGCACCGCCTACATACCTCGCTCTGCTAATCCTGTTACCAGTGGCTGCT GCCAGTGGCGATAAGTCGTGTCTTACCGGGTTGGACTCAAGACGATAGT TACCGGATAAGGCGCAGCGGTCGGGCTGAACGGGGGGTTCGTGCACAC AGCCCAGCTTGGAGCGAACGACCTACACCGAACTGAGATACCTACAGCG TGAGCTATGAGAAAGCGCCACGCTTCCCGAAGGGAGAAAGGCGGACAG GTATCCGGTAAGCGGCAGGGTCGGAACAGGAGAGCGCACGAGGGAGCT TCCAGGGGGAAACGCCTGGTATCTTTATAGTCCTGTCGGGTTTCGCCAC CTCTGACTTGAGCGTCGATTTTTGTGATGCTCGTCAGGGGGGCGGAGCC TATGGAAAAACGCCAGCAACGCGGCCTTTTTACGGTTCCTGGCCTTTTG CTGGCCTTTTGCTCACATGTTCTTTCCTGCGTTATCCCCTGATTCTGTGG ATAACCGTATTACCGCCTTTGAGTGAGCTGATACCGCTCGCCGCAGCCG AACGACCGAGCGCAGCGAGTCAGTGAGCGAGGAAGCGGAAGAGCGCCT GATGCGGTATTTTCTCCTTACGCATCTGTGCGGTATTTCACACCGCATA TGGTGCACTCTCAGTACAATCTGCTCTGATGCCGCATAGTTAAGCCAGT ATACACTCCGCTATCGCTACGTGACTGGGTCATGGCTGCGCCCCGACAC CCGCCAACACCCGCTGACGCGCCCTGACGGGCTTGTCTGCTCCCGGCAT CCGCTTACAGACAAGCTGTGACCGTCTCCGGGAGCTGCATGTGTCAGAG GTTTTCACCGTCATCACCGAAACGCGCGAGGCAGCTGCGGTAAAGCTCA TCAGCGTGGTCGTGCAGCGATTCACAGATGTCTGCCTGTTCATCCGCGT CCAGCTCGTTGAGTTTCTCCAGAAGCGTTAATGTCTGGCTTCTGATAAA GCGGGCCATGTTAAGGGCGGTTTTTTCCTGTTTGGTCACTTGATGCCTC CGTGTAAGGGGGAATTTCTGTTCATGGGGGTAATGATACCGATGAAACG AGAGAGGATGCTCACGATACGGGTTACTGATGATGAACATGCCCGGTTA

109 CTGGAACGTTGTGAGGGTAAACAACTGGCGGTATGGATGCGGCGGGAC CAGAGAAAAATCACTCAGGGTCAATGCCAGCGCTTCGTTAATACAGATG TAGGTGTTCCACAGGGTAGCCAGCAGCATCCTGCGATGCAGATCCGGAA CATAATGGTGCAGGGCGCTGACTTCCGCGTTTCCAGACTTTACGAAACA CGGAAACCGAAGACCATTCATGTTGTTGCTCAGGTCGCAGACGTTTTGC AGCAGCAGTCGCTTCACGTTCGCTCGCGTATCGGTGATTCATTCTGCTA ACCAGTAAGGCAACCCCGCCAGCCTAGCCGGGTCCTCAACGACAGGAGC ACGATCATGCGCACCCGTGGCCAGGACCCAACGCTGCCCGAAATT

110 B.5 pMIT:Nrx 2b

CCGACACCATCGAATGGTGCAAAACCTTTCGCGGTATGGCATGATAG CGCCCGGAAGAGAGTCAATTCAGGGTGGTGAATGTGAAACCAGTAACGT TATACGATGTCGCAGAGTATGCCGGTGTCTCTTATCAGACCGTTTCCCG CGTGGTGAACCAGGCCAGCCACGTTTCTGCGAAAACGCGGGAAAAAGTG GAAGCGGCGATGGCGGAGCTGAATTACATTCCCAACCGCGTGGCACAAC AACTGGCGGGCAAACAGTCGTTGCTGATTGGCGTTGCCACCTCCAGTCT GGCCCTGCACGCGCCGTCGCAAATTGTCGCGGCGATTAAATCTCGCGCC GATCAACTGGGTGCCAGCGTGGTGGTGTCGATGGTAGAACGAAGCGGC GTCGAAGCCTGTAAAGCGGCGGTGCACAATCTTCTCGCGCAACGCGTCA GTGGGCTGATCATTAACTATCCGCTGGATGACCAGGATGCCATTGCTGT GGAAGCTGCCTGCACTAATGTTCCGGCGTTATTTCTTGATGTCTCTGAC CAGACACCCATCAACAGTATTATTTTCTCCCATGAAGACGGTACGCGAC TGGGCGTGGAGCATCTGGTCGCATTGGGTCACCAGCAAATCGCGCTGTT AGCGGGCCCATTAAGTTCTGTCTCGGCGCGTCTGCGTCTGGCTGGCTGG CATAAATATCTCACTCGCAATCAAATTCAGCCGATAGCGGAACGGGAAG GCGACTGGAGTGCCATGTCCGGTTTTCAACAAACCATGCAAATGCTGAA TGAGGGCATCGTTCCCACTGCGATGCTGGTTGCCAACGATCAGATGGCG CTGGGCGCAATGCGCGCCATTACCGAGTCCGGGCTGCGCGTTGGTGCGG ATATCTCGGTAGTGGGATACGACGATACCGAAGACAGCTCATGTTATAT CCCGCCGTTAACCACCATCAAACAGGATTTTCGCCTGCTGGGGCAAACC AGCGTGGACCGCTTGCTGCAACTCTCTCAGGGCCAGGCGGTGAAGGGCA ATCAGCTGTTGCCCGTCTCACTGGTGAAAAGAAAAACCACCCTGGCGCC CAATACGCAAACCGCCTCTCCCCGCGCGTTGGCCGATTCATTAATGCAG CTGGCACGACAGGTTTCCCGACTGGAAAGCGGGCAGTGAGCGCAACGCA ATTAATGTGAGTTAGCTCACTCATTAGGCACAATTCTCATGTTTGACAG CTTATCATCGACTGCACGGTGCACCAATGCTTCTGGCGTCAGGCAGCCA TCGGAAGCTGTGGTATGGCTGTGCAGGTCGTAAATCACTGCATAATTCG TGTCGCTCAAGGCGCACTCCCGTTCTGGATAATGTTTTTTGCGCCGACA TCATAACGGTTCTGGCAAATATTCTGAAATGAGCTGTTGACAATTAATC ATCGGCTCGTATAATGTGTGGAATTGTGAACGGATAACAATTTCACACA GGAAACAGCCAGTCCGTTTAGGTGTTTTCACGAGCACTTCACCAACAAG GACCATAGATTATGAAAATCGAAGAAGGTAAACTGGTAATCTGGATTAA CGGCGATAAAGGCTATAACGGTCTCGCTGAAGTCGGTAAGAAATTCGAG AAAGATACCGGAATTAAAGTCACCGTTGAGCATCCGGATAAACTGGAAG AGAAATTCCCACAGGTTGCGGCAACTGGCGATGGCCCTGACATTATCTT CTGGGCACACGACCGCTTTGGTGGCTACGCTCAATCTGGCCTGTTGGCT GAAATCACCCCGGACAAAGCGTTCCAGGACAAGCTGTATCCGTTTACCT GGGATGCCGTACGTTACAACGGCAAGCTGATTGCTTACCCGATCGCTGT TGAAGCGTTATCGCTGATTTATAACAAAGATCTGCTGCCGAACCCGCCA AAAACCTGGGAAGAGATCCCGGCGCTGGATAAAGAACTGAAAGCGAAAG

111 GTAAGAGCGCGCTGATGTTCAACCTGCAAGAACCGTACTTCACCTGGCC GCTGATTGCTGCTGACGGGGGTTATGCGTTCAAGTATGAAAACGGCAAG TACGACATTAAAGACGTGGGCGTGGATAACGCTGGCGCGAAAGCGGGTC TGACCTTCCTGGTTGACCTGATTAAAAACAAACACATGAATGCAGACAC CGATTACTCCATCGCAGAAGCTGCCTTTAATAAAGGCGAAACAGCGATG ACCATCAACGGCCCGTGGGCATGGTCCAACATCGACACCAGCAAAGTGA ATTATGGTGTAACGGTACTGCCGACCTTCAAGGGTCAACCATCCAAACC GTTCGTTGGCGTGCTGAGCGCAGGTATTAACGCCGCCAGTCCGAACAAA GAGCTGGCAAAAGAGTTCCTCGAAAACTATCTGCTGACTGATGAAGGTC TGGAAGCGGTTAATAAAGACAAACCGCTGGGTGCCGTAGCGCTGAAGTC TTACGAGGAAGAGTTGGCGAAAGATCCACGTATTGCCGCCACCATGGAA AACGCCCAGAAAGGTGAAATCATGCCGAACATCCCGCAGATGTCCGCTT TCTGGTATGCCGTGCGTACTGCGGTGATCAACGCCGCCAGCGGTCGTCA GACTGTCGATGAAGCCCTGAAAGACGCGCAGACTAATTCGAGCTCGAAC AACAACAACAATAACAATAACAACAACCTCGGGATCGAGGGAAGGATTT CAGAATTCGCCCTCGCAGAGGGCACTCGGATCTTCGATCCGGTCACCGG TACAACGCATCGCATCGAGGATGTTGTCGATGGGCGCAAGCCTATTCAT GTCGTGGCTGCTGCCAAGGACGGAACGCTGCATGCGCGGCCCGTGGTGT CCTGGTTCGACCAGGGAACGCGGGATGTGATCGGGTTGCGGATCGCCGG TGGCGCCATCGTGTGGGCGACACCCGATCACAAGGTGCTGACAGAGTAC GGCTGGCGTGCCGCGGGGGAACTCCGCAAGGGAGACAGGGTGGCGCAA CCGCGACGCTTCGATGGATTCGGTGACAGTGCGCCGATTCCGGCGGACG GGACCACATACATCTTTGGGAAGGGGGGAGCGCTCATCACCTACACGTG GCCCCCCAATGACAGGCCCAGCACGAGGATGGATCGCCTGGCCGTGGGC TTCAGCACCCACCAGCGGAGCGCTGTGCTGGTGCGGGTGGACAGCGCCT CCGGCCTTGGAGACTACCTGCAGCTGCACATCGACCAGGGCACCGTGGG GGTGATCTTTAACGTGGGCACGGACGACATTACCATCGACGAGCCCAAC GCCATAGTAAGCGACGGCAAATACCACGTGGTGCGCTTCACTCGAAGCG GCGGCAACGCCACCCTGCAGGTGGACAGCTGGCCGGTCAACGAGCGGTA CCCGGCAGGCCGCCAGCTGACCATCTTCAACAGCCAGGCTGCCATCAAG ATCGGGGGCCGGGATCAGGGCCGCCCCTTCCAGGGCCAGGTGTCCGGCC TCTACTACAATGGGCTCAAGGTGCTGGCGCTGGCCGCCGAGAGCGACCC CAATGTGCGGACTGAGGGTCACCTGCGCCTGGTGGGGGAGGGGGACAT GCTGGCGGAAGAACTCCGCTATTCCGTGATCCGAGAAGTGCTGCCAACG CGGCGGGCACGAACGTTCGACCTCGAGGTCGAGGAACTGCACACCCTCG TCGCCGAAGGGGTTGTTGTACACAACTGTAAACAATACCAAGATTTAAT TAAAGACATTTTTGAAAATGGTTATGAAACCGATGATCGTACAGGCACA GGAACAATTGCTCTGTTCGGTACTAAATTACGCTGGGATTTAACTAAAG GTTTTCCTGCGGTAACAACTAAGAAGCTCGCCTGGAAAGCTTGCATTGC TGAGCTAATATGGTTTTTATCAGGAAGCACAAATGTCAATGATTTACGA TTAATACAACACGATTCGTTAATCCAAGGCAAAACAGTCTGGGATGAAA ATTACGAAAATCAAGCAAAAGATTTAGGATACCATAGCGGTGAACTTGG TCCAATTTATGGAAAACAGTGGCGTGATTTTGGTGGTGTAGACCAAATT

112 ATAGAAGTTATTGATCGTATTAAAAAACTGCCAAATGATAGGCGTCAAA TTGTTTCTGCATGGAATCCAGCTGAACTTAAATATATGGCATTACCGCC TTGTCATATGTTCTATCAGTTTAATGTGCGTAATGGCTATTTGGATTTG CAGTGGTATCAACGCTCAGTAGATGTTTTCTTGGGTCTACCGTTTAATA TTGCGTCATATGCTACGTTAGTTCATATTGTAGCTAAGATGTGTAATCT TATTCCAGGGGATTTGATATTTTCTGGTGGTAATACTCATATCTATATG AATCACGTAGAACAATGTAAAGAAATTTTGAGGCGTGAACCTAAAGAGC TCTGTGAACTAGTAATAAGTGGTCTACCTTATAAATTCCGATATCTTTC TACTAAAGAACAATTAAAATATGTTCTTAAACTTAGGCCTAAAGATTTC GTTCTTAACAACTATGTATCACACCCTCCTATTAAAGGAAAGATGGCGG TGTAATCTAGAGTCGACCTGCAGGCAAGCTTGGCACTGGCCGTCGTTTT ACAACGTCGTGACTGGGAAAACCCTGGCGTTACCCAACTTAATCGCCTT GCAGCACATCCCCCTTTCGCCAGCTGGCGTAATAGCGAAGAGGCCCGCA CCGATCGCCCTTCCCAACAGTTGCGCAGCCTGAATGGCGAATGGCAGCT TGGCTGTTTTGGCGGATGAGATAAGATTTTCAGCCTGATACAGATTAAA TCAGAACGCAGAAGCGGTCTGATAAAACAGAATTTGCCTGGCGGCAGTA GCGCGGTGGTCCCACCTGACCCCATGCCGAACTCAGAAGTGAAACGCCG TAGCGCCGATGGTAGTGTGGGGTCTCCCCATGCGAGAGTAGGGAACTGC CAGGCATCAAATAAAACGAAAGGCTCAGTCGAAAGACTGGGCCTTTCGT TTTATCTGTTGTTTGTCGGTGAACGCTCTCCTGAGTAGGACAAATCCGC CGGGAGCGGATTTGAACGTTGCGAAGCAACGGCCCGGAGGGTGGCGGG CAGGACGCCCGCCATAAACTGCCAGGCATCAAATTAAGCAGAAGGCCAT CCTGACGGATGGCCTTTTTGCGTTTCTACAAACTCTTTTTGTTTATTTTT CTAAATACATTCAAATATGTATCCGCTCATGAGACAATAACCCTGATAA ATGCTTCAATAATATTGAAAAAGGAAGAGTATGAGTATTCAACATTTCC GTGTCGCCCTTATTCCCTTTTTTGCGGCATTTTGCCTTCCTGTTTTTGCT CACCCAGAAACGCTGGTGAAAGTAAAAGATGCTGAAGATCAGTTGGGTG CACGAGTGGGTTACATCGAACTGGATCTCAACAGCGGTAAGATCCTTGA GAGTTTTCGCCCCGAAGAACGTTCTCCAATGATGAGCACTTTTAAAGTT CTGCTATGTGGCGCGGTATTATCCCGTGTTGACGCCGGGCAAGAGCAAC TCGGTCGCCGCATACACTATTCTCAGAATGACTTGGTTGAGTACTCACC AGTCACAGAAAAGCATCTTACGGATGGCATGACAGTAAGAGAATTATGC AGTGCTGCCATAACCATGAGTGATAACACTGCGGCCAACTTACTTCTGA CAACGATCGGAGGACCGAAGGAGCTAACCGCTTTTTTGCACAACATGGG GGATCATGTAACTCGCCTTGATCGTTGGGAACCGGAGCTGAATGAAGCC ATACCAAACGACGAGCGTGACACCACGATGCCTGTAGCAATGGCAACAA CGTTGCGCAAACTATTAACTGGCGAACTACTTACTCTAGCTTCCCGGCA ACAATTAATAGACTGGATGGAGGCGGATAAAGTTGCAGGACCACTTCTG CGCTCGGCCCTTCCGGCTGGCTGGTTTATTGCTGATAAATCTGGAGCCG GTGAGCGTGGGTCTCGCGGTATCATTGCAGCACTGGGGCCAGATGGTAA GCCCTCCCGTATCGTAGTTATCTACACGACGGGGAGTCAGGCAACTATG GATGAACGAAATAGACAGATCGCTGAGATAGGTGCCTCACTGATTAAGC ATTGGTAACTGTCAGACCAAGTTTACTCATATATACTTTAGATTGATTT

113 ACCCCGGTTGATAATCAGAAAAGCCCCAAAAACAGGAAGATTGTATAAG CAAATATTTAAATTGTAAACGTTAATATTTTGTTAAAATTCGCGTTAAA TTTTTGTTAAATCAGCTCATTTTTTAACCAATAGGCCGAAATCGGCAAA ATCCCTTATAAATCAAAAGAATAGCCCGAGATAGGGTTGAGTGTTGTTC CAGTTTGGAACAAGAGTCCACTATTAAAGAACGTGGACTCCAACGTCAA AGGGCGAAAAACCGTCTATCAGGGCGATGGCCCACTACGTGAACCATCA CCCAAATCAAGTTTTTTGGGGTCGAGGTGCCGTAAAGCACTAAATCGGA ACCCTAAAGGGAGCCCCCGATTTAGAGCTTGACGGGGAAAGCCGGCGAA CGTGGCGAGAAAGGAAGGGAAGAAAGCGAAAGGAGCGGGCGCTAGGGC GCTGGCAAGTGTAGCGGTCACGCTGCGCGTAACCACCACACCCGCCGCG CTTAATGCGCCGCTACAGGGCGCGTAAAAGGATCTAGGTGAAGATCCTT TTTGATAATCTCATGACCAAAATCCCTTAACGTGAGTTTTCGTTCCACT GAGCGTCAGACCCCGTAGAAAAGATCAAAGGATCTTCTTGAGATCCTTT TTTTCTGCGCGTAATCTGCTGCTTGCAAACAAAAAAACCACCGCTACCA GCGGTGGTTTGTTTGCCGGATCAAGAGCTACCAACTCTTTTTCCGAAGG TAACTGGCTTCAGCAGAGCGCAGATACCAAATACTGTCCTTCTAGTGTA GCCGTAGTTAGGCCACCACTTCAAGAACTCTGTAGCACCGCCTACATAC CTCGCTCTGCTAATCCTGTTACCAGTGGCTGCTGCCAGTGGCGATAAGT CGTGTCTTACCGGGTTGGACTCAAGACGATAGTTACCGGATAAGGCGCA GCGGTCGGGCTGAACGGGGGGTTCGTGCACACAGCCCAGCTTGGAGCG AACGACCTACACCGAACTGAGATACCTACAGCGTGAGCTATGAGAAAGC GCCACGCTTCCCGAAGGGAGAAAGGCGGACAGGTATCCGGTAAGCGGCA GGGTCGGAACAGGAGAGCGCACGAGGGAGCTTCCAGGGGGAAACGCCT GGTATCTTTATAGTCCTGTCGGGTTTCGCCACCTCTGACTTGAGCGTCG ATTTTTGTGATGCTCGTCAGGGGGGCGGAGCCTATGGAAAAACGCCAGC AACGCGGCCTTTTTACGGTTCCTGGCCTTTTGCTGGCCTTTTGCTCACA TGTTCTTTCCTGCGTTATCCCCTGATTCTGTGGATAACCGTATTACCGCC TTTGAGTGAGCTGATACCGCTCGCCGCAGCCGAACGACCGAGCGCAGCG AGTCAGTGAGCGAGGAAGCGGAAGAGCGCCTGATGCGGTATTTTCTCCT TACGCATCTGTGCGGTATTTCACACCGCATATGGTGCACTCTCAGTACA ATCTGCTCTGATGCCGCATAGTTAAGCCAGTATACACTCCGCTATCGCT ACGTGACTGGGTCATGGCTGCGCCCCGACACCCGCCAACACCCGCTGAC GCGCCCTGACGGGCTTGTCTGCTCCCGGCATCCGCTTACAGACAAGCTG TGACCGTCTCCGGGAGCTGCATGTGTCAGAGGTTTTCACCGTCATCACC GAAACGCGCGAGGCAGCTGCGGTAAAGCTCATCAGCGTGGTCGTGCAGC GATTCACAGATGTCTGCCTGTTCATCCGCGTCCAGCTCGTTGAGTTTCT CCAGAAGCGTTAATGTCTGGCTTCTGATAAAGCGGGCCATGTTAAGGGC GGTTTTTTCCTGTTTGGTCACTTGATGCCTCCGTGTAAGGGGGAATTTC TGTTCATGGGGGTAATGATACCGATGAAACGAGAGAGGATGCTCACGAT ACGGGTTACTGATGATGAACATGCCCGGTTACTGGAACGTTGTGAGGGT AAACAACTGGCGGTATGGATGCGGCGGGACCAGAGAAAAATCACTCAGG GTCAATGCCAGCGCTTCGTTAATACAGATGTAGGTGTTCCACAGGGTAG CCAGCAGCATCCTGCGATGCAGATCCGGAACATAATGGTGCAGGGCGCT

114 GACTTCCGCGTTTCCAGACTTTACGAAACACGGAAACCGAAGACCATTC ATGTTGTTGCTCAGGTCGCAGACGTTTTGCAGCAGCAGTCGCTTCACGT TCGCTCGCGTATCGGTGATTCATTCTGCTAACCAGTAAGGCAACCCCGC CAGCCTAGCCGGGTCCTCAACGACAGGAGCACGATCATGCGCACCCGTG GCCAGGACCCAACGCTGCCCGAAATT

115 B.6 pET:Nrx-His6

TGGCGAATGGGACGCGCCCTGTAGCGGCGCATTAAGCGCGGCGGGT GTGGTGGTTACGCGCAGCGTGACCGCTACACTTGCCAGCGCCCTAGCGC CCGCTCCTTTCGCTTTCTTCCCTTCCTTTCTCGCCACGTTCGCCGGCTTT CCCCGTCAAGCTCTAAATCGGGGGCTCCCTTTAGGGTTCCGATTTAGTG CTTTACGGCACCTCGACCCCAAAAAACTTGATTAGGGTGATGGTTCACG TAGTGGGCCATCGCCCTGATAGACGGTTTTTCGCCCTTTGACGTTGGAG TCCACGTTCTTTAATAGTGGACTCTTGTTCCAAACTGGAACAACACTCA ACCCTATCTCGGTCTATTCTTTTGATTTATAAGGGATTTTGCCGATTTC GGCCTATTGGTTAAAAAATGAGCTGATTTAACAAAAATTTAACGCGAAT TTTAACAAAATATTAACGTTTACAATTTCAGGTGGCACTTTTCGGGGAA ATGTGCGCGGAACCCCTATTTGTTTATTTTTCTAAATACATTCAAATAT GTATCCGCTCATGAGACAATAACCCTGATAAATGCTTCAATAATATTGA AAAAGGAAGAGTATGAGTATTCAACATTTCCGTGTCGCCCTTATTCCCT TTTTTGCGGCATTTTGCCTTCCTGTTTTTGCTCACCCAGAAACGCTGGT GAAAGTAAAAGATGCTGAAGATCAGTTGGGTGCACGAGTGGGTTACATC GAACTGGATCTCAACAGCGGTAAGATCCTTGAGAGTTTTCGCCCCGAAG AACGTTTTCCAATGATGAGCACTTTTAAAGTTCTGCTATGTGGCGCGGT ATTATCCCGTATTGACGCCGGGCAAGAGCAACTCGGTCGCCGCATACAC TATTCTCAGAATGACTTGGTTGAGTACTCACCAGTCACAGAAAAGCATC TTACGGATGGCATGACAGTAAGAGAATTATGCAGTGCTGCCATAACCAT GAGTGATAACACTGCGGCCAACTTACTTCTGACAACGATCGGAGGACCG AAGGAGCTAACCGCTTTTTTGCACAACATGGGGGATCATGTAACTCGCC TTGATCGTTGGGAACCGGAGCTGAATGAAGCCATACCAAACGACGAGCG TGACACCACGATGCCTGCAGCAATGGCAACAACGTTGCGCAAACTATTA ACTGGCGAACTACTTACTCTAGCTTCCCGGCAACAATTAATAGACTGGA TGGAGGCGGATAAAGTTGCAGGACCACTTCTGCGCTCGGCCCTTCCGGC TGGCTGGTTTATTGCTGATAAATCTGGAGCCGGTGAGCGTGGGTCTCGC GGTATCATTGCAGCACTGGGGCCAGATGGTAAGCCCTCCCGTATCGTAG TTATCTACACGACGGGGAGTCAGGCAACTATGGATGAACGAAATAGACA GATCGCTGAGATAGGTGCCTCACTGATTAAGCATTGGTAACTGTCAGAC CAAGTTTACTCATATATACTTTAGATTGATTTAAAACTTCATTTTTAATT TAAAAGGATCTAGGTGAAGATCCTTTTTGATAATCTCATGACCAAAATC CCTTAACGTGAGTTTTCGTTCCACTGAGCGTCAGACCCCGTAGAAAAGA TCAAAGGATCTTCTTGAGATCCTTTTTTTCTGCGCGTAATCTGCTGCTT GCAAACAAAAAAACCACCGCTACCAGCGGTGGTTTGTTTGCCGGATCAA GAGCTACCAACTCTTTTTCCGAAGGTAACTGGCTTCAGCAGAGCGCAGA TACCAAATACTGTCCTTCTAGTGTAGCCGTAGTTAGGCCACCACTTCAA GAACTCTGTAGCACCGCCTACATACCTCGCTCTGCTAATCCTGTTACCA GTGGCTGCTGCCAGTGGCGATAAGTCGTGTCTTACCGGGTTGGACTCAA GACGATAGTTACCGGATAAGGCGCAGCGGTCGGGCTGAACGGGGGGTT

116 CGTGCACACAGCCCAGCTTGGAGCGAACGACCTACACCGAACTGAGATA CCTACAGCGTGAGCTATGAGAAAGCGCCACGCTTCCCGAAGGGAGAAAG GCGGACAGGTATCCGGTAAGCGGCAGGGTCGGAACAGGAGAGCGCACG AGGGAGCTTCCAGGGGGAAACGCCTGGTATCTTTATAGTCCTGTCGGGT TTCGCCACCTCTGACTTGAGCGTCGATTTTTGTGATGCTCGTCAGGGGG GCGGAGCCTATGGAAAAACGCCAGCAACGCGGCCTTTTTACGGTTCCTG GCCTTTTGCTGGCCTTTTGCTCACATGTTCTTTCCTGCGTTATCCCCTGA TTCTGTGGATAACCGTATTACCGCCTTTGAGTGAGCTGATACCGCTCGC CGCAGCCGAACGACCGAGCGCAGCGAGTCAGTGAGCGAGGAAGCGGAA GAGCGCCTGATGCGGTATTTTCTCCTTACGCATCTGTGCGGTATTTCAC ACCGCATATATGGTGCACTCTCAGTACAATCTGCTCTGATGCCGCATAG TTAAGCCAGTATACACTCCGCTATCGCTACGTGACTGGGTCATGGCTGC GCCCCGACACCCGCCAACACCCGCTGACGCGCCCTGACGGGCTTGTCTG CTCCCGGCATCCGCTTACAGACAAGCTGTGACCGTCTCCGGGAGCTGCA TGTGTCAGAGGTTTTCACCGTCATCACCGAAACGCGCGAGGCAGCTGCG GTAAAGCTCATCAGCGTGGTCGTGAAGCGATTCACAGATGTCTGCCTGT TCATCCGCGTCCAGCTCGTTGAGTTTCTCCAGAAGCGTTAATGTCTGGC TTCTGATAAAGCGGGCCATGTTAAGGGCGGTTTTTTCCTGTTTGGTCAC TGATGCCTCCGTGTAAGGGGGATTTCTGTTCATGGGGGTAATGATACCG ATGAAACGAGAGAGGATGCTCACGATACGGGTTACTGATGATGAACATG CCCGGTTACTGGAACGTTGTGAGGGTAAACAACTGGCGGTATGGATGCG GCGGGACCAGAGAAAAATCACTCAGGGTCAATGCCAGCGCTTCGTTAAT ACAGATGTAGGTGTTCCACAGGGTAGCCAGCAGCATCCTGCGATGCAGA TCCGGAACATAATGGTGCAGGGCGCTGACTTCCGCGTTTCCAGACTTTA CGAAACACGGAAACCGAAGACCATTCATGTTGTTGCTCAGGTCGCAGAC GTTTTGCAGCAGCAGTCGCTTCACGTTCGCTCGCGTATCGGTGATTCAT TCTGCTAACCAGTAAGGCAACCCCGCCAGCCTAGCCGGGTCCTCAACGA CAGGAGCACGATCATGCGCACCCGTGGGGCCGCCATGCCGGCGATAATG GCCTGCTTCTCGCCGAAACGTTTGGTGGCGGGACCAGTGACGAAGGCTT GAGCGAGGGCGTGCAAGATTCCGAATACCGCAAGCGACAGGCCGATCAT CGTCGCGCTCCAGCGAAAGCGGTCCTCGCCGAAAATGACCCAGAGCGCT GCCGGCACCTGTCCTACGAGTTGCATGATAAAGAAGACAGTCATAAGTG CGGCGACGATAGTCATGCCCCGCGCCCACCGGAAGGAGCTGACTGGGTT GAAGGCTCTCAAGGGCATCGGTCGAGATCCCGGTGCCTAATGAGTGAGC TAACTTACATTAATTGCGTTGCGCTCACTGCCCGCTTTCCAGTCGGGAA ACCTGTCGTGCCAGCTGCATTAATGAATCGGCCAACGCGCGGGGAGAGG CGGTTTGCGTATTGGGCGCCAGGGTGGTTTTTCTTTTCACCAGTGAGAC GGGCAACAGCTGATTGCCCTTCACCGCCTGGCCCTGAGAGAGTTGCAGC AAGCGGTCCACGCTGGTTTGCCCCAGCAGGCGAAAATCCTGTTTGATGG TGGTTAACGGCGGGATATAACATGAGCTGTCTTCGGTATCGTCGTATCC CACTACCGAGATATCCGCACCAACGCGCAGCCCGGACTCGGTAATGGCG CGCATTGCGCCCAGCGCCATCTGATCGTTGGCAACCAGCATCGCAGTGG GAACGATGCCCTCATTCAGCATTTGCATGGTTTGTTGAAAACCGGACAT

117 GGCACTCCAGTCGCCTTCCCGTTCCGCTATCGGCTGAATTTGATTGCGA GTGAGATATTTATGCCAGCCAGCCAGACGCAGACGCGCCGAGACAGAAC TTAATGGGCCCGCTAACAGCGCGATTTGCTGGTGACCCAATGCGACCAG ATGCTCCACGCCCAGTCGCGTACCGTCTTCATGGGAGAAAATAATACTG TTGATGGGTGTCTGGTCAGAGACATCAAGAAATAACGCCGGAACATTAG TGCAGGCAGCTTCCACAGCAATGGCATCCTGGTCATCCAGCGGATAGTT AATGATCAGCCCACTGACGCGTTGCGCGAGAAGATTGTGCACCGCCGCT TTACAGGCTTCGACGCCGCTTCGTTCTACCATCGACACCACCACGCTGG CACCCAGTTGATCGGCGCGAGATTTAATCGCCGCGACAATTTGCGACGG CGCGTGCAGGGCCAGACTGGAGGTGGCAACGCCAATCAGCAACGACTGT TTGCCCGCCAGTTGTTGTGCCACGCGGTTGGGAATGTAATTCAGCTCCG CCATCGCCGCTTCCACTTTTTCCCGCGTTTTCGCAGAAACGTGGCTGGC CTGGTTCACCACGCGGGAAACGGTCTGATAAGAGACACCGGCATACTCT GCGACATCGTATAACGTTACTGGTTTCACATTCACCACCCTGAATTGAC TCTCTTCCGGGCGCTATCATGCCATACCGCGAAAGGTTTTGCGCCATTC GATGGTGTCCGGGATCTCGACGCTCTCCCTTATGCGACTCCTGCATTAG GAAGCAGCCCAGTAGTAGGTTGAGGCCGTTGAGCACCGCCGCCGCAAGG AATGGTGCATGCAAGGAGATGGCGCCCAACAGTCCCCCGGCCACGGGGC CTGCCACCATACCCACGCCGAAACAAGCGCTCATGAGCCCGAAGTGGCG AGCCCGATCTTCCCCATCGGTGATGTCGGCGATATAGGCGCCAGCAACC GCACCTGTGGCGCCGGTGATGCCGGCCACGATGCGTCCGGCGTAGAGGA TCGAGATCTCGATCCCGCGAAATTAATACGACTCACTATAGGGGAATTG TGAGCGGATAACAATTCCCCTCTAGGATCCAATAATTTTGTTTAACTTT AAGAAGGAGATATACATATGGGGACCACATACATCTTTGGGAAGGGGG GAGCGCTCATCACCTACACGTGGCCCCCCAATGACAGGCCCAGCACGAG GATGGATCGCCTGGCCGTGGGCTTCAGCACCCACCAGCGGAGCGCTGTG CTGGTGCGGGTGGACAGCGCCTCCGGCCTTGGAGACTACCTGCAGCTGC ACATCGACCAGGGCACCGTGGGGGTGATCTTTAACGTGGGCACGGACGA CATTACCATCGACGAGCCCAACGCCATAGTAAGCGACGGCAAATACCAC GTGGTGCGCTTCACTCGAAGCGGCGGCAACGCCACCCTGCAGGTGGACA GCTGGCCGGTCAACGAGCGGTACCCGGCAGGCCGCCAGCTGACCATCTT CAACAGCCAGGCTGCCATCAAGATCGGGGGCCGGGATCAGGGCCGCCCC TTCCAGGGCCAGGTGTCCGGCCTCTACTACAATGGGCTCAAGGTGCTGG CGCTGGCCGCCGAGAGCGACCCCAATGTGCGGACTGAGGGTCACCTGCG CCTGGTGGGGGAGGGGCATCATCATCATCATCACTAATCTAGAGTCGAC CTGCAGGCAAGCTTGCGGCCGCACTCGAGCACCACCACCACCACCACTG AGATCCGGCTGCTAACAAAGCCCGAAAGGAAGCTGAGTTGGCTGCTGCC ACCGCTGAGCAATAACTAGCATAACCCCTTGGGGCCTCTAAACGGGTCT TGAGGGGTTTTTTGCTGAAAGGAGGAACTATATCCGGAT

118 Bibliography

Adam, E. and Perler, F. B. Development of a positive genetic selection system for inhibition of protein splicing using mycobacterial inteins in escherichia coli DNA gyrase subunit a. J Mol Microbiol Biotechnol, 4(5):479–87, September 2002.

Agas, D., Sabbieti, M. G., and Marchetti, L. Endocrine disruptors and bone metabolism. Arch Toxicol, November 2012.

Antonini, L. V., Peregrina, J. R., Angulo, J., Medina, M., and Nieto, P. M. A std- nmr study of the interaction of the anabaena ferredoxin-nadp+ reductase with the coenzyme. Molecules, 19(1):672–85, 2014.

Aranda, A. and Pascual, A. Nuclear hormone receptors and gene expression. Physiol Rev, 81(3):1269–304, July 2001.

Babiker, F. A., De Windt, L. J., van Eickels, M., Grohe, C., Meyer, R., and Do- evendans, P. A. Estrogenic hormone action in the heart: regulatory network and function. Cardiovasc Res, 53(3):709–19, February 2002.

Bach, R. G., Brooks, M. M., Lombardero, M., Genuth, S., Donner, T. W., Garber, A., Kennedy, L., Monrad, E. S., Pop-Busui, R., Kelsey, S. F., Frye, R. L., and Investi- gators, B. D. Rosiglitazone and outcomes for patients with diabetes mellitus and coronary artery disease in the bypass angioplasty revascularization investigation 2 diabetes (bari 2D) trial. Circulation, 128(8):785–94, August 2013.

119 Belfort, M. and Pedersen-Lane, J. Genetic system for analyzing escherichia coli thymidylate synthase. J Bacteriol, 160(1):371–8, October 1984.

Blasi, F., Bacchelli, E., Pesaresi, G., Carone, S., Bailey, A. J., Maestrini, E., and International Molecular Genetic Study of Autism, C. Absence of coding mutations in the x-linked genes neuroligin 3 and neuroligin 4 in individuals with autism from the imgsac collection. Am J Med Genet B Neuropsychiatr Genet, 141B(3):220–1, April 2006.

Bosetti, C., Rosato, V., Buniato, D., Zambon, A., La Vecchia, C., and Corrao, G. Cancer risk for patients using thiazolidinediones for type 2 diabetes: a meta- analysis. Oncologist, 18(2):148–56, 2013.

Bourguet, W., Germain, P., and Gronemeyer, H. Nuclear receptor ligand-binding domains: three-dimensional structures, molecular interactions and pharmacological implications. Trends Pharmacol Sci, 21(10):381–8, October 2000.

Bovee, T. F., Helsdingen, R. J., Rietjens, I. M., Keijer, J., and Hoogenboom, R. L. Rapid yeast estrogen bioassays stably expressing human estrogen receptors alpha and beta, and green fluorescent protein: a comparison of different compounds with both receptor types. J Steroid Biochem Mol Biol, 91(3):99–109, July 2004.

Burris, T. P., Solt, L. A., Wang, Y., Crumbley, C., Banerjee, S., Griffett, K., Lun- dasen, T., Hughes, T., and Kojetin, D. J. Nuclear receptors and their selective pharmacologic modulators. Pharmacol Rev, 65(2):710–78, 2013.

Buskirk, A. R., Ong, Y. C., Gartner, Z. J., and Liu, D. R. Directed evolution of ligand dependence: small-molecule-activated protein splicing. Proc Natl Acad Sci USA, 101(29):10505–10, July 2004.

120 Carreras, C. W. and Santi, D. V. The catalytic mechanism and structure of thymidy- late synthase. Annu Rev Biochem, 64:721–62, 1995.

Chen, J. Q., Brown, T. R., and Russo, J. Regulation of energy metabolism pathways by and estrogenic chemicals and potential implications in obesity asso- ciated with increased exposure to endocrine disruptors. Biochimica Et Biophysica Acta-Molecular Cell Research, 1793(7):1128–1143, July 2009.

Chen, W. F. and Wong, M. S. Genistein enhances insulin-like growth factor signaling pathway in human breast cancer (MCf-7) cells. J Clin Endocrinol Metab, 89(5): 2351–9, May 2004.

Cheng, G., Butler, R., Warner, M., Gustafsson, J. A., Wilczek, B., and Landgren, B. M. Effects of short-term estradiol and norethindrone acetate treatment on the breasts of normal postmenopausal women. Menopause, 20(5):496–503, May 2013.

Ching, M. S., Shen, Y., Tan, W. H., Jeste, S. S., Morrow, E. M., Chen, X., Mukaddes, N. M., Yoo, S. Y., Hanson, E., Hundley, R., Austin, C., Becker, R. E., Berry, G. T., Driscoll, K., Engle, E. C., Friedman, S., Gusella, J. F., Hisama, F. M., Irons, M. B., Lafiosca, T., LeClair, E., Miller, D. T., Neessen, M., Picker, J. D., Rappaport, L., Rooney, C. M., Sarco, D. P., Stoler, J. M., Walsh, C. A., Wolff, R. R., Zhang, T., Nasir, R. H., Wu, B. L., and Children’s Hospital Boston Genotype Phenotype Study, G. Deletions of nrxn1 (neurexin-1) predispose to a wide spectrum of developmental disorders. Am J Med Genet B Neuropsychiatr Genet,153B(4): 937–47, June 2010.

Chong, S., Shao, Y., Paulus, H., Benner, J., Perler, F. B., and Xu, M. Q. Protein splicing involving the saccharomyces cerevisiae vma intein. the steps in the splicing

121 pathway, side reactions leading to protein cleavage, and establishment of an in vitro splicing system. J Biol Chem, 271(36):22159–68, September 1996.

Chu, W. L., Shiizaki, K., Kawanishi, M., Kondo, M., and Yagi, T. Validation of a new yeast-based reporter assay consisting of human estrogen receptors alpha/beta and coactivator src-1: application for detection of estrogenic activity in environmental samples. Environ Toxicol, 24(5):513–21, October 2009.

Daugelat, S. and Jacobs, J., W. R. The mycobacterium tuberculosis reca intein can be used in an ORftrap to select for open reading frames. Protein Sci, 8(3):644–53, March 1999.

Derbyshire, V. and Belfort, M. Lightning strikes twice: intron-intein coincidence. Proc Natl Acad Sci U S A, 95(4):1356–7, February 1998.

Derbyshire, V., Wood, D. W., Wu, W., Dansereau, J. T., Dalgaard, J. Z., and Belfort, M. Genetic definition of a protein-splicing domain: functional mini-inteins support structure predictions and a model for intein evolution. Proc Natl Acad Sci U S A, 94(21):11466–71, October 1997.

Diamanti-Kandarakis, E., Bourguignon, J. P., Giudice, L. C., Hauser, R., Prins, G. S., Soto, A. M., Zoeller, R. T., and Gore, A. C. Endocrine-disrupting chemicals: an endocrine society scientific statement. Endocr Rev, 30(4):293–342, June 2009.

Dobbins, L. L., Brain, R. A., and Brooks, B. W. Comparison of the sensitivities of common in vitro and in vivo assays of estrogenic activity: application of chemical toxicity distributions. Environ Toxicol Chem, 27(12):2608–16, December 2008.

Escande, A., Pillon, A., Servant, N., Cravedi, J. P., Larrea, F., Muhn, P., Nicolas, J. C., Cavailles, V., and Balaguer, P. Evaluation of ligand selectivity using reporter

122 cell lines stably expressing or beta. Biochem Pharmacol, 71(10):1459–69, May 2006.

Evans, J., T. C. and Xu, M. Q. Intein-mediated protein ligation: harnessing nature’s escape artists. Biopolymers, 51(5):333–42, 1999.

Filz, H. P. [insulin sensitizer. a new therapy option for type 2 diabetic patients]. MMW Fortschr Med, 142(38):31–3, September 2000.

Gangopadhyay, J. P., Jiang, S. Q., and Paulus, H. An in vitro screening system for protein splicing inhibitors based on green fluorescent protein as an indicator. Anal Chem, 75(10):2456–62, May 2003a.

Gangopadhyay, J. P., Jiang, S. Q., van Berkel, P., and Paulus, H. In vitro splicing of erythropoietin by the mycobacterium tuberculosis reca intein without substitut- ing amino acids at the splice junctions. Biochim Biophys Acta, 1619(2):193–200, January 2003b.

Gauthier, J., Siddiqui, T. J., Huashan, P., Yokomaku, D., Hamdan, F. F., Cham- pagne, N., Lapointe, M., Spiegelman, D., Noreau, A., Lafreniere, R. G., Fathalli, F., Joober, R., Krebs, M. O., DeLisi, L. E., Mottron, L., Fombonne, E., Michaud, J. L., Drapeau, P., Carbonetto, S., Craig, A. M., and Rouleau, G. A. Truncating mutations in nrxn2 and nrxn1 in autism spectrum disorders and schizophrenia. Hum Genet, 130(4):563–73, October 2011.

Gawrys, M. D., Hartman, I., Landweber, L. F., and Wood, D. W. Use of engineered escherichia coli cells to detect estrogenicity in everyday consumer products. Journal of Chemical Technology and Biotechnology, 84(12):1834–1840, December 2009.

123 Gierach, I., Li, J., Wu, W. Y., Grover, G. J., and Wood, D. W. Bacterial biosensors for screening isoform-selective ligands for human thyroid receptors alpha-1 and beta-1. FEBS Open Bio, 2:247–53, 2012.

Gierach, I., Shapero, K., Eyster, T. W., and Wood, D. W. Bacterial biosensors for evaluating potential impacts of estrogenic endocrine disrupting compounds in multiple species. Environ Toxicol, 28(4):179–89, April 2013.

Glees, P. and Meller, K. The finer structure of synapses and neurones. a review of recent electronmicroscopical studies. Paraplegia, 2:77–95, August 1964.

Golynskiy, M. V., Koay, M. S., Vinkenborg, J. L., and Merkx, M. Engineering protein switches: sensors, regulators, and spare parts for biology and biotechnology. Chembiochem, 12(3):353–61, February 2011.

Habauzit, D., Boudot, A., Kerdivel, G., Flouriot, G., and Pakdel, F. Development and validation of a test for environmental estrogens: Checking xeno-estrogen activity by cxcl12 secretion in breast cancer cell lines (cxcl-test). Environ Toxicol, 25(5): 495–503, October 2010.

Hartman, I., Gillies, A. R., Arora, S., Andaya, C., Royapet, N., Welsh, W. J., Wood, D. W., and Zauhar, R. J. Application of screening methods, shape signatures and engineered biosensors in early drug discovery process. Pharm Res, 26(10):2247–58, October 2009.

Hiraga, K., Derbyshire, V., Dansereau, J. T., Van Roey, P., and Belfort, M. Min- imization and stabilization of the mycobacterium tuberculosis reca intein. JMol Biol, 354(4):916–26, December 2005.

124 Ichtchenko, K., Hata, Y., Nguyen, T., Ullrich, B., Missler, M., Moomaw, C., and Sudhof, T. C. Neuroligin 1: a splice site-specific ligand for beta-neurexins. Cell, 81(3):435–43, May 1995.

Ichtchenko, K., Nguyen, T., and Sudhof, T. C. Structures, alternative splicing, and neurexin binding of multiple neuroligins. J Biol Chem, 271(5):2676–82, February 1996.

Jeong, T. H., Son, Y. J., Ryu, H. B., Koo, B. K., Jeong, S. M., Hoang, P., Do, B. H., Song, J. A., Chong, S. H., Robinson, R. C., and Choe, H. Soluble expression and partial purification of recombinant human erythropoietin from e. coli. Protein Expr Purif, 95C:211–218, January 2014.

Ju, Y. H., Allred, K. F., Allred, C. D., and Helferich, W. G. Genistein stimulates growth of human breast cancer cells in a novel, postmenopausal animal model, with low plasma estradiol concentrations. Carcinogenesis, 27(6):1292–9, June 2006.

Kane, P. M., Yamashiro, C. T., Wolczyk, D. F., Neff, N., Goebl, M., and Stevens, T. H. Protein splicing converts the yeast tfp1 gene product to the 69-kd subunit of the vacuolar h(+)-adenosine triphosphatase. Science, 250(4981):651–7, November 1990.

Kapust, R. B. and Waugh, D. S. Escherichia coli maltose-binding protein is uncom- monly effective at promoting the solubility of polypeptides to which it is fused. Protein Sci, 8(8):1668–74, August 1999.

Karki, P., Smith, K., Johnson, J., J., and Lee, E. Astrocyte-derived growth factors and estrogen neuroprotection: Role of transforming growth factor-alpha in estrogen- induced upregulation of glutamate transporters in astrocytes. Mol Cell Endocrinol, January 2014.

125 Kim, H. G., Kishikawa, S., Higgins, A. W., Seong, I. S., Donovan, D. J., Shen, Y., Lally, E., Weiss, L. A., Najm, J., Kutsche, K., Descartes, M., Holt, L., Braddock, S., Troxell, R., Kaplan, L., Volkmar, F., Klin, A., Tsatsanis, K., Harris, D. J., Noens, I., Pauls, D. L., Daly, M. J., MacDonald, M. E., Morton, C. C., Quade, B. J., and Gusella, J. F. Disruption of neurexin 1 associated with autism spectrum disorder. Am J Hum Genet, 82(1):199–207, January 2008.

Kirov, G., Gumus, D., Chen, W., Norton, N., Georgieva, L., Sari, M., O’Donovan, M. C., Erdogan, F., Owen, M. J., Ropers, H. H., and Ullmann, R. Comparative genome hybridization suggests a role for nrxn1 and apba2 in schizophrenia. Hum Mol Genet, 17(3):458–65, February 2008.

Koehnke, J., Jin, X., Trbovic, N., Katsamba, P. S., Brasch, J., Ahlsen, G., Scheiffele, P., Honig, B., Palmer, r., A. G., and Shapiro, L. Crystal structures of beta-neurexin 1 and beta-neurexin 2 ectodomains and dynamics of splice insertion sequence 4. Structure, 16(3):410–21, March 2008.

Kotronoulas, G., Stamatakis, A., and Stylianopoulou, F. Hormones, hormonal agents, and neuropeptides involved in the neuroendocrine regulation of sleep in humans. Hormones (Athens), 8(4):232–48, October 2009.

Kristensen, T., Baatrup, E., and Bayley, M. 17alpha- reduces the com- petitive reproductive fitness of the male guppy (poecilia reticulata). Biol Reprod, 72(1):150–6, January 2005.

Le Page, Y., Scholze, M., Kah, O., and Pakdel, F. Assessment of xenoestrogens using three distinct estrogen receptors and the zebrafish brain aromatase gene in a highly responsive glial cell system. Environ Health Perspect, 114(5):752–8, May 2006.

126 Lee, H. J., Lee, Y. J., Choi, C. W., Lee, J. A., Kim, E. K., Kim, H. S., Kim, B. I., and Choi, J. H. Rosiglitazone, a peroxisome proliferator-activated receptor-gamma agonist, restores alveolar and pulmonary vascular development in a rat model of bronchopulmonary dysplasia. Yonsei Med J, 55(1):99–106, January 2014.

Li, J., Gierach, I., Gillies, A. R., Warden, C. D., and Wood, D. W. Engineering and optimization of an allosteric biosensor protein for peroxisome proliferator-activated receptor gamma ligands. Biosens Bioelectron, August 2011.

Liarte, S., Cabas, I., Chaves-Pozo, E., Arizcun, M., Meseguer, J., Mulero, V., and Garcia-Ayala, A. Natural and synthetic estrogens modulate the inflammatory re- sponse in the gilthead seabream (sparus aurata l.) through the activation of en- dothelial cells. Mol Immunol, 48(15-16):1917–25, September 2011.

Lin, H. Y., Su, Y. F., Hsieh, M. T., Lin, S., Meng, R., London, D., Lin, C., Tang, H. Y., Hwang, J., Davis, F. B., Mousa, S. A., and Davis, P. J. Nuclear monomeric integrin alphav in cancer cells is a coactivator regulated by thyroid hormone. FASEB J, 27(8):3209–16, August 2013.

Lopez-Cebral, R., Martin-Pastor, M., Paolicelli, P., Casadei, M. A., Seijo, B., and Sanchez, A. Application of nmr spectroscopy in the development of a biomimetic approach for hydrophobic drug association with physical hydrogels. Colloids Surf B Biointerfaces, 115C:391–399, December 2013.

Manolagas, S. C., O’Brien, C. A., and Almeida, M. The role of estrogen and androgen receptors in bone health and disease. Nat Rev Endocrinol, 9(12):699–712, December 2013.

Marcheva, B., Ramsey, K. M., Peek, C. B., Affinati, A., Maury, E., and Bass, J. Circadian clocks and metabolism. Handb Exp Pharmacol, (217):127–55, 2013.

127 Miller, B. W., Willett, K. C., and Desilets, A. R. Rosiglitazone and pioglitazone for the treatment of alzheimer’s disease. Ann Pharmacother, 45(11):1416–24, Novem- ber 2011.

Miyawaki, A., Llopis, J., Heim, R., McCaffery, J. M., Adams, J. A., Ikura, M., and Tsien, R. Y. Fluorescent indicators for ca2+ based on green fluorescent proteins and calmodulin. Nature, 388(6645):882–7, August 1997.

Mootz, H. D., Blum, E. S., Tyszkiewicz, A. B., and Muir, T. W. Conditional protein splicing: a new tool to control protein structure and function in vitro and in vivo. J Am Chem Soc, 125(35):10561–9, September 2003.

Nettles, K. W. and Greene, G. L. Ligand control of coregulator recruitment to nuclear receptors. Annu Rev Physiol, 67:309–33, 2005.

Newbold, R. R. Impact of environmental endocrine disrupting chemicals on the development of obesity. Hormones-International Journal of Endocrinology and Metabolism, 9(3):206–217, July 2010.

Nygaard, F. B. and Harlow, K. W. Heterologous expression of soluble, active pro- teins in escherichia coli: the human estrogen receptor hormone-binding domain as paradigm. Protein Expr Purif, 21(3):500–9, April 2001.

O’Donnell, C. and Nolan, M. F. Tuning of synaptic responses: an organizing principle for optimization of neural circuits. Trends Neurosci, 34(2):51–60, February 2011.

Packer, N. and Karlsson, N. G. Glycomics : methods and protocols. Springer proto- cols. Humana Press ; Springer distributor, New York , NY London, 2009. ISBN 9781588297747 (hbk. alk. paper) 1588297748 (hbk. alk. paper) 9781597450225 (e- ISBN) 1597450227 (e-ISBN) 1064-3745 ;.

128 Paterni, I., Bertini, S., Granchi, C., Macchia, M., and Minutolo, F. Estrogen receptor ligands: a patent review update. Expert Opin Ther Pat, 23(10):1247–71, October 2013.

Paulus, H. Protein splicing and related forms of protein autoprocessing. Annu Rev Biochem, 69:447–96, 2000.

Perler, F. B. Protein splicing of inteins and hedgehog autoproteolysis: structure, function, and evolution. Cell, 92(1):1–4, January 1998.

Perler, F. B., Davis, E. O., Dean, G. E., Gimble, F. S., Jack, W. E., Neff, N., Noren, C. J., Thorner, J., and Belfort, M. Protein splicing elements: inteins and exteins– a definition of terms and recommended nomenclature. Nucleic Acids Res, 22(7): 1125–7, April 1994.

Petrenko, A. G., Ullrich, B., Missler, M., Krasnoperov, V., Rosahl, T. W., and Sudhof, T. C. Structure and evolution of neurexophilin. J Neurosci, 16(14):4360–9, July 1996.

Pietrokovski, S. Intein spread and extinction in evolution. Trends Genet, 17(8): 465–72, August 2001.

Polyzos, S. A., Kountouras, J., Deretzi, G., Zavos, C., and Mantzoros, C. S. The emerging role of endocrine disruptors in pathogenesis of insulin resistance: A con- cept implicating nonalcoholic fatty liver disease. Current Molecular Medicine,12 (1):68–82, January 2012.

Raffo, D., Berardi, D. E., Pontiggia, O., Todaro, L., de Kier Joffe, E. B., and Simian, M. Tamoxifen selects for breast cancer cells with mammosphere forming capacity and increased growth rate. Breast Cancer Res Treat, 142(3):537–48, December 2013.

129 Rowen, L., Young, J., Birditt, B., Kaur, A., Madan, A., Philipps, D. L., Qin, S., Minx, P., Wilson, R. K., Hood, L., and Graveley, B. R. Analysis of the human neurexin genes: alternative splicing and the generation of protein diversity. Genomics,79 (4):587–97, April 2002.

Saaristo, M., Craft, J. A., Lehtonen, K. K., Bjork, H., and Lindstrom, K. Disruption of sexual selection in sand gobies (pomatoschistus minutus) by 17alpha-ethinyl estradiol, an endocrine disruptor. Horm Behav, 55(4):530–7, April 2009.

Sivertsen, A., Isaksson, J., Leiros, H. K., Svenson, J., Svendsen, J. S., and Brandsdal, B. O. Synthetic cationic antimicrobial peptides bind with their hydrophobic parts to drug site ii of human serum albumin. BMC Struct Biol, 14(1):4, 2014.

Skinner, M. K., Manikkam, M., and Guerrero-Bosagna, C. Epigenetic transgenera- tional actions of endocrine disruptors. Reprod Toxicol, 31(3):337–43, April 2011.

Skretas, G. and Wood, D. W. A bacterial biosensor of endocrine modulators. JMol Biol, 349(3):464–74, June 2005a.

Skretas, G. and Wood, D. W. Regulation of protein activity with small-molecule- controlled inteins. Protein Sci, 14(2):523–32, February 2005b.

Skretas, G., Meligova, A. K., Villalonga-Barber, C., Mitsiou, D. J., Alexis, M. N., Micha-Screttas, M., Steele, B. R., Screttas, C. G., and Wood, D. W. Engineered chimeric enzymes as tools for drug discovery: generating reliable bacterial screens for the detection, discovery, and assessment of estrogen receptor modulators. JAm Chem Soc, 129(27):8443–57, July 2007.

Song, J. Y., Ichtchenko, K., Sudhof, T. C., and Brose, N. Neuroligin 1 is a postsy- naptic cell-adhesion molecule of excitatory synapses. Proc Natl Acad Sci U S A, 96(3):1100–5, February 1999.

130 Spradau, T. W. and Katzenellenbogen, J. A. Ligands for the estrogen receptor, containing cyclopentadienyltricarbonylrhenium units. Bioorg Med Chem Lett,8 (22):3235–40, November 1998.

Stratton, M. M. and Loh, S. N. Converting a protein into a switch for biosensing and functional regulation. Protein Sci, 20(1):19–29, January 2011.

Sudhof, T. C. Neuroligins and neurexins link synaptic function to cognitive disease. Nature, 455(7215):903–11, October 2008.

Sugita, S., Khvochtev, M., and Sudhof, T. C. Neurexins are functional alpha- latrotoxin receptors. Neuron, 22(3):489–96, March 1999.

Sugita, S., Saito, F., Tang, J., Satz, J., Campbell, K., and Sudhof, T. C. A stoichio- metric complex of neurexins and dystroglycan in brain. J Cell Biol, 154(2):435–45, July 2001.

Tabuchi, K. and Sudhof, T. C. Structure and evolution of neurexin genes: insight into the mechanism of alternative splicing. Genomics, 79(6):849–59, June 2002.

Tian, Y., Li, T., Sun, M., Wan, D., Li, Q., Li, P., Zhang, Z. C., Han, J., and Xie, W. Neurexin regulates visual function via mediating retinoid transport to promote rhodopsin maturation. Neuron, 77(2):311–22, January 2013.

Ushkaryov, Y. A., Petrenko, A. G., Geppert, M., and Sudhof, T. C. Neurexins: synaptic cell surface proteins related to the alpha-latrotoxin receptor and laminin. Science, 257(5066):50–6, July 1992. van den Heuvel, R. H., Fraaije, M. W., Laane, C., and van Berkel, W. J. Regio- and stereospecific conversion of 4-alkylphenols by the covalent flavoprotein vanillyl- alcohol oxidase. J Bacteriol, 180(21):5646–51, November 1998.

131 Vogel, C. L., Johnston, M. A., Capers, C., and Braccia, D. for breast cancer: a review of 20 years of data. Clin Breast Cancer, 14(1):1–9, February 2014.

Vosges, M., Le Page, Y., Chung, B. C., Combarnous, Y., Porcher, J. M., Kah, O., and Brion, F. 17alpha-ethinylestradiol disrupts the ontogeny of the forebrain gnrh system and the expression of brain aromatase during early development of zebrafish. Aquat Toxicol, 99(4):479–91, September 2010.

Washbourne, P., Dityatev, A., Scheiffele, P., Biederer, T., Weiner, J. A., Christopher- son, K. S., and El-Husseini, A. Cell adhesion molecules in synapse formation. J Neurosci, 24(42):9244–9, October 2004.

Weatherman, R. V., Fletterick, R. J., and Scanlan, T. S. Nuclear-receptor ligands and ligand-binding domains. Annu Rev Biochem, 68:559–81, 1999.

Wittliff, J. L., Wenz, L. L., Dong, J., Nawaz, Z., and Butt, T. R. Expression and characterization of an active human estrogen receptor as a ubiquitin fusion protein from escherichia coli. J Biol Chem, 265(35):22016–22, December 1990.

Wood, D. W., Wu, W., Belfort, G., Derbyshire, V., and Belfort, M. A genetic system yields self-cleaving inteins for bioseparations. Nat Biotechnol, 17(9):889–92, September 1999.

Wu, W. Y., Gillies, A. R., Hsii, J. F., Contreras, L., Oak, S., Perl, M. B., and Wood, D. W. Self-cleaving purification tags re-engineered for rapid topo(r) cloning. Biotechnol Prog, 26(5):1205–12, September 2010.

Xu, M. Q., Southworth, M. W., Mersha, F. B., Hornstra, L. J., and Perler, F. B. In vitro protein splicing of purified precursor and the identification of a branched intermediate. Cell, 75(7):1371–7, December 1993.

132 Yaghmaie, F., Saeed, O., Garan, S. A., Freitag, W., Timiras, P. S., and Sternberg, H. Caloric restriction reduces cell loss and maintains estrogen receptor-alpha im- munoreactivity in the pre-optic hypothalamus of female b6d2f1 mice. Neuro En- docrinol Lett, 26(3):197–203, June 2005.

Yang, C., Shen, H. C., Wu, Z., Chu, H. D., Cox, J. M., Balsells, J., Crespo, A., Brown, P., Zamlynny, B., Wiltsie, J., Clemas, J., Gibson, J., Contino, L., Lisnock, J., Zhou, G., Garcia-Calvo, M., Bateman, T., Xu, L., Tong, X., Crook, M., and Sinclair, P. Discovery of novel oxazolidinedione derivatives as potent and selective mineralocorticoid receptor antagonists. Bioorg Med Chem Lett, 23(15):4388–92, August 2013.

Yang, X., Yang, S., McKimmey, C., Liu, B., Edgerton, S. M., Bales, W., Archer, L. T., and Thor, A. D. Genistein induces enhanced growth promotion in er-positive/erbb- 2-overexpressing breast cancers by er-erbb-2 cross talk and p27/kip1 downregula- tion. Carcinogenesis, 31(4):695–702, April 2010.

Zeidler, M. P., Tan, C., Bellaiche, Y., Cherry, S., Hader, S., Gayko, U., and Perri- mon, N. Temperature-sensitive control of protein activity by conditionally splicing inteins. Nat Biotechnol, 22(7):871–6, July 2004.

Zhang, J. H., Chung, T. D., and Oldenburg, K. R. A simple statistical parameter for use in evaluation and validation of high throughput screening assays. J Biomol Screen, 4(2):67–73, 1999.

Zimniak, L., Dittrich, P., Gogarten, J. P., Kibak, H., and Taiz, L. The cdna sequence of the 69-kda subunit of the carrot vacuolar h+-ATPase. homology to the beta- chain of f0f1-ATPases. J Biol Chem, 263(19):9102–12, July 1988.

133