Quick viewing(Text Mode)

Role of the Maize Transcription Factor R in the Regulation of Anthocyanin Biosynthesis

Role of the Maize Transcription Factor R in the Regulation of Anthocyanin Biosynthesis

ROLE OF THE TRANSCRIPTION FACTOR R IN THE

REGULATION OF ANTHOCYANIN BIOSYNTHESIS

DISSERTATION

Presented in Partial Fulfillment of the Requirements for the Degree Doctor of Philosophy

in the Graduate School of The Ohio State University

By

Antje Christin Feller

Graduate Program in Molecular, Cellular and Developmental Biology

The Ohio State University

2010

Dissertation Committee:

Professor Erich Grotewold, Advisor Professor David Bisaro Professor Venkat Gopalan Professor JC Jang Professor Bernd Weisshaar

Copyright by

Antje Christin Feller

2010

ABSTRACT

The maize biosynthetic pathway is one of the best characterized plant model systems to study the combinatorial regulation of expression. are important secondary metabolites and important for the plant, such as in the case for anthocyanins as attractants to pollinators as well as for humans due to a number of biological activities. Anthocyanins in maize are regulated by the cooperation of the

R2R3-MYB domain C1 and the basic helix-loop-helix (bHLH) protein R. In contrast, phlobaphene pigments, derived from a separate flavonoid biosynthetic branch, are regulated by the R2R3-MYB domain protein P1, which can activate transcription in the absence of a (known) bHLH. Our laboratory has established that the bHLH transcription factor R plays a key role in determining the biological specificity of C1. I describe here how R might contribute to regulatory specificity, on one hand by forming a platform for protein-protein interactions, and on another, by binding to different sites in the regulatory regions of its target depending on the interacting partners. Using a variety of techniques I have investigated three protein-protein interacting regions of R. I demonstrate that the highly conserved bHLH domain of R is involved in transcriptional regulation and histone functions. I show that R interacts with the EMSY-like protein

RIF1 specifically via the bHLH domain and that this interaction is required for the ii regulation of endogenous flavonoid genes. RIF1 is part of the C1/R regulatory complex and I discuss how RIF1 links transcriptional regulation of flavonoid biosynthetic genes with chromatin function. In addition, I show that the region adjacent to the bHLH domain has structural similarity to a leucine zipper and that the extended bHLH-LZ-like region is able to homodimerize. The bHLH-LZ-like mediated dimerization is required for activation of a synthetic pG-box::Luc promoter::reporter construct in transient expression studies and for binding to a synthetic G-box probe, as well as to the Bz1 and C2 promoter in vitro. Furthermore, I demonstrate that the ACT domain at the C-terminus of R homodimerizes. This domain is necessary for anthocyanin pigment formation, for transcriptional activation of at least four anthocyanin biosynthetic genes and important for DNA-binding to the A1 and Bz1 promoters. I show that interplay between the functional domains described here is necessary for transcriptional activation and DNA- binding. I am also characterizing R-interacting partners which possibly tether R to as yet unknown target genes and therefore might show the involvement of R in other cellular processes.

Taken together, these studies emphasize the importance of the bHLH transcription factor R in combinatorial regulation of of anthocyanin biosynthetic genes and open new possibilities for R to function in other cellular processes. Moreover, these studies highlight the complexity of biochemical pathway regulation and show novel mechanisms of how one TF can participate in several regulatory complexes.

iii

Dedication

This document is dedicated to my family.

iv

Acknowledgments

This work would not have been possible without the continuous guidance, support and friendship of my advisor Prof. Erich Grotewold. Thank you Boss for believing in me.

I would like to thank my collaborator Dr. Ling Yuan and his lab members Que

Kong and Sitakanta Pattanaik for starting a highly interesting project and for letting me be a part of it.

I am grateful to all my lab mates throughout the years, undergraduate students

(Kenneth Frame, Ali Azad, Carmen Perrino and Julia Muntean), visiting scholars

(Margarita Barros, Andrea Gonzalez-Conca, Lina Palacio), visiting professors (Rivka

Barg and Lijun Wang), graduate students (Zidian Xie, George Heine, Marcela

Hernandez, Anusha Diaz, Robert Lockwood, Laura Martz and MinGab Kim) and postdoctoral fellows (Vinod Malik, Frantisek Poustka and Yuhua Lu). A super special thanks to former graduate student Niloufer Irani for her tremendous help in the lab, for her friendship and for making the time spent in the lab so much more fun.

I especially want to thank my current colleagues, visiting scholars Andres

Bohorquez and Katherine Mejia-Guerra, graduate students (Katja Machemer and Daniel

Arango), postdoctoral fellows (Lucille Pourcel, Kengo Morohashi, Asela Wijeratne, v

Chenglin Chai, Yongqin Wang, Waka Omata, Alper Yilmaz, Xinli Sun and Dan Siegal-

Gaskin). Thank you for all the discussions and for helping me making my research a success. I want to express my great appreciation to my lab mate and graduate student

Isabel Casas. Thank you Isa for scientific and non-scientific conversations and for being a wonderful friend. Thank you Isabel, Katja, Chenglin and Yongqin for carefully reviewing my dissertation.

A special thanks goes to all faculty, postdoctoral fellows and students in the Plant

Biotech Center for letting me borrow lab supplies and use equipment and for many helpful discussions.

Thanks also to the staff in Rightmire Hall; Melinda Parker, Diane Furtney, Scott

Hines, Dave Long and Joe Takayama. Thank you for taking care of “things”, which I have no clue about.

I am grateful to my committee members, Prof. David Bisaro, Prof. J.C. Jang, Prof.

Venkat Gopalan and Prof. Bernd Weisshaar for their time and guidance.

A special thanks goes to Oliver Voss, for a very unique friendship, for his help with any scientific question and for keeping me company while writing my dissertation.

Thank you John Bruzzese for flowers, dinners, movies and for wonderful times.

Finally I would like to thank my family, for sending packages with my favorite chocolate, for hour-long phone calls and for believing in me.

vi

Vita

March, 1st 1976……….…… born in Brandenburg a. d. H., Germany 1998…………………………B.S. Chemistry, GeoForschungsZentrum , (National Research Center for Geosciences), Potsdam, Germany 2002…………………………Diploma in Biotechnology, University of Applied Sciences, Berlin, Germany 2004 to present ……………..Graduate Teaching and Research Associate, Department of Plant Cellular and Molecular Biology, The Ohio State University

Publications

Falcone Ferreyra M, Rius S, Emiliani J, Feller A, Pourcel L, Morohashi K, Casati P, Grotewold E. 2010. Cloning and Characterization of a UV-B Inducible Maize Flavonol Synthase. Plant J. 62: 77-91

Poustka F, Irani NG, Feller A, Lu Y, Pourcel L, Frame K, Grotewold E (2007) A Trafficking Pathway for Anthocyanins Overlaps with the -to- Vacuole Protein Sorting Route in Arabidopsis and Contributes to the Formation of Vacuolar Inclusions. Plant Physiol. 145: 1323-1335

*Hernandez, JM, *Feller A, *Morohashi K, Frame K, Grotewold E. (2007) The Basic- helix-loop-helix Domain of Maize R Links Transcriptional Regulation and Histone Modifications by Recruitment of an EMSY-related Factor. Proc. Natl. Acad. Sci. USA 104: 17222-17227 (*these authors contributed equally to the work)

Feller A, Hernadez JM, Grotewold E. (2006) An ACT-like Domain Participates in the Dimerization of Several Plant Basic-helix-loop-helix Transcription Factors. J. Biol. Chem. 281: 28964-28974

Hernandez JM, Heine GF, Irani NG, Feller A, Kim MG, Matulnik T, Chandler VL, Grotewold E. (2004) Different Mechanisms Participate in the R-dependent Activity of the R2R3 MYB Transcription Factor C1. J. Biol. Chem. 279: 48205-48213 vii

Fields of Study

Major Field: Molecular, Cellular and Developmental Biology

viii

Table of Contents

Abstract ...... ii

Dedication ...... iiii

Acknowledgments…………………………………………………………………………v

Vita………………………………………………………………………………………vii

List of Tables ...... xv

List of Figures…………………………………………………………………………...xvi

Abbreviations……………………………………………………………………….…xviii

CHAPTER 1 ...... 1

INTRODUCTION ...... 1

1.1 Overview ...... 1

1.2 Combinatorial regulation of gene expression...... 2

1.2.1. Aspects of Combinatorial Gene Regulation ...... 3

1.2.2. The Role of Chromatin Structure in Transcriptional Regulation ...... 6

1.3 The Flavonoid Biosynthetic Pathway ...... 8

1.3.1. The Regulators of Anthocyanin and Phlobaphene Biosynthesis ...... 8

1.3.1.1 The MYB Domain Regulators ...... 9

1.3.1.2 The bHLH Domain Regulators ...... 13 ix

1.3.2 Structural Genes of the Anthocyanin Biosynthetic Pathway...... 20

1.4 Research Goal ...... 21

CHAPTER 2 ...... 32

AN ACT-LIKE DOMAIN PARTICIPATES IN THE DIMERIZATION OF PLANT

BASIC HELIX-LOOP-HELIX TRANSCRIPTION FACTORS ...... 32

2.1 Introduction ...... 32

2.2 Materials and Methods ...... 35

2.2.1 Plasmids used in transient expression experiments ...... 35

2.2.2 Plasmids Used in Yeast Two-hybrid Experiments ...... 36

2.2.3 Microprojectile Bombardment and Gene Expression Experiments ...... 37

2.2.4 Yeast Two-hybrid Experiments ...... 37

2.2.5 GST Pull-down Experiments ...... 38

2.2.6 Plant Transformation and Confocal Microscopy...... 40

2.2.7 Analysis of Dimerization Domain Structure ...... 40

2.3 Results ...... 41

2.3.1 R Contains a C-terminal Dimerization Domain ...... 41

2.3.2 The Dimerization Region of R Is Necessary for Regulatory Activity ...... 43

2.3.3 The Dimerization Domain Is Required for Only a Subset of the R Activities . 45

2.3.4 Structural Analysis of the Dimerization Domain of R ...... 47

x

2.3.5 Identification of ACT Dimerization Domains in Other bHLH ...... 49

2.4 Discussion ...... 51

CHAPTER 3 ...... 65

THE bHLH DOMAIN OF MAIZE R LINKS TRANSCRIPTIONAL REGULATION

AND HISTONE MODIFICATIONS BY RECRUITMENT OF AN EMSY-LIKE

FACTOR ...... 65

3.1 Introduction ...... 65

3.2 Materials and Methods ...... 68

3.2.1 Plant Materials ...... 68

3.2.2 Protoplast Isolation and Transformation ...... 68

3.2.3 Plant Transformation and Fluorescence Microscopy ...... 69

3.2.4 Protein-Protein Interaction Analyses ...... 69

3.2.5 ChIP Analyses ...... 70

3.2.6 Transient Expression Experiments in Maize Cells ...... 72

3.2.7 Nuclei Isolation and Micrococcal Nuclease Digestion ...... 73

3.2.7 Antibody Production...... 73

3.3 Results ...... 74

3.3.1 An Essential Function for the bHLH Domain of R ...... 74

xi

3.3.2 Identification of RIF1 as an ENT Domain Protein that Specifically Interacts

with the bHLH Domain of R ...... 77

3.3.3 RIF1 Displays Speckled Nuclear Localization and is Necessary for the

Activation of Maize Flavonoid Genes ...... 80

3.3.4 R Recruits RIF1 to the A1 Gene Promoter ...... 81

3.4 Discussion ...... 84

CHAPTER 4 ...... 104

A DIMERIZATION-MEDIATED SWITCH LEADS TO REGULATION OF

DIFFERENT SETS OF TARGET GENES ...... 104

4.1 Introduction ...... 104

4.2 Materials and Methods ...... 108

4.2.1 Plasmids used in transient expression experiments ...... 108

4.2.2 Transient Expression Assays ...... 108

4.2.3 Protoplast Transformation ...... 109

4.2.4 Chromatin-Immunoprecipitation ...... 110

4.2.5 Protein Expression and Purification ...... 110

4.2.6 Electrophoretic-Mobility-Shift-Assay (EMSA) ...... 111

4.2.7 Constructs Used for Protein-Protein Interaction Experiments ...... 112

4.2.8 Protein-Protein Interaction Experiments ...... 112

xii

4.3 Results ...... 113

4.3.1 The ACT Domain is Important for A1 and G-box Activation and Binding ... 113

4.3.2. Promoter Analysis of Anthocyanin Biosynthetic Genes ...... 115

4.3.3 Full-length R Homodimerizes in the Presence of C1 ...... 118

4.3.4 The bHLH Mediated Dimerization Inhibits Dimerization with RIF1 ...... 120

4.4 Discussion ...... 120

CHAPTER 5 ...... 134

CHARACTERIZATION OF NOVEL R-INTERACTING FACTORS FROM MAIZE

AND ARABIDOPSIS ...... 134

5.1 Introduction ...... 134

5.2 Material and Methods...... 135

5.2.1 Yeast Two-Hybrid Assay ...... 135

5.2.3 Antibody Production...... 137

5.2.3 Localization Studies ...... 137

5.3 Results ...... 137

5.3.1 Characterization of ZmbHLH5 ...... 137

5.3.2 Characterization of RIF-2C ...... 140

5.3.3 Characterization of At5g46690 ...... 143

5.4 Discussion ...... 144

xiii

CHAPTER 6 ...... 153

FINAL DISCUSSION ...... 153

6.1 Summary of Findings ...... 153

6.2 The Role of the ACT and bHLH Domain in R Function ...... 154

6.3 R Binds DNA in Different Ways Depending on the Target ...... 157

6.4 C1 is Required for Tethering R to DNA ...... 159

6.5 Final Remarks ...... 161

APPENDIX A ...... 165

CONSTRUCTS USED ...... 165

APPENDIX B ...... 168

PRIMERS USED ...... 168

APPENDIX C ...... 169

BACTERIA/YEAST STRAINS USED ...... 169

APPENDIX D ...... 170

YEAST-TWO-HYBRID SCREEN ANALYSIS ...... 170

List of References……………………………………………………………...……….171

xiv

List of Tables

Table 1.1: Functional characterization of bHLH proteins from different species...... 27

Table 3.1: Summary of yeast protein-protein interactions……………………….…….102

Table 4.1: Yeast II-hybrid and yeast III-hybrid results………………………………...133

xv

List of Figures

Figure 1.1: The flavonoid biosynthetic pathway...... 24

Figure 1.2: Model of TF binding and chromatin remodeling factor binding…………...25

Figure 1.3: Plant bHLH proteins……………………………………………………..….26

Figure 2.1: Schematic representation of R and alignment of the C-terminal region of R- like proteins from maize and Arabidopsis……………………………………………….58

Figure 2.2: The C-terminal region of R contains a dimerization domain……………….59

Figure 2.3: The R dimerization region is necessary for the efficient activation of anthocyanin biosynthesis………………………………………………………..……….60

Figure 2.4: The R dimerization region is required for a subset of the R regulatory activities………………………………………………………………………………….61

Figure 2.5: The dimerization region of R has structural similarities to ACT domains….62

Figure 2.6: ACT-like domains present in several other plant bHLH proteins…………..64

Figure 3.1: The bHLH region of R and chromatin functions…………………………....91

Figure 3.2: Enrichment of H3K9/14ac in the proximal A1 promoter region in

BMS cells…………………………………………………………………………..…….92

Figure 3.3: RIF1 corresponds to an EMSY-like protein that interacts with the bHLH domain of R………………………………………………………………………………94

Figure 3.4: Phylogenetic analysis of RIF1 and RIF1-like genes…………..……………96

xvi

Figure 3.5: RIF1 is nuclear and necessary for anthocyanin accumulation……………...97

Figure 3.6: Knock down of RIF1 reduces the accumulation of anthocyanin in

BMS cells………………………………………………………………………………..98

Figure 3.7: The recruitment of RIF1 to the A1 promoter depends on the presence of C1 and R………………………………………………………………………………99

Figure 3.8: RIF1 and RIF1 homologs from Arabidopsis interact with a subset of bHLH proteins………………………………………………………………………………….101

Figure 4.1: R homodimerizes via a bHLH-LZ like domain and binds to DNA……….127

Figure 4.2: The ACT-like domain of R is important for activation and binding to A1..128

Figure 4.3: R can activate from a synthetic G-box containing promoter in the absence of C1……………………………………………………………...……………129

Figure 4.4: Maize flavonoid gene promoter analysis………………………………….131

Figure 4.5: Analysis of the Bz1 promoter…………………………...…………………132

Figure 5.1: Characterization of the bHLH protein ZmbHLH5…...……………………147

Figure 5.2: Protein-protein interaction and localization of ZmbHLH5………………..148

Figure 5.3: Characterization of ZmRIF-2C ……………………………………………150

Figure 5.4: Alignment of ZmRIF-2C, ZmRIF-2C-like and putative Arabidopsis homolog

At2g27230………………………………………………………………………..…….151

Figure 5.5: Phylogenetic tree of bHLH proteins………………………………………152

Figure 6.1: Model of TF complex formation on the A1 or Bz1 promoter. ……………164

xvii

Abbreviations

ACT……………………………………………Aspartate Kinase, Chorismate Mutase and TyrA AD……………………………………………..Activation Domain ADE………………………………….………..Adenine AGRIS…………………………………………The Arabidopsis Gene Regulatory Information Server Ala……………………………………………..Alanine AN1, 2, 11…………………………..………....ANTHOCYANINLESS 1, 2, 11 ARE……………………………………………Anthocyanin Regulatory Element bHLH…………………………………………..basic Helix-Loop-Helix B……………………………………………….Booster BLAST………………………………………...Basic Local Search Analysis Tool BMS……………………………………………Black Mexican Sweet BR……………………………………………...Brassinosteroid BS……………………………………………… °C………………………………………………Degree Celsius C1………………………………………………COLORLESS1 ChIP……………………………………………Chromatin-Immunoprecipitation ChIP-chip………………………………………Chromatin-Immunoprecipitation followed by microarray DB……………………………………………...Database DBD……………………………………………DNA-Binding Domain DEL……………………………………………DELILA DFR……………………………………………DIHYDROFLAVONOL REDUCTASE DNA…………………………………………....DeoxyriboNucleic Acid DTT…………………………………………….Dithiotreitol E-box…………………………………………..Enhancer-box ENT……………………………………………EMSY-Nterminal domain Fig……………………………………………..Figure GA……………………………………………..Gibberellic Acid GFP…………………………………………….Green Fluorescence Protein Gln……………………………………………..Glutamine GPD……………………………………………Glyceraldehyde-3-Phosphate Dehydrogenase GST……………………………………………Glutathione-S- H3, H4…………………………………………Histone 3, 4

xviii haPBS…………………………………………..high affinity P1-binding site HAT……………………………………………Histone Acetyl Transferase HIS……………………………………………..Histidine IN1……………………………………………..INTENSIFIER1 IPTG……………………………………………Isopropyl β-D-1-thiogalactopyranoside JAF13………………………………………….JOHNANDFRANCESCA13 laPBS…………………………………………...low affinity P1-binding site LB……………………………………………...Luria Bertoni LEU……………………………………………Leucine MIR…………………………………………….MYB-interactin-region MIZ1 …………………………………………..Myc-interacting- protein NaPi……………………………………………Sodium Phosphate OD……………………………………………..Optical Density O/N…………………………………………….Over Night P1………………………………………………PERICARP COLOR 1 PAC1…………………………………………..PALE ALEURONE COLOR PBS…………………………………………….P1-binding site pH………………………………………………Potentiometric Hydrogen Ion Concentration PMSF…………………………………………..Phenylmethanesulfonylfluoride PL………………………………………………PURPLE LEAF R………………………………………………..RED1 S.D……………………………………………..Standard Deviation Ser……………………………………………...Serine TF……………………………………………....Transcription Factor TFDB…………………………………………..Transcription Factor Data Base TG……………………………………………...Target Gene TRP…………………………………………….Tryptophan TSS……………………………………………..Transcription Start Site TT………………………………………………TRANSPARENT TESTA TTG…………………………………………….TRANSARENT TESTA GLABRA UV……………………………………………...Ultra Violet γ32P-ATP ……………………………………….gamma-Phosphorus-32 Adenosine-5- Triphosphate

xix

CHAPTER 1

INTRODUCTION

1.1 Overview

Flavonoids are plant secondary metabolites which are best known for producing beautiful red, blue or purple pigmentation in the flowers, fruits and leaves of many plants.

Since they are non-essential for plant survival, flavonoid pigments have been used for more than 100 years by geneticists like Gregor Mendel and Barbara McClintock as tools to analyze principles of genetics and have helped make significant biological discoveries, such as mobile genetic elements (transposons) [1] .

Colors produced by the flavonoid biosynthetic pathway (Fig. 1.1) serve mainly as attractants for pollinators and dispersers [2-3] as well as shields from ultra-violet

(UV) light or from attacks by microbes and insects [4-5]. Numerous studies on anthocyanins, one subgroup of flavonoids (Fig. 1.1), have shown that these small molecules display a wide range of biological activities important, not only for the plant, but also for human health [6]. Given the importance of these secondary metabolites for plants and humans, it is of great interest to further investigate how these compounds are synthesized and regulated inside the plant.

This research study mainly addresses the question of how is regulated in plants. The flavonoid biosynthetic pathway has developed into a powerful 1 model system to study the combinatorial regulation of gene expression, which will be discussed in this first chapter. The main focus of this dissertation is the maize bHLH

(basic helix-loop-helix) transcription factor RED1 (R) and its role in the combinatorial regulation of this pathway. This introductory chapter provides the background to the present research study along with an outline of the research goals. I will first discuss different aspects of combinatorial regulation of gene expression in animals and plants, followed by a characterization of bHLH and MYB domain transcription factors, the main players in this regulation. I will further explain what is known about the regulation of the flavonoid biosynthetic pathway, its regulatory and structural genes and indicate the gaps where my research fits in.

1.2 Combinatorial regulation of gene expression

Gene expression can be controlled at multiple levels, including transcription

(initiation or rate of transcription), post-transcription (mRNA processing and stability), translation or post-translation. For almost two decades, another level of how gene expression is controlled has been investigated; small interfering RNAs (siRNAs) [7].Just recently, it was shown that some R2R3-MYB domain regulators of the flavonoid biosynthetic pathway are targeted by certain siRNAs, but neither how this is achieved nor what are the consequences are fully understood [8]. The regulation of the flavonoid biosynthesis genes is at the crux of this dissertation document. The transcriptional regulation of this pathway has been studied in much detail but many questions still remain to be answered [9].

2

Transcription factors (TFs) are proteins that control the rate of transcription for any given gene by binding to specific DNA motifs in the regulatory region in a sequence- specific manner. One TF does not act usually by itself; rather a complex combination of

TFs and other co-factors is formed at the regulatory regions leading to transcriptional activation or repression. Furthermore, individual TFs can contribute to the regulation of one set of genes by participating in a specific protein complex but might regulate another set of genes by becoming part of another TF-complex. This process is termed combinatorial regulation of gene expression [10]. A TF must possess one or several of the following features in order to modulate gene expression. First, the TF must bind to

DNA in a sequence-specific manner. Second, the TF must interact with co-regulators such as chromatin modifiers or proteins which associate with the general transcription machinery (e.g. TATA binding protein-associated factors, TAFs) to be able to influence transcription positively or negatively. Finally, the synthesis or activity of most TF must be tightly regulated, so that it is active only in particular situations cell types.

Combinatorial interaction between transcription factors, co-regulators and DNA is highly important for gene regulation. Examples are presented below which provide insight into aspects of gene regulation in the common yeast Saccharomyces cerevisiae

(here on called S. cerevisiae or yeast), humans and plants.

1.2.1. Aspects of Combinatorial Gene Regulation

Studies in the model organism S. cerevisiae have shown that a small number of

TFs can regulate complex spatial and temporal patterns of gene expression. Balaji and colaborators assembled a large transcriptional network for S. cerevisiae with 157 specific 3

TFs, 4,410 TGs (target genes) and 12,873 regulatory interactions between them [11]. The authors used genome-wide ChIP-chip (Chromatin-Immunoprecipitation followed by

DNA-microarray hybridization) data and determined that each TF in yeast regulates on average 82 TGs. This might be achieved by the presence of one specific cis-regulatory element recognized by each TF in the regulatory region of all the genes that this specific

TF regulates. It is however more likely that interactions of that specific TF with various other TFs or co-regulators leads to this combinatorial regulation and expression of more than one target gene. The authors in addition determined that one specific TG is regulated on average by 2.9 TFs. This is mainly accomplished by combinatorial physical interactions of a specific TF with other TFs [11].

Another aspect of combinatorial interaction between TFs, co-regulators and DNA was studied by Kato et al. [12]. By using previously published ChIP-chip data combined with TF-DNA binding motif analysis, they assembled a combinatorial map of the yeast cell cycle. They found that multiple target genes for a specific TF contained an over- representation of a specific DNA motif. There are three ways by which a TF can associate with a specific BS (binding site): direct binding, piggy-back binding or cross- binding (Fig. 1.2A). For example the yeast TF FKH2, together with MCM1 and NDD1, regulates cell-cycle specific transcription of G2/M phase genes [13-15]. FKH2 can bind directly to cis-regulatory elements containing the sequence GTAAACAA and it is not surprising that this motif is over-represented in the FKH2 targets identified by ChIP-chip.

However, this same binding motif is also over-represented in NDD1 ChIP-chip data; eventhough NDD1 can not bind DNA directly. Therefore, GTAAACAA is a direct

4 binding motif for FKH2, but a piggy-back binding motif for NDD1 (Fig. 1.2A). Cross- binding on the other hand occurs at the regulatory region of the CLB2 cluster gene SWI5, to which FKH2 binds directly only after MCM1 is tethered to its BS [16].

It is interesting that depending on the association with certain TFs or co- regulators, a specific TF can have different functions and participate in diverse cellular processes. FKH2, for example, acts as a repressor of CLB2 cluster genes in complex with

MCM1 throughout the cell cycle. Only when NDD1 joins the complex at the G2/M transition, this new complex activates transcription [13]. Furthermore, SWI6 is a non-

DNA-binding co-regulator that can bind to SWI4 or MBP1. Depending on the interaction with SWI4 or MBP1, SWI6 then controls budding/cell wall genes or regulates DNA replication. [12].

To determine how DNA-binding specificity is achieved in vivo, Rabinovich and collaborators investigated E2F (E2 PROMOTER BINDING FACTOR) target gene promoters with and without E2F consensus binding sequences [17]. The E2F family of

TFs consists of eight members (E2F1 – E2F8) and is involved in the regulation of many cellular processes, such as control of cell cycle progression, apoptosis and tumorigenesis

[17]. The authors performed eChIP (episomal ChIP), and mutant analysis as well as genomic tilling (ENCODE) arrays on regulatory regions up to 2 kb upstream of the TSS

(Transcription Start Site). They tested indirect recruitment, DNA-looping and binding to a distal consensus sequence and direct binding to a site which weakly resembles an E2F-

BS in order to determine how E2F is recruited to non-consensus E2F-BS. The authors determined that E2F is directly recruited to target gene promoter sites which only weakly

5 resemble an E2F consensus sequence [17]. Additionally, the distance from the TSS was found to be highly correlated with binding of E2Fs to the consensus-sequence and correlated with specific histone modifications [17]. For example, histone H3 and H4 acetylation seems necessary for E2F1 binding to the E2F2 promoter in the presence of

MYC, since the knockdown of MYC resulted in loss of H4 acetylation which leads to loss of binding of E2F1 to the E2F2 promoter [18].

These examples show the complexity and dynamics of gene regulation in vivo, and the cases described here can be applied to the regulation of many other cellular processes. Additional genome-wide studies (e.g. ChIP-chip) and high-throughput analyses as the ones described above are necessary to understand how genes are regulated, how transcriptional complexes are assembled at regulatory regions and how

DNA-binding specificity is achieved in vivo. Although gene expression studies in vertebrates have benefited plant research, the mechanism of combinatorial regulation of gene expression in plants is still poorly understood. We are trying to build a network as the ones described above in plants using the flavonoid biosynthetic pathway as a model.

1.2.2. The Role of Chromatin Structure in Transcriptional Regulation

Combinatorial gene regulation is affected by chromatin structure, which has a strong influence on the accessibility of the DNA to transcription factors and the transcriptional machinery including RNA polymerase. It is affected by chemical modifications of the histone tails at the N-terminus of the proteins, which may lead to changes in histone-

DNA interaction, but more importantly, result in the recruitment of non-histone proteins

[19]. Histone modifications include acetylation, methylation and phosphorylation and 6 affect cellular processes such as DNA repair, replication or transcription. Acetylation and methylation of particular lysine residues of H3 (Histone 3) and H4 within promoter chromatin is usually associated with gene expression (i.e. H3K9/14ac, H3K36me2) and is conserved across eukaryotes, including plants. Other methylation marks however corresponds to a repressive chromatin structure (i.e. H3K9me1,2, H3K27me1,2) (for review see [20-21]). Nervertheless, according to Bua et al. “the biological outcome of histone marks are impacted by their location in chromatin regions and on the repertoire of effectors that have access to those regions” [22].

While acetyl-lysines in the histone tails are recognized and bound by bromodomain containing proteins (Fig. 1.2B), such as histone acetyl (HATs) and HAT- associated transcriptional co-activators [23-24], methyl-lysines are bound by members of the “Royal” family of chromo- (chromatin binding domain), Tudor- and MBT domain proteins (Fig. 1.2B) [25-26]. HP1 proteins are examples of chromodomain-containing proteins, which bind to methylated histone H3, lysine 9 (H3K9me) and are required for heterochromatic gene silencing [27]. The structure of the HP1 chromodomain is highly similar to that of the Tudor domain of SMN (Survival Motor Neuron), and both proteins bind methylated histone peptides in vitro [28]. Furthermore, the PWWP domain (contains conserved Pro-Trp-Trp-Pro) shows structural similarity to the Tudor domain, and both might be related by divergent evolution [29]. The DNA methyltransferase DNMT3b contains a PWWP domain and binds DNA, possibly methylated. [28, 30]. The picture that emerges is that members of the Royal family of chromatin effectors are structurally related and are associated with methylated histones (Fig. 1.2B).

7

AGENET domains are plant–specific homologs of Tudor domains [28]. One protein can have up to 6 adjacent AGENET domains. An AGENET domain can be accompanied by additional domains, for example an ENT domain [28]. The function of the plant

AGENET domain is unknown and it will be interesting to investigate if proteins containing such a domain bind to methyl-lysines in histones, or to methylated DNA.

A great amount of research has been done in vertebrates to understand the link between chromatin remodeling and transcriptional activation but there is a still a need to increase research knowledge of this cross-talk in plants. One aspect related to this aspect of gene regulation will be discussed in Chapter 3 of this dissertation.

1.3 The Flavonoid Biosynthetic Pathway

The flavonoid biosynthetic pathway provides an outstanding model to study the combinatorial regulation of gene expression. First, it has been investigated by geneticists, chemists and molecular biologists and many aspects of gene regulation in plants have been discovered using this pathway as a model. Second, many of the structural and regulatory genes of this pathway have been identified [31-32], and mutations in any of these genes are not lethal since flavonoids are not essential. Finally, the presence or absence of pigments makes mutant phenotypes easy to score.

1.3.1. The Regulators of Anthocyanin and Phlobaphene Biosynthesis

Pigmentation has been shown in many flowering plants to be regulated in a similar way and it illustrates that the anthocyanin biosynthetic pathway arose prior to separation of monocot and dicots [33]. Studies of flavonoid biosynthesis regulation in

8 flowering plants, such as Zea mays (maize), Arabidopsis thaliana (Arabidopsis), Petunia hybrid (petunia), Antirrhinum majus (snapdragon), Oryza sativa (rice) and Perilla frutescens (perilla) have determined that R2R3-MYB domain proteins and bHLH proteins are involved in the regulation of the structural genes [34-39]. The interaction of bHLH proteins with MYB TFs happens via the MIR (MYB-interacting region), a 250 long region corresponding to the N-terminal portion of group III of bHLH proteins [40-41]. Amino acids in that region, which are responsible for the specificity of the interaction needs to be determined. This interaction domain seems functionally conserved across plant species, since maize R can interact for example with the MYB proteins WER and CPC from Arabidopsis and FaMYB1 from strawberry [42-44].

1.3.1.1 The MYB Domain Regulators

1.3.1.1.1 General Introduction to MYB domain Proteins

"MYB" is an acronym derived from "myeloblastosis", an old name for a type of leukemia. The MYB transcription factor family was first recognized in the form of the v-

Myb oncogene of the avian myeloblastosis virus (AMV) [45]. These so called “classical”

MYBs display a modular structure, comprised of three imperfect repeats (R1R2R3) of about 50 amino acids, each forming a helix-turn-helix structure. Each repeat contains three regular spaced tryptophan residues, a distinct sign for MYB proteins [46].

The first plant MYB gene identified was the C1 (COLORLESS1) locus of maize

[47]. In contrast to the small number of MYB-domain proteins found in vertebrates, there are 173 MYB-domain proteins identified in maize, 125 in rice (http://grassius.org/) and

9

133 in the dicot Arabidopsis (http://arabidopsis.med.ohio-state.edu) [48-49]. Most plant

MYB domains are comprised of two repeats (R2R3) but three-repeat containing proteins

(pc-MYB or 3R-MYB) and single MYB repeat proteins (R3) exist as well [49-53].

R2R3-MYB proteins are only found in plants and they control plant specific processes

[54], such as the regulation of secondary (i.e. anthocyanin biosynthesis [47], phlobaphene pigmentation [55] and flavonol accumulation (AtMYB12) [56-57]), control developmental processes such as trichome formation [58], formation of root hairs [42] or stomatal development [59]. In addition, plant MYB proteins participate in plant responses to environmental factors such as viral infections or drought [60-61].

Recently, a new function of MYB genes has been discovered. Expression of v-

MYB (R1R2R3-MYB) induces extensive chromatin remodeling of target genes, for example the mim-1 enhancer region [62]. Chromatin remodeling involves alterations of nucleosomal organization and is transcription dependent [62]. These studies have been performed in vertebrates and it will be interesting to see whether plant R2R3-MYBs or

3R-MYBs can perform similar functions.

1.3.1.1.2 MYB-Domain Proteins Involved in Flavonoid Biosynthesis

The maize R2R3-MYB domain proteins C1 or PL1 (PURPLE LEAF 1) regulate anthocyanin biosynthesis when dimerizing with a bHLH protein R or B (BOOSTER1).

C1/PL1 or R/B are functionally equivalent but differ in their temporal and spatial expression. C1 is expressed in the aleurone and embryo of the maize kernel and at low levels in the husks [63], whereas PL1 is expressed in the vegetative tissue [64-66]. The

MYB domain of C1 has been shown to bind DNA and to interact with R in vitro and in 10 vivo, whereas the C1 C-terminal domain mediates transcriptional activation via an acidic activation domain [67-68]. C1 can bind to DNA in vitro with low affinity, but it is able to activate transcription of all known biosynthetic genes in the pathway, but only if it physically interacts with R [68].

P1 (PERICARP COLOR 1) is a R2R3-MYB domain protein that regulates biosynthesis of red phlobaphene pigments in maize kernels, a flavonoid exclusively found in maize and few other plant species (e.g. , sorghum and gloxinia) [69]. P1 activation of the biosynthetic genes studied so far (e.g. A1, which encodes DFR;

DIHYDROFLAVONOL REDUCTASE) does not require a known bHLH protein.

However, when P1 was co-expressed in transient expression experiments with petunia R- like protein AN1 (ANTHOCYANINLESS 1), the activation of DFR was increased 10- fold compared to activation by P1 alone [70]. It needs to be determined if P1 and AN1 are able to physically interact and if they do, what makes AN1 so special and whether a maize bHLH protein exists which is required for activation of some biosynthetic genes by

P1. Furthermore, it needs to be determined if similar regulatory processes exist in other plant systems.

P1* is a synthetic mutant of P1 with six amino acids changes in the MYB domain to reflect those present in C1 at identical positions. These changes allow P1* to interact with R and to activate the entire anthocyanin pathway. The interaction of P1* with R leads to an R-enhanced activity on the A1 promoter compared to C1 and R and this activity requires MYB-binding sites and the anthocyanin regulatory element (ARE) [71].

11

PAP1 and PAP2 (PRODUCTION OF ANTHOCYANIN PIGMENT 1 and 2) are

Arabidopsis C1 homologs, and overexpression of these MYB factors leads to higher anthocyanin accumulation in the whole plant body [72-73]. Recently, two other MYB- domain proteins have been identified (AtMYB113, AtMYB114), which, when expressed from CaMV (cauliflower mosaic virus) 35S promoter (p35S), lead to an increase in anthocyanin accumulation in young hypocotyls and cotyledons of Arabidopsis plants

[73]. Interestingly, when overexpressed in the gl3/egl3/tt8 triple -, no pigmentation was observed, suggesting that AtMYB113 and AtMYB114 are dependent on the expression of the bHLH partner. TT2 (TRANSPARENT TESTA 2) is another Arabidopsis MYB- domain protein, which regulates the expression of DFR, LDOX, BAN and TT12, genes involved in (condensed ) pigmentation in the seed coat. TT2 interacts with the bHLH protein TT8 for regulatory function [74].

In petunia, the R2R3-MYB protein AN2 regulates anthocyanin biosynthesis in the flower and its paralog AN4 is expressed in and specifies pigmentation of anthers [4, 75].

AN2 and AN4 activate expression of AN1, a bHLH protein involved in anthocyanin biosynthesis. AN1 and AN2 are sufficient to activate the DFRA promoter in leaf cells.

From studies in the plants mentioned above and also from other monocots, dicots and even the gymnosperm Picea marinara (black spruce) [76], the picture arises that

MYB regulators are one of the main players in flavonoid biosynthesis.

12

1.3.1.2 The bHLH Domain Regulators

1.3.1.2.1 General Introduction to bHLH Proteins

bHLH proteins are a group of transcription factors that regulate many essential physiological and developmental processes in eukaryotic cells [65, 77]. The bHLH domain contains approximately 60 amino acids with two functionally distinctive regions, the basic region and the helix-loop-helix (HLH) domain (Fig. 1.3A).

The basic region consists of a stretch of ~13 mainly basic amino acids and is able to bind to the CANNTG sequence (E-box), where N corresponds to any nucleotide. Three amino acids in the basic region are highly conserved and directly involved in binding

DNA, a histidine at position five (H5), glutamic acid at position nine (E9) and arginine at position 13 (R13) (H-E-R motif) (Fig. 1.3A). E9 is conserved throughout eukaryotes and it contacts the C nucleotide in the E-box [78]. This specific amino acid is a general indicator to distinguish between DNA-binding bHLH proteins, and non-binders. Indeed, proteins which lack the corresponding glutamic acid and show low frequency of basic amino acids generally do not bind DNA [78].

The HLH motif contains two amphiphatic -helices with a variable loop in between (Fig. 1.3A) and it homo – or heterodimerizes [79]. Dimerization is a prerequisite for DNA-binding, meaning that bHLH proteins can bind DNA only as a dimer. Most bHLH proteins seem to preferentially form heterodimer over homodimers [80-81], and depending on which dimer is formed, they can act as repressors or activators [81].

Dimerization partners do not have to include a bHLH motif, as shown by the interaction

13 of human MYC with MIZ-1 [82]. MIZ-1 is a POZ TF which targets MYC to non-E-box

BS and represses BCL2 (B-cell lymphoma 2) leading to apoptosis [83].

bHLH proteins were originally classified according to their ability to bind a variation of E-box cis-regulatory sequences in regulatory regions [84-86]. Most plant bHLH proteins have evolved from group B of bHLH proteins, which bind to G-box

(CACGTG) elements. In Arabidopsis, 74% of the bHLH proteins are predicted to bind E- box sequences according to the presence of the conserved H-E-R motif in the basic region [87]. Since only 10% of Arabidopsis bHLH proteins have been functionally characterized, and even less in important crops like rice and maize (Table 1.1), in vitro and in vivo DNA-binding studies need to be performed to confirm these predictions.

Various phylogenetic analyses using A. thaliana and O. sativa bHLH proteins have been performed [40, 88]. According to them, bHLH proteins were classified into 15-

25 subgroups with most groups containing additional conserved domains outside the bHLH domain (e.g. MIR, PAS = PER/ARNT/SIM domain, WRPW = Trp-Arg-Pro-Trp).

The bHLH proteins of the plant lineage seem to have expanded compared to the animal lineage and both lineages show no sequence conservation outside the conserved DNA- binding domain [88]. Recently, the bHLH proteins from nine land plants and algae were classified according to their evolutionary relationship [89] and were grouped into 26 subfamilies containing 28 non-bHLH motifs [89]. The function of various plant bHLH proteins are summarized in Table 1.1 (adapted from [89]).

Several subgroups of bHLH proteins contain a LZ adjacent to the second helix.

The LZ forms an α-helical structure which extends the second helix of the HLH domain

14

(Fig. 1.3A). This might help stabilize the bHLH dimer and therefore provide stronger

DNA-binding. It might also contribute to restricting promiscuous interaction with other bHLH or bHLHZ proteins and therefore provides support for DNA-binding specificity.

One type of bHLH factors which need to be mentioned and are represented in plants and animals are the HLH proteins or non-DNA binders. Recently, two HLH proteins, BU1 from rice and PRE1 from Arabidopsis, both missing the basic DNA- binding helix, have been identified and shown to positively regulate brassinosteroid (BR) and gibberellic acid (GA) signaling, respectively [90-91]. Proteins that lack the basic domain, belong to group D and are usually negative regulators, such as the Id-like proteins from mammals [87]. The mouse protein Id acts as a negative regulator by dimerizing with bHLH transcriptional activators such as MyoD to inhibit binding to the muscle creatine kinase enhancer and thus inhibits transactivation [92].

Most bHLH proteins contain a nuclear localization signal (NLS) sequence as part of the basic region which promotes localization to the nucleus [93]. Even in the absence of the basic domains, many bHLH proteins are able to move to the nucleus, either because of additional NLS in the protein or because of dimerization with bHLH proteins, which contain a NLS [94-95].

1.3.2.2.3 bHLH Proteins Involved in Flavonoid Biosynthesis

The first plant bHLH protein identified was the maize protein R and it has been shown to function as a co-regulator in flavonoid biosynthesis [65]. The role of R in transcriptional regulation is still under investigation. We know now that R is an essential co-activator of C1, but does not increase the in vivo affinity of C1 for DNA [71]. From 15 studies with C1, P1 and P1* a few possible models for R function have been proposed and will be discussed throughout this dissertation [71]. Possible models for R function include, but are not exclusive: (i) Release of the effect of an C1-inhibitor or the effect of an inhibitory domain within C1; (ii) mediation of localization of C1 to the nucleus; (iii) stabilization of the C1 protein or (iv) Serving as a docking platform for additional factors required for transcriptional activation.

Another bHLH protein identified in maize is IN1 (INTENSIFIER1), which negatively regulates the anthocyanin biosynthetic pathway. It is highly similar to R in the first 220 amino acids as well as in the bHLH domain, suggesting that it can dimerize with

MYB-domain and bHLH proteins. The basic region of IN1 contains the H-E-R motif, proposed to be important for DNA-binding (H5, E9 and R13). Several possible ways exists in which IN1 could function as an inhibitor. On the one hand, IN1 could form non- functional heterodimer with a bHLH protein such as R or B and it therefore could inhibit functional homo – or heterodimerization or DNA-binding. On the other hand, IN1 could bind to a MYB domain protein such as C1 or P1 and inhibiting their function, i.e. DNA- binding or binding to an activating bHLH [96]. How exactly IN1 functions will be interesting to determine and this mechanism can possibly be applied to other regulatory processes. Interestingly, a bHLH protein acting as inhibitor of anthocyanin pigmentation has not been found in any other plant to date.

In Arabidopsis, four R-like bHLH proteins (GL3, EGL3, TT8 and AtMYC1) are present and all seem to participate to different extent in anthocyanin production. The tt8 and gl3 single mutants produce a significant amount of anthocyanin in hypocotyls and

16 cotyledons of 5-day old seedlings but neither gl3/egl3 nor gl3/egl3/tt8 triple mutants show any anthocyanin pigmentation [97]. This suggests that EGL3 is the major contributor to anthocyanin biosynthesis but that TT8 and GL3 also play a role. The role of AtMYC1 (bHLH012) in anthocyanin formation is still unclear. Some evidence for a possible function in anthocyanin pigmentation include the interaction with PAP1 and

PAP2 [57]. AtMYC1 can activate a DFR::GUS promoter::reporter construct when transformed with PAP1 into Arabidopsis protoplasts, but this activation is much weaker than the activation by PAP1 with either GL3, EGL3 or TT8 [72].

GL3 and EGL3 also interact with the MYB domain proteins PAP1/2 and GL1 and have been shown to homodimerize via the C-terminus, which includes the bHLH domain

[97]. I will show in Chapter 4 that these proteins, as well as TT8 and AtMYC1, are able to dimerize specifically through an extended bHLH domain, and that the presence of the region C-terminal to the bHLH domain stabilizes this dimerization.

Interestingly, a GL3 mutation shows a pleiotropic phenoptype with respect to trichome formation. Trichome initiation is moderately affected but trichome branching, endoreduplication and cell size are strongly influenced [98]. The gl3/egl3 double mutants shows in addition to reduced anthocyanin pigmentation, other phenotypes, including no trichomes, less seed coat mucilage and defects in root hair morphology. This observation suggests that GL3 and EGL3 have overlapping regulatory capabilities and that they regulate other processes than flavonoid biosynthesis, amongst other things by interacting with different MYB domain proteins.

17

The petunia bHLH proteins JAF13 and AN1 activate DFR when co-expressed with the MYB domain protein AN2. JAF13 is the ortholog of maize R, where AN1 is more closely related to IN1 from maize. In addition to pigment formation, AN1 controls acidification of the vacuole in petal cells as well as size and morphology in the seed coat epidermis, most likely by interacting with another protein via bHLH domain or the region

C-terminal to the bHLH domain [70, 99].

The examples mentioned above suggest that the bHLH proteins have multiple functions and that one bHLH regulates more than one cellular process by interacting with different set of proteins. R seems to be involved in trichome formation as well as root hair formation in Arabidopsis since overexpression of R in A. thaliana leads to increased trichome numbers and a reduction in root hairs [100]. To explore the possibility that R might have other functions in maize, I performed a yeast-two hybrid screen using cDNA libraries from maize. I identified several R-interacting factors involved in anthocyanin biosynthesis but also in other cellular processes and which are described in Chapters 3 and 5 of this dissertation.

1.3.1.3 Other Regulators Required for Flavonoid Biosynthesis

In Arabidopsis and petunia, anthocyanin production requires a WD repeat- containing protein in addition to MYBs and bHLH proteins. WD-repeat proteins contain tandem repeats of about 40 amino acids, all of them ending in WD (Trp-Asp) [101].

Structure analysis revealed a β-propeller structure which seems to be necessary for protein-protein interactions. AN11 from petunia and TTG1 from Arabidopsis encode

WD40-repeat proteins and mutation in AN11 or TTG1 lead to non-pigmented flowers in 18 petunia and no pigmentation in stems and leaf in Arabidopsis, respectively [102-103].

Despite the defects in anthocyanin accumulation, ttg1-1 mutants have other pleiotropic effects, such as lack of , lack of trichome, excess root hair formation and no seed coat mucilage [103]. Ectopic expression of GL3, EGL3, TT8 or R can completely or partially complement the pleiotropic ttg1-1 mutant phenotype and bypass the need for

TTG1 [97].

In maize, the PALE ALEURONE COLOR1 gene (PAC1) encodes a WD40 protein shown to complement the Arabidopsis ttg1-1 mutant phenotype [104]. Mutants in PAC1 exhibit specifically in the aleurone a reduction in mRNA level of the structural genes A1,

Bz1 and A2 but not of the regulatory genes C1 and B. Neither AN11 nor PAC1 show pleiotropic effects, suggesting that other WD40-repeat proteins are expressed and are involved in these functions, such as perhaps maize MP1 [105]. WD-repeat proteins involved in anthocyanin formation interact with bHLH proteins via the acidic domain between the MIR and the bHLH domain (Fig. 2.1) [97] (Oh and Grotewold, unpublished). In addition, TTG1can interact with TT2 and PAP1 [106] (Norambuena and

Grotewold, unpublished).

Other regulatory proteins have been implicated in pigment biosynthesis. A. thaliana TT1 and TTG2 are a WIP plant zinc-finger protein and a WRKY zinc-finger protein, respectively and involved in accumulation of condensed tannins in the seed coat

[107-108]. ANL2 (ANTHOCYANINLESS 2) from Arabidopsis is a homeodomain protein which controls accumulation of anthocyanins in subepidermal tissue of seedlings and mature plants, as well as cellular organization of the primary root [109]. It will be

19 interesting to see if these types of proteins are involved in flavonoid biosynthesis in other plant species as well.

1.3.2 Structural Genes of the Anthocyanin Biosynthetic Pathway

The maize A1 gene is the best studied structural gene in the flavonoid biosynthetic pathway. The A1 promoter has been used in many studies as a model to understand the process of DNA-binding by C1/R and P1. It has been shown that the region -123 to -88 upstream from the TSS is necessary for activation of the A1 gene by the anthocyanin and phlobaphene regulatory genes [110]. This region has a modular structure and contains two MYB-like BS [high affinity P1-binding site (haPBS) and low affinity P1-binding site

(laPBS)], (Fig. 4.4A). It was established by gel retardation assays and DNAse footprinting that C1 and P1 can bind to both sites with different affinities [68, 111]. The region necessary for transcriptional activation of A1 does not contain a canonical E-box but it harbors an anthocyanin regulatory element (ARE) present also in the promoter of the A2 and the Bz1 genes, where it resembles an E-box BS and is located downstream of both

MYB-binding sites [112] (Fig. 4.4A). This region has been proposed to recruit the bHLH partner by either direct binding or through the recruitment of other factors. All three cis- regulatory elements participate in full promoter activity as shown in transient expression experiments [110-111].

Comparison of the MYB-binding sites in A1 with promoter sequences in Bz1, A2,

C2 and Bz2 revealed a high homology (Fig. 4.4A). All cis-regulatory elements in the C2 promoter are predicted and I will show in Chapter 4 that the 223 nucleotides upstream of the TSS are sufficient for activation by C1 and R. The Bz1 promoter has been examined 20 and the region -76 to -45bp upstream of the TSS was determined to be necessary for transcriptional activation by C1 and R [113]. This region contains a C1-BS and an E-box

(Fig. 4.4A). Mutations in both sites reduced the promoter activity to 10% or 1%, respectively, suggesting that both elements are important for activation.

1.4 Research Goal

This dissertation discusses the role of the bHLH protein R in the regulation of maize anthocyanin biosynthesis and possibly other cellular processes. R was the first plant bHLH protein identified and to date is probably the best studied plant bHLH protein. Several conserved domains have been identified; but only the N-terminal MIR has been functionally characterized and shown to be involved in protein-protein interactions [114].

Chapter 2 of this dissertation investigates the role of the C-terminal region in R, covering amino acids 525 to 610 of the protein. I have shown that this region can form homodimers, is necessary for anthocyanin pigmentation and has structural homology to an ACT domain. Together with Dr. J. Marcela Hernandez, a former graduate student in the Grotewold laboratory, I determined that this fold is present in many other bHLH proteins and that some other bHLH proteins dimerize via it.

In Chapter 3, I present the role of the highly conserved bHLH domain in transcriptional regulation and in other regulatory processes such as histone modifications.

This work was done together with Dr. Hernandez, Dr. Kengo Morohashi and Mr.

Kenneth Frame. We have determined that the bHLH domain is dispensable for the activation of transiently expressed genes but necessary for the expression of endogenous 21 genes. In addition, I identified an R-interacting-factor (RIF1) which specifically interacts with the bHLH domain and which contains a domain related to chromatin function, the

AGENET domain. This RIF1 protein links transcriptional regulatory function of R with chromatin modification and is part of the complex formed on the A1 promoter. This finding dramatically increases the knowledge of combinatorial gene regulation in plants.

In Chapter 4, I investigate the role of the bHLH domain and ACT domain in

DNA-binding. R was always referred to as MYC-like protein; the difference between the two is the presence of the LZ in MYC adjacent to the second helix of the bHLH domain.

Structure prediction analyses and DNA-binding studies performed by Dr. Ling Yuan’s laboratory at the University of Kentucky, revealed that, R contains a LZ-like motif adjacent to the bHLH domain which is necessary for homodimerization and DNA- binding. I performed DNA-binding studies in vitro (EMSA, Electrophoretic Mobility

Shift Assay) and in vivo (ChIP and ChIP-seq) with the help of Dr. Chenglin Chai and Dr.

Morohashi, respectively, and determined binding of R to several anthocyanin biosynthetic promoters in the presence and absence of the ACT domain. On A1, R dimerizes via the ACT domain and binds to the regulatory region through C1. However, on promoters containing an E-box, R homodimerizes via the bHLH-LZ-like domain and binds DNA directly. Depending on how R is tethered to DNA, the presence of the ACT domain might inhibit or enhance DNA-binding. Together with the finding that C1 is required for homodimerization of R, this suggests that there is an active interplay between different domains of R and that depending on the dimerization partner, R might target different genes.

22

In Chapter 5, I examine other R-interacting factors identified in yeast two-hybrid screens. Investigation of these factors proposes additional regulatory functions for R in other pathways.

23

Figure 1.1: The flavonoid biosynthetic pathway. Shown are five branches of the pathway leading to flavones, flavonols, anthocyanins, condensed tannins and phlobaphene pigments. Biosynthetic genes are shown in blue, in red and brackets. The regulators of the pathway are marked on the right in brown (P1) or purple (C1+R). Adapted from [69, 115].

24

A

B

Figure 1.2: Models for TF - and chromatin remodeling factor binding. A, According to over-represented motifs from ChIP-chip data, Kato et al. 2004 hypothesized three type of binding mechanisms for TFs. a) direct binding, b) indirect piggy-bag binding, c) indirect cross-binding. B, Schematic representation of the histone tail with either acetylated (ac) or methylated (me) lysine residues. Bromodomain-containing proteins bind acetylated lysine residues while members of the “Royal family” such as chromo-, Tudor-, PHD- or MBT domain proteins bind methylated lysines. This figure was modified from [12].

25

A

B R . B . DEL . JAF13 . TT8 . AN1 . IN1 . GL3 . EGL3 . AtMYC1 .

Figure 1.3: Plant bHLH proteins. A, Alignment of the bHLH domain of R-like proteins from maize (R, B and IN1), Arabidopsis (GL3, EGL3, TT8, AtMYC1), petunia (AN1, JAF13) and A. majus (DEL). Numbers on top of the alignment correspond to amino acids in R. The H-E-R motif in the basic region is marked in blue. B, Phylogenetic tree including bootstrap analysis of R-like proteins used in alignment in Fig. 1.2A.

26

Name bHLH # Function References Subfamily Ia

AtMUTE AtbHLH045 Control sequential cell fate specification [116-118] during stomatal differentiation AtFAMA AtbHLH097 AtSPCH AtbHLH098 OsMUTE OsbHLH055 Control of stomata development [119] OsFAMA OsbHLH051 OsSPCH2 OsbHLH053 Subfamily Ib(1)

RGE1/ZHOUPI AtbHLH095 regulates embryonic development and [120-121] endosperm breakdown

Subfamily Ib(2)

OsIRO2 OsbHLH056 regulates genes involved in Fe uptake under [122] Fe-deficiency conditions Subfamily III(a+c) FIT AtbHLH029 required for the up-regulation of responses [123] to iron deficiency in Arabidopsis roots

RERJ1 OsbHLH006 involved in the rice shoot growth inhibition [124] caused by jasmonic acid Subfamily IIIb ICE/SCRM AtbHLH116 control stomatal development; implicated in [125] the cold acclimation response and freezing tolerance ICE2/SCRM2 AtbHLH033 [126-127] Subfamily III(d+e) MYC2/ JAI1/JIN1 AtbHLH006 involved in abscisic acid, jasmonic acid and [128-130] light signalling pathways

AIB AtbHLH017 involved in abscisic acid signalling [131]

continued

Table 1.1: Functional characterization of bHLH proteins from different species.

The Table was adapted and modified from [89].

27

Table 1.1 continued

PsGBF (Pea) regulates biosynthetic [132] pathway Subfamily IIIf

TT8 AtbHLH042 partially redundant; regulate anthocyanin [97-98, 133- and proanthocyanidin biosynthesis; 135] trichome, root hair and seed coat GL3 AtbHLH001 epidermal cell development EGL3 AtbHLH002

Ra/OSB1 OsbHLH013 regulate the anthocyanin biosynthetic [70, 96, 136- pathway Rb OsbHLH165 139] Rc OsbHLH017 OSB2 OsbHLH016 Lc IN1 An1 (Petunia) Subfamily IVa

NAI1 AtbHLH020 required for the formation of an [140] endoplasmic reticulum-derived structure, the ER body

Subfamily IVc ILR3 AtbHLH105 modulate metal homeostasis and auxin- [141] conjugate metabolism

Subfamily Va BIM1 AtbHLH046 implicated in brassinosteroid signalling [142] BIM2 AtbHLH102 BIM3 AtbHLH141 continued

28

Table 1.1 continued

Subfamily VII(a+b) PIF1/PIL5 AtbHLH015 bind to activated phyto-chromes and [143-146] mediate light and gibberellin signalling PIF3 AtbHLH008 responses; mediates plant architecture PIF4 AtbHLH009 responses to high temperatures (PIF4)

PIF5/PIL6 AtbHLH065 PIF7 AtbHLH072 HFR1 AtbHLH026 mediate both phytochrome and [147] cryptochrome signalling

SPATULA AtbHLH024 regulator of carpel margin development; [148-149] mediator of germination responses to light and temperature

ALCATRAZ AtbHLH073 required for the formation of a cell layer [150] necessary for fruit dehiscence

UNE10 AtbHLH016 involved in the fertilization process [151]

BP-5 OsbHLH102 involved in the regulation of amylose [152] synthesis in the rice endosperm

Subfamily VIIIb

HEC1 AtbHLH088 redundantly control the development of the [153] transmitting tract and stigma; each of these HEC2 AtbHLH037 proteins can form hetero-dimers with SPATULA HEC3 AtbHLH043

LAX OsbHLH122 regulator of axillary meristem generation [154]

INDEHISCENT AtbHLH040 required for the differentiation, in the [155] Arabidopsis fruit, of three cell types; involved in seed dispersal Subfamily VIIIc(1) AtRHD6 AtbHLH083 required for the formation of root hairs [156]

AtRSL1 AtbHLH086 continued

29

Table 1.1 continued

PpRSL1 PpbHLH043 redundantly required for the development [156] of rhizoids and caulonemata PpRSL2 PpbHLH033

Subfamily XI UNE12 AtbHLH059 involved in the fertilization process [151]

PTF1 OsbHLH096 involved in the responses to phosphate [157] deficiency stress

Subfamily XII

ZCW32/BPE AtbHLH031 controls petal size [158] BEE1 AtbHLH044 redundant positive regulators of [159] brassinosteroid signalling BEE2 AtbHLH058 BEE3 AtbHLH050 CIB1 AtbHLH063 shown to interact with the blue-light [160] receptor CRY2 and promote floral initiation CIB5 AtbHLH076

Subfamily XIII LHW AtbHLH156 regulates the size of the vascular initial [161] population in the root meristem

Subfamily XIV SAC51 AtbHLH142 involved in a spermidine synthase [162] mediated stem elongation process

Subfamily XV PRE1 AtbHLH136 proposed to act as positive regulators of [91] gibberellin signalling PRE2 AtbHLH134 PRE3 AtbHLH135 PRE4 AtbHLH161 KIDARI At1g26945 represses light signal transduction; [163] interacts and negatively regulates HFR1 continued

30

Table 1.1 continued

Orphans AMS AtbHLH021 required for correct anther development [164] particularly tapetum development DYT1 AtbHLH022 [165] TDR OsbHLH005 [166] Udt1 OsbHLH164 [167] MEE8 AtbHLH108 required for early embryo development [151]

PAR1 At2g42870 negatively control growth and metabolic [[168] shade avoidance responses

31

CHAPTER 2

AN ACT-LIKE DOMAIN PARTICIPATES IN THE DIMERIZATION

OF PLANT BASIC HELIX-LOOP-HELIX TRANSCRIPTION

FACTORS

This Chapter is based on the following publication: Feller A, Hernandez JM, and Grotewold E. (2006) An ACT-like domain participates in the dimerization of several plant basic-helix-loop- helix transcription factors. J Biol Chem 281:28964-74. Dr. Hernandez performed the GST pulldowns (Fig. 2.2B), analyzed the dimerization domain structure (Fig. 2.5A) and performed phylogenetic analysis to detect ACT domains in other bHLH proteins (Fig. 2.6A). In addition, she tested dimerization of At2g22770 with itself and with R (Fig. 2.6B).

2.1 Introduction

Proteins containing the basic-helix-loop-helix (bHLH) domain compose one of the largest transcription factor families in plants [40, 87-88, 169]. The bHLH signature that defines the family is constituted by an N-terminal ~16-amino acid-long basic α-helix that binds DNA to the canonical E-box (CANNTG, where N is any nucleotide) [170] and a C-terminal helix-loop-helix (HLH) domain involved in homo- and/or heterodimerization [79]. Some factors, however, such as the Id myogenic regulator or the

Arabidopsis light signal regulator KIDARI, lack the basic region and function as inhibitors by forming heterodimers that cannot bind DNA [163, 171]. It is common for 32 bHLH proteins to contain additional protein-protein interaction domains that contribute in unique ways to their regulatory function [172]. For example, the MYC proto-oncoprotein forms heterodimers with MAX through the respective bHLH and adjacent basic leucine zipper (LZ) domains [173]. In contrast to MYC, which cannot homodimerize, MAX can form homo- or heterodimers with several related proteins, including MAD1 and MNT

[174]. The bHLH region of MYC mediates the interaction with MIZ-1, a POZ transcription factor that allows MYC to bind and repress promoters lacking the CACGTG

G-box [82, 174]. Plant bHLH proteins are also characterized by the presence of several conserved domains in addition to the bHLH motif. For example, the analysis of 544 bHLH proteins from nine species of land plants and algae, revealed 26 subfamilies of bHLH proteins with 28 conserved non bHLH amino acid motifs [89]. Only few of these domains have been functionally characterized, among them the APB motif (active phytochrome binding), necessary for PIF4 function in phyB signaling [175], and the region that mediates the interaction with R2R3-MYB factors [72, 114] central to providing R2R3-MYB transcriptional regulators with very similar DNA-binding preferences with the ability to control distinct sets of target genes in vivo [176]. The bHLH/R2R3-MYB cooperation is best exemplified by the interaction between the N- terminal region of a member of the R/B group of bHLH regulators and the R2R3-MYB domain of C1 [114]. The interaction with R is essential for the ability of C1 to activate all known flavonoid biosynthetic genes, resulting in anthocyanin pigment accumulation [71].

A second function of R was uncovered using a mutant of the R2R3-MYB P1 regulator

(P1*) [41]. P1 normally regulates a subset of the C1/R-regulated genes [111]. Among the

33

P1-regulated genes is A1, an necessary for the formation of the anthocyanins

(controlled by C1+R) and the phlobaphenes (controlled by P1) (Fig. 1.1) [2]. Different from P1, P1* interacts with R [41], and in the presence of R, the P1* regulatory activity is enhanced. This R-enhanced activity requires the ARE (anthocyanin regulatory element) present in the promoter of A1 and other anthocyanin biosynthetic genes [71,

112].

Based on the of the bHLH domain and the presence of conserved N-terminal regions, R belongs to group IIIf of the plant bHLH gene family

[40]. This group is shared by the Arabidopsis GL3 (GLABROUS3), EGL3 (ENHANCER

OF GLABRA3), TT8 (TRANSPARENT TESTA8) and AtMYC1 proteins, which participate in trichome and root hair formation and in the control of flavonoid pigments

[98, 106, 133-134, 177], functions that can be complemented by the maize R gene, Lc

[178]. Similar to the interaction of R with C1, members of this group of bHLH proteins function by interacting with R2R3-MYB factors. For example, GL3 and EGL3 interact with the related R2R3-MYB GL1 (GLABROUS1) and WER (WEREWOLF) proteins to control trichome or root hair production, respectively, or with PAP1 (PRODUCTION OF

ANTHOCYANIN PIGMENT1) to control anthocyanin accumulation [97]. Similarly,

TT8 interacts with the R2R3-MYB TT2 (TRANSPARENT TESTA2) to activate proanthocyanidin accumulation in the seed coat [106]. The picture that emerges from these and other studies is that the R2R3-MYB factors are responsible for providing the specificity for a particular process, whereas the bHLH factors play more pleiotropic roles, being shared between two or more cellular processes [71, 97]. Given the multiplicity of

34 conserved domains in R and R-related proteins and what is known on the mechanism by which animal bHLH proteins function, the question remains as to whether regions in R participate in hetero- or homodimer formation and how these interactions contribute to R function.

Here I show that, similar to other bHLH proteins, R contains a dimerization domain that can direct the formation of homodimers in vitro and in vivo. This dimerization domain is necessary for the R regulatory activity, as demonstrated by mutants that exhibit a significant reduction of the activation of maize flavonoid genes and anthocyanin accumulation. The dimerization region of R is C-terminal to the bHLH motif and has structural similarity to the ACT domain involved in the of many amino acid metabolic enzymes. Structural homology searches identified other bHLH factors with similar ACT domains, and we show that some of them can mediate the formation of specific dimers. These findings highlight the role of a novel dimerization domain for the R regulatory activity and provide evidence for the recruitment by eukaryotic transcription factors of protein-protein interaction domains characteristic of metabolic enzymes.

2.2 Materials and Methods

2.2.1 Plasmids used in transient expression experiments

All plant expression vectors include the cauliflower mosaic virus 35S promoter, the tobacco mosaic virus Ω’ leader and maize first Adh1-S intron in the 5’ untranslated region, and potato proteinase II (pinII) termination signal unless otherwise specified.

35

Previously described plasmids include p35S::C1 (pPHP665), pBz1::Luc, pA1::Luc, pA1mPBS::Luc [68, 111], pA2::Luc [112], pBz2::Luc [179], p35S::R (pPHP471) and p35S::P1* (35S::PI77L, K80R, A83R, T84L, S94G, H95R) [41]. p35S::BAR (pPHP611) was used for normalizing the concentration of p35S sequences delivered in each bombardment and was also previously described [180]. The pGAL4BS::Luc reporter construct and p35S::GAL4DBD-C1-Cterm were also described [71, 181]. p35S::GAL4DBD-R411-610 was generated by PCR and then cloned into a plant expression vector generated on a pBluescript backbone and containing the features described above, except that it did not contain a terminator. pUbi::GUS (pPHP3953) was used to normalize the efficiency of each bombardment [182]. For expression of a p35S::R-GFP fusion protein for transient expression studies in maize and in N. benthamiana, the coding region was cloned into pENTR/D-Topo vector (Invitrogen) and recombined into the pGWB5 binary vector containing a C-terminal GFP (Green fluorescent protein) fusion. The pGWB5 vector was kindly provided by Dr. Tsuyoshi Nakagawa (Research Institute of Molecular Genetics,

Shimane University). p35S::RΔ532–560-GFP was generated with the Quick Change PCR kit

(Stratagene, LaJolla, CA) using the pENTR/D-TOPO clone as a template followed by recombination into pGWB5. All constructs were verified by DNA sequencing.

2.2.2 Plasmids Used in Yeast Two-hybrid Experiments

The C-terminus of R with (amino acids 411–610) and without (462–610) the bHLH region, the bHLH domain only, and a region containing the last 86 amino acids

(525–610) were generated by PCR and cloned into pADGAL4 and/or pBDGAL4 vector

(Stratagene). The At2g22770233–314 and EGL3511-596 or EGL3402-596 were amplified by 36

PCR and cloned in the pADGAL4 and pBDGAL4 vectors. The maize RIF1 and bHLH5 gene were isolated by a yeast two-hybrid screen for protein interacting with the C- terminal (amino acids 411–610) region of R. The ACT region of bHLH5 (amino acids

323-393) or the C-terminus containing the bHLH domain (amino acids 117-393) were amplified by PCR and cloned into pADGAL4 or pBDGAL4, respectively.

2.2.3 Microprojectile Bombardment and Gene Expression Experiments

Bombardment conditions of maize BMS (Black Mexican Sweet) suspension cells and transient expression assays for luciferase and GUS were performed essentially as previously described [111]. One µg of each of the regulators and 3 µg of reporter plasmid

(pA1::Luc, pA2::Luc, pBz2::Luc, pBz1::Luc, pA1mPBS::Luc, or pGAL4BS::Luc) were used in each bombardment. For each microprojectile preparation, the mass of DNA was adjusted to 10 µg with p35S::BAR [180] to equalize the amount of 35S promoter in each bombardment. To normalize luciferase activity to GUS activity, 3 µg of pUbi::GUS was included in every bombardment. Each treatment was done at least in triplicate, and entire experiments were repeated at least twice. The assays for luciferase and GUS and the normalization of the data were done as described [111]. Data are expressed as the ratio of arbitrary light units (luciferase) to arbitrary units of fluorescence (GUS).

2.2.4 Yeast Two-hybrid Experiments

The plasmid containing R411–610, R411–462, or R525–610 in the pBDGAL4 (TRP+) vector and R411–610, R525–610, or RIF1 cloned into pADGAL4 (LEU+) vector were co- transformed into yeast strain PJ69.4a [183] and plated on SC-LEU-TRP medium

37

(synthetic complete drop out medium: 4 g Difco yeast nitrogen base w/o amino acids, 10 g glucose, 6 g ammonium sulfate, 182.2 g sorbitol in 950 ml dH2O; adjusted to pH 5.8 with KOH; autoclave and add appropriate 50X amino acid stock solution). Further, the plasmid containing ZmbHLH5117-393 fused to the GAL4 AD and the plasmids containing

ZmbHLH5323-393, R463-610, R525-610 or EGL3511-596 fused to the GAL DBD were co- transformed into the same yeast strain mentioned above and plated on SC-LEU-TRP.

Colonies were then screened for growth on SC-LEU-TRP, SC-LEU-TRP-HIS, and SC-

LEU-TRP-HIS-ADE.

β-galactosidase assays were performed on the pJ69.4a yeast strain on three separate cultures carrying each set of plasmids (biological triplicates). Cells were grown in selective SC-LEU–TRP media over night and then used to inoculate 10 ml of yeast extract/peptone/dextrose-rich media. When cultures reached an OD600 of ~0.8, cells were collected and lysed in 0.1 M Tris-HCl, pH 8, 20% v/v glycerol, and 1 mM dithiothreitol

(DTT) using glass beads on a mini-bead-beater (Biospec Products). β-galactosidase assays were carried out essentially as described previously [184], and β-galactosidase units were calculated using the formula (A420 x 377.8) / (time of incubation x volume of extract x protein concentration in mg/ml).

2.2.5 GST Pull-down Experiments

The GST pull-down bait was constructed by cloning the R C-terminal region

(residues 411–610) into the vector pGEX-KG (53) as an XhoI-HindIII fragment. The pGEX-KG vector was used as a negative control. The R cDNA was excised and cloned into pBluescript (Stratagene). The GST-R411–610 and pGEX-KG constructs were each 38 transformed into E. coli (Escherichia coli) BL21(DE3)PlyS cells for expression. Cultures were grown, induced with IPTG and purified essentially as described [185], with the following modifications: After induction of a 500 ml culture with 1 mM isopropyl 1-thio-

β-D-galactopyranoside (IPTG), the cells were harvested by centrifugation and stored at -

80°C until further use. The cells were resuspended in 10 ml of phosphate-buffered saline buffer (137 mM NaCl, 2.7 mM KCl, 10 mM Na2HPO4, 2 mM KH2PO4, 1 mM DTT, 1 mM IPTG) and passed twice through a French press. The cell lysate was centrifuged at

4000xg for 20 min; the supernatant was filtered through two layers of Miracloth

(Calbiochem). The GST pull-down protocol previously described was used [186].

Glutathione-Sepharose beads (Novagen) were prewashed with NETN150 buffer (0.5% v/v Nonidet P-40, 0.1 mM EDTA, 20 mM Tris-HCl, pH 7.4, 150 mM NaCl, 1 mM DTT,

1 mM PMSF) four times and equilibrated in NETN150 buffer. The beads were coated by incubating the bacterial cell extract (100–200 µl) containing GST (used as negative control) or GST-R411–610 with 20 µl of glutathione-Sepharose beads at 4°C and nutating for 1 h. The beads were washed 4 times by nutating with 1 ml of NETN150 at 4°C (20 min/wash) and then resuspended in 20 µl of NETN150. Proteins were in vitro transcribed and translated in the presence of Redivue [35S]-methionine -in vitro translation grade

(AmershamBiosciences) using the TNT T7/T3 coupled wheat germ extract system

(Promega, Madison, WI). The pull-down reactions were done with 20 µl samples of coated beads to which 100 µl of NETN150 buffer was added together with 5 µl of labeled protein and incubated at 4°C nutating for 1 h or overnight. Reactions included beads loaded with GST-R411–610 and GST, used as a negative control. The beads were washed

39 four times (nutation at 4°C for 5 min) with 1 ml of NETN150 buffer. The bound proteins were eluted by boiling in 10 µl of 2X SDS sample buffer and visualized using Coomassie

Blue staining followed by autoradiography.

2.2.6 Plant Transformation and Confocal Microscopy

Plasmids corresponding to the GFP fusions were transformed into Agrobacterium tumefaciens strain GV3101, and infiltration was performed as described by Voinnet et al.

[187] with modifications. In brief, bacteria strains carrying binary constructs were grown at 29°C to stationary phase in LB media supplemented with antibiotics, acetosyringone, and MES. Cells were harvested and resuspended in media containing MgCl2, MES, and acetosyringone giving a final OD600 of 0.9–1.1. After 3 h at room temperature (RT), co- transfection of 3–4-week-old N. benthamiana plant leaves with Agrobacterium cells containing the p35S::R-GFP or p35S::RΔ532–560-GFP plasmid and a plasmid expressing the tomato bushy stunt virus (pBIN61) silencing suppressor [187] were performed. After

3 days, infiltrated leaf areas were excised, and localization of GFP was determined by confocal laser scanning microscopy (Nikon Eclipse E600, Japan) using identical gain

(2000).

2.2.7 Analysis of Dimerization Domain Structure

The structural analysis of the C-terminal dimerization domain of the R-like proteins and the other Arabidopsis proteins was done using the FUGUE Version 2.s.07 software (updated on May 2, 2002) [188]. The sequences of the C-terminal domains of R,

GL3, JAF13, AN1, NAI1, At2g46810, At1g49770, At2g31210, At5g65640, At1g32640,

40

At4g16430, At5g54680, At1g68810, At1g27740, At3g19500, At4g30980, and

At1g73830 were individually analyzed with FUGUE, which used a fold library of size

6348. The length of the query sequence varied between 79 and 88 amino acids. The query divergence values were 0.577 for R and GL3, 0.656 for JAF13, 0.735 for NAI1, 0.586 for

AN1, 0.779 for At2g46810, 0.732 for At1g49770, 0.443 for At2g31210,0.759 for

At5g65640, 0.794 for At1g32640, 0.76 for At4g16430, 0.708 for At5g54680, 0.755 for

At1g68810, 0.266 for At1g27740, 0.555 for At3g19500, 0.413 for At4g30980, and 0.302 for At1g73830. The recommended cutoff values were as follows: ZSCORE ≥ 6.0

(CERTAIN 99% confidence), ZSCORE ≥ 4.0 (LIKELY 95% confidence), ZSCORE

≥3.5 (MARGINAL 90% confidence), ZSCORE ≥2.0 (GUESS 50% confidence), and

ZSCORE <2.0 (UNCERTAIN). After the scores were obtained for all R-like proteins, three common structural hits were chosen and aligned with the dimerization domain sequences. The MEME/MAST system was used with the sequences mentioned above to generate a motif that was then utilized as a probe to search the nr database available at the same site.

2.3 Results

2.3.1 R Contains a C-terminal Dimerization Domain

To investigate whether the C-terminal region of R (residues 411–610, Fig. 2.1), which includes the bHLH domain, was capable of mediating the formation of dimers, I fused this region to the GAL4DBD and GAL4AD and tested them for interaction in yeast.

Growth in selective medium SC-LEU-TRP (for the selection of the plasmids) and SC-

41

LEU-TRP-HIS-ADE) is indicative of the formation of interacting partners (Fig. 2.2A,

#1), providing evidence that the C-terminal region of R contains a dimerization domain.

The equivalent regions of GL3 and EGL3 were also shown to interact with each other and to mediate homodimer formation in yeast two-hybrid experiments [97], suggesting that the presence of a C-terminal dimerization domain is a general feature of this group of bHLH proteins. To determine whether this dimerization was mediated by the bHLH motif, residues 411–462 (bHLH, Fig. 2.1A) were fused to the GAL4DBD and tested for interaction with GAL4AD-R411–610. Under the conditions used, no interaction was observed (Fig. 2.2A, #3). To ensure that the GAL4DBD-R411–462 protein is properly expressed, we tested this protein for interaction with GAL4AD-RIF1, where RIF1 corresponds to a maize R-interacting factor that interacts with the bHLH region of R (see

Chapter 3 for a detailed description of RIF1). The growth observed in SC-LEU-TRP-

HIS-ADE indicates that GAL4DBD-R411–462 is properly expressed (Fig. 2.2A, #4). Deletion analyses of R462–610 demonstrated that the R525–610 region was necessary and sufficient for homodimer formation in yeast (Fig. 2.2A, #5), and that it does not activate on its own when fused to the GAL4DBD (Fig. 2.2A, #6). To biochemically verify the ability of R411–

610 to mediate homodimer formation, the full-length R protein was in vitro transcribed and translated in the presence of [35S]-Methionine ([35S] R, Fig. 2.2B) and assayed in GST pull-down experiments for its ability to interact with bacterially expressed GST-R411–610

(Fig. 2.2B). As a control, the pull-down experiment was performed in parallel with GST

(Fig. 2.2B). GST-R411–610 (Fig. 2.2B, lane2), but not GST (Fig. 2.2B, lane3), resulted in the efficient recovery of the ~79-kDa R protein (Fig. 2.2B, lane1) after autoradiography

42

(Fig. 2.2B, top panel), providing evidence that the dimerization observed in yeast can also be observed biochemically.

To establish whether the C-terminal region of R can mediate homodimer formation in maize cells, I took advantage of the presence of the acidic region in R (Fig.

2.1A), which can mediate transcriptional activation, when fused to a heterologous DNA- binding domain (e.g. GAL4DBD, not shown). p35S::GAL4DBD-R411–610 does not activate transcription from a promoter containing GAL4 binding sites, driving the luciferase reporter (pGAL4BS::Luc) [71] in BMS maize cells, because the acidic region is absent

(Fig. 2.2C). However, when p35S::R is co-expressed with p35S::GAL4DBD-R411–610, a significant induction of the pGAL4BS::Luc reporter construct was observed (Fig. 2.2C, compare p35S::GAL4DBD-R411–610 and p35S::GAL4DBD-R411–610 + p35S::R) at levels comparable with those attained by the C1 transcriptional activation domain [189] fused to the GAL4DBD (Fig. 2.2C, p35S::GAL4DBD-C1TAD). This plant two-hybrid result suggests that the R411–610 region can mediate dimerization in vivo in maize cells. Taken together, these findings identify a novel dimerization region in the C-terminal domain of R (R525–

610).

2.3.2 The Dimerization Region of R Is Necessary for Regulatory Activity

I generated several single amino acid mutations in residues conserved between different R-like proteins (Fig. 2.1B, red stars). These mutations (E544A or E544R,

C547A, R548E, R556E, K562E, and D567K) did not affect the dimerization of R at a detectable level in yeast two-hybrid experiments, consistent with the large protein-protein interface involved in ACT domain dimerization [190]. However, the dimerization was 43 completely abolished when the region spanning residues 532–560 (Fig. 2.1B, indicated by green line underneath the alignment) was deleted (not shown). The R525–610 dimerization region contains one of three R nuclear localization signals [95]. Thus, I suspected that the 532–560 deletion within this region could affect the nuclear localization of R. To monitor the nuclear localization of R mutants, I constructed C- terminal GFP fusions and investigated their subcellular localization in N. benthamiana leaves after Agrobacterium-mediated infiltration (Fig. 2.3A). Both RΔ532–560-GFP and R-

GFP accumulate in the nucleus without any obvious difference in the amount of fluorescence. Interestingly, R-GFP displayed a speckled nuclear fluorescence, a pattern that was not previously observed in onion skin cells with the GUS reporter [95].

Although the deletion of the 532–560 region does not affect the nuclear import of R, it appears to interfere with the formation of discrete speckles (Fig. 2.3A). The biological significance of this pattern for the R regulatory function is not known, but many plant proteins display a similar speckled nuclear localization [191].

Next, I investigated the effect of the 532–560 amino acid deletion on the ability of

R (together with C1) to activate the anthocyanin biosynthetic pathway. The bombardment of p35S::RΔ532–560-GFP (the same construct as used in Fig. 2.3A) into BMS cells resulted in the very rare formation of red cells (occasionally one or two) compared with the hundreds of red cells usually observed in similar experiments using p35S::R or p35S::R-

GFP (Fig. 2.3B) in the presence of p35S::C1. The inability of p35S::RΔ532–560-GFP to activate anthocyanins could not be compensated by increasing the amount of plasmid in the bombardments (Fig. 2.3B), suggesting that the lack of activity of p35S::RΔ532–560-GFP

44 is not due, for example, to the inefficient expression of this protein. As described later, the p35S::RΔ532–560-GFP protein continues to be fully active on some of the other functions of R, indicating that the inability to induce anthocyanin formation in this experiment is not a consequence of, for example, no protein being formed.

I then compared the activity of p35S::RΔ532–560-GFP and p35S::R-GFP for their ability to activate transcription together with p35S::C1 of four anthocyanin biosynthetic genes (A1, A2, Bz1, and Bz2) in transient expression experiments in maize BMS cells.

Consistent with the decreased anthocyanin accumulation specified by p35S::RΔ532–560-

GFP (Fig. 2.3B), the activity of p35S::RΔ532–560-GFP was significantly reduced on the pA1::Luc, pA2::Luc, pBz1::Luc, and pBz2::Luc promoter constructs, with the most dramatic effect on pA1::Luc (~75% reduction) (Fig. 2.3C). Taken together, these results indicate that the dimerization domain is necessary for the transcriptional activity of R.

2.3.3 The Dimerization Domain Is Required for Only a Subset of the R Activities

Previously, we utilized a version of the P1 protein that can interact with R [P1*]

[41] to identify two distinct regulatory activities for R [71]. The R-enhanced activity is evidenced by a 2–3-fold increase in the activation of pA1::Luc when R is present [71]. To investigate whether the dimerization region participates in the R-enhanced activity, I tested P1* with p35S::R-GFP and p35S::RΔ532–560-GFP for the activation of the pA1::Luc reporter construct (Fig. 2.4A). As previously observed for p35S::R, p35S::R-GFP increases the expression of the luciferase reporter from the pA1::Luc construct about 2- fold (Fig. 2.4A, P1* + R-GFP). When the 532–560 region was deleted, R continued to

45 enhance the P1* activity (Fig. 2.4A, compare P1* with P1* + 35S::RΔ532–560-GFP). Thus, the dimerization domain is not required for the R-enhanced activity.

We previously also showed that the R-enhanced activity requires the ARE element [71], a cis-regulatory motif conserved in the promoters of several flavonoid biosynthetic genes [112] and located between the distal low affinity and high affinity P1 binding sites (laPBS and haPBS, respectively). The ARE is necessary for the C1+R regulation of A1 but not for the activation by P1 [192]. Consistent with this, a mutant A1 promoter lacking the laPBS and haPBS (pA1mPBS::Luc) shows no activation by P1.

However, pA1mPBS::Luc continues to be modestly activated by C1 + R [68], presumably by the recruitment of the complex to the DNA by R through the ARE, a mechanism similar to which P1* was proposed to activate Bz1 [71].

To determine whether R dimerization is necessary for the activation of the Bz1 promoter (pBz1::Luc) by P1*, I tested it when together with R-GFP or RΔ532–560-GFP.

Similar to P1, P1* is unable to activate pBz1::Luc in the absence of R (Fig. 2.4A, P1*).

However, the activation of pBz1::Luc is similar to that of C1 when P1* is co-bombarded with R-GFP or with RΔ532–560-GFP, indicating that the dimerization region is not required for the regulation of Bz1 by P1*.

Next, I compared the ability of R-GFP and RΔ532–560-GFP to cooperate with C1 to activate an A1 promoter lacking the laPBS and haPBS (Fig. 2.4B, pA1mPBS::Luc). Both p35S::R-GFP and p35::RΔ532–560-GFP similarly activated the pA1mPBS::Luc reporter when together with p35S::C1 (Fig. 2.4B, C1). Thus, the absence of R dimerization did not affect the R-enhanced activity, represented by the recruitment of the complex through the

46

ARE. Note, however, that the activation of pA1mPBS::Luc is significantly lower than that of pA1::Luc, highlighting the importance of the laPBS and haPBS in the regulatory activity by C1+R. Taken together, these results indicate that the dimerization region participates in just a subset of the activities displayed by R, likely to include the formation of a stable complex of C1 with the C1 binding sites (represented by the laPBS and haPBS).

2.3.4 Structural Analysis of the Dimerization Domain of R

The last 86 amino acids of R correspond to a region that is conserved among multiple R-related bHLH transcription factors (Fig. 2.1B). BLAST analyses (Basic Local

Search Analysis Tool) at http://blast.ncbi.nlm.nih.gov/Blast.cgi using this region of R as a query identifies a number of other R-like proteins in the database but no evidence for the presence of this domain in proteins other than closely related bHLH factors. As a first step in establishing whether this region of R and R-related proteins have structural similarity to any known domain structure, Dr. Hernandez utilized the FUGUE sequence- structure homology recognition program [188]. The top hit obtained (probe divergence values were 0.577 for R and GL3, and their z-scores were 3.74 and 4.84, respectively) corresponded to the ACT domain, a fold present in many proteins that participate in the ligand-mediated allosteric regulation of several biosynthetic enzymes [193]. One such enzyme is PGDH (PHOSPHOGLYCERATE DEHYDROGENASE), which catalyzes the first reaction in the L-serine biosynthetic pathway. The pathway is feedback-regulated by

L-serine, inhibiting PGDH by binding to the allosteric site, which resides within the ACT domain. This domain also mediates homodimerization of the protein. The interface of the 47 two ACT domains is formed by a 6-or 8-strand β-sheet composed of three or four β- strands from each subunit [194-195] (Fig. 2.5A). Two L-serine binding sites are formed in the interface of the two ACT domains, each of which includes His-342 and Asn-346 from one subunit and Val-363, Asn-364, and Ile-365 from the other subunit, thus creating

2 binding sites in each homodimer. The predicted secondary structure for the R dimerization domain is in agreement with the secondary structure of the E. coli PGDH

ACT domain (Fig. 2.5B). The three β-strands (blue arrows in Fig. 2.5B) involved in the homodimerization of the PGDH ACT domain align well with three predicted β-strands in

R (green arrows in Fig. 2.5B). Moreover, the region deleted in the RΔ532–560 mutant

(underlined with green line in Fig. 2.1B) includes one of the β-strands involved in the dimerization interface. I made several single amino acid mutations (indicated with a red star in Fig. 2.1B) to Ala but all mutants failed to affect the dimerization of the ACT- like domain (not shown). However, when Ser-560, Gln-562, and Ser-564 were simultaneously replaced by Ala, a significant (p <0.01) reduction in the ability of R411–610 to dimerize was observed, evidenced by the reduced β-galactosidase activity in quantitative yeast two- hybrid assays (Fig. 2.5C). These residues form part of a β-strand, central for the dimerization of ACT domains (Fig. 2.5A) and predicted to be conserved in the R ACT- like domain (Fig. 2.5B), providing additional evidence for the presence of a similar fold in R. Interestingly, quantitative yeast two-hybrid interactions monitoring the activation of

LacZ also showed that the 525–610 region of R dimerizes significantly weaker than the

411–610 region (Fig. 2.5C), a difference not evident from the growth in selective media

(compare #1 and #5 in Fig. 2.2A). Although we cannot rule out the possibility that

48

GAL4DBD-R525–610 or GALAD-R525–610 is expressed at a significantly lower level than the

GAL4DBD-R411–610 and GAL4AD-R411–610 proteins, these results may suggest an involvement of the bHLH region of R in participating or stabilizing the dimerization provided by the ACT-like domain.

2.3.5 Identification of ACT Dimerization Domains in Other bHLH Proteins

The structural similarity of the dimerization region of R with the ACT domain prompted us to investigate whether similar domains might be present in other plant bHLH proteins outside of the group formed by R and related factors (Fig. 2.1B). It should be noted that the dimerization domain was not among the conserved regions identified in a previous analysis of Arabidopsis bHLH proteins [40]. We reasoned that ACT-like domains could be present in other bHLH proteins, but due to extensive sequence divergence, we would not be able to detect them by using simple sequence alignments.

Therefore, to investigate whether the ACT-like domain was also present in the C- terminus of other bHLH families in Arabidopsis, we used the FUGUE program with representative C-terminal sequences from several of the bHLH subclasses described [40].

We found that ACT-like domains were present in proteins belonging to groups Ia (e.g.

At2g46810), IIIa (e.g. At4g21330), IIIb (e.g. At5g65640), IIIc (e.g.At1g10610), IIId

(e.g.At4g16430), IIIe (e.g.At1g32640), IVa (e.g. At2g22770), and Vb (e.g. At1g68810)

(Fig. 2.6A). We verified that all the members of these groups contained the ACT-like domain by BLAST. We found this to be the case with the exception of one protein

(At3g06120) in Group Ia and three proteins in Group III (At2g28160, At5g57150,

At4g29930). We verified that these were the only bHLH sub-classes to contain the ACT- 49 like domain by using the MEME algorithm for the discovery of conserved motifs [196] using a training set that included the ACT-like motifs of the R-like bHLHs as well as those of the proteins mentioned above in subclasses Ia, III, IVa, and Vb. The resulting motif was used to search the entire non-redundant database, and we were only able to detect the motif in the same bHLH subclasses that we had obtained with the previous method. The C-terminus of R and of other bHLH proteins analyzed is similar but not identical to ACT domains previously identified. With this in mind, I will name it ACT domain instead of ACT-like domain from this point on.

The presence of an ACT domain in other bHLH proteins does not necessarily imply that they can mediate similar protein-protein interactions (Fig. 2.2). Thus, to determine whether the presence of the ACT domain correlates with dimerization, we selected the corresponding regions of At2g22770, AtEGL3 and ZmbHLH5 (a maize R- interacting factor described in detail in Chapter 5) as case studies. In yeast two-hybrid experiments we tested the interaction of At2g22770233–314 with itself and with R525–610

(Fig. 2.6B). The robust growth in SC-LEU-TRP-HIS-ADE media of a strain expressing

GAL4DBD-At2g22770233–314 and GAL4AD -At2g22770233–314 (Fig. 2.6B, #1) and the absence of auto activation of At2g22770233–314-GAL4AD (Fig. 2.6B, #4) provide strong evidence that the corresponding region in At2g22770 mediates homodimer formation. In contrast, this region cannot interact with R525–610 (Fig. 2.6B, #2 and #3), When I tested for homodimerization of EGL3 or ZmbHLH5 (GAL4AD-EGL3402-596 + GAL4DBD-EGL3511-

596 or GAL4AD-ZmbHLH5200-393 + GAL4DBD-ZmbHLH5117-393), no growth was observed on SC-LEU-TRP-HIS-ADE (Fig. 2.6C, D), indicating that neither EGL3 nor ZmbHLH5

50 homodimerizes via the ACT domain. However, EGL3 interacted strongly with the ACT domain of R (GAL4AD-R525-610) as indicated by growth on SC-LEU-TRP-HIS-ADE (Fig.

2.6C, #4), whereas ZmbHLH5 did not (Fig 2.6D, #4+5). Lastly, EGL3 and ZmbHLH5 do not heterodimerize via ACT domain (Fig. 2.6C, #5; Fig. 2.6D, #5).

These results suggest that the ACT domain is a dimerization domain. The data presented here show that maize R and At2g22770 homodimerize via this domain in yeast whereas EGL3 heterodimerizes with R but does not homodimerize. ZmbHLH5 does not homodimerize but it is possible that it heterodimerizes with an as yet unidentified partner.

Therefore, a clear specificity in the dimerization preference was not observed.

2.4 Discussion

I describe here the presence of a novel protein-protein interaction domain in a group of plant bHLH transcription factors. This region has structural similarity with the

ACT domain present in several metabolic enzymes. The corresponding domain in R is central to its ability to cooperate with the R2R3-MYB transcription factor C1 to control anthocyanin pigment formation in maize. These findings provide additional support for the importance of R in mediating multiple protein-protein interactions in the regulation of transcription of flavonoid biosynthetic genes.

Previous studies identified R alleles derived from the excision of the Ds transposon from the r-m9 allele, inserted immediately upstream of the sequence encoding the identified dimerization domain. One of these derivative alleles, R-v24, carrying a 7-bp insertion, resulted in a frame shift mutation with reduced aleurone anthocyanin

51 pigmentation [197]. RNA expression analyses on the mutant v24 showed a 50–70% reduction in the mRNA steady-state levels of C2 and A1, two anthocyanin biosynthetic genes [197]. Because the v24 mutation affected the R C-terminal nuclear localization signal (Fig. 2.1A), the results were interpreted to indicate that the reduced nuclear localization of R in the mutant was at least in part responsible for the decreased anthocyanin accumulation [197]. Strikingly, however, I found a similar reduction in the activation of the A1 promoter when I deleted the 532–560 region of R (Fig. 2.3C). RΔ532–

560 displays similar nuclear localization as the wild-type R when fused to GFP (Fig.

2.3A). Taken together, we interpret these results to indicate that the main effect of the R- v24 mutation is the absence of the dimerization domain and not an altered nuclear localization. The significantly reduced accumulation of anthocyanin pigments in the R- v24 allele is consistent with the decreased ability of RΔ532–560 to induce anthocyanin formation in maize cells (Fig. 2.3B), providing in vivo evidence for the significance of the dimerization domain for R function.

My studies show that the R525–610 region can mediate homodimerization in yeast

(Fig. 2.2A), in vitro (Fig. 2.2B), and in maize cells (Fig. 2.2C). However, I am cautious to propose that the main role of this region of R is to mediate homodimer formation in vivo or that the observed reduction on the R regulatory activity (Fig. 2.3, B and C) of deletions of this region is due to the absence of R homodimerization. The C-terminal regions of

GL3 and EGL3 (including the bHLH) domains have been shown to homodimerize [97]. I have determined that this dimerization is not mediated by the ACT domain in EGL3 but rather through an extended bHLH domain (see Chapter 3 for details). Further, I have

52 established that ZmbHLH5, a maize R-interacting factor, does not homodimerize via the

ACT domain and due to a personal communication with Dr. Yuan, I know that AtMYC2 does not homodimerize in yeast as well. No dimerization partners have been identified for ZmbHLH5 or AtMYC2 but it is possible that such partner exist. It is also possible that an additional plant protein is required for ACT domain mediated homodimerization of

ZmbHLH5 and AtMYC2 and that this protein is not present in yeast. It is probable that the truncated ZmbHLH5 protein (ZmbHLH5200-393) is not functional in yeast. One way to test this is to perform a western blot with yeast transformed with GAL4AD-ZmbHLH5200-

393 and GAL4DBD-ZmbHLH5117-393 using antibodies against the GAL4 AD or GAL4 DBD and determine if the protein is degraded.

Furthermore, I cannot yet rule out the possibility that this region of R mediates the formation of heterodimers with other as-yet unidentified proteins. I have established by yeast two-hybrid that the corresponding regions of R, B and EGL3 can interact as well

(Fig. 2.6C; not shown). However, in maize R functions in the absence of B and, similar to

GL3 and EGL3, R and B have very similar functions [65]. Thus, it is unlikely that the role of this region is solely to mediate R/B heterodimer formation. It is very likely that

EGL3 heterodimerizes with one of the other R-like proteins in Arabidopsis (GL3, TT8,

AtMYC1) and this hypothesis will be tested in the near future.

Although ZmbHLH5, EGL3 or AtMYC2 do not homodimerize via the ACT domain, At2g22770 does. At2g22770 corresponds to the NAI1 gene, which controls the formation of the ER bodies [140], endoplasmic reticulum-derived structures involved in the transport of proteins to the vacuole [198]. The possibility that the regulatory activity

53 of NAI1 requires the identified dimerization domain has not been previously explored and its functional role in ER body formation needs to be determined.

By sequence-structure analysis using the Phyre tool box [254], I identified an

ACT domain in the C-terminus of AMS (ABORTED MICROSPORES), a bHLH protein of subgroup IIIa. This region interact with the SET domain and the PHD finger of

ASHR3 (ASH1-related protein 3), which are involved in stamen and anther development

[199]. SET domains and PHD finger domains have both been shown to be involved in chromatin remodeling and histone modification. It will be interesting to determine if the

ACT domain of R dimerizes with similar proteins or if this dimerization is specific to

AMS. We have shown recently that R interacts with an ENT/AGENET domain containing protein RIF1 via the bHLH domain and this protein links transcriptional regulation of R with histone modifications (Chapter 3) [200]. Further analyses of other bHLH proteins from group Ia, IIIa or IIIb is required to understand the ACT domain dimerization process. Interestingly, a mutation in SPCH (SPEECHLESS), a member of bHLH group Ia, which truncates the protein seven amino acids before the STOP codon and inside the predicted ACT domain, lacks stomata due to lack of initiation of asymmetric cell division [201]. This suggests that, as described for R, the ACT domain of

SPCH is important for regulatory function. Whether the ACT domain of SPCH forms a homodimer or heterodimer will be determined in the near future.

The regulation of A1 transcription by C1 and R requires at least three cis- regulatory elements, the haPBS, the laPBS, and the ARE (16). In the absence of C1 DNA- binding, either because of a mutation in one of the C1 DNA-recognition helices or

54 because of mutations in the haPBS and laPBS, C1 can be recruited in an R-dependent fashion to the A1 promoter by the ARE (16, 33). Similarly, R can recruit the P1* protein to the Bz1 promoter, which is normally not regulated by P1 [41, 71]. Our results showing that C1 and RΔACT activate a mutant A1 promoter lacking the haPBS and laPBS to the same degree than C1 and R indicate that the deletion of the dimerization domain of R does not affect the ARE-mediated tethering of the C1/P1* proteins to the corresponding promoters

(Fig. 2.4). Rather, it is apparent that the dimerization of R is necessary for the binding of

C1 to the haPBS, to the laPBS, or to both. The finding that the deletion of the dimerization domain of R is important for just one aspect of the R regulatory function is of importance from at least two perspectives. First, it provides strong confirmation that the deletion of the 532–560 region does not affect the overall folding or stability of R, since other R- dependent activities remain unharmed. Second, it may help understand how a stable C1-

DNA complex is formed given the very low affinity of C1 for DNA [68].

It is common for bHLH transcription factors to have other protein-protein interaction domains, and domain shuffling was proposed as one mechanism to explain the domain multiplicity in members of this large family of regulatory proteins [172]. The presence of an ACT domain in the dimerization region of one of the best-described groups of plant bHLH proteins provides a new addition to the diversity of protein-protein interaction domains associated with this class of transcription factors. ACT domains have been typically associated with metabolic enzymes, including PGDH [202] and phenylalanine hydroxylase [203], where they often play a regulatory role by providing allosteric regulation through the binding of specific small molecule ligands [193]. In

55 addition to PGDH, ACT domain-mediated protein-protein interactions have been described in several other factors [193]. The high conservation of several residues among the ACT domain present in plant bHLH proteins (Fig. 2.6A) and their position outside the dimerization interface in the E. coli PGDH ACT structure (Fig. 2.5A) suggest that they could be involved in other functions such as, for example, ligand recognition. It should be noted, however, that no plant bHLH proteins have been yet shown to interact with a small molecule ligand. An alignment of 17 ACT domains present in plant bHLH factors permitted us to deduce a loose consensus (Fig. 2.6E) that may facilitate the identification of a similar fold in other proteins. Interestingly, however, when this consensus is compared with that of several ACT-domain containing enzymes [204], significant differences are evident (Fig. 2.6E). The presence of an ACT domain in various distinct groups of bHLH proteins is intriguing and suggests an ancient recruitment of this domain to the plant bHLH family. The occurrence of an ACT domain in a regulatory protein has, however, precedents in the glycine binding domain of the E. coli transcriptional accessory protein GcvR, which participates in the regulation of the gcvTHP operon [205].

The presence of an bHLH-ACT domain fusion protein was first mentioned by

Anantharaman et al. [204]. The authors analyzed complete genomes from bacteria, archea and eukaryotes (including A. thaliana) for the presence of small-molecule-binding domains (SMBD) [206]. They determined that in eukaryotes, SMBDs such as ACT or

PAS (Per-Arnt Sim) domains can be fused to DNA binding domains such as Zn-finger or bHLH domains. According to the authors, At2g41130 contains a bHLH and an ACT domain but no further analysis was performed [206]. In our analysis, this same protein

56 was found to contain an ACT domain. It will be of interest to establish whether ACT domains are limited to the plant bHLH family or whether the presence of ACT domains is a more general feature of other transcription factor families.

Taken together, the results presented here demonstrate the presence of a novel

ACT dimerization domain in R and other plant bHLH proteins. This domain plays an important function in the ability of R to cooperate with the R2R3-MYB factor for the activation of maize pigment biosynthesis. These findings highlight the growing need to combine protein fold recognition with functional analyses in discovering novel functional protein domains.

57

A

B

Figure 2.1: Schematic representation of R and alignment of the C-terminal region of R- like proteins from maize and Arabidopsis. A, R encodes a 610-amino acid protein that contains a MIR region (amino acids 1-252), an acidic region (amino acids 253-410), and a basic helix-loop-helix domain (amino acids 411-462). The leucine-zipper-like domain extending the bHLH domain to amino acid 478 is marked with LZ. Nuclear localization signals are indicated by black lines beneath the scheme. B, Sequence alignment of the C- terminal 86 amino acids (amino acids 525-610) of Z. mays R-Lc (ZmR), Z. mays B-Peru (ZmB) (amino acids 477-562), A. thaliana GLABRA3 (AtGL3) (amino acids 552-637), A. thaliana ENHANCER OF GLABRA3 (AtEGL3) (amino acids 511-596), and A. thaliana TRANSPARENT TESTA8 (AtTT8) (amino acids 432-516). Boxed in black and gray are conserved or similar amino acid residues, respectively. Red and blue asterisks indicate residues mutated, and the blue line represents the changes that were done together in an effort to abolish the dimerization of R (see text). The green line underneath the alignment indicates the 532-560 region deleted in many of the constructs used in this study. Alignment was done using ClustalX2 [207] and GeneDoc [208].

58

A B #1 #6 #2 - + - - - + #5 #3 #4 kDa 75 35S-Met SC-LEU -TRP SC-LEU -TRP - HIS -ADE

#1) GAL4AD-R411-610 + GAL4DBD-R411-610 #2) pADGAL4 + GAL4DBD -R411-610 #3) GAL4AD-R411-610 + GAL4DBD -R411-462 Coomassie #4) GAL4AD-RIF1 + GAL4DBD-R411-462 50 #5) GAL4AD-R525-610 + GAL4DBD -R525-610 #6) pADGAL4 + GAL4DBD-R525-610 25 1 2 3

C

Figure 2.2: The C-terminal region of R contains a dimerization domain. A, yeast two- hybrid experiments showing interaction between a fusion of the GAL4 AD to the C- terminal region of R containing the bHLH (GAL4AD-R411-610) or to the C-terminal 86 amino acids (GAL4AD-R525-610) with R525-610, R411-610, or R411-462 fused to the GAL4 DBD (GAL4DBD) in the yeast strain PJ69.4a [183] containing the HIS3 and ADE2 genes under the control of GAL4 binding sites. RIF1 corresponds to a R partner that specifically interacts with the bHLH region and is described in Chapter 3. B, GST-pulldown experiments showing interaction of in vitro transcribed/translated [35S] methionine- labeled R and E. coli-expressed GST-R411-610. Lane 1, [35S]R; lane 2, [35S]R and GST- R411-610; lane 3, [35S]R and GST. C, 35S::R411-610 was fused to the GAL4DBD and co- bombarded with p35S::R into maize BMS cells. A pUbi::GUS construct was included in every bombardment as a normalization control. The fold activation was calculated as described under “Materials and Methods”. The average values from at least three replicates for each treatment are shown, the error bars indicate the S.D. (standard deviation) of the samples. The activation of p35S::Gal4DBD-R411-610 + p35S::R is significantly different from the activation of p35S::Gal4DBD-R411-610 (P < 0.5). 59

Figure 2.3: The R dimerization region is necessary for the efficient activation of anthocyanin biosynthesis. A, Agrobacterium-infiltrated N.benthamiana leaf cells expressing p35::R-GFP (R-GFP) or p35S::RΔ532-560-GFP (RΔ532-560-GFP) were visualized by laser scanning confocal microscopy. Fluorescence pictures were taken at the same gain level. The bar corresponds to 10µm. B, Number of red anthocyanin-accumulating maize BMS cells co-bombarded with 35S::C1 and different amounts (0.01, 0.03, 0.1, 0.3, 1.0, 3.0, and 9.0µg) of p35S::R, p35S::R-GFP or p35S:: RΔ532-560-GFP. C, maize BMS cells co-bombarded with p35S::C1 and p35S::R-GFP or p35S::C1 and p35S:: RΔ532-560- GFP. Indicated is the -fold activation of pBz2::Luc, pBz1::Luc, pA2::Luc, and pA1::Luc. Fold activation was determined as described in Figure 2.2. Error bars indicate S.D. The activation of p35S::C1 + p35S::RΔ532-560-GFP is significantly different from the activation of p35S::C1 + p35S::R-GFP on all promoters tested (P < 0.5).

60

A

B

Figure 2.4: The R dimerization region is required for a subset of the R regulatory activities. A, Maize BMS cells co-bombarded with p35S::P1* (P1*) and p35S::R-GFP (R-GFP) or p35S::RΔ532-560-GFP (RΔ532-560-GFP) and tested for activation of the pA1::Luc or pBz1::Luc reporter constructs. The activation of P1* +RΔ532-560-GFP on A1::Luc and Bz1::Luc is significantly different from the activation of P1* + R-GFP on these promoters (P < 0.5). B, Maize BMS cells co-bombarded with p35S::C1 (C1) and p35S::R-GFP (R-GFP) or p35S::RΔ532-560-GFP (RΔ532-560-GFP) and the pA1mPBS::Luc reporter construct containing mutations of the high and low affinity P1 binding site in the A1 promoter. Fold Activation was determined as described in Materials and Methods. Error bars represent S.D.

61

A

R347

R347

C

Figure 2.5: The dimerization region of R has structural similarities to ACT domains. A, Three-dimensional structure reconstruction showing the interface of the ACT domains of chains B and C in the E. coli PGDH tetramer crystal structure. The side chain for residue Arg-347 is shown in each of the two E. coli PGDH monomers. The file for the active PGDH tetramer (code1YBA) [195] was downloaded from the RCSB Protein Data Bank. B, Aignment of the E. coli PGDH ACT domain with the dimerization region of R (residues 525-610). The predicted secondary structure of R is shown on top of the alignment in blue, and the secondary structure of PGDH [209] is shown in green. Boxed amino acids indicate identical (black) or very similar (grey) residues. The star indicates Arg-347 in PGDH. Arrows indicate β-strands, and helices indicate α-helices. C, Quantitative yeast-two hybrid assay using the β-galactosidase reporter gene driven by GAL4-binding sites present in the PJ69.4a yeast strain [183].

62

Figure 2.6: ACT domains present in several other plant bHLH proteins. A, Relationship between bHLH proteins containing ACT domains based on the phylogenetic reconstruction of the A. thaliana bHLH family generated by Toledo-Ortiz et al. [87]. The subgroups identified by Heim et al. [40] were mapped onto the tree, and those groups containing the ACT domain are indicated in bold, with the number of group members that contained the domain, compared with the total number of group members, indicated in parentheses. For those groups in which an ACT domain was found, a graphical representation of the domain structure of the proteins is indicated including a dark gray box for the bHLH domain and a light grey box for the ACT domain. B, Yeast two-hybrid experiment showing homodimerization of At2g22770233-314 but no heterodimerization with R525-610 in the yeast strain pJ69.4a [183] containing HIS3 and ADE2 genes under control of the GAL4-BS. C, Yeast two-hybrid experiment showing no homodimerization of the ACT domain of EGL3 but heterodimerization with the ACT domain of R (R525- 610). D, Yeast two-hybrid experiment showing that the C-terminus of ZmbHLH5 containing the ACT domain does not homodimerize in yeast and does not heterodimerize with R or EGL3. E, Alignment of plant ACT domains present in bHLH proteins, as identified in this study. Based on these sequences, a consensus was derived (Consensus bHLH) and compared with the consensus obtained for ACT domains present in enzymes (Consensus enzymes) previously described [204] following the criteria of big amino acid (FILMVWYKREQ) (b), charged amino acid (DEHKR) (c), hydrophobic amino acid (ACFILMVWY) (h), branched amino acid (ILV) (i), polar amino acid (DEHKNQRST) (p), and small amino acid (ACSTDNGP) (s). To appear in the consensus, an amino acid should be present in at least 8/17 sequences.

63

A B

C D

E 526 604

R D A G T S N V T V T V S D K D V L L E V Q C R W E E L L M T R V F D A I K S L H L D V L S V Q A S A P D G F M G L K I R A Q F A G S G A V V P W M I S E A L R B D G T S N V T V T V S D T N V L L E V Q C R W E K L L M T R V F D A I K S L H L D A L S V Q A S A P D G F M R L K I G A Q F A G S G A V V P G M I S Q S L R K Jaf13 G S S T D S I V I N M I D K E V S I K M R C L S S E G L L F K I M E A L T G L Q M D C H T V Q S S N I D G I L S I S I E S K T N V S K T V S V G T I R E A L Q GL3 T G L T D N L R I G S F G N E V V I E L R C A W R E G V L L E I M D V I S D L H L D S H S V Q S S T G D G L L C L T V N C K H K G S K I A T P G M I K E A L Q An1 E E E I V Q V E V S I I E S D A L V E L R C P Y K E G L L L D V M Q M L R E L K V E V V T I Q S S L N N G S F F A E L R A K V K E N I Y G R K A S I L E V K K DEL D S L T D N I T V N I T N K D V L I V V T C S S K E F V L L E V M E A V R R L S L D S E T V Q S S N R D G M I S I T I K A K C K G L K V A S A S V I K Q A L Q At2g22770 I E A R V S D R D L L I R V H C E K N K G C M I K I L S S L E K F R L E V V N S F T L P F G N S T L V I T I L T K M D N K F S R P V E E V V K N I R V A L A E At2g22760 I E A K I S Q N D I L I R I L C E K S K G C M I N I L N T I E N F Q L R I E N S I V L P F G D S T L D I T V L A Q M D K D F S M S I L K D L V R N L R L A M V At2g22750 I E V R V S G K D V L I K I L C E K Q K G N V I K I M G E I E K L G L S I T N S N V L P F G P T F D I S I I A Q N N N F D M K I E D V V K N L S F G L S K L T At4g37850 I E V R F S D E D V L I K I L C E K Q K G H L A K I M A E I E K L H I L I T N S S V L N F G P T L D I T I I A K K E S D F D M T L M D V V K S L R S A L S N F At2g46810 K I L E Q Q L Q S L E A Q K R S Q Q S D D N K E Q I P E D N S L R N I S S N K L R A S N K E E Q S S K L K I E A T V I E S H V N L K I Q C T R K Q G Q L L R S At4g21330 N L L E T F H E M E E A P P E I D E E Q T D P M I K P E V E T S D L N E E M K K L G I E E N V Q L C K I G E R K F W L K I I T E K R D G I F T K F M E V M R F At5g65640 K E L L D K I N K L Q D E E Q E L G N S N N S H H S K L F G D L K D L N A N E P L V R N S P K F E I D R R D E D T R V D I C C S P K P G L L L S T V N T L E T At1g10610 K L E D E L K G I N E M E C K E I A A E E Q S A I A D P E A E R V S S K S N K R V K K N E V K I E V H E T G E R D F L I R V V Q E H K Q D G F K R L I E A V D At4g16430 T D M Q K K I R V Y E T E K Q I M K R R E S N Q I T P A E V D Y Q Q R H D D A V V R L S C P L E T H P V S K V I Q T L R E N E V M P H D S N V A I T E E G V V At1g32640 V K T E S E K L Q I K N Q L E E V K L E L A G R K A S A S G G D M S S S C S S I K P V G M E I E V K I I G W D A M I R V E S S K R N H P A A R L M S A L M D L At1g68810 K E L K R E T S V I S E T N L V P T E S D E L T V A F T E E E E T G D G R F V I K A S L C C E D R S D L L P D M I K T L K A M R L K T L K A E I T T V G G R V

Consensus bHLH p c . c p p l p l l p h c . p l h c p p p s h h p h . h p l c p h p h p p p p p l p l p p s p p p . p p s h h p p p l c p p b p . p p . h . . h b l p p h l p Consensus enzymes . . l . h . . s c s G h l . p l . . h h s p . s h s l . . h ...... s . . . h ...... b . . . . .

Figure 2.6

64

CHAPTER 3

THE bHLH DOMAIN OF MAIZE R LINKS TRANSCRIPTIONAL

REGULATION AND HISTONE MODIFICATIONS BY

RECRUITMENT OF AN EMSY-LIKE FACTOR

Part of this Chapter is published: Hernandez, J.M., Feller, A., Morohashi, K., Frame, K., and Grotewold E. (2007). The bHLH domain of maize R links transcriptional regulation and histone modifications by recruitment of an EMSY-like factor. Proc. Nat. Acad. Sci. 104: 17222-27. Dr. Hernandez performed the GST pulldown and interaction studies in yeast with truncated RIF1 and R, Dr. Morohashi performed ChIP experiments and PCRs with the ChIPed DNA and he did the micrococcal nuclease treatment. K. Frame designed the RIF1 RNAi construct and helped with the preparation of the tilling. For the non-published part, Dr. Asela Wijeratne performed the phylogenetic analyses.

3.1 Introduction

As mentioned in Chapter 2, the bHLH family of transcription factors is among the largest in animals and plants and the highly conserved bHLH domain contributes to

DNA-binding and dimerization of TFs from this family. Maize R was the first plant bHLH transcription factor described [138]. R belongs to a small gene family, which includes B, and R/B specify anthocyanin pigmentation in different plant tissues [210].

They participate in the transcriptional regulation of the anthocyanin pathway genes through cooperation with the R2R3-MYB transcription factor C1, or its paralog PL1

65

[211]. C1 and R/B physically interact through the MYB-domain of C1 and the N-terminal region of R (which does not contain the bHLH motif) [212-213]. C1 also makes direct

DNA contact with specific cis-regulatory elements which, in the case of the A1 gene correspond to the high- and low-affinity P1 binding sites (haPBS and laPBS respectively,

Fig. 4.4A) [214-215]. In Arabidopsis and various other plants, R-like genes participate in the regulation of trichome and root hair formation (e.g. GL3) and in the control of flavonoid pigments (e.g. TT8). All these factors function by interacting with R2R3-MYB proteins, recognizing particular signature motifs in the corresponding MYB DNA- binding domains [72, 213]. In addition, members of this group of bHLH proteins contain a conserved ACT domain at the C-terminus, which participates in homodimer formation

(Chapter 2) [216]. Despite the extensive knowledge implicating the cooperation of MYB and bHLH factors in a number of important plant functions [217], the mechanism by which the R/B bHLH region contributes to protein function has remained elusive.

The bHLH-LZ domain of the human proto-oncoprotein MYC has been studied in much detail ([174]). It dimerizes with the bHLH-LZ protein MAX and binds to the consensus DNA-recognition sequence CANNTG to activate transcription. However,

MYC represses transcription of its target genes when tethered to promoters by non-bHLH proteins such as MIZ1 or BRCA1 [174, 218]. The interaction of MYC with MIZ1 or

BRCA1 requires the bHLH domain but not the adjacent LZ. Moreover, MYC recruits various histone modifiers such as the HAT proteins CBP (CREB-binding protein) or p300 to its target genes in vivo via the C-teminus containing the bHLH-LZ domain [219].

This implies that MYC interacts with bHLH proteins and non-bHLH proteins and

66 depending on its partner, is tethered to different cis-regulatory elements. This is likely applicable to other bHLH proteins and will be discussed in more detail troughout this dissertation.

It has been shown that histone modifications and chromatin structure are intimately linked to the regulatory activity of many transcription factors [19]. In this

Chapter, I describe a novel function for the bHLH region of maize R in linking transcriptional regulation with histone modifications by recruiting an EMSY-like factor to flavonoid biosynthetic gene promoters. I demonstrate that deletion of the bHLH region of R has minor consequences for the transient expression of reporter constructs, but is essential for the activation of flavonoid genes in their normal chromatin environment.

Highlighting a role of histone modification in the regulation of flavonoid gene expression, I also show that H3K9/14 acetylation is intimately associated with the recruitment of R to DNA. Furthermore, I identified RIF1 as a nuclear maize factor with homology to the BRCA2-interacting EMSY N-terminal region (the ENT domain), which specifically interacts with the bHLH region of R. EMSY associates with the “Royal

Family” domain proteins HP1 and BS69, and it re-localizes to sites of DNA damage, consistent with a role of chromatin remodeling in DNA repair [220]. RIF1 is an example of how plants combined the ENT domain and a “Royal Family” domain (i.e., AGENET) into one protein. Mutations that abolish the R-RIF1 interaction significantly decrease pigment formation, with a similar effect observed when RIF1 expression is knocked- down. Phylogenetic analysis of RIF1-like genes revealed that most flowering plants have three ENT/AGENET protein-encoding genes, but four exist in Arabidopsis. Interaction

67 studies that I conducted between various bHLH proteins and RIF1-like genes exposed a specific interaction pattern. Together, my findings reveal a novel role of bHLH domains in tethering non-HLH proteins to DNA and modulating gene expression by histone modifications.

3.2 Materials and Methods

3.2.1 Plant Materials

The generation and analysis of the BMS cells expressing p35S::C1 and p35S::R was previously described [221]. B-I Pl were kindly provided by Dr. Vicki Chandler

(University of Arizona, Tuscon, AZ), and the M142X stock (b Pl1 R1-g) was obtained from the Maize COOP (http://maizecoop.cropsci.uiuc.edu/). Tilling mutant seeds were obtained from the maize tilling project at Purdue University.

3.2.2 Protoplast Isolation and Transformation

Protoplasts from 9 to12 day-old etiolated maize seedlings were obtained essentially as previously described [222], with the following modifications. After chopping the third leaves into small pieces, leaf stripes were digested in 3% cellulase RS,

0.6% macerozyme R10 (both from Yakult Honsha Co., Japan), 0.6 M mannitol, 10 mM

MES (pH5.7), 5 mM CaCl2, 7.5 mM -mercaptoethanol and 0.1% w/v BSA for 10 min under vacuum followed by 2 h gentle shaking (40 rpm) at 25°C in the dark. After releasing the protoplasts at 80 x g, the protoplasts were filtered through a 35 m nylon mesh and collected by centrifugation at 150 x g for 1 min. The protoplasts were washed

68 in ES buffer (0.6 M mannitol, 5 mM MES, pH 5.7, 10 mM KCl) and counted with a hemocytometer.

Electroporation was carried out on ~105 protoplasts with 30 g of DNA per transformation, using 100 V/cm, 10 msec and one pulse with a BTX Electro-Square-

Porator T820. After electroporation, protoplasts were incubated for 12-16 h in the dark at

RT.

The fluorescence furnished by p35S::GFP was used to calculate the transformation efficiency, which usually ranged from 30-50%. If efficiency was below

10%, three individual reactions were pooled together to yield one biological replicate.

3.2.3 Plant Transformation and Fluorescence Microscopy

p35S::RIF1-GFP in the presence or absence of 35S::R tagged with the MYC epitope, was transformed into A. tumefaciens strain GV3101 and infiltrated into 3-4 weeks old N. benthamiana leaves as described in Chapter 2. Localization of GFP was determined by confocal laser scanning microscopy on a Nikon Eclipse E600 microscope. p35S::RIF-GFP expression in maize protoplasts was visualized by fluorescence microscopy (Nikon Eclipse E600) at 40x or 100x magnification.

3.2.4 Protein-Protein Interaction Analyses

Yeast two-hybrid library screens were performed using a bait containing the C- terminal 200 amino acids of R fused to the GAL4 DBD in the pBDGAL4 plasmid

(Stratagene), and two maize cDNA libraries in the pADGAL4 vector (Stratagene) obtained from RNA extracted from immature B73 tassels (provided by Dr. Robert

69

Schmidt, University of California, San Diego, CA), or from young maize seedlings

(provided by Dr. Marja Timmermans, Cold Spring Harbor Labs, NY). The screen was performed in the PJ69.4a yeast strain [223] and positives selected in synthetic media SC-

LEU-TRP (for selection of the prey and bait plasmids respectively), SC-LEU-TRP-HIS-

ADE (for the selection of interacting partners). Plasmid DNA was isolated from putative positives, transformed into E. coli DH5 cells (Invitrogen), and after plasmid purification, re-transformed into PJ69.4a with the bait. A combined total of ~3.5 x 106 transformants was screened.

Protein-protein interaction experiments in yeast were performed as described in

Chapter 2. All genes were amplified by PCR and cloned into pADGAL4 or pBDGAL4 vector. Genes fused to the GAL4 AD are: ACK1, ACK2, ACK3, ACK3, SPCH97-364,

At5g46690, ZmRIF-2C, TT8361-518, R411-610 and GL3441-637. Genes fused to the GAL4

DBD include: ACK1, ACK2, ACK3, ACK4, GL3441-637, EGL3402-595, AtMYC1336-526,

TT8361-518, MUTE, SPCH97-364, FAMA190-414, At5g46690, At5g65320. For GST pull- down experiments, the RIF1 cDNA (GenBank Accession number EF647588) was cloned into the vector pGEX-KG [224] and expressed in the E. coli BL21(DE3)PlyS. Induction, purification and GST pull-down experiments were performed as described in Chapter 2 and [216].

3.2.5 ChIP Analyses

Approximately 60 mg of tissue (or ~104 protoplasts) were used for each immunoprecipitation. BMS cells and maize tissues were immersed in buffer A (0.4 M sucrose, 10 mM Tris-HCl pH 8.0, 1 mM EDTA, 1 mM PMSF) and protoplasts were 70 resuspended in ES buffer (0.6 M mannitol, 5 mM MES, pH 5.7, 10 mM KCl) containing

1% formaldehyde and incubated under vacuum for 20 min. Glycine was added to 0.1 M final concentration, and incubation was continued for an additional 10 min. The cross- linked material was resuspended in 0.1 ml Lysis Buffer (50 mM HEPES pH 7.5, 150 mM

NaCl, 1 mM EDTA, 1% Triton X-100, 0.1% deoxycholate, 0.1% SDS, 1 mM PMSF, 10 mM sodium butyrate), and plant proteinase inhibitor cocktail (Sigma). DNA was sheared by sonication to approximately 300-1000 bp fragments with a main peak of 500 bp.

Sonication (Sonics & Materials Inc.) was performed on ice with an amplitude of 10% using 5x15 sec pulses (5 sec between bursts). After pre-clearing with 40 L of salmon sperm DNA/Protein A-agarose beads (Upstate) for 120 min at 4˚C, immunoprecipitations were performed overnight at 4˚C with either 1 g of anti-acetyl-histone H3 (06-599; upstate), anti-dimethyl-K9-histone H3 (ab7312; abcam), or 1 l of anti-GFP antibody

(ab290; Abcam). After incubation, beads were washed 2 times with LNDET Buffer

(0.25M LiCl, 1% NP40, 1% deoxycholate, 1 mM EDTA), and 2 times with TE buffer.

The washed beads and input fraction were resuspended in Elution Buffer (1% SDS, 0.1

M NaHCO3) with 0.25 mg/ml proteinase K, and incubated overnight at 65˚C. After crosslink reversal of the immunoprecipitated and Input DNA (set aside from the sonication step), the DNA was purified using the PCR Purification Kit (Qiagen).

Semi-quantitative PCR was performed under standard PCR conditions (35-38 cycles). DNA was detected using agarose gel electrophoresis, and quantified by ethidium bromide staining. Quantitative PCR was performed using standard PCR conditions with 1

µCi [α-32P]-dCTP, 0.2 mM each of dATP, dTTP, dGTP, 0.02 mM of dCTP and 1 unit of

71

Taq DNA polymerase (Gene Script Corp.) for 24-30 cycles. Amplified products were immobilized on filter paper, TCA precipitated and counted on a scintillation counter.

Normalization was performed by calculating the ratio to the signal of the reference fragment after scaling by the signal provided by the input.

3.2.6 Transient Expression Experiments in Maize Cells

Microprojectile bombardment of maize BMS suspension cells and transient expression assays for luciferase were performed as described in Chapter 2 and in [213] with the following modifications. Ten µg DNA including 1 µg regulator, 3 µg reporter plasmid (i.e. pA1::Luc) and p35S::BAR (to bring it to 10 µg DNA total) were coated around gold particles (BioRad) using 0.1 M spermidine (cat. # 85558 from BioRad) and

2.5 M CaCl2. BMS cells were prepared 18 h before bombardment by adding 30 ml M237 medium containing 1 ml 50% PEG 8000 to 3 g of BMS cells. One ml cells were added onto filter paper placed in the center of a petri dish. Coated gold particles were sonicated for 2x 2 sec using the Bioruptur (Diagenode) at medium level and 8 µl suspended gold particles were immediately placed onto a macrocarrier (BioRad) and bombarded using the particle gun (BioRad).

For transient expression assays of firefly luciferase and Renilla luciferase

(Renilla), the Dual-Luciferase Reporter Assay System (Promega) was used. To normalize the number of red cells (counted 36-48 hours after bombardment) or the luciferase activity to the Renilla or GUS activity, 3 g of p35S::Ren [225] or pUbi::GUS [213] were included in each bombardment. Each treatment was done in triplicate, and entire experiments were repeated at least twice. 72

3.2.7 Nuclei Isolation and Micrococcal Nuclease Digestion

BMS cells (100 mg) were subjected to nuclear isolation with the CellLytic PN plant nuclei isolation kit (Sigma) 40 h after transformation with the pA1::Luc plasmid by microprojectile bombardment. Ten g of isolated nuclear protein were resuspended in

300 l of buffer N (15 mM HEPES pH 7.5, 60 mM KCl, 15 mM NaCl, 3 mM CaCl2 and

0.1 mM DTT), and 50 l aliquots were incubated with 15 U of micrococcal nuclease

(USB) for various time periods at 37 ˚C. As a control, we used 5 ng of the pure pA1::Luc plasmid. After the reaction was completed, DNA was extracted using the PCR

Purification Kit (Qiagen) into 30 l of EB buffer. The purified DNA (2 l) was used in

PCR reactions with primers that recognize the A1 promoter and luciferase (to detect pA1::Luc), or the A1 promoter and the A1 5’ UTR (to detect the endogenous A1 gene).

3.2.7 Antibody Production

Amino acids 232-347 in R and 105-244 in RIF1 were PCR amplified using primers R-ab-F and R-ab-R or RIF1-ab-F and cloned into pENTR/D-TOPO vector

(Invitrogen) followed by recombination into pDEST17 vector (Invitrogen). were expressed in E. coli and purified as follow: Protein was purified by resupending the

232-347 pellet obtained from harvesting of 500 ml bacteria expressing His6-R or His6-

RIF1105-244 in 10 ml Lipsick Buffer (10 mM Tris pH7.5, 50 mM NaCl, 10% glycerol, 7 M urea, add PMSF to 100 µg/ml right before lysis). Bacteria were lysed using a French press and total protein extract was spun down for 20 min at 4000 x g. Supernatant was filtered through one layer of Miracloth and added to equilibrated Ni2+ beads (0.5 ml slurry for 500 ml original culture volume). Beads were incubated with the supernatant for 73

2.5 h at RT and applied to a column. After collecting the flow thru, column was washed three times with sonication buffer in 7 M urea (50 mM NaPi pH 8, 300 mM NaCl, 100

µg/ml PMSF) and three times with wash buffer in 7 M urea (50 mM NaPi ,pH 8, 300 mM

NaCl, 1% v/v Tween-20, 10% v/v glycerol, 5 mM β-mercaptoethanol, 100 µg/ml PMSF).

Purified protein was eluted with 3x 3 ml wash buffer containing 250 mM imidazole.

Twenty µl of the elutions were loaded on a 15% SDS-polyacrylamid gel (PAGE) and quantified using BCA protein quantification kit (Promega). One mg of purified protein was then loaded on a 15% SDS-PAGE and electrophoresed for 240 min at 150 mV. Gel was stained with Coomassie Brilliant Blue and destained. Stained protein band was cut out and sent for injection into rabbits for antibody production at Cocalico Biological, Inc.

(www.cocalicobiological.com).

3.3 Results

3.3.1 An Essential Function for the bHLH Domain of R

The bHLH region of R/B is very conserved, yet in transient expression experiments, it appears to be dispensable for the activation of reporter constructs containing promoters from the anthocyanin pathway [212, 226-227]. In agreement with these studies, the deletion of the bHLH of R has only a moderate effect on the activation of the pA1::Luc reporter (about 40% reduction), when co-bombarded with C1 into BMS maize cells (Fig. 3.1A, R bHLH). The previously characterized RD12 mutant, containing an insertion of three amino acids at the C-terminus of the second helix of the bHLH motif, was originally identified as a Ds-transposition derived mutation that significantly reduces

74 anthocyanin accumulation and flavonoid gene expression [226]. RD12 shows a similarly moderate reduction on the normalized expression of the pA1::Luc reporter (Fig. 3.1A,

RD12). Previously it was shown that RD12 activity on Bz1::Luc decreases up to 90%. The difference observed between the activation of A1 and Bz1 might be due to the different regulation by R explained in detail in Chapter 4.

However, both R bHLH and RD12 show a dramatic reduction in the number of cells accumulating anthocyanins (~80% reduction for R bHLH and 98% reduction for RD12, Fig.

3.1A), when these mutants of R were co-bombarded with C1 (p35S::C1) into BMS cells.

These results are in agreement with previous in vivo findings showing that the RD12 mutation has a 97% reduction in anthocyanin accumulation in maize aleurones [226], and demonstrate that the bHLH region of R is necessary for the expression of endogenous flavonoid gene promoters, but not for the transient expression of reporter constructs. The reproducible difference in the activity of R bHLH and RD12 (Fig. 3.1A) [226] might reflect different stability of these proteins in the transient expression system. Alternatively, the bHLH motif may have both positive and negative regulatory activities; the D12 mutation perhaps only abolishes the positive regulatory activity while the deletion of the bHLH present in R bHLH might abolish both, resulting in increased activation of the anthocyanin genes, when compared to RD12.

One way to explain the difference in the activation of the endogenous genes and the transiently expressed constructs is that epigenetic modification, such as chromatin modifications, are different for the endogenous and transiently expressed promoters.

Micrococcal nuclease protection experiments performed with the help of Dr. Morohashi

75 showed a significantly higher sensitivity of the transiently expressed pA1::Luc construct compared to the endogenous A1 promoter (Fig. 3.1B). This suggests that, at most, a small fraction of the pA1::Luc plasmid is nucleosome associated in BMS cells, under the conditions tested. This is consistent with earlier studies in animal cells which have shown that transiently introduced and stably integrated genes can have different chromatin structures, and hence might be transcribed differently [228-229].

To investigate whether histone modifications are associated with the activation of the flavonoid genes by C1 and R, Dr. Morohashi analyzed previously described [221] transgenic BMS cells constitutively expressing R and C1 from the 35S promoter

(BMSR+C1). Chromatin immunoprecipitation (ChIP) experiments were performed in parallel on BMS and BMSR+C1 cells with commercial antibodies that recognize several different modifications of H3, including H3K9/14ac, H3K9/14me2, H3K4me2, H3K4me3 and H3K27me2. The most striking and reproducible difference in histone modification of the A1 gene promoter between BMS and BMSR+C1 is in the acetylation of Lys-9 and Lys-

14 in H3 (H3K9/14ac), particularly in the proximal region (Fig. 3.2B, P) immediately upstream of the TSS, a promoter fragment necessary and sufficient for the regulation by

C1 and R (Fig. 3.2A) [214, 227]. H3K9/14ac is significantly enriched in histones associated with the A1 promoter in BMSR+C1 cells, compared to BMS cells that do not express the pathway regulators (Fig. 3.2C). In contrast, the levels of H3K9/14ac are similar in a distal region (probe D, Fig. 3.2A) positioned 1.5 kb upstream of the TSS (Fig.

3.2C). Similarly, no difference between BMS and BMSR+C1 cells was observed in

H3K9/14ac in a gene unrelated to the flavonoid pathway (Fig. 3.2B, Actin).

76

To ensure that the differences observed between BMS and BMSR+C1 cells were not a consequence of the cell culture selection process, Dr. Morohashi investigated the status of H3K9/14ac between B-I (fully pigmented) and b (no pigment) maize plants. The expression of B-I correlated with increased accumulation of H3K9/14ac at the A1 gene promoter in the proximal, but not in the distal region (not shown). Taken together, these results indicate that the bHLH region of R is essential for the activation of flavonoid genes in their normal chromatin environment, and that the R/B and C1/PL1 pathway regulators influence histone modifications specifically associated with the proximal region of the A1 promoter.

3.3.2 Identification of RIF1 as an ENT Domain Protein that Specifically Interacts with the bHLH Domain of R

To determine the function of the bHLH region of R, I carried out yeast two-hybrid screens using the C-terminal region of R (residues 411-610, Fig. 3.3A) fused to the

GAL4 DNA-binding domain (GAL4DBD-R411-610) in the PJ69.4a strain [223] as bait. A total of ~3.5x106 clones were screened (see Appendix D for details), resulting in the identification of several positive clones, including two novel bHLH factors (described in

Chapter 5). Four different clones corresponding to RIF1 were identified, all of them containing the complete ORF for a 452 amino acid-long protein (Fig. 3.3A). The bHLH region of R (R411-462, Fig. 3.3B) is sufficient for the interaction of R with RIF1, and GST pull-down experiments, using in vitro transcribed and translated RIF1 ([35S]-GAL4AD-

RIF1) and E. coli expressed GST-R411-610, provided an independent biochemical confirmation for the interaction (Fig. 3.3C). 77

I rationalized that, if RIF1 had anything to do with the R activities described in

Figure 3.1, then the D12 mutation should impair the R/RIF1 interaction. To determine whether this was the case, the bHLH region of RD12 was fused to the GAL4DBD

[GAL4DBD-RD12(411-462)] and tested for interaction with GAL4AD-RIF1 in yeast. No interaction was observed (Fig. 3.3D, compare 1 and 2), indicating that the D12 mutation in R abolishes the interaction with RIF1.

RIF1 encodes a novel plant protein harboring unique domain architecture (Fig.

3.3A). The ENT domain was first described as necessary and sufficient for the interaction of EMSY and BRCA2 [220]. In humans, it is only found in EMSY. In plants, however, the ENT domain is present in a small plant , but usually in association with an AGENET domain, a member of the “Royal Family”, which includes the Tudor domain. Proteins belonging to the ENT/AGENET domain family are often associated with chromatin functions [28], but in plants none of these proteins has been functionally characterized.

Phylogenetic analysis of RIF1-like genes revealed that most flowering plants have three ENT/AGENET protein encoding genes, with the exception of A. thaliana where four of these genes were identified (Fig. 3.4A). In maize, three RIF1-like genes were found (ZmRIF1, ZmRIF1-like1, ZmRIF1-like2), all containing the typical ENT/AGENET domain structure (Fig. 3.4B).

Mapping of these genes in the maize genome revealed that RIF1 is localized on 7, bin 02 and RIF1-like2 maps to chromosome 4 (Fig. 3.4C). Interestingly, the RIF1-like1 gene is localized on chromosome 2.06 (Fig. 3.4C), a region containing the

78 enr1 locus (enhancer of r1) involved in color formation. [230]. In addition, during the course of QTL mapping, modifiers that increase anthocyanin accumulation in vegetative tissues in Pl-Blotched plants were mapped to that specific region (Ed Grow and Karen

Cone, personal communication). To determine if RIF1-like1 is involved in anthocyanin formation, I sent this gene to the Maize Tilling Project at Purdue University

(http://genome.purdue.edu/maizetilling/index.htm). Two tilling mutants were found, one mutation in the third intron and a glycine to aspartate change at amino acid 224 (Figure

3.4B). I received the seeds from the tilling facility and heterozygous seeds were planted and crossed by Dr. Yongqin Wang (a post doc in Erich Grotewolds laboratory) to get homozygous seeds. Homozygous plants for the RIF1-like1G224D mutation are viable and show no change in anthocyanin pigmentation when compared to W22 (Dr. Yongqin

Wang, personal communication). The G224D mutation lies within a region which is not conserved and highly variable between RIF1-like genes (Fig. 3.4A, B). This suggests that amino acid 224 in RIF1-like1 is not necessary for pigment formation.

While the ENT domain of RIF1 homodimerizes in yeast (Table 3.1), as is the case for the EMSY ENT domain [231], neither the ENT nor the AGENET domains of RIF1 are sufficient for the interaction with R. Dr. Hernandez truncated RIF1 and cloned the following truncated RIF1 parts into pADGAL4 to test for interaction with GAL4DBD-R411-

610 in yeast: A. RIF11-114 (including the ENT domain); B. RIF1241- 462 (including the

AGENET domain); C. RIF149-282 (containing ENT and AGENET domain); D. RIF11-242 and RIF149-462. No dimerization was observed with any of the truncations in RIF1, suggesting that multiple regions in RIF1 are involved. Together, these results provide the

79 first evidence for the interaction of a bHLH domain with an ENT/AGENET containing protein.

3.3.3 RIF1 Displays Speckled Nuclear Localization and is Necessary for the

Activation of Maize Flavonoid Genes

To determine whether RIF1 localizes to the nucleus, as could be expected for an

R partner, we fused RIF1 to GFP and investigated the localization of the p35S::RIF1-

GFP fusion in maize protoplasts (Fig. 3.5A) and in N. benthamiana leaf epidermal cells

(RIF1-GFP, Fig. 3.5B). In both cases, RIF1-GFP was detected exclusively in the nucleus in a distinct speckled pattern. No difference in the RIF1-GFP localization pattern was observed when R was co-expressed (Fig. 3.5B), suggesting that the speckled nuclear localization of RIF1 is independent of the similar pattern observed for R-GFP [216].

Next, I investigated whether RIF1 is necessary for R regulatory activity. Several

ESTs for RIF1 have been identified, including some from BMS cells, indicating that this gene is ubiquitously expressed, consistent with the ability of maize to accumulate anthocyanins in almost every plant organ. Since no maize RIF1 mutants are currently available, Kenneth Frame generated a construct that, when expressed, would generate a dsRNA that should target RIF1 for degradation (p35S::RNAiRIF1). A 500 bp fragment of the RIF1 coding region (nucleotide 164-663) in the forward and reverse orientations separated by the rice waxy-a intron was cloned in the pMCG161 vector

(http://www.chromdb.org/). With the help of Dr. Morohashi, I tested if this construct is able to target RIF1 for degradation by bombarding p35S::RIF1-GFP in the presence and absence of p35S::RNAiRIF1 into maize BMS cells followed by RNA isolation and qRT- 80

PCR. The RNAi construct knocks down the expression of the transiently expressed

RIF1-GFP (Fig. 3.6). I then investigated the effect of p35S::RNAiRIF1 on the ability of R and C1 to activate anthocyanin biosynthesis. p35S::RNAiRIF1 very significantly (P <

0.01) reduces the normalized number of red cells induced by R+C1 in BMS cells (Fig.

3.6B, RNAiRIF1), while no significant difference was observed when R+C1 were co- bombarded with the pMCG161 empty vector (Fig. 3.6B, empty vector). These results confirm that RIF1 participates in the C1/R regulatory function. I reasoned that, if this activity of RIF1 is related to the role of RIF1 in chromatin functions, then the knocking- down of RIF1 should affect anthocyanin accumulation, but not the expression of the pA1::Luc reporter. When I assayed luciferase and normalized it for Renilla activity provided by the co-bombarded p35S::Ren plasmid, I found that the expression levels of pA1::Luc remained the same (Fig. 3.6B). Taken together, these results indicate that RIF1 is a nuclear protein that is specifically required for the regulation of the endogenous flavonoid genes by C1 and R.

3.3.4 R Recruits RIF1 to the A1 Gene Promoter

To determine whether RIF1 would be recruited to the A1 promoter in a C1/R dependent fashion, I adopted a maize protoplast transient-expression system [232-233], which provides significantly higher transformation efficiency (30-50%) than the bombardment of BMS cells. In Chapter 2, I determined that p35S::R-GFP is equally active as p35S::R in promoting anthocyanin accumulation. When protoplasts were electroporated with p35S::R-GFP (R-GFP) and ChIP experiments were performed with

GFP-specific antibodies and probed on the P and D regions of the endogenous A1 81 promoter, no recruitment of R to DNA was observed (Fig. 3.7, R-GFP), as expected based on current models on how C1 and R function [234]. However, when p35S::C1 (C1) was co-transformed, a very significant recruitment of R-GFP to DNA was observed, yet only to the P region of the A1 promoter (Fig. 3.7, R-GFP + C1). No significant DNA recruitment of RIF1-GFP was observed in the absence of R and C1 (Fig. 3.7, RIF1-GFP).

However, the presence of R and C1 was sufficient to tether RIF1 to the proximal, yet not to the distal region of the A1 promoter (Fig. 3.7, RIF1-GFP + C1 + R). Taken together, these results show that RIF1 is part of the C1/R regulatory complex and provide the first evidence that in vivo, C1 is essential for the assembly of the C1/R/RIF1 complex on A1.

3.3.5 RIF1 and Arabidopsis Homologs of RIF1 interact with R-like and non-R-like bHLH proteins

According to the RIF1 domain structure and to my results showing possible involvement of RIF1 in chromatin function for anthocyanin biosynthetic gene regulation, it is possible that RIF1 is a common chromatin modifier which might recognize the helix- loop-helix fold, and therefore might interact with other bHLH proteins. In Chapter 5, I will describe in detail a maize bHLH protein (ZmbHLH5) which I identified in a yeast two-hybrid screen and which interacts with R. To determine if RIF1 has the potential to interact with other bHLH proteins from maize, I considered this protein as a case study. I fused RIF1 to the GAL4AD (GAL4AD-RIF1) and ZmbHLH5117-393 to the GAL4DBD

(GAL4DBD-ZmbHLH5117-393) and tested interaction in yeast. GAL4AD-RIF1 interacts with

GAL4DBD-ZmbHLH5117-393 as indicated by growth on selective medium SC-LEU-TRP-

HIS-ADE (Table 1). This suggests that R is not the only interacting partner for RIF1 and 82

RIF1 might be a more general chromatin associated protein which recognizes the bHLH fold.

As mentioned previously, the Arabidopsis genome contains four RIF1-like genes, which were denominated ACK1 to ACK4 (At5g13020, At2g24440, At3g12140,

At5g06780). I also mentioned previously, that Arabidopsis contains four R-like genes

(GL3, EGL3, TT8 and AtMYC1) and that R belongs to group IIIf of bHLH proteins [40].

These R-like genes are involved in epidermal cell differentiation (leaf and root), seed coat pigmentation and/or anthocyanin biosynthesis. To establish whether RIF1 and the ACK genes are functionally similar, I first tested if Arabidopsis RIF1 homologs and the R homologs are able to interact in yeast. When fusing the four ACK genes to the GAL4 activation domain (GAL4AD-ACK) and the bHLH plus C-terminal regions of the bHLH genes to the GAL4 DNA-binding domain (GAL4DBD), I found that ACK1 interacts with

GL3 and weakly with EGL3, but not with TT8 (Fig. 3.8A), whereas ACK2 and ACK3 do not interact with any of the R-like proteins (Fig. 3.8A, Table 3.1). In contrast, ACK4 interacts weakly with TT8 (Table 3.1). ACK3 interacts with R411-610 and with FAMA (but not with SPEECHLESS or MUTE, Table 3.1). This suggests that ACK1 might be involved in anthocyanin biosynthesis but that the other RIF1 homologs in Arabidopsis might be recruited to other regulatory regions and are involved in processes other than pigment formation.

Next, I tested wether RIF1 can interact with any of the R-like genes from

Arabidopsis in yeast. None of the Arabidopsis R-like genes interacted with RIF1, indicated by lack of growth on SC-LEU-TRP-HIS and SC-LEU-TRP-HIS-ADE (Fig.

83

3.8B). In addition, I tested the interaction of RIF1 with Arabidopsis group I bHLH proteins such as SPEECHLESS, FAMA, MUTE, At5g65320 and At5g46690 and with

NAI1 belonging to group VIa [40]. All proteins were truncated and contain only the C- terminus including the bHLH domain. RIF1 interacts with MUTE, but not with any of the other bHLH proteins tested (Fig. 3.8B). This suggests that RIF1 and RIF1 homologs from

Arabidopsis interact specifically with certain bHLH proteins. Although I have not been able to find an interacting bHLH partner for Ack2, it is possible that it dimerizes with an as yet unidentified bHLH protein. Since the bHLH domain forms a conserved 3D- structure, it is unlikely that RIF1 or the ACK proteins recognize the bHLH fold for binding, but rather recognize specific sequences within the bHLH domain. The bHLH proteins tested are highly similar in the bHLH domain (>70%) at the amino acid level

(Fig. 3.8C). If RIF1 binds to certain amino acids, for example residues facing the outside of the helix, as has been described for interaction of human MIZ1 and MYC1 [82], a structural analysis will determine which amino acids face the inside or outside of the helix and further mutational analysis can reveal the exact interacting site for RIF1.

3.4 Discussion

The results presented here provide a novel function for the bHLH region of the maize transcription factor R in linking transcriptional activation of flavonoid biosynthetic genes with chromatin functions. The finding that the R bHLH region is essential for the regulation of the endogenous genes in maize cells, yet is dispensable for the activation of transiently expressed reporters reconciles a number of conflictive reports [212, 226-227] while providing the first evidence that it may play a role in chromatin functions. The 84 identification of RIF1 as a partner for the bHLH domain of R further supports the link between R function and chromatin structure. The presence in RIF1 of an AGENET motif, which is closely related to Tudor domains [28], further highlights its likely participation in chromatin functions.

In addition to the AGENET domain, RIF1 also contains an ENT domain, located near the N-terminus of the protein (Fig. 3.3A). Whereas humans contain a single ENT- domain protein, the BRCA2-interacting EMSY factor [220], plant genomes exhibit several genes that code for ENT-domain containing proteins [220]. ENT domains in plants are almost always accompanied by AGENET domains, as found in RIF1. The function of the ENT domain remains unknown, however a conserved sequence adjacently located to the ENT domain mediates the interaction of EMSY with HP1 and BS69

[235]. EMSY forms homodimers through its ENT domain [231] and as expected, given the high sequence identity, the RIF1 ENT domain mediates RIF1 homodimer formation

(Table 3.1).

Recently, a group of plant proteins have been identified containing a DUF724 domain (domain of unknown function 724) [236]. As described for the ENT domain proteins, DUF724 domain proteins are almost always accompanied by an AGENET domain. There are 10 DUF724 domain containing proteins in Arabidopsis and 3 of them homodimerize through this domain [236]. The AGENET domain in the DUF724 domain containing proteins has significant sequence similarities with the N-terminal region of

FMRP (fragile X mental retardation protein) proteins, which has been shown to have

RNA binding activity [28, 237]. Cao and collaborators proposed that the AGENET

85 domains in DUF724 proteins may be involved in RNA binding and the DUF724 domain in interaction with microtubules or actin filaments [236].

In addition to RIF1, there are two RIF1-like genes (RIF1-like1, RIF1-like2) in maize. RIF1 and RIF1-like1 are 89% identical at the amino acid level. Whereas RIF1 is localized on chromosome 7, RIF1-like1 is localized on chromosome 2 bin 6.

Interestingly, two independent studies identified that this genomic region contains an enhancer of color formation in maize kernels or in vegetative tissues [230] (Ed Grow and

Karen Cone personal communication). It is possible, that RIF1-like1 is the gene responsible for the color formation. Seeds of enr1 mutants are available and plants should be screened for mutations in RIF1-like1. We will also test if this gene is able to interact with R in yeast two-hybrid experiments and if it is involved in anthocyanin formation in maize BMS cells by transiently expressing RIF1-like1 in the presence of C1 and R and looking for red cell formation.

RIF1 accumulates in the nucleus of maize and N. benthamiana cells in discrete speckles. However, the presence or absence of R does not influence the nuclear patterning of RIF1 (Fig. 3.5A, B). Interestingly, EMSY is also exclusively nuclear and forms speckles that co-localize with -H2AX after DNA damage [220]. LHP1, one of the

Arabidopsis HP1-like factors, also shows a speckled pattern [238]. Many plant nuclear proteins localize to heterogenous nuclear foci [191] and many “Royal Family” domain proteins heteodimerize such as EMSY and HP1 [220], and it would be interesting to see if LHP1 (containing chromodomain) and RIF1 might associate. Dr. Hernandez tested interactions in yeast by fusing the CDS of LHP1 to the GAL4DBD and testing interactions

86 with GAL4AD-RIF1. Yeast two-hybrid experiments failed to detect an interaction between LHP1 and RIF1, which is consistent with the absence of the EMSY HP1- interacting region in RIF1. Given that both the chromodomain in HP1 and the AGENET domain of RIF1 belong to the “Royal Family”, it is possible that the functions contributed by HP1 to the EMSY complex in metazoans are fulfilled by the RIF1 AGENET domain

(similar to the chromodomain in HP1) in the C1/R/RIF1 complex.

EMSY interacts with BRCA2 and co-localizes with -H2AX, which strongly suggests that EMSY is part of the BRCA2-containing DNA-repair complex.

Coordinating the recruitment of chromatin-remodeling proteins by EMSY strengthens the link between BRCA2 and chromatin repair [220, 239]. This is not surprising given that

DNA repair, like transcription, is a process that is challenged by chromatin structure. In addition to its role in DNA repair, EMSY regulates the transcriptional activity of

BRCA2. However, unlike what our studies suggest with regards to the function of RIF1,

EMSY is a transcriptional repressor that interacts with the activation domain of BRCA2

[220], possibly by the association with chromatin-remodeling proteins.

The recruitment of RIF1 to the C1/R enhanceosome is mediated by the bHLH region of R. bHLH motifs are typically expected to interact with other bHLH proteins, and the finding that the corresponding region of R is essential for the interaction with

RIF1 (Fig. 3.3A, B) opens the possibility for additional protein-protein interactions mediated by similar domains in other plant or animal bHLH proteins. Given the high identity in the bHLH region between R and several other plant proteins [240-243] and the results obtained from my yeast two-hybrid experiments, it is very likely that RIF1 is

87 shared by a number of regulatory complexes. Until maize RIF1 loss-of-function mutants become available, the identity of those regulatory complexes will remain unknown.

Interestingly however, mutations in ACK1, one of the RIF1 homologs from Arabidopsis

(At5g13020) display a number of developmental defects (Wijeratne, Hernandez and

Grotewold, unpublished). There are four RIF1-like genes in Arabidopsis (ACK1-ACK4), all can homodimerize, possibly via the corresponding ENT domain (Fig. 3.8A; Table

3.1). Furthermore, all ACKs can heterodimerize with each other (e.g. ACK1 interacts with ACK2, -3 and -4). In addition, the ACK proteins interact with various bHLH proteins (Fig. 3.8B, Table 3.1). ACK1 for example interacts weakly with GL3, EGL3 and maize R in yeast, three bHLH factors involved in anthocyanin formation. This is in agreement with the phenotype of ack1 mutants, which show reduced anthocyanin levels

(Wijeratne and Grotewold, unpublished). Further functional analysis of all four ACK genes is under way (Wijeratne and Grotewold, unpublished).

My results also provide the first in vivo evidence that the assembly of the C1/R enhanceosome on the proximal region of the A1 promoter requires C1, and that R, despite the presence of the bHLH region, is unable to be recruited to the cis-regulatory regions important for A1 regulation, in the absence of C1 (Fig. 3.7). Such a model had been predicted from extensive transient expression experiments and mutational analyses of the

A1 promoter [213, 215, 227, 234], but a direct in vivo tethering of C1 to the A1 promoter was never shown before. Most significant from a mechanistic perspective, however, my results support a model in which C1 is primarily responsible for specifying the promoters to which R needs to be recruited, while R furnishes a docking platform for the

88 recruitment of additional factors to the complex, including RIF1, as shown in this study.

As is the case for some of the other factors recruited by R, for example the WD40 factor

PAC1 [104], or the R dimerization [216], the specific role that RIF1 plays in the complex remains to be fully determined. However, the results shown here strongly suggest that the recruitment of the C1/R/RIF1 complex to the proximal region of the A1 promoter is required for chromatin functions that include the acetylation of H3K9/14 and ultimately results in the expression of A1. H3K9/14 acetylation in gene promoter regions has been extensively associated with a transcriptional activatory function [244]. In addition acetylated H3K14 (K14ac) and di-methylated H3K36 (K36me2) are associated with

“open” chromatin [21]. Interestingly it has been shown that ACK3 can bind to both modified histones, but not to H3K4me2 (Wijeratne, Morohashi, Omata, personal communication). These results are in agreement that chromatin-associated proteins, such as PHD finger domain or BROMO domain containing proteins are recruited to certain modified histones [19].

In conclusion, my study uncovered the recruitment of a novel EMSY-related protein to the regulatory region of the A1 gene as a novel function for a plant bHLH domain. This finding provides a new link between gene transcriptional regulatory mechanisms and chromatin functions in one of the best-described plant regulatory systems to date, and highlights a previously unknown function of bHLH domains.

89

Figure 3.1: The bHLH region of R and chromatin functions. A, Activation of the A1 promoter and red cell accumulation by mutants in the bHLH region of R. Results of transient expression experiments after co-bombardment of cultured maize BMS cells with R and bHLH mutants of R driven by the CaMV 35S promoter (p35S) together with p35S::C1 and pA1::Luc. A construct expressing GUS from pUbi (pUbi::GUS) was included in all bombardments as a normalization control. Each treatment was done in triplicate and the number of red cells and the activity of luciferase was normalized for GUS activity. The fold activation corresponds to the ratio of each particular treatment and the treatment with pA1Luc without activator. The scale for the normalized red cells is in arbitrary units of red cells/GUS. The average values are shown and the error bars indicate the standard deviation of the samples. B, Micrococcal (MNase) sensitivity assay. PCR of MNase digestion experiments using the nuclear extract from BMS cells bombarded with pA1::Luc and the pure (naked) pA1::Luc plasmid as a control. PCR was performed by using primers that selectively amplify the pA1::Luc or the endogenous A1 promoter. MNase treatments were carried out for 30 sec, 1 min, and 5 min in the presence of MNase (+) and 20 min without MNase (-), followed by agarose gel electrophoresis. Location of PCR primers are indicated by arrows in the diagram.

90

A

B

Figure 3.1

91

A

B

C

Figure 3.2: Enrichment of H3K9/14ac in the proximal A1 promoter region in BMS cells. A, Structure of the A1 gene promoter indicating the proximal (P) and distal (D) fragments analyzed by ChIP. The arrow corresponds to the transcription start site (TSS) at position +1 and numbers in the promoter are in reference to this. The laPBS and haPBS recognized by C1 [215] are indicated. B, Enrichment of H3K9/14ac (indicated in the rest of this figure as H3K9ac) in the P but not in the D region of the A1 promoter in BMSR+C1 cells compared to BMS, determined by semi-quantitative PCR using three four-fold serial dilutions of ChIPed DNA and C, by quantitative PCR. ChIP experiments were performed in biological triplicates, and PCRs were performed by using proximal (P) and distal (D) fragments. Asterisks indicate a sample significantly different from the rest (P < 0.05). The normalization of the H3K9/14ac abundance is described in the Material and Methods section. Different asterisk numbers indicate sample groups that are statistically significantly different (p < 0.05).

92

Figure 3.3: RIF1 corresponds to an EMSY-like protein that interacts with the bHLH domain of R. A, Schematic representation of the structure of R and RIF1 indicating the ENT domain (green), the AGENET domain (brown) and a conserved domain C terminal of RIF1 (blue). Underlined is the region involved in transcriptional activation. B, Yeast two-hybrid interaction of GAL4AD-RIF1 with the C-terminal region of R including the bHLH (R411-610), the C-terminal region of R excluding the bHLH (R462-610), and the R bHLH domain alone (R411-462). All R constructs were fused to the GAL4DBD at the N- terminus in pBDGAL4, which is the plasmid used in the empty vector control. C, Autoradiogram (right) and Coomassie stained gel (left) of an SDS-PAGE of a GST pull- down using GST-R411-610 (including the bHLH) as bait (lane 2) with an in vitro transcribed and translated GAL4AD-RIF1. GAL4AD-RIF1 was radio-labeled with 35S- methionine as shown in the input lane (lane 1). GST alone was used as a negative control (lane 3). These data were generated by Dr. Marcela Hernandez. D, Yeast two-hybrid interaction of GAL4AD-RIF1 with the bHLH of R containing the D12 allele sequence [GAL4DBDRD12(411-462)]. GAL4DBDR411-462 was used as positive control and the empty pADGAL4 plasmid (GAL4AD) as negative control. Yeast two-hybrid assays were performed using yeast strain PJ69.4a [223] containing the HIS3 and ADE2 genes under the control of GAL4-binding sites. Growth in SC–LEU – TRP –HIS –ADE plate is indicative of activation.

93

B

C

D

Figure 3.3

94

Figure 3.4: Phylogenetic analysis of RIF1 and RIF1-like genes. A, Phylogenetic tree showing the evolutionary relationship between RIF1 and RIF1-like genes from maize (ZmRIF1-like1 and ZmRIF1-like2) and Arabidopsis (ACK1-ACK4). The GenBank accession of maize RIF1 is EF647588. B, Alignment of maize RIF1 with two RIF1-like proteins (RIF1-like1 and RIF1-like2). RIF1 shares 89% and 62% similarities at the amino acid level with RIF1-like1 and RIF1-like2, respectively. Alignment was performed using ClustalX2 [207] and GeneDoc software [208]. The red arrow indicates amino acid 224, which was mutated in the tilling project to Asp. C, Schematic representation of the 10 of Zea mays and the localization of RIF1 and RIF-like genes within the chromosomes (red, yellow and black lines). Figure adopted from http://www.maizegdb.org/ after using the RIF1 CDS in the BLAST function.

95

A

B

C

Figure 3.4

96

A

B

Figure 3.5: RIF1 localizes to the nucleus. A, GFP fluorescence of maize protoplasts transiently transformed with RIF1-GFP. The left panel shows the GFP fluorescence, the middle panel the DAPI stain and the merged images are shown on the right. The bar corresponds to 5 m. B, p35S::RIF1-GFP (RIF1-GFP) localizes to nuclear speckles in agroinfiltrated N. benthamiana leaf epidermal cells, and this pattern is not affected by the presence of p35S::R (RMYC).

97

A

B

Figure 3.6: Knock down of RIF1 reduces the accumulation of anthocyanin in BMS cells A, RIF1-GFP expression is knocked down in BMS cells when co-bombarded with an siRNA construct targeting RIF1 [siRNA(+)] compared to when co-bombarded with empty vector [siRNA(-)]. No signal was detected for endogenous RIF1 (RIF1) in presence or absence of the siRNA construct. B, Activation of the A1 promoter (open bars) and red cells (gray bars) by p35S::R (R ) and p35S::C1 (C1) in the absence (-) or presence of a plasmid expressing a double stranded fragment of RIF1 (RNAiRIF1), or with the corresponding empty vector control (Empty vector). All other experimental details are described in the legend of Fig. 3.1 or in Materials and Methods. p35S::Renilla was used as a normalization control. The number of red cell formed after transformation with C1+R in the presence of RIF1RNAi is significantly different from the red cell number of C1 and R alone (P < 0.5). 98

Figure 3.7: The recruitment of RIF1 to the A1 promoter depends on the presence of C1 and R. Plasmids expressing R-GFP, C1, RIF1 and RIF1-GFP were transformed into maize protoplast and ChIP experiments were performed using antibodies against GFP. The semi-quantitative PCR of ChIPed material was performed on the A1 promoter as described in Fig. 3.2. The ChIP experiments and semi-quantitative PCR was performed by Dr. Morohashi (OSU).

99

Figure 3.8: RIF1 and RIF1 homologs from Arabidopsis interact with a subset of bHLH proteins. A, yeast two-hybrid experiments showing interaction between a fusion of the GAL4 activation domain to ACK1 or ACK2 (GAL4AD-ACK1, - ACK2) or to the C- terminus of TT8 or R (GAL4AD-TT8361-518, GAL4AD-R411-610) with GL3441-637, EGL3402- 595, AtMYC1336-526, TT8361-518, R411-610, Ack1 or Ack2 fused to the GAL4 DNA-binding domain (GAL4DBD) in the yeast strain PJ69.4a [183] containing the HIS3 and ADE2 genes under the control of GAL4 binding sites. ACK1 and ACK2 are ENT/AGENET domain containing proteins and homologs of maize RIF1 described below. B, yeast two- hybrid experiments showing interaction between a fusion of the GAL4 activation domain to RIF1 (GAL4AD-RIF1) with AtMYC1336-526, Mute, TT8361-518, EGL3402-595, R411-610 or RIF2 fused to the GAL4 DNA-binding domain (GAL4DBD) in the yeast strain PJ69.4a [177]. RIF1 corresponds to a R partner that specifically interacts with the bHLH region and is described in chapter 3. RIF2 was identified in a yeast two-hybrid screen using R as bait and corresponds to a HLH-like protein from maize (see Chapter 5 for detail). C, Alignment of the bHLH domains of RIF1 interactors (R415-475, ZmbHLH5121-183, MUTE1- 63, Myc1336-396) and non-interactors (GL3439-500, EGL3403-464, TT8361-422, At2g22770129-192, At5g4669087-149, At5g65320102-170, FAMA193-255, Speechless97-159). Alignment was done using ClustalX2 [207]and GeneDoc [208].

100

A

B

C

Figure 3.8

101

GAL4AD construct GAL4DBD construct Growth in Growth in SC-LTH SC-LTHA RIF1ENT RIF1ENT YES YES ACK1 ACK1 YES YES ACK2 ACK2 YES NO ACK3 ACK3 YES YES ACK4 ACK4 YES YES pADGAL4 ACK1 NO NO pADGAL4 ACK2 NO NO pADGAL4 ACK3 NO NO pADGAL4 ACK4 NO NO RIF1 ZmbHLH5117-393 YES YES RIF1 R411-610 YES YES RIF1 MUTE1-203 YES YES RIF1 AtMYC1339-526 NO NO RIF1 GL3434-503 NO NO RIF1 EGL3402-596 NO NO RIF1 TT8259-418 NO NO RIF1 SPEECHLESS97-364 NO NO RIF1 At5g4669081-327 NO NO RIF1 At5g6532097-296 NO NO ACK1 AtMYC1339-526 NO NO GL3434-637 ACK1 YES NO ACK1 EGL3402-596 YES NO ACK1 TT8259-418 YES NO SPEECHLESS97-364 ACK1 NO NO ACK2 AtMYC1339-526 NO NO GL3434-637 ACK2 NO NO ACK2 EGL3402-596 NO NO ACK2 TT8259-418 NO NO ACK3 AtMYC1339-526 NO NO GL3434-637 ACK3 NO NO continued

Table 3.1: Summary of yeast protein-protein interactions. Listed here are interactions tested between GAL4AD-RIF1 or GAL4AD-ACK1-4 with various bHLH proteins from maize and Arabidopsis fused to the GAL4DBD in the yeast strain pJ69.4a. YES in SC- LTH or SC-LTHA column indicates yeast growth and therefore interaction. 102

Table 3.1 continued

ACK3 EGL3402-596 NO NO ACK3 TT8259-418 NO NO ACK4 AtMYC1339-526 NO NO GL3434-637 ACK4 NO NO ACK4 EGL3402-596 NO NO ACK4 TT8259-418 YES NO SPEECHLESS97-364 ACK3 NO NO ACK3 MUTE1-203 NO NO ACK3 FAMA190-414 YES YES ACK1 R411-610 YES YES ACK3 R411-610 YES NO

103

CHAPTER 4

A DIMERIZATION-MEDIATED SWITCH LEADS TO

REGULATION OF DIFFERENT SETS OF TARGET GENES

4.1 Introduction

Plant bHLH proteins are classified into discrete groups or families according to their similarity in the bHLH domain, the ability of the bHLH domain to bind DNA or the existence of additional conserved domains besides the bHLH domain [40, 87]. Group

IIIf, according to Heim et al., contains an N-terminal MYB-interacting region (MIR), required for interaction with MYB-domain proteins which is essential for activation of flavonoid biosynthetic genes or trichome genes [98, 114]. Another domain frequently found in bHLH proteins is a leucine zipper (LZ), adjacent to the bHLH domain. It extends the second helix and stabilizes the HLH bundle by extensive van der Waals contacts and electrostatic interactions between charged residues. The human MYC proto- oncogene contains such a LZ. An E. coli expressed GST-MYC fusion polypeptide containing the MYC-bHLH-LZ region specifically binds the G-box sequence CACGTG in vitro [245]. Full-length MYC1 protein however, does not homodimerize in vitro or in cell extracts and fails to bind DNA. [246-247]. Instead, MYC forms heterodimer with the bHLH-LZ protein MAX, and this heterodimer binds to the G-box sequence and activates

104 cell proliferation. Interestingly, MAX can form homodimers and additionally interacts with several related bHLH proteins including MAD1 [248] and MNT [249]. While the

MYC/MAX heterodimerization results in transcriptional activation, interaction of MAX with MAD1 or MNT leads to transcriptional repression through recruitment of histone deacetylase complexes [174].

Several plant bHLH-LZ proteins have been identified and grouped into family

IVb and IVc [40]. In addition, two members of the family IIIb have been functionally characterized and shown to contain a LZ-like structure C-terminal to the bHLH domain.

ICE1 (INDUCER OF CBF EXPRESSION 1, AtbHLH116) was identified as a MYC-like bHLH-LZ-like transcription factor which regulates the transcription of CBF in the cold

[125]. Kanaoka et al. recently identified two MYC-like transcription factors involved in stomatal differentiation and found that one of these regulators, SCREAM1, corresponds to ICE1 [127]. ICE1 binds to several E-box sequences in the CBF3 promoter and both

ICE1/SCRM1 and SCRM2 (AtbHLH033) heterodimerize with the regulators of stomatal differentiation, SPEECHLESS, MUTE and FAMA [125, 127].

Structure analysis of the region adjacent to the bHLH domain of maize R revealed a similar leucine zipper-like (LZ-like) structure (Kong et al. unpublished) (Figure 4.1B).

Other domains in R have been studied and found to be important for anthocyanin pigmentation. The N-terminal MIR domain interacts with the R2R3-MYB domain protein C1 [41], the middle region interacts with the WD40-repeat protein PAC1 (Oh and

Grotewold unpublished observation) and the bHLH domain interacts with the

ENT/AGENET domain containing protein RIF1 [200]. In addition, R is able to form

105 homodimers via the C-terminal ACT domain (Chapter 2) [250] and this domain is essential for pigment formation. These findings imply that R serves as a docking platform for many proteins and that R is a main player in the transcriptional complex formed on the promoter of the anthocyanin biosynthetic genes. Interestingly, no homodimerization via the bHLH domain has been determined to date and no other bHLH protein has been shown to interact with R. Since dimerization trough the bHLH domain is a prerequisite for DNA-binding, it is not surprising that DNA-binding has not been reported for R, despite the fact that the basic DNA-binding domain contains the typical H5-E9-R13 (H-

E-R) motif which distinguishes DNA-binding proteins from non-binders.

Recent findings from Dr. Ling Yuan’s laboratory at the University of Kentucky forms the basis of the research described in this chapter and are summarized in Figure

4.1. The main question I will address is, whether R homo- or heterodimerizes through the bHLH domain and binds to cis-regulatory elements in the anthocyanin biosynthetic promoters in vivo. In addition, I am interested in how the transcriptional complex on these structural genes is formed. As we described previously, the bHLH domain of R

(amino acids 411-462) is not sufficient for homodimerization [250]. When this region was extended to amino acid 470 or 478, and tested for interaction in yeast by fusion to the

GAL4 AD (GAL4AD) or GAL4 DBD (GAL4DBD), growth on synthetic medium lacking

LEU, TRP, HIS and ADE (SC-LEU-TRP-HIS-ADE) indicated that it mediates homodimerization (Kong et al. unpublished, Fig. 4.1A). Structure prediction analysis of the region adjacent to the bHLH domain revealed a LZ-like structure, comparable to the

LZ found in human MAX (Fig. 4.1B). EMSA, performed using an E. coli expressed and

106

411-478 purified His-tagged truncated protein (His6-R ), showed that the bHLH domain of R is able to bind to the G-box sequence CACGTG in vitro (Fig. 4.1C), but did not bind to a mutated E-box sequence CAATTG. Interestingly, neither R411-610 (Fig. 4.1C), R1-478 nor full-length E. coli expressed His-tagged R protein bound the G-box probe. Deletion and domain swapping experiments of the ACT domain suggests that the dimerization via the

ACT domain inhibits DNA-binding (Fig. 4.1C). In addition, the presence of the N- terminus seems to inhibit DNA-binding in vitro. One explanation is that R can bind to the

G-box when dimerizing via the bHLH, but not when it forms homodimer through the

ACT domain, and that the N-terminus (MIR) has to undergo structural changes in order for DNA-binding to occur.

I show here that R is able to homodimerize in yeast when bound to C1 via the N- terminal MIR domain, implying that C1 likely induces structural changes to R, which make dimerization possible. It is known that C1 and R form a complex on the anthocyanin biosynthetic promoters and that RIF1 is part of that complex on the A1 promoter (Chapter 3) [200]. I determined that RIF1 can bind to the bHLH monomer but not to the dimer, suggesting that, when R is not bound to DNA, it exists as a bHLH monomer. Furthermore, I show that the ACT domain at the C-terminus of R is important for binding to the A1 promoter as well as for binding to promoters containing an E-box sequence, i.e. Bz1 and C2, two structural genes in the anthocyanin biosynthetic pathway.

107

4.2 Materials and Methods

4.2.1 Plasmids used in transient expression experiments

p35S::R (DP471), p35S::R-GFP, p35S::C1 (DP665), p35S::RD12, p35S::RΔ532-560-

GFP, pA1::Luc (ZO11), p35S::Renilla (pHTT672) and p35S::BAR (DP611) were described in Chapters 2 and 3. The G-box::Luc construct contains 5 CACGTG (G-box) repeats separated by 3 nucleotides, and was cloned into the EcoRI/BamHI into the pGL3 vector (Promega) and was kindly provided by Dr. Yuan (UK). In addition, we received the p35S::RL461A from Dr. Yuan’s lab.

To design the C2 promoter::Luc fusion constructs (pC2::Luc), I amplified the region -1249 to +10 compared to the TSS or the region -223bp to +10bp by PCR and cloned it into the ZO11 vector from which I deleted the A1 promoter sequence by digestion with BamHI and KpnI. To design the Bz1C1-BS*::Luc construct, the MYB-BS

(ACCTAAA) was replaced by a PvuI site (ACCCGATCG) and was prepared by George

Heine, a former PhD student in the Grotewold laboratory, by site directed mutagenesis using the Bz1::Luc plasmid as a template.

4.2.2 Transient Expression Assays

Transient expression assays into maize cultured BMS cells were performed as described in Chapters 2 and 3. Luciferase assays were performed in a 96-well plate using the Dual Luciferase assay kit from Promega and a luminometer Centro LB960 ( Berthold

Technologies).

108

4.2.3 Protoplast Transformation

Genes of interest (C1, R, RΔ532-560) were cloned into a Gateway vector containing a 35S CaMV promoter and GFP as a C-terminal fusion in a pBluescript backbone. The p35S::sGFP vector, used as a control contains a synthetic GFP gene with a mutation

S65T. Both vectors were a kind gift of Dr. J.C. Jangs laboratory (OSU). Eleven leaves of

12 to 14 day old etiolated maize seedlings from B73 x MO17 hybrid seeds were cut with a razor blade and digested in 10 ml K3 medium (13.7 g sucrose, 25 mg xylose, 50 mg

MES and 310 mg Gamborg’s B-5 medium, Caisson lab Inc., in 100 ml dH2O; pH 5.6 –

5.8), containing 5% cellulose and 0.3% macerozyme (both from Karlan Research Product

Cooperation, AZ, USA). Tissue was vacuum-infiltrated for 20 min before digestion at 40 x g in the dark for 2.5 – 3 h. Digested tissue was than rotated at 80 rpm for 30 min to release the protoplasts. Protoplasts were filtered through a mesh, pore size 35 µm, spun down at 1500 rpm for 5 min and washed twice with 10 ml W5 medium (154 mM NaCl,

125 mM CaCl2, 5 mM KCl and 2 mM MES, pH 5.7). Protoplasts were resupended in 1-2 ml suspension medium (0.4 M mannitol, 20 mM CaCl2, 5 mM MES, pH 5.7) and inspected under the light microscope at 20-40x magnification. Protoplasts were transformed by mixing 10 µg DNA, 200 µl suspended protoplasts and 220 µl 40% PEG solution (40% w/v PEG, 0.4 M mannitol, 100 mM CaCl2, pH 7.0) and incubated for 15-

20 min at RT. After incubation, 700 µl W5 medium was added, mixed and centrifuged at

2000 x g in a microcentrifuge. Pelleted protoplasts were resuspended in 700 µl W5 medium and stored O/N at RT in the dark. After 12-16 h, GFP fluorescence was observed using a fluorescence microscope and efficiency of transformation was calculated.

109

4.2.4 Chromatin-Immunoprecipitation

Transformed protoplasts were cross-linked with 1% v/v formaldehyde in W5 medium under vacuum for 20 min. Glycine was added to a final concentration of 0.1 M to stop the cross-linking. Protoplasts were washed twice with W5 medium and

Chromatin-Immunoprecipitation was performed as described in chapter 3. One µl ChIPed

DNA was used for semi-quantitative PCR or Q-PCR (performed at PMGF facility at

OSU) using the following primers (sequences can be found in Table S2):

For synthetic 5xG-box::Luc amplification: AF-G-box-Luc-F and AF-G-box-Luc-R

For A1 promoter amplification: KM_A1-A6 and KM_A1-B7

For Bz1 promoter amplification : KM_Bz1-A1 and KM_Bz1-B1

For Copia amplification: Copia-F and Copia-R

4.2.5 Protein Expression and Purification

411-478 The pHis6-R construct was kindly provided by Dr. Yuan (Univ. of

Kentucky, Lexington, KY). For protein expression and purification, 500 ml LB plus 500

µl of a 50µg/ml stock solution of kanamycin were inoculated with 1 ml O/N culture of

411-478 His6-R , grown to OD600=0.5, induced with 1 mM IPTG and incubated for 18 h at

18°C. Bacteria cells were lysed in sonication buffer (300 mM NaCl, 50 mM sodium phosphate, pH 7, PMSF, E. coli proteinase inhibitor cocktail from sigma) by sonication using a microtip probe followed by incubation with 7.5 U lysozyme (Novagen) for 20 min at RT. Cells were centrifuged at 4000 x g for 15 min, filtered through one layer of miracloth and the filtrate was incubated with equilibrated NTA-Ni2+-beads (Qiagen) for 2 h at 4°C. Beads were spun down at 3000 x g for 5 min, and washed 3x with wash buffer 110

(300 mM NaCl, 50 mM NaPi, pH 7.0, 10 mM imidazole). Purified protein was eluted with 250 mM imidazole in wash buffer and dialyzed against E-box buffer (20 mM

HEPES, 50mM KCl, 0.5 mM EDTA, 5% glycerol, 1 mM DTT) using a dialysis cassette

(Promega). Purified protein was quantified using BCA protein quantification kit

(Promega) and run on a 15% PAGE.

4.2.6 Electrophoretic-Mobility-Shift-Assay (EMSA)

Fifty picomolar of the C2-, Bz1-, and synthetic G-box promoter oligos were end- labeled by γ32P-ATP (activity >6,000 Ci/mmol) and T4 PNK (Polynucleotide Kinase) and annealed to the equal-molar non-radioactive complementary strand to produce double-stranded probes for EMSA. The radioactively labeled double-stranded probes were then purified and separated by 12% PAGE. The DNA-binding reactions were carried out in 10 mM Tris-HCl, pH 7.5, 50 mM KCl, 1 mM DTT, 5% glycerol, and 100 ng polydI:dC at a final volume of 20 µl. Purified proteins (100 ng) were incubated with

0.25-0.5 nM DNA probe (1 x 105 cpm, approximately 0.01 µC i) on ice for 30 min in the presence of 3 µg BSA and 1 mM DTT. The DNA-protein complexes were resolved on

6% nondenaturing PAGE run at 200 V for 30-35 min in 0.5x TBE buffer. The dried

PAGE was then subjected to autoradiography at -70°C. The following oligomers were used in EMSA experiments and sequences can be found in the Appendix Table S2.

Bz1 promoter: AF-Bz1prom-EMSA-F and AF-Bz1prom-EMSA-R

C2 promoter: AF-C2prom-EMSA-F and AF-C2-EMSA-R3

G-box: G-box_F and G-box_R

111

4.2.7 Constructs Used for Protein-Protein Interaction Experiments

Full-length R yeast expression constructs were generated using the coding sequence from ATG to the STOP codon with primers LcN1 and LcC2 (Table S1) and cloned into the pADGAL4 and pBDGAL4 vector as EcoRI and SalI fragments.

GAL4DBD-C1MYB and GAL4DBD-R411-462 was generated by Erich Grotewold by cloning amino acids 1-117 of C1 or amino acids 411-462 of R into pBDGAL4. GAL4AD-R411-610 and GAL4DBD-R411-610 were generated using primers LcN3 and LcN11 (Table S1) and were cloned into pADGAL4 or pBDGAL4 as EcoRI and SalI fragment. The GPD1::C1 construct was generated by Dr. Hernandez, by cloning the CDS of C1 into the yeast expression vector YEplac112 [251] under the control of the yeast GPD1 promoter. The

GAL4DBD-PAC1 construct was generated by Choon Seok Oh, a former student in the lab, by cloning the cDNA of PAC1 into the pBDGAL4 vector. GAL4-RIF1AD was isolated from the cDNA library used in yeast two-hybrid screens (Chapters 3 and 5). The constructs GAL4DBD-R411-478 and GAL4DBD-R411-478-L461A were a gift of Dr. Yuan’s lab

(UK).

4.2.8 Protein-Protein Interaction Experiments

Yeast two-hybrid studies were performed as described in Chapter 2. The yeast strain used was pJ69.4a which has the following genotype: MATalpha trp1-901 leu2-

3,112 ura3-52 his3-200 gal4Δ gal80Δ LYS2::GAL1-HIS3 GAL2-ADE2 met2::GAL7-lacZ

[183]. For yeast three-hybrid experiments, I first transformed GAL4AD-R1-610 and

GAL4DBD-R411-610, GAL4AD-R1-610 and GAL4DBD-PAC1 or as control GAL4AD-FL-R and pBDGAL4 into pJ69-4a. These yeast strains were than grown in liquid YEPD media O/N 112 at 30°C until OD600 = 1.5-2. Yeast transformation was then carried out as described in

Chapter 2 and transformants selected on SC-LEU-TRP. Colonies from SC-LEU-TRP were then streaked out on selective media SC-LEU-TRP-HIS-ADE to determine interaction.

4.3 Results

4.3.1 The ACT Domain is Important for A1 and G-box Activation and Binding

As I established in Chapter 2, the ACT domain at the C-terminus of R is necessary for regulation of anthocyanin pigmentation in cultured maize BMS cells and is important for the activation of the A1 anthocyanin biosynthetic gene (Fig. 4.2A). To establish the function of the ACT domain for regulatory function on the A1 promoter in more detail, I transformed p35S::C1 and p35S::R-GFP (C1+R), p35S::C1 and p35S::RΔ532-560-GFP (C1+RnoAct-GFP) or p35S::GFP into maize protoplasts. After establishing by fluorescence microscopy that the GFP fusion constructs are expressed

(Fig. 4.2B), I performed ChIP to determine binding to the A1 promoter. As shown previously, R-GFP binds to the A1 promoter in the presence of C1 (Fig. 4.5C) [200]. This recruitment of R to the A1 promoter is probably indirect (by piggy-bag binding, Fig.

1.2A) through C1 since the A1 promoter does not contain an E-box sequence in the 219 bp upstream of the TSS, sufficient for transcriptional activation. Interestingly, the deletion of amino acids 532 to 560, which disrupts the homodimerization via the ACT domain inhibits binding of C1 and R to the A1 promoter (Fig. 4.5C). These results imply

113 that the ACT domain of R is important for activation of A1 as well as for binding of R to

A1 in the presence of C1.

Based on the results from Dr. Yuan’s laboratory indicating that the bHLH-LZ-like domain of R can bind to a G-box sequence in vitro, I wanted to determine that R can activate a promoter::reporter construct containing a synthetic G-box fused to luciferase. p35S::R or p35S::R-GFP (R or R-GFP) in the presence or absence of p35S::C1 (C1) can activate luciferase expression from a promoter containing five G-box elements (Fig.

4.3A), which suggests that R can activate transcription from a G-box containing promoter without the need of C1 (Fig. 4.3B). When I transiently expressed two mutants of R, which cannot homodimerize via the bHLH domain (RD12 and R-L461A), no pG- box::Luc activation was observed (Fig. 4.3B). Furthermore, when I tested the RΔ532-560 mutant on the same pG-box::Luc construct, I observed a 5- or 3-fold reduction in activation when compared to R or C1 and R, respectively (Fig. 4.3B). When considered together with the results by Dr. Yuan’s lab that dimerization through the ACT domain inhibits G-box binding via bHLH domain in vitro, these results suggest that the ACT domain is important for R to activate G-box-containing genes. Moreover, this domain might be important for activation of genes that do not require C1 for binding. The decrease in activation observed when the ACT domain was deleted might be due to the inability of R to interact with an unknown protein or to homodimerize trough the ACT domain. It may indicate that there is a difference in C1/R complex assembly depending on the presence or absence of cis-regulatory elements in the promoter.

114

4.3.2. Promoter Analysis of Anthocyanin Biosynthetic Genes

To establish if the C1/R regulatory complex is tethered to the DNA in similar ways on all anthocyanin biosynthetic promoters, I examined the regulatory regions upstream of the TSS of four additional anthocyanin biosynthetic genes; C2, A2, Bz1 and

Bz2. As described previously, ≤ 300 bp upstream of the TSS are sufficient for transcriptional activation of A1, A2, Bz1 and Bz2 in maize cultured cells [68, 112-113,

252]. For C2, CHI and F3H, the promoter region has not been experimentally examined to date and all cis-regulatory elements shown for the C2 promoter are predicted (Fig.

4.4A). In this Chapter, I experimentally dissected the C2 and the Bz1 promoter.

4.3.2.1 Analysis of C2

C2 encodes for CHS (CHALCONE SYNTHASE), which catalyzes the first step in the anthocyanin biosynthetic pathway. To determine which part of the region upstream of the predicted TSS is sufficient for transcriptional activation, I fused the region -1248 to

+10 of the predicted regulatory region to luciferase and tested activation by C1 and R in maize cultured BMS cells. Figure 4.4B shows that this region is sufficient for activation by C1 and R, but not by C1 or R alone. Next, I tested the upstream 223 base pairs from the TSS. Figure 4.4B shows that 223 bp are sufficient for activation by C1 and, R but not by C1 or R alone. The region 223 bp upstream of the TSS contains three E-box sequences

(2x CACGTG, 1x CAACTG). To test if R can bind to the C2 promoter, I labeled a double-stranded DNA oligo covering region -97 bp to -55 bp upstream of the TSS with

γ32P-ATP. In addition to two bHLH binding sites, this region also contains a putative

411-478 MYB-binding site (ACCTAACCC). EMSA using E. coli purified His6-R protein 115 shows that R can bind to the C2 regulatory region in vitro (Fig. 4.4C) in the absence of

C1. This implies that R can bind to an E-box sequence in the C2 promoter directly but needs C1 to activate expression of the C2 gene. Furthermore, these results suggest a mechanism for the regulation of C2 significantly different than for A1.

4.3.2.2 Analysis of Bz1

Next, I examined the Bz1 promoter. The Bz1 structural gene encodes a UDP glucose:flavonol 3-O-glucosyltransferase, which catalyzes one of the last steps in the anthocyanin biosynthetic pathway [253]. Two MYB-BSs have been predicted and have been tested for importance of Bz1 promoter expression [113, 254]. Since C1 can bind to both MYB-BSs in vitro [254], I will refer here to both sites as C1-BSs. The results obtained by two different investigators are somewhat different. Roth and collaborators determined that when the C1-BS TAACTG (-71 to -66) was mutated to GCTAGC, the expression of pBz1::Luc in maize embryos lacking C1 and R decreased to 10% compared to wild-type pBz1::Luc [113]. Another study however showed that the same mutation had no effect on pBz1::Luc activity [254]. Yet, this study showed that when an additional C1-

BS at -92 to -87 was mutated (ACCTAA to GCTAGC, Fig. 4.5A), a mild reduction in expression was observed (88% compared to 100% wild-type pBz1::Luc). Moreover, the study by Roth and collaborators found that the region between -90 to -80 is necessary for

Bz1 promoter expression (89% reduction when this region was deleted) but that this expression might be independent of C1 and R [113]. Since these results are not consistent, I wanted to investigate the function of the C1-BS at position -92 to -87bp in

BMS cells. I took advantage of a pBz1::Luc mutant construct made by Dr. George Heine. 116

This mutant contains a PvuI restriction enzyme site at position -90 to -85 and replaces almost completely the C1-BS. When I bombarded this construct together with C1 and R into maize BMS cells and assayed the luciferase activity, I determined a reduction in expression of 90% compared to the wild-type Bz1 promoter (Fig. 4.5B). These data are consistent with findings by Roth et al. [109] and suggests that the region -90 to -84 is important for Bz1 promoter expression and that C1 and R are required.

In addition to the two C1-BSs, the Bz1 promoter contains a MYC-BS (-60 to -52), in this study referred to as E-box (Fig. 4.5A, GGCAGGTGC). This E-box has been shown to be important for Bz1 promoter activity in transient expression studies [109,

251]. I wanted to test if R can bind to the E-box in the Bz1 promoter. I labeled an oligo covering -93bp to -50bp upstream of the TSS as described in Materials and Methods and

411-478 411-478 run EMSA using E. coli purified His6-R . R can bind to the Bz1 promoter in vitro (Fig. 4.4C, lane 6). In addition, I transformed maize protoplasts with p35S::C1 and p35S::R-GFP, or with p35S::C1 and p35S::RΔ532-560-GFP, followed by ChIP and tested binding to the Bz1 promoter. Figure 4.5C shows that p35S::C1 and p35S::R-GFP can bind to the Bz1 promoter. Interestingly, when dimerization through the ACT domain is abolished, binding to the Bz1 promoter seems enhanced.

These data suggest that R can bind directly to the Bz1 promoter, possibly via the

E-box sequence, but needs C1 to activate expression of Bz1 in vivo. Preliminary results from ChIP experiments testing binding of FL-R to the Bz1 promoter in maize protoplasts do not show binding of R in the absence of C1. Together with the ChIP data presented

117 here, this suggests that C1 might be necessary for binding of R to the Bz1 promoter.

However, this observation needs to be investigated in more detail.

One interesting question which I need to address is if RIF1 is recruited to the Bz1 promoter. As mentioned in Chapter 4, RIF1 interacts with the R monomer but not with the bHLH dimer. Therefore, and different from the A1 promoter, I do not expect RIF1 to be part of the complex on the Bz1 promoter. The increase in binding in the absence of the

ACT domain might be due to a stabilized R homodimer via the bHLH domain which in turn leads to increased affinity for DNA. This seems to be a different mechanism from A1 activation, where R does not bind to DNA directly but rather through C1 and this complex activates transcription of A1.

4.3.3 Full-length R Homodimerizes in the Presence of C1

So far, we used a truncated R protein to test dimerization and DNA-binding in vitro. To establish if FL R can homodimerize, I fused the coding region of R to the GAL4

AD (GAL4AD-R1-610) and amino acids 411-610 of R to the GAL4 DBD (GAL4DBD-R411-

610). I could not use the FL R construct because it self-activates in yeast when fused to the

GAL4DBD due to the presence of an activation domain in the middle region of R. Growth on synthetic medium SC-LEU-TRP indicated that the yeast cells had taken up both plasmids but lack of growth on synthetic medium SC-LEU-TRP-HIS or SC-LEU-TRP-

HIS-ADE indicated that R does not dimerize when the N-terminal domain is present.

Both constructs are functional in yeast, since GAL4AD-R1-610 interacts with GAL4DBD-

C1MYB and GAL4DBD-R411-610 interacts with GAL4AD-R411-610 (Table 4.1).

118

To investigate whether C1 might be required for dimerization of R, I performed a yeast three-hybrid experiment. The yeast strain pJ69.4a [183] containing GAL4AD-R1-610 and GAL4DBD-R411-610 was transformed with the full-length cDNA of C1 driven by the yeast GPD1 promoter (pGPD1::C1). The pGPD1::C1 construct does not activate transcription on its own as determined by Dr. Hernandez (see dissertation thesis Marcela

Hernandez). When all three plasmids where transformed into the yeast strain, I detected growth on SC-LEU-TRP-HIS-ADE, as an indication for interaction between FL-R, R411-

610 and C1 (Table 4.1). As a negative control, I transformed pGPD1::C1 into a yeast strain containing GAL4AD-R1-610 and the empty pBDGAL4 vector and no growth on selective media was observed (Table 4.1). This indicates that full-length R dimerizes in yeast in the presence of C1.

Preliminary results also suggest that C1 is required for interaction of GAL4AD-R1-

610 with PAC1 when fused to the GAL4 DBD (GAL4DBD-PAC1). PAC1 is a maize WD40 protein, which interacts with the acidic domain of R located N-terminal to the bHLH domain and is required for anthocyanin pigmentation [104]. When GAL4AD-R1-610 and

GAL4DBD-PAC1 were transformed into pJ69-4a no interaction was observed. When I transformed pGPD1::C1 into the pJ69-4a yeast strain containing GAL4AD-FL-R and

GAL4DBD-PAC1, yeast growth on selective media indicated that interaction takes place.

The data obtained here suggest that R can homodimerize in the presence of C1.

This might be due to a structural change in R induced by C1. However, preliminary data from our lab (by former student MinGab Kim) suggest that C1 might affect the stability of R. This possibility will be tested by using the transformed yeast cells described above,

119 extract proteins and analyze by Western blotting using GAL4 AD or GAL4 DBD antibodies.

4.3.4 The bHLH Mediated Dimerization Inhibits Dimerization with RIF1

I identified the R-interacting factor RIF1 in a yeast two-hybrid screen using

GAL4DBD-R411-610 as bait, and determined that the bHLH domain (amino acids 411 to

462) is sufficient for this interaction [200]. However, when I tested the extended bHLH region fused to GAL4 DBD (GAL4DBD-R411-478), which is able to homodimerize in yeast and bind to DNA in vitro, the interaction with GAL4AD-RIF1 was abolished (Table 4.1).

To determine if there is a correlation between homodimerization via the bHLH domain and heterodimerization with RIF1, I tested several mutants in the bHLH domain which cannot homodimerize (GAL4DBD-R411-478-L461A, GAL4DBD-R411-478-RD12). Interestingly, heterodimerization with RIF1 was restored in all bHLH dimerization mutants tested

(Table 4.1), which suggests that RIF1 binds to a bHLH monomer and that a bHLH homodimer blocks the interaction of RIF1 with R. Furthermore, this implies that R411-610 dimerizes via the ACT domain but not via the bHLH domain.

4.4 Discussion

The finding by Dr. Yuan’s lab that R contains a LZ-like structure at the C- terminus of the bHLH domain which can mediate homodimerization opens a new door for the functional analysis of the bHLH protein R and for the analysis of combinatorial gene expression in plants. R was the first plant bHLH protein described and since then has been used in many studies as a reference. The extended bHLH-LZ-like domain of R

120 can bind to a synthetic G-box sequence in vitro as determined by EMSA by Dr. Yuan’s lab. SELEX experiments (performed by Dr. Chai, a post doc in the Grotewold laboratory)

411-478 using a His6-tagged R fusion protein and 26 nt random sequence established

CACGTG as the most abundant sequence to which R binds in vitro. In vivo, R can activate a synthetic (G-box)5::Luc promoter::reporter construct in the presence or absence of C1. However, when maize protoplasts were transformed with R-GFP followed by

ChIP experiments, no direct binding of R-GFP to the G-box was observed (Fig. 4.5C).

Several different explanations exists why R might not form homodimer or why homodimerization in planta might not be detectable easily. First, an E. coli expressed

GST-MYCbHLH-LZ construct binds to the G-box sequence in vitro but in vivo, MYC prefers heterodimerization with the bHLH-LZ protein MAX. However, MYC homodimerizes under certain conditions, such as when protein concentration rises above a certain threshold or when aberrant protein modifications occur [255]. It has also been suggested that a naturally occurring variant encoding a minimal MYC-bHLH-LZ protein exists in vivo which is able to homodimerize under physiological conditions [255].

Furthermore, homodimerization of the bHLH-LZ MAX was hard to determine since phosphorylation in the N-terminus of MAX inhibited DNA-binding. It is possible that R functions in a similar way and some of the possibilities described above will be examined further. I might not see binding of R in vivo because the heterodimerization-partner is not expressed in leaves of 12 day old etiolated maize seedlings used for ChIP experiments.

Another possibility includes that I used a synthetic G-box promoter, which does not contain a C1-BS. It is possible that FL R physically binds to DNA only in the presence of

121 another protein such as C1. Moreover, plant and animal bHLH proteins have been shown to activate other bHLH or bHLH-LZ proteins, which in turn bind to the G-box sequence.

Human c-MYC for example can activate the bHLH-LZ protein AP4, which in turn binds as a homodimer to specific recognition sequences located on the promoter of one of its target genes, p21 [256]. I identified two bHLH containing R-interacting factors in a yeast two-hybrid screen (Chapter 5) and it will be interesting to determine if they bind DNA as dimer with R or as homodimers. The activation of R from the G-box containing promoter

(Fig. 4.3) implies that R might bind to DNA but could also be due to an indirect effect, meaning R could activate factor X which than binds to the G-box.

The A1 promoter contains two MYB-BS, to which C1 binds with low affinity in the absence of R, but has no E-box sequence in close proximity to the TSS. It also contains an anthocyanin-regulatory-element (ARE), which is important for transcriptional activation by C1 and R, but DNA-binding studies of R to this element have so far been unsuccessful. It is not without precedents that R regulates expression of A1 in the absence of a bHLH binding site. In vivo binding studies have placed human c-MYC on promoters without MYC consensus binding sequences, possibly through recruitment of interacting partners [256]. Consistently our results indicate that R activates A1 though the recruitment by C1.

One question that remains to be answered is what is the role of the ACT domain in DNA-binding to A1, Bz1 and possibly other G-box containing promoters in the anthocyanin biosynthetic pathway. I determined previously that the ACT domain at the

C-terminus of R is important for activation of A1 [250] and I show here for binding to A1

122 in vivo. The absence of dimerization of the ACT domain inhibits binding of C1 and R to

A1 (Figure 4.1 and 4.2). The results shown here suggest that R dimerizes via ACT domain on the A1 promoter, which might be necessary for activation of the endogenous

A1 gene. Furthermore, the finding that RIF1 interacts with the R-bHLH-monomer (Table

4.1) and is part of the complex formed on the A1 promoter [200] highlights the fact that R does not form homodimer via the bHLH domain, but rather dimerizes via ACT domain on A1.

Compared to the function of the ACT domain on A1, the presence of the ACT domain on G-box containing promoters appears to play an inhibitory function. On a synthetic G-box containing oligomer, the dimerization of the ACT domain inhibits DNA- binding (Fig. 4.1). Furthermore, in the absence of the ACT domain, binding of R to the

Bz1 promoter in vivo seems enhanced (in the presence of C1, Fig. 4.5C).

My model suggests that since R dimerizes via ACT domain, it is structurally not able to homodimerize via the bHLH domain, and therefore DNA-binding is inhibited.

How exactly the ACT domain functions on G-box containing promoters needs to be determined. One way to explain the data presented here is that dimerization might happen between two R proteins or between R and an as yet unknown factor. As shown in Chapter

2, the ACT domain of R can dimerize with EGL3 in yeast (Fig. 2.6C). In addition, ACT domains in other bHLH proteins might be able to heterodimerize. The Arabidopsis TF

ABORTED MICROSPORES (AMS, At2g16910) belongs to group IIIa of bHLH proteins [40] and contains an ACT domain at its C-terminus, according to sequence- structure analysis I conducted-using the Phyre tool box [257]. The C-terminus of AMS

123

(excluding the bHLH domain) interacts with both the SET- and the PHD-finger domain of ASHR3, a protein involved in regulation of stamen and anther developmental [199].

SET and PHD-finger domains have been associated with chromatin remodeling proteins.

PHD-domain and chromodomain in the transcriptional co-factor p300 play an important role in acetylation-dependent nucleosome binding [258] and the PHD domain in ING2 tumor-suppressor recognizes the euchromatic mark tri-methylated H3K4 [259]. In addition, ICE1, interacts with the MYB-domain protein At-MYB15 via the C-terminus containing an ACT domain [260]. From these studies, I propose that it is possible that the

ACT domain is able to interact with an unknown protein, which in turn leads to structural rearrangements in R or which is necessary for transcriptional activation. Interestingly,

RIF1 contains a PHD finger-like domain and an AGENET domain, both known for involvement in chromatin remodeling. The bHLH domain of R is sufficient for interaction with RIF1, but AGENET and PHD-like domain are not sufficient for dimerization (see Chapter 3 for more details).

In addition to the ACT domain, the presence of the N-terminus also seems to

bHLH-LZ-like 1-524 inhibit R DNA-binding. E. coli expressed His6-R does not bind to the G-box sequence in vitro in the absence of C1 (binding assays were done by Que Kong, Dr.

Yuan’s lab). One possibility is that the N-terminus masks for example the bHLH domain

(intramolecular protein-protein interactions) and the presence of C1 or another RIF is required for structural changes making DNA-binding possible. I tested binding of

GAL4AD-R1-252 to the bHLH domain or to the ACT domain fused to the GAL4DBD in yeast and did not observe interaction. This does not rule out intramolecular protein-

124 protein interactions in general and other domains of R might be involved. Other possibilities, which I need to explore in more detail are: C1stabilizes R or binding of small molecules to R. Both the ACT domain and the N-terminal domain have structural similarities to small molecule binding domains as determined by Phyre tool box analysis

[257]. As described previously, small molecules can be involved in gene expression [261] and it will be interesting to determine if a small molecule can bind to R and play a role in

R function.

Taken together, the results presented here demonstrate a regulatory switch between two dimerization domains in R, which leads to activation of different set of target genes. bHLH-related dimerization activates E-box containing genes whereas ACT domain associated dimerization activates genes which do not compose of E-box sequences in the promoter region. These findings highlight the dynamics of protein complexes formed on regulatory regions.

125

Figure 4.1: R homodimerizes via a bHLH-LZ like domain and binds to DNA. A, Diagram of R and its structural domains. When the bHLH domain was extended towards the C-terminus, this region was sufficient for homodimerization in yeast (R411-470, R411-478, R411-510). No dimerization was observed when R411-524 was tested, whereas R411-610 dimerized very well. +++ indicates very good dimerization, ++ reasonable dimerization, + weak dimerization. Figure was made based on data provided by Dr. Yuan. B, Sequence and structure analysis of the bHLH-LZ-like structure of R and comparison to human MAX, a typical bHLH-LZ protein. Leucine residues contributing to the extension of the second helix are shown in red. ZmLc = R, AtEGL3 = A. thaliana EGL3, a homolog of R, HsMyc and HsMax = H. sapiens protein MYC and MAX, two bHLH-LZ proteins shown 411-478 411-524 to homodimerize. C, EMSA showing binding of His6-R and His6-R to a synthetic oligonucleotide containing a G-box sequence (5’- 411-463 CGTTCCCCACGTGCTTCTCC-3’) (lane 3, 4). His6-R does not bind to the G-box 411-610 sequence (lane 2), and neither does His6-R , which contains the ACT domain (lane 5). When the ACT domain of R was replaced by the ACT domain of AtMYC2 or when the ACT-domain was mutated, DNA-binding did occur (compare lane 5 with lanes 6, 7 and 8). Figure B and C were provided by Dr. Yuan.

126

A

B

C

Figure 4.1

127

A

B

Figure 4.2: The ACT domain of R is important for activation of A1 but not for localization. A, Transient expression assay in maize BMS cells. Cells were bombarded with p35S::C1 and p35S::R-GFP or p35S::C1 and p35S::RΔ532-560-GFP and A1::Luciferase activity was measured. Shown is fold activation normalized to 35S::Renilla-Luciferase activity. Error bars represent standard deviation. The activation of C1+RΔ532-560-GFP is statistically different from activation of C1+R-GFP (P < 0.5). B, Fluorescence microscopy images of protoplasts transformed with p35S::GFP (right), p35S::C1 and p35S::R-GFP (left) or p35S::C1 and p35S::RΔ532-560-GFP (middle). Scale bar corresponds to 5 µm.

128

A

B

pG-box::Luc

Figure 4.3: R can activate from a synthetic G-box containing promoter in the absence of C1. A, Schematic representation of the G-box::Luc construct used in transient expression assays in B. B, Activation of a synthetic pG-box::Luc promoter::reporter construct by p35S::R (R) or p35S::R-GFP (R-GFP) and several dimerization mutants (RΔ532-560- GFP, RD12 and R-L461A) in the presence and absence of p35S::C1 (C1) in maize BMS cells. Shown is percent activation compared to R-GFP or R which is set to 100%. Luciferase values were divided by p35S::Renilla values and normalized to promoter activation only. Error bars represent standard deviation. The activation by RΔ532-560- GFP or by C1 + RΔ532-560-GFP is statistically different from those of R-GFP or C1 + R-GFP (P < 0.5), respectively.

129

Figure 4.4: Maize flavonoid gene promoter analysis. A, Schematic representation of the promoters of the flavonoid biosynthetic genes. Predicted cis-regulatory elements are shown in striped boxes. Full boxes represent cis-regulatory elements that have been experimentally shown to be important for regulation by C1 and R. Numbers correspond to the position of the cis-regulatory elements from the TSS [68, 112-113, 252, 254]. B, Activation of two pC2::Luc promoter::reporter constructs by p35S::R (R) and p35S::C1 (C1) in maize BMS cells. C2-long corresponds to the 1248 bp region upstream of the TSS, whereas C2-short corresponds to 223 bp upstream of the TSS. All other experimental details are described in the legend of Fig. 3.1 or in Materials and Methods of chapter 3. For normalization, luciferase values were divided by p35S::Renilla values and normalized to promoter activation only. Error bars represent standard deviation. C, 411-478 EMSA showing binding of purified His6-R protein to the C2 (lane 4) and Bz1 promoter (lane 6) sequence and to the synthetic G-box sequence (lane 1). Free probes of C2, Bz1 and G-box are shown in lane 2, 3 and 5, respectively. The sequences used as probes are shown on the right of the gel picture and the E-box sequences are boxed.

130

A

B

C

Figure 4.4

131

A

B C

Figure 4.5: Analysis of the Bz1 promoter. A, Shown is the Bz1 promoter sequence -50 to -93 upstream of the TSS. On top of the alignment are the cis-regulatory elements represented; in blue and brown: C1-BS [254] and in pink the E-box. The mutation introduced by Lesnick is shown below the alignment (GCTAGC) [254] and at the very bottom is shown the mutation introduced by our lab (CGATCGTC) with the PstI side underlined. On the right side are shown the % activation compared to WT Bz1 promoter activity by C1 and R. B, Activation of pBz1::Luc or pBz1*::Luc promoter::reporter construct by p35S::R (R) and p35S::C1 (C1) in maize BMS cells. Bz1* has a mutation in the C1-BS (brown) as represented in Fig. 4.5A. For normalization, Luciferase values were divided by p35S::Renilla values and normalized to promoter activation only. Error bars represent standard deviation and single experiments where repeated twice. Experimental details are described in the legend of Fig. 3.1 or in Materials and Methods of Chapter 3. C, Maize protoplasts were transformed with p35S::C1 and p35S::R-GFP or p35S::RΔ532-560-GFP, ChIP was performed as described in Materials and Methods followed by Q-PCR. Shown are the Q-PCR results on the A1and Bz1 promoters. Copia represents the reverse transcriptase sequence of the maize copia TYI type retrotransposon (Genebank # AF398212) and was used as a negative control. Data was normalized by dividing the Q-PCR signal derived from the ChIP sample by the Q-PCR signal derived from the input sample and % of input is shown [262].

132

AD-fusion BD-fusion +/- Growth on SC-LEU- construct construct GPD::C1 TRP-HIS-ADE R1-610 R411-610 - NO R1-610 C11-172 - NO R411-610 R411-610 - YES R1-610 R411-610 + YES R1-610 pBDGAL4 + NO R1-610 PAC1 - NO R1-610 PAC1 + YES RIF1 R411-462 YES RIF1 R411-478 NO RIF1 R411-478-L461A YES RIF1 R411-478-RD12 YES

Table 4.1: Yeast two-hybrid and yeast three-hybrid results. The top 7 columns show results from testing homodimerization of R and heterodimerization of R and PAC1 in the presence or absence of C1. The bottom 4 rows show interaction of RIF1 with R monomer, but not R dimer constructs. Growth on SC-LEU-TRP-HIS-ADE indicates interaction.

133

CHAPTER 5

CHARACTERIZATION OF NOVEL R-INTERACTING FACTORS

FROM MAIZE AND ARABIDOPSIS

5.1 Introduction

Combinatorial interactions between transcription factors, co-regulators and DNA are highly important for gene regulation. The anthocyanin biosynthetic pathway in maize,

Arabidopsis and petunia has been used as model systems to study these processes in more detail. The regulatory complex formed on A1, one gene in the anthocyanin biosynthetic pathway in maize, has been investigated in much detail and consists of the R2R3-MYB domain protein C1, the bHLH protein R and the ENT/AGENET domain containing protein RIF1 [200]. Interaction of R and C1 or RIF1 has been determined in yeast, in vitro and in vivo [200]. Moreover, I have shown in Chapter 4 of this dissertation that R dimerizes via the ACT domain on the A1 promoter, leaving the bHLH as a monomer for interaction with RIF1 (Fig. 6.1A). However on Bz1, another structural gene in the anthocyanin biosynthetic pathway of maize, R seems to homodimerize via the bHLH-LZ- like domain and bind to an E-box sequence in the promoter. The picture that emerges is

134 one in which R functions as a scaffold for the protein complex and, depending on the target gene, R can adopt different conformations and interact with different partners.

Overexpression of R in wild-type Arabidopsis has a pleiotropic phenotype, such as an increase in anthocyanin pigmentation in young seedlings, an increase in trichome numbers and a decrease in the number of root hairs [100]. It is known that R interacts with the Arabidopsis MYB-domain proteins GL1, CPC and WER when expressed under the 35S promoter [72, 263] and these interactions partly mediate the phenotypes described above. However, R likely interacts with other TFs or co-factors for regulation of the cellular processes described and possibly additional processes. Therefore, the identification of additional R partners will allow us to unveil new gene regulators that might be involved in known processes regulated by R but possibly reveals also new regulatory processes.

I will show here that R interacts with three novel bHLH proteins, two from maize and one from Arabidopsis. Further functional analysis of these R-interacting partners and their contribution to R function will provide a new aspect to combinatorial regulation of gene expression in the plant kingdom.

5.2 Material and Methods

5.2.1 Yeast Two-Hybrid Assay

GAL4DBD-R411-610 and GAL4DBDR1-152 was used as bait and a maize cDNA library from early tassel (gifts from Robert Schmidt and Marja Timmermans) or a cDNA from

Arabidopsis green tissue (ABRC stock CD4-22) were used as prey. The bait was

135 transformed into the yeast strain pJ69-4a and this modified strain was than transformed with the cDNA library. Fifty ml YEPD medium were inoculated with yeast cells containing the bait plasmid and grown at 30°C until OD600 reached 1.5 – 2 which corresponds to a cell number of about 2.5 x 108. Cells were spun down at 4000 x g for 5 min and pellet was washed with 25 ml dH2O and spun as above. Three ml of 0.1 M LiAc were added to the pellet, vortexed and incubated at 30°C for 15 min. Cells were spun down and the following was added to the pellet: 2.4 ml 50% w/v PEG 3280, 360 µl 1 M

LiAc, 500 µl ss-DNA (salmon sperm-DNA, boiled and cooled down on ice). Ten µg library plasmid DNA were added, the volume adjusted to 3.6 ml with dH2O and the mix was vortexed vigorously to resuspend the cell pellet. The transformation mix was incubated for 30 min at 30°C followed by heat shock at 42°C for 30 min. Invert the

Falcon tube every 5 min for temperature equilibrium. The cells were collected by centrifugation at 4000 x g for 5 min, the supernatant was discarded and the pellet was resuspended in 3 ml dH2O. Resuspended cells were plated on SC-LEU-TRP-HIS medium. In addition, 30 µl cells were plated on three SC-LEU-TRP plates to calculate the efficiency of transformation. After 4-7 days at 30°C, colonies were counted from SC-

LEU-TRP plates, efficiency of transformation was calculated and colonies from SC-

LEU-TRP-HIS plates were streaked out on SC-LEU-TRP-HIS and SC-LEU-TRP-HIS-

ADE plates. Positive clones were re-transformed into E. coli, plasmid DNA was extracted, sequenced and tested for interaction with the bait protein. For exact numbers of colonies screened and efficiencies of transformation see Appendix Table S4.

136

5.2.3 Antibody Production

Amino acids 194 to 317 of ZmbHLH5 were PCR amplified using primers AF-

ZmbHLH5-ab-F and –R (for protein sequence see Fig. 5.1B) and cloned into pENTR/D-

TOPO vector (Invitrogen) followed by recombination into pDEST17 vector (Invitrogen).

194-317 His6-tagged ZmbHLH5 was expressed in E. coli and purified as described in

Chapter 3. The purified protein (1 µg) was sent to Cocalico Biologicals Inc. for antibody production.

5.2.3 Localization Studies

Localization of GFP fusion constructs was done as described in Chapter 2.

5.3 Results

5.3.1 Characterization of ZmbHLH5

5.3.1.1 Structure - and Phylogenetic Analysis

ZmbHLH5 (bHLH5) was identified in a yeast two-hybrid screen using a cDNA library from early maize tassel as prey and R411-610 fused to the GAL4DBD as bait.

ZmbHLH5 encodes a 393 amino acid-long protein and sequence-structure analysis have revealed the bHLH domain at position 119 to 170 and a LZ-like region adjacent to the bHLH domain extending the second helix to amino acid 193. Further sequence-structure analysis of the conserved region at the C-terminus of ZmbHLH5 (amino acids 314-393) using the Phyre tool box [257] established similarities to the ACT domain found in R

(Figure 5.1A, B). Analysis of the N-terminus of bHLH5 however, did not result in the identification of similar regions to any known structure. Basic Local Search Analysis 137

Tool (BLAST) at http://blast.ncbi.nlm.nih.gov/Blast.cgi showed two maize genes with high homology to bHLH5 (ZmbHLH5-like1, -like2, Fig. 5.1B), two genes from Sorghum bicolor and two genes from Oryza sativa (Japonica) and one gene from Vitis vinifera

(Fig. 5.1B). The closest homolog in Arabidopsis is SPCH (SPEECHLESS, At5g53210,

AtbHLH098) with 52% identity over the entire protein and 88% identity in the bHLH-

LZ-like domain (Fig. 5.1B). SPEECHLESS, bHLH5 and Os2g15760 have been predicted to be orthologs according to http://pogs.uoregon.edu.

5.3.1.2 Protein-Protein Interactions

To determine in more detail which domain of R is sufficient for the dimerization with bHLH5, I tested in yeast interaction with several truncated R proteins. Neither the bHLH domain of R (GAL4DBD-R411-462) nor the region C-terminal to the bHLH domain

(GAL4DBD-R462-610 or GAL4DBD-R525-610) is sufficient for dimerization (Figure 5.2A).

However, when I tested the extended bHLH-LZ-like domain described in Chapter 4

(GAL4DBD-R411-478), growth on SC-LEU-TRP-HIS-ADE indicated that interaction takes place (Fig. 5.2B). Furthermore, when I tested dimerization with a mutant of R, which cannot homdimerize in yeast (GAL4DBD-LcD12411-610), no interaction was observed (Fig.

5.2A). These data indicate that bHLH5 interacts with the bHLH-LZ-like domain of R. In addition, I tested interaction of bHLH5 with other bHLH proteins from maize and

Arabidopsis as well as with ZmRIF1. bHLH5 homodimerizes weakly and interacts with

ZmRIF1 (Fig. 5.2A) and ZmRIF-2C (described below) in yeast (Fig. 5.2B). Based on yeast two-hybrid results, bHLH5 does not interact with FAMA, SPEECHLESS,

At5g65320, At5g46690, TT8361-518, GL3, GL3/EGL3434-503 or GL3/EGL3434-637 from 138

Arabidopsis. This suggests that bHLH5 can form homo- and heterodimer via the bHLH-

LZ-like domain. However, what provides the specificity for dimer formation needs to be determined.

5.3.1.3 Localization

A putative NLS (nuclear localization signal) was predicted for Os02g15760 [264], and by homology search I identified a NLS in ZmbHLH5

(RRRTGEEEEEKGSGGSAPGPAHKK, amino acids 80-104, Figure 5.1A). To test if bHLH5 locates to the nucleus, I transformed p35S::bHLH5-GFP into N. benthamiana leaves as described in Materials and Methods. bHLH5 localizes to the nucleus in N. benthamiana (Figure 5.2B). Compared to p35S::bHLH5-GFP, p35S::R-GFP localizes to nuclear foci in the nucleus (see Chapter 2). To determine if R and bHLH5 interact in planta and if this interaction might change the localization of bHLH5-GFP, I transformed both constructs together into N. benthamiana leaves. The presence of R does not change the localization of p35S::bHLH5-GFP (Figure 5.2B). Whether the putative NLS is responsible for the transport into the nucleus needs to be determined by mutational analysis. Taken together, bHLH5 contains a putative NLS and localizes to the nucleus.

5.3.1.4 Future Studies with ZmbHLH5

To determine the function of bHLH5 in planta, I made polyclonal antibodies against ZmbHLH5 to test protein-protein interaction (Immunprecipitation) as well as

DNA-binding in vivo (ChIP and ChIP-Seq). I will perform transient expression assays with p35S::bHLH5, with and without p35S::R and p35S::C1, to determine if bHLH5 has

139 an effect on pigment formation in maize cultured BMS cells or on the activation of any of the anthocyanin biosynthetic genes. With the help of Kenneth Frame, I initiated a tilling request for ZmbHLH5 at Purdue University

(http://genome.purdue.edu/maizetilling/index.html) and we are waiting for mutants to be identified in the tilling screen. Once mutants have been identified, seeds will be planted and plants will be phenotypically and genotypically analyzed.

5.3.2 Characterization of RIF-2C

RIF-2C was identified in the same yeast two-hybrid screen as bHLH5 using R411-

610 fused to GAL4DBD as bait. After I sequenced GAL4AD-RIF-2C, I searched available

BAC clones at www.maizegdb.org. The sequence matched a region in BAC clone

AC210794 and mapped to the long arm of chromosome 5. Only recently I was able to identify the full-length cDNA at www.maizesequence.org, which encodes a 731 amino acid protein. The yeast two–hybrid clone I pulled out from the maize library contained only amino acid 495 to the STOP codon. Sequence-structure analysis [257] revealed a putative HLH domain located at amino acids 516 to 568 and an ACT domain located at the C-terminus of RIF-2C (Figure 5.3A). Sequence-structure analysis of the N-terminal domain (amino acids 1-117, Fig. 5.3C) of RIF-2C revealed a GAF-like domain. This domain has been found in prokaryotes such as E. coli as well as in plants and human. It was originally described as common domain in cGMP-specific phosphodiesterases, adenylyl cyclases and in FhlA, therefore the name GAF [265]. In E. coli, the yebR protein binds 2-(N-Morpholino)-ethansulfonic acid via the GAF domain and it has methionine (R)-sulfoxide reductase activity [266]. In addition, it is able to homodimerize 140

(Fig. 5.3D) [267]. Most eukaryotes with GAF domains on the other hand contain two

GAF repeats; one important for small ligand binding and the other for dimerization.

Additional functions besides small ligand binding have been predicted [268].

Interestingly, when I analyzed the N-terminus of R (amino acids 126 to 224) I found that it contains the same secondary structure as RIF-2C, which corresponds to the predicted

GAF-like fold (Fig. 5.3C). In GL3, an R homolog from Arabidopsis, amino acids 1-100 are sufficient for interaction with GL1 [97]. This suggests that this GAF-like domain is not required for interaction with the MYB partner. In fact, examination of the 3D- structure of the GAF-like domain in R (Fig. 5.3D) revealed that it does not contain the dimerization interface present in E. coli yebR. However, it might contain the binding pocket for ligand binding (Fig. 5.3C, D).

The closest homolog to RIF-2C is Sb04g032060 from Sorghum bicolor and

Os02g0673500 from Oryza sativa (Figure 5.4). One hypothetical protein from maize

(LOC100280243) shows high homology to RIF-2C, specifically in the N-terminal 110 amino acids which correspond to the putative GAF-like domain and in the C-terminal

272 amino acids corresponding to the putative HLH-like domain and an ACT domain. In

Arabidopsis, At1g64625 and At2g27230 are paralogs and the closest homologs to RIF-

2C. At1g64625 corresponds to bHLH157 according to [269] and to the AGRIS database

(http://arabidopsis.med.ohio-state.edu/), but no experimental data is available. At2g27230

(LONESOME HIGHWAY, LWH) on the other hand has been functionally characterized and is involved in root development, specifically it is required to establish and maintain the normal vascular cell number and pattern in primary and lateral roots [161]. LWH is a

141 transcriptional activator and the bHLH-like domain is sufficient for homodimerization in yeast. Interestingly, the dimerization of LHW is much stronger when the putative ACT domain is present [161], which is similar to what I have determined in R or bHLH5. In addition, a yeast two-hybrid screen has shown that At2g27230 interacts with at least 4 other bHLH proteins from subfamilies Va, Vb and XIV [161].

Protein-protein interaction studies in yeast revealed that a GAL4AD-ZmRIF-2C495-

731 fusion construct interacts with the GAL4DBD-R411-610, but not with GAL4DBD-R411-462,

GAL4DBD-R411-478 or GAL4DBD-R462-610 (Fig. 5.3B). These results suggest that neither the bHLH domain of R nor the bHLH-LZ-like domain are sufficient for dimerization with

RIF-2C and that at least part of the C-terminus of R is required for this interaction. Since the bHLH-like domain of RIF-2C is different from usual bHLH domains, it is possible that the interaction with R requires an extention of the bHLH beyond the bHLH-LZ-like domain for dimerization. These interaction studies have been done only once and need to be repeated. Based on further interaction studies in yeast, RIF-2C495-731 interacts with

RIF1 (not shown), with bHLH5 (Fig. 5.2B) but not with At5g46690 (Fig. 5.3B).

5.3.2.1 Future studies with ZmRIF-2C

Now that the full-length cDNA of ZmRIF-2C is available, it needs to be confirmed that R and RIF-2C interact in planta. Furthermore, over-expression and mutant analysis will determine the function of ZmRIF-2C in planta in the presence and absence of R.

142

5.3.3 Characterization of At5g46690

The yeast two-hybrid screen using the Arabidopsis green tissue library as prey revealed At5g46690 as an R-interacting factor. At5g46690 encodes a 327 amino acid protein and groups together with bHLH proteins from subgroup Ia, according to Heim et al. [40]. Sequence-structure analysis identified the highly conserved bHLH domain and an additional ACT domain at the C-terminus. I tested interaction of At5g46690 with a truncated R protein in yeast and determined that interaction with bHLH-LZ-like domain of R only (GAL4DBDR411-478) is weak and that the presence of the C-terminal domain of R stabilizes the interaction with At5g46690 (Fig. 5.3B). Interestingly, a yeast two-hybrid screen using a seedling library (ABRC stock CD4-22) as prey and GAL4DBD-FAMA110-

414 as bait identified At5g46690 as a FAMA-interacting partner [270]. Overexpression of

At5g46690 had a minor effect on stomata formation, similar to the fama1-1 mutation and loss of function assays did not show any effect on stomatal formation. Together with the finding by Ohashi-Ito et al. that At5g46690 is broadly expressed throughout the plant, this suggests that At5g46690 in addition to promoting stomatal fate, is involved in other cellular processes and might not be a regular partner required for FAMA function [270].

Since At5g46690 interacts with FAMA [270], I wanted to test if it can interact with other bHLH proteins involved in stomata development. I have tested interaction with

SPEECHLESS in yeast, but no growth on selective media was observed indicating that these proteins do not interact. Furthermore, since At5g46690 interacts with R, I wanted to test the interaction with R homologs from Arabidopsis. No interaction was found between At5g46690 and GL3441-63 and studies are on the way to examine dimerization

143 with EGL3 and TT8. I tested interaction of At5g46690 with ZmbHLH5 and ZmRIF-2C. the two R-interacting proteins from maize (Fig. 5.3B) but no interaction was observed.

Taken together, these data show that At5g46690 interacts with R in yeast via the bHLH-LZ-like domain. At5g46690 also interacts with FAMA in yeast but knock out of

At5g46690 does not show a mutant stomata phenotype. Therefore, it will be interesting to determine the function of an R/At5g46690 (or R-homolog/At5g46690) heterodimer and if this dimer requires a for example aMYB protein for function.

5.4 Discussion

I have shown here that R can heterodimerize with three new bHLH proteins. I have tested interaction in yeast and dimerization in vivo will be determined in the near future. It is interesting that ZmbHLH5 and At5g46690 are both connected to stomata development. The closest homolog to ZmbHLH5 in Arabidopsis is SPCH, a TF regulating the early cell division steps during stomata development. At5g46690 interacts with FAMA, a regulator of division and differentiation of the guard cell [271], but not with SPCH (this study). Stomata and pavement cells are differentially arranged in monocots and dicots and asymmetric cell division seem to differ. But in both plant types, an asymmetric cell division initiates stomatal lineage, which in Arabidopsis is regulated by SPCH (for review see [271]). Mutants with defects in asymmetric divisions during stomata development have been identified in maize, but no genes have been associated with the mutation [272]. It will be interesting to see if similar genes responsible for stomata development in the dicot Arabidopsis regulate stomata development in maize.

144

RIF-2C, the third bHLH protein identified in the yeast two-hybrid screen, shows similarity to the Arabidopsis LHW protein, which is required to establish and maintain the normal vascular cell number and pattern in primary and lateral roots [161]. LHW is expressed in the root meristem and positively regulates the size of stele cell population in vascular roots.

If there is a correlation between stomata patterning and root cell patterning and if both are regulated by bHLH proteins in maize, it is appealing and needs to be established.

In addition, I am interested in the question how maize R is involved in these processes.

ChIP-seq data of maize protoplasts transformed with p35S::R-GFP, which is being analyzed, hopefully will provide more details on this issue.

I identified additional potential R-interacting factors in the yeast two-hybrid screen performed. The biggest challenge when performing a yeast two-hybrid screen is the identification of false-positives. A table of commonly identified two-hybrid false positives has been compiled at http://fccc.edu:80/research/labs/golemis/InteractionTrapInWork. Ribosomal genes as well as yeast ADH seem to be common false positives and were also identified during my screen.

Taken together, the results presented here, suggest that the bHLH domain of R heterodimerizes with other bHLH domain containing proteins, which might tether R to target genes different from the anthocyanin biosynthetic genes.

145

Figure 5.1: Characterization of the bHLH protein bHLH5. A, Schematic representation of the domain structure of ZmbHLH5. ZmbHLH5 contains a highly conserved bHLH domain (pink, amino acids 119-170), a LZ-like structure extending the second helix of the bHLH (brown, amino acids 171-193) and a highly conserved C-terminus with similarity to an ACT domain (purple, amino acids 314-393). ZmbHLH5 contains a putative NLS in the N-terminal region (grey, amino acids 80-104) B, Amino acid sequence alignment of ZmbHLH5, two ZmbHLH5-like proteins from maize (Genebank accession numbers: ZmbHLH5-like 1 = NP_001152521 and ZmbHLH5-like 2 = NP_001132879), two bHLH proteins from Sorghum bicolor (Genebank accession numbers: Sb_bHLH5-like1 = XP_002453634 and Sb_bHLH5-like2 = XP_002437061), two bHLH proteins from rice (Os02g15760 and Os06g33450), SPEECHLESS (At5g53210) from Arabidopsis and one bHLH protein from grape (Vitis vinifera, gene bank accession number: XP_002267745). The bHLH domain, LZ and ACT domain are boxed in colors corresponding to colors in Figure 5.1A.

146

A

B

Figure 5.1

147

A

B C

Figure 5.2: Protein-protein interaction and localization of bHLH5. A, Yeast two-hybrid experiments showing interaction between a fusion of the GAL4 AD to bHLH5 with the bHLH plus C-terminus of R (GAL4DBD-R411-610) in the yeast strain PJ69.4a [183]. bHLH5 does not interact with the bHLH domain of R (GAL4DBD-R411-462, #2) or the C-terminus of R containing the ACT domain (GAL4DBD-R462-610, #3; GAL4DBD-R525-610, #4). Moreover, it does not interact with a mutant of R which contains three amino acid residue insertions in the bHLH domain (GAL4DBD-LcD12411-462, #6; GAL4DBD-LcD12411--610, #7). bHLH5 interacts with RIF1 (GAL4AD-RIF1, #5). The positive control shows homodimerization of the C-terminus (+ bHLH) of R. B, Yeast two-hybrid experiments showing interaction between a fusion of the GAL4 AD to bHLH5 with the bHLH-LZ- like domain of R (GAL4DBD-R411-478) and with RIF-2C (GAL4DBD-RIF-2C495-731) but not with TT8 (GAL4DBD-TT8361-610). C, p35S::bHLH5-GFP localizes to the nucleus in N. benthamiana leaves in the presence and absence of p35S::R. Bars correspond to 10 µm.

148

Figure 5.3: Characterization of ZmRIF-2C. A, Schematic representation of the domain structure of ZmRIF-2C. ZmRIF-2C contains a putative GAF-like domain at the N- terminus (orange, amino acids 17-134), a bHLH-like domain (pink, amino acids 504-568) and an ACT domain at the C-terminus (purple, amino acids 603-695). B, Yeast two- hybrid studies in yeast strain pJ69.4a showing interaction of At5g46690 or RIF-2C495-703 with R411-610 (#2 and #5, respectively). C, Alignment of GAF-like domain of maize R and RIF-2C with part of GAF domain of E. coli yebR (YEBR). Structure on top of the alignment corresponds to the α-helices or β-strands in R. Structure below the alignment corresponds to α-helices or β-strands in E. coli yebR. Numbers on top of the alignment correlate with numbers in Figure 5.3D on the right (for maize R) and numbers below the alignment correlate with numbers in Figure 5.3D on the left (for E. coli yebR). Colors correspond to colors in Figure 5.3D. D, Left: 3D-Structure of the GAF domain of E. coli yebR. Shown is a dimer of chain A and chain B. Boxed is the part of the GAF domain shown in the alignment in C and which corresponds to the GAF-like domain of maize R on the right. yebR binds the ligand MES (2-(N-MORPHOLINO)-ETHANESULFONIC ACID). The dimerization surface is formed between the α-helix in turquoise (amino acids 45 to 56 in chain A) and two β-strands shown in green in chain B (amino acids 62 to 68 and 72 to 77). These structures are missing in the GAF-like domain of R (on the left) but it contains the putative small ligand binding site. The Protein Data Bank file for the E. coli yebR (code 1vhm) [191] was downloaded from the RCSB Protein Data Bank.

149

A

B

C

D

Figure 5.3

150

Figure 5.4: Alignment of ZmRIF-2C, ZmRIF-2C-like and putative Arabidopsis homolog At2g27230. Boxed in colors are the GAF-like domain, the bHLH-like domain and the ACT domain according to the color-code in Figure 5.3A.

151

Figure 5.5: Phylogenetic tree of bHLH proteins used in this Chapter. Genebank accession numbers are as follow: ZmbHLH5-like1= NP_001152521; ZmbHLH5-like2 = NP_001132879; Sb_bHLH5-like1 = XP_002453634; Sb_bHLH5-like2 = XP_002437061; LOC100272406 = NP_001140356; ZmRIF-2C-like = NP_001146644. Full-length cDNAs were aligned using ClustalX2 and GENEDOC and the tree was generated using MEGA4. Numbers correspond to bootstrap values and are in %.

152

CHAPTER 6

FINAL DISCUSSION

6.1 Summary of Findings

Combinatorial regulation of gene expression is a complex field of study which has been extensively investigated in vertebrates. In plants, the flavonoid biosynthetic pathway provides one of the best studied models of how control of gene expression is achieved

[32]. One major breakthrough was the finding that R (bHLH) interacts with C1 (R2R3-

MYB) but not with P1 (R2R3-MYB). The R interaction however, can be transferred to

P1 by changing a few amino acid residues in the MYB domain, making it more “C1- like”. This “new” protein, P1*, now has novel regulatory functions. Together with other findings by members of our lab, this lead to the discovery that R does not simply allow

C1 to function, but rather contributes to the regulatory specificity of C1 [71]. In addition, it has been postulated that R contributes to the activity of its target genes by making additional (direct or indirect) DNA contacts and by releasing C1 from a plant specific inhibitor [71].

The main question I addressed in this dissertation was how exactly R achieves these functions. Following are some of the major findings from my research:

153

A. R contains an ACT dimerization domain at the C-terminus which is necessary for

its regulatory function in activation of the anthocyanin biosynthetic genes.

B. The bHLH domain of R is essential for activation of anthocyanin biosynthetic

genes through the interaction with RIF1, an ENT/AGENET containing protein,

which links R transcriptional regulation with chromatin functions.

C. R contains a LZ-like structure which extends the second helix of the bHLH

domain, and this extended bHLH is able to form homodimers and bind to the Bz1

and C2 promoter in vitro. In vivo, R binds to the A1 promoter indirectly through

C1, but likely direct to the Bz1 promoter in the presence of C1.

D. The dimerization via this ACT domain plays an important role in binding to

different regulatory regions. For the activation of A1, R requires C1 and the

deletion of the ACT domain leads to reduced DNA-binding and activation. On

Bz1 however, deletion of the ACT domain also leads to reduced activation, but

leads to an increase in DNA-binding.

E. The bHLH-LZ-like domain interacts with two other bHLH proteins, which might

tether R to novel target genes and provide new functions for R.

6.2 The Role of the ACT and bHLH Domain in R Function

Following our identification of the ACT domain in several bHLH proteins, it was determined that a mutation in the predicted ACT domain of SPCH shows no stomata due to lack of initiation of asymmetric cell division [201]. This is in agreement with a lack of pigmentation when C1 and RΔACT were co-bombarded into maize cultured BMS cells.

154

This observation indicates that the ACT domain is important for the regulatory function of SPCH in stomata development and of R for anthocyanin biosynthesis. The absence of the ACT domain leads further to reduced levels of activation of the biosynthetic gene promoters of A1, A2, Bz1 and Bz2 in transient expression assays. It is puzzling that the activation of the transiently introduced promoter::reporter constructs is decreased to a maximum of 75%, but no pigmentation was visible. A possible explanation for this finding is that the deletion might have a more dramatic effect on a biosynthetic gene not tested. Another possibility comes from studies done on AMS, an

Arabidopsis bHLH protein required for tapetal cell development and postmeiotic microspore formation [161]. The C-terminus of AMS, including the predicted ACT domain, binds the SET/PHD domain protein ASHR3 (ASH1-Related3), and both domains are required for interaction with AMS. AMS seems to tether ASHR3 to its target genes. SET domains have been associated with the recognition of methylated lysines in histone tails, and PHD fingers have been show to contribute to nucleosome binding, interaction with nuclear proteins and tri-methylated H3K4 [199], which suggests epigenetic regulation of AMS-ASHR3. In the case of R, it is possible that under certain conditions discussed below, the ACT domain binds a chromatin modifying protein, and the phenotype associated with the absence of the ACT domain is due to the disturbed interaction.

A similar mechanism was observed for the bHLH domain of R. We examined that the bHLH domain is dispensable for activation of transiently expressed genes but necessary for activation of endogenous biosynthetic genes of the anthocyanin

155 biosynthetic pathway. The effect of deleting the bHLH domain on the activation of the transiently expressed genes however was less dramatic than the effect resulting from deletion of the ACT domain. This difference in transient versus endogenous gene activation is likely due to the interaction of the bHLH domain with the ENT/AGENET containing protein RIF1. AGENET domains are plant specific and belong to the “Royal family” of domains of chromatin recognition proteins. They have high similarities to

Tudor domains, which have been shown to bind methylated lysines in histone tails [273].

The mechanism of action of RIF1 is still unknown. We determined that it is recruited to the A1 promoter by C1 and R. A RIF1-like protein from Arabidopsis (ACK3) is able to bind acetylated H3K14 and di-methylated H3K36, both modifications associated with

“open” chromatin (Wijeratne et al., unpublished). To further determine the function of

RIF1, we are at present testing binding to modified histones in vitro and in vivo. The fact that RIF1 does not interact with bHLH proteins in general (Fig. 3.8B) suggests that it is targeted to specific transcription factor complexes and therefore acts on specific regulatory sites in the chromatin. I generated antibodies against RIF1 which will be used in ChIP and ChIP-Seq experiments to explore targets of RIF1. In addition, we will knock down RIF1 in planta. We have tested and used an RIF1-RNAi construct in cultured maize BMS cells to show participation of RIF1 in anthocyanin biosynthesis. We could use a similar construct for transformation into maize plants using protocols provided by the PTF (Plant Transformation Facility) at Iowa State University

(http://www.agron.iastate.edu/ptf/). It would be an useful to have RNAi plants to determine novel phenotypes related to this mechanism.

156

6.3 R Binds DNA in Different Ways Depending on the Target

Deletion of the ACT domain of R had not only an effect on activation of A1 and

Bz1 as described above, it also had an effect on DNA-binding. Deletion of the ACT domain leads to reduced binding of R and C1 to the A1 promoter (Fig. 4.5C). On the contrary, deletion of this domain seems to increase binding to Bz1, when compared to wild type R and C1 (Fig. 4.5C). According to this data, the structural conformation of R is highly dynamic and R is able to dimerize via the ACT domain or the bHLH-LZ-like domain depending on how it is tethered to DNA.

For binding to A1, R homodimerizes via the ACT domain, which leaves the bHLH-LZ-like domain as a monomer and able to mediate dimerization of RIF1 with the bHLH domain (Fig. 6.1A). The ACT domain-mediated dimerization most likely stabilizes the complex formed on A1 since deletion of the ACT domain leads to reduced activation of A1. The missing stabilizer therefore might lead to reduced interaction with

C1, PAC1 or RIF1 (Fig. 6.1B). These possibilities will be tested in yeast using the yeast three-hybrid method described in Chapter 4.

For binding to Bz1 and possibly other E-box containing promoters, R homodimerizes via the bHLH-LZ-like domain, which is prerequisite for DNA-binding

(Fig. 6.1D). The ACT mediated dimerization gets disrupted and the monomer might now be able to heterodimerize with an as yet unidentified partner. The deletion of the ACT domain prevents the formation of the heterodimer and a homodimer is unable to form due to sterical hindrance. This leads to a loss of stability in the complex and therefore, as seen 157 for A1, reduced activation of Bz1 (Fig. 6.1D). I will test and quantify interaction of R with and without the ACT domain with C1 and PAC1. According to my model, RIF1 does not bind a bHLH dimer and might therefore be absent from the Bz1 promoter.

However, it is possible that the ACT domain recruits a chromatin factor, an ASH3-like protein. To determine whether RIF1 is recruited to Bz1, I will use a maize protoplasts system followed by ChIP as described in Chapter 3. Stronger binding to Bz1 in the absence of the ACT domain might be a consequence of the ability of the bHLH-LZ-like domain to form a better functional dimer. R interacting partners via the ACT domain could include, as mentioned earlier, a chromatin modifier as found for AMS or it could be another TF as has been shown for ICE1 (group IIIb), which interacts with MYB15 via the C-terminus that includes a predicted ACT domain [260]. Novel partners for the ACT domain could be identified by a yeast two-hybrid screens.

Interestingly, RIF1 appears to contain a PHD finger-like domain, which was identified based on homology to ATX, which is highly similar to the PHD domain in

ASHR3. Altough, Prof. Zoya Avramova (University of Lincoln, NE), an expert in epigenetics and chromatin structure, had some doubts wether this was a PHD domain, we will call it PHD finger-like domain until further functional analyses reveal the role of this domain. One could speculate that RIF1 binds the bHLH of R on promoters like A1 which require a bHLH monomer but binds to ACT monomer on promoters which contain an E- box and to which R binds as a bHLH homodimer. However, we have tested binding of

RIF1 to the C-terminus of R excluding the bHLH in yeast and no interaction was observed under the conditions used (Fig. 3.3B).

158

6.4 C1 is Required for Tethering R to DNA

We have previously shown by ChIP experiments that R can be tethered to the A1 promoter when bound to C1, but not on its own [200]. I have determined in this dissertation that the bHLH-LZ-like domain of R can bind to the Bz1 and C2 promoter in vitro in the absence of C1. Furthermore, R can be recruited to the Bz1 promoter in vivo.

This has been tested by ChIP experiments using maize protoplasts transformed with C1 and R-GFP (Fig. 4.5). However, preliminary ChIP results with protoplasts transformed only with R-GFP show no binding to Bz1. These results are in agreement with ChIP experiments showing that R does not bind to a synthetic G-box::Luc construct containing a G-box but no C1-BS (Fig. 4.3B). It is possible that R binds directly to the E-box site in the Bz1 promoter, but requires C1 for several reasons. First, I have shown that full-length

R needs C1 to homodimerize in yeast. C1 might induce some structural changes which allow homodimerization to occur. On the other hand, C1 might stabilize R and this possibility will also be tested. It is known that C1 needs R for activating Bz1 and maybe both C1 and R bind to the corresponding MYB and bHLH BS in the Bz1 promoter for activation to occur (Fig. 4.4A). To show that R binds directly to Bz1 in the presence of

C1, I would use a mutant of C1, which cannot bind DNA but can bind R and do ChIP experiments on maize protoplasts. Such a C1 mutant has been described (C1D101E) [68].

C1D101E binds the haPBS present in the A1 promoter very weak compared to wild type C1 and is not able to activate pBz1::Luc when tested together with B in transient expression studies. However, the interaction with the bHLH protein B was not impaired in this

159 mutant. If my prediction is true, R would bind to Bz1 in the presence of C1 but indirect binding could be excluded.

6.5 R-interacting Factors

As discussed before, R has at least three conserved domains essential for protein- protein interactions. The importance of the N-terminus of R and the interaction with C1 has been extensively described [41]. An additional interaction is occurring between the acidic domain of R and PAC1, a WD40 repeat protein, homologous to TTG1 from

Arabidopsis (Oh and Grotewold, unpublished) [104]. This protein interaction however is not part of the present study, although it will be very interesting to see how PAC1 participates in R function. A yeast two-hybrid screen identified a number of other R- interacting factors, amongst which RIF1 has been discussed in much detail earlier (see

Chapter 3). I am in the process of characterizing two bHLH proteins (ZmbHLH5 and

RIF-2C) with unknown function. The closest homolog to ZmbHLH5 in Arabidopsis is

SPCH, which directs the first asymmetric division to initiate the stomata lineage [274].

The closest homolog to RIF-2C in Arabidopsis is LHW, a transcriptional activator, which is required to establish and maintain the normal vascular cell number and pattern in primary and lateral roots [161]. Both, RIF-2C and LHW, contain a conserved domain with weak homology to a HLH-motif and lack the basic region completely. Previously identified HLH proteins that lack the basic region have been shown to function as repressors [275-276].

Stomata and root architecture is significantly different between monocots and dicots. Therefore it would be interesting to see whether SPCH or LHW orthologs play a 160 similar role in maize in asymmetric cell division, or in promoting vascular proliferation, respectively. Of course, it needs to be determined first if the R-interacting factors are orthologs of SPCH or LHW and further if R is part of these processes. Maize plants with mutations in R have no pigmentation in the aleurone layer of the kernel, and to date, no phenotype in stomata or root architecture has been reported which would suggest that R is part of these processes described. However, a bHLH protein with redundant function to

R might be involved in these processes but not in pigment formation.

Many mutants of bHLH proteins show pleiotropic effects, possible because they interact with different proteins and are tethered to special targets by the interacting partner and therefore are being shared between different cellular processes [97-98].

The first experimental step to determine if R interacts with these proteins in vivo is protein-protein interaction studies such as co-immunoprecipitation (co-IP).

Furthermore, I would like to determine if these R-interacting partners play a role in known function of R, such as anthocyanin pigmentation. I also need to investigate where these genes are expressed and where the proteins localize.

6.6 Final Remarks

Taken together, in this dissertation, three conserved domains of the maize TF R have been investigated, all able to homo- or heterodimerize and all important for regulatory function of R. This amplifies the picture that one way R contributes to regulatory specificity of the TF complex is by serving as a docking platform for co- regulators. I also established that R binds to DNA either directly or indirectly via interacting partners. If R is recruited to DNA by another TF, for example C1, R appears 161 to dimerize via the ACT domain, leaving the bHLH domain as a monomer and able to interact with RIF1. If R binds DNA directly through to an E-box, R dimerizes via the bHLH domain, which might leave the ACT domain as a monomer, and it possibly heterodimerizes or bind small molecules, which abolishes homodimerization. Whether R binds small molecules for regulatory function, remains an open question. Structural analysis of the GAF-like domain in the N-terminus of R (amino acids 126 – 224) shows a possible small molecule binding pocket (Figure 5.3D).

The knowledge obtained from this study in maize is most likely transferable to regulatory mechanisms of anthocyanin biosynthesis in other plants. It is providing new insights into the regulatory specificity of TFs, not only for plants but also for vertebrates.

Many aspects of combinatorial gene regulation have been investigated in cellular processes in yeast or humans and this study in plants shows that the complexity and the dynamics of gene regulation is compareable between species.

162

Figure 6.1: Model of TF complex formation on the A1 or Bz1 promoter. A, In the non- activated state, the A1 or Bz1 promoter is occupied by nucleosomes with histone modifications specific for heterochromatin. C1 is probably bound by an as yet unknown inhibitor (marked with ?). R is shown with all its domains. The N-terminal MIR domain is masked by an unknown mechanism when not bound to C1 and when not dimerizing. RIF1 and PAC1 are shown. B, C1, R and RIF1 are part of the complex on the A1 promoter as determined by ChIP experiments [200]. C1 binds the MIR region of R (pink), PAC1 has been shown to bind the acidic region (Oh and Grotewold unpublished) amino acids 251-410, orange) and RIF1 interacts with the bHLH domain of R (blue). C1 binds to the haPBS and laPBS in the promoter and requires R for activation. R forms a bHLH monomer which interacts with RIF1. The ACT domain (amino acids 525-610, yellow) forms a dimer, which is required for A1 activation and binding, since the deletion of the ACT domain as seen in C leads to reduced activation and binding. C, In the absence of the ACT domain dimer, interaction between R and its partners might weaken, which leads to the reduced activation and binding shown in Chapter 2 and 4. D, C1 and R make direct contact with two C1-BS or an E-box, respectively. The bHLH domain of R (blue) forms a homodimer, which is required for E-box binding. RIF1 cannot bind the bHLH dimer. The ACT domain might heterodimerize with an unknown protein or a small molecule (in red) or homodimerize (not shown). It has been shown that C1 binds to both MYB-BS [254] and that the C1-BS at position -71 to -66 is important for activation by C1 and R [113]. I have shown in this dissertation that the C1-BS at position -92 to -86 is important for activation by C1 and R. E, In the absence of the ACT domain, activation of Bz1 is significantly reduced, which might be due to the missing homo- or heterodimer. However, the binding of R to the E-box is significantly increased in the absence of the ACT domain. This is likely due to the formation of a much more stable bHLH dimer.

163

A

B C

D E

Figure 6.1

164

APPENDIX A

CONSTRUCTS USED

Construct name Common Definition name in EG- lab DB p35S::C1 DP665 (also 2x 35S promoter driving C1 CDS. Made by Erich pPHTT665) Grotewold p35S::R DP471 2x 35S promoter driving R CDS. Made by Erich Grotewold p35S::C1+R DP687 2x 35S promoter driving C1 and R CDS. Made by Erich Grotewold p35S::RΔbHLH DP5660 2x 35S promoter driving R with amino acids 411- 462 deleted. Made by Erich Grotewold p35S::RD12 35S::RD12 From Sue Wessler p35S::P1 5AG31 35S promoter driving P1 CDS. Made by Marcela Hernandez p35S::P1* 5AG’31 35S promoter driving P1* CDS. Made by Marcela Hernandez p35S::R-GFP(protoplast) 16LF-51 R CDS driven by 35S, vector received from JC Jangs lab R in pENTR 16DN-16 R CDS cloned into pENTR-D/TOPO RΔ532-560 in pENTR 16DP-1 Site directed mutagenesis using 16DN-16 as template p35S::R-GFP W5-R R CDS in pGWB5 p35S::R-Myc W20-R R CDS in pGWB20 p35S::GAL4DBD-C1-Cterm same Gal4-C1-Cterm in 5AG31 p35S::GAL4DBD-R411-610 16EH-34 Gal4-BD-R(HLH-C) in 5AG31 p35S::RΔ532-560-GFP 16EW-18 LR of 16DP-1 into pGWB5 pA1::Luc ZO11 From Vicky Chandler’s laboratory pA2::Luc same A2 promoter driving Luc pBz1::Luc same Bz1 promoter driving Luc pBz2::Luc same Bz2 promoter driving Luc pA1mPBS::Luc same A1::Luc with mutated haPBS and laPBS p35S::REN pHTT672 From Pioneer pUbi::GUS DP3953 From Pioneer p35S::BAR DP611 From Pioneer pC2long::Luc 16MS-3, -8 C2 regulatory region -1249 to +10 from TSS 165

cloned into ZO11(- A1) as BamHI/PstI fragment pC2short::Luc 16MU-1 C2 regulatory region –223 to +10 from TSS cloned into ZO11 (- A1) as BamHI/PstI fragment pBz1C1-BS*::Luc 31E82 C1-BS at -92 tp -86 mutated to PvuI site pG-Box::Luc same 5x G-box seq cloned as EcoRI/BamHI in pGL3 vector. Received from L. Yuang pGAL4BS::Luc same 5XGAL4UAS-TATA-LUC-NOS, received from Masaruy Ohme-Takagi (Japan) GAL4AD-R1-610 16FF-1 LR from 16DN-16 into pADGAL4-GWC1 GAL4AD-R411-610 GS88 Made by Erich Grotewold GAL4DBD-R411-610 GS99 Made by Erich Grotewold GAL4AD-R411-478 same Received from L. Yuan (U Kentucky) GAL4DBD-R411-478 same Received from L. Yuan (U Kentucky) GAL4DBD-R411-462 GG75 Made by Erich Grotewold GAL4DBD- LcD12411-462 same bHLH region of the R allele LcD12 generated by PCR w/primers LcN3 + LcC3 cloned as an EcoRI /Sal I fragment into PBDGAL4 GAL4DBD- LcD12411-610 same bHLH and C-term of LcD12 in pBDGal4 GAL4AD-R525-610 16AD-2 Amino acids 525-610 of cloned into pADGAL4 as EcoRI/SalI using primers LcN11 and LcC2 GAL4DBD-R525-610 16AE-4 Amino acids 525-610 of cloned into pBDGAL4 as EcoRI/SalI using primers LcN11 and LcC2 GAL4DBD-R462-610 16P-2 Amino acids 525-610 of cloned into pBDGAL4 as EcoRI/SalI using primers LcN10 and LcC2 RIF1-pENTR 5CY-1 CDS of RIF1 in pENTR-D/TOPO using primers Zm-RIF-1C-GWF & Zm-RIF-1C-GWR p35S::RIF1-GFP 5DA-17 LR of 5CY-1 into pGWB5 RIF1-ab-pENTR 16HP-1 Amino acids 106-241 of RIF1 into pENTR- D/TOPO vector RIF1-pDEST17 16HP-1GW LR of 16HP-1 into pDEST17 R-ab-pENTR 16HR-2 amino acids 232-347 of R into pENTR-D/TOPO vector R-ab-pDEST17 16HR-2GW LR of 16HR-2 into pDEST17 ZmbHLH5-pENTR TF5-pENTR Received from J. Gray ZmbHLH5-GFP 16LW-21 LR of TF5-pENTR into pGWB5 (protoplast)

His6-ZmbHLH-5 16GJ-2 LR from TF5-pENTR into pDEST17 ZmbHLH5-ab-pENTR 16HE-100 Amino acid 194-317 of ZmbHLH5 into pENTR- D/TOPO vector ZmbHLH5-ab-pDEST17 16HE-100GW LR of 16HE-100 into pDEST17 GAL4AD-ZmbHLH5 M13-23C Isolated from yeast two-hybrid screen GAL4AD-RIF-2C495-703 M8-46E Isolated from yeast two-hybrid screen GAL4AD-GL3434-637 16HS-1 Aa 434-637 of GL3 into pADGal4 as EcoRI/SalI GAL4AD-SPCH97-364 16HT-2, -5 Aa 97-364 of SPEECHLESS into pADGal4 as EcoRI/SalI 166

GAL4AD-RIF1ENT 47B-10 Made by Marcela Hernandez/ Brett Hirsch GAL4AD-ACK1 5DW-5 FL-ACK1 into pADGal4-GW-C1, made by Asela Wijeratne GAL4AD-ACK2 CG2-1 FL-ACK2 into pADGal4-GW-C1, made by Asela Wijeratne GAL4AD-ACK3 CK1-1 FL-ACK3 into pADGal4-GW-C1, made by Asela Wijeratne GAL4AD-ACK4 CE3-1 FL-ACK4 into pADGal4-GW-C1, made by Asela Wijeratne GAL4AD-RIF1 VIII-4-1 Isolated from yeast two-hybrid screen GAL4DBD-EGL3402-596 16IF-1 Aa 402-596 of EGL3 into pBDGal4 as EcoRI/SalI GAL4DBD-AtMYC1339-526 16KH-1 Aa 339-526 of AtMYC1 into pBDGal4 as EcoRI/SalI GAL4DBD-TT8359-518 16IJ-2 Aa 359-518 of TT8 into pBDGal4 as EcoRI/SalI GAL4DBD-SPCH97-364 16HW-1, -2, -3 Aa 97-364 of SPEECHLESS into pBDGal4 as EcoRI/SalI GAL4DBD-At5g46690 A21-12C Isolated from yeast two-hybrid screen GAL4DBD-At5g6532097-296 16HY-1 aa 97-296 (bHLH+Cterm) were amplified and cloned into pBDGAL4 as EcoRI/SalI GAL4DBD-MUTE1-203 16KA-7, -8 Aa 1-203 of MUTE into pBDGal4 as EcoRI/SalI GAL4DBD-ACK1 5DV-21 FL-ACK1 into pBDGal4-GW-C1, made by A.W. GAL4DBD-ACK2 CF3-1 FL-ACK2 into pBDGal4-GW-C1, made by A.W. GAL4DBD-ACK3 CJ2-1 FL-ACK3 into pBDGal4-GW-C1, made by A.W GAL4DBD-ACK4 CD4-1 FL-ACK4 into pBDGal4-GW-C1, made by A.W GAL4DBD-ZmbHLH5117-393 16IA-5 Aa 117-393 of bHLH5 in pBDGal4 as EcoRI/SalI GAL4DBD-PAC1 40M-6 Made by Choon Seok Oh GAL4DBD-C11-172 GS20 Made by Erich Grotewold GPD::C1 GY56 Made by Erich Grotewold GST-R411-610 16AB-64 Aa 411-610 of R cloned into pGEX-KG

167

APPENDIX B

PRIMERS USED

168

APPENDIX C

BACTERIA/YEAST STRAINS USED

Yeast pJ69.4a MATa trp1-901 leu2-3,112 ura3-52 his3-200 gal4(deleted) gal80(deleted) LYS2::GAL1-HIS3 GAL2-ADE2 met2::GAL7-lacZ

E. coli DH5α φ80dlacZΔM15, recA1, endA1, gyrAB, thi-1, hsdR17(rK-, mK+), supE44, relA1, deoR, Δ(lacZYA-argF) U169, phoA E.coli TOP10 F–, mcrA, Δ(mrr-hsdRMS-mcrBC), φ80lacZΔM15, ΔlacX74, deoR, recA1, araD139, Δ(ara, leu)7697, galU, galK, rpsL(strr), endA1, nupG r E. coli F–, ompT, hsdSB(rB-, mB-), dcm, gal, λ(DE3), pLysS (Cm ) BL21(DE3)pLyS

JM109 endA1, recA1, gyrA96, thi-1, hsdR17(rK-, mK+), relA1, supE44, Δ(lac-proAB), [F', traD36, proAB, lacIqZΔM15]

169

APPENDIX D

YEAST-TWO-HYBRID SCREEN ANALYSIS

Bait Prey Trans- Efficiency of # of colonies formation # transformation screened R411-610 Maize early 1 1920 768000 tassel 2 2770 1,108000 3 1320 263000 4 3030 680850 R411-610 Arabidopsis 1 2500 805000 green tissue

170

References

1. Coe, E.H., Jr., The origins of maize genetics. Nat Rev Genet, 2001. 2(11): p. 898- 905. 2. Grotewold, E., The genetics and of floral pigments. Annu Rev Plant Biol, 2006. 57: p. 761-80. 3. Kevan, P.G. and H.G. Baker, Insects as flower visitors and pollinators. Annu Rev Entomol, 1983. 28. 4. Koes, R., F. Quattrocchio, and J.N.M. Mol, The flavonoid biosynthetic pathway in plants: Function and evolution. Bioessays, 1994. 16: p. 123-132. 5. Ferreyra, M.L., et al., Cloning and characterization of a UV-B-inducible maize flavonol synthase. Plant J, 2010. 6. Mazza, G.J., Anthocyanins and heart health. Ann Ist Super Sanita, 2007. 43(4): p. 369-74. 7. Vaucheret, H., Post-transcriptional small RNA pathways in plants: mechanisms and regulations. Genes Dev, 2006. 20(7): p. 759-71. 8. Rajagopalan, R., et al., A diverse and evolutionarily fluid set of microRNAs in Arabidopsis thaliana. Genes Dev, 2006. 20(24): p. 3407-25. 9. Koes, R., W. Verweij, and F. Quattrocchio, Flavonoids: a colorful model for the regulation and evolution of biochemical pathways. Trends Plant Sci, 2005. 10(5): p. 236-42. 10. Martinez, E., Multi-protein complexes in eukaryotic gene transcription. Plant Mol Biol, 2002. 50(6): p. 925-47. 11. Balaji, S., et al., Comprehensive analysis of combinatorial regulation using the transcriptional regulatory network of yeast. J Mol Biol, 2006. 360(1): p. 213-27. 12. Kato, M., et al., Identifying combinatorial regulation of transcription factors and binding motifs. Genome Biol, 2004. 5(8): p. R56. 13. Wittenberg, C. and S.I. Reed, Cell cycle-dependent transcription in yeast: promoters, transcription factors, and transcriptomes. Oncogene, 2005. 24(17): p. 2746-55. 14. Simon, I., et al., Serial regulation of transcriptional regulators in the yeast cell cycle. Cell, 2001. 106(6): p. 697-708. 15. Darieva, Z., et al., Cell cycle-regulated transcription through the FHA domain of Fkh2p and the coactivator Ndd1p. Curr Biol, 2003. 13(19): p. 1740-5. 16. Boros, J., et al., Molecular determinants of the cell-cycle regulated Mcm1p-Fkh2p transcription factor complex. Nucleic Acids Res, 2003. 31(9): p. 2279-88. 17. Rabinovich, A., et al., E2F in vivo binding specificity: comparison of consensus versus nonconsensus binding sites. Genome Res, 2008. 18(11): p. 1763-77. 18. Leung, J.Y., et al., A role for Myc in facilitating transcription activation by E2F1. Oncogene, 2008. 27(30): p. 4172-9. 171

19. Kouzarides, T., Chromatin modifications and their function. Cell, 2007. 128(4): p. 693-705. 20. Lusser, A., D. Kolle, and P. Loidl, Histone acetylation: lessons from the plant kingdom. Trends Plant Sci, 2001. 6(2): p. 59-65. 21. Fuchs, J., et al., Chromosomal histone modification patterns--from conservation to diversity. Trends Plant Sci, 2006. 11(4): p. 199-208. 22. Bua, D.J., et al., Epigenome microarray platform for proteome-wide dissection of chromatin-signaling networks. PLoS One, 2009. 4(8): p. e6789. 23. Jeanmougin, F., et al., The bromodomain revisited. Trends Biochem Sci, 1997. 22(5): p. 151-3. 24. Haynes, S.R., et al., The bromodomain: a conserved sequence found in human, Drosophila and yeast proteins. Nucleic Acids Res, 1992. 20(10): p. 2603. 25. Bannister, A.J., et al., Selective recognition of methylated lysine 9 on histone H3 by the HP1 chromo domain. Nature, 2001. 410(6824): p. 120-4. 26. Lachner, M., et al., Methylation of histone H3 lysine 9 creates a binding site for HP1 proteins. Nature, 2001. 410(6824): p. 116-20. 27. Motamedi, M.R., et al., HP1 proteins form distinct complexes and mediate heterochromatic gene silencing by nonoverlapping mechanisms. Mol Cell, 2008. 32(6): p. 778-90. 28. Maurer-Stroh, S., et al., The Tudor domain 'Royal Family': Tudor, plant Agenet, Chromo, PWWP and MBT domains. Trends Biochem Sci, 2003. 28(2): p. 69-74. 29. Russell, R.B., et al., Recognition of analogous and homologous protein folds: analysis of sequence and structure conservation. J Mol Biol, 1997. 269(3): p. 423-39. 30. Qiu, C., et al., The PWWP domain of mammalian DNA methyltransferase Dnmt3b defines a new family of DNA-binding folds. Nat Struct Biol, 2002. 9(3): p. 217-24. 31. Holton, T.A. and E.C. Cornish, Genetics and Biochemistry of Anthocyanin Biosynthesis. Plant Cell, 1995. 7(7): p. 1071-1083. 32. Mol, J., E. Grotewold, and R. Koes, How genes paint fkowers and seeds. Trends Plant Sci, 1998. 3: p. 212-217. 33. Hernandez, J.M., Combinatorial transcriptional regulation of the maize flavonoid pathway: Understanding the old players and discovering new ones., in Ohio State Biochemistry Graduate Program, Department of Plant, Cellular and Molecular Biology. 2006, The Ohio State University: Columbus, OH. 34. Abe, H., et al., Role of arabidopsis MYC and MYB homologs in drought- and abscisic acid-regulated gene expression. Plant Cell, 1997. 9(10): p. 1859-68. 35. Goodrich, J., R. Carpenter, and E.S. Coen, A common gene regulates pigmentation pattern in diverse plant species. Cell, 1992. 68(5): p. 955-64. 36. Gong, Z.Z., et al., A constitutively expressed Myc-like gene involved in anthocyanin biosynthesis from Perilla frutescens: molecular characterization, heterologous expression in transgenic plants and transactivation in yeast cells. Plant Mol Biol, 1999. 41(1): p. 33-44. 37. Ludwig, S.R., et al., A Regulatory Gene as a Novel Visible Marker for Maize Transformation. Science, 1990. 247(4941): p. 449-450. 172

38. Quattrocchio, F., et al., Analysis of bHLH and MYB domain proteins: species- specific regulatory differences are caused by divergent evolution of target anthocyanin genes. Plant J, 1998. 13(4): p. 475-88. 39. Hu, J., B. Anderson, and S.R. Wessler, Isolation and characterization of rice R genes: evidence for distinct evolutionary paths in rice and maize. Genetics, 1996. 142(3): p. 1021-31. 40. Heim, M.A., et al., The basic helix-loop-helix transcription factor family in plants: a genome-wide study of and functional diversity. Mol Biol Evol, 2003. 20(5): p. 735-47. 41. Grotewold, E., et al., Identification of the residues in the Myb domain of maize C1 that specify the interaction with the bHLH R. Proc Natl Acad Sci U S A, 2000. 97(25): p. 13579-84. 42. Lee, M.M. and J. Schiefelbein, WEREWOLF, a MYB-related protein in Arabidopsis, is a position-dependent regulator of epidermal cell patterning. Cell, 1999. 99(5): p. 473-83. 43. Aharoni, A., et al., The strawberry FaMYB1 transcription factor suppresses anthocyanin and flavonol accumulation in transgenic tobacco. Plant J, 2001. 28(3): p. 319-32. 44. Wada, T., et al., Epidermal cell differentiation in Arabidopsis determined by a Myb homolog, CPC. Science, 1997. 277(5329): p. 1113-6. 45. Foos, G., S. Grimm, and K.H. Klempnauer, Functional antagonism between members of the family: B-myb inhibits v-myb-induced gene activation. EMBO J, 1992. 11(12): p. 4619-29. 46. Ogata, K., et al., Solution structure of a DNA-binding unit of Myb: a helix-turn- helix-related motif with conserved tryptophans forming a hydrophobic core. Proc Natl Acad Sci U S A, 1992. 89(14): p. 6428-32. 47. Paz-Ares, J., et al., Molecular cloning of the c locus of Zea mays: a locus regulating the anthocyanin pathway. EMBO J, 1986. 5(5): p. 829-33. 48. Rabinowicz, P.D., et al., Maize R2R3 Myb genes: Sequence analysis reveals amplification in the higher plants. Genetics, 1999. 153(1): p. 427-44. 49. Stracke, R., M. Werber, and B. Weisshaar, The R2R3-MYB gene family in Arabidopsis thaliana. Curr Opin Plant Biol, 2001. 4(5): p. 447-56. 50. Baranowskij, N., et al., A novel DNA binding protein with homology to Myb oncoproteins containing only one repeat can function as a transcriptional activator. EMBO J, 1994. 13(22): p. 5383-92. 51. Lipsick, J.S., One billion years of Myb. Oncogene, 1996. 13(2): p. 223-35. 52. Braun, E.L. and E. Grotewold, Newly discovered plant c-myb-like genes rewrite the evolution of the plant myb gene family. Plant Physiol, 1999. 121(1): p. 21-4. 53. Kranz, H.D., et al., Towards functional characterisation of the members of the R2R3-MYB gene family from Arabidopsis thaliana. Plant J, 1998. 16(2): p. 263- 76. 54. Riechmann, J.L., et al., Arabidopsis transcription factors: genome-wide comparative analysis among eukaryotes. Science, 2000. 290(5499): p. 2105-10.

173

55. Grotewold, E., P. Athma, and T. Peterson, Alternatively spliced products of the maize P gene encode proteins with homology to the DNA-binding domain of myb- like transcription factors. Proc Natl Acad Sci U S A, 1991. 88(11): p. 4587-91. 56. Mehrtens, F., et al., The Arabidopsis transcription factor MYB12 is a flavonol- specific regulator of phenylpropanoid biosynthesis. Plant Physiol, 2005. 138(2): p. 1083-96. 57. Borevitz, J.O., et al., Activation tagging identifies a conserved MYB regulator of phenylpropanoid biosynthesis. Plant Cell, 2000. 12(12): p. 2383-2394. 58. Oppenheimer, D.G., et al., A myb gene required for leaf trichome differentiation in Arabidopsis is expressed in stipules. Cell, 1991. 67(3): p. 483-93. 59. Lai, L.B., et al., The Arabidopsis R2R3 MYB proteins FOUR LIPS and MYB88 restrict divisions late in the stomatal cell lineage. Plant Cell, 2005. 17(10): p. 2754-67. 60. Gubler, F., et al., Gibberellin-regulated expression of a myb gene in barley aleurone cells: evidence for Myb transactivation of a high-pI alpha-amylase gene promoter. Plant Cell, 1995. 7(11): p. 1879-91. 61. Yang, Y., J. Shah, and D.F. Klessig, Signal perception and transduction in plant defense responses. Genes Dev, 1997. 11(13): p. 1621-39. 62. Wilczek, C., et al., Myb-induced chromatin remodeling at a dual enhancer/promoter element involves non-coding rna transcription and is disrupted by oncogenic mutations of v-myb. J Biol Chem, 2009. 284(51): p. 35314-24. 63. Cooper, P.S. and K.C. Cone, C1 is expressed at low levels in husks. Maize Newsletter, 1997. 71: p. 25-26. 64. Goff, S.A., et al., Transactivation of anthocyanin biosynthetic genes following transfer of B regulatory genes into maize tissues. EMBO J, 1990. 9(8): p. 2517- 22. 65. Ludwig, S.R. and S.R. Wessler, Maize R gene family: tissue-specific helix-loop- helix proteins. Cell, 1990. 62(5): p. 849-51. 66. Cone, K.C., et al., Maize anthocyanin regulatory gene pl is a dublicate of c1 that functions in the plant. Plant Cell, 1993. 5: p. 1795-1805. 67. Cone, K.C., F.A. Burr, and B. Burr, Molecular analysis of the maize anthocyanin regulatory locus C1. Proc Natl Acad Sci U S A, 1986. 83(24): p. 9631-5. 68. Sainz, M.B., E. Grotewold, and V.L. Chandler, Evidence for direct activation of an anthocyanin promoter by the maize C1 protein and comparison of DNA binding by related Myb domain proteins. Plant Cell, 1997. 9(4): p. 611-25. 69. Winkel-Shirley, B., Flavonoid biosynthesis. A colorful model for genetics, biochemistry, cell biology, and biotechnology. Plant Physiol, 2001. 126(2): p. 485-93. 70. Spelt, C., et al., anthocyanin1 of petunia encodes a basic helix-loop-helix protein that directly activates transcription of structural anthocyanin genes. Plant Cell, 2000. 12(9): p. 1619-32.

174

71. Hernandez, J.M., et al., Different mechanisms participate in the R-dependent activity of the R2R3 MYB transcription factor C1. J Biol Chem, 2004. 279(46): p. 48205-13. 72. Zimmermann, I.M., et al., Comprehensive identification of Arabidopsis thaliana MYB transcription factors interacting with R/B-like BHLH proteins. Plant J, 2004. 40(1): p. 22-34. 73. Gonzalez, A., et al., Regulation of the anthocyanin biosynthetic pathway by the TTG1/bHLH/Myb transcriptional complex in Arabidopsis seedlings. Plant J, 2008. 53(5): p. 814-27. 74. Nesi, N., et al., The Arabidopsis TT2 gene encodes an R2R3 MYB domain protein that acts as a key determinant for proanthocyanidin accumulation in developing seed. Plant Cell, 2001. 13(9): p. 2099-114. 75. Quattrocchio, F., et al., Regulatory Genes Controlling Anthocyanin Pigmentation Are Functionally Conserved among Plant Species and Have Distinct Sets of Target Genes. Plant Cell, 1993. 5(11): p. 1497-1512. 76. Xue, B., et al., Characterization of a MYBR2R3 gene from black spruce (Picea mariana) that shares functional conservation with maize C1. Mol Genet Genomics, 2003. 270(1): p. 78-86. 77. Massari, M.E. and C. Murre, Helix-loop-helix proteins: regulators of transcription in eucaryotic organisms. Mol Cell Biol, 2000. 20(2): p. 429-40. 78. Atchley, W.R. and W.M. Fitch, A natural classification of the basic helix-loop- helix class of transcription factors. Proc Natl Acad Sci U S A, 1997. 94(10): p. 5172-6. 79. Murre, C., P.S. McCaw, and D. Baltimore, A new DNA binding and dimerization motif in immunoglobulin enhancer binding, daughterless, MyoD, and myc proteins. Cell, 1989. 56(5): p. 777-83. 80. O'Shea, E.K., et al., Preferential heterodimer formation by isolated leucine zippers from fos and jun. Science, 1989. 245(4918): p. 646-8. 81. Ghosh, A.K., P.K. Datta, and S.T. Jacob, The dual role of helix-loop--helix-zipper protein USF in ribosomal RNA gene transcription in vivo. Oncogene, 1997. 14(5): p. 589-94. 82. Herold, S., et al., Negative regulation of the mammalian UV response by Myc through association with Miz-1. Mol Cell, 2002. 10(3): p. 509-21. 83. Patel, J.H. and S.B. McMahon, BCL2 is a downstream effector of MIZ-1 essential for blocking c-MYC-induced apoptosis. J Biol Chem, 2007. 282(1): p. 5-13. 84. Dang, A.Q., et al., Altered fatty acid composition in the plasma, platelets, and aorta of the streptozotocin-induced diabetic rat. Metabolism, 1988. 37(11): p. 1065-72. 85. Swanson, H.I., W.K. Chan, and C.A. Bradfield, DNA binding specificities and pairing rules of the Ah receptor, ARNT, and SIM proteins. J Biol Chem, 1995. 270(44): p. 26292-302. 86. Voronova, A. and D. Baltimore, Mutations that disrupt DNA binding and dimer formation in the E47 helix-loop-helix protein map to distinct domains. Proc Natl Acad Sci U S A, 1990. 87(12): p. 4722-6. 175

87. Toledo-Ortiz, G., E. Huq, and P.H. Quail, The Arabidopsis basic/helix-loop-helix transcription factor family. Plant Cell, 2003. 15(8): p. 1749-70. 88. Buck, M.J. and W.R. Atchley, Phylogenetic analysis of plant basic helix-loop- helix proteins. J Mol Evol, 2003. 56(6): p. 742-50. 89. Pires, N. and L. Dolan, Origin and Diversification of Basic-Helix-Loop-Helix Proteins in Plants. Mol Biol Evol, 2009. 90. Tanaka, A., et al., BRASSINOSTEROID UPREGULATED1, encoding a helix- loop-helix protein, is a novel gene involved in brassinosteroid signaling and controls bending of the lamina joint in rice. Plant Physiol, 2009. 151(2): p. 669- 80. 91. Lee, S., et al., Overexpression of PRE1 and its homologous genes activates Gibberellin-dependent responses in Arabidopsis thaliana. Plant Cell Physiol, 2006. 47(5): p. 591-600. 92. Benezra, R., et al., Id: a negative regulator of helix-loop-helix DNA binding proteins. Control of terminal myogenic differentiation. Ann N Y Acad Sci, 1990. 599: p. 1-11. 93. Dang, C.V. and W.M. Lee, Identification of the human c-myc protein nuclear translocation signal. Mol Cell Biol, 1988. 8(10): p. 4048-54. 94. Raikhel, N., Nuclear Targeting in Plants. Plant Physiol, 1992. 100(4): p. 1627- 1632. 95. Shieh, M.W., S.R. Wessler, and N.V. Raikhel, Nuclear targeting of the maize R protein requires two nuclear localization sequences. Plant Physiol, 1993. 101(2): p. 353-61. 96. Burr, F.A., et al., The maize repressor-like gene intensifier1 shares homology with the r1/b1 multigene family of transcription factors and exhibits missplicing. Plant Cell, 1996. 8(8): p. 1249-59. 97. Zhang, F., et al., A network of redundant bHLH proteins functions in all TTG1- dependent pathways of Arabidopsis. Development, 2003. 130(20): p. 4859-69. 98. Payne, C.T., F. Zhang, and A.M. Lloyd, GL3 encodes a bHLH protein that regulates trichome development in arabidopsis through interaction with GL1 and TTG1. Genetics, 2000. 156(3): p. 1349-62. 99. Spelt, C., et al., ANTHOCYANIN1 of petunia controls pigment synthesis, vacuolar pH, and seed coat development by genetically distinct mechanisms. Plant Cell, 2002. 14(9): p. 2121-35. 100. Lloyd, A.M., V. Walbot, and R.W. Davis, Arabidopsis and Nicotiana anthocyanin production activated by maize regulators R and C1. Science, 1992. 258(5089): p. 1773-5. 101. Smith, T.F., et al., The WD repeat: a common architecture for diverse functions. Trends Biochem Sci, 1999. 24(5): p. 181-5. 102. de Vetten, N., et al., The an11 locus controlling flower pigmentation in petunia encodes a novel WD-repeat protein conserved in yeast, plants, and animals. Genes Dev, 1997. 11(11): p. 1422-34.

176

103. Walker, A.R., et al., The TRANSPARENT TESTA GLABRA1 locus, which regulates trichome differentiation and anthocyanin biosynthesis in Arabidopsis, encodes a WD40 repeat protein. Plant Cell, 1999. 11(7): p. 1337-50. 104. Carey, C.C., et al., Mutations in the pale aleurone color1 regulatory gene of the Zea mays anthocyanin pathway have distinct phenotypes relative to the functionally similar TRANSPARENT TESTA GLABRA1 gene in Arabidopsis thaliana. Plant Cell, 2004. 16(2): p. 450-64. 105. Hernandez, J.M., M. Pizzirusso, and E. Grotewold, The maize Mp1 gene encodes a WD-repeat protein similar to An11 and TTG. Maize Genetics Cooperation Newsletter, 2000. 74: p. 24-26. 106. Baudry, A., et al., TT2, TT8, and TTG1 synergistically specify the expression of BANYULS and proanthocyanidin biosynthesis in Arabidopsis thaliana. Plant J, 2004. 39(3): p. 366-80. 107. Sagasser, M., et al., A. thaliana TRANSPARENT TESTA 1 is involved in seed coat development and defines the WIP subfamily of plant zinc finger proteins. Genes Dev, 2002. 16(1): p. 138-49. 108. Johnson, C.S., B. Kolevski, and D.R. Smyth, TRANSPARENT TESTA GLABRA2, a trichome and seed coat development gene of Arabidopsis, encodes a WRKY transcription factor. Plant Cell, 2002. 14(6): p. 1359-75. 109. Kubo, H., et al., ANTHOCYANINLESS2, a homeobox gene affecting anthocyanin distribution and root development in Arabidopsis. Plant Cell, 1999. 11(7): p. 1217-26. 110. Tuerck, J.A. and M.E. Fromm, Elements of the maize A1 promoter required for transactivation by the anthocyanin B/C1 or phlobaphene P regulatory genes. Plant Cell, 1994. 6(11): p. 1655-63. 111. Grotewold, E., et al., The myb-homologous P gene controls phlobaphene pigmentation in maize floral organs by directly activating a flavonoid biosynthetic gene subset. Cell, 1994. 76(3): p. 543-53. 112. Lesnick, M.L. and V.L. Chandler, Activation of the maize anthocyanin gene a2 is mediated by an element conserved in many anthocyanin promoters. Plant Physiol, 1998. 117(2): p. 437-45. 113. Roth, B.A., et al., C1- and R-dependent expression of the maize Bz1 gene requires sequences with homology to mammalian myb and myc binding sites. Plant Cell, 1991. 3(3): p. 317-25. 114. Goff, S.A., K.C. Cone, and V.L. Chandler, Functional analysis of the transcriptional activator encoded by the maize B gene: evidence for a direct functional interaction between two classes of regulatory proteins. Genes Dev, 1992. 6(5): p. 864-75. 115. Grotewold, E., Plant metabolic diversity: a regulatory perspective. Trends Plant Sci, 2005. 10(2): p. 57-62. 116. Nadeau, J.A., Stomatal development: new signals and fate determinants. Curr Opin Plant Biol, 2009. 12(1): p. 29-35. 117. Serna, L., Emerging parallels between stomatal and muscle cell lineages. Plant Physiol, 2009. 149(4): p. 1625-31. 177

118. Pillitteri, L.J., et al., Termination of asymmetric cell division and differentiation of stomata. Nature, 2007. 445(7127): p. 501-5. 119. Liu, T., K. Ohashi-Ito, and D.C. Bergmann, Orthologs of Arabidopsis thaliana stomatal bHLH genes and regulation of stomatal development in grasses. Development, 2009. 136(13): p. 2265-76. 120. Kondou, Y., et al., RETARDED GROWTH OF EMBRYO1, a new basic helix- loop-helix protein, expresses in endosperm to control embryo growth. Plant Physiol, 2008. 147(4): p. 1924-35. 121. Yang, S., et al., The endosperm-specific ZHOUPI gene of Arabidopsis thaliana regulates endosperm breakdown and embryonic epidermal development. Development, 2008. 135(21): p. 3501-9. 122. Ogo, Y., et al., The rice bHLH protein OsIRO2 is an essential regulator of the genes involved in Fe uptake under Fe-deficient conditions. Plant J, 2007. 51(3): p. 366-77. 123. Bauer, P., H.Q. Ling, and M.L. Guerinot, FIT, the FER-LIKE IRON DEFICIENCY INDUCED TRANSCRIPTION FACTOR in Arabidopsis. Plant Physiol Biochem, 2007. 45(5): p. 260-1. 124. Kiribuchi, K., et al., RERJ1, a jasmonic acid-responsive gene from rice, encodes a basic helix-loop-helix protein. Biochem Biophys Res Commun, 2004. 325(3): p. 857-63. 125. Chinnusamy, V., et al., ICE1: a regulator of cold-induced transcriptome and freezing tolerance in Arabidopsis. Genes Dev, 2003. 17(8): p. 1043-54. 126. Fursova, O.V., G.V. Pogorelko, and V.A. Tarasov, Identification of ICE2, a gene involved in cold acclimation which determines freezing tolerance in Arabidopsis thaliana. Gene, 2009. 429(1-2): p. 98-103. 127. Kanaoka, M.M., et al., SCREAM/ICE1 and SCREAM2 specify three cell-state transitional steps leading to arabidopsis stomatal differentiation. Plant Cell, 2008. 20(7): p. 1775-85. 128. Abe, H., et al., Arabidopsis AtMYC2 (bHLH) and AtMYB2 (MYB) function as transcriptional activators in abscisic acid signaling. Plant Cell, 2003. 15(1): p. 63-78. 129. Lorenzo, O., et al., JASMONATE-INSENSITIVE1 encodes a MYC transcription factor essential to discriminate between different jasmonate-regulated defense responses in Arabidopsis. Plant Cell, 2004. 16(7): p. 1938-50. 130. Yadav, V., et al., A basic helix-loop-helix transcription factor in Arabidopsis, MYC2, acts as a repressor of blue light-mediated photomorphogenic growth. Plant Cell, 2005. 17(7): p. 1953-66. 131. Li, H., et al., The bHLH-type transcription factor AtAIB positively regulates ABA response in Arabidopsis. Plant Mol Biol, 2007. 65(5): p. 655-65. 132. Qian, W., et al., Identification of a bHLH-type G-box binding factor and its regulation activity with G-box and Box I elements of the PsCHS1 promoter. Plant Cell Rep, 2007. 26(1): p. 85-93.

178

133. Nesi, N., et al., The TT8 gene encodes a basic helix-loop-helix domain protein required for expression of DFR and BAN genes in Arabidopsis siliques. Plant Cell, 2000. 12(10): p. 1863-78. 134. Bernhardt, C., et al., The bHLH genes GLABRA3 (GL3) and ENHANCER OF GLABRA3 (EGL3) specify epidermal cell fate in the Arabidopsis root. Development, 2003. 130(26): p. 6431-9. 135. Gonzalez, A., et al., TTG1 complex MYBs, MYB5 and TT2, control outer seed coat differentiation. Dev Biol, 2009. 325(2): p. 412-21. 136. Sweeney, M.T., et al., Caught red-handed: Rc encodes a basic helix-loop-helix protein conditioning red pericarp in rice. Plant Cell, 2006. 18(2): p. 283-94. 137. Sakamoto, W., et al., The Purple leaf (Pl) locus of rice: the Pl(w) allele has a complex organization and includes two genes encoding basic helix-loop-helix proteins involved in anthocyanin biosynthesis. Plant Cell Physiol, 2001. 42(9): p. 982-91. 138. Ludwig, R., et al., Lc, a member of the maize R gene family responsible for tissue- specific anthocyanin production, encodes a protein similar to transcriptional activators and contains the myc-homology region. Proc. Natl. Acad. Sci. USA, 1989. 86: p. 7092-7096. 139. Hu, J., V.S. Reddy, and S.R. Wessler, The rice R gene family: two distinct subfamilies containing several miniature inverted-repeat transposable elements. Plant Mol Biol, 2000. 42(5): p. 667-78. 140. Matsushima, R., et al., NAI1 gene encodes a basic-helix-loop-helix-type putative transcription factor that regulates the formation of an endoplasmic reticulum- derived structure, the ER body. Plant Cell, 2004. 16(6): p. 1536-49. 141. Rampey, R.A., et al., An Arabidopsis basic helix-loop-helix leucine zipper protein modulates metal homeostasis and auxin conjugate responsiveness. Genetics, 2006. 174(4): p. 1841-57. 142. Yin, Y., et al., A new class of transcription factors mediates brassinosteroid- regulated gene expression in Arabidopsis. Cell, 2005. 120(2): p. 249-59. 143. Castillon, A., H. Shen, and E. Huq, Phytochrome Interacting Factors: central players in phytochrome-mediated light signaling networks. Trends Plant Sci, 2007. 12(11): p. 514-21. 144. de Lucas, M., et al., A molecular framework for light and gibberellin control of cell elongation. Nature, 2008. 451(7177): p. 480-4. 145. Leivar, P., et al., The Arabidopsis phytochrome-interacting factor PIF7, together with PIF3 and PIF4, regulates responses to prolonged red light by modulating phyB levels. Plant Cell, 2008. 20(2): p. 337-52. 146. Koini, M.A., et al., High temperature-mediated adaptations in plant architecture require the bHLH transcription factor PIF4. Curr Biol, 2009. 19(5): p. 408-13. 147. Duek, P.D. and C. Fankhauser, HFR1, a putative bHLH transcription factor, mediates both phytochrome A and cryptochrome signalling. Plant J, 2003. 34(6): p. 827-36.

179

148. Heisler, M.G., et al., SPATULA, a gene that controls development of carpel margin tissues in Arabidopsis, encodes a bHLH protein. Development, 2001. 128(7): p. 1089-98. 149. Penfield, S., et al., Cold and light control seed germination through the bHLH transcription factor SPATULA. Curr Biol, 2005. 15(22): p. 1998-2006. 150. Rajani, S. and V. Sundaresan, The Arabidopsis myc/bHLH gene ALCATRAZ enables cell separation in fruit dehiscence. Curr Biol, 2001. 11(24): p. 1914-22. 151. Pagnussat, G.C., et al., Genetic and molecular identification of genes required for female gametophyte development and function in Arabidopsis. Development, 2005. 132(3): p. 603-14. 152. Zhu, Y., et al., An interaction between a MYC protein and an EREBP protein is involved in transcriptional regulation of the rice Wx gene. J Biol Chem, 2003. 278(48): p. 47803-11. 153. Gremski, K., G. Ditta, and M.F. Yanofsky, The HECATE genes regulate female reproductive tract development in Arabidopsis thaliana. Development, 2007. 134(20): p. 3593-601. 154. Komatsu, K., et al., LAX and SPA: major regulators of shoot branching in rice. Proc Natl Acad Sci U S A, 2003. 100(20): p. 11765-70. 155. Liljegren, S.J., et al., Control of fruit patterning in Arabidopsis by INDEHISCENT. Cell, 2004. 116(6): p. 843-53. 156. Menand, B., et al., An ancient mechanism controls the development of cells with a rooting function in land plants. Science, 2007. 316(5830): p. 1477-80. 157. Yi, K., et al., OsPTF1, a novel transcription factor involved in tolerance to phosphate starvation in rice. Plant Physiol, 2005. 138(4): p. 2087-96. 158. Szecsi, J., et al., BIGPETALp, a bHLH transcription factor is involved in the control of Arabidopsis petal size. EMBO J, 2006. 25(16): p. 3912-20. 159. Friedrichsen, D.M., et al., Three redundant brassinosteroid early response genes encode putative bHLH transcription factors required for normal growth. Genetics, 2002. 162(3): p. 1445-56. 160. Liu, H., et al., Photoexcited CRY2 interacts with CIB1 to regulate transcription and floral initiation in Arabidopsis. Science, 2008. 322(5907): p. 1535-9. 161. Ohashi-Ito, K. and D.C. Bergmann, Regulation of the Arabidopsis root vascular initial population by LONESOME HIGHWAY. Development, 2007. 134(16): p. 2959-68. 162. Imai, A., et al., The dwarf phenotype of the Arabidopsis acl5 mutant is suppressed by a mutation in an upstream ORF of a bHLH gene. Development, 2006. 133(18): p. 3575-85. 163. Hyun, Y. and I. Lee, KIDARI, encoding a non-DNA Binding bHLH protein, represses light signal transduction in Arabidopsis thaliana. Plant Mol Biol, 2006. 61(1-2): p. 283-96. 164. Sorensen, A.M., et al., The Arabidopsis ABORTED MICROSPORES (AMS) gene encodes a MYC class transcription factor. Plant J, 2003. 33(2): p. 413-23.

180

165. Zhang, W., et al., Regulation of Arabidopsis tapetum development and function by DYSFUNCTIONAL TAPETUM1 (DYT1) encoding a putative bHLH transcription factor. Development, 2006. 133(16): p. 3085-95. 166. Li, N., et al., The rice tapetum degeneration retardation gene is required for tapetum degradation and anther development. Plant Cell, 2006. 18(11): p. 2999- 3014. 167. Jung, K.H., et al., Rice Undeveloped Tapetum1 is a major regulator of early tapetum development. Plant Cell, 2005. 17(10): p. 2705-22. 168. Roig-Villanova, I., et al., Interaction of shade avoidance and auxin responses: a role for two novel atypical bHLH proteins. EMBO J, 2007. 26(22): p. 4756-67. 169. Atchley, W.R., et al., Correlations among amino acid sites in bHLH protein domains: an information theoretic analysis. Mol Biol Evol, 2000. 17(1): p. 164- 78. 170. Murre, C. and D. Baltimore, in Transcriptional Regulation S.L. McKnight and K.R. Yamamoto, Editors. 1992, Cold Spring Harbor Press: Cold Spring Harbor, New York. 171. Benezra, R., et al., The protein Id: a negative regulator of helix-loop-helix DNA binding proteins. Cell, 1990. 61(1): p. 49-59. 172. Morgenstern, B. and W.R. Atchley, Evolution of bHLH transcription factors: modular evolution by domain shuffling? Mol Biol Evol, 1999. 16(12): p. 1654-63. 173. Blackwood, E.M. and R.N. Eisenman, Max: a helix-loop-helix zipper protein that forms a sequence-specific DNA-binding complex with Myc. Science, 1991. 251(4998): p. 1211-7. 174. Adhikary, S. and M. Eilers, Transcriptional regulation and transformation by Myc proteins. Nat Rev Mol Cell Biol, 2005. 6(8): p. 635-45. 175. Khanna, R., et al., A novel molecular recognition motif necessary for targeting photoactivated phytochrome signaling to specific basic helix-loop-helix transcription factors. Plant Cell, 2004. 16(11): p. 3033-44. 176. Quattrocchio, F., et al., in The Science of Flavonoids, E. Grotewold, Editor. 2006, Spinger, NY. p. 97-122. 177. Bernhardt, C., et al., The bHLH genes GL3 and EGL3 participate in an intercellular regulatory circuit that controls cell patterning in the Arabidopsis root epidermis. Development, 2005. 132(2): p. 291-8. 178. Lloyd, A.M., et al., Epidermal cell fate determination in Arabidopsis: patterns defined by a steroid-inducible regulator. Science, 1994. 266(5184): p. 436-9. 179. Bodeau, J.P. and V. Walbot, Regulated transcription of the maize Bronze-2 promoter in electroporated protoplasts requires the C1 and R gene products. Mol Gen Genet, 1992. 233(3): p. 379-87. 180. Grotewold, E., et al., Engineering secondary metabolism in maize cells by ectopic expression of transcription factors. Plant Cell, 1998. 10(5): p. 721-40. 181. Goff, S.A., K.C. Cone, and M.E. Fromm, Identification of functional domains in the maize transcriptional activator C1: comparison of wild-type and dominant inhibitor proteins. Genes Dev, 1991. 5(2): p. 298-309.

181

182. Gordon-Kamm, W.J., et al., in Moleccular Improvements of Cereal Crops I.K. Vasil, Editor. 1999, Kluwer Academic Publishers: Dordrecht, Netherlands. p. 189-253. 183. James, P., J. Halladay, and E.A. Craig, Genomic libraries and a host strain designed for highly efficient two-hybrid selection in yeast. Genetics, 1996. 144(4): p. 1425-36. 184. Mount, R.C., B.E. Jordan, and C. Hadfield, Reporter gene systems for assaying gene expression in yeast. Methods Mol Biol, 1996. 53: p. 239-48. 185. Williams, C.E. and E. Grotewold, Differences between plant and animal Myb domains are fundamental for DNA binding activity, and chimeric Myb domains have novel DNA binding specificities. J Biol Chem, 1997. 272(1): p. 563-71. 186. Frangioni, J.V. and B.G. Neel, Solubilization and purification of enzymatically active glutathione S-transferase (pGEX) fusion proteins. Anal Biochem, 1993. 210(1): p. 179-87. 187. Voinnet, O., et al., An enhanced transient expression system in plants based on suppression of gene silencing by the p19 protein of tomato bushy stunt virus. Plant J, 2003. 33(5): p. 949-56. 188. Shi, J., T.L. Blundell, and K. Mizuguchi, FUGUE: sequence-structure homology recognition using environment-specific substitution tables and structure- dependent gap penalties. J Mol Biol, 2001. 310(1): p. 243-57. 189. Sainz, M.B., S.A. Goff, and V.L. Chandler, Extensive mutagenesis of a transcriptional activation domain identifies single hydrophobic and acidic amino acids important for activation in vivo. Mol Cell Biol, 1997. 17(1): p. 115-22. 190. Mas-Droux, C., et al., A novel organization of ACT domains in allosteric enzymes revealed by the crystal structure of Arabidopsis aspartate kinase. Plant Cell, 2006. 18(7): p. 1681-92. 191. Moriguchi, K., et al., Functional isolation of novel nuclear proteins showing a variety of subnuclear localizations. Plant Cell, 2005. 17(2): p. 389-403. 192. Pooma, W., C. Gersos, and E. Grotewold, Transposon insertions in the promoter of the Zea mays a1 gene differentially affect transcription by the Myb factors P and C1. Genetics, 2002. 161(2): p. 793-801. 193. Liberles, J.S., M. Thorolfsson, and A. Martinez, Allosteric mechanisms in ACT domain containing enzymes involved in amino acid metabolism. Amino Acids, 2005. 28(1): p. 1-12. 194. Bell, J.K., G.A. Grant, and L.J. Banaszak, Multiconformational states in phosphoglycerate dehydrogenase. Biochemistry, 2004. 43(12): p. 3450-8. 195. Thompson, J.R., et al., Vmax regulation through domain and subunit changes. The active form of phosphoglycerate dehydrogenase. Biochemistry, 2005. 44(15): p. 5763-73. 196. Bailey, T.L. and C. Elkan, Fitting a mixture model by expectation maximization to discover motifs in biopolymers. Proc Int Conf Intell Syst Mol Biol, 1994. 2: p. 28- 36.

182

197. Liu, Y., M. Alleman, and S.R. Wessler, A Ds insertion alters the nuclear localization of the maize transcriptional activator R. Proc Natl Acad Sci U S A, 1996. 93(15): p. 7816-20. 198. Herman, E. and M. Schmidt, Endoplasmic reticulum to vacuole trafficking of endoplasmic reticulum bodies provides an alternate pathway for protein transfer to the vacuole. Plant Physiol, 2004. 136(3): p. 3440-6. 199. Thorstensen, T., et al., The Arabidopsis SET-domain protein ASHR3 is involved in stamen development and interacts with the bHLH transcription factor ABORTED MICROSPORES (AMS). Plant Mol Biol, 2008. 66(1-2): p. 47-59. 200. Hernandez, J.M., et al., The basic helix loop helix domain of maize R links transcriptional regulation and histone modifications by recruitment of an EMSY- related factor. Proc Natl Acad Sci U S A, 2007. 104(43): p. 17222-7. 201. Pillitteri, L.J. and K.U. Torii, Breaking the silence: three bHLH proteins direct cell-fate decisions during stomatal development. Bioessays, 2007. 29(9): p. 861- 70. 202. Schuller, D.J., G.A. Grant, and L.J. Banaszak, The allosteric ligand site in the Vmax-type cooperative enzyme phosphoglycerate dehydrogenase. Nat Struct Biol, 1995. 2(1): p. 69-76. 203. Kobe, B., et al., Structural basis of autoregulation of phenylalanine hydroxylase. Nat Struct Biol, 1999. 6(5): p. 442-8. 204. Chipman, D.M. and B. Shaanan, The ACT domain family. Curr Opin Struct Biol, 2001. 11(6): p. 694-700. 205. Heil, G., L.T. Stauffer, and G.V. Stauffer, Glycine binds the transcriptional accessory protein GcvR to disrupt a GcvA/GcvR interaction and allow GcvA- mediated activation of the Escherichia coli gcvTHP operon. Microbiology, 2002. 148(Pt 7): p. 2203-14. 206. Anantharaman, V., E.V. Koonin, and L. Aravind, Regulatory potential, phyletic distribution and evolution of ancient, intracellular small-molecule-binding domains. J Mol Biol, 2001. 307(5): p. 1271-92. 207. Larkin, M.A., et al., Clustal W and Clustal X version 2.0. Bioinformatics, 2007. 23(21): p. 2947-8. 208. Nicholas, K.B., N.H.B. Jr, and D.W. Deerfield, 2nd, GeneDoc: Analysis and Visualization of Genetic Variation. EMBNEW.NEWS 1997. 4(14). 209. Pollastri, G., et al., Improving the prediction of protein secondary structure in three and eight classes using recurrent neural networks and profiles. Proteins, 2002. 47(2): p. 228-35. 210. Ludwig, S.E. and S.R. Wessler, Maize R gene family: Tissue-specific helix-loop- helix proteins. Cell, 1990. 62: p. 849-851. 211. Cone, K.C., et al., Maize anthocyanin regulatory gene pl is a duplicate of c1 that functions in the plant. Plant Cell, 1993. 5: p. 1795-1805. 212. Goff, S.A., K.C. Cone, and V.L. Chandler, Functional analysis of the transcriptional activator encoded by the maize B gene: evidence for a direct functional interaction between two classes of regulatory proteins. Genes Dev., 1992. 6: p. 864-875. 183

213. Grotewold, E., et al., Identification of the residues in the Myb domain of maize C1 that specify the interaction with the bHLH cofactor R. Proc Natl Acad Sci U S A, 2000. 97(25): p. 13579-84. 214. Grotewold, E., et al., The myb-homologous P gene controls phlobaphene pigmentation in maize floral organs by directly activating a flavonoid biosynthetic gene subset. Cell, 1994. 76(3): p. 543-53. 215. Sainz, M.B., E. Grotewold, and V.L. Chandler, Evidence for direct activation of an anthocyanin promoter by the maize C1 protein and comparison of DNA binding by related Myb domain proteins. Plant Cell, 1997. 9: p. 611-625. 216. Feller, A., J.M. Hernandez, and E. Grotewold, An ACT-like domain participates in the dimerization of several plant bHLH transcription factors. J Biol Chem, 2006. 281: p. 28964 – 28974. 217. Ramsay, N.A. and B.J. Glover, MYB-bHLH-WD40 protein complex and the evolution of cellular diversity. Trends Plant Sci, 2005. 10(2): p. 63-70. 218. Wang, Q., et al., BRCA1 binds c-Myc and inhibits its transcriptional and transforming activity in cells. Oncogene, 1998. 17(15): p. 1939-48. 219. Vervoorts, J., et al., Stimulation of c-MYC transcriptional activity and acetylation by recruitment of the cofactor CBP. EMBO Rep, 2003. 4(5): p. 484-90. 220. Hughes-Davies, L., et al., EMSY links the BRCA2 pathway to sporadic breast and ovarian cancer. Cell, 2003. 115(5): p. 523-35. 221. Grotewold, E., et al., Engineering secondary metabolism in maize cells by ectopic expression of transcription factors. Plant Cell, 1998. 10(5): p. 721-40. 222. Sheen, J., Molecular mechanisms underlying the differential expression of maize pyruvate, orthophosphate dikinase genes. Plant Cell, 1991. 3(3): p. 225-45. 223. James, P., J. Halladay, and E.A. Craig, Genomic libraries and a host strain designed for highly efficient two-hybrid selection in yeast. Genetics, 1996. 144: p. 1425-1436. 224. Guan, K.L. and J.E. Dixon, Eukaryotic proteins expressed in Escherichia coli: an improved thrombin cleavage and purification procedure of fusion proteins with glutathione S-transferase. Anal Biochem, 1991. 192(2): p. 262-7. 225. Elomaa, P., et al., A bHLH transcription factor mediates organ, region and flower type specific signals on dihydroflavonol-4-reductase (dfr) gene expression in the inflorescence of Gerbera hybrida (Asteraceae). Plant J, 1998. 16(1): p. 93-9. 226. Liu, Y., et al., Molecular consequences of Ds insertion into and excision from the helix-loop-helix domain of the maize R gene. Genetics, 1998. 150: p. 1639-1648. 227. Tuerck, J.A. and M.E. Fromm, Elements of the maize A1 promoter required for transactivation by the Anthocyanin B/C1 or Phlobaphene P regulatory genes. Plant Cell, 1994. 6: p. 1655-1663. 228. Archer, T.K., et al., Transcription factor loading on the MMTV promoter: a bimodal mechanism for promoter activation. Science, 1992. 255(5051): p. 1573- 6. 229. Almouzni, G. and A.P. Wolffe, Replication-coupled chromatin assembly is required for the repression of basal transcription in vivo. Genes Dev, 1993. 7(10): p. 2033-47. 184

230. Stinard, P.S., J.L. Kermicle, and M.M. Sachs, The maize enr system of r1 haplotype-specific aleurone color enhancement factors. J Hered, 2009. 100(2): p. 217-28. 231. Chavali, G.B., et al., Crystal structure of the ENT domain of human EMSY. J Mol Biol, 2005. 350(5): p. 964-73. 232. Sheen, J., Signal transduction in maize and Arabidopsis mesophyll protoplasts. Plant Physiol, 2001. 127(4): p. 1466-75. 233. He, P., L. Shan, and J. Sheen, The use of protoplasts to study innate immune responses. Methods Mol Biol, 2007. 354: p. 1-9. 234. Hernandez, J., et al., Different mechanisms participate in the R-dependent activity of the R2R3 MYB transcription factor C1. J. Biol. Chem., 2004. 279(12): p. 48205-48213. 235. Ekblad, C.M., et al., Binding of EMSY to HP1beta: implications for recruitment of HP1beta and BS69. EMBO Rep, 2005. 6(7): p. 675-80. 236. Cao, X., et al., Characterization of DUF724 gene family in Arabidopsis thaliana. Plant Mol Biol, 2010. 72(1-2): p. 61-73. 237. Brown, V., et al., Purified recombinant Fmrp exhibits selective RNA binding as an intrinsic property of the fragile X mental retardation protein. J Biol Chem, 1998. 273(25): p. 15521-7. 238. Libault, M., et al., The Arabidopsis LHP1 protein is a component of euchromatin. Planta, 2005. 222: p. 910-925. 239. Hakimi, M.A., et al., A core-BRAF35 complex containing histone deacetylase mediates repression of neuronal-specific genes. Proc Natl Acad Sci U S A, 2002. 99(11): p. 7420-5. 240. Atchley, W.R., et al., Correlations among amino acid sites in bHLH protein domains: an information theoretic analysis. Mol Biol Evol, 2000. 17(1): p. 164- 78. 241. Heim, M.A., et al., The basic helix-loop-helix transcription factor family in plants: a genome-wide study of protein structure and functional diversity. Mol Biol Evol, 2003. 20(5): p. 735-47. 242. Buck, M.J. and W.R. Atchley, Phylogenetic analysis of plant basic helix-loop- helix proteins. J. Mol. Evol., 2003. 56(6): p. 742-50. 243. Toledo-Ortiz, G., E. Huq, and P.H. Quail, The Arabidopsis basic/helix-loop-helix transcription factor family. Plant Cell, 2003. 15(8): p. 1749-70. 244. Berger, S.L., The complex language of chromatin regulation during transcription. Nature, 2007. 447(7143): p. 407-12. 245. Blackwell, T.K., et al., Sequence-specific DNA binding by the c-Myc protein. Science, 1990. 250(4984): p. 1149-51. 246. Littlewood, T.D., et al., Max and c-Myc/Max DNA-binding activities in cell extracts. Oncogene, 1992. 7(9): p. 1783-92. 247. Dang, C.V., et al., Intracellular leucine zipper interactions suggest c-Myc hetero- oligomerization. Mol Cell Biol, 1991. 11(2): p. 954-62.

185

248. Ayer, D.E. and R.N. Eisenman, A switch from Myc:Max to Mad:Max heterocomplexes accompanies monocyte/macrophage differentiation. Genes Dev, 1993. 7(11): p. 2110-9. 249. Hurlin, P.J., C. Queva, and R.N. Eisenman, Mnt: a novel Max-interacting protein and Myc antagonist. Curr Top Microbiol Immunol, 1997. 224: p. 115-21. 250. Feller, A., J.M. Hernandez, and E. Grotewold, An ACT-like domain participates in the dimerization of several plant basic-helix-loop-helix transcription factors. J Biol Chem, 2006. 281(39): p. 28964-74. 251. Gietz, R.D. and A. Sugino, New yeast-Escherichia coli shuttle vectors constructed with in vitro mutagenized yeast genes lacking six- restriction sites. Gene, 1988. 74(2): p. 527-34. 252. Bodeau, J.P. and V. Walbot, Structure and regulation of the maize Bronze2 promoter. Plant Mol Biol, 1996. 32(4): p. 599-609. 253. Dooner, H.K. and O.E. Nelson, Genetic control of UDPglucose:flavonol 3-O- glucosyltransferase in the endosperm of maize. Biochem Genet, 1977. 15(5-6): p. 509-19. 254. Lesnick, M.L., Analysis of the cis-acting sequences required for C1/B activation of the maize anthocyanin biosynthetic pathway. . In Department of Biology (Eugene: University of Oregon), 1997: p. pp. 32-53. 255. Hurlin, P.J. and J. Huang, The MAX-interacting transcription factor network. Semin Cancer Biol, 2006. 16(4): p. 265-74. 256. Cowling, V.H. and M.D. Cole, The Myc transactivation domain promotes global phosphorylation of the RNA polymerase II carboxy-terminal domain independently of direct DNA binding. Mol Cell Biol, 2007. 27(6): p. 2059-73. 257. Kelley, L.A. and M.J. Sternberg, Protein structure prediction on the Web: a case study using the Phyre server. Nat Protoc, 2009. 4(3): p. 363-71. 258. Ragvin, A., et al., Nucleosome binding by the bromodomain and PHD finger of the transcriptional cofactor p300. J Mol Biol, 2004. 337(4): p. 773-88. 259. Pena, P.V., et al., Molecular mechanism of histone H3K4me3 recognition by plant homeodomain of ING2. Nature, 2006. 442(7098): p. 100-3. 260. Agarwal, M., et al., A R2R3 type MYB transcription factor is involved in the cold regulation of CBF genes and in acquired freezing tolerance. J Biol Chem, 2006. 281(49): p. 37636-45. 261. Arndt, H.D., Small molecule modulators of transcription. Angew Chem Int Ed Engl, 2006. 45(28): p. 4552-60. 262. Haring, M., et al., Chromatin immunoprecipitation: optimization, quantitative analysis and data normalization. Plant Methods, 2007. 3: p. 11. 263. Lee, M.M. and J. Schiefelbein, Developmentally distinct MYB genes encode functionally equivalent proteins in Arabidopsis. Development, 2001. 128(9): p. 1539-46. 264. Brameier, M., A. Krings, and R.M. MacCallum, NucPred--predicting nuclear localization of proteins. Bioinformatics, 2007. 23(9): p. 1159-60. 265. Aravind, L. and C.P. Ponting, The GAF domain: an evolutionary link between diverse phototransducing proteins. Trends Biochem Sci, 1997. 22(12): p. 458-9. 186

266. Lin, Z., et al., Free methionine-(R)-sulfoxide reductase from Escherichia coli reveals a new GAF domain function. Proc Natl Acad Sci U S A, 2007. 104(23): p. 9597-602. 267. Badger, J., et al., Structural analysis of a set of proteins resulting from a bacterial genomics project. Proteins, 2005. 60(4): p. 787-96. 268. Zoraghi, R., J.D. Corbin, and S.H. Francis, Properties and functions of GAF domains in cyclic nucleotide phosphodiesterases and other proteins. Mol Pharmacol, 2004. 65(2): p. 267-78. 269. Bailey, P.C., et al., Update on the basic helix-loop-helix transcription factor gene family in Arabidopsis thaliana. Plant Cell, 2003. 15(11): p. 2497-502. 270. Ohashi-Ito, K. and D.C. Bergmann, Arabidopsis FAMA controls the final proliferation/differentiation switch during stomatal development. Plant Cell, 2006. 18(10): p. 2493-505. 271. Casson, S. and J.E. Gray, Influence of environmental factors on stomatal development. New Phytol, 2008. 178(1): p. 9-23. 272. Gallagher, K. and L.G. Smith, Roles for polarity and nuclear determinants in specifying daughter cell fates after an asymmetric cell division in the maize leaf. Curr Biol, 2000. 10(19): p. 1229-32. 273. Huang, Y., et al., Recognition of histone H3 lysine-4 methylation by the double tudor domain of JMJD2A. Science, 2006. 312(5774): p. 748-51. 274. MacAlister, C.A., K. Ohashi-Ito, and D.C. Bergmann, Transcription factor control of asymmetric cell divisions that establish the stomatal lineage. Nature, 2007. 445(7127): p. 537-40. 275. Bou-Torrent, J., et al., PAR1 and PAR2 integrate shade and hormone transcriptional networks. Plant Signal Behav, 2008. 3(7): p. 453-4. 276. Zhang, L.Y., et al., Antagonistic HLH/bHLH Transcription Factors Mediate Brassinosteroid Regulation of Cell Elongation and Plant Development in Rice and Arabidopsis. Plant Cell, 2009. 21(12): p. 3767-80.

187