Chemical Genetic Interrogation of Receptors

in Arabidopsis thaliana and Striga hermonthica

by

Duncan Holbrook-Smith

A thesis submitted in conformity with the requirements for the degree of Doctor of

Philosophy

Department of Cell and Systems Biology

University of Toronto

©Copyright by Duncan Holbrook-Smith (2016) Chemical Genetic Interrogation of Strigolactone Receptors in Arabidopsis

thaliana and Striga hermonthica

Duncan Holbrook-Smith

Doctor of Philosophy

Department of Cell and Systems Biology

University of Toronto

2016

Abstract

Strigolactones are a class of terpenoid plant hormones that regulate various areas of plant growth and development. They are best known as suppressors of axillary growth, but have also been implicated in areas as diverse as leaf shape, hypocotyl length, and seed germination. However, parasitic plants of the genera Striga, Orobanche, and

Phelipanche have evolved to use released from host plant roots as a cue for germination and parasitism. Because these parasitic plants cause billion-dollar yield losses in the developing world each year, considerable research efforts have been dedicated towards understanding the mechanism of strigolactone perception.

The alpha/beta hydrolase HTL is the receptor for strigolactones in the seed. HTL interacts with effector proteins to elicit a strigolactone response. HTL-dependent signaling leads to increased germination and reduced hypocotyl length. We decided to use a chemical genetic approach to both probe HTLs while also developing Striga control technologies. Separately, we screened approximately 4000 compounds from a chemical library to identify small molecules that could be agonists and antagonists for

ii

Arabidopsis thaliana HTL. We were able to show that many of the compounds we isolated from the chemical screen were able to directly bind to HTL, and their action was specific to strigolactone signaling. These compounds were also able to stimulate

Striga hermonthica germination, showing they have promise as leads for Striga control technology development.

In order to identify new genes involved in strigolactone signaling, we screened a collection of overexpression lines for the ability to resist the effects of an HTL antagonist on germination. We found that the overexpression of the splicing factor

U2AF35B was sufficient to suppress the effect of an antagonist on germination, and could partially suppress phenotypes associated with loss-of-function alleles of HTL.

This suggests that U2AF35B plays a role in strigolactone signaling at or downstream of

HTL.

iii

Acknowledgements

The work that goes into a PhD does not take place in a vacuum, and I am profoundly grateful to the many people who contributed to my efforts in so many ways. Firstly, I must thank my supervisor, Dr. Peter McCourt, for his always excellent guidance. His constant appetite for the discussion of data, his openness to different directions or ideas, and his enthusiasm for hypothesis building was absolutely instrumental to my development as a PhD student.

I was also very lucky to have the support of senior McCourt lab members. Dr. Shelley

Lumba served as a constant resource for questions of all types, as well as a sharp extra pair of eyes for data. Dr. Yuichiro Tsuchiya was the founding member of “Team

Strigolactone” and his excellent work drew me to work in the lab. Dr. Shigeo Toh, the

2nd member of “Team Strigolactone”, was my closest collaborator who helped me in too many ways to count. I am especially grateful for the many hours we spent in the office discussing arcane details and theories about the perception of strigolactones. I would also like to thank Asrinus Subha, Michael Stokes, and Amir Arellano Saab for their help and support over my PhD and assistance with experiments. I also owe Aram Karkar and Avery Noonan my thanks for their time as undergraduate volunteers and project students.

Outside of the McCourt lab I would like to acknowledge the members of my supervisory committee, Drs. David Guttman, and Darrell Desveaux for their excellent input on my project. I would also like to thank Drs. Sonia Gazzarrini, Eiji Nambara, Daphne Goring,

iv

Steven Prescott, and Matthew Kimber for serving on various examination committees.

Lastly I must thank my friends, my family, and my ever supportive girlfriend for their support over the course of my PhD. In particular, I would like to thank my parents for the love of science they instilled in me.

v

Table of contents

Abstract ...... ii

Acknowledgements ...... iv

Table of contents ...... vi

List of Tables ...... xi

List of Figures ...... xi

Abbreviations ...... xvi

Chapter 1: General Introduction ...... 1

Abstract ...... 2

Introduction ...... 3

Chemical biology ...... 3

Chemical genetics ...... 5

Methods for developing agonists and antagonists ...... 11

Methods of high-throughput screening ...... 13

Small molecule hormones as controllers of development ...... 17

Hormones in plants and mechanisms of signaling ...... 17

A historical perspective on strigolactone signaling and biosynthesis ...... 20

Identification of SL signaling genes ...... 22

vi

Mechanisms of SL perception by D14 and HTL ...... 25

Structural biology of strigolactone receptors ...... 30

Striga control technologies ...... 32

Strigolactone agonists and chemical genetics ...... 35

Strigolactone antagonists and chemical genetics ...... 37

Identifying novel SL signaling genes through overexpression screening ...... 38

Conclusions and future directions ...... 42

Thesis overview ...... 44

Chapter 2: High throughput screening uncovers agonists for SL receptors

in Arabidopsis and Striga hermonthica ...... 46

Abstract ...... 47

Introduction ...... 48

Results and discussion ...... 53

Chemical screening identifies stimulants of the HTL x MAX2 yeast two-hybrid

interaction...... 53

Chemical screen identifies stimulants of the HTL x SPA1 yeast two-hybrid

interaction...... 54

Compound activity in HTL x MAX2 is not very predictive of HTL x SPA1 activity ... 56

In planta chemical screen for SL mimics ...... 58

Comparison of hits from in planta and yeast-based screens ...... 62

vii

Prioritization of compounds for further study ...... 63

Lead compounds have diverse chemical structures ...... 65

d14-1htl-3 plants are insensitive to some lead compounds ...... 67

Native fluorescence shows AtHTL is the target of selected lead compounds ...... 68

YLG hydrolysis assays show binding of leads 2 and 4 to AtHTL...... 71

Striga responds to HTL agonists ...... 74

Agonists signal preferentially through ShHTL7 ...... 77

Agonists 3 through 6 are able to bind to ShHTL7 directly ...... 80

Investigation of potential synergistic activity by agonists ...... 80

Conclusions ...... 85

Materials and methods ...... 86

Chapter 3: Discovery and characterization of antagonists for Arabidopsis

HTL and Striga hermonthica HTL7 ...... 93

Abstract ...... 94

Introduction ...... 95

Results and discussion ...... 99

Chemical screening for antagonists for HTL uncovers Soporidine ...... 99

RGs may act through HTL signaling to inhibit germination ...... 102

RGs may act through HTL to lengthen hypocotyls ...... 102

RG4 is Soporidine, and is the best inhibitor of Arabidopsis germination ...... 104

viii

SOP acts through SL signaling ...... 106

SOP acts directly on HTL based on YLG hydrolysis assays ...... 108

SOP acts directly on HTL based on intrinsic protein fluorescence ...... 110

SOP is able to inhibit the GR24 dependent interaction between HTL and MAX2 110

SOP as well as other RGs stabilize HTL in a DARTS assay ...... 112

RGs 2, 3, and 5 are also able to bind directly to HTL ...... 114

SOP and other RGs are able to suppress Striga hermonthica germination ...... 114

SOP does not noticeably inhibit rice development ...... 114

YLG competition assays show that SOP acts directly on ShHTL7 ...... 119

SOP analogs show no improvement on SOP potency in planta...... 121

Conclusions ...... 124

Materials and methods ...... 125

Chapter 4: Exploration of downstream strigolactone signaling using gain-of-

function screening and RNA sequencing approaches ...... 132

Abstract ...... 133

Introduction ...... 134

Results and discussion ...... 139

Overexpression of U2AF35B increases germination ...... 141

SOP cannot suppress the germination of U2AF35B overexpressors ...... 141

...... 143

ix

GR24 treatment can enhance the effects of U2AF35B overexpression ...... 144

Overexpression of U2AF35B in an htl-3 can partially suppress that backgrounds

germination defect ...... 146

Overexpression of U2AF35B in an htl-3 background can partially suppress that

mutant’s hypocotyl length defect ...... 148

RNA sequencing based exploration of htl-3 and SOP treated seedlings ...... 150

Changes in gene expression levels in htl-3 and SOP treated seedlings are weakly

correlated ...... 151

Exploration of isoform abundance in htl-3 and SOP treated seedlings ...... 157

Conclusions ...... 161

Methods...... 167

Chapter 5: General discussion ...... 178

Discussion ...... 179

Future directions ...... 183

Conclusions ...... 188

Tables ...... 189

References ...... 232

Appendix 1 RNA sequencing analysis...... 249

x

List of Tables

Table I – Hit chemicals from HTL x MAX2 yeast two-hybrid screen...... 189

Table II – Hit chemicals from HTL x SPA1 yeast two-hybrid screen...... 191

Table III – Hit chemicals from hypocotyl elongation screen...... 193

Table IV – Identities of hits in primary chemical screen for hypocotyl lengthening agents...... 195

Table V – Identities and description of SOPR lines...... 197

Table VI – Gene names and transcript abundances for genes significantly changed between htl-3 and DMSO treated Col seedlings...... 198

Table VII – Gene names and transcript abundances for genes significantly changed between SOP and DMSO treated Col seedlings...... 227

Table VIII – GO biological process enrichment or depletion in htl-3 transcripts...... 230

Table IX – GO biological process enrichment or depletion in SOP treated plant transcripts...... 231

List of Figures

Figure 1.1 - Schematic diagram of chemical screening using both target and phenotype-based approaches...... 14

Figure 1.2 Chemical structures of natural and synthetic strigolactones...... 21

Figure 1.3 Overview of shared elements of D14 and HTL dependent SL signaling...... 26

Figure 1.4– Phylogenic relationship between HTL proteins from Striga hermonthica and other plants...... 27

xi

Figure 1.5 Enantiomeric structures of GR24 ...... 29

Figure 1.6 Overview of the action of agonists and antagonists for HTL ...... 45

Figure 2.1 Schematic diagram of chemical screens for SL mimics...... 51

Figure 2.2 – Structurally diverse compounds can stimulate the yeast two-hybrid interaction of HTL with MAX2...... 55

Figure 2.3 – Structurally diverse compounds can stimulate the yeast two-hybrid interaction of HTL with SPA1...... 57

Figure 2.4 – Relationship between compound activity in HTL x MAX2 and SPA1 yeast two-hybrid screens...... 59

Figure 2.5 – Structural similarity between hits in hypocotyl elongation based screen... 60

Figure 2.6 – Relationship between compound activity in HTL x MAX2 and SPA1 yeast two-hybrid screens for 35S::GUS:COP1 hypocotyl shorteners...... 64

Figure 2.7 – Compound overlap between screens uncovers 7 likely SL mimics...... 66

Figure 2.8 – Dose response relationship of compound concentration to Arabidopsis hypocotyl length...... 69

Figure 2.9 – Dose response relationship of compound concentration to Arabidopsis hypocotyl length...... 70

Figure 2.10 – Intrinsic fluorescence binding assays show the binding of GR24 to HTL.

...... 72

Figure 2.11 – Intrinsic fluorescence binding assays show that some SL mimic interact directly with AtHTL...... 73

Figure 2.12 – YLG hydrolysis assays show binding of GR24, lead 2, and lead 4 to

AtHTL...... 75

xii

Figure 2.13 – SL mimics are able to stimulate Striga hermonthica seed germination. . 76

Figure 2.14 – HTL agonists act to stimulate germination specifically through ShHTL4-7.

...... 79

Figure 2.15 – Effect of Agonists 3-6 on ShHTL7 intrinsic fluorescence...... 81

Figure 2.16 – Dose response relationship of ShHTL7 expressing Arabidopsis seed germination to increasing concentrations of Agonists 3-6...... 82

Figure 2.17 – Effect of simultaneous treatment of Agonists 3-6 on the germination of

ShHTL7 expressing Arabidopsis seeds...... 84

Figure 3.1 – Schematic of HTL-dependent SL signaling and anticipated HTL antagonist activity...... 98

Figure 3.2 – Schematic representation of the chemical screen for antagonists for HTL and structures of lead compounds...... 100

Figure 3.3 – Effect of RGs on the germination of Col as well as SL signaling mutants.

...... 103

Figure 3.4 – Dose-response relationship of hypocotyl length to RG concentration and effect of compounds SL signaling mutants...... 105

Figure 3.5 – The most potent inhibitor of Arabidopsis germination is RG4 (Soporidine).

...... 107

Figure 3.6 – SOP acts through SL signaling...... 109

Figure 3.7 – SOP binds directly to AtHTL based on YLG competition assays...... 111

Figure 3.8 – SOP binds directly to AtHTL based on intrinsic protein fluorescence assays...... 113

xiii

Figure 3.9 – SOP can inhibit the interaction between HTL and MAX2 based on yeast- two hybrid assays...... 115

Figure 3.10 – SOP and other RGs are able to stabilize AtHTL in a DARTS assay. .... 116

Figure 3.11 – RGs 2, 3, and 5 bind directly to AtHTL based on intrinsic protein fluorescence assays...... 117

Figure 3.12 – SOP and RGs can suppress Striga hermonthica germination...... 118

Figure 3.13 – Early development in rice is not obviously perturbed by SOP treatment.

...... 120

Figure 3.14 – SOP can bind to ShHTL7 based on YLG competition assays...... 122

Figure 3.15 – SOP analogs show similar germination activity...... 123

Figure 4.1 – Schematic diagram of screen for SOP insenstive FOX lines...... 140

Figure 4.2 – Selection of U2AF35B overexpression lines...... 142

Figure 4.3 – GR24 addition can enhance 35S:U2AF35B phenotype...... 143

Figure 4.4 – SOP does not inhibit the germination rate of U2AF35B overexpression lines...... 145

Figure 4.5 – 35S::U2AF35B htl-3 seeds germinate more readily than htl-3...... 147

Figure 4.6 – 35S::U2AF35B htl-3 seedlings have shorter hypocotyls than htl-3...... 149

Figure 4.7 – Western blot showing the expression of flag tagged UAF35B protein in

Col and htl-3 backgrounds...... 152

Figure 4.8 – Transcript abundance for Arabidopsis seedlings under genetic and pharmacological perturbation of HTL...... 153

Figure 4.9 – Transcript abundance changes in SOP treated seedlings compared to htl-

3...... 154

xiv

Figure 4.10 – Volcano plots of transcripts for htl-3 seedlings and SOP treated Col. .. 156

Figure 4.11 – Transcript abundances are shown for hits from FOX screen...... 158

Figure 4.12 – Isoform abundance for each treatment...... 159

Figure 4.13 – Gene isoform abundance changes in SOP treated seedlings compared to htl-3...... 163

Figure 4.14 – Volcano plots of isoforms for htl-3 seedlings and SOP treated Col...... 165

Figure 4.15 – Low abundance isoforms show increased relative variability than higher abundance isoforms...... 166

Figure 4.16 – MDS analysis shows variability between replicates...... 169

xv

Abbreviations

ABA -

ABF - ABSCISIC ACID RESPONSIVE ELEMENTS-BINDING FACTOR

AHP - Arabidopsis thaliana histidine phosphotransfer proteins

ARF - response factor

ARR – ARABIDOPSIS RESPONSE REGULATOR

AtHTL - Arabidopsis thaliana HYPOSENSITIVE TO LIGHT

Aux/IAA - INDOLE-3-ACETIC ACID INDUCIBLE

BAK1 - BRI1-ASSOCIATED RECEPTOR KINASE 1

BES1- BRI1-EMS-SUPPRESSOR 1

BRI1 - INSENSITIVE 1

BR - brassinosteroid

COI1 - CORONATINE INSENSITIVE 1

CRE1 - RESPONSE 1

CTR1 - CONSTITUTIVE TRIPLE RESPONSE 1

D14 – DWARF 14

D53 – DWARF 53 xvi

DMSO - dimethyl sulfoxide

DNA - deoxyribonucleic acid

EIN2 - ETHYLENE INSENSITIVE 2

ETR1 - ETHYLENE RESPONSE 1

ER - endoplasmic reticulum

EV - empty vector control

FPKM - fragments per kilobase of transcript per million mapped reads

FOX - full length overexpressor

GA - gibberellic acid

GID1 - INSENSTIVIE DWARF 1

GUS - β-glucuronidase

HTL - HYPOSENSITIVE TO LIGHT

JA -

JAZ1 - -ZIM-DOMAIN PROTEIN 1

KAR -

MAX1 - MORE AXILLARY BRANCHES 1

MAX2 - MORE AXILLARY BRANCHES 2

xvii

MAX3 - MORE AXILLARY BRANCHES 3

MAX4 - MORE AXILLARY BRANCHES 4

MDS - multiple dimensional scaling

PBS-T - phosphate buffered saline with TWEEN

RNA - ribonucleic acid

SCF - Skp, Cullin, F-box

SMAX1 - SUPPRESSOR OF MAX2 1

ShHTL - Striga hermonthica HYPOSENSITIVE TO LIGHT

SL - strigolactone

SNRK - SNF1-RELATED PROTEIN KINASE

SOP - soporidine

SPA1 - SUPPRESSOR OF PHYA-105

TIR1 - TRANSPORT INHIBITOR RESPONSE 1

TPR - TOPLESS RELATED

YLG - Yoshimulactone green

xviii

Chapter 1: General Introduction

1

Abstract

The mobilization of chemical approaches to solving mechanistic problems in the realm of molecular biology has generated a new field known as chemical biology. Akin to genetic analysis where mutations are used to perturb gene function, in chemical biology small molecules are used to probe biological processes through interactions with macromolecules such as proteins. These small molecules, or chemical probes, often work by perturbing a specific protein within a biological system. Chemical probes that act in this way can be used in concert with genetic approaches to understand systems that are recalcitrant to traditional genetic interrogation. This combination of chemical biology and genetics is known as chemical genetics and has been a fruitful approach in many areas of biology.

One area where it has been particularly useful is in the understanding of signaling. Many plant hormone receptors recognize small molecule hormones whose combined action results in the generation of the plant’s body plan. Strigolactones (SLs) are one such class of small molecule hormones. SLs act endogenously to suppress shoot branching, but are also exuded into the soil to acts as a cue for symbiosis with soil fungi. Unfortunately, exuded SLs also serve as a cue for the germination of parasitic plants of the genera Striga, Orobanche, and Phelipanche which allows them to attack crop plants and dramatically reduce crop yields throughout the developing world. For this reason, the perception of SLs in the seed has become an area of intense study.

2

Chemical biology and chemical genetic approaches hold significant promise with regard to increasing our understanding of the perception of SLs both in the model plant

Arabidopsis thaliana as well as in the parasitic plant Striga hermonthica. Additionally, chemicals that can either mimic or inhibit SL signaling are potentially agriculturally useful in Striga control measures.

Introduction

Chemical biology

Chemical biology is a relatively new field that defies easy definitions. However, I feel that a good working definition is that chemical biology refers to the mobilization of drug- like chemical tools to answer biological questions. As a term, chemical biology is relatively new. However, it is possible to trace the intellectual history of chemical biology back more than 100 years. With the advent of light microscopy, chemical pigments were used as chemical probes to enhance contrast and visualize biological samples1. Although scientists did not necessarily understand how dyes were able to differentially stain samples, this is an example of small molecules being used as markers for specific biological entities such as chromosomes2. As a field, the use of chemical probes to enhance the visualization of structures such as tumors continues to be one of intense and ongoing research3. Chemical biology approaches became more broadly applicable as more discoveries were made regarding the mechanism of action of drugs4 and the chemical composition and structures of life at the lowest molecular level5. Chemical biology is a useful approach since all biological processes have at

3 their base a chemical element. With that said, certain areas of biology are more amenable to analysis through chemical biology.

In a general chemical biology workflow, a researcher either uses or develops a chemical probe for some macromolecular entity and uses it to understand some underlying property of the biological system. Chemical biology can thus be divided into two main groups of approaches that use chemical probes of different sorts. In the first, a chemical probe is designed and synthesized in order to possess some desirable chemical property. For example, it is possible to attach a chemical group that will crosslink with proteins that are near to it when they are exposed to light, allowing for identification of the proteins bound this chemical probe6. Another example is a probe can act as a general substrate that becomes covalently attached to enzymes when the enzyme attempts to process them7. This first set of approaches relies heavily on custom synthesis of small molecules. These approaches can be extremely powerful.

For example, although cysteine is sometimes a catalytically important amino acid, there is no universal consensus sequence that predicts whether a cysteine is catalytically reactive. By using specifically designed chemical probes that covalently bind to only catalytically active cysteines, one group was able to profile the whole S. cerevisiae genome to identified a novel hyper-reactive cysteine residue that is conserved across the eukaryotic phylogeny and is essential for survival in yeast8. This sort of genome- wide interrogation of cysteine reactivity would have been difficult to achieve using any other suite of techniques. However, since they rely on synthetic chemists they are generally beyond the capabilities of labs that traditionally study biology. Additionally, these approaches also often require some a priori knowledge of their expected

4 molecular targets. For this reason, many of these approaches are better suited to mechanistic studies as opposed to exploratory work.

The second flavour of chemical biology relies on chemical probes that act as agonists or antagonists that engage in non-covalent interactions with their target. In contrast to the more exotic chemical probes described above, these compounds are simply generated to either activate or inactivate some molecular target. These types of chemical probes generally act through non-covalent binding to their molecular target.

These agonists or antagonists can either be found by screening libraries of chemicals synthesized commercially and arrayed for high throughput screening, or by targeted synthesis. Since treatments with agonists or antagonists are conceptually similar to gain-of-function and loss-of-function mutations, this screening-based methodology is often combined with genetics in an approach known as chemical genetics.

Chemical genetics

Chemical genetics can be seen as a subset of the field of chemical biology where chemical probes are used in combination with genetic approaches to understand biological systems9. These approaches can include genetic screening for mutants that differ in their response to a chemical probe from wild-type10, or the use of chemical probes in combination with mutant analysis to perform chemical epistasis experiments11. In fact, techniques that would now be considered chemical genetics have been at the heart of our understanding of molecular genetics for more than 50 years. Perhaps the best example of early chemical genetics is the understanding of the

Lac operon by Jacob and Monod. In their seminal work they conducted genetic screens to identify E. coli mutants that were able to grow on plates containing a chemical

5 analog of lactose called PGAL that served as a carbon source but did not activate the lac operon. In this way they were able to identify mutants with constitutive LACZ and

LACY expression12. In other words, they used PGAL as a chemical probe for constitutive expression from the Lac operon. Subsequent mechanistic knowledge revealed that PGAL was unable to bind to the lac repressor but was still a substrate for

LACZ. This serves as an early example of how useful a chemical probe can be in targeting different members of a pathway to tease apart genetic interactions. Time and time again these types of tools have been instrumental in the understanding of signaling pathways and mechanisms13,14.

Although selective agonists have a long history of usefulness, perhaps the most prominent element of the chemical geneticist’s toolkit is the antagonist. Molecular genetic studies have frequently relied on antagonists to support genetic results. For example, cycloheximide15 have been used in a wide variety of circumstances to test hypotheses about the functioning of the cell. Cycloheximide exemplifies some of the useful properties of antagonists. A mutant that was unable to synthesize protein would be lethal, and thus could not be isolated. However, antagonists can be added to an organism or assay at any desired stage and thus circumvent issues of lethality before the experiment. Thus an antagonist like cycloheximide is able to test hypotheses that would be hard to test using traditional genetics. Another attractive property of a compound like cycloheximide is that it can be added to various genetic backgrounds easily allowing for the analysis of the interaction between protein synthesis and various other biological processes. In some organisms the construction of homozygous double mutants is a time consuming process, which makes testing for genetic epistasis

6 between the two genes labourious. If an antagonist that is able to inhibit a specific gene product is available, it can dramatically speed up the process of testing epistasis. This is done by assessing the ability of one genotype to suppress the effects of an antagonist for the other gene or vice versa16.

Another way of deploying chemical genetics is to use it as a way of uncovering hidden phenotypes in a particular genotype. In genetics, phenotype (P) is equal to the interaction of genotype (G) and environment (E). Each chemical treatment from a chemical library can be conceptualized as a new “environment”. Thus, if a particular mutant genotype lacks an obvious phenotype compared to WT, by treating the mutant and WT with chemicals from a library and screening for differential responses between those two genotypes is it possible to reveal novel phenotypes for that genotype by altering the “E” term in the P = G + E abstraction10.

Due to its ease of genetic manipulation, one system in which chemical genetic approaches have been particularly well developed is the yeast S. cerevisiae. One chemical genetic approach pioneered in yeast that is intellectually related to the idea of a chemical as an environment is the idea of drug induced haploinsufficiency17. For most genes in diploid organisms, one copy of that gene is sufficient under laboratory conditions. However, if a heterozygous deletion strain is treated with a compound that inhibits the product of that deleted gene, that strain can be made haploinsufficient.

Using genome-wide fitness analysis it is possible to profile the sensitivity of all heterozygous deletion strains in the S. cerevisiae genome to identify strains with an enhanced fitness defect in the presence of a given drug. For example, in a proof of concept this approach was used to show that heterozygous deletion strain in DFR1

7 shows an enhanced fitness defect in the presence of methotrexate, a drug that targets it18. Another analogous approach is homozygous profiling, where the sensitivities of homozygous deletions to a drug are profiled. In one seminal case, the sensitivity of homozygous deletions in the drug binding FKBP12 and TOR kinases showed reduced sensitivity to rapamycin19. In both of these approaches the chemical acts as an environment that uncovers a phenotype for the genotypes tested. One final yeast chemical genetic approach that is of particular interest is multicopy suppression profiling. Again this technique takes advantage of the ease of manipulation of gene dosage in yeast. However, unlike the previous examples, in muilticopy suppression profiling the amount of gene product is increased by transforming yeast that are susceptible to some compound with a high copy number library of barcoded yeast open reading frames. The transformed yeast are then grown under competitive conditions and genes that confer reduced sensitivity to the compound are identified through sequencing of their barcodes20.

Chemical genetics has also been used to great effect in systems that are more closely related to human health and disease. In particular, finding small molecules that are able to inhibit cell growth or kill cells in a genotype specific way has been of particular interest because of the role of cell growth in cancer. For example, one group used a specific tumor cell line21 that possesses the v-Ha-ras Harvey rat sarcoma viral oncogene homologue to screen for compounds that were able to inhibit its growth but not that of isogenic non-tumorogenic cells. Through this screening approach they were able to identify the small molecule erastin22 which acted through a mitochondrial voltage dependent channel and was an entity never before identified as a target for

8 anti-tumor agents. This example emphasizes two things. Firstly, that the thoughtful incorporation of genetic approaches into phenotype-based chemical screens can be used to achieve practical outcomes such as the identification of new lead compounds for chemotherapeutic development. Secondly, by using a phenotypic screening approach they were able to identify a novel molecular target for drug development.

The successful application of chemical genetics is not limited to cell lines and yeast; chemical genetics has been applied to whole organism systems to great effect. For example, one group used a Drosophila-based chemical screening system where a specific oncogene was selectively driven such that only 50% of larvae survived to the pupal stage. The group was then able to measure the influence of a collection of kinase inhibitors on survival in order to identify kinase inhibitors that could oppose the effects of the oncogene, while simultaneously assessing the effects of the compounds of the morphology of the developed flies23. This allowed for the development of a compound that showed more favourable efficacy and toxicity profiles compared to known clinical kinase inhibitors. This study shows that a well designed whole-organism screen using a carefully chosen genotype can be used to identify drugs with established mechanisms of action and good clinical potential.

As an approach, chemical genetics is best applied to processes that are sensitive to chemical perturbation. This is because it is easier to activate or inactivate some gene product using a non-natural small molecule probe if that gene product already interacts with other small molecules as a part of its normal function. This ease of modulation by small-molecules is referred to as “drugability”, and all gene-products that can be drugged are known as the “druggable genome”24. The scope of the druggable genome

9 is largely derived from what types of gene products are known to be the targets of small-molecule drugs. The most striking thing about the druggable genome is that it is not representative of the genome as a whole. About 25% of drug targets are GPCRs, and another 10 and 11 % are kinases and enzymes respectively. Those ratios are much larger than the prevalence of those types of genes in the genome where, for example, less than 1% of protein coding genes are GPCRs25. It is estimated that approximately 10-14% of the human genome is druggable24. Conspicuously missing from the list of targets of drugs are transcription factors, which typically bind to DNA and are generally not directly modulated by small-molecule binding in vivo.

Although there is work being conducted to expand the scope of targets that can be chemically perturbed, for example blocking protein-protein interactions26 or finding RNA binding drugs27, the lesson of the druggable genome is that certain targets are more amenable to modulation than others. For example, small molecule hormone receptors are by definition sensitized to small molecules and thus are attractive targets for chemical genetic perturbation. In fact, 3% of marketed drugs target nuclear hormone receptors in humans24 although those receptors make up only approximately 0.2% of proteins in the genome28. The same principles from drug discovery can also be translated to agricultural chemistry and the search for herbicides. Common herbicides target the plant hormone receptor for auxin29, and others such as glyphosate famously target the shikimate biosynthetic pathway30. This shows that even in agribusiness and plant biology that the lessons of the druggable genome hold true.

10

Methods for developing agonists and antagonists

Strategies for uncovering agonists and antagonists for a target can be broken down into different categories that are more or less favoured by different groups of scientists9. The first group of approaches are sometimes grouped under the umbrella of “rational design”.

Perhaps the most obvious approach to perturbing a target is to synthesize compounds that are structurally similar to known endogenous ligands and to then test whether they are able to agonize or antagonize the intended target. This is in essence the approach that Jacob and Monod used in their studies12. This approach has several benefits, including that any synthetic analog likely has a similar mechanism of action as the ligand it was modeled after. This restricts the number of potential targets that must be assayed to show the direct interaction of the probe with its target. One disadvantage of this approach is that only receptors with known ligands can be targeted in such a way.

Additionally, this approach restricts the chemical space that can be explored by the experimenter and by restricting the types of structures that are used could place a cap on the potency that is achieved.

Another “rational” approach is to take advantage of crystallographic data on the desired target, and design molecules that are predicted to bind to that target31. This approach again has the benefit that the mechanism of action of the designed probe is likely known. Compared with the first approach described, it also has the advantage of not requiring any a priori knowledge of the natural ligand for the target. The obvious limitation for this approach is that it requires detailed structural information on the protein. Algorithms used to predict what hypothetical molecules might bind to the target

11 make use of angstrom-level information about the orientation of specific atoms in the side chains of the amino acids that make up the protein32. If the positions of the atoms in the crystal structure do not match their true position in the soluble protein, it is unlikely that the molecules will fill the active site or engage in hydrogen bonding with the desired residues. Even in the event that a sufficiently high-quality structure of the protein is available, often ligands interact with proteins using an induced fit mechanism33, where the structure of the protein changes to accommodate the binding of the ligand. In such cases designing a ligand that fits into the binding pocket of a ligand-free protein may not induce the conformational changes in the protein that are required for signal transduction. This means that in order to have the highest chances of successfully designing a drug, the researcher needs a crystal structure of the target- ligand pairing. This can be a daunting proposition depending on the target. For example, crystallization of the alpha folate receptor from humans with its ligand required truncating the protein and fusing it with a secreted IgG FC domain, expressing and purifying it in human cells, and also selectively inhibiting certain forms of glycosylation34.

In the approaches listed above, a relatively small number of compounds are rationally chosen based on predicted properties in order to increase the chances that any given compound might bind to the target. The other major approach to uncovering agonists and antagonists is to screen through a large collection of small-molecules to find them35. In this case each of the compounds that is screened does not have a particularly high chance of binding to the target, but the experimenter leverages the large number of compounds available in chemical libraries such that the chances that

12 one or more of them will act on the target becomes high. This approach will be referred to here as high-throughput chemical screening. As was described for the rational design paradigm outlined above, there are many ways of conducting a high-throughput chemical screen. The most major factor that differentiates different ways of screening is whether they are target, or phenotype based (Figure 1.1)9. As with the different approaches described above, both have strengths and weaknesses. The relationship between these two approaches is often compared to the relationship between forward and reverse genetic screening9. In a forward genetic screen an unknown gene is perturbed by a mutation, and in a phenotype based chemical screen an unknown protein is targeted by a chemical. In both screening approaches the researcher knows that the perturbation results in a phenotypic output, but does not know how it was generated. On the other hand, in a reverse genetics approach the researcher mutates a specific gene, much as in a target based screening approach when a known macromolecule is targeted by small molecules. In both cases the target of the perturbation is known, but the consequence of the perturbation is not.

Methods of high-throughput screening

Target based approaches are centered on identifying small molecules that are able to directly interact with a target macromolecule. For example, a researcher attempting to uncover small molecules that inhibits the activity of an enzyme might mix the target protein, screening compounds, and an enzymatic reporter and look for compounds that

13

Figure 1.1 - Schematic diagram of chemical screening using both target and phenotype-based approaches.

For either approach a chemical library of unique small molecules is diluted into an assay plate. For a target-based screen (left) the chemicals are

14

screened for the ability to modulate the activity of a particular

macromolecular target. The hits are then prioritized based on ability to

generate an in vivo response. In the phenotype-based chemical screen

(right) the primary screen generates hits with in vivo activity as

represented with purple colour in the seedling, but the target is unknown.

Thus it is necessary to identify the molecular target of the hits from that

primary screen. reduce the output of the reporter36. The primary benefit of such approaches is that chemical hits derived from screens of this type should work through the target, and thus elucidating the mechanism of action of the hit is relatively straight forward. The disadvantage of such an approach is that the in vitro conditions that are generally used for these sorts of screens omit a large number of factors that may be important if the probe is intended for use in a whole-organism. For example, small molecules can be modified37, sequestered38, or excreted by the organism39. This means that although a hit from a target-based screen may work in vitro it may be completely ineffective in vivo.

In some ways, phenotypic screens suffer from the opposite problems but also have the opposite benefits compared to target based screening approaches. In a phenotypic screen, chemicals are screened and evaluated for their ability to elicit some desired phenotypic output. This can be conducted using an in vivo reporter system or simply by looking at the physiological changes in the organism used in the assay. In either case the compounds that are called as hits from the screen by definition have some activity that the experimenter wants the eventual probe/drug to possess. Further, in the whole organism case the experimenter also knows that the compound is not being entirely

15 inactivated by any of the processes of modification or displacement described above.

However, the experimenter faces a daunting question: how is the compound actually working? This can be a difficult question to answer for many reasons. One major impediment is that the experimenter cannot be sure that the compound that was added to the assay is the same molecule that has the desired activity. Although modifications to small-molecules can be inactivating, they can also be activating40. This can complicate attempts to identify the target. Furthermore, although an experimenter may have a number of candidate macromolecules that could be targets of the small molecules, there is no guarantee that the compounds are working on a macromolecule that is known to have any role in the process under investigation. This means that the sometimes lengthy process of target identification is required to identify the target and clarify the mechanism of action. This field of target identification has become an active field in its own right41. However, the cost of the effort required to conduct target validation is partially compensated for by the potential to uncover new genes in a process42. Perhaps the greatest limitation to phenotypic chemical screens is that not all physiological processes can be easily adapted to a phenotypic screening approach.

For example, identifying drugs for neurological conditions such as Alzheimer’s disease would be challenging using a phenotypic approach. In that field historically efforts have been focussed towards finding anti-Aβ drugs, although that target-based approach has failed to generate approved drugs43. For other neurological disorders such as

Huntington’s disease, finding drugs that target the disease causing protein or other molecular approaches that target the mRNA for that protein still remain the best strategies44.

16

Small molecule hormones as controllers of development

All organisms go through a process of development through their lifecycle. Within simple single-celled organisms this process may be more limited, but on some level these organisms still have to interpret their genetic code and engage in concerted changes of cellular morphology to complete their lifecycle. In multi-cellular organisms, the coordination of cellular machinery is even more complex and must be coordinated over larger spatial and temporal scales. Thus the question of how organisms are able to interpret their genetic code and engage in specific patterns of development has long been a major area of inquiry in biology.

The use of chemical messengers is an evolutionarily ancient method for eliciting concerted changes in cellular morphology and gross physiological changes. In bacteria, small molecules are used for quorum sensing and influence their rate of cellular division in response to crowding45. In yeast small peptide mating pheromones control mating types to allow for sexual reproduction46. In more complex organisms small- molecules or peptides are often biosynthesized and secreted from the organisms’ tissues47. Chemical messengers are often transported to different parts of the organism to elicit developmental changes remotely48. Collectively, entities that are endogenously produced molecules that travel to remote parts of the organism to spur physiological changes are called hormones.

Hormones in plants and mechanisms of signaling

While animals show a diverse mix of peptide and small molecule hormones, the best characterized hormones in plants are small-molecules. At this point at least eight families of small molecules are recognized plant hormones, where the molecular details

17 of their core signaling pathways are at least partially characterized. These hormones are brassinosteroid (BR), ethylene, cytokinin, auxin, jasmonic acid (JA), abscisic acid

(ABA), gibberelic acid (GA), and most recently discovered, strigolactones (SLs). These hormones control elements of plant development from the earliest stages of embryonic growth through to flowering time and seed set. This collection of hormones can be broken down into two general categories: those whose receptors are membrane proteins, and those whose receptors are not. I will first briefly describe mechanisms of hormone perception and signal transduction in hormones with membrane bound receptors before doing the same for the nuclear hormone receptors. Except where specified otherwise, signaling pathways and gene names are from Arabidopsis thaliana.

BR is perceived by a small family of leucine rich repeat receptor-like kinases (LRR

RLKs) with the founding member called BRI149. When BR binds to the BRI1 and the co-receptor BAK150 a series of phosphorylation events on the cytoplasmic domains of those proteins51 as well as others allows for the activation of a transcriptional response to the hormone. Ethylene is perceived by a family of 5 proteins, the best known of which is ETR152, that are embedded in the membrane53 of the endoplasmic reticulum

(ER). Binding of ethylene to the receptors blocks the inhibitory phosphorylation of EIN2 by CTR154. This allows for the cleavage and translocation of the C terminal tail of EIN2 to the nucleus where it leads to the activation of transcriptional activity54.Like ethylene, cytokinin is perceived primarily by ER membrane bound receptors such as CRE155.

Those receptors phosphorylate AHPs which in turn are able to phosphorylate

18 downstream response regulators that lead to the transcriptional changes associated with cytokinin treatment through type-B ARRs56.

The rest of the known plant hormones have nuclear receptors. The main opponent of cytokinin, auxin, is perceived by the F-box protein TIR157,58 with Aux/IAA proteins as coreceptors. The SCFTIR1 is then able to target Aux/IAA proteins for degradation, alleviating their repression of ARF proteins which allows for those ARF proteins to alter the transcriptional profile of the cell59. JA signaling is commonly compared to auxin signaling. In JA signaling the F-Box protein COI160 acts as a coreceptor with JAZ transcription factors. These transcription factors are ubiquitinated by the SCFCOI1 and targeted for degradation allowing for transcriptional changes61. ABA is perceived by a family of START domain proteins14 that interact with type 2 C protein phosphatases to inhibit their inhibitory dephosphorylation of SNRK proteins62 which allows for the phosphorylation of ABF transcription factors that lead to ABA response at the transcriptional level63. Finally, GA is perceived by an alpha/beta hydrolase protein with an inactive catalytic triad. GID1 is able to bind GA64,65 and then interacts with DELLA domain proteins66 that act as repressors of GA response in the cell. The interaction of

GID1 with DELLAs leads to their targeted degradation and thus to GA response in the cell67,68.

From the signaling processes described above a few themes emerge. The first is that only one major hormone is perceived by a cell-surface receptor and that most are perceived by nuclear hormone receptors. Secondly, for nuclear hormone receptors, ubiquitination and targeted degradation and phosphorylation are key elements of hormone signaling pathways in plants. These paradigms continue to play an important

19 role in how researchers approach understanding newer signaling pathways including the response to the novel plant hormone strigolactone (SL).

A historical perspective on strigolactone signaling and biosynthesis

The first strigolactone (SL) to be uncovered was strigol. Strigol was isolated from the roots of cotton plants and added to Striga lutea seed and shown to stimulate their germination69. At that point it was unclear why the host plant would produce a molecule that was a cue for parasitic plant germination and attack. In subsequent years other structurally related compounds were isolated and chemically characterized70. Over time this lead to a large collection of small molecules that are referred to collectively as strigolactones (SLs) (Figure 1.2). Although some other plant hormones have multiple active forms, SLs are unusual that that they have so many. For example, there is only one active form of ABA.

The first non-parasitic role for SLs that was uncovered was its role in stimulating hyphal branching in symbiotic mycorrhizal fungi71. This corresponded well with the subsequent observation that SLs are exuded in greater amounts under conditions of low phosphorous72. This generated a satisfying narrative where low nutrient availability stimulated the release of SLs into the rhizosphere in an attempt to gain additional nutrient through symbiotic interactions with soil fungi. To some degree this also explained why Striga infestations are often worst in areas where farmers are working more marginal land and using little if any synthetic fertilizer. However, even at this point it was recognized that other roles for SLs must exist within the plant. This was based on the observation that SLs are produced even in plants that do not form associations with mycorrhizal fungi. In the 1990s and early 2000s a number of groups had been

20

Figure 1.2 Chemical structures of natural and synthetic strigolactones.

(A) The core 2D chemical structure of canonical strigolactones is shown with A, B, C and D rings indicated. The chemical structures of the synthetic strigolactone GR24 (B), natural strigolactones (C-E), and two synthetic strigolactone mimics (F, G) are shown.

21 working on a collection of mutants that had increased numbers of axillary branches in

Pisum sativum73, Arabidopsis74, and rice75. Well designed micrografting experiments76,77 showed that there was a mobile upward moving chemical signal that was involved in those mutants’ branching phenotypes. The cloning of MAX4 showed that it was a carotenoid cleavage dioxygenase and thus showed that the mobile signal was likely carotenoid derived78. Since SLs were a likely carotenoid-derived orphan molecule, and the MAX mutants were carotenoid biosynthetic mutants with branching defects it was logical to check whether SLs could rescue their branching phenotype.

Two groups independently found that exogenous application of SLs could rescue the branching defects of max1, max3, and max4 plants, that max2 plants were blind to exogenous SLs and that biosynthetic mutants had lower amounts of SLs79,80.

Identification of SL signaling genes

A simplified overview of SL signaling is shown in Figure 1.3. MAX2 was cloned and found to be an F-BOX protein74,81. This was seen as a promising development since F-

BOX proteins have featured prominently in a number of plant hormone signaling pathways. As described above, in GA signaling the F-BOX protein SLY1 interacts with the alpha/beta hydrolase fold protein GID1 and targets negative regulators of GA signaling for degradation68. In other hormone signaling pathways such as auxin signaling, F-BOX proteins bind directly to the hormone as receptors. Thus two obvious questions arose. The first was whether MAX2 was serving as a receptor for SLs, and the second was what MAX2 might be targeting for degradation. The speculation surrounding MAX2 as a receptor was allayed with the discovery of a SL insensitive

22 mutant that was found to be a loss-of-function allele of D14, a gene coding for an alpha/beta hydrolase82.

As was described above, alpha/beta hydrolases are well known as nuclear receptors for plant hormones. In addition to the example of GID1 in the gibberellin signaling pathway64, COI1 is an alpha/beta hydrolase that is a nuclear receptor for the plant hormone jasmonate60. Soon it was shown that the petunia paralog of D14 was able to bind to the synthetic SL GR24 and interact with MAX2 in a SL dependent manner83. In addition to D14, in Arabidopsis there is another alpha/beta hydrolase that is a paralog of D14 called HTL. We were able to show that much like D14, GR24 is able to bind to

HTL and that HTL is able to interact with MAX284. Genetic analysis has shown that D14 is required for the response of plants to SLs in branching82 and that HTL is required for the germination response of seeds to the exogenous application of SLs85. Both D14 and HTL seem to play a role in controlling hypocotyl length85.

With an established interaction between a SL receptor and an F-BOX protein the question of what MAX2 targets for degradation became an obvious area of inquiry.

Within a relatively short period of time several potential targets were published. First it was reported that the rice DELLA domain protein could also interact with D14 making it a potential target for MAX2, but it was not established that MAX2 was able to target it for ubiquitination86. On the other hand, the Clp ATPase protein D53 was shown to interact with D14 and be targeted for degradation by MAX2 in a SL dependent manner87,88. This connected well with an earlier observation that a loss-of-function mutant in a paralogous protein in Arabidopsis called SMAX1 could genetically suppress the phenotypes of a MAX2 loss-of-function allele including germination defects89.

23

Moreover, proof has begun to emerge for a direct interaction between HTL with SMAX like proteins such as SMXL6 that seem to be involved in regulating leaf shape90. Finally the brassinosteroid transcription factor BES1 was also shown to interact with MAX2 and be degraded in a SL dependent manner91. This result is not without some problems. The mechanism of how SLs influence BES1 degradation is unclear given that BES1 was not shown to interact with D14. Additionally, the exceedingly strong phenotype of the BES1 RNAi lines means that epistatic analysis of BES1 loss-of- function alleles with d14-1 or max2-1 is somewhat problematic. Moreover, the role

BES1 degradation with respect to germination is uncharacterized. Collectively this has raised more skepticism towards the role of BES1 in SL signaling than to D53 in particular. In addition to the published interactors described above, unpublished work in the McCourt laboratory suggests that the light signaling protein SPA1 is also able to interact with HTL. Additional lines of evidence suggest that this interaction may be biologically meaningful including the MAX2-dependent degradation of SPA1 under

GR24 treatment. Thus, the core model shown in Figure 1.3 shows a simplified version of our best current understanding of the D14 and HTL dependent SL signaling pathways.

Even with these new targets for degradation of MAX2, it is still far from clear how the addition of exogenous SLs can lead to the transcriptional reprogramming of the cell.

The clear missing link between what we know of SL signaling and these transcriptional responses would be a collection of transcription factors that are somehow targeted by

D53 or its homologs to activate or inactive them. There are some early indications that proteins of the TOPLESS family might be involved based on a protein-protein

24 interaction between the TOPLESS family member protein TPR2 with D5388. And, indeed there may also be a role for TPR proteins in HTL/SMXL dependent processes.

It has been shown recently that TPR2 may interact with SMXL6-8 to regulate processes such as the development of correct leaf shape90. However, it remains unclear if TPRs play any role in HTL dependent processes in the seed. This suggests that an unbiased screening approach may be needed to uncover genes that might be involved in downstream responses to exogenous SLs in the seed.

Although this core pathway is now relatively well accepted in model plants, the question of whether SL signaling in Striga occurs in the same way is still open to debate.

However, at least some elements of this core SL perception machinery are intact in

Striga hermonthica. Of particular interest in Striga are HTL orthologs since they are responsible for SL perception in the seed in model plants. Striga hermonthica has 11

HTL homologs (Figure 1.4). This is a dramatic expansion compared to model plants such as Arabidopsis or Rice where there is only a single copy. This marked increase in the numbers of HTL proteins expressed92 by Striga hermonthica may in part be a result of its highly SL dependent lifestyle where the perception of exogenous SLs in the soil is a crucial element of its life cycle.

Mechanisms of SL perception by D14 and HTL

Like many other plant hormones, there exist a large number of chemically related small molecules designated SLs that are able to elicit a SL response when added to Striga or added to a SL auxotroph. This is reminiscent of the situation in GA signaling, where more than one hundred varieties of GA were identified but only a handful of them were actually able to bind to the receptor94. Thus there has been significant interest in trying

25

Figure 1.3 Overview of shared elements of D14 and HTL dependent

SL signaling.

In both scenarios an α/β hydrolase (D14 or HTL) engages in a SL- dependent protein-protein interaction with the F-box protein MAX2. MAX2 is then able to target a Clp-ATPase (D53 or SMAX1) for degradation by ubiquitination. This leads to an inhibition of axillary branching for D14, increased germination for HTL, or reduced hypocotyl length in both cases.

HTL can also interact with the light signaling component SPA1 in a SL dependent manner.

26

Figure 1.4– Phylogenic relationship between HTL proteins from

Striga hermonthica and other plants.

This figure is adapted from Toh et al.93 The phylogenetic relationships between HTL proteins from Arabidopsis (Circled), and Striga, as well HTLs from other plants is shown with Bootstap values out of 100 indicated at the nodes.

27 to understand which compounds are able to bind directly to the receptor, as opposed to chemicals that are chemically altered in the plant to generate the bioactive form of the hormone. In addition to the various decorations of the A and B rings that can be seen on canonical SLs, there is the potential for chiral variability in SL structures. Canonical

SLs have two distinct chiral centers. Recent work has shown that different enantiomers

(Figure 1.5) of GR24 show different levels of activity when added to plants exogenously95. We were able to show that mixes of GR24 with different chiral content were able to bind to HTL differentially84.

Naturally occurring SLs are present in small quantities within the plant and are very difficult to synthesize chemically. For this reason, most experiments conducted with the addition of exogenous SL use the synthetic SL GR24. In fact, the earliest characterization of binding of a SL to the D14 homolog DAD2 was conducted using

GR2483. The binding assays conducted in that study also turned up an unexpected result. The differential scanning fluorimetry was used to measure the binding of GR24 to bacterially expressed petunia D14/DAD2 protein. In this approach the apparent melting temperature of the protein is assessed as a function of varying concentrations of the likely ligand. By definition the binding of a ligand to its receptor must be a spontaneous and energetically favourable interaction. This will generally translate into a greater stability of the ligand-receptor complex. However, in the case of D14/DAD2 with

GR24, the stability of the complex was less than that of the protein alone. This result has been confirmed elsewhere using rice D1496 but is still rather surprising. It is possible that the destabilization the D14 in response may be connected to the enzymatic activity of D14.

28

Figure 1.5 Enantiomeric structures of GR24

The chemical structures of GR24 in all four possible enantiomeric forms are shown. Enantiomers are named according to their similarity to the chirality of either 5-deoxystrigol (5D2) or 4-deoxyorobanchol (4DO). The ent forms are the mirror image of the 5DS or 4DO versions. The GR24rac used in most experiments is a racemic mixture of GR245DS and GR24ent-

4DO.

29

There has been some speculation that the destabilization of D14 is an important aspect of SL signaling, however this seems problematic since it is well known that loss-of- function mutants in D14 are SL insensitive. It is hard to understand how the destabilization of a receptor could be a key aspect of its action if an inactivated receptor confers insensitivity to the hormone rather than constitutive activity.

Structural biology of strigolactone receptors

Many groups have adopted a structural approach to understanding the perception of

SLs. The earliest contributions to this body of work engaged in comparative structural studies between AtHTL and AtD1497, and between OsD14 and OsHTL (OsD14L)98.

The one conclusion that was drawn from these comparative studies was that the binding pocket found in HTL proteins is much smaller than that found in D14s, making it impossible to dock known SLs with the receptor98,99. This was paired with observations that although both receptors have an intact hydrolytic triad, that HTLs rate of hydrolysis was dramatically slower than that of D1498. However, these observations pair poorly with observations that HTL is required for the response to exogenously applied SLs in the seed85. Additionally, we were able to observe that purified AtHTL protein was capable of both binding to, and hydrolyzing the synthetic SL GR2484 which should not be possible if SLs could not fit into the binding pocket. Since all crystal structures are snapshots of the normally dynamic protein, this may suggest that the existing structures of the HTL apoprotein are not representative of the true structure of the receptor in solution. AtHTL has a number of binding partners that seem to interact with the protein constitutively, based on unpublished yeast two-hybrid data from the

McCourt lab. It is possible that crystallizing HTL in the absence of those interactors

30 leads to the generation of a structure that is not representative of reality in the cell. This lack of agreement between structural biology and in vivo and in vitro activity suggest that structure-based design of agonists or antagonists for that receptor would prove challenging.

Although there are no published crystal structures showing the interaction of a SL with

HTL, there is some structural detail on the binding of SL with D14. The first structure of a complex of GR24 with D14 showed the hydrolyzed D ring of the SL interacting with

D14 in the entrance of the binding pocket86. This suggested that the hydrolysis of SLs by D14 might be an activating step where the D ring is biologically active, although the

D ring product has little in vivo biological activity of its own86. Additionally, a recent study showed that the hydrolysis of SL analogs by a D14 homolog is an irreversible reaction where the D ring remains attached to the catalytic histidine residue which further discredits the importance of a free D ring in SL perception100. A subsequent study was able to generate a crystal structure of unhydrolyzed GR24 in complex with

D14 showing the D ring of GR24 entering the binding pocket101. Interestingly this structure showed that there were only very minor changes in the structure of D14 when the GR24 bound structure was compared to the unbound version101. This raises the question of how it is possible that such a small structural change can cause the receptor to interact with MAX2 in a SL dependent manner. Further work within the context of larger complexes of D14, MAX2, and GR24 might have the potential to answer these questions. However, this might also suggest that the binding of SLs to the receptor is insufficient to allow for signaling, and that hydrolysis is a required step in the signaling pathway. This observation that hydrolysis is important is potentially

31 supported by the observation that the non-hydrolysable SL analog carba-GR24 is biologically inactive, however that analog is also apparently unable to bind to D1492.

Taken together this data is unable to conclusively answer the question of whether hydrolysis is an activating step, or an inactivating step in SL perception.

Striga control technologies

The massive scale of Striga infestations world wide has lead to considerable interest in approaches to controlling Striga outbreaks. Various approaches have been employed to address this problem with varying levels of success.

Since Striga only germinates when it encounters SLs released by a host’s roots, perhaps the most obvious approach to reducing the impact of Striga infestations is to reduce the amount of SLs that are released. As was described above, a number of mutants in the SL biosynthetic pathway are known. These plants do not make any significant quantities of any known SL79 and mutants with less produced SLs are more resistant to Striga102. Also, existing rice strains have been shown to have differing levels of SL production and those with reduced SL production tend to be more resistant to Striga hermonthica103. The major disadvantage of this approach is that rice accessions with reduced SL levels and Striga susceptibility also show less beneficial agronomic traits. Plants with reduced SL content tend to show reduced apical dominance103 and are not necessarily well suited to growth in the areas that are the worst hit by Striga. One more disadvantage of this approach is that for each crop that the farmer wishes to plant it is necessary to generate a SL defective line in that plant.

Most subsistence farmers do not grow a monoculture of any one crop, but rather plant

32 a mixture of many types of plants. Generating SL biosynthesis defective lines for all the plants that a farmer might wish to plant is not a trivial undertaking.

An approach that has been used to greater effect is the application of herbicide to crop seeds prior to planting. This approach is most well known in the form of the

StrigAway™ system that has been developed by the agribusiness firm BASF. In this system maize seeds that are resistant to the herbicide Imazapyr are coated with the herbicide prior to planting. This kills germinating Striga seeds in the area of the planted seed104. Since the impact of the parasite is the worst on the plant when it is young, killing the Striga that are likely to attach to the young plant can protect yields. The disadvantage of this approach is similar to those for low SL emitting lines described above. For each crop that the farmer wishes to plant, it is necessary to generate, or more likely purchase, an Imazaypyr resistant strain. Additionally, herbicide resistant plants are often categorized as GMOs, and as such it is not permitted to plant them in many parts of Africa. These concerns limit the effectiveness of the system in combatting Striga.

One approach that is generalizable to a larger range of locations and crops is the planting of a false crop prior to planting the true crop. In this approach, a plant that is able to grow quickly and release large amounts of SLs into the soil is planted and is allowed to grow briefly. This would cause Striga seeds in the area of the false crop to germinate and attach to the plant. It would then be possible to burn off the false crop and thus kill the Striga attached to it. A true crop can then be planted and presumably would be able to grow with less interference from Striga. Although this approach is affordable, and is applicable to any range of crops its major disadvantage is that it

33 takes a significant amount of time. In many places farmers are reluctant to shorten their growing season in exchange for the uncertain prospect of a reduced intensity of Striga infestation.

Another approach that has been discussed for many years is the idea of “suicidal germination”69. Since Striga seeds have a small amount of nutrient reserves, if germination can be triggered in the absence of the host it would be possible to kill those seeds. Thus their germination would be “suicidal”. The advantage of this approach is that it is applicable to any mixture of crop types, and it is faster than the false crop strategy described above since no plants have to grow before planting the true crop. Early approaches suggested using analogs of naturally occurring SLs69.

However, the low stability of these compounds under alkaline conditions69 together with their difficult and costly synthesis makes them less than ideal as candidates. Some pioneering studies have shown that when the soil is treated with ethylene Striga seeds can be induced to germinate105. However, the challenges of treating soil in remote and poor parts of the world with ethylene would be considerable, especially when the logistics required to transport large volumes of pressurized gas to those areas are considered. For the suicidal germination approach to be feasible low cost, high potency, stable SL mimics are needed.

One final approach that could be used to control Striga infestations is the treatment of crop seeds with antagonists for SL receptors in Striga. Using much of the same reasoning for the StrigAway strategy above, if plants can be protected from Striga in their early life, they can become established and be less impacted by the parasite.

Much like suicidal germination this approach would have the benefit that it could be

34 applicable to any crop plant. The main disadvantages of this approach was that at the time we began this work, there were no known antagonists for SL receptors in Striga.

Another concern is that any compound that is able to inhibit the germination of Striga in response to exogenous SLs might also inhibit the germination or growth of the crop plant.

Strigolactone agonists and chemical genetics

As receptors for a class of small-molecule hormones, strigolactone receptors should be good candidates for chemical perturbation. The synthetic SL GR24 is the most commonly used SL in laboratory settings, and is an analog of naturally occurring SLs such as 5 deoxystrigol and has the stereotypical ABC ring system connected to a butinolide D ring via an enol linkage106 that is the target of degradation107. Other analogs such as nijmegen108 and debranone109 lack similarity to the ABC ring, but retain structural similarity to natural SLs at the enol linkage and the butenolide D ring.

This suggests that, at least in butenolide molecules, the enol linkage is necessary for in vivo activity. In essence all existing SL mimics and analogs have been designed based on similarity to known SLs. This left an opening to uncover potentially novel ligands for

HTL using an unbiased screening approach.

The use of SL mimics that are not designed could have a range of benefits. One impediment to the usage of synthetic SLs as suicide germination compounds is the poor stability of those compounds under alkaline conditions. This poor stability is due the spontaneous hydrolysis of the enol linkage between the C and D rings107. SL mimics that lack this enol linkage could be more stable in soil and thus be more workable in the field. An additional advantage of an unbiased screening is that it could

35 further define the chemical space that is perceived by HTL and related SL receptors.

After engaging in screening for potential SL mimics there are three possible outcomes.

The first outcome is that a generalized collection of molecules from a chemical library does not contain any molecules that are able to act as SL mimics. In this case it could be inferred that SL receptors have exquisite specificity and any deviation from a canonical SL structure is not allowed by the receptor. The second outcome is that the compounds that are uncovered as mimics from chemical screening might vary from many elements of a perfectly canonical SL structure but share some significant structural overlap with known SLs. This would seem to be the most likely outcome from evidence suggesting the enol linkage between the C and D rings is necessary for activity. The final and perhaps most interesting case would be where the SL mimics that are uncovered are not structurally related to SLs whatsoever, but are still able to signal through the SL perception system. In such a case these structurally diverse compounds could be used in conjunction with biochemical and structural interrogation to further understand the mechanisms of SL signaling independently from endogenous

SLs or SL mimics.

Although this last case may sound unlikely, there are many examples where chemical screening was able to identify ligands for a receptor that are not structurally related to the endogenous ligand. In plant hormone signaling one noted example is pyrabactin37 which was shown to be a synthetic ABA mimic14 despite lacking structural similarity to

ABA. Examples abound in animal systems as well where multiple small molecules have been found through screening that antagonize receptors that naturally serve as receptors for peptides110.

36

Pyrabactin also exemplifies another element of the usefulness of agonists uncovered by chemical screening. Unlike ABA, pyrabactin is recognized only by a subset of ABA receptors14. This made it possible for Park et al. to uncover the ABA receptors in a genetic screen for pyrabactin insensitivity. The discovery of small molecules that are able to mimic SLs may be useful in dissecting the roles of SL receptors in Arabidopsis but also in other species. This could be especially useful in species such as Striga. This is because Striga has a large number of HTL proteins and is genetically intractable, which makes characterization of its receptors using traditional genetic approaches more challenging.

Strigolactone antagonists and chemical genetics

In contrast to the relatively large number of agonists for the SL receptor, no small molecule antagonists have yet been described for any SL receptor. This is true, despite attempts to make inactive GR24 analogs. For example, in carba-GR24 the oxygen atom in the enol ether bridge between rings C and D is replaced by a carbon rendering it non-hydrolyzable. It was found that carba-GR24 was not a competitive inhibitor of

ShHTL hydrolysis implying that it cannot bind to those proteins to block their perception of a ligand92. Antagonists for the SL receptors could have a range of useful effects.

Perhaps the most obvious use for SL signaling antagonists would be as chemical control agents for Striga hermonthica infestations. As opposed to agonists which could be used to stimulate Striga germination in the absence of a host, antagonists could be used strategically to inhibit Striga germination in the presence of the host. This could be done cost effectively by treating crop seeds with a seed coating agent that included a

SL signaling antagonist.

37

SL signaling antagonists would also be useful academically as chemical genetic probes. As outlined above, antagonists can be used in conjunction with genetic screening approaches to uncover chemical genetic relationships that elucidate signal transduction pathways. This chemical genetic approach could be particularly useful in the case of SL signaling. As was described earlier, there is no well established mechanistic link between D53’s homologs that play a role in the seed and the transcriptional reprogramming of the organism. Characterizing mutants with SL signaling defects have not uncovered transcription factors or other signaling components downstream of D53 and so it seems likely that alternative approaches are required.

One such approach would be to screen for mutants that are able to suppress the effect of a SL signaling antagonist on the plant. If, for example, a SL signaling antagonist that targets HTL was uncovered it would be possible to screen for mutants that are able to suppress the effect of the chemical. This could be manifested as the mutant germinating in the presence of the compound or the mutant having a short hypocotyl in the presence of the antagonist. This approach is apparently similar to the suppressor screen that was conducted to uncover SMAX89, however it has some advantages including the amenability to overexpression-based screening approaches.

Identifying novel SL signaling genes through overexpression screening

One of the primary advantages of Arabidopsis relative to other multicellular, developmentally complex model organisms is that it is relatively easy to transform

Arabidopsis using Agrobacterium mediated gene transfer. This ease of transformation has allowed the generation of an extremely large number of T-DNA insertion lines that

38 are exploited as resources for those seeking to do reverse genetics using loss-of- function mutations in particular genes111. However, this has also made it possible to generate large libraries of independent constitutive overexpression lines that can be screened to find mutants with a phenotype of interest112. Screening such an overexpression library has both advantages and disadvantages over screening a conventional EMS-mutagenized population.

One advantage is the ease with which the overexpressed gene can be identified.

Because the sequence flanking the transgene is a part of the T-DNA cassette, it is the same in all lines. This allows primers complementary to that region to be used to amplify the overexpressed gene using PCR, using the genomic DNA of the individual as template. Thus there is no need to generate a mapping population or engage in rough mapping as one must with a traditional genetic screen. An additional advantage of this approach is derived from the fact that overexpression lines tend to be gain-of- function mutants for the gene that is overexpressed. This is different from traditional genetic screening where most hits in a screen will be loss-of-function mutants. This also means that screening using an overexpression approach will allow the experimenter to screen in a different genetic direction, which can be useful when traditional genetic screens have not uncovered the desired types of mutants. Finally, the generation of independent overexpression alleles is comparatively straightforward compared to isolating additional loss-of-function alleles, although this may be changing with increasing adoption of CRISPR113.

However, an overexpression-based screening system is not without problems. The first problem is that not all 35S derived lines show strong expression. If the overexpression

39 of a transgene is too weak, a gain-of-function phenotype might not be apparent.

Another problem comes from co-suppression of transgene expression over generations. Even if a transgene can generate a phenotype, subsequent generations may not have the same phenotype. Since the seed that is being screened in a typical overexpression collection is in generation T3, it is likely that some lines will show co- suppression. It is also possible that for many genes no amount of overexpression will lead to a significant accumulation of the gene product. This could be the case if there is a strong negative regulator of protein accumulation in the pathway of interest. Finally, not all genes are equally easy to introduce into Arabidopsis using Agrobacterium. Very long genes may not be possible to add, which would skew the pool of mutants that are screened. With these limitations noted, we sought to use this approach to uncover new

SL signaling genes.

In uncovering overexpression mutants that are related to SL signaling there are multiple ways of going about the screen. The first question would be whether it is better or worse to look for overexpression mutants that show a phenotype that mimics SL a lack of SL signaling, or a constitutive SL response phenotype. We have already uncovered one gene that when overexpressed has a SL-related phenotype. COP1, when overexpressed generates a long hypocotyl114 which mimics the hypocotyl phenotype seen in a loss-of-function mutation in HTL. Since we have already uncovered a gene that acts in this direction, I decided that I would be more interested in genes that when expressed generate a constitutive SL signaling phenotype. An advantage of this approach is that it would be possible to do easy epistatic analysis by

40 overexpressing a candidate gene in an htl-3 background to see whether it is able to rescue a background that lacks SL perception in the seed.

The most obvious approach when looking for constitutive SL response mutants would be to look for mutants that either have constitutively short hypocotyls or germinate more readily while WT is still in primary dormancy. The disadvantage of such an approach is that the genes that are uncovered could be very genetically distant from SL signaling. The ideal screen would be to screen an overexpression library in an htl-3 background looking for overexpression lines that are able to suppress the effect of htl-3 on hypocotyl length or germination. Generating such a population would, however, be time consuming and resource intensive. A simpler approach would be to leverage a SL signaling antagonist that inactivates HTL to find compounds that are able to suppress the effect of the antagonist of the plant. This approach would be analogous to multicopy suppression profiling in yeast. However, in this approach rather than growing yeast under competitive conditions to enrich for strains that can resist the effects of some drug, we would select seeds carrying a transgene that is capable of conferring insensitivity to an antagonist at the level of germination.

In such a screen a phenotype such as germination or hypocotyl length would be evaluated for the members of the overexpression library in the presence of an antagonist. After reconfirming the antagonist resistance, the gene ID for each resistant line would be identified. This would yield a list of gene IDs that are potential SL signaling genes. For genes that are potentially interesting, such as transcription factors, kinases, or other signaling type genes it is then possible to make independent overexpression lines in both WT and htl-3 backgrounds. This would ultimately yield two

41 important pieces of information. First, by isolating additional overexpression lines that are in a WT background it would be possible to confirm that the phenotype is due to the overexpression of the gene, as opposed to due to the disruption of a gene by the T-

DNA insertion. The second piece of information is generated by the overexpression in the htl-3 background. If multiple independent overexpression lines are able to suppress some or all of the htl-3 associated phenotypes, then that gene has some relationship to

SL signaling.

Conclusions and future directions

In this literature review we have described the nature of chemical biology as a discipline and discussed various approaches that are used within that field. In particular, we have discussed ways that chemical genetics has been used in the past to make discoveries that would have been challenging without it. We also argue that chemical genetics has great potential within the world of plant hormone signaling to characterize more molecular details of these important classes of small molecules.

We have also reviewed known mechanisms of plant hormone perception and the signaling pathways that lead to responses to those hormones. This gives context in which to understand the young and rapidly evolving field of SL signaling. We have discussed what is known about the perception of this important plant hormone and what gaps exist in our understanding of it. Furthermore, we have also described its relationship to the agriculturally devastating parasitic plant Striga hermonthica. Here we argue that if chemical genetics and chemical biology hold promise in illuminating plant hormone signaling in general, that it holds even greater promise in the area of SLs in

42 part due to the potential agricultural usefulness of agonists and antagonists for strigolactone receptors in Striga hermonthica.

43

Thesis overview

The primary goals of the thesis projects described here were to identify agonists and antagonists for the strigolactone receptor HTL in Arabidopsis and in Striga hermonthica

(Figure 1.6). As is described in Chapter 2, we first used a combination of target-based and phenotype-based chemical screening to identify compounds that were able to mimic SLs. These compounds were further analyzed and were found to act directly through HTL in Arabidopsis and Striga to varying degrees. The relative activities of these compounds in htl-3 Arabidopsis seeds expressing HTLs from Striga showed that

ShHTL7, which showed the highest sensitivity to SLs, was responsive to the broadest range of the SL mimics that were uncovered. In Chapter 3 we used a combination of phenotypic chemical screens to identify Soporidine (SOP), an antagonist that acts specifically and directly through Arabidopsis HTL and is able to work on the sensitive

Striga HTL ShHTL7. In Chapter 4 we were able to use SOP as a tool to screen for overexpression lines that show a phenotype that is reminiscent of a constitutive SL response in the seed. Using genetic approaches, we were able to show that the overexpression of the splicing factor U2AF35B was able to partially suppress the phenotypes displayed by the HTL loss-of-function allele htl-3. This shows a potential role for this protein as a downstream element of SL signaling. We then explored the changes in gene isoform abundance in htl-3 and SOP treated seedlings using an RNA sequencing approach.

44

Figure 1.6 Overview of the action of agonists and antagonists for

HTL

The effect of an agonist (AG) for HTL is shown. An agonist should be able to stimulate the interaction of HTL with MAX2 and SPA1, as well as germination and inhibit hypocotyl length. A SL antagonist (AT) should be able to reduce germination, increase hypocotyl length, and inhibit the interaction between HTL and SPA1 and MAX2.

45

Chapter 2: High throughput screening uncovers agonists for SL receptors

in Arabidopsis and Striga hermonthica

Modified from:

Detection of parasitic plant suicide germination compounds using a high-throughput

Arabidopsis HTL/KAI2 strigolactone perception system

Duncan Holbrook-Smith*, Shigeo Toh*, Michael E. Stockes, Yuichiro Tsuchiya, and

Peter McCourt (2014)

Chemistry & Biology 21, 988–998

*both authors contributed equally to this study

and

Structure-function analysis identifies highly sensitive strigolactone receptors in Striga

Shigeo Toh, Duncan Holbrook-Smith, Peter J. Stogios, Olena Onopriyenko, Shelley

Lumba, Yuichiro Tsuchiya, Alexie Savchenko, and Peter McCourt (2015)

Science 350, 203-208

46

Abstract

In the Arabidopsis seed, strigolactones (SLs) are perceived by the α/β hydrolase HTL.

This is also true in the parasitic plant Striga hermonthica, although the parasite has a massively expanded array of HTL proteins (ShHTLs). We used a combination of yeast two-hybrid and in planta chemical screening to identify small molecules that are agonists for Arabidopsis HTL. These agonists are useful for two reasons. Firstly, these agonists were able to stimulate Striga hermonthica germination, despite lacking structural similarity to known SLs. This is desirable because the addition of SL mimics to the soil has long been proposed as a way of generating “suicidal germination” in the obligate parasites that lay dormant in the soil. However, natural SLs are quickly degraded in the soil. Since the new HTL agonists lack structural similarity to known natural or synthetic SLs, they could be more stable in the soil. The second reason that the compounds were useful is that we were able to use them to probe the Striga HTLs and understand their activities. When ShHTLs were introduced into Arabidopsis backgrounds lacking the endogenous HTL, the transgenics expressing those ShHTLs were all able to germinate in response to the synthetic SL GR24. However, when the

SL mimics that we uncovered were added, only plants expressing one of four HTLs were able to germinate. Since these mimics were able to cause germination in Striga seeds, this showed that this subgroup was important. Further analysis showed that one of these particular ShHTLs, ShHTL7, had very high sensitivity to exogenous SLs.

Several of the mimics were shown to bind directly to ShHTL7 in addition to AtHTL.

47

Introduction

Agonists for SL receptors, and HTL in particular, are useful molecules for two main reasons. The first is that agonists can be used as chemical probes11 to dissect signaling pathways. In the past, compounds like pyrabactin that are selective agonists for a limited array of hormone receptors have been instrumental in understanding the signaling pathways controlled by small molecules hormones14. The other main use of agonists for HTL as a lead for the development of compounds for use in Striga control measures69. At the outset of this project, all existing agonists for HTL were structural analogs of natural SLs. Rather than pursuing a synthetic chemical approach where we would generate yet more structural analogs of known SLs, we decided to use a chemical screening approach. We used this approach for four main reasons. The first reason was that we were interested in exploring the chemical space that HTL was able to perceive. Natural SLs have varied structures particularly within the A and B rings.

We wanted to know what kinds of compounds would be able to act on HTL, and by screening for compounds with SL-like activity we hoped to explore that chemical space in a relatively unbiased way. Secondly, existing agonists that can cause Striga germination seemed to act on both the SL receptor important for response to SLs in germination (HTL) and also the receptor that is important in the suppression of axillary growth (D14). By screening for compounds that acted on HTL we hoped that we might uncover specific agonists for HTL over D14 that were active in Striga. Thirdly, it is known that SLs are hydrolyzed by their receptors. We wanted to know whether such hydrolysis was absolutely required by receptors for biological activity. All known analogs of SLs that are active maintain an enol bridge that is the target for

48 hydrolysis115, so synthesizing additional analogs of those natural SLs seemed unlikely to generate non-hydrolyzable but biologically active agonists. Finally, known SL mimics at that time were unstable in the soil and we hoped that by uncovering compounds with unrelated structures that we would be able to find compounds that would be more stable in the soil and thus more effective compounds to pursue “suicide germination” crop control schemes.

We next were faced with the decision of whether to adopt a target or phenotype-based screening approach. An option that occurred to us for performing a target-based screen would take advantage of the SL-dependent interaction between HTL and the F- box protein MAX2. If the HTL ligand GR24 is able to stimulate the interaction of HTL with MAX2 based on a yeast-two hybrid assay84, other compounds that are able to stimulate that interaction could also be agonists for the receptor. Thus, we set out to conduct a chemical screen for compounds that were able to stimulate the interaction of

HTL with MAX2 based on the β-galactosidase activity readout from the yeast two- hybrid system. Additionally, unpublished data from the McCourt laboratory suggested that the light signaling component SPA1 could also interact with HTL in a SL dependent manner (Figure 2.1a) and so we chose to include it in our screen as well. We chose to screen the YeastActive chemical library116. This library was constructed by pre- screening a larger set of libraries at a high concentration for the ability to slow the growth of yeast. Reduced fitness of yeast treated with the compounds was used as a marker of broad bioactivity. When this curated collection of 4182 compounds was tested on other organisms it was found to have a higher rate of bioactive compounds

49 than the libraries it was derived from116. This indicates that this library could deliver a higher rate of hits per compound than other less specialized libraries.

When we performed the screen, we were able to uncover a collection of small molecules with diverse chemical structures that were in the top 1% of compounds with respect to relative activity for stimulating the interaction between HTL and MAX2. We repeated the screen, looking for compounds that stimulated the interaction between

HTL and SPA1. Once again the collection of compounds that were in the top 1% of the screen with respect to stimulating the interaction were chemically diverse. With this data in hand we were able to ask whether the activity of a compound in one screen was predictive of its activity in the other. We found that the correlation between activity in the HTLxMAX2 and SPA1 screens was quite poor. Since both screens were using the same receptor, this result was rather unexpected. It was possible that compounds that were acting differentially in either screen were targeting MAX2 or SPA1 rather than

HTL, but this seemed unlikely since neither of those proteins had any reported activity as small molecule binding proteins. To explore this further, we decided to introduce a third chemical screen.

In the third independent screen we adopted a phenotypic approach. We reasoned that this would allow us assess in a relatively unbiased way what the relationship was between activity in the yeast screens and activity in planta. Perhaps the most obvious screen that could be conducted for compounds with SL-like activity would be to assay their ability to stimulate Striga hermonthica germination. This approach is problematic

50

Figure 2.1 Schematic diagram of chemical screens for SL mimics.

A) Left, the effect of a DMSO control as well as the synthetic SL GR24 on the yeast two-hybrid interaction between HTL and MAX or SPA1 is shown.

Right, the effect of three hypothetical screening compounds on the same yeast two-hybrid interactions is shown. Active compounds would be called as “hits” in the proposed screen. B) Left, the effect of DMSO and then synthetic SL GR24 on the hypocotyl length of 35S::GUS:COP1 seedlings.

Right, the effect of three hypothetical screening compounds is shown on the hypocotyl lengths of 35S::GUS:COP1 seedlings.

51 for two main reasons. First, Striga hermonthica seeds show a huge variability in germinability. This makes normalization between wells and assay plates in a large chemical screen extremely difficult. Secondly, we wanted to use a screening approach that could be widely emulated and expanded by other groups. Striga’s classification as a noxious weed makes its import to many regions illegal, and thus a non-Striga based approach would likely serve the community best as a template for chemical screening.

Since the exogenous application of SLs to Arabidopsis seeds is able to stimulate germination85,117 one possibility was to screen for the ability of compounds to cause

Arabidopsis seeds to germinate. The main drawback of such an approach was that the role of compounds that are able to stimulate Arabidopsis germination and compounds that are able to stimulate Striga germination is somewhat complicated. The smoke derived compound karrikin (KAR) is a stimulant of germination that acts through

HTL84,85. However, KAR is not able to stimulate parasitic plant germination118 which signals that not all compounds that signal through HTL in the seed are able to stimulate

Striga germination. This suggested to us that it might be preferable to use a phenotypic system that takes advantage of the action of both SL receptors in Arabidopsis.

The exogenous application of SLs is able to inhibit hypocotyl length in Arabidopsis seedlings and it does so by signaling through both HTL and D1485, and thus we would expect that SL-like molecules would also be able to inhibit hypocotyl elongation.

Hypocotyl elongation is a consistent and easy to score phenotype, and Arabidopsis is used for research everywhere. For these reasons we decided to screen for small molecules that were able to inhibit hypocotyl elongation in a sensitized background

(Figure 2.1b). We were then able to compare which compounds were hits in either of

52 the target-based screens with those that were hits in the phenotypic screen. This comparison showed that when only hits in the phenotypic screen were considered, the relationship between activity in the two yeast screens was much stronger. We decided to work further on compounds that were hits in at least one of the target-based screens and were also a hit in the phenotypic screen. Seven compounds fit this description.

Further work on these compounds showed that most of them were able to bind directly to AtHTL, and that some of them signal specifically through AtHTL in the seed and the hypocotyl. Additionally, all of the lead compounds were able to stimulate Striga hermonthica germination through a subclade of ShHTLs that included ShHTL7, the

ShHTL that seems to be particularly important for the high sensitivity of Striga to exogenous SLs. Taken together these results showed that the approach taken to identifying agonists for HTL was successful in finding lead “suicidal germination” compounds, and was also successful in finding chemical probes to dissect the activity of the HTL proteins found in Striga hermonthica.

Results and discussion

Chemical screening identifies stimulants of the HTL x MAX2 yeast two-hybrid interaction

We tested the effect of all 4182 compounds in the YeastActive chemical library on the

β-galactosidase output of the HTL x MAX2 yeast two-hybrid assay. The relative amount of β-galactosidase output was assessed by measuring the blue colour of yeast colonies treated with each compound using image analysis software. The average relative intensity of the blue colour generated under every compound treatment is shown in

53

Figure 2.2a. The compounds that were in the top percentile based on relative intensity were called as hits based on the concept that compounds that were able to stimulate the HTLxMAX2 yeast two-hybrid the most robustly should be the best agonist candidates. These 42 compounds (Table I) were structurally diverse, as is shown in

Figure 2.2b where the 2D structural similarity between hits is shown in a dendrogram.

Chemical screen identifies stimulants of the HTL x SPA1 yeast two-hybrid interaction

In order to further explore the chemical space that HTL is able to perceive, we decided to repeat the screen in its entirety with another binding partner for HTL. Unpublished data generated in the McCourt laboratory by Dr. Shigeo Toh shows that HTL also interacts with the light signaling component SPA1 in a SL dependent manner. The screen for HTL x SPA1 interaction stimulants was conducted in the same way as the

MAX2 screen described above. The compounds that were in the top 1% of this screen are shown in Table II. The effect of every compound in the library on the relative intensity of β-galactosidase output is shown in Figure 2.3a. As with the MAX2 screening hits, the diversity of the chemical structures of compounds that were hits was quite high (Figure 2.3b).

54

Figure 2.2 – Structurally diverse compounds can stimulate the yeast two-hybrid interaction of HTL with MAX2.

A) The relative intensity of the blue colour of yeast colonies treated with each of the 4182 compounds in the YeastActive chemical library are shown. Values are the average relative intensities found from two independent screening plates. Compounds that caused a relative intensity value in the top percentile are represented as blue dots, those in the bottom 99% are represented as black dots. B) The degree of structural

55

dissimilarity is shown between each of the 42 compounds that were in the

top 1% of the HTL x MAX2 screen are shown. Structural dissimilarity is

based on pairwise 2D Tanimoto similarity scores. Compounds with scores

of less than 0.33 are statistically significantly similar. This is indicated by

nodes below the dotted line. Numbers are PubChem IDs.

Compound activity in HTL x MAX2 is not very predictive of HTL x SPA1 activity

When the relative activity of each compound in both of the chemical screens are compared to each other (Figure 2.4) the Pearson correlation coefficient is only 0.24.

Although this indicates that there is a positive correlation between the activity of a compound in one screen and another, this is a very weak relationship. However, the number of compounds that were found in the top % of both screens was 8. The binomial theorem dictates that the chance of a compound being in the top 1% of both screens is 0.0001%. Therefore, for the whole library we would expect to find

4182*0.00001, or 0.4182 compounds. This means that there are more compounds that are hits in both screens than we would expect naively. Nonetheless, the lack of a strong relationship between activity in the yeast two-hybrid screens could have multiple meanings. One interpretation is that the system is very noisy and that if the effect of each compound on HTL could be measured with greater precision that the relationship would be stronger. The other is that the difference in the response of the different pairings to each compound is a real property of the system. This could be an intriguing result since it might allow for the development of chemical probes that are specific for one interaction partner compared to another and that could thus be used to explore the effect of each individual interaction on SL signaling. In order to convincingly address

56

Figure 2.3 – Structurally diverse compounds can stimulate the yeast two-hybrid interaction of HTL with SPA1.

A) The relative intensity of the blue colour of yeast colonies treated with each of the 4182 compounds in the YeastActive chemical library are shown. Values are the average relative intensities found from two independent screening plates. Compounds that caused a relative intensity

57

value in the top percentile are represented as purple dots, those in the

bottom 99% are represented as black dots. B) The degree of structural

dissimilarity is shown between each of the 42 compounds that were in the

top 1% of the HTL x SPA1 screen are shown. Structural dissimilarity is

based on pairwise 2D Tanimoto similarity scores. Compounds with scores

of less than 0.33 are statistically significantly similar. This is indicated by

nodes below the dotted line. Numbers are PubChem IDs. this question, we needed alternative ways of showing the SL dependent interaction of

HTL with MAX2 and SPA1 in planta. Other McCourt laboratory members such as Drs.

Michael Stokes and Shigeo Toh dedicated significant time to finding independent methods of measuring the HTLxMAX2 and SPA1 interactions but were unable to do so, and so this particular line of inquiry was de-prioritized. However, in the event that a robust in planta approach for measuring HTL interactors is developed this area of inquiry could be very fruitful.

In planta chemical screen for SL mimics

The variability in structures of potential SL agonists, as well as the variability between the MAX2 and SPA1 based screens lead us to ask how we should prioritize hits for further study. One possibility was to focus on compounds that were hits in both yeast- based screens. However, it was possible that compounds that were hits in both yeast- based screens could be artifacts of the yeast system. Thus, we became interested in the question of how the yeast two-hybrid data would compare to an in planta chemical screen. To address this question systematically, we decided to conduct an independent chemical screen for compounds with SL-like activity in planta using the

58

Figure 2.4 – Relationship between compound activity in HTL x MAX2 and SPA1 yeast two-hybrid screens.

The relative intensity of the blue colour of yeast colonies treated with each of the 4182 compounds in the YeastActive chemical library are shown for both HTL x MAX2 and HTL x SPA1. Compounds that caused a relative intensity value in the top percentile are represented as blue circles for HTL x MAX2, purple circles for HTL x SPA1 and as filled magenta dots if they were in the top 1% for both. Compounds that were in the bottom 99% for both screens are coloured black. The Pearson correlation coefficient for this data set is 0.24.

59

Figure 2.5 – Structural similarity between hits in hypocotyl elongation based screen.

A) The degree of structural relatedness between all hits in the screen for compound that were able to inhibit hypocotyl length is shown in dendrogram format. The node position is based off of the 2D Tanimoto dissimilarity scores. B) Average and median 2D Tanimoto dissimilarity scores are shown for each indicated set of compounds. Compounds with scores of less than 0.33 are statistically significantly similar. This is indicated by nodes below the dotted line. Numbers are PubChem IDs.

60 same chemical library described above.

We opted to use a hypocotyl-based chemical screening approach. As was described above, the exogenous addition of SLs to Arabidopsis causes a reduction in hypocotyl length. However, under normal growth conditions WT Columbia (Col) ecotype seedlings have a quite short hypocotyl. This makes it challenging to screen for compounds that inhibit hypocotyl length. To alleviate this issue, we chose to use a transgenic line that overexpresses COP1. Plants that carry the 35S::GUS:COP1 transgene have much longer hypocotyls than WT119, and their hypocotyls are very sensitive to exogenous SLs114.

When the YeastActive chemical library was screened for the ability to shorten the hypocotyls of the 35S::GUS:COP1 plants, we found a total of 44 hits (Table III). As with the yeast screens, these compounds were structurally diverse (Figure 2.5a). With three independent chemical screens conducted, each with apparently diverse compounds as hits, we wanted to compare how diverse these collections of hits were relative to each other as well as relative to randomly chosen compounds. When the average and median values for 2D dissimilarity based on Tanimoto similarity scores were compared, compounds that were hits in the HTL x SPA1 chemical screen showed the lowest dissimilarity from all three screens. All three collections of hits were less dissimilar from each other on average than compounds randomly selected from the PubChem data base were from each other. However, this is perhaps not surprising since the

YeastActive chemical library was selected to only contain compounds with “drug-like” properties with respect to size, hydrophilicity, and other properties116. To more accurately assess whether the collections of hits were more similar to each other than

61 would be expected by chance alone, we randomly selected compounds from the

YeastActive chemical library and computed the average and median dissimilarities between these compounds. We found that the HTL x MAX2 and hypocotyl screening hits were more dissimilar from each other than these randomly chosen compounds, and that although the HTL x SPA1 compounds were on average less dissimilar, that the median dissimilarity was essentially the same as what was observed in the randomly chosen compounds from the library. These results suggest that the compounds that were found in all three screens are not dramatically enriching for compounds of any particular type.

Comparison of hits from in planta and yeast-based screens

In Figure 2.6 the relative intensity of β-galactosidase activity for every compound in

HTL x MAX2 and SPA1 is plotted, and compounds that were hits in the hypocotyl- based screen are marked in red. The average HTLxMAX2 and SPA1 relative intensity of a hit in the hypocotyl screen were 1.82 and 1.39, compared to 1.04 and 1.07 for all other HTLxMAX2 and SPA1 treatments respectively. This shows that compounds that are hypocotyl shorteners stimulate the yeast two-hybrid interactions more than compounds that are inactive in the hypocotyl. When the Pearson correlation coefficient for only hits in the hypocotyl screen is calculated its value is 0.52. This value is higher than for the data set as a whole, where the Pearson correlation coefficient was 0.24.

This implies that the relationship between the intensity of the HTL x MAX and HTL x

SPA1 interactions becomes stronger if only compounds that have some SL-like activity in planta are considered. This might suggest that many of the compounds that have

62 dramatically different activity in different pairings might be the result of either noise in the data or some more systematic artifact.

It is also worth noting that many of the hypocotyl shortening hits were not able to stimulate the interaction of HTL with MAX2 or SPA1. This could be for a range of reasons. It is possible that these compounds are targeting HTL in the hypocotyl of

Arabidopsis seedlings, but are modified by the yeast and thus are not able to interact with the receptor. Alternatively, it is possible that in plants the compounds are altered into a form where they become able to activate the receptor. It is also very possible that many of the hits in the hypocotyl-based screen are simply compounds that are toxic to the plant, and thus they are able to reduce their stature. Such non-specific toxic compounds would not be expected to enhance the intensity of the β-galactosidase activity in the yeast.

Prioritization of compounds for further study

The overlap between hypocotyl and yeast-based screens (Figure 2.7a) was larger than would be expected by chance alone. By the binomial theorem the expected number of compounds that would appear in the top 1% of one of the yeast screens and would also be one of the 44 hypocotyl hits is 0.44. However, 5 and 3 hits were shared between the MAX2 and SPA1 data sets with the hypocotyl data set. Again, this suggests that there is a positive relationship between a compound’s activity in yeast and its activity in the hypocotyl.

63

Figure 2.6 – Relationship between compound activity in HTL x MAX2 and SPA1 yeast two-hybrid screens for 35S::GUS:COP1 hypocotyl shorteners.

The relative intensity of the blue colour of yeast colonies treated with each of the 4182 compounds in the YeastActive chemical library are shown for both HTL x MAX2 and HTL x SPA1. Compounds that were able to shorten the hypocotyls of 35S::GUS:COP1 seedlings in two independent replicates are coloured red. The Pearson correlation coefficient for those compounds is data set is 0.51 compared to the value of 0.24 for the data set as a whole. The average HTLxMAX2 and SPA1 relative intensity of a hit in the hypocotyl screen were 1.82 and 1.39, compared to 1.04 and 1.07 for all other HTLxMAX2 and SPA1 treatments respectively.

64

Because we were interested in compounds that acted as agonists for HTL, we decided to focus our attention on compounds that were a hit in either of the yeast screens and were also a hit in the hypocotyl screen. The reason for this is that compounds that are able to stimulate the interaction of HTL with either MAX2 or SPA1 are likely targeting

HTL since it is the receptor, and compounds that are able to shorten Arabidopsis hypocotyls seem to be SL mimics in planta. Therefore, compounds that are in both groups should be compounds that bind to HTL and cause hypocotyl shortening, or in other words are agonists for HTL. The 7 compounds that fulfilled these categories were carried forward for further analysis and were designated leads 1 through 7.

Lead compounds have diverse chemical structures

The 7 lead compounds (Figure 2.7b) were notable for the fact that only two of them, leads 6 and 7, had any obvious structural similarity. Although leads 6 and 7 were the most structurally similar compounds, one was discovered in the HTLxMAX2 screen and the other in the HTLxSPA1 screen. It is also notable that none of the 7 compounds had significant structural similarity to known natural SLs. This might imply that their mechanism of action was different than that of SLs or that they might even have additional molecular targets within the plant. In order to assess this possibility, we decided to use a combination of genetic and biochemical approaches to explore their mechanisms of action.

65

Figure 2.7 – Compound overlap between screens uncovers 7 likely

SL mimics.

A) The number of compounds that were hits in each of the screens is represented in Venn diagram form. B) The 2D chemical structures of all

66

compounds that were a hit in at least one of the yeast two-hybrid screens

and the hypocotyl screen are shown. d14-1htl-3 plants are insensitive to some lead compounds

In order to understand to what degree the lead compounds were acting through SL signaling machinery, we decided to use a genetic approach. If our lead compounds were acting specifically through HTL or D14, we would expect that when a compound were added to d14-1htl-3 seedlings, where neither receptor is functional, they would not be able to shorten their hypocotyls. In order to test this in a meaningful way we needed to know what concentration of each of the lead compounds would be appropriate to add. If a compound is active in a sub micromolar concentration range, adding it in tens of micromolar amounts might lead to off-target or toxic effects taking place that would not occur as much at a lower concentration. Therefore, we measured the dose-response relationship between each of the lead compounds and the hypocotyl length of Col seedlings (Figure 2.8). We found that the concentration of each lead compound that is required to generate a hypocotyl length that is 50% of that of seedlings treated with a vehicle control was quite variable. The compound with the highest value was lead 5 with a value of 37 µM, and the lowest was lead 4 with a value of 4 µM.

We found when each of the compounds was added to Col at the concentrations listed, they were able to shorten their hypocotyls to 50% (Figure 2.9). However, we found that only GR24, lead 2, and lead 5 were unable to significantly reduce hypocotyl lengths of d14-1htl-3 plants. When the d14-1htl-3 seedlings were treated with leads 1 and 3 the resulting hypocotyl length was significantly greater than Col treated with those same

67 compounds, but was significantly shorter than DMSO control for d14-1htl-3 (Figure

2.9). This represents an incomplete ability to resist the hypocotyl shortening effects of each of the compound. These results suggest that lead compounds 2 and 5 are acting specifically through SL signaling to shorten hypocotyls much like GR24, and that lead compounds 1 and 3 are shortening hypocotyls partially through SL signaling and partially through some other process such as general toxicity. Finally, these results suggest that although lead compounds 4, 6, and 7 may be able to bind to HTL and stimulate its interaction with MAX2 or SPA1 based on a yeast-two hybrid assay, the effect of the compounds on the hypocotyl might be almost entirely due to off target or toxic effects.

Native fluorescence shows AtHTL is the target of selected lead compounds

HTL is required for response to SLs in the seed and is involved in the perception of SLs in the hypocotyl, and since it is a receptor it should be very druggable. Considering that all of the lead compounds were able to stimulate the interaction of HTL with either

MAX2 or SPA1, it might seem likely that each of the compounds is targeting HTL directly. To assess whether the lead compounds were targeting HTL we used an intrinsic protein fluorescence approach. Aromatic amino acids fluoresce at a wavelength of 333 nm when they are excited with light at a wavelength of 280 nm. The intensity of the fluorescence is dependent on the environment of the amino acids in the protein120,121. Reduced solvent exposure or stacking interactions with a ligand can reduce the fluorescence of the amino acids. Thus by treating protein with increasing concentrations of a putative ligand it is possible to measure the degree to which the ligand is altering the fluorescence of the protein. When purified AtHTL protein was

68

Figure 2.8 – Dose response relationship of compound concentration to Arabidopsis hypocotyl length.

The effects of increasing concentrations of each of the SL mimics on the relative hypocotyl lengths of Col seedlings is shown. Error bars represent

SEM from three independent biological replicates. The concentration of each compounds required to shorten Col hypocotyls by 50% of their untreated length was inferred for each compound from this data. For GR24 and lead compounds 1 through seven the calculated IC50 values were

7.5, 34, 22, 27, 4, 37, 16, and 20 µM.

69

Figure 2.9 – Dose response relationship of compound concentration to Arabidopsis hypocotyl length.

The effects of each of the SL mimics as well as GR24 on the hypocotyls of

Col and d14-1htl-3 seedlings is shown. Each of the compounds was added at a concentration that reduced the length of Col hypocotyls by 50%.

Those IC50s were for GR24 and the lead compounds 1-7: 7.5, 34, 22, 37,

4, 37, 16, and 20 µM. * p-value < 0.01 when comparing hypocotyl length compound treatment relative to DMSO treatment. ‡ p-value < 0.01 comparing hypocotyl length under compound treatment in d14-1htl-3 and

Col backgrounds based on a 2 tailed student’s T test.

70 treated with increasing concentrations of GR24 we were able to see a change in intrinsic fluorescence that allowed us to infer the dissociation constant (Kd) for the binding of a mixture of 4 enantiomers of GR24 was 7.7 ± 1.2 µM (Figure 2.10a). For a racemic mixture of two enantiomers we found that the Kd was 4.7 ± 2.0 µM (Figure

2.10b), which represents binding that is approximately twice as strong.

When the same approach was used with leads 1-7, we were able to construct binding curves for leads 1, 3, 5, 6 and 7 (Figure 2.11). The apparent Kds using a model with single site binding with non-specific binding were 0.49, 8.6, 0.41, 1.1, and 4.15 µM. For leads 1, 5, and 6 there was significant non-specific binding on the protein. These results together with the abilities of these compounds to stimulate the interactions of

HTL with MAX2 and SPA1 suggest that these compounds can directly target HTL, although the genetic evidence from Figure 2.9 may suggest that their activity as hypocotyl shortening agents may not be due to an interaction with HTL.

YLG hydrolysis assays show binding of leads 2 and 4 to AtHTL

Leads 2 and 4 did not alter the intrinsic fluorescence of AtHTL protein. It was possible that those compounds were binding by a mechanism that is different from the other lead compounds and that that was why they were unable to elicit the alerted fluorescence that other leads could. For this reason, we wanted to use an independent assay to assess the binding of those compounds to HTL. We decided to exploit

Yoshimulactone Green (YLG)92, a hydrolytically activated chemical probe for D14/HTL activity. When YLG is hydrolysed by HTL protein it becomes fluorescent. By adding a competitor for binding to HTL to the assay it is possible to outcompete the YLG for access to the receptor and thus decrease the amount of fluorescence that is generated.

71

Figure 2.10 – Intrinsic fluorescence binding assays show the binding of GR24 to HTL.

A) The effect of a mix of 2 enantiomers of the synthetic SL GR24 on the level of AtHTL fluorescence is shown. The apparent Kd for this binding interaction was 4.7 ± 2.0 µM. B) The effect of a mix of 4 enantiomers of

GR24 on AtHTL fluorescence is shown. The apparent Kd for this binding interaction was 7.7 ± 1.2 µM. Error bars represent SD from three independent assays.

72

Figure 2.11 – Intrinsic fluorescence binding assays show that some

SL mimics interact with AtHTL.

A-E) The effects of leads1, 3, 5, 6, and 7 on protein AtHTL fluorescence is shown. Error bars represent standard deviation from three technical replicates. The apparent Kds for these assays were 0.49, 8.6, 0.41, 1.1, and 4.15 µM.

73

When AtHTL is treated with increasing concentrations of GR24rac, the apparent concentration of GR24 that generated a 50% decrease in relative fluorescence is 8.5

µM (Figure 2.12a). When the same is done with lead 2 (Figure 2.12b), and lead 4

(Figure 2.12c) the IC50 values are 3.8, and 217 µM. The IC50 value for lead 4 was very high indicating that the interaction between the lead and the receptor might be very weak. However, these results show that at least lead 2 is also able to interact directly with AtHTL. These results show that all the lead compounds except for 4 have three markers of SL activity. They are able to shorten 35S::GUS:COP1 hypocotyls, they can stimulate the AtHTL x MAX2/SPA1 yeast two-hybrid interaction, and they are able to bind to AtHTL based on the biochemical data presented above. For this reason, the compounds are likely agonists for HTL, and will be called as such from this point on.

Striga responds to HTL agonists

Although some of the compounds seemed not to be acting through SL signaling in the hypocotyl, we were curious to know whether these compounds could work through SL signaling in the seed. Since our interest in the role of these compounds in germination was largely based on the role of germination in Striga’s lifecycle, we decided to first test whether the SL mimics were able to stimulate Striga hermonthica germination. This seemed reasonable since Striga has a broad array of HTL paralogs (Figure 1.4), and we might expect that compounds that are agonists for AtHTL might also be able to stimulate Striga germination by binding to Striga hermonthica HTLs (ShHTLs). The ability of such agonists to stimulate germination is also instrumental to their usefulness

74

Figure 2.12 – YLG hydrolysis assays show binding of GR24, lead 2, and lead 4 to AtHTL.

The relative amount of YLG fluorescence is shown when for YLG treated with AtHTL protein and increasing concentrations of A) GR24, B) lead 2, and C) lead 4. Average IC50s of 8.5, 3.8, and 21.7 µM were calculate for

A-C. Error bars represent SD from 2 independent assays.

75

SD from three replicates. SDtechnical from

seeds. Error barsrepresent Error seeds.

SL mimics are able to stimulateseed germination. hermonthica Striga mimics to able are SL

13

.

2

Strgia hermonthica Strgia

Figure e

76

of germinationrate on the shown as SLas well eachare ofmimics GR24 of the concentrations theTheof effects indicated conditioned Figur

as Striga germination control measures as suicide germination compounds. Therefore, we tested the ability of the agonists to stimulate Striga hermonthica germination at a range of concentrations (Figure 2.13). All of the lead compounds were able to stimulate Striga germination when they were added at high (100 µM) concentrations.

Lead compound 5 had the strongest ability to stimulate germination both in terms of the raw percentage of Striga germination generated and in terms of the low concentrations at which it was able to stimulate the germination.

Agonists signal preferentially through ShHTL7

We became interested in which Striga receptors were responsible to the germination response of Striga to the lead compounds. Fortunately, Dr. Shigeo Toh had begun to investigate which Striga HTL proteins (ShHTLs) were responsible for the parasite’s response to exogenous natural and synthetic SLs. Dr. Toh was able to clone and express all 11 ShHTLs in an htl-3 background in Arabiopdsis. He found that ShHTL1-9 expressors could respond to GR24 and that ShHTL4-9 expressors could respond to natural SLs (Figure 2.14). Although this narrowed down which ShHTLs might be important for the germination response, it still left a significant amount of ambiguity. A clearer picture emerged when the Striga germination data described above was combined with the results of treating ShHTL expressing seeds with the agonists we had uncovered. It became apparent that only ShHTL4-8 expressors were able to respond to the agonists, and that only ShHTL7 was targeted by all of the agonists92. This suggested that ShHTL7 is important for the sensitivity of Striga to the lead compounds.

Furthermore, since agonist 7 was exclusively able to activate ShHTL7 and was able to cause germination in Striga hermonthica seeds, it implied that activating

77

ShHTL7 is sufficient for eliciting a germination response in Striga. It is worth noting that when htl-3 seeds without ShHTLs were treated with the agonists there was no germination. This suggests that the germination response seen in ShHTL expressing lines is specific to the overexpression of those genes and is not due to some other receptor in the cell. This means that the action of the agonists in the seed seems to be more specific for HTLs than was the activity of the agonists in the hypocotyl as shown in Figure 2.9.

This observation was accompanied by the result that ShHTL7 expressing plants were as much as 10 000 times more sensitive to exogenous SL at the level of germination93.

Together, these observations suggest that the sensitivity of Striga hermonthica to SLs can largely be explained by the sensitivity of ShHTL7 to SLs. Additionally, although

ShHTL7 showed the most sensitivity to exogenous application of lead compounds it was also sensitive to the largest array of compounds. This suggests that ShHTL7 is able to pair a high sensitivity to SLs with a broad ability to respond to SL-related molecules. This combination of sensitivity and a lack of selectivity is striking since the exquisite sensitivity of ShHTL7 expressing plants implies that there should be an interaction between those molecules and the receptor that occurs at a very high affinity.

However, high affinity interactions generally require many specific intermolecular interactions between the ligand and the receptor. With so many putative ligands, it is not obvious how ShHTL7 could have a binding site that is well tailored to respond to each of them individually.

78

Figure 2.14 – HTL agonists act to stimulate germination specifically through ShHTL4-7.

This figure is adapted from Toh et al.93 Arabidopsis germination assays were performed by Dr. Shigeo Toh, Striga germination assays were performed by Duncan Holbrook-Smith. The germination rate of htl-3

Arabidopsis seeds overexpressing the indicated genes is shown for treatments with endogenous SLs at a concentration of 1 µM (Top) and agonists at a concentration of 10 µM (bottom).

79

Agonists 3 through 6 are able to bind to ShHTL7 directly

Since seeds expressing ShHTL7 became sensitive to the AtHTL agonists, we decided to investigate whether those agonists were binding directly to that receptor. We tested this using the intrinsic fluorescence approach that was described above. When purified

ShHTL7 protein was treated with increasing concentrations of agonists 3 through 6 we were able to observe a change in fluorescence for each of those compounds. Agonist 3 showed the most robust change out of the four compounds tested. We were able to calculate an apparent Kd for each of the agonists with ShHTL7. Those Kds were 12.1 ±

14.4 µM, 1.3 ± 2.7 µM, 2.7 ± 2.9 µM, and 3.0 ± 3.7 µM for agonists 3, 4, 5, and 6 respectively (Figure 2.15). Although there is considerable uncertainty for each of those

Kd values, it is clear that each of the compounds is able to influence ShHTL7 fluorescence and that they are therefore able to bind to them.

Investigation of potential synergistic activity by agonists

The wide range of agonists that were binding to ShHTL7 caused us to wonder if their action on the receptor might be in any way synergistic. In this context synergistic is meant in the sense that by treating with two compounds simultaneously, they might be able to stimulate Striga germination to a greater degree and at lower concentrations than would be expected11. Rather than using actual Striga hermonthica seeds to pursue this line of inquiry we chose to use Arabidopsis that was expressing ShHTL7.

This approach was preferable to using Striga seeds because of the substantial batch to batch variability in Striga response to exogenous SLs that is due in part to the fact that the population of Striga seeds under analysis has the potential to be genetically heterogeneous. First we determined the concentrations of agonists 3-6 required to

80

Figure 2.15 – Effect of Agonists 3-6 on ShHTL7 intrinsic fluorescence.

A-D) The effects of the indicated concentrations of Agonists 3 through 6 on the fluorescence of purified ShHTL7 protein is shown. Data is from three independent experiments. Error bars represent SEM. The calculated Kds were 12.1 ± 14.4 µM, 1.3 ± 2.7 µM, 2.7 ± 2.9 µM, and 3.0 ± 3.7 µM for agonists 3, 4, 5, and 6 respectively.

81

Figure 2.16 – Dose response relationship of ShHTL7 expressing

Arabidopsis seed germination to increasing concentrations of

Agonists 3-6.

The average germination rate 7 days after imbibition is shown for each compound and concentration Error bars represent SEM from three independent assays with ~75 seeds per assay. EC40 values were inferred based on the logisitic fit of the curve. Those values for Agonists 3, 4, 5, and 6 required 0.92, 2.7, 0.26, and 0.89 µM respectively.

82

generate 40% germination in ShHTL7 seeds (Figure 2.16). 40% was chosen due to the fact that the maximum germination level was not 100%. Agonist 5 showed the greatest potency with only 0.26 µM required to generate 40% germination. Agonists 3,

4, 5, and 6 required 0.92, 2.7, 0.26, and 0.89 µM respectively. Interestingly this meant that the compounds that were most potent in Striga hermonthica (Figure 2.13) were able to stimulate ShHTL7 germination at the lowest concentrations.

Each of the compounds were added to ShHTL7 either alone at the concentration that generated 40% germination, or at half of that concentration with another of the agonists also at a half concentration. This approach to assessing the synergy is an adaptation of the isobolic approach122. The reasoning for this approach is fairly straightforward. If you treat with two compounds at 50% of the concentration of each that would alone cause

40% germination, in the event that the two compounds act additively the germination rate will be 40%. A thought experiment to show how this is the case is if you had two samples of the same compound and you treated the assay as was described above you would have added 2 x 50% of the amount of compound required to generate 40% germination, and thus you would get 40% germination. However, if significantly more than 40% germination occurs, that would indicate the compounds are causing more germination than you would have expected and that the compounds are acting synergistically. Alternatively, if the compounds produce less than 40% when both are added at 50% of the amount required for 40% germination the interaction between the compounds would be described as antagonistic. However, when each of the agonists were treated alone or in pairwise combinations none of the pairs of treatments caused

83

Figure 2.17 – Effect of simultaneous treatment of Agonists 3-6 on the germination of ShHTL7 expressing Arabidopsis seeds.

The average germination rate 7 days after imbibition is shown for each condition. Each agonist was added at its EC40 alone, or together with another compound with both added at ½ of their EC40s. Error bars represent SEM from three independent replicates with ~75 seeds per replicate.

84 a rate of germination that was significantly greater than either of the full strength treatments alone (Figure 2.17).

Conclusions

In order to identify small molecule agonists for SL receptors in Arabidopsis and Striga hermonthica, we undertook a three-part chemical screen. 4182 compounds were screened for their ability to stimulate the interaction of AtHTL with MAX2 and independently with SPA1. In a third independent chemical screen the same 4182 compounds were screened for the ability to inhibit hypocotyl elongation in the SL sensitized 35S::GUS:COP1 Arabidopsis line. Compounds that were able to stimulate the interaction of HTL with one of its interactors that were also hits in the hypocotyl screen were applied to conditioned Striga hermonthica seeds all of them were able to stimulate germination. Thus, we have presented a new way of identifying compounds that are able to mimic the activity of SLs in Striga that can be easily adopted and expanded to find more SL mimics by others in the scientific community. Further analysis of the SL mimics showed that most were able to bind directly to AtHTL and that their efficacy as hypocotyl shorteners was reduced in a d14-1htl-3 background.

Taken together this suggests that these compounds are active agonists for AtHTL despite the fact that they lack the enol bridge that would be expected to be required for hydrolysis by HTL. This might suggest that the act of hydrolysis by HTL is not an entirely essential step in SL perception as has been hypothesized by others. Further analysis using Striga hermonthica HTL proteins showed that all the agonists were able

85 to stimulate the germination of htl-3 Arabidopsis seeds expressing ShHTL7 and that agonists 3 through 6 were able to bind directly to the receptor.

Materials and methods

Yeast growth conditions

Solid YPD media was made by dissolving 10 g of peptone, 5 g of yeast extract, and one pellet of NaOH, and 10 g of agar in 450 mL of water and was autoclaved. 50 mL of filter sterilized 20% (w/v) D-glucose was added to the autoclaved media. YNB media for growth was made by dissolving 0.85 g of yeast nitrogen base, 2.5 g of (NH4)2SO4, 0.3 g of amino acid mixutre (-H -U -L -W), and 10 g of agar (for solid media only) were dissolved in 450 mL of water. After autoclaving 50 mL of filter sterilized 20% (w/v) D- glucose and 30 mg of L-leucine were added to the media. For YNB used for assays the ingredients described above were dissolved in 350 mL of water and after autoclaving filter sterilized ingredients were added in the following amounts: 50 mL of 20% (w/v) D- galactose, 50 mL of 10% (w/v) D-raffinose, 50 mL phosphate buffer (493 mM disodium phosphate, 250 mM monosodium phosphate), 30 mg L-leucine, and 40 mg X-gal.

Yeast two-hybrid assays

The DupLEXA yeast two-hybrid system was used to analyze protein-protein interactions between HTL/KAI2 and MAX2. The yeast strain RFY48 was transformed with HTL/KAI2 in the bait plasmid pEG202 and the reporter pSH18-34. EGY48 was transformed with prey plasmid pJG4-5 carrying either MAX2 or nothing as an empty

86 vector control. The HTL carrying strain was mated with both the MAX2 and empty pJG4-5 carrying strain on YPD. Mated yeast were streaked on selective media and a single colony was grown in liquid media. The cells were harvested and resuspended in glycerol before being frozen at -80 ⁰C. To perform the assays, the cells were streaked on YNB plates from the glycerol stock and grown for 2 days at 30 ⁰C. The yeast was transferred from the YNB plate to the assay plates using a 96-pin tool. Assays were conducted at 30 ⁰C.

Yeast two-hybrid based small-molecule screening and analysis

4182 compounds from the Yeast-active library116 were arrayed into a 96-well format in a concentration of 3 mM in DMSO. 1 µL of each compound was added to the wells in

96-well plates in duplicate. 100 µL of assay media (see yeast growth conditions) was then added to each well of the plate. The plates were allowed to solidify for one day in the dark before yeast were pinned onto the plate as was described above and the results of the assays were photographed after 16-24h. The intensity of the blue colour of each colony was measured using ImageJ (rsbweb.nih.gov/ij/). Lead compounds were purchased from ChemBridge Corporation.

Relative intensity (I) was calculated as follows:

IRelative=((Ireplicate1+Ireplicate2)/2)/(IPlateAverage+IScreenAverage)/2

Hypocotyl-based chemical screening

As with the yeast based screening system, 1 µL of the 3 mM stock of the Yeast- active116 chemical library was added to the corresponding well of a 96 well plate in

87 duplicate. Murashige and Skoog (MS) basal salts (4.4 g/L) and agar (16 g/L) were dissolved in ddH2O and autoclaved. While still warm the two were mixed in equal proportions, and 99 µL of the mixture was added to each well of the 96-well plates containing the chemicals. Plates were allowed to solidify for several hours. ~15 surface sterilized GUS-COP misexpressing seeds were then aliquoted into each well of the 96- well plate. Plates were stratified for 2 days at 4 °C before being placed at room temperature. Plates were treated with continuous white light with a fluence of ~45 µE.

Hypocotyl length was evaluated after 5 days of growth. Compounds that caused a reduction in hypocotyl length in both replicates were scored as “hits”.

Chemical structure based clustering

The Chembridge ID numbers for all of the compounds that were marked as a “hit” in a particular screen were used to search the Chembridge website. A structure data-file

(SDF) for that group of compounds was then downloaded. This SDF was then used as the input in a structure search on the PubChem website

(https://pubchem.ncbi.nlm.nih.gov/) using a search by identity. This list of returned compounds were then used as the input for structural clustering using the PubChem

2D structural clustering tool. Similarity data was exported an analyzed further using R to draw the dendrograms seen in Figures 2.2, 2.3, and 2.5.

Hypocotyl elongation assays

For assays shown in Figures 2.8 and 2.9 liquid plant growth media (10 mM MES pH

5.8, 1x Gambourgs vitamins) was mixed with compounds to the indicated final concentrations keeping the concentration of DMSO constant in all assays. The media

88 with compound was added to 24 well plates at a 1 mL scale, and approximately 15 surface sterilized seeds were added to each well. Plates were sealed with transparent tape and place at 4 ºC for 2 days to stratify. Plates were then placed under 10 µE of continuous white light for 5 days. Each well was then photographed and hypocotyls were measured using ImageJ.

Plant growth conditions

Plants used as sources for seed were grown under continuous white light at a fluence of 100 µE at a temperature of 19 ⁰C. Plants were fertilized with 1 g/L 20-20-20 fertilizer added to soil saturation twice in their lifecycle. The first was at the time of potting, and the second at bolting.

HTL protein expression and purification

AtHTL protein84 was expressed in a p15TV-L vector in BL21 CodonPlus E. coli cells.

Cells were induced with 300 µM IPTG overnight at 16 °C. All subsequent steps were performed at 4 °C. After harvesting and washing with Lysis buffer (20 mM HEPES pH

7.0, 150 mM NaCl, 1 mM 2-mercaptoethanol, 5% glycerol (v/v)) cells were disrupted by sonication. The lysate was cleared by a centrifugation (15 min, 10 000 rcf) and applied to equilibrated HisPur resin. After gently inversion (1 hour) the protein bound resin was applied to a column and washed with 20 column volumes of washing buffer (20 mM

HEPES pH 7.0, 150 mM NaCl, 1 mM 2-mercaptoethanol, 5% glycerol (v/v), 30 mM imidazole). The protein was eluted in an elution buffer (20 mM HEPES pH 7.0, 150 mM

NaCl, 1 mM 2-mercaptoethanol, 5% glycerol (v/v), 250 mM imidazole) and dialyzed

89 overnight in the lysis buffer. Protein was concentrated to >10 mg/mL and flash frozen in liquid nitrogen in small volumes.

S. hermonthica seed conditioning and germination assays

Striga seeds were washed 10 times in sterilization solution (20% bleach) with 3 minutes intervals of agitation in between changes of sterilization solution. The seeds were then washed extensively with sterile distilled water until all traces of bleach were removed.

The seeds were then deposited onto 2.5 mm glass microfibre filters (Whatman GF/B) saturated with sterile distilled water and were incubated at 30 ⁰C in the dark for three weeks in a closed container. Conditioned Striga seeds were deposited on glass fibre filters that were saturated with compounds dissolved in sterile distilled water at the concentrations indicated. GR24 was added at a concentration of 10 nM. SOP was added at the concentrations indicated. After 5 days of incubation at room temperature in the dark, the number of Striga seeds showing radicle emergence were scored.

Arabidopsis germination assays

Compounds were diluted to the indicated final concentrations in ddH2O keeping the concentration of DMSO constant between treatments. The diluted compounds were aliquoted into the wells of 48-well plates with a volume of 250 µL per well. Arabidopsis seeds were directly added to the plate. 15-25 seeds were added to each well.

Experiments were performed in technical triplicate within each experiment, and were replicated a total of 3 time independently. Radicle emergence was used universally as the mark of germination. Plates were incubated at the temperatures indicated in the relevant figure caption under white light with a fluence of ~45 µE.

90

Synergy analysis

For Figures 2.16 and 2.17, the germination rate of 35S::ShHTL7 htl-3 T3 homozygous seeds treated with compounds at the indicated concentrations was measured seven days after imbibition. Concentrations of DMSO were held constant in all cases. For that seven days the temperature was held at 32 ºC and the seeds were under continuous light. The concentration of each compound predicted to be required to generate a 40% germination rate was estimated using a four parameter logistic regression in SigmaPlot

11. Note that this concentration is not the concentration required for a 40% response, but the amount required for a raw germination rate of 40%.

Intrinsic fluorescence assays

The small-molecules dissolved in DMSO were added to the protein in the amount needed to reach the final concentrations indicated. The amount of DMSO in each well was identical. 100 µL of 10 µM HTL was added to the wells of a flat bottom black 96- well plate. Tecan infinite M1000Pro spectrophotometer was used to take fluorescence measurements. Fluorescence intensities were recorded for each concentration of ligand with the protein as well as for the ligand alone. Readings were taken using an excitation wavelength of 280 nm and an emission wavelength of 333 nm. The gain was set to 70, the number of flashes to 50, the flash frequency to 400 Hz, and the integration time to 2 µs. ΔF was calculated by taking the difference in protein fluorescence of the DMSO control and each ligand concentration. ΔF/ΔFmax was calculated by dividing ΔF by the maximum change in fluorescence for that series of concentrations. The Kd determination was performed in SigmaPlot 11 using a model

91 with a single binding site with nonspecific binding described by the below equation where y is total binding concentration, x is ligand concentration, and N accounts for the nonspecific binding. y = (Bmax x)/(Kd + x) + Nx

YLG hydrolysis assays

Concentrated ShHTL7 protein was thawed from -80 ºC and was diluted to a concentration of 5 µM for ShHTL7 in 20 mM HEPES pH 7.0, 150 mM NaCl, 5% (v/v) glycerol. The agonists were diluted to twice the indicated concentrations in the protein solution. 50 µL of the agonist and protein mixture, or a buffer and DMSO mixture were aliquoted into the wells of a 96-well black plate. YLG was diluted to a concentration of

10 µM in the same buffer described above. 50 µL of the YLG solution was then aliquoted into each well of the plate using a 12 channel pipette, and the measurement was started promptly. The concentration of DMSO was held constant at 1% (v/v) in all assays, and the final concentrations of protein and YLG were 2.5 and 5 µM respectively. Fluorescence was monitored using a TECAN infinite m1000pro using a

480 nm excitation wavelength and a 520 nm emission wavelength with a 5 nm bandwidth and a 30 minute kinetic read cycle. The gain was set to 100, the flash frequency was set to 400 Hz, the z-position was set to 20000 µM, and the readings were taken in a 2x2 square matrix. The change in fluorescence was measured after

2.5 hours. The inhibitory curves and IC50 values were plotted using SigmaPlot 11.0 four parameter logistic curve. The IC50 values are the EC50 as determined by

SigmaPlot based on the predicted maximum change observed for each assay.

92

Chapter 3: Discovery and characterization of antagonists for Arabidopsis

HTL and Striga hermonthica HTL7

Previously published as:

Small-molecule antagonists of germination of the parasitic plant Striga hermonthica

Duncan Holbrook-Smith, Shigeo Toh, Yuichiro Tsuchiya, and Peter McCourt (2016)

Nature Chemical Biology (advance online access)

93

Abstract

Antagonists for the strigolactone (SL) receptor HTL could be very useful chemical tools for at least two reasons. Firstly, much of the mechanism of SL perception downstream of HTL in the seed is currently poorly understood. HTL antagonists could be used as chemical probes tools to better understand the mechanism of SL perception through that receptor. Secondly, perception of SLs by HTL proteins in Striga hermonthica is a key step in that plant’s parasitic lifestyle. A compound that is able to block the perception of SLs by the parasitic plant’s HTLs could be deployed in Striga control measures. Despite their obvious usefulness, there are no known antagonists for HTL.

We used an in planta chemical screening approach to identify compounds that might be antagonists for HTL based on their effects on Arabidopsis hypocotyl length and germination rate. When we tested these compounds for their strength, we found that one compound that we named Soporidine (SOP) was the most potent. We used genetic analysis to show that SOP was acting through SL signaling, and used biochemical analysis to show that SOP binds directly to Arabidopsis HTL. Further analysis showed that SOP was an effective inhibitor of Striga germination and that SOP is able to target Striga HTLs (ShHTLs) in general, and ShHTL7 in particular. Basic analysis of the toxicity of SOP on Rice plants showed that the compound does not cause obvious stunting of growth or other defects. Taken together this evidence suggests that SOP is an antagonist for HTLs in Arabidopsis and Striga that can serve both as a chemical probe for AtHTL, and a lead compound for the development of

Striga control measures.

94

Introduction

Antagonists are molecules that are able to block the action of a target macromolecule.

These types of molecules have been historically useful as drugs123 and as chemical probes124. Since small molecule hormone receptors are easily targeted by small drug- like molecules, they make excellent targets for modulation through small molecule antagonists24. Strigolactones (SLs) are a class of small molecule terpenoid plant hormones that are perceived by the α/β hydrolase HTL in the seed84,85. SLs are also a key player in the parasitic lifecycle of plants of the genus Striga69. Striga seeds germinate after sensing SLs exuded into the soil by their host and attack its roots, a process that causes billion dollar losses in crop productivity in Africa125. As with other plants, HTLs are important in the perception of SLs in seed of plants of that genus93.

Since the perception of SLs in the seed is poorly understood and is a key step in the parasitic Striga lifestyle, we opted to attempt to uncover antagonists for HTLs in

Arabidopsis and Striga with the twin goals of developing a chemical probe to understand SL signaling in the seed as well as a potential agrichemical to suppress

Striga germination.

In the search for HTL antagonists, we were first faced with one fundamental choice in terms of what approach to compound discovery we should take. The first option would be to use a synthetic chemistry approach. Since we know the structures of a number of compounds that bind directly to AtHTL84, we could have tried to design and synthesize compounds that would bind to the receptor but not activate it. We opted against using this approach for three main reasons. The first reason was that it has been shown that a non-hydrolysable version of GR24 (carba-GR24) is not a competitive inhibitor for the

95 hydrolytic activity of the HTL related hydrolase D14126. This suggests that such a synthetic approach might not generate compounds that are able to competitively inhibit

HTL. Secondly, the success we had already achieved using chemical screening approaches to identify small-molecules that perturbed HTL activity84,127 lead us to believe that we should be able to uncover SL antagonists through chemical screening.

The third reason was that the structural diversity in the agonists that were discovered suggested that HTL was susceptible to perturbation by compounds with a range of structures. This made screening a diverse chemical library more appealing than a rational design approach.

Next we had to choose whether to use a target or phenotype-based chemical screening approach. As was described above, there are advantages and disadvantages to both ways of screening. However, we chose to use a phenotype-based chemical screening approach for two reasons. Firstly, our results described in chapter 2 showed that when the hypocotyl-based chemical screen data was combined with the target-based yeast screens, the correlation between HTLxMAX2 and HTLxSPA1 activity became much stronger (Figure 2.6). This suggested that hypocotyl-based screens are efficient in terms of lead generation. Secondly, because of our experience in target validation techniques with HTL, we had access to a battery of biochemical tests that we knew worked reasonably well with Arabidopsis HTL protein. Since the main negative trait of phenotypic chemical screens is that it is hard to know what the molecular target of a given hit is, the relative ease of target identification in our system made a phenotypic approach much more attractive.

96

Although our experience in working with HTL reduced the challenge of target validation, we did want to ensure that the largest proportion possible of hit compounds directly targeted HTL. For this reason, we used two independent phenotypes as the basis of the screen to increase the chances that hit compounds were targeting HTL. In both cases we took advantage of the fact that we would expect an antagonist for the receptor to at least in part mimic the phenotypes displayed loss-of-function mutant in the receptor (Figure 3.1). The two htl phenotypes that are the easiest to assess in a multi-well plate format are elongated hypocotyl length in the presence of GR24128, and slower germination both in the presence and absence of GR2485. These two phenotypes are good ones to attempt to emulate not only because of their tractability in a multi-well plate format, but also because broadly speaking one is an increase in growth, whereas the other is a decrease. This means that a compound that is a general toxin would not be called as a hit in this screen since it would be unlikely to increase hypocotyl length although it might inhibit germination. Similarly, compounds that target some other biological processes that generate increased hypocotyl stature would not generally be expected to inhibit germination. Thus this approach should limit the number of compounds that are targeting other biological processes.

97

Figure 3.1 – Schematic of HTL-dependent SL signaling and anticipated HTL antagonist activity.

A) The α/β hydrolase HTL is a receptor for SLs. It interacts with the F-BOX protein MAX2 and the Clp-ATPase SMAX1 in a SL dependent manner.

This targets SMAX1 for degradation in leads to increased germination and reduced hypocotyl length. B) The proposed mechanism of an antagonist targeting HTL is shown. The antagonist binds to HTL in place of SL and therefore there is no interaction between HTL with MAX2 or SMAX1. This results in no transduction of SL signal through the pathway and thus there is no increased rate of germination and the hypocotyl is not shortened.

98

This screening uncovered 7 compounds that had both phenotypic activities. The compound with the best activity in vivo was named Soporidine (SOP) and was studied in greater depth. We found that SOP bound to AtHTL, acted specifically though SL signaling, and was able to inhibit the germination response of Striga hermonthica to exogenous SL. In order for SOP to be useful as a seed coating agent that can inhibit nearby Striga germination, it is necessary that the antagonist not hinder the development of the crop plant and thus reduce grain yields. As a rough proxy for this we assessed the effect of SOP on rice germination and early development. We found that rice germination was not inhibited by SOP, and that the development of the seedling appeared normal. Taken together this suggests that SOP is a useful chemical probe for HTL function in Arabidopsis and Striga, as well as a promising lead for the development of Striga control technologies.

Results and discussion

Chemical screening for antagonists for HTL uncovers Soporidine

In the primary screen the phenotype we used was the response of the hypocotyl to exogenous SL (Figure 3.2a). As was described above, response to exogenously applied SLs includes the reduction of hypocotyl length, and that response is mediated by HTL85. Therefore, we would expect that if a compound that is able to block the effects of exogenously applied SLs on the hypocotyl, it might be a SL signaling antagonist. In the screen, rather than using wild-type Arabidopsis seeds we chose to employ a 35S::GUS:COP1 line. We chose this line because it has a longer hypocotyl than WT, and is very sensitive to the addition of GR24114. This made it easier to score

99

Figure 3.2 – Schematic representation of the chemical screen for antagonists for HTL and structures of lead compounds.

A) In a primary screen GUS:COP1 expressing seeds were treated with

GR24 and screening molecules. Compounds that were able to block the inhibition of hypocotyl elongation by GR24 were called primary hits. Those primary hits were tested for their ability to inhibit the germination of

Arabidopsis seed. Compounds that were germination inhibitors were called as secondary hits (RGs). Accompanying photographs show relevant

100

assays using RG4. B) 2D chemical structures for RG compounds are

shown. the differences in hypocotyl length between a GR24 treated or untreated plant, and thus easier to screen for compounds that blocked the effect of GR24 on the plant. Any compound that increased the hypocotyl length of GR24 treated seedlings was called as a hit (Table IV).

For the secondary screen, the phenotype we exploited was germination. Because htl-3 is more dormant than WT117, we would expect that an antagonist for HTL would make seeds less likely to germinate. Therefore, in a secondary screen we tested all hits in the primary screen for the ability to inhibit Col germination (Figure 3.2a). The seven compounds (Figure 3.2b) that were able to reduce the germination of Col seeds were designated as RGs for their ability to counteract the action on GR24. We chose to conduct the screen in this order for two reasons. Firstly, the enhancement of hypocotyl length is a positive trait compared to the suppression of germination which is a negative trait. If the suppression of germination were used in the primary screen, the best hits would likely be compounds that are toxic to the seed and thus prevent it from germinating. This might skew the hits away from HTL antagonists which would require more intensive secondary screening to find hits that lengthen hypocotyl lengths.

Secondly, scoring a reduction in germination is challenging. In order to screen for compounds that inhibit germination a condition would need to be chosen where germination is close to 100%. Measuring the germination rate of a given compound treatment would require counting all the seeds and all the germinated seeds in any well with an apparently reduced rate of germination relative to DMSO order to calculate the

101

% germination. That is significantly more labour intensive than qualitatively assessing whether a given compound treatment causes a longer apparent hypocotyl length.

RGs may act through HTL signaling to inhibit germination

The ability to inhibit germination could easily be due to a general toxicity in the seed, and thus we wanted to know if any of the RGs were acting to inhibit germination through SL signaling specifically. In order to assess this, we decided to test whether

RGs were able to inhibit the germination of lines that had loss-of-function mutations in

MAX2 or double mutants in HTL and D14. Before we were able to do this we wanted to make sure we were using the minimum amount of each compound that was required to inhibit germination. Therefore, we measured the relationship between germination rates of Col seeds after 2 days and a range of concentrations for each of the RGs (Figure

3.3a). We used this relationship to determine a minimum effective concentration for each of the compounds (MEC). When Col, max2-1, and d14-1htl-3 seeds were treated with every RG at its MEC, we saw that only Col germination rates declined (Figure

3.3b). This is consistent with a model where RGs are acting specifically though SL signaling because if they act by inhibiting HTL they should not have any further activity in the two mutant backgrounds that were tested.

RGs may act through HTL to lengthen hypocotyls

Having established the results above that are consistent with RGs working through HTL to reduce germination, we wanted to establish whether this was also true for hypocotyl elongation. We used a similar approach to establish this. First we established a dose- response relationship between RG concentration and Col seedling hypocotyl length.

We used this relationship to determine the concentration of each RG that generated the

102

Figure 3.3 – Effect of RGs on the germination of Col as well as SL signaling mutants.

A) Dose response of Col germination rates to RG compounds. The average germination rates after two days are shown from three biological replicates. From this data a minimum effective concentration (MEC) for all subsequent germination experiments for each compound was determined and each MEC is highlighted as an orange bar. Error bars represent standard error of the mean (n=3, approximately 40-70 seeds each). B)

Ripened seeds of the indicated genotypes were treated with the minimum inhibitory concentration of each compound. The rate of germination was measured after 2 days. Error bars represent the standard error of the mean generated from three biological replicates (approximately 40-70 seeds each).

103 maximum hypocotyl length (Cmax, Figure 3.4a). We then treated Col, max2-1, and d14-

1htl-3 with each RG at their Cmax and measured the resulting hypocotyl length (Figure

3.4b). We found that treating mutants that are defective in SL perception did not lead to any further increase in hypocotyl length. This is consistent with a model of SOP action where it is inhibiting the action of HTL because if HTL and D14 are already mutated then a compound that targets either of them would not be expected to further increase the hypocotyl length.

RG4 is Soporidine, and is the best inhibitor of Arabidopsis germination

To prioritize which compounds were the most promising leads for use as antagonists for HTL, the relative abilities of the compounds to inhibit the GR24 dependent germination of Col seeds was assayed (Figure 3.5a). The compound that showed the strongest inhibition of germination in that assay was RG4 (Figure 3.5b), which we renamed soporidine (SOP) from “sopor” meaning an unnaturally deep sleep, and

“idine” from the piperidine moiety found in the chemical structure (Figure 3.5c).

Interestingly, the core phenylpiperidine structure of SOP is associated with compounds with opioid effects (PMID 14451235) and is shared with the opioid analgesic pethidine129 and shares large structural similarity to other synthetic opioid such as anileridine130. Whether or not SOP has the same opioid activity in animals is not currently known and would need to be assessed before moving forward with agricultural usage.

104

Figure 3.4 – Dose-response relationship of hypocotyl length to RG concentration and effect of compounds SL signaling mutants.

A) Hypocotyl length dose response to RGs and the effect of RGs on Col hypocotyls is shown for 5 day old seedlings. The lowest concentration at which they showed the maximum length was chosen as Cmax and used in subsequent hypocotyl elongation experiments (RG3, RG5 and RG7, 20

μM; RG1, RG4 and RG6, 30 μM; RG2, 50 μM). B) The hypocotyl lengths of seedlings with the indicated genotypes grown on each compound at its

Cmax. Error bars represent standard error of the mean, (n=3 with

105

approximately 10 hypocotyls per replicate. “a” indicates a difference from

the DMSO case with a p-value < 0.05 based on a two tailed student’s t

test.

SOP acts through SL signaling

Although SOP seemed to mimic a loss-of-function mutant in HTL, that was not sufficient evidence to say that it was acting by inhibiting SL signaling. Therefore, we undertook a genetic approach to establish SOP’s mechanism of action. If a compound works by inhibiting the action of a receptor, increasing the amount of that receptor should reduce the efficacy of the compound19. Under this rationale we tested the susceptibility of two independent transgenic Arabidopsis lines that expressed HTL under the control of a strong 35S promoter. When WT Arabidopsis seeds were treated with SOP at its minimum effective concentration (MEC), the germination rate was significantly reduced. On the other hand, when the overexpression lines were treated the same their germination rate was not significantly reduced (Figure 3.6a). This suggested that SOP was working by inhibiting HTL.

As was described above, an additional genetic prediction can be made if SOP was acting through SL signaling. If there is a loss-of-function mutation in HTL, SOP should not further reduce the germination rate of the seeds. To be more certain of the results from Figure 3.3 we treated htl-3 with increasing concentrations of SOP to assay at what point SOP would start to inhibit the germination rate of the receptor mutant.

However, even when htl-3 seeds were treated with 80 µM of SOP there was no reduction in the germination rate of htl-3 relative to a DMSO control (Figure 3.6b). This is consistent with a model where SOP is acting specifically through SL signaling. Taken

106

Figure 3.5 – The most potent inhibitor of Arabidopsis germination is

RG4 (Soporidine).

A) Germination rate of wild type seeds treated with either DMSO or an RG compound at their minimum effective concentrations in the presence (+) or absence (-) of GR24rac (5 µM) as measured over the course of a week. B)

A detailed representation of the germination rates of each treatment from

107

the left panel on Day 4. * p-value < 0.05 based on a 2 tailed student’s T

test relative to treatment with GR24 alone. Error bars represent SD for 3

technical replicates. C) The 2D chemical structure of Soporidine.

together with the gain-of-function genetics this shows that from a genetic perspective the SOP acts through HTL. The next question we sought to answer was whether SOP was acting directly on HTL at the biochemical level.

SOP acts directly on HTL based on YLG hydrolysis assays

To assay whether SOP was directly binding to HTL we used two main biochemical assays. The first assay took advantage of a recently developed fluorogenic reporter called Yoshimulactone green (YLG)126. YLG is hydrolytically activated by SL receptors, and so it is possible to assay for competitive inhibition of YLG hydrolysis by treating purified SL receptor protein with YLG and increasing concentration of ligands for the receptor. When HTL protein was treated with increasing concentrations of the known agonist for HTL GR24 (Figure 3.7a), or SOP (Figure 3.7b) the level of YLG fluorescence was reduced. The apparent IC50 for GR24 was 52.8µM, compared to

82.2µM for SOP. These IC50s were high relative to those reported for GR24 with

D14126, however this may be because of HTL’s relatively poor hydrolytic activity98.

Additionally, when GA3, a different plant hormone that signals through a different α/β hydrolase, was added to HTL there was no reduction in YLG fluorescence (Figure

3.7c).

108

Figure 3.6 – SOP acts through SL signaling.

A) Germination rate of wild type, two lines expressing HTL under the control of the CaMV35S promoter (HTL ox21, HTL ox23) and the loss-of- function athtl-3 seeds treated without (-) or with (+) SOP at its germination

IC50. B) The germination rates of seeds of the indicated genotypes in the presence of increasing concentrations of SOP are shown. Seeds were allowed to germinate at 24 ºC for 2 days. Error bars represent SD for 3 technical replicates of 40-70 seeds. The SOP IC50 used for this experiment was 30µM. * p-value < 0.05 based on a 2 tailed student’s T test relative to treatment with DMSO.

109

SOP acts directly on HTL based on intrinsic protein fluorescence

To independently test the conclusion that SOP was binding to HTL we also assayed binding of SOP with HTL using an intrinsic fluorescence binding assay. Aromatic amino acids fluoresce at a wavelength of 333 nm when they are excited with light at a wavelength of 280 nm. The intensity of the fluorescence is dependent on the environment of the amino acids in the protein; reduced solvent exposure or stacking interactions with a ligand can reduce the fluorescence of the amino acids120. Thus by treating protein with increasing concentrations of a putative ligand it is possible to measure the degree to which the ligand is altering the fluorescence of the protein.

When this was done with GR24 (Figure 3.8a) and SOP (Figure 3.8b), the changes that were seen were used to compute dissociation constants (Kds) of 0.77 µM ± 0.54

µM and 0.53 µM ± 0.40 µM for GR24 and SOP respectively. This can be contrasted with the addition of GA3 that did not change any significant change in protein fluorescence (Figure 3.8c). Taken together these two biochemical assays show that

SOP acts directly on HTL, and with the genetics described above it is clear that SOP acts through SL signaling. Thus, SOP is a specific antagonist for HTL.

SOP is able to inhibit the GR24 dependent interaction between HTL and MAX2

As was described above, HTL engages in a SL dependent interaction with MAX284. If

SOP is able to target HTL directly and inhibit signaling through it, we might expect that it would be able to inhibit this SL-dependent interaction. To test this, we treated yeast expressing the HTL and MAX2 yeast-two hybrid pairing, or HTL and an empty-vector control (EV) with DMSO, GR24, or GR24 and SOP. When the HTLxMAX2 pairing is

110

Figure 3.7 – SOP binds directly to AtHTL based on YLG competition assays.

A-B) Relative fluorescence of a hydrolytically activated probe YLG in the presence of purified AtHTL at increasing concentrations of GR24rac and

SOP. Error bars represent standard deviation for three technical replicates. The average IC50 found from two independent experiments for

SOP was 82.2 µM, and 52.8 for GR24rac. Error bars represent SEM from two independent experiments. C) Relative YLG fluorescence is shown for

111

assays where purified AtHTL was treated with the indicated concentrations

of compounds. Error bars represent standard deviation within a single

experiment with three technical replicates. treated with GR24 there is a significant increase in relative β-galactosidase activity.

However, when SOP is added together with GR24 there is not a significant increase in relative β-galactosidase activity (Figure 3.9). This suggests that SOP may be able to inhibit the SL dependent interaction between these two proteins and further supports the existence of SOP binding to HTL.

SOP as well as other RGs stabilize HTL in a DARTS assay

In one final assay to measure the interaction of SOP with HTL as well as other RGs were decided to use a DARTS assay131. The binding of a ligand to a receptor protein is normally a thermodynamically stabilizing event for the receptor. For this reason, when purified receptor protein is subjected to limited proteolysis in the presence of a ligand, the degree of proteolysis is often reduced compared to a sample treated with a vehicle control. When purified AtHTL protein was treated with GR24, SOP, or RGs 1, 2, 3, or 7 there was reduced proteolysis compared to a DMSO (Figure 3.10). This suggests that those compounds are able to interact with HTL. This result seems to show that the binding of ligands to HTL is a stabilizing event. This stands in contrast to D14, which is destabilized by the addition of ligand83, which suggests that although D14 and HTL share similarities in their action in SL perception that the biochemical details of their response to SL are quite different. This may be related to the dramatically slower rate of hydrolysis that HTL shows compared to D1498.

112

Figure 3.8 – SOP binds directly to AtHTL based on intrinsic protein fluorescence assays.

rac Binding properties of AtHTL in the presence of GR24 , SOP or GA3.

Apparent dissociation coefficients (Kd) were derived from intrinsic protein fluorescence measurements over a 5-fold range-excess. For GR24 and

SOP error bars represent standard error of the mean from three biological replicates, and ± indicates standard error of the mean with respect to the calculated Kds. For GA3 error bars indicate SD for three technical replicates.

113

RGs 2, 3, and 5 are also able to bind directly to HTL

The result that almost all of the rest of the RGs were able to stabilize HTL in the

DARTS assay caused us to wonder if we could measure the interaction of any of the other RGs with HTL. We used the intrinsic fluorescence assay to measure the effect of all the remaining RGs on binding to HTL. We found that RGs 2, 3, and 5 are able to bind to HTL directly based on altered intrinsic protein fluorescence (Figure 3.11).

SOP and other RGs are able to suppress Striga hermonthica germination

The ability of the various RGs and SOP to inhibit Arabidopsis germination and bind to

AtHTL lead us to ask whether they would also be able to inhibit Striga germination in the presence of exogenous GR24. When we treated conditioned Striga seeds with 1

µM GR24 as well as 50 µM of each of the RGs we were able to see that SOP as well as RGs 1, 3, and 6 were able to significantly decrease Striga germination relative to a

GR24 only treatment (Figure 3.12a). To assess the potency of SOP as a Striga germination inhibitor, we treated conditioned Striga seeds with 10 nM GR24 as well as

SOP in increasing concentrations. We found that as little as 1 µM of SOP was able to significantly reduce Striga germination rates relative to a GR24 only treatment (Figure

3.12b). This suggested that SOP had the potential to serve as a Striga control measure.

SOP does not noticeably inhibit rice development

SOP’s activity as an inhibitor of Striga and Arabidopsis seed germination lead us to wonder whether it would also inhibit rice germination. If SOP inhibited rice germination it would make a much less usefully agrichemical since germination is a key step in crop

114

Figure 3.9 – SOP can inhibit the interaction between HTL and MAX2 based on yeast-two hybrid assays.

The relative β-galactoisidase activity for each of the yeast-two hybrid pairings indicated is shown. Where GR24 and SOP are indicated, 25 and

50 µM of the compounds were added respectively. Data is the average of six independent experiments, and error bars represent SEM. * indicates p-value < 0.05 based on 2-tailed student’s T test relative to the relevant

DMO control.

115

Figure 3.10 – SOP and other RGs are able to stabilize AtHTL in a

DARTS assay.

Purified protein (shown undigested in the first lane) was treated with the indicated compounds at a concentration of 50 µM as well as proteinase K.

The top band indicated with the arrow is the full length AtHTL protein.

116

Figure 3.11 – RGs 2, 3, and 5 bind directly to AtHTL based on intrinsic protein fluorescence assays.

Binding properties of AtHTL in the presence of RGs 2, 3 and 5 are shown

(panels A, B, and C respectively). Apparent dissociation coefficients were

(Kd) 7.2 ± 2.0 µM, 6.5 ± 1.8 µM, 1.4 ± 0.4 µM for RG2, 3 and 5 respectively. Kd values were derived from intrinsic protein fluorescence measurements over a 20-fold range-excess. Error bars represent SD from three technical replicates.

117

Figure 3.12 – SOP and RGs can suppress Striga hermonthica germination.

A) The germination rate of conditioned Striga hermonthica seeds is shown for the indicated treatments. RGs were added at a concentration of 50 µM.

Error bars represent standard deviation. Averages are the result of three technical replicates. * indicates p-value < 0.05 from student’s T-test relative to 1 µM GR24rac treatment alone. B) The germination rate of Striga hermonthica treated with either DMSO, 10 nM GR24rac or GR24rac with the indicated concentrations of SOP in µM. Error bars represent SEM from two independent replicates.

118 development. To test this, we treated rice seeds with 20 µM SOP and found that they were able to germinate at a rate that was indistinguishable from a vehicle control

(Figure 3.13a). To further assay whether SOP had any off-target effects we also grew the rice hydroponically for 2 weeks under SOP treatment. We found that young rice plants were similar in stature and general appearance compared to a vehicle control

(Figure 3.13b). This approach is relatively limited, but it does suggest that SOP has no obvious toxicity in rice at the stages tested and at the concentration of SOP that was added. Further study where rice was grown for seed would be needed to assess whether SOP has any impact of seed set or growth at later stages.

YLG competition assays show that SOP acts directly on ShHTL7

Because SOP was able to inhibit Striga germination, and because ShHTLs seem to be involved in the perception of SLs in Striga germination we asked whether SOP acted on Striga through those ShHTLs. To accomplish this, we took advantage of the YLG competition assays that were described above. First we tested whether increasing concentrations of SOP could inhibit the hydrolysis of YLG by a crude Striga hermonthica extract. We found that SOP was able to reduce YLG fluorescence with an

IC50 of 19.8 ± 4.1µM (Figure 3.14a). Since ShHTL7 was shown to have an important role of the seed to sensitivity to exogenous SLs, we decided to test whether SOP was able to directly interact with purified ShHTL7 based on YLG competition. The same sort of YLG hydrolysis assay described above conducted with ShHTL7 generated an IC50 of 12.0 ± 2.9 µM for SOP (Figure 3.14b). Taken together with earlier evidence this

119

Figure 3.13 – Early development in rice is not obviously perturbed by

SOP treatment.

A) Photographs of rice seeds germinated in the presence of either a vehicle control or 20 µM SOP are shown. Scale bar represents 5 mm. B)

Assay was performed by Dr. Shigeo Toh. Photographs of rice grown hydroponically on either DMSO or SOP for two weeks are shown.

120 strongly suggests that SOP is able to inhibit SL dependent germination of Striga in response to exogenous application of GR24 through inhibiting ShHTLs including

ShHTL7.

SOP analogs show no improvement on SOP potency in planta

Although SOP was able to inhibit Striga germination at micromolar concentrations, for a robust response on Arabidopsis tens of µM of SOP is required. We sought to improve on the activity of SOP by acquiring several SOP analogs (Figure 3.15a). The analogs shared the core piperidine structure as well as the ethyl ester. When Arabidopsis seeds were treated with each of the analogs and SOP there was no significant difference in the activities of the various analogs from SOP (Figure 3.15b). Thus, analogs with increased affinity should likely vary the ethyl ester position or make more dramatic alterations to the halogen decorations of the molecule. This might also have the benefit of finding SOP analogs that do not share complete structural similarity to the synthetic opioids as was described above.

121

Figure 3.14 – SOP can bind to ShHTL7 based on YLG competition assays.

A) YLG fluorescence is shown for a crude Striga hermonthica seed extract treated with the indicated concentrations of SOP. B) The YLG flurorescence of purified ShHTL7 protein treated with the indicated concentrations of SOP is shown. Values are the average of three independent assays. Error bars represent standard deviation.

122

Figure 3.15 – SOP analogs show similar germination activity.

A) The chemical structures of SOP as well as three structural analogs are shown. B) The effect of the indicated concentrations of each SOP on the germination rate of Col seeds treated with 5 µM GR24 (rac) is shown.

Values are the mean of three independent experiments, each of which is made up of triplicate observations for each condition with approximately 15 seeds per replicate. Error bars represent the standard error of the mean.

123

Conclusions

The major goal of the research we embarked on was to identify a small molecule antagonist for the SL receptor HTL in Arabidopsis and Striga. We used a phenotypic screening approach to identify small molecules that generated two phenotypic responses that we expected an HTL antagonist would elicit. When the putative antagonists were ranked on their ability to inhibit Arabidopsis germination, RG4 stood out as the strongest inhibitor. RG4 was renamed soporidine (SOP) and was found to directly and specifically inhibit the activity of Arabidopsis HTL. SOP was able to inhibit

Striga hermonthica’s SL dependent germination, and thus may hold promise as a lead for the development of seed coating agents that inhibit the germination of Striga seeds in the area of a treated crop seed. Furthermore, SOP was found to bind the ShHTL7, the Striga HTL that seems to be the most important for the response of Striga to low concentrations of SLs in the soil.

SOP is the first antagonist for HTL that has been described. SOP shares structural similarity to synthetic opioid analgesics, but it is not known whether SOP would share that physiological property. Future development of SOP to make it a more potent inhibitor of Striga germination could focus on making changes to the core structure of the compound so that it no longer shares a common substructure with known opioid analgesics.

124

Materials and methods

Chemical screen for hypocotyl lengthening agents

For the primary hypocotyl screen, 1 µL of each compound dissolved in DMSO was transferred from the 3 mM working chemical library into the corresponding well of a 96- well plate. 99 µL of liquid growth media (4.3 g/L Murashige and Skoog basal salts, 25 mg/L MES, pH 5.8 with KOH, 1 x Gambourg’s vitamins) was added to each well. In all wells except for negative controls, the media also contained 5 µM of GR24rac 95. Five to

15 35S::GUS:COP1 seeds were added to each well. Seeds were stratified at 4⁰C for two days. The screening plates were then transferred to room temperature and grown under continuous white light at an intensity of 10 µE. The plates were scored for long hypocotyls after five days of growth. Hypocotyl lengths were compared to GR24 untreated seedlings and seedlings treated with GR24 alone to determine whether they would be scored as a hit. All chemicals were screened in duplicate. With respect to the secondary germination screen, 30 µM of each hit compound was dissolved in water and added to a 24 well microtiter well. Good germinating ripened wild type seeds were added and compounds with obviously reduced the speed of germination after 2-3 days were selected as secondary hits.

Hypocotyl elongation assays

Assays were performed in the same liquid media described above. The solutions of the indicated compounds or DMSO with the media were aliquoted into the wells of 24-well plates. Surface sterilized seeds were added to the wells of the assay plate and the assays were sealed and placed at 4 ⁰C for two to four days. Assay plates were moved to room

125 temperature and assays were performed under continuous light at an intensity of 45 µE.

After five days the hypocotyl lengths were measured using ImageJ.

Arabidopsis germination assays

The seeds used to perform the germination assays described in Figure 3.5ab were harvested less than a month before the experiment was performed, and thus were still dormant. For all other Arabidopsis germination experiments, the seeds were outside of primary dormancy and thus germinated readily. It was necessary to use seeds outside of primary dormancy because during primary dormancy atd14-1 athtl-3 and max2-1 will not germinate even if exogenous SL is added. In order to test whether antagonists were able to further suppress germination in these genetic backgrounds some level of germination was necessary. In all cases approximately 25 seeds were added to a solution of water and the indicated compounds or DMSO as a vehicle control that had been deposited into a well of a multi-well plate. Assays were performed at room temperature, under 45 µE of white light. Radicle emergence was used as the mark of germination in all assays.

AtHTL protein expression and purification

BL21 CodonPlus E. coli were transformed with p28-SBP-TEV carrying the AtHTL

(Arabidopsis accession: AT4G37470) or ShHTL7 (GenBank accession: KR013127) cDNA. Cells were grown to an OD600 of 0.7 at a scale of 3 L. IPTG was then added to a final concentration of 300 µM. The cells were grown at 16 ⁰C overnight (approximately

18 hours). The cells were harvested and washed in 80 mL lysis buffer (20 mM HEPES pH 7.0, 150 mM NaCl, 1 mM 2-mercaptoethanol, 5% glycerol (v/v)) at 4 ⁰C. All steps

126 after this point were conducted at 4 ⁰C. After washing, cells were resuspended in 40 mL of lysis buffer and disrupted by sonication. The lysate was subjected to a 60 minute, 10

000 rcf centrifugation in order to clear the lysate of insoluble components. The cleared lysate was applied to 2.5 mL of HisPur resin which had been equilibrated in lysis buffer.

The lysate was allowed to mix gently with the resin for 1 hour. The mixture of lysate and resin was applied to the column, and then was washed 100 mL of wash buffer (20 mM

HEPES pH 7.0, 150 mM NaCl, 1 mM 2-mercaptoethanol, 5% glycerol (v/v), 30 µM imidazole). The protein was eluted in approximately 3 mL elution buffer (20 mM HEPES pH 7.0, 150 mM NaCl, 1 mM 2-mercaptoethanol, 5% glycerol (v/v), 300 µM imidazole).

The eluate was allowed to dialyze overnight into 4 L of lysis buffer. The protein was then concentrated to > 10 mg/mL and frozen in small volumes using liquid nitrogen.

Drug affinity response target stability (DARTS) assay

Recombinant AtHTL was diluted to a concentration of 2 mg/mL in 20 mM HEPES pH 7.0,

150 mM NaCl, 5% glycerol. A sample was taken for use as an undigested protein control.

Proteinase K (Invitrogen, RNase free) was then added to the ice cold, diluted AtHTL protein to a final concentration of 1 µg/mL. Reactions were incubated at room temperature for 5 minutes and then each reaction was stopped by the addition of SDS

PAGE loading buffer and boiling for one minute. Equal volumes of each sample were run on a 12% polyacrylamide gel and analyzed by Coomassie Blue staining.

S. hermonthica seed conditioning and germination assays

Striga seeds were washed 10 times in sterilization solution (20% bleach) with 3 minute intervals of agitation in between changes of sterilization solution. The seeds were then

127 washed extensively with sterile distilled water until all traces of bleach were removed.

The seeds were then deposited onto 2.5 mm glass microfibre filters (Whatman GF/B) saturated with sterile distilled water and were incubated at 30 ⁰C in the dark for three weeks in a closed container. Conditioned Striga seeds were deposited on glass fibre filters that were saturated with compounds dissolved in sterile distilled water at the concentrations indicated. GR24 was added at a concentration of 10 nM. SOP was added at the concentrations indicated. After 5 days of incubation at room temperature in the dark, the number of Striga seeds showing radicle emergence were scored.

Yeast two-hybrid assays

The DupLEXA yeast two-hybrid system was used to analyze protein-protein interactions between HTL and MAX2. The yeast strain RFY48 with an integrated copy of the

Arabidopsis SKP1 homolog ASK1 was transformed with HTL in the bait plasmid pEG202 and the reporter pSH18-34. EGY48 was transformed with prey plasmid pJG4-5 carrying either MAX2 or nothing as an empty vector control. The HTL carrying strain was mated with both the MAX2 and empty pJG4-5 carrying strain on YPD. The mated yeast were streaked on selective media and a single colony was grown in liquid media. The cells were harvested and resuspended in 50% glycerol before being frozen at -80 ⁰C. Yeast from the glycerol stock were added to yeast Nitrogen-Glucose (YNG) media and were grown to saturation over the course of 48 hours. The cells were then diluted 100 fold in

YNG in the presence of 0.4 mg/mL ortho-Nitrophenyl-β-galactoside (ONPG) and the indicated compounds. GR24-rac was added at a concentration of 25 µM and SOP was added at 50 µM. 1 mL of the media was taken 18 hours later. The yeast cells were pelleted by centrifugation. The supernatant was collected and the OD 420 was measured

128 to determine the amount of β-galactosidase activity. The cell pellet was resuspended in

1 mL water and then was diluted 10 fold. The OD600 was then measured. The ratio of the OD420 to OD600 is plotted to describe the relative β-galactosidase activity. YNG media for growth was made by dissolving 0.85 g of yeast nitrogen based, 2.5 g of

(NH4)2SO4, 0.3 g of amino acid mixture (-H -U -L -W), and 10 g of agar (for solid media only) were dissolved in 850 mL of water. After autoclaving, filter sterilized ingredients were added in the following amounts: 50 mL of 20% (w/v) D-galactose, 50 mL of 10%

(w/v) D-raffinose, 50 mL phosphate buffer (493 mM disodium phosphate, 250 mM monosodium phosphate), 30 mg L-leucine.

Intrinsic fluorescence assays

Concentrated HTL protein was thawed from -80 ºC and was diluted to a final concentration of 10 µM in 20 mM MES pH 6.0, 150 mM NaCl. GR24, SOP, and GA3 were diluted to the final concentrations indicated while keeping the concentration of

DMSO constant at 1% (v/v). 100 µL of the protein-compound mixture was added to the wells of a flat bottom black 96-well plate. Each concentration point was measured in triplicate. Tecan infinite M1000Pro spectrophotometer was used to take fluorescence measurements. Fluorescence intensities were recorded for each concentration of ligand with the protein as well as for the ligand alone. Readings were taken using an excitation wavelength of 280 nm and an emission wavelength of 333 nm. The gain was set to 70, the number of flashes to 50, the flash frequency to 400 Hz, and the integration time to 2 µs. -ΔF was calculated by taking the negative of the difference in protein fluorescence of the DMSO control and each ligand concentration. -ΔF/ ΔFmax was calculated by dividing -ΔF by the maximum change in fluorescence for that series

129 of concentrations. The results of three independent experiments were used to generate

Figure 3.8 for the GR24 and SOP treatments. For the GA3 treatment one experiment was performed. The Kd determination was performed in SigmaPlot 11 using a model with a single binding site with nonspecific binding described by the below equation where y is total binding concentration, x is ligand concentration, and N accounts for the nonspecific binding. y = (Bmax x)/(Kd + x) + Nx

YLG hydrolysis assays

Concentrated HTL protein was thawed from -80 ºC and was diluted to a final concentration of 5 µM for AtHTL protein,10 µM for ShHTL7, and 100 µg/mL for the crude

Striga extract in 20 mM HEPES pH 7.0, 150 mM NaCl, 5% (v/v) glycerol. GR24, SOP, and GA3 were diluted to the indicated concentrations in the protein solution. The concentration of DMSO was held constant at 1% (v/v) in all assays. YLG was added to a concentration of 2.5 uM for AtHTL experiments, and 1 µM for the assays described in

Figure 3.7. Fluorescence was monitored using a TECAN infinite m1000pro using a 480 nm excitation wavelength and a 520 nm emission wavelength with a 5 nm bandwidth and a 30 minute kinetic read cycle. The gain was set to 100, the flash frequency was set to

400 Hz, the z-position was set to 20000 µM, and the readings were taken in a 2x2 square matrix. The change in fluorescence observed over the course of 1.5 hours of YLG in buffer without AtHTL was subtracted from the change observed for AtHTL with the various amounts of each compound to generate the relative fluorescence values reported. In the case of the Striga extract experiments (Figure 3.14a) the fluorescence was observed at 20 minutes, and for the HTL7 experiments (Figure 3.14b) it was

130 recorded at 7.5 minutes with both using a 2.5 minute kinetic read cycle. The inhibitory curves and IC50 values were plotted using SigmaPlot 11.0 four parameter logistic curve.

The IC50 values are the EC50 as determined by SigmaPlot based on the predicted maximum change observed for each assay.

Rice germination assays

300 µL of ddH20 containing either 20 µM SOP or DMSO (0.2% DMSO in both cases) were added to the wells of a 24-well plate. One Nipponbare rice seed was added to each well. The plate was sealed and placed at 30 ⁰C under continuous light. The germination rate of the rice was determined by radicle emergence two days after imbibition.

RG compound effects on rice seedlings

Nipponbare rice seeds were removed from the husks. To sterilize the seeds, they were imbibed in 1.2% sodium hypochlorite solution for 15 min. and rinsed five times in sterile water. Seeds were then incubated in sterile water at 30°C in the dark for 4 days in 50 ml tube. Germinated seeds were put in 20μM RG compounds with 1.5ml 0.8% agar in 24 well plate and cultured at 24°C under continuous fluorescent white light (100 μmol m-1 s-1) for continuous light for 10 days.

131

Chapter 4: Exploration of downstream strigolactone signaling using gain-

of-function screening and RNA sequencing approaches

132

Abstract

We screened the Arabidopsis FOX line collection to identify Arabidopsis cDNA overexpression lines that were able to resist the effect of SOP, an antagonist for the

Arabidopsis SL receptor HTL, at the level of germination using a thermoinhibition system. This screening uncovered a collection of genes with no known role in SL signaling including SOPR244, which was found to code for the splicing factor

U2AF35B. Multiple independent U2AF35B overexpression lines in a Col background showed enhanced germination, confirming a role for that gene in germination. When

U2AF35B was overexpressed in an htl-3 genetic background multiple lines showed there was partial rescue of the htl-3 reduced germination phenotype. Additionally, multiple independent U2AF35B overexpression lines partially rescued the htl-3 increased hypocotyl length phenotype. These pieces of evidence suggest that

U2AF35B may have a role in SL signaling.

In parallel we decided to pursue a transcriptomic approach to understanding SL response. We used RNA sequencing to analyze the transcriptomes of Col seedlings treated with SOP, Col seedlings treated with a vehicle control, and htl-3 seedlings treated with a vehicle control. When we looked at the transcriptional state of the antagonist treatment and the mutant compared to Col we found that there was a weak correlation in the changes in gene and isoform abundance seen. However, out of the

14 genes that were hits in the FOX line screen and whose transcripts were detected, 6 were differentially expressed in Col compared to htl-3, hinting that those genes might be involved in SL response.

133

Introduction

Our understanding of Strigolactone (SL) signaling has evolved quickly in the 8 years since the discovery that SLs are endogenous plant hormones that control, most notably, branching79,80. The model shown in Figure 1.3 that describes the mechanism of SL perception, however, is missing detail on what events actually cause the transcriptional changes associated with SL signaling. As mentioned above, the

TOPLESS related protein TPR2 seems to play a role in D14 mediated processes88 as well as ones that are regulated by SMXL6-8132 but a role for that protein in the seed is still not supported by any evidence. This presents the obvious question of what downstream genes are responsible for transmitting the SL signal from the

HTL/MAX2/SMAX1 complex into a transcriptional change. Up to this point, mutants with a SL phenotype have not been mapped to transcription factors, and therefore new approaches are required in order to identify genes that may be involved in HTL- dependent transcriptional reprogramming.

We chose to use a gain-of-function genetic screening approach to attempt to uncover novel SL signaling genes. This approach was particularly attractive in light of the newly discovered antagonist for HTL, Soporidine (SOP). With the development of SOP, it became possible to screen a library of overexpression (OX) lines not only for SL phenotypes, but for SL phenotypes in the presence of an antagonist for the SL receptor involved in germination. This is a desirable situation because it could allow us to uncover genes that are acting at or downstream of HTL in the SL signaling pathway.

The reasoning for why this approach should enrich for genes that act downstream of the receptor is as follows: If we were to simply screen an overexpression library for

134 lines that are able to mimic a constitutive SL response, we would expect to find not only genes involved in the perception of SLs but also in their biosynthesis or in signaling pathways that are genetically upstream of SL signaling. For example, if you were to overexpress a SL biosynthetic gene in a normal assay we might expect to see increased germination due to increased accumulation of SL. On the other hand, if you were to overexpress a SL biosynthesis gene in the presence of an antagonist for HTL, the overaccumulation of natural SLs would be less likely to generate a response since the receptor is inactivated by the antagonist. On the other hand, if a transcription factor that is working downstream of HTL is overexpressed we would not expect the antagonist for HTL to be able to block its effect on the seed. Thus, a downstream signaling gene would be a “hit” in the screen.

The arguments made above for genetic screening for insensitivity to SOP could equally be applied to a more traditional chemically mutagenized population. However, there are arguments for screening using a gain-of-function approach instead of a normal M2 population. Firstly, there is the consideration of the ease of identifying and confirming hits. In a normal genetic screen there are lengthy stages of confirmation that are required to identify the mutation in the plant that is causing the mutant phenotype of interest. After a hit from an EMS mutagenized population is selected it must be grown up and retested in the next generation. Following this stage, it should ideally be backcrossed into a wild-type background to reduce the number of other mutations that are carried along with the mutation that is responsible for the phenotype as well as assess whether the mutation is dominant or recessive. After backcrossing and recovering mutants, it is necessary to cross the mutant into a different accession, carry

135 the cross forward and recover mutants in subsequent generations. At that point sequencing can be used to generate a list of candidate genes. However, the experimenters work is still not done at this stage as it is necessary to generate additional alleles to confirm that the mutation is sufficient the generate the phenotype.

This might be as simple as ordering T-DNA insertion lines, but it could also require using CRISPR to make additional mutants133. All told the process of backcrossing, making a mapping population, and generating additional mutant lines can easily take at least seven generations before the causative mutation is identified and multiple independent lines with a mutation in the targeted gene can be assessed. By comparison, after a hit from an overexpression based screen is selected and retested it is simply a matter of using PCR to amplify the transgene and sequencing to identify it. It is then possible to make new transgenic lines to test whether the transgene is sufficient to generate the phenotype of interest. All told this takes no more than three generations. Thus using an overexpression based system is much faster than using a traditionally mutagenized population.

The second advantage of a gain-of-function system based on overexpression is that it should tend to generate mutants that are dominant. This is an advantage because of functional redundancy. Most mutants generated in a chemically mutagenized population will be loss-of-function mutants. This is simply because it is easier to render a gene non-functional than it is to make it more functional. Since most loss-of-function mutants are recessive, a normal genetic screen will tend to generate a collection of mostly recessive mutants. This is a problem since many important gene families have many members. For example, in Arabidopsis the family of ABA receptors contains 14

136 members, 13 of which seem to be responsive to ABA134. A loss-of-function mutation in any one gene in such a large gene family is unlikely to generate a clear phenotype. On the other hand, in the case where a single member of a large family is overexpressed it may have a phenotype even if the loss-of-function mutation does not.

One final advantage of a gain-of-function genetic screening approach that uses overexpression is that hits that are generated are more likely to be easy to work with than an average gene from the Arabidopsis genome. Many genes are difficult to clone or express in planta. Since any hit from an overexpression screen had to be successfully cloned and overexpressed, hits generated from this sort of screen should be relatively easy to work with in the lab.

For the reasons above we chose to use the Arabidopsis full-length overexpressor

(FOX) collection that has been assembled by RIKEN112. This collection is composed of approximately 20,000 independent overexpression lines. Those 20,000 lines are separated into 20 sets of 20 pools of seed, where each pool of seed is provided in a 1.5 mL tube containing 50 lines with approximately 8 individuals per line. Because each pool contains a small number of lines (compared to a chemically mutagenized population) it is possible to conduct a screen that requires fairly small quantities of reagents.

Having settled on a collection to screen, the next question was how to conduct the screen. SOP is able to lengthen the hypocotyls of Arabidopsis seedlings and is able to suppress their germination. These phenotypes are similar to the long hypocotyls and reduced germination that htl-3 plants have. Both phenotypes are easy to assess and

137 thus are suitable for a genetic screen. However, we chose to screen for overexpression lines that were able to germinate in the presence of SOP. We decided this for three reasons. The first reason is that germination is a phenotype that is more relevant to the

Striga problem, if we are going to use genetic approaches to uncover new SL signaling genes, we are most interested in those that might be important for understanding how

Striga is able to sense and attack its host. Secondly, germination in the presence of

SOP is a positive trait, whereas a short hypocotyl in the presence of SOP is a negative trait. Any overexpression line that makes the plant “sick” could generate what looks like a short hypocotyl, but such sick plants would not be expected to germinate more readily than WT. The final reason we chose to use germination was that it is easier to score rare germination events in the presence of SOP than it is to look for hypocotyls that are shorter than average in the presence of SOP.

However, germination is not without its challenges as a screening system. One main challenge relates to the age of the seeds that are provided. Ripened Arabidopsis seed germinate well at room temperature in water. Under these conditions SOP can reduce the average germination rate but not suppress it entirely. For this reason, we used conditions that induce secondary dormancy. In particular, when Arabidopsis seeds are thermally stressed they will fail to germinate and exogenous application of SLs is able to overcome that inhibition117. By imbibing the FOX line seeds at 32 ºC we were able to suppress germination to the point that in the presence of SOP there was only very limited germination. Thus we were able to select only the limited seeds that were able to germinate in the presence of SOP and high temperature. Using this thermoinhibition system we were able to isolate overexpression lines that had the potential to play a role

138 in downstream SL signaling. One of these genes, SOPR244, codes for U2AF35B which is a splicing factor. Genetic analysis showed that U2AF35B may play a role in downstream SL signaling.

Results and discussion

Genetic screening for SOP insensitivity

The Arabidopsis FOX collection was screened to identify seeds that could germinate in the presence of SOP (Figure 4.1). Approximately 500 individuals that were able to germinate in the presence of 20 µM SOP were collected in the primary screen and were grown to seed. In the next generation 92 out of those 500 were confirmed to germinate more readily than WT when treated with DMSO and/or SOP.

In 18 out of the 92 lines, the genes that were overexpressed were amplified by PCR and successfully sequenced. The identities of those genes are shown in Table V.

Some of the genes were expected but uninteresting. For example, the overexpression of a major facilitator superfamily protein (AT4G36790) conferred insensitivity to SOP.

Genes of this class are often implicated in antibiotic resistance because they can pump drug-like molecules out of the cell. Thus, it is not hard to imagine how overexpressing such a gene could lead to resistance to SOP. However, some genes that were uncovered were more interesting. For example, SOPR244 coded for a gene with a zinc finger motif that seemed to be the spliceosome subunit U2AF35B. This sort of gene could be involved in mediating transcriptional changes in response to SL treatment, and therefore it was chosen for further study. The only other protein with predicted DNA binding activity was ATL68, a U-Box protein with zinc finger motifs. Both it and

139

Figure 4.1 – Schematic diagram of screen for SOP insenstive FOX lines.

Individual T2 lines expressing Arabidopsis cDNAs are pooled to form FOX screening pools. FOX screening lines were screened for the ability to overcome the effect of SOP on germination rates. Seeds that are able to germinate in the presence of SOP are selected and grown for further analysis.

140

U2AF35B were carried forward for further analysis, but U2AF35B was successfully cloned first and will be discussed in more depth below.

Overexpression of U2AF35B increases germination

Because the phenotype that was seen from the original SOPR244 hit in the screen could have been due to a positional effect such as disrupting a negative regulator of SL signaling it was necessary to generate independent U2AF35B overexpression lines.

Independent transgenic lines overexpressing U2AF35B were made in both a Col and htl-3 background. We chose to generate lines in an htl-3 background concurrently with the Col background because if U2AF35B is a downstream element of SL signaling it might be expected to be able to suppress some or all of the phenotypes shown by htl-3.

We also chose to overexpress the protein separately with both an N or C terminal flag tag. We chose to try the overexpression with the tag on either end of the protein because some proteins will not express properly when a tag is attached to either their

N or C terminal. As can be seen in Figure 4.2, various independent lines showed differing levels of germination. We observed that overexpression lines with a C terminal flag tag fusion showed increased rates of germination overall, and so we decided to use the best two germinating lines from those constructs for subsequent analysis.

SOP cannot suppress the germination of U2AF35B overexpressors

Although the initial hit from the genetic screen was able to germinate in the presence of the HTL antagonist SOP, we wanted to test whether the U2AF35B overexpression lines that were made independently and that carried a C terminal flag tag were also able to resist it. When two independent SOP244 overexpression lines in a Col background

141

lines lines

value < < value

-

. All seeds . seeds All

* p *

Overexpression (OX) Overexpression

plicate. Error bars represent SEM. Error bars represent plicate.

) with SOPR244 from pGWB611/612 vectors from pGWB611/612 SOPR244 with )

T2

(

Selection of U2AF35B overexpression lines. U2AF35B of overexpression Selection

second generation post transformation second

2

.

4

in in the

Average germination rates for the indicated genotypes one week after imbibition. after week one genotypes indicated Averagethe germination for rates are white waterfrom derived is three Data light. were in under age. aregerminated continuous the same Seeds per25re seeds approximately experiments independent with test. T student’s tailed 0.05, 2 Figure

142

Figure 4.3 – GR24 addition can enhance 35S:U2AF35B phenotype.

A) The germination rate of Col, and two independent SOPR244 overexpression lines (T2) treated with a vehicle control or 5 µM GR24rac are shown over the course of 7 days. B) The germination rates from part A are shown for only day 3. Germination rates are the average of three biological replicates, ~ 75 seeds/replicate. Error bars represent standard deviation. * p-value < 0.05, 2 tailed student’s T test.

143 were treated with 30 µM SOP and their germination rate was monitored over the course of 7 days we found that at no point was there any significant decrease in the germination rate of those overexpression lines (Figure 4.4). This confirmed that lines that overexpressed U2AF35B were in fact SOP resistant.

GR24 treatment can enhance the effects of U2AF35B overexpression

Since SOP was not able to suppress the germination rate of U2AF35B overexpressors, it seemed possible that U2AF35B overexpressors were displaying a constitutive SL response in the seed. To address this question we decided to test the interaction between GR24 treatment with germination in U2AF35B overexpression lines. We reasoned that if the germination rate of overexpression lines showed no further increase in the presence of GR24 that those lines would be showing a strong constitutive SL response phenotype.

When the germination rate of overexpressors were measured over the course of one week in the presence and absence of GR24 we found that additional GR24 delivered a modest stimulation of germination (Figure 4.3). These results may indicate that the expression level in the overexpression lines tested are not strong enough to generate a phenotype that is so strong that GR24 has no additional effect. Another plausible possibility is that U2AF35B is only able to act in some parts of the downstream response of the seed to SLs. Thus, no matter what level of expression level is generated U2AF35B will not be able to rescue the entire phenotype associated with

GR24 addition. This seems quite plausible given past results showing that there may be a role for light signaling components in SL signaling114.

144

Figure 4.4 – SOP does not inhibit the germination rate of U2AF35B overexpression lines.

A) The germination rate of Col, and two independent SOPR244 overexpression lines (T2) treated with a vehicle control (-) or 20 µM SOP

(+) are shown over the course of 5 days. B) The germination rates from part A are shown for only day 5. Germination rates are the average of three biological replicates, ~ 75 seeds/replicate. Error bars represent standard deviation.

145

Overexpression of U2AF35B in an htl-3 can partially suppress that backgrounds germination defect

Since the overexpression of U2AF35B was able to confer insensitivity to SOP, we expected that it might be acting at or downstream of SOP’s target, HTL. We were also emboldened by the fact that U2AF35B is necessarily a DNA binding protein and thus is likely to be acting towards the end of the signaling pathway. In order to test whether

U2AF35B is acting at or downstream of the receptor were decided to test the ability of overexpression lines to suppress the effect of htl-3 on germination. As was shown in

Figure 4.2, U2AF35B overexpression lines were made in an htl-3 background with both an N- and C-terminal flag tag fusion. As was the case with the Col background C- terminal fusions showed a higher germination rate.

When the germination rate of two independent 35S::U2AF35B/htl-3 lines were measured over the course of one week we found that the overexpression lines showed a higher average germination rate than htl-3 (Figure 4.5). This showed that overexpression of U2AF35B was at least partially able to suppress the germination defects associated with htl-3. This result shows that U2AF35B acts at or downstream of

HTL in the perception of SLs in the seed. It is also worth noting that the seed that was used in this assay was T2 seed, and thus we expect as many as 1 in 4 of the seeds to have an htl-3 genotype, but to lack the transgene. Thus this assay likely underestimated, rather than overestimated, the effect of U2AF35B overexpression on the germination rate of htl-3 plants.

146

Figure 4.5 – 35S::U2AF35B htl-3 seeds germinate more readily than htl-3.

A) Average germination rates for the indicated genotypes is shown over the course of one week. B) Germination rates are shown for day 4. OX lines are T2. All seeds are the same age. Seeds were germinated in water under continuous white light. Data is derived from three independent

147

experiments with approximately 75 seeds per replicate. Error bars

represent SD. * p-value < 0.05 relative to htl-3 based on 2 sided student’s

T test.

Overexpression of U2AF35B in an htl-3 background can partially suppress that mutant’s hypocotyl length defect

The observation that U2AF35B overexpression was sufficient to partially suppress the germination defect of htl-3 plants raised the question of whether U2AF35B overexpression to suppress other htl-3 phenotypes. As was mentioned above, loss-of- function mutations in HTL cause seedlings to have a longer hypocotyl, and so we decided to test whether 35S::U2AF35B htl-3 plants show shorter hypocotyls than the mutant background alone. When the average hypocotyl length of seedlings after 5 days of growth was measured, we found that two independent U2AF35B overexpression lines in an htl-3 background showed a shorter hypocotyl than htl-3 alone (Figure 4.6).

Although the suppression was not total, since the seed pool used for these assays was

T2, it is likely that as many as 1 in 4 of the seeds were htl-3 but lacked the transgene.

Thus the hypocotyl length of individuals that are homozygous for the overexpressor will likely have an even shorter hypocotyl and perhaps more robust expression (Figure

4.7).

This result shows that U2AF35B is able to partially suppress two independent phenotypes associated with htl-3. This places U2AF35B as a potential player in HTL mediated processes in the seed and seedling, although the question of how close to SL signaling U2AF35B is not thoroughly answered at this point. Although U2AF35B seems

148

Figure 4.6 – 35S::U2AF35B htl-3 seedlings have shorter hypocotyls than htl-3.

Average hypocotyl length of the indicated genotypes from three independent experiments is shown. Hypocotyls were measured on day 5.

OX lines are T2 plants. * p-value < 0.05, 2 sided student’s T test relative to htl-3. Error bars represent SEM.

149 to be genetically at or below the receptor, it is possible that it is only distantly involved in downstream responses and is not directly influenced by the receptor. In order to address whether U2AF35B is directly targeted by SL signaling machinery further work will have to be done to establish a mechanism for its involvement.

RNA sequencing based exploration of htl-3 and SOP treated seedlings

The overexpression based genetic screening described above was successful in uncovering at least one gene that may play a role in SL signaling downstream of HTL.

We were intrigued by the types of transcriptomic responses that are elicited when the

SL signaling machinery are functioning properly compared to under perturbation either through genetic or chemical approaches. In the past, transcriptomic analyses of HTL mode of action have been limited to analyzing the response of plants to ligands for

HTL135. However, since those plants already contain endogenous HTL ligands, it is unclear whether transcriptomic changes upon compound treatment are necessarily biologically important. Additionally, since a microarray was used it meant that information regarding the abundances of various splice variants under different conditions has not been explored on a whole genome level. In order to address this shortcoming of presently available data we decided to use RNAseq to compare the abundances of mRNAs and their splice variants comparing Col treated with a vehicle control to Col treated with SOP or htl-3 treated with a vehicle control. Additionally, since

SOP was able to mimic the phenotypes of htl-3 plants, we expected that by measuring the transcriptome of SOP treated plants we would be able to generate a more robust picture of the transcriptomic changes associated with blocking SL perception since we

150 would be measuring the effect of two independent ways of inhibiting the effects of endogenous SLs on the transcriptome.

Changes in gene expression levels in htl-3 and SOP treated seedlings are weakly correlated

The transcript abundances in seedlings for all treatments were calculated as described in the methods section and were expressed in FPKM (Fragments Per Kilobase of transcript per Million mapped reads). Comparison of average transcript abundances between treatments showed that in general the transcript abundances between treatments were well correlated (Figure 4.8). Indeed, the Pearson correlation coefficient of DMSO treated Col seedlings with SOP treated Col seedlings and htl-3 seedlings were 0.9852 and 0.9440. The correlation between SOP treated Col seedlings with htl-3 seedlings was 0.9783. This shows that the transcriptomic state of SOP treated Col seedlings is more similar to that of htl-3 seedlings than is DMSO treated

Col. When the fold change of transcript abundance in SOP treated Col versus DMSO treated Col and DMSO treated htl-3 versus DMSO treated Col were plotted (Figure

4.9) there was a weak positive correlation between those fold changes. This again shows that although the transcriptome of SOP treated Col is more similar to that of

DMSO treated Col than DMSO treated htl-3, SOP treated Col is more similar to DMSO treated htl-3 seedlings than DMSO treated Col is.

As might be expected from the finding described above, when the number of genes with significantly altered transcription between DMSO treated htl-3 and DMSO treated

151

Figure 4.7 – Western blot showing the expression of flag tagged

UAF35B protein in Col and htl-3 backgrounds.

Blots was performed by Asrinus Subha. A) Western blot of 35S:U2AF35B lines. The migration of 20 µL of protein extract on a 12% polyacrylamide gel from the indicated overexpression lines is shown after probing and exposing with a primary anti-flag (mouse) antibody, and a secondary alkaline-phosphatase (anti-mouse) antibody. Protein ladder from first lane is superimposed to show migration distance. B) Ponceau staining of the blot from A is shown to indicate equal loading.

152

Figure 4.8 – Transcript abundance for Arabidopsis seedlings under genetic and pharmacological perturbation of HTL.

Average transcript abundances are shown for DMSO treated Col (D), SOP treated Col (R), and DMSO treated htl-3 (H) seedlings. Diagonal panels show the number of gene transcripts (in FPKM) occurring at various abundances. All other panels show the average abundances (FPKM) of a given transcript in one treatment relative to the same for another treatment.

153

Figure 4.9 – Transcript abundance changes in SOP treated seedlings compared to htl-3.

Log2 fold change of transcript abundance in htl-3 over Col plants treated with DMSO is plotted versus log2 fold changes in transcript abundance of

SOP treated Col over DMSO treated Col. Pearson correlation coefficient was calculated to be 0.27. Black line indicates line of best fit for log transformed data. Blue lines indicate density contours.

154

Col was compared to the number from SOP treated Col and DMSO treated Col, we found that more genes were significantly altered when comparing htl-3 to Col. This can be seen based on the volcano plots (Figure 4.10) that show a larger number of changed genes in the htl-3 comparison to Col. In fact, the number of genes found to be significantly different between htl-3 and DMSO treated Col with a p-value of 0.01 was

1144 (Table VI), compared to 75 from SOP treatment (Table VII). Of those genes, four were significantly altered under both forms of treatment. Those genes were

AT1G07135, CRWN1, PHOT1, and ZAT10.

In order to explore what kinds of genes were significantly changed between treatments we used Gene Ontology (GO) enrichment analysis. When the GO biological processes were checked for genes that were differently expressed in Col and htl-3 (Table VIII) some of the enriched categories were expected. For example, genes that were differently expressed have a four-fold enrichment in genes involved in response to karrikin. Other expected GO categories included: response to hormone, and response to endogenous stimulus. When the same approach was used to analyze categories overrepresented in SOP treated Col compared to DMSO treated Col (Table IX) the GO category “response to hormone” was still present, but response to karrikin was not.

When only genes that were differentially expressed in both htl-3 and SOP treated seedlings were analyzed, those four genes were not significantly enriched in any category.

We were curious to know how the expression of the genes that were generated as hits in the FOX line screens were influenced by either the htl-3 background or SOP treatment. We were able to detect expression of 14 of the18 FOX screen hits. 6 of

155

Figure 4.10 – Volcano plots of transcripts for htl-3 seedlings and SOP treated Col.

156

A) The log2 fold change in DMSO treated htl-3 transcript abundance over

DMSO treated Col is shown relative the -log10(p-value) for each gene. B)

The log2 fold change in SOP treated Col transcript abundance over DMSO

treated Col is shown relative the -log10(p-value) for each gene. Red dots

represent transcripts with a p-value < 0.01. those genes were differentially expressed in htl-3 plants, although no significant differences were seen under SOP treatment (Figure 4.11). Although none of the genes changed in abundance under SOP treatment, it is worth noting that the screen was performed using germination not hypocotyl elongation. If the RNA from germinating seeds were collected rather than focussing on seedlings there might have been a more robust change in expression for those genes under SOP treatment.

Exploration of isoform abundance in htl-3 and SOP treated seedlings

The relative changes in gene expression that we were able to detect using whole gene analysis were relatively small. We were curious to see how the data might be affected if the analysis was shifted from the gene level to the gene isoform level. This form of analysis seemed particularly attractive since at least one gene that seemed to be playing a role in SL signaling downstream of HTL was a splicing factor.

The first thing that we noticed when conducting isoform analysis was that the inter- treatment variability was no lower than when gene level analysis was used. The

Pearson correlation coefficients between Col and SOP treated Col or htl-3 were 0.9850 and 0.9447, and the correlation between SOP treated Col and htl-3 was 0.9790 (matrix scatter plot between samples in Figure 4.12). These values were almost

157

Figure 4.11 – Transcript abundances are shown for hits from FOX screen.

The transcript abundances of genes that were hits in the FOX line screen for insensitivity to SOP are shown (FPKM). Error bars represent standard deviation. Values are the average of three independent experiments. * p- value < 0.05 based on 2 sided student’s T test.

158

Figure 4.12 – Isoform abundance for each treatment.

Average gene isoform abundances are shown for DMSO treated Col (D),

SOP treated Col (R), and DMSO treated htl-3 (H) seedlings. Diagonal panels show the number of gene isoforms (in FPKM) occurring at various abundances. All other panels show the average abundances (FPKM) of a given gene isoform in one treatment relative to the same for another treatment.

159 identical to those where gene abundance was used rather than isoforms and give the same impression that SOP treated Col is more similar to htl-3 than DMSO treated Col is. Interestingly, the correlation coefficient for the log2 transformed relative expression data for the isoforms was 0.39 (Figure 4.13) compared to 0.27 in the whole gene analysis. This result seemed to validate the idea that when there are implications that splicing is involved in a process, that looking at individual isoforms may be more meaningful than whole genes. Additionally, when the magnitudes of the fold changes are compared, it is readily apparent when comparing Figure 4.9 with Figure 4.13 that the scale of the changes seen using isoforms is greater than the gene level.

However, this larger difference between samples was not necessarily indicative of a more robust transcript level response between samples. When volcano plots (Figure

4.14) were used to visualize the p-values and fold change in expression between treatments, it seems that the isoforms that show the largest variability also show very high p-values. This implies that although some isoform numbers are very different between treatments that there is great variability between replicates and that those fold changes are not necessarily reliable. A closer inspection of Figure 4.13 shows that most of the points that are very far from the diagonal relationship fall beneath a value of

0 on the log10 scale. This indicates that the isoforms where the FPKM values are the most different between treatments are general present with abundances of less than 1

FPKM. If very low abundance isoforms are responsible for much of the variation between samples, it might explain when those differences are not meeting the p-value cut-off for significance.

160

To explore this possibility, we plotted the ratio of the standard deviation between replicates for each isoform in each treatment and the average FPKM of that isoform in versus each isoforms average FPKM (Figure 4.15). Since the FPKM value reported for each treatment is the average of three replicates, by looking at the size of the standard deviation between those replicates and comparing them to the average FPKM we were able to look at the noise to signal ratio. This result showed that at lower average FPKM values there was a larger ratio of noise to signal. This confirmed our analysis that low abundance isoforms were leading to unreliable but large differences between treatments. To further explore variability between replicates in each treatment we used multidimensional scaling (MDS) analysis to look at the relationship between treatments.

When an MDS plot was generated (Figure 4.16) two things were fairly obvious. The first was that htl-3 replicates were much further apart from each other on the plot than

DMSO or SOP treatments were from the other replicates of those treatments. The second was that one SOP treatment clustered relatively far away from the other two

SOP treatments indicating that it might be something of an outlier. With these two observations it might be possible to explain some of the poor overlap between SOP treatment and htl-3 treatment.

Conclusions

The results presented here showed that overexpression based screening for genes involved in SL signaling was able to uncover genes that to this point had not been implicated in SL response. Additionally, overexpression of at least one of those genes,

U2AF35B, was able to partially suppress htl-3. This was shown using both germination

161 assays as well as hypocotyl elongation assays. Although the mechanistic link between

U2AF35B is not yet known, further experiments such as chromatin immunoprecipitation of U2AF35B paired with DNA sequencing could be used to generate those mechanistic links. We were also able to investigate the transcriptional profile of htl-3 and SOP treated Col and establish that changes in gene isoform abundance seemed more informative than changes in gene expression alone. This dovetails with U2AF35B’s role in the splicosome and provides further evidence that it might be involved in the response to SLs.

162

Figure 4.13 – Gene isoform abundance changes in SOP treated seedlings compared to htl-3.

Log2 fold change of gene isoform abundance in htl-3 over Col plants treated with DMSO is plotted versus log2 fold changes in gene isoform abundance of SOP treated Col over DMSO treated Col. Pearson correlation coefficient was calculated to be 0.39. black line indicates line of best fit for log transformed data. Blue lines indicate density contours.

163

164

Figure 4.14 – Volcano plots of isoforms for htl-3 seedlings and SOP

treated Col.

A) The log2 fold change in DMSO treated htl-3 isoform abundance over DMSO treated Col is shown relative the -log10(p-value) for each gene. B) The log2 fold change in SOP treated Col isoform abundance over DMSO treated Col is shown relative the -log10(p-value) for each gene. Red dots represent transcripts with a p- value < 0.01.

165

Figure 4.15 – Low abundance isoforms show increased relative variability than higher abundance isoforms.

The ratio of the standard deviation between replicates within each treatment divided by the average abundance of the transcript is shown relative to the abundance of the transcript. Blue lines represent density contours.

166

Methods

Genetic screening for SOP insensitive FOX lines

Arabidopsis FOX lines112 were obtained from RIKEN, Japan.1 mL of 70% ethanol was added to each 1.5 mL tube containing a seed pool. The tube was agitated for 10 minutes. The supernatant was discarded and 1 mL of 100% ethanol was added. The tube was then agitated for an additional 10 minutes before the supernatant was discarded. All remaining ethanol was removed using a p10 micropipette. The residual ethanol was removed under a vacuum at room temperature.

1 mL of 20 µM SOP dissolved in water (0.1% DMSO) was added to each well of a 24- well plate. For each individual seed pool of FOX line seeds, half of the seeds in the tube were added to a well of the 24-well plate. This was equivalent to approximately

100 seeds per well. This meant that on average you would expect 4 seeds for each individual transformation event to be screened. The probability per well of not having at least one seed of any of the transgenics within the pool was less than 5% based on the binomial theorem. Col seeds of an equivalent apparent age were also added as an internal control. The plates were sealed with transparent tape and were placed under white light at a temperature of 32 ºC.

The plates were checked every day for germinated seeds. Germinated seeds were picked up to a maximum number of 46 per plate, or until there was widespread Col germination. Germinated seeds were transferred to ½ MS agar plate and were grown under continuous white light for 1 week before they were transferred to soil. Col seeds were also transferred simultaneously to act as controls for later retesting. Plants were

167 grown under continuous light until senescence. At that point they were harvested individually.

The total number of hits in the primary screen of the FOX lines was 582. Those 582 lines were designed SOPR1 through 582 for SOPORIDINE RESISTANT. SOPR lines were tested in duplicate for the ability to germinate faster than Col of the same age both in the presence and the absence of 20 µM SOP (0.1% DMSO). Any SOPR that was able to germinate significantly faster than Col on average in either of the treatments was designated as a confirmed hit. A total of 69 lines were confirmed.

Germination assays

Germination assays performed in Figures 4.2-5 were performed at room temperature.

0.2 mL of water containing the appropriate concentration of compound and a final concentration of 0.1% DMSO were added to the wells of 48 well plates. Approximately

25 seeds of the appropriate genetic background were placed inside each well with compound. For each genotype and treatment, three replicates were performed, and the entire assay was repeated three times. Germination was measured daily over the course of one week.

DNA extraction

One mature Arabidopsis leaf was added to a 1.5 mL microfuge tube. 100 µL of Tris

EDTA (TE: 10 mM Tris, 1 mM EDTA, pH 8) was added to the tube along with one 3 mm diameter glass bead. The tubes were then shaken for 2 minutes in a bead beating

168

Figure 4.16 – MDS analysis shows variability between replicates.

MDS representation shows the relationships between replicates of DMSO treated Col (D), SOP treated Col (R), and DMSO treated htl-3 (H). The three replicates are numbered 0 through 2.

169 machine in order to grind the leaf. 200 µL of CTAB buffer (100 mM TRIS pH8, 20 mM

EDTA, 1.4 M NaCl. 3% CTAB, 0.2% β-mercaptoethanol) was then added. The tubes were then mixed and incubated at 65 ºC for 30 minutes. 300 µL of cholorform was then added to the tubes and the tubes were mixed. The tubes were subjected to centrifugation at 18 000 rcf for 15 minutes at 4 ºC. The supernatant was decanted into a new 1.5 mL tube. 200 µL of isopropanol was mixed into the contents of the new tube.

The tubes were subjected to centrifugation at 18 000 rcf for 15 minutes at 4 ºC. The supernatant was discarded by inversion. The DNA pellet was washed with 200 µL of

70% ethanol. The tubes were subjected to centrifugation at 18 000 rcf for 15 minutes at

4 ºC. The supernatant was discarded by inversion, and the pellet was dried in a vacuum. The pellet was resuspended in 50 µL of TE.

Primer sequences

F5 primer 5’– AAGTTCATTTATTCGGAGAGG – 3’

R3 primer 5’– CAAATGTTTGAACGATCGGGGAAAT – 3’

SOPR244F 5’ –GGGGACAAGTTTGTACAAAAAAGCAGGCTTCATGGCA

GAGCATTTAGCTTCAATC– 3’

SOPR244R 5’ –GGGGACCACTTTGTACAAGAAAGCTGGGTTTTAAACTCCCTCAT

CACG TTCTCG – 3’

PCR amplification of SOP genes from FOX lines

Amplifications were performed at a 50 µL scale using Thermo Scientific Phusion High-

Fidelity DNA Polymerase. 10 µL of Phusion HF buffer, 5 µL of 2 mM dNTPs, 2.5 µL of

170

10 µM forward and reverse primers, 1.5 µL DMSO, 2.5 µL template DNA (1/10 dilution from DNA purification), 26 µL ddH2O, and 0.25 µL of Phusion DNA polymerase were mixed for each reaction. When F3/R5 primer pairing was used, the thermocycler was set to the following program: 98 ºC for 2 minutes, 35 x (98 ºC for 0.75 minutes, 61 ºC for 0.75 minutes, 72 ºC for 5 minutes), 72 ºC for 5 minutes.

Molecular cloning of SOP genes

PCR products were cleaned up using PureLink Quick PCR purification Kit (Invitrogen).

Purified products were sent for sequencing with F5 and R3 primers. To clone

U2AF25B, the primers described above (SOPR244F/SOPR244R) were designed to include sequences for BP reactions. PCRs were performs as described above but with an annealing temperature of 68 ºC. PCR products were purified with the same kit mentioned above. BP reactions were conducted by mixing 1 µL of a 1/10 dilution of vector DNA (pDONR207), 1 µL of 1/10 dilution of PCR products, 1.5 µL of Gateway BP

Clonase II mix (Invitrogen), and 1.5 µL of TE (pH8). Reactions were allowed to sit at room temperature overnight. 1 µL of ProteinaseK was then added and the mixture was held at 37 ºC for 10 minutes. Transformation to E. coli (DH5α) was performed by adding the whole volume to 50 µL of chemically competent E. coli cells. The mixture was incubated on ice for 30 minutes before being exposed to 42 ºC for 45 seconds.

The cells were then transferred to ice and held there for 5 minutes. 1 mL of LB was added to the cells and the mixture was shaken at 37 ºC for 1 hour. The cells were pelleted and all but 100 µL of the LB was removed. The pellet was resuspended and the cells were added to an LB agar plate with 25 µg/mL gentamicin. Plates were incubated at 37 ºC overnight. Colonies were picked and added to 3 mL of LB with 25

171

µg/mL gentamicin. The plasmid was purified from the E. coli using the PureLink Quick

Plasmid Miniprep Kit (Invitrogen). U2AF35B was then transferred into the pGWB611 and pGWB612 vectors by LR reaction. For the reaction, 1 µL of a 1/10 dilution of the miniprep of both pDONR207 with U2AF35B and the empty pGWB vector were mixed.

0.5 µL of LR Clonase II enzyme mix (Invitrogen). Reactions were allowed to sit at room temperature overnight. 1 µL of ProteinaseK was then added and the mixture was held at 37 ºC for 10 minutes. Transformation, selection of transformants, and vector purification was performed as described above but using spectinomycin at a concentration of 50 µg/mL. In order to introduce 35S::U2AF35B into Agrobacterium, 50

µL of electrocompetant cells of the strain GV3101 were thawed on ice. 1 µL of the pGWB611 or pGWB612 binary vector with U2AF35B was added to the cells. An electroporation cuvette was cooled on ice before the mix of cells and DNA was added to it. The settings for the Gene Pulser (BioRad) device were set as follows:

Capacitance 25 µF, resistance 200 Ω, and voltage was set to 1.25 kV. The cells were then electroporated. The cells were then washed from the chamber of the cuvette by collecting several changes of ice cold LB. The LB and cell mixture was shaken at 30 ºC for 2 hours before 100 µL of the mixture was spread on an LB plate with 100 µg/mL rifampicin and 50 µg/mL gentamicin. Colonies were picked from the plate after three days of growth to generate glycerol stocks.

Agrobacterium mediated transformation by floral dip

A 3 mL culture of Agrobacterium in LB with selectable markers was started. 2 days later the 3 mL of Agrobacterium were added to 100 mL of LB with selectable markers. 2 days later the floral dip was performed. The cells were harvested by centrifugation at 4

172

000 rpm for 10 minutes. The supernatant was discarded and the cells were resuspended in 100 mL of 5% sucrose. A stainless steel cart was shielded with several pieces of cling-wrap and paper towels were placed over the cling wrap. 20 µL of Silwet

L-77 was added to a 150 mm diameter and 25 mm depth tissue culture plate placed on top of the paper towels. The bacterial resuspension was carefully poured into the plate to mix the Silwet with the bacteria. Flowering Arabidopsis plants in one pot were then dipped into the bacterial suspension. The time of dipping was chosen to maximize the number of flowers present on the plants. The flowers and buds of the plant were lightly pressed on to allow the bacterial suspension to enter the flowers. After the flowers were thoroughly soaked, excess suspension was blotted away with the paper towels on the cart, and the pot and plants were wrapped with the cling wrap on the cart. Cling wrap was gradually removed over the course of three days.

Selection of transgenic plants

T1 seed was harvested from the dipped T0 plants. 200 mg of seed was surface sterilized and sown out on a 150 mm diameter MS-agar plate containing 10 µg/mL

BASTA. The plate with the seeds was stratified at 4 ºC for 5 days. The plates were placed under 24 hour, 45 µE light for 7 days before resistant seedlings were selected.

Those plants were transferred to MS-agar plates and allowed to grow before being transplanted to soil and grown for seed. In parallel with selection of transgenics, Col and htl-3 plants were also sown out, and transplanted to soil to ensure plants of the same age were available to assess the phenotypes shown by T2 seed.

Hypocotyl elongation assays

173

Approximately thirty surface sterilized seeds per genotype were spread onto a ½ MS agar plate. The plate was stratified at 4 ºC for 5 days. The plate was then transferred to room temperature and was placed under 34 µE light for 5 days. Hypocotyls were then laid down on an agar plate and photographed from above. The length of each hypocotyl was then measured using the measure feature of the program ImageJ136,137.

Average hypocotyl lengths were then calculated. The assays were repeated three times independently to generate the average hypocotyl lengths reported in Figure 4.6.

Protein extraction and Western blotting

50 mg of 5 day old seedlings were added to a 1.5 mL microfuge tube together with 50

µL of water and two 3 mm diameter glass beads. The sample was disrupted by bead beating for one minute. Samples were centrifuged at room temperature for 5 minutes at

18 000 rcf. 25 µL of 5X Laemmli buffer (0.31 M TRIS pH 6.8, 0.346 M SDS, 25% glycerol, 0.5% BPB, 0.5 M DTT) was added to each tube. The samples were boiled for

5 mins, and then centrifuged one last time as described above before loading. 1

µL Precision Plus Protein Dual Xtra Prestained Protein Standards was loaded in the first lane of a 12% polyacrylamide gel, and 20 µL of each of the protein extracts were loaded in the next wells. The protein was transferred to a PVDF membrane and was probes with the Flag-Tag antibody (mouse) at 1:10 000 (v/v) concentration in 3% (w/v) skim milk overnight at room temperature followed by four washed in PBS-T (8 mM

Na2HPO4, 150 mM NaCl, 2 mM KH2PO4, 3 mM KCl, 0.05% Tween 20, pH 7.4) one with a duration of 15 minutes and the others with a duration of 5 minutes. Alkaline

Phosphatase (mouse) secondary antibody was then added at 1:2500 (v/v) concentration in 3% (w/v) skim milk. The blot was incubated with the secondary

174 antibody for one hour. Washes were performed as described above. CDP-Star ready- to-use reagent (Sigma-Aldrich) was then used to visualize the secondary antibody.

RNA extraction and purification

Three autoclaved rounds of filter paper were cut into three equally sized pieces and were applied to MS-agar plates. Stratified seeds harvested from three Col and three htl-3 individuals were applied to the filter paper so that the seeds from each individual were on different pieces of filter paper, in the same plate as the rest of the individuals of that genotype. This was done twice for the Col individuals to allow for comparison of the effect of DMSO to SOP. The plates were placed under 24 h light at a fluence of 45

µE for two days. The filter paper and the seedlings on it were then transferred to new

MS agar plates. The htl-3 seedlings were transferred to an MS-agar plate with 0.2 %

DMSO, and one of the sets of Col seedlings were also transferred to an MS-agar plate with 0.2 % DMSO. The final Col seedlings were transferred onto an MS-agar plate with

20 µM SOP and 0.2 % DMSO. The plates were exposed to 24 h of 45 µE light before they were harvested and their RNA was purified. Seedlings were ground in liquid nitrogen in a precooled mortar and pestle. Approximately 100 mg of tissue was ground into a fine powder per sample. Ground samples were scraped into a chilled 1.5 mL tube. 1 mL of Tri Reagent (Molecular Research Center) was promptly added to the powdered sample. The sample was then mixed by flicking. The mixture was placed on ice until all samples were complete. Samples were incubated at room temperature for 5 minutes. 0.2 mL of chloroform was added to each tube. Tubes were shaken vigorously by hand for 15 seconds. The mixture was incubated at room temperature for 3 minutes.

Tubes were centrifuged at 12 000 g for 15 minutes at 4 ºC. The upper, colourless, layer

175 was pulled and added to a fresh tube. 0.5 mL of isopropanol was added to precipitate the RNA. The samples were incubated at room temperature for 10 minutes. The tubes were centrifuged at 12 000 g for 15 minutes at 4 ºC. The supernatant was decanted and the pellet was washed in 1 mL of cold 75% ethanol. The samples were centrifuged at 7 500 g for 5 minutes at 4 ºC. The supernatant was discarded and the RNA pellet was dried under a vacuum for 5 minutes. The sample was then dissolved in 50 µL of

RNase free water. RNA concentration and purity were assessed by spectroscopy at a pH of 7.5. RNA used for sequencing had a A260/A280 ratio of 1.8 or higher and a concentration of at least 50 µg/µL. RNA used for sequencing was also checked for degradation by gel electrophoresis.

RNA sequencing data quality control

RNA sequencing data was analyzed using FastQC138 and was trimmed using

Trimmomatic139. In raw files the per base sequence quality score for reads was on average greater than the cut off of 28 for all read positions up to position 145. Abnormal per base sequence content was observed for the first 15 base pairs. All data failed the sequence duplication levels and Kmer content filters, but these measurements are not relevant for the analysis of RNAseq data where some sequences are expected to be overrepresented. Trimmomatic was then used to trim the first 15 bases from each sequence, and bases with a quality of less than 28 were trimmed from the beginning and the end of each sequence. Reads with a length of less than 40 bases were discarded. After this treatment no position showed an average quality score of less than 28, and the per base sequence content was no longer anomalous. The trimmed reads were used for subsequent analysis. On average 70 million reads with an average

176 length of 130 base pairs remained after trimming each biological replicate. Since approximately 85% of reads were mappable to the genome that translates to a final number of mapped reads per sample were approximately 60 million with average length of 130 bases.

RNA sequencing analysis

RNA sequencing analysis was performed using the Tophat-Cufflinks-CummeRbund pipeline140 to map and analyze reads. Complete code for the RNAseq analysis can be found in Appendix 1.

GO enrichment analysis

The GO enrichment web tool (http://geneontology.org/page/go-enrichment- analysis)141,142 was used to analyze the fold enrichment or depletion of GO terms in the differentially regulated gene sets. The biological processes mode was used in the analysis. Bonferroni correction was used to control for multiple hypothesis testing.

177

Chapter 5: General discussion

178

Discussion

Striga hermonthica is a devastating parasitic plant that causes enormous harm to agriculture each year in the developing world125 whose lifecycle centres around the perception of the plant hormone strigolactone (SL)69. In this project, we used a variety of chemical genomic approaches to attempt to develop technologies that could be useful in two realms. Firstly, we wanted to identify compounds that could be developed to lessen Striga’s impact on agriculture. Secondly, we wanted to develop chemical probes that could be used to understand the mechanism of SL perception in model organisms as well as in Striga. In order to achieve these goals, we developed a set of high-throughput chemical screening approaches that were used to uncover both agonists and antagonists for the SL receptor HTL in Arabidopsis as well as in Striga. In addition to developing tools, we wanted to use those tools ourselves to understand SL signaling. In collaboration with Dr. Shigeo Toh, we found that the agonists that we uncovered were able to selectively activate a subset of Striga HTL proteins93. This allowed us to narrow down which of the 11 ShHTL proteins are the most important for the perception of germination cues. We were also able to use the HTL antagonist SOP in conjunction with gain-of-function genetic screening to uncover a role for the splicing factor U2AF35B in HTL-dependent SL responses.

In the search for HTL agonists we screened for chemicals that were able to shorten the hypocotyls of 35S::GUS:COP1 seedlings and were also able to stimulate the interaction of HTL with either MAX2 or SPA1 based on a yeast two-hybrid assay. This screening approach had the advantage that it did not use Striga and thus can be emulated on a larger scale in places where the import of Striga seeds is not permitted.

179

Most of the compounds that were designated as “hits” in this trio of screens were able to directly target HTL based on at least one biochemical assay. Furthermore, all hits from this screen were able to stimulate Striga germination to some degree, meaning they could serve as leads to develop soil-stable ShHTL agonists. These results validate our approach as one that is efficient in the identification of direct agonists for HTL that have in vivo activity. Additionally, all of the compounds were able to stimulate the germination of htl-3 transgenics expressing ShHTL proteins, but none could stimulate germination in an htl-3 background. This showed that at least in this germination assay, the action of the compounds was dependent on the presence of an active HTL protein in the plant. However, many of the compounds were able to inhibit hypocotyl elongation in an htl-3/d14-1 background, implying that not all of their action on the hypocotyl is dependent on SL signaling. This suggested that some portion of the activity that they showed in that screen was likely not SL dependent. It is interesting that although the activity of many of these compounds in the hypocotyl was not SL dependent, that most of them were still able to directly bind to AtHTL and were able to activate ShHTLs. This suggests that the inclusion of a target-based screening approach overcame shortcomings in the phenotypic screen and reduced the number of false positives.

These agonists were very useful in assessing the activity of specific ShHTLs. The addition of the synthetic SL GR24 was able to elicit a germination response in plants expression ShHTLs1-9, however the analysis of which ShHTLs were activated by the agonists showed that the germination response was limited to a subset of SL receptors

(ShHTLs4-7). Since these agonists were able to stimulate Striga germination, this showed that the activation of that subset is sufficient to stimulate germination. This

180 insight into the role of this subset of agonists would not have been possible without the agonists that we uncovered. By narrowing down which ShHTLs are important in germination, this serves as an example of how the use of specific agonists can dissect the roles of various receptors in a biological process.

With respect to chemical screening for HTL antagonists, a different approach was taken. While in the screen for HTL agonists we used a target-based screen in the form of yeast two-hybrid assays, in the screen for antagonists we did not use any target- based assays for screening. Instead we used a two step screen where in order to be a hit, a compounds had to cause two independent phenotypic changes that are associated with htl-3 plants. We found 7 compounds that were able to both cause an increased hypocotyl length in the presence of GR24 and inhibit germination of Col seeds. Of those seven compounds five were able to enhance the stability of HTL in a

DARTS assay, and four were able to bind to HTL based on intrinsic fluorescence binding assays. This suggests that a carefully designed chemical screen based purely off of phenotypic assays can enrich for compounds that target a particular protein. Also,

4 of the 7 compounds were able to cause significant reductions in Striga germination rates in the presence of GR24 showing that this Arabidopsis based screen was effective in locating compounds with activity in Striga. As was the case with the chemical screen described above, this screening approach does not require access of

Striga seeds, but can be used to find Striga germination inhibiting compounds. Thus this approach can be widely used by the research community in any country to look for more antagonists, or more potent antagonists for Striga HTLs.

181

Through genetic analysis we were able to show that the most potent of the inhibitors of

Arabidopsis germination, SOP, was able to act specifically through HTL at the level of germination. Therefore, we decided to use SOP as a screening tool to uncover new SL signaling genes. In order to do this, we screened the Arabidopsis FOX line collection112 for lines that were able to germinate in the presence of SOP. We found a variety of overexpression lines could potentially confer insensitivity to SOP, but we decided to focus on the splicing factor U2AF35B (SOPR244) because of its role in transcription.

We were able to show with two independent lines that the overexpression of U2AF35B made seeds germinate more readily, and that overexpression of U2AF35B in an htl-3 background was partially able to suppress the germination defect of that line as well as its long hypocotyl phenotype. These results placed U2AF35B at or downstream of HTL genetically. Since U2AF35B is a splicing factor, we decided to employ an RNA sequencing approach to explore transcription and splicing under genetic or chemical perturbation of HTL. The gene and isoform abundances seen in SOP treated Col correlated better with the abundances seen in htl-3 seedlings than DMSO treated Col did. This means that the transcriptional profile of SOP treated Col is closer to htl-3 than

DMSO treated Col is. This suggests that SOP treatment is at least partially able to mimic the transcriptional changes that occur when the receptor is mutated and is an additional line of evidence suggesting that SOP acts directly on HTL. When the fold change in gene transcription seen in htl-3 and SOP treated Col versus DMSO treated

Col were compared to each other, we found that there was a weak positive correlation.

This implied that although the transcriptional changes between SOP treatment and htl-

3 are more similar than not, that the relationship was weak. When the relative

182 abundance of gene isoforms was assessed, the correlation became stronger although the value was still quite low. The question of why the SOP treated transcriptome isn’t more similar to the htl-3 transcriptome is an interesting one. It is possible that the timing, or intensity of the treatment could be to blame for the poor correlation. However, it is worth noting that the consequences of blocking the activity of a gene is not the same as the consequence of eliminating all its protein-protein interactions143.

Unpublished data from the McCourt laboratory suggests that HTL engages in constitutive protein-protein interactions with light signaling components. Eliminating their binding partner by mutating the protein could have many effects on the cell that are secondary to the inhibition of SL perception. This might in part explain the differences in transcriptional changes that occur under mutation and pharmacological perturbation.

Future directions

An important finding generated by this project was that chemical screening in

Arabidopsis has proven to be a reliable way of finding both SL mimics and antagonists for SL receptors. However, this work was conducted with a specific focus on the role of

SLs in the seed. As was described at length above, SLs play a role in the branching pattern of plants both above79,80 and below ground144. Particularly, axillary branching is controlled through the SL receptor D1483. In this project we did not systematically attempt to identify compounds that might also be acting as either agonists or antagonists for D14. In the chemical screens described in both chapters 2 and 3, hypocotyl elongation was used as a marker for SL action. Although mutants in D14 do

183 not have long hypocotyls, they are insensitive to the addition of exogenous SLs at the level of hypocotyl elongation85. This suggests that many of the hits in those chemical screens could be agonists or antagonists for D14. Testing whether the compounds are targeting D14 should be fairly straightforward. Both YLG hydrolysis92 and differential scanning fluorimetry83 or even isothermal titration calorimetry98 can be used to analyze the binding of small molecules to D14. Compounds that act as agonists or antagonists for D14 could be useful chemical probes to understand the mechanisms of SL perception at the level of shoot branching. In particular, antagonists for D14 could be used in conjunction with genetics to identify genes downstream of D14 in the shoot branching pathway.

Another aspect of the work described in both chapters 2 and 3 that could be further expanded is the search for more potent agonists and antagonists. This could be accomplished in several ways. The first way to find more potent agonists or antagonists would be to expand the screen to a larger chemical library. By looking at more compounds we could probably locate a compound with a greater potency than those uncovered so far. The second approach takes advantage of the fact that we now know that the chemical screens that were used can be effective in finding lead compounds.

With this knowledge in hand it is possible to be more stringent regarding what compounds are hits. One way of being more stringent would be to use a lower concentration of compounds in the initial screen. For example, in the screen for antagonists we looked for compounds that were able to cause a longer hypocotyl in the presence of GR24, and we screened with a concentration of 30 µM for the chemicals from the library. In Figure 3.4A you can see that when these compounds were added

184 to hypocotyls at higher concentrations, such as 50 µM, they often acted to shorten hypocotyls. It is entirely possible that there were compounds that were more potent than SOP in the chemical library, but at 30 µM they had an inhibitory effect on hypocotyl length. Thus, if the goal is to identify compounds that act at nanomolar concentrations it is necessary to screen a lower concentration than was chosen for this study.

One final future direction related to screening relates to the Arabidopsis plants that express ShHTLs in an htl-3 background. At the onset of both the agonist and antagonist screens that are described here no ShHTLs had been cloned. One promising approach to phenotypic screening for small molecules that are able to perturb the activity of ShHTLs is to use those overexpression lines as screening tools.

For example, it would be possible to screen for small molecules that are able to inhibit hypocotyl elongation or stimulate germination in plants overexpressing ShHTL7 in order to find agonists for ShHTL7. Conversely, it would be possible to screen for compounds that are antagonists for ShHTL7 by looking for compounds that can inhibit the germination of ShHTL7 seeds in the presence of GR24 or compounds that are able to increase hypocotyl length of ShHTL7 expressing seedlings in the presence of GR24.

This might find compounds that are better suited to ShHTLs than to AtHTLs and thus are more potent. Most of the agonists that we uncovered were able to activate

ShHTLs4-7, but none were able to robustly activate the individual receptors. This means that although we have found that this group of ShHTLs are important, we still are not able to assay them selectively in Striga. Screening for compounds that are able to selectively activate each of the ShHTLs would give us the opportunity to investigate

185 not only the function of that core group of 4 receptors but also the functions of the other

ShHTLs that hitherto do not have a clear function.

Although screening approaches might yield more potent probes, another important element is the expense of those compounds. After more potent Striga germination stimulants or inhibitors are identified, extensive structure-activity-relationship studies would be required to identify analogs of those active compounds that are cheaper to produce, more potent, or both. If good quality crystal structures of Striga HTLs in complex with agonists or antagonists could be obtained, that information could also be used to aid with the design of cheaper or more potent analogs for those probes.

Another consideration for potential Striga control measures is that it is not enough to show that compounds are able to perturb Striga germination under laboratory conditions; it is also necessary to show through field trials that the compounds are still effective under real world conditions.

There are regulatory challenges beyond the challenges of ensuring that any chemical

Striga control measures are cheap, potent, and effective in the field. Although different countries have different standards for the approval of agrichemicals, the Environmental

Protection Agency (EPA) in the United States of America analyzes risks both to human health as well as environmental risks. This means that information regarding the risks to human health through food, water, and occupational exposure for those working with the chemical would need to be assessed before it could be approved. The potential for groundwater contamination, the risks to endangered species, and the potential for endocrine-disruption would also need to be analyzed. That level of analysis is beyond the scope of any single research group, as can be seen by the fact that the current

186 work plan for the approval of conventional pesticides by the EPA shows that the only entities bringing new pesticides to approval are large agribusinesses145. It is also telling that diethofencarb, a fungicide, was only recently approved by the FDA and approved by the European Commission Health and Consumers Directorate-General in 2011146 although studies of its effects on rats were conducted as far back as 1992147. This should give some sense of the time frame that exists for the translation of compounds from discovery to approval for use in the field. Therefore, if more potent analogs of

SOP are shown to be effective in the field the next steps would include animal testing for safety as well as assessment of the environmental risks posed by the compound.

The other major area that warrants further investigation is the role of U2AF35B in SL signal transduction. Although the work presented here suggests that U2AF35B might have a role in SL perception, it could also be influencing germination and hypocotyl length only indirectly. For example, both GA and ABA seem to act at or downstream of

SL signaling117. Thus any mutant that perturbs those two master regulators of dormancy might have some of the phenotypes observed. Clearly a greater degree of mechanistic insight is required with respect to the role of the splicing factor, but there are several attractive avenues of inquiry that could be pursued to generate that insight.

The first potential avenue is to take advantage of mass spectroscopy to identify proteins that are interacting with U2AF35B. If U2AF35B is somehow affected by SL response, we might expect for it to co-isolate with SL signaling genes together with other parts of the spliceosome. Since we have a flag tagged U2AF35B expressed in both a Col and htl-3 background it should be possible to look for changes in the proteins that are isolated with U2AF35B +/- GR24, but also to do so in a mutant

187 background. These experiments could give insight into what proteins U2AF35B or its interacting partners are interacting with in different SL signaling environments.

Additionally, this approach would be sensitive to post-translational modifications of

U2AF35B or its interactors that might be important for understanding SL signaling.

Another approach would be to use chromatin immunoprecipitation paired with sequencing to identify what genes are directly targeted by U2AF35B. This approach could be used to great effect in combination with the RNAseq data described above to give insight into what DNA-protein interactions are truly associated with SL response and also hint at other downstream genes that are involved in SL signaling.

Conclusions

This project generated numerous chemical probes for the strigolactone receptor HTL.

These probes included both agonists that were able to selectively activate a smaller subset of Striga HTLs than known SLs, as well as an antagonist, soporidine, that was able to inhibit the Striga HTL that seems to be the most important for sensitivity to low concentrations of SLs. We were able to use that antagonist in conjunction with genetic screening to identify novel genes that might play a role in strigolactone signal transduction.

188

Tables

Table I – Hit chemicals from HTL x MAX2 yeast two-hybrid screen.

Chembridge ID numbers, CID numbers, SMILES strings, and relative activity

in the screen are shown for each compound that was in the top 1% of hits in

the HTL x MAX2 yeast two-hybrid based chemical screen.

CB ID CID SMILES Activity C1=CC=C2C(=C1)C=CC=C2SCCC3=CC=NC= 5104063 744276 9.701112 C3 C1CCC(CC1)(C2=CC=C(C=C2)O)C3=CC(=CC 5142942 1638979 6.30106 =C3)Cl C1CCN(CC1)C2=C(C(=O)OC3=CC=CC=C32) 5162496 38047 8.258068 N CC1=CC=C(C=C1)S(=O)(=O)NC2=CC3=CC= 5173960 224144 10.78537 CC=C3C=C2 5175167 627384 C1=CC2=C(C(=C(C=C2Br)Br)N)N=C1 6.124259 5236846 1908457 C1CCC2=C(C1)C3=CC=CC=C3N2CCC#N 9.300362 COC1=CC=C(C=C1)CN2CCN(CC2)C3=CC=C 5261161 792453 5.73128 C=C3Cl CC1=CC=CC=C1N2CCN(CC2)CC3=CC(=C(C 5268570 2056197 6.585653 =C3)OC)C CCC1CCCCN1S(=O)(=O)C2=CC(=C(C=C2)C) 5358125 3114940 7.180085 C 5850825 1906281 CCCCSC1=NN=C2N1C3=CC=CC=C3S2 6.82286 5937705 748649 CCOC(=O)C1=C(C2=C(O1)C=CC=C2O)C 9.71448 CCC1=CC=C(C=C1)C2=CSC(=N2)C3=CC=NC 6127796 1147059 8.209232 =C3 6613675 805516 C1CCCCCC(CCCCC1)CC(=O)N2CCOCC2 8.965091 CC1=CC(=C(C=C1)C)OCC2=NC(=NO2)C3=C 6680521 845254 6.943512 C=NC=C3 6739747 2289474 CCCOC1=C(C=C(C=C1)C=NO)Cl 5.731938 CC1=CC2=C(C=C1)OC(=O)N2CCC3=CC=CC 6847048 18823256 5.932956 =C3 7860273 2985028 C1=CC=C(C(=C1)CN2C=C(C=N2)I)Cl 8.94396 CC(=O)C1=CC=C(C=C1)OC2=C(C(=C(C(=C2 7992462 2985104 6.465744 F)F)C=CCN3CCOCC3)F)F.Cl

189

C1CCN(CC1)S(=O)(=O)NC2=CC(=CC(=C2)Cl) 7992615 6459556 6.278501 Cl C1CN(CCN1C2=CC=CC=C2Cl)C(=O)NC3=CC 9023622 3745815 9.795961 =CC=C3F 9047892 6467430 CCCCS(=O)(=O)NC1=CC=C(C=C1)C(C)CC 7.816696 CC1=CC=CC=C1N2CCN(CC2)CC(=O)NC3CC 9048645 6469730 5.960104 CCCC3 C1CCCC(CC1)NC(=O)CN2CCN(CC2)C3=CC= 9053218 6469992 13.5269 C(C=C3)Cl COCCCNC(=O)C1CCN(CC1)C2=NC(=NC3=C 9053767 42095683 11.75773 2CCC3)C4=CC=CC=C4 1062439 CC1=CC(=C(C=C1)N2CCN(CC2)CC3=C(OC(= 42099214 10.71791 1 N3)C4=CC=C(C=C4)OC)C)C 1243937 CN1CCC(CC1)CN(CCC2=CC=C(C=C2)OC)CC 42216542 8.067979 6 3=CC=C(C=C3)N4C=CC=N4 1549509 COC1=CC=C(C=C1)N2C=CN=C2CN3CCCC(C 45185555 6.689579 2 3)C(=O)C4=CC=C(C=C4)SC 2150596 COC1=CC(=CC=C1)OCC(CNC(=O)C2=CC3= 45188399 7.451932 7 C(C=C2)N=C(O3)C4CCCCC4)O 2363753 CN(CCN1C=CN=C1)CC2=CN(N=C2C3=CC4= 16188165 6.466713 7 C(C=C3)OCO4)C5=CC=CC=C5 2503051 CC1CN(CCN1CC2=C(OC(=N2)C3=C(C(=CC= 45191240 5.912073 6 C3)OC)OC)C)C4=CC=CC=C4 2593632 C1CC(CN(C1)CC2=CC=C(C=C2)OC(F)F)NC3 45192246 6.091226 0 =CC=C(C=C3)F 2677117 CC1=CC(=CC=C1)CNCC2=CC=C(C=C2)OCC( 45194576 8.182693 7 CN(C)CC3=CC=CC=C3)O 2874636 C1=CC(=C(C=C1F)F)OCC2=CC(=NO2)C(=O) 42532805 9.95233 9 NCCC3=CC=NC=C3 4738158 COC1=C(C=CC(=C1)CN2CCCC(C2)C(=O)C3= 45225677 6.132977 7 CC=CC4=CC=CC=C43)O 6137506 CC1=CC(=CC(=C1Cl)C)OCC2=CC(=NO2)C(= 45228223 12.2684 0 O)NC(C)C3=CN(N=C3)C 6449644 CC1=C(OC2=C1C=CC=C2F)C(=O)NCC3=CN( 26331447 15.9089 0 N=C3)C4=CC=CC=C4OC 7007795 CC1=C(C2=C(C=C1)C(=C(O2)C(=O)N3CCC(C 42193475 8.747228 9 C3)C4=CC=NC=C4)C)C 7251103 C1CCC(=CC1)CC(=O)N2CCC3=C(C2)C=C(C= 42213486 11.02413 0 C3)NC(=O)C4=CC=CC=N4 7600509 CN(CC1=CC=CC=C1)CC2=C(N=C3N2C=C(C= 42197252 29.2948 3 C3)Cl)C(=O)N4CCC5=CC=CC=C5C4 8197957 CC1=C(C=CC(=C1)C2=NC(=C(O2)C)CN3CCC 42392549 5.976164 4 (CC3)OC4=CC=C(C=C4)F)OC

190

8356411 C1CC(CN(C1)C(=O)NCC2=CC=CC=C2)C(=O) 45244448 5.898211 1 C3=CC(=CC=C3)Cl 8679321 CN1CCC(CC1)N2CCCC(C2)NC(=O)CC3=CC= 45247735 6.240835 2 C(C=C3)C4=CC=CC=C4 9165598 C1=CC=C2C(=C1)C=CC=C2SCCC3=CC=NC= 744276 5.990135 1 C3

Table II – Hit chemicals from HTL x SPA1 yeast two-hybrid screen.

Chembridge ID numbers, CID numbers, SMILES strings, and relative activity

in the screen are shown for each compound that was in the top 1% of hits in

the HTL x SPA1 yeast two-hybrid based chemical screen.

CB ID CID SMILES Activity 5162496 38047 C1CCN(CC1)C2=C(C(=O)OC3=CC=CC=C32)N 5.387078 5307697 759368 CC(=O)C1=C(C(=O)C2=CC=CC=C2C1=O)C3=CC=CC=C3 4.87666 5660085 741638 CC1=C(C=C(C=C1)NS(=O)(=O)C2=CC=C(C=C2)Br)C 4.809868 5701225 868873 C1CN(CCN1C2=CC=CC=C2)C(=O)CC3=CC=C(C=C3)Cl 4.059232 5850825 3114940 CCC1CCCCN1S(=O)(=O)C2=CC(=C(C=C2)C)C 4.216858 6088083 5348080 CC1=CC=CC=C1OCC(=O)O/N=C(/CC2=CC=CC=C2OC)\N 3.673789 CC1=C(C(=CC=C1)N2CCN(CC2)C(=O)COC3=CC=CC=C3 9015458 2623077 4.112435 OC)C 9023622 6459556 C1CCN(CC1)S(=O)(=O)NC2=CC(=CC(=C2)Cl)Cl 3.999109 9053767 6469992 C1CCCC(CC1)NC(=O)CN2CCN(CC2)C3=CC=C(C=C3)Cl 13.52596 2481764 CC1=CC(=CC(=C1Cl)C)OCC2=CC(=NO2)C(=O)N(C)C3CC 11166552 5.047472 3 OC3 2481668 CC1=CC(=CC(=C1Cl)C)OCC2=CC(=NO2)C(=O)N3CCN(CC 11779621 4.750521 6 3)CC4CC4 4209921 CC1=CC(=C(C=C1)N2CCN(CC2)CC3=C(OC(=N3)C4=CC= 12439376 6.865662 4 C(C=C4)OC)C)C 4219161 CC1=C(N=C(O1)C2=CC=CC=C2F)CN3CCOC4=C(C3)C=C( 13902878 4.481197 4 C=C4)Cl 4517581 CN(CC1CCCN(C1)CCC2=CC(=CC=C2)C(F)(F)F)C3CCSCC 14868249 5.461691 5 3

191

4518005 C1CN(C(CN1CC2=C(C=CC(=C2)Cl)O)CCO)CC3=CC(=C(C 17658802 4.123248 6 =C3)F)F 4239551 CC1=C(N=C(O1)C2=C(C=CC(=C2)OC)F)CN3CCC4(CC3)C 21802139 4.202052 2 =CC5=CC=CC=C45 2895472 CC1=CC2=C(C=C1C)N=C(N2)CN(C)CC3=NC(=NO3)C4=C 22760174 3.681337 5 C(=CC=C4)Cl 4518758 C1CC(CN(C1)C(=O)NC2=CC=CC=C2C(F)(F)F)C(=O)C3=C 23002086 4.041701 8 C(=CC=C3)Cl 4518901 CC1=C(N=C(O1)C2=CC=C(C=C2)Cl)CN(C)C(C)C3=CC(=C 24129931 4.205442 4 C=C3)OC 4519124 CC1CN(CCN1CC2=C(OC(=N2)C3=C(C(=CC=C3)OC)OC)C) 25936320 4.082349 0 C4=CC=CC=C4 2545571 CC1=C(N=C(O1)C2=CC(=C(C=C2)OC)OC)CN(C)CC3=CC= 32035096 4.021164 8 CC4=CC=CC=C43 4519862 CC1=C(C=CC(=C1)Cl)C(=O)C2CCCN(C2)CC3=CC=C(C=C 32465407 5.05854 4 3)C#CCCO 4520116 CN(CC1=NC(=NO1)C2=CC(=CC=C2)Cl)CC(C3=CC=CC=C 34824127 3.891738 7 3)O 2539554 CN(CC1=CN(N=C1C2=CC3=C(C=C2)OCO3)C4=CC=CC=C 37402697 4.005499 1 4)CC5=NC=CS5 1618989 C1CN(CCC1C2=CC=NC=C2)C(=O)C3=NOC(=C3)COC4=C( 40139765 4.699297 6 C=C(C=C4)F)Cl 2554546 CC1=C(N=C(O1)C2=CC=CC3=CC=CC=C32)CN(C)CC4=C 40943835 4.002543 3 C=CC5=C4C=CN=C5 4520948 CC1=NC=C(C=C1)OCC2=CC(=NO2)C(=O)N(C)CC3CC4=C 41629182 5.191042 7 C=CC=C4O3 4253126 COCCN1CCC(CC1)CN(CC2=CC=C(C=C2)Cl)CC3=CC=NC 43054605 3.839052 3 =C3 4521507 C1CC(CN(C1)C(=O)NCCC2=CC=CC=C2)C(=O)C3=CC(=C 44003006 8.56698 3 C=C3)Cl 2834685 CC1=C(N=C(O1)C2=CC3=C(C=C2Cl)OCO3)CN(CC#C)C4C 48985341 8.078174 3 CCCC4 2372339 CC1=CC2=C(C=C1)OC(=C2C)C(=O)NCC(COC3=CC=CC(= 59300521 4.270828 3 C3)OC)O 2633144 CC1=C(OC2=C1C=CC=C2F)C(=O)NCC3=CN(N=C3)C4=CC 68464060 5.248262 7 =CC=C4OC 4523354 C1CC(CN(C1)C2=NC3=CC=CC=C3S2)NCC4=CC=C(C=C4) 70077959 3.797547 1 C#CC5=CN=CN=C5 4221348 C1CCC(=CC1)CC(=O)N2CCC3=C(C2)C=C(C=C3)NC(=O)C 71528219 4.041684 6 4=CC=CC=N4 4523831 CC1=CC=C(C=C1)N2C=C(C(=N2)C3=CC=C(C=C3)F)CN(C) 76005093 11.65264 6 C(C)C4=NC=NC=C4

192

4523852 CC1=C(C=CC(=C1)C2=NC(=C(O2)C)CN(C)C(C)C3=NOC= 78047604 4.096662 6 C3)OC 4237926 CCN(CCN1C=CC=N1)CC2=C(OC(=N2)C3=CC=C(C4=CC= 78324256 3.805628 9 CC=C43)OC)C 4246120 CC1=C(N=C(O1)C2=CC=C(C=C2)Cl)CN3CCC(CC3)OC4=C 81752333 5.390244 3 C=C(C=C4)OC 4525183 CN(C)C1=NC=C(S1)CN2CCCC(C2)C(=O)C3=C4C=CC=C5 86577258 4.647549 7 C4=C(CC5)C=C3 2641179 CC1=C(N=C(O1)C2=CC(=CC=C2)F)CN(CCO)CC34CC5CC( 97768764 3.894615 8 C3)CC(C5)C4 2481918 CN(CCCC1CCCC1)C(=O)CC2C(=O)NCCN2CC3=C(C(=CC 98641713 3.674584 3 =C3)F)F 98784987 38047 C1CCN(CC1)C2=C(C(=O)OC3=CC=CC=C32)N 4.179529

Table III – Hit chemicals from hypocotyl elongation screen.

Chembridge ID numbers, CID numbers, and SMILES strings are shown for

each hit in the hypocotyl elongation chemical screen.

CB ID CID SMILES 5107663 75462 CCN1C(=CC2=[N+](C3=CC=CC=C3C=C2)CC)C=CC4=CC=CC=C41.[Cl-] 5117972 6759427 CCOC(=O)C1=C(SC(=C1C)C)NC=C2C=C(C=C(C2=O)Br)Br 5228209 8515 C1=CC=C2C(=C1)C3=NNC4=CC=CC(=C43)C2=O 5260028 2787032 C1=CC(=CC(=C1)Br)C=C2C(=O)NC(=O)S2 5348278 2842821 COC1=CC=C(C=C1)S(=O)(=O)NC2=CC=C(C=C2)[N+](=O)[O-] 5732168 3122356 CC1=C(C(=NN1)NC2=CC(=CC=C2)OC)[N+](=O)[O-] 5850825 3114940 CCC1CCCCN1S(=O)(=O)C2=CC(=C(C=C2)C)C 5937705 1906281 CCCCSC1=NN=C2N1C3=CC=CC=C3S2 6088083 5348080 CC1=CC=CC=C1OCC(=O)O/N=C(/CC2=CC=CC=C2OC)\N 6682831 2910086 CC1=NC2=CC=CC=C2N1CC(COC3=CC=C(C=C3)C4CCC=C4)O 7938002 2969068 CN1CCN(CC1)C(=O)C2=C(C3=C(S2)N=C(C=C3C(F)(F)F)C4=CC=CC=C4)N 9025351 5300685 CC1=C(C2=CC=CC=C2N1)C(=O)C(=O)NC3=CC=C(C=C3)Cl 9047892 3745815 C1CN(CCN1C2=CC=CC=C2Cl)C(=O)NC3=CC=CC=C3F 9055547 6470861 C1=CC=C(C=C1)C#CC(=O)NC2=CC(=CC=C2)F

193

1116655 2481764 CC1=CC(=CC(=C1Cl)C)OCC2=CC(=NO2)C(=O)N(C)C3CCOC3 2 3 1530319 4517648 C=CCCC(=O)N1CCCC(C1)C(=O)C2=CC=CC3=CC=CC=C32 0 5 1659917 4224151 CC1=C(N=C(O1)C2=C(C=CC(=C2)OC)F)CN3CCN(CC3)C4=NC5=CC=CC=C 9 1 5C=N4 1968898 4235851 CC1=C(N=C(O1)C2=C(C=C(C=C2)OC)F)CN3CCN(CC3)C4=NC5=CC=CC=C 9 7 5C=N4 2625363 4259239 C1CCN(C1)C(=O)C2=CC3=C(C=C2)OC(=N3)CC4=CC=C(C=C4)Cl 2 6 4027900 4520680 COC1=C(C=C(C=C1)C2=C(C=NN2)CN3CCN(CC3)C4CCC5=CC=CC=C5C4) 3 1 F 4210606 4520858 CC1=C(N=C(O1)C2=CC3=C(C=C2Cl)OCO3)CN4CCCCC4C5=CC(=CC=C5) 1 0 OC 4292906 4252843 CN(CC1=CC=CC=C1)CC2=CN=C(N2CC3CCCCC3)S(=O)(=O)CC4CC4 7 7 4762698 4521378 COC1=CC2=C(C=C1)C=C(C=C2)C(=O)C3CCCN(C3)C4CCSCC4 0 6 4934056 4521540 CC(CCN1C=CC=N1)NC(=O)C2=CSC3=NC(=CN23)C4=CC=C(C=C4)C(F)(F) 7 3 F 5228241 4523068 CN(CC1CCCCO1)C2=NC(=CN=N2)C3=CC(=CC=C3)F 0 2 5370406 4219370 C1CN(CCC1CCC(=O)NCC2=CC(=CC=C2)Cl)C(=O)C3=CC4=CC=CC=C4O3 3 2 6769804 2481901 CC1=C(C=C(C=C1)N2C3=C(C=N2)C(CCC3)NC(=O)C4=CC=CC=C4N5CCO 5 3 CC5)C 7266753 4523795 CC(C)C1=CC=C(C=C1)NC2CCCN(C2)C(=O)C3=CC4=NO[N+](=C4C=C3)[O- 5 7 ] 7751694 4524145 CCC(C1=CC=NC=C1)NC(=O)C2=CC3=C(C=C2)OC(=N3)CC4=CC=C(C=C4) 6 4 Cl 7752425 2639065 CC1=C(C(=CC=C1)N2CCN(CC2)CC3=C(OC(=N3)C4=CC=C(C=C4)Cl)C)C 7 4 8240890 1619215 CCOC(=O)C1(CCN(CC1)C(=O)C2=NN(C(=C2)C)C)CC3=CC=CC=C3C(F)(F) 0 2 F 8772198 4211938 C1CCCN(CC1)C(=O)CCC2=NN=C(O2)CC3=CC=C(C=C3)C4=CC=CC=C4 7 4 9768149 1619021 CC1=C(C=CC(=C1)F)N2CCC(CC2)N3CCN(C(C3)CCO)CCC(C)C 9 9 1162980 4522822 CC1=CC(=CC(=C1Cl)C)OCC2=CC(=NO2)C(=O)NC(C)C3=CN(N=C3)C 6 3 3865652 2639631 CCOC(=O)N1CCN(CC1)CC2=C(OC(=N2)C3=CC(=CC=C3)C(F)(F)F)C 5 2

194

6449644 2372341 CC1=C(N=C(O1)C2=CC=C(C=C2)C3=CC=CC=C3)CN(CCOC)C(C)C 0 0 9127374 4522637 CC(C)N1CCC(CC1)N2CCCC(C2)C(=O)C3=CC4=C(C=C3)C=C(C=C4)OC 6 4 1346384 4522092 COC1=CC=CC=C1C(=O)NC2CCCC3=C2C=NN3C4=C(C=C(C=C4)F)F 7 8 6222304 4252536 COCC(=O)N1CCC(CC1)C(=O)NC2=CC=CC(=C2)C3=CC(=CC=C3)Cl 4 4 5559092 4525268 CC(C1=NC=NC=C1)NC(=O)C2=CSC3=NC(=CN23)C4=CC=C(C=C4)C(F)(F) 3 1 F 4109505 4259284 CCN(C)CC1=NC(=NO1)C2=CC=C(C=C2)Cl 2 5 2811553 75462 CCN1C(=CC2=[N+](C3=CC=CC=C3C=C2)CC)C=CC4=CC=CC=C41.[Cl-] 4

Table IV – Identities of hits in primary chemical screen for hypocotyl

lengthening agents.

All the ChemBrigde ID numbers, CID numbers, SMILES strings, and

whether or not a compound was able to inhibit germination are shown in the

accompanying table. If the compound was able to inhibit germination its RG

number is listed.

CB ID CID SMILES Germination?

5378298 2844330 C1=CC2C3C(C1O2)C(=O)N(C3=O)C4=CC=C(C=C4)Br N 5476484 520327 COC1=C(C=C2C(=C1)C(=NC(=N2)Cl)Cl)OC N

9013088 2995765 CC1=CC(=C(C(=C1Cl)C)Cl)OCCN2CCN(CC2)C.Cl N

CC1=C(NC2=CC=CC=C12)CN3CCCC(C3)N4CCN(CC4)C 75762136 45236652 N 5=CC=C(C=C5)OC

98984089 26412749 CN(CC1CCN(CC1)CCC2=CC=C(C=C2)OC)CC3=CC=CS3 N

20282340 24794229 CCCCC1=NC2=C(O1)C=CC(=C2)C(=O)N3CCCC3 N

CC1=CC=C(O1)CN2CCC(CC2)CN(C)CC3=CC4=CC=CC= 55286919 25304500 N C4O3

195

C1CCC(C1)NC(=O)C2=CC(=C(C=C2)OC3CCN(CC3)CC4= 68627672 26328080 N CSC=C4)Cl

CC(CCN1CCC(CC1)C(=O)NC2=CC=CC(=C2)C3=CSC=N3 99918769 45253253 N )C4=CC=CC=C4

CC1=CC=C(O1)CN2CCC(CC2)CN(C)CC3=CN=C(S3)C4= 15852610 28389584 N CC=CC=C4

CC1=C(C=CC(=C1)CN2CCN(CC2CCO)CC3=CC=C(C=C3) 32261691 45198402 N C#C)OC

CC1(CC(C2=C(C1)N(N=C2)C3=CC=C(C=C3)OC)NC(=O)C 40870762 45207388 N N4N=C5C=CC=CC5=N4)C

CC1=C(C=CC(=C1C)OC)C2=NC(=C(O2)C)CN3CCCC3C4 88036311 45245290 N =CC=CC=C4OC

CC1=CC=C(S1)C2=NC(=C(O2)C)CN3CCN(CC3)C4=NC= 11541255 42118433 N NC5=CC=CC=C54

COC1=CC=CC(=C1)C2=CC=C(C=C2)NC(=O)C3CCCN(C3 78628928 45238750 N )CC4=CC(=CC=C4)O

CCOC(=O)C1(CCN(CC1)CC2=CC=C(C=C2)C#CCCO)CC3 15233450 42215958 RG4 =CC(=CC=C3)C(F)(F)F

C1CC(CN(C1)CC2=CC=CC=N2)NCC3=CC=C(C=C3)OCC 19486260 45182717 RG5 4=CC=CC=C4

85445356 45243570 CN(CC1CCN(CC1)CCC2=CC=CC=C2F)CC3CCC=CC3 RG7

CC1=C(N=C(O1)C2=CC=C(C=C2)C(F)(F)F)CN3CCCCC3 43204705 45209642 N C4=NC=CN4C

CC1=C(NC2=C1C=C(C=C2)Cl)CN3CCN(C(C3)CCO)C4CC 54044684 45219551 N CC4

CC1=CC(=CC=C1)CN(CCC2=CC=C(C=C2)OC)CC3CCN( 54190612 42567453 N CC3)C

CC1=CC=C(C=C1)N2C=C(C(=N2)C3=CC(=CC=C3)C)CN4 84027977 42393812 N CCC(CC4)C(=O)OC

COC1=CC(=C(C=C1)F)CN2CCCC(C2)(CC3=CC(=CC=C3) 12426770 23723435 N Cl)CO

COC1=CC2=C(C=C1)C=C(C=C2)C3=C(C=NN3)CNCC4=C 22315646 42290368 N C=C(C=C4)C(F)(F)F

COC1=CC=CC=C1N2CCN(CC2)C3CCCN(C3)CCCC4CCC 58176071 45223106 N C4

5135250 69298 C1=CC=C(C=C1)NC(=O)CC(=O)NC2=CC=CC=C2 N

5547669 2249387 CCCCNCCCOC1=CC(=C(C=C1)Cl)C RG1

196

6051520 721066 C1=CC=C(C(=C1)C2=NC3=C(O2)C=CC(=C3)N)Br RG3

5621856 6741140 C1=CC=C(C=C1)C2=NNC(=S)N2NC=C3C=CC=CC3=O RG2

5931792 2877228 C1=CC=C(C=C1)C2=CSC(=N2)NC3=CC=C(C=C3)O.Br N

5849268 781570 C1CN(CCN1C2=CC=C(C=C2)N)C(=O)C3=CC=CC=C3 N

COC1=CC(=CNNC(=O)C2CC2(C3=CC=CC=C3)C4=CC=C 5194006 6810063 N C=C4)C=CC1=O

C1CC2CN(CCN2C1)C3=NC(=CN=N3)C4=CC=CC5=CC=C 38941253 45205447 N C=C54

CN(CC1CCN(CC1)CCC2=CC(=CC=C2)F)CC3=CN=CC=C 28144154 25274894 N 3

CC1=C(N=C(O1)C2=CC=C(C=C2)OC)CN3CCC(CC3)C4= 20585905 42364911 N CC=CC=C4

CC1=C(C=CC(=C1)Cl)C(=O)C2CCCN(C2)CC3=C(C=C(C= 15625510 45177000 N C3)O)OC

CCOC1=CC=C(C=C1)CN2CCN(CC2CCO)CC=CC3=CC=C 41990395 72076803 RG6 (C=C3)OC

Table V – Identities and description of SOPR lines.

Arabidopsis gene IDs, short gene names, and gene description are shown

for all SOPR lines where available.

SOP# Arabidopsis ID Gene names Gene description

SOPR7 AT5G38530 TSBTYPE2 Tryptophan synthase beta type 2 Mitochondrial glyoxalase 2-1, required SOPR12 AT2G43430 GLX2-1 for growth under abiotic stress Eukaryotic aspartyl protease family SOPR32 AT1G03220 AT1G03220 protein, may be involved in salt stress UDP-Glycosyltransferase superfamily SOPR53 AT2G22590 AT2G22590 protein Mitochondrial tyrosyl-tRNA SOPR55 AT2G33840 AT2G33840 synthetase, class Ib

197

UDP glycosyl transferase acts on SOPR75 At4G14090 At4G14090 anthocyanain Photosystem I subunit G, has role in SOPR83 AT1G55670.1 PSAG electron transport within photosystem UDP glycosyl transferase acts on SOPR94 AT4G14090 AT4G14090 anthocyanain Light-harvesting chlorophyll B-binding SOPR110 AT5G54270 LHCB3 protein 3 Disease resistance protein (TIR-NBS- SOPR149 AT5G18360 AT5G18360 LRR class) Protein phosphatase 2C family SOPR189 AT5G57050 ABI2 protein, core ABA signaling component

SOPR196 AT3G27420 AT3G27420 Unknown protein Arabinogalactan protein 15, unknown SOPR213 AT5G11740 AGP15 function Class II aminoacyl-tRNA and biotin SOPR221 AT5G56680 SYNC1 synthetases superfamily protein

SOPR222 AT4G36790 AT4G36790 Major facilitator superfamily protein RING/U-box superfamily protein with SOPR223 AT3G61550 ATL68 zinc finger motifs

SOPR224 AT3G51240 F3H Flavanone 3-hydroxylase SOPR244 AT5G42820 U2AF35B Splicing factor with zinc finger motif

Table VI – Gene names and transcript abundances for genes

significantly changed between htl-3 and DMSO treated Col seedlings.

The gene names, transcript abundances in FPKM, and relative fold changes

are shown for all genes that were significantly changes in abundance

198

between htl-3 plants and DMSO treated Col plants. The p-value cutoff used

was 0.01.

FPKM Fold changes Gene name DMSO htl-3 SOP htl-3/DMSO SOP/DMSO AT1G01300 25.4933 46.7983 22.2537 1.83571 0.475524 AT1G01800 20.8908 10.6165 17.8595 0.50819 1.68224 BGLU11 11.2771 6.26293 12.7117 0.555367 2.029673 Sadhu4-2 13.1648 24.5839 13.5788 1.867396 0.552345 3AT2 2.48533 0.65541 2.39504 0.263711 3.654262 EIF2 GAMMA 15.3225 9.08091 14.93 0.592652 1.644108 AT1G04420 23.2836 13.6593 19.1031 0.586649 1.398542 AT1G04560 5.53167 14.3264 3.14753 2.589887 0.219701 ADS1 2.67345 8.1123 1.73393 3.034394 0.213741 ACX3 10.1627 5.6475 10.3012 0.555709 1.824028 AT1G06510 4.78334 1.81296 4.35785 0.379015 2.403721 AT1G07210 7.00096 2.7245 6.63145 0.389161 2.434006 RPL10AA 87.127 53.479 75.6937 0.613805 1.415391 PRA1E 13.0713 25.9837 12.0759 1.987844 0.464749 ERD6 40.992 22.0239 35.4599 0.537273 1.610065 CSP41B 374.635 219.826 325.962 0.586774 1.482818 GLP4 51.1993 85.9264 54.0887 1.678273 0.629477 RPL21A 151.801 96.8878 116.352 0.638255 1.200894 AT1G09750 62.988 145.062 62.0981 2.30301 0.42808 XBCP3 4.70207 2.09417 4.31787 0.445372 2.061853 AT1G09870 5.95678 3.06933 5.08431 0.515267 1.656489 AT1G09900 4.90333 1.83904 4.30925 0.375059 2.343206 ERG28 25.9237 50.3417 23.0426 1.941918 0.457724 ACLA-1 37.4009 23.6272 34.1896 0.631728 1.447044 LRX1 1.44916 3.86715 2.43046 2.668546 0.628489 AT1G12080 18.4631 39.1157 20.2469 2.118588 0.517616 AT1G14345 117.447 215.282 119.087 1.833014 0.553167 AT1G14890 29.9939 48.7593 26.0668 1.625641 0.534602 PNSB1 30.8267 15.9418 29.1641 0.517143 1.829411 AT1G16080 49.1872 31.4236 39.108 0.638857 1.244542 AT1G16180 65.2499 108.935 57.809 1.669504 0.530674 AT1G16320 20.1626 10.2265 17.7597 0.507201 1.736635 SPS2 11.8378 6.29703 11.6677 0.531943 1.852889 GSTU24 7.33199 2.54051 6.48973 0.346497 2.554499

199

GSTU25 6.70068 2.76025 6.04003 0.411936 2.188218 AT1G17200 89.6623 181.771 75.1378 2.027285 0.413365 ALAAT1 29.4034 18.0415 27.6336 0.613586 1.531669 AT1G17620 16.0277 27.8438 14.8054 1.73723 0.531731 TIP3-2 3.12717 15.6629 3.5586 5.00865 0.227199 AT1G19600 16.0602 9.37647 12.8557 0.583833 1.37106 PUP14 29.1398 17.7978 27.2898 0.610773 1.533324 VHA-C2 184.483 390.374 169.525 2.116043 0.434263 CAT3 149.662 259.349 146.366 1.732898 0.564359 AT1G22750 41.6191 66.1478 35.0262 1.589362 0.529514 AT1G23040 14.247 29.3894 11.3169 2.062848 0.385067 AT1G23130 66.5733 29.8183 62.1587 0.447902 2.084582 SPDS1 35.519 21.9439 30.0037 0.617807 1.367291 TULP10 8.51167 4.86906 8.05613 0.572045 1.654555 AT1G27000 20.9959 12.4222 17.2061 0.591649 1.385109 AT1G27690 2.86603 1.11802 2.5025 0.390094 2.238332 NTF2 70.1271 117.94 67.5914 1.681803 0.5731 RPL34 293.41 476.618 254.081 1.62441 0.533091 AT1G29670 27.5013 17.0198 24.1493 0.618873 1.418894 UGT78D1 6.79686 3.58782 4.82272 0.527864 1.344192 AT1G30700 12.3684 6.01517 11.3052 0.486334 1.879448 PER7 14.4205 24.3662 15.3688 1.689692 0.630743 ECS1 77.691 273.681 79.6652 3.522686 0.291088 AT1G32410,PCMP-E56 20.0364 34.0343 17.8543 1.698624 0.524597 PKP3 9.48006 4.64337 7.44786 0.489804 1.603977 ADR1 2.12994 0.824896 2.06038 0.387286 2.497745 AT1G33811 24.0521 13.6488 20.9121 0.567468 1.532157 AT1G35516 63.1335 117.388 56.8871 1.859362 0.484607 AT1G36310 6.03587 2.53608 5.1445 0.420168 2.028524 AT1G36980 137.807 270.234 123.529 1.96096 0.457119 ARP1 190.738 107.655 162.231 0.564413 1.506953 AT1G45165 12.5026 27.4036 9.94964 2.191832 0.363078 AT1G47710 4.76156 2.16597 4.08081 0.454887 1.884057 PEX3-2 3.56232 1.52257 3.88224 0.42741 2.549794 NUCL1 23.1575 12.8203 24.2203 0.553613 1.889215 CCS1 7.25184 3.63369 5.63008 0.501071 1.549411 PAB8 16.9377 9.75319 16.8603 0.575827 1.728696 AT1G50380 3.88699 1.73053 3.66795 0.445211 2.119553 TGG5 4.44424 2.15793 4.91397 0.485557 2.277168 IAA18 7.35088 3.7275 7.12051 0.507082 1.910264

200

NPF1.2 4.8262 9.09228 4.51881 1.883942 0.496994 AT1G52870 14.9992 8.8061 15.0619 0.587105 1.710394 AT1G52930 7.95332 2.64512 6.78726 0.332581 2.565955 HEI10 0.388583 8.84649 0.14946 22.76602 0.016895 AT1G54520 10.1579 5.80853 9.9507 0.571824 1.713118 AT1G54740 20.4005 10.3751 20.2191 0.508571 1.94881 PRP1 3.49574 10.9128 5.74298 3.121742 0.526261 AT1G56050 8.18094 4.26523 6.47496 0.521362 1.51808 AT1G56220 182.19 283.343 175.701 1.555206 0.6201 GSTU16 10.248 5.00934 9.64282 0.488811 1.924968 EXPA18 10.6671 19.707 8.64936 1.847456 0.438898 AtGH9C2 5.92241 9.82697 5.58374 1.659286 0.568206 RPN12A 39.7047 23.6127 33.9281 0.594708 1.436858 AT1G65220 12.3776 5.83029 10.2001 0.471036 1.749501 CICDH 101.483 57.3629 93.6696 0.565246 1.63293 MIR163 9.47043 3.50575 9.69434 0.370179 2.765268 AT1G66820 33.7621 59.1194 31.3708 1.751058 0.530635 AT1G67060 14.1688 27.7961 11.652 1.961782 0.419195 PIS1 29.4244 50.3432 28.3691 1.710934 0.563514 AT1G68670 9.8521 5.51328 8.88288 0.559605 1.611179 AT1G68945 63.3585 137.502 54.2832 2.170222 0.394781 TCP15 3.94054 7.25912 3.36249 1.842164 0.463209 GGAT2 7.83794 2.9786 7.16316 0.380023 2.404875 TIFY7 21.1371 11.6561 17.1703 0.551452 1.473074 PIN3 16.8123 26.6364 18.2238 1.58434 0.684169 AT1G71840 8.35457 4.26265 6.62416 0.510218 1.554 AT1G71970 12.4744 24.8751 12.7111 1.994092 0.510997 AT1G72090 3.25717 1.47153 3.41068 0.451782 2.317778 AT1G72230 9.52632 29.1816 8.35522 3.063261 0.286318 AT1G72645 29.2026 51.167 25.5063 1.752139 0.498491 TIP3-1 1.61021 8.12225 2.32343 5.044218 0.286057 AT1G73920 8.2602 4.86228 8.0208 0.58864 1.649596 AT1G74450 3.29064 1.08133 3.20194 0.328608 2.961113 GASA6 62.5034 99.8171 57.6173 1.596987 0.577229 AT1G75280 20.6408 11.5068 16.1389 0.557478 1.402553 VHA-C4 97.9785 218.926 84.2126 2.234429 0.384662 PDF1.1 17.6844 55.5458 14.687 3.140949 0.264412 OPR1 28.9623 14.1392 28.3101 0.488193 2.002242 ACR3 15.7013 9.49284 16.0042 0.604589 1.685923 RHM1 12.8414 5.66219 13.1125 0.440932 2.3158

201

PSBR 5866.42 10662.8 5601.87 1.817599 0.525366 PIP1-3 518.212 1039.88 471.731 2.006669 0.45364 AT1G03106 1.00035 5.67087 0.618253 5.668886 0.109023 PSAD2 212.265 473.167 260.092 2.229133 0.549683 PHB2 17.3684 8.89691 14.0628 0.512247 1.580639 FLA9 269.854 515.689 261.463 1.910993 0.507017 AT1G04340 13.5789 25.1162 12.4887 1.849649 0.497237 TPR4 11.5771 5.72033 10.8312 0.494107 1.893457 PRMT10 7.37118 2.90261 6.64479 0.393778 2.289247 AT1G05135 35.8973 63.8058 37.5304 1.777454 0.588197 GLX2-4 5.10459 2.25827 4.86829 0.4424 2.155761 FTSH8 17.0634 9.50151 19.2941 0.556836 2.030635 LSH6 5.09414 13.5012 4.91865 2.650339 0.364312 AT1G07135 19.5687 10.2712 8.5914 0.524879 0.836455 AT1G07310 19.6031 31.5888 22.1516 1.611419 0.701249 AT1G07750 12.0029 22.4814 11.3866 1.872997 0.50649 RPS15AA 301.389 513.073 275.067 1.702361 0.536117 RABA2A 6.97007 2.96851 4.91984 0.425894 1.657343 PUR2 7.66436 3.80512 6.4538 0.496469 1.696083 NFXL1 4.98262 2.9875 5.45317 0.599584 1.825329 MLO2 7.54756 4.01389 7.28053 0.531813 1.813834 XTH8 17.1499 29.2037 16.2682 1.70285 0.55706 AT1G12064 5.86632 15.1664 5.51422 2.585335 0.363581 ELP 799.269 1971.07 837.135 2.466091 0.424711 AT1G12845 11.8961 25.1926 6.72658 2.117719 0.267006 ERF1-2 18.0438 6.10918 15.0182 0.338575 2.4583 AT1G13380 31.5483 52.9776 25.8588 1.679254 0.488108 SBH2 17.3375 28.5421 15.5296 1.646264 0.544095 AT1G14450 75.1005 137.398 70.9218 1.829522 0.516178 AL7 13.998 7.15825 11.2103 0.511377 1.566067 DECOY 7.72575 2.49542 6.55259 0.323 2.625847 KDSA2 4.97482 2.40777 4.00583 0.483991 1.66371 AT1G17960 5.07052 0.612113 5.71701 0.12072 9.339795 RPL6A 98.9896 58.0551 86.3336 0.586477 1.487098 AT1G18850 4.99806 1.9809 4.39687 0.396334 2.219632 AT1G18980 14.3282 34.0669 12.7665 2.377612 0.374748 AT1G19370 2.41667 0.785907 2.23624 0.325202 2.845426 BBD2 28.3051 16.719 25.4881 0.590671 1.524499 CLH1 6.69685 15.255 7.16528 2.277937 0.4697 APS2 20.3782 10.8236 16.2962 0.531136 1.505617

202

EXPA11 9.15582 24.4243 9.32144 2.667626 0.381646 COR47 15.7236 25.467 20.2663 1.619667 0.795787 AT1G22200 7.44318 3.33913 5.52593 0.448616 1.654901 GRF10 38.0991 23.917 35.9368 0.627758 1.502563 UGT85A1 19.1827 10.831 15.3538 0.564623 1.417579 AT1G23390 58.7087 91.7699 65.0916 1.56314 0.709291 AT1G23950 12.1464 6.83637 10.8904 0.562831 1.593009 CCT5 17.9202 9.92904 17.9838 0.55407 1.811233 AT1G25260 19.6676 8.92429 19.1487 0.453756 2.145683 AT1G25400 16.2121 9.07858 11.04 0.559988 1.216049 AT1G27100 12.3394 7.46695 10.5252 0.605131 1.409572 AT1G27200 3.756 1.69452 3.75456 0.45115 2.215707 AT1G27470 1.90315 0.665244 2.15394 0.349549 3.23782 ZAT10 22.4283 10.2741 10.9249 0.458086 1.063344 AT1G28120 9.32804 3.77047 7.40391 0.404208 1.963657 DYL1 197.786 337.239 176.586 1.70507 0.523623 AT1G29090 24.249 12.8967 19.2267 0.531845 1.490823 COR413IM2 19.9269 36.5423 18.948 1.833818 0.518522 AT1G29785 15.957 31.8826 13.5742 1.998032 0.425756 LHCB1.2 3996.57 7642.05 4164.54 1.912152 0.544951 RPL18AA 32.0729 53.413 27.0145 1.665362 0.505766 AT1G30580 34.6547 21.2558 29.5311 0.61336 1.38932 AT1G30630 20.7991 11.52 17.2911 0.55387 1.500964 AT1G30820 6.38031 3.64992 5.96644 0.57206 1.634677 IMDH1 17.2261 9.83198 16.3931 0.570761 1.667324 PSAF 1077.72 1865.12 1284.52 1.730616 0.688706 SKIP25 7.55512 3.22635 6.62046 0.427042 2.051997 DAD1 67.8676 111.227 53.5673 1.638882 0.481603 AT-GTL1 2.33794 4.3439 3.84016 1.858003 0.884035 TCP23 7.52296 13.3011 6.42168 1.768067 0.482793 CYCA1-1 2.41749 0.635898 1.45355 0.263041 2.285823 GAMMACA2 29.4386 14.8577 23.8138 0.504701 1.602792 DELTA-ADR 5.29096 9.06032 5.66015 1.712415 0.624719 REM19 9.5436 5.30254 9.20797 0.555612 1.736521 RBP47A 17.4212 9.77729 15.2926 0.561229 1.564094 FTSH1 57.2693 35.6219 50.663 0.622007 1.422243 AT1G50290 10.2009 21.8127 11.2023 2.138311 0.513568 UBP6 5.70834 2.80053 6.13718 0.490603 2.191435 JAL8 11.239 4.46774 8.65886 0.397521 1.938085 JAL9 20.0257 9.18651 16.9884 0.458736 1.849277

203

JAL10 37.4788 13.4578 32.4882 0.359078 2.41408 AT1G52100 55.5609 33.7946 48.3793 0.608244 1.431569 CURT1C 178.145 298.895 173.764 1.677819 0.581355 GLL22 23.3693 13.2383 19.3469 0.566483 1.461434 AT1G54410 30.5788 72.828 44.563 2.38165 0.611894 AT1G55152 12.1693 24.186 10.5703 1.98746 0.437042 DIR20 23.0128 40.7036 22.6072 1.768737 0.55541 AGP21 375.492 849.956 354.006 2.26358 0.416499 TBP2 11.3725 6.2563 9.32583 0.550125 1.49063 PSAG 1659.84 3578.28 1787.22 2.155798 0.499463 ATDI19 103.113 57.4521 90.6624 0.557176 1.578052 AT1G60000 32.9828 18.3055 29.3666 0.555001 1.60425 ACLA-2 8.19042 4.42872 7.77236 0.54072 1.75499 ARP2 5.63921 2.34406 4.34539 0.415672 1.853788 Apr-02 32.693 12.6924 29.2095 0.38823 2.301338 WRKY6 3.02154 1.36397 2.63463 0.451416 1.931589 AT1G62510 154.375 294.926 185.552 1.910452 0.629148 AT1G63010 5.74555 3.34115 5.61093 0.58152 1.679341 UGE3 6.56245 2.69052 6.11136 0.409987 2.271442 AT1G63660 7.48305 4.15781 5.59214 0.55563 1.344972 ENODL8 22.6923 47.9073 21.5916 2.11117 0.450695 AT1G65020 4.45662 1.33933 3.73345 0.300526 2.78755 ECI1 9.53087 21.4793 8.4038 2.253656 0.391251 AT1G65720 79.0544 151.549 72.1103 1.917022 0.475822 AT1G66100 13.1484 35.0706 17.7845 2.66729 0.507106 BGLU21 16.7261 9.72639 15.5602 0.58151 1.599792 BGLU22 9.93256 4.67309 10.1451 0.470482 2.170962 AT1G66330 15.556 8.46163 15.2467 0.543946 1.801863 CRWN1 1.16621 2.53137 2.78171 2.170595 1.098895 PDCB4 15.0237 26.5406 13.3309 1.766582 0.502283 HIR2 10.711 5.16508 9.06735 0.482222 1.75551 SPDSYN2 28.081 16.732 22.0051 0.595848 1.315151 BCA4 65.1843 40.1729 54.6369 0.616297 1.360044 ARF2-A 114.304 206.595 109.486 1.807417 0.529955 MLP34 38.7588 65.081 47.0972 1.679128 0.723671 AT1G70985 6.32506 14.1824 7.80533 2.242255 0.550353 AT1G70990 2.20002 7.23397 2.53787 3.288138 0.350827 ATMYBL2 7.52558 15.248 11.4333 2.026156 0.749823 AT1G71060,AT1G71070 5.23258 2.79313 3.94234 0.533796 1.411442 AT1G72020 130.768 229.469 130.995 1.754779 0.570861

204

AT1G72550 7.56855 2.82643 7.02494 0.373444 2.485446 CAD1 5.53562 1.66796 4.79059 0.301314 2.872125 AT1G73120 102.178 171.466 84.0176 1.678111 0.489996 NRP1 8.38892 3.73005 9.23426 0.44464 2.47564 MIF1 5.7506 15.3192 7.11837 2.663931 0.46467 WAT1 47.8933 79.0973 40.8511 1.651532 0.516466 PETE 128.314 206.326 125.509 1.607977 0.608304 AT1G77670 6.72464 3.29442 5.51356 0.489903 1.673606 SRK2C 4.71996 1.5843 4.28439 0.33566 2.704279 RPS17 485.018 1039.55 449.415 2.143323 0.432317 AT1G80530 10.2543 18.379 8.16487 1.792321 0.44425 AT2G01008 9.03768 16.7104 8.12216 1.84897 0.486054 AT2G01021 9614.61 36350.4 14131.2 3.780746 0.38875 PDF2.2 268.311 531.699 260.497 1.981652 0.489933 PDF2.3 201.128 500.991 199.815 2.490906 0.39884 UBC2 42.1447 67.1265 40.5263 1.592763 0.60373 SPR1 86.5146 142.277 74.8049 1.644543 0.525769 FLA7 26.329 50.3309 27.5056 1.911615 0.546495 GRP3S 822.052 1946.64 848.801 2.368025 0.436034 GRP3 181.871 298.709 168.315 1.642422 0.563475 AT2G05790 23.9111 38.9866 24.8009 1.630481 0.636139 XTH4 287.097 496.415 297.299 1.729085 0.598892 AT2G07717 5.54106 9.9905 6.79205 1.802994 0.679851 MT-CYB 7.46659 14.6889 10.3794 1.967284 0.706615 AT2G14878 46.3126 95.648 40.349 2.065269 0.421849 AGP9 83.276 199.649 76.2113 2.397437 0.381726 UGT73B4,UGT73B5 2.62676 0.98171 2.63116 0.373734 2.680181 COR413PM1 90.1362 189.089 97.6705 2.097814 0.516532 AT2G16586 185.561 357.577 202.942 1.927005 0.567548 AT2G17710 8.3214 17.9321 8.78841 2.154938 0.490094 AT2G19350 7.69804 18.0401 7.69612 2.343467 0.426612 AT2G19530,AT2G19540 10.6697 5.24256 9.67171 0.49135 1.844845 ASHR2 3.98827 1.8093 3.17356 0.453655 1.754026 TOM2AH2 38.8719 63.3797 37.5188 1.630476 0.591969 AT2G20420 13.9554 5.82602 12.055 0.417474 2.069166 GLK1 8.71067 16.3663 10.4284 1.87888 0.637187 TOM2AH3 13.9159 25.585 13.6176 1.838544 0.532249 AT2G20820 84.4916 144.381 79.483 1.708821 0.550509 AT2G21187 24.1304 50.3556 25.3611 2.086812 0.50364 AT2G21195 11.3203 27.2827 13.2781 2.410069 0.486686

205

AT2G22425 57.1587 103.02 54.6586 1.80235 0.530563 RER1C 15.5145 28.1554 16.9114 1.81478 0.600645 AT2G24040 13.451 30.9198 13.5944 2.298699 0.439666 AT2G24090 209.083 343.912 186.467 1.644859 0.542194 CYP71B6 7.33811 2.90058 6.4009 0.395276 2.206766 AT2G24980 0.577754 2.73187 0.548549 4.728431 0.200796 AT2G25210 171.474 286.162 143.053 1.668836 0.499902 AT2G25250 9.3231 19.7713 9.8049 2.120679 0.495916 TIP4-1 9.06683 17.08 7.69505 1.88379 0.45053 AT2G26355 11.0066 19.9348 11.1334 1.811168 0.558491 RPP2A 225.377 373.081 216.85 1.655364 0.581241 CSD2 135.371 227.479 127.227 1.680412 0.559291 AT2G28370 26.0561 51.25 23.0537 1.96691 0.449828 APK1B 3.00336 1.02828 3.49043 0.342377 3.394435 AT2G29660 5.62848 12.8819 5.92725 2.2887 0.460122 ASP1 14.7423 8.11825 11.3755 0.550677 1.401226 AT2G31585 5.13855 11.8331 4.59348 2.302809 0.388189 AT2G32090 14.6413 27.3962 18.9122 1.871159 0.690322 OFP16 4.72667 13.6969 5.73044 2.897791 0.418375 RPL27A 16.2386 7.14518 11.791 0.440012 1.650203 UCC1 12.7091 37.4242 11.9309 2.944677 0.318802 CSLB3 6.5561 3.48027 5.38781 0.530845 1.548101 LHB1B1 3559.29 7801.38 3913.9 2.191836 0.501693 NHL12 6.0743 14.6405 6.6769 2.410237 0.456057 DOT1 74.3508 131.963 75.2664 1.77487 0.57036 AT2G36220 9.42815 22.0557 9.67652 2.339345 0.438731 PXC1 2.74293 6.17115 2.54254 2.249839 0.412004 TIP1-1 556.309 1520.75 521.681 2.733643 0.343042 AT2G37690 2.46671 0.957148 2.79393 0.388026 2.919016 AT2G37750 61.9871 134.231 59.9718 2.165467 0.446781 AKR4C8 15.0521 8.34847 13.3674 0.554638 1.60118 RPS31 205.586 471.406 214.253 2.292987 0.454498 DTX44 6.72799 11.5432 5.30974 1.715698 0.459989 PER22 13.767 7.69493 12.987 0.55894 1.687735 LTP2 394.744 1116.16 328.604 2.827554 0.294406 LTP1 445.593 882.894 448.675 1.981391 0.508187 PIP2-6 132.249 270.268 123.055 2.04363 0.455307 AT2G39020 21.3292 35.3885 21.4311 1.659157 0.605595 DIR9 3.72839 7.94705 4.35038 2.131496 0.547421 ACR9 70.6014 44.779 63.4576 0.634251 1.417129

206

WLIN2A 11.8359 21.7379 11.6324 1.836607 0.535121 AT2G40110 12.515 23.0123 12.1869 1.838777 0.529582 CZF1 11.1737 5.50886 8.18344 0.49302 1.485505 CML12 21.8086 12.2327 20.3945 0.560912 1.667212 M17 1.84256 18.6252 4.64776 10.10833 0.249541 AT2G41312 7.39279 23.7259 6.2161 3.20933 0.261996 OEP163 56.4283 94.5682 56.9073 1.6759 0.601759 AT2G42310 98.3574 208.236 90.5357 2.117136 0.434774 PIF4 1.53139 3.45426 2.84717 2.255637 0.824249 PME16 2.88708 15.1203 2.347 5.237229 0.155222 AT2G43540 8.30483 17.0407 7.31508 2.051902 0.429271 AT2G44410 2.47268 0.806782 2.08608 0.326278 2.58568 AT2G44870 20.5802 11.699 18.5273 0.568459 1.583665 AT2G45180 2616.15 6109.77 2571.16 2.335405 0.420828 CURT1B 402.72 637.902 441.909 1.583984 0.692754 AT2G47420 9.29536 4.70071 9.05003 0.505705 1.925247 TSPO 3.54939 14.9481 3.71935 4.211456 0.248818 AT2G47780 17.6747 57.8821 9.56016 3.274856 0.165166 CRR6,NET3C 27.2102 45.7676 22.8751 1.682002 0.49981 RPH1 40.2175 69.8368 34.3051 1.736478 0.491218 EDA4 6.19547 17.423 7.24656 2.812216 0.415919 AT2G01410 7.50045 13.3176 7.62626 1.775573 0.572645 PIN4 15.475 27.1658 15.8727 1.755464 0.58429 SOT11 6.37273 3.09971 6.45628 0.486402 2.082866 SOT12 13.9126 4.43067 11.0519 0.318465 2.494408 AT2G03820 11.6874 5.53523 11.3499 0.473607 2.050484 DTXL1 38.9966 3.5965 42.7049 0.092226 11.87402 LHCB2.2 1084.84 2172.07 1192.96 2.002203 0.549227 LHCB2.1 1485.11 3113.53 1630.34 2.096498 0.523631 PSBX 1160.69 2058.41 1181.57 1.773436 0.574021 PLA2-ALPHA 9.06443 20.0897 8.40745 2.216322 0.418496 AT2G06950 2.18889 0.909905 2.17264 0.415692 2.387766 AT2G07671 28.3403 77.2865 34.0003 2.727088 0.439925 COX3 7.62763 19.443 9.10724 2.549022 0.468407 AT2G07811 4.11135 8.48245 4.9215 2.063179 0.580198 AT2G07711 6.05043 18.1432 6.82063 2.998663 0.375933 AT2G07723,AT2G07815 2.70559 5.3457 3.15033 1.975798 0.58932 AT2G07731 3.49713 8.10717 4.32599 2.318235 0.533601 AT2G07733 4.73534 9.33858 6.81535 1.972103 0.729806 AT2G10940 608.376 1326.1 576.111 2.179738 0.43444

207

AGT1 716.92 1239.53 675.646 1.728966 0.545082 AT2G13820 29.062 65.2027 30.5968 2.243572 0.469257 TIC21 28.0722 46.7603 25.9807 1.665716 0.555614 AT2G15292 17.4023 31.6928 15.499 1.821185 0.489039 VHA-C3 73.0662 135.151 63.6069 1.849706 0.470636 EXL5 20.9347 33.9269 20.5646 1.620606 0.606144 ATL44 28.2163 62.1952 29.9955 2.204229 0.48228 AT2G18190 1.61171 0 1.29352 0 #DIV/0! AT2G18193 5.8751 0.37683 7.7187 0.06414 20.48324 AT2G18900 1.7212 0.632866 2.43959 0.367689 3.854829 AT2G19340 6.90496 15.598 8.30788 2.258956 0.532625 PRO1 95.0579 172.087 93.1578 1.810339 0.541341 MIOX2 27.6695 15.2837 26.8606 0.552366 1.757467 COV1 27.8747 56.1721 25.1843 2.015164 0.448342 AT2G20835 8.95737 22.0112 11.2063 2.457328 0.509118 CSP4 78.5936 138.476 65.6012 1.761925 0.473737 PRP2 9.73556 18.6077 8.29673 1.911313 0.445876 AT2G21180 7.62559 18.3191 9.42601 2.402319 0.514545 RBG7 400.209 663.392 339.028 1.657614 0.511052 AT2G22170 121.801 253.54 110.521 2.081592 0.435911 AGP2 25.4467 47.6886 27.8937 1.874058 0.584913 HHP3 36.8866 73.6628 33.1156 1.997007 0.449557 FZF 7.17681 3.21017 6.5018 0.447298 2.025376 JAL20 3.98823 1.52337 4.33742 0.381966 2.847253 AT2G27385 183.102 379.322 199.728 2.071643 0.526539 CKS1 33.6358 65.9326 31.1142 1.960191 0.471909 CKS2 17.3382 32.1114 18.2886 1.852061 0.569536 DIR10 19.4823 58.4662 19.7545 3.000991 0.337879 AT2G28790 18.6467 32.7397 16.8426 1.755791 0.51444 EXPA6 35.4495 77.7572 34.6141 2.193464 0.445156 CYP73A5 37.9932 22.9021 30.4637 0.602795 1.330171 PSBW 874.296 1533.45 936.431 1.753925 0.610669 AT2G31141 36.2499 78.5531 43.7438 2.166988 0.556869 AT2G32235,AT2G32240 0.910804 2.03009 1.85551 2.228899 0.914004 AT2G32380 9.81869 19.3606 9.27489 1.971811 0.47906 ATGRP23 418.721 774.159 441.984 1.848866 0.570921 AGP30 7.12476 17.6036 6.15615 2.470764 0.34971 AT2G33830 193.012 358.63 193.912 1.858071 0.540702 LHB1B2 7295.5 15896.2 7946.66 2.178905 0.499909 CYP710A2 5.59762 13.088 5.36266 2.338137 0.409739

208

CYP710A1 3.79924 8.50633 4.35023 2.238956 0.511411 AT2G34750 7.91556 3.95311 6.59877 0.49941 1.66926 ATL9 1.78395 4.65968 2.21267 2.612001 0.474854 LEA18 12.759 37.6426 14.5136 2.950278 0.385563 AT2G35880 1.1884 3.03035 2.43826 2.549941 0.804613 CASP1 24.2726 62.2702 22.4614 2.565452 0.360709 UGT73C5 9.14182 1.46155 8.88601 0.159875 6.079854 AT2G37035 5.56934 10.2659 5.34043 1.843288 0.520211 PAL1 8.54677 4.50364 10.1568 0.526941 2.255242 S1FA2 26.8411 55.4612 27.8417 2.066279 0.502003 CP29B 420.543 685.403 377.133 1.629805 0.550235 EXPA3 37.3087 98.9332 33.0474 2.651746 0.334038 PRA1B4 14.6789 31.0676 14.0992 2.11648 0.453823 AT2G38480 13.5289 29.1018 13.8758 2.151084 0.476802 AT2G40290 8.32142 4.25137 7.27396 0.510895 1.710968 PRA1B2 18.2147 30.807 16.7102 1.691326 0.542416 EXPA8 8.27027 35.6687 11.5062 4.312882 0.322585 AT2G41420 83.5724 139.105 78.1629 1.664485 0.561899 PDF1 84.7711 196.308 85.8681 2.315742 0.437415 AT2G43590 1.40468 4.23596 2.33924 3.015605 0.552234 AT2G43610 31.3569 15.3471 25.2863 0.489433 1.647627 HOL2 9.51838 19.306 7.47259 2.028286 0.38706 AT2G43945 13.9526 6.13069 12.9717 0.439394 2.115863 RPL7C 31.724 13.5993 31.2186 0.428675 2.295603 UCC2 125.888 223.247 120.151 1.773378 0.538198 FLA8 86.5566 200.432 89.8785 2.315618 0.448424 DBP 14.9737 26.1139 19.9627 1.743984 0.764447 AGP16 142.26 354.653 147.531 2.492992 0.415987 PI4KG4 7.08519 3.93737 7.34439 0.555718 1.865303 SDH4 28.489 52.3352 30.1551 1.837032 0.576192 ABCB4 1.76333 0.744662 3.06874 0.422304 4.120984 TIC20-II 29.5599 57.851 30.4024 1.957077 0.525529 AGP26 21.9122 76.3432 21.3682 3.48405 0.279897 AT3G01690 21.2317 36.4001 22.3231 1.714422 0.61327 AT3G01950 52.0403 119.802 57.1404 2.3021 0.476957 AT3G02120 11.1995 22.722 9.42918 2.028841 0.41498 AT3G02200 23.4765 11.6176 20.6201 0.494861 1.774902 AT3G02480 29.3929 83.3594 27.7873 2.836039 0.333343 STE1 35.6427 73.4693 33.5696 2.061272 0.45692 CML18 7.10161 14.5335 8.55144 2.046508 0.588395

209

AT3G03020 12.9639 24.6416 10.7143 1.900786 0.434805 AT3G03160 107.048 282.567 93.8074 2.639629 0.331983 AT3G03180 8.03247 17.5651 6.68712 2.186762 0.380705 AT3G03790 0.96616 1.9078 1.82512 1.974621 0.956662 RPL23C 728.789 1375.96 641.423 1.888009 0.466164 RPI3 216.472 385.853 191.193 1.782461 0.495507 AT3G05150 3.93874 9.42437 4.62594 2.392737 0.490849 BHLH150 15.6808 34.8957 17.4014 2.225378 0.498669 AT3G06125 124.68 256.011 129.019 2.053345 0.503959 AT3G06170 27.3688 48.949 25.0365 1.788496 0.511481 AT3G06470 29.3512 52.7884 26.478 1.798509 0.501587 CITRX 18.9662 32.7517 15.8973 1.726846 0.485389 AT3G06780 40.9325 103.309 40.8816 2.523887 0.395722 AT3G07350 24.3791 53.2928 27.2134 2.186004 0.510639 AIR12 23.4077 44.5496 21.2828 1.903203 0.477733 emb1990 89.5941 195.207 89.0671 2.178793 0.45627 AT3G07480 105.148 221.364 96.0644 2.105261 0.433966 AT3G08030 63.6908 137.172 66.3368 2.153718 0.483603 AT3G08600 11.5113 30.3145 12.1188 2.633456 0.399769 STR10 35.8995 67.6314 34.4171 1.883909 0.508892 LHCB4.2 2677.54 5341.54 2689.62 1.994943 0.503529 AT3G09035 13.0042 22.9829 10.9236 1.767344 0.475293 AT3G09570 24.2913 42.0627 22.3798 1.731595 0.532058 SCPL7 6.42194 3.25101 6.2391 0.506235 1.919127 NAC053 4.32233 2.12032 3.87139 0.49055 1.825852 RPL41E 400.087 704.769 355.12 1.761539 0.503881 FAD7 53.842 33.8317 45.409 0.628351 1.342203 AT3G11340 8.60209 4.4802 8.17023 0.520827 1.823631 CASP2 9.6279 23.1761 10.2682 2.407181 0.443051 AT3G11590 0.742684 2.05642 1.22419 2.768903 0.595302 AT3G11690 10.4974 19.9811 11.3408 1.903433 0.567576 AT3G11800 34.7764 60.2742 35.7507 1.733193 0.593134 AT3G11810 31.2343 60.4817 28.5724 1.936387 0.472414 CCT7 26.4828 16.4391 26.7334 0.620746 1.626208 PRMT3 2.88601 1.17148 2.75984 0.405917 2.355858 AT3G12502 3.18728 10.0994 2.96349 3.168658 0.293432 AGP12 558.02 1380.39 593.382 2.473729 0.429865 DIR7 5.37504 10.3402 6.04403 1.923744 0.584518 ESM1 3.00485 7.57137 3.13356 2.519716 0.41387 LYSA1 16.8008 9.13655 15.7863 0.543816 1.727818

210

AT3G14595 40.2872 85.0541 45.7201 2.111194 0.537541 RPL18AC 84.8864 53.9356 74.1434 0.635386 1.374665 DJ1A 148.224 82.4449 132.805 0.556218 1.610833 mMDH2 14.8154 25.5398 13.2733 1.723868 0.51971 TCP4 10.8422 17.6943 11.2626 1.631984 0.63651 AT3G15090 7.66999 3.96768 6.06242 0.517299 1.527951 ERF4 45.9019 91.3096 37.3167 1.989234 0.408683 IAA19 2.70731 6.79447 2.31098 2.509676 0.340127 COX5B-1 55.5768 92.5514 53.2872 1.665288 0.575758 AT3G16220 14.1623 7.87866 14.1139 0.556312 1.791409 TIP2-1 2691.77 9582.98 2796.62 3.560104 0.291832 PBP2 38.5672 61.3451 36.4833 1.590603 0.594722 JAL32 13.1315 7.27906 10.9538 0.554321 1.504837 RALF23 232.454 652.317 216.364 2.80622 0.331685 BHLH147 31.1068 61.0312 32.8729 1.961989 0.538625 TASIR-ARF 50.1836 85.2869 46.8209 1.699497 0.548981 ATHS1 192.833 310.168 161.559 1.60848 0.520876 AT3G18050 54.1273 106.742 55.0012 1.972055 0.515272 BGLU44 8.17326 14.1873 8.97272 1.735819 0.632447 RPL30C 346.725 559.894 314.898 1.614807 0.562424 TIC62 8.75248 15.3008 13.3857 1.748167 0.874837 AT3G19680 20.3804 32.8356 21.2409 1.611136 0.646886 FAX6 26.6316 54.3631 26.5992 2.041301 0.489288 JMJD5 9.50958 15.533 8.01839 1.633405 0.516216 AT3G21390 14.4167 23.5271 13.757 1.631934 0.58473 UGT84A2 11.557 6.92416 10.8487 0.599131 1.566789 RH9 7.4768 3.73169 6.90673 0.499103 1.850832 AOX1A 31.9667 13.539 29.1405 0.423534 2.152338 AT3G22620 36.4243 79.9631 35.7009 2.195323 0.446467 IAA7 84.7924 132.325 85.5595 1.560576 0.646586 AT3G23255 3.97873 9.45341 5.40988 2.375987 0.572268 AT3G23620 7.10493 2.28594 5.16896 0.32174 2.261197 RALFL24 26.4098 49.2408 31.4225 1.86449 0.63814 CER26L 6.29862 12.1952 5.78013 1.93617 0.473968 RCH2 1.84086 0.88347 2.36023 0.479922 2.671545 CLE41 9.12723 24.4853 9.32755 2.682665 0.380945 AT3G24927 13.4034 23.769 10.1595 1.773356 0.427426 AT3G25290 7.38568 12.5682 7.36444 1.701698 0.585958 AT3G25700 5.97093 12.1102 7.06339 2.028193 0.58326 AT3G27270 3.38178 9.97951 2.86888 2.950964 0.287477

211

OL2 2.93 9.26975 2.86751 3.163737 0.309341 LHCB2.4 805.021 1320.46 875.721 1.64028 0.663194 RPL12C 306.397 593.177 265.164 1.935975 0.447023 AT3G28130 3.09472 6.95912 2.80919 2.248707 0.40367 AT3G28160 4.88397 2.39909 5.8384 0.491217 2.433589 PER31 47.4073 94.9526 46.5995 2.002911 0.490766 AT3G29370 2.58592 18.0104 3.19566 6.964794 0.177434 FAX2 24.3111 38.8153 25.6132 1.596608 0.659874 AT3G44020 35.8534 60.0137 32.4407 1.673864 0.540555 HDT1 17.8703 6.96541 13.8227 0.389776 1.984478 AT3G45230 14.2343 28.9407 15.7674 2.033166 0.544818 PHOT1 4.85815 8.89185 8.77614 1.830295 0.986987 AT3G46490 5.85637 11.165 6.255 1.906471 0.560233 PTAC16 55.1702 94.8954 85.2164 1.720048 0.898003 HSD2 19.6252 44.5532 17.52 2.270204 0.393238 AT3G48020 5.88504 15.9842 4.89441 2.716073 0.306203 T24C20_20 532.224 842.55 551.308 1.583074 0.654333 PME34 15.0597 26.9188 15.5923 1.787473 0.579235 BAM2 18.067 34.9065 19.8287 1.932058 0.568052 F3H 28.8398 13.7256 23.4908 0.475926 1.711459 AT3G51510 103.303 163.317 97.2349 1.580951 0.595375 DGAT2 20.1654 32.8373 21.6254 1.628398 0.658562 AT3G52360 67.3128 118.174 64.8114 1.755595 0.54844 AT3G52460 3.48783 8.09777 3.63728 2.321722 0.449171 RPA3A 9.50922 22.1082 10.6577 2.324923 0.48207 PRXIIE 257.491 580.302 230.926 2.253679 0.397941 AT3G54366 62.9777 161.926 60.3089 2.571164 0.372447 ATHRGP1 4.74544 11.5191 5.25993 2.427404 0.456627 DJ1F 2.71616 5.70035 2.59708 2.09868 0.4556 PIP2-5 6.45631 12.0598 7.9121 1.867909 0.656072 AT3G54940 2.93708 8.9747 3.06243 3.055654 0.341229 DIR24 18.4096 42.7728 18.1577 2.323396 0.424515 SZF1 15.2462 7.84793 12.6491 0.514747 1.611775 AT3G56360 85.561 183.829 79.5313 2.148514 0.432637 HIT3 55.5217 96.4326 55.2523 1.736845 0.572963 AT3G56590 3.87447 7.57818 6.12034 1.955927 0.807627 AT3G56880 48.4756 77.8642 44.4802 1.606256 0.571254 GK-2 12.0515 7.15529 9.89223 0.593726 1.382506 GATA4 61.3497 99.4782 64.3634 1.621494 0.64701 ATL68 21.6992 45.8906 21.2183 2.114852 0.462367

212

CRF6 20.353 9.87864 18.9654 0.485365 1.919839 AT3G61770 36.6415 69.3528 36.1812 1.892739 0.521698 AT3G62400 87.5923 171.606 86.8542 1.959145 0.506126 CP12-2 148.852 93.5788 110.327 0.62867 1.178974 AT3G62650 85.9274 138.464 80.5683 1.611407 0.581872 GATL7 21.8717 42.7662 22.8631 1.955321 0.534607 PRP3 32.4652 91.6488 34.7777 2.822986 0.379467 PLP9 30.608 55.0734 27.9731 1.799314 0.507924 SCPL40 4.60895 10.4665 4.83875 2.270908 0.462308 AT3G63510 7.82909 13.5676 7.72022 1.732973 0.569019 AT3G01130 56.5692 96.5728 48.3235 1.707162 0.500384 HIR3 66.121 42.3171 57.8272 0.639995 1.366521 AT3G01570 1.84025 8.43398 1.7912 4.583062 0.212379 AT3G01850 25.9661 49.0649 25.3025 1.889575 0.515695 AT3G02640 33.6772 58.644 28.1615 1.741356 0.480211 AT3G03070 79.1829 133.55 72.1566 1.686602 0.540297 AT3G03150 67.6511 107.469 73.4002 1.588577 0.68299 AT3G03270 32.9179 60.748 27.636 1.84544 0.454929 AT3G03341 11.2878 57.7099 13.6835 5.112591 0.237108 LTL1 38.8579 77.0759 35.962 1.983532 0.466579 HEL 25.8271 47.7843 28.2223 1.850161 0.590619 AT3G04860 4.8015 9.52765 4.36154 1.984307 0.457777 AT3G05000 35.2749 72.9788 32.1434 2.068859 0.440448 AT3G05165 37.0335 58.1511 36.4638 1.57023 0.627053 RCI2A 197.018 497.315 198.225 2.524211 0.39859 RCI2B 112.254 191.637 100.531 1.707173 0.524591 AT3G05900 2.06313 5.65842 4.1364 2.742639 0.731017 AT3G05990 15.0943 29.479 15.4236 1.952989 0.523206 LUL4 3.83237 8.01111 4.70758 2.09038 0.587631 AT3G06390 6.21752 19.129 6.03152 3.076629 0.315308 BHLH148 23.6999 43.6685 24.1922 1.842561 0.553997 AT3G06750 103.303 224.175 103.154 2.170073 0.460149 AT3G06770 9.96361 17.6932 9.45318 1.775782 0.534283 AT3G07010 8.63457 14.5076 9.99483 1.680176 0.688938 PEX13 84.1358 147.386 82.0147 1.751763 0.556462 AT3G07570 19.1412 32.7746 17.1383 1.712254 0.522914 AT3G07910 78.68 151.438 76.5132 1.924733 0.505244 AT3G08610 284.408 489.614 243.664 1.72152 0.497666 MT2A 470.454 932.016 426.173 1.981099 0.457259 AT3G10080 15.5671 34.9959 18.6554 2.248068 0.533074

213

RTNLB8 23.2358 42.6854 21.324 1.837053 0.499562 PP2CA 11.0211 18.2683 10.0743 1.657575 0.551463 RPS14B 274.218 485.802 277.175 1.77159 0.570551 AT3G11530 32.3291 55.2699 29.1542 1.709602 0.527488 NHL1 22.8258 40.9452 23.6687 1.793812 0.578058 AT3G11770 20.4747 34.1852 20.7057 1.669631 0.605692 AT3G11780 52.004 93.0176 56.217 1.788662 0.604369 MTPC2 7.46501 13.6194 7.14751 1.824432 0.524804 AT3G12260 163.385 282.81 144.471 1.730942 0.510841 CHI-B 2.58548 7.77003 2.822 3.005256 0.36319 OST4A 24.0179 46.7837 28.0181 1.947868 0.598886 DRT100 46.3445 154.621 49.1625 3.33634 0.317955 SAUR72 6.14158 16.6367 7.02475 2.708863 0.422244 ABCC3 8.20453 3.31116 12.5189 0.403577 3.78082 SAT3 42.4102 87.6 42.1974 2.065541 0.481705 AT3G13310 159.039 307.853 181.703 1.935708 0.590227 AT3G13700,PRA1F4 3.54078 6.38149 3.63205 1.802284 0.569154 PRA1F3 59.3412 173.056 61.6936 2.916288 0.356495 CWINV1 26.7633 16.1477 27.37 0.603352 1.694978 AT3G13980 2.11539 6.62229 2.56088 3.130529 0.386706 AT3G14060 20.3373 43.0387 17.7544 2.116245 0.412522 AT3G14067 108.795 241.967 110.527 2.224064 0.456785 AT3G14240 48.5006 103.72 48.7497 2.13853 0.470013 AT3G14430 49.7347 112.046 46.1442 2.252874 0.411833 AT3G15480 52.3589 88.2894 48.4349 1.686235 0.548592 AT3G15530 21.218 45.4759 22.2601 2.14327 0.489492 AT3G15670 1.92465 6.59074 1.51308 3.424384 0.229577 NAI2 15.7527 25.1057 24.581 1.593739 0.9791 MFP1 1.9473 5.73522 4.1102 2.945216 0.71666 PSAH1 1550.97 3331.02 1587.92 2.147701 0.476707 UGT88A1 57.9175 90.9387 56.3749 1.570142 0.619922 AT3G16530 14.6867 6.16162 10.3737 0.419537 1.683599 TCTP 2102.37 4301.44 1985.3 2.045996 0.461543 SWEET16 14.1664 25.6595 13.7779 1.811293 0.536951 RPL3B 13.5337 7.72307 11.9096 0.570655 1.542081 AT3G17520 1.17841 4.8616 0.70589 4.125559 0.145197 ASPG1 61.1896 99.4575 63.5125 1.625399 0.638589 AT3G18530,AT3G18535 14.4188 24.8417 14.6255 1.722869 0.588748 PUB25 5.08135 10.1337 4.5122 1.994293 0.445267 AT3G19920 1.4928 4.41872 1.20969 2.960021 0.273765

214

ASPG2 8.97386 18.1083 8.85541 2.017894 0.489025 GRP-5 17.0602 52.227 17.118 3.061336 0.327762 UCNL 6.69139 19.673 6.80834 2.940047 0.346075 PSBT 3326 7635.34 3179.83 2.295652 0.416462 AT3G21140 7.39496 3.75422 5.73987 0.507673 1.528911 DMP2 12.5948 23.4608 13.2823 1.862737 0.566149 ICL 417.952 1065.19 361.08 2.548594 0.338982 CWLP 193.622 422.064 222.118 2.179835 0.526266 AT3G22240 56.5629 113.722 57.2977 2.01054 0.50384 NRPB5A 18.6698 7.94751 18.2951 0.425688 2.301991 AT3G22600 44.0108 92.0835 47.3283 2.092293 0.513972 LRX6 8.79172 16.4815 9.08201 1.874662 0.551043 ELIP1 53.3706 19.0898 36.1617 0.357684 1.894294 IAA2 10.1339 18.3907 13.5213 1.81477 0.735225 AT3G23170 15.3347 26.4732 12.8241 1.726359 0.484418 AMT1-3 6.23178 13.5979 5.45582 2.182025 0.401225 AT3G24420 15.7521 3.21924 14.5651 0.204369 4.524391 LRX4 25.4488 49.6688 30.6143 1.951715 0.616369 ATL5 114.578 65.7956 99.3829 0.574243 1.510479 TIP1-2 2604.58 7559.08 2607.12 2.902226 0.344899 AT3G28550 4.37142 8.07158 4.50213 1.846443 0.557776 AT3G28950 5.4216 11.4416 7.35332 2.110373 0.642683 EXPA5 11.3749 26.3355 11.6319 2.315229 0.441681 SDR4 24.3562 49.9496 28.116 2.050796 0.562887 AT3G30390 161.281 277.273 154.231 1.719192 0.556242 AT3G41762 22.4734 44.1387 22.9457 1.964042 0.519854 SADHU3-2 3.11397 16.604 2.81458 5.3321 0.169512 AT3G43430 17.5753 34.7187 14.9105 1.975426 0.429466 LTPG2 179.407 501.788 188.995 2.796925 0.376643 AT3G43850 5.63361 19.2129 5.25474 3.410406 0.273501 RPS29A 404.956 839.01 399.25 2.071855 0.475858 AT3G44100 56.316 100.947 58.6153 1.79251 0.580654 AT3G45040 6.47383 11.5832 7.04577 1.789235 0.608275 TET3 36.3868 81.0435 33.5958 2.227277 0.41454 FLA4 14.2206 25.2262 15.3126 1.77392 0.607012 CDF3 2.34222 5.09491 3.1436 2.175248 0.617008 AT3G48115 39.816 75.0435 34.9901 1.884757 0.466264 AT3G48970 6.92051 22.6371 5.40921 3.271016 0.238953 NAC062 5.44493 2.89637 5.864 0.531939 2.024603 AtRLP44 2.14433 6.20613 2.73721 2.894205 0.441049

215

AT3G50340 6.66948 19.5529 7.04832 2.931698 0.360474 AT3G50685 113.935 236.152 102.452 2.072691 0.433839 UGT72E1 28.1402 45.5615 27.0153 1.619089 0.592941 AT3G50810 5.66607 12.7659 5.17568 2.253043 0.40543 LTP5 1160.11 2029.58 1066.07 1.749472 0.525266 EBP1 16.2798 8.91281 16.7444 0.547477 1.878689 AT3G52060 75.0474 151.705 81.5241 2.021456 0.537386 AT3G52480 24.3399 48.5535 21.041 1.994811 0.433357 AT3G52500 40.8041 81.4966 39.6778 1.997265 0.486864 AT3G52730 134.488 252.452 143.497 1.877134 0.568413 AT3G52760 15.4326 27.3848 14.0268 1.774477 0.512211 S1FA1 25.2707 44.8427 25.3218 1.774494 0.564681 PIP2-1 943.22 1782.5 936.831 1.889803 0.525571 CP29A 345.451 567.002 288.336 1.641338 0.508527 AT3G53850 12.9027 21.8621 13.1042 1.694382 0.599403 AT3G53980 14.7973 30.8104 17.5027 2.082164 0.568078 AT3G54580 8.34614 16.8534 9.3637 2.019305 0.555597 GRXS14 129.126 208.483 144.228 1.61457 0.691797 CHI1 17.8951 9.35659 14.4927 0.522858 1.54893 ECR 64.7105 119.982 59.7221 1.854135 0.497759 PRA1B1 26.4056 51.7988 24.1647 1.96166 0.466511 BASS4 24.8884 45.4405 22.2931 1.82577 0.4906 AT3G56210 4.60106 8.90741 3.56538 1.935947 0.400271 AT3G56290 52.2296 28.812 45.4604 0.551641 1.577829 CAM2 76.1959 119.858 72.9425 1.573024 0.608574 PSRP5 397.78 741.75 388.136 1.864724 0.523271 RPS2D 14.5847 6.52612 10.8563 0.447463 1.663515 RFS2 44.8889 28.8262 47.2514 0.642168 1.639182 U2.3 5.96214 29.7888 6.8817 4.996327 0.231016 AT3G57785 34.8192 64.0341 35.0353 1.839046 0.547135 BZIP61 7.8458 14.0591 8.64094 1.791927 0.614615 VHA-D 82.0657 149.303 76.2887 1.819311 0.510966 AT3G58930 3.32681 1.39204 3.12042 0.418431 2.241617 PIF5 9.90702 18.4505 16.0935 1.862366 0.872253 AT3G59070 1.17752 3.90142 1.29343 3.313252 0.331528 AT3G59340 2.15882 5.41793 2.1302 2.509672 0.393176 AT3G60070 3.27187 6.55757 3.28983 2.004227 0.501684 UCC3 5.33683 14.0658 4.95948 2.63561 0.352591 AT3G61260 32.3786 66.5906 46.5738 2.056624 0.699405 BRH1 32.4068 76.6427 27.6298 2.365019 0.360501

216

AGP20 50.2525 90.3383 43.2505 1.797688 0.478761 AT3G61820 11.8514 37.1273 10.7533 3.132735 0.289633 AT3G62460 6.8788 2.36625 6.27275 0.343992 2.650924 ATG18A 19.04 32.6513 19.4245 1.714879 0.594907 RPL7AB 137.897 86.7398 122.626 0.629019 1.413722 TIR1 43.1245 71.1771 39.8767 1.650503 0.560246 ATMRK1 27.7032 47.5096 28.5059 1.71495 0.600003 BIL4 66.2979 114.809 60.7302 1.731714 0.528967 AT3G63390 19.2061 39.2979 18.1769 2.046116 0.462541 CYP86A2 60.0953 139.861 62.8033 2.32732 0.449041 AT4G00780 9.99913 5.18066 8.49862 0.518111 1.640451 PAP1 14.2723 6.41894 14.9456 0.449748 2.32836 AT4G04925 22.4324 44.1307 22.7227 1.967275 0.514896 AT4G05150 18.4579 29.3196 17.7525 1.588458 0.605482 AT4G08555 31.6278 7.44512 26.4024 0.235398 3.546269 SAH7 18.999 42.3372 18.2822 2.228391 0.431824 EXO 92.237 42.1692 73.1116 0.457183 1.733768 AGP10 25.4337 45.1359 26.7987 1.774649 0.593734 DIR13 6.81173 15.3802 6.71202 2.257899 0.436407 AT4G11211 56.2023 165.729 51.0548 2.948794 0.308062 AT4G11320 20.8183 42.0688 26.1886 2.020761 0.622518 AIR1B 15.7059 34.236 14.7337 2.179818 0.430357 AIR1 3.87777 13.9345 3.65021 3.593431 0.261955 PSAL 1901.43 3609.95 1906.25 1.898545 0.528054 LRX3 19.3332 29.8682 20.1814 1.544918 0.675682 AMT1-1 52.7167 95.3154 48.366 1.808068 0.507431 AT4G13615 292.89 481.008 269.523 1.642282 0.56033 ATJ20 8.57082 16.222 12.5473 1.892701 0.773474 CID2 21.0074 39.0485 19.4457 1.858797 0.497988 ELIP2 34.8761 12.6009 27.8295 0.361305 2.208533 FdC2 49.5351 86.3449 47.4353 1.743105 0.54937 KMS1 3.69692 1.7748 3.75641 0.480075 2.116526 AT4G16450 82.2633 188.469 81.2784 2.291046 0.431256 AT4G16980 629.94 1446.31 623.116 2.295949 0.430832 TIP2-2 49.7994 151.172 55.9545 3.035619 0.370138 RPA3B 29.5013 56.6449 29.0874 1.920081 0.513504 AT4G18905 2.42503 0.82299 2.29029 0.339373 2.782889 AT4G19200 57.4519 91.0947 52.695 1.585582 0.578464 IRT2 2.23248 6.68139 2.28407 2.992811 0.341856 PP2A1 13.9191 6.33052 14.0254 0.454808 2.215521

217

AT4G20390 3.49236 10.2114 5.1007 2.923925 0.49951 TIF3D1 10.7971 6.13653 9.1691 0.56835 1.494183 AT4G21105 88.2925 148.641 90.5596 1.683507 0.60925 AT4G21620 57.0399 152.952 56.7664 2.681491 0.371139 AT4G22212 21.9334 41.1063 21.6784 1.874142 0.527374 CYP706A1 3.58236 1.73316 2.91235 0.483804 1.68037 PYR4 9.13307 4.60827 7.1761 0.50457 1.557222 PIP1-5 107.743 192.97 98.7946 1.791021 0.511969 AT4G23470 50.5697 81.7638 46.1651 1.616854 0.564615 AT4G25780 7.74632 16.3557 7.87956 2.111415 0.481762 XTH23 22.3972 9.91521 18.8666 0.442699 1.902794 XTH14 18.2475 31.4947 17.4792 1.725973 0.554989 RPL31B 72.2696 42.0431 57.1242 0.581754 1.358706 D6PKL1 7.19643 3.37448 6.23666 0.46891 1.848184 TIM22-2 13.6764 29.7663 12.2841 2.176472 0.412685 AT2S1 0 7.30276 0.076954 #DIV/0! 0.010538 AT2S3 0.725501 10.0265 0.876737 13.82011 0.087442 RMA3 4.69405 9.6833 4.41498 2.062888 0.455938 AT4G29020 119.183 275.148 114.048 2.308618 0.414497 AT4G29870 23.4824 49.6402 21.6664 2.113932 0.436469 AT4G30660 18.5062 44.2424 17.3934 2.39068 0.393139 AT4G31180 13.7771 8.11858 12.6802 0.589281 1.561874 AT4G31290 33.5573 18.2158 27.3533 0.542827 1.501625 CBL10 16.9048 28.6336 17.6307 1.693815 0.615735 AT4G33610 7.25285 18.5413 7.63348 2.556416 0.411701 AT4G33740 1.10685 2.85175 2.21739 2.576456 0.777554 AT4G34265 31.7119 63.7425 31.2804 2.01005 0.490731 SQS1 14.6368 8.2812 12.1174 0.565779 1.463242 RPS3AB 90.5325 56.8352 76.3417 0.627788 1.343212 AT4G34881 97.7755 159.459 83.0376 1.630869 0.520746 PSAT1 10.1082 5.66089 9.64924 0.560029 1.704545 AT4G37220 13.9827 26.2586 13.9773 1.877935 0.532294 CAD7 11.0737 19.0824 9.70671 1.723218 0.508673 AT4G38080 7.88442 24.4378 7.55307 3.099505 0.309073 AT4G38250 41.891 86.7603 36.9531 2.071096 0.425922 GGR 27.5716 47.4376 27.608 1.720524 0.581986 VHA-C3 106.66 186.815 97.1231 1.7515 0.519889 AT4G00165 51.766 85.177 51.7278 1.645424 0.607298 RPP1B 161.671 302.619 151.042 1.87182 0.499116 WTF1 2.21262 0.664367 2.66217 0.300263 4.007077

218

RHS13 23.6491 45.9138 20.3897 1.941461 0.444087 LECRK43 1.70812 0.481134 1.34766 0.281675 2.801008 PSAD1 708.229 1342.86 773.953 1.896082 0.576347 TUFA 19.6008 11.287 15.3583 0.575844 1.360707 AT4G04692 60.4399 99.2608 50.7926 1.642306 0.511709 ABCI8 17.8848 10.0634 18.9634 0.562679 1.884393 AT4G07410 5.42912 2.76176 5.57274 0.508694 2.017822 AT4G10480 61.3541 35.8853 49.8513 0.584888 1.389184 AT4G10810 24.0561 47.5212 30.1224 1.975432 0.633873 AZI1 72.3098 137.449 78.9433 1.900835 0.574346 EARLI1 3.14816 9.67247 3.0438 3.07242 0.314687 AT4G12520 39.2597 95.6412 43.1263 2.436116 0.450918 FLA2 93.4892 203.212 93.3272 2.173641 0.45926 ENODL19 49.1173 120.381 42.5197 2.450888 0.353209 AT4G12980 9.37215 16.8115 10.6526 1.793772 0.63365 RALFL32 32.657 74.5348 31.216 2.282353 0.418811 TUBA6 111.72 188.514 105.897 1.687379 0.561746 AT4G15390 2.79941 5.73911 3.01814 2.050114 0.52589 FCA 1.91357 4.22283 2.74397 2.206781 0.649794 AT4G16490 23.4837 38.1664 23.0374 1.625229 0.603604 CYS4 26.1505 45.7732 26.1603 1.750376 0.57152 RGF6 6.5739 48.6087 9.67835 7.394195 0.199107 AHL23 8.66572 15.805 8.02738 1.823853 0.507901 EIF4E1 11.6746 5.62928 9.05901 0.482182 1.609266 CHLI1 76.5351 47.2124 71.0375 0.616873 1.504636 LRX5 7.40535 17.0023 8.15775 2.295948 0.479803 AT4G18970 20.7863 36.7419 19.4852 1.767602 0.530326 AT4G19880 7.77975 3.43068 8.55935 0.440976 2.494943 AT4G20150 117.42 188.528 120.997 1.605587 0.641799 DER2.1 26.3555 43.8696 25.4554 1.664533 0.580251 MSRB9 37.3693 75.3177 32.1844 2.015497 0.427315 AT4G22490 8.99459 20.0128 6.13075 2.224982 0.306341 LPPB 5.32717 13.2551 5.52258 2.488207 0.416638 CPuORF27 2.37518 0.572235 1.68639 0.240923 2.947024 AT4G24830 6.41184 3.31714 6.46729 0.517346 1.949658 AT4G25790 3.56399 11.1472 4.0334 3.12773 0.361831 AGP13 20.5954 64.9006 17.8668 3.151218 0.275295 ATPHOS34 30.6407 50.6604 29.8592 1.65337 0.589399 ENODL2 107.958 209.687 111.718 1.942302 0.532785 BGLU10 2.62239 0.883061 2.75298 0.336739 3.117542

219

UBC9 46.6593 85.2777 48.2474 1.827668 0.565768 YLMG1-2 22.8584 50.9696 22.1688 2.229797 0.434942 AT4G28230,AT4G28240 20.156 36.572 20.571 1.814447 0.562479 EXPB3 7.42028 18.9754 6.50664 2.557235 0.342899 AT4G29260 17.0924 28.0143 16.2711 1.638992 0.580814 CDEF1 14.721 7.47166 13.1866 0.507551 1.764882 AT4G30450 9.72905 25.2743 7.78272 2.597818 0.30793 AT4G30460 12.6242 23.2581 11.0898 1.842342 0.476815 AT4G30670 30.6337 59.9108 28.7294 1.955715 0.479536 BHLH69 1.55057 4.74586 2.36375 3.06072 0.498066 RPS6A 104.811 66.9839 89.2772 0.639092 1.332816 AT4G31810 3.29511 1.31323 2.93963 0.398539 2.238473 AT4G32260 384.738 651.107 392.004 1.692339 0.602058 SHM3 11.4323 5.01802 11.2291 0.438934 2.237755 AT4G33625 12.2539 20.7238 11.1017 1.6912 0.535698 DAP 21.3183 10.888 17.3822 0.510735 1.596455 SSR16 746.729 1835.68 717.208 2.458295 0.390704 VHA-C3 151.549 313.321 132.027 2.067457 0.421379 AT4G34760 6.76018 14.0793 6.50847 2.082681 0.462272 MRL1 4.54397 2.67012 5.68684 0.587618 2.129807 SBT1.6 11.6896 20.2026 10.4558 1.728254 0.517547 ATJ11 104.794 169.468 121.985 1.617154 0.719811 AT4G36530 18.5147 10.238 18.251 0.552966 1.782672 CYP81D8 2.13778 0.530497 1.20182 0.248153 2.265461 AGP18 37.9637 68.3355 33.6067 1.800022 0.49179 BT5 19.0884 10.6783 18.4215 0.559413 1.725134 XTH7 295.728 550.732 256.778 1.862292 0.466249 AT4G38660 20.2333 35.139 16.9423 1.736691 0.482151 CSP2 181.352 388.042 172.34 2.139717 0.444127 PRP4 141.362 364.343 155.903 2.577376 0.427902 AT4G38932 22.4521 44.8168 20.443 1.996107 0.456146 BBX20 10.2074 2.86434 8.19008 0.280614 2.859325 RD19A 183.411 316.627 192.736 1.726325 0.608716 DRG3 7.20522 3.90044 6.23623 0.541335 1.598853 HTR8 217.759 351.403 199.834 1.613724 0.568675 AGP3 30.385 84.2237 24.1904 2.771884 0.287216 AT5G03460 38.0103 72.8338 38.5773 1.91616 0.529662 AT4 6.15457 13.909 5.07445 2.259947 0.364832 CHI3 14.2055 7.03837 11.6983 0.495468 1.662075 PRA1B3 2.82743 8.13082 3.12 2.875693 0.383725

220

AT5G05500 13.6446 34.129 12.8394 2.501283 0.376202 UGT76C1 2.76804 0.911256 3.01216 0.329206 3.305504 MEE60 20.7177 42.3889 15.9557 2.046023 0.376412 CASP4 2.8864 9.46037 3.3489 3.277567 0.353992 RPS4B 107.623 64.2035 92.1992 0.596559 1.436046 PRA1B6 11.4069 25.2223 10.9301 2.211144 0.433351 FH16 1.40613 2.84749 2.0047 2.025055 0.704024 AT5G07860 5.49787 2.58474 4.47837 0.470135 1.732619 AT5G07960 26.6046 49.3426 23.6637 1.854664 0.47958 AT5G08570 6.17061 3.0588 6.75156 0.495705 2.207258 RH25 3.48268 1.50271 2.96381 0.431481 1.97231 FLS1 19.9845 5.85095 16.0934 0.292774 2.750562 EXL4 36.5164 62.354 34.4924 1.707562 0.553171 AT5G09580,U2.5 1.83177 6.33483 3.16211 3.458311 0.499163 RH15 15.0991 8.94019 13.3013 0.592101 1.48781 AT5G11280 30.3191 49.6028 23.2279 1.636025 0.468278 AT5G11420 46.3072 80.7103 49.3486 1.742932 0.611429 AGP15 306.819 798.062 318.242 2.601084 0.398769 AT5G13210 3.67312 0.332418 3.54486 0.0905 10.66386 AT5G13370 7.50914 3.56808 7.15887 0.475165 2.006365 AT5G13650 30.2733 18.5969 29.5575 0.6143 1.589378 AT5G13720 16.5973 27.9053 16.5966 1.681316 0.594747 CHS 129.149 46.7048 103.018 0.361635 2.205726 GASA14 15.7213 28.7819 16.1232 1.830758 0.560185 CASP5 9.33572 22.7732 10.22 2.439362 0.448773 AT5G15610 8.35055 4.65757 6.69805 0.557756 1.4381 RPS7C 82.0619 49.0863 69.8522 0.598162 1.423049 ENH1 58.4488 33.5435 53.1483 0.573895 1.584459 AT5G17190 29.8458 52.2825 31.0022 1.751754 0.592975 AT5G17670 29.873 49.0324 28.95 1.641362 0.590426 PSRP6 402.902 999.915 386.402 2.481782 0.386435 SAUR21 4.44485 17.6305 4.53352 3.966501 0.257141 BAM9 60.5659 27.3302 51.0779 0.451247 1.868918 AT5G19120 155.776 242.784 154.733 1.558546 0.637328 AT5G19290 26.6409 47.0002 26.5515 1.764212 0.564923 AT5G19440 19.7279 10.8285 18.1261 0.548893 1.673925 RPT6B 4.3105 1.69436 3.60822 0.393077 2.129547 NUDT19 6.88357 3.74949 5.97734 0.544701 1.594174 AT5G20400 7.22183 2.50666 5.34423 0.347095 2.132012 AT5G21020 155.336 264.806 142.758 1.70473 0.539104

221

FAR1 1.52022 4.13465 2.08602 2.719771 0.504522 AT5G22580 6.59171 15.7776 5.06537 2.393552 0.321048 AT5G23820 109.478 220.479 95.1611 2.013911 0.431611 AT5G24130 0.629625 3.6622 0.860746 5.816478 0.235035 AT5G24210 5.56383 2.1233 3.83469 0.381626 1.806005 AT5G24610 25.0604 43.0501 24.5771 1.717854 0.570895 KO 6.70382 2.83521 5.10537 0.422925 1.800703 AT5G26270 17.1015 33.8839 16.3069 1.981341 0.481258 AT5G26280 135.942 71.452 120.011 0.525607 1.679603 AT5G26710 4.29941 1.83289 4.69592 0.426312 2.56203 AT5G27400,AT5G27410 8.99714 4.97558 8.33744 0.553018 1.675672 AT5G27430 9.45058 4.37223 7.13323 0.462641 1.631486 AT5G27470 10.1177 5.36228 10.4235 0.52999 1.943856 AT5G35190 1.68049 6.58571 2.76256 3.918922 0.419478 AT5G36230 18.1177 10.3749 15.8937 0.572639 1.531938 PMAT1 13.9934 6.33791 11.1112 0.452921 1.753133 UGD4 12.2695 6.86626 11.6245 0.55962 1.692989 RPL5B 42.437 20.7438 36.8738 0.488814 1.777582 RPS9C 11.284 4.52967 11.8684 0.401424 2.620147 ATL46 6.96823 3.62835 5.07653 0.520699 1.399129 AGP24 66.4712 126.685 66.3912 1.905863 0.524065 GDPD2 85.631 47.5259 71.7824 0.555008 1.510385 AT5G42720 9.24344 15.0914 8.89315 1.632661 0.589286 RPN9A 10.7159 6.16504 11.3168 0.575317 1.835641 AP2M 11.0867 6.06611 10.347 0.547152 1.705706 AT5G46730 10.3602 18.8988 10.7985 1.824173 0.571385 AT5G49440 204.359 380.375 189.205 1.861308 0.497417 ATCP1 37.0834 20.6502 35.3855 0.556858 1.713567 AT5G49640 20.8972 36.0587 21.0704 1.725528 0.584336 CPI1 13.1696 22.5374 11.9744 1.71132 0.531312 AT5G50560 10.2172 25.3391 7.79329 2.480043 0.30756 NBP35 7.1041 3.02408 5.96213 0.425681 1.971552 HSP23.5 6.363 1.23673 5.30908 0.194363 4.292837 SCL8 10.4002 19.8702 9.05242 1.910559 0.455578 NLE1 5.8437 2.70479 5.20994 0.462856 1.92619 PYL8 34.1995 16.7117 25.9003 0.488653 1.54983 AGP22 30.1842 63.9811 26.4864 2.119688 0.413972 BOB1 4.70639 1.89308 3.76799 0.402236 1.990402 AHL 11.9049 6.97069 9.41846 0.585531 1.351152 RBP45A 19.6441 10.0173 16.1365 0.509939 1.610863

222

COX15 4.94465 1.68556 3.61547 0.340886 2.144967 NMT1,ckl12 14.8079 8.14352 13.5122 0.549944 1.659258 AT5G57625 17.1736 42.4392 18.3374 2.471188 0.432086 AT5G58770 10.6133 4.38644 7.21358 0.413297 1.644518 AT5G59250 36.9226 63.49 31.6959 1.719543 0.499227 ZEU1 14.9381 7.69182 11.8488 0.514913 1.540442 AT5G59500 14.9353 24.9875 13.0017 1.67305 0.520328 AT5G62280 4.73377 10.5964 6.06685 2.23847 0.572539 AT5G62350 45.6743 72.1306 46.4512 1.579238 0.643987 PUB41 5.94387 10.479 5.25914 1.762993 0.501874 AT5G62865 9.77321 23.9193 10.8889 2.447435 0.455235 AT5G63500 17.4793 37.1485 23.3158 2.125285 0.627638 AGP1 37.9566 63.4341 33.1193 1.671227 0.522106 AT5G64400 254.548 405.292 229.915 1.592203 0.567282 ASN2 31.5629 19.5059 27.0212 0.618001 1.385283 XTH6 73.6902 30.6128 61.9603 0.415426 2.024 STR16 176.526 314.118 153.119 1.779443 0.487457 AT5G66052 64.5278 114.665 58.9571 1.776986 0.514168 STR18 19.1143 32.7762 19.5488 1.714748 0.596433 RALFL34 50.1353 110.065 43.3745 2.195359 0.394081 AT5G01075 11.3737 27.2409 12.6354 2.395078 0.463839 AT5G01215 27.9878 61.8776 28.8321 2.210878 0.465954 FER1 75.7973 48.7592 64.5331 0.643284 1.323506 AT5G01790 3.83099 11.1416 4.08593 2.908282 0.366727 AT5G02050 11.3698 4.4147 8.71514 0.388283 1.974118 ABCI20 14.6521 6.10722 11.2265 0.416815 1.838234 MT2B 1872.14 3236.24 1874.48 1.728631 0.579215 RPS28A 240.407 383.689 238.815 1.595998 0.622418 AT5G03970 3.0263 0.779303 2.24955 0.25751 2.886618 AT5G05370 82.3573 149.592 83.8595 1.816378 0.560588 CYP90A1 40.2283 73.4565 38.9931 1.825991 0.530833 ABCG22 7.03705 2.68549 7.69408 0.381622 2.865056 AT5G06640 2.14366 4.98515 2.81035 2.325532 0.563744 AT5G07030 18.0274 36.7588 16.2562 2.039052 0.44224 ATRBL3 8.7441 16.7044 8.18688 1.910362 0.490103 AT5G07360 2.81712 1.24813 2.97193 0.443052 2.381106 AT5G08180 17.9348 7.85906 16.6542 0.438202 2.119108 AT5G08350 5.3976 1.62316 5.1128 0.300719 3.149905 NAC081 10.4638 2.94256 9.78233 0.281213 3.324428 AT5G09225 25.0947 48.2612 24.9849 1.923163 0.517702

223

AT5G09530 0.894655 3.15445 1.69046 3.525884 0.535897 PPA6 71.3628 44.1336 57.8501 0.61844 1.310795 ASN3 4.02509 1.33927 3.37002 0.33273 2.516311 AGP4 25.586 72.0432 28.484 2.815727 0.395374 OCP3 7.88484 3.35184 6.11876 0.425099 1.825493 AT5G11550 9.83102 19.8543 7.68312 2.019556 0.386975 AT5G11770 100.652 168.046 98.367 1.669574 0.585358 DIT1 182.141 292.785 178.468 1.607463 0.609553 AT5G13890 6.1429 11.7828 6.84609 1.918117 0.581024 AT5G14330 29.0274 54.0009 24.3768 1.860342 0.451415 VDAC3 47.2635 30.0128 40.9487 0.63501 1.364375 CNGC2 14.5289 23.2963 15.1445 1.603446 0.650082 AT5G15780 101.413 188.001 101.853 1.853816 0.541768 AT5G16250 49.2395 98.426 45.0297 1.998924 0.457498 P1 58.2576 36.2573 51.9297 0.622362 1.432255 PSY1 32.2874 15.135 32.214 0.468759 2.128444 ATG5 3.73253 1.38061 2.77345 0.369886 2.008858 AT5G17460 2.46063 5.13262 2.56866 2.085897 0.500458 MEX1 2.16387 4.62087 1.90862 2.135466 0.413043 EMB1241 19.8318 11.3843 16.5423 0.574043 1.45308 PER57 37.3828 61.5838 34.1883 1.647383 0.555151 RPS16C 758.061 1419.2 665.178 1.872145 0.468699 GRXS2 21.6157 11.2584 21.3686 0.520844 1.898014 AT5G19860 56.4775 93.2905 50.1533 1.651817 0.537604 MPC1 42.8975 76.2155 44.9108 1.776689 0.589261 AT5G20165 20.2628 36.9252 22.4084 1.822315 0.606859 GER3 916.5 1993.27 767.598 2.174872 0.385095 COPT5 71.669 143.4 68.9487 2.000865 0.480814 AT5G20740 6.98586 20.6953 6.13598 2.962456 0.296491 CCT2 13.0203 6.63052 12.5992 0.509245 1.900183 AT5G22020 9.37878 4.33191 7.89198 0.461884 1.821825 AT5G22090 10.293 6.21269 10.8103 0.603584 1.740035 RPL10AC 37.6383 21.0451 32.6189 0.559141 1.549952 CAS 95.683 61.3819 85.2774 0.641513 1.389292 SQE6 5.3832 2.65804 4.21907 0.493766 1.587286 AT5G24170 6.4293 12.7479 6.7706 1.982782 0.531115 CRY3 9.86997 5.8254 7.64685 0.590215 1.312674 AT5G24890 8.04916 3.90713 6.82169 0.485408 1.745959 RD22 19.106 32.1536 23.8083 1.682906 0.740455 STP13 13.1844 7.61885 12.237 0.577869 1.606148

224

RH3 36.4024 23.6257 36.8617 0.649015 1.560237 DEGP14 14.7425 26.7827 12.8385 1.8167 0.479358 PAP26 47.8434 27.9072 43.5921 0.583303 1.562038 CYP28 68.9809 134.231 61.7566 1.945915 0.460077 EIF(ISO)4E 26.027 15.1451 19.5192 0.5819 1.288813 RVE2 38.1125 12.4462 28.7322 0.326565 2.308512 AT5G37475 12.6387 6.22685 9.54164 0.492681 1.532338 PER62 22.1483 13.0639 17.2906 0.589838 1.32354 OLEO2 3.37821 26.5194 3.42568 7.850134 0.129176 AT5G41400 6.50863 17.7033 6.45039 2.719973 0.364361 AT5G42110 58.3149 100.355 55.1257 1.720915 0.549307 DIR2 2.79369 10.1345 3.22603 3.627639 0.318322 DIR1 10.0946 31.8839 6.49315 3.15851 0.20365 CYP74A 13.3872 29.2913 15.2855 2.188008 0.521844 PHT1-1 13.7966 25.3744 12.9375 1.839178 0.509864 AT5G43450 10.0533 5.16477 9.20942 0.513739 1.783123 AT5G43580 133.845 52.9383 122.907 0.395519 2.321703 APS4 26.2103 48.1798 23.5694 1.838201 0.489197 AT5G44260 11.6265 28.1572 12.74 2.421812 0.45246 AT5G44440 2.47032 12.4423 2.51949 5.036716 0.202494 AT5G44450 5.68303 2.67151 4.15978 0.470086 1.557089 RPI4 11.948 5.33406 7.7918 0.44644 1.460763 AT5G46890 22.2585 59.8913 25.9478 2.690716 0.433248 AT5G46900 24.075 60.6031 26.4815 2.517263 0.436966 TIP2-3 6.92651 22.5912 5.3764 3.261556 0.237986 AT5G47620 9.19425 5.30251 8.72468 0.57672 1.645387 RPP1C 362.143 604.531 313.589 1.669316 0.518731 GFA2 3.08635 0.972966 3.36948 0.315248 3.463101 AT5G48120 1.99245 0.807023 1.67276 0.405041 2.072754 NSP5 88.0845 141.058 83.3718 1.601394 0.591046 AT5G48412 32.4621 67.0995 28.195 2.06701 0.420197 AT5G48490 50.3328 88.5948 50.5685 1.76018 0.570784 CRRSP55 12.8083 6.24167 10.7206 0.487314 1.717585 CYTB5-D 119.627 204.131 113.117 1.706396 0.554139 HST 36.2507 59.6931 33.1048 1.646674 0.554583 AT5G52190,AT5G52200 26.0738 44.9808 23.529 1.725134 0.52309 AT5G53540 7.01149 3.16577 5.44232 0.451512 1.719114 AT5G53570 4.03771 2.0452 3.50803 0.506525 1.71525 AT5G54170 5.94653 2.50197 4.23961 0.420745 1.694509 SESA5 2.51618 9.50816 1.01006 3.778808 0.106231

225

FTSZ1 8.6527 4.59041 7.48218 0.530518 1.629959 TOP1 0.699212 1.68049 1.71265 2.403406 1.019137 AT5G55790 8.86053 17.7366 7.97136 2.001754 0.44943 OXS3 18.3279 9.68721 15.9643 0.52855 1.647977 XTH22 42.3032 16.6276 36.8293 0.393058 2.21495 FLA12 2.6416 6.95603 3.11978 2.633264 0.4485 PIP2-4 11.2975 44.5224 13.0828 3.940907 0.293848 ABCF1 10.2763 4.43623 9.54977 0.431695 2.152677 PNM1 3.51347 1.44654 3.44355 0.411713 2.380543 AT5G61820 30.3953 14.5827 31.4931 0.479768 2.159621 RH7 10.8534 5.13363 12.7291 0.472997 2.479552 BT1 45.3176 26.3315 41.7617 0.581044 1.585998 NAC102 14.1091 7.40109 13.7015 0.524561 1.851281 FLCY 4.71293 2.24846 4.29128 0.477083 1.908542 PER70 18.8506 10.5079 18.5252 0.557431 1.762978 PER71 97.4168 60.8683 94.7902 0.624823 1.5573 DIT2-1 41.5849 65.1831 37.8607 1.56747 0.580836 AT5G64830 2.59776 0.523286 1.38063 0.201437 2.638385 AT5G65207 32.3919 63.2315 33.4978 1.952078 0.529764 AGP7 45.1297 137.711 46.3293 3.051449 0.336424 AT5G65860 4.96295 1.77331 3.40886 0.35731 1.922315 AT5G66330 2.05888 5.62328 1.83373 2.731233 0.326096 RAB18 6.05975 16.4618 4.6228 2.716581 0.28082 AT5G67220 13.9289 7.36436 13.5612 0.528711 1.841463 SBT1.7 60.4965 110.213 62.2041 1.821808 0.564399 CKA1 19.5509 11.3278 15.5693 0.5794 1.374433 ORF107A 1257.73 2563.56 1800.66 2.038244 0.702406 ATP6-1 2.57264 6.82398 3.11274 2.65252 0.456147 COX3 9.15511 25.3004 9.33051 2.763528 0.368789 ATP9 37.0171 100.43 47.4077 2.71307 0.472047 RRN26 1311.15 2442.66 1673.46 1.862991 0.685097 NAD5C 11.2827 38.8778 18.0285 3.445789 0.463722 COX2 12.3457 37.7792 16.2141 3.06011 0.429181 NAD5A 0.923702 3.87612 1.70174 4.196288 0.439032 ORF25 11.1611 38.0942 15.8076 3.413122 0.414961 ORF149 8.55327 19.7719 8.96073 2.311619 0.453205 ATP6-2 2.91813 11.8938 3.7019 4.075829 0.311246 NAD2B 3.8077 9.21312 5.39134 2.419602 0.585181 COX1 14.6184 39.8537 18.3737 2.72627 0.461029 RRN18 1627.31 3709.93 1981.91 2.279793 0.534218

226

PSBK 328.468 925.203 404.198 2.816722 0.436875 PSBC,PSBD 852.16 1879.6 1254.6 2.205689 0.667482 PSBZ 493.338 1226.49 763.185 2.486105 0.622251 RBCL 2821.55 8350.88 3588.62 2.959678 0.42973 PETA 112.302 200.77 182.722 1.787769 0.910106 PSAJ 360.479 1432.72 780.537 3.974489 0.544794 RPL33 340.456 707.781 461.004 2.078921 0.651337 PSBB 311.376 984.077 435.483 3.160414 0.442529 PSBH 421.442 1007.91 541.867 2.391575 0.537614 PETB 257.61 759.774 363.862 2.949319 0.478908 PETD 312.718 936.685 428.019 2.995302 0.456951 CCSA 17.0307 39.5617 20.7384 2.322964 0.524204 PSBA 1664.75 5675.9 2282.76 3.409461 0.402185 MATK 108.54 189.276 146.636 1.743836 0.774721 ATPF 349.69 568.502 562.59 1.625731 0.989601 ATPH 1980.55 4217.94 2887.57 2.129681 0.684592 ATPI 88.6548 177.648 137.313 2.003817 0.77295 RPS14 457.91 939.221 662.272 2.051104 0.705129 PSAB 234.954 518.139 366.1 2.205278 0.706567 PSAA 210.268 464.995 335.583 2.21144 0.721692 YCF3 44.3804 82.1784 69.1818 1.851682 0.841849 PSBE 428.481 1530.02 544.394 3.5708 0.355808 CLPP1 111.972 174.063 165.638 1.554523 0.951598 NDHF 16.9494 31.642 21.152 1.866851 0.668479 NDHD 69.8394 204.969 91.4814 2.934862 0.446318 PSAC 263.906 826.728 366.04 3.132661 0.442757 NDHE 114.933 276.938 176.252 2.40956 0.636431 NDHG 96.6026 215.473 137.544 2.230509 0.638335 NDHA 79.1016 142.216 132.802 1.79789 0.933805 YCF1.2 6.40195 10.3159 10.1356 1.611368 0.982522

Table VII – Gene names and transcript abundances for genes

significantly changed between SOP and DMSO treated Col seedlings.

The gene names, transcript abundances in FPKM, and relative fold changes

are shown for all genes that were significantly changes in abundance

227

between SOP and DMSO treated Col plants. The p-value cutoff used was

0.01.

FPKM Fold changes Gene name DMSO htl-3 SOP htl-3/DMSO SOP/DMSO AT1G22930 2.96201 2.87405 5.74326 0.970304 1.998316 THO2 0.611274 0.547117 1.45438 0.895044 2.658261 ABCC1 3.56201 3.28279 7.49802 0.921612 2.284039 ACC1 3.54106 3.69707 7.51247 1.044057 2.032006 PATL1 16.8209 26.0629 30.6726 1.549436 1.176868 SUS2 3.88789 3.6836 7.81713 0.947455 2.122144 AT1G01320 6.90127 6.49993 12.9535 0.941845 1.992868 AT1G02890 2.4863 2.94827 5.11689 1.185806 1.735557 SPI 0.982814 0.814681 2.00027 0.828927 2.45528 AT1G07135 19.5687 10.2712 8.5914 0.524879 0.836455 GWD1 1.68351 2.80082 3.99302 1.663679 1.425661 VALRS 2.04342 1.89063 4.4085 0.925228 2.331762 AT1G18270 4.05805 4.33554 7.28759 1.06838 1.680896 emb1507 3.50744 3.3509 6.9366 0.955369 2.070071 ZAT10 22.4283 10.2741 10.9249 0.458086 1.063344 NPF7.3 5.522 7.85858 10.7436 1.42314 1.367117 AT1G48650 1.67896 1.02647 3.3387 0.611373 3.252604 SAB 0.902853 0.686137 2.10679 0.759965 3.070509 CRWN1 1.16621 2.53137 2.78171 2.170595 1.098895 UPL2 2.89046 2.82994 6.12349 0.979062 2.163823 HAC1 1.36644 1.14117 2.9062 0.835141 2.546685 GC5 1.62395 2.51998 3.32758 1.55176 1.320479 SYD 0.92649 0.832584 2.0858 0.898643 2.505213 MOR1 1.66547 1.6545 3.20326 0.993413 1.936089 BRM 1.21421 1.21419 2.63933 0.999984 2.173737 AT2G17930 1.02045 0.963686 2.07537 0.944374 2.153575 BGAL8 1.70197 2.92685 3.64962 1.719684 1.246945 SPA1 2.85829 2.61313 5.33111 0.914228 2.040124 MED14 1.33118 2.0361 2.63367 1.529545 1.293488 EMB2454 2.15293 3.10598 4.73929 1.442676 1.52586 POLIB 1.24255 0.79984 2.7077 0.643709 3.385302 TIC 5.23322 7.29007 9.50333 1.393037 1.303599 CHUP1 6.09691 8.27557 11.5076 1.357338 1.390551 PHOT1 4.85815 8.89185 8.77614 1.830295 0.986987

228

FMT 3.58788 3.38559 6.66435 0.943619 1.968446 NRPA1 0.736614 0.392141 1.83565 0.532356 4.681097 EIF4G 2.81103 3.57424 5.92922 1.271505 1.658876 BIG2 1.52711 1.87313 3.29663 1.226585 1.759958 BIG 3.21434 4.00507 7.12628 1.246001 1.779315 CHC2 8.69092 10.1471 15.8878 1.167552 1.565748 AT3G11964 1.03496 0.531017 2.12726 0.51308 4.006011 PA200 1.11087 1.38253 2.23542 1.244547 1.616905 TPR4 3.0626 4.05972 5.54533 1.32558 1.365939 FHY3 1.93893 2.6531 4.10318 1.368332 1.546561 AT3G48050 2.1561 2.5375 4.25991 1.176893 1.678782 AT3G50370 3.67869 4.89003 6.55496 1.329286 1.340474 ATXR3 0.742567 0.637738 1.67672 0.858829 2.629167 emb2726 7.85608 10.5189 14.1668 1.33895 1.346795 SPT16 1.17834 1.39323 2.71593 1.182367 1.949377 ATG2484-1 1.23043 1.19005 2.88385 0.967182 2.423302 TPP2 3.99397 4.83751 7.98659 1.211203 1.650971 AT4G28080 7.94721 11.6747 17.211 1.469031 1.474213 NRPB1 2.11939 1.91903 4.31529 0.905463 2.248683 GLU1 31.8052 31.7784 58.2357 0.999157 1.832556 AT5G47690 1.83254 1.82907 3.60893 0.998106 1.973096 GLT1 7.38894 11.086 13.493 1.500351 1.217121 CHLH 50.9146 49.0102 96.1953 0.962596 1.962761 EMB2773 0.594933 0.614668 1.42685 1.033172 2.321334 XIK 1.38437 1.86785 2.89382 1.349242 1.549279 CIPK20 1.13941 1.66555 3.56002 1.461765 2.137444 CLPC1 22.7358 26.7165 41.4085 1.175085 1.549922 ABCF5 5.36577 8.68917 12.0726 1.619371 1.389385 ATMG00090,RPL16 30.8296 27.663 65.2085 0.897287 2.357246 YCF4 105.854 161.023 189.062 1.52118 1.17413 ATPA 175.323 247.505 361.172 1.411709 1.459251 RPOC2 11.5557 11.7649 23.5801 1.018104 2.004275 RPOC1 16.4305 15.965 30.1655 0.971669 1.889477 RPOB 11.6674 12.3861 21.478 1.061599 1.734041 RPOA 22.5134 35.7307 45.5145 1.587086 1.273821 RPS11 40.3927 50.0671 81.9206 1.239509 1.636216 RPS8 28.4381 26.6907 74.165 0.938554 2.778683 RPL14 66.9233 70.6621 132.054 1.055867 1.868809 RPL16 71.3209 81.9882 164.417 1.149568 2.005374 RPL22,RPS3 75.3633 90.065 160.892 1.195078 1.786399

229

NDHH 22.0243 28.3632 48.111 1.287814 1.696247

Table VIII – GO biological process enrichment or depletion in htl-3

transcripts.

Over or under represented GO biological processes in genes whose

transcripts are found in significantly different levels in htl-3 compared to Col

are shown. The cut off for significance in comparison of htl-3 to Col was

0.01. Bonferroni correction for multiple hypothesis testing was applied to GO

terms.

GO biological process complete Fold enrichment P-value electron transport chain (GO:0022900) 6.06 2.18E-03 photosynthesis, light reaction (GO:0019684) 5.46 6.32E-03 photosynthesis (GO:0015979) 4.92 7.67E-07 cellular respiration (GO:0045333) 4.42 4.93E-02 lipid transport (GO:0006869) 4.28 4.08E-04 energy derivation by oxidation of organic compounds (GO:0015980) 4.27 3.30E-02 lipid localization (GO:0010876) 4.24 1.12E-04 response to karrikin (GO:0080167) 4.11 2.46E-02 generation of precursor metabolites and energy (GO:0006091) 4.06 1.00E-06 response to cytokinin (GO:0009735) 3.71 1.79E-03 response to cold (GO:0009409) 3.24 4.58E-03 peptide biosynthetic process (GO:0043043) 2.33 7.52E-04 peptide metabolic process (GO:0006518) 2.32 1.49E-04 translation (GO:0006412) 2.3 1.50E-03 amide biosynthetic process (GO:0043604) 2.27 8.95E-04

230 cellular amide metabolic process (GO:0043603) 2.17 5.93E-04 organonitrogen compound biosynthetic process (GO:1901566) 2.07 5.44E-05 response to abiotic stimulus (GO:0009628) 1.95 4.11E-05 organonitrogen compound metabolic process (GO:1901564) 1.88 3.90E-05 response to hormone (GO:0009725) 1.8 2.16E-02 response to endogenous stimulus (GO:0009719) 1.75 2.98E-02 response to chemical (GO:0042221) 1.74 6.19E-05 response to organic substance (GO:0010033) 1.73 1.07E-02 single-organism metabolic process (GO:0044710) 1.39 2.04E-02 Unclassified (UNCLASSIFIED) 1.09 0.00E+00 macromolecule modification (GO:0043412) 0.5 8.38E-03 protein modification process (GO:0036211) 0.49 1.64E-02 cellular protein modification process (GO:0006464) 0.49 1.64E-02 phosphorylation (GO:0016310) 0.27 2.90E-04 protein phosphorylation (GO:0006468) < 0.2 3.90E-05

Table IX – GO biological process enrichment or depletion in SOP

treated plant transcripts.

Over or under represented GO biological processes in genes whose

transcripts are found in significantly different levels in SOP treated Col

compared to DMSO treated Col are shown. The cut off for significance in

comparison of SOP treated Col compared to DMSO treated Col was 0.01.

Bonferroni correction for multiple hypothesis testing was applied to GO

terms.

GO biological process complete Fold enrichment P-value DNA-templated transcription, elongation (GO:0006354) > 100 7.37E-05 organelle organization (GO:0006996) 5.91 4.25E-03 response to hormone (GO:0009725) 4.81 2.94E-02 response to organic substance (GO:0010033) 4.58 7.68E-03 cellular component organization (GO:0016043) 4.15 4.18E-03 response to chemical (GO:0042221) 3.85 4.73E-03

231 cellular component organization or biogenesis (GO:0071840) 3.67 1.82E-02 nitrogen compound metabolic process (GO:0006807) 2.7 1.45E-02 response to stimulus (GO:0050896) 2.49 1.86E-02 single-organism cellular process (GO:0044763) 2.34 1.24E-02 cellular process (GO:0009987) 1.9 4.47E-03 Unclassified (UNCLASSIFIED) 1.06 0.00E+00

References

1. Morrison, K. L. & Weiss, G. A. The origins of chemical biology. Nat. Chem. Biol.

2, 3–6 (2006).

2. Paweletz, N. Walther Flemming: pioneer of mitosis research. Nat. Rev. Mol. Cell

Biol. 2, 72–75 (2001).

3. Ding, X. et al. Activatable Molecular Probes for Cancer Imaging. Curr Top Med

Chem. 12, 1135–1144 (2010).

4. Ehlrich, P. Chemotheraputics: Scientific principles, methods, and results. Lancet

353–359. (1913).

5. Pauling, L. & Corey, R. B. Configurations of Polypeptide Chains With Favored

Orientations Around Single Bonds. Proc Natl Acad Sci U S A 37, (1951).

6. Mackinnon, A. L. & Taunton, J. Target Identification by Diazirine Photo-Cross-

linking and Click Chemistry. Curr. Protoc. Chem. Biol. 1, 55–73 (2009).

7. Niphakis, M. J. et al. A Global Map of Lipid-Binding Proteins and Their

Ligandability in Cells. Cell 161, 1668–1680 (2015).

8. Weerapana, E. et al. Quantitative reactivity profiling predicts functional cysteines

232

in proteomes. Nature 468, 790–795 (2010).

9. Stockwell, B. R. Chemical genetics: ligand-based discovery of gene function. Nat.

Rev. Genet. 1, 116–25 (2000).

10. Cutler, S. & Mccourt, P. Dude, where’s my phenotype? Dealing with redundancy

in signaling networks. Plant Physiol. 138, 558–559 (2005).

11. Lehár, J., Stockwell, B. R., Giaever, G. & Nislow, C. Combination chemical

genetics. Nat. Chem. Biol. 4, 674–81 (2008).

12. Jacob, F. & Monod, J. Genetic regulatory mechanisms in the synthesis of

proteins. J. Mol. Biol. 3, 318–356 (1961).

13. Germain, P., Iyer, J., Zechel, C. & Gronemeyer, H. Co-regulator recruitment and

the mechanism of retinoic acid receptor synergy. Nature 415, 187–192 (2002).

14. Park, S.-Y. et al. Abscisic acid inhibits type 2C protein phosphatases via the

PYR/PYL family of START proteins. Science. 324, 1068–71 (2009).

15. Miller, A. L., Kress, B. C., Lewis, L., Stein, R. & Kinnon, C. Effect of Tunicamycin

and Cycloheximide on the Secretion of Acid Hydrolases from I-Cell Cultured

Fibroblasts. 971–975 (1980).

16. Kliegman, J. I. et al. Chemical genetics of rapamycin-insensitive TORC2 in S.

cerevisiae. Cell Rep. 5, 1725–1736 (2013).

17. Giaever, G. et al. Genomic profiling of drug sensitivities via induced

haploinsufficiency. Nat. Genet. 21, 278–283 (1999).

18. Giaever, G. et al. Chemogenomic profiling: identifying the functional interactions

233

of small molecules in yeast. Proc. Natl. Acad. Sci. U. S. A. 101, 793–798 (2004).

19. Heitman, J., Movva, N. R. & Hall, M. N. Targets for cell cycle arrest by the

immunosuppressant rapamycin in yeast. Science. 253, 905 (1991).

20. Ho, C. H. et al. A molecular barcoded yeast ORF library enables mode-of-action

analysis of bioactive compounds. Nat. Biotechnol. 27, 369–377 (2009).

21. Hahn, W. C. et al. Creation of human tumour cells with defined genetic elements.

Nature 400, 464–8 (1999).

22. Yagoda, N. et al. RAS-RAF-MEK-dependent oxidative cell death involving

voltage-dependent anion channels. Nature 447, 864–8 (2007).

23. Dar, A. C., Das, T. K., Shokat, K. M. & Cagan, R. L. Chemical genetic discovery

of targets and anti-targets for cancer polypharmacology. Nature 486, 80–4

(2012).

24. Hopkins, A. L. & Groom, C. R. The druggable genome. Nat. Rev. Drug Discov. 1,

727–30 (2002).

25. Jassal, B. et al. The systematic annotation of the three main GPCR families in

Reactome. Database (Oxford). 2010, (2010).

26. Arkin, M. R. & Wells, J. a. Small-molecule inhibitors of protein-protein

interactions: progressing towards the dream. Nat. Rev. Drug Discov. 3, 301–317

(2004).

27. Howe, J. A. et al. Selective small-molecule inhibition of an RNA structural

element. Nature 526, 672–677 (2015).

234

28. Robinson-Rechavi, M., Carpentier, a S., Duffraisse, M. & Laudet, V. How many

nuclear hormone receptors are there in the human genome? Trends Genet. 17,

554–6 (2001).

29. Grossmann, K. Auxin herbicides: Current status of mechanism and mode of

action. Pest Manag. Sci. 66, 113–120 (2010).

30. Steinrucken, H. C. & Amrhein, N. The herbicide glyphosate is a potent inhibitor of

5-enolpyruvylshikimic acid-3-phosphate synthase. Biochem. Biophys. Res.

Commun. 94, 1207–1212 (1980).

31. Seddon, G. et al. Drug design for ever, from hype to hope. J. Comput. Aided.

Mol. Des. 26, 137–150 (2012).

32. Anderson, A. C. The process of structure-based drug design. Chem. Biol. 10,

787–797 (2003).

33. Koshland, D. E. Application of a Theory of Enzyme Specificity to Protein

Synthesis. Proc. Natl. Acad. Sci. U. S. A. 44, 98–104 (1958).

34. Chen, C. et al. Structural basis for molecular recognition of folic acid by folate

receptors. Nature 500, 486–9 (2013).

35. Stockwell, B. R. Exploring biology with small organic molecules. Nature 432,

846–854 (2004).

36. Acker, M. G. & Auld, D. S. Considerations for the design and reporting of enzyme

assays in high-throughput screening applications. Perspect. Sci. 1, 56–73 (2014).

37. Zhao, Y. et al. Chemical genetic interrogation of natural variation uncovers a

235

molecule that is glycoactivated. Nat. Chem. Biol. 3, 716–721 (2007).

38. Zhitomirsky, B. & Assaraf, Y. G. Lysosomes as mediators of drug resistance in

cancer. Drug Resist. Updat. 24, 23–33 (2016).

39. Sharifi, M. & Ghafourian, T. Estimation of biliary excretion of foreign compounds

using properties of molecular structure. AAPS J. 16, 65–78 (2014).

40. Borgna, J. L. & Rochefort, H. Hydroxylated metabolites of tamoxifen are formed

in vivo and bound to estrogen receptor in target tissues. J. Biol. Chem. 256, 859–

868 (1981).

41. Schenone, M., Dančík, V., Wagner, B. K. & Clemons, P. a. Target identification

and mechanism of action in chemical biology and drug discovery. Nat. Chem.

Biol. 9, 232–40 (2013).

42. Harding, M. W., Galat, a, Uehling, D. E. & Schreiber, S. L. A receptor for the

immunosuppressant FK506 is a cis-trans peptidyl-prolyl isomerase. Nature 341,

758–760 (1989).

43. Golde, T. E., Schneider, L. S. & Koo, E. H. Anti-A?? therapeutics in alzheimer’s

disease: The need for a paradigm shift. Neuron 69, 203–213 (2011).

44. Yu, S., Liang, Y., Palacino, J., Difiglia, M. & Lu, B. Drugging unconventional

targets: Insights from Huntington’s disease. Trends Pharmacol. Sci. 35, 53–62

(2014).

45. Hawver, L. A., Jung, S. A. & Ng, W. Specificity and complexity in bacterial

quorum-sensing systems. 1–15 (2016). doi:10.1093/femsre/fuw014

236

46. Christopher G. Alvaro & Jeremy Thorner. Heterotrimeric G Protein-coupled

Receptor Signaling in Yeast Mating Pheromone Response. J. Biol. Chem. 291,

7788–7795 (2016).

47. Motomitsu, A., Sawa, S. & Ishida, T. Plant peptide hormone signalling. Essays

Biochem. 58, 115–131 (2015).

48. Adamowski, M. & Friml, J. PIN-Dependent Auxin Transport: Action, Regulation,

and Evolution. Plant Cell Online 27, 20–32 (2015).

49. Wang, Z. Y., Seto, H., Fujioka, S., Yoshida, S. & Chory, J. BRI1 is a critical

component of a plasma-membrane receptor for plant steroids. Nature 410, 380–

383 (2001).

50. Sun, Y. et al. Structure reveals that BAK1 as a co-receptor recognizes the BRI1-

bound brassinolide. Cell Res. 23, 1326–9 (2013).

51. Wang, X. et al. Identification and Functional Analysis of in Vivo Phosphorylation

Sites of the Arabidopsis BRASSINOSTEROID-INSENSITIVE1 Receptor Kinase.

Plant Cell 17, 1685–1703 (2005).

52. Schaller, E. G. & Bleecker, A. B. Ethylene-binding sites generated in yeast

expressing the Arabidopsis ETR1 gene. Science. 270, 1809–18011 (1995).

53. Schaller, G. E., Ladd, A. N., Lanahan, M. B., Spanbauer, J. M. & Bleecker, A. B.

The ethylene response mediator ETR1 from Arabidopsis forms a disulfide-linked

dimer. Journal of Biological Chemistry 270, 12526–12530 (1995).

54. Ju, C. et al. CTR1 phosphorylates the central regulator EIN2 to control ethylene

237

hormone signaling from the ER membrane to the nucleus in Arabidopsis. Proc.

Natl. Acad. Sci. U. S. A. 109, 19486–91 (2012).

55. Inoue, T. et al. Identification of CRE1 as a cytokinin receptor from Arabidopsis.

Nature 409, 1060–1063 (2001).

56. Rashotte, A. M. et al. A subset of Arabidopsis AP2 transcription factors mediates

cytokinin responses in concert with a two-component pathway. Proc. Natl. Acad.

Sci. U. S. A. 103, 11081–5 (2006).

57. Dharmasiri, N., Dharmasiri, S. & Estelle, M. The F-box protein TIR1 is an auxin

receptor. Nature 435, 441–5 (2005).

58. Kepinski, S. & Leyser, O. The Arabidopsis F-box protein TIR1 is an auxin

receptor. Nature 435, 446–451 (2005).

59. Gray, W. M., Kepinski, S., Rouse, D., Leyser, O. & Estelle, M. Auxin regulates

SCF TIR1 -dependent degradation of AUX / IAA proteins. 414, 271–276 (2001).

60. Katsir, L., Schilmiller, A. L., Staswick, P. E., He, S. Y. & Howe, G. a. COI1 is a

critical component of a receptor for jasmonate and the bacterial virulence factor

coronatine. Proc. Natl. Acad. Sci. U. S. A. 105, 7100–7105 (2008).

61. Thines, B. et al. JAZ repressor proteins are targets of the SCF(COI1) complex

during jasmonate signalling. Nature 448, 661–665 (2007).

62. Yoshida, R. et al. ABA-activated SnRK2 protein kinase is required for

dehydration stress signaling in Arabidopsis. Plant Cell Physiol. 43, 1473–1483

(2002).

238

63. Furihata, T. et al. Abscisic acid-dependent multisite phosphorylation regulates the

activity of a transcription activator AREB1. Proc Natl Acad Sci U S A 103, 1988–

1993 (2006).

64. Ueguchi-Tanaka, M. et al. GIBBERELLIN INSENSITIVE DWARF1 encodes a

soluble receptor for gibberellin. Nature 437, 693–8 (2005).

65. Murase, K., Hirano, Y., Sun, T. & Hakoshima, T. Gibberellin-induced DELLA

recognition by the gibberellin receptor GID1. Nature 456, 459–463 (2008).

66. Ueguchi-Tanaka, M. et al. Molecular interactions of a soluble gibberellin receptor,

GID1, with a rice DELLA protein, SLR1, and gibberellin. Plant Cell 19, 2140–55

(2007).

67. Dill, A., Jung, H. S. & Sun, T. P. The DELLA motif is essential for gibberellin-

induced degradation of RGA. Proc. Natl. Acad. Sci. U. S. A. 98, 14162–7 (2001).

68. Silverstone, a L. et al. Repressing a repressor: gibberellin-induced rapid

reduction of the RGA protein in Arabidopsis. Plant Cell 13, 1555–1566 (2001).

69. Cook, C., Whichard, L. & Wall, M. Germination stimulants. The structure of

Strigol. J. Am. Chem. Soc. 1447, 6198–6199 (1964).

70. Muller, S., Hauck, C. & Schildknecht, H. Germination stimulants produced by

Vigna unguiculata Walp; cv Saunders Upright. J. Plant Growth Regul. 11, 77–84

(1992).

71. Akiyama, K., Matsuzaki, K. & Hayashi, H. Plant sesquiterpenes induce hyphal

branching in arbuscular mycorrhizal fungi. Nature 435, 824–7 (2005).

239

72. Yoneyama, K., Yoneyama, K., Takeuchi, Y. & Sekimoto, H. Phosphorus

deficiency in red clover promotes exudation of orobanchol, the signal for

mycorrhizal symbionts and germination stimulant for root parasites. Planta 225,

1031–8 (2007).

73. Beveridge, C. a., Ross, J. J. & Murfet, I. C. Branching Mutant rms-2 in Pisum

sativum (Grafting Studies and Endogenous Indole-3-Acetic Acid Levels). Plant

Physiol. 104, 953–959 (1994).

74. Stirnberg, P., van De Sande, K. & Leyser, H. M. O. MAX1 and MAX2 control

shoot lateral branching in Arabidopsis. Development 129, 1131–1141 (2002).

75. Zou, J. et al. The rice HIGH-TILLERING DWARF1 encoding an ortholog of

Arabidopsis MAX3 is required for negative regulation of the outgrowth of axillary

buds. Plant J. 48, 687–698 (2006).

76. Napoli, C. Highly Branched Phenotype of the Petunia dad1-1 Mutant Is Reversed

by Grafting. Plant Physiol. 111, 27–37 (1996).

77. Ongaro, V. & Leyser, O. Hormonal control of shoot branching. J. Exp. Bot. 59,

67–74 (2008).

78. Sorefan, K. et al. MAX4 and RMS1 are ortholosgous dioxygenase-like genes that

regulate shoot branching in Arabidopsis and pea. Genes Dev. 17, 1469–1474

(2003).

79. Umehara, M. et al. Inhibition of shoot branching by new terpenoid plant

hormones. Nature 455, 195–200 (2008).

240

80. Gomez-Roldan, V. et al. Strigolactone inhibition of shoot branching. Nature 455,

189–94 (2008).

81. Woo, H. R. et al. ORE9, an F-box protein that regulates leaf senescence in

Arabidopsis. Plant Cell 13, 1779–90 (2001).

82. Arite, T. et al. d14, a Strigolactone-Insensitive Mutant of Rice, Shows an

Accelerated Outgrowth of Tillers. Plant Cell Physiol. 50, 1416–1424 (2009).

83. Hamiaux, C. et al. DAD2 Is an α/β hydrolase likely to Be Involved in the

perception of the plant branching hormone, strigolactone. Curr. Biol. 1–5 (2012).

84. Toh, S., Holbrook-Smith, D., Stokes, M. E., Tsuchiya, Y. & McCourt, P. Detection

of Parasitic Plant Suicide Germination Compounds Using a High-Throughput

Arabidopsis HTL/KAI2 Strigolactone Perception System. Chem. Biol. 21, 988–98

(2014).

85. Waters, M. T. et al. Specialisation within the DWARF14 protein family confers

distinct responses to and strigolactones in Arabidopsis. Development

1295, 1285–1295 (2012).

86. Nakamura, H. et al. Molecular mechanism of strigolactone perception by

DWARF14. Nat. Commun. 4, 2613 (2013).

87. Zhou, F. et al. D14–SCFD3-dependent degradation of D53 regulates

strigolactone signalling. Nature 1–2 (2013).

88. Jiang, L. et al. DWARF 53 acts as a repressor of strigolactone signalling in rice.

Nature (2013).

241

89. Stanga, J. P., Smith, S. M., Briggs, W. R. & Nelson, D. C. SUPPRESSOR OF

MAX2 1 (SMAX1) controls seed germination and seedling development in

Arabidopsis thaliana. Plant Physiol. (2013).

90. Wang, L. et al. Strigolactone Signaling in Arabidopsis Regulates Shoot

Development by Targeting D53-Like SMXL Repressor Proteins for Ubiquitination

and Degradation. Plant Cell 27, 1–16 (2015).

91. Wang, Y. et al. Strigolactone/MAX2-Induced Degradation of Brassinosteroid

Transcriptional Effector BES1 Regulates Shoot Branching. Dev. Cell 27, 681–688

(2013).

92. Tsuchiya, Y. et al. Probing strigolactone receptors in Striga hermonthica with

fluorescence. Science. 349, 864–868 (2015).

93. Toh, S. et al. Structure-function analysis identifies highly sensitive strigolactone

receptors in Striga. Science. 350, 203–208 (2015).

94. Yamaguchi, S. Gibberellin Metabolism and its Regulation. Annu. Rev. Plant Biol.

59, 225–251 (2008).

95. Scaffidi, A. et al. Strigolactone hormones and their stereoisomers signal through

two related receptor proteins to induce different physiological responses in

Arabidopsis. Plant Physiol. (2014).

96. Zhao, L.-H. et al. Destabilization of strigolactone receptor DWARF14 by binding

of ligand and E3-ligase signaling effector DWARF3. Cell Res. 14, 1–18 (2015).

97. Zhao, L.-H. et al. Crystal structures of two phytohormone signal-transducing α/β

242

hydrolases: karrikin-signaling KAI2 and strigolactonesignaling DWARF14. Cell

Res. 1–4 (2013). doi:10.1038/cr.2013.19

98. Kagiyama, M. et al. Structures of D14 and D14L in the strigolactone and karrikin

signaling pathways. Genes to cells 1–14 (2013).

99. Zhao, L.-H. et al. Crystal structures of two phytohormone signal-transducing α/β

hydrolases: karrikin-signaling KAI2 and strigolactone-signaling DWARF14. Cell

Res. 23, 436–9 (2013).

100. de Saint Germain, A. et al. An histidine covalent receptor and butenolide complex

mediates strigolactone perception. Nat. Chem. Biol. (2016).

doi:10.1038/nchembio.2147

101. Zhao, L.-H. et al. Destabilization of strigolactone receptor DWARF14 by binding

of ligand and E3-ligase signaling effector DWARF3. Cell Res. 25, 1219–1236

(2015).

102. Cardoso, C. et al. Natural variation of rice strigolactone biosynthesis is

associated with the deletion of two MAX1 orthologs. Proc. Natl. Acad. Sci. U. S.

A. 2–7 (2014).

103. Jamil, M., Charnikhova, T., Houshyani, B., van Ast, A. & Bouwmeester, H. J.

Genetic variation in strigolactone production and tillering in rice and its effect on

Striga hermonthica infection. Planta 235, 473–484 (2012).

104. Berner, D., Ikie, F. O. & Green, J. M. ALS-inhibiting herbicide seed treatments

control Striga hermonthica in ALS-modified corn (Zea mays). Weed Technol. 11,

704–707 (1997).

243

105. Bebawi FF, E. R. Efficacy of Ethylene as a Germination Stimulant of Striga

hermonthica Seed. Weed Sci. 34, 694–698 (2016).

106. Johnson, A. W., Rosebery, G. & Parker, C. A novel approach to Striga and

Orobanche control using synthetic germination stimulants. Weed Res. 16, 223–

227 (1976).

107. Akiyama, K., Ogasawara, S., Ito, S. & Hayashi, H. Structural Requirements of

Strigolactones for Hyphal Branching in AM Fungi. Plant Cell Physiol. 51, 1104–

1117 (2010).

108. Wigchert, S. C. et al. Dose-response of seeds of the parasitic weeds Striga and

Orobanche toward the synthetic germination stimulants GR 24 and Nijmegen 1.

J. Agric. Food Chem. 47, 1705–10 (1999).

109. Zwanenburg, B., Nayak, S. K., Charnikhova, T. V & Bouwmeester, H. J. New

strigolactone mimics: Structure-activity relationship and mode of action as

germinating stimulants for parasitic weeds. Bioorg. Med. Chem. Lett. 23, 5182–6

(2013).

110. Knudsen et al. Small-molecule agonists for the glucagon-like peptide 1 receptor.

Proc. Natl. Acad. Sci. U. S. A. 104, 937–942 (2007).

111. Alonso, J. M. et al. Genome-wide insertional mutagenesis of Arabidopsis

thaliana. Science 301, 653–657 (2003).

112. Ichikawa, T. et al. The FOX hunting system: An alternative gain-of-function gene

hunting technique. Plant J. 48, 974–985 (2006).

244

113. Jiang, W. et al. Demonstration of CRISPR/Cas9/sgRNA-mediated targeted gene

modification in Arabidopsis, tobacco, sorghum and rice. Nucleic Acids Res. 1–12

(2013).

114. Tsuchiya, Y. et al. A small-molecule screen identifies new functions for the plant

hormone strigolactone. Nat. Chem. Biol. 6, 741–9 (2010).

115. Umehara, M. et al. Structural requirements of strigolactones for shoot branching

inhibition in rice and Arabidopsis. Plant Cell Physiol. 56, 1059–1072 (2014).

116. Wallace, I. M. et al. Compound prioritization methods increase rates of chemical

probe discovery in model organisms. Chem. Biol. 18, 1273–83 (2011).

117. Toh, S. et al. Thermoinhibition uncovers a role for strigolactones in Arabidopsis

seed germination. Plant Cell Physiol. 53, 107–17 (2012).

118. Nelson, D. C. et al. Karrikins Discovered in Smoke Trigger Arabidopsis Seed

Germination by a Mechanism Requiring Gibberellic Acid Synthesis and Light.

Plant Physiol. 149, 863–873 (2008).

119. McNellis, T. W., von Arnim, a G. & Deng, X. W. Overexpression of Arabidopsis

COP1 results in partial suppression of light-mediated development: evidence for

a light-inactivable repressor of photomorphogenesis. Plant Cell 6, 1391–1400

(1994).

120. Epps, D. E., Raub, T. J., Caiolfa, V., Chiari, A. & Zamai, M. Determination of the

affinity of drugs toward serum albumin by measurement of the quenching of the

intrinsic tryptophan fluorescence of the protein. J. Pharm. Pharmacol. 51, 41–8

(1999).

245

121. Rawel, H. M., Frey, S. K., Meidtner, K., Kroll, J. & Schweigert, F. J. Determining

the binding affinities of phenolic compounds to proteins by quenching of the

intrinsic tryptophan fluorescence. Mol. Nutr. Food Res. 50, 705–13 (2006).

122. Tallarida, R. J. Quantitative methods for assessing drug synergism. Genes

Cancer 2, 1003–8 (2011).

123. Vane, J. Inhibition of prostaglandin synthesis as a mechanism of action for

aspirin-like drugs. Nat. New Biol. 23, 232–5 (1971).

124. Asami, T. et al. Characterization of brassinazole, a triazole-type brassinosteroid

biosynthesis inhibitor. Plant Physiol. 123, 93–99 (2000).

125. Parker, C. Observations on the current status of Orobanche and Striga problems

worldwide. Pest Manag. Sci. 2009, 453–459 (2009).

126. Tsuchiya, Y. et al. Supplemental Probing strigolactone receptors in Striga

hermonthica with fluorescence. Science. 349, 864–868 (2015).

127. Toh, S. et al. Preliminary Structure-function analysis identifies highly sensitive

strigolactone receptors in Striga. Science. 350, 203–207 (2015).

128. Sun, X. D. & Ni, M. HYPOSENSITIVE to LIGHT, an alpha/beta fold protein, acts

downstream of ELONGATED HYPOCOTYL 5 to regulate seedling de-etiolation.

Mol. Plant 4, 116–126 (2011).

129. Ebert, B., Andersen, S. & Krogsgaard-Larsen, P. Ketobemidone, methadone and

pethidine are non-competitive N-methyl-D- aspartate (NMDA) antagonists in the

rat cortex and spinal cord. Neurosci. Lett. 187, 165–168 (1995).

246

130. Mclaughlan, S. D., Marshall, D. J. & Majesty, H. Pharmacology of ethyl-(4-

aminophenethyl)-4-phenylisonipecotate, anileridine, a new potent synthetic

analgesic. J. Pharmacol. Exp. Ther. 32, 343–344 (1970).

131. Lomenick, B., Jung, G., Wohlschlegel, J. a & Huang, J. Target identification using

drug affinity responsive target stability (DARTS). Curr. Protoc. Chem. Biol. 3,

163–180 (2011).

132. Soundappan, I. et al. SMAX1-LIKE/D53 Family Members Enable Distinct MAX2-

Dependent Responses to Strigolactones and Karrikins in Arabidopsis. Plant Cell

14, 1–18 (2015).

133. Bortesi, L. & Fischer, R. The CRISPR/Cas9 system for plant genome editing and

beyond. Biotechnol. Adv. 33, 41–52 (2015).

134. Fujii, H. et al. In vitro reconstitution of an abscisic acid signalling pathway. Nature

462, 660–4 (2009).

135. Nelson, D. C. et al. Karrikins enhance light responses during germination and

seedling development in Arabidopsis thaliana. Proc. Natl. Acad. Sci. U. S. A. 107,

7095–100 (2010).

136. Abràmoff, M. D., Magalhães, P. J. & Ram, S. J. Image processing with imageJ.

Biophotonics Int. 11, 36–41 (2004).

137. Schneider, C. a, Rasband, W. S. & Eliceiri, K. W. NIH Image to ImageJ: 25 years

of image analysis. Nat. Methods 9, 671–675 (2012).

138. Andrews, S. FastQC: a quality control tool for high throughput sequence data.

247

(2010).

139. Bolger, A. M., Lohse, M. & Usadel, B. Trimmomatic: A flexible trimmer for

Illumina sequence data. Bioinformatics 30, 2114–2120 (2014).

140. Trapnell, C. et al. Differential gene and transcript expression analysis of RNA-seq

experiments with TopHat and Cufflinks. Nat. Protoc. 7, 562–78 (2012).

141. The Gene Ontology Consortium. Gene ontology: Tool for the identification of

biology. Nat. Genet. 25, 25–29 (2000).

142. The Gene Ontology Consortium. Gene ontology consortium: Going forward.

Nucleic Acids Res. 43, D1049–D1056 (2015).

143. Nookaew, I., Olivares-Hernández, R., Bhumiratana, S. & Nielsen, J. Genome-

scale metabolic models of Saccharomyces cerevisiae. Methods in Molecular

Biology 759, (2011).

144. Rasmussen, A. et al. Strigolactones suppress adventitious rooting in Arabidopsis

and pea. Plant Physiol. 158, 1976–1987 (2012).

145. Agency, E. P. Multi-Year Workplan for Conventional Pesticide Registration - New

Chemical Registration Candidates. 1–2 (2016).

146. European Comission. Final review report for the active substance Diethofencarb.

(2014).

147. Hosokawa, S. et al. Effects of diethofencarb on thyroid function and hepatic UDP-

glucuronyltransferase activity in rats. J. Toxicol. Sci. 17, 155–66 (1992).

248

Appendix 1

RNA sequencing analysis

All analysis was performed on a machine running Ubuntu 14.04 LTS. Fastq files were extracted to the following location:

“/media/mccourt/AC1687B716878156/RNAseq/data/” (timing ~ 1h). FastQC was opened by writing the following command in the terminal: perl /home/mccourt/Downloads/FastQC/fastqc

It was then possible to select fastq files to analyze through the program’s GUI. In order to clear relevant warnings regarding the sequencing data the following commands were made through the terminal to trimmomatic (timing ~ 1.5 h): java -jar /home/mccourt/Downloads/Trimmomatic-0.33/trimmomatic-0.33.jar SE /media/mccourt/AC1687B716878156/RNAseq/data/D1_150624_NextSeq_R2.fastq /media/mccourt/AC1687B716878156/RNAseq/data/D1R2.fastq HEADCROP:15 LEADING:28 TRAILING:28 CROP:129 MINLEN:40 && java -jar /home/mccourt/Downloads/Trimmomatic- 0.33/trimmomatic-0.33.jar SE /media/mccourt/AC1687B716878156/RNAseq/data/D2_150624_NextSeq_R1.fastq /media/mccourt/AC1687B716878156/RNAseq/data/D2R1.fastq HEADCROP:15 LEADING:28 TRAILING:28 CROP:129 MINLEN:40 && java -jar /home/mccourt/Downloads/Trimmomatic- 0.33/trimmomatic-0.33.jar SE /media/mccourt/AC1687B716878156/RNAseq/data/D2_150624_NextSeq_R2.fastq /media/mccourt/AC1687B716878156/RNAseq/data/D2R2.fastq HEADCROP:15 LEADING:28 TRAILING:28 CROP:129 MINLEN:40 && java -jar /home/mccourt/Downloads/Trimmomatic- 0.33/trimmomatic-0.33.jar SE /media/mccourt/AC1687B716878156/RNAseq/data/D3_150624_NextSeq_R1.fastq /media/mccourt/AC1687B716878156/RNAseq/data/D3R1.fastq HEADCROP:15 LEADING:28 TRAILING:28 CROP:129 MINLEN:40 && java -jar /home/mccourt/Downloads/Trimmomatic- 0.33/trimmomatic-0.33.jar SE /media/mccourt/AC1687B716878156/RNAseq/data/D3_150624_NextSeq_R2.fastq /media/mccourt/AC1687B716878156/RNAseq/data/D3R2.fastq HEADCROP:15 LEADING:28 TRAILING:28 CROP:129 MINLEN:40 && java -jar /home/mccourt/Downloads/Trimmomatic- 249

0.33/trimmomatic-0.33.jar SE /media/mccourt/AC1687B716878156/RNAseq/data/R1_150624_NextSeq_R1.fastq /media/mccourt/AC1687B716878156/RNAseq/data/R1R1.fastq HEADCROP:15 LEADING:28 TRAILING:28 CROP:129 MINLEN:40 && java -jar /home/mccourt/Downloads/Trimmomatic- 0.33/trimmomatic-0.33.jar SE /media/mccourt/AC1687B716878156/RNAseq/data/R1_150624_NextSeq_R2.fastq /media/mccourt/AC1687B716878156/RNAseq/data/R1R2.fastq HEADCROP:15 LEADING:28 TRAILING:28 CROP:129 MINLEN:40 &&java -jar /home/mccourt/Downloads/Trimmomatic- 0.33/trimmomatic-0.33.jar SE /media/mccourt/AC1687B716878156/RNAseq/data/R2_150624_NextSeq_R1.fastq /media/mccourt/AC1687B716878156/RNAseq/data/R2R1.fastq HEADCROP:15 LEADING:28 TRAILING:28 CROP:129 MINLEN:40 && java -jar /home/mccourt/Downloads/Trimmomatic- 0.33/trimmomatic-0.33.jar SE /media/mccourt/AC1687B716878156/RNAseq/data/R2_150624_NextSeq_R2.fastq /media/mccourt/AC1687B716878156/RNAseq/data/R2R2.fastq HEADCROP:15 LEADING:28 TRAILING:28 CROP:129 MINLEN:40 && java -jar /home/mccourt/Downloads/Trimmomatic- 0.33/trimmomatic-0.33.jar SE /media/mccourt/AC1687B716878156/RNAseq/data/R3_150624_NextSeq_R1.fastq /media/mccourt/AC1687B716878156/RNAseq/data/R3R1.fastq HEADCROP:15 LEADING:28 TRAILING:28 CROP:129 MINLEN:40 && java -jar /home/mccourt/Downloads/Trimmomatic- 0.33/trimmomatic-0.33.jar SE /media/mccourt/AC1687B716878156/RNAseq/data/R3_150624_NextSeq_R2.fastq /media/mccourt/AC1687B716878156/RNAseq/data/R3R2.fastq HEADCROP:15 LEADING:28 TRAILING:28 CROP:129 MINLEN:40 && java -jar /home/mccourt/Downloads/Trimmomatic- 0.33/trimmomatic-0.33.jar SE /media/mccourt/AC1687B716878156/RNAseq/data/Ht1_150624_NextSeq_R1.fastq /media/mccourt/AC1687B716878156/RNAseq/data/H1R1.fastq HEADCROP:15 LEADING:28 TRAILING:28 CROP:129 MINLEN:40 && java -jar /home/mccourt/Downloads/Trimmomatic- 0.33/trimmomatic-0.33.jar SE /media/mccourt/AC1687B716878156/RNAseq/data/Ht1_150624_NextSeq_R2.fastq /media/mccourt/AC1687B716878156/RNAseq/data/H1R2.fastq HEADCROP:15 LEADING:28 TRAILING:28 CROP:129 MINLEN:40 &&java -jar /home/mccourt/Downloads/Trimmomatic- 0.33/trimmomatic-0.33.jar SE /media/mccourt/AC1687B716878156/RNAseq/data/Ht2_150624_NextSeq_R1.fastq /media/mccourt/AC1687B716878156/RNAseq/data/H2R1.fastq HEADCROP:15 LEADING:28 TRAILING:28 CROP:129 MINLEN:40 && java -jar /home/mccourt/Downloads/Trimmomatic- 0.33/trimmomatic-0.33.jar SE /media/mccourt/AC1687B716878156/RNAseq/data/Ht2_150624_NextSeq_R2.fastq /media/mccourt/AC1687B716878156/RNAseq/data/H2R2.fastq HEADCROP:15 LEADING:28 TRAILING:28 CROP:129 MINLEN:40 && java -jar /home/mccourt/Downloads/Trimmomatic- 0.33/trimmomatic-0.33.jar SE /media/mccourt/AC1687B716878156/RNAseq/data/Ht3_150624_NextSeq_R1.fastq /media/mccourt/AC1687B716878156/RNAseq/data/H3R1.fastq HEADCROP:15 LEADING:28 TRAILING:28 CROP:129 MINLEN:40 && java -jar /home/mccourt/Downloads/Trimmomatic- 0.33/trimmomatic-0.33.jar SE /media/mccourt/AC1687B716878156/RNAseq/data/Ht3_150624_NextSeq_R2.fastq /media/mccourt/AC1687B716878156/RNAseq/data/H3R2.fastq HEADCROP:15 LEADING:28 TRAILING:28 CROP:129 MINLEN:40

The fastq files were then checked again through FastQC to ensure that any relevant errors had been eliminated. With trimming and quality control concluded, the processed files were mapped using Tophat. The code for Tophat analysis of all samples follows (timing ~ 36 h):

250 tophat -p 8 -G /media/mccourt/AC1687B716878156/RNAseq/Arabidopsis_thaliana_Ensembl_TAIR10/Arabidopsis_thali ana/Ensembl/TAIR10/Annotation/Genes/genes.gtf -o /media/mccourt/AC1687B716878156/RNAseq/data/D1thout /media/mccourt/AC1687B716878156/RNAseq/Arabidopsis_thaliana_Ensembl_TAIR10/Arabidopsis_thali ana/Ensembl/TAIR10/Sequence/Bowtie2Index/genome /media/mccourt/AC1687B716878156/RNAseq/data/D1R1.fastq /media/mccourt/AC1687B716878156/RNAseq/data/D1R2.fastq && tophat -p 8 -G /media/mccourt/AC1687B716878156/RNAseq/Arabidopsis_thaliana_Ensembl_TAIR10/Arabidopsis_thali ana/Ensembl/TAIR10/Annotation/Genes/genes.gtf -o /media/mccourt/AC1687B716878156/RNAseq/data/D2thout /media/mccourt/AC1687B716878156/RNAseq/Arabidopsis_thaliana_Ensembl_TAIR10/Arabidopsis_thali ana/Ensembl/TAIR10/Sequence/Bowtie2Index/genome /media/mccourt/AC1687B716878156/RNAseq/data/D2R1.fastq /media/mccourt/AC1687B716878156/RNAseq/data/D2R2.fastq && tophat -p 8 -G /media/mccourt/AC1687B716878156/RNAseq/Arabidopsis_thaliana_Ensembl_TAIR10/Arabidopsis_thali ana/Ensembl/TAIR10/Annotation/Genes/genes.gtf -o /media/mccourt/AC1687B716878156/RNAseq/data/D3thout /media/mccourt/AC1687B716878156/RNAseq/Arabidopsis_thaliana_Ensembl_TAIR10/Arabidopsis_thali ana/Ensembl/TAIR10/Sequence/Bowtie2Index/genome /media/mccourt/AC1687B716878156/RNAseq/data/D3R1.fastq /media/mccourt/AC1687B716878156/RNAseq/data/D3R2.fastq && tophat -p 8 -G /media/mccourt/AC1687B716878156/RNAseq/Arabidopsis_thaliana_Ensembl_TAIR10/Arabidopsis_thali ana/Ensembl/TAIR10/Annotation/Genes/genes.gtf -o /media/mccourt/AC1687B716878156/RNAseq/data/R1thout /media/mccourt/AC1687B716878156/RNAseq/Arabidopsis_thaliana_Ensembl_TAIR10/Arabidopsis_thali ana/Ensembl/TAIR10/Sequence/Bowtie2Index/genome /media/mccourt/AC1687B716878156/RNAseq/data/R1R1.fastq /media/mccourt/AC1687B716878156/RNAseq/data/R1R2.fastq && tophat -p 8 -G /media/mccourt/AC1687B716878156/RNAseq/Arabidopsis_thaliana_Ensembl_TAIR10/Arabidopsis_thali ana/Ensembl/TAIR10/Annotation/Genes/genes.gtf -o /media/mccourt/AC1687B716878156/RNAseq/data/R2thout /media/mccourt/AC1687B716878156/RNAseq/Arabidopsis_thaliana_Ensembl_TAIR10/Arabidopsis_thali ana/Ensembl/TAIR10/Sequence/Bowtie2Index/genome /media/mccourt/AC1687B716878156/RNAseq/data/R2R1.fastq /media/mccourt/AC1687B716878156/RNAseq/data/R2R2.fastq && tophat -p 8 -G /media/mccourt/AC1687B716878156/RNAseq/Arabidopsis_thaliana_Ensembl_TAIR10/Arabidopsis_thali ana/Ensembl/TAIR10/Annotation/Genes/genes.gtf -o /media/mccourt/AC1687B716878156/RNAseq/data/R3thout /media/mccourt/AC1687B716878156/RNAseq/Arabidopsis_thaliana_Ensembl_TAIR10/Arabidopsis_thali ana/Ensembl/TAIR10/Sequence/Bowtie2Index/genome /media/mccourt/AC1687B716878156/RNAseq/data/R3R1.fastq /media/mccourt/AC1687B716878156/RNAseq/data/R3R2.fastq && tophat -p 8 -G /media/mccourt/AC1687B716878156/RNAseq/Arabidopsis_thaliana_Ensembl_TAIR10/Arabidopsis_thali ana/Ensembl/TAIR10/Annotation/Genes/genes.gtf -o /media/mccourt/AC1687B716878156/RNAseq/data/H1thout /media/mccourt/AC1687B716878156/RNAseq/Arabidopsis_thaliana_Ensembl_TAIR10/Arabidopsis_thali ana/Ensembl/TAIR10/Sequence/Bowtie2Index/genome /media/mccourt/AC1687B716878156/RNAseq/data/H1R1.fastq /media/mccourt/AC1687B716878156/RNAseq/data/H1R2.fastq && tophat -p 8 -G /media/mccourt/AC1687B716878156/RNAseq/Arabidopsis_thaliana_Ensembl_TAIR10/Arabidopsis_thali ana/Ensembl/TAIR10/Annotation/Genes/genes.gtf -o /media/mccourt/AC1687B716878156/RNAseq/data/H2thout /media/mccourt/AC1687B716878156/RNAseq/Arabidopsis_thaliana_Ensembl_TAIR10/Arabidopsis_thali ana/Ensembl/TAIR10/Sequence/Bowtie2Index/genome /media/mccourt/AC1687B716878156/RNAseq/data/H2R1.fastq

251

/media/mccourt/AC1687B716878156/RNAseq/data/H2R2.fastq && tophat -p 8 -G /media/mccourt/AC1687B716878156/RNAseq/Arabidopsis_thaliana_Ensembl_TAIR10/Arabidopsis_thali ana/Ensembl/TAIR10/Annotation/Genes/genes.gtf -o /media/mccourt/AC1687B716878156/RNAseq/data/H3thout /media/mccourt/AC1687B716878156/RNAseq/Arabidopsis_thaliana_Ensembl_TAIR10/Arabidopsis_thali ana/Ensembl/TAIR10/Sequence/Bowtie2Index/genome /media/mccourt/AC1687B716878156/RNAseq/data/H3R1.fastq /media/mccourt/AC1687B716878156/RNAseq/data/H3R2.fastq

On average, approximately 85% of reads could be mapped to the reference genome. With the reads mapped it was now possible to use Cufflinks to assemble the transcripts and estimate their abundances. This was done using the following code: cufflinks -p 8 -o /media/mccourt/AC1687B716878156/RNAseq/data/D1clout /media/mccourt/AC1687B716878156/RNAseq/data/D1thout/accepted_hits.bam && cufflinks -p 8 -o /media/mccourt/AC1687B716878156/RNAseq/data/D2clout /media/mccourt/AC1687B716878156/RNAseq/data/D2thout/accepted_hits.bam && cufflinks -p 8 -o /media/mccourt/AC1687B716878156/RNAseq/data/D3clout /media/mccourt/AC1687B716878156/RNAseq/data/D3thout/accepted_hits.bam && cufflinks -p 8 -o /media/mccourt/AC1687B716878156/RNAseq/data/R1clout /media/mccourt/AC1687B716878156/RNAseq/data/R1thout/accepted_hits.bam && cufflinks -p 8 -o /media/mccourt/AC1687B716878156/RNAseq/data/R2clout /media/mccourt/AC1687B716878156/RNAseq/data/R2thout/accepted_hits.bam && cufflinks -p 8 -o /media/mccourt/AC1687B716878156/RNAseq/data/R3clout /media/mccourt/AC1687B716878156/RNAseq/data/R3thout/accepted_hits.bam && cufflinks -p 8 -o /media/mccourt/AC1687B716878156/RNAseq/data/H1clout /media/mccourt/AC1687B716878156/RNAseq/data/H1thout/accepted_hits.bam && cufflinks -p 8 -o /media/mccourt/AC1687B716878156/RNAseq/data/H2clout /media/mccourt/AC1687B716878156/RNAseq/data/H2thout/accepted_hits.bam && cufflinks -p 8 -o /media/mccourt/AC1687B716878156/RNAseq/data/H3clout /media/mccourt/AC1687B716878156/RNAseq/data/H3thout/accepted_hits.bam

Cuffmerge was used to put together replicates into a single framework to analyze differential expression. To do this we first generated a file “assemblies.txt” to tell Cuffmerge where to look for the files. The contents of that text file were as follows: /media/mccourt/AC1687B716878156/RNAseq/data/D1clout/transcripts.gtf /media/mccourt/AC1687B716878156/RNAseq/data/D2clout/transcripts.gtf /media/mccourt/AC1687B716878156/RNAseq/data/D3clout/transcripts.gtf /media/mccourt/AC1687B716878156/RNAseq/data/R1clout/transcripts.gtf /media/mccourt/AC1687B716878156/RNAseq/data/R2clout/transcripts.gtf /media/mccourt/AC1687B716878156/RNAseq/data/R3clout/transcripts.gtf /media/mccourt/AC1687B716878156/RNAseq/data/H1clout/transcripts.gtf /media/mccourt/AC1687B716878156/RNAseq/data/H2clout/transcripts.gtf /media/mccourt/AC1687B716878156/RNAseq/data/H3clout/transcripts.gtf

We next entered the following commands into the terminal for Cuffmerge (timing ~ 1 h): cuffmerge -g /media/mccourt/AC1687B716878156/RNAseq/Arabidopsis_thaliana_Ensembl_TAIR10/Arabidopsis_thali

252 ana/Ensembl/TAIR10/Annotation/Genes/genes.gtf -s /media/mccourt/AC1687B716878156/RNAseq/Arabidopsis_thaliana_Ensembl_TAIR10/Arabidopsis_thali ana/Ensembl/TAIR10/Sequence/WholeGenomeFasta/genome.fa -p 8 /media/mccourt/AC1687B716878156/RNAseq/assemblies.txt

Finally, with the transcript abundance data merged it was possible to use Cuffdiff to analyze the relative abundances of transcripts between treatments and replicates. The following code was entered in the terminal (timing ~ 48h): cuffdiff -o diff_out -b /media/mccourt/AC1687B716878156/RNAseq/Arabidopsis_thaliana_Ensembl_TAIR10/Arabidopsis_thali ana/Ensembl/TAIR10/Sequence/WholeGenomeFasta/genome.fa -p 4 -L D,R,H -u /media/mccourt/AC1687B716878156/RNAseq/merged_asm/merged.gtf /media/mccourt/AC1687B716878156/RNAseq/data/D1thout/accepted_hits.bam,/media/mccourt/AC1687 B716878156/RNAseq/data/D2thout/accepted_hits.bam,/media/mccourt/AC1687B716878156/RNAseq/da ta/D3thout/accepted_hits.bam, /media/mccourt/AC1687B716878156/RNAseq/data/R1thout/accepted_hits.bam,/media/mccourt/AC1687 B716878156/RNAseq/data/R2thout/accepted_hits.bam,/media/mccourt/AC1687B716878156/RNAseq/da ta/R3thout/accepted_hits.bam, /media/mccourt/AC1687B716878156/RNAseq/data/H1thout/accepted_hits.bam,/media/mccourt/AC1687 B716878156/RNAseq/data/H2thout/accepted_hits.bam,/media/mccourt/AC1687B716878156/RNAseq/da ta/H3thout/accepted_hits.bam

The output from this command was analyzed in an R environment using the

CummeRbund package to assess relative expression levels of genes and isoforms.

253