Deciphering Pleiotropic Effects:

A molecular characterization of the foraging gene in melanogaster.

by

Aaron Munro Allen

A thesis submitted in conformity with the requirements for the degree of Doctor of Philosophy Cell and Systems Biology University of Toronto © Copyright by Aaron Munro Allen 2016 Deciphering Pleiotropic Effects: A molecular characterization of the foraging gene in .

Aaron Munro Allen

Doctor of Philosophy

Cell and Systems Biology University of Toronto

2016

Abstract

Pleiotropy is defined as the manifold effects of a gene at the phenotypic level.

Understanding the mechanisms of manifold gene action has important implications for many fields of biology ranging from evolution and to medicine. The foraging gene, in Drosophila melanogaster, has long been a pivotal example of a single gene with natural variants that affect feeding-related phenotypes. One possible mode of action for foraging’s pleiotropy is through independent regulation of its gene products. Characterization of the foraging gene revealed 4 distinct promoters that produce 21 transcripts, and 9 ORFs. A foraging null mutant of the locus was generated using Ends-out gene targetting. foraging null mutants had reduced foraging behaviour, reduced food intake behaviour, and increased lipid levels. A recombineered full genomic rescue of the gene rescued the effects of the null mutation. By comparing the effects of the null mutant with those of the natural variants, I showed that these feeding-related phenotypes were differentially regulated. A promoter manipulation strategy identified diverse, and non-overlapping expression patterns associated with the 4 foraging promoters. Expression

ii was seen in the nervous and gastric systems of the larva and adult fly, as well as the reproductive systems of adult fly. This expression suggests potential new roles for the foraging gene in the larva and adult fly. Characterizing the regulation of foraging's gene products will further our understanding of its role in behaviour, and shed light on the evolutionary origins of natural variants of the foraging. Not only will this study further our understanding of this gene's conserved function across taxa, but it will also help elucidate the role for differential transcriptional regulation in achieving multiple functions of a gene. This could further serve as fodder for investigations intro the roles of neo-functionalization versus escape from adaptive constraint.

iii Acknowledgements

I’d like to thank Prof. Marla Sokolowski for her supervision, guidance and support throughout this work. I have learned many valuable lessons and have had many unique experiences throughout my time in the lab. There was always something new to learn with such a diverse group. Prof. Joel Levine for the many technical discussions and frankness throughout the degree, his out of the box thinking and for instigating the collaboration with Prof. Stephen Goodwin at the University of Oxford. Prof. Tim Westwood for his insightful comments and discussion on transcriptional regulation, and for keeping in touch with multiple interesting and relevant studies and speakers. Prof. Mariana Wolfner, and Prof. Henry Krause for making time to give their feedback and examination of this thesis.

To Dr. Scott Douglas for teaching me many of the basics of , including many now defunct techniques. Prof. Jean-Christophe Billeter for multiple discussion earlier in my degree that were pivotal in the overall design of the this project, and for introducing me to the bioinformatic software package Geneious, which single handedly saved a minimum of two years of work. Dr. Amsale Belay for teaching me immunohistochemistry and many valuable discussion on the molecular biology of the foraging gene. Dr. Tony So for many earlier morning coffee discussions, giving me yet another perspective into the voodoo of molecular biology.

Prof. Stephen Goodwin who graciously collaborated on the HR and recombineering projects, and for allowing me to conduct some of the early work on these projects for 4 months in his laboratory at the University of Oxford. Dr. Megan Neville for her advice, instruction and discussions that helped guide the HR and recombineering project while in the Goodwin Lab. Yet another invaluable insight into the voodoo of molecular biology. To the other members of the Goodwin lab for making it such a welcoming environment.

To many Sokolowski lab members over the years for the many science and non-science discussions. To Bryon Hughson for many for plotting and scheming on how we’re going to fix the world, and for educating me in areas of metabolism. The new kids, Ina Anreiter, and Oscar

iv Vasquez for their shenanigans and many useful discussions into topics of chromatin modifications and biochemistry, respectively. Dr. Jeff Dason for many useful discussion with a grounded and pragmatic points of view on science.

To the many graduate students through the years. Thomas Braukmann for providing a valuable outside perspective, and for always being a willing participant in procrastination and tomfoolery. Kevin Judge, Marion Andrews, Matt Janicki, Amy Wong, Becky Rooke, Aaron LeBlanc, Mark McDougall, Laura Junker, Audrey Reid, and the whole UTM Friday Beers crew for always providing interesting and entertaining discussions. To the Earth Science Centre community of graduate students on the St. George campus who made our lab’s move to downtown such an easy transition.

My mother, Michele Robinson, and my father, David Allen, for their genes and the environmental conditions to cultivate them. For instilling a sense of curiosity, wonder, and skepticism in interrogating the world around me. For giving me a grounded work ethic and true value of a dollar. Dorothy and Russell Merrifield, my second set of parents, for even further instilling the a solid work ethic. For trying to keep my smart aleck responses in check; it always made for a good challenge. My brothers, Patrick, Kyle and Sean who helped me cut my pedantic and semantic teeth from a very young age. For our countless silly, ridiculous, and Monty Python inspired academic discussion that frequently go all through the night.

To Dr. Kyla Ercit for her support and companionship throughout the degree. You helped me retain the last remnants of my sanity. For always listening when I need to vent. For distracting me when I need distracting. For providing an invaluable outside perspective on my work; making sure I am explaining myself in an interpretable way.

v Table of Contents

Abstract...... ii Acknowledgements...... iv Table of Contents...... vi List of Tables...... xi List of Figures...... xii List of Appendices...... xiv List of Initialisms and Acronyms...... xv Chapter 1:...... 1 General Introduction...... 1 Pleiotropy...... 2 Origins...... 2 “One gene, one enzyme”...... 3 Introducing molecular biology...... 3 Philosophical utility...... 4 Transcriptional regulation...... 5 Chromatin remodeling...... 5 Insulators act to delineate regulatory domains...... 6 Transcription factor binding and prediction...... 6 The long reach of the enhancer...... 7 Promoter structure...... 7 Promoter-proximal pausing...... 8 Alternative promoters and splicing...... 9 The model organism Drosophila melanogaster...... 9 Advances in tools for genetic analysis...... 10 The foraging gene...... 11 foraging’s associated phenotypes...... 11 foraging’s gene products...... 12 foraging is conserved...... 13 The foraging gene as a model to study the mechanisms of pleiotropy...... 14 References:...... 16 Chapter 2...... 24 A Reverse Genetic Dissection of the foraging gene in the Fruit Fly (Drosophila melanogaster)...... 24

vi Abstract:...... 25 Introduction:...... 26 foraging and its feeding-related phenotypes...... 26 Conserved nature of the foraging gene...... 26 Improving upon natural variant associations...... 27 Materials and Methods...... 28 Fly Strains and Rearing...... 28 Larval Synchronization...... 28 Bioinformatics...... 29 DNA Extraction...... 29 PCR, Restriction Digestion, and Ligation...... 29 E. coli Strains...... 30 Electrocompetent E. coli Cell Preparation...... 30 E. coli Transformation...... 30 Plasmid Preparation...... 31 Gene model characterization...... 31 Ends-Out gene targeting...... 32 Recombineering...... 33 Western Blot Analysis...... 34 Southern Blot...... 34 RNAi lines...... 35 Partial Deletions of 3'-end foraging...... 35 Partial Deletions of 5'-end foraging...... 35 Immunohistochemistry...... 36 Triglyceride analysis...... 36 Food intake...... 37 Path length...... 37 Statistical analysis...... 38 Results and Discussion:...... 39 The Structure of the foraging Locus and its Products...... 39 Sequencing the foraging locus...... 39 Identifying alternative promoters...... 40 Identifying alternatively spliced variants...... 41 Generating deletion mutants...... 41 Generating partial deletions...... 41 Generating a null with HR...... 42 Generating a genomic rescue...... 44 The foraging null mutant is pupal lethal...... 44 Determining the stage of lethality...... 45 Inducing lethality with RNAi...... 45 Increased foraging expression increases path length behaviour...... 47 Increased foraging expression increases food intake behaviour...... 48

vii Increased foraging expression decreases DAG/TAG levels...... 49 Conclusion:...... 50 References:...... 51 Chapter 3...... 72 Promoter Specific Expression and Function of the foraging gene in the Larvae of the Fruit Fly (Drosophila melanogaster)...... 72 Abstract:...... 73 Introduction:...... 74 Pleiotropy...... 74 Coding vs noncoding variation...... 75 Mapping CREs and their functions...... 75 foraging and its pleiotropy...... 76 Decoding foraging’s pleiotropy...... 77 Materials and Methods:...... 78 Strains...... 78 Larval Synchronization...... 78 Bioinformatics...... 79 DNA Extraction...... 79 PCR, Restriction Digestion, and Ligation...... 79 E. coli Transformation...... 80 Plasmid Preparation...... 80 Promoter-Gal4s...... 81 Western Blot Analysis...... 81 Immunohistochemistry...... 82 Triglyceride analysis...... 82 Food intake...... 83 Path length...... 83 Statistical analysis...... 84 Results and Discussion:...... 85 Dissecting foraging’s pleiotropy...... 85 Lack of variation in foraging coding sequence...... 85 Promoter bashing...... 86 for-pr-Gal4 construction...... 87 for-pr-Gal4 expression...... 87 Mapping CREs...... 89 Reliability of Gal4s...... 89 anti-FOR has non-specificity...... 90 Support for pr-Gal4 expression patterns...... 90 Missing CREs...... 91 P1 or P3 protein isoforms are required for viability...... 92 P1 or P3 protein isoforms are required for foraging behaviour...... 93 Mapped CREs were insufficient in regulating food intake behaviour...... 94

viii Fat body expression of foraging autonomously regulates DAG/TAG levels...... 94 Conclusion:...... 96 References:...... 97 Chapter 4...... 121 Promoter Specific Expression and Function of the foraging Gene in the Adult Fruit Fly (Drosophila melanogaster)...... 121 Abstract:...... 122 Introduction:...... 123 Conserved expression/CRE across development...... 123 foraging’s adult phenotypes...... 124 Identifying foraging’s adult CREs and their functions...... 125 Materials and Methods:...... 126 Fly Strains and Rearing...... 126 Bioinformatics...... 126 Promoter-Gal4s construction...... 126 Western blot analysis...... 127 Immunohistochemistry...... 128 Triglyceride analysis...... 128 Sucrose responsiveness...... 128 Starvation resistance...... 129 Statistical analysis...... 129 Results and Discussion:...... 131 for-pr-Gal4 expression...... 131 for-pr-Gal4 in the CNS...... 131 for-pr-Gal4 in the gastric system...... 131 for-pr-Gal4 in the salivary and fat tissues...... 133 for-pr-Gal4 in the reproductive systems...... 133 Mapping CREs...... 134 Mapped CREs were not sufficient to alter sucrose responsiveness...... 134 Altered foraging expression in the fat body affects DAG/TAG metabolism...... 135 Increasing foraging expression increases starvation resistance...... 135 Conclusion:...... 136 References:...... 137 Chapter 5...... 153 General Discussion...... 153 Deciphering foraging's pleiotropic effects...... 154 foraging’s gene structure...... 155 Genetic dissection of the locus...... 156 Transcriptional regulation...... 158 Promoter specific function...... 159 Other avenues of regulation...... 162 Promoter-proximal pausing...... 162

ix Post-transcriptional regulation of foraging...... 163 Differential protein function...... 164 Pleiotropy...... 164 Final thoughts...... 166 References:...... 168 Appendix...... 173

x List of Tables

Chapter 2 Table...... 56 Table 1: Complementation analysis for lethality...... 56 Chapter 3 Tables...... 102 Table 1: Selection analysis of foraging coding sequence using PAML...... 102 Table 2: Gal4>RNAi lethality screen...... 103

xi List of Figures

Chapter 2 Figures...... 57 Figure 1: Contiguous sequence assembly of sitter line...... 57 Figure 2: Example of sequence divergence from reference genome...... 58 Figure 3: Alignment of RACE and cDNA reads...... 59 Figure 4: Schematic of the foraging gene and associated features...... 60 Figure 5: Cloning of HR targeting construct...... 61 Figure 6: Generation and verification of null mutant...... 62 Figure 7: Schematic of the foraging gene and associated tools...... 63 Figure 8: Recombineering of foraging BAC...... 64 Figure 9: Western blot analysis of null and BAC mutants...... 65 Figure 10: Null mutant lethality...... 66 Figure 11: Foraging path length behaviour of null mutant...... 67 Figure 12: Foraging path length behaviour and effectiveness of RNAi...... 68 Figure 13: Rover sitter allele swap...... 69 Figure 14: Food intake behaviour...... 70 Figure 15: Lipid level of null mutant...... 71 Chapter 3 Figures...... 104 Figure 1: Cloning of for-pr-Gal4s...... 104 Figure 2: Site-specific integration of for-pr-Gal4s...... 105 Figure 3: for-pr-Gal4s expression in the 3rd instar larval CNS...... 106 Figure 5: for-pr-Gal4 expression in salivary glands and fat body...... 109 Figure 6: pr-Gal4 expression in carcass...... 110 Figure 7: for-pr-Gal4 co-labeling...... 111 Figure 8: Mapped CREs back onto the foraging locus...... 112 Figure 9: Anti-FOR IHC of foraging null mutant...... 113 Figure 10: in-situ and enhancer trap expression...... 114 Figure 11: for-pr-Gal4>RNAi western blot analysis...... 115 Figure 12: modENCODE TFBSs aligned to the foraging locus...... 116 Figure 13: pr-Gal4 foraging behaviour...... 117 Figure 14: Isoform-specific RNAi foraging behaviour...... 118 Figure 15: pr-Gal4>RNAi food intake...... 119 Figure 16: Fat body>RNAi lipid levels...... 120 Chapter 4 Figures...... 142 Figure 1: pr-Gal4 expression in adult CNS...... 142 Figure 2: pr-Gal4 expression in adult gastric system...... 143 Figure 3: pr-Gal4 co-labeling...... 144 Figure 4: pr-Gal4 expression in adult salivary and fat tissues...... 145

xii Figure 5: pr-Gal4 expression in adult reproductive systems...... 147 Figure 6: Mapped CREs back onto the locus...... 148 Figure 7: pr-Gal4>RNAi sucrose responsiveness...... 149 Figure 8: Fat body>RNAi lipid levels...... 150 Figure 9: Starvation resistance...... 151 Figure 10: BAC adult western bot analysis...... 152

xiii List of Appendices

Supplemental Figures...... 173 Supplemental figure 1: Southern blot of hobo transposable elements...... 173 Supplemental figure 2: RNA PolII ChiP-Seq from modEncode...... 174 Supplemental figure 3: Path length behaviour and foraging expression in pumilio mutant..175 Chapter 2 Supplemental Tables...... 176 Table S1: Statistical analysis for lethality - Chapter 2, figure 11...... 176 Table S2: Statistical analysis for null path length - Chapter 2, figure 11...... 177 Table S3: Statistical analysis for RNAi path length - Chapter 2, figure 12...... 179 Table S4: Statistical analysis for recombinant path length - Chapter 2, figure 13...... 180 Table S5: Statistical analysis for null food intake - Chapter 2, figure 14...... 181 Table S6: Statistical analysis for null fat levels - Chapter 2, figure 15...... 182 Chapter 3 Supplemental Tables...... 184 Table S1: Statistical analysis for pr>RNAi path length - Chapter 3, figure 13...... 184 Table S2: Statistical analysis for isoform specific path length - Chapter 3, figure 14...... 186 Table S3: Statistical analysis for pr>RNAi food intake - Chapter 3, figure 15...... 187 Table S4: Statistical analysis for RNAi fat levels - Chapter 3, figure 16...... 188 Chapter 4 Supplemental Tables...... 189 Table S1: Statistical analysis for pr>RNAi PER- Chapter 4, figure 6...... 189 Table S2: Statistical analysis for fat levels - Chapter 4, figure 7...... 190 Table S3: Statistical analysis for SR - Chapter 4, figure 8...... 191

xiv List of Initialisms and Acronyms

cGMP cyclic guanosine monophosphate PKG cGMP dependent protein kinase CRE cis-regulatory elements TFBS transcription factor binding site-specific PCR polymerase chain reactions RT-PCR reverse transcription PCR RACE rapid amplification of cDNA ends CAGE cap analysis gene expression EAC escape from adaptive conflict TSS transcription start site polyA transcription termination site pr# promoter # P# protein isoform # RNA Pol II RNA polymerase II RBP RNA binding proteins ChIP-Seq chromatin immuno-precipitation, next generation sequencing

xv Chapter 1:

General Introduction

1 Pleiotropy

Origins Correlated traits have been documented for a long time (Mendel, 1866) but the term pleiotropy was not coined until the start of the 20th century (Plate, 1910). Originally it was defined as a unit of inheritance having several characteristics dependent on it. From the Greek roots, it loosely translates as “many directions”. Although there has been much discussion about the unit of pleiotropy, the main idea is that it is manifold, with one input at the genetic level and multiple outputs at the phenotypic level. The early mechanistic studies subdivided pleiotropy into two categories, genuine and spurious (Gruneberg, 1938). Genuine pleiotropy occurs when two, or more, distinct phenotypes are directly caused by independent mechanisms of a gene. Crystallin proteins represent an example of genuine pleiotropy. Crystallins are important structural proteins in lens tissue and can also function as enzymes such as lactate dehydrogenase and enolase (Tomarev and Piatigorsky, 1996). In contrast, spurious pleiotropy occurs when there is a shared mechanism or one phenotypic effect leads to another. Spurious pleiotropy is exemplified by a mutation in a liver enzyme, phenylalanine hydroxylase, which causes increased levels of phenylalanine in the plasma. This in turn causes defects in myelination, resulting in mental retardation (Hodgkin, 1998). The question of “How does a gene produce its ‘pleiotropic’ effects?” is a long standing one (Gruneberg, 1938). The phenomenon of pleiotropy has far reaching implications in biological sciences when trying to elucidate the genotype-phenotype map. One can conceive of multiple different modes of action for a gene to perform multiple pleiotropic functions. For example, a single gene product can elicit a phenotypic effect which in turn elicits another phenotype (model 1), as with the phenylalanine hydroxylase example above. A single gene product can also elicit multiple independent effects (model 2), as with the crystallin example. Both of these modes of action assume only a single product from a gene. We can imagine a more complicated model in which a gene produces multiple independent products that separately produce the independent phenotypic effects of a gene (model 3). More subtle distinctions between and within these models are discussed elsewhere (Pyeritz, 1989; Hodgkin, 1998).

2 “One gene, one enzyme” Although Beetle and Tatum's research in 1941 significantly impacted the field of biology, it most certainly had detrimental effects on the study of pleiotropy. Beetle and Tatum’s “one gene - one enzyme” hypothesis did not allow for complications such as a single gene producing multiple products with different functions, genuine pleiotropy. It did, however, allow for the study of spurious pleiotropy by looking at the multiple phenotypic effects of a single gene product affecting biochemical processes. By hypothesizing that a gene product can only have one function, there could not be a case of genuine pleiotropy, as Gruneberg defined it. Research in this area subsided until the molecular revolution, when it became more widely accepted that a gene can produce multiple polypeptides that may have distinct biochemical function.

Introducing molecular biology The introduction of molecular techniques into biological research has drastically changed how we view the genotype-phenotype map and pleiotropic functions of the gene. Modern discoveries show that overlapping reading frames, alternate promoters, alternative splicing, post transcriptional regulation, RNA editing, multidomain proteins and other factors have consequences for the mechanisms of pleiotropic action of a gene. For example, alternative splicing allows for the generation of unique protein products, which in turn can have unique biochemical functions. This may include unique interactions which then induce independent signaling cascades resulting in different phenotypes. The same polypeptide can also be expressed in different tissues at different times through the use of discrete cis-regulatory elements (CREs, discussed below). When alternative splicing and CREs are combined through the use of unique promoters the analysis of the mechanistic pleiotropic functions of a gene are further complicated. Even when looking at only a few levels at which gene products are produced, our knowledge of molecular biology makes the study of pleiotropy tractable but more complicated.

3 Philosophical utility The knowledge that a single gene performs many roles in an organism is paradigm changing and affects many disciplines ranging from evolution to medicine (Orr, 2000; Pavlicev and Wagner, 2012; Pyeritz, 1989; Mackay and Anholt, 2006). Many diseases have been characterized for cross phenotype associations and the potential for a single pleiotropic underpinning. Many psychological disorders, cancers, and famously Marfan syndrome have been shown to have a genetic foundation with pleiotropic functions (Pyeritz, 1998; Solovieff et al., 2013). Knowledge of pleiotropic effects of a gene is also relevant for the pharmaceutical industry. A new drug may alleviate some symptoms but cause new symptoms (side effects) due to the pleiotropic nature of the gene product it is targeting and the wholistic nature of oral therapy. An issue frequently encountered is the question of what is the genetic unit that has pleiotropic effects. In the context of “molecular gene pleiotropy”, used by molecular geneticists, it is the gene that is in fact pleiotropic. In the case of “selectional pleiotropy”, used by evolutionary biologists, it is the mutation which is pleiotropic. This discipline-dependent point of view can lead to confusion. The former is primarily interested in the biochemical functions of gene products in a cell, whereas the latter is concerned with the prevalence and fitness consequences of new mutational variants. Both approaches and points of view are appropriate for their respective disciplines. As long as one’s use of the term pleiotropy is defined, the distinction is trivial, as it's the core concepts that are important and not the words we choose to represent them. In this study, I use the “molecular gene pleiotropy” definition, in which the “molecular gene” is the unit at the genetic level that performs pleiotropic function. The distinction between what is or is not pleiotropy, also for reasons such as what constitutes a different phenotypic function, is for the most part philosophical and does not alter the primacy of any associated findings. The utility that I gain from questions into the mechanisms of pleiotropic function of a gene is in the distinction between model 1, 2 and 3 above when filling in the genotype-phenotype map. Having independent regulation via transcription is an interesting extension of this argument. If the differential functions of a gene were due to unique spatial- or temporal- expression, then mutating the regulatory elements would only affect a subset of the gene’s function. In the case of a single polypeptide gene, any mutation in the coding sequence of the gene will necessarily affect all of its functions. So, how are genes regulated?

4 Transcriptional regulation

Transcriptional regulation is a multistep process, and many of the steps are discussed in more detail below. But briefly, the chromatin first may have to be modified to open, allowing access to the DNA for transcription factors to bind. This process can be regulated through the use of insulator sequences. Next, transcription factors bind the locus to recruit RNA polymerase II (RNA Pol II). Once recruited, RNA Pol II may pause just downstream of the transcription start site (TSS) and require additional regulation to progress to elongation. During elongation, alternative splicing can occur, possibly generating tissue specific expression. After the mature transcript is made, it may be bound by RNA binding proteins (RBPs) to regulate its localization and stability before translation. This process gives ample opportunity to achieve independent regulation and expression that may result in pleiotropic functions.

Chromatin remodeling In a cell, DNA is associated with multiple histones and other proteins in order to form chromatin. This chromatin has many important functions relating to gene expression and cellular processes. The expression activity of a gene can be regulated through modifications of the histone proteins. The best known modifications are methylation and acetylation of the multiple histone subunits, but modifications also include phosphorylation and ubiquitylation. Many modifications of histone proteins have been associated with unique signatures of gene activity. For example, H3K4me3/me2 and H3K9ac are significantly associated with transcriptionally active promoters (Kharchenko et al., 2011). Similarly there are associations for exonic regions, intronic regions, and transcriptionally silent regions.

5 Insulators act to delineate regulatory domains Regulatory elements can be isolated from a specific promoter by insulator sequences (ref). Insulators are regions of DNA that are bound by proteins to act as a barrier to condensed chromatin or as a block to enhancers that would otherwise interact with a promoter of interest. Insulator sequences serve as landmarks to be bound by chromatin modifying proteins in order to make loops of inactive and active chromatin. Active loops can then have linearly distant enhancers brought close to a promoter, even if there is a region of inactive chromatin between them. The insulator in the gypsy retrotransposon, bound by Su(Hw), is among the most studied insulator sequences (Maeda and Karch, 2007). gypsy elements can function at distances up to 85 kilobases (Dorsett, 1993). Chromatin remodeling and insulator activity allow for the regulatory regions to be exposed and brought close to their corresponding TSSs.

Transcription factor binding and prediction Transcription factors are proteins that bind DNA in order to regulate gene expression. As such, this includes many of the proteins involved in the previous sections. The regions of DNA that they bind are known as cis-regulatory elements (CREs). Transcription factors can be both positive signals for expression such as activators, or negative signals for expression such as repressors. Once the chromatin has been remodeled, making a region of DNA available for binding, transcription factors are then able to bind and affect the expression of a locus. These transcription factors typically recognize a very short stretch of nucleotides, between 4 and 8 base pairs long (Moses and Sinha, 2009). Through the use of bioinformatic approaches, especially in this age of next generation sequencing, many consensus binding sequences have been identified for transcription factors (ref). However, due to the short length of these sequences, this information has not proven useful in a single gene context. Given that a 6 base recognition sequence of a transcription factor can occur randomly every 4096 bases, they can be very common. In a 40,000 bp locus, it would occur at random 10 times. This is known as the “Futility Theorem” (Wasserman and Sandelin, 2004), since searching for individual transcription factor binding sites based solely on the consensus motif proves to be futile. Of course this view of single transcription factors binding 4 to 8 base pair sequences is too simplistic. Transcription factors do not work in isolation; they work primarily in large complexes of many proteins. The other constituents of these complexes dictate a higher level of

6 specificity than the previously mentioned consensus binding sequences. The constituents of these complexes can vary widely depending on the stage of development and the tissue in which they reside.

The long reach of the enhancer With the advent of chromosome conformation capture (3C) experiments (Dekker et al., 2002), and their higher throughput derivatives, 4C and Hi-C (Zhoa et al., 2006), the extent to where I should be looking for CREs has changed drastically. Traditional views of where to look for enhancers of a gene were typically restricted to within a few kilobases 5’ to the TSS. As the years progressed, it was realized that many enhancers of genes are 3’ to the TSS and lie within the introns of the gene. Recent studies in Drosophila have shown that even this is may be too naive. Not only are discrete regions along the entire length of a chromosome potential enhancers of a gene of interest, but so too are regions on other chromosomes (Sexton et al., 2012). In effect, much of the genome is up for grabs. It should be noted that these interactions are not structureless. Discrete regions both within and between chromosomes show repeated interactions, and delineated by insulator binding and DNase hypersensitivity. Once enhancers are bound by TFs, they recruit RNA Pol II to bind the promoter.

Promoter structure The common definition of a promoter is the sequence on which the RNA polymerase II holoenzyme binds DNA at the TSS of a gene. As a result, the promoter only consists of approximately 100 base pair sequence. This is sometimes referred to as the core promoter. In this thesis I use this definition of promoter. Any elements further out from this region are CREs. Although the promoter is a small region of DNA, it is by no means simple. The elements responsible for transcriptional initiation lie in this region. A given promoter may include, but is not limited to, the following elements: a TATA box (Lifton et al., 1978), Initiator sequence (Inr; Smale and Baltimore, 1989), and a Downstream Promoter Element (DPE; Burke and Kadonaga, 1996). Many more elements exist, and a promoter can have a variety of subsets of them. Originally, many genes were thought to have a TATA box, however, recent studies in Drosophila showed that only 16-28% of promoters contained these elements (Ohler et al., 2002;

7 Gershenzon et al., 2006). In contrast, almost 66% of promoters contained Inr sequences. Furthermore, the combination of an Inr and a DPE was much more prevalent than any other pair of elements (Gershenzon et al., 2006). The composition of these elements in a promoter also differs between genes with a single promoter and those with multi-promoters (Zhu and Halfon, 2009). A promoter can be grouped into one of two categories, peaked or broad. In peaked promoters. The TSS maps to a discrete base. Broad promoters, on the other hand, have multiple transcription start sites along a 100 base pair range. These different classes of promoter structure have different associations with the previously mentioned promoter elements. Peaked promoters are enriched for the Inr and DPE sequences. In contrast, broad promoters are not enriched for these classical elements, but instead are enriched for Ohler elements, a novel class of elements (Ohler et al., 2002; Rach et al., 2009). Peaked promoters also show significant association with spatially- and temporally- restricted expression patterns (Hoskins et al., 2011). For the vast majority of genes, the usage of promoters is consistent across development. More than 90% of promoters used in are also used adults. Furthermore, of those that are consistent, 95% have a shared promoter structure across development, with broad promoters in the being broad promoters in the adult and similar for peaked promoters (Hoskins et al., 2011).

Promoter-proximal pausing Once the RNA Pol II holoenzyme is recruited, it does not necessarily proceed to elongation immediately. Another interesting way in which a gene could be transcriptionally regulated is through promoter-proximal pausing of RNA polymerase II (Wu and Snyder, 2008). In this situation, RNA polymerase II holoenzyme is recruited to the promoter of a gene but due to several interacting factors the holoenzyme pauses roughly 20-40 base pairs downstream of the TSS (Lis and Gilmour, 2011). The holoenzyme remains paused until a separate factor then releases it to progress to elongation, transcribing the gene. This primes the locus for rapid ramp- up of expression, because the initiator complex is already in position and engaged (Price, 2008). If the cell gets the signal that it needs more of this gene product, it only has two hit “play” to get more transcript without the need of going through chromatin remodeling, transcription factor binding, and RNA polymerase II recruitment. The role of promoter-proximal pausing in

8 environmental responsiveness (Adelman and Lis, 2012) and developmental control is well established (Zeitlinger,et al., 2007).

Alternative promoters and splicing Although discrete CREs can drive expression of a transcript in multiple different places, even more independence of regulation can be achieved through alternative promoters. By having multiple TSS, CREs can be further separated from each other. Promoter-proximal pausing at one TSS and not another allows for even more differential regulation. Insulator sequences can even further ensure the isolation of these elements. In a recent study in Drosophila, more than 12,000 promoters were identified for 8000 genes (Hoskins et al., 2011). This represents an average of more than 1.5 promoters per gene, indicating that multiple promoters are quite common. Genes with alternative promoters may be as common as 14% (Zhu and Halfon, 2009). Alternative promoters necessarily allow for alternative splicing. Alternative splicing is yet another way to affect the regulation of a gene. Alternative splicing has its own cis- and trans-regulatory elements. The proteins produced by the trans-regulatory elements bind CREs at the intron-exon borders as well as at the branch point in the intron in order to mediate its excision. Variation in these sequences would allow for variation in alternative splicing. Alternative splicing can generate distinct polypeptides with different functions, or different untranslated regions (UTRs) that may have different post-transcriptional regulation.

The model organism Drosophila melanogaster

Much of the early genetic work done on behaviour traces its roots back to the laboratory of William Castle and began with investigation of phototaxic behaviour of Drosophila (Carpenter, 1905). Studies then moved on to olfactory behaviour in the fly (Barrows, 1907). The Morgan lab introduced the study of genetic variants and chromosomal mechanics by hunting for spontaneous mutations (Morgan, 1910). The first mutants characterized for behaviour, tan and vestigial, were initially identified for their morphological defects; an early example of

9 pleiotropy (McEwen, 1918). The importance of genetic background in such studies was noted from a very early stage (Scott, 1943) and has been more explicitly studied in recent years (de Belle and Heisenberg, 1996; Chandler et al., 2014). The realization that individuals of a population could be variable in their behavioural output allowed for genetic analysis (Hirsch and Erlenmeyer-Kimling, 1961). Quantitative genetic approaches furthered the study of behaviour using natural and selected populations (Hirsch and Tryon, 1956; Hirsch, 1963). Through the use of chromosome segregation, the phenotypic effects were mapped to individual chromosomes, but the researchers were unable, at the time, to map the polygenes any further. In contrast, the single-gene induced- mutation approach (Benzer, 1967) represented a significant departure from the natural population and selection approaches (Hirsch, 1963). A series of point mutations were induced in independent lines, flies were screened for aberrant performance in behaviour to identify genes involved. This search for single genes influencing behaviour yielded significant results (Konopka and Benzer, et al., 1971). Both of these approaches followed a forward genetics model, in which genes were mapped that affected a particular phenotype. Reverse genetics on the other hand, characterizes the phenotypic effects by mutating a known gene. Using both natural variants and single-gene induced mutations provides invaluable insight into all aspects of the pleiotropic genes that influence behaviour (Tully, 1996; Greenspan, 1997). Drosophila melanogaster is highly amenable to behavioural genetic analyses and has been for over 100 years. With a short life cycle of less than 14 days and a small size, they are easy to maintain. Chromosomal substitutions between different fly lines are possible through the use of multiple genetic tools such as balancer chromosomes and dominant phenotypic markers (Lindsley and Zimm, 1992). Environmental conditions such as food quality can be readily manipulated. Many of the above studies into behaviour, genetics, and transcriptional regulation were pioneered in the fly.

Advances in tools for genetic analysis Drosophila’s genetic and molecular tool kit is unsurpassed. A sequenced genome and sequenced genome wide cDNAs facilitates mutant discovery (Celniker et al., 2002; Stapleton et al., 2002). Multiple resources such as FlyAtlas and Fly-FISH allow for expression analysis (Chintapalli et al., 2007; Lecuyer et al., 2007). The modENCODE project permits extensive

10 interrogation of the transcriptional regulation of a locus (Roy et al., 2010). The Gal4/UAS system facilitates targeted expression or knockdown of genes of interest (Brand and Perrimon, 1993). GAL80ts allows for temperature based temporal control of Gal4 function (McGuire et al., 2003). Other binary expression systems, such as lexA and QF, provide intersectional methods to further restrict expression patterns (Yagi et al. 2010; Potter et al., 2010). Comparisons between transgenes are more reliable with site-specific integration (Bischof et al., 2007; Markstein et al., 2008). Homologous recombination mediated gene targeting (HR) and CRISPR/Cas9 provide unparalleled capacity to make sequence specific mutation (Gong and Golic, 2003; Basset et al., 2013). Both are used to delete a locus as well as to introduce a specific mutation. Recombination mediated genetic engineering, or recombineering, permits in-vitro generation of specific mutations in a gene (Warming et al., 2005; Venken et al., 2006). RNAi lines are available for most genes in the genome (Dietzl et al., 2007; Ni et al., 2010 Venken and Bellen, 2005). These tools along with the assays we have developed to quantify larval and adult food related traits (see below) facilitate the molecular genetic investigation of foraging’s pleiotropy.

The foraging gene

foraging’s associated phenotypes The foraging gene was initially characterized by its influence on individual differences in larval food search behaviour on a nutritive medium (Sokolowski, 1980). Larvae of the rover strain had longer foraging path lengths than those of the sitter strain when in the presence of food but showed no difference when on a non-nutritive medium (Sokolowski, 1980). This behaviour was shown to segregate as a single gene (de Belle and Sokolowski, 1987), which was later mapped to the left arm of chromosome-2 (de Belle and Sokolowski, 1989). A mutagenesis procedure, known as lethal tagging, was used to further map the gene. Briefly, flies were mutated with radiation, and screened for both lethality and altered foraging path length behaviour. The lethality was then mapped by complementation analysis, and the behavioural alteration was mapped relative to the lethality. This further refined the location to a small region of chromosome-2 corresponding to polytene position 24A3-24C5 (de Belle et al., 1989). The

11 rover/sitter behavioural difference was found to vary in response to food quality and it was exhibited throughout most of the larval instars developmental stage (Graf and Sokolowski, 1989). foraging has subsequently been shown to be pleiotropic, involved in a suite of other behavioural and physiological traits for which rovers and sitters exhibit differences (Reaume and Sokolowski, 2009). Additional feeding-related traits were shown to involve the foraging allelic polymorphism: these include food intake, nutrient absorption, and fat metabolism and storage (Kaun et al., 2007; Kaun et al. 2008). Compared to sitters, rovers forage over longer distances, eat less, absorb more, and preferentially store their acquired nutrients as carbohydrates over lipids. Many of these patterns are also seen in the adult fly (Pereira and Sokolowski, 1993; Kent et al., 2009). foraging’s gene products foraging encodes a cGMP-dependent protein kinase (PKG) in Drosophila (Kalderon and Rubin, 1989; Osborne et al., 1997; Celniker et al., 2002). That multiple functions are associated with a PKG is not surprising given the many interacting targets that this family of proteins has in different signaling pathways (Schlossmann and Desch, 2009). These interacting targets include ion channels, G proteins, and transcription factors. A single gene product activating multiple different cascades would fall into the model 2 discussed above. Mammalian PKG genes encode multiple protein isoforms which have been shown to differ in their expression, biochemical activity, and targets (Smith et al., 1996; Hofmann et al., 2009). This presents us with an interesting alternative, in which independent regulation of distinct gene products with differing functions elicit independent phenotypic effects, model 3 above. foraging produces multiple gene products. It was initially characterized as producing upwards of 8 RNAs, all of which share a common 3’-end coding for the catalytic domain of the PKG protein (Kalderon and Rubin, 1989). Only one TSS was identified in that study, but as this thesis shows, there are multiple. Prior to the experiments described in this thesis, foraging had eleven annotated transcripts which were predicted to code for four different protein isoforms. Previously uncharacterized transcripts were deduced by EST libraries and computer annotation (Stapleton et al., 2002). All eleven transcripts were predicted to share a common 3’-UTR, and transcripts that coded for the same protein isoform only differed in their 5’-UTR. These

12 transcripts were capable of producing unique proteins (Kalderon and Rubin, 1989; Belay et al., 2007; MacPherson et al., 2004). In general, rovers have higher transcript, protein and enzyme activity levels than sitters (Osborne et al., 1997; Belay et al., 2007; Kaun et al., 2007; Dawson- Scully et al., 2010). Little was known about the transcriptional regulation of the foraging gene in Drosophila melanogaster. It is important to understand how foraging is regulated in order to address the mechanisms mediating foraging’s diverse phenotypes. foraging is conserved PKGs are evolutionarily ancient kinases and are found in many diverse taxa across eukarya (Manning et al., 2002). They are a member of the larger AGC kinase group, with cAMP- and cCMP-dependent protein kinases. PKGs are thought to have originated from a PKA duplication, and even now both PKG and PKA can be activated by each other's cNMP (Pearce et al., 2010). It is not surprising then that foraging’s sequence is conserved in many taxa (Fitzpatrick and Sokolowski, 2004). Many metazoans have multiple PKG genes from nematodes, to insects, to mammals (Fitzpatrick and Sokolowski, 2004). The duplication of PKG is thought to have occurred in early metazoans. foraging’s orthologues in many different species are pleiotropic and have also been associated with behavioural effects. Mutating foraging’s homologue in nematodes, egl-4, resulted in greater time spent roaming instead of dwelling (Fujiwara et al., 2002), as well as in multiple other pleiotropic behaviours and metabolic phenotypes such as effects in growth, life span, body size and fat levels (Hirose et al., 2003; Raizen et al., 2006). In the ant Pheidole pallidula, foraging expression levels were associated with caste-specific roles in food search, where foragers have lower expression than workers (Ingram et al., 2005). A role for foraging in task-specific behaviours was also found in another in the harvester ant (Lucas and Sokolowski, 2009). In honey bees, an increase in foraging gene expression was linked to a developmental transition from nurse to forager behaviour (Ben-Shahar et al., 2002). foraging’s mammalian homologue, cGKI has also been implicated in multiple behaviours (Hofmann et al., 2006). In mice and humans, manipulations of cGKI resulted in altered fat levels and body size (Miyashita et al., 2009; Pfeifer et al., 1998). The conserved nature of foraging, both in sequence and pleiotropic function, further its importance as a model in behaviour genetics and metabolism.

13 The foraging gene as a model to study the mechanisms of pleiotropy

Many genes that influence behaviour are pleiotropic (Sokolowski, 2001). Expression throughout development and in many tissues is common, including when genes are involved in metabolic functions (Hall, 1994). Such broad expression is also common for vital genes that influence behaviour. The foraging gene is an important example of a single gene with natural allelic variants that influence multiple developmental and physiological phenotypes involved with behaviour. With its multiple isoforms and multiple associated phenotypes, foraging is a good candidate to study the mechanisms of pleiotropy. In this thesis I address the broad question: How does the foraging gene produce its broad range of pleiotropic effects? I hypothesize that foraging’s gene products have isoform-specific functions to produce the gene’s pleiotropic effects. The specific aims of this thesis are:

Aim I: Molecular characterization of the foraging gene and its products. I conducted an in- depth analysis of foraging's gene products. I identified the transcription start sites (TSSs), transcription termination site (polyA) and the splicing patterns of the locus. (Chapter 2)

Aim II: Create an amorph of the locus. I generated a genetic null mutation of the foraging gene using HR, and assay its phenotypic effects. I generated a genomic rescue construct of the entire locus, using recombineering. (Chapter 2)

Aim III: Transcriptional regulation of the foraging gene. I generated foraging-Gal4 constructs to identify cis-regulatory elements (CREs) in the foraging locus and localized their spatial and temporal expression. (Chapter 3 and 4)

Aim IV. Promoter-specific function. Once the products and promoters of the foraging gene had been characterized, I assessed the promoter-specific functions of the gene products

14 using gene-specific drivers, cDNA and RNAi to alter foraging’s expression in temporal- and spatial-specific patterns and assayed for altered phenotypes. (Chapter 3 and 4)

Knowledge of the mechanistic basis underlying the generation of foraging’s mRNA and protein products will permit the investigation of the means by which these products influence foraging’s phenotypes.

The majority of this thesis will focus on the larval stage of Drosophila development. Given the recent work in the lab on adult phenotypes, I will extend our expression and functional analysis to the adult stage of development in Chapter 4. The chapters that follow will lead to separate publications. As such, there may be some repetition of information.

15 References:

Adelman, K., and Lis, J.T. (2012). Promoter-proximal pausing of RNA polymerase II: emerging roles in metazoans. Nat. Rev. Genet. 13, 720–731.

Barrows, W.M. (1907). The reactions of the Pomace fly, Drosophila ampelophila loew, to odorous substances. J. Exp. Zool. 4, 515–537.

Bassett, A.R., Tibbit, C., Ponting, C.P., and Liu, J.-L. (2013). Highly efficient targeted mutagenesis of Drosophila with the CRISPR/Cas9 system. Cell Reports 4, 220–228.

Beadle, G.W., and Tatum, E.L. (1941). Genetic control of biochemical reactions in Neurospora. PNAS 27, 499–506.

Belay, A. t., Scheiner, R., So, A. k.-C., Douglas, S. J., Chakaborty-Chatterjee, M., Levine, J. d., and Sokolowski, M. b. (2007). The foraging gene of Drosophila melanogaster: Spatial- expression analysis and sucrose responsiveness. J. Comp. Neurol. 504, 570–582. de Belle, J.S., and Sokolowski, M.B. (1989). Rover/sitter foraging behavior in Drosophila melanogaster: Genetic localization to chromosome 2L using compound autosomes. J. Insect Behav. 2, 291–299. de Belle, J.S., and Heisenberg, M. (1996). Expression of Drosophila mushroom body mutations in alternative genetic backgrounds: a case study of the mushroom body miniature gene (mbm). PNAS 93, 9875–9880. de Belle, J.S., and Sokolowski, M.B. (1987). Heredity of rover/sitter: Alternative foraging strategies of Drosophila melanogaster larvae. Heredity 59, 73–83. de Belle, J.S., Hilliker, A.J., and Sokolowski, M.B. (1989). Genetic localization of foraging (for): a major gene for larval behavior in Drosophila melanogaster. Genetics 123, 157–163.

Ben-Shahar, Y., Robichon, A., Sokolowski, M.B., and Robinson, G.E. (2002). Influence of gene action across different time scales on behavior. Science 296, 741–744.

Benzer, S. (1967). Behavioral mutants of Drosophila isolated by countercurrent distribution. PNAS 58, 1112–1119.

Bischof, J., Maeda, R.K., Hediger, M., Karch, F., and Basler, K. (2007). An optimized transgenesis system for Drosophila using germ-line-specific φC31 integrases. PNAS 104, 3312–3317.

Brand, A.H., and Perrimon, N. (1993). Targeted gene expression as a means of altering cell fates and generating dominant phenotypes. Development 118, 401–415.

16 Burke, T.W., and Kadonaga, J.T. (1996). Drosophila TFIID binds to a conserved downstream basal promoter element that is present in many TATA-box-deficient promoters. Genes Dev. 10, 711–724.

Carpenter, F.W. (1905). The reactions of the pomace fly (Drosophila ampelophila Loew) to light, gravity, and mechanical stimulation. American Naturalist 39, 157–171.

Celniker, S.E., Wheeler, D.A., Kronmiller, B., Carlson, J.W., Halpern, A., Patel, S., Adams, M., Champe, M., Dugan, S.P., Frise, E. et al. (2002). Finishing a whole-genome shotgun: release 3 of the Drosophila melanogaster euchromatic genome sequence. Genome Biol. 3, RESEARCH0079.

Chandler, C.H., Chari, S., Tack, D., and Dworkin, I. (2014). Causes and consequences of genetic background effects illuminated by integrative genomic analysis. Genetics 196, 1321–1336.

Chintapalli, V.R., Wang, J., and Dow, J.A.T. (2007). Using FlyAtlas to identify better Drosophila melanogaster models of human disease. Nat Genet 39, 715–720.

Dawson-Scully, K., Bukvic, D., Chakaborty-Chatterjee, M., Ferreira, R., Milton, S.L., and Sokolowski, M.B. (2010). Controlling anoxic tolerance in adult Drosophila via the cGMP- PKG pathway. J. Exp. Biol. 213, 2410–2416.

Dekker, Jo., Rippe, K., Dekker, M., and Kleckner, N. (2002). Capturing chromosome conformation. Science 295, 1306–1311.

Dietzl, G., Chen, D., Schnorrer, F., Su, K.-C., Barinova, Y., Fellner, M., Gasser, B., Kinsey, K., Oppel, S., Scheiblauer, S. et al. (2007). A genome-wide transgenic RNAi library for conditional gene inactivation in Drosophila. Nature 448, 151–156.

Dorsett, D. (1993). Distance-independent inactivation of an enhancer by the suppressor of hairy- wing DNA-binding protein of Drosophila. Genetics 134, 1135–1144.

Fitzpatrick, M.J., and Sokolowski, M.B. (2004). In Search of food: Exploring the evolutionary link between cGMP-dependent protein kinase (PKG) and behaviour. Integr. Comp. Biol. 44, 28–36.

Fujiwara, M., Sengupta, P., and McIntire, S.L. (2002). Regulation of body size and behavioral state of C. elegans by sensory perception and the EGL-4 cGMP-dependent protein kinase. Neuron 36, 1091–1102.

Gershenzon, N.I., Trifonov, E.N., and Ioshikhes, I.P. (2006). The features of Drosophila core promoters revealed by statistical analysis. BMC Genomics 7, 161.

Gong, W.J., and Golic, K.G. (2003). Ends-out, or replacement, gene targeting in Drosophila. PNAS 100, 2556–2561.

Graf, S.A., and Sokolowski, M.B. (1989). Rover/sitter Drosophila melanogaster larval foraging polymorphism as a function of larval development, food-patch quality, and starvation. J. Insect Behav. 2, 301–313.

17 Greenspan, R.J. (1997). A kinder, gentler genetic analysis of behavior: dissection gives way to modulation. Current Opinion in Neurobiology 7, 805–811.

Gruneberg, H. (1938). An analysis of the “pleiotropic” effects of a new lethal mutation in the rat (Mus norvegicus). Proceedings of the Royal Society of London. Series B, Biological Sciences 125, 123–144.

Hall, JC. (1994). Pleiotropy of behavioral genes. In flexibilty and constraints in behavioral systems. Edited by CP Kyriacou and RJ Greenspan John Wiley and Sons Ltd pages 15–27.

Hirose, T., Nakano, Y., Nagamatsu, Y., Misumi, T., Ohta, H., and Ohshima, Y. (2003). Cyclic GMP-dependent protein kinase EGL-4 controls body size and lifespan in C. elegans. Development 130, 1089–1099.

Hirsch, J. (1963). Behavior genetics and individuality understood. Science 142, 1436–1442.

Hirsch, J., and Erlenmeyer-Kimling, L. (1961). Sign of taxis as a property of the genotype. Science 134, 835–836.

Hirsch, J., and Tryon, R.C. (1956). Mass screening and reliable individual measurement in the experimental behavior genetics of lower organisms. Psychol. Bull 53, 402–410.

Hodgkin, J. (1998). Seven types of pleiotropy. Int. J. Dev. Biol. 42, 501–505.

Hofmann, F., Feil, R., Kleppisch, T., and Schlossmann, J. (2006). Function of cGMP-dependent protein kinases as revealed by gene deletion. Physiological Reviews 86, 1–23.

Hoskins, R.A., Landolin, J.M., Brown, J.B., Sandler, J.E., Takahashi, H., Lassmann, T., Yu, C., Booth, B.W., Zhang, D., Wan, K.H. et al. (2011). Genome-wide analysis of promoter architecture in Drosophila melanogaster. Genome Res. 21, 182–192.

Huet, F., Lu, J.T., Myrick, K.V., Baugh, L.R., Crosby, M.A., and Gelbart, W.M. (2002). A deletion-generator compound element allows deletion saturation analysis for genome wide phenotypic annotation. PNAS. 99, 9948–9953.

Ingram, K.K., Oefner, P., and Gordon, D.M. (2005). Task-specific expression of the foraging gene in harvester ants. Molecular Ecology 14, 813–818.

Kalderon, D., and Rubin, G.M. (1989). cGMP-dependent protein kinase genes in Drosophila. J. Biol. Chem. 264, 10738–10748.

Kaun, K.R., Riedl, C.A.L., Chakaborty-Chatterjee, M., Belay, A.T., Douglas, S.J., Gibbs, A.G., and Sokolowski, M.B. (2007). Natural variation in food acquisition mediated via a Drosophila cGMP-dependent protein kinase. J. Exp. Biol. 210, 3547–3558.

Kaun, K.R., Chakaborty-Chatterjee, M., and Sokolowski, M.B. (2008). Natural variation in plasticity of glucose homeostasis and food intake. J. Exp. Biol. 211, 3160–3166.

Kent, C.F., Daskalchuk, T., Cook, L., Sokolowski, M.B., and Greenspan, R.J. (2009). The Drosophila foraging gene mediates adult plasticity and gene–environment interactions in

18 behaviour, metabolites, and gene expression in response to food deprivation. PLoS Genet 5, e1000609.

Kharchenko, P.V., Alekseyenko, A.A., Schwartz, Y.B., Minoda, A., Riddle, N.C., Ernst, J., Sabo, P.J., Larschan, E., Gorchakov, A.A., Gu, T. et al. (2011). Comprehensive analysis of the chromatin landscape in Drosophila melanogaster. Nature 471, 480–485.

Knight, C.G., Zitzmann, N., Prabhakar, S., Antrobus, R., Dwek, R., Hebestreit, H., and Rainey, P.B. (2006). Unraveling adaptive evolution: how a single point mutation affects the protein coregulation network. Nat. Genet. 38, 1015–1022.

Konopka, R.J., and Benzer, S. (1971). Clock Mutants of Drosophila melanogaster. Proc Natl Acad Sci U S A 68, 2112–2116.

Lécuyer, E., Yoshida, H., Parthasarathy, N., Alm, C., Babak, T., Cerovina, T., Hughes, T.R., Tomancak, P., and Krause, H.M. (2007). Global analysis of mRNA localization reveals a prominent role in organizing cellular architecture and function. Cell 131, 174–187.

Li, J., and Gilmour, D.S. (2011). Promoter proximal pausing and the control of gene expression. Curr. Opin. Genet. Dev. 21, 231–235.

Lifton, R.P., Goldberg, M.L., Karp, R.W., and Hogness, D.S. (1978). The organization of the histone genes in Drosophila melanogaster: functional and evolutionary implications. Cold Spring Harb. Symp. Quant. Biol. 42 Pt 2, 1047–1051.

Lindsley, D.L., and Zimm, G.G. (1992). In The genome of Drosophila melanogaster, D.L.L.G. Zimm, ed. (San Diego: Academic Press), pp. vii – viii.

Lucas, C., and Sokolowski, M.B. (2009). Molecular basis for changes in behavioral state in ant social behaviors. PNAS 106, 6351–6356.

Mackay, T.F.C., and Anholt, R.R.H. (2006). Of flies and man: Drosophila as a model for human complex traits. Annu. Rev. Genomics Hum. Genet. 7, 339–367.

MacPherson, M.R., Lohmann, S.M., and Davies, S.-A. (2004). Analysis of Drosophila cGMP- dependent protein kinases and assessment of their in vivo roles by targeted expression in a renal transporting epithelium. J. Biol. Chem. 279, 40026–40034.

Maeda, R.K., and Karch, F. (2007). Making connections: boundaries and insulators in Drosophila. Curr. Opin. Genet. Dev. 17, 394–399.

Manning, G., Plowman, G.D., Hunter, T., and Sudarsanam, S. (2002). Evolution of protein kinase signaling from yeast to man. Trends in Biochemical Sciences 27, 514–520.

Des Marais, D.L., and Rausher, M.D. (2008). Escape from adaptive conflict after duplication in an anthocyanin pathway gene. Nature 454, 762–765.

Markstein, M., Pitsouli, C., Villalta, C., Celniker, S.E., and Perrimon, N. (2008). Exploiting position effects and the gypsy retrovirus insulator to engineer precisely expressed transgenes. Nat. Genet. 40, 476–483.

19 McEwen, R.S. (1918). The reactions to light and to gravity in Drosophila and its mutants. J. Exp. Zool. 25, 49–106.

McGuire, S.E., Le, P.T., Osborn, A.J., Matsumoto, K., and Davis, R.L. (2003). Spatiotemporal rescue of memory dysfunction in Drosophila. Science 302, 1765–1768.

Medawar, P.B. (1952). An unsolved problem of biology. (University College London, H.K. Lewis & Co. Ltd., London).

Mendel, J. (1866). Versuche über plflanzenhybriden verhandlungen des naturforschenden vereines in Brünn, Bd. IV für das Jahr, 1865 Abhandlungen: 3–47. English translation at http://www. Mendelweb. org/Mendel. Html.

Miyashita, K., Itoh, H., Tsujimoto, H., Tamura, N., Fukunaga, Y., Sone, M., Yamahara, K., Taura, D., Inuzuka, M., Sonoyama, T. et al. (2009). Natriuretic peptides/cGMP/cGMP- dependent protein kinase cascades promote muscle mitochondrial biogenesis and prevent obesity. Diabetes 58, 2880–2892. modENCODE Consortium, Roy, S., Ernst, J., Kharchenko, P.V., Kheradpour, P., Negre, N., Eaton, M.L., Landolin, J.M., Bristow, C.A., Ma, L. et al. (2010). Identification of functional elements and regulatory circuits by Drosophila modENCODE. Science 330, 1787–1797.

Morgan, T.H. (1910). Sex limited inheritance in Drosophila. Science 32, 120–122.

Moses, A., and Sinha, S. (2009). Regulatory motif analysis. In Bioinformatics, D. Edwards, J. Stajich, and D. Hansen, eds. (Springer New York), pp. 137–163.

Ni, J.-Q., Liu, L.-P., Binari, R., Hardy, R., Shim, H.-S., Cavallaro, A., Booker, M., Pfeiffer, B.D., Markstein, M., Wang, H. et al. (2009). A Drosophila resource of transgenic RNAi lines for neurogenetics. Genetics 182, 1089–1100.

Ohler, U., Liao, G., Niemann, H., and Rubin, G.M. (2002). Computational analysis of core promoters in the Drosophila genome. Genome Biol 3, research0087.1–87.12.

Orr, H.A. (2000). Adaptation and the cost of complexity. Evolution 54, 13–20.

Osborne, K.A., Robichon, A., Burgess, E., Butland, S., Shaw, R.A., Coulthard, A., Pereira, H.S., Greenspan, R.J., and Sokolowski, M.B. (1997). Natural behavior polymorphism due to a cGMP-dependent protein kinase of Drosophila. Science 277, 834–836.

Paaby, A.B., and Rockman, M.V. (2013). The many faces of pleiotropy. Trends in Genetics 29, 66–73.

Parks, A.L., Cook, K.R., Belvin, M., Dompe, N.A., Fawcett, R., Huppert, K., Tan, L.R., Winter, C.G., Bogart, K.P., Deal, J.E. et al. (2004). Systematic generation of high-resolution deletion coverage of the Drosophila melanogaster genome. Nat. Genet. 36, 288–292.

Pavlicev, M., and Wagner, G.P. (2012). A model of developmental evolution: selection, pleiotropy and compensation. Trends Ecol. Evol. (Amst.) 27, 316–322.

20 Pearce, L.R., Komander, D., and Alessi, D.R. (2010). The nuts and bolts of AGC protein kinases. Nat. Rev. Mol. Cell Biol. 11, 9–22.

Pereira, H.S., and Sokolowski, M.B. (1993). Mutations in the larval foraging gene affect adult locomotory behavior after feeding in Drosophila melanogaster. Proc. Natl. Acad. Sci. U.S.A. 90, 5044–5046.

Pfeifer, A., Klatt, P., Massberg, S., Ny, L., Sausbier, M., Hirneiss, C., Wang, G.X., Korth, M., Aszódi, A., Andersson, K.E. et al. (1998). Defective smooth muscle regulation in cGMP kinase I-deficient mice. EMBO J 17, 3045–3051.

Plate, L. (1910). Genetics and evolution, pp. 536-610 in Festschrift zum sechzigsten Geburtstag Richard Hertwigs. Fischer, Jena, Germany.

Potter, C.J., Tasic, B., Russler, E.V., Liang, L., and Luo, L. (2010). The Q system: a repressible binary system for transgene expression, lineage tracing, and mosaic analysis. Cell 141, 536– 548.

Price, D.H. (2008). Poised polymerases: on your mark...get set...go! Mol. Cell 30, 7–10.

Pyeritz, R.E. (1989). Pleiotropy revisited: molecular explanations of a classic concept. American Journal of Medical Genetics 34, 124–134.

Rach, E.A., Yuan, H.-Y., Majoros, W.H., Tomancak, P., and Ohler, U. (2009). Motif composition, conservation and condition-specificity of single and alternative transcription start sites in the Drosophila genome. Genome Biol. 10, R73.

Raizen, D.M., Cullison, K.M., Pack, A.I., and Sundaram, M.V. (2006). A novel gain-of-function mutant of the cyclic GMP-dependent protein kinase egl-4 affects multiple physiological processes in Caenorhabditis elegans. Genetics 173, 177–187.

Reaume, C.J., and Sokolowski, M.B. (2009). cGMP-dependent protein kinase as a modifier of behaviour. In cGMP: Generators, effectors and therapeutic implications, H.H.H.W. Schmidt, F. Hofmann, and J.-P. Stasch, eds. (Springer Berlin Heidelberg), pp. 423–443.

Schlossmann, J., and Desch, M. (2009). cGK substrates. In cGMP: Generators, effectors and therapeutic implications, H.H.H.W. Schmidt, F. Hofmann, and J.-P. Stasch, eds. (Springer Berlin Heidelberg), 163–193.

Scott, J.P. (1943). Effects of single genes on the behavior of Drosophila. The American Naturalist 77, 184–190.

Sexton, T., Yaffe, E., Kenigsberg, E., Bantignies, F., Leblanc, B., Hoichman, M., Parrinello, H., Tanay, A., and Cavalli, G. (2012). Three-dimensional folding and functional organization principles of the Drosophila genome. Cell 148, 458–472.

Smale, S.T., and Baltimore, D. (1989). The “initiator” as a transcription control element. Cell 57, 103–113.

Smith, J.A., Francis, S.H., Walsh, K.A., Kumar, S., and Corbin, J.D. (1996). Autophosphorylation of type Iβ cGMP-dependent protein kinase increases basal catalytic

21 activity and enhances allosteric activation by cGMP or cAMP. J. Biol. Chem. 271, 20756– 20762.

Sokolowski, M.B. (1980). Foraging strategies of Drosophila melanogaster: a chromosomal analysis. Behav. Genet. 10, 291–302.

Sokolowski, M.B. (2001). Drosophila: Genetics meets behaviour. Nat. Rev. Genet. 2, 879–890.

Solovieff, N., Cotsapas, C., Lee, P.H., Purcell, S.M., and Smoller, J.W. (2013). Pleiotropy in complex traits: challenges and strategies. Nat. Rev. Genet. 14, 483–495.

Stapleton, M., Liao, G., Brokstein, P., Hong, L., Carninci, P., Shiraki, T., Hayashizaki, Y., Champe, M., Pacleb, J., Wan, K. et al. (2002). The Drosophila gene collection: identification of putative full-length cDNAs for 70% of D. melanogaster genes. Genome Res. 12, 1294– 1300.

Stearns, F.W. (2010). One hundred years of pleiotropy: A retrospective. Genetics 186, 767–773.

Thibault, S.T., Singer, M.A., Miyazaki, W.Y., Milash, B., Dompe, N.A., Singh, C.M., Buchholz, R., Demsky, M., Fawcett, R., Francis-Lang, H.L. et al. (2004). A complementary transposon tool kit for Drosophila melanogaster using P and piggyBac. Nat. Genet. 36, 283–287.

Tomarev, S.I., and Piatigorsky, J. (1996). Lens crystallins of invertebrates--diversity and recruitment from detoxification enzymes and novel proteins. Eur. J. Biochem. 235, 449–465.

Tully, T. (1996). Discovery of genes involved with learning and memory: An experimental synthesis of Hirschian and Benzerian perspectives. PNAS 93, 13460–13467.

Venken, K.J.T., and Bellen, H.J. (2005). Emerging technologies for gene manipulation in Drosophila melanogaster. Nat Rev Genet 6, 167–178.

Venken, K.J.T., He, Y., Hoskins, R.A., and Bellen, H.J. (2006). P[acman]: A BAC transgenic platform for targeted insertion of large DNA fragments in D. melanogaster. Science 314, 1747–1751.

Venken, K.J.T., Kasprowicz, J., Kuenen, S., Yan, J., Hassan, B.A., and Verstreken, P. (2008). Recombineering-mediated tagging of Drosophila genomic constructs for in vivo localization and acute protein inactivation. Nucl. Acids Res. 36, e114–e114.

Warming, S., Costantino, N., Court, D.L., Jenkins, N.A., and Copeland, N.G. (2005). Simple and highly efficient BAC recombineering using galK selection. Nucl. Acids Res. 33, e36–e36.

Wasserman, W.W., and Sandelin, A. (2004). Applied bioinformatics for the identification of regulatory elements. Nat. Rev. Genet. 5, 276–287.

Williams, G.C. (1957). Pleiotropy, natural selection, and the evolution of senescence. Evolution 11, 398–411.

Wu, J.Q., and Snyder, M. (2008). RNA polymerase II stalling: loading at the start prepares genes for a sprint. Genome Biol. 9, 220.

22 Yagi, R., Mayer, F., and Basler, K. (2010). Refined LexA transactivators and their use in combination with the Drosophila Gal4 system. Proc. Natl. Acad. Sci. U.S.A. 107, 16166– 16171.

Zeitlinger, J., Stark, A., Kellis, M., Hong, J.-W., Nechaev, S., Adelman, K., Levine, M., and Young, R.A. (2007). RNA polymerase stalling at developmental control genes in the Drosophila melanogaster embryo. Nat. Genet. 39, 1512–1516.

Zhao, Z., Tavoosidana, G., Sjölinder, M., Göndör, A., Mariano, P., Wang, S., Kanduri, C., Lezcano, M., Singh Sandhu, K., Singh, U., et al. (2006). Circular chromosome conformation capture (4C) uncovers extensive networks of epigenetically regulated intra- and interchromosomal interactions. Nat Genet 38, 1341–1347.

Zhu, Q., and Halfon, M.S. (2009). Complex organizational structure of the genome revealed by genome-wide analysis of single and alternative promoters in Drosophila melanogaster. BMC Genomics 10, 9.

23 Chapter 2

A Reverse Genetic Dissection of the foraging gene in the Fruit Fly (Drosophila melanogaster)

The vast majority of the experimental research done in this chapter was performed by Aaron Allen. However, some of the experiments performed in this thesis were collaborative, with data being contributed by others. Those data and the contributors are listed below;

● The homologous recombination and recombineering experiments were conducted with the aid and advice of our collaborators Stephen Goodwin and Megan Neville (University of Oxford). ● Df(2L)for08112-3 and Df(2L)for32122-1 partial deletions - crosses to generate the deletions carried out by Dr. Scott Douglas, molecular and phenotypic characterization of the mutants by Aaron Allen ● Df(2L)ford0f0 and Df(2L)forf0e0 - crosses to generate the deletions carried out by Dr. Scott Douglas, molecular characterization of the mutants by Dr. Scott Douglas and Aaron Allen ● Table 2: lethal complementation analysis - crosses were performed by the following people; Aaron Allen, Dr. Scott Douglas, Dr. Amsale Belay, Dr. Hisato Kuniyoshi. ● Figure 16C: partial deletion lipid analysis - Dr. Scott Douglas

24 Abstract:

Natural variants and single-gene induced mutations can provide invaluable insight into the modes of action of the pleiotropic genes that influence behaviour. The foraging gene of Drosophila melanogaster has long been a pivotal example of a single gene with natural variants that differentially influences behaviour. foraging has been shown to be highly pleiotropic, influencing many different phenotypes. The molecular nature of foraging's alleles and how foraging influences many of its phenotypes is still not known. Here I characterized the sequences of the rover and sitter alleles and conducted a genetic dissection of the locus to uncover its role in foraging behaviour, ingestion behaviour and lipid content of the larva. I sequenced multiple isogenized isolates from rover and sitter populations. I constructed a consensus sequence for both the rover and sitter alleles. Comparisons between these sequences uncovered 425 non-identical sites. These differences were primarily in intronic regions. There were only two non-synonymous coding region mutations. I identified 4 transcription start sites and 1 transcription termination site. This supported a gene model with 4 promoters that produce RNA with a common 3’-end. I identified many novel alternative splice variants. Our current gene model contains 21 transcripts, which code for 9 distinct proteins. I generated a full genetic deletion of the foraging gene. This null mutant was pupal lethal and died during the pharate adult stage. This stage of lethality allowed us to investigate phenotypes during the larval stage of development. Null mutant larvae had reduced foraging behaviour, reduced ingestion behaviour and increased lipid levels. I rescued these phenotypes with a genomic rescue transgene, generated with recombineering. By comparing the phenotypes of the rover and sitter natural alleles to the mutants generated in this thesis, I infered independent regulation of the foraging and ingestion behaviours. I was unable to conclude if the altered lipid metabolism phenotype was independently regulated from either the foraging or ingestion behaviour. These new tools conclusively showed that foraging is pleiotropic, influencing multiple feeding-related phenotypes. This study sheds light on the regulation of the foraging gene and the nature of the rover and sitter allelic variants.

25 Introduction:

foraging and its feeding-related phenotypes

The Drosophila melanogaster foraging gene is a well established model in behaviour genetic research (Sokolowski, 2001). The foraging gene was initially characterized because of its influence on individual differences in larval locomotion on a nutritive medium, between two stains called rover and sitter (Sokolowski, 1980). On a nutritive medium, rovers foraged over longer distances than sitters. This difference in behaviour was mapped to the left arm of chromosome-2 in cytological position 24A3-C5 (de Belle et al., 1989). The gene was later cloned and shown to encode a cGMP-dependent protein kinase, PKG (Osborne et al., 1997). The rover and sitter alleles of the foraging gene were subsequently been shown to differ in a suite of other behavioural and physiological traits, and as such foraging is pleiotropic. Additional feeding-related traits associated with variation in the foraging gene include food intake, nutrient absorption, and fat metabolism and storage (Kaun et al., 2007; Kent et al., 2009; Belay et al., 2010). foraging serves as an important example of how naturally occurring variants can be used to deduce the functions of a gene.

Conserved nature of the foraging gene

foraging’s sequence is conserved among many taxa. PKGs are found in many eukaryotes from protists, to green algae, to fungi, and humans (Fitzpatrick and Sokolowski, 2004, Manning et al., 2002). Furthermore, PKGs influence behaviour in a variety of other organisms such as nematodes (Fujiwara et al., 2002; L’Etoile et al., 2002; Hirose et al., 2003; Raizen et al., 2006; Raizen et al., 2008), ants (Ingram et al., 2005; Lucas and Sokolowski, 2009), honey bees (Ben- Shahar et al., 2002) and mammals (Hofmann et al., 2006). The metabolic functions of foraging are also seen in other organisms. Manipulations in mice of cGKI, foraging’s orthologue showed

26 altered fat levels and body size (Miyashita et al., 2009; Pfeifer et al., 1998). Similarly in C. elegans, manipulations of foraging’s orthologue, egl-4, results in altered body size and lipid levels (Raizen et al., 2006). This conserved nature of foraging sequence and function furthers its importance as a model in behaviour genetics and metabolism.

Improving upon natural variant associations

Much of the work conducted to date on the foraging gene has relied on the natural variants, rover and sitter and a hypomorphic mutant fors2. Although there are benefits to using hypomorphic mutants in behaviour analysis (Greenspan, 1997), we still do not know the amorphic phenotype of this gene. In addition, the natural variants and hypomorph may not uncover all of the functions of foraging. Techniques such as homologous recombination mediated gene targeting, HR (Gong and Golic, 2003), and Flp/FRT carrying transposable elements (Thibault et al., 2004; Parks et al., 2004) allow for the precise deletion of all, or part of foraging. These deletions would allow us to uncover the amorphic phenotype. Deletions in foraging enable us to uncover the relevant isoforms or tissues in which foraging influences its multiple phenotypes. foraging was previously characterized to produce multiple transcripts and protein isoforms (Kalderon and Rubin, 1989; Stapleton et al., 2002). Microarray and RNA-Seq studies revealed that foraging is expressed in many tissues throughout development (Chintapalli et al., 2007; Graveley et al., 2011). These studies identified many candidate tissues for foraging’s function for further investigation. Targeted expression of foraging is possible with the Gal4/UAS system (Brand and Perrimon, 1993). These expression patterns can be further temporally restricted with temperature sensitive manipulations of the GAL80ts gene by changing the incubation temperature (McGuire et al., 2003). Genome wide RNAi collections allow for targeted reduction of foraging expression in a tissue-specific manner (Dietzl et al., 2007; Ni et al., 2010). Here, I generate a null mutant of the foraging gene, to identify the amorphic phenotype in larval foraging, food intake, and lipid metabolism. To further resolve our genotype-phenotype map for this locus I perform a dissection rigorous characterization of the gene structure and its products shedding light on to the multiple functions of the foraging gene.

27 Materials and Methods

Fly Strains and Rearing

Strains were reared in 40 mL vials with 10 mL of food and 170 mL bottles with 40 mL of food at 25 ± 1ºC with a 12L:12D photoperiod. Our food recipe contained 1.5 % sucrose, 1.4 % agar, 3 % glucose, 1.5 % cornmeal, 1 % wheat germ, 1 % soy flour, 3 % molasses, 3.5 % yeast, 0.5 % propionic acid, 0.2 % Tegosept and 1 % ethanol in water. Fly strains for homologous recombination were acquired from Stephen Goodwin. They include yw, hsFLP, hsI-SceI/Y, hs-his and yw ; eyFLP5 and yw, eyFLP2 ; Pin/CyO. The nuclear GFP, UAS-Stinger was obtained from Joel Levine and the maternal driver, nos-Gal4, was obtained from Craig Smibert. The following lines were acquired from the Exelixis Collection at Harvard Medical School; (w* ; PBac{WH}forf04293/CyO), (w* ; PBac{WH}f00049/CyO), (w* ; PBac{WH}fore02991 /CyO), (w* ; PBac{WH}fore00955/CyO), and (w* ; P{XP}ford07690/CyO). The following lines were acquired from Bloomington Drosophila Stock Center: (y1, w67c23, P{Crey}1b ; snaSco/CyO), (w1118 ; Df(ED243)/SM6a), (y1, w* ; Df(2L)drm-P2/SM6b), (Df(2L)ed1/CyO ; P{ftz/lacC}1), (y1, w67c23 ; P{wHy}forDG23111/SM6a), (w1118 ; P{UAS-Dcr-2.D}2), and (w1118 ; Df(2L)Exel7018/CyO).

Larval Synchronization

To work with mid third instar larvae, I needed to synchronize their development. Five to seven day old adults were allowed to oviposit on a grape juice and agar media (45 % grape juice, 2.5 % ethanol, 2.5 % acetic acid, 2 % agar in water) for 20 hrs. The following day, any hatched larvae were cleared. Following a four hour incubation, newly hatched larvae were seeded into a 100 mm Petri dish containing 30 mL of our yeast-cornmeal- molasses-agar food.

28 The plates were incubated at 25 ± 1 ºC in a 12L:12D photocycle until they reached mid third instar (72 ± 2 hrs. post hatch).

Bioinformatics

Bioinformatic analyses were conducted with the Geneious 8.1.7 software package (http://www.geneious.com, Kearse et al., 2012). This includes primer design, editing and visualization of Sanger sequencing chromatograms, contig assembly, nucleotide alignments, in- silico design of cloning schemes, and restriction endonuclease digestion prediction.

DNA Extraction

Larvae were ground in a sterile 1.5 mL centrifuge tube with 150 μL of Solution A (0.1 M Tris-HCl, 0.1 M EDTA, 0.1 M NaCl, 0.5 % SDS, pH 8.0). An additional 150 μL of Solution A was added and the flies were ground again. The homogenate was incubated at 65 °C for 30 min. 172 μL of 5 M KoAC and 428 μL of 6 M LiCl were added. The sample was mixed by inversion and then incubated on ice for 20 min. The mixture was then centrifuged for 20 min. at 12,000 RCF. 600 μL of supernatant was transferred to a fresh sterile 1.5 mL tube, and 450 μL of ice cold isopropanol was added, then centrifuged for 5 min. at 12,000 RCF. The supernatant was discarded and the pellet was washed in 70 % ethanol, dried, and then re-suspended in 50 μL of sterile 10 mM Tris-HCl, pH8, 1 mM EDTA.

PCR, Restriction Digestion, and Ligation

PCR, restriction endonuclease digestions, and ligations were performed using New England Biolabs (NEB) products following the manufacturer's instructions. Routine PCR, such

29 as colony PCR and RT-PCR were performed using NEB Taq DNA Polymerase. PCR of genomic region for cloning and sequencing of the locus were amplified with NEB Phusion High-Fidelity DNA Polymerase. All restriction enzymes used were purchased from NEB and used according to the manufacturers instructions. NEB T4 DNA ligase was used for ligation reactions in all cloning schemes.

E. coli Strains

The SW102 cells for recombineering (Warming et al., 2005) were obtained from the Biological Resources Branch of National Cancer Institute at Frederick. EPI300 cells were obtained from Epicentre. Standard cloning was conducted using NEB 5-alpha and NEB 10-beta cells from New England Biolabs.

Electrocompetent E. coli Cell Preparation

A 5 mL overnight culture was grown from an isolated colony in LB medium. The overnight culture was then diluted with fresh media at a rate of 1 to 50 parts. The larger inoculated culture was grown to an optical density of 0.6 at 600 nm (O.D.600 = 0.6). The culture was chilled on ice and washed with 5 volumes of ice cold sterile deionized water. The cells were then washed with 1 volume of ice cold 15 % glycerol. Every 10 mL of original culture was concentrated to 50 μL individual transformations.

E. coli Transformation

Electro-competent E. coli cells were removed from -80 ºC and were thawed on ice. The cell suspension was mixed with 1 µL of ligation (or intact vector DNA). This suspension was

30 then added to a 1 mm gap electroporation cuvette. Cells were then electroporated at a voltage of 1.8 V. Immediately following the electroporation, 1 mL of LB medium was added to the cuvette and mixed. The cell suspension was then transferred to a culture tube and incubated for 1 hr. in a 37ºC shaker (30 ºC for recombineering). Following recovery, a dilution series of the transformation was plated on LB medium containing appropriate antibiotics and incubated at 37 ºC overnight (or 24 hours at 30 ºC for recombineering). Colonies were then inoculated in 5 mL of LB medium containing appropriate antibiotics and incubated at 37 ºC overnight in a shaker.

Plasmid Preparation

The cells were pelleted by centrifugation of 1 mL of the culture for 1 min. at 16 000 RCF. The supernatant was discarded and 200 μL of GTE buffer (50 mM glucose, 2.5 mM Tris- HCl, 10 mM EDTA, pH8.0) was added. The tubes were then vortexed to resuspend the pellet, and then 400 μL of Lysis buffer (1% SDS, 0.2M NaOH) was added. The mixture was incubated at room temperature for 3 min. Then 300 μL of Neutralization buffer (3 M K+, 5 M OAc-) was added, briefly vortexed and allowed to incubate on ice for 5 min. The mixture was then centrifuged for 4 min. at 16,000 RCF and then the supernatant was transferred to a fresh 1.5 mL tube. The plasmids were precipitated by adding 500 μL of ice cold isopropanol, vortexed and then centrifuged for 5 min. The supernatant was discarded and the pellet was washed in 70 % ethanol. The pellet was re-suspended in 25 μL of 1xTE (10 mM Tris-HCl, 1 mM EDTA, pH8.0). Any RNA present was then digested by adding 1 μL of RNase (20 μg/ml Sigma) and allowed to incubate at room temperature for 1 min.

Gene model characterization

To sequence the foraging locus we need a isogenic strains as a starting template. Rover and sitter alleles were isolated by balancer-mediated isogenization. Heterogeneous rover and heterogeneous sitter populations were crossed to Sp/CyO balancer flies. Single F1 individuals

31 with foraging alleles balanced over CyO were backcrossed to Sp/CyO balancer flies. The progeny of each cross were then homozygosed by selecting against the Cy phenotype. Regions from the locus were amplified using NEB's Phusion polymerase, following the manufacurer's instructions. 15 overlapping regions spanning 40 kb were amplified from 4 separate isogenic rover alleles and 4 separate isogenic sitter alleles. These PCR fragments were used as templates for Sanger sequencing. The samples were sequenced using the ABI 3130xl Capillary Sequencer using BigDye Terminator v3.1 Cycle Sequencing Kit following the manufacturer's instructions (Life Technologies). Transcription start sites (TSSs) were identified with 5’-RACE experiments that used homopolymeric tailing (Michelson and Orkin, 1982; Sambrook and Russell, 2001) and RNA ligase-mediated using GeneRacer (Life Technologies). Total RNA was extracted from adult flies with TriZOL Reagent (Life Technologies). RNA was reverse transcribed with Superscript III (Life Technologies) and primed with random hexamers and oligo dT primers. Terminal Transferase (New England Biolabs) was used to add a poly-guanosine tail to the 5' end of the isolated cDNA. The 5' ends of the transcripts were amplified with an oligo dC primer and gene- specific primer targeting the kinase region of the foraging gene. Transcription termination sites were identified by 3’-RACE using the GeneRacer Kit (Life Technologies) following the manufacturer's instructions. Splice variants were identified by PCR using primers targeting near the TSS and the common coding 3' end exons. The resulting amplicons were cloned into the pGEM-Teasy vector (Promega). Clones were then sequenced by Sanger sequencing on an ABI 3130xl Capillary Sequencer using BigDye Terminator v3.1 Cycle Sequencing Kit (Life Technologies).

Ends-Out gene targeting

An attP sequence was cloned into the pW25 (Gong and Golic, 2004) ends-out gene targeting vector between the w+mC gene and the P3 P element end. The attP sequence was amplified with PCR from the pCaryP (Groth et al., 2004) vector obtained from Addgene. Homology arms, roughly 5 kb in length, that flanked the foraging gene were cloned into the pW25-attP vector into the multiple cloning sites flanking the w+mC gene. This construct was

32 designed to replace (delete) the foraging gene and leave behind an attP site. The vector was integrated into w1118 flies using P element transgenesis (performed by Genetic Services Inc.). This yielded an X-chromosome transformant. A series of crosses were conducted to mobilize the targeting construct and to allow homologous recombination (as in Rideout et al., 2010). Briefly, the element was crossed into a hs-Flp, hs-IsceI containing background and embryos and L1 larvae were heat shocked. Progeny were then crossed into an ey-Flp background and were screened for red eye colour. These crosses yielded multiple integrations at the foraging locus, but no deletions of the gene. The recombinants had integrated at either the 5'- or 3'-end of the locus, corresponding to the two holomogy arms. It was therefore possible to generate a deletion of foraging since the pW25 vector contains loxP sites. I crossed the 5' and 3' recombinants to make a trans-heterozygote in a hs-Cre background. Recombinants were screened by PCR, sequencing and Southern blot.

Recombineering

To generate new BACs for recombineering, a loxP site specific recombination sequence was added and the order of the attB-MCS was rearranged from the P[acman] (Venken et al., 2006) vector. I called the resulting vector p[attlox]. This vector was designed to be paired with the pW25-attP vector in order to have a minimal footprint upon re-integration of the engineered locus of interest. The P[acman] clone CH321-64J02 BAC available from Children’s Hospital Oakland Research Institute (http://www.chori.org) was used as the source of the foraging gene sequence. A gap repair protocol (as in Venken et al., 2006) was used to trim the larger BAC down to a 40 kb segment containing foraging. The BACs in the P[acman] library were generated using DNA isolated from the y1;cn1,bw1,sp1 genome strain, and contained a naturally occurring copia transposable element in the foraging gene. The galK selection/counter-selection (as in Warming et al., 2005) was used to remove the copia transposable element. The BAC was incorporated into the fly's genome using φC31 integration into the attP2 landing site on the third chromosome (Groth et al., 2004). Transgenesis was performed by Genetic Services Inc.

33 Western Blot Analysis

Twenty mid third instar larvae (72 ± 2 hrs. post hatch) were homogenized on ice in 400 μl of lysis buffer (50 mM Tris-HCl pH7.5, 10 % glycerol, 150 mM NaCL, 1% Triton-X 100, 5 mM EDTA, 1x Halt Protease Inhibitor Cocktail, 1x Halt Phosphatase Inhibitor Cocktail). Samples were centrifuged at 16,000 RCF at 4 °C and the supernatant was transferred to a new tube and placed on ice. Protein quantification was performed with Pierce BCA kit. 20 μg of protein was denatured for 5 min. at 100 °C. The samples were run on a 4 % stacking/7 % resolving polyacrylamide and SDS gel at 150 V for 1 hr. in running buffer (25 mM Tris-HCl, 200 mM glycine, 0.1% SDS). Proteins were transferred onto a nylon membrane at 100 V for 1 hr in transfer buffer (25 mM Tris-HCl, 200 mM glycine, 10 % methanol). Blots were blocked for 2 hrs. in 5% nonfat milk in 0.1% Tween-20 in 1x TBS (0.1% TBST). Primary antibody was incubated for 1 hr. at a concentration of 1:10 000 for anti-FOR (Belay et al., 2007) and 1:5000 for anti-ACTIN (Sigma). The blots were rinsed twice and washed 3x for 5 min. with each rinse in 0.1% TBST. Blots were then incubated with HRP conjugated secondary (Jackson ImmunoResearch Laboratories) at a concentration of 1:10 000 for 45 min. They were then rinsed twice and washed 3x. Blots were incubated for 5 min. in GE’s ECL Prime Detection reagent and exposed to X-ray film.

Southern Blot

DNA was extracted using a high salt method (above). DNA was digested with restriction endonucleases as mentioned above. Samples were electrophoresed on an agarose gel and transferred to a nylon membrane, using capillary transfer, as described in Molecular Cloning (Sambrook and Russell, 2001). Probes were prepared using New England Biolabs NEBlot Kit, a non-radioactive labelling kit, following the manufacturer's instructions. Blocking, hybridization and washing was conducted as described in Molecular Cloning (Sambrook and Russell, 2001).

34 RNAi lines

Multiple foraging-specific RNAi lines targetting different exons were generated using the pWIZ RNAi cloning vector (Lee and Carthew, 2003). A region complementary to part of exon 7 and 8 was used to generate a series of common coding RNAi lines. P element injections into w1118 were performed by BestGene Inc.

Partial Deletions of 3'-end foraging

The P{wHy}forDG23111 construct, 40 bp 5'-to exon 6, was used to delete the 3’-end of the foraging gene (as in Huet et al. 2002). Flies were crossed to a source of hobo transposase and screened for the loss of the white transgene. Since hobo has a “copy-and-paste” mode of replication, neighbouring elements were able to undergo homologous recombination. This deleted the intervening sequence which included a dominant marker; w+mC or y+t7.7, depending of the direction of the local hop. Deletion breakpoints were verified with PCR, iPCR, sequencing and Southern blotting (data not shown).

Partial Deletions of 5'-end foraging

Multiple partial deletions of the foraging gene were generated using the Flp/FRT system (as described in Thibault et al. 2004 and Parks et al. 2004). Flies carrying transposable elements with FRT sequences were obtained from The Exelixis Collection at Harvard Medical School. The following elements were used to generate the following deletions: f04293, e02991 -> Df(2L)forf0e0; d07690, f00049 -> Df(2L)ford0f0; e00955, f00049 -> Df(2L)fore0f0. The breakpoints of the deletions were confirmed with PCR and sequencing (data not shown).

35 Immunohistochemistry

Dissected samples were fixed in 4 % paraformaldehyde in 1x PBS for 1 h. The fixed samples were rinsed twice in 0.5 % Triton X in 1x PBS (PBT) and then washed 4x for 30 min each in PBT. The samples were blocked in 10 % normal goat serum (NGS, Jackson ImmunoResearch Laboratories) and 0.1 % BSA (Sigma) in PBT for 2 hrs. at room temperature. Primary antibody incubations were conducted in blocking solution and incubated overnight at 4 °C. After primary incubation the samples were rinsed twice and then washed 4x for 30 min. each in PBT. Secondary antibody was diluted 1:1000 in blocking solution incubated at room temperature for 2 hrs. Washing was conducted as described above for the primary antibody. Tissues were mounted on slides in Vectashield (Vector Labs). Samples were imaged using a Zeiss Axioscope epifluorescence microscope as well as a Zeiss LSM 510 and Leica SP5 confocal microscopes. Images were analysed using Fiji software package (Schindelin et al., 2012).

Triglyceride analysis

Groups of 10 larvae (72 hrs. ± 2 hrs. post hatch) were homogenized in a volume of 200 μL of 0.1 % Tween 20 in 1x PBS. Samples were incubated at 70 °C for 5 min and then chilled for 2 min. on ice. Debris was pelleted by centrifugation at 16,000 RCF for 5 min. 25 μL of the supernatant was then mixed with 200 μL Infinity TAG Reagent (Fisher Scientific) in a 96-well spectrophotometer plate. Samples were incubated at 37 °C for 10 min. The absorbance was measured at 540 nm. A separate aliquot was mixed with Pierce BCA reagent to quantify protein levels in the samples. The BCA samples were incubated at 37 °C for 30 min. Standard curves were used for both the DAG/TAG and protein analysis. Lipid levels are displayed at μg glycerol/ mg protein.

36 Food intake

Cell culture strainers (40 μm mesh, 20 mm diameter) placed in a 35 mm petri dish were used as an arena. 700 μL of liquid food was placed into the cell culture dish. The food drained into the space below, but remained in contact with the mesh and would flow up by capillary action when eaten. The food contained 0.5 % fluorescein (Sigma), 5 % sucrose and 5 % yeast extract. Mid-third instar larvae (72 ± 2 hrs. post hatch) were removed from food plates, washed and then placed into the test arenas in groups of 10. Larvae were left to feed for 10 min. The cell culture strainers were then lifted out of the food and rinsed with water. Larvae were were washed 3x in water. Washed individual larvae were placed into a 0.2 mL well of a 96 well PCR plate and frozen. Larvae were homogenized in 150 μL of 1x PBS and then centrifuged with a stainless steel ball bearing and agitation. Samples were centrifuged at 3,500 RCF for 15 min. 20 μL of the supernatant was mixed with 180 μL of 1x PBS in a fluorometer plate and excited at 488 nm and emission measured at 562 nm. Homogenates from larvae that were fed food without fluorescein were used as a blank.

Path length

Foraging path length was measured using black rectangular Plexiglas plates (37 cm width, 60 cm length, 0.5 cm height) with 10 wells (0.5 mm depth, 9.5 cm diameter) arranged in a 2-by-5 well fashion (Sokolowski et al., 1997). Mid-third instar larvae (72 ± 2 hrs. post hatch) were randomly selected from the food plates and washed in water. A homogenous yeast suspension (2:1 w/w) was spread across the wells creating a thin even layer in each well. Individual larvae were placed in the centre of each well and covered with the lid of a 10 cm Petri dish. Larvae were allowed to move for 5 min., and then the path length was traced on the Petri-dish lid. Path lengths were digitized using Fiji (Schindelin et al., 2012).

37 Statistical analysis

Statistical analysis was performed in R (R Core Team, 2013). The effects of genotype as well as blocking factors such as the date of the experiment were modeled with general linear models using the lm function, and the car package (Fox and Weisberg, 2011). ANOVA and Tukey HSD post-hoc tests were run to compute statistical significances. 95 % confidence intervals were calculated using the effects package (Fox, 2003). Effect sizes were calculated using the compute.es package (Re, 2013). Power was calculated using the pwr package (Champley, 2012). The packages lattice and gplots were used for plotting (Sarkar, 2008; Warnes et al., 2013 respectively).

38 Results and Discussion:

The Structure of the foraging Locus and its Products

Sequencing the foraging locus In order to connect sequence variation to phenotype I required a full sequence of the foraging locus for the rover and sitter strains. I began by isogenizing our rover and sitter wild type strains. Balancer mediated isogenization yielded 4 isogenic rover and 4 isogenic sitter lines and these lines were sequenced separately. DNA was extracted from these lines, and 15 overlapping regions were amplified with PCR for each of the 8 lines. These amplicons were used as the sequencing templates to derive the sequence for the locus. The resulting reads for each rover line and each sitter line were pooled during alignment. Rovers yielded 649 reads and were aligned to the approximately 40kb locus with an average depth of coverage of 11.6 with a standard deviation of 3.8 reads. Sitters yielded 645 reads with an average depth of coverage of 12.2 with a standard deviation of 3.9 (fig. 1). I then built a sitter consensus sequence and compared it to the y1;cn1,bw1,sp1 reference genome on FlyBase (Celniker et al., 2002). When compared to the reference genome, our wild- type sitter consensus sequence had many insertions and deletions (fig. 2A), some larger than 40 bp. Many SNPs also differed between the sitter strain and the reference genome. Divergence was approximately 2.1% across of the locus, with 1266 non-identical sites (fig. 2B). It is not surprising that the vast majority of the divergence occurred in the introns of the gene. Many sites in introns will not have any functional consequences to the locus, but those that do may only affect a subset of the gene’s function. When considering only the exons, the divergence dropped to 0.9% with only 59 non-identical bases over the 6,925 bp. When comparing the consensus sequences of the rover and sitter lines, I saw many differences as well, but fewer than the y1;cn1,bw1,sp1 genome strain comparison. I identified 425 non-identical sites between our rover and sitter lines at the foraging locus (fig. 2B). There where only 10 non-identical sites within protein coding sequences and only 10 more non-identical sites within 5'UTR. There were no differences seen within the 3'UTR. All the remaining 405 non-identical sites were within

39 introns, and upstream and downstream of the locus. Of the non-identical sites within coding sequences, there were only 2 non-synonymous mutations at the N-terminus of the P1 isoform, and none in the cGMP or kinase domains of the protein sequences. The sequence differences between these rover and sitter strains are candidates for determining the causal variants involved.

Identifying alternative promoters I identified the transcription start sites (TSSs) and transcription termination sites (polyAs) of the foraging gene using a pooled mid third instar larval and adult RNA samples from our rover and sitter strains. I used both homopolymeric as well as RNA ligase mediated rapid application of cDNA ends (RACE, Michelson and Orkin, 1982) to identify TSSs, and oligo dT RT-PCR to identify transcription termination sites (polyA). I sequenced 144 clones of our RACE products and the chromatograms were mapped back onto the locus. I identified four separate TSSs. These transcription start sites fall into two categories, peaked and broad. TSS 1, 2 and 4 are peaked (fig. 3A), whereas TSS 3 is broad (fig. 3B). There are some interesting implications of these different structures of TSS. Peaked TSS are typically found in regulated promoters, whereas broad TSS are found in constitutive promoters (Hoskins et al., 2011). All 4 TSSs are near to strong matches to Initiator (Inr) and Downstream Promoter Element (DPE) sequences. Our 3’-RACE experiments identified only one polyA which had strong matches to consensus polyadenylation signals. This model of multiple TSSs with one shared polyA is consistent with what is seen in foraging’s orthologues in other organisms (Ørstavik et al., 1997; Stansberry et al., 2001). These RACE results support a gene model of four independent minimal promoters (for- pr1-4.7-4, fig. 4) that whose RNA products all splice into a common 3’-exon. It should be noted that the coding sequences for the cGMP binding domains as well as the kinase domain of the protein products are all located in this 3’-end. After completion of these experiments, the modENCODE project set out to conduct genome wide RACE and CAGE experiments (Hoskins et al., 2011). Their results are consistent with our results, with the exception that they failed to identify the promoter 3 TSS. Promoter 3 was, however, the only TSS that was successfully identified before these experiments (Kalderon and Rubin, 1989). A more in-depth dissection of the promoters of foraging is undertaken in Chapters 3 and 4.

40 Identifying alternatively spliced variants Once the starts and stops of the transcripts were identified, I characterized alternatively spliced variants of the gene. I isolated and sequenced 240 full length cDNA clones and aligned them to the locus (fig. 3B). Our experiments identified many novel transcripts compared to previous reports (Kalderon and Rubin, 1989; Stapleton et al., 2002). I verified 21 distinct transcripts which represent 9 distinct open reading frames (fig. 4). Due to the shared 3’-end of the transcripts, these variants differ in their 5’-UTR and corresponding N-terminal coding sequence of their open reading frames. Although these results suggest a much more complicated gene structure compared to previous reports, all the new variants identified here still use the same 12 exons originally described (Kalderon and Rubin, 1989). I conducted RT-PCR experiments of the splice site junctions between our rover and sitter strains given this new gene model. I did not find any differences between the strains with respect to their alternative splicing (data not shown). The relative lack of coding sequence variation (i.e. only 2 non-synonymous replacements in th P1isoform) and the lack of splicing differences suggests that the rover sitter differences are likely in the expression level of a common set of transcripts.

Generating deletion mutants.

Deletion mutants can serve as an extremely valuable tool in genetic research. A genetic deletion of a gene provides definitive proof that a particular trait or behaviour can be ascribed, at least in part, to a particular gene. Deletion mutants also serve as a valuable negative control in many molecular biology experiments, especially for antibody verification. Now that we had a precise and in-depth characterization of gene and its products, I generated more precise mutations of the gene, including multiple partial and full deletions of the foraging gene.

Generating partial deletions Deletions spanning the 5’-end of the locus were generated using the Flp/FRT system and include Df(2L)fore0f0, Df(2L)forf0e0, and Df(2L)ford0f0. These deletions removed varying extents of the 5’ TSS of the locus; TSS 1-4, TSS 2-3, TSS 1-3, respectively. None of them deleted the

41 kinase domain of the protein. All three removed the start codons for P1 and P3 protein isoforms. These deletions were verified using PCR, RT-PCR, and sequencing (data not shown). But RNA from the common coding exons are still detected in these homozygous deletions. As such, it was possible that these mutants still produce at least some functional FOR-PKG protein. These mutants proved useful in determining isoform specific function (Chapter 3), but were not sufficient as a negative control for our antibodies. Partial deletions of the 3’-end of the locus were generated with the hobo transposase deletion generator system (Huet et al., 2002). We used P{wHy}forDG2311 found within the foraging locus, 40 bp 5'-to promoter 4. The hobo transposable element was mobilized to induce local replicative hops. Progeny were then screened for recombination events, deleting the mini- white transgene between the neighboring hobo transposable elements. These deletions, which included Df(2L)for08112-3 and Df(2L)for31122-1, removed most of the two cGMP binding domains and all of the kinase domain (fig. 8). Deletions were verified using PCR, RT-PCR, sequencing, and Southern blotting (S.fig. 1). These deletions did remove the coding sequence of the target for our antibody and were therefore valuable negative controls. Subsequent Southern blot analysis showed that our rover and sitter backgrounds, as well as many common UAS/Gal4 and balancer strains, contained active hobo transposase element (data not shown). Thus putting these hobo-generated deletions into these backgrounds would likely have resulted in an ever changing mutant, rendering any phenotypic analysis uninterpretable.

Generating a null with HR To generate a null mutation of the foraging gene I implemented the ends-out gene targeting system (HR, Gong and Golic, 2004). I re-engineered the pW25 vector (Gong and Golic, 2004) to contain foraging-specific homology arms, as well as an attP (Groth et al., 2004) site-specific recognition sequence (fig. 5A). Once this vector was integrated and recombined at the locus, the gene would be deleted, yielding no foraging sequence, and replaced with an attP sequence. This site-specific recognition sequence could then be used for integration of future transgenes. Sequencing, and restriction enzyme digestion (fig. 5B and 5C) confirmed the construction of the foraging ends-out HR construct. Crosses were then carried out to mobilize this targeting construct in order to allow homologous recombination at the foraging locus. Although these crosses yielded 8 second chromosome integrations, none were a deletion of the locus. Upon subsequent analysis, I

42 determined that these integration events indeed recombined at one of the homology arms but not both (data not shown). I found separate integration events at both the 5’- and 3’-end of the locus, but no deletion. Importantly, I was able to use the loxP site-specific recombination sequences in the pW25 vector, that were intended for removal of the mini-white gene, to generate a null mutation while removing mini-white (fig 6A). By crossing the 5’-loxP element to the 3’-loxP element, generating a trans-heterozygous mutant, in the presence of a Cre recombinase, I induced a recombination event which resulted in the deletion of the entire locus. It may also be of interest that for every recombination event that yielded a deletion of the foraging gene, there was a paired recombinant that generated a duplication of the foraging gene. The approach used to generate foraging deletions was similar in concept to the well- established Flp/FRT deletions (Thibault et al., 2004; Parks et al., 2004). Our foraging null mutants were validated using PCR, sequencing, and Southern blot (data not shown). I PCR amplified a 11kb region spanning the locus in the null mutants. The corresponding region in wild type animals was 45kb. Digestion of this 11kb fragment yielded all predicted sizes if the locus was deleted appropriately (data not shown). Sequencing of the regions flanking both homology arms was consistent with a recombination event. Southern blotting failed to detect DNA homologous to the foraging gene in the null mutants (data not shown). These data conclusively showed that the foraging DNA sequence had been deleted. I further verified the lack of foraging gene products produced from these mutants. Using RT-PCR, I determined that the homozygous foraging null mutant larvae expressed control genes, but no detectable foraging RNA (fig. 6B). I amplified four different control genes (actin5c, αtubulin84B, and 1433ε) but failed to amplify foraging with seven separate primer pairs. All primer pairs amplified with cDNA from wild-type larvae. Finally, in our western blot analysis, I saw no FOR immunoreactivity in the mutants, despite there being an abundance of protein loaded on the gel, as seen via Ponceau S staining (fig. 6C). These data showed that our previously generated antibody (Belay et al., 2007) was faithful in a western blot analysis. More importantly, these DNA, RNA, and protein data strongly suggest that the loxP recombination mutants were bona-fide genetic null mutations of the foraging gene.

43 Generating a genomic rescue.

In order to rescue any mutant phenotypes seen in the foraging null, I constructed a full genomic rescue construct. I employed the technique known as recombination mediated genetic engineering, or recombineering. I started by reengineering the P[acman] recombineering vector (Venken et al., 2006). I rearranged the relative orientation of the attB and mini-white sequences in the BAC. This would allow it to be paired with the loxP-attP alleles generated using the pW25-attP. After φC31 integration, loxP recombination would remove the mini-white and leave a minimal footprint. Once attB and mini-white sequences were rearranged, I performed gap repair experiments isolating the foraging gene sequence from a larger BAC, CHORI-64J02. The BACs available from the P[acman] resource website used DNA obtained from the y1;cn1,bw1,sp1 strain. The unfortunate consequence was that this strain contained a large copia transposable element within the foraging locus. In order to have a rescue construct similar to our wild-type strain, I removed this transposable element using the galK selection method (Warming et al., 2005). Once the copia element was removed I had a wild-type BAC ready to inject into the fly as a rescue construct (fig. 8A). PCR, sequencing, and digestion confirmed that the BAC was constructed appropriately. The in-silico digestion (fig. 8B) was consistent with the in-vitro digestion of the BAC (fig. 8C). Once integrated into the fly's genome, into the attP2 landing site on the third chromosome (Groth et al., 2004), the BAC was crossed into the null mutant background. A western blot analysis confirmed that the BAC for the most part recapitulated the foraging expression pattern (fig. 9A). Furthermore, the BAC in a wild-type second chromosome background showed over expression on a western blot (fig. 9A, 9B). It should be noted that there were some relative expression differences between the wild-type and the BAC rescue strains. Even though some of the immunoreactive bands were at relatively different levels between these two strains, both strains expressed all of the foraging isoforms.

The foraging null mutant is pupal lethal.

44 Determining the stage of lethality The first noteworthy phenotype exhibited by the foraging null mutant, was its recessive lethality. I identified the stage of lethality to the late pupal (pharate adult stage) by using a GFP expressing balancer chromosome to identify individuals that were homozygous null. This lethality was not due solely to a failure to eclose because when I dissected away the pupal case, the pharate adults still did not survive (fig. 10A). In addition, this late pupal lethality phenotype was recapitulated in our lethal complementation analysis where I crossed the foraging null mutant to additional deletions that deleted at least a part of the foraging gene sequence (table 1). The BAC transgenic insert of the foraging gene was sufficient to rescue the lethality of the null mutant. This demonstrated that the 35 kb genomic sequence of foraging was sufficient to rescue the lethality and showed that distal regulatory elements outside the locus was not needed. I found no significant lethality prior to the late pupal stage in foraging null mutants. Homozygous null mutants (fig. 10B) as well as null mutants crossed to the other deficiencies of foraging (fig. 10C) yielded the same survivorship from first instar to late pupal as control animals. Partial deletions of the 3’-half of the locus, Df(2L)for08112-3 and Df(2L)for32122-1, showed similar survivorship from first instar to pupa as the control strain (fig. 10D). Therefore, I concluded that the foraging null mutant was pharate adult/late pupal lethal and that the mutant results in no lethal effects in larval development. The only deletion that did complement mutants within the foraging gene was Df(2L)ed1. The locus was originally mapped using, in part, this deficiency (de Belle et al., 1989). But with these new data, it seems likely that the lethality of the origin mapped mutation is an effect on a neighbouring gene. It was also reported that Df(2L)ed1 and Df(2L)drmP2 overlap (Green et al., 2002), but this is inconsistent with our complementation analysis. It should also be noted that the breakpoints of the Df(2L)ed1 and Df(2L)drmP2 have not been mapped with accuracy, via sequencing, and were defined via polytene chromosomal analysis and restriction fragment mapping, respectively.

Inducing lethality with RNAi The RNAi lines, v38319 and v38320, available for the foraging gene from the Vienna Drosophila RNAi Center (VDRC) had significant off target effects. Crossing these RNAi lines to da-Gal4 and tub-Gal4 caused significant lethality during the first instar larval stage. This

45 stage of lethality is inconsistent with the lethal phase of every other mutant of the foraging gene. Off target effects may be expected given that the RNAi construct was designed to target the coding sequence for the kinase domain of the protein. This region has a high similarity to other kinase genes. We then constructed our own RNAi library targeting multiple exons along the foraging gene (fig. 7). I was able to induce a late pupal pharate adult lethality when driving our newly generated common coding RNAi, UAS-Dcr2 ; UAS-forRNAi-exon7:8, with ubiquitous drivers, da-Gal4 and tub-Gal4. Multiple other reports suggested that the foraging mRNA was highly maternally loaded in the developing embryo (Lécuyer et al., 2007; Graveley et al., 2011; Tomancak et al., 2002). Since the null is lethal, I maintained stocks over a balancer chromosome. As a result, the foraging allele on the balancer chromosome was able to maternally deposit foraging mRNA. Therefore, I could not determine whether foraging is essential in embryological development. In order to test if foraging was required for embryo development, I drove foraging RNAi with a maternal Gal4, nos-Gal4 (results not shown). These individuals survived to adulthood. This suggested that foraging was not required for embryogenesis but was required for pupal development. We do not know the extant to which foraging was knocked down in this experiment. We further mapped the stage at which foraging was required for survival by the use of tub-gal80;tub-Gal4 driving our UAS-Dcr;UAS-forRNAi7:8. We were able to restrict foraging’s expression by moving the developing animal from the restrictive (18ºC) to permissive (30ºC) temperatures, and vice versa. We concluded that the foraging RNAi needed to be active from late wandering until late pupa, approximately P8-P13, in order to induce lethality (Ina Anreiter, unpublished data). Both the on-early/off-late and the off-early/on-late experiments supported this window in which the RNAi was required to be active to get high levels of pharate adult lethality. It has been previously stated that pleiotropic genes affecting metabolism and behaviour are likely to be vital (Hall, 1994). I have shown that this is true for foraging. Nothing is known about foraging’s role in pupal development, but these data conclude that foraging is necessary.

46 Increased foraging expression increases path length behaviour.

Since the forging null mutation did not produce any lethality during larval development, I was able to study its effects on larval phenotypes. I used the null allele to investigate foraging behaviour, food intake behaviour and lipid metabolism of larvae. The natural alleles showed many differences in metabolic associated phenotypes and behaviours. The foraging null allele that I generated allowed me to determine if they all map to within the foraging gene. There are extensive reports showing the natural foraging alleles differ in foraging path length behaviour (Sokolowski, 1980; de Belle et al., 1989; Osborne et al., 1997; Kaun et al., 2007; fig. 11A). Notably, our newly generated foraging null mutant has reduced foraging path length behaviour. When heterzygous over another deficiency, Df(2L)Exel7018, the foraging null mutants exhibited shorter path length behaviour compared to wild-type over the same deficiency (fig. 11B). This effect was further supported by the Df(2L)for08112-3 and Df(2L)for32122-1 partial deletions of the locus which also show reduced path length behaviour, relative to controls (fig. 11C). I rescued this mutant phenotype with our genomic BAC copy of foraging (fig. 11D). This suggests that the regulatory elements required to modulate this phenotype are contained within this 35kb locus. In addition, ubiquitous expression of our foraging RNAi construct, UAS-Dcr2; UAS-forRNAi-exon7:8, that targets all known transcripts (fig. 8), reduced path length behaviour (fig. 12A). Therefore, my results indicate that the foraging gene is necessary for wild-type levels of foraging behaviour. Furthermore, increased foraging gene copy number resulted in increased whole larva protein expression (fig 9) and increased foraging behaviour (fig 11C). Since rovers have longer foraging path length behaviour, these data are consistent with a model that rovers have higher expression of the foraging gene in the relevant cells to elicit this phenotypic effect. Increased gene copy number with the BAC transgene increased path length behaviour in a dose dependent manner (fig. 11D). This dose response is an interesting effect. Previous reports suggested that the foraging path lengths of rover/sitter heterozygotes were indistinguishable from homozygous rover larvae (de Belle et al., 1989). This suggests the rover strain did not respond to gene dosage for path length behaviour as did the sitter and BAC alleles in the current study. But, our BAC over expresser did not reach the phenotypic levels of our rover strain in the paht length assay. This suggested that rovers may have a much higher expression level than our BAC over expresser in cells relevant for foraging path length.

47 By conducting a tandem series of 9-generation backcrosses, I used the null mutation to integress the rover foraging locus into a sitter background (R1), and vice versa (R4). Null mutants were backcrossed to both the rover and sitter strains for 9 generations (fig. 13A). This yielded null mutations in both backgrounds. I used these lines as background donors for an allele swap. Since these background donors had no foraging sequence, they could not recombine at the foraging locus. I also conducted a series of control crosses in which I backcrossed the rover allele into the rover background (R2) and the sitter allele into the sitter background (R3). When conducting a foraging path length experiment, I saw that both rover in a sitter background, R1 and rover in a rover background, R2 were significantly longer than sitter in a sitter background, R3 and sitter in a rover background, R4 (fig. 13B). This suggested that the allelic difference between these lines did reside within the foraging locus. Rovers and sitters showed different relative expression of FOR immunoreactive bands in a whole larva western blot analysis (fig. 13C). The allele swap lines mapped the difference in expression between rovers and sitters to the foraging locus.

Increased foraging expression increases food intake behaviour.

Schoofs et al., (2014) showed that neuromuscular control of feeding behaviour is suppressed during locomotion. Other studies have shown that when selecting for increased feeding rate, increased locomotory behaviour was co-selected (Sewell et al., 1974). I showed that the foraging null mutation affected foraging locomotion behaviour. I wondered if the foraging behaviour was related to food intake behaviour. Previous studies found a difference in food intake by the rover and sitter lines (Kaun et al., 2007; Kaun et al., 2008). These studies suggested that individuals with increased locomotory behaviour will have decreased ingestion behaviour. This was confirmed here (fig. 14A). This, however, was in contrast to what I found with the foraging null mutant, which showed reduced food intake; the genomic BAC rescues this phenotype (fig. 14A). Again, I saw a paralleled dose-response effect of the foraging null, BAC rescue, and BAC over expresser on food intake behaviour. Increasing foraging gene copy number, increased food intake behaviour (fig. 14A). These experiments were consistent with a partial foraging deletion, Df(2L)forED243 (fig. 8), that also showed a reduction in food intake

48 behaviour relative to its control (fig. 14B). Rovers had higher path length and lower food intake than sitters, and increasing gene copy number increased both path length and food intake behaviour. This suggested that rovers had lower expression in the relevant cells for this food intake behaviour. This showed that the rover phenotype is differentially regulated with respect to the path length and the food intake behaviours.

Increased foraging expression decreases DAG/TAG levels.

I wondered if foraging's roles in movement on food and ingestion affected nutrient storage? Selecting for lower feeding rates has been shown to reduced body weight in larvae (Burnet et al., 1977). Lipid metabolism is important for energy homeostasis of an organism. Previous work showed that rovers have less fat than sitters (Belay, 2010) and that they store less of their ingested nutrients as fat (Kaun et al., 2008). foraging’s mammalian orthologue, cGKI, functions in adipose tissue and affects lipid metabolism. The foraging null mutants had significantly more lipids than the BAC rescue and the BAC rescue has significantly more DAG/TAG than the BAC over expresser (fig. 15A). This was further supported by partial deletions of the foraging gene both of the 5’- and 3’-end of the locus (fig. 15B, 15C, respectively). Thus, increased foraging expression decreased lipid levels. This supports foraging’s involvement in lipid metabolism, in the fly. It is not known whether this lipid phenotype is a cause, consequence or unrelated to larval food intake and foraging behaviours. The foraging null mutant forages less, likely expending less energy which could result in more stored lipids. However, these mutants also have lower food intake, which could lead to lower lipid stores. Further research will determine if this lipid phenotype is differentially regulated from both foraging and food intake behaviours.

49 Conclusion:

In this study, I described and characterized the structure of the foraging gene. I identified 4 separate promoters that produce 21 transcripts, many of which are new variants. I also verified multiple previously annotated variants. I generated sequences for the rover and sitter alleles for the entire 40,000 bp foraging locus. I identified a significant level of variation between these strains, and even more when compared to the reference genome. These variations in the DNA sequence did not affect the amino acid sequence as there were only two non-synonymous mutations. I successfully generated a foraging null mutant allele, as well as multiple partial deletions of the locus. Null mutants are pupal lethal demonstrating that foraging gene products were not required for viability during larval development. This allowed us to evaluate the feeding related behaviour and metabolic state of the null larvae. I showed that foraging influences how much larvae move in their food environment, how much food they ingest, and how much DAG/TAG is stored. Ours is the first behavioural characterization of any genetic deletion of the foraging locus. Our deletion study unequivocally proved a causal relationship between the foraging gene and its pleiotropic influence on these feeding-related traits. By comparing the effects of the null mutant with those of the natural variants, I showed that several of foraging’s associated phenotypes were differentially regulated. These studies will shed further light onto foraging’s regulation and the nature of the rover and sitter allelic variants. I identified many sequence differences between the rover and sitter alleles. It seems likely that this pleiotropy was achieved through differential transcriptional regulation of a common set of transcripts given the lack of variation in splicing patterns and amino acid sequence. This work conclusively showed foraging’s involvement in these phenotypes and implicated its differential regulation in these phenotypes. A more in-depth genetic dissection of the transcriptional regulation of the foraging gene is necessary to identify the relevant isoforms and tissues for these feeding related behaviours.

50 References:

Belay, A.T. (2010). Cellular components of naturally varying behaviours in the fruit fly, Drosophila melanogaster. Thesis. University of Toronto (http://hdl.handle.net/1807/19022),

Belay, A. t., Scheiner, R., So, A. k.-C., Douglas, S. J., Chakaborty-Chatterjee, M., Levine, J. d., and Sokolowski, M. b. (2007). The foraging gene of Drosophila melanogaster: Spatial- expression analysis and sucrose responsiveness. J. Comp. Neurol. 504, 570–582. de Belle, J.S., Hilliker, A.J., and Sokolowski, M.B. (1989). Genetic localization of foraging (for): a major gene for larval behavior in Drosophila melanogaster. Genetics 123, 157–163.

Benzer, S. (1967). Behavioral mutants of Drosophila isolated by countercurrent disruption. PNAS 58, 1112–1119.

Brand, A.H., and Perrimon, N. (1993). Targeted gene expression as a means of altering cell fates and generating dominant phenotypes. Development 118, 401–415.

Burnet, B., Sewell, D., and Bos, M. (1977). Genetic analysis of larval feeding behaviour in Drosophila melanogaster: II. Growth relations and competition between selected lines. Genetics Research 30, 149–161.

Celniker, S.E., Wheeler, D.A., Kronmiller, B., Carlson, J.W., Halpern, A., Patel, S., Adams, M., Champe, M., Dugan, S.P., Frise, E. et al. (2002). Finishing a whole-genome shotgun: release 3 of the Drosophila melanogaster euchromatic genome sequence. Genome Biol. 3, RESEARCH0079.

Champely, S. (2012). pwr: Basic functions for power analysis.

Chintapalli, V.R., Wang, J., and Dow, J.A.T. (2007). Using FlyAtlas to identify better Drosophila melanogaster models of human disease. Nat Genet 39, 715–720.

Dietzl, G., Chen, D., Schnorrer, F., Su, K.-C., Barinova, Y., Fellner, M., Gasser, B., Kinsey, K., Oppel, S., Scheiblauer, S. et al. (2007). A genome-wide transgenic RNAi library for conditional gene inactivation in Drosophila. Nature 448, 151–156.

L’Etoile, N.D., Coburn, C.M., Eastham, J., Kistler, A., Gallegos, G., and Bargmann, C.I. (2002). The cyclic GMP-dependent protein kinase EGL-4 regulates olfactory adaptation in C. elegans. Neuron 36, 1079–1089.

Fitzpatrick, M.J., and Sokolowski, M.B. (2004). In search of food: Exploring the evolutionary link between cGMP-dependent protein kinase (PKG) and behaviour. Integr. Comp. Biol. 44, 28–36.

51 Fox, J. (2003). Effect displays in R for generalised linear models. Journal of Statistical Software 8, 1–27.

Fox, J., and Weisberg, S. (2011). An R companion to applied regression (Thousand Oaks CA: Sage).

Fujiwara, M., Sengupta, P., and McIntire, S.L. (2002). Regulation of body size and behavioral state of C. elegans by sensory perception and the EGL-4 cGMP-dependent protein kinase. Neuron 36, 1091–1102.

Gong, W.J., and Golic, K.G. (2003). Ends-out, or replacement, gene targeting in Drosophila. PNAS 100, 2556–2561.

Gong, W.J., and Golic, K.G. (2004). Genomic deletions of the Drosophila melanogaster hsp70 genes. Genetics 168, 1467–1476.

Graveley, B.R., Brooks, A.N., Carlson, J.W., Duff, M.O., Landolin, J.M., Yang, L., Artieri, C.G., van Baren, M.J., Boley, N., Booth, B.W. et al. (2011). The developmental transcriptome of Drosophila melanogaster. Nature 471, 473–479.

Green, R.B., Hatini, V., Johansen, K.A., Liu, X.-J., and Lengyel, J.A. (2002). Drumstick is a zinc finger protein that antagonizes Lines to control patterning and morphogenesis of the Drosophila hindgut. Development 129, 3645–3656.

Greenspan, R.J. (1997). A kinder, gentler genetic analysis of behavior: dissection gives way to modulation. Current Opinion in Neurobiology 7, 805–811.

Groth, A.C., Fish, M., Nusse, R., and Calos, M.P. (2004). Construction of transgenic Drosophila by using the site-specific integrase from phage phiC31. Genetics 166, 1775–1782.

Hall, JC. (1994). Pleiotropy of behavioral genes. In flexibilty and constraints in behavioral systems. Edited by CP Kyriacou and RJ Greenspan John Wiley and Sons Ltd pages 15–27.

Hirose, T., Nakano, Y., Nagamatsu, Y., Misumi, T., Ohta, H., and Ohshima, Y. (2003). Cyclic GMP-dependent protein kinase EGL-4 controls body size and lifespan in C. elegans. Development 130, 1089–1099.

Hirsch, J., and Erlenmeyer-Kimling, L. (1961). Sign of taxis as a property of the genotype. Science 134, 835–836.

Hofmann, F., Feil, R., Kleppisch, T., and Schlossmann, J. (2006). Function of cGMP-dependent protein kinases as revealed by gene deletion. Physiological Reviews 86, 1–23.

Hoskins, R.A., Landolin, J.M., Brown, J.B., Sandler, J.E., Takahashi, H., Lassmann, T., Yu, C., Booth, B.W., Zhang, D., Wan, K.H. et al. (2011). Genome-wide analysis of promoter architecture in Drosophila melanogaster. Genome Res. 21, 182–192.

Huet, F., Lu, J.T., Myrick, K.V., Baugh, L.R., Crosby, M.A., and Gelbart, W.M. (2002). A deletion-generator compound element allows deletion saturation analysis for genome wide phenotypic annotation. Proc. Natl. Acad. Sci. U.S.A. 99, 9948–9953.

1 Ingram, K.K., Oefner, P., and Gordon, D.M. (2005). Task-specific expression of the foraging gene in harvester ants. Molecular Ecology 14, 813–818.

Kalderon, D., and Rubin, G.M. (1989). cGMP-dependent protein kinase genes in Drosophila. J. Biol. Chem. 264, 10738–10748.

Kaun, K.R., Riedl, C.A.L., Chakaborty-Chatterjee, M., Belay, A.T., Douglas, S.J., Gibbs, A.G., and Sokolowski, M.B. (2007). Natural variation in food acquisition mediated via a Drosophila cGMP-dependent protein kinase. J. Exp. Biol. 210, 3547–3558.

Kaun, K.R., Chakaborty-Chatterjee, M., and Sokolowski, M.B. (2008). Natural variation in plasticity of glucose homeostasis and food intake. J Exp Biol 211, 3160–3166.

Kearse, M., Moir, R., Wilson, A., Stones-Havas, S., Cheung, M., Sturrock, S., Buxton, S., Cooper, A., Markowitz, S., Duran, C. et al. (2012). Geneious Basic: An integrated and extendable desktop software platform for the organization and analysis of sequence data. Bioinformatics 28, 1647–1649.

Kent, C.F., Daskalchuk, T., Cook, L., Sokolowski, M.B., and Greenspan, R.J. (2009). The Drosophila foraging Gene mediates adult plasticity and gene–environment interactions in behaviour, metabolites, and gene expression in response to food deprivation. PLoS Genet. 5, e1000609.

Lécuyer, E., Yoshida, H., Parthasarathy, N., Alm, C., Babak, T., Cerovina, T., Hughes, T.R., Tomancak, P., and Krause, H.M. (2007). Global analysis of mRNA localization reveals a prominent role in organizing cellular architecture and function. Cell 131, 174–187.

Lee YS, Carthew RW. (2003). Making a better RNAi vector for Drosophila: use of intron spacers. Methods 30, 322–329.

Lucas, C., and Sokolowski, M.B. (2009). Molecular basis for changes in behavioral state in ant social behaviors. PNAS 106, 6351–6356.

Manning, G., Plowman, G.D., Hunter, T., and Sudarsanam, S. (2002). Evolution of protein kinase signaling from yeast to man. Trends in Biochemical Sciences 27, 514–520.

McGuire, S.E., Le, P.T., Osborn, A.J., Matsumoto, K., and Davis, R.L. (2003). Spatiotemporal rescue of memory dysfunction in Drosophila. Science 302, 1765–1768.

Miyashita, K., Itoh, H., Tsujimoto, H., Tamura, N., Fukunaga, Y., Sone, M., Yamahara, K., Taura, D., Inuzuka, M., Sonoyama, T. et al. (2009). Natriuretic peptides/cGMP/cGMP- dependent protein kinase cascades promote muscle mitochondrial biogenesis and prevent obesity. Diabetes 58, 2880–2892.

Ni, J.-Q., Liu, L.-P., Binari, R., Hardy, R., Shim, H.-S., Cavallaro, A., Booker, M., Pfeiffer, B.D., Markstein, M., Wang, H. et al. (2009). A Drosophila resource of transgenic RNAi lines for neurogenetics. Genetics 182, 1089–1100.

Orstavik, S., Natarajan, V., Taskén, K., Jahnsen, T., and Sandberg, M. (1997). Characterization of the human gene encoding the type I alpha and type I beta cGMP-dependent protein kinase (PRKG1). Genomics 42, 311–318.

2 Osborne, K.A., Robichon, A., Burgess, E., Butland, S., Shaw, R.A., Coulthard, A., Pereira, H.S., Greenspan, R.J., and Sokolowski, M.B. (1997). Natural behavior polymorphism due to a cGMP-dependent protein kinase of Drosophila. Science 277, 834–836.

Parks, A.L., Cook, K.R., Belvin, M., Dompe, N.A., Fawcett, R., Huppert, K., Tan, L.R., Winter, C.G., Bogart, K.P., Deal, J.E. et al. (2004). Systematic generation of high-resolution deletion coverage of the Drosophila melanogaster genome. Nat. Genet. 36, 288–292.

Pfeifer, A., Klatt, P., Massberg, S., Ny, L., Sausbier, M., Hirneiss, C., Wang, G.X., Korth, M., Aszódi, A., Andersson, K.E. et al. (1998). Defective smooth muscle regulation in cGMP kinase I-deficient mice. EMBO J 17, 3045–3051.

Raizen, D.M., Cullison, K.M., Pack, A.I., and Sundaram, M.V. (2006). A novel gain-of-function mutant of the cyclic GMP-dependent protein kinase egl-4 affects multiple physiological processes in Caenorhabditis elegans. Genetics 173, 177–187.

Raizen, D.M., Zimmerman, J.E., Maycock, M.H., Ta, U.D., You, Y., Sundaram, M.V., and Pack, A.I. (2008). Lethargus is a Caenorhabditis elegans sleep-like state. Nature 451, 569–572.

R Core Team (2013). R: A language and environment for statistical computing (Vienna, Austria: R foundation for statistical computing).

Re, A.C.D. (2013). compute.es: Compute Effect Sizes.

Rideout, E.J., Billeter, J.-C., and Goodwin, S.F. (2007). The sex-determination genes fruitless and doublesex specify a neural substrate required for courtship song. Current Biology 17, 1473–1478.

Sambrook, J., and Russell, D.W. (2001). Molecular cloning: A laboratory manual. Cold Spring Harbor Laboratory, Cold Spring Harbor, New York.

Sarkar, D. (2008). Lattice: Multivariate data visualization with R (New York: Springer).

Schindelin, J., Arganda-Carreras, I., Frise, E., Kaynig, V., Longair, M., Pietzsch, T., Preibisch, S., Rueden, C., Saalfeld, S., Schmid, B. et al. (2012). Fiji: an open-source platform for biological-image analysis. Nat. Meth. 9, 676–682.

Schoofs, A., Hückesfeld, S., Schlegel, P., Miroschnikow, A., Peters, M., Zeymer, M., Spieß, R., Chiang, A.-S., and Pankratz, M.J. (2014). Selection of motor programs for suppressing food intake and inducing locomotion in the Drosophila brain. PLoS Biol 12, e1001893.

Sewell, D., Burnet, B., and Connolly, K. (1974). Genetic analysis of larval feeding behaviour in Drosophila melanogaster. Genetics Research 24, 163–173.

Sokolowski, M.B. (1980). Foraging strategies of Drosophila melanogaster: a chromosomal analysis. Behav. Genet. 10, 291–302.

Sokolowski, M.B. (2001). Drosophila: Genetics meets behaviour. Nat. Rev. Genet. 2, 879–890.

Sokolowski, M.B., Pereira, H.S., and Hughes, K. (1997). Evolution of foraging behavior in Drosophila by density-dependent selection. PNAS 94, 7373–7377.

3 Stansberry, J., Baude, E.J., Taylor, M.K., Chen, P.-J., Jin, S.-W., Ellis, R.E., and Uhler, M.D. (2001). A cGMP-dependent protein kinase is implicated in wild-type motility in C. elegans. Journal of Neurochemistry 76, 1177–1187.

Stapleton, M., Liao, G., Brokstein, P., Hong, L., Carninci, P., Shiraki, T., Hayashizaki, Y., Champe, M., Pacleb, J., Wan, K. et al. (2002). The Drosophila gene collection: Identification of putative full-length cDNAs for 70% of D. melanogaster genes. Genome Res. 12, 1294– 1300.

Thibault, S.T., Singer, M.A., Miyazaki, W.Y., Milash, B., Dompe, N.A., Singh, C.M., Buchholz, R., Demsky, M., Fawcett, R., Francis-Lang, H.L. et al. (2004). A complementary transposon tool kit for Drosophila melanogaster using P and piggyBac. Nat. Genet .36, 283–287.

Tomancak, P., Beaton, A., Weiszmann, R., Kwan, E., Shu, S., Lewis, S.E., Richards, S., Ashburner, M., Hartenstein, V., Celniker, S.E. et al. (2002). Systematic determination of patterns of gene expression during Drosophila embryogenesis. Genome Biol. 3, research0088.1–88.14.

Venken, K.J.T., He, Y., Hoskins, R.A., and Bellen, H.J. (2006). P[acman]: A BAC transgenic platform for targeted insertion of large DNA fragments in D. melanogaster. Science 314, 1747–1751.

Warming, S., Costantino, N., Court, D.L., Jenkins, N.A., and Copeland, N.G. (2005). Simple and highly efficient BAC recombineering using galK selection. Nucl. Acids Res. 33, e36–e36.

Warnes, G.R., Bolker, B., Bonebakker, L., Gentleman, R., Liaw, W.H.A., Lumley, T., Maechler, M., Magnusson, A., Moeller, S., Schwartz, M. et al. (2013). gplots: Various R programming tools for plotting data.

4 Chapter 2 Table

Table 1: Complementation analysis for lethality

Above the diagonal are the results of the lethal complementation analysis. Below the diagonal are the stages at which the lethality was observed. All genotype were balanced over a CyO, act- GFP chromosome. Df/Df animals were identified by a lack of GFP expression. The nomenclature is as follows; “+” complementation (viable), “-” lack of complementation (lethal), “e” embryonic lethality, “PA” pharate adult lethality, “P” pupal lethality. The breakpoints of these deletions can be seen in figure 7. } r o f - C A B 8 {

3 1 1 ; - - 3

0 2 2 l 4 l 2 2 1 0 0 2 0 7 l f f 1 1 l P l e u D 2 0 0 8 u 0 e E f d e 0 3 n n 1 m r r r r r r r r x d r o o o o o o o o e d f f f f E f f f f ) ) ) ) ) ) ) ) ) ) ) L L L L L L L L L L L 2 2 2 2 2 2 2 2 2 2 2 ( ( ( ( ( ( ( ( ( ( ( f f f f f f f f f f f D D D D D D D D D D D

Df(2L)Exel7018 e ------Df(2L)ed1 P e + + + + + Df(2L)drmP2 e e ------Df(2L)forED243 PA - - - - Df(2L)forf0e0 PA PA PA - - - Df(2L)ford0f0 PA PA PA PA - - Df(2L)fore0f0 PA PA PA PA PA - Df(2L)for08112-3 PA - Df(2L)for32122-1 PA PA Df(2L)fornull PA PA PA PA PA +

5 Chapter 2 Figures

Figure 1: Contiguous sequence assembly of sitter line

Fifteen overlapping fragments spanning the gene were amplified using high fidelity polymerase multiple isolated alleles from the rover and sitter populations. These fragments served as the templates for 649 Sanger sequencing reactions for rovers and 645 Sanger sequencing reactions for sitters. Purified reactions were run on the ABI 3130xl capillary sequencer. Reads were assembled to the genome reference sequence using Geneious bioinformatics software. This yielded an average depth of coverage of 11.6 with a standard deviation of 3.8 for rovers and an average depth of coverage of 12.2 with a standard deviation of 3.9 for sitters of the 40kb locus.

6 A.

B.

C.

Figure 2: Example of sequence divergence from reference genome.

A. Magnified image of the alignment shown in figure 1. Shown is a 22bp deletion relative to the genome reference line. B. Alignment of the consensus isogenized wild-type sitter sequence (lower gray with black bars) and the genome reference sequence (upper gray with black bars). The black denote lack of sequence identity. Green and yellow plot above the alignment represent the level of sequence identity with a sliding window of 50 bp. The exons of foraging appear in blue. C. Alignment of the consensus sitter sequence (upper gray with black bars) and the consensus rover sequence (lower gray with black bars). The black denote lack of sequence identity. Green and yellow plot above the alignment represent the level of sequence identity with a sliding window of 50 bp. The exons of foraging appear in blue.

7 A. B.

C.

Figure 3: Alignment of RACE and cDNA reads

A. Homopolymeric tailed RACE reads aligned to genome. In this example we see promoter 2, a peaked promoter, with all reads mapping the transcription start site to a single base. Reads were aligned to the genome reference sequence using Geneious bioinformatics software. B. Homopolymeric tailed RACE reads aligned to genome. In this example we see promoter 3, a broad promoter with the reads mapping the transcription start site to 200 bp region. Reads were aligned to the genome reference sequence using Geneious bioinformatics software. C. Full length cDNAs (light blue peaks with black lines) were sequenced and aligned to the exons (blue) of the locus (green). Consensus splice variants were identified and characterized.

8 Figure 4: Schematic of the foraging gene and associated features.

The transcription start sites (for-pr1-4.7-4, up-and-right arrows) and one transcription termination site (AAA) were identified with rapid amplification of cDNA ends (RACE). The splicing patterns of the transcripts were identified by sequencing full length cDNAs. Exons (dark blue) are annotated along the locus (black line) with the transcripts below. UTRs are denoted with narrow gray bars and ORFs are denoted with broad blue bars. Transcripts and proteins are labeled following FlyBase nomenclature with “R” for RNA and “P” for protein followed by alphabetic label of the individual variants. A through K are currently annotated on FlyBase. I have labeled some of the ORFs P1 – P4, due to the fact that some distinct transcripts code for identical ORFs.

9 A. B.

C.

10kb

3kb

Figure 5: Cloning of HR targeting construct

A. Schematic of Ends-out targeting vector (pW25) with foraging homology arms (HA1 and HA2) and an attP site flanking a w+mC gene. B. In-silico digestion of the foraging targeting construct with restriction enzymes. The lanes are as follows; 1. NEB 1kb ladder, 2. undigested plasmid, 3. BsiWI, 4. PstI, 5. PvuI. C. Digestion confirmation of the targeting vector. The lanes are as follows; 1. NEB 1kb ladder, 2. undigested plasmid, 3. BsiWI, 4. PstI, 5. PvuI. All digestions are as expected according to the in-silico digestion.

10 A.

B. C.

Figure 6: Generation and verification of foraging null mutant

A. Ends-out HR was used to incorporate loxP (yellow triangle) and attP (gray irregular pentagon) recombination sequences into the 5’- and 3’-ends of foraging. Subsequent recombination between the loxP (red diagonal line) sites deleted the gene and left only a loxP and attP site behind. This recombination event also yielded alleles with two copies of foraging at the locus. B. RT-PCR of wild type and null mutants amplified a control gene product but the nulls failed to amplify any foraging gene product. actin, tubulin, and 1433epsilon were all run as control loci. Seven separate foraging common coding regions were run, with all showing the same result. C. Western blot detected with anti-FOR showed no banding in null mutants compared to the numerous bands in wild type animals. Ponceau staining shows adequate protein for all lines.

11 Figure 7: Schematic of the foraging gene and associated tools.

The breakpoints of the deletions used are shown above the locus (red). Df(2L)Exel7018 and Df(2L)forED243 were ordered from Bloomington Drosophila Stock Center. The other deletions were generated in the lab. Below the locus are the target sequence of two RNAi constructs generated in the lab. Both target all known transcripts. The exon7:8 construct targets what will code for first cGMP binding domain, and exon7:8 targets what will code for the AGC Kinase C- terminus.

12 A. B. C.

Figure 8: Recombineering of foraging BAC

A. Schematic of recombineered foraging BAC. The P(acman) vector was re-engineered to have two different site specific recombination sequences on either side of the multiple cloning site (MCS). Homology regions were cloned into the MCS and gap repair was performed. The resulting BAC has a synteny of “loxP-for-attB”. B. In-silico digestion of the foraging BAC with restriction enzymes. The lanes are as follows; 1. NEB 10kb ladder, 2. BstXI, 3. EcoRV, 4. NsiI, 5. PstI, 6. XhoI. C. Digestion confirmation of the foraging gene in the p(attlox) BAC. The lanes are as follows; 1. NEB 1kb ladder, 2. BstXI, 3. EcoRV, 4. NsiI, 5. PstI, 6. XhoI. All digestions are as expected according to the in-silico digestion.

13 A. B. 4.0

3.5 n

o 3.0 i s s

e 2.5 r p x

e 2.0

e v i t 1.5 a l e

R 1.0

0.5

0.0

Figure 9: Western blot analysis of foraging null and BAC mutants

A. A western blot analysis of the generated foraging null and BAC mutant larvae. The foraging BAC integrated into the attP2 site on the third chromosome was crossed into our null mutant (BAC rescue) and our wild type (BAC over expressor) second chromosome backgrounds. The BAC rescue recapitulated the the wild type bands (albeit at slightly different levels). The BAC over expressor showed approximately double expression relative to both the wild type and the BAC rescue. B. Quantification of western blots (n=4). Western blots were quantified and standardized to actin expression using ImageJ. All FOR bands were summed to give a total FOR expression metric. Data are plotted as means ± standard error of the mean.

14 A. B.

BAC over

BAC rescue

null

wt

0 0.2 0.4 0.6 0.8 1 Proportion Survival from L1 to Pupa

C. D.

f0e0/null d0f0/null 32122-1 Exel7018/null 08112-3 CyO,GFP/null yw

0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1

Proportion Survival from L1 to Pupa Proportion Survival from L1 to Pupa

Figure 10: Null mutant lethality

A. Homozygous and heterozygous null mutants were dissected from pupal cases. Homozygous null pupae survived to the pharate stage and never eclose. When removed from the pupal case, null flies twitched but never walked or flew. They died soon after removal from the pupal case. B. Individual homozygous foraging null mutants, BAC rescue and BAC over expressor and wild type first instar larvae were seeded into vials (χ2=0.0047, df=3, p=0.9999). C. The foraging null mutation was crossed to three different deficiencies as well as a CyO, act- GFP control. Four vials with 20 GFP-negative first instar larvae each were seeded at reared in standard conditions. GFP-positive individuals were seeded for the control. The vials were incubated for 10 days and the number of pupae was scored (χ2=0.0035, df=3, p=0.9999). D. Individual homozygous larvae of partial foraging deletions Df(2L)for08112-3 and Df(2L)for32122- 1 and y1w67c23 were seeded into isolation vials. The vials were visually inspected daily(χ2=1.0164, df=2, p=0.6016).

15 A. B. C. D.

60 60 60 60

) 50 50 50 50 m m

( 40 40 40 40 h t g 30 30 30 30 n e l

h 20 20 20 20 t a P 10 10 10 10

0 0 0 0

Figure 11: Foraging path length behaviour of null mutant

A. Rovers had longer path length behaviour on yeast than sitters. Data plotted as means ± 95% confidence intervals. B. Crossing to a deficiency (Df) reduced path length of the foraging null relative to wild-type. The deficiency used was Df(2L)Exel7018 which deleted from odd to shaw resulting in a 120kb deletion. Data were plotted as means ± 95% confidence intervals. C. Homozygous deletion mutants of the kinase domain coding region, Df(2L)for08112-3 and Df(2L)for32122-1 showed a reduction in path length behaviour on yeast relative to the background control, y1w67c23. Data plotted as means ± 95% confidence intervals. D. Increasing foraging gene copy number increases path length behaviour on yeast. Path lengths of homozygous foraging null mutants were significantly shorter than the BAC rescue, which in turn were significantly shorter than the BAC over expressor. Data plotted as means ± 95% confidence intervals. (Full statistics in Ch.2 table S2)

16 A. B.

Figure 12: Foraging path length behaviour and effectiveness of RNAi

A. Driving the UAS-Dcr2; UAS-forRNAi-exon7:8 with a ubiquitous da-Gal4 reduced path length behaviour on yeast. The da>RNAi line was significantly lower than both controls. Although the wt>RNAi control was significantly higher that the da>wt control. Data plotted as means ± 95% confidence intervals. B. A western blot analysis confirmed a down regulation of foraging gene product. The UAS- Dcr2; UAS-forRNAi-exon7:8 crossed to da-Gal4 does not show expression relative to wild type and UAS and Gal4 controls. (Full statistics in Ch.2 table S3)

17 A. B.

C.

Figure 13: Rover sitter allele swap

A. Schematic of the backcrossing scheme to generate the recombinant lines. The foraging null mutant was backcrossed for 9 generations into both our rover (iB55, red) and sitter (ie4, blue) lines. These lines were then used as background donors for two crosses each for 9 generations. The resulting lines are rover in a sitter background (R1), rover in a rover background (R2, control), sitter in a sitter background (R3, control), and sitter in a rover background (R4). B. Path length behaviour of the recombinant lines on yeast. Data plotted as means ± 95% confidence intervals. C. A western blot analysis of the generated the recombinant lines. Blots (n=4) were stained with anti-FOR and anti-ACTIN. (Full statistics in Ch.2 table S4)

18 A. B.

30000 30000

25000 25000 ) . u . f . 20000 20000 a (

e k a t n i 15000

d 15000 o o F

10000 10000

5000 5000

0 0

Figure 14: Food intake behaviour

A. Rovers ate less than sitters. Increasing foraging gene copy number increased larval food intake behaviour. Food intake behaviour increased with gene copy number. Data plotted as means ± 95% confidence intervals. B. A partial deletion of the foraging gene also reduced food intake behaviour of larvae. Data collected by Dr. Scott Douglas. Data plotted as means ± 95% confidence intervals. (Full statistics in Ch.2 table S5)

19 A. B. C.

800 800 800

700 700 ) 700 n i e t

o 600

r 600 600 p

g m

500 500 / 500

l o r e

c 400 400 400 y l g

g

µ 300 300 300 (

s l e

v 200 200 200 e l

t a F 100 100 100

0 0 0

Figure 15: Lipid level of null mutant

A. An increase in foraging gene copy number decreased larval fat levels. All lines were homozygous for their respective mutations and transgenes. Data plotted as means ± 95% confidence intervals. B. Deletions of the kinase coding domain of the gene also increased lipid levels. All lines were homozygous. Data are plotted as means ± 95% confidence intervals. C. Deletion of the 5'-end of the gene increased fat levels. All lines were homozygous. Data collected by Dr. Scott Douglas. Data are plotted as means ± 95% confidence intervals. (Full statistics in Ch.2 table S6)

20 Chapter 3

Promoter Specific Expression and Function of the foraging gene in the Larvae of the Fruit Fly (Drosophila melanogaster)

The vast majority of the experimental research done in the thesis was performed by Aaron Allen. However, some of the experiments performed in this thesis were collaborative, with data being contributed by others. Those data and the contributors are listed below;

● Table 2: lethality Gal4 screen - crosses were performed by the following people; Aaron Allen, Ina Anreiter, Lydia To, Bryon Hughson, Dr. Jeff Dason. ● Figure 9B, D, and F: IHC of wild type larval CNS - Dr. Amsale Belay ● Figure 10B: RNA in-situ in larval CNS - Dr. Amsale Belay ● Figure 10C: IHC of larval CNS enhancer trap - Dr. Amsale Belay ● Figure 16A and B: fat body manipulations in larvae - Dr. Scott Douglas

21 Abstract:

Pleiotropy has many important implications for a number of fields of biology including evolution and medicine. The foraging gene, in Drosophila melanogaster, is highly pleiotropic. It has a complex gene structure makes it an ideal system to study the mechanisms underlying pleiotropy. I hypothesized that foraging achieves its pleiotropy through the use of independently regulated cis-regulatory elements that drive distinct spatial expression of foraging isoform, and this results in foraging's multiple associated phenotypes. I employed promoter analysis and isoform specific RNAi to address this hypothesis. I identified a broad expression pattern from the 4 promoters of the foraging gene. The promoters co-expressed in the same tissue systems, but had quite distinct expression within those systems. for-pr1-4.7-Gal4 and for-pr4-2.4-Gal4 expressed in neurons in the CNS, and for-pr2-4.0-Gal4 and for-pr3-3.3-Gal4 expressed in different subsets of glia cells in the CNS. I also saw extensive expression in the alimentary canal of the larva. for-pr1-4.7-Gal4 had limited expression in the a few enteroendocrine cells of the midgut. for-pr2-4.0-Gal4 expressed in the stem cells of the midgut. for-pr3-3.3-Gal4 had the broadest expression in the gastric system, and expressed in the midgut enterocytes, midgut enteroendocrine cells, and midgut and hindgut visceral muscle. for-pr4-2.4-Gal4 was restricted to the hindgut epithelia of the alimentary canal. Through the use of isoform specific RNAi, I showed that the P1 or P3 protein isoforms are necessary for larval foraging behaviour. I identified a cis-regulatory element that drives expression in the fat body of the larva. Reduced foraging in the fat body increased lipid levels. Our results suggest that foraging gene is an example of genuine pleiotropy, in that its associated phenotypes are influenced by independent regulation of its products.

22 Introduction:

Pleiotropy

Pleiotropy is defined as one element at the genetic level affecting multiple phenotypes (Plate, 1910). Here I use the “molecular gene pleiotropy” definition where the gene is the unit of measure at the genetic level and not the mutation (Paaby and Rockman, 2013). The pleiotropic actions of a gene are typically deduced by comparing the phenotypes of a mutant to a wild-type, or by comparing the phenotypes of two (or more) natural variants of a gene. With this definition, a mutation itself is not pleiotropic. However when compared to wild-type it allows us to uncover the multiple functions of a gene and investigate how a mutation might alter a gene’s pleiotropy. Phenotypes can be measured at different levels of biological organization ranging from gene expression, to physiological states, to behavioural patterns. Pleiotropy can be achieved through multiple modes of action (discussed below). Pleiotropy has important implications for many fields of biology. These range from medicine and disease (Pyeritz, 1989; Mackay and Anholt, 2006), to adaptation and selection (Orr, 2000; Pavlicev and Wagner, 2012). Arguably the most notable is the theory of antagonistic pleiotropy in aging (Medawar, 1952, Williams 1957). Genes that influence behaviour are for the most part pleiotropic and also affect other aspects of physiological and developmental processes (Sokolowski, 2001). Examples include period (Williams and Sehgal, 2001), doublesex (Greenspan and Ferveur, 2000), and foraging (Reaume and Sokolowski, 2009). In the case of foraging, both mutants and natural alleles implicate the pleiotropic actions of the gene (Chapter 2). The minimal variation in amino acid sequence suggests that the phenotypic variation in the natural rover and sitter variants is regulatory.

23 Coding vs noncoding variation

Mutations in the coding sequence are more likely to affect all functions of a polypeptide. Variation in noncoding regions may only affect a subset of a gene's expression, and as a result, its function. Variations within cis-regulatory elements (CREs), the regions of DNA bound by transcription factors, can affect a subset of the spatial- or temporal-expression of the gene. Variation in CREs has been proposed as a prevalent source of morphological evolution (Carroll, 2000). The importance of non-coding variation in the DNA sequence for evolution is not a new idea (Britten and Davidson, 1971). Regulatory differences are known to be important for physiological and behavioural evolution between lineages (Andersson and Georges, 2004; Hofmann, 2003). In order to study the relevance of such regulatory elements, it is important to find them and deduce their functions.

Mapping CREs and their functions

Promoter analyses have been used previously to identify CREs and deduce isoform- specific expression and function successfully. Examples of genes analyzed by promoter analysis in Drosophila include slowpoke (Thomas et al., 1997), string (Lehman et al., 1999), paramyosin (Arredondo et al., 2001), pdf (Park et al., 2000), timeless (Okada et al., 2001), and fruitless (Billeter and Goodwin, 2004). CREs may be combinatorial in nature, and do not necessarily associate with their closest promoter. The slowpoke gene illustrates this point (Brenner and Atkinson, 1996). slo has several regulatory elements across the locus that drive expression in different tissues (Thomas et al., 1996, Brenner et al., 1996). These elements associate with specific promoters of the locus, in a combinatorial fashion, and not necessarily with the closest promoter (Brenner and Atkinson, 1996). Multiple similar CREs may be found together resulting in stronger expression. Expression in muscle steadily decreased when a series of nested constructs of the paramyosin gene were examined (Arredondo et al., 2001). In this case there were multiple CREs closely positioned along the DNA that were responsible for expression in the muscle. Many CREs can only interact with a specific promoter. In the example of string, expression was seen in various

24 tissues by fusing reporter constructs with the native promoter of the string gene, (Lehman et al., 1999). However, when these elements were fused to a non-native promoter expression was not seen. This CRE-by-promoter interaction underlines the importance of maintaining the endogenous promoter in these analyses. Similarly, CREs that drive tissue- and sex-specific expression of the fruitless gene were also mapped with promoter bashing (Billeter and Goodwin, 2004). fru encodes a transcription factor involved in the sex determination pathway and and it effects courtship behaviour; it has multiple transcripts regulated by four independent promoters (Ryner et al., 1996; Goodwin et al., 2000). The first promoter regulates sex-specific expression of transcripts while the remaining promoters are not sex-specific. Lesions in the intronic regions of the sex-specific transcripts cause aberrant splicing which induces defective courtship behaviour in males (Goodwin et al., 2000). A promoter-Gal4 transgene identified male-specific expression patterns that co-localize with endogenous FRUM protein (Billeter and Goodwin, 2004).

foraging and its pleiotropy

foraging is a pleiotropic gene (Chapter 2). It has primarily been investigated for its roles in feeding-related behaviour and metabolism. In the previous Chapter, I showed that foraging has a complex gene structure with multiple promoters. Furthermore, I generated and used a null mutation to conclusively demonstrate that foraging is involved with larval foraging behaviour, food intake, and lipid metabolism. I found these phenotypes to be correlated and were responsive to gene dosage. PKGs are known to interact with multiple signaling pathways, so multiple functions may not be surprising (Schlossmann and Desch, 2009). foraging's mammalian orthologue, cGKI, produces multiple isoforms with differing expression, biochemical activity, and interacting partners (Smith et al., 1996; Hofmann et al., 2009). Our data from the previous Chapter suggests that foraging's associated phenotypes are are not all co- regulated through a common CRE. The foraging gene is an excellent system in which to study pleiotropy due to its complexity and the suite of associated phenotypes.

25 Decoding foraging’s pleiotropy

It is not known why foraging has so many transcripts and protein isoforms. I hypothesize that foraging achieves its pleiotropy through the tissue-specific expression and function of its numerous products. Furthermore, I hypothesize that CREs along the locus drive expression of the promoters and their transcripts ins a tissue-specific manner. Here I used a promoter bashing approach, employing the Gal4/UAS system (Brand and Perrimon, 1993) to address the above question. Knowledge of the mechanistic basis underlying the generation of foraging’s products will permit the future investigation of the means by which these products influence foraging’s phenotypes.

26 Materials and Methods:

Strains

Strains were kept in 10 mL of food per 40 mL vial and 40 mL of food per 170 mL bottle at 25 ± 1ºC with a 12L:12D photocycle. Our fly food recipe contains 1.5% sucrose, 1.4% agar, 3% glucose, 1.5% cornmeal, 1% wheat germ, 1% soy flour, 3% molasses, 3.5% yeast, 0.5% propionic acid, 0.2% Tegosept and 1% ethanol in water. The nuclear GFP, UAS-Stinger was from Joel Levine. The maternal driver, nos-Gal4, was from Craig Smibert. The muscle driver, 24b-Gal4, was obtained from Rolf Bodmer. The following lines were acquired from Bloomington Drosophila Stock Center: (w1118 ; P{UAS-Dcr- 2.D}2), (w1118 ; P{Gal4}repo/TM3,Sb1), (y1w1118 ; P{Lsp2-Gal4.H}3), (w* ; P{ppl-Gal4.P}2), and (w1118 ; P{Cg-Gal4.A}2),

Larval Synchronization

To work with mid third instar larvae, I synchronized larval development. To accomplish this, five to seven day old adults were allowed to oviposit on a grape juice and agar media (45% grape juice, 2.5% ethanol, 2.5% acetic acid, 2% agar in water) for 20 hours. The following day, any hatched larvae were removed and discarded. Following a four hour incubation, newly hatched larvae were seeded into a 100mm Petri dish containing 30mL of our yeast-cornmeal- molasses-agar food. The plates were incubated at 25 ± 1ºC with a 12L:12D photocycle until they reached mid third instar (72 ± 2hrs post hatch).

27 Bioinformatics

Bioinformatics were conducted with the Geneious 8.1.7 software package (http://www.geneious.com, Kearse et al., 2012). This includes primer design, editing and visualization of Sanger sequencing chromatograms, contig assembly, nucleotide alignments, in- silico design of cloning schemes, and restriction endonuclease digestion prediction.

DNA Extraction

Larvae were ground in a sterile 1.5ml centrifuge tube with 150μl of Solution A (0.1M Tris-HCl, 0.1M EDTA, 0.1M NaCl, 0.5% SDS, pH 8.0). An additional 150μl of Solution A was added and the flies were ground again. The homogenate was incubated at 65°C for 30 minutes. 172μl of 5M KoAC and 428μl of 6M LiCl were added. The mixture was inverted aggressively 3 times and then incubated on ice for 20 minutes. The mixture was then centrifuged for 20 minutes at 12,000 RCF. 600μl of supernatant was transferred to a fresh sterile 1.5ml tube, and 450μl of ice cold isopropanol was added, then centrifuged for 5 minutes at 12,000 RCF. The supernatant was discarded and the pellet was washed in 70% ethanol, dried, and then re- suspended in 50μl of sterile deionized water.

PCR, Restriction Digestion, and Ligation

PCR, restriction endonuclease digestions, and ligations were performed using New England Biolabs products following the manufacturer's instructions. Routine PCR, such as colony PCR and RT-PCR were performed using NEB Taq DNA Polymerase. PCR of genomic region for cloning and sequencing of the locus were amplified with NEB Phusion High-Fidelity DNA Polymerase. All restriction enzymes used were NEB enzymes. NEB T4 DNA ligase was used for ligation reactions in all cloning schemes.

28 E. coli Transformation

Electro-competent E. coli cells were removed from -80ºC and were thawed on ice. The cell suspension was mixed with 1µl of ligation (or intact vector DNA). This suspension was then added to a 1 mm gapped electroporation cuvette. Cells were then electroporated at a voltage of 1.8V. Immediately following the electroporation, 1 mL of LB broth was added to the cuvette and mixed. The cell suspension was then transferred to a culture tube and incubated for 1 hour in a 37ºC shaker. Following recovery, a dilution series of the transformation was plated on LB medium containing appropriate antibiotics and incubated at 37ºC overnight. Colonies were then inoculated in 5ml of LB broth containing appropriate antibiotics and incubated at 37ºC overnight in a shaker. Standard cloning was conducted using NEB 5-alpha and NEB 10-beta cells from New England Biolabs.

Plasmid Preparation

The cells were pelleted by centrifugation of 1 mL of the culture for 1 minute at 16 000 RCF. The supernatant was discarded and 200μl of GTE buffer (50mM glucose, 2.5mM Tris- HCl, 10mM EDTA, pH8.0) was added. The tubes were then vortexed to resuspend the pellet, and then 400μl of Lysis buffer (1% SDS, 0.2M NaOH) was added. The mixture was incubated at room temperature for 3 minutes. Then 300μl of neutralization buffer (3M K, 5M AC) was added, briefly vortexed and allowed to incubate on ice for 5 minutes. The mixture was then centrifuged for 4 minutes at 16,000 RCF and then the supernatant was transferred to a fresh 1.5 mL tube. The plasmids were precipitated by adding 500 μl of ice cold isopropanol, vortexed and then centrifuged for 5 minutes. The supernatant was discarded and the pellet was washed in 70% ethanol. The pellet was re-suspended in 25μl of 1xTE (10mM Tris-HCl, 1mM EDTA, pH8.0). Any RNA present was then digested by adding 1μl of RNase and allowed to incubate at room temperature for 1 minute.

29 Promoter-Gal4s

An attB sequence was amplified with PCR from the pUAST-attB vector (Bischof et al., 2007). NsiI sites were added to the primers. The NsiI fragment was then cloned into the pStinger vector (Barolo, 2000). Regions of the foraging gene were amplified with PCR and cloned into pGEM-Teasy vector (Promega). KpnI and SacI sites were added to the primers used. The KpnI/SacI fragment was digested out of pGEM and inserted into the Gal4 containing pMARTINI-Gal4 vector (Billeter and Goodwin, 2004). The NotI pr-Gal4 fragment was then inserted into the pStinger-attB vector, replacing the eGFP sequence. The resulting vector contained a pr-Gal4 sequence between two gypsy insulators with an attB sequence. The nested constructs were cloned by inserting the NotI fragment of pMARTINI-Gal4 into the pStinger-attB vector first. This insulated Gal4 vector with attB, called pIGa, was digested with KpnI and end filled with Klenow enzyme (New England Biolabs). Nested regions were amplified with PCR from the larger pr-Gal4 vectors and cloned into the end filled KpnI digested pIGa. All pr-Gal4 constructs were injected into the attP2 landing site (Groth et al., 2004) by Genetic Services Inc. Successful integration was confirmed with PCR.

Western Blot Analysis

Twenty mid third instar larvae (72 ± 2 hours post hatch) were homogenized in 400μl of lysis buffer (50 mM Tris-HCl pH7.5, 10% glycerol, 150 mM NaCL, 1% Triton-X 100, 5 mM EDTA, 1x Halt Protease Inhibitor Cocktail, 1x Halt Phosphatase Inhibitor Cocktail) on ice. Samples were centrifuged at 16,000 RCF at 4°C and the supernatant was transferred to a new tube and placed on ice. Protein quantification was performed with Pierce BCA kit. 20 μg of protein was denatured for 5 minutes at 100°C. The samples were run on a 4% stacking/7% resolving polyacrylamide and SDS gel at 150V for 1 hour in running buffer (25 mM Tris-HCl, 200 mM glycine, 0.1% SDS). Proteins were transferred onto a nylon membrane at 100V for 1 hour in transfer buffer (25 mM Tris-HCl, 200 mM glycine, 10% methanol). Blots were blocked

30 for 2 hours in 5% nonfat milk in 0.1% Tween-20 in 1x TBS (0.1% TBST). Primary antibody was incubated for 1hour at a concentration of 1:10 000 for anti-FOR (Belay et al., 2007) and 1:5000 for anti-ACTIN (Sigma). The blots were rinsed twice and washed 3 times for 5 minutes with each rinse in 0.1% TBST. Blots were then incubated with HRP conjugated secondary (Jackson ImmunoResearch Laboratories) at a concentration of 1:10 000 for 45 minutes. They were then rinsed twice and washed 3 times. Blots were incubated for 5 minutes in GE’s ECL Prime Detection reagent and exposed to X-ray film.

Immunohistochemistry

Dissected samples were fixed in 4% paraformaldehyde in 1 time PBS for 1 hour. The fixed samples were rinsed twice in 0.5% Triton X in 1time PBS (PBT) and then washed 4 times for 30 minutes each in PBT. The samples were blocked in 10% normal goat serum (NGS, Jackson ImmunoResearch Laboratories) and 0.1% BSA (Sigma) in PBT for 2 hours at room temperature. Primary antibody incubations were conducted in blocking solution and incubated overnight at 4°C. After primary incubation the samples were rinsed twice and then washed 4 times for 30 minutes each in PBT. Secondary antibody was incubated in PBT at room temperature for 2 hours. Washing was conducted as described above for the primary antibody. Tissues were mounted on slides in Vectashield (product infor). Samples were imaged using a Zeiss Axioscope epifluorescence microscope as well as a Zeiss LSM 510 and Leica SP5 confocal microscopes. Images were analysed using Fiji software package (Schindelin et al., 2012).

Triglyceride analysis

Groups of 10 larvae were homogenized in a volume of 200 μL of 0.1% Tween 20 in 1x PBS. Samples were incubated at 70°C for 5 minutes and then chilled for 2 minutes on ice. Debris was pelleted by centrifugation at 16,000 RCF for 5 minutes. 25 μL of the supernatant

31 was then mixed with Infinity TAG Reagent (Fisher Scientific) in a 96-well spectrophotometer plate. Samples were incubated at 37°C for 5 minutes. The absorbance was measured at 540 nm. A separate aliquot was mixed with Pierce BCA reagent to quantify protein levels in the samples. The BCA samples were incubated at 37°C for 30 minutes. Standard curves were used for both the DAG/TAG and protein analysis. Protein levels are displayed at ug glycerol/ mg protein.

Food intake

Cell culture dishes placed in a 35 mm petri dish were used as an arena. 700 μL of liquid food was placed into the cell culture dish. The food contained 0.5% fluorescein, 5% sucrose and 5% yeast extract. Mid-third instar larvae were removed from food plates, washed and then placed into the test arenas in groups of 10. Larvae were left to feed for 10 minutes. The cell culture dishes were then lifted out of the food and rinsed with water. Larvae were then picked out of the cell culture dish and washed an additional 2 times in water. Washed larvae were placed into a 1.5ml tube and frozen. The 10 larvae were homogenized in 150 μL of 1x PBS and then centrifuged at 16,000 RCF for 15 minutes. 20 μL of the supernatant was mixed with 180 μL of 1 time PBS in a fluorometer plate and excited at 488 nm and emission measured at 562 nm. Larvae that were fed food without fluorescein were homogenized and used as a blank.

Path length

Foraging path length was measured using black rectangular Plexiglas plates (37 cm width, 60 cm length, 0.5 cm height) with 10 wells (0.5 mm depth, 9.5 cm diameter) arranged in a 2-by-5 well fashion (Sokolowski et al., 1997). Mid-third instar larvae (72 ± 2 hours post hatch) were randomly selected from the food plates and washed in water. A homogenous yeast suspension (2:1 w/w) was spread across the wells creating a thin even layer in each well. Individual larvae were placed in the centre of each well and covered with the lid of a 10cm Petri

32 dish. Larvae were allowed to move for 5 minutes, and then the path length was traced on the Petri-dish lid. Path lengths were digitized using Fiji (Schindelin et al., 2012).

Statistical analysis

Statistical analysis was performed in R (R Core Team, 2013). The effects of genotype as well as blocking factors such as the date of the experiment were modeled with general linear models using the lm function, and the car package (Fox and Weisberg, 2011). ANOVA and Tukey HSD post-hoc tests were run to compute statistical significances. 95% confidence intervals were calculated using the effects package (Fox, 2003). Effect sizes were calculated using the compute.es package (Re, 2013). Power was calculated using the pwr package (Champley, 2012). The packages lattice and gplots were used for plotting (Sarkar, 2008; Warnes et al., 2013 respectively).

33 Results and Discussion:

Dissecting foraging’s pleiotropy.

There were many sequence differences between our rover and sitter lines, when analyzing the foraging gene sequence. There are 425 non-identical sites along the 40,000 bp locus. It was impractical to perform a promoter analysis on both rover and sitter genetic backgrounds. I choose the sitter background since it has more phenotypic similarities to other common wild type strains, such as Canton S and Oregon R, used in Drosophila genetic research. In order to understand the differences between these alleles, I need to discover the location of the causal variation. I began by using genetic dissection techniques on the sitter background to understand how the foraging gene accomplishes its pleiotropy.

Lack of variation in foraging coding sequence.

We wondered whether the natural allelic variation resides in the coding sequence or regulatory sequence of the foraging gene? In chapter 2, we found almost no variation in amino acid sequence between our rover and sitter strains. Similarly, when I looked at the amino acid sequence of foraging in the DGRP collection (Mackay et al., 2012), also I saw very little variation in the coding sequence. This common coding region, which contains the cGMP- binding domains as well as the kinase domain, only had 2 variable amino acid sites, out of 527. As this region was critical for the function of the cGMP dependent protein kinase enzyme, it is not surprising that it contained little DNA variation. The rest of the protein coding regions have a 99.7% identity at the amino acid level. This minimal variation within a population of Drosophila melanogaster was further echoed when I conducted a selection analysis (table 1) using non-synonymous to synonymous substitution rates with PAML (Yang, 2007) on 12

34 sequenced Drosophila species (Clark et al., 2007). This analysis showed that the common coding region of the gene is under strong purifying selection (table 1). Even between Drosophila species that are separated by more than 40 million of years of evolution (e.g. D. melanogaster and D. grimshawi; Powell, 1997), there is still minimal variation in the foraging coding sequence. The mammalian amino acid sequence shared roughly 63% identity with the Drosophila melanogaster foraging sequence (Butt et al., 1993). Flies and mammals are separated by 540 million years and still there was minimal variation in the coding region of the mammalian foraging gene. Taken together we suggest that it is much more likely for DNA variation to occur in the regulatory sequences of the gene for most natural lines, and not just our rover and sitter strains. In summary, both within and between species analyses show little DNA variation in the coding regions of foraging. Our data also demonstrate that foraging is an essential gene for development. The null mutation, as well as many transposable element insertion downstream of the start codon, resulted in lethality in the fly. This may explain why the coding sequence would be under highly purifying selection. When a gene is vital to the survival of an animal it is easier to acquire new functions, or alter existing functions, through altered regulation of the gene as opposed to mutation the coding sequence in (Carroll, 2008). Accordingly, I focus here on the transcriptional regulation of the foraging gene.

Promoter bashing.

Given our hypothesis that the pleiotropy of the foraging gene is achieved through differential transcriptional regulation, I set out to identify the differential functions of these promoters. In order to interrogate the functions of these promoters I cloned regions surrounding foraging TSSs and fused them to a reporter (as in Brenner et al., 1996; Arredondo et al., 2002). By this method, I identified differential expression patterns arising from each promoter. The tools generated were then used to manipulate the expression of the gene in a spatial- and temporal-specific manner.

35 for-pr-Gal4 construction I selected regions 5’-to and including the TSSs of the four identified minimal promoters (fig. 1A). These regions were 4.9, 4.2, 3.7, and 2.4 kb for promoters 1-4, respectively. These sizes were chosen with the consideration of restriction endonuclease sites, primer design, and non-overlapping exons. Thirteen sub-region were cloned from the larger regions. This nested construct strategy has been successfully employed many times before (Park, et al,, 2000; Arredondo et al., 2001). Because position effects during transgenesis were an issue for Drosophila trangenesis (Kellum and Schedl, 1991), the design of our reporter constructs needed to avoid position effects. I constructed a vector with gypsy insulator sequences flanking the promoter and Gal4 segments (fig. 1B). This vector was constructed by replacing the eGFP sequence in pStinger (Barolo et al., 2000) with the Gal4 sequence from pMARTINI-Gal4 (Billeter and Goodwin, 2004). gypsy insulators are well known for their barrier element activity (Gdula et al., 1996). To further guard against position effects, I added the φC31 site-specific integration sequence to these constructs so that order to have all reporter transgenes were incorporated into the same location of the fly’s genome (fig. 2A, Groth et al., 2004). Having all pr-Gal4's inserted in the same location and all insulated with gypsy increased the likelihood of faithfully recapitulating the expression patterns arising from the regulatory sequences I included in each construct (Markstein et al., 2008; Billeter and Goodwin, 2004). PCR confirmed appropriate integration of the pr-Gal4 lines (fig. 2B). The P{CaryP}attP2 integration site was chosen for it's high levels of inducible expression and low levels of leaky expression due to position effects (Groth et al., 2004; Markstein et al., 2008).

for-pr-Gal4 expression All of the four for-pr-Gal4 lines with the largest piece of for-pr DNA showed expression in varying cell types in different tissue systems in larvae (fig. 3-8). All four promoters expressed in the CNS, yet none overlapped in their specific cellular expression (fig. 3). for-pr1-4.7- and for-pr4-2.4-Gal4 both expressed in neurons. for-pr2-4.0- and for-pr3-3.3-Gal4 both expressed in glia. for-for-pr1-4.7-4.7-Gal4 expressed in neurons throughout the central brain and VNC (fig. 3A), whereas for-pr4-2.4-Gal4 expression was restricted to the optic lobes and discs (fig. 3D). for-pr1-4.7-Gal4 cells co-localized with anti-ELAV (fig. 7A-C). for-pr2-4.0-Gal4 expressed in the midline glia (fig. 3B). The midline glia, which are only present in the larvae, are crucial for

36 midline axon guidance and nervous system morphogenesis (Jacobs, 2000). for-pr3-3.3-Gal4 expressed in the surface glia (fig. 3C). These for-pr3-3.3-Gal4 cells co-localize with anti-REPO (fig. 7G-I). The surface glia are made up of two distinct types, perineurial and subperineurial glia. These cells are likely the perineurial glia due to the size and number of cells this pattern represents (Stork et al., 2012). The surface glia are important for blood-brain-barrier activities (Edwards and Meinertzhagen, 2011) The for-pr-Gal4s similarly drove expression in different regions of the larval gastric system. All are expressed in the gut but each in a a specific and unique set of cells (fig. 4). Expression of for-pr1-4.7-Gal4 was only evident in a few enteroendocrine cells (EEC) in the anterior of the midgut (fig. 4A). Due to the location of the these cells they may be allatostatin positive EEC (Veenstra, 2009). Future co-labeling experiments for allatostatin B and labial with anti-ASTB and anti-LAB will verify their identity. Release of endocrine molecules from EECs is important for maintaining gut homeostasis (Veenstra, 2009). for-pr2-4.0-Gal4 is expressed in the adult midgut precursor cells (AMP) (fig. 4B). This cell cluster contains the intestinal stem cells and other support cells (fig. 4C). During preparation for pupal development, these cells begin to increase in number to repopulate the midgut with adult cells (Jiang and Edgar, 2009). This for-pr2-4.0-Gal4 expression co-localized with anti-DELTA, a marker for the AMP (fig. 7D- F). for-pr2-4.0-Gal4 also expressed in the ureter of the malpighian tubules. for-pr3-3.3-Gal4 had the broadest expression of the four drivers in the gastric system. for-pr3-3.3-Gal4 expressed in midgut anterior enterocytes (fig. 4D), the copper cells responsible for acid secretion (fig. 4E, Dubreuil, 2003), as well as the large flat cells of the middle midgut (fig. 4F). These epithelia of the midgut are important for the secretion and absorption aiding in digestion (Shanbhag and Tripathi, 2009). for-pr3-3.3-Gal4 drove expression in the circular and longitudinal muscle of the midgut and hindgut (fig. 4G, 4H). These muscles are necessary for peristaltic contraction in the gastric system. for-pr3-3.3-Gal4 also expressed in EEC further down the midgut from for-for-pr1-4.7-4.7-Gal4 (fig. 4I). These cells are important for sending hormonal cues to the enterocytes to regulate digestion and absorption (Veenstra, 2009). Finally, for-pr3-3.3-Gal4 expressed in the primary cells of the malpighian tubules (fig. 4J). for-pr4-2.4- Gal4, on the other hand, expressed exclusively in the hindgut. The larval hindgut is separated into many distinct sections, each with differing functions. for-pr4-2.4-2.4Gal4 expressed in the h3, h5d, h6d, hv and h7 regions of the hindgut (fig. 4K, 4L). These structures have different

37 physiological functions, including, peristaltic contractions (h3), ion and water transport (h5d and h6d), absorption (hv), and contractions for fecal waste (h7) (Murakami and Shiotsuki, 2001). Both for-pr2-4.0- and for-pr4-2.4-Gal4 expressed in the salivary gland imaginal ring (fig. 5A, 5D). The salivary imaginal ring develops into the adult salivary duct and gland during metamorphosis. for-pr3-3.3-Gal4, expressed in the developed salivary gland of the larvae (fig. 5B). for-pr3-3.3-Gal4 also expressed in the fat body (fig. 5C). for-pr3-3.3-Gal4 was also extensively expressed in body wall muscle (fig. 6A) whereas for-pr4-2.4-Gal4 expressed in dorsal denticles (fig. 6B), as well as the anterior and posterior spiracles (fig. 6C, 6D).

Mapping CREs. The characterization of the expression of the four larger for-pr-Gal4 lines served to map CREs to the locus. I was able to refine this mapping using a series of nested Gal4 lines generated from each of the 4 larger for-pr-Gal4 lines (fig. 1). The expression of the nested Gal4 lines were a subset of the expression seen in the larger for-pr-Gal4. This lack of novel expression seen in the nested lines suggested a lack of any silencer sequences in the cloned regions. By moving to progressively smaller Gal4 lines, I inferred the location of a CRE by the loss of expression in the next smaller Gal4 line (fig. 8). I saw the best refinement in the case of for-pr3-3.3. A perineurial glia CRE mapped to a 850 bp region 1.7kb 5’-to pr3’s TSS. A fat body CRE mapped to an 830 bp region 200 bp 5’-to pr3’s TSS. The remaining expression mapped to an 880 bp region between them. The lack of expression observed in the smallest of the Gal4 lines supports the efficacy of the gypsy insulator sequences and the site specific integrations site used in our study.

Reliability of Gal4s. Promoter fusion Gal4s are not a direct measure of the expression of a gene. There are many examples of cloned Gal4s faithfully recapitulating a gene's endogenous expression pattern, as previously mentioned, but this is far from the “rule”. Direct measures of a gene's expression in tissue include RNA in-situ hybridization and IHC. Together these techniques verify a gene's expression pattern. Another potential limitation of cloned Gal4s is that they remove a sequence segment out of context from the locus under investigation. This could result in the loss of expression due to the absence of an enhancer, and could result in a gain of

38 expression when lacking a silencer. There is also the issue of the regulatory elements in the neighbourhood of the insertion site. In these experiments I used gypsy insulator sequences and φC31 integrase to limit these neighbourhood effects.

anti-FOR has non-specificity. Although a polyclonal anti-FOR antibody has been previously published (Belay et al., 2007) and was run with many controls, now that we have a full genetic null mutation, I was able to test whether it showed any non-specific immunoreactivity. Staining our foraging null mutant with this antibody showed significant immunoreactivity, and as such is non-specific (fig. 9). As illustration, figure 4 shows three regions of the larval CNS. The staining seen in the null mutant SOG (fig. 9A), central brain (fig. 9C), and VNC (fig. 9E) were indistinguishable from wild type (fig. 9B, 9D, 9F). I also saw staining in other tissues that failed to uncover differences between the null mutant and wild type (data not shown). Co-localization of the pr-Gal4’s with this anti- FOR antibody is, at present not feasible. This antibody is faithful in western analysis (Ch. 1, fig. 10) but seems to be non-specific in larval immunohistochemical analyses, with the current protocol. It is not uncommon for an antibody to not work in one biochemical instance and not another (Baker, 2015).

Support for pr-Gal4 expression patterns. Although I cannot verify that our pr-Gal4s recapitulate native foraging expression with our anti-FOR antibody, there is other evidence that they do. RT-PCR on dissected tissues, as well as data from Fly Atlas and modENCODE, support foraging expression in some tissues including the fat body, salivary glands, and trachea (data not shown; Chintapalli et al., 2007; Graveley et al., 2011). Recent RNA fluorescent in-situ hybridization experiments, conducted in our lab, have also identified foraging transcripts in larval trachea (Ina Anreiter, unpublished data). in-situ hybridization and enhancer trap of foraging exon 6 support expression in the optic lobes and eye discs seen in for-pr4-2.4-Gal4 (fig. 10). Enhancer traps in the same location also show expression in the anterior and posterior spiracles (data not shown).

39 Missing CREs. Our cloned for-pr-Gal4 expression patterns probably do not capture the majority of foraging’s expression. I saw no change in expression levels on whole larvae westerns blot when I express our foraging RNAi construct, UAS-Dcr2 ; UAS-forRNAi-exon7:8, with each of the four largest of the pr-Gal4 lines (fig. 11A, 11B). These whole larval western blot patterns are indistinguishable from the controls. I showed that this RNAi construct was sufficient to eliminate all FOR bands when driven by a ubiquitous da-Gal4 (Ch. 1, fig. 13B). Thus I am forced to conclude that even though I have shown extensive expression patterns and multiple regions containing CREs, these expression patterns do not represent the majority of foraging’s expression pattern. The cloned pr-Gal4s only cover 15kb of the 40kb locus, leaving 25kb that could be cloned in the future. The modENCODE project has identified multiple transcription factor binding sites (TFBSs) that were not covered by our pr-Gal4 constructs (Nègre et al., 2011). There are multiple trithorax-like (GAGA factor), senseless, twist, huckebein, and Chinmo binding sites across the locus outside of our cloned regions (fig. 12). These missing CREs could represent elements that drive more expression in the already characterised cells or elements that drive expression in additional, as of yet, uncharacterized cells. I did find that the foraging BAC recapitulates the FOR western immunoreactivity of wild type larvae (Ch. 1, fig. 10), suggesting that the 40kb locus must contain all necessary CREs. To discover the missing CREs, I recently generated a foraging::GFP BAC, with recombineering, and am in the process of creating a foraging-Gal4 BAC. These lines will be critical for the identification of the remaining foraging expression patterns. Interestingly, ubiquitous knockdown of exon4:5 eliminates most of the wild type expression pattern on westerns (fig. 11C). This suggests that the vast majority of the wild-type protein is from a subset of the transcripts produced from for-pr1-4.7, for-pr2-4.0, and for-pr3- 3.3. These transcripts only produce 2 protein isoforms, P1, and P3 (Ch. 2, fig. 4). We have generated a P1-Gal4 HR targeting line in the lab, and efforts are underway to mobilize this element to knock it in. Given the previously mentioned 25kb not covered by the cloned pr- Gal4’s, an isoform specific knock-in may uncover some of these remaining CREs. With this in mind, I went on to manipulate RNAi using the large pr-Gal4's in the hope that one or more of these lines contained CREs associated with the phenotypes of interest.

40 P1 or P3 protein isoforms are required for viability.

I previously established that foraging is necessary to make it through pupal development. Determining which isoform is required at which time in which tissue for survivorship to adulthood is of great interest. I first evaluated the necessity of the patterns of our pr-Gal4 lines for survivorship to adulthood. Crossing each of the cloned pr-Gal4 lines to our UAS-Dcr2 ; UAS-forRNAi-exon7:8, did not incur any lethality relative to controls. This suggested that either; 1- our pr-Gal4’s did not express in the relevant cells, or 2- they did express in the relevant cells, but the expression was too weak, or 3- expression in a combination of the patterns of different pr-Gal4’s was required. We have found that many other stronger, and broader tissue- specific drivers were not sufficient to induce lethality (table 2). These include neuronal drivers, fat body drivers, muscle drivers, tracheal drivers, an ecdysone receptor driver, glial drivers, and more, none of which induced lethality. As a result the tissue specific requirements for lethality, for now, remain a mystery. I then used our isoform specific RNAi lines to investigate the lack of foraging's lethal function. Pharate adult lethality occurred when driving our isoform specific RNAi, UAS-Dcr2 ; UAS-forRNAi-exon4:5, with ubiquitous drivers, da-Gal4 and tub-Gal4. This suggested that for-pr4- 2.4 expression was not sufficient for survivorship. Although for-pr1-4.7, 2, and 3 produced many transcripts, the UAS-forRNAi-exon4:5 only targeted a subset of them. This further suggested that the P1 or P3 protein isoforms were necessary for survivorship. Additionally, the National Institute of Genetics in Kyoto, Japan created two independent inserts of a foraging isoform specific RNAi, UAS-forRNAi10033R-1 and UAS-forRNAi10033R-2. They reported that crossing these lines to an act-Gal4 caused lethality, although they did not specify the stage. This suggested that for- pr3-3.3 expression is not sufficient to get the flies through pupal development. Together, this data narrowed the candidates down for lethality to a subset of the transcripts derived from for- pr1-4.7 and for-pr2-4.0, which in-turn only produced a P1 protein isoform. It was recently reported that expressing UAS-forRNAi10033R-1 with a UAS-Dcr2 ; hedgehog- Gal4 driver induced lethality (Swarup et al., 2015). This group also showed that UAS- forRNAi10033R-1 or UAS-forRNAi10033R-2 crossed to a MS1096-Gal4 ; UAS-Dcr2, a wing disc driver, induced lethality. In both cases the stage of lethality was not mentioned. But, other reports suggested that altering hedgehog signaling in the wing discs during pupal development caused pharate adult lethality (Matusek et al., 2014; Blanco et al., 2009; Apidianakis et al., 2001). This

41 was consistent with our foraging mutants. I hypothesize that foraging P1 protein is expressed in hedgehog positive cells in the wing discs during pupal development, and this expression is necessary for survival. Future studies will address this hypothesis. Manipulating one imaginal disc can have domino effects on the other discs and the developing pupae, and in severe cases cause lethality (John Ewer, personal communication).

As with the null mutant, ubiquitous and tissue specific expression of foraging RNAi did not have any effect on larval survivorship. This allows us to carry out behaviour and metabolic assays of the mid third instar larvae. As in Chapter 2, I am interested in how foraging influences larval foraging behaviour, food intake, and lipid metabolism.

P1 or P3 protein isoforms are required for foraging behaviour.

In Chapter 1, I reportedthat deleting foraging reduced foraging path length behaviour. In this chapter, I found that the pr-Gal4s driving UAS-Dcr2; UAS-forRNAi-exon7:8 was not sufficient to reduce larval path length behaviour (fig. 12A, 12B). I used the four largest Gal4 lines from each promoter group which had the broadest expression to drive the RNAi. In all cases, the experimental line was not significantly different from its controls. Furthermore, over-expressing foraging with a UAS-forcDNA construct with each of the for-pr-Gal4s did not significantly affect path length behaviour (fig. 12C). In contrast, the ubiquitous knockdown of all foraging gene products did yield a significant reduction in path length behaviour (Ch. 2, fig. 13A). I concluded that the CREs cloned into the for-pr-Gal4s were not sufficient to affect path length behaviour. As above, this suggested that these for-pr-Gal4s did not express in the relevant cells, express at too low a level in the correct cells, or needed to be expressed in a combination of these cells. Previous experiments were also unable to modify foraging path length behaviour with a variety of tissue-specific drivers, which included pan-neuronal, PNS, neuroendocrine cells, and muscle (Belay, 2010; Douglas, unpublished data; Hughson, unpublished data). I then used isoform-specific RNAi construct, UAS-Dcr2; UAS-forRNAi-exon4:5, to interrogate promoter-specific function. Ubiquitously reducing the expression of a subset of foraging transcripts reduced path length behaviour (fig. 13B). This RNAi line targeted only 4 transcripts

42 which are collectively expressed from for-pr1-4.7, for-pr2-4.0, and for-pr3-3.3. These 4 transcripts code for only 2 different protein isoforms, P1 and P3. A deletion that removes the first protein coding exon of P1 and P3, Df(2L)forf0e0, also failed to complement the null mutant for path length behaviour (fig. 13C). These data suggest that for-pr4-2.4 is not sufficient to produce wild-type path length behaviour. More importantly, that only P1 or P3 are necessary for foraging path length behaviour. Here, I have shown that only 4 of the 21 transcripts, and only 2 of the 9 protein isoforms are relevant for foraging behaviour. This is a significant narrowing of the field of possibilities. With the previously mentioned foraging-Gal4 BAC and knock-in P1-Gal4, we will be able to conduct a much more restricted investigation into the tissue requirements for foraging behaviour.

Mapped CREs were insufficient in regulating food intake behaviour.

I had previously shown that the foraging null mutant has lower food intake than wild type and the null carrying a rescue foraging-BAC. Phenotypic comparisons with the natural strains implied that rovers had lower expression in the relevant cells for food intake. Testing our pr-Gal4 lines driving RNAi did not yield a reduction in food intake behaviour. As with the foraging path length experiments, these Gal4s may not express in the relevant tissues for food intake behaviour. Similar to the case of foraging behaviour, the previously mentioned foraging- Gal4 BAC may identify more expression that is sufficient to alter food intake behaviour.

Fat body expression of foraging autonomously regulates DAG/TAG levels.

Since I identified a fat body CRE in our promoter bashing, and mammalian foraging has previously been characterized for its role in adipose tissue, I wanted to test if foraging acts

43 autonomously in the fat body to influence DAG/TAG levels. In the previous Chapter, I saw that altered foraging expression resulted in altered lipid levels. In Drosophila melanogaster, triglycerides are the primary lipid storage molecule, and are assembled from diglycerides in the fat body of the larvae (Canavoso et al., 2001). Our expression analysis identified a fat body CRE in the locus. I then asked, whether foraging acted locally, in the fat body, to alter lipid metabolism. Reducing foraging levels with a fat body specific driver, Lsp2-Gal4>UAS-forv38320, increased lipid levels (fig. 16A). Reducing foraging expression in fat bodies, Cg-Gal4>UAS- forv38320, also resulted in a reduction of total body size (fig. 16B).

44 Conclusion:

I identified a diverse expression pattern associated with the foraging locus. Each of the cloned region upstream of a TSS drove distinct expression in multiple different tissue systems. The diversity of expression supports the benefits of our use of site-specific integration and the use of gypsy insulator sequences, since there was no common neighbourhood effect. The observed expression occurred in many tissues in which foraging has previously been detected. Previous reports support the expression of the foraging in tissues such as the fat body, trachea, and salivary glands (Chintapalli et al., 2007). They also showed support for expression in less homogeneous tissues such as the central nervous system and the gastric system. Our isoform-specific RNAi, and promoter bashing strategies, allowed us to uncover isoform-specific functions and regulatory elements within the locus. Our data showed that foraging P1 or P3 isoform were required during the first three quarters of pupal development for survivorship. Although, I was not able to conclusively show where foraging is necessary for survivorship. In Chapter 2, I deduced independent regulation of the foraging behaviour and food intake behaviour, by comparison of the null mutants and the natural alleles. In the present chapter I identifed a subset of the isoforms involved, P1 or P3. I was unable to localize a CRE or tissue controlling foraging behaviour or food intake behaviour. I identified a fat body CRE within the locus. When foraging was manipulated locally in the fat body I was able to alter lipid levels. This is a significant advance in filling in the genotype-phenotype map of the foraging gene and its associated feeding-related phenotypes. I have used the concept of “molecular gene pleiotropy”, where the gene is the genetic unit of pleiotropic action and not the mutation (Paaby and Rockman, 2013). The independent regulation of foraging's associated larval phenotypes supported a model of genuine pleiotropy (Pyeritz, 1989; Gruneberg, 1938; Paaby and Rockman, 2013). So far I can not rule out that the phenotypes discussed are influenced by the same, single protein isoform that is expressed in different tissues. Further experiments into the isoform and tissue requirements will elucidate these distinctions.

45 References:

Andersson, L., and Georges, M. (2004). Domestic-animal genomics: deciphering the genetics of complex traits. Nat. Rev. Genet. 5, 202–212.

Apidianakis, Y., Grbavec, D., Stifani, S., and Delidakis, C. (2001). Groucho mediates a Ci- independent mechanism of hedgehog repression in the anterior wing pouch. Development 128, 4361–4370.

Arredondo, J.J., Ferreres, R.M., Maroto, M., Cripps, R.M., Marco, R., Bernstein, S.I., and Cervera, M. (2001). Control of Drosophila paramyosin/miniparamyosin gene expression: differential regulatory mechanism for muscle-specific transcription. J. Biol. Chem. 276, 8278– 8287.

Baker, M. (2015). Reproducibility crisis: Blame it on the antibodies. Nature 521, 274–276.

Barolo, S., Carver, L.A., and Posakony, J.W. (2000). GFP and β-galactosidase transformation vectors for promoter/enhancer analysis in Drosophila. BioTechniques 29, 726–732.

Belay, A. t., Scheiner, R., So, A. k.-C., Douglas, S. J., Chakaborty-Chatterjee, M., Levine, J. d., and Sokolowski, M. b. (2007). The foraging gene of Drosophila melanogaster: Spatial- expression analysis and sucrose responsiveness. J. Comp. Neurol. 504, 570–582.

Billeter, J.-C., and Goodwin, S.F. (2004). Characterization of Drosophila fruitless-Gal4 transgenes reveals expression in male-specific fruitless neurons and innervation of male reproductive structures. J. Comp. Neurol. 475, 270–287.

Blanco, J., Seimiya, M., Pauli, T., Reichert, H., and Gehring, W.J. (2009). Wingless and Hedgehog signaling pathways regulate orthodenticle and eyes absent during ocelli development in Drosophila. Developmental Biology 329, 104–115.

Brand, A.H., and Perrimon, N. (1993). Targeted gene expression as a means of altering cell fates and generating dominant phenotypes. Development 118, 401–415.

Brenner, R., and Atkinson, N. (1996). Developmental- and eye-specific transcriptional control elements in an intronic region of a Ca(2+)-activated K+ channel gene. Dev. Biol. 177, 536– 543.

Brenner, R., Thomas, T.O., Becker, M.N., and Atkinson, N.S. (1996). Tissue-specific expression of a Ca(2+)-activated K+ channel is controlled by multiple upstream regulatory elements. J. Neurosci. 16, 1827–1835.

46 Britten, R.J., and Davidson, E.H. (1971). Repetitive and non-repetitive DNA sequences and a speculation on the origins of evolutionary novelty. The Quarterly Review of Biology 46, 111– 138.

Butt, E., Geiger, J., Jarchau, T., Lohmann, S.M., and Walter, U. (1993). The cGMP-dependent protein kinase-gene, protein, and function. Neurochem. Res 18, 27–42.

Canavoso, L.E., Jouni, Z.E., Karnas, K.J., Pennington, J.E., and Wells, M.A. (2001). Fat metabolism in insects. Annual Review of Nutrition 21, 23–46.

Carroll, S.B. (2000). Endless Forms: The Evolution of Gene Regulation and Morphological Diversity. Cell 101, 577–580.

Carroll, S.B. (2008). Evo-devo and an expanding evolutionary synthesis: A genetic theory of morphological evolution. Cell 134, 25–36.

Champely, S. (2012). pwr: Basic functions for power analysis.

Chintapalli, V.R., Wang, J., and Dow, J.A.T. (2007). Using FlyAtlas to identify better Drosophila melanogaster models of human disease. Nat. Genet. 39, 715–720.

Clark, A.G., Eisen, M.B., Smith, D.R., Bergman, C.M., Oliver, B., Markow, T.A., Kaufman, T.C., Kellis, M., Gelbart, W., Iyer, V.N. et al. (2007). Evolution of genes and genomes on the Drosophila phylogeny. Nature 450, 203–218.

Dubreuil, R.R. (2004). Copper cells and stomach acid secretion in the Drosophila midgut. Int. J. Biochem. Cell Biol. 36, 745–752.

Edwards, T.N., and Meinertzhagen, I.A. (2010). The functional organisation of glia in the adult brain of Drosophila and other insects. Prog. Neurobiol. 90, 471–497.

Fox, J. (2003). Effect displays in R for generalised linear models. Journal of Statistical Software 8, 1–27.

Fox, J., and Weisberg, S. (2011). An R companion to applied regression (Thousand Oaks CA: Sage).

Gdula, D.A., Gerasimova, T.I., and Corces, V.G. (1996). Genetic and molecular analysis of the gypsy chromatin insulator of Drosophila. PNAS 93, 9378–9383.

Goodwin, S.F., Taylor, B.J., Villella, A., Foss, M., Ryner, L.C., Baker, B.S., and Hall, J.C. (2000). Aberrant splicing and altered spatial expression patterns in fruitless mutants of Drosophila melanogaster. Genetics 154, 725–745.

Graveley, B.R., Brooks, A.N., Carlson, J.W., Duff, M.O., Landolin, J.M., Yang, L., Artieri, C.G., van Baren, M.J., Boley, N., Booth, B.W. et al. (2011). The developmental transcriptome of Drosophila melanogaster. Nature 471, 473–479.

Greenspan, R.J., and Ferveur, J.F. (2000). Courtship in Drosophila. Annu. Rev. Genet. 34, 205– 232.

47 Groth, A.C., Fish, M., Nusse, R., and Calos, M.P. (2004). Construction of transgenic Drosophila by using the site-specific integrase from phage phiC31. Genetics 166, 1775–1782.

Gruneberg, H. (1938). An analysis of the “pleiotropic” effects of a new lethal mutation in the rat (Mus norvegicus). Proceedings of the Royal Society of London. Series B, Biological Sciences 125, 123–144.

Hodgkin, J. (1998). Seven types of pleiotropy. Int. J. Dev. Biol. 42, 501–505.

Hofmann, H.A. (2003). Functional genomics of neural and behavioral plasticity. J. Neurobiol. 54, 272–282.

Jacobs, J.R. (2000). The midline glia of Drosophila: a molecular genetic model for the developmental functions of glia. Prog. Neurobiol. 62, 475–508.

Jiang, H., and Edgar, B.A. (2009). EGFR signaling regulates the proliferation of Drosophila adult midgut progenitors. Development 136, 483–493.

Kearse, M., Moir, R., Wilson, A., Stones-Havas, S., Cheung, M., Sturrock, S., Buxton, S., Cooper, A., Markowitz, S., Duran, C. et al. (2012). Geneious Basic: An integrated and extendable desktop software platform for the organization and analysis of sequence data. Bioinformatics 28, 1647–1649.

Kellum, R., and Schedl, P. (1991). A position-effect assay for boundaries of higher order chromosomal domains. Cell 64, 941–950.

Lee YS, Carthew RW. (2003). Making a better RNAi vector for Drosophila: use of intron spacers. Methods 30, 322–329.

Lehman, D.A., Patterson, B., Johnston, L.A., Balzer, T., Britton, J.S., Saint, R., and Edgar, B.A. (1999). Cis-regulatory elements of the mitotic regulator, string/Cdc25. Development 126, 1793–1803.

Mackay, T.F.C., and Anholt, R.R.H. (2006). Of flies and man: Drosophila as a model for human complex traits. Annu. Rev. Genomics. Hum. Genet. 7, 339–367.

Mackay, T.F.C., Richards, S., Stone, E.A., Barbadilla, A., Ayroles, J.F., Zhu, D., Casillas, S., Han, Y., Magwire, M.M., Cridland, J.M. et al. (2012). The Drosophila melanogaster genetic reference panel. Nature 482, 173–178.

Markstein, M., Pitsouli, C., Villalta, C., Celniker, S.E., and Perrimon, N. (2008). Exploiting position effects and the gypsy retrovirus insulator to engineer precisely expressed transgenes. Nat. Genet. 40, 476–483.

Matusek, T., Wendler, F., Polès, S., Pizette, S., D’Angelo, G., Fürthauer, M., and Thérond, P.P. (2014). The ESCRT machinery regulates the secretion and long-range activity of Hedgehog. Nature 516, 99–103.

Medawar, P.B. (1952). An unsolved problem of biology. (University College London, H.K. Lewis & Co. Ltd., London).

48 Murakami, R., and Shiotsuki, Y. (2001). Ultrastructure of the hindgut of Drosophila larvae, with special reference to the domains identified by specific gene expression patterns. J. Morphol. 248, 144–150.

Nègre, N., Brown, C.D., Ma, L., Bristow, C.A., Miller, S.W., Wagner, U., Kheradpour, P., Eaton, M.L., Loriaux, P., Sealfon, R. et al. (2011). A cis-regulatory map of the Drosophila genome. Nature 471, 527–531.

Okada, T., Sakai, T., Murata, T., Kako, K., Sakamoto, K., Ohtomi, M., Katsura, T., and Ishida, N. (2001). Promoter analysis for daily expression of Drosophila timeless gene. Biochem. Biophys. Res. Commun. 283, 577–582.

Orr, H.A. (2000). Adaptation and the cost of complexity. Evolution 54, 13–20.

Paaby, A.B., and Rockman, M.V. (2013). The many faces of pleiotropy. Trends in Genetics 29, 66–73.

Park, J.H., Helfrich-Förster, C., Lee, G., Liu, L., Rosbash, M., and Hall, J.C. (2000). Differential regulation of circadian pacemaker output by separate clock genes in Drosophila. Proc. Natl. Acad. Sci. U.S.A. 97, 3608–3613.

Pavlicev, M., and Wagner, G.P. (2012). A model of developmental evolution: selection, pleiotropy and compensation. Trends Ecol. Evol. (Amst.) 27, 316–322.

Plate, L. (1910). Genetics and evolution, pp. 536-610 in Festschrift zum sechzigsten Geburtstag Richard Hertwigs. Fischer, Jena, Germany.

Powell, J. (1997). Progress and prospects in evolutionary biology : The Drosophila model (Oxford Series in Ecology & Evolution, Oxford University Press).

Pyeritz, R.E. (1989). Pleiotropy revisited: molecular explanations of a classic concept. American Journal of Medical Genetics 34, 124–134.

R Core Team (2013). R: A language and environment for statistical computing (Vienna, Austria: R Foundation for Statistical Computing).

Re, A.C.D. (2013). compute.es: Compute Effect Sizes.

Reaume, C.J., and Sokolowski, M.B. (2009). cGMP-dependent protein kinase as a modifier of behaviour. In cGMP: Generators, effectors and therapeutic implications, H.H.H.W. Schmidt, F. Hofmann, and J.-P. Stasch, eds. (Springer Berlin Heidelberg), pp. 423–443.

Ryner, L.C., Goodwin, S.F., Castrillon, D.H., Anand, A., Villella, A., Baker, B.S., Hall, J.C., Taylor, B.J., and Wasserman, S.A. (1996). Control of male sexual behavior and sexual orientation in Drosophila by the fruitless gene. Cell 87, 1079–1089.

Sarkar, D. (2008). Lattice: Multivariate data visualization with R (New York: Springer).

Schindelin, J., Arganda-Carreras, I., Frise, E., Kaynig, V., Longair, M., Pietzsch, T., Preibisch, S., Rueden, C., Saalfeld, S., Schmid, B. et al. (2012). Fiji: an open-source platform for biological-image analysis. Nat. Meth. 9, 676–682.

49 Schlossmann, J., and Desch, M. (2009). cGK substrates. In cGMP: Generators, effectors and therapeutic implications, H.H.H.W. Schmidt, F. Hofmann, and J.-P. Stasch, eds. (Springer Berlin Heidelberg), 163–193.

Shanbhag, S., and Tripathi, S. (2009). Epithelial ultrastructure and cellular mechanisms of acid and base transport in the Drosophila midgut. J. Exp. Biol. 212, 1731–1744.

Smith, J.A., Francis, S.H., Walsh, K.A., Kumar, S., and Corbin, J.D. (1996). Autophosphorylation of type Iβ cGMP-dependent protein kinase increases basal catalytic activity and enhances allosteric activation by cGMP or cAMP. J. Biol. Chem. 271, 20756– 20762.

Sokolowski, M.B. (2001). Drosophila: Genetics meets behaviour. Nat. Rev. Genet 2, 879–890.

Stern, D.L. (2000). Perspective: Evolutionary developmental biology and the problem of variation. Evolution 54, 1079–1091.

Stork, T., Bernardos, R., and Freeman, M.R. (2012). Analysis of glial cell development and function in Drosophila. Cold Spring Harb Protoc 2012, 1–17.

Swarup, S., Pradhan-Sundd, T., and Verheyen, E.M. (2015). Genome-wide identification of phospho-regulators of Wnt signaling in Drosophila. Development 142, 1502–1515.

Thomas, T., Wang, B., Brenner, R., and Atkinson, N.S. (1997). Novel embryonic regulation of Ca(2+)-activated K+ channel expression in Drosophila. Invert. Neurosci. 2, 283–291.

Veenstra, J.A. (2009). Peptidergic paracrine and endocrine cells in the midgut of the fruit fly maggot. Cell Tissue Res. 336, 309–323.

Warnes, G.R., Bolker, B., Bonebakker, L., Gentleman, R., Liaw, W.H.A., Lumley, T., Maechler, M., Magnusson, A., Moeller, S., Schwartz, M. et al. (2013). gplots: Various R programming tools for plotting data.

Williams, G.C. (1957). Pleiotropy, Natural Selection, and the Evolution of Senescence. Evolution 11, 398–411.

Williams, J.A., and Sehgal, A. (2001). Molecular components of the circadian system in Drosophila. Annu. Rev. Physiol. 63, 729–755.

Yang, Z. (2007). PAML 4: Phylogenetic analysis by maximum likelihood. Mol Biol Evol 24, 1586–1591.

50 Chapter 3 Tables

Table 1: Selection analysis of foraging coding sequence using PAML

A. The log-likelihood and parameter estimates for multiple selection models of an alignment of kinase domain of foraging for the 12 sequenced Drosophila species. B. Likelihood ratio tests of the log likelihood in A, and their associated P-values.

A.

Model p ln L Estimates of parameters

Random-sites:

M1a: nearly neutral 2 -6044.97 p0=0.99886, p1=0.00114, ω0=0.01298, ω1=1.00000

M2a: positive selection 4 -6044.97 p0=0.99886, p1=0.00114, p 2=0.00000, ω0=0.01298, ω1=1.00000, ω2=40.73024 M7: β 2 -6023.12 p=0.09841, q=5.43934

M8a: β and ω=1 3 -6023.12 p=0.09841, q=5.43934, p 0=1.00000, ω=1.00000

M8: β and ω 4 -6023.12 p=0.09841, q=5.43938, p 0=1.00000, ω=2.35694

Branch-sites: foreground = D. melanogaster

Model A 4 -6044.97 p0=0.99886, p1=0.00114, ω0=0.01298, ω1=1.00000, ω2=1.00000

Model A null 3 -6044.97 p0=0.99886, p1=0.00114, ω0=0.01298, ω1=1.00000, ω2=1.00000

Branch-sites: foreground = melanogaster subgroup

Model A 4 -6044.97 p0=0.99886, p1=0.00114, ω0=0.01298, ω1=1.00000, ω2=1.00000

Model A null 3 -6044.97 p0=0.99886, p1=0.00114, ω0=0.01298, ω1=1.00000, ω2=1.00000

Branch-sites: foreground = melanogaster group

Model A 4 -6044.97 p0=0.99886, p1=0.00114, ω0=0.01298, ω1=1.00000, ω2=1.00000

Model A null 3 -6044.97 p0=0.99886, p1=0.00114, ω0=0.01298, ω1=1.00000, ω2=1.00000

B. Tree Compared Models df 2ΔlnL P-value

Unrooted phylogeny M2a-M1a 2 0.00000 1.00000 M8-M7 2 0.00105 0.99947 M8-M8a 1 0.00010 0.99186

foreground = D. melanogaster Model A-Model A (null) 1 0.00000 0.99840

foreground = melanogaster subgroup Model A-Model A (null) 1 0.00000 1.00000

foreground = melanogaster group Model A-Model A (null) 1 0.00000 0.99840

51 Table 2: Gal4>RNAi lethality screen

Gal4 lines used in a lethality screen driving UAS-Dcr; UASforRNAi-exon7:8. The name of the driver, its expression, and viability when driving foraging RNAi are listed. Data provided by Ina Anreiter, Lydia To, Aaron Allen, Bryon Hughson, and Dr. Jeff Dason.

Driver Expression Viability

tub-GAL4 ubiquitous lethal da-GAL4 ubiquitous lethal act-GAL4 ubiquitous lethal

24B-GAL4 muscle viable Mef2-GAL4 muscle viable MHC-GAL4 muscle viable c57-GAL4 muscle viable

Repo-GAL4 glia viable Gli-GAL4 glia viable 46F-GAL4 glia viable

Elav-GAL4 neurons viable nsyb-GAL4 neurons viable OK6-GAL4 neurons viable ppk-GAL4 neurons viable ppk1.9-GAL4 neurons viable smid-GAL4 neurons viable

btl-GAL4 trachea viable DSRF-GAL4 trachea viable 14D03-GAL4 trachea viable

nos-GAL4 maternal viable ETH-GAL4 inka cells viable en-GAL4 patterning viable akh-GAL4 corpora cardiaca viable

aug21-GAL4 Mutli-tissue viable B119-GAL4 Mutli-tissue viable c563-GAL4 Mutli-tissue viable EcR-GAL4 Mutli-tissue viable esg-GAL4 Mutli-tissue viable

Lsp2-GAL4 fat body viable ppl-GAL4 fat body viable c564-GAL4 fat body viable r4-GAL4 fat body viable Cg-GAL4 fat body viable

pr1-GAL4 for-GAL4 viable pr2-GAL4 for-GAL4 viable pr3-GAL4 for-GAL4 viable pr4-GAL4 for-GAL4 viable eeEI-5 for-GAL4 viable

52 Chapter 3 Figures

A.

B.

Figure 1: Cloning of for-pr-Gal4s

A. Schematic of cloned nested regions (coloured bars above locus) of the foraging locus (blue) used in the promoter bashing analysis. The regions up to 5kb upstream of and 300bp downstream of the TSS for the four identified minimal promoters were cloned into a gypsy insulated Gal4 vector. B. Example of one of the cloned for-pr-Gal4 constructs. The for-pr-Gal4 segment is flanked by gypsy insulator sequences. An attB site-specific recombination sequence from φC31 was added to the vector.

53 A. B.

Figure 2: Site-specific integration of for-pr-Gal4s

A. Diagram depicting the mechanism of φC31 integrase system in Drosophila. Site-specific integration of the pr-Gal4 constructs was used to control for position effects during transgenesis. B. PCR confirmation of the four largest pr-Gal4 integrations. There was a positive integration event in all lines.

54 A. B.

C. D.

Figure 3: for-pr-Gal4s expression in the 3rd instar larval CNS

Immunohistochemical analysis of pr-Gal4 driving UAS-mCD8::GFP in the larval CNS and stained with anti-GFP. A. for-pr1-4.7-4.7Gal4 expressed in neurons throughout the VNC and brain lobes. B. for-pr2-4.0-4.0Gal4 expressed in midline glia in the VNC. C. for-pr3-3.3-3.3Gal4 expressed in the perineurial surface glia of the CNS and PNS. D. for-pr4-2.4-2.4Gal4 expressed in the optic lobes of the CNS and the eye imaginal discs and the leg imaginal discs.

55 A. B. C.

D. E. F.

G. H. I.

J. K. L.

Figure 4: for-pr-Gal4 expression in larval gastric system (caption on following page)

56 Figure 4: for-pr-Gal4 expression in the 3rd instar larval gastric system Immunohistochemical analysis of pr-Gal4 driving UAS-mCD8::GFP in the larval gastric system and stained with anti-GFP. A. for-pr1-4.7-4.7Gal4 expressed in approximately 10 enteroendocrine cells in the anterior portion of the larval midgut. B. for-pr2-4.0-4.0Gal4 expressed in the adult midgut precursor cells (AMP) throughout the midgut. C. Magnified AMP clusters of for-pr2-4.0-4.0Gal4. D. for- pr3-3.3-3.3Gal4 expressed in the anterior midgut enterocytes. E. for-pr3-3.3-3.3Gal4 expressed in the copper cells of the acid region of the midgut. F. for-pr3-3.3-3.3Gal4 expressed in the large flat cells of the mid midgut. G. for-pr3-3.3-3.3Gal4 expressed in the muscle of the midgut. H. for-pr3-3.3-3.3Gal4 expressed in the muscle of the hindgut. I. for-pr3-3.3-3.3Gal4 expressed in enteroendocrine cells of the midgut. J. for-pr3-3.3-3.3Gal4 expressed in the distal region of the anterior larval malpighian tubules. K. for-pr4-2.4-2.4Gal4 expressed in the h3, h5d, and hv regions of the larval hindgut. L. for-pr4-2.4-2.4Gal4 expressed in the h5d, h6d, hv, and h7 regions of the larval hindgut.

57 A. B.

C. D.

Figure 5: for-pr-Gal4 expression in salivary glands and fat body

Immunohistochemical analysis of pr-Gal4 driving UAS-mCD8::GFP in the salivary glands and fat body and stained with anti-GFP. A. for-pr2-4.0-4.0Gal4 expressed in the salivary gland imaginal ring. B. for-pr3-3.3-3.3Gal4 expressed in the larval salivary gland. C. for-pr3-3.3-3.3Gal4 expressed in the larval fat body. D. for-pr4-2.4-2.4Gal4 expressed in the salivary gland imaginal ring.

58 A. B.

C. D.

Figure 6: pr-Gal4 expression in carcass

Immunohistochemical analysis of pr-Gal4 driving UAS-GFP::nls in the larval carcass and stained with anti-GFP. A. for-pr3-3.3-3.3Gal4 expressed in the body wall muscle of the larva. B. for-pr4-2.4-2.4Gal4 expressed in the denticles of the larval cuticle. C. for-pr4-2.4-2.4Gal4 expressed in the anterior spiracles. D. for-pr4-2.4-2.4Gal4 expressed in the posterior spiracles.

59 A. B. C.

ELAV GFP

D. E.

DELTA GFP

G. H. I.

REPO GFP

Figure 7: for-pr-Gal4 co-labeling

Immunohistochemical analysis of co-localized pr-Gal4 driving UAS-GFP in the larval carcass. A-C. for-pr1-4.7-4.7Gal4 driving UAS-mCD8::GFP in larval CNS expression co-localized with nuclear anti-ELAV D-F. for-pr2-4.0-4.0Gal4 driving UAS-mCD8::GFP in larval midgut expression co-localized with cytoplasmic anti-DELTA. G-I. for-pr3-3.3-3.3Gal4 driving UAS-GFP::nls in larval CNS expression co-localized with nuclear anti-REPO.

60 Figure 8: Mapped CREs back onto the foraging locus

In the centre of the figure is the region of the foraging gene (blue bars) corresponding to the cloned pr-Gal4 regions (multi-coloured above the locus). Cloned regions are colour coded and are aligned below the locus. The dashed gray lines correspond to the delineation points of the nested constructs. Mapped larval expression is below the locus. The decimal numbers within each promoter refers to the internal segment to which the expression was mapped.

61 A B . .

C D . .

E F. .

Figure 9: Anti-FOR IHC of foraging null mutant

Immunohistochemical analysis of fornull and wild-type in the larval CNS and stained with anti- FOR. A. anti-FOR staining of larval sub-oesophagial ganglia (SOG) in fornull mutants. B. anti-FOR staining of larval SOG in wild-type mutants. C. anti-FOR staining of larval central brain in fornull mutants. D. anti-FOR staining of larval central brain in wild-type mutants. E. anti-FOR staining of larval VNC in fornull mutants. F. anti-FOR staining of larval VNC in wild-type mutants.

(B, D, and F data provided by Dr. Amsale Belay, thesis 2009)

62 A.

B. C.

Figure 10: in-situ and enhancer trap expression

A. Schematic of the foraging locus with the enhancer trap, forNH48, and the in-situ probe target locations. B. betagalactoidase in-situ hybridization in 3rd instar larval CNS. An isoform-specific probe targeting transcripts of for-pr4-2.4 was used. Staining identifies of patterns in the optic lobes, eye discs, and antennal discs. C. Enhancer trap expression pattern of forNH48 in larval CNS. forNH48 expresses in the optic lobes and eye imaginal discs.

(B and C data provided by Dr. Amsale Belay)

63 A. B.

i i i i i A A A i A A N N N A N N R -R R N R R r r r R r r fo fo fo r fo fo r; ; ; fo ; r; t c t cr t cr r; t cr c w D w D w D c w D D > > > > > > D > > t t> r1 r1 r3 r3 r4 r4 t> r2 r2 w w p p p p p p w p p

anti-FOR

anti-ACT

Figure 11: for-pr-Gal4>RNAi western blot analysis

A. Western blot analysis of pr-Gal4’s driving foraging RNAi. Blots were probed with a common coding anti-FOR antibody and an anti-ACTIN antibody. for-pr1-4.7-, 3-, and 4-Gal4 driving UAS-Dcr2; UAS-forRNAi-exon10:11:12 were not different from controls. (n = 3 blots with 30 individuals per homogenate)

B. Western blot analysis of for-pr2-4.0-Gal4 driving foraging RNAi. Blots were probed with a common coding anti-FOR antibody and an anti-ACTIN antibody. for-pr2-4.0-Gal4 driving UAS-Dcr2; UAS-forRNAi-exon10:11:12 was not different from controls. (n = 3 blots with 30 individuals per homogenate)

64 Figure 12: modENCODE TFBSs aligned to the foraging locus.

Regions that were cloned are highlighted; for-pr1-4.7-purple, for-pr2-4.0-green, for-pr3-3.3-red, and for-pr4-2.4-orange. TFBSs that lie both within and outside the cloned regions, there are multiple that don’t. These modENCODE data were collected using extracts from embryos, and as such, represent a set of possible TFBSs (Nègre et al., 2011).

65 A. B. C. ) m m (

h t g n e l

h t a P

i A A A A N i t t i i N N t N D A A N A t D D cD c N w w N R N c c w R w r r r r > > R r R o > o o r 1 3 r o r > f fo r3 f f o r r o f o r2 > > p > t> f p p f t> f r1 r2 r4 w > > > p p p p r1 r4 w r2 A p p p

Figure 13: pr-Gal4 foraging behaviour

A. pr-Gal4 driving UAS-Dcr2; UAS-forRNAi-exon7:8 in larval path length assay. for-pr1-4.7, for-pr3- 3.3, and for-pr4-2.4 driving RNAi were not significantly different from their controls. Data plotted as means ± 95% confidence intervals. B. for-pr2-4.0-Gal4 driving UAS-Dcr2; UAS-forRNAi-exon7:8 in larval path length assay. for-pr2-4.0 driving RNAi was not significantly different from its controls. Data plotted as means ± 95% confidence intervals. C. pr-Gal4 driving a UAS-forcDNA in larval path length assay. All pr-Gal4s driving cDNA were not significantly different from both their controls. Data plotted as means ± 95% confidence intervals. (Full statistics in Ch.3 table S1)

66 A.

B. C.

60 120

50 )

m 100 m ( 40 h t 80 g n e l 30

h 60 t a

P 20 40

10 20

0 0

Figure 14: Isoform-specific RNAi foraging behaviour

A. A schematic of the foraging gene and associated isoform-specific tools. Above the locus are the breakpoints of the Df(2L)forf0e0 deletion. Below the locus is the target sequence of an isoform-specific RNAi construct generated in the lab. B. da-Gal4 driving forRNAi-exon4:5 in larval path length assay. da-Gal4 driving isoform-specific RNAi is significantly shorter than controls. Data plotted as means ± 95% confidence intervals. C. Crossing Df(2L)forf0e0 to foraging null mutant fails to complement larval path length behaviour. Data plotted as means ± 95% confidence intervals. (Full statistics in Ch.3 table S2)

67 ) e c e n e k c s a e t r o n u i l f

y r d a r o t i b o r a F (

i i A A t i N N t A R R w N > w R r r 2 > r fo fo r r4 o > > p p f r1 r2 t> p p w

Figure 15: for-pr-Gal4>RNAi food intake for-pr-Gal4 driving UAS-Dcr2; UAS-forRNAi-exon7:8 in larval food intake assay. No for-pr-Gal4 was sufficient to reduce food intake. Data plotted as means ± 95% confidence intervals. (Full statistics in Ch.3 table S3)

68 A. B.

1200 20

18

1000 ) 16 n i ) e t 2 o ^

r 14 s p l 800 e g x m i

12 /

p l (

o r a e

600 e 10 c r y l A g 8 g µ (

s 400

d 6 i p i L 4 200

2

0 0

Figure 16: Fat body>RNAi lipid levels

A. Decreasing foraging expression in larval fat bodies increases fat levels. Fat body specific Lsp2-Gal4 was used to drive the expression of a foraging common coding RNAi (v38320 from VDRC). Dissected fat bodies were used in this analysis. Data plotted as means ± 95% confidence intervals. B. Decreasing foraging expression in larval fat bodies decreases the overall size of the larvae. Fat body enriched Cg-Gal4 was used to drive the expression of a foraging common coding RNAi (v38320 from VDRC). Areas were quantified 48, 72, and 96 hours after egg hatching. Data plotted as means ± 95% confidence intervals. (Full statistics in Ch.3 table S4) (data provided by Dr. Scott Douglass)

69 Chapter 4

Promoter Specific Expression and Function of the foraging Gene in the Adult Fruit Fly (Drosophila melanogaster)

The vast majority of the experimental research done in the thesis was performed by Aaron Allen. However, some of the experiments performed in this thesis were collaborative, with data being contributed by others. Those data and the contributors are listed below;

● Figure 6: Proboscis extension response in adult - Bryon Hughson ● Figure 7: fat body manipulations in adult - Dr. Scott Douglas

70 Abstract:

Many tissues in Drosophila are rebuilt or built anew during metamorphic transition from larva to fly. The life history traits of the animal also undergo changes. Can parallels be drawn between the expression and function of genes across development despite these differences? The foraging gene is known to influence similar phenotypes in the larval and adult fly. In this chapter, I set out to identify the CREs located in the foraging gene that drove expression in the adult. Cloned foraging promoter-Gal4 lines had diverse expression patterns in the adult. Many of the tissues expressed in the adult were similar to those expressed in the larva. for-pr1-4.7- Gal4 drove expression in the brain and VNC. for-pr3-3.3-Gal4 drove expression in the perineurial glia on the surface of the CNS. for-pr4-2.4-Gal4 drove expression in the optic lobes of the fly, as in the larva. for-pr2-4.0-Gal4 primarily expressed in the stem cells of the gastric system, and for-pr3-3.3-Gal4 expressed in multiple of the differentiated cells of the gut. for-pr4- 2.4-Gal4 was restricted to the hindgut of the gastric system, as it was in the larva. As in the larva, foraging was expressed in the adult fat body and this expression autonomously regulated lipid levels in the fly. forging also had extensive expression in the reproductive system of the adult male and female fly. foraging has not previously been well characterized for any role in mating and reproductive behaviour, so this finding represents an interesting new avenue of research. This study identified similar expression and regulation of the foraging gene in the adult compared to what I previously showed in the larva. foraging’s larval and adult feeding- related phenotypes may be regulated in a similar fashion.

71 Introduction:

Drosophila melanogaster, a holometabolous insect, undergoes a body remodeling during its metamorphosis from larva to fly. During the transition from larva to pupa to adult, many of the tissues are dissolved and regrown to form the new adult tissues. The behaviours of the animal also change from one developmental stage to another. Larvae spend most of their time foraging and eating to acquire nutrients that they need during metamorphosis. The energy requirements of increasing body size from a first instar larva to a third, followed by the rebuilding of tissues during pupal development is very large. As with the larvae, adults need to acquire nutrients in order to maintain its body and meet the metabolic requirements. However, adults invest much of their time in mate search and reproduction. Can parallels be drawn between the expression and function of genes across development despite the differences in life histories of these developmental stages?

Conserved expression/CRE across development

Many genes show consistent expression across development (Graveley et al., 2011). Additionally, promoter structure has been shown to be highly conserved between developmental stages (Hoskins et al., 2011). In the case of the slowpoke gene, shared regulatory elements drive expression in both the larvae and adult muscle, and another separate element drives expression specifically in the adult muscle (Brenner and Atkinson, 1996). Furthermore, there are multiple elements that drive expression in both the adult and larval CNS. Many of these patterns are also seen in the developing embryo (Thomas et al., 1997). Reporter constructs of another gene, pdf, exhibits consistent expression across development in the nervous system (Park et al., 2000). Specifically, a CRE distal from the TSS drives expression in a discrete set of cells at the tip of the abdominal ganglia in both the larvae and adults. A separate CRE closer to the TSS drives expression in a set of cells in the brain in both larvae and adults. The patterns of regulation of the paramyosin gene are also consistent across development, from the embryo to larva to adult

72 (Arredondo et al., 2001). Interestingly, foraging expression patterns, as deduced by microarray on dissected tissues, suggest consistent expression across development (Chintapalli et al., 2007). Both the expression levels and spatial distribution of foraging are similar between tissues of the third instar larva and adult fly. Similar expression expression patterns between life history stages do not necessarily reflect similar functional outcomes. foraging’s protein isoforms may interact with different partners in a given tissues in the larva compared to adult, or a given tissues may have different functions due the the differences in life history stages.

foraging’s adult phenotypes

Although much of the classic literature of the foraging gene has focused on the larva, recent work has expanded to include adult phenotypes of rover and sitter. These include post- feeding locomotion (Pereira and Sokolowski, 1993), sucrose responsiveness (Scheiner et al. 2004; Belay et al., 2007), learning and memory (Mery et al., 2007; Reaume et al., 2011; Kohn et al., 2013; Kuntz et al., 2012; Wang et al., 2008) social behaviour (Foucaud et al., 2013; Donlea et al., 2012), sleep (Donlea et al., 2012; Raizen et al., 2008) stress tolerance (Dawson- Scully et al., 2007; Dawson-Scully et al., 2010). Many of these phenotypes show parallels between the larval and adult rover and sitter flies. Here I focus on feeding related traits where much of the larval and adult phenotyping has been done. This includes foraging behaviour, food intake and nutrient storage. foraging affects food search behaviour in adult flies (Pereira and Sokolowski, 1993; Kent et al., 2009). Rovers traveled further distances after feeding on a drop of sucrose. foraging also affected the hunger state of adult flies (Scheiner et al., 2004; Belay et al., 2007). Rovers showed higher sucrose responsiveness than sitters and their sucrose responsiveness was increased with the duration of starvation (Scheiner et al., 2004). Transgenically increasing foraging expression in neurons, with elav-Gal4, increased sucrose responsiveness (Belay et al., 2007). Although, both the natural alleles and the ubiquitous overexpression of foraging had no effect on adult ingestion (Urquhart-Cronish and Sokolowski, 2014) in the cafe assay (Ja et al., 2007), differences in ingestion behaviour of the larvae are found (Chapter 2). These findings may represent differential regulation of the foraging gene.

73 During short bouts of nutrient stress, as in the case of starvation, flies begin depleting their carbohydrate reserves, primarily in the form of glycogen. During longer bouts of starvation, flies begin to deplete their lipid reserves (Canavoso et al., 2001). Sitters are more resistant to these forms of nutrient stress. When chronically starved, sitter adults survive longer than rovers (Donlea et al., 2012). The natural allelic variants also differ in their effects on fat content as adults. Sitters have significantly higher fat levels compared to rovers (Hughson and Anreiter, unpublished data). This is similar to what we see in larvae (Belay, 2010). These fat levels have been shown to decrease in response to food deprivation (Hughson, unpublished data). This is consistent with the fact that sitters survive longer due to nutrient stress as they have more reserves.

Identifying foraging’s adult CREs and their functions

Parallels can be drawn between larval and adult phenotypes associated with foraging through the natural variants of the foraging gene. Some of these foraging associated phenotypes have been further supported by transgenic manipulations. The question arises, do foraging’s associated phenotypes in the adult fly arise from similar regulation and expression as reported in the larvae? In larvae I found a broad expression in many tissues and mapped multiple CREs (Chapter 3). I found expression in various subsets of neurons and subsets of glia in the CNS. Expression was also seen throughout the gastric system. I wondered if there were CREs within the locus that drove expression in the same tissues in the adult as I reported in the larva? In this chapter, I identified CREs located in the foraging gene that drive expression in the adult. I will use the same promoter-Gal4 constructs used in the last chapter to identify the expression patterns of foraging and its associated CREs in the adult fly. Furthermore, I altered foraging expression in some of these identified tissues and assessed feeding-related behaviours in adults.

74 Materials and Methods:

Fly Strains and Rearing

Strains were kept in 40 mL vials with 10 mL of food and 170 mL bottles with 40 mL of food at 25 ± 1ºC with a 12L:12D photocycle. Our food recipe is 1.5% sucrose, 1.4% agar, 3% glucose, 1.5% cornmeal, 1% wheat germ, 1% soy flour, 3% molasses, 3.5% yeast, 0.5% propionic acid, 0.2% Tegosept and 1% ethanol in water.

Bioinformatics

Bioinformatics were conducted with the Geneious 8.1.7 software package (http://www.geneious.com, Kearse et al., 2012). This included primer design, editing and visualization of Sanger sequencing chromatograms, contig assembly, nucleotide alignments, in- silico design of cloning schemes, and restriction endonuclease digestion prediction.

Promoter-Gal4s construction

An attB sequence was amplified with PCR from the pUAST-attB vector (Bischof et al., 2007). NsiI sites were added to the primers. The NsiI fragment was then cloned into the pStinger vector (Barolo, 2000). Regions of the foraging gene were amplified with PCR and cloned into pGEM-Teasy vector (Promega). KpnI and SacI sites were added to the primers used. The KpnI/SacI fragment was digested out of pGEM and inserted into the Gal4 containing pMARTINI-Gal4 vector (Billeter and Goodwin, 2004). The NotI pr-Gal4 fragment was then

75 inserted into the pStinger-attB vector. The resulting vector contained a pr-Gal4 sequence between two gypsy insulators with an attB sequence. The nested constructs were cloned by inserting the NotI fragment of pMARTINI-Gal4 into the pStinger-attB vector first. This insulated Gal4 vector with attB, called pIGa, was digested with KpnI and end filled with Klenow enzyme (New England Biolabs). Nested regions were amplified with PCR from the larger pr-Gal4 vectors and cloned into the end filled KpnI digested pIGa. All pr-Gal4 constructs were injected into the P{CaryP}attP2 landing site (Groth et al., 2004) by Genetic Services Inc. Successful integration was confirmed with PCR.

Western blot analysis

Twenty larvae were homogenized in 400μl of lysis buffer (50 mM Tris-HCl pH7.5, 10% glycerol, 150 mM NaCL, 1% Triton-X 100, 5 mM EDTA, 1x Halt Protease Inhibitor Cocktail, 1x Halt Phosphatase Inhibitor Cocktail) on ice. Samples were centrifuged at 16,000 RCF at 4°C and the supernatant was transferred to a new tube and placed on ice. Protein quantification was performed with Pierce BCA kit. 20 μg of protein was denatured for 5 min. at 100°C. The samples were run on a 4% stacking/7% resolving polyacrylamide and SDS gel at 150V for 1hr. in running buffer (25 mM Tris-HCl, 200 mM glycine, 0.1% SDS). Proteins were transferred onto a nylon membrane at 100V for 1hr. in transfer buffer (25 mM Tris-HCl, 200 mM glycine, 10% methanol). Blots were blocked for 2hrs. in 5% nonfat milk in 0.1% Tween-20 in 1x TBS (0.1% TBST). Primary antibody was incubated for 1h at a concentration of 1:10 000 for anti- FOR (Belay et al., 2007) and 1:5000 for anti-ACTIN (Sigma). The blots were rinsed twice and washed 3x for 5 min. with each rinse in 0.1% TBST. Blots were then incubated with HRP conjugated secondary (Jackson ImmunoResearch Laboratories) at a concentration of 1:10 000 for 45 min. They were then rinsed twice and washed 3x. Blots were incubated for 5 min. in GE’s ECL Prime Detection reagent and exposed to X-ray film.

76 Immunohistochemistry

Dissected samples were fixed in 4% paraformaldehyde in 1x PBS for 1h. The fixed samples were rinsed twice in 0.5% Triton X in 1x PBS (PBT) and then washed 4x for 30 min each in PBT. The samples were blocked in 10% normal goat serum (NGS, Jackson ImmunoResearch Laboratories) and 0.1% BSA (Sigma) in PBT for 2h at room temperature. Primary antibody incubations were conducted in blocking solution and incubated overnight at 4°C. After primary incubation the samples were rinsed twice and then washed 4x for 30 min. each in PBT. Secondary antibody was incubated in blocking solution at room temperature for 2h. Washing was conducted as described above for the primary antibody. Tissues were mounted on slides in Vectashield (product infor). Samples were imaged using a Zeiss Axioscope epifluorescence microscope as well as a Zeiss LSM 510 and Leica SP5 confocal microscopes. Images were analysed using Fiji software package (Schindelin et al., 2012).

Triglyceride analysis

Groups of 10 adults were homogenized in a volume of 200 μl of 0.1% Tween 20 in 1x PBS. Samples were incubated at 70°C for 5 min and then chilled for 2min on ice. Debris was pelleted by centrifugation at 16 000 rcf for 5 min. 25 ul of the supernatant was then mixed with Infinity TAG Reagent (Fisher Scientific) in a 96-well spectrophotometer plate. Samples were incubated at 37°C for 5 min. The absorbance was measured at 540 nm. A separate aliquot was mixed with Pierce BCA reagent to quantify protein levels in the samples. The BCA samples were incubated at 37°C for 30 min. Standard curves were used for both the DAG/TAG and protein analysis. Protein levels are displayed at ug glycerol/ mg protein.

Sucrose responsiveness

77 Four to seven day old mated female flies were isolated into groups of nine flies. Flies were food, but not water, deprived for 48 hrs in vials containing 1% agar prior to testing. Starved flies were placed inside a trimmed 200μL pipet tip with with only their head and one foreleg protruding. The protocol used for sucrose responsiveness was taken from Scheiner et al. (2004). Briefly, flies were presented with a Kimwipe soaked in different sucrose concentrations. The concentrations were 0.1%, 0.3%, 1%, 3%, 10%, and 30%. Water was presented to the flies between each sucrose sample. The number of proboscis extensions following a sucrose solution was tallied to get the sucrose responsiveness score. Any proboscis extension that was followed by a positive response to the water sample was discounted. The experiments were performed in an environmental chamber between 1200 hours and 1700 hours at 25°C, and 60% RH.

Starvation resistance

Starvation resistance experiments for fig. 8A were performed using the Trikinetics Drosophila activity monitor. Vials were filled with about 2 cm of 1% agar. Time to no activity was used as the metric for starvation resistance. When the fly died, it no longer moved and as a result did not cross the infrared beams in the monitor. Five to seven day old flies were anesthetized with CO2 and placed into the vials. The monitors downloaded count data every hour. Experiments were performed in a 25C, 60% RH and 12:12 L:D environmental chamber. Starvation resistance experiments for fig. 8B were performed in 40 mL fly culture vials with 10mL of 1% agar. A 2 mL microfuge tube filled with water and topped with a piece of cotton was inserted into each vial. A Nikon D3100 DSLR was used with computer running gphoto2 (http://www.gphoto.org/) to take photos of the vials every hour. Immobile flies were scored as dead. As above, five to seven day old anesthetized adults were placed into the agar vials. The experiments were performed in an environmental chamber at 25°C, 60% RH, and 12:12 L:D.

Statistical analysis

78 Statistical analysis was performed in R (R Core Team, 2013). The effects of genotype as well as blocking factors such as the date of the experiment were modeled with general linear models using the lm function, and the car package (Fox and Weisberg, 2011). ANOVA and Tukey HSD post-hoc tests were run to compute statistical significances. 95% confidence intervals were calculated using the effects package (Fox, 2003). Effect sizes were calculated using the compute.es package (Re, 2013). Power was calculated using the pwr package (Champley, 2012). The packages lattice and gplots were used from plotting (Sarkar, 2008; Warnes et al., 2013 respectively). The survival package was used to calculate and generate the time to failure plots for the starvation resistance experiments (Therneau, 2015).

79 Results and Discussion:

for-pr-Gal4 expression

for-pr-Gal4 in the CNS In general I saw parallels in the expression patterns of the larvaqe and adults. More specifically, for-pr1-4.7-Gal4 drove expression in a subset of neurons in both the brain lobes as well as the VNC (fig. 1A, 1B). Levels of expression in adults were dramatically reduced compared to the larvae (Ch. 3, fig. 3A). Interestingly, the neurons in the adult brain were roughly located in positions corresponding to secondary lineage neurons seen in the larval expression pattern. These secondary lineage neurons were not functional in the larvae, but survive metamorphosis and become functional adult neurons (James Truman, personal communication; Truman et al., 2010). for-pr2-4.0-Gal4 expressed in the midline glia in larva, however, this cell type does not exist in the adult nervous system. for-pr2-4.0-Gal4 did not express in the adult CNS; it expressed in the trachea and air sacs surrounding and integrating into the adult nervous system (fig. 1C). for-pr3-3.3-Gal4 drove expression in the surface glia (fig. 1D, 1E); this expression was highly reminiscent of what I saw in the larva (Ch. 3, fig. 3C). This surface glial pattern was the perineurial glia as evident from the size and number of cells (Awasaki et al., 2008). Interestingly, the surface glia are responsible for blood-brain barrier functions in the larval and adult fly (Bainton et al., 2005; Schwabe et al., 2005; Stork et al., 2008). for-pr4-2.4-Gal4 expressed in the optic lobes of the fly brain (fig. 1F). A role for PKG in the visual system of mammals was reported (Snellman and Nawy, 2008).

for-pr-Gal4 in the gastric system When I look at the gastric system of the adults, I also saw parallels with the larval expression patterns described below. I found no observable expression from for-pr1-4.7-Gal4 in the gastric system. for-pr2-4.0-Gal4 expressed in some of the same tissues as a larva as well as a

80 few additional tissues. The crop duct as well as the cardia represent novel expression for for- pr2-4.0-Gal4 in the adult as compared to the larva (fig. 2A, 2B). Expression in the intestinal stem cells (ISCs), as well as the enteroblasts, was consistent with the larval pattern (fig. 2D). Some cells stained positive for DELTA, and other non-enterocyte cells stained positive for PROSPERO (fig. 3). DELTA is indicative of ISCs (Ohlstein and Spradling, 2006) and PROSPERO is indicative of enteroblasts (Micchelli and Perrimon, 2006). The ISCs are important for midgut homeostasis. These cells divide symmetrically to replace themselves, or divide asymmetrically to proliferate into enteroblasts. These enteroblasts then differentiate into enterocytes or enteroendocrine cells (Ohlstein and Spradling, 2007). for-pr2-4.0-Gal4 also expressed in the ureter of the malpighian tubules. These cells are very small, smaller than stellate cells. They were first described from the expression of an enhancer trap (Sözen et al. 1997), and have since been identified as being malpighian tubule stem cells (Sing and Hou, 2009). The tubules are vital for ion balance in the hemolymph of the fly and are discussed further below. The adult for-pr3-Gal4 also had analogous expression to the larva. It drove expression in the anterior enterocytes, enteroendocrine cells, large flat cells, visceral gut muscle, as well as the malpighian tubules (fig. 2E-H). The enterocytes and enteroendocrine cells are important for absorption and digestion of ingested nutrients. As you move from the anterior to the posterior, the specialization of the digestion shifts from larger macromolecules to smaller monosaccharides. The visceral muscle is required to push these nutrients along via peristalsis. Mice with foraging's orthologue knocked out in smooth muscle show increased gut passage times relative to controls (Weber et al., 2007). Larvae and adults carrying the rover and sitter alleles have not been characterized for gut clearing times. However, they have been previously been characterized for differences in adult malpighian tubule secretion rate, with rovers secreting less than sitters (MacPherson et al., 2004a). Overexpression of foraging in the tubules led to an increase in secretion rate (MacPherson et al., 2004b), suggesting that rovers have lower foraging gene expression in the tubules than sitters. for-pr4-2.4-Gal4 expressed in the hindgut epithelia as well as the ampulla or rectum of the hindgut (fig. 2I, 2J). Much of the hindgut is import for ion absorption, and the rectum is crucial for water balance (Lemaitre and Miguel-Aliaga, 2013). The rover and sitter variant adults have been shown to differ in their excretion rate (Urquhart-Cronish and Sokolowski, 2014), the hindgut and rectum are strong candidate tissues for mediating this difference.

81 for-pr-Gal4 in the salivary and fat tissues I also see parallels with larval expression in the salivary glands and fat bodies of the adults. for-pr1-4.7-Gal4 did not express in either of these tissues. for-pr2-4.0-Gal4 expressed at the start of the salivary duct as well as fat tissue associated with the gastric system (fig. 4A, 4B). for-pr3-3.3-Gal4 expressed throughout the salivary glands as well as in the fat body (fig. 4C, 4D). for-pr4-2.4-Gal4 was restricted to the salivary duct of the salivary gland and also expressed in the fat body associated with the gastric system (fig. 4E, 4F).

for-pr-Gal4 in the reproductive systems I also saw extensive expression in the reproductive tissues of the adults. There were many similarities in the males between the Gal4 lines. for-pr1-4.7-, for-pr2-4.0-, and for-pr4- 2.4-Gal4 all express in the seminal vesicle and secondary cells of the accessory glands, albeit to varying extents (fig. 5A, 5E, 5G, respectively). The seminal vesicles are primarily a storage organ for mature prior to copulation and they also produce glandular secretions for the seminal fluid (Wolfner et al., 2005). The male accessory glands produce many proteins required for fertility, and their secretions have been shown to have strong effects on female post mating behaviours (Wolfner, 1997). for-pr2-4.0-Gal4 had the added exception of a small ring of cells at the base of the vas deferens where it joins with the ejaculatory duct (fig. 5F). for-pr4-2.4-Gal4 drove expression in the ejaculatory bulb of the male and reproductive system (fig. 5H), which not only functions to pump the ejaculate, but also contributes to glandular secretions. for-pr3- 3.3-Gal4 had no observed expression in the male reproductive systems. This lack of expression by for-pr3-3.3-Gal4, further supported the success of the gypsy insulators in our cloning and the site specific integration; it allows us to rule out the possibility that a position effect is responsible for the accessory gland expression. I did not observe any expression of for-pr1-4.7-, for-pr3-3.3-, and for-pr4-2.4-Gal4 in the female reproductive system. I saw expression of for-pr2-4.0-Gal4 in the female reproductive system. for-pr2-4.0-Gal4 expressed in the follicle cells of the developing eggs in the ovaries as well as the spermatheca (fig. 5B, 5C). The maternal loading of foraging in the developing eggs and early embryo is well known (Lécuyer et al., 2007; Graveley et al., 2011; Tomancak et al.,

82 2002; Jambor et al., 2015). foraging is also known to have high levels of expression in the spermatheca (Chintapalli et al., 2007). The spermatheca are important for long term storage of sperm (Gilbert, 1981), and perturbations in these cells can cause a decrease in egg laying over time (Schnakenberg et al., 2011). Egg laying differences have previously been reported between our rover and sitter strains (Burns et al., 2012). Finally, there was expression in a small segment of the common oviduct in the female reproductive system (fig. 5D).

Mapping CREs I was able to refine the mapping of the CREs onto the foraging locus by using the nested Gal4 lines that I generated and described in the previous chapter (fig. 6). The adult expression was parsed out more finely than the larva expression. Interestingly, the secondary cells of the male accessory glands mapped very close to the TSS of for-pr1-4.7, for-pr2-4.0, and for-pr4- 2.4. It is particularly facinating that these three promoters are the peaked promoters of the locus. for-pr3-3.3 is broad promoter and does not have expression in the male accessory glands. The closely linked CRE and the peaked nature of the promoter may suggest a common evolutionary origin. foraging may have undergone multiple partial gene duplications leading to alternative promoters. I speculate that if so, the accessory gland CRE was so closely linked to the original promoter that the redundant expression between promoters has not yet been lost yet.

Mapped CREs were not sufficient to alter sucrose responsiveness

foraging has previously been shown to affect the hunger state of adult flies (Scheiner et al., 2004; Belay et al., 2007). I asked if the CREs identified here were sufficient to alter this adult sucrose responsiveness? Driving our common coding RNAi with dicer, UAS-Dcr2; UAS- forRNAi-exon7:8, with the for-pr-Gal4 lines did not affect sucrose responsiveness (fig. 7). Neuronal expression of foraging, with elav-Gal4, was previously shown to be sufficient to alter sucrose responsiveness (Belay et al., 2007). Our for-pr-Gal4’s express in very few neurons. Other work in the lab has shown that foraging expression in the corpora cardiaca (CC) is required for wild-

83 type sucrose responsiveness and our for-pr-Gal4s do not express in the CC (Hughson, unpublished).

Altered foraging expression in the fat body affects DAG/TAG metabolism

foraging had previously been shown to affect food search behaviour and hunger state of adult flies (Pereira and Sokolowski, 1993; Kent et al., 2009; Scheiner et al., 2004; Belay et al., 2007). Are these behaviour patterns associated with altered lipid metabolism? Previous work in the lab had found differences in the lipid levels of adult flies of our rover and sitter strains (Hughson, unpublished; Anreiter, unpublished). Similar to what I found for larvae, I saw expression in the fat body cells of the adult (fig. 4). When we increased foraging expression levels in the fat body with a fat body driver, ppl-Gal4, we saw decreased fat levels (fig. 8). I localized CREs responsible for expression in the fat body and also showned that locally altered foraging expression in the fat body is sufficient to alter lipid levels. Other research in the lab has shown that non-autonomous expression of foraging in the CC also alters fat levels (Hughson, unpublished). This represents an interesting situation in which foraging functions in multiple distinct tissues to affect the same phenotype. Flies rely on their lipid reserves to cope with times of inadequate nutrient availability. We have shown that altered foraging in the fat body of adult flies results in altered lipid levels. Altered lipid levels are likely to result in altered resistance to starvation.

Increasing foraging expression increases starvation resistance

Since the null mutant is lethal, I was unable to evaluate its phenotypes in the adult. But when I compare the BAC rescue to the BAC over expressor, I did see an increase in starvation resistance (fig. 9A). The BAC rescue recapitulated most of the adult expression pattern on a western (fig. 10A). The BAC over expresser showed an increased expression relative to wild

84 types (fig. 10B). These data suggested that higher foraging expression confered increased starvation resistance. When I over express foraging with our pr-Gal4 lines, I saw that for-pr1- 4.7-Gal4 also increased starvation resistance (fig. 9B). Rovers have less starvation resistance than sitters, and over expressing foraging in the mushroom bodies increased starvation resistance (Donlea et al., 2012). This implies that rovers have lower expression in the cells relevant for this phenotype, and the mushroom bodies are a good candidate. Other work in the lab showed that altering foraging in the CC was sufficient to alter starvation resistance (Hughson, unpublished). This suggests that the CC is also a candidate tissue for starvation resistance. Given that for-pr1-4.7-Gal4 did not drive in the mushroom bodies or the CC, it is possible that foraging expression in yet another tissue may be involved in adult starvation resistance.

Conclusion:

As in the larvae, for-pr-Gal4s drove expression in multiple different tissues systems in the adults. I used our nested constructs to map the CREs along the locus. Many of these expression patterns were consistent with what I found in the larva, but I did see a few novel tissues appearing in the adult. I identified fat body regulatory elements and altering foraging’s expression in the fat body altered lipid levels. Much of the expression that I found implicated potential new roles of the foraging gene in the adult fly. This is especially true in the case of the reproductive tissues.

85 References:

Arredondo, J.J., Ferreres, R.M., Maroto, M., Cripps, R.M., Marco, R., Bernstein, S.I., and Cervera, M. (2001). Control of Drosophila paramyosin/miniparamyosin gene expression: Differential regulatory mechanism for muscle-specific transcription. J. Biol. Chem. 276, 8278–8287.

Awasaki, T., Lai, S.-L., Ito, K., and Lee, T. (2008). Organization and postembryonic development of glial cells in the adult central brain of Drosophila. J. Neurosci. 28, 13742– 13753.

Bainton, R.J., Tsai, L.T.-Y., Schwabe, T., DeSalvo, M., Gaul, U., and Heberlein, U. (2005). moody encodes two GPCRs that regulate cocaine behaviors and blood-brain barrier permeability in Drosophila. Cell 123, 145–156.

Barolo, S., Carver, L.A., and Posakony, J.W. (2000). GFP and β-galactosidase transformation vectors for promoter/enhancer analysis in Drosophila. BioTechniques 29, 726–732.

Belay, A. t., Scheiner, R., So, A. k.-C., Douglas, S. J., Chakaborty-Chatterjee, M., Levine, J. d., and Sokolowski, M. b. (2007). The foraging gene of Drosophila melanogaster: Spatial- expression analysis and sucrose responsiveness. J. Comp. Neurol. 504, 570–582.

Billeter, J.-C., and Goodwin, S.F. (2004). Characterization of Drosophila fruitless-Gal4 transgenes reveals expression in male-specific fruitless neurons and innervation of male reproductive structures. J. Comp. Neurol. 475, 270–287.

Bischof, J., Maeda, R.K., Hediger, M., Karch, F., and Basler, K. (2007). An optimized transgenesis system for Drosophila using germ-line-specific φC31 integrases. PNAS 104, 3312–3317.

Brenner, R., and Atkinson, N. (1996). Developmental- and eye-specific transcriptional control elements in an intronic region of a Ca(2+)-activated K+ channel gene. Dev. Biol. 177, 536– 543.

Burns, J.G., Svetec, N., Rowe, L., Mery, F., Dolan, M.J., Boyce, W.T., and Sokolowski, M.B. (2012). Gene-environment interplay in Drosophila melanogaster: chronic food deprivation in early life affects adult exploratory and fitness traits. PNAS 109 Suppl 2, 17239–17244.

Chintapalli, V.R., Wang, J., and Dow, J.A.T. (2007). Using FlyAtlas to identify better Drosophila melanogaster models of human disease. Nat. Genet. 39, 715–720.

86 Dawson-Scully, K., Armstrong, G.A.B., Kent, C., Robertson, R.M., and Sokolowski, M.B. (2007). Natural variation in the thermotolerance of neural function and behavior due to a cGMP-dependent protein kinase. PLoS ONE 2, e773.

Dawson-Scully, K., Bukvic, D., Chakaborty-Chatterjee, M., Ferreira, R., Milton, S.L., and Sokolowski, M.B. (2010). Controlling anoxic tolerance in adult Drosophila via the cGMP- PKG pathway. J. Exp. Biol. 213, 2410–2416.

Donlea, J., Leahy, A., Thimgan, M.S., Suzuki, Y., Hughson, B.N., Sokolowski, M.B., and Shaw, P.J. (2012). foraging alters resilience/vulnerability to sleep disruption and starvation in Drosophila. PNAS 109, 2613–2618.

Edwards, T.N., and Meinertzhagen, I.A. (2010). The functional organisation of glia in the adult brain of Drosophila and other insects. Prog Neurobiol 90, 471–497.

Foucaud, J., Philippe, A.-S., Moreno, C., and Mery, F. (2013). A genetic polymorphism affecting reliance on personal versus public information in a spatial learning task in Drosophila melanogaster. Proceedings of the Royal Society of London B: Biological Sciences 280, 20130588.

Gilbert, D.G. (1981). Ejaculate esterase 6 and initial sperm use by female Drosophila melanogaster. Journal of Insect Physiology 27, 641–650.

Graveley, B.R., Brooks, A.N., Carlson, J.W., Duff, M.O., Landolin, J.M., Yang, L., Artieri, C.G., van Baren, M.J., Boley, N., Booth, B.W. et al. (2011). The developmental transcriptome of Drosophila melanogaster. Nature 471, 473–479.

Groth, A.C., Fish, M., Nusse, R., and Calos, M.P. (2004). Construction of transgenic Drosophila by using the site-specific integrase from phage phiC31. Genetics 166, 1775–1782.

Hoskins, R.A., Landolin, J.M., Brown, J.B., Sandler, J.E., Takahashi, H., Lassmann, T., Yu, C., Booth, B.W., Zhang, D., Wan, K.H. et al. (2011). Genome-wide analysis of promoter architecture in Drosophila melanogaster. Genome Res. 21, 182–192.

Ja, W.W., Carvalho, G.B., Mak, E.M., de la Rosa, N.N., Fang, A.Y., Liong, J.C., Brummel, T., and Benzer, S. (2007). Prandiology of Drosophila and the CAFE assay. PNAS 104, 8253– 8256.

Jambor, H., Surendranath, V., Kalinka, A.T., Mejstrik, P., Saalfeld, S., and Tomancak, P. (2015). Systematic imaging reveals features and changing localization of mRNAs in Drosophila development. eLife Sciences 4, e05003.

Kearse, M., Moir, R., Wilson, A., Stones-Havas, S., Cheung, M., Sturrock, S., Buxton, S., Cooper, A., Markowitz, S., Duran, C. et al. (2012). Geneious basic: An integrated and extendable desktop software platform for the organization and analysis of sequence data. Bioinformatics 28, 1647–1649.

Kent, C.F., Daskalchuk, T., Cook, L., Sokolowski, M.B., and Greenspan, R.J. (2009). The Drosophila foraging gene mediates adult plasticity and gene–environment interactions in

87 behaviour, metabolites, and gene expression in response to food deprivation. PLoS Genet 5, e1000609.

Kohn, N.R., Reaume, C.J., Moreno, C., Burns, J.G., Sokolowski, M.B., and Mery, F. (2013). Social environment influences performance in a cognitive task in natural variants of the foraging gene. PLoS ONE 8, e81272.

Kuntz, S., Poeck, B., Sokolowski, M.B., and Strauss, R. (2012). The visual orientation memory of Drosophila requires Foraging (PKG) upstream of Ignorant (RSK2) in ring neurons of the central complex. Learn. Mem. 19, 337–340.

Lécuyer, E., Yoshida, H., Parthasarathy, N., Alm, C., Babak, T., Cerovina, T., Hughes, T.R., Tomancak, P., and Krause, H.M. (2007). Global analysis of mRNA localization reveals a prominent role in organizing cellular architecture and function. Cell 131, 174–187.

Lemaitre, B., and Miguel-Aliaga, I. (2013). The digestive tract of Drosophila melanogaster. Annual Review of Genetics 47, 377–404.

Lilián E Canavoso, Zeina E Jouni, K Joy Karnas, James E Pennington, and Wells, M.A. (2001). Fat metabolism in insects. Annual Review of Nutrition 21, 23–46.

MacPherson, M.R., Broderick, K.E., Graham, S., Day, J.P., Houslay, M.D., Dow, J.A.T., and Davies, S.A. (2004a). The dg2 (for) gene confers a renal phenotype in Drosophila by modulation of cGMP-specific phosphodiesterase. J. Exp. Biol. 207, 2769–2776.

MacPherson, M.R., Lohmann, S.M., and Davies, S.-A. (2004b). Analysis of Drosophila cGMP- dependent protein kinases and assessment of their in vivo roles by targeted expression in a renal transporting epithelium. J. Biol. Chem. 279, 40026–40034.

Mery, F., Belay, A.T., So, A.K.-C., Sokolowski, M.B., and Kawecki, T.J. (2007). Natural polymorphism affecting learning and memory in Drosophila. PNAS 104, 13051–13055.

Ohlstein, B., and Spradling, A. (2007). Multipotent Drosophila intestinal stem cells specify daughter cell fates by differential notch signaling. Science 315, 988–992.

Park, J.H., Helfrich-Förster, C., Lee, G., Liu, L., Rosbash, M., and Hall, J.C. (2000). Differential regulation of circadian pacemaker output by separate clock genes in Drosophila. Proc. Natl. Acad. Sci. U.S.A. 97, 3608–3613.

Pereira, H.S., and Sokolowski, M.B. (1993). Mutations in the larval foraging gene affect adult locomotory behavior after feeding in Drosophila melanogaster. Proc. Natl. Acad. Sci. U.S.A. 90, 5044–5046.

Raizen, D.M., Zimmerman, J.E., Maycock, M.H., Ta, U.D., You, Y., Sundaram, M.V., and Pack, A.I. (2008). Lethargus is a Caenorhabditis elegans sleep-like state. Nature 451, 569–572.

Reaume, C.J., Sokolowski, M.B., and Mery, F. (2011). A natural genetic polymorphism affects retroactive interference in Drosophila melanogaster. Proc. Biol. Sci. 278, 91–98.

88 Scheiner, R., Sokolowski, M.B., and Erber, J. (2004). Activity of cGMP-dependent protein kinase (PKG) affects sucrose responsiveness and habituation in Drosophila melanogaster. Learn. Mem. 11, 303–311.

Schindelin, J., Arganda-Carreras, I., Frise, E., Kaynig, V., Longair, M., Pietzsch, T., Preibisch, S., Rueden, C., Saalfeld, S., Schmid, B. et al. (2012). Fiji: an open-source platform for biological-image analysis. Nat Meth 9, 676–682.

Schnakenberg, S.L., Matias, W.R., and Siegal, M.L. (2011). Sperm-storage defects and live birth in Drosophila females lacking spermathecal secretory cells. PLoS Biol 9, e1001192.

Schwabe, T., Bainton, R.J., Fetter, R.D., Heberlein, U., and Gaul, U. (2005). GPCR signaling is required for blood-brain barrier formation in Drosophila. Cell 123, 133–144.

Singh, S.R., and Hou, S.X. (2009). Multipotent stem cells in the malpighian tubules of adult Drosophila melanogaster. J Exp Biol 212, 413–423.

Snellman, J., and Nawy, S. (2004). cGMP-dependent kinase regulates response sensitivity of the mouse on bipolar cell. J Neurosci 24, 6621–6628.

Sözen, M.A., Armstrong, J.D., Yang, M., Kaiser, K., and Dow, J.A. (1997). Functional domains are specified to single-cell resolution in a Drosophila epithelium. Proc. Natl. Acad. Sci. U.S.A. 94, 5207–5212.

Stork, T., Engelen, D., Krudewig, A., Silies, M., Bainton, R.J., and Klämbt, C. (2008). Organization and function of the blood–brain barrier in Drosophila. J. Neurosci. 28, 587– 597.

Therneau, T.M. (2015). A package for survival analysis in S.

Thomas, T., Wang, B., Brenner, R., and Atkinson, N.S. (1997). Novel embryonic regulation of Ca(2+)-activated K+ channel expression in Drosophila. Invert. Neurosci. 2, 283–291.

Tomancak, P., Beaton, A., Weiszmann, R., Kwan, E., Shu, S., Lewis, S.E., Richards, S., Ashburner, M., Hartenstein, V., Celniker, S.E. et al. (2002). Systematic determination of patterns of gene expression during Drosophila embryogenesis. Genome Biol 3, research0088.1–88.14.

Truman, J.W., Moats, W., Altman, J., Marin, E.C., and Williams, D.W. (2010). Role of Notch signaling in establishing the hemilineages of secondary neurons in Drosophila melanogaster. Development 137, 53–61.

Urquhart-Cronish, M., and Sokolowski, M.B. Gene-environment interplay in Drosophila melanogaster: Chronic nutritional deprivation in larval life affects adult fecal output. Journal of Insect Physiology.

Wang, Z., Pan, Y., Li, W., Jiang, H., Chatzimanolis, L., Chang, J., Gong, Z., and Liu, L. (2008). Visual pattern memory requires foraging function in the central complex of Drosophila. Learn. Mem. 15, 133–142.

89 Weber, S., Bernhard, D., Lukowski, R., Weinmeister, P., Wörner, R., Wegener, J.W., Valtcheva, N., Feil, S., Schlossmann, J., Hofmann, F. et al. (2007). Rescue of cGMP kinase I knockout mice by smooth muscle specific expression of either isozyme. Circ. Res. 101, 1096–1103.

Wolfner, M.F. (1997). Tokens of love: functions and regulation of Drosophila male accessory gland products. Insect Biochem. Mol. Biol. 27, 179–192.

Wolfner, M.F., Heifetz, Y., and Applebaum, S.W. (2005). Gonadal glands and their gene products. In comprehensive molecular insect science, L.I. Gilbert, ed. (Amsterdam: Elsevier), pp. 179–212.

90 Chapter 4 Figures

A. B. C.

D. E. F.

Figure 1: pr-Gal4 expression in adult CNS

Immunohistochemical analysis of for-pr-Gal4 driving UAS-mCD8::GFP in the adult CNS and stained with anti-GFP. A. for-pr1-4.7-Gal4 expressed in neurons in the brain lobes. B. for-pr1-4.7-Gal4 expressed in neurons in the VNC. C. for-pr2-4.0-Gal4 expressed in head tracheal air sacs. D. for-pr3-3.3-Gal4 expressed in the perineurial surface glia of the brain. E. for-pr3-3.3-Gal4 expressed in the perineurial surface glia of the VNC. F. for-pr4-2.4-Gal4 expressed in the optic lobes of the CNS.

91 A. B. C.

D. E. F.

H G. I. .

J.

Figure 2: pr-Gal4 expression in adult gastric system

Immunohistochemical analysis of pr-Gal4 driving UAS-mCD8::GFP and UAS-GFP::nls in the adult gastric system and stained with anti-GFP. A. for-pr2-4.0-Gal4 driving GFP::nls expressed in the crop duct. B. for-pr2-4.0-Gal4 driving GFP::nls expressed in the cardia. C. for-pr2-4.0-Gal4 driving mCD8::GFP expressed in ureter of the malpighian tubules (possibly stem cells). D. for-pr2-4.0-Gal4 driving mCD8::GFP expressed in the midgut intestinal stem cells (ISC). E. for-pr3-3.3-Gal4 driving GFP::nls expressed distal anterior malpighian tubules. F. for-pr3-3.3-Gal4 driving GFP::nls expressed in the midgut anterior enterocytes. G. for-pr3-3.3-Gal4 driving mCD8::GFP expressed in the midgut enteroendocrine cells. H. for-pr3-3.3-Gal4 driving mCD8::GFP expressed in the large flat cells of the mid midgut. I. for-pr4-2.4-Gal4 driving mCD8::GFP expressed in the hindgut epithelia. L. for-pr4-2.4-Gal4 driving GFP::nls expressed in the ampulla (rectum) of the hindgut.

92 A. B. C.

D. E. F.

Figure 3: for-pr-Gal4 co-labeling

Immunohistochemical analysis of for-pr2-4.0-Gal4 driving UAS-mCD8::GFP and UAS- GFP::nls in the adult midgut A. for-pr2-4.0-4.0Gal4 driving mCD8::GFP expressed in the intestinal stem cells. B. anti-DELTA staining of the intestinal stem cells. C. Merge of A and B, with DNA stained with DAPI in blue. D. for-pr2-4.0-4.0Gal4 driving GFP::nls expressed in enteroblasts. E. anti-PROSPERO staining of the enteroblasts. F. Merge of D and E.

93 A. B.

C. D.

E. F.

Figure 4: for-pr-Gal4 expression in adult salivary and fat tissues

Immunohistochemical analysis of pr-Gal4 driving UAS-mCD8::GFP and UAS-GFP::nls in the adult salivary glands and fat cells stained with anti-GFP. A. for-pr2-4.0-Gal4 driving mCD8::GFP expressed in the start of the salivary duct. B. for-pr2-4.0-Gal4 driving mCD8::GFP expressed fat cells associated with the gastric system. C. for-pr3-3.3-Gal4 driving GFP::nls expressed in the salivary gland. D. for-pr3-3.3-Gal4 driving GFP::nls expressed in fat cells associated with the gastric system. E. for-pr4-2.4-Gal4 driving mCD8::GFP expressed in the salivary duct. F. for-pr4-2.4-Gal4 driving mCD8::GFP expressed in fat cells associated with the gastric system.

94 A. B. C.

D. E. F.

G H . .

Figure 5: for-pr-Gal4 expression in adult reproductive systems

Immunohistochemical analysis of for-pr-Gal4 driving UAS-mCD8::GFP and UAS-GFP::nls in the adult reproductive system and stained with anti-GFP. A. for-pr1-4.7-Gal4 driving mCD8::GFP expressed in the seminal vesicle and secondary cells of the accessory glands in the male reproductive system. B. for-pr2-4.0-Gal4 driving mCD8::GFP expressed in the eggs of the females ovaries. C. for-pr2-4.0-Gal4 GFP::nls expressed in the spermatheca of the female reproductive system. D. for-pr2-4.0-Gal4 GFP::nls expressed in the common oviduct of the female reproductive system. E. for-pr2-4.0-Gal4 driving mCD8::GFP expressed in the seminal vesicle and secondary cells of the accessory glands in male reproductive system. F. for-pr2-4.0-Gal4 driving GFP::nls expressed in the border of the seminal vesicle and ejaculatory duct in the male reproductive system. G. for-pr4- 2.4-Gal4 driving mCD8::GFP expressed in the seminal vesicle and secondary cells of the accessory glands in the male reproductive system. H. for-pr4-2.4-Gal4 driving GFP::nls expressed in the ejaculatory bulb in the male reproductive system.

95 Figure 6: Mapped CREs back onto the foraging locus.

In the centre of the figure is the region of the foraging gene corresponding to the cloned pr-Gal4 regions. Cloned region are colour coded and aligned below the locus. The dashed grey lines correspond to the delineation points of the nested constructs. Mapped adult expression is below the locus. The decimal numbers within each promoter refers to the internal segment to which the expression has been mapped.

96 s s e n e v i s n o p s e R

e s o r c u S

i i i i A A A A t N t N t t N N R R R R w w - w w r > r > r > > r o r1 fo r2 o r3 r4 fo f p > p f p p > > 2 > t r1 r r3 w p p p

Figure 7: for-pr-Gal4>RNAi sucrose responsiveness

Sucrose responsiveness scores of the pr-Gal4 lines driving UAS-Dcr; UAS-forRNAi-exon7:8. Mated female flies were food deprived for 48 hours prior to testing. Sample sizes range from n=30-40 per genotype. Data was collected over 5 days of testing. Data plotted as means ± 95% confidence intervals. (Full statistics in Ch.4 table S1)

97 500

) 450 n i e t

o 400 r p

g 350 m

/

l

o 300 r e c y

l 250 g

g

µ 200 (

s d i 150 p i L 100

50

0

Figure 8: Fat body>RNAi lipid levels

Lipid content of a fat body driver, ppl-Gal4, driving UAS-forcDNA in adult flies. Increasing foraging expression decreased lipid levels. Data plotted as means ± 95% confidence intervals. (Full statistics in Ch.3 table S2)

98 A. e v i l A

t n e c r e P

Time (hours)

B. e v i l A

t n e c r e P

Time (hours)

Figure 9: Starvation resistance

A. Starvation resistance of mated female foraging BAC rescue and BAC overexpressor. Increasing foraging expression increased starvation resistance (p = 0.00096). A Kaplan-Meier survival analysis was used. Proportion alive plotted as a function of time. Dashed lines are 95% confidence intervals. B. Starvation resistance of mated female pr-Gal4 drive foraging cDNA. Increasing foraging expression with for-pr1-4.7-Gal4 increased starvation resistance (p = 0). A Kaplan-Meier survival analysis was used. Proportion alive plotted as a function of time. Dashed lines are 95% confidence intervals. (Full statistics in Ch.3 table S3)

99 A. B. r so s re p ll x u e e n u r e c e p g s e v y in re p o t g ty d a C C il r A ild A w fo B w B

Figure 10: BAC adult western bot analysis

A. A western blot analysis of the wild type and BAC rescue adults. foraging null larvae were used as a negative control. The BAC rescue for the most part recapitulated the wild type expression pattern. B. A western blot analysis of the wild type and BAC over expresser adults. The BAC over expresser showed increased expression relative to the wild type adults.

100 Chapter 5

General Discussion

101 Deciphering foraging's pleiotropic effects

The foraging gene of Drosophila melanogaster is an important example of a gene with natural alleles that result in individual differences in behaviour. foraging is a pleiotropic gene that influences multiple developmental and physiological phenotypes with consequences for the animal's behaviour. The foraging gene is a good candidate to study the mechanisms of pleiotropy, since it has previously been characterized to produce multiple isoforms and is expressed in multiple tissues across development. I hypothesized that the multiple phenotypes of the foraging alleles were achieved through independent regulation of its isoforms' expression. I characterized foraging’s gene structure, its products, their regulation, and their function in Drosophila melanogaster. I focused on feeding-related phenotypes, as this is where most of the work on the foraging gene has occurred. This analysis could be extended to any of the other phenotypes associated with this gene. In this thesis, I conducted an in-depth analysis of foraging's gene products, characterizing its transcription start sites, its transcription termination site, and the alternative splicing patterns of its transcripts (Chapter 2). I generated a null mutation of the foraging gene using HR, and characterized its multiple phenotypic effects on feeding-related traits and viability. I used recombineering to generate a genomic rescue construct of the entire locus, which rescued the mutant phenotypes (Chapter 2). I generated multiple cloned foraging-Gal4 constructs and identified many regions containing cis-regulatory elements in the foraging locus that drove unique spatial and temporal expression (Chapter 3 and 4). By comparing phenotypes of the natural allelic variants of the gene with mutants and rescue constructs my data suggested that many foraging's phenotypes are differentially regulated. I also identified isoform-specific functions of the foraging gene products using isoform-specific RNAi to alter foraging’s expression (Chapter 3). Knowledge of the mechanistic basis underlying the generation of foraging’s mRNA and protein products permits investigation of the means by which these products influence foraging’s phenotypes. These experiments are essential for investigating the molecular

102 underpinnings of pleiotropy and the natural variation between rovers and sitters and will be discussed further below.

foraging’s gene structure

Sequencing the foraging locus of our rover and sitter lines identified 425 non-identical sites that included SNPs and InDels. The variations were enriched in the introns of the locus, and there were no non-synonymous mutations in the cGMP or kinase domains of the protein. Two amino acid differences were found in the N-terminus of the P1 isoform. Other isogenized strains in the lab share these two rover-like amino acids, but are phenotypically sitter-like, suggesting that neither of these two amino acid differences are the causal variants responsible for the rover/sitter phenotypic differences. I also found high variation in intron sequence, and minimal amino acid variation in the DGRP population. A selection analysis with PAML of the coding sequence of foraging for 12 Drosophila genomes found strong evidence for purifying selection. Together these data strongly suggest that the underlying allelic differences in our rover and sitter strains are regulatory. In order to characterize the regulation of the foraging gene. I identified four unique promoters in the foraging gene. When the gene was first characterized (Kalderon and Rubin, 1989), only one promoter had been verified. Subsequent genome-wide promoter identification studies supported our data for the remaining transcription start sites (Hoskins et al., 2011). The foraging promoters included 3 peaked and 1 broad TSS. Peaked promoters associate with more regulated expression patterns. Broad promoters associate with more constitutive expression patterns. Promoter 1, 2, and 4 were all peaked whereas promoter and 3 was broad. All promoters of foraging had strong matches to the Inr (Smale and Baltimore, 1989) and DPE (Burke and Kadonaga, 1996) sequences, and none appeared to have TATA (Lifton et al., 1978) boxes. foraging had been previously characterized as producing multiple transcript and protein isoforms (Kalderon and Rubin, 1989; Stapleton et al., 2002). Here I identify additional complexity. I identified and verified 21 unique transcripts which represented 9 distinct open reading frames. Several of these newly identified variants were predicted to code for proteins with truncated N-termini. The N-terminus of a PKG is pivotal in regulating its substrate

103 specificity (Pearce et al., 2010). I found no splicing differences between rover and sitter strains, suggesting that the allelic difference are likely in the regulation of expression level of the transcripts. All transcripts shared the four 3’-most exons and one 3’-transcription termination site. The coding region for the cGMP binding and kinase domains were in these shared exons, and as such all proteins share a C-terminus. foraging's orthologs share a complex gene structure with multiple promoters; this is found in several other taxa, including other Drosophila species, C. elegans, and mammals including mice and humans (Clark et al., 2007; Ørstavik et al., 1997; Stansberry et al., 2001). The “style” of this complexity does differ somewhat between taxa. The other Drosophila species have a very similar foraging gene structure to that in D. melanogaster, but there is a trend for larger introns as you move further out in the phylogeny (data not shown). The P4 protein isoform, generated by a retained intron, is only present in the melanogaster subgroup. More distant Drosophila species have an in-frame stop codon within this region (data not shown). The current annotations of egl-4, foraging’s orthologue in C. elegans, has 7 promoters and 16 transcripts (wormbase.org). The human ortholog, prkg1, has 5 promoters and 5 transcripts (ensembl.org). Having such in-depth characterization of the TSSs and splicing patterns of foraging and its orthologs could allow for interesting evolutionary comparison of its gene structure. Questions of interest are: do foraging’s orthologs use similar cis-regulatory elements in their promoter? Are there similar splicing patterns seen across taxa? Answers to such questions may indicate if foraging has had a complex structure since its origin, or if foraging and its orthologues have converged on this complex structure by some inherited tendency to have a high mutation rate. This high level of gene structure complexity generated many possibilities for the differential regulation of foraging’s gene products. Recent numerous advances in Drosophila molecular genetics have made a dissection of foraging’s complexity possible.

Genetic dissection of the locus

The number of sequence differences between the rover and sitter strains made a genetic dissection of both alleles intractable. As a result I focused my efforts on our sitter strain. Using homologous recombination-mediated gene targeting, I generated a null mutation of the foraging

104 gene and showed that this mutation resulted in altered foraging behaviour, ingestion behaviour, and DAG/TAG accumulation. I generated a genomic rescue construct of the entire locus, with recombineering. This construct was sufficient to rescue the null mutation in these phenotypes. Based on the extent of the BAC, more distal regulatory elements did not appear to be necessary for these phenotypes. This suggested that the foraging locus contains sufficient regulatory elements to drive expression of foraging products in the relevant tissues for these phenotypes. I conclusively showed the involvement of the foraging gene in these phenotypes. The BAC data in conjunction with the lack of splicing variation and minimal amino acid variation between rover and sitter alleles, suggested that the allelic differences between rovers and sitters are regulatory and within the locus. This work represents a significant advancement in the study of the foraging gene and its influence on feeding-related phenotypes. Furthermore, the data presented in this thesis suggest that the classical view of a causal SNP resulting in rovers having higher foraging expression than sitters, which in turn causes all of the phenotypic differences, is too simplistic. We have seen that for foraging path length behaviour, rovers likely do have higher foraging gene expression, yet for food intake, our data suggest that rovers have lower foraging expression than sitters. This finding alone shows that foraging’s pleiotropy, with its multiple associated phenotypes from the natural variants, arises from independent regulation of its gene products. Consequently, variation in foraging behaviour, food intake, and lipid levels likely arise from tissue specific expression of foraging. Many genes influencing behaviour and metabolism are typically vital and can be mutated to lethality (Hall, 1994). Our work has shown that the stage of lethality seen in the forging null mutant is pharate adult. This stage of lethality is consistent with the lethal stage of all other partial deletions of the foraging locus and is supported by our complementation analysis. Reducing foraging expression ubiquitously with RNAi also resulted in lethality at the pharate adult stage. We restricted the window in which foraging is required for survivorship to the early/mid pupal development through the use of a GAL80ts transgene (Ina Anreiter, unpublished). These experiments narrowed down the time when foraging is required for viability, but did not establish where foraging must be expressed for survivorship to adulthood. Other reports suggest that foraging may be required in hedgehog expressing cells of the wing disc during pupal development in order to survive to adulthood (Swarup et al., 2015). Future experiments will test this hypothesis by driving foraging RNAi in the developing wing discs.

105 Transcriptional regulation

Distal regulatory elements are more common than previously thought, and can be present on separate chromosomes from the gene of interest (Sexton et al., 2012). Distal regulatory elements of foraging don’t appear to be necessary for foraging behaviour, food intake and DAG/TAG accumulation, as our genomic BAC rescues the null phenotypes. foraging may still have distal regulatory elements that influence other aspects of the gene’s expression and function. To characterize the expression of foraging’s promoters I employed a promoter bashing strategy. I identified a broad set of tissues where foraging's promoters may be expressed. I did not see ubiquitous expression from any of these promoters. Rather, I saw expression from the promoters in most tissue systems and these promoters expressed in discrete cells of these tissue systems. These expression patterns are supported by this and other studies (Chintapalli et al., 2007; Gravelly et al., 2011). Many of these expression patterns, with respect to cell type, were consistent between the larva and the adult fly. Moreover, these shared expression patterns were driven by the same regions of the locus across development. Similar consistencies across development were seen when promoter bashing of the slowpoke, pdf, and paramyosin genes were performed (Brenner and Atkinson, 1996; Park et al., 2000; Arredondo et al., 2001). In addition, previous studies have noted that transcription start sites and core promoter elements are also shared throughout development (Hoskins et al., 2011). Our for-pr-Gal4s may not have encapsulate all of the regulatory elements from the locus, as driving foraging-RNAi with them did not alter the amount of FOR on a whole larva western blot. Given that the cloned regions covered less than half of the foraging gene, this may not be surprising. Transcription factor binding sites, as annotated with modENCODE ChiP-Seq data from embryos, we do see experimentally identified binding sites that do lie outside of our cloned regions (Nègre et al., 2011; Ch 3, fig. 12). The combinatorial nature of some cis-regulatory elements may also be a factor. I may have cloned most of the regulatory elements but tested them out of the context of the gene could have resulted in their inability to communicate with their “native” TSS. Such CRE-promoter specific interactions have been seen in the case of string (Lehman et al., 1999). To get around these two problems, I have generated a foraging::GFP BAC with recombineering and efforts are underway to generate a common

106 coding foraging-Gal4 BAC. Future expression analyses of these lines should identify the missing tissues not recovered with our cloned promoter-Gal4s. The identification of regulatory elements in foraging provided evidence about where the gene products are expressed, and the location of the DNA elements causing their expression. This allowed us to narrow the search by investigating causal DNA sequences for a given phenotypic difference. The value of this type of analysis is not limited to within-species comparisons. There are many studies that swap regulatory elements from closely related species, that result in differential expression and morphology (Arredondo et al., 2001; Gompel et al., 2005; Frankel et al., 2011). Previous work in the lab has found larger intrastrain differences in foraging path length for Drosophila simulans compared to Drosophila melanogaster (Sokolowski and Hansell, 1983). Similarly, Drosophila pseudoobscura has been characterized as having longer foraging path lengths than Drosophila melanogaster (Bauer and Sokolowski, 1984). Finding regulatory differences in the foraging gene that result in phenotypic differences between species is an interesting avenue for research.

Promoter specific function

I used isoform-specific RNAi driven by a ubiquitous driver to knock down a subset of foraging’s products. Targeting of exon 4 and 5, the first protein coding exon of the P1 and P3 protein isoforms, was sufficient to alter foraging behaviour. This allowed me to conclude that the P1 or P3 protein isoform plays a role in foraging behaviour in larvae. In the future, the critical isoform could be further narrowed down by using other RNAi lines are available that only target the start of exon 4, and therefore only reduce P1 coding transcripts (10033R-1 and 10033R-2; NIG-Fly, Nation Institute of Genetics, Kyoto Japan). These lines could be used to test for the requirement of P3 in this phenotype. Conversely, I have generated genomic BACs with both promoter 1 and 2 deleted, separately, which are the only promoters to produce a P1 isoform. Because the wild-type BAC is sufficient to rescue the foraging path length of the null mutant, testing these promoter deletion mutants will further enable us to identify which, if any, fail to complement the null mutant for larval foraging path length behaviour. To date the only other experiments that have been successful in transgenically manipulating foraging behaviour

107 employed a ubiquitous decrease (this study), or a ubiquitous increase (Osborne et al., 1997) of foraging expression. Despite the expression patterns seen in our for-pr-Gal4 lines, they are not sufficient to alter foraging behaviour in the larvae. Multiple other tissue specific drivers, including pan-neuronal and muscle-specific, have been tested in the past with overexpression and yielded no effect on larval foraging behaviour (Belay, 2010). As with the expression characterization, the recombineered foraging-Gal4 BAC may be necessary in order to identify the relevant tissue. Since the wild-type BAC is sufficient to rescue foraging behaviour, a foraging-Gal4 BAC driving RNAi should be able to identify candidate tissues required for foraging behaviour. Our promoter bashing experiments identified a region containing a cis-regulatory element that drove expression in the fat body of the larva and adult. Previous experiments supported our finding of foraging in the fat body of the larva (Chintapalli et al., 2007; Belay, 2010; Graveley et al., 2011). In this study, we were able to reduce foraging expression specifically in the fat body and increase lipid content in larvae and adults. Thus, I have located a region of DNA in the locus that is responsible for driving the expression in a tissue and manipulating foraging in this tissue elicits an effect on lipid levels. We now have a more restricted region, with respect to the gene, as well as tissue, to look for potential differences in any natural alleles of the foraging gene. Our rover and sitter strains have multiple SNPs and an InDel in this region. Future studies should generate a rover vs. sitter regulatory region directly fused to foraging coding sequence. By crossing these transgenics into the null mutant background, we could test if they differentially rescue the lipid phenotype. Expression of foraging in adipose tissue is well characterized in other mammals (Hofmann, 2006) and multiple FOR isoforms are expressed in the fat body (Belay, 2010). Consequently, the CRE identified in the current study may not be restricted to regulating just one isoform. In collaborative experiments in the lab, Dr. Jeff Dason has demonstrated a role for foraging at the larval neuromuscular junction (NMJ). Specifically, the foraging null mutant had increased synaptic transmission and increased nerve terminal growth. The transgenic foraging BAC rescued these phenotypes. Knocking down foraging in both the presynaptic neurons and the postsynaptic muscle increased synaptic transmission and had no effect on nerve terminal growth. Knocking down foraging in glia increased nerve terminal growth. These experiments are vastly expanding our understanding of the previously characterized rover/sitter differences in

108 these phenotypes (Renger et al., 1999). Here we see that even at the NMJ, foraging is required in different tissues to achieve its pleiotropic effects. In other collaborative experiments in the lab led by Dr. Jeff Dason, expression of channelrhodopsin with the for-pr1-4.7-Gal4 showed larval curling and rolling behaviour reminiscent of the larval nociceptive response (Tracey et al., 2003). This behaviour is like the escape behaviour seen when a larva is under attack from a parasitoid wasp (Hwang et al., 2007). We have previously seen encapsulation differences between the rover and sitter larvae when attacked by the parasitic wasp Asobara tabida (Hughes and Sokolowski, 1996). The cells in the for-pr1-4.7-Gal4 pattern then represent putative candidates for foraging involvement in this nociceptive behaviour. Our for-pr-Gal4 fragments were not sufficient to alter sucrose responsiveness of the adult fly. Previous studies have shown rover/sitter differences in sucrose responsiveness. Work currently underway in the lab investigates a role for the corpora cardiaca (CC) in this sucrose responsiveness. Altering foraging expression in CC cells alone using foraging RNAi was sufficient to alter sucrose responsiveness (Hughson, unpublished). Our for-pr-Gal4s did not express in the CC. I was able to alter starvation resistance both by over expressing the gene with the genomic BAC, as well as with for-pr1-4.7-Gal4. The expression patterns reported in this thesis generate new hypotheses for the involvement of foraging in novel phenotypes. Expression of foraging in the copper cells of the gut may influence acid secretion. Rovers and sitters differed in acidity levels in this region of the gut (Kaun, unpublished). Furthermore, expression in the visceral muscle may affect peristalsis of nutrients through the gut, and indeed some mutants of foraging have previously been characterized for an increase in gut passage time (Douglas, unpublished). foraging’s orthologue in mice functions in smooth muscle (Lohmann et al., 1997; Hofmann et al., 2006; Hofmann et al., 2009) and affects gut passage time (Weber et al., 2007). Rovers and sitters are known to differ in egg laying (Burns et al., 2012). This could result from foraging expression in spermatheca. Previous studies have supported the role of the spermatheca in egg laying (Schnakenberg et al., 2011). This rover/sitter difference in egg laying may also be due to differences in the male accessory glands, as their secretions have drastic effects on female post- mating behaviours (Gligorov et al., 2013; Wolfner, 1997). There are interesting interactions between cell types in the adult midgut. The midgut undergoes apoptosis, constricting its length, during food deprivation. When refed, the ISCs

109 proliferated and regrowth ocurrs in the gut. The JAK/STAT and EGFR induced proliferation of ISC is dependent on diet and insulin receptor mediated signalling (Choi et al., 2011). ISC proliferation is dependent on visceral muscle derived dilp3 secreted to the ISCs (O’Brien et al., 2011). Trachea derived dpp also affects ISC proliferation in the midgut (Li et al., 2013). PKG has already been shown to be involved in JAK-STAT signalling (Huang et al., 2005). With foraging expression in all of these tissues, it may play a role in phenotypes associated with nutrient dependent midgut homeostasis.

Other avenues of regulation

In this study I focused on transcriptional regulation. I was interested in finding the CREs associated with promoters that drive expression in distinct tissues. However there are many other means to differentially regulate the expression of a gene. They are discussed in turn below.

Promoter-proximal pausing

Promoter-proximal pausing occurs when the RNA polymerase II enzyme pauses its procession during transcription near the transcription start sites (Wu and Snyder, 2008). This additional step can delay the production of the transcript until it is needed. When looking at ChIP-Seq data, available from the modENCODE database, there is evidence of promoter- proximal pausing at the foraging gene (Nègre et al., 2011). By pulling down RNA polymerase II, the authors were able to identify an enrichment of transcript mapping to the foraging gene near the transcription start sites (S.fig. 2); this is indicative of promoter-proximal pausing (Price, 2008). This represents yet another interesting avenue of potential regulation of the foraging gene. Promoter-proximal pausing is associated with environmentally- and developmentally- responsive genes (Adelman and Lis, 2012; Zeitlinger et al., 2007). The rover and sitter strains are known to be differentially responsive to the environment (Graf and Sokolowski, 1989; Kaun et al., 2008; Kent et al., 2009). Conducting GRO-qPCR and ChIP-qPCR experiments in our rover and sitter strains might find different extents of promoter-proximal pausing.

110 Post-transcriptional regulation of foraging

Once transcription is completed, mRNA are bound by RNA binding proteins (RBPs) and are post-transcriptionally regulated. This regulation can alter the localization and stability of the transcripts (Keene, 2007). By altering the combination of RBPs, different sets of mRNAs involved in a particular pathway can be co-regulated to allow for a rapid response to environmental stimuli (Keene and Lager, 2005; Keene and Tenenbaum, 2002). An RNA operon/regulon consists of a group of mRNAs encoding functionally related products that are post-transcriptionally co-regulated by RBPs and microRNAs to form ribonucleoprotein complexes (Keene, 2007). The PUF RBP family is well characterized for its role in post- transcriptional RNA regulons. In S. cerevisiae the five PUF proteins bind to more than 10% of the transcriptome (Gerber et al., 2004). PUMILIO (PUM), a PUF RBP in Drosophila melanogaster, is involved in embryo development and can serve to degrade the transcripts it binds (Lehmann and Nusslein-Volhard, 1991). Drosophila PUM has also been shown to associate with many mRNAs coding for functionally related proteins (Gerber et al., 2006). PUM binds an 8-nucleotide core consensus sequence, known as a nanos-response element (NRE, Gerber et al., 2006). Post-transcriptional regulation represents another avenue of regulation of the foraging gene. The mRNA of foraging’s orthologue in C. elegans, egl-4 was shown to be bound by PUM at its 3’-UTR (Kaye et al., 2009). There are eight potential consensus sequences in foraging’s 3’-UTR that may be bound by PUM. I hypothesized that PUM post-transcriptionally regulates foraging in the fly. Specifically, that decreased PUM activity could result in higher foraging expression, which in turn could result in increased larval path length behaviour. I tested larvae carrying a P-element insertion in the pumilio gene for foraging expression and foraging path length behaviour (S.fig. 3A). I saw a trend to higher foraging mRNA expression levels in this mutant (S.fig. 3B). The insert also significantly increased path length relative to control. I do not however see any sequence variation between rovers and sitters at the putative NREs in its 3’- UTR, but pumilio is a strong candidate for the post-transcriptional regulation of foraging, independent from our natural alleles. As with promoter-proximal pausing, post transcriptional regulation is another means by which foraging may be responsive to environmental stimuli.

111 Differential protein function

foraging produces multiple different protein isoforms. I have identified 9 predicted FOR proteins that only differ in their N-termini, a crucial region of PKGs that is involved in substrate specificity and relative kinase activity (Pearce et al., 2010). Based on our western blot results it seems that the majority of the FOR protein isoforms produced in the larva and adult fly are either a P1, P3, or their derivatives. There are more than 8 distinct immunoreactive bands on a western blot. This represents the potential for a significant level of post-translational modification. In vitro studies conducted in our lab have shown that E. coli expressed P1-4 protein isoforms can have very different enzyme activities (So, unpublished). Isoform-specific kinase activity has been seen for a subset of these isoforms when overexpressed in malpighian tubules (MacPherson et al., 2004b). They also showed that these isoforms have different subcellular localization (MacPherson et al., 2004b). Furthermore, changes in the subcellular localization of a PKG isoform has previously been characterized in both mammalian cell cultures and in C. elegans (Gudi et al. 1997; O’Halloran et al., 2012). The differential binding and activity levels of these isoforms, and their differential subcellular localizations, adds yet another layer of complexity that can contribute to how foraging achieves its pleiotropic functions.

Pleiotropy

Pleiotropy is loosely defined as the manifold effects of a piece of genetic information at the phenotypic level. “Selectional pleiotropy” used by evolutionary biologist and population geneticists view a mutation as the genetic unit of pleiotropic function. It is a very useful framework to think of the adaptive or selective advantage of a novel mutation over its predecessor. In this context, it is useful to think of this differential regulation as an escape or avoidance of the constraint of pleiotropy, as two regulatory mutations can be recombined away from each other. However, in the context of molecular genetics it is more useful to think of the gene as being pleiotropic. Since it is the gene products that perform the biochemical functions in the cells of an animal to elicit phenotypic effects. As previously stated in this thesis, I use the definition of “molecular gene pleiotropy”. Consequently, I view the gene and not the mutation

112 as the unit at the genetic level that has manifold effects on phenotype, and not a mutation. I view a mutant, natural or otherwise, as a tool to interrogate gene function. I hypothesize that foraging’s pleiotropy is achieved through independent regulation of its gene products. I was able to verify the independent regulation of many of foraging’s phenotypes by a combination of Hirschian- and Benzerian-inspired approaches. I used these approaches to investigate our natural rover and sitter alleles and to generate a null allele of the foraging locus. By comparing the rover and sitter strains, and the null, rescue and over expressor mutants, I concluded that foraging behaviour and food intake are differentially regulated. Rovers have longer path length and lower food intake than sitters. Higher expression of the foraging gene results in longer path length and higher food intake. Thus, rovers must have higher expression in the cells responsible for path length, and lower expression in the cells responsible for food intake. The foraging gene influences both of these phenotypes, but clearly must function in different cells to exert its influence. Furthermore, manipulating foraging expression in the fat body was sufficient to alter DAG/TAG levels. Preliminary data suggested that altered foraging expression in the fat body is not sufficient to alter path length behaviour, but this needs follow- up in the future. Lipid levels and foraging behaviour may also be independently regulated. These data together suggest that foraging behaviour, food intake, and lipid levels are all influenced by foraging, but have different spatial or temporal requirements of foraging’s expression. As such, we conclude that foraging’s pleiotropy is achieved by the differential regulation of its gene products. I have identified an expression pattern for the multiple promoters of the foraging locus. This expression pattern could underlie numerous potential functions for the locus. In this study, I was able to identify the role of a few specific isoforms in some of the phenotypes I investigated. Other work in the lab has identified other tissues, some of which seen in our expression analysis, in which foraging is required for rover and sitter phenotypes (discussed above). These data suggest that there are multiple independent tissues in which foraging is expressed to elicit its multiple phenotypes. As such, foraging represents an example of genuine pleiotropy. Genuine pleiotropy occurs when a gene has independent functions to produce its phenotypic effects. This is in contrast to spurious pleiotropy where a gene only has one function to produce a phenotype which in turn leads to another phenotype. This can be especially common if the phenotypes are at different levels of biological organization. Further experiments into the exact tissue and isoforms required for these phenotypes will identify the precise nature of foraging pleiotropy.

113 These data suggest that foraging has both a complex gene structure and regulation. The shared attribute of promoter 1, 2, and 4 being peaked, along with the closely linked expression in a male accessory glands suggest that these promoters may have a common origin. The multiple promoters of the foraging gene may have arisen from multiple sub-gene level duplications. Neofunctionalization occurs when a gene duplicates, allowing one of the paralogues to drift into new roles. Escape from adaptive conflict occurs when a gene that already has multiple functions duplicates, allow the paralogue the separate the functions (Des Marais and Rausher, 2008). Investigations into the origins of the multiple isoforms of foraging, combined with extension of these analyses to the other PKG paralogues will further our understanding into why foraging is seen to be so pleiotropic in so many different taxa.

Final thoughts

The results discovered in this thesis work are essential for understanding the molecular underpinnings of the natural variation between rovers and sitters. The identification of the foraging gene products that are necessary in particular tissues for foraging’s associated phenotypes will facilitate the identification the causal sequence difference between the rover and sitter alleles. The question of how foraging achieves its pleiotropy, and the origin of its multiple promoters, may also shed light on the evolutionary origins of rover and sitter. This study has focused on an aspect of the transcriptional regulation of foraging, foraging's post-transcriptional regulation, translational regulation, and differing biochemistries between isoforms, represent many other avenues of research into the pleiotropic functions of foraging. The complicated structure and regulation of the Drosophila melanogaster foraging gene, make it an ideal model for the study of the mechanistic underpinnings of pleiotropic gene function in feeding-related behaviours and metabolism, as well as many other phenotypes. The extent of a gene's pleiotropic effects can have large consequences for the rate and path of its evolution, as well as treating and identifying risk alleles in disease. Identifying these mechanisms may shed light on the evolutionary origins of not only the naturally occurring strategies of the rover and sitter morphs, but of the foraging gene itself. Whether the complex gene structure and regulation of foraging is an example of neofunctionalization or escape from adaptive conflict could further our understanding of this gene and its conserved functions across taxa.

114 115 References:

Adelman, K., and Lis, J.T. (2012). Promoter-proximal pausing of RNA polymerase II: emerging roles in metazoans. Nat Rev Genet 13, 720–731.

Arredondo, J.J., Ferreres, R.M., Maroto, M., Cripps, R.M., Marco, R., Bernstein, S.I., and Cervera, M. (2001). Control of Drosophila paramyosin/miniparamyosin gene expression: Differential regulatory mechanism for muscle-specific transcription. J. Biol. Chem. 276, 8278–8287.

Bauer, S.J., and Sokolowski, M.B. (1984). Larval foraging behavior in isofemale lines of Drosophila melanogaster and D. pseudoobscura. J. Hered. 75, 131–134.

Belay, A.T. (2010). Cellular components of naturally varying behaviours in the fruit fly, Drosophila melanogaster. Thesis. University of Toronto (http://hdl.handle.net/1807/19022).

de Belle, J.S., Hilliker, A.J., and Sokolowski, M.B. (1989). Genetic localization of foraging (for): a major gene for larval behavior in Drosophila melanogaster. Genetics 123, 157–163.

Brenner, R., and Atkinson, N. (1996). Developmental- and eye-specific transcriptional control elements in an intronic region of a Ca(2+)-activated K+ channel gene. Dev. Biol. 177, 536– 543.

Burns, J.G., Svetec, N., Rowe, L., Mery, F., Dolan, M.J., Boyce, W.T., and Sokolowski, M.B. (2012). Gene-environment interplay in Drosophila melanogaster: chronic food deprivation in early life affects adult exploratory and fitness traits. PNAS 109 Suppl 2, 17239–17244.

Chintapalli, V.R., Wang, J., and Dow, J.A.T. (2007). Using FlyAtlas to identify better Drosophila melanogaster models of human disease. Nat. Genet. 39, 715–720.

Choi, N.H., Lucchetta, E., and Ohlstein, B. (2011). Nonautonomous regulation of Drosophila midgut stem cell proliferation by the insulin-signaling pathway. PNAS 108, 18702–18707.

Clark, A.G., Eisen, M.B., Smith, D.R., Bergman, C.M., Oliver, B., Markow, T.A., Kaufman, T.C., Kellis, M., Gelbart, W., Iyer, V.N. et al. (2007). Evolution of genes and genomes on the Drosophila phylogeny. Nature 450, 203–218.

Des Marais, D.L., and Rausher, M.D. (2008). Escape from adaptive conflict after duplication in an anthocyanin pathway gene. Nature 454, 762–765.

Frankel, N., Erezyilmaz, D.F., McGregor, A.P., Wang, S., Payre, F., and Stern, D.L. (2011). Morphological evolution caused by many subtle-effect substitutions in regulatory DNA. Nature 474, 598–603.

116 Gerber, A.P., Herschlag, D., and Brown, P.O. (2004). Extensive association of functionally and cytotopically related mRNAs with Puf family RNA-binding proteins in yeast. PLoS Biol. 2, E79.

Gerber, A.P., Luschnig, S., Krasnow, M.A., Brown, P.O., and Herschlag, D. (2006). Genome- wide identification of mRNAs associated with the translational regulator PUMILIO in Drosophila melanogaster. Proc. Natl. Acad. Sci. U.S.A. 103, 4487–4492.

Gligorov, D., Sitnik, J.L., Maeda, R.K., Wolfner, M.F., and Karch, F. (2013). A novel function for the hox gene Abd-B in the male accessory gland regulates the long-term female post- mating response in Drosophila. PLoS Genet. 9, e1003395.

Gompel, N., Prud’homme, B., Wittkopp, P.J., Kassner, V.A., and Carroll, S.B. (2005). Chance caught on the wing: cis-regulatory evolution and the origin of pigment patterns in Drosophila. Nature 433, 481–487.

Graf, S.A., and Sokolowski, M.B. (1989). Rover/sitter Drosophila melanogaster larval foraging polymorphism as a function of larval development, food-patch quality, and starvation. J. Insect Behav. 2, 301–313.

Graveley, B.R., Brooks, A.N., Carlson, J.W., Duff, M.O., Landolin, J.M., Yang, L., Artieri, C.G., van Baren, M.J., Boley, N., Booth, B.W. et al. (2011). The developmental transcriptome of Drosophila melanogaster. Nature 471, 473–479.

Gruneberg, H. (1938). An analysis of the “pleiotropic” effects of a new lethal mutation in the rat (Mus norvegicus). Proceedings of the Royal Society of London. Series B, Biological Sciences 125, 123–144.

Gudi, T., Lohmann, S.M., and Pilz, R.B. (1997). Regulation of gene expression by cyclic GMP- dependent protein kinase requires nuclear translocation of the kinase: identification of a nuclear localization signal. Mol. Cell. Biol. 17, 5244–5254.

Hall, JC. (1994). Pleiotropy of behavioral genes. In flexibilty and constraints in behavioral systems. Edited by CP Kyriacou and RJ Greenspan John Wiley and Sons Ltd pages 15–27.

Hodgkin, J. (1998). Seven types of pleiotropy. Int. J. Dev. Biol. 42, 501–505.

Hofmann, F., Feil, R., Kleppisch, T., and Schlossmann, J. (2006). Function of cGMP-dependent protein kinases as revealed by gene deletion. Physiological Reviews 86, 1–23.

Hofmann, F., Bernhard, D., Lukowski, R., and Weinmeister, P. (2009). cGMP regulated protein kinases (cGK). In cGMP: Generators, effectors and therapeutic implications, H.H.H.W. Schmidt, F. Hofmann, and J.-P. Stasch, eds. (Springer Berlin Heidelberg), pp. 137–162.

Hoskins, R.A., Landolin, J.M., Brown, J.B., Sandler, J.E., Takahashi, H., Lassmann, T., Yu, C., Booth, B.W., Zhang, D., Wan, K.H. et al. (2011). Genome-wide analysis of promoter architecture in Drosophila melanogaster. Genome Res. 21, 182–192.

Huang, J.-S., Chuang, L.-Y., Guh, J.-Y., Chen, C.-J., Yang, Y.-L., Chiang, T.-A., Hung, M.-Y., and Liao, T.-N. (2005). Effect of nitric oxide-cGMP-dependent protein kinase activation on

117 advanced glycation end-product–induced proliferation in renal fibroblasts. JASN 16, 2318– 2329.

Hughes, K., and Sokolowski, M.B. (1996). Natural selection in the laboratory for a change in resistance by Drosophila melanogaster to the parasitoid wasp Asobara tabida. J. Insect Behav. 9, 477–491.

Hwang, R.Y., Zhong, L., Xu, Y., Johnson, T., Zhang, F., Deisseroth, K., and Tracey, W.D. (2007). Nociceptive neurons protect Drosophila larvae from parasitoid wasps. Curr. Biol. 17, 2105–2116.

Kalderon, D., and Rubin, G.M. (1989). cGMP-dependent protein kinase genes in Drosophila. J. Biol. Chem. 264, 10738–10748.

Kaun, K.R., Chakaborty-Chatterjee, M., and Sokolowski, M.B. (2008). Natural variation in plasticity of glucose homeostasis and food intake. J. Exp. Biol. 211, 3160–3166.

Kaye, J.A., Rose, N.C., Goldsworthy, B., Goga, A., and L’Etoile, N.D. (2009). A 3’UTR pumilio-binding element directs translational activation in olfactory sensory neurons. Neuron 61, 57–70.

Keene, J.D. (2007). RNA regulons: coordination of post-transcriptional events. Nat Rev Genet 8, 533–543.

Keene, J.D., and Lager, P.J. (2005). Post-transcriptional operons and regulons co-ordinating gene expression. Chromosome Res. 13, 327–337.

Keene, J.D., and Tenenbaum, S.A. (2002). Eukaryotic mRNPs may represent posttranscriptional operons. Mol. Cell 9, 1161–1167.

Kent, C.F., Daskalchuk, T., Cook, L., Sokolowski, M.B., and Greenspan, R.J. (2009). The Drosophila foraging gene mediates adult plasticity and gene–environment interactions in behaviour, metabolites, and gene expression in response to food deprivation. PLoS Genet. 5, e1000609.

Lehman, D.A., Patterson, B., Johnston, L.A., Balzer, T., Britton, J.S., Saint, R., and Edgar, B.A. (1999). Cis-regulatory elements of the mitotic regulator, string/Cdc25. Development 126, 1793–1803.

Lehmann, R., and Nüsslein-Volhard, C. (1991). The maternal gene nanos has a central role in posterior pattern formation of the Drosophila embryo. Development 112, 679–691.

Li, Z., Zhang, Y., Han, L., Shi, L., and Lin, X. (2013). Trachea-derived Dpp controls adult midgut Homeostasis in Drosophila. Developmental Cell 24, 133–143.

Lohmann, S.M., Vaandrager, A.B., Smolenski, A., Walter, U., and De Jonge, H.R. (1997). Distinct and specific functions of cGMP-dependent protein kinases. Trends Biochem. Sci. 22, 307–312.

118 MacPherson, M.R., Lohmann, S.M., and Davies, S.-A. (2004). Analysis of Drosophila cGMP- dependent protein kinases and assessment of their in vivo roles by targeted expression in a renal transporting epithelium. J. Biol. Chem. 279, 40026–40034.

Manning, G., Plowman, G.D., Hunter, T., and Sudarsanam, S. (2002). Evolution of protein kinase signaling from yeast to man. Trends in Biochemical Sciences 27, 514–520.

Nègre, N., Brown, C.D., Ma, L., Bristow, C.A., Miller, S.W., Wagner, U., Kheradpour, P., Eaton, M.L., Loriaux, P., Sealfon, R. et al. (2011). A cis-regulatory map of the Drosophila genome. Nature 471, 527–531.

O’Brien, L.E., Soliman, S.S., Li, X., and Bilder, D. (2011). Altered modes of stem cell division drive adaptive intestinal growth. Cell 147, 603–614.

O’Halloran, D.M., Hamilton, O.S., Lee, J.I., Gallegos, M., and L’Etoile, N.D. (2012). Changes in cGMP levels affect the localization of EGL-4 in AWC in Caenorhabditis elegans. PLoS ONE 7, e31614.

Orstavik, S., Natarajan, V., Taskén, K., Jahnsen, T., and Sandberg, M. (1997). Characterization of the human gene encoding the type I alpha and type I beta cGMP-dependent protein kinase (PRKG1). Genomics 42, 311–318.

Osborne, K.A., Robichon, A., Burgess, E., Butland, S., Shaw, R.A., Coulthard, A., Pereira, H.S., Greenspan, R.J., and Sokolowski, M.B. (1997). Natural behavior polymorphism due to a cGMP-dependent protein kinase of Drosophila. Science 277, 834–836.

Park, J.H., Helfrich-Förster, C., Lee, G., Liu, L., Rosbash, M., and Hall, J.C. (2000). Differential regulation of circadian pacemaker output by separate clock genes in Drosophila. Proc. Natl. Acad. Sci. U.S.A. 97, 3608–3613.

Pearce, L.R., Komander, D., and Alessi, D.R. (2010). The nuts and bolts of AGC protein kinases. Nat. Rev. Mol. Cell Biol 11, 9–22.

Price, D.H. (2008). Poised polymerases: on your mark...get set...go! Mol. Cell 30, 7–10.

Renger, J.J., Yao, W.D., Sokolowski, M.B., and Wu, C.F. (1999). Neuronal polymorphism among natural alleles of a cGMP-dependent kinase gene, foraging, in Drosophila. J. Neurosci. 19, RC28.

Schnakenberg, S.L., Matias, W.R., and Siegal, M.L. (2011). Sperm-storage defects and live birth in Drosophila females lacking spermathecal secretory cells. PLoS Biol 9, e1001192.

Sexton, T., Yaffe, E., Kenigsberg, E., Bantignies, F., Leblanc, B., Hoichman, M., Parrinello, H., Tanay, A., and Cavalli, G. (2012). Three-dimensional folding and functional organization principles of the Drosophila genome. Cell 148, 458–472.

Sokolowski, M.B., and Hansell, R.I. (1983). Drosophila larval foraging behavior. I. The sibling species, D. melanogaster and D. simulans. Behav. Genet. 13, 159–168.

119 Stansberry, J., Baude, E.J., Taylor, M.K., Chen, P.-J., Jin, S.-W., Ellis, R.E., and Uhler, M.D. (2001). A cGMP-dependent protein kinase is implicated in wild-type motility in C. elegans. Journal of Neurochemistry 76, 1177–1187.

Stapleton, M., Liao, G., Brokstein, P., Hong, L., Carninci, P., Shiraki, T., Hayashizaki, Y., Champe, M., Pacleb, J., Wan, K. et al. (2002). The Drosophila gene collection: identification of putative full-length cDNAs for 70% of D. melanogaster genes. Genome Res. 12, 1294– 1300.

Swarup, S., Pradhan-Sundd, T., and Verheyen, E.M. (2015). Genome-wide identification of phospho-regulators of Wnt signaling in Drosophila. Development 142, 1502–1515.

Tracey, W.D., Wilson, R.I., Laurent, G., and Benzer, S. (2003). painless, a Drosophila gene essential for nociception. Cell 113, 261–273.

Weber, S., Bernhard, D., Lukowski, R., Weinmeister, P., Wörner, R., Wegener, J.W., Valtcheva, N., Feil, S., Schlossmann, J., Hofmann, F. et al. (2007). Rescue of cGMP kinase I knockout mice by smooth muscle specific expression of either isozyme. Circ. Res. 101, 1096–1103.

Wolfner, M.F. (1997). Tokens of love: functions and regulation of Drosophila male accessory gland products. Insect Biochem. Mol. Biol. 27, 179–192.

Wu, J.Q., and Snyder, M. (2008). RNA polymerase II stalling: loading at the start prepares genes for a sprint. Genome Biol. 9, 220.

Zeitlinger, J., Stark, A., Kellis, M., Hong, J.-W., Nechaev, S., Adelman, K., Levine, M., and Young, R.A. (2007). RNA polymerase stalling at developmental control genes in the Drosophila melanogaster embryo. Nat. Genet. 39, 1512–1516.

120 Appendix

121 Supplemental Figures

Supplemental figure 1: Southern blot of hobo transposable elements.

Southern blot probing for hobo transposable elements was conducted as in Russell and Sambrook (2001) using a probe described in Blackman et al., (1989). Presence of a 2.6 kb fragment is indicated a fully functional hobo transposable element coding for it's own transposase. Multiple of our lab stains and common tools showed fully functional hobo elements.

Blackman, R.K., Koehler, M.M., Grimaila, R., and Gelbart, W.M. (1989). Identification of a fully-functional hobo transposable element and its use for germ-line transformation of Drosophila. EMBO J. 8, 211–217.

122 Supplemental figure 2: RNA PolII ChiP-Seq from modEncode.

RNA PolII ChIP-Seq data, from modENCODE, aligned to the foraging locus (Nègre et al., 2011). We saw significant peaks at all of the four identified promoters with variation throughout development.

123 Supplemental figure 3: Path length behaviour and foraging expression in pumilio mutant.

A. Foraging path length behaviour on yeast of a pumilio mutant, milord, and control, 2u. Sample size of n=90 per genotype (p=0.001). B. qRT-PCR of foraging transcripts of a pumilio mutant, milord, and control, 2u. Sample size of n=3 per genotype. Rp49 was used as a reference gene for standardization (p=0.105).

124 Chapter 2 Supplemental Tables

Table S1: Statistical analysis for lethality - Chapter 2, figure 11.

A. Chi squared value and p-value for fig. 11B (n=100-120 per genotype). B. Chi squared value and p-value for fig. 11C (n=80 per genotype). C. Chi squared value and p-value for fig. 11D (n=50 per genotype).

A.

Chi-squared test for given probabilities

data: null X-squared = 0.0047, df = 3, p-value = 0.9999

B.

Chi-squared test for given probabilities

data: frt X-squared = 0.0035, df = 3, p-value = 0.9999

C.

Chi-squared test for given probabilities

data: hobo X-squared = 1.0164, df = 2, p-value = 0.6016

125 Table S2: Statistical analysis for null path length - Chapter 2, figure 11.

A. Anova table for fig. 11A (n=120 per genotype). B. Means, standard errors, and upper and lower 95% confidence intervals for fig. 11A. C. Anova table for fig. 11B (n=180 per genotype). D. Means, standard errors, and upper and lower 95% confidence intervals for fig. 11B. E. Anova table for fig. 11C (n=220-240 per genotype). F. Means, standard errors, and upper and lower 95% confidence intervals for fig. 11C. G. Anova table for fig. 11D (n=240-300 per genotype). H. Means, standard errors, and upper and lower 95% confidence intervals for fig. 11D.

A.

Sum Sq Df F value Pr(>F)

genotype 47506 1 165.8461 < 2.2e-16 date 4064 3 4.7293 0.003192 genotype:date 1996 3 2.3226 0.075834 Residuals 66170 231

B.

genotype fit se lower upper

iB55 51.40066 1.551537 48.34369 54.45764 ie4 23.20195 1.545057 20.15775 26.24616

C.

Sum Sq Df F value Pr(>F)

genotype 3998 1 18.3174 2.42e-05 date 28531 5 26.1465 < 2.2e-16 genotype:date 643 5 0.5894 0.7081 Residuals 75948 348

D.

genotype fit se lower upper

Df / null 30.54156 1.101112 28.37588 32.70723 Df / wt 37.20623 1.101112 35.04056 39.37190

126 E.

Sum Sq Df F value Pr(>F)

Genotype 5927 2 7.340 0.0007028 Date 76938 7 27.224 < 2.2e-16 Genotype:Date 16199 14 2.866 0.0003264 Residuals 269286 667

F.

Genotype fit se lower upper

08112-3 47.37125 1.331618 44.75659 49.98592 32122-1 45.21382 1.300455 42.66035 47.76730 yw 52.10863 1.343523 49.47059 54.74667

G.

Sum Sq Df F value Pr(>F)

gene.copy.number 54772 2 105.8390 < 2.2e-16 background 1296 1 5.0102 0.0254646 date 8681 3 11.1830 3.374e-07 gene.copy.number:background 4217 2 8.1482 0.0003132 gene.copy.number:date 6759 6 4.3536 0.0002448 background:date 536 1 2.0725 0.1503565 gene.copy.number:background:date 2419 2 4.6746 0.0095778 Residuals 213209 824

H.

gene.copy.number fit se lower upper

none 24.86598 1.0404199 22.82379 26.90816 one 33.92880 0.9447597 32.07438 35.78322 two 45.49528 0.9452859 43.63982 47.35073

127 Table S3: Statistical analysis for RNAi path length - Chapter 2, figure 12.

A. Anova table for fig. 12A (n=90 per genotype). B. Means, standard errors, and upper and lower 95% confidence intervals for fig. 12A.

A.

Sum Sq Df F value Pr(>F)

genotype 7089 2 10.8915 2.864e-05 date 6402 2 9.8357 7.616e-05 genotype:date 3713 4 2.8525 0.0243 Residuals 84939 261

B.

genotype fit se lower upper

daGal4>comRNAi 32.58841 1.901569 28.84404 36.33278 daGal4>wt 37.56446 1.901569 33.82009 41.30882 wt>comRNAi 45.05534 1.901569 41.31098 48.79971

128 Table S4: Statistical analysis for recombinant path length - Chapter 2, figure 13.

A. Anova table for fig. 13B (n=120 per genotype). B. Means, standard errors, and upper and lower 95% confidence intervals for fig. 13B.

A.

Sum Sq Df F value Pr(>F)

genotype 2956 3 5.1475 0.0016477 date 3272 3 5.6980 0.0007752 genotype:date 3328 9 1.9315 0.0457641 Residuals 88822 464

B.

genotype fit se lower upper

R1 26.44167 1.263019 23.95972 28.92361 R2 26.80833 1.263019 24.32639 29.29028 R3 23.34167 1.263019 20.85972 25.82361 R4 20.71667 1.263019 18.23472 23.19861

129 Table S5: Statistical analysis for null food intake - Chapter 2, figure 14.

A. Anova table for fig. 14A (n=150-190 per genotype). B. Means, standard errors, and upper and lower 95% confidence intervals for fig. 14B. C. Anova table for fig. 14B (n=20 per genotype). D. Means, standard errors, and upper and lower 95% confidence intervals for fig. 14B.

A.

Sum Sq Df F value Pr(>F)

genotype 5.4695e+10 4 147.5616 < 2.2e-16 date 7.6108e+09 1 82.1320 < 2.2e-16 genotype:date 2.5396e+09 4 6.8515 1.966e-05 Residuals 8.2750e+10 893

B.

genotype fit se lower upper

+[ie1];+[iB55];+[ie4] 3287.092 779.7516 1756.733 4817.452 +[ie1];+[ie4];+[ie4] 12340.606 787.9228 10794.210 13887.002 w-;001-018;+ 3950.437 861.7408 2259.164 5641.711 w-;001-018;BAC:flag 16299.733 1056.6351 14225.955 18373.510 w-;+;BAC:flag 27819.842 987.7916 25881.179 29758.506

C.

Sum Sq Df F value Pr(>F)

genotype 2109248 1 10.6892 0.002523 date 2888347 1 14.6375 0.000550 genotype:date 96191 1 0.4875 0.489948 Residuals 6511753 33

D.

genotype fit se lower upper

control 27494.95 1019.313 25421.15 29568.76 Df(ED243) 22716.00 1047.265 20585.32 24846.67

130 Table S6: Statistical analysis for null fat levels - Chapter 2, figure 15.

A. Anova table for fig. 15A (n=8 per genotype). B. Means, standard errors, and upper and lower 95% confidence intervals for fig. 15B. C. Anova table for fig. 15B (n=50 per genotype). D. Means, standard errors, and upper and lower 95% confidence intervals for fig. 15B. E. Anova table for fig. 15C (n=19 per genotype). F. Means, standard errors, and upper and lower 95% confidence intervals for fig. 15C.

A.

Sum Sq Df F value Pr(>F)

genotype 649218 2 56.139 3.744e-09 Residuals 121426 21

B.

genotype fit se lower upper

w-;101-053;+ 641.0182 26.88444 585.1090 696.9275 w-;101-053;BAC:HA 341.6476 26.88444 285.7384 397.5569 w-;+;BAC:HA 257.8565 26.88444 201.9473 313.7658

C.

Sum Sq Df F value Pr(>F)

Genotype 0.1754 2 4.6133 0.01153 Date 5.1857 4 68.2093 < 2e-16 Genotype:Date 0.1409 8 0.9270 0.49644 Residuals 2.5659 135

D.

Genotype fit se lower upper

08112-3 0.6568671 0.01949691 0.6212580 0.6924762 32122-1 0.6571083 0.01949691 0.6214992 0.6927173 yw 0.5844552 0.01949691 0.5488461 0.6200643

131 E.

Sum Sq Df F value Pr(>F)

genotype 209900 1 8.4580 0.006555 date 166279 2 3.3501 0.047750 genotype:date 341 2 0.0069 0.993160 Residuals 794133 32

F.

genotype fit se lower upper

control 398.5322 36.14055 324.9163 472.1480 ED243 547.1753 36.14055 473.5594 620.7911

132 Chapter 3 Supplemental Tables

Table S1: Statistical analysis for pr>RNAi path length - Chapter 3, figure 13.

A. Anova table for fig. 13A (n=120 per genotype). B. Means, standard errors, and upper and lower 95% confidence intervals for fig. 13A. C. Anova table for fig. 13B (n=120 per genotype). D. Means, standard errors, and upper and lower 95% confidence intervals for fig. 13B. E. Anova table for fig. 13C (n=120 per genotype). F. Means, standard errors, and upper and lower 95% confidence intervals for fig. 13C.

A.

Sum Sq Df F value Pr(>F)

genotype 4210 6 2.4526 0.02343 date 9285 3 10.8169 5.662e-07 genotype:date 17916 18 3.4786 1.459e-06 Residuals 232048 811

B.

genotype fit se lower upper

for-pr1-4.7>RNAi 32.30179 1.544149 27.95541 36.64816 for-pr1-4.7>wt 35.17251 1.544149 30.82613 39.51888 for-pr3-3.3>RNAi 36.16122 1.544149 31.81485 40.50760 for-pr3-3.3>wt 36.26126 1.544149 31.91489 40.60764 for-pr4-2.4>RNAi 37.56877 1.544149 33.22240 41.91515 for-pr4-2.4>wt 33.46448 1.544149 29.11811 37.81085 wt>RNAi 39.51301 1.550743 35.14807 43.87794

C.

Sum Sq Df F value Pr(>F)

genotype 3267 2 6.0395 0.0026399 date 4808 3 5.9264 0.0005946 genotype:date 10448 6 6.4392 1.888e-06 Residuals 94111 348

D.

genotype fit se lower upper

for-pr2-4.0>RNAi 28.11887 1.501205 23.87790 32.35985 for-pr2-4.0>wt 28.84364 1.501205 24.60267 33.08461 wt>RNAi 34.84035 1.501205 30.59938 39.08132

133 E.

Sum Sq Df F value Pr(>F)

genotype 31569 8 7.3832 1.373e-09 date 28391 3 17.7066 3.267e-11 genotype:date 52821 24 4.1178 1.827e-10 Residuals 551580 1032

F.

genotype fit se lower upper

for-pr1-4.7>cDNA 48.72229 2.110473 44.58098 52.86359 for-pr1-4.7>wt 59.08117 2.128588 54.90432 63.25803 for-pr2-4.0>cDNA 42.50771 2.119686 38.34832 46.66709 for-pr2-4.0>wt 56.02028 2.128453 51.84369 60.19687 for-pr3-3.3>cDNA 57.05788 2.128953 52.88031 61.23545 for-pr3-3.3>wt 55.45431 2.119550 51.29519 59.61343 for-pr4-2.4>cDNA 51.44327 2.138096 47.24776 55.63878 for-pr4-2.4>wt 56.16604 2.110473 52.02474 60.30735 wt>cDNA 45.29187 2.119414 41.13301 49.45072

134 Table S2: Statistical analysis for isoform specific path length - Chapter 3, figure 14.

A. Anova table for fig. 14B (n=90 per genotype). B. Means, standard errors, and upper and lower 95% confidence intervals for fig. 14B. C. Anova table for fig. 14C (n=50 per genotype). D. Means, standard errors, and upper and lower 95% confidence intervals for fig. 14C.

A.

Sum Sq Df F value Pr(>F)

genotype 6608 2 10.6959 3.431e-05 date 10363 2 16.7732 1.403e-07 genotype:date 2136 4 1.7284 0.144 Residuals 80628 261

B.

genotype fit se lower upper

daGal4>T1/3RNAi 28.53173 1.852689 24.88361 32.17985 daGal4>wt 37.56446 1.852689 33.91634 41.21258 wt>T1/3RNAi 40.04433 1.852689 36.39621 43.69245

C.

Sum Sq Df F value Pr(>F)

genotype 36006 1 43.9395 1.977e-09 date 11652 1 14.2187 0.0002812 genotype:date 764 1 0.9319 0.3367935 Residuals 78667 96

D.

genotype fit se lower upper

f0e0/null 62.9369 4.04834 54.90101 70.97279 CyOGFP 100.8876 4.04834 92.85167 108.92345

135 Table S3: Statistical analysis for pr>RNAi food intake - Chapter 3, figure 15.

A. Anova table for fig. 15A (n=144 per genotype). B. Means, standard errors, and upper and lower 95% confidence intervals for fig. 15A.

A.

Sum Sq Df F value Pr(>F)

genotype 3.0687e+10 8 23.2888 < 2e-16 date 1.4544e+09 2 4.4153 0.01228 genotype:date 3.6221e+09 16 1.3745 0.14570 Residuals 2.0901e+11 1269

B.

genotype fit se lower upper

for-pr1-4.7>RNAi 34438.46 1069.483 32340.31 36536.61 for-pr1-4.7>wt 31387.44 1069.483 29289.29 33485.59 for-pr2-4.0>RNAi 23913.58 1069.483 21815.43 26011.73 for-pr2-4.0>wt 18615.26 1069.483 16517.11 20713.41 for-pr3-3.3>RNAi 27567.88 1069.483 25469.73 29666.02 for-pr3-3.3>wt 21318.01 1069.483 19219.86 23416.16 for-pr4-2.4>RNAi 30394.59 1069.483 28296.44 32492.74 for-pr4-2.4>wt 22696.03 1069.483 20597.88 24794.18 wt>RNAi 26535.80 1069.483 24437.65 28633.95

136 Table S4: Statistical analysis for RNAi fat levels - Chapter 3, figure 16.

A. Anova table for fig. 16A (n=16 per genotype). B. Means, standard errors, and upper and lower 95% confidence intervals for fig. 16A. C. Anova table for fig. 16B (n=55-60 per genotype). D. Means, standard errors, and upper and lower 95% confidence intervals for fig. 16B. A.

Sum Sq Df F value Pr(>F)

genotype 1127948 2 29.3633 1.052e-08 sex 45841 1 2.3867 0.12987 genotype:sex 110352 2 2.8728 0.06771 Residuals 806683 42

B.

genotype fit se lower upper

cross 973.5814 34.6471 909.1622 1038.0007 Lsp2-Gal4 605.3132 34.6471 540.8940 669.7325 UAS-forRNAi 725.9727 34.6471 661.5534 790.3919

C.

Sum Sq Df F value Pr(>F)

genotype 328.09 2 78.819 < 2.2e-16 age 2726.75 2 655.052 < 2.2e-16 genotype:age 469.74 4 56.424 < 2.2e-16 Residuals 339.26 163

D.

genotype age fit se lower upper

Cg-Gal4 48hr 3.646158 0.3309733 2.992610 4.299706 UAS-forRNAi 48hr 4.446579 0.3309733 3.793031 5.100127 cross 48hr 4.478444 0.3400428 3.806988 5.149901 Cg-Gal4 72hr 8.346619 0.3148184 7.724971 8.968267 UAS-forRNAi 72hr 8.309059 0.3499011 7.618135 8.999982 cross 72hr 7.319474 0.3309733 6.665926 7.973022 Cg-Gal4 96hr 14.935211 0.3309733 14.281662 15.588759 UAS-forRNAi 96hr 17.583600 0.3225929 16.946600 18.220600 cross 96hr 8.974350 0.3225929 8.337350 9.611350

137 Chapter 4 Supplemental Tables

Table S1: Statistical analysis for pr>RNAi PER- Chapter 4, figure 6.

A. Anova table for fig. 6 (n=30-40 per genotype). B. Means, standard errors, and upper and lower 95% confidence intervals for fig. 6.

A.

Sum Sq Df F value Pr(>F)

genotype 124.813 8 11.2487 8.84e-13 time 42.842 8 3.8611 0.0003282 date 2.268 4 0.4088 0.8021683 genotype:time 52.497 46 0.8228 0.7787844 genotype:date 28.733 15 1.3811 0.1610375 time:date 6.628 7 0.6827 0.6866062 genotype:time:date 5.924 3 1.4237 0.2375125 Residuals 241.333 174

B.

genotype fit se lower upper

1R 5.671053 1.2993126 3.1066105 8.235495 1W 4.545113 0.9737393 2.6232519 6.466974 2R 4.350251 1.1026126 2.1740336 6.526468 2W 3.554511 0.9156200 1.7473600 5.361663 3R 3.010025 0.9964111 1.0434170 4.976633 3W 1.504386 1.2270765 -0.9174844 3.926256 4R 4.669799 1.1703657 2.3598587 6.979740 4W 4.504386 1.0146368 2.5018060 6.506966 WR 1.463659 0.8607978 -0.2352902 3.162608

138 Table S2: Statistical analysis for fat levels - Chapter 4, figure 7.

A. Anova table for fig. 7 (n=15 per genotype). B. Means, standard errors, and upper and lower 95% confidence intervals for fig. 7.

A.

Sum Sq Df F value Pr(>F)

genotype 114009 4 7.7782 3.107e-05 Residuals 252843 69

B.

genotype fit se lower upper

ppl>cDNA-2 289.2667 15.62985 258.0860 320.4474 ppl>wt 354.0714 16.17843 321.7963 386.3465 wt>cDNA-2 360.6667 15.62985 329.4860 391.8474

139 Table S3: Statistical analysis for SR - Chapter 4, figure 8.

A. Chi square table for fig. 8A (n=150-160 per genotype). B. Sample sizes, medians, and upper and lower 95% confidence intervals for fig. 8A. C. Chi square table for fig. 8B (n=100 per genotype). D. Sample sizes, medians, and upper and lower 95% confidence intervals for fig. 8B.

A.

N Observed Expected (O-E)^2/E (O-E)^2/V

gene.copy.number=1 159 159 131 5.78 11 gene.copy.number=2 156 156 184 4.14 11

Chisq= 11 on 1 degrees of freedom, p= 0.000926

B.

records n.max n.start events median 0.95LCL 0.95UCL

gene.copy.number=1 159 159 159 159 50 48 53 gene.copy.number=2 156 156 156 156 55 54 57

C.

N Observed Expected (O-E)^2/E (O-E)^2/V

genotype=for-pr1-4.7>cDNA 100 100 182.4 37.21764 67.35718 genotype=for-pr2-4.0>cDNA 100 100 64.1 20.05133 24.30447 genotype=for-pr3-3.3>cDNA 100 100 82.4 3.75198 4.69850 genotype=for-pr4-2.4>cDNA 100 100 100.3 0.00103 0.00135 genotype=wt>cDNA 100 100 70.7 12.10835 14.92609

Chisq= 84.6 on 4 degrees of freedom, p= 0

D.

records n.max n.start events median 0.95LCL 0.95UCL

genotype=for-pr1-4.7>cDNA 100 100 100 100 117.0 113 121 genotype=for-pr2-4.0>cDNA 100 100 100 100 90.5 87 93 genotype=for-pr3-3.3>cDNA 100 100 100 100 93.5 90 100 genotype=for-pr4-2.4>cDNA 100 100 100 100 100.0 96 103 genotype=wt>cDNA 100 100 100 100 91.5 89 95

140