Research Collection

Doctoral Thesis

A systems genetics approach to understanding intra-species variation in Drosophila wing and body size

Author(s): Vonesch, Sybille

Publication Date: 2014

Permanent Link: https://doi.org/10.3929/ethz-a-010419387

Rights / License: In Copyright - Non-Commercial Use Permitted

This page was generated automatically upon download from the ETH Zurich Research Collection. For more information please consult the Terms of use.

ETH Library DISS. ETH NO. 22296

A systems genetics approach to understanding intra-species variation in Drosophila wing and body size

A thesis submitted to attain the degree of

DOCTOR OF SCIENCES of ETH ZURICH

(Dr. sc. ETH Zurich)

presented by

Sibylle Chantal Vonesch

MSc Universität Zürich

born on 11.10.1984

citizen of

Dübendorf ZH

accepted on the recommendation of

Prof. Dr. Ernst Hafen Prof. Dr. Konrad Basler Prof. Dr. Sven Bergmann Prof. Dr. Trudy F. C. Mackay

2014

1 1. SUMMARY 5 2. ZUSAMMENFASSUNG 7 3. INTRODUCTION 9 3.1. Growth is a regulated process that is sensitive to environmental fluctuations 9 3.2. Molecular mechanisms governing growth control 10 3.2.1. The insulin/IGF pathway controls cell, organ and organismal growth and metabolism in response to nutrient availability 12 3.2.2. Insulin is the main regulator of metabolic homeostasis in vertebrates 13 3.2.3. Insulin-like growth factors (IGFs) control pre- and postnatal growth in vertebrates 13 3.2.4. Insulin and IGF binding to their corresponding receptors initiates an intracellular signaling cascade 14 3.3. Molecular mechanisms of growth control in Drosophila 16 3.3.1. Systemic control of growth and metabolic homeostasis by IIS/TOR signaling in Drosophila in response to nutrients 16 3.3.2. The transcription factor FOXO regulates many downstream targets of IIS and is essential for survival under stress conditions 20 3.3.3. The Target of Rapamycin (TOR) pathway 21 3.3.3.1. The TOR pathway controls cellular and organismal growth in response to local abundance of growth factors, amino acids, cellular energy levels and stress 21 3.3.3.2. TORC1 responds to growth factors via IIS, extracellular amino acid levels, intracellular energy status and stress 22 3.3.3.3. TORC1 promotes translation, ribosome biogenesis and the expression of metabolic genes while inhibiting autophagy 24 3.3.4. Myc controls cell growth and cell numbers downstream of TOR and other pathways 26 3.3.5. The Hippo tumor suppressor pathway controls cell number by regulating cell cycle progression and apoptosis in an organ autonomous manner 28 3.3.6. The Ras/MAPkinase pathway controls cell growth and proliferation 30 3.4. Steroid hormones affect body size by controlling developmental timing in Drosophila 32 3.5. Various environmental factors can influence body and organ size in Drosophila 34 3.6. The Drosophila wing as a model system to study growth control 36 3.7. Genome-wide association studies 40 3.7.1. Genome-wide association studies provide a first step towards a systems

2 level understanding of multigenic traits by linking variation in genotype to natural phenotypic variation 40 3.7.2. Finding the missing heritability of complex traits 45 3.7.3. GWAS studies of human height imply common variants with small effects account for half of the total heritability and shed light on the sources of missing heritability 46 3.7.4. Limitations and problems of human GWAS that can be overcome in model organism GWAS 48 3.7.5. Drosophila as a model system for GWAS of size 50 3.8. Experimental evolution 50 3.8.1. Experimental evolution experiments can be used to generate extreme phenotypes and to identify combinations of loci that are causally linked to these phenotypes 50 3.8.2. Results from selection studies of size in Drosophila 53 3.9. Summary of the introduction and project motivation 54 3.10. Aims of this thesis 55 4. RESULTS 57 4.1. Novel loci rather than variants in canonical growth pathway genes are associated with wing and body size variation in Drosophila melanogaster (Manuscript in preparation) 59 4.2. The FlyCatwalk: A high throughput feature-based sorting system for artificial selection (Submitted manuscript) 109 4.3. Further results 130 4.3.1. Foodbatch variability is a strong and specific confounder for wing size in Drosophila melanogaster 130 4.3.2. Experimental evolution of Drosophila wing size 133 5. DISCUSSION AND OUTLOOK 136 5.1. Small fluctuations in foodbatch quality cause substantial population-level variation in wing size 136 5.2. Many loci with small effects and predominantly regulatory variants underlie wing size variation in Drosophila 141 5.3. Novel candidates fall into diverse functional classes, overlap candidates from other studies and are associated with height or obesity related traits in humans 146 5.4. Planar cell polarity (PCP) genes and growth control 148 5.5. Metabolism and growth control 151 5.6. Conclusions and Outlook 154

3 6. MATERIALS AND METHODS 161 7. REFERENCES 162 8. ACKNOWLEDGEMENTS 200 9. CURRICULUM VITAE 201

4 1. SUMMARY All must regulate growth in a coordinated manner. A deregulation of this process impacts on survival, fitness and fertility of the organism and may lead to one of many malignancies subsumed as cancer. For understanding both physiological development and malignant growth it is thus paramount to gain a complete understanding of the mechanisms and genetic networks underlying growth control. The final size of an organism is determined by the interplay between intrinsic and extrinsic factors. Intrinsic factors are the genomic loci that exert their effect on growth via a highly interconnected network of cellular signaling molecules and systemic growth factors and hormones. The best-studied extrinsic factor is nutrient availability, but other factors such as temperature, population density and parasitic infection can also influence growth. The actions of intrinsic and extrinsic factors must be coordinated during development to ensure that growth can be adjusted to changing environments, for example in response to nutrient shortage.

Much has been learned about the control of organismal and tissue growth from single gene studies in model organisms, especially Drosophila. On the systems level there are growth factors, which enable integration of growth with nutritional cues. The evolutionarily conserved insulin/insulin-like growth factor (IIS) and target of rapamycin (TOR) pathways are the main mediators of growth and proliferation in response to nutrient availability. On the level of the individual organs, morphogens and physical forces act to control size. Organs also have differential sensitivity to environmental cues, indicating that integration with the external environment not only occurs at the systemic but also at the organ level. On the cellular level, multiple pathways affect cell growth, proliferation and apoptosis in response to systemic and organ intrinsic stimuli and need to be integrated with each other. Besides growing, tissues also have to become patterned to form a normally developed organism, which involves dedicated pathways that may also impact on growth. On top of the molecular machinery is a layer of hormonal control that delimits the duration of different developmental stages. In light of the multifactorial nature of animal growth it is clear that regulatory networks rather than single genes or pathways govern the control of growth and proper organismal size. To gain a complete understanding of growth control we thus have to apply methods that probe the influence of whole gene networks and eventually the whole genome, rather than single genes, on size. It will ultimately be important to understand how all these genes interact under natural circumstances to create a properly sized organism, and how genetic variation creates phenotypic variability while preserving function. Exploiting natural phenotypic variation in size and studying which genetic loci cause this variation is a promising new approach for shedding more light on growth control at the systems level. Genome-wide

5 association studies (GWAS) are a routinely used tool for studying genotype phenotype associations in a variety of organisms and have substantially broadened our understanding of the underlying genetic networks of multifactorial traits. This work describes the application of genome wide association methods towards identifying the loci underlying natural wing and body size variation in Drosophila. We successfully reduced the influence of environmental factors during growth by raising under a strictly controlled regime, which is a necessity to avoid false associations between genotype and phenotype. We show that only very few of the previously known growth pathway genes are associated to variability in size, instead our study identifies novel regulators of size. This finding illustrates the complementarity of the GWAS approach to classical genetics and highlights the importance of probing natural variants. Our findings suggest processes like planar cell polarity and metabolism may have a larger role in controlling growth than previously thought. Furthermore, our results highlight the importance of intergenic noncoding and regulatory elements in creating size variability in a population and encourage more efforts towards the investigation of regulatory rather than functional mutations for understanding how phenotypic variability is achieved. Taken together our findings identify loci relevant for creating variability in size between organisms and add to the expanding knowledge of the processes governing growth.

6 2. ZUSAMMENFASSUNG Wachstum und damit das erreichen einer bestimmten Grösse ist ein kontrollierter und koordinierter Prozess. Störungen in der Regulation des Wachstums haben schwerwiegende Konsequenzen für die Gesundheit, Fruchtbarkeit und das Überleben eines jeden Organismus und können die Entstehung von Tumoren begünstigen. Um sowohl physiologisches Wachstum als auch Krebsentwicklung zu verstehen ist es notwendig die Mechanismen und genetischen Interaktionen die der Wachstumskontrolle unterliegen aufzuklären. Die Grösse eines Organismus wird durch ein Zusammenspiel von intrinsischen und extrinsischen Faktoren bestimmt. Intrinsische Faktoren sind die Gene, die das Wachstum über ein stark vernetztes System von Signalmolekülen und Hormonen beeinflussen. Der am besten studierte extrinsische Faktor ist die Nahrung, doch andere Faktoren wie Temperatur, Populationsdichte und Infektion durch Parasiten können Wachstum ebenfalls beeinflussen. Intrinsische und extrinsische Faktoren müssen zusammenspielen um sicherzustellen, dass Wachstum den sich verändernden Umweltbedingungen angepasst werden kann, wie zum Beispiel bei Nahrungsknappheit. Wichtige Erkenntnisse im Bereich der Wachstumskontrolle wurden durch genetische Forschung in Zellen und Geweben von Modellorganismen, speziell in der Fruchtfliege, gewonnen. Diese haben gezeigt, dass der Ernährungszustand des Organismus durch systemisch agierende Wachstumsfaktoren an die Gewebe kommuniziert wird. Dies geschieht über den evolutionär konservierten Insulin Signalweg, der zusammen mit dem ebenfalls konservierten Target of Rapamycin (TOR) Signalweg dafür verantwortlich ist Zellwachstum und Zellteilung an die Nährstoffverfügbarkeit anzupassen. Das Wachstum der einzelnen Organe wird zusätzlich durch Morphogene und mechanische Kräfte reguliert und ist zudem von Organ zu Organ unterschiedlich sensitiv gegenüber Umwelteinflüssen. Auf zellulärer Ebene werden systemische und organspezifische Signale über mehrere Signalwege integriert um Zellwachstum, Zellteilung und Zelltod zu kontrollieren. Um einen normal entwickelten Organismus zu generieren muss neben dem Organwachstum auch Musterbildung stattfinden. Dies erfolgt über spezialisierte Signalwege, die jedoch auch das Wachstum beeinflussen können. Neben den zellulären Prozessen spielen Hormone eine wichtige Rolle für das Wachstum, da sie die Länge der einzelnen Entwicklungsphasen regulieren. Angesichts der vielen Signalwege und extrinsischen Faktoren scheint es klar, dass regulatorische Netzwerke und nicht einzelne Gene Wachstum und damit das Erreichen einer bestimmten Grösse bestimmen. Um die Wachstumsregulation in ihrer Ganzheit zu verstehen wird es darum notwendig sein neue Methoden anzuwenden, welche es ermöglichen den

7 Einfluss von genetischen Netzwerken und schlussendlich des gesamten Genoms auf das Wachstum zu studieren. Das Ziel wäre es zu verstehen wie genomische Regionen interagieren um einen perfekt proportionierten Organismus zu generieren, und wie Variabilität im Genom zu Variabilität in der Grösse führt ohne die Entwicklung eines funktionellen Organismus zu stören. Ein vielversprechender Ansatz um dies zu erreichen ist die Untersuchung des Zusammenhangs zwischen genetischen Polymorphismen und phänotypischer Variation in natürlichen Populationen. Genomweite Assoziationsstudien werden in einer Vielzahl von Organismen regelmässig angewendet um solche Zusammenhänge zu studieren und haben es ermöglicht, dass wichtige Erkenntnisse über die genetischen Netzwerke, die diesen Phänotypen unterliegen, gewonnen werden konnten. Die vorliegende Studie beschreibt die Anwendung einer solch genomweiten Assoziationsstudie um genetische Regionen die Variation in Flügel- und Körpergrösse in der Fruchtfliege unterliegen zu identifizieren. Um die Grösse möglichst genau bestimmen zu können, haben wir störende Umwelteinflüsse während des Wachstums erfolgreich reduziert indem wir Fliegen unter strikt kontrollierten Bedingungen aufgezogen haben. Die Resultate unserer Assoziationsstudie demonstrieren, dass nur wenige der schon bekannten Wachstumsgene mit natürlicher Grössenvariation assoziiert sind und der Grossteil der ihr unterliegenden Gene neue Wachstumsregulatoren darstellen. Dies zeigt auf, dass der genomweiten Ansatzes komplementär zu klassisch genetischen Studien ist und untermalt die Wichtigkeit natürliche Polymorphismen zu studieren. Des Weiteren zeigen unsere Resultate, dass nicht kodierende und regulatorische Regionen wichtige Module sind um Variabilität in Grösse zu generieren und Mutationen in solchen Regionen stärker erforscht werden sollten um zu verstehen wie dies in einer Population erreicht wird. Aufgrund der Rolle einiger unserer neuen Wachstumsregulatoren in Prozessen wie Zellpolarität und Zellmetabolismus denken wir, dass diese wichtigere Rollen in der Wachstumskontrolle haben könnten als bisher angenommen. Diese neuen Erkenntnisse tragen zu einem besseren Verständnis der Mechanismen und Gene bei, die der Grössenvariation zwischen Organismen in Populationen unterliegen und ergänzen so unser Gesamtbild von Wachstumskontrolle.

8 3. INTRODUCTION

This introduction should provide the reader with an overview over the current state of the field of growth control and illuminate why, complementary to classical genetics studies, more global approaches are necessary to attain a complete understanding of animal growth.

The final size of an organism is determined by interplay between the molecular mechanisms governing cell size and division, the action of systemic growth factors and hormones and environmental cues on growth. In the course of this introduction, each of these components is addressed and their impact on and relevance for growth discussed. After a brief summary of animal growth control as a whole I will provide a description of the molecular mechanisms and most relevant pathways governing growth, the nature of hormonal stimuli and their impact on growth duration and thereby size as well as discuss environmental factors that may affect size. As the aim of this thesis consists in studying genetic factors that affect wing size, a short recapitulation of the development of the Drosophila wing will be given. Together these sections should illuminate the complexity of animal growth, requiring the involvement of many genetic factors that interact with each other and with the environment to create a properly sized animal, and thereby illustrate that approaches studying the effects of single genes on body size might not be sufficient to grasp the whole picture. Two approaches, genome-wide association studies and experimental evolution experiments, are then put forward and their relative advantages and shortcomings in providing a more global view of the loci involved in body size determination discussed.

3.1 Animal Growth is a regulated process that is sensitive to environmental fluctuations: How animals regulate their final size by controlling and coordinating growth among cells and tissues is an important but still incompletely understood issue in biology (Weinkove and Leevers 2000). A deregulation of this process impacts on survival and fertility of the organism and, if occurring in single cells, may lead to malignancies such as cancer. For understanding both physiological development and malignant growth it is thus paramount to gain a complete understanding of the mechanisms and genetic networks underlying growth control.

The size of an organism is ultimately defined by two factors: the duration of the growth period and the growth rate during that period (Stern and Emlen 1999, Nijhout 2003, Nijhout et al. 2014). Considerable size variation exists between species but within naturally breeding

9 species size is relatively homogeneous, suggesting that organismal size, and thus the growth rate and the duration of the growth period, are genetically predetermined. The growth rate of individual organs within an organism can be broken down to the rate of growth and proliferation of single cells that constitute the organ. Numerous studies in single cell systems have revealed important mechanisms underlying cellular growth and division (Rupes 2002, Echave et al. 2007, Aldea et al. 2007, Yanagida et al. 2011). The growth and proliferation rates of unicellular organisms can however not be directly translated to those of cells in a multicellular organism, as here an additional layer of control consisting of systemic growth factors and hormones plays a key role during normal development. Defining the growth rates of cells within a multicellular organism thus not only requires an understanding of how growth and proliferation are controlled at the cellular level but also how growth factors and hormones affect different cell types and tissues in vivo. Furthermore, cell differentiation and pattern formation occur within developing organs and genes involved in these processes may affect growth (Serrano and O’ Farrell 1997, Baker 2007, Lecuit and LeGoff 2007). Apart from genetic factors, an organism’s environment can substantially influence its growth: considerable intra-specific variation in size may arise through environmental factors like quantity and quality of nutrition, temperature or population density, which results in competition for resources (Santos et al. 1994, French et al. 1998, Imasheva et al. 1999, Lefranc and Bundgaard 2000, Imasheva and Bubliy 2003). Organismal growth thus needs to be coupled to other control systems that are responsive to and integrate these environmental stimuli with normal development (Van der Have and De Jong 1996, Martin and Hall 2005, Wang T et al. 2006, Wullschleger et al. 2006, Ghosh et al. 2013, Nijhout et al. 2014).

3.2 Molecular mechanisms governing growth control: Mathematically, the size of an organ is determined by the number and size of its constituent cells. Final cell number is the result of the antagonizing actions of cell division and apoptosis, making organ, or body size an activity readout of the processes of cell growth, cell division and apoptosis during development. This view is supported by studies showing that observed wing size differences between Drosophila melanogaster strains were mainly attributable to differences in cell number (Robertson 1959), while the effect of temperature on wing size was found to be mediated by modulating cell size (Partridge et al. 1994, Azevedo et al. 2002). Cell number and cell size are thus genetically separately regulated, respond differentially to environmental cues and both changes in cell size and cell number may affect final organ size (Nijhout 2003). However, despite the fact that animals of different size generally differ in the number of cell divisions they undergo and consequently the number of cells they are composed of (Raff 1996), cell division alone is not sufficient for tissue growth to

10 occur; instead cell growth needs to precede division. Even on the cellular level, as shown in yeast, growth can continue in the absence of cell division, while a certain cell size is an absolute requirement before cell division can follow (Johnston et al. 1977, Jorgensen and Tyers 2004). In developing tissues, altering proliferation rates does not lead to a change in total tissue size; rather cell size is adjusted to maintain the proper size of the tissue. Patterning also seems to occur relatively normally even in total absence of cell division, suggesting that the mechanisms controlling growth and pattern formation do not depend on cell number but rather respond to and regulate distance across or volume of the tissue (Weigmann et al. 1997, Neufeld et al. 1998, Su and O’Farrell 1998). Evidence for a distance sensing mechanism on the cellular level comes from fission yeast, where cells measure their own length over a centrally located sensor protein, Cdr2, that evaluates the gradient of another protein, Pom1, emanating from the cell tips, to control entry into mitosis (Martin 2009). Evidently, when cellular growth and proliferation are coupled, growth typically controls division and not vice versa (Goberdhan and Wilson 2003). Nevertheless, altering cell size by modulating general protein synthesis can affect organismal growth rate but does not necessarily lead to a change in size in the final adult structure as is apparent from studies on mutants showing the minute syndrome in Drosophila. Most Minute genes correspond to protein components of the ribosomal machinery, and flies heterozygous for a minute mutation have slower cellular growth and proliferation rates. However, most of them reach normal cell and adult sizes as they have a proportionately elongated growth period (Morata and Ripoll 1975, Marygold et al. 2007). On the other hand, a mutation in the Drosophila S6 kinase (S6K) gene does affect final organ size. Apart form having a reduced proliferation rate and a severe delay in development, S6K mutant flies also have smaller cells and, importantly, a concomitant reduction in final body size since cell number is not changed (Montagne et al. 1999). S6K (p70 S6K in mammals) is a downstream component of insulin/insulin-like growth factor signaling (IIS) that, by phosphorylating the ribosomal protein S6, promotes translation initiation and elongation. A great number of genetic studies in Drosophila melanogaster have subsequently shown that several other components of the IIS system regulate organismal size by specifically controlling growth and proliferation of cells and organs, and have greatly advanced our current understanding of organismal growth control (Chen et al. 1996, Leevers et al. 1996, Böhni et al. 1999, Montagne et al. 1999, Weinkove et al. 1999, Brogiolo et al. 2001, Britton et al. 2002, Oldham et al. 2002, Oldham and Hafen 2003). The high similarity of phenotypes of Drosophila and mammalian IIS pathway mutants suggest a highly conserved role of this pathway in growth control through

11 the regulation of cell size, cell number and metabolism (Efstratiadis 1998, Nakae et al. 2001, Maki 2010). The following chapters entail a description of pathways and components that, through extensive classical genetic research in unicellular organisms, mice and Drosophila over the past 30 years, have been identified as evolutionarily highly conserved key molecular players and mechanisms governing growth control. These include the nutrient sensing IIS/TOR pathway, the transcription factor Myc, the Hippo tumor suppressor pathway and the EGFR/Ras/MAP kinase pathway. Myc and components from all these pathways are frequently misregulated in human tumors and cancer, highlighting their crucial role in the regulation of growth.

3.2.1 The insulin/IGF pathway controls cell, organ and organismal growth and metabolism in response to nutrient availability: The insulin/insulin-like growth factor signaling (IIS) pathway is an important mediator of growth control, metabolism, reproduction and longevity in metazoans. While in invertebrates like C. elegans and Drosophila, both growth control and regulation of metabolism are coupled and underlie signaling from the same molecule, insulin, over the insulin receptor, vertebrates have two functionally distinct molecules and pathways (Maki 2010). Signaling from the peptide hormone insulin over the insulin receptor (InR) regulates postnatal glucose homeostasis, carbohydrate and lipid levels whereas pre- and postnatal growth control is mediated by the insulin-like growth factors -I and -II (IGF-I, IGF-II) over the IGF-I receptor (IGF-IR). All three molecules are involved in the regulation of reproductive physiology. Despite the functional separation, insulin and IGFs show high sequence similarity, as do their receptors, and large parts of the downstream signaling cascades, mediated by insulin receptor substrate proteins (IRSs) are common (Chan and Steiner 2000, Nakae et al. 2001). On the cellular level, insulin/IGF signaling controls growth, proliferation, metabolism and survival of cells. The InR/IGF-IR homologues in vertebrates and lower metazoans activate some of the same downstream signaling components as their vertebrate counterparts, mainly the Ras/MAPK and phosphotidylinositide-3 kinase (PI3K) pathways (Saltiel and Kahn 2001, Oldham and Hafen 2003, Guo 2014). Complete loss of key components of the insulin/IGF signaling cascade is mostly lethal while partial malfunction results in a number of growth defects and metabolic disorders, like a misbalance of lipids (dyslipidemia), hypertension, female infertility and loss of glucose sensitivity that may develop into type II diabetes (Reaven 1997).

12 3.2.2 Insulin is the main regulator of metabolic homeostasis in vertebrates: In humans, insulin is mainly functional in the regulation of blood glucose homeostasis by increasing glucose uptake primarily in muscle but also in fat tissue and inhibiting glucose production in the liver. Furthermore, by promoting synthesis and inhibiting catabolism of lipids, proteins and glycogen, insulin has a stimulatory effect on cell growth and differentiation and metabolite storage in muscle, liver and fat tissues. Thus, a disturbance in the insulin response due to insulin resistance or deficiency entails severe metabolic deregulation and is the underlying basis of developing Diabetes mellitus (Saltiel and Kahn 2001). Type I diabetes results from an insulin deficiency caused by the absence of insulin-producing pancreatic beta cells due to an autoimmune reaction. Insulin resistance in muscle, liver and fat tissues leads to the development of type II diabetes, which is accompanied by hyperinsulinemia, the elevated production and secretion of insulin from pancreatic beta cells to compensate for the resistance. Both type I and type II diabetes lead to chronically elevated blood glucose levels (hyperglycaemia) and consequential organ damage, dysfunction and failure, especially of the eyes, kidneys, heart, nerves and blood vessels (American Diabetes Association 2009).

3.2.3 Insulin-like growth factors (IGFs) control pre- and postnatal growth in vertebrates: The IGFs share strong primary sequence similarity to insulin and have many of the activities of insulin though with a significantly smaller effect. The small mitogenic polypeptides IGF-I and IGF-II are mainly produced by the liver in response to growth hormone (GH) secreted by the pituitary. They are responsible for pre- and postnatal growth of somatic tissue in an autocrine, paracrine and endocrine fashion in vertebrates. Furthermore, they promote differentiation of muscle progenitor cells in embryonic chicken cell culture (Schmid et al. 1983, Guler et al. 1988, Maki 2010). IGFs mainly bind to and activate the IGF-I receptor (IGF-IR). Via recruitment of insulin receptor substrates (IRSs), IGF-IR activates phosphoinositide-3-kinase (PI3K) and mitogen-activated protein kinase (MAPK) pathways. While IGF-I only binds to IGF-IR, IGF-II also interacts with IGF-IIR, which is essential for controlling IGF-II bioavailability by acting as a scavenger receptor (Wang ZQ et al. 1994, Saltiel and Kahn 2001, Maki 2010). Surprisingly, IGF-I and IGF-II null mutant mice, and even IGF-I/IGF-II double mutants are viable, though severely reduced in size (40% in the single mutant and 70% in the double mutant situation) due to a reduction in cell numbers. IGF-I mutant survival rate is dependent on the genetic background and survivors maintain a severely reduced growth rate after birth, accompanied by infertility. IGFs are thus crucial for proper pre- and postnatal development of somatic tissue but not strictly essential for viability (Liu et al. 1993, Baker et al. 1993, Efstratiadis 1998). A human patient with a homozygous

13 partial deletion of the IGF-1 gene exhibited the same phenotypic hallmarks as IGF-I null mutant mice, with the exception of the effect on the reproductive tract and a more severe growth retardation at birth (40% of normal birth weight) (Woods et al. 1996, Efstratiadis 1998).

IGFs mostly occur in association with IGF binding proteins (IGFBPs) in the serum, which substantially prolongs their half-lives (Efstratiadis 1998). IGFBPs have equal or even higher affinity for IGFs than IGF-IR does and thereby play a crucial role in regulating the bioavailability of IGFs for IGF-IR signaling. On the other hand, some IGFBPs even potentiate IGF signaling by providing extracellular matrix localization and thus promoting ligand-receptor association. IGFBPs are themselves regulated by IGFBP proteases. Cleavage reduces their affinity for IGFs, thereby releasing IGFs from the complex and making them free for binding to the IGF-IR. IGFBPs also have functions apart from regulating signaling through IGF-IR (Claussen et al. 1997, Duan 2002).

3.2.4 Insulin and IGF binding to their corresponding receptors initiates an intracellular signaling cascade: Insulin and IGFs mediate their effects on growth and metabolism by binding to InR and IGF- IR, who are structurally very similar and belong to the family of ligand-activated receptor tyrosine kinases (RTKs). These membrane integrated dimeric molecules undergo auto- phosphorylation upon ligand binding, which recruits adaptor molecules (the insulin receptor substrate proteins (IRSs)) and potentiates their activity, enabling them to phosphorylate further substrate molecules (Ullrich and Schlessinger 1990, White 1998). InR mediates prenatal growth in response to IGF-II and postnatal carbohydrate and lipid metabolism in response to insulin. IGF-IR mediates prenatal growth in response to both IGF-I and IGF-II and postnatal growth in response to IGF-I. IGF-IIR regulates IGF-II levels by acting as a scavenger receptor (Wang et al. 1994, Nakae et al. 2001). InR null mutant mice undergo near normal embryonic development and only show a slight reduction in size (10%) at birth. However, afterwards, metabolic deregulation ensues: mice are highly insulin resistant and die within a few days (Nakae et al. 2001). This severe insulin resistance phenotype, highlighted by increased glucose levels after feeding despite 100 – 1000 fold increased insulin levels, is also found in humans with mutations in the insulin receptor (Taylor 1992). InR mutations in humans range in severity depending on the type and effect of the mutation and apart from causing insulin resistance affect prenatal growth. The most severe form of insulin resistance is found in patients with leprechaunism, who mostly carry mutations in both alleles of the InR, one often being a null mutation. Surprisingly, even a complete deletion of

14 the InR gene is homozygous viable in humans, though accompanied by the deregulation of blood glucose homeostasis, a multitude of growth as well as other organ defects and severe developmental delay (Wertheimer et al. 1993). IGF-IR null mice only weigh 45% of the normal weight at birth and die immediately after due to respiratory failure. IGF-I/IGF-IR double mutant mice are phenotypically the same as the IGF-IR single mutants, whereas IGF-II/IGF-IR double mutants have a more severe growth retardation (only 30% of normal weight at birth), which is due to IGF-II affecting fetal growth via binding to both IGF-IR and InR (Efstratiadis 1998). IGF-IIR mutant mice have increased levels of IGF-II and are 35% heavier than normal. This is accompanied by general overgrowth and abnormalities of organs, and most mutants die shortly before or after birth, highlighting the crucial role of the IGF-IIR in regulating bioavailability of IGF-II. Lethality can however be rescued in an IGF-IR null mutant background, indicating that IGF-II signaling over InR is sufficient for fetal growth (Ludwig et al. 1996, Nakae et al. 2001).

Receptor auto-phosphorylation recruits insulin receptor substrate (IRS) proteins to the membrane, where they function as docking molecules for downstream components of insulin/IGF signaling. IRS proteins have N-terminally located pleckstrin homology (PH) and phosphotyrosine binding (PTB) domains, as well as several serine/threonine phosphorylation sites at the C-terminus. Via the PH domain, IRS proteins bind to the downstream effector phosphatidylinositiol (3,4,5)-trisphosphate (PIP3) while the PTB domain mediates association with the InR and IGF-IR (White 1998). IRS proteins comprise IRS 1-4, of which IRS-1 was the first insulin receptor substrate to be discovered (Sun et al. 1991). IRS-1 and -2 are the main mediators of somatic growth and control of carbohydrate metabolism in most cells. IRS-4 is only found in the thymus, brain and kidney and IRS-3 is found in adipose tissue in rodents (where it can function analogously to IRS-1). IRS proteins thus confer specificity for the downstream signaling cascade (Nakae et al. 2001). IRS1 knockout mice are mildly insulin resistant and weigh, depending on the genetic background, between 40 and 80% of the normal weight at birth and remain small throughout life (Tamemoto et al. 1994). IRS2 null mutants are of normal size but develop diabetes (hyperglycaemia) due to a progressive deregulation of blood glucose homeostasis caused by peripheral insulin resistance (Withers et al. 1998). Despite being born at normal size, IRS2 null mutant mice have some tissue specific defects like a smaller brain due to reduced proliferation of neurons, reduced photoreceptors and female sterility (Burks et al. 2000). IRS3 and IRS4 mutants seem phenotypically relatively normal, though IRS4 mutants show mild growth and reproductive defects and modest insulin resistance. These results suggest that in

15 mice the main growth regulatory effect of the IGF pathway occurs over IRS1, whereas IRS2 seems to be more important for initiating signaling cascades important for fecundity and the control of metabolism. There are numerous downstream effectors that may bind to IRSs, but the main effect on growth and metabolism is mediated by PI3K activation of Akt, which regulates downstream targets such as mTORC1 and FOXO (Shaw 2011).

3.3 Molecular mechanisms of growth control in Drosophila

3.3.1 Systemic control of growth and metabolic homeostasis by IIS/TOR signaling in Drosophila in response to nutrients: Despite the large evolutionary distance between flies and mammals, the insulin/IGF signaling system in Drosophila melanogaster shows a high degree of conservation to the mammalian system. The Drosophila IIS pathway is much simpler, integrating the functions of the dual mammalian insulin/IGF system within one pathway. The Drosophila insulin receptor (dInR) and one IRS protein, chico, constitute the homologs of the mammalian InR, IGF-IR and IRS1- 4 and mediate the complete response to insulin stimulation on growth, metabolism, reproduction and longevity (Rulifson et al. 2002, Ikeya et al. 2002, Partridge et al. 2002).

The Drosophila insulin-like peptides (DILPs) are the ligands for dInR. There are seven DILPs encoded in the Drosophila genome, of which DILP-2 shows the highest sequence similarity to mammalian insulin. The DILPs show a wide range of spatial and temporal expression patterns (Brogiolo et al. 2001) with the strongest expression during larval stages found within two clusters of neurosecretory cells, called the insulin producing cells (IPCs) (Rulifson et al. 2002). Similarities between developmental programs specifying IPCs and mammalian pancreatic beta cells suggest common ancestry of these cell types. The IPCs produce DILP- 1, -2, -3 and -5, which are secreted into the hemolymph from IPC projections to the aorta and to specific regions of the brain. DILPs, analogously to mammalian insulin, act systemically and via binding to the insulin receptor activate growth in various target tissues. Apart from the expression in IPCs, specific DILPs are expressed in various other tissues, where they may exert a para- and autocrine growth regulatory function. DILP-2 is found in the embryonic midgut and mesoderm and in imaginal discs and the salivary glands during larval development, DILP-4 in the embryonic mesoderm and embryonic and larval midgut, DILP-5 in the adult ovary and, together with DILP-6, in the larval gut and DILP-7 in the embryo and in ten cells of the ventral nerve cord that innervate the adult female reproductive tract (Brogiolo et al. 2001, Ikeya et al. 2002, Yang CH et al. 2008). DILP-6, in response to ecdysteroid levels, additionally becomes highly expressed in the fat body when feeding stops and is

16 essential for growth during the postfeeding stage (Okamoto et al. 2009). DILP-6 has further been shown to be upregulated during starvation, which, like its developmental expression, is dependent on activity of the transcription factor FOXO (Slaidina et al. 2009). All DILPs have growth promoting potential, though DILP-2 and DILP-6 have the largest effects (Ikeya et al. 2002). Single mutants are viable and have no (dilp-3, dilp-4 and dilp-5) or only mild growth defects, though all mutants show reduced female fertility and dilp-2 flies additionally show a slight increase in lifespan and a quite substantial increase in trehalose levels in the hemolymph. Overexpression of DILP-2 on the other hand leads to a 40% increase in adult size due to increased cell size and number (Brogiolo et al. 2001). In contrast, dilp-2, -3, -5 triple mutants show a much more severe phenotype, with reduced male viability and female fertility, a massive developmental delay of 17 days, a substantial decrease in weight (-42%), and elevated lipid and glycogen levels (Grönke et al. 2010). Taken together, these data suggest that the Drosophila ILPs together mediate the many functions of mammalian insulin and IGFs and do so in a partly redundant manner. Furthermore, production of at least two DILPs (DILP-3 and DILP-5) has been shown to depend on nutrient levels, analogous to the production of insulin in response to elevated blood glucose levels (Ikeya et al. 2002) and DILP secretion from the IPCs is blocked under nutrient restriction (Géminard et al. 2009). A further protein, structurally resembling the DILPs, was recently identified and termed DILP-8. DILP-8 is produced by growing tissues and involved in the coordination between growth and developmental timing (Colombani et al. 2012). DILP-2 action is modified by the IGFBP7-like protein Imp-L2 (Imaginal morphogenesis protein- late 2). Imp-L2 is secreted from the fatbody under nutrient restriction and binds to DILP-2 thereby preventing its interaction with dInR. Conversely, Imp-L2 expression in specific neurons promotes selective uptake of DILP-2 and consequently high IIS activity in these cells (Bader et al. 2013). A second factor regulating DILP action is the secreted decoy of insulin receptor (SDR) protein. SDR is secreted by the central nervous system and the midgut throughout larval development and negatively regulates IIS by binding to DILP-3 (Okamoto et al. 2013).

In contrast to the mild phenotypes of DILP single mutants, complete loss of dInR function is lethal, but partial loss of function leads to a severe reduction in organ and body size as a consequence of both a reduction in cell number and cell size (Chen et al. 1996, Stocker and Hafen 2000, Brogiolo et al. 2001). The growth phenotype is accompanied by female sterility and a two-fold increase in lipid levels. Conversely, tissue-specific overexpression of InR leads to organ autonomous overgrowth due to increased cell numbers and cell size, an

17 outcome reminiscent of the transforming potential of hyperactive IGF-IR signaling in mammalian cells (Valentinis and Baserga 2001). The fact that many other IIS components have been found to be oncogenes or tumor suppressor genes highlights the importance of IIS signaling not only in development but also in cancer (Vogt 2001, Engelman et al. 2006).

The IRS protein CHICO binds the auto-phosphorylated residues of dInR, whereupon it is phosphorylated by the receptor and serves as a docking molecule to assemble downstream signaling components (Poltilove et al. 2000). Flies homozygous for a mutation in CHICO exhibit a phenotype similar to partial InR loss of function mutants. Chico mutants are semi- lethal and have a more than 50% proportionate reduction in body size compared to wild-type flies due to both fewer and smaller cells (Böhni et al. 1999). An explanation why the chico mutant phenotype is less severe than that of dInR mutants could be low levels of IIS signaling from dInR directly to the downstream target PI3K, independent of CHICO. dInR has a C-terminal extension absent in mammalian InR that contains several functional PI3K binding sites resembling the ones in IRS proteins (Ruan et al. 1995). In murine cell culture a chimeric dInR, consisting of the human InR insulin binding domain and the dInR intracellular domain, is able to activate PI3K in the absence of IRS (Yenush et al. 1996). It thus appears that the C-terminal extension of dInR can bypass insulin signaling through CHICO. Chico controls cell and organ size in an autonomous manner. In mosaic tissues, cells that are chico homozygous grow markedly slower than and are often outcompeted by their heterozygous neighbors while also showing a cell-autonomous decrease in cell size. Selective removal of chico function in the eye imaginal disc leads to adults with small heads and eyes, whereas the rest of the body is of normal proportion. Despite the overall reduction in size, chico flies show an approximately two-fold increase in lipid levels and female sterility (Böhni et al. 1999). Via distinct phosphorylated tyrosine residues in its C-terminus that bind the Src- homology 2 (SH2) domains of various proteins, CHICO mediates association with components of the two main downstream signaling pathways, the Ras/MAPK pathway, which is involved in growth and proliferation, and the Phosphoinositide 3-kinase (PI3K) pathway, the main effector for regulation of growth and metabolism (Poltilove et al. 2000, Oldham and Hafen 2003, Shaw 2011). Binding of CHICO to PI3K is essential for its function in growth control, whereas binding to the Ras-MAPK adaptor Drk/Grb2 is dispensable (Oldham et al. 2002). The absolute requirement of the CHICO-PI3K interaction is surprising given the functional PI3K binding sites in dInR (Yenush et al. 1996). The regulatory subunit of class 1 A PI3K (p85 in mammals, p60 in Drosophila) is recruited to the cell membrane upon binding of insulin and insulin-like growth factors to the insulin

18 receptor and subsequent phosphorylation of the insulin receptor substrate (IRS). When bound to IRS, the p110 (Dp110 in Drosophila) catalytic subunit of PI3K catalyzes the phosphorylation of phosphatidylinositol-4,5-phosphate (PIP2) to phosphatidylinositol-3,4,5- phosphate (PIP3), a reaction that is antagonized by the lipid phosphatase PTEN (Maehama and Dixon 1998, Vanhaesebroeck et al. 2001). PTEN is a tumor suppressor gene, its normal function lying in inhibiting the survival signals over PI3K, and the second most mutated gene in human cancers. PTEN loss of function causes overgrowth in Drosophila and transformation in mammals (Kennedy et al. 1997, Gao et al. 2000, DiCristofano and Pandolfi 2000). PI3Ks are evolutionarily conserved proteins that are regulated by cell surface receptors, and have roles in mitogenesis, differentiation, cell survival, control of the cytoskeleton and cell shape, chemotaxis and receptor trafficking (Stephens et al. 1993, Engelman et al. 2006). Modulation of PI3K activity in the heart of mice results in an autonomous decrease/increase in heart size due to a change in cell size (Shioi et al. 2000). PI3K mutant phenotypes are consistent with that of dInR mutants, demonstrating its crucial role in relaying the growth regulatory input from dInR to downstream targets. Modulation of PI3K activity autonomously affects tissue size through a change in both cell size and number (Leevers et al. 1996, Weinkove et al. 1999). The size reduction is even more pronounced in a situation where CHICO is lacking, leading to an additional reduction in wing size by 48%. Similarly, flies carrying a hypomorphic InR allele in a chico homozygous background have further reduced cell numbers (Böhni et al. 1999). These observations support the claim that Drosophila InR can signal in the absence of CHICO, via Dp110/PI3K (Yenush et al. 1996, Böhni et al. 1999).

Conversion of PIP2 to PIP3 leads to the recruitment of PIP3-binding proteins, like protein kinase B (PKB, Akt in mammals), whose location is usually cytoplasmic, and phosphoinositide dependent kinase 1 (PDK1) to the membrane. Activation of PKB requires phosphorylation by PDK1 and the Target of Rapamycin Complex 2 (TORC2) at two sites that become accessible only when PKB is bound to PIP3 (Burgering and Coffer 1995, Franke et al. 1995, Didichenko et al. 1996, Matsuzaki et al. 1996, Sarbassov et al. 2005). The membrane recruitment of PKB and its interaction with PIP3 are crucial for propagating signaling further downstream. A form of PKB that can no longer localize to the membrane due to a mutation in the domain used for association with PIP3 fully rescues the lethality of PTEN null mutants, indicating that PKB is the major mediator of the PI3K pathway (Stocker et al. 2002). Mammals possess three forms of PKB, Akt1-3, that regulate biological processes such as cell proliferation, survival and metabolism through a wide variety of substrates. Many upstream mutations are known that lead to hyperactivation of Akt/PKB

19 signaling, which in turn causes diseases such as cancer, diabetes and cardiovascular and neurological disorders (Hers et al. 2011).

3.3.2 The transcription factor FOXO regulates many downstream targets of IIS and is essential for survival under stress conditions: An important target of Akt/PKB phosphorylation is the forkhead family of transcription factors (FOXO). Whereas Drosophila only has one, mice and humans have four FOXO transcription factors (FOXO1, FOXO3a, FOXO4, FOXO6) that modulate the expression of genes involved in apoptosis, the cell cycle, stress responses such as DNA damage repair and oxidative stress, cell differentiation and glucose metabolism (Huang and Tindall 2007). Phosphorylation by Akt/PKB leads to sequestration of FOXO in the cytoplasm by binding to 14-3-3 proteins and prevents it from regulating target genes with roles in metabolism in the nucleus (van der Heide et al. 2004). 14-3-3 proteins bind phosphoserine motifs on a multitude of substrate proteins involved in signaling, scaffolding and metabolism. 14-3-3 binding can have diverse effects such as inactivation of catalytic enzymes, inducing a conformational change, relocalization of proteins, bridging between proteins and a shielding function (Thomas et al. 2005). FOXO activity promotes gluconeogenesis and negatively regulates expression of genes that promote glucose metabolism, like glycolytic enzymes and enzymes of the pentose phosphate pathway (PPP) and de-novo lipid biosynthesis. Negative regulation of FOXO activity by insulin signaling over Akt/PKB thus promotes glucose utilization and prevents glycogen synthesis (White et al. 2002, Zhang W et al. 2006). FOXO further positively regulates the transcription of cell-cycle inhibitors and pro-apoptotic enzymes and eukaryotic translation initiation factor 4E-binding protein (4E-BP) (Brunet et al. 1999, Medema et al. 2000, Schmidt et al. 2002, Engelmann et al. 2006). 4E-BP is the regulator of eukaryotic translation initiation factor 4E (eIF-4E), the mRNA 5’ cap binding protein, which is crucial for survival under environmental stress such as dietary restriction and oxidative stress (Jünger et al. 2003, Tettweiler et al. 2005, Teleman et al. 2005). Consistent with the repressive action of FOXO on cell metabolism and proliferation, constitutively active FOXO leads to arrest of cell growth, and tissue specific overexpression of FOXO leads to an autonomous decrease in tissue size due to reduced cell numbers (Puig et al. 2003). FOXO is also involved in feedback regulation on insulin signaling in mammals and Drosophila, as FOXO directs transcription from the InR locus in response to dietary restriction. Active insulin signaling thus leads to a decrease in InR transcription due to the repressive action on FOXO (Puig et al. 2003, Puig and Tjan 2005). Despite its role in transcriptionally regulating many important downstream targets, loss of FOXO function is not detrimental for growth under normal conditions. FOXO may however become crucial under stress conditions. Through

20 activation of 4E-BP, FOXO mediates part of the reduction in cell number caused by loss of PKB function and may thus be essential for survival under conditions where insulin signaling is low, like starvation or oxidative stress (Jünger et al. 2003, Tettweiler et al. 2005, Teleman et al. 2005). Apart from being regulated through cytoplasmic retention, FOXO is also targeted for degradation through ubiquitination in response to insulin signaling (Matsuzaki et al. 2003). PI3K signaling over Akt/PKB also directly promotes general translation by inactivating glycogen synthase kinase 3 (GSK3). GSK3 normally phosphorylates and inhibits eukaryotic initiation factor 2B (eIF2B), so insulin signaling activates eIF2B. eIF2B mediates recycling of eukaryotic initiation factor 2 (eIF2), which is required for translation initiation (Proud 2006).

3.3.3 The Target of Rapamycin (TOR) pathway

3.3.3.1 The TOR pathway controls cellular and organismal growth in response to local abundance of growth factors, amino acids, cellular energy levels and stress: The second branch of Akt/PKB controls, via regulation of tuberous sclerosis complex (TSC), the nutrient sensing target of rapamycin (TOR) pathway, an important regulator of metabolic control from yeast to mammals. Together with IIS, TOR is the main pathway for integrating environmental cues with cellular growth rates and metabolism. Whereas IIS mediates the response to systemic hormonal stimuli, which carry information about the organism’s nutritional status, thus coordinating nutrient availability with cellular growth (Ikeya et al. 2002, Grewal 2009), the TOR pathway, which is conserved in all eukaryotes, integrates nutrient availability in terms of amino acid levels, growth factors, stress and the energy status of a cell to control its growth (Zhang H et al. 2000, Dennis et al. 2001, Loewith et al. 2002, Avruch et al. 2006). In contrast to IIS, the TOR pathway mainly senses the cell-autonomous environment, though it may affect cellular growth in a non-autonomous manner, impacting on the growth of more distant cells (Colombani et al. 2003, Martin and Hall 2005). The fat body plays a central role in coupling nutrient sensing by the TOR pathway to regulating IIS signaling in more distant tissues. Via TOR, the fat body is able to sense nutrient fluctuations in the hemolymph. If nutrient levels are low, humoral signals are released that inhibit secretion of DILPs from the IPCs and thereby reduce IIS signaling systemically (Colombani et al. 2003, Géminard et al. 2009). The downstream targets of TOR signaling include transcriptional and translational regulation, ribosome biogenesis, nutrient transport, organization of the actin cytoskeleton and autophagy. On top of the growth regulatory function during development, IIS/TOR signaling functions in the response to starvation and stress as well as fertility and aging (Schmlezle and Hall 2000, Grewal 2009, Fontana et al. 2010, Goberdhan 2003).

21 Yeast has two TOR proteins, TOR 1 and TOR 2 that form two structurally and functionally distinct protein complexes, TORC1 and TORC2. Only TORC2 additionally has a function in the regulation of the cell-cycle dependent polarization of the actin cytoskeleton (Schmidt et al. 1996). In higher eukaryotes like Drosophila, mice and humans there is only one TOR protein (dTOR or mTOR, respectively), but the Ser/Thr kinase TOR associates with different proteins to form two functionally distinct protein complexes, TORC 1 and TORC 2, which are conserved in their function (Loewith et al. 2002, Jacinto et al. 2004). TORC1 is composed of TOR, Raptor (regulatory associated protein of TOR), LST8 (lethal with Sec-13 protein 8) and other components and is sensitive to inhibition by rapamycin. TORC1 responds to growth factors, nutrients, cellular energy levels and stress and performs functions in protein synthesis, ribosome biogenesis and autophagy, which converge on cell growth, mainly via regulation of S6K and 4E-BP but also other substrates. TORC1 thus has a crucial role in maintaining a balance between cellular energy consumption through biosynthetic processes and cellular energy production (autophagy), while integrating cellular metabolism with external cues to prevent energy wasting under unfavorable conditions (Tumaneng et al. 2012). TORC2 is composed of TOR, Rictor (rapamycin insensitive companion of TOR), LST8 and other components, lacks rapamycin sensitivity and does not regulate growth via the same targets as TORC1, may however modulate insulin signaling. An important point of influence is the phosphorylation of Akt/PKB, which is dispensable for growth in normal conditions but required for tissue overgrowth when IIS is hyperactivated (Inoki and Guan 2006, Wullschleger et al. 2006, Hietkangas and Cohen 2009). TORC2 is further required for regulation of cell morphology by controlling cytoskeleton dynamics and in yeast is important for survival following DNA damage (Jacinto et al. 2004, Weisman et al. 2014).

3.3.3.2 TORC1 responds to growth factors via IIS, extracellular amino acid levels, intracellular energy status and stress: TORC1 is activated in a cell-autonomous manner either in response to growth factors, via the PI3K pathway or in response to extracellular nutrients and oxygen and cellular energy levels (Colombani et al. 2003, Nobukuni et al. 2005, De Virgilio and Loewith 2006, Wullschleger et al. 2006). TORC1 activation is coupled to IIS via Akt/PKB and occurs mainly at the lysosome, which is also the cell compartment where amino acid levels are sensed. Akt/PKB phosphorylates and thus inactivates the tuberous sclerosis protein 2 (TSC2), which in a complex together with TSC1 negatively regulates TORC1 signaling (Inoki et al. 2002). Both TSC1 and TSC2 are tumor suppressor genes, and mutations in either gene are associated with benign tumors in humans (Oldham and Hafen 2003). Accordingly, in Drosophila loss of TSC1/2 function results in overgrowth through hyperactivation of the translation promoting

22 kinase S6K, as does overexpression of Rheb, the negatively regulated target of the TSC1/2 complex. It was long believed that the TSC1/2 complex inactivates the small GTPase Rheb (Ras homologue enriched in brain) by acting as its GTPase activating protein (GAP) (Gao et al. 2002, Inoki et al. 2003, Garami et al. 2003, Zhang et al. 2003). Only this year it was found that in contrast to earlier belief, TSC1/2 inactivation does not occur by altering its GAP activity, protein or complex stability, rather it seems to be the targeted dissociation of TSC2 from the lysosomal membrane that prevents Rheb inactivation (Menon et al. 2014). Rheb activity is thus controlled by IIS-dependent shuttling of the TSC1/2 complex to and from the lysosomal compartment. Insulin signaling thus prevents inactivation of Rheb. Rheb is able to bind to TORC1 in its inactive form, but can only activate it in its active, GTP-bound form, possibly through inducing a GTP dependent conformational change in TORC1 (Long et al. 2005, Wullschleger et al. 2006). The second input of Akt/PKB signaling on TORC1 occurs over FOXO dependent up- regulation of 4E-BP, which also regulates TORC1 negatively (Oldham and Hafen 2003). As FOXO is sequestered in the cytoplasm and also subject to degradation when IIS signaling is high, IIS signaling promotes TORC1 activation by keeping 4E-BP levels low. TORC1 also directly senses extracellular amino acid levels, a response that involves the Ras-related GTPase (Rag). In the absence of amino acids, Rag is present in its inactive, GDP-bound form but assumes its active, GTP bound form upon amino acid sufficiency. In association with a protein complex termed Ragulator, Rag promotes TORC1 re-localization to lysosomal compartments that contain Rheb, leading to Rheb dependent activation of TORC1, but only if Rheb has itself been activated by upstream growth signals over the PI3K- Akt/PKB axis (Kim et al. 2008, Sancak et al. 2010). Turning off of this signal occurs actively, over recruitment of TSC1/2 to the lysosomal membrane by the inactive form of Rag (Rag- GDP) (Demetriades et al. 2014). Apart from being a regulated by Akt/PKB, TSC2 is also a target of regulation through the energy sensor AMPK (AMP activated protein kinase). AMPK senses the energy status of a cell through the AMP to ATP ratio, being activated when cellular energy pools are depleted (high levels of AMP compared to ATP). Active AMPK phosphorylates and activates TSC2, which leads to a down-regulation of translation over inactivation of TORC1 (Inoki et al. 2003). AMPK can also directly phosphorylate Raptor and thereby promote its interaction with 14-3-3 proteins, which leads to an inactivation of TORC1. Over TORC1, AMPK thus mediates an important metabolic checkpoint that enables coupling cellular energy levels to energy demanding processes such as protein biosynthesis, enforcing cell cycle arrest when conditions are unfavorable (Gwinn et al. 2008).

23 Cellular stress, such as hypoxia, elevated levels of reactive oxygen species (ROS) or DNA damage also leads to down-regulation of TORC1 signaling. The cellular response to hypoxia requires the two homologous proteins Scylla and Charybdis (REDD1 and REDD2 in mammals), which are induced by the hypoxia responsive transcription factor HIF-1 (hypoxia inducible factor 1), inhibit TORC1 upstream of TSC1/2 but downstream of Akt/PKB (Reiling and Hafen 2004). The response to oxidative stress occurs at the peroxisomes and involves TSC1/2 and Rheb. In response to elevated ROS levels, TSC1/2 are bound by peroxisomal membrane proteins and thus recruited to peroxisomal surfaces, where Rheb is also present. TSC1/2 inactivation of Rheb then leads to inactivation of TORC1 and induction of autophagy (Zhang et al. 2013). Down-regulation of TORC1 in response to DNA damage requires p53 and the AMPK-TSC2 axis (Wullschleger et al. 2006).

3.3.3.3 TORC1 promotes translation, ribosome biogenesis and the expression of metabolic genes while inhibiting autophagy: TORC1 controls translation by phosphorylating and activating S6 kinase (S6K), which, by phosphorylation of the ribosomal protein S6, promotes translation initiation and elongation. S6K activation however additionally requires phosphorylation by several kinases, including PDK1 (Oldham and Hafen 2003). In an S6K independent manner TOR also promotes translation of mRNAs coding for ribosomal proteins and other components of the translational machinery, thereby promoting ribosome biogenesis (Patursky-Polischuk et al. 2008). Further promoting ribosome biogenesis, TORC1 regulates synthesis of ribosomal RNAs by promoting assembly of the Pol I pre-initiation complex and recruitment of the transcription initiation factor IA (TIF-IA) to rDNA loci (Grewal et al. 2007). The second main input of TORC1 on translation initiation is over phosphorylation and inhibition of 4E-BP, which liberates eIF4E to associate with mRNA and other components of the translation machinery to initiate translation (Beretta et al. 1996). TORC1 also promotes translation elongation by inducing dephosphorylation and activation of eEF2 (eukaryotic elongation factor 2), which mediates translocation of the ribosome by three nucleotides along the mRNA (Proud 2006). Activation of TORC1 thus results in a coordinated up-regulation of translation and the translational machinery under favorable growth conditions (Schmelzle and Hall 2000).

TORC1 regulates the expression of target genes involved in glucose and protein metabolism over several transcription factors, among them Myc and DREF (DNA replication related element binding factor) (Peng et al. 2002, Killip and Grewal 2012). In mammals, TORC1 also regulates the expression of the transcription factors hypoxia-inducible factor 1 alpha (HIF1alpha) and sterol regulatory element binding protein (SREBP1/2), which are important

24 for the regulation of genes involved in glucose utilization, lipid synthesis and proliferation (Düvel et al. 2010). Generally, many of the TORC1 dependent genes that influence cell size also have a role in cell division. Most of these genes are conserved in mammals, and several implied in human diseases (Guertin et al. 2006).

Autophagy is a process critically relevant for survival under starvation conditions by controlled lysosomal degradation of cellular organelles and components. 18 different autophagy-related genes (ATGs) are involved in this process. Autophagy is induced under poor nutrient conditions and requires the catalytic activity of ATG1, which acts in a complex with other ATG genes. In nutrient rich environments, autophagy is blocked via hyper- phosphorylation of ATG13 by TORC1, which prevents ATG13 from activating ATG1 (Hosokawa et al. 2009, Kamada et al. 2010). Apart from controlling general autophagy, TORC1 also regulates turnover and trafficking of nutrient transporters to control uptake of lipids, amino acids, glucose and iron (Edinger and Thompson 2002, Wullschleger et al. 2006). In summary, the activities of TORC1 result in a coordinated growth program by promoting biosynthetic processes and inhibiting catabolic pathways.

The combined actions of IIS and TOR signaling control cell growth, proliferation and metabolism during development and metabolic homeostasis in adults. The actions of IIS and TOR signaling are carefully orchestrated to ensure quick adjustment to fluctuating environmental and dietary conditions. An important point of interference are the IRS proteins, which can be inactivated by high TORC1 signaling activity, thus suppressing IIS and preventing further activation of TORC1 via PI3K (Manning 2004). IRS proteins are further subject to degradation and inactivation by other mechanisms, highlighting their important role in feedback regulation on insulin/IGF signaling. Several feedback mechanisms thus allow tight control of the signaling strength and time through the pathway (Tremblay and Marette 2001, Saltiel and Kahn 2001, Harrington et al. 2004, Shaw 2011, DeStefano and Jacinto 2013). Another brake mechanism on growth occurs through upregulation of FOXO in response to hyperactivation of the TOR pathway (Jünger et al. 2003, Harvey et al. 2008). Approximately 20% of Drosophila genes are nutrient-sensitive, showing a change in transcript levels upon nutrient stimulation, and FOXO may regulate up to one third of all nutrient responsive transcripts (Gershman et al. 2007). Nutrient-responsive FOXO targets are involved in nutrient sensing, protein synthesis, growth, fatty acid oxidation and storage and mitochondrial biogenesis, highlighting FOXO as a key transcriptional regulator of nutrient sensing, utilization and storage systems and enabling it to integrate

25 sensitivity of the cell to systemic nutrients with local nutrient concentrations (Gershman et al. 2007).

Despite IIS/TOR signaling being able to regulate all three main mechanisms involved in body size control, cell growth, proliferation and apoptosis, other pathways are necessary for more function and more fine-grained control. One important player that integrates the effects of many pathways is the transcription factor Myc. Myc function is crucial for growth, proliferation and apoptosis, and many of its functions on growth are mediated through its activation by IIS/TOR signaling.

3.3.4 Myc controls cell growth and cell numbers downstream of TOR and other pathways: The Myc family of transcription factors is one of the most studied gene classes, partly due to their broad involvement in a variety of tumors in many species. Classical proto-oncogenes, Myc proteins are involved in cellular processes such as growth, energy metabolism, proliferation, differentiation and apoptosis (Cole 1986, Eilers and Eisenmann 2008). Myc requires its interaction partner Max for activation of target genes, and is antagonized by the transcriptional repressor Mnt. Myc:Max dimers bind to specific sequence elements called E boxes and activate proximal target genes (Bellosta and Gallant 2010). Myc is a downstream target of TORC1 and IIS signaling, with FOXO controlling Myc transcript levels in a tissue specific manner and TORC1 regulating Myc protein abundance, which ensures a rapid adaptation to changing nutritional conditions. Myc is especially important in promoting up-regulation of the translational machinery through its activation of genes involved in ribosome biogenesis and protein translation (eIF4E, eIF2alpha). The presence of E boxes in more than 90% of TORC1 target genes suggest that Myc is the main mediator by which TORC1 regulates genes involved in ribosome assembly (Teleman et al. 2008). Myc is both necessary and sufficient to regulate rRNA synthesis through increasing the levels of components of the Pol I transcription machinery. Furthermore, this de novo rRNA synthesis is an absolute requirement for promoting growth via Myc in the wing disc. Apart from genes involved in ribosome biogenesis, further Myc targets include genes with roles in nucleotide and amino acid metabolism, Pol II transcription, protein folding and targeting and cell cycle regulation (Grewal et al. 2005). In Drosophila, loss of diminutive (dm, Drosophila Myc) leads to death during early larval stages. Dm hypomorphs show a proportionate reduction in body size, smaller cells with reduced rRNA content, female sterility and have short and thin bristles, whereas Myc overexpression leads to 30% bigger animals with bigger cells and more ribosomes (Gallant

26 et al. 1996, Bellosta and Gallant 2010). In tissues, dm cells have delayed cell growth and decreased growth rates, with the opposite effect observed during overexpression. Dm overexpression promotes the progression from G1 to S, but not from G2 to M, resulting in no net acceleration of cell division. Apart from its effects on cell growth and cell cycle progression, Myc also has an effect on patterning, as the morphogen Wingless (Wg) patterns growth in the wing disc by modulating Myc expression (Johnston et al. 1999). There is evidence that Myc also plays a beneficial role in tissue regeneration from studies showing that overexpression of Myc after tissue damage enhances the regenerative process and that Myc is upregulated in regenerating wing discs in a Wingless (Wg) dependent manner (Smith- Bolton et al. 2009).

Myc not only controls cell number by stimulating proliferation, but is also an important regulator of apoptosis. The pro-apoptotic action of Myc occurs via the apoptosis regulator Head involution defective (Hid), which by binding to and inhibiting inhibitors of apoptosis (IAPs) promotes caspase activation and apoptosis. Apoptosis is a process that is developmentally extremely important, as it is a mechanism to control cell number and thus proper tissue size (de la Cova et al. 2004). Apoptosis occurs in response to developmental cues, environmental stress or lack of survival signals. During growth, apoptosis is repressed partly via the PI3K pathway. Overexpression of Myc in wing disc cells causes these cells to undergo apoptosis, whereas cells carrying a hypomorphic myc allele have reduced apoptosis upon exposure to ionizing radiation (IR) (Montero et al. 2008). Another form of inducing apoptosis is cell competition, a process observed in mosaic tissues composed of cells with differential Myc levels. Cells that have elevated Myc levels have a growth advantage and outcompete their neighbors, inducing them to undergo apoptosis. The effects of cell competition can be counteracted by survival signals from the transforming growth factor beta (TGF-beta) family member Decapentaplegic (Dpp) (Basler and Moreno 2004, de la Cova et al. 2004). The mechanisms by which cells with reduced levels of Myc are outcompeted and eliminated from the tissue are the same as the ones observed in minute mutant cells. Minute genes encode components of the translational machinery, which are also the target of Myc (Bellosta and Gallant 2010).

Cells carrying mutations in components of a third growth regulatory pathway, the Hippo tumor suppressor pathway, have been shown to possess properties of supercompetitors. In these cells, Myc is overexpressed since it is a transcriptional target of Yorkie, the main downstream effector of the Hippo pathway, and mediates the cell competition (Ziosi et al. 2010). Myc thus represents a node where effects of different growth pathways are integrated,

27 which enables systemic coordination. By itself repressing Yorkie levels, Myc is involved in feedback on its own production and this mechanism provides a possibility for coordinating the cellular activities of the Hippo pathway and Myc, and eventually to balance growth (Neto- Silva et al. 2010).

3.3.5 The Hippo tumor suppressor pathway controls cell number by regulating cell cycle progression and apoptosis in an organ autonomous manner: The Hippo tumor suppressor pathway restricts cell proliferation in developing tissues in metazoans and plays an important role in stem cell proliferation and tissue regeneration, partly over transcriptional activation of Myc (Huang 2005, Dong 2007, Zhao et al. 2011). The high evolutionary conservation of the pathway is evident from the fact that mammalian core pathway components can functionally substitute their Drosophila counterparts. The Hippo pathway integrates diverse inputs on organ growth and mutations in pathway components are associated with a number of tumors in humans and promote tissue overgrowth in Drosophila (Saucedo and Edgar 2007, Yu and Guan 2013). The Hippo pathway thus plays an important role in the regulation of organ autonomous size determination in Drosophila and mammals via controlling cell proliferation and apoptosis.

The core of the pathway consists of the kinases Hippo (Hpo, MST1/2 in mammals) and Warts (Wts, LATS in mammals) and the WW domain protein Salvador (Sav). Hpo, Wts and Sav form a complex and both Sav and Wts are phosphorylation targets of Hpo, which in case of Wts results in its activation. Sav facilitates phosphorylation of Wts by Hpo, possibly by acting as a scaffold. Loss of either component causes tissue overgrowth and reduced apoptosis due to elevated levels of the cell cycle regulator Cyclin E and the anti-apoptotic protein DIAP1 (Drosophila inhibitor of apoptosis 1)(Justice et al. 1995, Tapon et al. 2002, Harvey et al. 2003, Pantalacci et al. 2003, Udan et al. 2003, Wu et al. 2003). Cyclin E regulates the progression from G1 to S phase and DIAP1 inhibits apoptosis by inactivating caspases (Saucedo and Edgar 2007). Hpo can control abundance of the apoptosis regulator DIAP1 by phosphorylating and thereby destabilizing it. Sav, which by itself is an unstable protein normally targeted for proteasomal degradation, is stabilized by its interaction with Hpo (Pantalacci et al. 2003, Harvey et al. 2003). Apart from negatively regulating DIAP1 protein and mRNA levels (Wu et al. 2003), the Hippo pathway controls the induction of apoptosis by inducing the pro-apoptotic gene hid (head involution defective) (Udan et al. 2003). A fourth protein, Mats (Mob as tumor suppressor), physically associates with the Hpo-Sav-Wts core via binding to Wts, thereby potentiating its catalytic activity (Lai et al. 2005).

28 The downstream effector of Hpo/Sav/Wts is the transcriptional co-activator Yorkie (Yki, YAP in mammals). Yki gets phosphorylated and thus inactivated by Wts (Huang et al. 2005) since phosphorylation leads to its cytoplasmic sequestration by binding to 14-3-3 proteins (Dong et al. 2007). Apart from Myc, Cyclin E and DIAP1, Yki regulates expression of the 21-nucleotide microRNA bantam (Thompson and Cohen 2006). Bantam affects tissue growth by both stimulating proliferation and preventing apoptosis, and at least part of its function in apoptosis may stem from its activation of the pro apoptotic gene Hid (Brennecke et al. 2003). Yki activates its target genes by associating with the transcription factor Scalloped (Sd) (Zhang et al. 2008).

Hippo signaling is regulated by multiple upstream components and inputs. The two FERM domain proteins Expanded (Ex) and Merlin (Mer, NF2 in mammals) act upstream of Hpo and function in relaying signaling from the membrane to the cytoplasm through interaction with transmembrane and cytoskeletal proteins. Mer and Ex have partly redundant functions in Drosophila and both need to be lost for overgrowth to occur. (Boedigheimer and Laughon 1993, Boedigheimer et al. 1997, Hamaratoglu et al. 2006, Saucedo and Edgar 2007, Pan 2007). Mer and Ex do however influence distinct downstream pathways. Loss of Mer disturbs normal developmental apoptosis while loss of Ex only has a mild effect. Loss of Ex in contrast leads to delayed cell cycle exit and an upregulation of the morphogen Wingless (Wg) (Pellock et al. 2007). Mer physically associates with the WW domain protein Kibra, which can directly bind Wts, and they together act synergistically with the Ex branch on Hippo pathway activity (Baumgartner et al. 2010, Genevet et al. 2010). The localization of Mer, Ex and Kibra to the apical domain of cells suggests the plasma membrane as the place of activation of the Hippo core. The transmembrane protein Fat is another upstream activator of the Hippo pathway (Bennett and Harvey 2006, Silva et al. 2006, Willecke et al. 2006). Fat belongs to the family of protocadherins, cadherins that have large extracellular domains for interactions in cell-cell adhesion and cytoplasmic domains that are distinct between family members and from those of classical cadherins. Fat is localized at apical adherens junctions and is regulated by other proteins, like the protocadherin Dachsous (Ds), the casein kinase discs overgrown (Dco), the Golgi-localized kinase Four-jointed (Fj) and the Fat/Ds interacting protein Lowfat (Lft) (Matakatsu et al. 2006, Rogulja et al. 2008, Willecke et al. 2008, Feng and Irvine 2009, Mao et al. 2009, Sopko et al. 2009, Simon et al. 2010, Tumaneng et al. 2012). Fj and Ds levels are themselves subject to regulation by the Dpp and Wnt/Wg signaling pathways, which are important for patterning of organs, thus providing a link between growth control and pattern formation (Rogulja et al. 2008, Zecca and Struhl 2010). Loss of Fat leads to autonomous

29 tissue overgrowth, elimination of Ex from adherens junctions and generally reduced Ex levels due to decreased translation or stability while having no effect on localization of Mer (Bennett and Harvey 2006, Silva et al. 2006, Willecke et al. 2006). The effect of Fat on the core components and activation of Wts is thus mainly mediated through Ex, while parallel activation over Mer occurs in a Fat-independent manner (Tyler and Baker 2007). The promoting effect of Ex on Hippo signaling might occur through its action as an adaptor molecule between Hpo and Sav (Saucedo and Edgar 2007). The overgrowth of Fat mutant cells is dependent on the unconventional myosin Dachs, a downstream target of Fat. Fat regulates Dachs by preventing its localization to or affecting its stability at the membrane. Dachs interacts with Wts and destabilizes it, which may be a cause of loss of Hippo signaling (Mao et al. 2006, Saucedo and Edgar 2007). Hippo activity can also be modified by proteins involved in the establishment of cell polarity, such as Scrib, Dlg, Lgl, aPKC and Crumbs (Crb) (Chen et al. 2010, Grzeschik et al. 2010, Ling et al. 2010, Robinson et al. 2010). Further inputs on Hippo pathway activity occur at the level of the core complex, over the PP2A phosphatase complex STRIPAK, which is an inhibitor of Hpo, and the dJub protein (Ajuba LIM proteins in mammals), which antagonizes Yki phosphorylation through interaction with Wts and Sav (Das Thakur et al. 2010, Ribeiro et al. 2010). Lastly, modulation of pathway activity can occur in response to mechanical tension within the tissue (Schroeder and Halder 2012). This regulation involes dJub, which localizes to adherens junctions and there associates with alpha catenin in a tension dependent manner. Alpha-catenin links the actin cytoskeleton to adherens junctions and undergoes a conformational change when there is cytoskeletal tension, promoting its interaction with dJub. Subsequent inhibitory Wts recruitment to the junction by dJub presents a mechanism by which cytoskeletal tension can be integrated with growth output (Rauskolb et al. 2014). On top of the regulation by these diverse inputs, the Hippo pathway negatively regulates its own activity. Mer, Ex, Kibra and Ds are involved in a negative feedback loop on Hippo signaling, being inhibited by activation of the pathway through transcriptional regulation by Yki (Cho et al. 2006, Hamaratoglu et al. 2006, Willecke et al. 2006, Saucedo and Edgar 2007,Genevet et al. 2010).

3.3.6 The EGFR/Ras/MAPK pathway controls cell growth and proliferation: A further short-range signaling pathway that is crucially important for proper growth and development of organisms is the EGFR pathway. EGFR pathway activity integrates with other pathways at different levels, for example over the Ras/MAPK dependent activation of Myc or the EGFR adaptor molecule Drk (Grb2 in mammals) that may bind to IRS/CHICO. EGFR signaling controls cell division, differentiation, survival and migration and is misregulated in a variety of cancers (Rommel and Hafen 1998, Normanno et al. 2005).

30 Peptide growth factors of the epidermal growth factor (EGF) family bind to the EGF receptor (EGFR), a receptor tyrosine kinase, which undergoes autophosphorylation and recruits Ras to the membrane, where it gets activated. Drosophila has one EGFR homolog (Drosophila EGFR (DER)), four activating ligands (Vein, Spitz, Keren, Gurken) and one inhibitory ligand (Argos). The neuregulin Vein is a secreted ligand and has the lowest activation capacity of the four activators. Spitz, Keren and Gurken, of which Spitz is the primary ligand, are transforming growth factor alpha (TGF alpha) homologs and are produced as transmembrane precursors. Production of the active, secreted ligands requires regulated cleavage of the precursors. The ligands differ in their spatial and temporal activation of EGFR. Pathway inhibition occurs over competitive binding of Argos to EGFR, heterodimerization with the transmembrane protein Kekkon or interception of the Ras/MAPK cascade by Sprouty (Livneh et al. 1985, Riese et al. 1998, Shilo et al. 2003, 2005). Activated receptor binds the adaptor protein downstream of receptor kinase (Drk, Grb2 in mammals) (Pawson et al. 2011). Drk interacts with the guanine nucleotide exchange factor (GEF) Sos (Son of sevenless) that recruits the small GTPase Ras. Sos stimulates activation of inactive, GDP-bound Ras by promoting the exchange of GDP for GTP (Buday et al. 1993, Olivier et al. 1993, Schlessinger et al. 1993, Diaz-Benjumea and Hafen 1994, McCormick 1994, Hunter 2000, Prober and Edgar 2002). Activated Ras recruits effector kinases such as Raf and PI3K, presenting a node of interaction between the growth regulatory functions of EGFR and IIS signaling. Many other effector kinases are known, and in all cases, Ras recruitment of the effector is the first step in a kinase cascade resulting in a series of activating phosphorylation events. The Ras/MAPK branch is the best characterized and most relevant for growth control in Drosophila. Ras binding to Raf promotes activation of MEK/Dsor1 (MAP/ERK kinase/ Downstream of Raf1), which then activates MAPK/ERK. Activation of MAPK results in its translocation to the nucleus where it phosphorylates targets like the transcription factor Pointed, leading to its activation, and inactivates Yan, a negative regulator of pointed target genes, which control expression of genes involved in proliferation, differentiation and cell survival (Gabay et al. 1996). A negative feedback loop on pathway activity exists in the activation of the EGFR inhibitors Argos, Kekkon and Sprouty (Chong et al. 2003, Shilo 2005, Doroquez and Rebay 2006), which also allows short-range signaling of the pathway. Up-regulation of EGFR pathway activity over Ras in Drosophila tissues leads to accelerated G1/S transitions and Raf/MEK/ERK dependent tissue overgrowth. One of the downstream targets of Ras signaling is Myc, which increases levels of the cell cycle regulator Cylin E (G1/S transition) post-transcriptionally (Prober and Edgar 2000). A further interaction node between Ras and IIS/TOR consists of TSC2, which is a target of both pathways,

31 suggesting that Ras/MAPK signaling and PI3K signaling converge on the control of translation via TOR (Wullschleger et al. 2006).

Systemic growth control occurs via IIS/TOR signaling, ensuring that nutrient availability is coupled to growth in the whole organism. TOR signaling allows an additional level of control that growth only occurs under favorable conditions by being responsive to amino acid levels, cellular energy levels and stress and can mediate its effects also on a systemic level by repressing DILP release from the IPCs under unfavorable conditions. An important organ for the control of systemic growth factor levels is the fat body (Colombani et al. 2003). The fatbody is the functional homolog of vertebrate liver and fat cells. It is an important tissue for many metabolic functions, primarily the storage and release of energy in response to changing energy demands of the organism. The adipocytes, the main cells of the fat body, contain glycogen and triglyceride stores, which are crucial energy reservoirs during nonfeeding states (Arrese and Soulages 2010). Additionally, during nonfeeding stages, the fatbody produces DILP-6, which is necessary for growth and development of organs after the has stopped feeding. Upon nutrient shortage during feeding, the fatbody releases a humoral signal that suppresses secretion of dILPs from the IPCs, thus systemically reducing IIS signaling (Géminard et al. 2009). IIS and TOR actions are not only carefully orchestrated with each other, but also with the organ- and cell-autonomous actions of other growth regulators like the Hippo tumor suppressor pathway, the Ras/MAPK pathway and the proto- oncogene Myc.

3.4 Steroid hormones affect body size by controlling developmental timing in Drosophila: On top of the molecular machinery controlling growth, there is also a layer of humoral control that regulates the duration of different developmental stages. As final body size is dependent on the duration of the growth period, the mechanisms that regulate hormonal control of development also control body size. Drosophila undergoes three larval stages, during which most of the growth that determines final adult size occurs. Adult organs are present as precursor tissues in the developing larva, the imaginal discs. The wing discs that will give rise to the adult wings, for example, grow from a mere 50 cells to 50’000 cells during larval development. The third and last larval instar is the stage where the most dramatic size increase occurs: Drosophila gains about 70% of its final weight during that period (Mirth et al. 2005). Late in the third instar, upon reaching a critical weight, defined as the size after which starvation does not delay metamorphosis anymore, larvae stop feeding and begin wandering away from the food to find a place for pupariation (Beadle 1938, Robertson 1963, De Moed

32 et al. 1991, Nijhout and Williams 1974, Shingleton 2010). During metamorphosis, the imaginal discs differentiate and evaginate, giving rise to the adult structures. In Drosophila, like in mammals, maturation is controlled by steroid hormones. Pulses of the steroid hormone 20-hydroxyecdysone (20E) promote the progression to the next developmental stage. A strong increase in 20E levels at the end of larval developmental is the cue to stop growth and start pupariation. 20E is bound by the ecdysone receptor (EcR) localized in the nuclear membrane, which in its 20E bound form can directly regulate transcription of target genes involved in the cessation of feeding and puparium formation (Thummel 1996). 20E is produced in the brain, in the cells of the prothoracic gland (PG) upon stimulation by prothoracicotropic hormone (PTTH). PTTH release is coordinated with the attainment of critical weight, in response to decreasing titers of juvenile hormone (JH). If PTTH release is suppressed, larval development is longer and results in bigger adults with more cells (McBrayer et al. 2007). PTTH dependent 20E release from the PG depends on signaling through Ras and PI3K pathways, with increased Ras and PI3K signaling in PG cells leading to a shortened larval period and smaller adults due to premature 20E release, and decreased signaling having the opposite effect. PI3K pathway activity directly affects the size of the PG cells, constituting a mechanism through which larvae assess when critical weight has been reached (Caldwell et al. 2005, Mirth et al. 2005, Shingleton 2010). Consistently, starvation, which reduces cell size and IIS activity, leads to a delay in development through IIS dependent repression of 20E synthesis. 20E can in turn directly affect larval growth rate by repressing IIS signaling in target tissues, proposing a mechanism how hormonal regulation of developmental timing and nutrient-dependent growth are interconnected (Colombani et al. 2005, Edgar 2006, Mirth and Shingleton 2012, Colombani et al. 2012, Sarraf-Zadeh et al. 2013).

Imaginal discs can themselves have an effect on developmental timing, delaying development upon tissue damage to create time for repair. Discs adjust their proliferation rate to ensure attainment of correct tissue size and shape (Simpson et al. 1980, Stieper et al. 2008, Smith-Bolton et al. 2009, Colombani et al. 2012). Evidence for discs having a target size stem from experiments where discs were dissected out of the larvae and transferred to the growth-permissive environment of the adult female abdomen. These discs grew to the same size and cell number as undissected discs (Bryant and Levinson 1985). This implies that organs know what size they are and when they have grown enough and necessitates systems that can coordinate and communicate the size of different organs with each other and with final body size.

33 3.5 Various environmental factors can influence body and organ size in Drosophila: The effects of variable nutrient levels on body size are very well understood and are integrated with growth through the IIS/TOR pathways. Starvation leads to smaller animals through a reduction in cell size and/or number, whereas nutrient rich environments promote growth. In humans, the consumption of excess nutrients is associated with obesity and the development of diabetes and cardiovascular disease. In Drosophila, the nutritional input occurs mainly over the fatbody, which inhibits DILP-2, -3 and -5 secretion under food scarcity, and over direct repression of DILP-3 and -5 production during starvation (Ikeya et al. 2002, Géminard et al. 2009). On the cellular level, amino acid levels can directly modulate TOR pathway activity. Different levels of IIS/TOR activity directly underlie the effect of the endosymbiontic bacterium Wolbachia on body size in Drosophila. Infected animals have higher levels of IIS/TOR signaling and show a smaller reduction in body size upon nutrient shortage than uninfected individuals (Ikeya et al. 2009). Furthermore, other bacteria and generally infections may have effects on body size via altering IIS/TOR signaling levels (Mirth and Shingleton 2012). The effect of crowding on body size is in essence the same as that of nutrient restriction, as high crowding leads to increased competition for resources and variable nutrient availability. High crowding causes increased phenotypic diversity, as does nutritional stress (Imasheva et al. 2003). In hand with this, flies also develop to smaller sizes if they are grown under highly crowded conditions.

Naturally existing latitudinal body size clines and laboratory selection experiments clearly show an effect of developmental temperature on body size. Larger body size is observed with increasing latitudes and colder temperatures, and size clines have been found to exist due to different developmental temperatures (Imasheva et al. 1994). The relative contributions of cell size and cell number to differences in size in size clines may however vary (Zwaan et al. 2000). Generally, temperature changes cell size and not number (Partridge et al. 1994, Azevedo et al. 2002). In Drosophila, like in most ectotherms, growth at lower temperatures occurs at a reduced rate and leads to larger adult sizes (Atkinson 1994, Angilletta et al. 2004, Powell et al. 2010). Temperature-sensitivity of body size is not uniform across development: shifts of flies to low (16.5°C) or high (29°C) temperatures at different stages of development show that shifts have a smaller effect on size when they occur at later developmental stages. The different discs and adult structures also show differential sensitivity with the wing being affected until the end of the pupal stage and the thorax only until pupariation, even though they derive from the same disc, the wing disc, that undergoes most proliferation during larval stages. The abdomen, whose precursor tissue grows mostly

34 after pupariation, in contrast shows temperature sensitivity through both larval and pupal stages (French et al. 1998). Flies that had been left to evolve at low temperature (16.5°C) for five years had lower growth rates and longer larval stages irrespective of the experimental temperature they were tested at, whereas the pupal stage was only longer than in the high temperature (25°C) evolved lines when tested at 25°C. Critical weight (CW) and pre-adult survival showed signs of being optimized for the temperature the flies were evolved at: both CW and survival were higher for cold evolved lines at cold temperatures and for warm evolved flies at higher temperatures (Partridge, Barrie and Fowler 1994). Ectothermic animals have developed mechanisms to tolerate fluctuations in external temperature over a substantial range without changing intracellular signaling, thus maintaining developmental robustness. Notch signaling, for example, uses different endocytic routes for Notch activation and temperature robustness is achieved by adjusting the flux balance through competing endocytic trafficking routes and is coupled to temperature dependent ubiquitination of Notch. The combined effects support signaling at lower temperatures and restrain signaling at higher temperatures (Shimizu et al. 2014). The mechanisms of how temperature affects size are however unknown apart from a possible influence on cell proliferation rate (Shingleton 2010).

Interestingly, the scaling, or allometry, with body size is distinctive for different organs and may even vary for the same organ under different environmental conditions. Whereas wing size scales proportionately with body size upon changes in nutrient levels, the scaling is hypermetric, meaning wings get bigger relative to the body, with increasing temperature (Shingleton et al. 2009). The differential sensitivity among organs is again best explored in response to nutrients. Animals protect growth of critical organs during periods of nutrient shortage to ensure survival. Genitals, the unarguably most important organs for fertility and fitness, are less reduced in size upon nutrient shortage than for example wings. Mechanistically, this is caused by insulin insensitivity of cells in the genital discs. Insensitivity arises because cells only express low levels of FOXO, rendering the braking mechanism of FOXO on growth relatively ineffective (Tang et al. 2011). Similarly, neuronal cells are spared from the down-regulation of growth upon nutrient restriction. This however involves a different mechanism than for genitals. In neural progenitor cells, the protein ALK (anaplastic lymphoma kinase) relieves the dependence of the cells on activating cues over amino acid levels and TORC1 and directly activates PI3K, thus maintaining PI3K signaling under nutrient restriction (Cheng et al. 2011).

35 Not only growth factors, but also mechanical forces may influence the growth and proliferation of organs. Cultured cells and organs respond to physical forces, increasing proliferation when cytoskeletal tension is high, such as under conditions of high cell density, and decresing it when tension is low (Huang and Ingber 1999, Aegerter-Wilmsen et al. 2007, Aegerter-WIlmsen et al. 2012, Schluck et al. 2013). The Hippo pathway, over inhibitory recruitment of Wts to the adherens junctions by dJub, is one possible mediator of the effect of physical forces on growth (Rauskolb et al. 2014).

The chapters so far have introduced the molecular pathways and components, the systemically acting growth factors and the hormonal input that interact together and with the environment to control organism and tissue growth and proliferation in Drosophila. The following chapter will introduce the Drosophila wing as a model tissue for growth control and give a brief summary over how the wing specifically develops.

3.6 The Drosophila wing as a model system to study growth control: Growth and growth control is a trait that has been studied in various organisms, from simple single cell systems like the bacterium Escherichia coli or the fungi Saccharomyces cerevisiae and Saccharomyces pombe, over invertebrates like Caenorhabditis elegans and Drosophila melanogaster to mammals like mice (Mus musculus) and eventually humans. For the single celled systems proliferation, and thus the ability to grow fast, is the most important readout for the fitness of the organism. Studies of growth control mainly involve gene-knockouts to understand which cellular signaling pathways and processes are most vital for normal growth of the cell. Recently the rise of techniques that enable quantifying metabolites in cells has brought on a shift towards studying the impact of central metabolism and anabolic processes on growth. The single celled systems are clearly very well suited to studying processes within a cell that are important for that cell’s survival and reproduction, but cannot capture the more complex aspects of growth in tissues. Invertebrate systems have the advantage of being cheap, easy to culture and having a short generation time while possessing distinct cell-types and tissues and an endocrine system, which is known to be crucially involved in the growth control of multicellular organisms. However, despite a high level of conservation of signaling pathways, invertebrates are still evolutionarily quite distant from humans. For this reason, mammalian systems like the mouse are very popular organisms for studying growth, since the complexity of the involved pathways, often having several molecules where e.g. Drosophila only has one, is much closer to the situation in humans. As experiments in mice are still rather expensive and time-consuming and the basic principles underlying growth control are conserved throughout metazoans, an invertebrate model like Drosophila is a

36 perfect compromise for studying basic processes underlying growth. Furthermore, many important growth genes that have been implied in human cancers are conserved between Drosophila and humans, so studies in Drosophila may help to formulate hypotheses for elucidating their function in humans. Last but not least, Drosophila has a long history as a model organism and thus many genetic tools are readily available for modulating gene activity to study gene function. Tissue specific knockdown and overexpression of genes is greatly facilitated by the use of the UAS-Gal4 system (Brand and Perrimon 1993) and the method by Xu and Rubin allows the generation of mosaic tissues and thus studying the effects of loss of essential genes on cell and organ size (Xu and Rubin 1993). Apart from the wealth of genetic tools, numerous bioinformatic resources exist. The annotated genome sequences of Drosophila melanogaster and twelve of its most closely related species are publicly available, in hand with many databases carrying information on gene expression, protein interaction, intergenic sequence elements and a multitude of other information (Celniker et al. 2009, Franceschini et al. 2013, St Pierre et al. 2014). The Drosophila imaginal discs, sheets of epithelial cells in the developing larvae from which adult structures are formed, are ideally suited organs for studying the processes governing tissue growth (Weinkove and Leevers 2000). The imaginal discs arise as small cell clusters during embryogenesis. Nearly all cell growth and division occurs within the imaginal tissues in the few days between the first instar larvae hatching from the egg to its undergoing pupariation at the end of the third larval instar. Thus, the size of adult structures such as the wing are a function of the amount of growth occurring in the imaginal discs during larval development. The gain in mass is substantial; the wing disc for example grows from approximately 50 to 50’000 cells between first and third instar while simultaneously undergoing patterning (Serrano and O’ Farrell 1997). During metamorphosis, all larval tissues die and only the imaginal cells come together to form the adult exoskeleton. The wing disc evaginates after the start of pupariation and merges with neighboring discs to continue cell division during early pupal stages (Schubiger and Palka 1987) until eventually the epidermis secretes the adult cuticle. During development, each wing disc gets divided into specific compartments, with cells in each compartment assuming a specific identity and not mixing with cells of other compartments. Disc growth is dependent on systemic growth factors and hormones, but also relies on disc-intrinsic growth programs through long- and short-range acting signaling molecules, like is the case in mammalian appendages (Bryant and Levinson 1985). Long- range effects are mediated by components expressed along the formed compartment boundaries, like Wingless (Wg) along the dorsoventral (DV) axis of the disc and Dpp along the anterior-posterior axis. Wingless expression is induced by Notch signaling activity at the

37 boundary and from there diffuses to both sides, acting, like Dpp, as a morphogen, a long- range signaling molecule that forms a gradient across tissues and by directly binding to cells activates cellular response in a threshold dependent manner (Herranz and Milan 2008, Swarup and Verheyen 2012). The formation of compartments and expression of these long- range acting molecules are crucially important for proper development and tissue growth as misformation of the DV boundary leads to growth defects in the whole wing disc. While lack of Notch signaling at the DV boundary prevents growth in the whole disc, ectopic Notch activity in non-boundary cells leads to overgrowth of the whole disc. Generally, pathways involved in patterning may also control proliferation in developing discs, thus providing a mechanism for coordinating growth and patterning. In eye discs the EGFR, Notch, Hh and Dpp signaling pathways control proliferation in a spatial manner through transcriptional regulation of target genes that themselves control Cyclin E and other cell cycle regulators. In wing discs, a gradient of Dpp, the homolog of mammalian bone morphogenetic proteins, emanates from the AP boundary and instructs cells to assume a position specific fate, important for the later formation of wing veins, in a concentration dependent manner (Baker 2007). Dpp acts by binding to its receptor thickveins (Tkv), which leads to phosphorylation and activation of the transcriptional activator mothers against dpp (Mad). Mad controls expression of Dpp target genes such as optomotor blind (omb) and spalt. The extracellular Dpp gradient is thus translated into an intracellular gradient of Mad activity across the disc. As Dpp target genes have different activation thresholds this ensures adoption of distinct cell fates depending on the levels of activated Mad (Restrepo et al. 2014). Dpp also promotes relatively uniform proliferation across the disc but it is not well understood how this uniform proliferation is achieved given the gradient of Dpp levels across the disc. The fact that Dpp signaling can act in a non-autonomous manner could be an underlying mechanism ensuring uniform proliferation across the disc despite different perceived Dpp levels. Alternatively, cells could react to the temporal change in Dpp levels, which is relatively seen the same for all cells and could explain uniform proliferation (Baker 2007, Hamaratoglu et al. 2014). Dpp action on growth might be only permissive, rather than directly essential, by inhibiting the transcriptional repressor Brinker (Brk) in parts of the disc (Campbell and Tomlinson 1999). Brk, when overexpressed, induces cell death or only mildly reduces proliferation in disc cells, depending on where in the disc the ectopic expression occurs. Brk action on proliferation is likely mediated via its downstream targets Myc and bantam, but Spalt and Omb, two further classical patterning genes, might also play a role. Further highlighting their importance in the regulation of patterning, Wg and Dpp are released by damaged cells upon injury, inducing tissue proliferation to replace the lost cells (Baker 2007, Hamaratoglu et al. 2014). In summary, in the developing wing disc systemic factors and environmental cues are

38 integrated with tissue-intrinsic growth and patterning signals to form a correctly sized and patterned wing.

As has become clear from the studies summed up in the preceding chapters, organismal size is a highly complex phenotype that is influenced by many genes across the genome and subject to environmental variation (Gockel et al. 2002, Cook and Tyers 2007, Lango-Allen et al. 2010). Much has been learned about the control of organismal and tissue growth from single gene studies in model organisms, especially Drosophila. On the systems level there are growth factors, which enable integration of growth with environmental cues. On the level of the individual organs, morphogens and physical forces act to control size. Organs also have differential sensitivity to environmental cues, indicating that integration with the external environment not only occurs at the systemic but also at the organ level. On the cellular level, multiple pathways affect cell growth, proliferation and apoptosis in response to systemic and organ intrinsic stimuli and need to be integrated with each other. On top of the growth control and neglected here, patterning has to be regulated and integrated with growth to form a normally developed organism, suggesting the existence of mechanisms that allow coordination of growth and patterning.

In light of the evidence reviewed above it is clear that regulatory networks rather than single genes or pathways govern the control of growth and proper organismal size. If we ever want to gain a complete understanding of growth control, this then raises the need for methods that evaluate the influence of whole pathways, networks of genes and eventually the whole genome, rather than single genes, on size. Though genome-wide mutagenesis screens are able to probe the whole genome and identify many genes involved in controlling a trait, classical genetics cannot address interactions exceeding two or three molecules at a time. Nor are the mutations introduced artificially likely to reflect the polymorphisms present in natural populations. It will ultimately be important to understand how all these genes interact under natural circumstances to create a phenotype, in this case a properly sized organism, and how genetic variation creates phenotypic variability while preserving function. Exploiting natural phenotypic variation in size and studying which genetic loci cause this variation is a promising new approach for shedding more light on growth control at the systems level. In the next chapters, two approaches are introduced that both offer a more global view on genetic loci underlying a phenotype: Genome-wide association studies and experimental evolution experiments.

39

3.7 Genome-wide association studies

3.7.1 Genome-wide association studies provide a first step towards a systems level understanding of multigenic traits by linking variation in genotype to natural phenotypic variation: Genome wide association studies (GWAS) are a relatively new and powerful approach for linking genetic loci to variation in a phenotype or a disease state in a population, thus shedding light on the underlying genetics of this trait (Womack et al. 2012, Korte and Farlow 2013). They have been made feasible by the advent of fast and affordable whole genome sequencing of many species and the development of high-density SNP arrays. The advantages of GWAS are that they, in contrast to candidate based screens, are unbiased in terms of the loci that are studied and that context-dependent effects can be detected. The motivation for GWAS came from the desire of being able to predict disease development and outcome in humans or product yield in agriculture based on the genetic variants present in the organism. Risk prediction on the one hand needs a detailed molecular understanding of the genes involved but also information about the genetic architecture of the trait under study. The genetic, or allelic, architecture of a trait is a description of the number and type of alleles involved, their effect size and their frequency in the population. Many genes involved in diseases have been identified through careful and detailed functional studies in cell culture and model organisms. But for all of theses traits it was not known whether only a few genes contribute to it or if there were many loci with only small effects involved. Additionally, GWAS offer the advantage of identifying genes relevant for a disease or trait directly in humans, which is very valuable in an organism where genetic screens are impossible. GWAS form the foundation for further experiments, by allowing an informed choice of future genes to evaluate for involvement in a given trait. They are complementary to classical mutagenesis screens for several reasons. GWAS typically identify loci with small and possibly context-dependent effects. These types of alleles are often overlooked in classical screens due to their focus on phenotypically dramatic and reproducible effects. Furthermore, the alleles tested in GWAS are natural variants, generated by evolutionary processes acting on the population rather than artificially induced mutations. Natural polymporphisms are probably more relevant for understanding variability in a trait or disease etiology than artificially induced mutations. GWAS have been pioneered (Hirschhorn and Daly 2005, McCarthy 2008, The Wellcome Trust Case Control Consortium 2007), and since then widely applied, in human genetics nearly 10 years ago and are now a routinely used tool in model organisms like Arabidopsis,

40 Drosophila and mouse as well as in various crops and cattle species (Bergelson & Roux 2010, Jumbo-Lucioni et al. 2010, Meijon et al. 2013, Lee SH et al. 2013, Huang X. et al. 2011, Flint and Eskin 2012, Lipka et al. 2013, Makvandi-Nejad et al. 2012, Garcia-Gamez et al. 2012, Maxa et al. 2012, Minozzi et al. 2013), where they have substantially broadened our understanding of the underlying genetics of complex traits. GWAS exploit the natural genetic variation that is present in populations and aim to identify genomic loci that are causal for variation in a phenotype of interest (McCarthy et al. 2008). For each segregating locus in the population a statistical model is formulated that tests whether the phenotypic distributions of different genotypes at that locus are significantly different, an indication that the locus is associated with trait variation. Genome-wide significance is achieved if the association p- value is below a specified multiple testing corrected p-value. SNPs with lower p-values than the genome-wide significance level are considered candidate loci for the trait. The power of a study to identify causal variants is influenced by several factors. On the SNP side, the minor allele frequency (MAF, the frequency at which the least abundant allele at a locus occurs in a population) and the effect size influence the association p-value. Two other factors that need to be considered are the overall number of loci to be tested and the population size (Mackay et al. 2010, Gibson 2011). Only SNPs that are common, having a MAF of more than 1% to 5% of the population, are amenable to being tested by GWAS. If alleles are less frequent it is hard to distinguish them from sequencing errors, which occur every 100 to 1000 base pairs with current standard sequencers (Loman et al. 2012). Further complicating their use in GWAS is their presence in only few individuals, which poses statistical problems in estimating the phenotypic variance of individuals carrying that SNP accurately. The involvement of rare variants in quantitative traits and common diseases can best be assessed using a candidate-based approach, via targeted re-sequencing of candidate regions followed by detailed functional studies (Bodmer and Bonilla 2008). The effect size reflects the allelic difference in phenotypic means. To have large power, large sample sizes are crucial as most variants tested in GWAS are common and thus expected to have only modest effects on the phenotype. Further increasing the need for large sample sizes are the large numbers of hypotheses that are tested in a GWAS, which requires a multiple testing correction of the p-values obtained from the tests (Sham and Purcell 2014). If 1000 loci are tested at a significance level of α=0.05, then 5% or 50 of these loci will show a nominally significant p-value (p < α) merely by chance. Clearly, when testing many hypotheses, the p-values need to be adjusted to account for such statistical fluctuations and avoid identifying too many false negatives. A very stringent method for correction is the Bonferroni method, which consists of dividing the resultant p-values by the number of tests performed. In the case of testing 1000 SNPs, to reach nominal significance we would need a

41 p-value smaller than 0.05/1000 or 10E-05. The generally accepted p-value for genome-wide significance in human studies is 10E-08, equaling a p-value of 0.05 after Bonferroni correction for 1 million tests (Risch and Merikangas 1996). There are less stringent methods for correction (Sham and Purcell 2014) (e.g. Bonferroni-Holm, the false discovery rate (FDR) control method of Benjamini-Hochberg or permutation-based methods) which may be more adequate for GWAS findings, as it is generally affordable to have some false positive associations in the final candidate set given that follow up studies or independent replications are performed. False positives may not only arise from statistical fluctuations inherent in the testing design but also due to systematic biases such as population structure in the sample or technical artifacts (Hirschhorn and Daly 2005). One source of population structure is the presence of multiple ethnic groups in the study population, which is very often the case nowadays (Freedman et al. 2004). The frequency of a variant often differs between ethnic groups. If ethnic groups also differ in the prevalence of a disease or the value of a trait, ethnic subgroups can be overrepresented and false positive associations can ensue, mistakenly linking the ethnically differing variant to the disease or trait. More generally, population structure arises when some individuals in the population are more related to each other than to other individuals. If this coincides with them showing a similar phenotype then loci that are shared due to relatedness can show association even though they do not affect the phenotype. This problem can be addressed by using a small number of loci, clustering based methods or if present a phylogenetic tree (for example in populations of model organisms) to identify and correct for population structure before performing the GWAS (Pritchard et al. 2000). More recent methods consist of applying an admixture model (Alexander et al. 2009) or principal components analysis to a small number of genotyped SNPs to identify co- variation due to ancestry (Price et al. 2006, Patterson et al. 2006). The PCA method is very efficient in identifying ancestry in human populations with the first two principal components perfectly splitting the study population into regions of origin. This works both on the worldwide scale but also on the country scale (Paschou et al. 2007). For example in Switzerland, PCA has been able to assign people in a cohort in Lausanne (coLaus) to the main language region they originate from. In model organisms however, population structure is often much more complex than in humans due to elaborate designs of mapping populations, and PCA only captures a part of the present population structure. An improvement of the PCA method lies in the application of mixed models that identify and account for variables that cause phenotypic variability in the population, without prior knowledge of which confounders are present (Kang et al. 2008, Segura et al. 2012, Korte et al. 2012, Rakitsch et al. 2013). Technical artifacts can arise due to genotyping errors,

42 inconsistencies in sample preparation and systematically missing genotypes at some loci, issues that become more rare with the development of more accurate sequencers and the formation of community efforts for GWAS studies (The HapMap Phase II project 2007, The 1000 genomes consortium 2010). The combined problems of modest effects and multiple testing corrections require large sample sizes to enable having decent power for identifying causal variants. Genotyping large numbers of individuals is however still quite expensive, despite a significant drop in sequencing costs and the development of SNP arrays, and apart from the monetary side also rather time-consuming. One strategy to circumvent this problem is using a stepwise approach, which consists of identifying SNPs in a small, densely genotyped population using a relaxed p-value threshold and re-testing them in a second, larger population at a more stringent threshold. Alternatives are performing GWAS in founder populations or pooling samples from multiple individuals for genotyping (Hirschhorn and Daly 2005, Gang-Shi et al. 2011). All of these approaches can alleviate the need for large samples and thus the cost and effort needed for genotyping but suffer from other limitations. A more viable and very powerful approach are meta-analyses, which pool the results of several individual GWAS studies and can substantially improve power while simultaneously helping to identify and eliminate false positive associations (Zeggini and Ioannidis 2009, Yang J et al. 2012). Meta analyses combine summary statistics for each variant, either p-values or effect sizes, from multiple studies of the same trait to identify significantly associated SNPs. Studies included in the meta-study are given different weights based on their respective powers, which mostly translates to the sample size they used. Apart from meta-analyses, approaches that can improve power are gene-based or pathway-based association methods (Liu et al. 2010, Wang K. et al. 2010). Gene-based methods are useful when multiple SNPs within a gene show significant association. These SNPs are redundant in their information in terms of the gene involved and may even be statistically redundant due to LD between them. Replacing the individual SNP p-values by an overall p-value for the gene increases power by decreasing the number of tests. Pathway-based methods build on the logic that proteins do not exert their effects in isolation, and for a given disease it is thus likely that a whole pathway, instead of just one component, is disturbed. GWAS studies have become widely applied in human genetics since 2005: by 2013 more than 1600 publications identified more than 2000 robust associations for more than 300 complex traits (Manolio et al. 2013). Two projects that greatly helped the widespread use of GWAS in humans were the HapMap Phase I and II projects and the human genome diversity project (The International Hapmap Consortium 2003, 2005, 2007, Cavalli-Sforza 2005). These projects provide catalogs of common DNA sequence variation in the human genome.

43 The HapMap Phase I project for instance has genotype information, allele frequencies and LD information for more than one million variants from 269 individuals with African, American, Asian and European ancestry. The HapMap Phase II project extends the number of variants to over 3.1 million. A more recent effort, the 1000 genomes project, provides information about sequence variants that are more rare (1000 genomes consortium 2010). However, the identification of associated loci is only the first step, and detailed molecular analyses are necessary to bring GWAS findings into clinical use or predict phenotypes based on SNPs (Wray et al. 2013). More often than not the identified variant lies in the intergenic space (Hindorff et al. 2009) and it is not apparent which gene it influences or if the variant even lies in a functional noncoding region (Ernst et al. 2011, The ENCODE Consortium 2012). Even for variants located near to genes, it is not immediately apparent which gene they tag due to linkage disequilibrium (LD). LD exists when genomic loci are inherited together because they are in close proximity and thus no recombination occurs between them. The genotype at one position in an LD region, or block, is then sufficient to infer the genotypes at the other loci, since they are inherited together, making the genotypic information of the other loci in the block redundant. LD is the basis for GWAS, allowing the use of one SNP as a tag SNP for a genomic region, thus reducing the number of tests that have to be performed. However, the chosen tag SNP will very rarely be the causal variant, and it may take additional sequencing and functional studies to identify the true causative allele or even just gene, as human LD blocks can span over several hundred kb, often containing more than 10 genes (McCarthy et al. 2008, Manolio et al. 2013, Pickrell 2014). Even if the correct gene is identified, it is unclear how the SNP influences the activity or abundance of the corresponding protein. Also, proteins do not act in isolation, and often it will be interactions between proteins from different loci that are causal for disease. All these considerations make it difficult to directly build phenotype or risk predictions, or even develop drugs for a disease (Kooperberg et al. 2010, Bao et al. 2013). A strategy that is very promising in helping to assign causality to genetic loci is the incorporation of intermediate phenotypes, like transcript levels, protein levels, enzymatic activity and metabolites (Sieberts and Schadt 2007, McCarthy et al. 2008, Schadt 2009, Stunnenberg and Hubner 2014). Indeed, studies associating transcript, protein or metabolite levels to genetic variants have proven successful and often provide more clear insights into the connection between a variant and phenotype (Kim and Gibson 2010, Wu et al. 2013, Rueedi et al. 2014). Integrating phenotype, protein level, transcript level and metabolite GWAS is thus a promising approach for gaining a causal understanding of complex traits (Schadt et al. 2005, Mackay et al. 2009, Hawkins et al. 2010, Civelek and Lusis 2014). Additionally, integrating prior knowledge about the trait can substantially improve associations, but entails a bias in

44 terms of the analyzed loci (Marjoram et al. 2014). In GWAS for multi-locus interactions, this strategy may however help reducing the search space and thus decrease the multiple testing penalty (Emily et al. 2009).

3.7.2 Finding the missing heritability of complex traits: A common factor to all analyzed diseases or traits to date is the relatively small fraction of the estimated heritability of the phenotype that is explained by the associated loci (Manolio et al. 2009). Heritability provides an estimate of the relative importance of the environment and of genetic factors in explaining variation in a trait (Visscher et al. 2008). Formally, heritability defines the percentage of the variance in a trait - measured in a specific population at a specific time - that can be explained by variation in the genotype. Broad-sense heritability gives the percentage of total phenotypic variance in a population due to all genetic variation, additive contributions of single loci and interactions between loci, while narrow-sense heritability encompasses only the additive genetic variation. As the genetic pool and phenotypic spectrum differ among populations of the same species and over time, trait heritability is an estimate that is only valid for the population it was estimated from at that time, and not a generally valid number. Broad-sense heritability can be estimated from empirical phenotypic data using linear mixed models (a type of statistical model that evaluates contributions of different variables to total phenotypic variance) and narrow-sense heritability from selection experiments, by a method called the breeders equation. For humans, heritability is estimated from simple parent offspring regression, the phenotypic correlation between full and half siblings or the correlation difference between monozygotic and dizygotic twin pairs (Meyer 1985, Falconer and Mackay 1996, Lynch and Walsh 1998). Given the small fractions of explained heritability in most GWAS, obviously the long accepted hypothesis of common disease common variant (CDCV) does not hold (Reich and Lander 2001, Pritchard and Cox 2002). CDCV says that common diseases, such as diabetes, cancers, and cardiovascular disorders, should be attributable to common variants. However, most identified common variants only confer a 1.1 to 1.5-fold increase in risk for developing the disease and consequently only explain a small part of the estimated heritability (Manolio et al. 2009). For example, many loci have been identified that contribute to human height, a trait with an estimated heritability of 80%, but they only explain about 10% of the phenotypic variance (Lango-Allen et al. 2010). Proposed explanations for this phenomenon include the contribution of many more low frequency variants (MAF 0.5%-5%) of smaller effects that have missed the significance threshold so far and have yet to be identified, rare variants (MAF<0.5%) with large effects that cannot be assayed so far due to technical and statistical issues, structural variants that are only poorly characterized for most genomes (McCarroll

45 2008), the possibly large contribution of epistasis (gene-gene interactions) that we have only low power to detect (Mackay and Moore 2014), and unaccounted-for environmental effects. On the other hand, heritability estimates might be inflated due to shared genetic factors that do not contribute to the trait or due to shared environmental effects on the phenotype, two factors that are likely to play a role in families, where most heritability estimates are derived from (Manolio et al. 2009, Gibson 2011). As the genetic architectures of traits differ, there are likely to be different answers to the missing heritability question depending on the trait under study. Likewise, not only one factor will explain missing heritability for one trait but rather a combination of factors will be the cause. Despite all the difficulties, there are some stories of success to tell: A common variant in the fat mass and obesity-associated gene FTO has been associated with BMI and an increased risk for childhood and adult obesity (Frayling et al. 2007). Rare variants in the phospholipase D3 gene are risk factors for late-onset Alzheimer’s disease (Cruchaga et al. 2014) and two common variants in the complement factor H gene (CFH) have been identified to confer a two- to three-fold increase in risk for developing age-related macular degeneration to carriers (Maller et al. 2006, Manolio et al. 2011). For type II diabetes, a low-frequency intronic variant in the Cyclin D2 gene, the key regulator of postnatal pancreatic beta-cell mass, reduces the risk of developing diabetes by half and a rare deletion variant in the homeodomain transcription factor PDX1 gene, leading to a frameshift and premature stop, is associated with a more than twofold higher risk. Individuals carrying the protective Cyclin D2 variant have consistently lower blood glucose levels, the variant correlates with elevated Cyclin D2 expression in white blood cells and adipose tissue and Cyclin D2 knockout mice are less glucose tolerant and develop diabetes 9-12 month after birth, indicating that this variant, which lies in a region with characteristics of a transcriptional regulatory region, is the causal variant. PDX1 is a major regulator of pancreatic hormone expression and is involved in insulin gene expression in response to glucose, which makes it a likely candidate for being causally involved in pathogenesis of type II diabetes (Steinthorsdottir et al. 2014).

3.7.3 GWAS studies of human height imply common variants with small effects account for half of the total heritability and shed light on the sources of missing heritability: Human height is to date one of the most successfully investigated traits by GWAS (Lettre 2011). Human body size, or height is a highly polygenic trait with a high heritability of 80% in humans (Perola et al. 2007). Discovery of the first height-associated gene in 2007, HMGA2, a chromatin protein, was followed by several smaller studies that together identified 47 loci associated with height that explain about 5% of trait variability. Genes in these loci grouped

46 into pathways and functional classes such as chromatin proteins, Hedgehog pathway genes, targets of the let-7 microRNA, BMP signaling, extracellular matrix proteins and proteases and genes involved in cancer and cell-cycle control, and several genes were recovered that were previously implied in skeletal growth defects (Hirschhorn and Lettre 2009). Despite the importance of insulin signaling in human body size determination, common variants in genes of the growth hormone/IGF axis, IGF-1 among them, were not significantly associated with height variation in humans (Lettre et al. 2007). These and other studies were integrated in 2010 in a large meta-study by the Genetic Investigation of Anthropometric Traits (GIANT) consortium with more than 133’000 individuals of European ancestry, combining data from 46 GWAS. The GIANT study found that hundreds of variants in at least 180 loci, which together explain 10% of the phenotypic variability, associated with human height. These loci are not randomly distributed over the genome but are enriched for biological interactions and concerning variant types, are enriched for cis-regulatory and non-synonymous SNPs. Of the 180 SNPs, several lie near or in genes that cause severe skeletal growth defects, showing that depending on the type of variant the same genes can be involved in common trait variation and rare diseases (Lango-Allen et al. 2010). Pathway analysis further revealed a nominal enrichment for genes of the Hedgehog and TGF-beta signaling pathways, both involved in bone formation, and the growth hormone pathway. Novel loci have been identified since, indicating that loci contributing to this trait are numerous and spread over the whole genome (Yang J et al. 2012, Hao et al. 2013, Du et al. 2014). The fraction of phenotypic variation explained by these loci is however small compared to studies for height in e.g. the horse, where four loci explain 83% of size variation (Makvandi-Nejad et al. 2012), or domestic dogs, where a single in IGF-1 substantially contributes to size variation, being present in all of the small breeds analyzed and nearly absent in the large breeds (Sutter et al. 2007). This may, however, be due to the fact that domestic animals like horses and dogs are specifically bred and do not reproduce naturally, such as humans, and thus carry only few size loci of large effect. Despite this, three of the horse loci were also found in GWAS for human height, while only one was associated with cattle growth and one with dog size (Makvandi-Nejad et al. 2012). More of the phenotypic variance in human height is explained if the effects of all common autosomal SNPs are taken together. Common autosomal variants explain a good 45% of total phenotypic variation (Yang J et al. 2010), with effects of individual genes or chromosomes on the phenotype being proportional to their length. This suggests that loci influencing height are spread uniformly across the genome and genomic regions explain variation proportional to their gene content (Yang J et al. 2011). Furthermore, gene regions explained proportionally more variation than intergenic regions. The heritability estimate for

47 the 180 loci of the GIANT study showed good correlation with the heritability estimate for the same loci in this study. The fact that common variants explain about half of the total heritability suggests that there should be additional, yet unidentified variants affecting height that are present at low frequencies. This study provides one explanation for the missing heritability problem: common variants of small effect that do not reach genome-wide significance. The missing heritability is exacerbated due to incomplete LD between causal variants and tag SNPs, possibly due to low MAF of the causal variants (Yang J et al. 2010). Yet another source of missing heritability in height is the presence of multiple alleles in a gene showing independently significant association and contributing to phenotypic variation. Usually only one allele is chosen to represent the significant association and thus significant independent contributions to total variance may be missed (Zhang G et al. 2012). Taken together, theses studies put forward human height as a classical quantitative trait that is influenced by many loci across the genome that each only have a small effect. Both common and rare variants contribute to trait variation, and the so far identified common variants do not affect classical growth genes from the IGF axis. Probably, variants in genes coding for Insulin/IGF signaling components, which given the proteins’ roles in fetal and adult growth would be expected to have sizable effects on height, would be kept at low frequencies in the population and it is thus feasible that when the detection of low frequency variants is possible with higher confidence, some might be found in genes of the IGF axis.

3.7.4 Limitations and problems of human GWAS that can be overcome in model organism GWAS: The evolutionary conservation of core molecular, cellular and physiological processes make model organisms attractive systems for gaining insights into shared base principles of life but also clinically relevant diseases and traits (Mackay 2006, Müller and Grossniklaus 2010, Aitman et al. 2011). Several features of model organisms make them additionally very attractive systems for GWAS, highest among them being the ability to control environmental influences on the phenotype. Complex traits are by definition heavily susceptible to environmental factors (Lynch and Walsh 1998, Batty et al. 2009, Oksenberg et al. 2008, Thomas 2010, Bergelson & Roux 2010, Vilhjàlmsson and Nordborg 2013), which can lead to confounding in GWAS. For a GWAS to identify truly associated loci, the phenotype must be as accurate a readout of the genotype as possible. To achieve this, the effect of environmental covariates on the phenotype of interest needs to be minimized or randomized. Controlling the environment in human studies is seldom feasible and thus, in an attempt to avoid systematic environmental effects, large, randomized cohorts are used for quantitative trait GWAS (McCarthy et al.

48 2008). Nevertheless, a large part of the phenotypic variance in the study population will be due to various environmental factors that affected each person slightly differently during their lifetime. The larger the contribution of the environment to a person's measured phenotype, the more limited the associations to DNA variants will be in terms of significance and therefore the number of loci that are identified for the trait. A possible solution is to include known environmental covariates in the association model and thus correct for the variance in phenotype caused by the factor. A relatively recent example of a human GWAS for serum cholesterol levels shows that associations can be substantially improved using this strategy (Igl et al. 2010). However, this only works if the nature of the environmental variable is known and the variable is quantifiable for each study subject. The feasibility of much more stringent environmental control and the possibility of measuring several genetically identical individuals to obtain a high confidence phenotypic value for a given trait and genotype thus make model organisms highly attractive systems for GWAS. For behavioral traits, the same individual can even be assayed multiple times, to obtain a phenotypic value that is averaged (or randomized) over different environments. This is important when factors like the time or day of measurement potentially affect the behavior (Anholt & Mackay 2004). Interactions between genotypes and the environment can be abundant and often account for a large part of total phenotypic variance, as was found in a comprehensive investigation of the effect of a number of covariates on multiple complex traits in mice (Valdar et al. 2006). As a consequence, the predictive power of genes identified in a single environment can be low and phenotypic profiling across multiple environments is necessary to better the chance of identifying truly causal genes (Gagneur et al. 2013). To eliminate confounding effects it is clearly necessary to take the utmost care in identifying potential environmental perturbations for a given phenotype and address each of them as necessary, by either eliminating, randomizing or accounting for the source of variation, a task that is near impossible to achieve in humans but feasible in model organisms. Apart from the ease of environmental control, the possibility for functional validation of target genes makes model organisms superior systems for GWAS. Validation in human GWAS consists of trying to replicate the associations in different cohorts, which is a good strategy for identifying false positives but unsatisfactory for assigning function and causality to loci. For Drosophila and many other models, RNAi constructs and knockout libraries exist for many genes and are publicly available. The decades of intense model organism research have, apart form generating mutant alleles of variable severity for many genes, also created a wealth of genetic tools for functional studies. Furthermore, in recent years panels of inbred lines of different model species have been created, sequenced and made publicly available, enabling GWAS studies for anyone motivated enough to quantify a phenotype (Peirce et al.

49 2004, Bennett et al. 2010, Weigel and Mott 2009, Cao et al. 2011, Mackay et al. 2012, King et al. 2012).

3.7.5 Drosophila as a model system for GWAS of size: Drosophila is a great model organism for GWAS due to its well-characterized genome, manageable genome size, rapidly decaying LD structure and large amount of genetic variation. The development of the recently established Drosophila genetic reference panel (DGRP) make Drosophila a particularly attractive organism to study traits by GWAS (Mackay et al. 2012). The DGRP is a set of inbred isofemale Drosophila melanogaster lines derived from a North American population (Raleigh, North Carolina). The DGRP genome sequences are publicly available and have been found to harbor substantial natural genetic variation, differing in about every 25th base pair among them. Furthermore, they show an abundance of phenotypic variation for any trait assayed so far and many associated loci have been validated for sleep, alcohol sensitivity, olfaction, oxidative stress, weight and metabolism, shedding light on the underlying genetics of these traits (Ayroles et al. 2009, Harbison et al. 2009, Arya et al. 2010, Jumbo-Lucioni et al. 2010, Mackay et al. 2012, Jumbo-Lucioni et al. 2012, Massouras et al. 2012, Weber et al. 2012, Jordan et al. 2013, Harbison et al. 2013, Swarup et al. 2013). For the same reasons that Drosophila is an excellent system to study body size - high conservation of involved pathways, ease of handling and the availability of many genetic and bioinformatics tools - it is a suitable model for studying body size using a GWAS approach.

3.8. Experimental evolution 3.8.1 Experimental evolution experiments can be used to generate extreme phenotypes and to identify combinations of loci that are causally linked to these phenotypes: GWAS have the limitations that they cannot reliably detect rare variants and due to multiple testing penalties large sample sizes are required to detect variants of small effect. Even in a meta-study, combining data from 46 individual GWAS for height, yielding a sample size of over 130’000 individuals, the fraction of trait heritability explained by the identified loci was a mere 10%. This proportion is likely to be higher in GWAS for body size in Drosophila due to the controlled and thus much more homogeneous environment and replicate measures of size phenotypes, yielding a very high confidence phenotypic value for a given genotype. Nevertheless, population size, statistical power and rare variants could still be an issue and thus a complementary approach that can overcome these limitations could substantially increase our understanding of the genetic polymorphisms underlying body size variation.

50 Experimental evolution experiments, where artificial selection for a trait is applied to populations of model organisms over several generations followed by sequencing of the evolved populations, represent such an approach (Zeyl 2006, Bennett and Hughes 2009, Burke and Rose 2009, Kawecki et al. 2012). Over the past 100 years, experimental evolution experiments have yielded insights into how traits evolve and how populations adapt to changing environments (Wichmann et al. 2000, Elena and Lenski 2003, Orozco-TerWengel et al. 2012). More recently, the combination of laboratory selection and population-based re- sequencing offers new possibilities for identifying combinations of loci that are causally linked to selected phenotypes. (Fiegna et al. 2006, Barrick et al. 2009, Burke et al. 2010, Parts et al. 2011, Turner and Miller 2012, Baldwin-Brown et al. 2014, Reed et al. 2014). Applying artificial selection to laboratory populations generates highly divergent extreme populations for the phenotype of interest. Sequencing can occur at multiple stages during the experiment and/or at the end of selection. One strategy is to select and pool the most extreme individuals throughout the experiment (Turner and Miller 2012) and sequence those, another to pool- sequence the evolved end-populations (Reed et al. 2014). Comparing the genetic pools of the divergent populations will identify alleles that are present at significantly different frequencies among them. A difference in the frequency of an allele between the extreme populations hints at a role of the locus in controlling trait variation. However, the genetic pool of a population does not only change through directed selection but also through random processes such as genetic drift. Furthermore, selection of one variant entails co-selection of linked genomic loci due to LD (called hitchhiking), which in combination with the loci that get randomly enriched during selection makes it difficult to identify the actual selected allele (Tobler et al. 2014). The result is that among all enriched loci, a small minority is causally involved in the expression of the trait in question while the majority of enriched SNPs are false positives due to hitchhiking and genetic drift. Theoretical studies and simulations have shown that the proportion of causal loci in the selected populations depends on the specifications of the experiment (Fuller et al. 2005, Baldwin- Brown et al. 2014, Kofler and Schlötterer 2014). A powerful experimental design should aim at maximizing the proportion of causal loci versus random noise in the selected populations. An increase in population size and a larger number of replicates correlate positively with a higher proportion of causal (true positive) loci relative to noise. Using starting populations with high genetic diversity, genome-wide low LD and as few and small inversions as possible further increases the chance of recovering causal loci, as do a high selection pressure and the sequencing of additional generations in between the starting and final populations (Baldwin-Brown et al. 2014, Tobler et al. 2014, Kofler and Schlötterer 2014). While the amount of genetic diversity, LD and presence of inversions clearly depend on the organism

51 and nature of the starting population, the upper limit on population size and replicate number is given by the time and manpower available for sampling and phenotyping the population. It is thus obvious that large population sizes and many replicates can only be achieved in studies with easy and rapid phenotyping. In unicellular organisms such as bacteria and yeast culturing is straightforward, generation time short and phenotyping often partly automated, allowing the screening of large numbers of individuals and replicates over many generations (Nicoloff et al. 2007, Parts et al. 2011, Wiser et al. 2013, Jang et al. 2014, Wang et al. 2014). In bigger and more complex organisms artificial selection experiments have been limited to small population sizes, except in cases where easy and fast, often partially automated phenotyping and selection could be implemented. Successful examples in Drosophila are studies in gravitaxis, phototaxis, alcohol sensitivity, aggression behavior and other behavioral traits that are amenable to high throughput phenotyping (Hirsch and Erlenmeyer-Kimling 1962, Hadler 1964, Dierick and Greenspan 2006, Edwards et al. 2006, Morozova et al. 2007). Growth and size are harder to quantify, and a tradeoff between accuracy and speed is unavoidable. A very fast and large- scale method for phenotyping Drosophila size is the sieving method by Turner (Turner et al. 2011), which sorts flies based on their ability to pass through decreasingly smaller holes when anesthetized. The authors achieve population sizes of 1800 flies per generation with this method, but it suffers from inaccuracies in phenotyping, as flies are randomly oriented when passing through the sieve and outstretched legs or wings may hinder an otherwise small from passing. Likewise, it does not take into account the size of individual body parts. A fly could end up in the big population due to a big thorax, abdomen or big wings, which clearly makes a difference for subsequent analysis. The highest accuracy for morphometrc phenotyping is still achieved by manual phenotyping, which has the drawback of being very time-consuming. This is reflected in the small population sizes in selection studies with manual phenotyping, which often do not exceed 25-50 flies per generation (Partridge et al. 1999, Trotta et al. 2007). The WINGMACHINE device and software developed by Houle provides an intermediate solution, offering relatively high throughput phenotyping of live fly wings with high accuracy, which is however still somewhat time- consuming and cumbersome to operate, and is restricted to measuring wings (Houle et al. 2003). An early version of a wing and body phenotyping machine was already developed by Robertson and Reeve in 1952 and allowed phenotyping of up to 200 flies per day by one operator (Robertson and Reeve 1952). A machine that is able to provide high accuracy phenotyping while substantially improving phenotyping speed could benefit morphometrics selection studies by enabling accurate phenotyping for larger population sizes than previously achieved.

52 3.8.2 Results from selection studies of size in Drosophila: One of the earliest selection studies for wing and thorax length was published in 1952, describing the selection over 50 generations with two different wild-type stocks derived from single females originating from Nettlebed and Edinburgh (Robertson and Reeve 1952). This study showed that there is abundant genetic variation for wing and body size (as measured by thorax length) and both traits can be selected for, achieving heritabilities as high as 50% over several generations. Selection for either wing or thorax size entails co-selection for the other trait and overall body weight in the same direction. Partridge et al found that small lines decrease their size through a deceleration of the growth rate, a lower critical weight and on the cellular level a decrease in cell size, whereas bigger size is achieved through longer growth periods and an increase in cell number (Partridge et al. 1999). In a selection experiment for cell number, cell area and total area of the wing on several geographically distinct populations, all traits achieved high heritabilities between 50% and 60% (Trotta et al. 2007). The response to selection was highly dependent on the geographic origin of the population tested, a logical consequence given that the genetic pools and allele frequencies of different populations are distinct, but not on the sex, as males and females show a similar response to selection, indicating a similar genetic architecture of size among sexes. Interestingly, Trotta et al. also found that selection for all traits seems not to have an impact on general fitness as assayed by viability, rather the largest fitness differences were again observed between populations. This is somewhat surprising, as, due to involvement of IIS/TOR in fertility, developmental time and longevity, one could expect that these traits are correlated to body size and may thus impose a constraint on selection. Furthermore, in line with the results of Partridge et al., no decrease in developmental time, which could accelerate reaching of a reproductive age and thus give a fitness advantage, is observed in small lines. However, animals selected for small cell areas seem to have a competitive growth advantage at low temperatures in crowded environments. The smaller flies need less nutrients to generate and adult, which could be a deciding factor in a competitive environment. In terms of the genetic loci involved in these selection responses, a large-scale study over more than 100 generations by Turner et al. identified hundreds of loci differentially enriched between large and small populations (Turner et al. 2011). These loci are not randomly distributed across the genome but cluster in peaks that vary in width, from very large regions in the proximity of centromeres down to loci that contain only a few variants. The identified candidates, determined as lying closest to the most significant, i.e. peak variant in each LD window, are enriched for genes implied in post-embryonic development and metamorphosis, such as the ecdysone induced proteins Eip63E and Eip75B, and for cell morphogenesis.

53 Furthermore, several genes from the EGFR pathway (egfr), the Hippo pathway (salvador and crumbs) and many other growth pathways (dally, E2F, knirps, miniature) lie in close proximity to a peak variant. Canonical IIS/TOR signaling genes are not found among those closest to peak variants but often overlap with other significant variants in the differentially enriched loci. Taken together these studies suggest that there is abundant genetic variability for size present in natural populations that can be selected for and many loci get differentially enriched in this process. The challenge here, as in GWAS, will lie in the identification of truly causal selected loci among the large numbers of loci that get enriched due to linkage.

3.9 Summary of the introduction and project motivation: The question of how animals control and coordinated growth among tissues has fascinated biologists for decades. Having a detailed mechanistic but at the same time global understanding of the growth processes taking place during normal physiological development is furthermore highly relevant for understanding the basis of the many pathologies summed under the term cancer. Numerous classical genetic studies in Drosophila over the past 30 years have given us a substantial understanding of the core developmental processes. They identified two pathways as the main underlying regulators of size, the Insulin/TOR pathway and the Hippo pathway, but also revealed that understanding growth control as a whole includes many layers of control and is highly complex. Many components are known, the complete picture, and especially the many interactions between components of these pathways with each other and other molecules, is however still missing. Most crucially, we likely have to address whole pathways and genetic networks at a time to fully comprehend the multi-faceted problem of growth control. As a quantitative trait such as size is by definition highly multigenic, being influenced by many loci across the genome that are interdependent and each only have a small effect on the final phenotype, studies that look at a single gene at a time might not be sufficient to grasp the whole system of networks underlying this trait. Especially when certain alleles only show a mild effect that may be hard to reproduce or is overlooked in classical forward or reverse genetic screens, or when this effect is dependent on the genetic or environmental context. In this case, a more global approach that is able to evaluate the contributions of genes across the whole genome at a time to final size may bridge the gap between the detailed mechanistic understanding that we have from single gene studies and a more complete understanding of growth control at the systems level. Genome-wide associations and experimental evolution experiments are two examples of such global approaches that may each provide further insight into what combinations of loci are relevant for creating phenotypic variation in size and thereby reveal the contextual action of growth pathway components.

54 3.10 Aims of this thesis: In my thesis I want to address the following set of questions:

1. Can we develop standardized culture conditions that sufficiently minimize environmental influences on Drosophila body size, thus allowing accurate quantification of body size as a readout of the genotype? 2. Can GWAS identify loci significantly associated with size? 3. If so, a. what is the genetic architecture of size in Drosophila melanogaster? b. do they reflect the genes identified by classical genetics studies? c. where do these loci fit into the previously established picture of growth control?

Questions 2 and 3 will further be addressed using artificial selection and re-sequencing of selected populations. However, as the sequencing results will be obtained after the defense of this thesis, a short summary of the results of the experimental evolution part will only be shown for the selection process in Results part 3. The potential benefit of the approach over GWAS was discussed in the introduction (chapter 3.8) and future directions for this experiment, and how it may add to the results obtained by GWAS will be described in the outlook. To be able to perform relatively large-scale selection, an automated phenotyping and selection tool for Drosophila was developed upon our specifications by Vasco Medici from the company SciTracks and is described in the Results (chapter 4.2.) To address question 1, we identified and tried to minimize or randomize potential confounders of size. Variance decomposition of the size phenotypes reveals the relative contributions of genotype and environment on size when flies are grown in the thus established conditions. The protocol was established for and tested with the Drosophila Genetic Reference Panel (DGRP) lines, the previously described panel of inbred lines derived from a population in Raleigh, North Carolina. The results of this experiment are described in the manuscript in Results part 4.1 To answer question 2, we applied GWAS for wing and body size in the DGRP lines. GWAS pipelines were optimized for each phenotype to specifically remove any remaining confounding effects and different association methods used to identify loci underlying size variation. To validate the associations we performed tissue-specific RNAi knockdown of candidate genes and tried to link them to existing knowledge of the processes governing animal growth. The GWAS results are discussed in the manuscript in Results (chapter 4.1.).

55 The answering of question 3 entails a combination of bioinformatic and statistical approaches. For subquestion a), we analyze the genomic distribution of associations and the corresponding p-values as a proxy for effect size. Part b) can be answered by checking for presence of previously known growth genes in the candidate set. Part c) mainly entails literature and database research. Additionally, we performed an analysis of statistical epistasis of variants in previously known growth genes with DGRP SNPs, which reveals those interactions between the two groups of SNPs that are significantly associated with size. To be able to address the potential function of variants located in the intergenic space we searched for overlap with potentially functional elements as annotated by the modENCODE effort and checked for conservation of some sequences among twelve Drosophila species. These results are also presented in the manuscript in Results (chapter 4.1.).

56 4. RESULTS

The results chapter is composed of three different parts:

4.1 Novel loci rather than variants in canonical growth pathway genes are associated with wing and body size variation in Drosophila melanogaster (Manuscript in preparation)

4.2 The FlyCatwalk: A high throughput feature-based sorting system for artificial selection (submitted manuscript)

4.3 Further Results 4.3.1 Foodbatch variability is a strong and specific confounder for wing size in Drosophila melanogaster 4.3.2 Artificial selection for Drosophila wing size

57

58 4.1 Novel loci rather than variants in canonical growth pathway genes are associated with wing and body size variation in Drosophila melanogaster (Manuscript in preparation)

Sibylle Chantal Vonesch1, David Lamparter2, Sven Bergmann2, Ernst Hafen1

1 Institute of Molecular Systems Biology (IMSB) ETH Zürich Auguste-Piccard-Hof 1 CH-8093 Zürich

2 Department of Medical Genetics, University of Lausanne Rue de Bugnon 27 CH-1005 Lausanne

59 ABSTRACT

Knowledge of how processes governing animal growth are modulated to create variation in size is crucial for understanding normal development and tumorigenesis. Numerous involved factors have been identified but the complete picture remains elusive. As many genes may affect size, large-scale methods that evaluate the whole genome instead of focusing on single genes can substantially expand our knowledge of this trait.

Here we present the application of genome-wide association methods to studying developmental traits in Drosophila. We show that we successfully reduced the influence of environmental confounders on size by raising flies under a strict environmentally controlled regime. GWAS for wing and body size revealed a substantial number of loci associated to these traits but surprisingly the majority of genes that were located in proximity of a significant SNP were not previously implied in growth control. We validated 45 novel genes for a role in size determination in the Drosophila wing and found that significant intergenic SNPs were preferentially located in regions with enhancer signature and overlaped lincRNA loci. Finally, we show that a large part of our novel candidates have a human ortholog, many of which have been associated to height and obesity related traits in humans.

In summary, genome-wide association studies for developmental traits in Drosophila have identified novel regulators of size and may shed light on interactions between these and known genes. The identification of loci that are modulated to create variability in size while proper function is maintained provide a first step towards a systems-level understanding of size determination. The high conservation of basic developmental processes allows findings from Drosophila to serve as a basis for hypothesis driven investigation of the physiological function and role in disease of orthologous genes in humans.

60 INTRODUCTION:

The question of how animals control and coordinate growth among tissues has fascinated biologists for decades. Having a detailed mechanistic but at the same time global understanding of the growth processes taking place during normal physiological development is furthermore highly relevant for understanding the basis of the many pathologies subsumed under the term cancer. Numerous classical genetic studies in Drosophila over the past 30 years have provided us with a substantial understanding of the core molecular mechanisms of growth control and have shed light on the role of humoral factors and the environment on final adult size (Oldham et al. 2000, Johnston and Gallant 2002, Oldham and Hafen 2003, Mirth and Riddiford 2007, Pan 2007, Shingleton 2010, Tumaneng et al. 2012). From these studies two pathways emerged as the main underlying regulators of size, the Insulin/TOR pathway, which couples systemic growth to nutrient availability, and the Hippo tumor suppressor pathway, which controls cell survival and proliferation in developing organs. However, these studies also reveal the enormous complexity and context-dependence of growth control, and to date the complete picture, especially the many interactions between components of these pathways with each other, with yet unknown molecules, and with extrinsic factors, is still missing. As a quantitative trait such as size is by definition highly multigenic (Gockel et al. 2002, Lango-Allen et al. 2010, Yang et al. 2010, 2011), being influenced by many loci across the genome each having only a small effect on the final phenotype, studies that look at a single gene at a time might not be sufficient to grasp the whole system of networks underlying this trait. Especially when certain alleles only show a mild effect that may be hard to reproduce or is overlooked in classical forward or reverse genetic screens, or when this effect is dependent on the genetic or environmental context (Falconer and Mackay 1996, Lynch and Walsh 1998). In this case, a more global approach that is able to probe the contributions of genes across the whole genome at a time to final size may bridge the gap between the detailed mechanistic understanding that we have from single gene studies and a more complete understanding of growth control at the systems level.

Genome-wide association studies (GWAS) are a popular method fo linking variation in quantitative traits to underlying genetic loci, thus shedding light on the genetic architecture of such traits (Womack et al. 2012, Korte and Farlow 2013). They have been pioneered (Hirschhorn and Daly 2005, McCarthy et al. 2008) and since then widely applied in humans and are now a routinely used tool in model organisms like Arabidopsis, Drosophila and mouse as well as in various crops and cattle species (Bergelson and Roux 2010, Jumbo-

61 Lucioni et al. 2010, Meijon et al. 2013, Hwan Lee et al. 2013, Huang et al. 2011, Flint and Eskin 2012, Lipka et al. 2013, Makvandi-Nejad et al. 2012, Garcia-Gamez et al. 2012, Maxa et al. 2012, Minozzi et al. 2013), where they have substantially broadened our understanding of the underlying genetics of complex traits. GWAS exploit the natural genetic variation that is present in populations and aim to identify those genomic loci that are causal for variation in the phenotype of interest (McCarthy et al. 2008). GWAS of height have revealed that many loci across the genome, with each locus only having a small effect, contribute to size variation in humans (Lango-Allen et al. 2010, Yang et al. 2010, 2011). This is in contrast to a much simpler genetic architecture of size in domestic animals, where few loci explain a large proportion of size variation (Sutter et al. 2007, Makvandi-Nejad et al. 2012). Though many loci affecting human height have been identified by GWAS, deducing the underlying molecular mechanisms is difficult in humans. The absence of possibilities for functional validation in the organism where associations were discovered, and the presence of large environmental variability make it difficult to identify causal links between genotype and phenotype (Lynch and Walsh 1998, Oksenberg et al. 2008, Thomas 2010, Bergelson and Roux 2010, Vilhjàlmsson and Nordborg 2013). In contrast to human studies, GWAS in model organisms benefit from the feasibility of functional validation, much more stringent environmental control and the possibility of measuring several genetically identical individuals to obtain a high confidence phenotypic value for a given trait and genotype. All three factors can substantially improve the power of a GWAS. However, as slight environmental fluctuations occur even under controlled laboratory conditions, and interactions between genotypes and the environment can be abundant, often accounting for a large part of total phenotypic variance (Valdar et al. 2006), it is necessary to take the utmost care in identifying potential confounders for a given phenotype and address each of them as necessary, by either eliminating, randomizing or accounting for the source of variation. GWAS in Drosophila have been made possible for the wide public since the establishment of The Drosophila Genetic Reference Panel (DGRP) (Mackay et al. 2012, Huang et al. 2014). The DGRP is a set of wild-derived, inbred Drosophila melanogaster lines whose genome sequences are publicly available, which makes them a great resource for studying complex traits by performing GWAS. The DGRP lines harbor the substantial natural genetic variation present in the original wild population and they show an abundance of phenotypic variation for any trait assayed so far (Ayroles et al. 2009, Jumbo-Lucioni et al. 2010, Mackay et al. 2012, Jumbo-Lucioni et al. 2012, Massouras et al. 2012, Swarup et al. 2012).

62 In this study we use GWAS as an approach for studying developmental traits in Drosophila. We achieve high confidence phenotypic means for these traits by raising flies under strictly controlled environmental conditions. GWAS for wing and body size revealed a substantial number of loci associated to these traits, confirming the highly multigenic nature of size traits. Most of these loci located to intergenic and regulatory regions, implying that phenotypic variation in size in Drosophila is mainly governed by changes in protein abundance and modulation of regulatory networks, rather than large functional changes in proteins. Surprisingly, the majority of genes that were located in proximity of a significant SNP were not previously implied in growth control, with many genes not even functionally annotated. Among the few exceptions to this rule were a SNP lying in the regulatory region of the insulin-like petide dILP-8, which was significantly associated with relative wing size in females, and several components of the Hippo tumor suppressor and the EGFR pathway, and genes with a role in planar cell polarity. We validated 45 novel genes for a role in size determination in the Drosophila wing and found that significant intergenic SNPs were preferentially located in regions with enhancer signature and in lincRNA loci. A SNP 2kb upstream of the expanded locus lay within a region of evolutionary conservation, indicating the presence of a putatively functional regulatory element. We found human orthologs for a large part of our novel candidates, many of which have been associated to height and obesity related traits in humans. Genome-wide approaches can thus identify novel regulators of developmental traits and help placing them within the context of already established networks. The identification of loci that underlie natural size variation provide a first step towards a systems-level understanding of size determination.

RESULTS:

A standardized culturing regime sufficiently reduces the influence of environmental perturbations on size

We devised a protocol specifically designed towards minimizing environmental contributions to adult Drosophila size and raised ten replicates of four DGRP lines under these conditions (Figure 1A). To assess the effectiveness of the protocol, we estimated the proportion of phenotypic variance attributable to replicates with a linear mixed model. Taking wing size as an example, we found that the replicate term explained less than 1% of total phenotypic variance, a negligible fraction compared to the contribution of the genotype (78%, Table S1). The remaining 21% was accounted for by residual or intra-line variance. These remaining phenotypic differences between flies of the same genotype could be the consequence of

63 unmonitored environmental variables and/or stochasticity of cellular processes (Kilfoil et al. 2009). Since mean phenotypes per line are used in the GWAS, this effect will be cancelled out and will not diminish power. As the analysis of interocular distance showed similar results (Table S1) we are confident that our standardized development protocol sufficiently deals with confounding effects on size phenotypes.

Quantitative genetic analysis of body and wing size variation

To quantify the extent of phenotypic and genetic variation for size among the DGRP lines we cultured 143 of the lines under the above standardized conditions and characterized them for three phenotypes: centroid size (CS), interocular distance (IOD) and thorax length (TL), as representative measurements for wing size and body size (eye disc derived and wing disc derived) respectively (Figure 1B). We observed extensive phenotypic variation in all traits between the lines (Figure 2A-C, Tables S2 – S5) that was to a significant extent due to the different genotypes of the lines (p<<<10E-05), an indication that the environmental control was stringent enough (Table S6). We also found extensive variation between the sexes (p<<<10E-05) and significant genotype by sex interactions (p<<<10E-05) for all traits. The significant line by sex term most likely arose due to differences in sexual dimorphism between the lines rather than a fundamental difference in the genetic architecture of the traits between the sexes as broad-sense heritability estimates were similar for males and females and the cross-sex genetic correlation (rMF) was high for all traits (rMFCS=0.97, rMFIOD=0.97, rMFTL=0.96, Table S7).

To determine what fraction of phenotypic variation in the population was due to the different genotypes of the lines we estimated relative contributions of genotype and environment using a linear mixed model (Figure 2B, Table S8). Broad-sense heritability, the genetic component

2 2 2 of phenotypic variance, was high for all three traits (H CS=0.63, H IOD=0.69, H TL=0.63, Table S8). Surprisingly, 15% of total phenotypic variance in centroid size could be attributed to the different foodbatches the flies were grown on. Even though the effect was markedly lower for the body size measures we used corrected phenotypic values for all traits in subsequent analyses to remove this effect.

As cosmopolitan inversions are present in the genomes of some DGRP lines and have been found to be associated with other quantitative traits (Huang et al. 2014), we specifically modeled the presence of two cosmopolitan inversion, In(2L)t and In(3R)Mo, that we found to have the biggest effect on size. We performed subsequent analyses with both the inversion- corrected (CSIC and IODIC) and non-corrected (CS and IOD) phenotypes to evaluate if

64 accounting for the presence of the inversions would affect the number and identity of identified SNPs. We chose IOD as body size measure as it showed a markedly lower genetic correlation with CS than TL did (Table S9), and we thus expected to largely map SNPs for each trait separately with a few variants common to both traits. To additionally find loci that specifically affected variation in wing size unrelated to the overall body size variation we defined a measure for relative wing size (rCS) by modeling wing size as a function of body size and taking the deviation from the fit as our relative size phenotype.

GWAS identifies many novel loci associated with wing and body size variation

To identify common loci that contribute to wing and body size variation in Drosophila, we performed five GWAS, for CS and IOD, the inversion corrected phenotypes CSIC and IODIC and relative wing size rCS, in each sex using the FaST-LMM method. To obtain relatively stable effect size estimates we used 1’319’937 SNPs that occurred in at least ten lines of our dataset (7% of our measured population and subsequently referred to as minor allele count 7 (MAC7) GWAS). However, as rare alleles commonly have larger effects we additionally performed all ten GWAS with SNPs present in at least seven lines (5% of our population, MAC5) and four lines (3% of our population, MAC3) to probe also rarer alleles (Tables S10- S39). Though we expected to pick up more false positive associations in these GWAS due to inaccurate estimation of phenotypic variance with only four data points this does not present a substantial problem in Drosophila, since associations can be validated functionally.

In the MAC7 GWAS we identified between 59 and 77 significant (p<10E-05) SNPs for females and between 25 and 43 SNPs for males, depending on the phenotype, with relative centroid size (rCS) showing the highest number of associated SNPs (Table S40, Figure 3A). These largely represented individual associations, as we did not detect any long range LD (Figure 3B). Overall, the number of significant SNPs was comparable between phenotypes and, as expected, got proportionally higher for each phenotype with decreasing inclusion stringency for SNPs. One exception was the MAC3 GWAS of IODIC, where substantially more SNPs were significantly associated than for any other trait. This was due to many SNPs with a low minor allele count reaching significance, which we, however, presume to be false positive associations as the corresponding genes disappeared from the candidate list as soon as the minor allele count threshold was elevated.

Somewhat unexpectedly, given the comparable coefficients of phenotypic variation and broad-sense heritabilities of males and females, and the high cross-sex genetic correlation, consistently more loci reached significance in females than in males. The proportion of

65 significant variants identified in males that were also significant in females ranged from 25% to 88%, being lowest for CSIC and highest for IODIC with a median overlap of 52%. In contrast, only about 20-30% of female significant SNPs were also identified in males, which reflected the higher number of SNPs identified for females. The significance ranking of SNPs also differed between the sexes, meaning that e.g. the two most significant loci in females were not the same in males.

Between phenotypes (Table S41) the overlap was highest between the inversion corrected and non-corrected GWAS for both wing and body size with additional SNPs reaching significance in the GWAS with corrected phenotypes (Figure 3C). This suggests that the correction for the presence of a confounder enhances power of the GWAS. Nevertheless, a considerable proportion of the loci identified in the GWAS with uncorrected phenotypes remained significant associations in the GWAS of the inversion corrected phenotype, demonstrating that those results were to a large extent not false positives due to confounding with inversion presence. We found no overlap between centroid size and IOD GWAS in terms of identified SNPs as could be expected based on the relatively low genetic correlation between these traits (Table S9). These results also hint at a generally more similar genetic architecture for absolute and relative wing size than for absolute wing size and body size. In both sexes about one third of SNPs identified in the absolute CS GWAS were significant also in the relative CS GWAS, and vice versa, showing that the majority of SNPs is specific for either absolute or relative size and suggesting the existence of partly separate mechanisms for creating variation in the size of individual organs.

To investigate whether the identified SNPs were preferentially located in specific functional elements or randomly distributed across the genome, we annotated the candidate lists with FlyBase genomic features (Tables S10-S39). Not surprisingly, this revealed that across phenotypes the majority of associated SNPs fell into the intergenic space and regulatory regions (introns, UTRs), as was observed in most published GWAS to date. For the female MAC7 relative centroid size GWAS we for example identified 63 intergenic, 38 intronic, 7 UTR, 8 synonymous, 1 nonsense and no missense SNPs (Table S42).

The perhaps most unexpected result was the low number of canonical growth control genes we identified in our list of significant candidates (Table S43). Given the vast knowledge we already have about large numbers of genes involved in Drosophila organ and body size determination, we would have expected to identify many of these and a few novel loci. Contrary to this expectation, the majority of genes we identified are novel candidates for a role in growth control. Among the exceptions were a SNP 197bp downstream of the Ilp8

66 gene coding for the Drosophila insulin-like peptide 8 (Fig 3A) and two significant SNPs localized near or in the genes coding for two important regulators of TORC1 under stress conditions (Scylla and the gamma subunit of AMPK), some significant hits in or in the proximity of components of the Hippo tumor suppressor pathway (Wts, Ex), the EGFR pathway (kek-1, rasp, Sos, Gug) and some regulators of processes like tissue polarity and patterning that are important during wing and eye development.

To evaluate whether the candidate sets were enriched for interactions and for distinct functional classes, processes or subcellular locations, we performed GO analysis using DAVID and STRING (Huang et al. 2009a,b, Franceschini et al. 2013). We only included genes in this analysis that had a significantly associated SNP in or 1kb up- or downstream of their transcribed region to be sure to avoid misannotation. The candidate gene set was enriched for interactions for CSF (MAC3), IODF (MAC3), IODFIC (MAC3 and 7), rCSF (MAC3) and rCSM (MAC 3 and 7) (table enrcand). Enriched categories included plasma membrane location (IODF MAC3, IODFIC MAC3), response to stimulus (IODM MAC3) and regulation of Ras protein signal transduction (CSF MAC3). Organ and imaginal disc morphogenesis categories were among the top categories for wing size traits but not significantly (Bonferroni corrected p<0.01) enriched. We also noticed that some candidate genes seemed to have roles in amino acid and sugar metabolism, but again we detected no significant enrichment of any of these pathways.

In summary, we identified many common SNPs associated with either size variation in the wing or the body with little overlap, indicating that modulation of distinct genetic networks in the precursor tissues during growth creates size variation in these two structures. Despite the vast number of genes we know to have a role in growth control from classical genetic studies, the majority of SNPs associated with size variation were located in intergenic regions, making it difficult to deduce the functional impacts of such loci, or lay within and around genes with no previous implication in size determination.

Gene-wise summary statistics identify more and different genes associated with size variation

One reason that we failed to detect canonical growth pathway genes might be that they do not contain SNPs with large effects but rather many SNPs with small effects that each on their own did not reach significance. We think this is a likely explanation, as, given the essential role of many growth genes, we would expect large effect SNPs to be depleted in such genes. We thus calculated gene-wise summary statistics using the VEGAS method (Liu

67 et al. 2010), which sums the effects of all SNPs within a gene and calculates a gene-level p- value while correcting for linkage and gene length. Lacking a formal significance threshold we selected the 20 top hits for each phenotype as putative candidates (Table S44). The overlap between the VEGAS candidate genes and the significantly associated genes from the GWAS was small. When summed over all phenotypes, however, 24% of the top VEGAS genes contained a SNP that reached significance on its own in one of the GWAS. In contrast to the GWAS gene sets, we did not find any GO enrichment or enrichment for interactions in the top 20 VEGAS gene lists. As in the GWAS, genes identified by VEGAS were largely novel putative candidates for a role in growth control.

Functional validation of novel genes tagged by associated SNPs:

To confirm whether novel genes have a role in growth control, we performed functional validation applying tissue-specific RNAi. We chose candidates for validation only from the MAC7 GWASs and picked a subset for each phenotype based on immediate availability of VDRC or Bloomington RNAi lines. As in the enrichment analysis, we limited ourselves to testing genes containing a significant SNP in or 1kb around their transcribed regions. To assess the phenotypic effects of genes identified in the wing trait GWAS we performed tissue-specific knockdown using the nubbin-Gal4 driver and measured total wing area as a readout. We tested between 30% and 66% of candidates for each wing phenotype and validated between 64% and 74% of the tested candidates, meaning their knockdown resulted in a significant change in wing size (p<10E-03, Wilcoxon rank sum test, Table S45, S46, S47, Figure 4A). The validation rates we observed for VEGAS candidates were similar to those of the GWAS candidates, ranging from 43% to 80%. However, as we tested a smaller subset of candidates, between 25% and 40%, this estimate could be inaccurate. The validated candidates included the previously known genes tws, stan, chinmo, aPKC and Ilp8 but also 45 novel genes that showed a significant change in wing size upon knockdown in at least one sex. Knockdown of one gene, VhaM8.9 resulted in severely decreased wing size (Figure 4B), whereas knockdown of other genes had more mild effects on pigmentation, bristles and vein formation We annotated validated candidates with functional classes using DAVID, which showed that genes fell into functional classes like metabolism, RNA and protein metabolism, cell and neuron morphogenesis, immune response, signaling, transmembrane transport, inositol phosphate metabolism, wound healing and autophagy (Figure 4C). To exclude that a similar validation rate could be achieved by knockdown of random genes across the genome, a not unlikely scenario given that size is a phenotype influenced by many genes, we tested an additional 26 randomly selected genes for an effect

68 on wing size. These were chosen based on the absence of a significant SNP in or 1kb around their transcribed region. Knockdown of the random set resulted in only 42% showing a significant change in wing size (Table S48), suggesting a clear advantage in power of identifying candidate genes for growth control by GWAS over randomly picking genes. We tested a smaller subset of IOD and IODIC GWAS candidate genes for a change in IOD upon knockdown with eyeless-GAL4 only in females, and validated eight of these (62%) (Table S49).

Pairwise significant association lists are significantly enriched for interactions and reveal novel connections between known growth genes

As a strategy to link the novel genes to previously known growth control genes and thus elucidate where in the web of interacting growth pathways they act, we performed pairwise association using FastEpistasis (Schüpbach et al. 2010). To this end we composed a list of SNPs within and 1kb up- or downstream of genes that were previously known to play a role in growth control (14’137 SNPs) or wing development (43’498 SNPs) and used these as focal SNPs (Table S50). Apart from hoping to see many of our newly identified candidates, we expect to find many already known growth pathway genes in the interactor list, as interactions among genes within and between these pathways are abundant. We did not consider this redundant information, as it may, among the thousands of possible interactions, reveal those connections that are most relevant for creating size variation. We tested for pairwise association of each focal SNP, in combination with all other DGRP SNPs present in at least 10 lines (the interactors, 1’100’811 SNPs), with the inversion controlled wing and body size phenotypes (CSIC, IODIC) and with relative wing size in females. This resulted in a total number of 15’562’165’107 and 47’883’076’878 tests, requiring at least a p-value of 1E- 13 for formal significance. Only one interaction, in the pairwise association to wing size, reached this threshold: naked cuticle (nkd), a downstream target of Dpp that negatively regulates wingless signaling (Yang L et al. 2013), with the protein tyrosine phosphatase Ptp99A, which has a role in neuron development and has been shown to interact with InR and the Ras signaling pathway (Madan et al. 2011, Carter 2013). As the Bonferroni multiple testing correction is very conservative, we analyzed candidate interactions above a nominal significance threshold of 1E-10 for the presence of genes we had previously identified by GWAS. Surprisingly, among the 285 interactors we found only 23 previously known growth genes but 42 of the genes that we identified by GWAS (Table S51, Figure 5). The interactions we identified can thus serve as a basis for hypothesis driven investigation into the role of new interactions.

69 GO enrichment analysis revealed that the newly identified interactors play important roles in development. The interactor lists from the pairwise associations to all three phenotypes were enriched (p<0.01) for genes with plasma membrane localization and involved in morphogenetic processes, neuron development, differentiation, locomotion and metamorphosis. Both the rCSF and IODFIC lists showed enrichment for genes located at cell junctions and involved in taxis, while the rCSF list was additionally enriched for genes with roles in behavior and the regulation of RNA metabolism, and the IODFIC list for transcription factor activity, cytoskeletal protein binding, tracheal development, gastrulation and muscle differentiation. Furthermore, though both the interactor lists and the focal lists were individually enriched for interactions, the combined lists showed more interactions than expected and than the sum of the interactions from the individual lists, indicating that biological interactions do underlie the statistical interactions (Table S52). In summary, analyzing pairwise interactions may identify putative biologically relevant interactions and shed light on novel connections between already known genes.

Intergenic SNPs are preferentially located in regions with enhancer signatures and overlap lincRNA loci

To elucidate how intergenically located significant associations might affect phenotypic variation in size we investigated whether any of these SNPs lay within modENCODE functionally annotated intergenic regions. The modENCODE database (Celniker et al. 2009) contains extensive information on methylation patterns, transcription factor binding sites, noncoding RNAs, origins of replication and chromatin states across the Drosophila melanogaster genome. Though regions with predicted functionality have not been proven to actually be functional, the localization of a SNP within such a region may provide a basis for investigation of the effect of this association. Specifically, we asked whether there was enrichment among our candidate loci for SNPs lying within regions containing histone 3 lysine 4 monomethylation (H3K4Me1), a signature of active enhancers. We additionally obtained a dataset of long intergenic noncoding RNA (lincRNA) annotations in the Drosophila genome from a study by Young et al. (Young et al. 2012) and searched for enrichment of SNPs localized to lincRNA loci. LincRNAs are of special interest as they have been implied in developmental regulation and are often enriched for trait-associated loci (Hangauer et al. 2013). We tested for enrichment only in the MAC 7 gene lists and for the H3K4Me1 enrichment restricted ourselves to three developmental stages (L2, L3, pupae), which we considered to be the most relevant interval for gene activity affecting growth of imaginal discs. We found significant enrichment (p<0.05, hypergeometric test) of SNPs lying in

70 regions with H3K4Me1 signature in the L2 stage among candidate loci from the absolute wing and body size GWAS in males and females (CSF, CSFIC, CSM, CSMIC, IODM, IODF and IODFIC) but not among loci associated with relative wing size (Table S53). Only IODF candidate loci showed significant enrichment for overlapping H3K4Me1 peaks in the L3 stage, while enrichment was significant for absolute wing size in both sexes, and body size and relative wing size in females (CSF, CSFIC, CSM, CSMIC, rCSF, IODFIC,) in the pupal stage. We found a significant enrichment of lincRNA loci overlapping candidate SNPs only for body size GWAS in females (IODF, IODFIC), though there was at least one SNP lying in a lincRNA in all candidate lists except rCSM (Table S54). These data suggest that intergenic loci may affect size variation through introducing changes in the binding sites for transcriptional activators and repressors on the DNA and thus modulate affinity of these proteins to such regulatory elements, which will in the end affect the efficiency and strength of transcriptional activation or repression, leading to changes in transcript abundance. On the other hand, intergenic loci may be located in noncoding RNAs such as microRNAs and, as shown above, lincRNAs, which have both been shown to regulate cellular processes and growth. As noncoding RNAs work by complementarity to their targets, a nucleotide change may reduce or enhance affinity to their substrates and thus modulate regulatory input.

Based on the observation that candidate lists seemed to be enriched for SNPs lying in regions with putative enhancer functionality, we were interested whether the SNP 2kb upstream of expanded (position 429144) could represent a regulatory element. 2000 bp is a reasonable distance for a transcriptional regulatory element in Drosophila, however, we did not detect the SNP in regions with H3K4Me1 signature in the L2, L3 or pupal stage. To elucidate in what other putatively functional region the SNP could be located, we checked annotation with a variety of alternative modENCODE tracks. This revealed that the SNP was situated within regions annotated with histone deacetylase binding sites in embryos, origins of replication in Drosophila cell culture cells, H3K4Me1 in embryos and in the adult female, H3K27Ac, another mark of enhancer regions, in the adult female, H3K9Ac, a mark of transcriptional start sites, in the embryo, and had assigned state 4 of the 9 state chromatin model suggestive of a strong enhancer (Ernst et al. 2011). As the mere presence of these marks is not sufficient to imply functionality, we checked for conservation of the sequence around this SNP across taxa. Specifically, we chose a region, starting at the annotated transcription start site (TSS) of the gene expanded (current FlyBase annotation) and extending 10kb upstream on 2L, from the Drosophila melanogaster genome and performed multiple sequence alignment by BLAST (Altschul et al. 1990) to the same relative sequences from the genomes of seven other Drosophila species (Table S55, supplementary data file 1).

71 We limited ourselves to including only those species that contained the ortholog of the expanded gene in the same orientation in the genome. The BLAST search produced significant alignment of the 3kb immediately upstream of the expanded TSS to an approximately 3kb sequence piece in the genomes of D. sechellia, D, yakuba and D.erecta (sequence similarity 80% to 90%, Details alignment, Fig 4D). This is rather high for a regulatory element, where usually polymorphisms in the longer sequence are of little consequence and only a few residues are highly conserved. Also speaking against this sequence having conserved regulatory function was its location not immediately upstream of the ex orthologs but much further (8kb) upstream in the other Drosophila species. In summary, we present evidence for a putatively functional region immediately upstream of the Drosophila melanogaster expanded transcription start site based on functional annotation with modENCODE data and conservation across species. Whether this element truly is functional, and whether it represents an enhancer, an additional promoter, an origin of replication or a previously unannotated noncoding RNA or gene locus remains to be seen.

Human orthologs of candidate genes are associated with height, obesity and a variety of other traits

To investigate conservation to humans and further elucidate putative function of our candidates, we searched for orthologous proteins in humans using DIOPT-DIST (Hu Y et al. 2011). DIOPT-DIST combines several algorithms and tools for ortholog prediction and simultaneously accesses curated resources to link DIOPT results with human trait and disease associations. Using DIOPT-DIST, we found human orthologs for 137 of our 187 candidate genes (combined from GWAS for all phenotypes), five of which had a good confidence ortholog (score > 4) associated with height, two with pubertal anthropometrics and an additional three with bone mineral density, which is an important factor during growth. (Table S56). As in Drosophila both growth and metabolism are controlled by IIS signaling while in humans there are distinct receptors and downstream cascades for these two functions, we hypothesized that genes associated with traits that often correlate with deregulation of insulin signaling in humans, such as obesity, high BMI and a misbalance in metabolite levels, might play a role in growth regulation in Drosophila. Indeed we found human orthologs for 24 of our candidate genes with an association to BMI, interaction with BMI, metabolite levels, obesity related traits and Type II diabetes. We additionally found that GIPC2, the human ortholog of the PCP regulator kermit, for which we detected association to body size with the VEGAS method and that showed a nominally significant pairwise interaction with the EGFR in the association to wing size, was associated with human height,

72 corroborating its role in growth control. In summary, the revelation that most of our candidate genes have orthologs in humans underlines the evolutionary conservation of basic developmental processes, and the evidence for an involvement in growth control from GWAS in both organisms and experimental support from validation in Drosophila corroborates a true biological function of these genes in the determination of body size.

DISCUSSION

Summary: We have applied a novel approach for studying developmental traits in Drosophila by employing GWAS for wing and body size. Here, we show that we could successfully reduce the influence of environmental confounders on size by raising flies under a strict environmentally controlled regime. GWAS for wing and body size revealed a substantial number of loci associated to these traits, confirming the highly multigenic nature of size traits. Most of these loci located to intergenic and regulatory regions, implying that phenotypic variation in size in Drosophila is mainly governed by changes in protein abundance and modulation of regulatory networks, rather than large functional changes in proteins. Surprisingly, the majority of loci that overlapped a significant SNP were not previously implied in growth control. We validated 45 novel genes for a role in size determination in the Drosophila wing and found that significant intergenic SNPs were preferentially located in regions with enhancer signature and overlapped lincRNA loci. A SNP 2kb upstream of the expanded locus was found to lie within a region of evolutionary conservation, indicating the presence of a putatively functional regulatory element. Finally, we showed that a large part of our novel candidates have a human ortholog, many of which have been associated to height and obesity related traits in humans.

Small fluctuations in foodbatch quality cause substantial population-level variation in wing size: Our results show that under perfectly controlled conditions 78% of total wing size variation in our population can be explained by the different genotypes, which is reduced to 63% when we vary one environmental variable, food, slightly. We explain this effect with very subtle variations in the cooking protocol, leading to the evaporation of more water from the broth and consequentially affecting the texture of the food, which in turn could affect how efficiently and fast larvae can enter and process the food. Physiologically, this would mirror a situation where there is slight differences in food availability between larvae, perhaps comparable to a slightly above optimum population density. Flies reared under nutrient restriction (NR), or genetically starved flies show a proportional reduction in all body parts, though there are exceptions where essential organs are less sensitive to NR, to ensure survival and fitness of the organism under stress (Cheng et al. 2011, Tang et al. 2011).

73 Furthermore, different body parts showed different scaling relationships with overall body size and dependent on environmental variables, though in this study the thorax was found to be affected to a larger extent by nutritional variability than the wing (Shingleton 2009). However, as Shingleton et al. used very different experimental settings, including a different medium, rather severe NR and lines with presumably far less genetic variability than our population, this may explain the discrepancy. If the effect we observe was due to texture- dependent variation in accessibility, the resulting variability in nutrient availability would have very different effects on the organism and the individual organs from more severe starvation. As a larger body generally means more resources for reproduction and survival it could make sense to first place fewer resources into making wings. In contrast, if nutrients become severely limiting, a reduction in body size is essential as the energetic cost of maintaining a large body would be too high. Mechanistically, this could be explained by different thresholds of insulin signaling sensitivity in different organs, along the lines of prior observations of Shingleton et al. that a mutation in the insulin receptor reduces insulin signaling and has a greater impact on wing size than on genital size (Shingleton et al. 2005) and the reduced sensitivity of genital discs and neurons to NR due to reduced levels of FOXO and bypassing amino acid dependence (Cheng et al. 2011, Tang et al. 2011).

In conclusion, it is astonishing how such a slight variation in the environment can reflect to such an extent in a phenotype. This raises the question of how comparable or reproducible results from different Drosophila laboratories are, where food nutritional content is markedly different due to the absence of one accepted standard medium. For studies dealing with genetic manipulations that produce robust and large effects this may not be an issue, whereas studies using inbred lines and investigating quantitative traits are likely affected. Generally, slight environmental fluctuations may provide an explanation for the irreproducibility of weak phenotypic effects. As the effect we see on size is already considerable in a situation with perfectly controlled environment we can only imagine it to be much more pronounced in situations where environmental control is imperfect, such as in GWA studies in humans. Obviously, outbred human individuals may be less sensitive to fluctuations in the environment than inbred strains, but then the differences in quality and quantity of the consumed food are immensely larger in human populations than the differences in our standardized food. Nutrition affects an organism throughout lifetime, and apart from influencing overall height (see the pronounced differences in size between humans living only few kilometers apart due to malnutrition during development (Schwekendiek and Pak 2009, Pak 2010)), also affects weight, the development of obesity, type II diabetes and metabolic syndrome and a variety of other diseases (Riccardi et al.

74 2004, Alexander et al. 2010, Reynolds et al. 2010, Gerber 2012, Arts et al. 2014). As accounting for diet and physical activity can improve the power of a GWAS to recover novel associations (Igl et al. 2010), dietary confounding does seem to be an issue also in humans. The strong confounding effect we see of only small fluctuations in an environmental variable shows that careful evaluation of potential environmental covariates is absolutely essential for the success of a GWA study to identify truly causal associations.

Many loci with small effects and predominantly regulatory variants underlie wing size variation in Drosophila: For the genetic architecture of height, or body size, humans and domestic animals represent the extremes in terms of the allelic spectrum of involved SNPs. In humans, all common variants in the genome taken together explain only about half of the heritability of the trait, while in dogs, horses and cattle species few loci account for a large proportion of size variation, which is a consequence of specific breeding of these species (Sutter et al. 2007, Yang et al. 2010, Makvandi-Nejad et al. 2012). Not surprisingly, as Drosophila populations in the wild are naturally breeding like humans, their genetic architecture of size seems to be more similar to that of humans. Using three different methods we identified largely distinct sets of significant SNPs or candidate genes associated with size. Generalizing to other phenotypes these observations imply that the number of loci controlling trait variation is likely much higher than those that are detected by single locus GWAS, which is the default method for associating phenotype to genotype. With the most stringent SNP inclusion threshold we identify up to 77 SNPs associated with wing size variation. Taking into account the loci identified with the other two methods and considering that we likely lack power for identifying all causal variants we estimate that at least several hundred loci control natural variation in wing size in Drosophila melanogaster. This is not a surprising finding per se, as classical genetic studies have shown that genes from at least three different pathways, IIS/TOR, Hippo and EGFR control cell growth and proliferation, with a multitude of other genes from pathways involved mainly in tissue polarity, patterning and developmental timing also affecting growth. Unexpectedly, we identified only a small number of bona fide growth genes, indicating that major growth pathway components either do not contain SNPs contributing to size variability or that they contain SNPs that are too rare to be genotyped or have too small effects to be detected by GWAS. Apart from this, largely novel loci and a minority of previously known loci were associated with size variability in our population. The most significant female MAC7 SNP for the CS and CSIC GWAS (p = 7.15E- 08, p = 6.43E-08) mapped to exon 5 of the gene CG6091, an ortholog of the human de- ubiquitination enzyme OTUD5, which has a role in innate immunity, and the most significant male CS, CSIC and rCS SNP (p = 7.00E-08, p = 8.55E-08, p = 3.74E-07) was located in the

75 intron of CG34370, a gene encoding an LDL repeat containing protein of unknown function, which was recently identified as a candidate in a GWAS for lifespan and lifetime fecundity in Drosophila (Durham et al. 2014). Rather unexpectedly, the most significant rCSF SNP (p = 1.31E-07) was situated in the intron of dsx (doublesex), a gene well characterized for its involvement in sex determination, fecundity and courtship behavior. In the body size GWAS in contrast, we identified a cluster of SNPs lying 12-13kb upstream of the gene encoding the negative EGFR pathway regulator Kekkon-1 among the top associations (p=3.45E-09). Not surprisingly, most of our identified variants are regulatory, as could be expected given genes affecting growth are often essential genes and may also impact on fecundity and fitness of the organism. The more safe and common way to introduce variability in size while maintaining function seems to occur via modulation of protein abundance (this study and Stern and Orgogozo 2008). This finding is in line with those of other GWAS studies, as for most other phenotypes the majority of associations fall within regulatory regions and the intergenic space. As we show in our results, annotation of such SNPs with functional intergenic element signatures and looking for conservation of the sequence provides a way for formulating hypotheses towards elucidating the functional impact of an intergenic association.

Most identified coding SNPs are synonymous substitutions. These have been shown to be functional in some cases (Hunt et al. 2014), but based on our data we cannot exclude that they tag a rare non-probed SNP in their close proximity rather than being causally associated themselves. Among all candidates there was only one significant nonsense SNP, which located to exon 8 of the gene Dhc64C, a member of the dynein heavy chain family, important molecules for cellular transport. This SNP was significantly associated with all wing size traits in both sexes and had comparable effect sizes in all these GWAS. Nonsense and missense SNPs are of particular interest as they allow the formulation of testable hypotheses for the elucidation of the functional impact of the polymorphism. We found missense SNPs in genes coding for the mitochondrial protein Cep89, Pi3K68D, the FERM and PH domain containing protein CG34347, the cell shape regulator Mp20 (Kiger et al. 2002), CG5381, the putative superoxide dismutase CG31028 and in Fbp2, a protein with functions in metabolism. We found largely different genes with the VEGAS method for the same trait, but over all traits there was an overlap of 24%. Both the low overlap between methods and the lack of enrichment could be a consequence of only considering genes above our rather arbitrary cut- off of 20 genes and it is thus possible that more overlap with the GWAS results and significant enrichment could be detected when including progressively more genes. With the exception of VHA M8.9 knockdown of most of the novel genes resulted in a small change in

76 median wing size (-19.4% to 10.1%), indicating that their role in the growth of this tissue is modulatory rather than causal. This implies that they either act redundantly with other genes or as enhancers or mild suppressors of a signal, instead of being involved in generating and propagating the signal. An example of such a gene that we identified is Pox neuro (Awasaki and Kimura 2001), a nonessential gene which has a role in wing hinge formation. However, as we only performed knockdown in the wing we do not know whether these novel genes are essential for other developmental processes or have larger effects upon overexpression. It is furthermore notable that seven of the eight candidate knockdowns that yielded wing size changes exceeding 10% were in the negative direction, indicating that among our candidates, genes with a normally growth supportive function have a stronger input on size than do genes with a growth inhibitory function.

Through statistical epistasis analysis we could find putative biological interactors for 42 of our candidates. Evidence for the applicability of this approach in identifying at least some true biological interactions comes from the detected interaction between InR and the protein tyrosine phosphatase Lar. Lar has been shown to be able to phosphorylate InR (Madan et al. 2011), which likely affects InR activity. Polymorphisms at these two loci could thus cancel out each other’s effects or act synergystically to increase or decrease InR activity, and thus have variable effects on size. Furthermore, we find an enrichment of interactions between GWAS candidates and epistasis interactors (expected 52, observed 100, p=4.26E-09). This indicates that though different approaches may yield different top associations, these do form gene interaction networks among each other. Combining different approaches for identifying trait associations may thus help in placing the candidates from both approaches into a biological context. Obviously, these results have to be interpreted with care as we expect a substantial proportion of false positives in this list of additionally novel genes due to the low power for detecting locus-locus interactions with a population size as small as ours and the rather low nominal significance threshold we set. Nevertheless, some of these interactions may provide the basis for further hypothesis driven investigation towards elucidating the roles and connectivity of newly identified genes and discover new links between already known genes.

Novel candidates fall into diverse functional classes, overlap candidates from other studies and are associated with height or obesity related traits in humans:

The novel candidate genes fall into diverse functional classes, reflecting the multitude of processes that may converge on growth. Apart from genes with roles in PCP and metabolism discussed below we identify genes involved in signal propagation, transmembrane transport,

77 transcription and translation (the eIF4H homolog Rbp2), immunity and the hormonal cascade. Signaling components include the classical protein kinase C Pkc53E, which has been implied in alcohol insensitivity in Drosophila (Chen J et al. 2010) like PKC enzymes in humans, and whose human ortholog has been found to be significantly associated with height (Table S56). 15 other genes that we identified as candidates but did not validate were found to enhance or suppress major growth pathways and effectors (Schertel et al. 2013, listed in table (Table S58) and three candidates, CG10249, CG2269 and CG9743 showed significant association to body weight in Drosophila (Jumbo-Lucioni et al. 2010).

Interestingly, we found several transmembrane ion transporters among our candidates. These mediate cellular responses to light, nerve growth factor, and a wide range of chemical and physical stimuli and metabolic stress (Minke and Cook 2002) and are found mutated in tumors and neurodegenerative disorders. As mediators of extracellular signals and stress these are good candidates for an upstream regulatory role in growth. One of our validated candidates is Mid1 (mildly increased wing size), which has been found to function as a stretch activated Ca2+ channel and plays a role in the polarized growth of mating projections and the response to cold stress and iron toxicity in S. cerevisiae (Iida et al. 1994, Levin and Errede 1995, Peiter et al. 2005). As mechanical tension clearly plays a role in growth control of imaginal discs, this channel could act in translating such signals to intracellular signaling pathways via the second messenger Ca2+. Another novel candidate, the human ortholog of the transmembrane channel Trpm (flies with wing-specific knockdown show a mild increase in wing size) was found to be associated with anthropometric traits during puberty, indicating a role during the postnatal growth phase in humans (Table S56).

The only novel candidate gene we find that may have a function in the hormonal cascade regulating developmental timing is the gene CG14258. This protein has putative juvenile hormone binding functionality and is conserved among Drosophila species, but so far no studies have systematically investigated its in vivo function and no human orthologs exist. It has, however been found to have slightly but nonsignificant male biased expression (Vanaphan et al. 2012).

Most of our GWAS candidates have human orthologs, some of which have been associated with height or obesity related traits in humans. As IIS signaling in Drosophila performs the dual function of the human Insulin/IGF system it is likely that growth associated genes in Drosophila could have effects on either growth or metabolic phenotypes in humans, given they feed into or mediate IIS signaling at some point. Evidence for an involvement in growth control from GWAS in both organisms and experimental support from validation in Drosophila

78 corroborates a true biological function of these genes in the determination of body size. For those genes where no correlation with size or metabolism has been detected in humans our validation of novel candidates playing a role in growth control in Drosophila can provide a basis for elucidating their function in humans.

Planar cell polarity (PCP) genes and wing size:

Among our candidates are several genes with a role in PCP establishment, the polarity of cells within a plane in an epithelium. PCP plays a role in major developmental processes such as cell migration, convergent extension, neurogenesis, axonal guidance, and kidney morphogenesis. PCP signaling components are evolutionarily conserved and reactivation of PCP has been suggested to drive migration of malignant cells during invasion and metastasis and wound healing (Lawrence and Casal 2013, Matis and Axelrod 2013, Hatakeyama et al. 2014). Polarization of the tissue is further important for determining the orientation of cell division, thereby influencing the final shape of the organ. In the Drosophila wing, division occurs preferentially along the PD axis, leading to elongated wings. Mechanistically, oriented cell division involves motor proteins that can orient the spindle apparatus (Baena-Lopez et al. 2005, Segalen and Bellaiche 2009, Mao et al. 2011). The PCP components we identified in our study comprise Wnt4, Lar, aPKC, Fj, Fz, stan, kermit and the microtubule motor proteins Dhc64C and Khc-73, whose human ortholog is significantly associated with height. We identified Kermit, a downstream target of Fz in PCP establishment (Lin and Katanaev 2013), as a candidate both in the VEGAS method for IODF and as an interactor with EGFR in the pairwise association to CSFIC. As EGFR has been shown to act in a combinatorial manner with Fz signaling in the establishment of PCP in the Drosophila eye (Weber et al. 2008), it is not unlikely that the pairwise interaction we identify is real. Kermit is suggested to control trafficking of the transmembrane protein Van Gogh (Vang) by myosin or microtubule motor proteins. A dual role in PCP and growth control has been shown for many already known growth genes such as EGFR, Fat, Dachsous, Four- jointed, Frizzled, Crumbs, aPKC and Lgl, which regulate these two processes via largely distinct downstream cascades but sometimes in a coordinated manner (Povelones et al. 2005, Parsons et al. 2010, Hatakayema et al. 2014). However, Kermit and motor proteins are more downstream in the cascade important for establishing PCP and thus likely have specialized roles for this process. Nevertheless, this can impact on growth, as proper establishment of polarity is important for growth by providing the orientation of cell division, and loss of the Vang ortholog Vangl2 in zebrafish leads to a reduction in body length (Hataeyama et al. 2014), Kermit, Khc-73 and Dhc64C could thus affect growth via their role

79 in PCP. However, for Kermit and Dhc64C it remains to be seen whether they show an effect on wing size upon knockdown.

Metabolism and growth control:

We found the mitochondrial protein Cep89 as a novel regulator of wing size in Drosophila. Cep89 is a highly conserved gene with a role in mitochondrial metabolism and is required for neuronal function in Drosophila and humans (van Bon et al. 2013). A missense causing SNP with negative effect size showed association to all wing phenotypes (except rCSM) and localized to the second exon, providing a putative mechanistic cause for the phenotypic effect. Cep89 is required for proper formation of complex IV, a component of the electron transport chain involved in oxidative phosphorylation, and cep89 loss of function leads to decreased complex IV activity and growth defects in humans and flies. In contrast to the size reduction observed by van Bon et al. we observed an increase in wing size. Potential explanations for this discrepancy include the use of different RNAi lines, Gal-4 lines, controls and different rearing temperatures. Given the proposed function of Cep89 and the phenotype manifested in humans, the phenotype observed by van Bon et al. is what would be expected from knockdown, whereas our observed size increase, albeit mild, is hard to explain in the context of the gene’s function. Furthermore, flies with this SNP had on average smaller wings, as is evident form the negative effect size. Characterization of the effects of an allelic series on wing size and mitochondrial integrity during development would be the best approach to reconcile these results.

In addition to Cep89 we identify several other candidates with putative roles in metabolism. Among these are the amino acid transporter Eaat1, the mitochondrial transmembrane transporters Shawn and Tyler, the protein Fbp2, which contains a missense SNP, causes a mild increase in wing size upon knockdown and is involved in glycolysis, fatty acid and tyrosine metabolism, CG3011 which localizes to the mitochondrion and may have a role in amino acid metabolism and Dnc and Gycbeta100B that are predicted to be involved in purine metabolism. Cht7 has a predicted role in amino sugar metabolism while CG6084 (mild decrease in wing size upon knockdown) may be involved in carbohydrate and glycerolipid metabolism and CG31028, another gene containing a missense causing SNP, in ROS metabolic processes. Corroborating their role in growth or metabolism, Eaat1 and dnc orthologs are associataed to metabolite levels and obesity related traits in humans and the Fbp2 ortholog is associated with pathological overgrowth of the long bones.

80 For a phenotype such as growth that essentially comes down to how much energy and how much biosynthetic precursors are available for cellular growth, it is likely that metabolism and metabolic coordination have to be taken into account to mechanistically understand variability in size. A very recent publication also illustrates this point by showing a mechanism for coupling metabolism to growth via the growth and PCP regulator Fat (Sing et al. 2014). Fat mutants in Drosophila show reduced mitochondrial complex I activity, analogous to the phenotype observed with cep89 mutants, but interestingly cellular metabolism is switched to largely utilizing aerobic glycolysis for energy generation, which is a hallmark of cancer cells (Gatenbye and Gillies 2004). The physiological role of Fat in mitochondria consists of promoting flux through oxidative phosphorylation and thus presents a switch for adjusting metabolism to changing energy and precursor requirements. Knockdown of other mitochondrial components caused PCP defects and affected Hippo pathway activity, indicating that metabolic processes can in turn impact on signaling activity (Baker and Jenny 2014, Sing et al. 2014). This novel perspective on the role of a bona fide growth and polarity gene directly on metabolic control and the evidence for causal effects of mitochondrial proteins on growth control provide exciting new insights in the coordination of growth and metabolism. It will be interesting to see whether other such components can be identified and the characterization of the role of our above mentioned novel candidates in growth control may provide further insight into how this coordination is achieved.

Conclusions: In summary, genome-wide association studies for developmental traits in Drosophila have identified novel regulators of size and may shed light on interactions between these and known genes during growth control. The identification of loci that are modulated to create variability in size while proper function is maintained provides a first step towards a systems-level understanding of size determination. The high conservation of basic developmental processes allows findings from Drosophila to serve as a basis for hypothesis driven investigation of the physiological function and role in disease of orthologous genes in humans. The finding that we identify largely novel genes underlines the complementarity of the GWAS approach to the classical genetics approach and highlights the necessity to probe natural variants. The newly identified loci provide a novel perspective on processes like PCP and metabolism that have so far been understudied in relation to growth control and allow new links to be formed in existing networks. Furthermore, our results highlight the important role of intergenic noncoding and regulatory elements in creating size variability in a population and encourage more efforts towards the investigation of regulatory rather than functional mutations for understanding how phenotypic variability is achieved.

81 METHODS

Fly media, stock keeping and fly lines: Fly food was prepared according to the following recipe: 100 g fresh yeast, 55 g cornmeal, 10 g wheat flour, 75 g sugar, 8 g bacto-agar and 1 liter tap water.

Experiments were performed with 149 of the Drosophila Genetic Reference Panel lines. RNAi lines used are listed in Supplementary Table xx

Standardized Culture Conditions: Lines were set up in duplicate vials, with five males and five females per vial. After seven days, the parental flies were removed. From the F1, five males and five females were put together in duplicate vials and discarded after seven days of egg laying. From the F2, thirty males and thirty females were mated in a laying cage with an apple juice agar plate plus a yeast drop as food source and allowed to get accustomed to the cage for 24 hours. Then a fresh plate of apple juice agar plus yeast drop was applied and flies were left to lay eggs for another 24 hours. From this plate, F3 L1 larvae were picked with forceps and distributed into three replicate vials, with 40 larvae per vial. The food surface in the vials was scratched and 100µl of ddH2O added prior to larvae transfer. The adult F3 flies were pooled from the three vials and frozen at -20°C approximately 1-2 days after eclosion. The whole experiment was performed in a dedicated incubator (DR-36VL, CLF Plant Climatics GmbH) with a 12-hour day-night cycle, constant humidity of 65 - 68% and constant temperature of 25.5°C +/- 1°C. Vials were shuffled every two days during the first and second round of mating but left at a fixed position in the incubator for the duration of the development of the F3 generation.

Lines were all set up on the same day on the same foodbatch. For the F1 generation matings, different foodbatches had to be used due to different developmental timing of the lines. F2 matings were set up using the same batch of apple agar plates and yeast for all lines. F3 larvae were distributed on four different foodbatches and batch number was recorded for each line.

The control experiment was performed using the same procedure as above, except that the same foodbatch was used for all flies of a generation.

82 Phenotyping and morphometric measurements: Depending on the number of flies available, between five and twenty-five flies per sex and line were measured (median 12 flies per sex and line). Flies were positioned on a black apple agar plate and photographed using a VHX-1000 digital light microscope (KEYENCE). Morphometric body traits were measured manually using the VHX-1000 dedicated measurement software. If intact the right and otherwise the left wing was removed and mounted in water on a glass slide for wing image acquisition. Morphometric measurements were extracted from the wing images using WINGMACHINE (Houle et al. 2003) and MATLAB (MATLAB version R2010b Natick, Massachusetts: The MathWorks Inc.)

Centroid size was measured as the square root of the summed squared distances of 14 landmarks from the center of the wing. Wing aspect ratio was defined as the squared wing length divided by the wing area (WL2/WA). Thorax length was measured viewed from above, from the midpoint between the left and right humeral bristles to the posterior tip of the scutellum. Interocular distance was measured from eye edge to eye edge along the anterior edge of the posterior ocelli and parallel to the base of the head. For the experimental generation we distributed a total of 19200 larvae in four batches spaced throughout 1.5 weeks according to developmental timing, and the final dataset consisted of morphometric data of 6978 flies, 3500 females and 3478 males. We measured a total of 22 wing traits and the five body traits described above.

Quantitative Genetic analysis: All analyses were performed in R Studio using the R statistical language version 2.15 (http://www.R-project.org).

Total phenotypic variance was partitioned according to the following formula:

Y = µ + S + L + LxS + ε

Analysis of variance models of this form were fitted using the aov() function to assess significance of the individual terms in explaining phenotypic variation; that is, whether group means were significantly different from each other given the within-group variability. Group factors assessed were the terms in the model above. Only lines with between 15 and 25 individuals per sex were used, in order to have a more balanced design. This removed a total of 12 lines: 25179, 28138, 28140, 28141, 28155, 28173, 28183, 28196, 28245, 28246, 28255, 29651. All p-values given are unadjusted p-values.

83 Cross-sex genetic correlation was calculated as

2 2 2 rMF = σ L/sqrt(σ LF σ LM)

2 2 2 where σ L = variance due to genotype, σ LF = variance due to genotype in females and σ LM = variance due to genotype in males.

Cross-trait genetic correlations were calculated as

2 2 2 rAB = σ G(AB)/sqrt(σ GA σ GB)

2 2 2 where σ G(AB)= genetic covariance trait A and B, σ GA = genetic variance trait A and σ GB= genetic variance trait B. Further,

2 2 2 2 σ G(AB)= (σ Y - ( σ GA+ σ GB))/2 and Y = A + B.

Linear mixed models of the form

Y = µ + S + L + LxS + F + ε were fitted using the lmer() function from the package lme4 to estimate the variance components for the individual terms. Reduced models were fitted for each sex separately. We treated sex (S) as a fixed effect and genotype (L), genotype by sex (LxS) and foodbatch (F) as random effects. Relative contributions of the variance components to total phenotypic 2 variance (σ P) was calculated as

2 2 2 2 2 σ i / ( σ L + σ LxS + σ F + σ E)

2 2 2 2 2 2 2 2 2 2 where σ i represents any of σ L, σ LxS , σ F , σ E and σ P = σ L + σ LxS + σ F + σ E.

2 2 2 σ P = total phenotypic variance, σ L = variance due to genotype, σ LxS = variance due to 2 2 genotype by sex interactions, σ F = variance due to food and σ E = residual variance.

Broad sense heritability for each trait was estimated as

2 2 2 2 2 2 2 2 2 H = σ G / σ P = ( σ L + σ LxS ) / ( σ L + σ LxS + σ F + σ E)

2 2 The ratio of food contribution to line contribution (FG Ratio) was calculated as σ F / σ L .

Correlations among traits were calculated applying the rcorr() function from the Hmisc package using the Pearson estimate. 84 Phenotype Treatment: We focused our analysis on 2 traits on Centroid size (CS) as a measure of wing size and inter-ocular distance (IOD) as a proxy to overall body size. Since for CS we saw substantial influence of the foodbatch on the phenotype (Figure 2D) and two relatively common inversions (In(2L)t and In(3R)Mo) were correlated with variability in IOD, and to a lesser extent CS, we modeled these covariates using a mixed model. The foodbatch was modeled by a random effect and the rearrangements were coded as (0,1,2) depending on whether both, one or no inversion was present in the homozygous state. Specifically, the models used were:

CSraw = α + X1β1 + X2β2 + Fu+ ε,

2 where X1 refers to the sex covariate, X2 refers to the inversion covariate, ε~Νn(0,σε In ) with 2 n being the number of lines, u~Nk(0, σu Ik) with k being the number of foodbatches and F an (n,k)-indicator matrix, associating each line to its respective foodbatch. The actual analysis was done on the estimated residual of this model (CS = ε). To construct an additional phenotype (resCS) that had the effect of IOD on CS removed via regression we modeled:

CSraw = α + IOD + X1β1 + Fu + ε,

where X1 and Fu refer again to the sex-effect and the foodbatch effect. We saw no need to model the inversions because the residuals of the above model showed no correlation with the inversion variable. The residulas ε from this regression were then used as relative size phenotypes.

Association Analysis: Genotypes for 143 of the 149 lines were obtained from the DGRP Freeze 2. For further analysis, only SNPs were kept that were missing in a maximum of ten lines and occurred in at least four (MAC3), seven (MAC5) or ten (MAC7) lines. GWAS was performed using FaST-LMM (Lippert et al. 2011) for separate sexes, and SNPs with an association p-value < 10E-05 were kept as candidates. Using the FaST-LMM SNP p-values we applied the VEGAS method to calculate gene-wise statistics. Gene boundaries were defined using FlyBase annotation, but we included also SNPs lying within 1000 base pairs up- or downstream of these margins. The correlation matrix was calculated from the genotypes themselves.

85 GO annotation and interaction enrichement: We used the functional annotation chart function from DAVID (Huang DW et al. 2009a, b) and the functional annotation and protein interaction enrichment tools from STRING (Franceschini et al. 2013).

RNAi validation: SNPs with an association p-value of p<10E-05 lying in a gene region or +/- 1kb from a gene were mapped to that gene. From VEGAS, we chose the top 20 genes from each list as candidates. RNAi lines were ordered only for MAC7 candidates based on immediate availability. For the random lines we ordered a random set of 30 genes that did not contain a significant SNP within 1kb of their transcribed region, four of which died in the laboratory, yielding a total of 26 random candidates tested. The used lines are listed in Table S57. For wing size candidates, validation was performed using the nubbin-Gal4 driver, whereas we used the eyeless driver for validation of IOD candidates. The VDRC line 47097 containing a UAS-RNAi construct against the CG1315 gene served as a negative control for the knockdown. We decided to use this line as reference since its knockdown had so far never shown an effect in any setting and, most importantly, because it was in the same background as most of our tester lines, an essential factor to consider when assessing genes that presumably only have a small effect on size upon knockdown. Prior to the experiment, driver lines were bred under controlled density and virgins collected to set up crosses on the same day for all candidate RNAi lines. The random lines arrived later and wre thus set up at a later point. Wings and IOD were phenotyped as described above for the dataset. Change in wing size was tested with a Wilcoxon rank sum test (function wilcoxon.test() in R) between each line and the control for separate sexes.

Epistatic Analysis: We explored epistatic interactions between SNPs lying within and 1kb around genes that were previously found to be involved in growth or wing development in Drosophila against all DGRP SNPs with missingness <11 and present in at least ten lines. Additionally, the phenotypes were normalized to follow a standard normal distribution. We used FasT-Epistasis (Schüpbach et al. 2010) calculating interactions for all pairs between the focal SNPs and the set of all SNPs satisfying the above criteria

Intergenic element enrichment analysis: For the enrichment analysis we determined the number of SNPs from each GWAS candidate list and the overall number of SNPs that located within modENCODE annotated regulatory elements or lincRNA loci. Enrichment was tested using a hypergeometric test (function phyper() in R). The used tracks are H3K4Me1 in L2, L3 and pupae. And we obtained a table with lincRNAs in the Drosophila genome from Young et al. (Young et al. 2012).

86 BLAST alignment: We downloaded the sequence of the region 10kb upstream of the annotated transcription start site of the expanded locus (2L: 421227..431227) from FlyBase (St Pierre et al. 2014). We additionally downloaded the sequence of the same relative region for all of the twelve Drosophila species (Clark et al. 2007), Supplementary data file 1. We performed multiple sequence alignment using the discontguous megablast option on NCBI BLAST (Altschul et al. 1990).

Annotation with human orthologs: We combined candidate genes from all phenotypes and searched for orthologs in humans using DIOPT-DIST (Hu Y et al. 2011). We only considered ortholog predictions with a score of at least 4. The score reflects the confidence of ortholog prediction by giving the number of prediction methods that support the match, weighted by functional assessment using GO annotation.

FIGURE LEGENDS

Figure1. Standardized Drosophila Culture Conditions for the Quantification of Morphometric Traits. A) The protocol extends over three generations and efficiently controls known covariates of size, such as temperature, humidity, day-night-cycle and crowding. Additionally, effects of other environmental covariates, such as intra-vial environment, light intensity and incubator position are randomized. B) Illustration of the three phenotypes assayed in detail: interocular distance (IOD), thorax length (TL) and the 14 landmarks that were used in the calculation of centroid size (CS).

Figure 2. Foodbatch variability accounts for a sizable fraction of phenotypic variation in wing size. Phenotypic variation in wing and body size in the DGRP (A-C). Plots show mean phenotypic values for A) centroid size, B) interocular distance and C) thorax length. Each dot represents the mean phenotype per line of males (black) and corresponding females (red), with error bars denoting one standard deviation. CS= centroid size (A), IOD = interocular distance (B), TL = thorax length (C), definitions of trait values are shown in Figure 1B. Lines are ordered on the x-axis according to male trait value, from lowest to highest, meaning that the order of lines is different for each plot. The order for each phenotype and corresponding trait means are given in Tables S2 – S4. D) The plot shows the proportion of phenotypic variance in the population explained by the genotype (dark grey), genotype by sex interactions (light grey), food (red) and residual environmental factors (green) for each phenotype. Variance estimates were obtained using the lmer() function in R. 87 Figure 3. GWAS for absolute and relative wing size identifies between xx and 77 significant SNPs across the genome that are preferentially located in intergenic and regulatory regions. A) Manhattan plot of the SNP p-values from the GWAS of relative centroid size in females (MAC7) shows that significant SNPs are distributed over all chromosomes. Negative log10 p-values are plotted against genomic position, the black line denotes the nominal significance threshold of 10E-05. The SNP in the 3’UTR of ILP-8 is marked in green. B) Overlap in the number of SNPs identified by GWAS for different wing traits in females. The overlap is bigger for the two absolute wing size phenotypes and a substantial proportion of SNPs shows association to all traits. C) LD plot showing correlation between associated SNPs in the GWAS for relative centroid size (MAC7). LD can extend over bigger regions as seen by multiple SNPs being correlated on chromosome 2R, but generally identified SNPs represent individual associations. Blue = No correlation, orange = complete correlation. D) Associated SNPs are predominantly located in the intergenic space and in regulatory regions. Boxes show the distribution of negative log10 p-values of the significant SNPs from the relative centroid size GWAS in females (MAC7) among site classes with numbers of SNPs belonging to each site class are denoted above the boxes. As a SNP can fall into multiple classes, the sum of SNPs from all site classes is higher than the total number of SNPs.

Figure 4. Associated SNPs overlap 45 functionally diverse novel candidate genes for wing size determination and localize within putative enhancer elements. A) Percent change in median wing area compared to CG1315 RNAi upon wing-specific knockdown of the validated candidate genes in females. Median and 25th and 75th percentile for each line can be found in Table Sxx. B) Representative image for the extreme wing phenotype upon knockdown VhaM8.9 (VDRC line 105281). C) Functional annotation of the 50 validated candidate genes based on DAVID GO annotation. D) Alignment of the 2kb region immediately upstream of the D. melanogaster ex locus that showed sequence conservation across Drosophila species. The position of the SNP significantly associated with male centroid size is indicated by the blue line. Sequences were retrieved from FlyBase and alignments performed using BLAST multiple sequence alignment.

Figure 5. Whole genome pairwise interactions between focal growth genes and DGRP SNPs for body size. The plot shows the focal genes annotated in black and the interactors in red. Interaction lines are colored according to the chromosome the focal gene is located on. The thin bars within the circle denote the location of insertions.

88 FIGURES

Figure 1.

89 Figure 2.

90 Figure 3.

91 Figure 4.

92 Figure 5.

93 REFERENCES Alexander, D. D., P. J. Mink, C. A. Cushing, and B. Sceurman, 2010 A review and meta- analysis of prospective studies of red and processed meat intake and prostate cancer. Nutrition journal 9: 50. Altschul, S. F., W. Gish, W. Miller, E. W. Myers, and D. J. Lipman, 1990 Basic local alignment search tool. Journal of molecular biology 215: 403–410.

Anholt, R. R. H., and T. F. C. Mackay, 2004 Quantitative genetic analyses of complex behaviours in Drosophila. Nature reviews. Genetics 5: 838–849.

Arts, J., M. L. Fernandez, and I. E. Lofgren, 2014 Coronary Heart Disease Risk Factors in College Students. Advances in Nutrition: An International Review Journal 5: 177–187.

Ashton-Beaucage, D., C. M. Udell, P. Gendron, M. Sahmi, M. Lefrançois et al., 2014 A Functional Screen Reveals an Extensive Layer of Transcriptional and Splicing Control Underlying RAS/MAPK Signaling in Drosophila (H. J. Bellen, Ed.). PLoS biology 12: e1001809.

Aune, D., T. Norat, P. Romundstad, and L. J. Vatten, 2013 Dairy products and the risk of type 2 diabetes: a systematic review and dose-response meta-analysis of cohort studies. American Journal of Clinical Nutrition 98: 1066–1083.

Awasaki, T., and K. Kimura, 2001 Multiple function of poxn gene in larval PNS development and in adult appendage formation of Drosophila. Development genes and evolution 211: 20– 29.

Baena-López, L. A., A. Baonza, and A. Garcia-Bellido, 2005 The Orientation of Cell Divisions Determines the Shape of Drosophila Organs. Current Biology 15: 1640–1644.

Baker, N. E., and A. Jenny, 2014 Metabolism and the Other Fat: A Protocadherin in Mitochondria. Cell 158: 1240–1241.

Batty, G. D., M. J. Shipley, D. Gunnell, R. Huxley, M. Kivimaki et al., 2009 Height, wealth, and health: An overview with new data from three longitudinal studies. Economics & Human Biology 7: 137–152.

Bergelson, J., and F. Roux, 2010 Towards identifying genes underlying ecologically relevant traits in Arabidopsis thaliana. Nature reviews. Genetics 11: 867–879.

94 Bi, P., T. Shan, W. Liu, F. Yue, X. Yang et al., 2014 Inhibition of Notch signaling promotes browning of white adipose tissue and ameliorates obesity. Nature Medicine 1–10.

Birdsall, K., E. Zimmerman, K. Teeter, and G. Gibson, 1999 Genetic variation for the positioning of wing veins in Drosophila melanogaster. Evolution & development 2: 16–24.

Celniker, S. E., L. A. L. Dillon, M. B. Gerstein, K. C. Gunsalus, S. Henikoff et al., 2009 Unlocking the secrets of the genome. Nature 459: 927–930.

Chen, J., Y. Zhang, and P. Shen, 2010 Protein kinase C deficiency-induced alcohol insensitivity and underlying cellular targets in Drosophila. NSC 166: 34–39.

Clark, A. G., M. B. Eisen, D. R. Smith, C. M. Bergman, B. Oliver et al., 2007 Evolution of genes and genomes on the Drosophila phylogeny. Nature 450: 203–218.

Ernst, J., P. Kheradpour, T. S. Mikkelsen, N. Shoresh, L. D. Ward et al., 2011 Mapping and analysis of chromatin state dynamics in nine human cell types. Nature 473: 43–49.

Falconer, D. S., & Mackay, T. F. C., 1996 Introduction to Quantitative Genetics (Edition 4). Longmans Green, Harlow, Essex, UK.

Flint, J., and E. Eskin, 2012 Genome-wide association studies in mice. Nature reviews. Genetics 13: 807–817.

Franceschini, A., D. Szklarczyk, S. Frankild, M. Kuhn, M. Simonovic et al., 2012 STRING v9.1: protein-protein interaction networks, with increased coverage and integration. Nucleic acids research 41: D808–D815.

García-Gámez, E., B. Gutiérrez-Gil, G. Sahana, J.-P. Sánchez, Y. Bayón et al., 2012 GWA Analysis for Milk Production Traits in Dairy Sheep and Genetic Support for a QTN Influencing Milk Protein Percentage in the LALBA Gene (J. C. Nelson, Ed.). PLoS ONE 7: e47782.

Garoia, F., D. Grifoni, V. Trotta, D. Guerra, M. C. Pezzoli et al., 2005 The tumor suppressor gene fat modulates the EGFR-mediated proliferation control in the imaginal tissues of Drosophila melanogaster. Mechanisms of Development 122: 175–187.

Gatenby, R. A., and R. J. Gillies, 2004 Why do cancers have high aerobic glycolysis? Nature Reviews Cancer 4: 891–899.

95 Gerber, M., 2012 Omega-3 fatty acids and cancers: a systematic update review of epidemiological studies. British Journal of Nutrition 107: S228–S239.

Grubbs, N., M. Leach, X. Su, T. Petrisko, J. B. Rosario et al., 2013 New Components of Drosophila Leg Development Identified through Genome Wide Association Studies (E. E. Schmidt, Ed.). PLoS ONE 8: e60261.

Hangauer, M. J., I. W. Vaughn, and M. T. McManus, 2013 Pervasive Transcription of the Human Genome Produces Thousands of Previously Unidentified Long Intergenic Noncoding RNAs (J. L. Rinn, Ed.). PLoS Genetics 9: e1003569.

Hatakeyama, J., J. H. Wald, I. Printsev, H. Y. H. Ho, and K. L. Carraway, 2014 Vangl1 and Vangl2: planar cell polarity components with a developing role in cancer. Endocrine Related Cancer 21: R345–R356.

Hirschhorn, J. N., and M. J. Daly, 2005 Genome-wide association studies for common diseases and complex traits. Nature reviews. Genetics 6: 95–108.

Hu, Y., I. Flockhart, A. Vinayagam, C. Bergwitz, B. Berger et al., 2011 An integrative approach to ortholog prediction for disease-focused and other functional studies. BMC bioinformatics 12: 357.

Huang, D. W., B. T. Sherman, and R. A. Lempicki, 2009a Bioinformatics enrichment tools: paths toward the comprehensive functional analysis of large gene lists. Nucleic acids research 37: 1–13.

Huang, D. W., B. T. Sherman, and R. A. Lempicki, 2009b Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nature protocols 4: 44–57.

Huang, W., A. Massouras, Y. Inoue, J. Peiffer, M. Ramia et al., 2014 Natural variation in genome architecture among 205 Drosophila melanogaster Genetic Reference Panel lines. Genome Research 24: 1193–1208.

Huang, X., Y. Zhao, X. Wei, C. Li, A. Wang et al., 2011 Genome-wide association study of flowering time and grain yield traits in a worldwide collection of rice germplasm. Nature Publishing Group 44: 32–39.

Hunt, R. C., V. L. Simhadri, M. Iandoli, Z. E. Sauna, and C. Kimchi-Sarfaty, 2014 Exposing synonymous mutations. Trends in genetics : TIG 30: 308–321.

96 Ibrahim, D. M., B. Biehs, T. B. Kornberg, and A. Klebes, 2013 Microarray Comparison of Anterior and Posterior Drosophila Wing Imaginal Disc Cells Identifies Novel Wing Genes. G3 (Bethesda, Md.) 3: 1353–1362.

Igl, W., Å. Johansson, J. F. Wilson, S. H. Wild, O. Polašek et al., 2010 Modeling of Environmental Effects in Genome-Wide Association Studies Identifies SLC2A2 and HP as Novel Loci Influencing Serum Cholesterol Levels (P. Gasparini, Ed.). PLoS Genetics 6: e1000798.

Iida, H., H. Nakamura, T. Ono, M. S. Okumura, and Y. Anraku, 1994 MID1, a novel Saccharomyces cerevisiae gene encoding a plasma membrane protein, is required for Ca2+ influx and mating. Molecular and Cellular Biology 14: 8259–8271.

Jumbo-Lucioni, P., J. F. Ayroles, M. Chambers, K. W. Jordan, J. Leips et al., 2010 Systems genetics analysis of body weight and energy metabolism traits in Drosophila melanogaster. BMC Genomics 11: 297.

Kiger, A. A., B. Baum, S. Jones, M. R. Jones, A. Coulson et al., 2003 A functional genomic analysis of cell morphology using RNA interference. Journal of Biology 2: 27.

Kilfoil, M. L., P. Lasko, and E. Abouheif, 2009 Stochastic variation: From single cells to superorganisms. HFSP Journal 3: 379–385.

Korte, A., and A. Farlow, 2013 The advantages and limitations of trait analysis with GWAS- a review. Plant Methods 9: 1–1.

Krumsiek, J., C. Marr, T. Schroeder, and F. J. Theis, 2011 Hierarchical Differentiation of Myeloid Progenitors Is Encoded in the Transcription Factor Network (M. Pesce, Ed.). PLoS ONE 6: e22649.

Lango Allen, H., K. Estrada, G. Lettre, S. I. Berndt, M. N. Weedon et al., 2010 Hundreds of variants clustered in genomic loci and biological pathways affect human height. Nature 467: 832–838.

Lawrence, P. A., and J. Casal, 2013 The mechanisms of planar cell polarity, growth and the Hippo pathway: some known unknowns. Developmental Biology 377: 1–8.

Lee, S. H., B. H. Choi, D. Lim, C. Gondro, Y. M. Cho et al., 2013 Genome-Wide Association Study Identifies Major Loci for Carcass Weight on BTA14 in Hanwoo (Korean Cattle) (C. Wade, Ed.). PLoS ONE 8: e74677.

97 Lettre, G., 2011 Recent progress in the study of the genetics of height. Human Genetics 129: 465–472.

Lettre, G., J. L. Butler, K. G. Ardlie, and J. N. Hirschhorn, 2007 Common genetic variation in eight genes of the GH/IGF1 axis does not contribute to adult height variation. Human Genetics 122: 129–139.

Lettre, G., A. U. Jackson, C. Gieger, F. R. Schumacher, S. I. Berndt et al., 2008 Identification of ten loci associated with height highlights new biological pathways in human growth. Nature Genetics 40: 584–591.

Levin, D. E., and B. Errede, 1995 The proliferation of MAP kinase signaling pathways in yeast. Current Opinion in Cell Biology 7: 197–202.

Lin, C., and V. L. Katanaev, 2013 Kermit Interacts with Gαo, Vang, and Motor Proteins in Drosophila Planar Cell Polarity (E. Moreno, Ed.). PLoS ONE 8: e76885.

Lipka, A. E., M. A. Gore, M. Magallanes-Lundback, A. Mesberg, H. Lin et al., 2013 Genome- wide association study and pathway-level analysis of tocochromanol levels in maize grain. G3 (Bethesda, Md.) 3: 1287–1299.

Lippert, C., J. Listgarten, Y. Liu, C. M. Kadie, R. I. Davidson et al., 2011 FaST linear mixed models for genome-wide association studies. Nature Methods 8: 833–835.

Liu, J. Z., A. F. Mcrae, D. R. Nyholt, S. E. Medland, N. R. Wray et al., 2010 A Versatile Gene- Based Test for Genome-wide Association Studies. The American Journal of Human Genetics 87: 139–145.

Lynch, M. & Walsh, B., 1998 Genetics and Analysis of Quantitative Traits. Sinauer Associates, Inc., Sunderland, MA, USA.

Mackay, T. F. C., 2001 The genetic architecture of quantitative traits. Annu. Rev. Genet. 35: 303-339.

Mackay, T. F. C., and R. R. H. Anholt, 2006 Of flies and man: Drosophila as a model for human complex traits. Annual review of genomics and human genetics 7: 339–367.

Mackay, T. F. C., S. Richards, E. A. Stone, A. Barbadilla, J. F. Ayroles et al., 2012 The Drosophila melanogaster Genetic Reference Panel. Nature 482: 173–178.

98 Mackay, T. F. C., E. A. Stone, and J. F. Ayroles, 2009 The genetics of quantitative traits: challenges and prospects. Nature Publishing Group 10: 565–577.

Madan, L. L., S. Veeranna, K. Shameer, C. C. S. Reddy, R. Sowdhamini, and B. Gopal, 2011a Modulation of Catalytic Activity in Multi-Domain Protein Tyrosine Phosphatases. PLoS ONE 6: e24766.

Mao, Y., A. L. Tournier, P. A. Bates, J. E. Gale, N. Tapon et al., 2011 Planar polarization of the atypical myosin Dachs orients cell divisions in Drosophila. Genes & Development 25: 131–136.

Matis, M., and J. D. Axelrod, 2013 Regulation of PCP by the Fat signaling pathway. Genes & Development 27: 2207–2220.

Maxa, J., M. Neuditschko, I. Russ, M. Förster, and I. Medugorac, 2012 Genome-wide association mapping of milk production traits in Braunvieh cattle. Journal of Dairy Science 95: 5357–5364.

McCarthy, M. I., G. R. Abecasis, L. R. Cardon, D. B. Goldstein, J. Little et al., 2008 Genome- wide association studies for complex traits: consensus, uncertainty and challenges. Nature reviews. Genetics 9: 356–369.

Meijón, M., S. B. Satbhai, T. Tsuchimatsu, and W. Busch, 2013 Genome-wide association study using cellular traits identifies a new regulator of root development in Arabidopsis. Nature Publishing Group 46: 77–81.

Minke, B., and B. Cook, 2002 TRP channel proteins and signal transduction. Physiological reviews 82: 429–472.

Minozzi, G., E. L. Nicolazzi, A. Stella, S. Biffani, R. Negrini et al., 2013 Genome Wide Analysis of Fertility and Production Traits in Italian Holstein Cattle (L. Chen, Ed.). PLoS ONE 8: e80219.

Neto-Silva, R. M., B. S. Wells, and L. A. Johnston, 2009 Mechanisms of Growth and Homeostasis in the DrosophilaWing. Annual Review of Cell and Developmental Biology 25: 197–220.

Norry, F. M., J. C. Vilardi, and E. Hasson, 1997 Genetic and phenotypic correlations among size-related traits, and heritability variation between body parts in Drosophila buzzatii. Genetica 101: 131–139.

99 Oksenberg, J. R., S. E. Baranzini, S. Sawcer, and S. L. Hauser, 2008 The genetics of multiple sclerosis: SNPs to pathways to pathogenesis. Nature reviews. Genetics 9: 516–526.

Özkan, E., R. A. Carrillo, C. L. Eastman, R. Weiszmann, D. Waghray et al., 2013 An Extracellular Interactome of Immunoglobulin and LRR Proteins Reveals Receptor-Ligand Networks. Cell 154: 228–239.

Pak, S., 2010 The growth status of North Korean refugee children and adolescents from 6 to 19 years of age. Economics & Human Biology 8: 385–395.

Parsons, L. M., N. A. Grzeschik, M. Allott, and H. Richardson, 2010 Lgl/aPKC and Crb regulate the Salvador/Warts/Hippo pathway. Fly 4: 288–293.

Peiter, E., M. Fischer, K. Sidaway, S. K. Roberts, and D. Sanders, 2005 The Saccharomyces cerevisiae Ca2+ channel Cch1pMid1p is essential for tolerance to cold stress and iron toxicity. FEBS Letters 579: 5697–5703.

Perola, M., S. Sammalisto, T. Hiekkalinna, N. G. Martin, P. M. Visscher et al., 2007 Combined Genome Scans for Body Stature in 6,602 European Twins: Evidence for Common Caucasian Loci. PLoS Genetics 3: e97.

Povelones, M., 2005 Genetic Evidence That Drosophila frizzled Controls Planar Cell Polarity and Armadillo Signaling by a Common Mechanism. Genetics 171: 1643–1654.

Rakitsch, B., C. Lippert, O. Stegle, and K. Borgwardt, 2013 A Lasso multi-marker mixed model for association mapping with population structure correction. Bioinformatics 29: 206– 214.

Reynolds, J. V., C. L. Donohoe, and S. L. Doyle, 2010 Diet, obesity and cancer. Irish Journal of Medical Science 180: 521–527.

Riccardi, G., R. Giacco, and A. A. Rivellese, 2004 Dietary fat, insulin sensitivity and the metabolic syndrome. Clinical Nutrition 23: 447–456.

Roff, D. A., and T. A. Mousseau, 1987 Quantitative genetics and fitness: lessons from Drosophila. Heredity 58 ( Pt 1): 103–118.

Schertel, C., D. Huang, M. Björklund, J. Bischof, D. Yin et al., 2013 Systematic Screening of a Drosophila ORF Library In Vivo Uncovers Wnt/Wg Pathway Components. DEVCEL 25: 207–219.

100 Schüpbach, T., I. Xenarios, S. Bergmann, and K. Kapur, 2010 FastEpistasis: a high performance computing solution for quantitative trait epistasis. Bioinformatics 26: 1468– 1469.

Schwekendiek, D., and S. Pak, 2009 Recent growth of children in the two Koreas: A meta- analysis. Economics & Human Biology 7: 109–112.

Segalen, M., and Y. Bellaïche, 2009 Cell division orientation and planar cell polarity pathways. Seminars in Cell and Developmental Biology 20: 972–977.

Sing, A., Y. Tsatskis, L. Fabian, I. Hester, R. Rosenfeld et al., 2014 The Atypical Cadherin Fat Directly Regulates Mitochondrial Function and Metabolic State. Cell 158: 1293–1308.

St Pierre, S. E., L. Ponting, R. Stefancsik, P. McQuilton, FlyBase Consortium, 2014 FlyBase 102--advanced approaches to interrogating FlyBase. Nucleic acids research 42: D780–8.

Stark, A., M. F. Lin, P. Kheradpour, J. S. Pedersen, L. Parts et al., 2007 Discovery of functional elements in 12 Drosophila genomes using evolutionary signatures. Nature 450: 219–232.

Thomas, D., 2010 Gene–environment-wide association studies: emerging approaches. Nature reviews. Genetics 11: 259–272. van Bon, B. W. M., M. A. W. Oortveld, L. G. Nijtmans, M. Fenckova, B. Nijhof et al., 2013 CEP89 is required for mitochondrial metabolism and neuronal function in man and fly. Human Molecular Genetics 22: 3138–3151.

Vilhjálmsson, B. J., and M. Nordborg, 2013 The nature of confounding in genome-wide association studies. Nature reviews. Genetics 14: 1–2.

Weber, U., C. Pataki, J. Mihaly, and M. Mlodzik, 2008 Combinatorial signaling by the Frizzled/PCP and Egfr pathways during planar cell polarity establishment in the Drosophila eye. Developmental Biology 316: 110–123.

Womack, J. E., H.-J. Jang, and M. O. Lee, 2012 Genomics of complex traits. Annals of the New York Academy of Sciences 1271: 33–36.

Wray, N. R., J. Yang, Ben J Hayes, A. L. Price, M. E. Goddard et al., 2013 Pitfalls of predicting complex traits from SNPs. Nature reviews Genetics 14: 507–515.

101 Yang, J., T. A. Manolio, L. R. Pasquale, E. Boerwinkle, N. Caporaso et al., 2011 Genome partitioning of genetic variation for complex traits using common SNPs. Nature Genetics 43: 519–525.

Yang, L., F. Meng, D. Ma, W. Xie, and M. Fang, 2012 Bridging Decapentaplegic and Wingless signaling in Drosophila wings through repression of naked cuticle by Brinker. Development (Cambridge, England) 140: 413–422.

Young, R. S., A. C. Marques, C. Tibbit, W. Haerty, A. R. Bassett et al., 2012 Identification and Properties of 1,119 Candidate LincRNA Loci in the Drosophila melanogaster Genome. Genome biology and evolution 4: 427

ACKNOWLEDGEMENTS We thank Anna Troller, Benjamin Schlager and Anni Strässle for the great support during experiments. We also thank Katja Köhler and Benjamin Schlager for fruitful discussions and critical reading of the manuscript. This work was funded by grant SXRTX0-123851 from SystemsX.ch, the Swiss National Science Foundation grant 31003AB_135699 and financial support from ETH Zurich to EH.

102 Supplementary Information Sibylle Chantal Vonesch1, David Lamparter2, Sven Bergmann2, Ernst Hafen1 1 Institute of Molecular Systems Biology, ETH Zürich, CH-8093 Zürich 2 Department of Medical Genetics, University of Lausanne, CH-1005 Lausanne

All genomic positions in Tables with variants refer to the annotation in the DGRP PopDrowser (http://popdrowser.uab.cat/gb2/gbrowse/dgrp/)

103 Supplementary files:

Supplementary data file 1. The genomic regions used in the BLAST multiple sequence alignment

Supplementary tables: Table S1. Variance Partitioning of Centroid Size and Interocular Distance in the Control Experiment. Relative contribution of the individual variance components to each phenotype is shown. CS = centroid size, IOD = interocular distance.

Table S2. Phenotypic Variation in morphometric traits in Drosophila melanogaster. Smallest and largest trait values and the percent difference are given per sex for each phenotype. Population means (Mean) and standard deviations (STD) are used to calculate the coefficient of variation (CV) for each trait and sex.

Tables S3-S5. Line ordering according to mean in males, from lowest to highest, for centroid size (CS), interocular distance (IOD) and thorax length (TL) . Table S6. Significance of ANOVA model terms. P-values for significance of each individual grouping factor in explaining variance between groups is given for all four phenotypes. Models were fitted using the aov() function in R (Methods). Only lines with between 15 and 25 individuals per sex and line were used for a more balanced design. P- values are unadjusted.

Table S7. Genetic variance for all traits in males and females and cross-sex genetic correlation. Genetic variances were determined from model estimates using the lmer() function in R (Methods). rMF = cross-sex genetic correlation, VGF= genetic variance females,

VGM = genetic variance males.

Table S8. Variance Partitioning of the Phenotypes. Relative contributions of the individual variance components to each phenotype are shown. CS = centroid size, IOD = interocular distance, TL = thorax length. H2 = broad-sense heritability, calculated as described in the methods section.

Table S9. Genetic correlation between traits.

Table S10. Significant SNPs from GWAS to centroid size in females (CSF) MAC3

104 Table S11. Significant SNPs from GWAS to centroid size in females (CSF) MAC5

Table S12. Significant SNPs from GWAS to centroid size in females (CSF) MAC7

Table S13. Significant SNPs from GWAS to inversion modeled centroid size in females (CSFIC) MAC3

Table S14. Significant SNPs from GWAS to inversion modeled centroid size in females (CSFIC) MAC5

Table S15. Significant SNPs from GWAS to inversion modeled centroid size in females (CSFIC) MAC7

Table S16. Significant SNPs from GWAS to centroid size in males (CSM) MAC3

Table S17. Significant SNPs from GWAS to centroid size in males (CSM) MAC5

Table S18. Significant SNPs from GWAS to centroid size in males (CSM) MAC7

Table S19. Significant SNPs from GWAS to inversion modeled centroid size in males (CSMIC) MAC3

Table S20. Significant SNPs from GWAS to inversion modeled centroid size in males (CSMIC) MAC5

Table S21. Significant SNPs from GWAS to inversion modeled centroid size in males (CSMIC) MAC7

Table S22. Significant SNPs from GWAS to interocular distance in females (IODF) MAC3

Table S23. Significant SNPs from GWAS to interocular distance in females (IODF)MAC5

Table S24. Significant SNPs from GWAS to interocular distance in females (IODF)MAC7

105 Table S25. Significant SNPs from GWAS to inversion modeled interocular distance in females (IODFIC) MAC3

Table S26. Significant SNPs from GWAS to inversion modeled interocular distance in females (IODFIC)MAC5

Table S27. Significant SNPs from GWAS to inversion modeled interocular distance in females (IODFIC)MAC7

Table S28. Significant SNPs from GWAS to interocular distance in males (IODM) MAC3

Table S29. Significant SNPs from GWAS to interocular distance in males (IODM) MAC5

Table S30. Significant SNPs from GWAS to interocular distance in males (IODM) MAC7

Table S31. Significant SNPs from GWAS to inversion modeled interocular distance in males (IODMIC) MAC3

Table S32. Significant SNPs from GWAS to inversion modeled interocular distance in males (IODMIC)MAC5

Table S33. Significant SNPs from GWAS to inversion modeled interocular distance in males (IODMIC)MAC7

Table S34. Significant SNPs from GWAS to relative centroid size in females (rCSF) MAC3

Table S35. Significant SNPs from GWAS to relative centroid size in females (rCSF) MAC5

Table S36. Significant SNPs from GWAS to relative centroid size in females (rCSF) MAC7

Table S37. Significant SNPs from GWAS to relative centroid size in males (rCSM) MAC3

106 Table S38. Significant SNPs from GWAS to relative centroid size in males (rCSM) MAC5

Table S39. Significant SNPs from GWAS to relative centroid size in males (rCSM) MAC7

Table S40. Number of significant SNPs identified in each of the GWAS and percent overlap between sexes

Table S41. Percent overlap in identified SNPs between MAC7 GWAS for different traits

Table S42. Number of SNPs falling into distinct genomic regions for each trait

Table S43. Genes previously known for a role in growth or wing development identified in the GWAS studies

Table S44. Top 20 listed genes for each trait identified by the VEGAS method. Last column indicates, which genes were validated by RNAi and shows the corresponding significance of wing size change upon knockdown.

Table S45. Validation results wing size candidates females. Crosses in columns pigmentation, bristles, veins indicate a slightly abnormal corresponding phenotype. N= number of individuals tested, Q25 and Q75 = first and third quartile.

Table S46. Validation results wing size candidates males. Crosses in columns pigmentation, bristles, veins indicate a slightly abnormal corresponding phenotype. N= number of individuals tested, Q25 and Q75 = first and third quartile.

Table S47. Overview of the validation results. Number of SNPs that were tested and the number and percentage that were validated for each trait for two different Wilcoxon-test p- value thresholds.

Table S48. Validation results random lines

Table S49. Validation results IOD candidates

107 Table S50. Genes previously implied in growth regulation or wing development that were used as focal genes for epistasis

Table S51. Genes found as candidates by both GWAS and epistasis

Table S52. Enrichment of interactions between pairwise epistasis lists

Table S53. Enrichment analysis of enhancer signature

Table S54. Candidate SNPs lying within lincRNAs. LincRNA loci that overlap significant SNPs and their expression during different developmental stages is shown. Data provided by Young et al.

TableS55. Genomic location of expanded homologs in 12 Drosophila species. Name of the homolog, its genomic location and orientation are shown. We only used species that had the gene in the same orientation as D. melanogaster (grey).

Table S56. Human orthologs of putative and validated Drosophila growth regulators and their association to human complex traits.

Table S57. Line numbers of RNAi stocks used for validation

Table S58. Overlap between candidates found in our study and in the study by Schertel et al. Pathway names indicate in which pathway these genes showed suppression or enhancement of effects

108 4.2 The FlyCatwalk: A high throughput feature-based sorting system for artificial selection

Vasco Medici1, 2, 3, *, Sibylle Chantal Vonesch1, *, Steven N. Fry2, Ernst Hafen1

1 Institute of Molecular Systems Biology (IMSB), ETH Zurich, Auguste-Piccard-Hof 1, CH- 8093 2 SciTrackSs GmbH, Pfaffhausen, Switzerland 3 Institute for Applied Sustainability to the Built Environment, University of Applied Sciences and Arts of Southern Switzerland, Campus Trevano, CH-6952 Canobbio * These authors contributed equally

109 ABSTRACT Experimental evolution is a powerful tool to investigate complex traits. Artificial selection can be applied for a specific trait and the resulting phenotypically divergent populations pool- sequenced to identify alleles that occur at substantially different frequencies in the extreme populations. In order to maximize the proportion of loci that are causal to the phenotype among all enriched loci, population size and number of replicates need to be high. These requirements have in fact limited evolution studies in higher organisms, where the time investment required for phenotyping is often prohibitive for large-scale studies. Animal size is a highly multigenic trait that remains poorly understood and an experimental evolution approach may thus aid in gaining new insights into the genetic basis of this trait. To this end, we developed the FlyCatwalk, a fully automated, high throughput system to sort live fruit flies (Drosophila melanogaster) based on morphometric traits. With the FlyCatwalk, we can detect gender and quantify body and wing morphology parameters at a four-fold higher throughput compared to manual processing. The phenotyping results acquired using the FlyCatwalk correlate well with those obtained using the standard manual procedure. We demonstrate that an automated, high throughput feature-based sorting system is able to avoid previous limitations in population size and replicate numbers. Our approach can likewise be applied for a variety of traits and experimental settings that require high throughput phenotyping.

110 INTRODUCTION Many traits in natural populations are quantitative; they show roughly Gaussian distributed phenotypic values and are controlled by genetic variation at multiple loci across the genome (Falconer and Mackay 1996, Lynch and Walsh 1998). Our understanding of the causal relationships between genetic variation and quantitative phenotypic variation remains incomplete for most traits, in part due to their highly multigenic nature. The traditional “organism minus one gene” approach has proven to be very powerful for gaining a mechanistic understanding of the functions of individual gene products and their effects on a phenotype. The single gene approach is, however, limited by its ignorance towards environmentally or genetically context-dependent effects of genes and by its bias for large effects; small effects are often overlooked or hard to reproduce. Given that most quantitative traits are influenced by a large number of often highly context-dependent loci, which by themselves affect the phenotype only a little, a more global approach is evidently needed to obtain a more complete understanding of the genetic architecture of quantitative traits ( Falconer and Mackay 1996, Lynch and Walsh 1998, Mackay et al. 2009, Huang et al. 2012). Experimental evolution represents an example of such an approach (reviewed in Zeyl 2006, Bennett and Hughes 2009, Burke and Rose 2009, Kawecki et al. 2012). While laboratory evolution experiments have previously been widely applied to study the mechanisms underlying the evolution of traits and the adaptation of populations to new environments, they can be powerful tools to elucidate the genetic networks underlying quantitative trait variation. Applying artificial selection to laboratory populations generates highly divergent extreme populations for the phenotype of interest. The selected populations can be pool-sequenced to identify alleles that are present at significantly different frequencies in the divergent populations. A difference in the frequency of an allele between the extreme populations hints at a role of the locus in controlling trait variation. Ideally, the artificial selection process affects only loci that are causally involved in the expression of the trait in question. In reality, however, the majority of enriched SNPs are false positives in the sense that neutrally evolving loci can get enriched merely by chance through random processes such as genetic drift. A powerful experimental design should aim at maximizing the proportion of causal loci versus random noise in the selected populations. An increase in population size and a larger number of replicates correlate positively with a higher proportion of causal (true positive) loci relative to noise (Kofler and Schlötterer 2014). The upper limit on both factors is given by the time and manpower available for sampling and phenotyping the population. Consequently, large population sizes and many replicates can only be achieved in evolution studies with simple maintenance and rapid phenotyping. In less complex organisms like bacteria or yeast, propagation is straightforward and selection and

111 phenotyping are often partly automated, enabling a high throughput and very large population sizes (Wang et al. 2014, Jang et al. 2014, Nicoloff et al. 2007, Wiser et al. 2013). In higher organisms, however, phenotyping is more time consuming and has so far been the major limiting factor for experimental throughput. Thus, partial or complete automation of the phenotyping step would open up new experimental possibilities for higher organisms, such as Drosophila melanogaster, which represents a powerful model organism for experimental evolution experiments (Burke and Rose 2009). Drosophila has a short generation time, can be easily maintained in the laboratory and there are public genomic resources. Additionally, Drosophila has certain attractive genomic features for artificial selection: A manageable genome size, a rapidly decaying linkage–disequilibrium structure and high intra-specific genetic variation (Mackay et al. 2012). Artificial selection has been successfully applied to several complex behaviors of Drosophila that are amenable to relatively high-throughput selection by partially automating the phenotyping process (Dierick and Greenspan 2006, Edwards et al. 2006, Mackay et al. 2005). Morozova et al. used an “inebriometer” to quantify alcohol tolerance in a selection experiment for Drosophila alcohol sensitivity and effectively identified genes underlying alcohol tolerance (Morozova et al. 2007). Hirsch developed a maze for easy phenotyping of Drosophila photo- and gravitaxis, which has been applied in selection experiments for positive and negative geo- and phototaxis (Hirsch 1959, Hadler 1964, Hirsch and Erlenmeyer-Kimling 1962). In contrast, large-scale studies on size variation in Drosophila populations have so far been limited by the prohibitive time investment for phenotyping. There is extensive knowledge about the evolutionarily conserved signaling pathways that control developmental processes from a wealth of single-gene analyses over the last 30 years (Johnston and Gallant 2002, Mirth and Riddiford 2007, Shingleton 2010, Oldham et al. 2000). These identified two pathways as the main underlying regulators of size, the Insulin/TOR and the Hippo pathway (Oldham and Hafen 2003, Tumaneng et al. 2012, Pan 2007). However, the complete picture is still missing. As size is clearly a highly multigenic trait (Lango-Allen et al. 2010, Gockel et al. 2002), evolve and re-sequencing approaches are promising to reveal further insights into the genetic architecture underlying this trait. Unfortunately, manual measurements have so far remained the gold standard for morphometric quantification (Partridge et al. 1999, Trotta et al. 2007). As a consequence, most published selection studies use population sizes of around 10-50 individuals per generation, which is clearly in the lower range and leads to underpowered experimental designs (Kofler and Schlötterer 2014). Houle et al. developed a phenotyping device for Drosophila wings (Houle et al. 2003) that enables live fly wing quantification with significant improvement in speed compared to current techniques but still

112 involves considerable manual labor. Furthermore, it is restricted to wings; other morphometric traits like thorax or head size need to be quantified separately. Another study made use of a sieving apparatus, which enables screening 1800 flies simultaneously for size (Turner et al. 2011). Though fast and ingenious, this approach is inaccurate, as flies are randomly oriented when passing through the sieve and outstretched legs or wings may hinder an otherwise small fly from passing. Likewise, it does not take into account the size of individual body parts. A fly could end up in the big population due to a big thorax, abdomen or big wings, which clearly makes a difference for subsequent analysis. We posit that an automated phenotyping and selection system for Drosophila size traits would be highly beneficial, as it enables a more powerful experimental design of artificial selection studies for morphometric traits by decreasing the phenotyping effort. An automated morphometric phenotyping system should increase speed while maintaining a level of accuracy comparable to the current gold standard. Applying selection after phenotyping requires single individual quantification and storage, as population statistics are necessary to determine which individuals are selected. We developed a system for the rapid phenotyping of Drosophila morphometric traits that meets these requirements: the FlyCatwalk. With a throughput of one fly every ~40s, our system is able to quantify several morphometric features simultaneously. For a set of flies measured both manually and with the FlyCatwalk we achieve a very good correlation of measurements, thus demonstrating the high accuracy of the automated system. Furthermore, the FlyCatwalk is able to distinguish between males and females and allows storing flies individually until the morphometric analysis is complete. To be able to select specific flies from the measured population, we additionally implemented an automated sorting mechanism. In summary, we present an automated phenotyping system for Drosophila morphometric traits that allows performing so far extremely time consuming artificial selection experiments by increasing experimental throughput approximately 4-fold, while preserving a data quality comparable to standard manual measurements.

RESULTS The FlyCatwalk: high-throughput automated phenotyping of morphometric traits We developed the FlyCatwalk (Figure 1, Video S1), a system that enables increased phenotyping throughput while maintaining a measurement accuracy comparable to the current gold standard and on top has a sorting function for selection experiments. In the FlyCatwalk, flies are singled out, imaged while walking through a measurement chamber and subsequently collected individually in different wells of a storage device. With the FlyCatwalk we can measure multiple morphometric traits simultaneously: Wing area, wing length and

113 wing width, the dimensions of the thorax, abdomen and head and two measures that we chose as proxies for body size: interocular distance and shoulder width. Other measures can easily be implemented based on the body segment dimensions. The FlyCatwalk consists of three modules: The entrance chamber, the measurement tunnel and the storage device, which are interconnected with plastic tubes for transferring flies between the modules using a pneumatic system (Figure 1A, B). Cold-anesthetized flies need to be loaded approximately 60 at a time into the entrance chamber, where timed air pulses animate them to enter the measurement tunnel. A gating mechanism ensures that flies are singled out at the entrance. Once in the measurement chamber, each fly is imaged by a high- resolution camera at 20 frames per second while walking vertically through the tunnel. We apply some quality filtering criteria to evaluate whether an image is valid, the most important of which requires that the fly be oriented with its wings facing the camera. A fly is kept and transferred to the storage device when at least three images are valid, the minimum required for subsequent in-depth morphometric analysis. Otherwise the fly is blown back to the entrance chamber. This cycle is repeated until all storage wells are full. The user occasionally needs to reload flies but otherwise is free to leave, as the FlyCatwalk is fully self-operating. Flies do not need to be sorted by sex prior to phenotyping, as the analysis software is able to distinguish between males and females. We store the flies individually until the in-depth morphometric analysis is complete. Individual storage has the advantage that specific flies from the population can be identified and further used for experiments or breeding. For a selection experiment it is for instance necessary to know the phenotypes of all individuals in a population before the largest 20% can be determined. Using the pneumatic system of the FlyCatwalk, these can be specifically sorted from the storage device to form the parents of the next generation. In summary, the FlyCatwalk enables live phenotyping of multiple traits simultaneously with minimal user intervention and offers additional functionalities that may be useful for selection studies.

Experimental throughput To determine the realized increase in phenotyping throughput of the automated system, we compared the per day phenotyping performance of our system with that of a standard method, i.e. manual quantification. With the FlyCatwalk we currently achieve a throughput of approximately one fly every 40s, which yields a total throughput of 720 flies per eight-hour day. Based on experience, this number is variable depending on the sex of the flies and the time of the day. Throughput is higher for males, which could be a consequence of higher intra-sex aggression behavior, and in the morning and evening compared to throughout the day, which reflects the natural activity pattern of Drosophila (Allada and Chung 2010,

114 Klarsfeld et al. 2003). Nevertheless, we routinely achieve a throughput of more than 500 mixed-sex flies per day. Using our optimized manual phenotyping procedure, which involves one person positioning and a second person photographing the flies, we can currently process 250 flies per day, but need an additional day for acquiring all morphometric measurements. The FlyCatwalk therefore offers an approximately 4-fold increase in phenotyping throughput compared to manual processing while reducing the manpower required to one person. Beyond the phenotyping functionality, the FlyCatwalk is able to sort flies from the measured population based on user-defined selection criteria, a process that is very cumbersome to do manually, and thus additionally reduces the time and manpower investment for selection experiments.

Morphometric analysis We describe the morphometric analysis in detail in the Methods part and in Figures S1- S4. Briefly, the data analysis software extracts morphometric measurements for body segments and wings and simultaneously detects the gender. The body is first segmented into head, thorax and abdomen and the dimensions of the individual segments are determined (Figure S1). The software calculates the interocular distance based on the head model, the ocelli and the eye edges (Figure S3), thus using the same landmarks as in manual quantification. Shoulder width in contrast is extracted indirectly from the scaling parameter applied to the thorax template by the template-matching algorithm and not by detecting the humeral bristles, which are the landmarks used for manual phenotyping. A template consisting of the wing outline and veins L2 to L5 is fitted to the wings, whereupon wing length, wing width and wing area are calculated (Figure S4). Even if in principle the system does not require it, a visual verification of the data is always desirable. It not only gives the user control over the accuracy of the fitting but also allows discarding flies with damaged wings. For this reason we provide a Matlab graphical user interface (GUI) that allows visual verification and manual adjustment of the analysis.

Method validation Increasing throughput through automation often entails a decrease in measurement accuracy. To this end, we evaluated the accuracy of the automatic image processing compared to manual handling, which represents the current gold standard for morphometric phenotyping. We chose five test phenotypes that may be of interest for morphometric studies and that derive from different body parts. Wing length (WL), wing width (WW) and wing area (WA) represented the wing phenotypes, while for the head we selected inter-ocular distance (IOD) and for the thorax shoulder width (SW). We observed high correlation between manual

115 and automated measurements for all wing traits (RWL=0.98, RWW=0.95, RWA=0.99) and slightly lower correlation for the two body traits (RIOD=0.83, RSW=0.93) (Figure 2A-E), which may reflect the difficulty of quantifying 3D objects accurately from 2D projections. To estimate the relative error of the automated method, expressed in percent of the manually measured data, we applied bi-square robust linear regression and analyzed the residuals (Figure 3, black boxes). This clearly shows that while the errors for all wing traits do, with the exception of a few outliers, not exceed 5%, the error distributions are wider for the body traits (WL: 0.06 ± 1.56%, WW: 0.09 ± 2.39%, WA: 0.06 ± 2.05%, IOD: -0.35 ± 4.83%, SW: -0.15 ± 2.74%). To determine whether these errors arose due to inaccuracies of the automatic processing, we ran the same analysis after the results of the automated fitting had been user-corrected with the verification GUI. Adjustment resulted in a moderate improvement of the correlation coefficients for wing width and shoulder width (RWL=0.98, RWW=0.98, RWA=0.99, RIOD=0.87) (Figure 2F-I) and for the wing traits in slightly tighter error distributions (WL: -0.04 ± 1.46%, WW: -0.09 ± 1.61%, WA: 0.02 ± 1.53%) (Figure 3, red boxes). Though the error distribution of IOD did not change considerably in broadness (IOD: 0.16 ± 4.04%), we observed a shift of the median towards 0, indicating a moderately beneficial effect of adjustment. Since the current version of the verification software does not include SW correction we did not correct this trait manually. In general, user correction of the fitting on the acquired images did not improve the quality of the correlation coefficients in a substantial manner, suggesting that the inconsistencies between the manual and automatic measurements do not arise from the image processing and template fitting algorithms, but instead could be a consequence of differences in flies’ body and wing posture between the two methods.

DISCUSSION We present an automated system for the rapid phenotyping of Drosophila morphometric traits that allows performing so far extremely time-consuming artificial selection experiments, while preserving a data quality comparable to standard manual methods.

Novelty and relevance A key factor for the design of powerful selection experiments are large population sizes, which have so far been limited to only tens of individuals in Drosophila morphometrics studies due to the highly time demanding phenotyping process. By automating the phenotyping using the FlyCatwalk we can greatly decrease the per-fly time investment and thus achieve population sizes that far exceed the past standard. The application of the FlyCatwalk is not restricted to selection experiments; instead we consider it a valuable tool in all studies requiring intermediate to high-throughput phenotyping or sorting of Drosophila melanogaster and other species based on morphometric features.

116 To our knowledge, this is the first fully automated large-scale method for the analysis of size traits in Drosophila. The FlyCatwalk is able to quantify several aspects of Drosophila adult morphology simultaneously with high accuracy, and the spectrum of phenotypes can even be extended. With the FlyCatwalk we routinely phenotype 500 flies per day, though the maximum throughput could be as high as ~700 flies/day. We thus estimate that our system provides an at least a 4-fold increase in phenotyping throughput compared to manual processing while reducing the manpower required to one person. A further major advantage of the FlyCatwalk is the possibility of measuring flies alive. Live-fly phenotyping greatly simplifies selection protocols by enabling phenotypic analysis before breeding. In selection experiments with manual phenotyping, which is simpler on dead flies, many pairs of males and females have to be mated prior to phenotypic analysis, but only the progeny of the phenotypically most extreme pairs will be kept to form the next generation. As this strategy requires more time, resources and space since many crossing vials need to be prepared and stored, it may additionally limit the number of flies that can be analyzed for each generation and effectively limits the population size. There are examples of manual live- fly morphometric phenotyping but this is often either inaccurate or cumbersome (Houle et al. 2003, Turner et al. 2011). Automated live-fly phenotyping may also be desirable in other experiments, such as in studies evaluating the influence of fly size on behavioral traits like mating or flying ability. In summary, we believe that our system represents a valuable tool for various studies addressing growth –related questions in Drosophila melanogaster and especially opens up new possibilities for far more powerfully designed experimental evolutions studies of morphometric traits, which have the potential to contribute to a more complete understanding of animal growth.

Are the requirements to an automated measurement system fulfilled?

Speed Phenotyping with the FlyCatwalk is on average at least four times faster than manual processing. The maximum throughput depends largely on the walking behavior of the flies, which may vary between individuals, sexes, time of day, days and seasons. Generally, a throughput of between 500 and 800 flies per 8 hours is feasible, with an average of 90 flies processed per hour. A factor that could help keeping the throughput higher is a constant temperature around 20°C since walking activity markedly drops if the temperature rises towards 25°C. Apart from the improvement in the phenotyping throughput, the FlyCatwalk also reduces the necessary manpower to one person, as it is fully self-operating after the flies have been loaded.

117 Measurement accuracy The FlyCatwalk enables very precise quantification of wing traits as is evident from the high correlation between manually obtained and automated measurements. The correlation is lower for the two body morphometric traits, interocular distance (IOD) and shoulder width (SW), which is likely a consequence of trying to quantify parts of a 3D structure from a projection onto a 2D image. In contrast to the flat structure of the wing that due to its venation pattern naturally offers very defined landmarks as quantification reference points, IOD and SW need to be defined from less distinct landmarks. The distance between the landmarks on the 2D projection may vary depending on the flies’ head or thorax posture on the image. As we control posture only in the manual procedure this may account for the discrepancies between the two methods. The different definitions of SW between the automated and manual method is a further plausible reason for the low observed correlation: the automated SW measure is calculated from the scaling parameter applied to the thorax template by the template-matching algorithm, whereas the manual procedure uses the humeral bristles as reference points. As both IOD and SW are measures that we routinely use to determine head size and body size manually, we used these traits in the FlyCatwalk to be able to make quantitative comparisons between the manual and automated method. As a future perspective, choosing more robust measures for head or thorax size will result in more accurate quantification. A defined partial area is more suited to machine vision and should be easy to implement as we already detect the full head, thorax and abdomen shape in the current analysis.

Single fly measurement and storage Despite the gating system two flies may occasionally be measured and stored together, but these can be discarded during the analysis. We match flies to their corresponding phenotypes by coding the images and measurements according to the fly's location in the storage device. The high percentage of successfully singled out flies and the labeling system thus ensure that the FlyCatwalk meets this requirement.

Limitations of the FlyCatwalk: The FlyCatwalk is based on voluntary walking behavior, which introduces a bias for measuring the most active flies in the population. We designed the system exploiting the fact that average flies show negative gravitaxis and positive phototaxis. As there is variability for both these behaviors on the population level, and they are heritable (Hirsch 1959, Toma et al. 2006, Hadler 1964), we preferentially phenotype flies with strong negative gravitaxis and positive phototaxis. Flies that have sustained mutations or injuries that prevent them from walking vertically can in contrast not be measured. We noticed that voluntary walking

118 behavior is strongly dependent on the sex and the time of day. An all-female population takes longer to measure than an all-male one, which could be a consequence of their lower intra- sex aggression levels making them feel more comfortable in the crowded and confined space of the entrance chamber than males. Phenotyping throughput also shows characteristic morning and evening “spikes”, being high in the morning, then markedly dropping throughout the day and increasing again in the evening. This reflects the natural activity curve of Drosophila melanogaster (Allada and Chung 2010, Klarsfeld et al. 2003). On the analysis side, computational time is at the moment a major limiting factor for throughput, and code optimization should be performed to increase analysis speed.

Applications of the FlyCatwalk beyond selection experiments of Drosophila melanogaster Beyond the currently implemented morphometric traits (IOD, SW, WA, WL, WW), other traits that are based on head, thorax, abdomen or wing size and shape can easily be added. Furthermore, our setup can be modified and extended to include quantification of other parts of the body, such as the eyes. Many studies in Drosophila use the eye disc as a model organ (37-40) and require quantification of adult eye size, the usual readout for the amount of growth and proliferation in the eye disc. All features on the body that are amenable to detection and quantification by machine vision, such as bristles and bristle number could also be implemented easily. High throughput quantification of these traits reduces the timespan required for experiments and is especially beneficial for large-scale projects such as forward and reverse genetic screens and genome-wide association studies. A valuable application of our system beyond studies of morphometric traits could be in the production of transgenic flies, where successful transformants with red eyes have to be sorted from among large numbers of white-eyed flies. As this is a binary decision based on a single feature, color, this task is ideally suited to machine vision and automation. Furthermore, the FlyCatwalk could be adapted to quantify the response to various stimuli, such as different odors, light or gravitaxis, and perform selection on this trait. The present setup is not restricted to Drosophila melanogaster; quantification of other species that approximately match its size and weight is possible after minor adjustments in the morphometric fitting, primarily the template. For larger and heavier specimens adjustments on the modular parts, such as the diameter of the entrance, measurement and storage compartments as well as the air pressure, are necessary. As the logic of the setup is the same this should be straightforward based on the existing design templates. We estimate the current size limit somewhere below the size of thoracica (4.51mm length, 1.06 mm width, 1.27mm thorax height).

119 These examples show that the Fly Catwalk, in addition to its application in selection studies, represents a valuable tool in a number of experiments; from large scale morphometry screens and genome-wide association studies in Drosophila melanogaster and other species to transgenic fly production, or more generally, in all types of experiments that require high- throughput phenotyping or sorting based on morphometric features.

MATERIALS & METHODS

Animals: Adult wild-type flies from an outbred population of 176 round-robin mated DGRP lines (14) were used for all validation experiments.

FlyCatwalk Workflow

Measurement framework: The FlyCatwalk software consists of a state machine running in Labview 12 (National Instruments Corporation, Austin, TX, USA) that automatically manages the workflow and integrates the different hardware components with the image processing software. We use an Arduino Uno R3 board (Arduino SA, Chiasso, Switzerland), which allows controlling up to 6 servo motors, 4 solenoid valves and 6 light barriers and communicates with the computer using a USB serial connection. Image processing entails two steps: a fast real-time analysis for image validation followed by a detailed and more time- consuming morphometric analysis (Matlab R2013a, The Mathworks Inc, Natick, MA, USA).

Measurement workflow: The workflow of the FlyCatwalk is illustrated in Figure 1A, B. Approximately 60 flies are introduced into the FlyCatwalk entrance chamber. A pneumatic system consisting of a 2-port solenoid valve (VQ21M1-5YO-C6-Q, SMC Corporation, Tokyo, Japan) and a system of gates operated with servo motors (Modelcraft WG90MG and Modelcraft MC-965DMG, Conrad Electronic SE, Hirschau, Germany) allow singling out and storage of the measured flies. Flies are activated by air pulses to move towards the measurement channel, a narrow vertical tunnel, which they ascend due to their naturally negative geotaxis (Hirsch 1959, Beckingham et al. 2005, Toma et al. 2006). Positive phototaxis (Hadler 1964) is also exploited by placing a white light (Ace I, Schott AG, Mainz, Germany) at the far end of the tunnel. At the tunnel entrance, a light barrier detects the presence of a fly and closes a gate behind it. While walking up the tunnel, the singled out fly is imaged by a high-resolution color camera (Basler Pilot piA2400-12gc, Basler AG, Ahrensburg, Germany) at 20 frames per second. The measurement channel is covered by a standard glass coverslip, which permits a high image quality for filming but can easily be replaced when stained. In the measurement chamber, a two-color illumination strategy is

120 used to image the wings. Blue light provides backlighting where wings do not overlap with the body. A diffusive screen is placed on the tunnel floor to provide homogenous backlighting. The body itself is used as diffusor to image wing parts that do overlap the body. This is achieved by illuminating the fly from both sides with red light (LXHL-LD3C red high power led, wavelength 627 nm, Quadica Developments Inc., Brantford, Canada) channeled by 22 optical fibers. Using different light channels (blue and red) allows analyzing the two channels separately, for instance when only the silhouette of the fly is required (as for analyzing body morphology) only the blue channel is used. The absence of direct light from above avoids reflections from the wings. In order to minimize heat while maximizing light intensity, both light sources are flashed synchronously with the camera exposure with a flash duration of 400 us (0.8% duty cycle), using a strobe controller (Gardasoft PP520F, Gardasoft Vision Ltd, Cambridge, England). While the fly is walking up the tunnel, the acquired frames are analyzed in real-time to assess image quality and to verify the orientation of the fly. The real-time image processing is performed with a custom-written C++ library, which uses the OpenCV library (Bradski 2000) to perform fast image analysis. For each acquired frame the following quality checks are performed: a) image is in focus, b) body and wings do not touch the image border, c) no direct reflections from the wings, d) fly is walking with its wings facing the camera, e) longitudinal body axis is aligned with the tunnel, f) body and wing symmetry match along the body’s longitudinal axis. If all these criteria are met, the frame is retained and added to a sequence of valid images. When the fly reaches the end of the imaged region, the acquired sequence is analyzed and a decision made whether to accept or reject the fly based on the number of valid images. If accepted, the fly is blown into a slot of the storage device, where it will be kept until the entire sample population is collected and the in-depth analysis is completed. If the acquired image sequence does not contain enough (≥ 3) valid frames, the fly is blown back to the entrance chamber to be re-measured. The storage device consists of a rack of 182 wells 6mm in diameter and 15mm deep, each equipped with an independent door mechanism that is automatically opened when a fly is inserted or ejected. The fly container is mounted on a motorized XZ stage (two Newmark ET- 150-21 mounted in XZ configuration, Newmark Systems Incorporated, Rancho Santa Margarita, CA, USA), which is controlled using a 2-axes USB controller (Newmark NSC-A2L, Newmark Systems Incorporated, Rancho Santa Margarita, CA, USA) to allow alignment of the container wells with the measurement channel outlet for single individual storage.

121 Data analysis: The data analysis software extracts morphometric measurements for body segments and wings and simultaneously detects the gender. Head, thorax and abdomen dimensions, interocular distance and wing morphology are calculated.

Body segments extraction: The entire image sequence is scanned to first segment the body into head, thorax and abdomen. The main image processing steps are illustrated in Figure S1. The blue color channel is used to extract the silhouette of the fly for body segmentation. Each frame is subtracted from the background, which results in a bright fly against a dark background. This image is inverted to generate the complement image, a dark fly on a bright background. Only the frames that were considered valid during the acquisition are used. Using the central image moments, the centroid and main axis are extracted from the complement images and used to align the single frames in a stack (Figure S1A). A 95th percentile image is then calculated from the stack to delete the legs, which are constantly moving while the fly is walking and therefore expose the background underneath (Figure S1B). The 95th percentile image is thresholded and an image closing morphological operation (Serra 1983) is applied to remove thin structures, such as wing veins and bristles. The three body segments are extracted from the obtained binary image using the watershed segmentation algorithm (Roerdink and Meijster 2000) (Figure S1C). The dimensions of each segment are subsequently evaluated by fitting templates to their contours.

Sex discrimination: Two methods are combined for increased robustness. The luminance in the red color channel along the abdomen is first normalized using mean luminance and abdomen size that were extracted from a set of manually sorted measurements and subsequently cross-correlated with average male and female luminance curves (Figure S2A, B, E, F). The sex of the individual is determined by the curve yielding the higher correlation coefficient. As luminance patterns may vary between flies depending on their genotypic background or their hydration state this method is not sufficient for sex discrimination. We therefore implemented a complementary detection method consisting of scanning the single frames for the existence of sex combs. The images are scanned for structures extending anterior to the head along the body longitudinal axis (Figure S2C, G). The sex combs are then identified by applying a threshold to the leg image in the red color channel and scanning for dark spots (Figure S2D, H). The eccentricity of the detected regions is calculated and objects yielding high eccentricity values are rejected to avoid false positives when the leg is crossing the antennae. The confidence level of both methods is used to determine sex. For the abdominal luminance method, confidence is estimated based on how different the two correlation coefficients are. For the sex comb method, confidence is based on the average area of the sex combs when detected and on the proportion of images they were detected

122 on. When the combined results do not allow making a decision with sufficient confidence, the sex is marked as unknown and may be determined by the user in the verification GUI.

Interocular distance extraction: The head position is detected on the brightest frame in the red channel by binarizing the background-subtracted image in the blue channel and aligning it with the previously extracted body model (Figure S3A). The ocelli are detected using a template matching method (Figure S3B, C) and the light intensity along the line intersecting the two posterior ocelli is evaluated (Figure S3D, black line). The eye edges are identified by scanning the derivative of the luminance (Figure S3D, red line) for sharp contrast edges (Figure S3D, dashed lines). Interocular distance is quantified as the length of the line segment crossing the center of the posterior ocelli, from eye edge to eye edge.

Wing extraction: Wings are detected using a template consisting of the outline and wing veins L2 to L5. Wing Length is defined as the length of the segment between the wing hinge and the intersection of the wing outline and vein L3. Wing Width is measured as the length of the line connecting the intersection of the outline with vein L2 to the intersection of the outline with vein L5 (Figure S4B). The wing-fitting algorithm is implemented in Matlab and fits B- splines to the topological skeleton of the wing outline and veins extracted from the acquired video frames. The brightest frame in the red channel is selected for the procedure to ensure maximum brightness of the abdomen, which allows a sharper contrast between wing and body and facilitates detection of the wings and wing veins in regions where body and wings overlap. The algorithms used to extract the topological skeleton of the wing veins and outline are slightly different between the overlapping and the non-overlapping regions, but the principles are the same. The blue channel is used for the non-overlapping regions since the backlighting allows to clearly see through the wings, while for the overlapping regions the red channel is used. Contrast-limited adaptive histogram equalization (46) is applied to the background subtracted image, which allows enhancing the local contrast of the image while minimizing luminance gradients. The image is thresholded and cleaned using a series of morphological operations (Supporting Methods) and the resulting black and white image skeletonized using the anaskel.m function1 (Figure S4A, B).

User verification of the analysis outcome: A graphical user interface (GUI) for visual verification and manual adjustment of the outcome of the automatic analysis was implemented in Matlab. The following parameters can be verified and changed: sex, IOD,

1 Written by Nicholas R. Howe, available at http://www.mathworks.com/matlabcentral/fileexchange/11123-better-skeletonization 123 WL, WW and WA. At present, the fit of body segments and the complete wing vein morphology can be visually checked but not modified2.

Control measurements: A total of 147 flies (77 males and 70 females) were measured in the FlyCatwalk and then collected individually and frozen at -20°C for manual phenotyping.

Body: Flies were positioned on a black apple agar plate and photographed with a VHX-1000 digital light microscope (KEYENCE, Itasca, IL, USA). Morphometric body traits were measured manually using the VHX-1000 built-in measurement software. Interocular distance was measured from eye edge to eye edge along the center of the posterior ocelli and parallel to the base of the head. Shoulder width was measured as the distance between the left and right humeral bristles.

Wings: The intact left or otherwise right wing was dissected from the fly and mounted in water on a glass slide for wing image acquisition. Morphometric measurements were extracted from the wing images using the WINGMACHINE software (33) and Matlab.

2 The possibility of modiying the wing vein morphology has not been implemented yet, as it was not required for the present study 124 COMPETING INTERESTS The authors declare that they have no competing financial, professional or personal interests that might have influenced the performance or presentation of the work described in this manuscript.

AUTHOR CONTRIBUTIONS EH conceptualized the approach and EH and SCV defined specifications of the system. VM and SNF developed and implemented the FlyCatwalk. SCV and VM designed and conducted the experiments, analyzed the data and wrote the manuscript. EH and SCV revised the manuscript. All authors read and approved the final manuscript.

ACKNOWLEDGEMENTS We thank Anna Maria Strässle Eugster and members of the Institute of Molecular Systems Biology for the great support during the experiments. We also thank Katja Köhler for critical reading of the manuscript, Francesco Crivelli of the Sensory-Motor Systems Lab at ETH Zurich for participating in the design of the FlyCatwalk and ETH transfer for their support. This work was funded by grant SXRTX0-123851 from SystemsX.ch, the Swiss National Science Foundation grant 31003AB_135699 and financial support from ETH Zurich to EH.

125 REFERENCES Allada, R., and B. Y. Chung, 2010 Circadian Organization of Behavior and Physiology in Drosophila. Annual Review of Physiology 72: 605–624.

Beckingham, K. M., M. J. Texada, D. A. Baker, R. Munjaal, and J. D. Armstrong, 2005 Genetics of graviperception in animals. Advances in genetics 55: 105–145.

Bennett, A. F., and B. S. Hughes, 2009 Microbial experimental evolution. American journal of physiology. Regulatory, integrative and comparative physiology 297: R17–25.

Bradski G., 2000 The opencv library. Dr Dobbs Journal. Available at: http://elibrary.ru/item.asp?id=4934581 [Accessed March 24, 2014].

Burke, M. K., and M. R. Rose, 2009 Experimental evolution with Drosophila. American journal of physiology. Regulatory, integrative and comparative physiology 296: R1847–54.

Dierick, H. A., and R. J. Greenspan, 2006 Molecular analysis of flies selected for aggressive behavior. Nature Genetics 38: 1023–1031.

Edwards, A. C., S. M. Rollmann, T. J. Morgan, and T. F. C. Mackay, 2006 Quantitative genomics of aggressive behavior in Drosophila melanogaster. PLoS Genetics 2: e154.

Falconer, D. S., and T. F. C. Mackay, 1996 Introduction to Quantitative Genetics (Edition 4). Longmans Green, Harlow, Essex, UK.

Fuller, R. C., C. F. Baer, and J. Travis, 2005 How and When Selection Experiments Might Actually be Useful. Integrative and comparative biology 45: 391–404.

Gockel, J., S. J. W. Robinson, W. J. Kennington, D. B. Goldstein, and L. Partridge, 2002 Quantitative genetic analysis of natural variation in body size in Drosophila melanogaster. Heredity 89: 145–153.

Hadler, N. M., 1964 Heritability And Phototaxis In Drosophila Melanogaster. Genetics 50: 1269–1277.

Hirsch J., 1959 Studies in experimental behavior genetics: II. Individual differences in geotaxis as a function of chromosome variations in synthesized Drosophila populations. Journal of Comparative and Physiological Psychology 52:304–308

126 Hirsch J., and L. Erlenmeyer-Kimling, 1962 Studies in experimental behavior genetics: IV. Chromosome analysis for geotaxis. Journal of Comparative and Physiological Psychology 55:732-739.

Houle, D., J. Mezey, P. Galpern, and A. Carter, 2003 Automated measurement of Drosophila wings. BMC Evolutionary Biology 3: 25.

Huang, W., S. Richards, M. A. Carbone, D. Zhu, R. R. H. Anholt et al., 2012 Epistasis dominates the genetic architecture of Drosophila quantitative traits. Proceedings of the National Academy of Sciences 109: 15553–15559.

Jang, Y., Y. Lim, and K. Kim, 2014 Saccharomyces cerevisiae Strain Improvement Using Selection, Mutation, and Adaptation for the Resistance to Lignocellulose-Derived Fermentation Inhibitor for Ethanol Production. Journal of microbiology and biotechnology 24: 667–674.

Johnston, L. A., and P. Gallant, 2002 Control of growth and organ size in Drosophila. BioEssays 24: 54–64.

Kawecki, T. J., R. E. Lenski, D. Ebert, B. Hollis, I. Olivieri et al., 2012 Experimental evolution. Trends in Ecology & Evolution 27: 547–560.

Klarsfeld, A., J.-C. Leloup, and F. Rouyer, 2003 Circadian rhythms of locomotor activity in Drosophila. Behavioural Processes 64: 161–175.

Kofler, R., and C. SCHLÖTTERER, 2014 A guide for the design of evolve and resequencing studies. Molecular biology and evolution 31: 474–483.

Koontz, L. M., Y. Liu-Chittenden, F. Yin, Y. Zheng, J. Yu et al., 2013 The Hippo Effector Yorkie Controls Normal Tissue Growth by Antagonizing Scalloped-Mediated Default Repression. Developmental Cell 25: 388–401.

Lango Allen, H., K. Estrada, G. Lettre, S. I. Berndt, M. N. Weedon et al., 2010 Hundreds of variants clustered in genomic loci and biological pathways affect human height. Nature 467: 832–838.

Leevers, S. J., D. Weinkove, L. K. MacDougall, E. Hafen, and M. D. Waterfield, 1996 The Drosophila phosphoinositide 3-kinase Dp110 promotes cell growth. The EMBO journal 15: 6584.

127 Ling, C., Y. Zheng, F. Yin, J. Yu, J. Huang et al., 2010 The apical transmembrane protein Crumbs functions as a tumor suppressor that regulates Hippo signaling by binding to Expanded. Proceedings of the National Academy of Sciences of the United States of America 107: 10532–10537.

Mackay, T. F. C., S. L. Heinsohn, R. F. Lyman, A. J. Moehring, T. J. Morgan et al., 2005 Genetics and genomics of Drosophila mating behavior. Proceedings of the National Academy of Sciences of the United States of America 102 Suppl 1: 6622–6629.

Mackay, T. F. C., S. Richards, E. A. Stone, A. Barbadilla, J. F. Ayroles et al., 2012 The Drosophila melanogaster Genetic Reference Panel. Nature 482: 173–178.

Mackay, T. F. C., E. A. Stone, and J. F. Ayroles, 2009 The genetics of quantitative traits: challenges and prospects. Nature Publishing Group 10: 565–577.

Mirth, C. K., and L. M. Riddiford, 2007 Size assessment and growth control: how adult size is determined in . BioEssays 29: 344–355.

Morozova, T. V., R. R. H. Anholt, and T. F. C. Mackay, 2007 Phenotypic and transcriptional response to selection for alcohol sensitivity in Drosophila melanogaster. Genome Biology 8: R231.

Nicoloff, H., V. Perreten, and S. B. Levy, 2007 Increased Genome Instability in Escherichia coli lon Mutants: Relation to Emergence of Multiple-Antibiotic-Resistant (Mar) Mutants Caused by Insertion Sequence Elements and Large Tandem Genomic Amplifications. Antimicrobial Agents and Chemotherapy 51: 1293–1303.

Nowak, K., G. Seisenbacher, E. Hafen, and H. Stocker, 2013 Nutrient restriction enhances the proliferative potential of cells lacking the tumor suppressor PTEN in mitotic tissues. eLife 2: e00380.

Oldham, S., and E. Hafen, 2003 Insulin/IGF and target of rapamycin signaling: a TOR de force in growth control. Trends in cell biology 13: 79–85.

Oldham, S., R. Bohni, H. Stocker, W. Brogiolo, and E. Hafen, 2000 Genetic control of size in Drosophila. Philosophical Transactions of the Royal Society B: Biological Sciences 355: 945– 952.

Pan, D., 2007 Hippo signaling in organ size control. Genes & development 21: 886–897.

128 Partridge, L., R. Langelan, K. Fowler, B. Zwaan, and V. French, 1999 Correlated responses to selection on body size in Drosophila melanogaster. Genetics Research 74: 43–54.

Roerdink J., and A. Meijster, 2000 The watershed transform: Definitions, algorithms and parallelization strategies. Fundamenta Informaticae 41: 187-228 Serra J., 1983 Image analysis and mathematical morphology. Academic Press, Inc. Orlando, FL, USA.

Shingleton, A. W., 2010 The regulation of organ size in Drosophila: physiology, plasticity, patterning and physical force. Organogenesis 6: 76–87.

Toma, D. P., K. P. White, J. Hirsch, and R. J. Greenspan, 2002 Identification of genes involved in Drosophila melanogaster geotaxis, a complex behavioral trait. Nature Genetics 31: 349–353.

Trotta, V., F. C. F. Calboli, M. Ziosi, and S. Cavicchi, 2007 Fitness variation in response to artificial selection for reduced cell area, cell number and wing area in natural populations of Drosophila melanogaster. BMC Evolutionary Biology 7 Suppl 2: S10.

Tumaneng, K., R. C. Russell, and K.-L. Guan, 2012 Organ Size Control by Hippo and TOR Pathways. Current biology : CB 22: R368–R379.

Turner, T. L., A. D. Stewart, A. T. Fields, W. R. Rice, and A. M. Tarone, 2011 Population- Based Resequencing of Experimentally Evolved Populations Reveals the Genetic Basis of Body Size Variation in Drosophila melanogaster (G. GIBSON, Ed.). PLoS Genetics 7: e1001336.

Wang, B. L., A. Ghaderi, H. Zhou, J. Agresti, D. A. Weitz et al., 2014 Microfluidic high- throughput culturing of single cells for selection based on extracellular metabolite production or consumption. Nature biotechnology 32: 473–478.

Wiser, M. J., N. Ribeck, and R. E. Lenski, 2013 Long-term dynamics of adaptation in asexual populations. Science 342: 1364–1367.

Zeyl, C., 2006 Experimental evolution with yeast. FEMS yeast research 6: 685–691.

Zuiderveld K., 1994 Contrast limited adaptive histogram equalization, pp 474-485 in Graphics gems IV, edited by Paul S. Heckber

129 4. 3 Further results

4.3.1 Foodbatch variability is a strong and specific confounder for wing size in Drosophila melanogaster We discovered an unexpectedly large contribution of foodbatch variability to phenotypic variation exclusively in wing size. To investigate whether this was the case for other wing traits, we looked at the magnitude of the effect of food on wing length (WL), wing width (WW) and wing aspect ratio (WAR = WL2/WA), a measure for wing shape. We observed a comparable contribution to total phenotypic variance in wing length and wing width, whereas wing aspect ratio was not affected at all (Table 1). This makes sense biologically, as patterning, and consequently wing shape, is known to be more tightly controlled at the genetic level than wing size and is thus more robust to environmental fluctuations (Birdsall et al. 2000). The estimate for the foodbatch effect was consistent among wing size traits (CS, WL, WW), which could, however, be a consequence of the high correlation between traits. The foodbatch effect estimate was consistent also among four different body size traits despite them being less correlated with each other than the wing size traits (Table 2). To assess if the effect of the foodbatch would significantly impact genome-wide association results, we performed GWAS for all four traits using Fast-LMM (Lippert et al. 2011) once with modeling the foodbatch variability and once without. Comparing p-values of individual SNPs from the foodbatch-modeled versus non-modeled (naïve) GWAS, we clearly saw a difference for wing size, a more slight effect for IOD and no apparent effect on female wing shape p- values (Figure 1 A-C). Foodbatch variability can thus clearly act as a strong confounder specifically for wing size traits.

Trait Line Line x Sex Food Residual H2

CS 0.56 0.07 0.15 0.22 0.63 IOD 0.63 0.05 0.03 0.28 0.69 WAR 0.73 0.03 0.00 0.24 0.76 rel_sp 0.63 0.05 0.04 0.29 0.68 TL 0.56 0.06 0.03 0.35 0.63 Length 0.57 0.06 0.14 0.23 0.63 HW 0.59 0.07 0.03 0.31 0.66 SW 0.47 0.09 0.02 0.42 0.56 WW 0.54 0.06 0.18 0.22 0.60

Table 1. Variance components for wing and body size traits. H2= broad-sense heritability (see Methods section of RESULTS chapter 4.1 for definition).

130

Figure 1. SNP p-values clearly differ between foodbatch modelled and naïve GWAS for centroid size. The plot shows the correlation between SNP p-values from GWAS with the foodbatch-modelled phenotype (x-axis) and the non-modelled phenotype (y-axis) for all four phenotypes for females (left) and males (right). SNPs are sorted according to position in the genome. 131

CS WL WW WAR IOD TL HW SW CS 1 0.99 0.93 0.26 0.76 0.9 0.82 0.85 WL 1 0.92 0.31 0.75 0.9 0.81 0.84 WW 1 -0.01 0.69 0.84 0.75 0.79 WAR 1 0.15 0.22 0.18 0.17 IOD 1 0.81 0.84 0.82 TL 1 0.89 0.89 HW 1 0.89 SW 1

Table 2. Correlation between traits.

132 4.3.2 Experimental evolution of Drosophila wing size: To generate populations of flies with extreme wing sizes relative to their body we applied artificial selection to a population of outbred, random-robin mated DGRP lines. With this strategy we hope to specifically enrich for loci underlying wing size determination in Drosophila. As factors such as population size and number of replicates are crucial for having power to identify truly underlying loci versus randomly enriched loci we maximized both these factors within the limits of our current phenotyping ability. With the FlyCatwalk, a specifically designed phenotyping and selection system, we were able to achieve a population size of 300 flies per generation and selected condition. We established three replicate selection lines per selection condition to address whether there would be different strategies for modulating wing size. In total, we performed ten generations of selection for wing size in nine populations, three high lines, three low lines and three control lines. To enrich for loci that underlie size variation specifically in the wing, we applied selection for wing size relative to the body, measured by IOD. In order to not loose most alleles in the first few generations due to drift we kept selective pressure ate the low end, selecting the 40% most extreme individuals per population, and keeping selection pressure constant throughout the experiment. The opposite selected lines showed a marked divergence in wing size, whereas IOD remained unchanged over the whole course of the experiment (Figure 2). We thus conclude that selective pressure was high enough to shift the phenotypic means of the populations in the direction of selection. Furthermore, our strategy to make inseminated females lose their stored sperm by mating them for a week with selected males before egg laying for the next generation seems to have been efficient. The generation 10 populations will be frozen and pool-sequenced in order to identify differentially enriched loci in the selected populations. The further outlook for this results part can be found in the outlook of the overall thesis.

133

Figure 2. Divergent selection for relative wing size causes substantial shifts in population means in the selected direction. Plots show the population mean change over ten generations of selection for A) wing size and B) IOD. Populations clearly diverge in the selected directions for wing size, whereas IOD remains constant. Top part of each subfigure figure is females and bottom part males. Orange = selection for large wings relative to body, blue = selection for small wings relative to body, black = control. Data points are population means of 150 individuals per sex.

134

135 5. DISCUSSION

5.1 Small fluctuations in foodbatch quality cause substantial population-level variation in wing size:

We quantified various wing and body size traits in 149 of the Drosophila Genetic Reference Panel lines to be used in a genome-wide association scan for loci controlling Drosophila wing and body size variation (RESULTS, chapter 4.1). As size traits and generally quantitative traits are notoriously prone to be highly influenced by the environment, we raised the flies under standard culture conditions that were specifically tailored to controlling environmental factors previously known to affect morphometric traits. These factors include temperature, humidity, nutritional quality and quantity, crowding (Santos et al. 1994, French et al. 1998, Imasheva et al. 1999, Lefranc and Bundgaard 2000, Imasheva and Bubliy 2003), shared- environment effects (e.g. within-vial effects) and effects due to experimental and developmental timing (Anholt and Mackay 2004, 2010).

We either kept these confounders constant throughout development (temperature, day/night cycle and humidity) or randomized their influence to avoid systematic effects on the phenotypes. In an ideally controlled experiment, the same food source would be used for all lines and for the whole duration of the experiment, as is commonly done in studies using Arabidopsis thaliana as a model (e.g. Lahner et al. 2003). However, this was not feasible in our case as the Drosophila medium dried out too fast for an experiment of such length and this changed the quality of the food considerably. An additional difficulty was the very different developmental timing and fecundity of the lines, resulting in some lines having an average time between generations of 10 days and others of 18 days at 25°C. Assuming that freshly prepared food using a standard recipe and procedure by one person albeit on different days would be more comparable than using food of the same batch but of varying quality, we decided to distribute lines on four such foodbatches. The assumption was based on a prior observation that dry food gave rise to more phenotypic variation among flies of the same genotype, presumably because the larvae could not access the food easily. However, even using this standardized food, quantitative genetic analyses revealed that the variability in foodbatches accounted for a substantial part of the variance in wing size while affecting body size to a lesser extent.

We found that foodbatch variability affected two other wing size traits, wing length (WL) and wing width (WW) in a comparable manner to wing size, which could, however, be due to high correlation between these traits (RESULTS, chapter 4.3). We only saw the strong food effect

136 for wing size traits whereas a wing shape trait, wing aspect ratio, seemed to be more robust and showed a pattern comparable to that of body size traits. This makes sense biologically, as wing shape is known to be under stricter genetic control than wing size (Birdsall et al. 2000). Furthermore, SNP-level p-values differed markedly between food-modeled and non- modeled GWAS for wing size whereas this effect was less pronounced for both body size GWAS's and the wing shape GWAS. The food thus seems to act as a confounder only and specifically for the wing size phenotype.

The 15% of wing size variation attributable to food variation matches exactly the difference in broad-sense heritability estimates between the dataset (0.63) and the smaller control dataset (0.78). In the control experiment all flies of the experimental generation were raised on the same food. A broad-sense heritability estimate is only valid for the specific population, time point and environment for which it was determined and can differ for the same phenotype in different populations and environments (Falconer and Mackay 1996, Lynch and Walsh 1998, Visscher 2008, Anholt and Mackay 2010). Our results show that under perfectly controlled conditions 78% of total wing size variation in our population can be explained by the different genotypes, which is reduced to 63% when we vary one environmental variable – food – slightly.

A possible explanation for how standardized food could produce variability in a phenotype would be very subtle variations in the cooking protocol. Slight differences in the temperature or the cooking time could lead to the evaporation of more water from the broth and consequentially affect the texture of the food. As embryos are deposited on the food surface and have to dig their way into the food upon hatching, food texture could affect how efficiently and fast larvae can enter and process the food. Physiologically, this would mirror a situation where there is slight differences in food availability between larvae, perhaps comparable to a slightly above optimum population density. Flies reared under nutrient restriction (NR), or genetically starved flies show a proportional reduction in all body parts, though there have been exceptions where specific, essential organs are less sensitive to NR, to ensure survival and fitness of the organism under stress (Cheng et al. 2011, Tang et al. 2011). Also, different body parts show different scaling relationships with overall body size, and distinct allometric relationships have been observed between a body part and overall size, dependent on the environmental conditions (Shingleton et al. 2009). These observations show that there is differential sensitivity of organs to external cues and suggest the existence of systems to integrate environmental conditions with growth of individual organs.

137 Shingleton (Shingleton et al. 2009) systematically analyzed the plasticity of different body parts in response to environmental variables in Drosophila melanogaster. Plasticity variance of the phenotype was defined as the combined variance in the trait caused by the environmental variable and interactions of the environmental variable with the genotype. In contrast to what we find, their results indicated that there is slightly more plasticity variance of the thorax in response to nutritional cues at 25°C than of the wing. However, our experimental settings differ from theirs in a number of ways that may explain the discrepancy. Obviously, the slight differences in food availability between larvae as a consequence of texture variability in our experiment are a very different situation from actual nutrient limitation. The food quantity spectrum used by Shingleton et al. is very broad, ranging from 1% to 100% cornmeal/molasses medium, which covers varying levels of starvation, the mildest form of NR being a 50% medium. Our food accessibility limitation would more realistically mirror very small changes in medium richness, perhaps in the range of a few percent, whereas large effects of NR on phenotypes usually only start to be visible from 50% reduced medium downwards.

Plasticity variance is composed of the variance in the trait attributable to the environmental variable, and the variance attributable to interaction between the environment and the genotype. We cannot calculate plasticity variance in our dataset as we have too few observations to get reliable genotype by food interaction estimates. As a consequence, we can only make statements about the variance in traits caused by the environmental variable food, and this is higher for wing size traits than for body size traits. It is, however, possible that the variance in trait size caused by interactions of the foodbatch with the genotype is much lower for wing size than for body size in our population, which would counteract the larger environmental variance for the wing. Also, we estimated the food variance using data from both sexes (versus only males in Shingleton et al.) and did not estimate the contribution of the foodbatch by sex interaction, again for the purpose of keeping our model simple and the number of estimated parameters small. However, comparing the foodbatch variability estimates from sex-separated datasets shows the same trend of a higher effect on wing size traits than on body size traits, while the proportion of phenotypic variance due to foodbatch variability is in general smaller for males than for females for all traits.

Furthermore, we use a population of 149 inbred lines with a nucleotide diversity of approximately 2.5% among them (Mackay et al. 2012). This contrasts with the only three genotypes used in Shingleton et al. Also, all of their strains were isogenic laboratory strains, two of which only differed at their third chromosome. The genetic variability was thus most

138 likely a lot higher in our dataset, which could have an effect on the amount of plasticity variance that is detectable.

In summary, the differences in experimental settings between our study and the study by Shingleton et al. may explain the different observations of the effect of nutritional variability on different body parts. It is feasible that if the effect we observe is due to texture differences in the media and consequential variation in accessibility, this would have very different effects on the organism and the individual organs from severe starvation. As a larger body generally means more resources for reproduction and survival it could make sense to first place fewer resources into making wings. Of course the wing size spectrum would be constrained at the lower boundary by the ability to fly. In contrast, if nutrients become severely limiting, a reduction in body size is essential as the energetic cost of maintaining a large body would be too high. Mechanistically, this could be explained by variable thresholds of insulin signaling sensitivity among different organs, along the lines of prior observations of Shingleton et al. that a mutation in the insulin receptor reduces insulin signaling and has a greater impact on wing size than on genital size (Shingleton et al. 2005) and the reduced sensitivity of genital discs and neurons to NR due to reduced levels of FOXO and bypassing amino acid dependence (Cheng et al. 2011, Tang et al. 2011).

An alternative, non-biological explanation for the wing size specific foodbatch effect could be that our phenotypic estimates are much more accurate for the wing traits than for the body traits. The wing is basically a two-dimensional structure with very distinct, natural landmarks that lends itself to morphometric analysis. For quantification, we used a sophisticated, semi- automated wing image processing system based on a published wing analysis program (Houle et al. 2003). The body measures were in contrast measured manually and subject to human error. The body being a 3D object makes it harder to reliably identify landmarks used for body trait quantification. As a consequence, the residual environmental contribution is smaller for wing size (22% centroid size) than for the body size traits (28% interocular distance and 35% thorax length). To be able to compare variance components between traits we need to normalize to one, as the estimates are on different scales for the different traits. As a consequence of this normalization, the food contribution to body size variation could appear to be smaller because it is squeezed due to the relatively larger residual component. However, we can refute that explanation by examining the absolute values of the parameter estimates for the food and genotype terms. Even if the food effect we see were due to the relatively larger residual component, we should not see a difference in the ratio of the

2 2 variance estimate for food (σ F) to the variance estimate for line (σ L) between the

139 2 2 phenotypes. However, the ratio σ F/σ L is five times higher for the wing size traits than for the body size traits. This clearly shows that the food effect is not a normalization artifact.

A third explanation lies in the potential inaccuracy of our foodbatch variability estimate. We treat foodatch as a random variable, which means that based on the four batches we have we try to predict how different foodbatches in general affect phenotypic variability. However, a mere four batches is probably not a representative sample of different foodbatches overall and a too small sample to reliably estimate general foodbatch variability from. The foodbatch contribution to wing size could thus be this high for wing size or low for the other traits by chance. A confidence measure for the estimate, which would resolve this issue, is unfortunately not calculated with the lmer() function for statistical reasons. The lme4 package contains functions to calculate confidence intervals using Markov Chain Monte Carlo methods, but this is only recommendable and reliable when more groups are compared. Since we are looking at random effects it does not make much sense to use bootstrapping, as we would never sample new foodbatches and thus not address the general effect of using different standardized foodbatches. We would not account for a possible foodbatch that was even more different from the others and would lead to more phenotypic variability in body size among the lines.

In conclusion, to be absolutely sure of the differential sensitivity of wing and body size traits to slight foodbatch differences, several genotypes should be measured after being raised on a higher number of different foodbatches each. This would allow getting more solid estimates for the effect of the different foodbatches on population-level phenotypic variance. As our experiment was never designed for detecting foodbatch effects, since we did not expect them to be present, we did not have the sample size to get reliable estimates to address genotype by food interactions. Genotype by food interactions are very likely abundant for size traits, as the organism is critically dependent on nutrient supply and must regulate its growth according to the amount of available resources, a mechanism that involves systemic IIS signaling, hormones and intracellular pathways, and thus is dependent on genotypes at a large number of loci.

In conclusion, it is astonishing how such a slight variation in the environment can reflect to such an extent in a phenotype. This raises the question of how comparable or reproducible results are among different Drosophila laboratories, where food nutritional content can be markedly different due to the absence of one accepted standard medium. For studies dealing with genetic manipulations that produce robust and large effects this may not be an issue, whereas studies using inbred lines and investigating quantitative traits are likely affected.

140 Generally, slight environmental fluctuations may provide an explanation for the irreproducibility of weak phenotypic effects. As can be seen by comparing SNP p-values from GWAS of foodbatch modeled and non-modeled phenotypes, a small variation in food source can act as a major confounder in trait associations. The effect we see is already considerable in a situation with perfectly controlled environment, and we can only imagine it to be much more pronounced in situations where environmental control is imperfect, such as in GWA studies in humans. Obviously outbred human individuals may be less sensitive to fluctuations in the environment than inbred strains, but then the differences in quality and quantity of the consumed food are immensely larger in human populations than the differences in our standardized food. Nutrition affects and organism throughout lifetime, and apart from influencing overall height (see the pronounced differences in size between humans living only few kilometers apart due to malnutrition during development (Schwekendiek and Pak 2009, Pak 2010)), also affects weight, the development of obesity, type II diabetes and metabolic syndrome and a variety of other diseases (Riccardi et al. 2004, Alexander et al. 2010, Reynolds et al. 2010, Gerber 2012, Arts et al. 2014). GWAS aiming at identifying risk loci for these diseases have to account for dietary variation, and a multitude of other confounders, in the population to be able to recover robust associations. As accounting for diet and physical activity can improve the power of a GWAS to recover novel associations (Igl et al. 2010), dietary confounding does seem to be an issue also in humans. In some studies it is now common to collect lifestyle and lifetime data for study subjects and often these are correlated with phenotypic measures and corrected for (Aune et al. 2013, Rueedi et al. 2014). The strong confounding effect we see of only small fluctuations in an environmental variable shows that careful evaluation of potential environmental covariates is absolutely essential for the success of a GWA study to identify truly causal associations.

5.2 Many loci with small effects and predominantly regulatory variants underlie wing size variation in Drosophila: To describe the genetic architecture of a trait, and ultimately understand how variation in this trait is created, it is necessary to know the number of loci that underlie this trait, the effect size distribution of these loci, their frequency in the population and their interactions with each other. For the genetic architecture of height, or body size, humans and domestic animals represent the extremes in terms of the allelic spectrum of involved SNPs. In humans, all common variants in the genome taken together explain only about half of the heritability of the trait, while in dogs, horses and cattle species few loci account for a large proportion of size variation, which is a consequence of specific breeding of these species (Sutter et al. 2007, Yang et al. 2010, Makvandi-Nejad et al. 2012). Not surprisingly, as Drosophila populations in the wild are naturally breeding like humans,

141 their genetic architecture of size seems to be more similar to that of humans. Though the population we used for GWAS consisted of inbred lines, and many loci were presumably lost during the inbreeding process, genetic variation among the whole set of lines is still a good approximation of the genetic pool of the original wild population, which means our results are largely representative of the genetic architecture of size in wild Drosophila populations. Obviously, our results are population specific and GWAS using a different population could yield other associations due to different allele frequency distributions among the populations. However, though different SNP loci might be identified in another population, we would expect that truly causative genes would be replicated. Using three different methods, single SNP GWAS, gene-wise summary statistics calculated by the VEGAS method, and locus- locus interaction scans using FastEpistasis, we identify some loci common to all three methods but largely distinct sets of significant SNPs or candidate genes associated with wing or body size (RESULTS, chapter 4.1.). Generalizing to other phenotypes these observations imply that, though these methods are designed to enable identifying partly complementary sets of candidate loci, the number of loci controlling trait variation is likely much higher than those that are detected by single locus GWAS, which is the default method for associating phenotype to genotype. With the most stringent SNP inclusion threshold we identify up to 77 SNPs associated with wing size variation. Taking into account the loci identified with the other two methods and considering that we have low statistical power for identifying all causal variants it is likely that at least several hundred loci control natural variation in wing size in Drosophila melanogaster. This is not a surprising finding per se, as classical genetic studies have shown that genes from at least three different pathways, IIS/TOR, Hippo and EGFR control cell growth and proliferation, with a multitude of other genes from pathways involved mainly in tissue polarity, patterning and developmental timing also affecting growth. What is unexpected, however, is that only a very small fraction of the candidates we identify overlap with canonical growth pathway genes. There are several possible explanations for this:

On the one hand, it is conceivable that there are no large effect mutations in growth pathway genes but only variants with small effects. Because growth pathway genes are to a large part essential genes or have pleiotropic roles, affecting other fitness related traits such as fecundity, lifespan and mating success, a mutation with a large effect will in most cases be lost from the population due to lethality or reduced fitness of the organism carrying it. As variants with small effects require large population sizes to be detected by GWAS our experiment could simply have been underpowered for identifying such small effects and thus did not recover more growth pathway genes.

142 On the other hand, variants in essential growth pathway genes with large effects could exist in the population, but at very low frequencies. Assuming that mutations with large effects would be mostly deleterious for the fitness of the organism, such SNPs would only exist in heterozygotes, thus automatically having a lower allele frequency in the population than other SNPs that do not negatively impact viability and fitness of the organism in the homozygous constellation. During the inbreeding process, highly deleterious SNPs would be lost while mildly deleterious ones, for example by affecting fecundity or developmental timing, could get fixed in a few lines. As they were already present at low frequencies in the original population only few lines would end up with such a SNP, and compared to other SNPs this SNP would have much fewer reads mapping to it than others. Taking into account also the high frequency of sequencing errors with current standard platforms, the combination of these issues would make it difficult to call such a SNP with high confidence during genotyping, with the result that the SNP would not be detected as being present in the population. If enough lines would have the SNP to enable high confidence genotyping, it could still be excluded from our GWAS since we do not include SNPs present in less than four lines in the least stringent case.

Alternatively, SNPs with large effects could be present and genotyped, and even included in the GWAS, but not yield a large effect because the SNP’s deleterious effects are masked by the genotype at another locus. It is not uncommon for organisms to live with lethal alleles as long as these are buffered by counteracting alleles at another locus. An example of this is flies carrying a PTEN mutation, which would normally cause lethality due to highly increased levels of PIP3 (Stocker et al. 2002). However, flies carrying a second mutation in the PH domain of Akt that reduces its affinity for PIP3 are phenotypically normal. In the example above these loci would always be linked, as flies carrying only the PTEN allele would not be viable, thus allowing us to detect such an interaction by LD analysis. However, as we did not detect any long range LD this is apparently not an explanation for the underrepresentation of bona fide growth pathway genes.

Apart from putative contribution of rare large effect variants and common very small effect variants in known growth genes, largely novel loci and a minority of previously known loci are associated with variability in size in our population. Of the known loci few belong to or regulate the three major pathways governing cell growth and proliferation, among them the Hippo pathway kinase Warts (wts) and the upstream regulator expanded, and from IIS/TOR signaling the insulin-like peptide 8 (dILP8), the negative IIS regulator secreted decoy of InR (Sdr), the TORC1 target and autophagy regulator Atg1, and the TORC1 regulators Scylla

143 (scyl) and SNF4Agamma. From the EGFR pathway we find its negative regulator kekkon-1 as well as rasp, son of sevenless and Grunge. Most of the others have roles also in patterning or otherwise affect wing or eye disc development. The most significant female MAC7 SNP for the CS and CSIC GWAS (p = 7.15E-08, p = 6.43E-08) mapped to exon 5 of the gene CG6091, an ortholog of the human de-ubiquitination enzyme OTUD5, which has a role in innate immunity, and the most significant male CS, CSIC and rCS SNP (p = 7.00E-08, p = 8.55E-08, p = 3.74E-07) was located in the intron of CG34370, a gene encoding an LDL repeat containing protein of unknown function, which was recently identified as a candidate in a GWAS for lifespan and lifetime fecundity in Drosophila (Durham et al. 2014). Rather unexpectedly, the most significant rCSF SNP (p = 1.31E-07) was situated in the intron of dsx (doublesex), a gene well characterized for its involvement in sex determination, fecundity and courtship behavior. In the body size GWAS in contrast, we identified a cluster of SNPs lying 12-13kb upstream of the gene encoding the negative EGFR pathway regulator Kekkon-1 among the top associations (p=3.45E-09). Not surprisingly, most of our identified variants are regulatory, as could be expected given genes affecting growth are often essential genes and may also impact on fecundity and fitness of the organism. The more safe and common way to introduce variability in size while maintaining function seems to occur via modulation of protein abundance (Stern and Orgogozo 2008). This finding is in line with those of other GWAS studies, as for most other phenotypes the majority of associations fall within regulatory regions and the intergenic space. As we show in our results (RESULTS, chapter 4.1.), annotation of such SNPs with functional intergenic element signatures and looking for conservation of the sequence provides a way for formulating hypotheses towards elucidating the functional impact of an intergenic association.

Most identified coding SNPs are synonymous substitutions. These have been shown to be functional in some cases (Hunt et al. 2014), but based on our data we cannot exclude that they tag a rare non-probed SNP in their close proximity rather than being causally associated themselves. Among all candidates there was only one significant nonsense SNP, which located to exon 8 of the gene Dhc64C, a member of the dynein heavy chain family, important molecules for cellular transport. This SNP was significantly associated with all wing size traits in both sexes and had comparable effect sizes in all these GWAS. Nonsense and missense SNPs are of particular interest as they allow the formulation of testable hypotheses for the elucidation of the functional impact of the polymorphism. We found missense SNPs in genes coding for the mitochondrial protein Cep89, Pi3K68D, the FERM and PH domain containing protein CG34347, the cell shape regulator Mp20 (Kiger et al. 2002), CG5381, the putative superoxide dismutase CG31028 and in Fbp2, a protein with functions in metabolism. We

144 found largely different genes with the VEGAS method for the same trait, but over all traits there was an overlap of 24%. Both the low overlap between methods and the lack of enrichment could be a consequence of only considering genes above our rather arbitrary cut- off of 20 genes and it is thus possible that more overlap with the GWAS results and significant enrichment could be detected when including progressively more genes. With the exception of VHA M8.9 (CG8444, a subunit of vacuolar H+ ATPase with a role in planar cell polarity that can interact with fz and fz2) knockdown of most of the novel genes resulted in a small change in median wing size (-19.4% to 10.1%), indicating that their role in the growth of this tissue is modulatory rather than causal. This implies that they either act redundantly with other genes or as enhancers or mild suppressors of a signal, instead of being involved in generating and propagating the signal. An example of such a gene that we identified is Pox neuro (Awasaki and Kimura 2001), a nonessential gene which has a role in wing hinge formation. However, as we only performed knockdown in the wing we do not know whether these novel genes are essential for other developmental processes or have larger effects upon overexpression. It is furthermore notable that seven of the eight candidate knockdowns that yielded wing size changes exceeding 10% were in the negative direction, indicating that among our candidates, genes with a normally growth supportive function have a stronger input on size than do genes with a growth inhibitory function.

Through statistical epistasis analysis we could find putative biological interactors for 42 of our candidates. Evidence for the applicability of this approach in identifying at least some true biological interactions comes from the detected interaction between InR and the protein tyrosine phosphatase Lar. Lar has been shown to be able to phosphorylate InR (Madan et al. 2011), which likely affects InR activity. Polymorphisms at these two loci could thus cancel out each other’s effects or act synergystically to increase or decrease InR activity, and thus have variable effects on size. Furthermore, we find an enrichment of interactions between GWAS candidates and epistasis interactors (expected 52, observed 100, p=4.26E-09). This indicates that though different approaches may yield different top associations, these do form gene interaction networks among each other. Combining different approaches for identifying trait associations may thus help in placing the candidates from both approaches into a biological context. Obviously, these results have to be interpreted with care as we expect a substantial proportion of false positives in this list of additionally novel genes due to the low power for detecting locus-locus interactions with a population size as small as ours and the rather low nominal significance threshold we set. Nevertheless, some of these interactions may provide the basis for further hypothesis driven investigation towards elucidating the roles

145 and connectivity of newly identified genes and discover new links between already known genes.

Emerging from our results is a picture of growth control that suggests even more interconnectedness between growth and morphogenetic processes, and that the mutational targets for creating variability in size are, at least to a larger part than expected, not the core components of IIS/TOR, EGFR and Hippo signaling. Even if more canonical growth pathway genes will in the future be found to be associated with size variation in GWAS with more power, the associations that we have validated still present a substantial number of novel regulators in growth control. Below we will discuss some thus far unknown candidates and their putative roles in growth control.

5.3 Novel candidates fall into diverse functional classes, overlap candidates from other studies and are associated with height or obesity related traits in humans:

The novel candidate genes fall into diverse functional classes, reflecting the multitude of processes that may converge on growth. Apart from genes with roles in PCP and metabolism discussed below we identify genes involved in signal propagation, transmembrane transport, transcription and translation (the eIF4H homolog Rbp2 and the ribosomal proteins RpS3 and RpS16), immunity and the hormonal cascade. Signaling components include Hmgcr (newly identified Ras signaling component (Ashton-Beaucage et al. 2014) whose human ortholog shows association to metabolite levels) the classical protein kinase C Pkc53E, which has been implied in alcohol insensitivity in Drosophila (Chen J et al. 2010) like PKC enzymes in humans, and whose human ortholog has been found to be significantly associated with height. 15 other genes that we identified as candidates but did not validate were found to enhance or suppress major growth pathways and effectors (Schertel et al. 2013) and three candidates, CG10249, CG2269 and CG9743 showed significant association to body weight in Drosophila (Jumbo-Lucioni et al. 2010).

Interestingly, we found several transmembrane ion transporters among our candidates. These mediate cellular responses to light, nerve growth factor, and a wide range of chemical and physical stimuli and metabolic stress (Minke and Cook 2002) and are found mutated in tumors and neurodegenerative disorders. As mediators of extracellular signals and stress these are good candidates for an upstream regulatory role in growth. One of our validated candidates is Mid1 (mildly increased wing size), which has been found to function as a stretch activated Ca2+ channel and plays a role in the polarized growth of mating projections and the response to cold stress and iron toxicity in S. cerevisiae (Iida et al. 1994, Levin and

146 Errede 1995, Peiter et al. 2005). As mechanical tension clearly plays a role in growth control of imaginal discs, this channel could act in translating such signals to intracellular signaling pathways via the second messenger Ca2+. Another novel candidate, the human ortholog of the transmembrane channel Trpm (flies with wing-specific knockdown show a mild increase in wing size) was found to be associated with anthropometric traits during puberty, indicating a role during the postnatal growth phase in humans. In fungi and nematodes, Mid1 is involved in the trafficking of sodium leak channels, and in Drosophila, neuronal knockdown of Mid1 phenocopied the circadian motor and social clustering effects of loss of the sodium leak channel narrow abdomen (na) (Ghezzi et al. 2014).

The only novel candidate gene we find that may have a function in the hormonal cascade regulating developmental timing is the gene CG14258. This protein has putative juvenile hormone binding functionality and is conserved among Drosophila species, but so far no studies have systematically investigated its in vivo function and no human orthologs exist. It has, however been found to have slight but nonsignificant male biased expression (Vanaphan et al. 2012).

Most of our GWAS candidates have human orthologs, some of which have been associated with height or obesity related traits in humans. As IIS signaling in Drosophila performs the dual function of the human Insulin/IGF system it is likely that growth associated genes in Drosophila could have effects on either growth or metabolic phenotypes in humans, given they feed into or mediate IIS signaling at some point. It is interesting to note that a predicted human ortholog of Drosophila aPKC (mainly characterized for its role in polarization and asymmetric cell division but found to significantly reduce wing size in our screen, in line with results identifying it as a Hippo pathway regulator (Parsons et al. 2014)) was significantly associated with height. Overall, we found orthologs for five of our candidates that were found to be associated to height in humans (5-HT1A/HTR1D, aPKC/PRKCZ, dally/GPC5, Khc- 73/KIF13A, Pkc53E/PRKCA), two to pubertal anthropometrics (trpm/TRPM3, Tsp66E/CD82) and an additional three to bone mineral density (arr/LRP5, Axn/AXIN1, Gug/RERE), which is an important factor during growth. Additionally, we found human orthologs for 24 of our candidate genes with an association to one of the following obesity-related traits: BMI (six genes, 5-HT1A/HTR1A, Ac13E/ADCY9, aret/CELF1, Hmgcr/HMGCR, Sec16/SEC16B, fra/DCC), interaction with BMI (two genes, Bx/LMO1, RhoGAP15B/ARAP1), metabolite levels (five genes, Ance-3/ACE, dally/GPC5, Eaat1/SLC1A4, Hmgcr/HMGCR, stan/CELSR2), obesity related traits (nine genes, aret/CELF2, CG42673/NOS1AP, CG5549/SLC6A5, CG9086/UBR2, dnc/PDE4D, fra/DCC, Lar/PTPRD, Magi/MAGI3,

147 TfIIB/GTF2B) and Type II diabetes (two genes, Lar/PTPRD and tws/PPP2R2C). We additionally found that GIPC2, the human ortholog of the PCP regulator kermit, for which we detected association to body size with the VEGAS method and that showed a nominally significant pairwise interaction with the EGFR in the association to wing size, was associated with human height, further corroborating its role in growth control.

Evidence for an involvement in growth control from GWAS in both organisms and experimental support from validation in Drosophila corroborates a true biological function of these genes in the determination of body size. For those genes where no correlation with size or metabolism has been detected in humans our validation of novel candidates playing a role in growth control in Drosophila can provide a basis for elucidating their function in humans. The study by Schertel et al. corroborates our findings for the role in growth control of some candidates, while at the same time providing a link to known growth pathways (Schertel et al. 2013). Furthermore, it highlights the importance of validating genes with overexpression, as redundancy often precludes some genes from having an effect upon knockdown.

5.4 Planar cell polarity (PCP) genes and growth control:

In our study for identifying loci underlying size variation in the Drosophila wing or body, we identified several genes with a role in PCP. I will thus briefly introduce the topic and discuss how these genes might affect variation in size.

Planar cell polarity (PCP) describes the polarity of cells within a plane in an epithelium and is mechanistically distinct from apico-basal polarity. The orientation of trichomes on the Drosophila wing, the hair orientation in the fur of mice or the cilia in the inner ear all depend on PCP signaling. But not only epithelial cells forming such distinct outgrowths are polarized, instead PCP is a phenomenon ubiquitously important for proper development and plays a role in developmental processes such as cell migration, convergent extension (the elongation and narrowing of tissue by migration and intercalation of cells), neurogenesis, axonal guidance, and kidney morphogenesis. PCP signaling components are evolutionarily conserved and reactivation of PCP has been suggested to drive migration of malignant cells during invasion and metastasis and wound healing. Though there is much controversy about the number of systems involved in PCP establishment and about the detailed molecular mechanisms, it is clear that cell-cell interactions are crucial. Morphogen gradients have been implied as the most upstream cues for establishing a polarity axis (Lawrence and Casal 2013, Matis and Axelrod 2013, Hatakeyama et al. 2014).

148 In the Drosophila wing, eye and abdomen PCP establishment involves the Fat/Dachsous/Four-jointed (Ft/Ds/Fj) system and the Starry night/Frizzled (Stan/Fz/Vang) system. Both systems require contacts between neighboring cells, either through the formation of heterodimeric Ft/Ds bridges or stan/stan homodimers. In the cell, core PCP components become located to proximal and distal sides and interact through the formation of a complex with the opposing complex in the neighboring cell. Directional information is necessary for the proper orientation of the core modules and is communicated by opposing Fj and Ds gradients across the tissue. In Drosophila and some vertebrates it has been shown that Wnt family proteins are involved in providing a global cue for polarization, and Wg and Wnt4 control the establishment of a polarity axis along their gradient by interfering with Fz/Vang association (Wu J et al. 2013). PCP regulation in other tissues may involve different proteins. For example, in the elongation of the Drosophila egg chamber the Ft homolog Fat2 and the protein tyrosine phosphatase Lar are the main mediators of polarity cues.

Formation of a Fj/Ds gradient across the tissue provides signals for both growth control via the Hippo pathway and PCP, with the direction of the gradient providing cues for PCP while the slope serves as regulatory input for Hippo pathway activation. Fj acts by phosphorylating Ft, which enhances its activity for Ds, thereby promoting the formation of Ft/Ds heterodimers. Fj is a transmembrane protein localized in the Golgi and phosphorylates both Ft and Ds when they pass through the Golgi. The output of the Fj/Ds gradient across the tissue is the formation of Ft/Ds heterodimers in a subcellular gradient, which conveys information for the orientation of polarization. In the wing, Dpp and Wg gradients shape Fj/Ds gradients whereas in eye discs JAK/STAT and Notch signaling are additionally required (Lawrence and Casal 2013, Mats and Axelrod 2013). During morphogenesis polarization of the tissue is important for proper orientation of the mitotic spindle and thus the orientation of cell division, which influences the final shape of the organ. In the Drosophila wing, division occurs preferentially along the PD axis, leading to elongated wings, and is dependent on Ft, Ds and the atypical myosin Dachs. Mechanistically, oriented cell division involves motor proteins that can orient the spindle apparatus (Baena-Lopez et al. 2005, Segalen and Bellaiche 2009, Mao et al. 2011).

PCP components we identified in our study comprise Wnt4, Lar, aPKC, Fj, Fz, stan, kermit and the microtubule motor proteins Dhc64C and Khc-73, whose human ortholog is significantly associated with height. Kermit was identified as a candidate both in the VEGAS method for IODF and as an interactor with EGFR in the pairwise association to CSFIC, and has been found to play a role in the establishment of planar cell polarity (PCP) in the wing

149 downstream of Fz and the G-protein Go (Lin and Katanaev 2013). As EGFR has been shown to act in a combinatorial manner with Fz signaling in the establishment of PCP in the Drosophila eye (Weber et al. 2008), it is not unlikely that the pairwise interaction we identify is real. Interestingly, Lin and Katanaev propose that Kermit controls whether the transmembrane protein Van Gogh (Vang) is transported by myosin on actin fibers, away from the apical surface and leading to PCP defects, or by dynein and kinesin on microtubule fibers in the apical plane. Kermit overexpression promotes Vang localization to actin fibers and thus PCP defects, an effect that is worsened by the reduction of microtubule motor proteins dynein (Dhc64C) and kinesin (Khc). Among our candidates we also found kinesin heavy chain proteins (Khc-73) and dynein heavy chain proteins (Dhc64C), and knockdown of Khc- 73 resulted in a 3% decrease in wing size in males. It is thus possible that Khc-73 and Dhc64C are involved in both PCP and growth control, a dual role that has been shown for many already known growth genes such as EGFR, Fat, Dachsous, Four-jointed, Frizzled, Crumbs, aPKC and Lgl, which, but sometimes in a coordinated manner (Povelones et al. 2005, Parsons et al. 2010, Hatakayema et al. 2014). Fz, by acting as a receptor for the morphogen Wg, regulates processes involved in tissue growth and cell fate specification and PCP. As the phenotypic strength of several different fz alleles correlated well for readouts of both pathways this suggests, despite different downstream players being involved, the involvement of at least a common mechanism for regulation of these two distinct functions (Povelones et al. 2005). However, Kermit and motor proteins are more downstream in the cascade important for establishing PCP and thus likely have specialized roles for this process. Nevertheless, this can impact on growth, as proper establishment of polarity is important for growth by providing the orientation of cell division, and loss of the Vang ortholog Vangl2 in zebrafish leads to a reduction in body length (Hakateyama et al. 2014), Kermit, Khc-73 and Dhc64C could thus affect growth via their role in PCP. However, for Kermit and Dhc64C it remains to be seen whether they show an effect on wing size upon knockdown.

A reason why we identify so many PCP genes could be due to the phenotype we chose for measuring wing size. Centroid size is the sum of the distance of 14 landmarks on the wing from the center of the wing and thus also reflects some aspects of wing shape. Though it correlates well with wing area, the correlation is not perfect. Changes between the distances of landmarks can occur through subtle changes in cell division orientation, and it could thus be that the identification of PCP genes has to do with the specification of wing size by us. This could be clarified by doing GWAS for wing area or for length and width, and looking at the number of PCP candidates identified for each of these phenotypes.

150 5.5 Metabolism and growth control:

Among our candidates we find the mitochondrial protein Cep89 as a novel regulator of wing size in Drosophila. Cep89 is a highly conserved gene, which has been shown to play a role in mitochondrial metabolism and is required for neuronal function in Drosophila and humans (van Bon et al. 2013). A missense causing SNP with negative effect size showed association to all wing phenotypes (except rCSM) and localized to the second exon, providing a putative mechanistic cause for the phenotypic effect. Cep89 is required for proper complex IV formation and cep89 loss of function leads to decreased complex IV activity in humans and flies. Muscle biopsy of a patient with a homozygous deletion of exons 15-19 of CEP89 revealed mitochondrial dysfunction characterized by a decrease in the rate of ATP production, decreased oxidation rate of 14C-pyruvate and complex IV deficiency. Phenotypically, apart from myopathy (muscular weakness due to nonfunctional muscle fibers) the patient manifested intellectual disability, small stature (<3rd percentile) but above average weight and some morphological malformations with increasing age. Complete ubiquitous (act-Gal4) Cep89 knockdown in flies was late pupal lethal, and mitochondrial fractions from these pupae showed a >50% decrease in complex IV activity, but partial knockdown yields survivors proportional to knockdown efficiency. Strong muscle-specific Cep89 knockdown was also lethal but resulted in phenotypically normal flies when knockdown was mild, indicating a dose or threshold dependent effect of Cep89 activity. Along these lines, neuronal knockdown showed dose- dependent phenotypes, ranging from lethality to motor defects accompanied by a significant decrease in presynaptic active zones, the areas on the synapse where neurotransmitter release occurs. Furthermore, cep89 neuronal knockdown led to an overall decrease in larval size and a decrease in the size of muscles. Notably, wing specific knockdown of the strongest allele under the control of the MS1096-Gal4 promoter resulted in a dramatic reduction in wing size, while eye-specific knockdown with GMR-Gal4 led to collapsed ommatidia. In contrast to the size reduction observed by van Bon et al. with the strong knockdown allele, we observed an increase in wing size with an allele with less knockdown efficiency. Potential explanations for this discrepancy include the use of different RNAi lines, different Gal-4 promoters and different rearing temperatures. We only tested the weaker of the alleles, VDRC line GD 24240, which van Bon et al. used for knockdown in other tissues but not in the wing. Furthermore, we induced RNAi using a nubbin-Gal4 driver line, whereas they used MS1096-Gal4. Both lines provide UAS-transgene activation in the wing pouch, but MS1096- Gal4 is inserted into the Bx locus, so these two drivers have different activity patterns. Furthermore, all our experiments were performed at 25°C whereas van Bon et al. reared flies

151 at 28°C. The difference could also lie in the choice of the control, but we tested knockdown against another control line, lacZ-RNAi, which also resulted in flies with significantly smaller wings than Cep89GD24240 knockdown flies. Given the proposed function of Cep89 and the phenotype manifested in humans, the phenotype observed by van Bon et al. is what would be expected from knockdown, whereas our observed size increase, albeit mild, is hard to explain in the context of the gene’s function. Furthermore, flies with this SNP had on average smaller wings, as is evident form the negative effect size. Characterization of the effects of an allelic series on wing size and mitochondrial integrity during development would be the best approach to reconcile these results.

In addition to Cep89 we identify several other candidates with putative roles in metabolism. Among these are the amino acid transporter Eaat1, the mitochondrial transmembrane transporters Shawn and Tyler, the protein Fbp2, which contains a missense SNP, causes a mild increase in wing size upon knockdown and is involved in glycolysis, fatty acid and tyrosine metabolism, CG3011 which localizes to the mitochondrion and may have a role in amino acid metabolism and Dnc and Gycbeta100B that are predicted to be involved in purine metabolism. Cht7 has a predicted role in amino sugar metabolism while CG6084 (mild decrease in wing size upon knockdown) may be involved in carbohydrate and glycerolipid metabolism and CG31028, another gene containing a missense causing SNP, in ROS metabolic processes. Corroborating their role in growth or metabolism, Eaat1 and dnc orthologs are associated to metabolite levels and obesity related traits in humans and the Fbp2 ortholog is associated with pathological overgrowth of the long bones.

For a phenotype such as growth that essentially comes down to how much energy and how much biosynthetic precursors are available for cellular growth, it is likely that metabolism and metabolic coordination have to be taken into account to mechanistically understand variability in size. Growth, in its basis, comes down to cellular growth, which is dependent on the availability of building blocks for molecules and macromolecules. Amino acids serve as building blocks for proteins, which then act in signaling pathways to maintain growth or assemble in ribosomes to again produce more protein. Lipids are necessary to make organelles and more cell wall to enable the cell to expand, which is a prerequisite for cell division, and are often components of important signaling and cell surface molecules. At the very base of everything are the nucleotides, which are necessary to produce RNAs, important growth regulators themselves and needed to form proteins, either as components of the ribosome or by serving as the intermediate between DNA and protein, and to copy the genetic information, enabling the cell to divide and the tissue to expand. The availability of

152 these building blocks is dependent on metabolism. The oxidation of glucose to pyruvate during glycolysis, decarboxylation of pyruvate to Acetyl-CoA and the subsequent oxidation of

Acetyl-CoA to CO2 and water in the TCA cycle in the mitochondria produce precursor metabolites for amino acid and lipid synthesis, like pyruvate, α-ketoglutarate and oxaloacetate, but also NADH molecules. These can be oxidized to NAD+ during oxidative phosphorylation via the electron transport chain, which involves protein complexes I to IV, and the resulting proton gradient across the inner mitochondrial membrane be used to generate ATP via ATP synthase (complex V). Alternatively cells can generate ATP by catabolizing pyruvate to lactate, a strategy thought to have evolved for generating energy in the absence of O2, which is used by cancer cells even in the presence of O2 (called aerobic glycolysis (Gatenbye and Gillies 2004)). The pentose phosphate pathway also oxidizes glucose but its primary role is in the generation of precursors for aromatic amino acid and de- novo nucleotide synthesis (erythroes-4-phosphate (E4P) and ribose-5-phosphate (R5P), respectively). Nucleotides can further be synthesized from intermediates of RNA and DNA degradation, a process called nucleotide salvage. Finally, anaplerotic reactions have to occur to replenish the TCA cycle and coordinate anabolism and catabolism (Voet, Voet and Pratt 2002). Depending on the flux through these pathways, more energy or more precursors are produced, so clearly, metabolic control and metabolic flux need to be taken into account when trying to understand how organisms regulate growth. A very recent publication illustrates this and shows a mechanism for coupling metabolism to growth via the growth and planar cell polarity (PCP) regulator Fat (Sing et al. 2014). Fat regulates growth control via the Hippo pathway and is involved in PCP formation via Ds and other proteins. Sing et al now find that a C-terminal proteolytic product of Fat regulates mitochondrial metabolism in Drosophila. Ft knockdown leads to morphological defects of mitochondria, enhanced levels of reactive oxygen species (ROS) and lactate and is accompanied by increased glycolysis and reduced complex I activity despite oxygen being present. Mechanistically the effect of Fat on mitochondrial metabolism occurs over its cleavage. The resulting soluble fragment is imported into mitochondria where it binds to components of complex I and V, and Ft interaction with the complex I component NADH dehydrogenase ubiquinone flavoprotein 2 (Ndufv2) has a stabilizing effect on the protein complex. While Ft cleavage probably blocks its growth inhibitory effect through the Hippo pathway, it enhances complex I (and V) assembly in the mitochondria, which results in a net increase in oxidative phosphorylation in a dose dependent manner. The Fat cleavage thus presents a switch for adjusting metabolism to changing energy and precursor requirements. The authors propose that the switch to aerobic glycolysis in ft mutant cells mechanistically underlies the overgrowth observed in Hippo pathway mutants. Support for this hypothesis

153 comes from the observation that knockdown of other mitochondrial components causes PCP defects and affects hippo pathway activity. However, as the Ndufv2 binding domain of Ft is not necessary for its function in growth control and PCP, direct regulation of these processes by Fat occurs via separate mechanisms. (Baker and Jenny 2014, Sing et al. 2014).

This novel perspective on the role of a bona fide growth and polarity gene directly on metabolic control and the evidence for causal effects of mitochondrial proteins on growth control provide exciting new insights in the coordination of growth and metabolism. It will be interesting to see whether other such components can be identified and the characterization of the role of our above mentioned novel candidates in growth control may provide further insight into how this coordination is achieved. However, as metabolic networks are highly interconnected and coregulated it is challenging to comprehend how changes at single network nodes affect metabolism in its entirety. Even in E. coli, where extensive metabolic research has been performed, metabolic control and coordination is a field where many controversies and uncertainties persist. Nevertheless, defining regulatory modules with a specific input and output rather than considering single steps enables studying mechanisms that feed into these modules and how these affect output (Chubukov et al. 2014).

5.6 Conclusions and outlook:

In summary, this study has shown that genome-wide association studies for developmental traits in Drosophila can identified novel regulators of size and may shed light on interactions between these and known genes during growth control. The identification of loci that are modulated to create variability in size in a population provides a first step towards a systems- level understanding of size determination. Our finding that largely novel loci rather than canonical growth pathway genes underlie size variation underlines the complementarity of the GWAS approach to the classical genetics approach and highlights the necessity to probe natural variants. The newly identified loci provide a novel perspective on processes like PCP and metabolism that have so far been neglected in relation to growth control and imply a larger than previously thought role for these processes in governing growth. Furthermore, our results highlight the importance of intergenic noncoding elements and regulatory modules in creating size variability in a population and encourage more efforts towards the investigation of regulatory rather than functional mutations for understanding how phenotypic variability is achieved. Finally, the high conservation of basic developmental processes allows findings from Drosophila to serve as a basis for hypothesis driven investigation of the physiological function and role in disease of orthologous genes in humans.

154 Reflecting on our results, the identification of many validated and putative novel regulators of growth, the substantial number of novel regulators identified by Schertel et al., and the large numbers of genes already known to affect size, it seems that the field of growth control faces the same problems as that of metabolic control. Instead of clarifying our picture of growth control and explaining the effects of alterations in single genes in a broader context, the identification of ever more growth regulators raises new questions and introduces more uncertainty about how all these loci interact to govern growth. It is reasonable to assume that large parts of the genome affect growth, as perturbations in all basic cellular processes may impact on cell or organ size. Even just for the growth of a single cell a multitude of processes, such as endocytosis, cytoskeletal organization, trafficking, cell cycle control and metabolism are involved. It is thus logical that knockdown or overexpression of components governing these processes may yield a growth phenotype, which does not imply that this gene is ultimately involved in varying growth in a natural context. Such experiments are of course crucial for elucidating the physiological function of genes and placing them within the context of the cellular machinery, but it should not be concluded that genes showing a growth phenotype upon genetic manipulation are ultimately relevant for creating variability in size. Along these lines, it may not be essential to know which genes can affect growth, but which genes do so in a natural context.

Adopting a strategy analogous to those used in metabolic regulation, we should aim at identifying developmental modules and defining their inputs and what the corresponding spectrum of outputs is. This reduces the highly multidimensional problem of growth control to fewer defined modules that each has a defined range of inputs and outputs. Considering genes in a module together helps identifying biologically relevant consequences of mutations, as the effect on module output, signaling activity through a pathway, is what is ultimately relevant. Genetic variation is the most upstream intrinsic input into a biological system. The genetic variants are translated into effects on levels, activity, and functionality of a multitude of proteins and noncoding regulatory molecules, which produces a cellular or organismal phenotype. This interaction is not static but highly dynamic and nonlinear, involving crosstalk within and between the levels of genome, transcriptome, proteome and metabolome, in response to changing environments and during the course of growth and division of a cell or development of an organism. It will thus not only be necessary to form and characterize genetic modules, but rather modules across the levels of intermediates leading to phenotypic change (Schadt et al. 2005). Interactions between genetic loci ultimately converge on cellular processes that each receives input from many signaling pathways. Unless the effects of all variants are considered together, it is not clear whether

155 the effect of one variant is ultimately relevant for the output of the signaling pathway it converges on to the cellular processes governing growth. The effects on the cellular processes are the ultimately relevant characteristics that predict the phenotype and phenotypic change. Understanding how the information flow proceeds from genetic loci through modules of transcripts and proteins, which modules converge on which processes, and how changing the genetic input affects output on the cellular process will ultimately be a more viable strategy for understanding growth control as a whole than considering effects of mutations in genes in isolation.

Further complicating the picture is the influence of the environment. Organisms have to face changing environments in nature and have to employ mechanisms that integrate information on nutritional status and biotic and abiotic factors with development. A failure to downregulate growth upon nutrient shortage is fatal and temperature changes affect the thermodynamics of cellular processes. It is thus necessary to define modules that respond to this information and feed it into the basic modules, which will require characterizing molecular and growth phenotypes in different environmental conditions.

The approach we propose for studying development has been proposed for the study of other multifactorial traits, for the same reasons mentioned above (Busser et al. 2008, Schadt 2009, Burga and Lehner 2013, Civelek and Lusis 2014, Marjoram et al. 2014). Such an approach is already applied in single cell systems, and is partially successful in reflecting experimentally observed phenotypic effects (Karr et al. 2012, Gagneur et al. 2013), but it will be immensely more difficult to implement for higher organisms due to cell-cell interactions, multiple tissues, the nervous system and hormonal signals, which requires the integration of many more cell-extrinsic cues than for a single cell system. Nevertheless, in light of the evidence presented in the introduction and results part of this study and discussed above we think it is crucial to adopt such an approach for studying growth control if we ever want to understand growth control in its entirety and be able to predict how genetic variability is translated to phenotypic variability while maintaining function of the organism.

I will briefly outline below potential next steps to further analyze our results and an outlook to combining the GWAS data with data from the experimental evolution experiment from RESULTS chapter 4.3.2.

156 1. Cross-validation and heritability estimation in a different population. In absence of functional validation possibilities, the strategy for validating GWAS associations in humans is the replication of associations in different populations. Though we have the opportunity to apply functional validation in Drosophila, replication of SNP associations in other populations has the advantage of showing if the identified SNPs or underlying genes are generally relevant for creating size variability or are confined to our population due to being false positive associations or due to very different genetic architectures of in the two populations. Furthermore, estimation of the amount of heritability explained by the significant loci is better performed in a different population, as estimating this parameter in the same population the SNPs were identified in leads to an upward-biased estimate. At least two other resources of sequenced inbred Drosophila lines exist (DSPR, King et al. 2012, DPGP, www.dpgp.org), and using the FlyCatwalk (described in RESULTS chapter 4.2) could be rapidly phenotyped.

2. Overlap of variants identified in the experimental evolution approach. In RESULTS chapter 4.3.2 we describe the artificial selection process and phenotypic effects of Drosophila populations for big and small relative wing size. Though not within the scope of this thesis, the populations will be sequenced and alleles present at different frequencies between high and low populations identified. As discussed in the introduction (chapter 3.8) experimental evolution experiments can be used to identify loci underlying a trait by applying selection for changes in the trait to outbred populations and sequencing the resulting selected populations. A consistent difference in the frequency of an allele between divergent populations indicates a role for this locus in controlling variation in the selected phenotype. We want to compare the overlap between SNPs identified by GWAS and by this approach, thereby cross- validating the respective candidate loci and enhancing our power of identifying truly causative loci versus false positives in both approaches (Turner et al. 2011). Since we created the base population for selection from the DGRP lines, the selection process operated on the same genetic pool present in these lines. Furthermore, through the outbreeding process, alleles that were only present in few lines or too rare to be genotyped and could thus not be assessed by GWAS can be probed with the selection approach and can get enriched during selection if they are linked to size determination and not detrimental. It will thus be interesting to see if we identify SNPs in more canonical growth pathway genes in this approach that were not probed before due to being too rare. Moreover, we established three replicate lines for each

157 selection condition and it will be interesting to see how different these are in terms of the enriched loci. Given the large number of loci that could affect size it seems likely that already by three times random sampling from the base population we see very different genetic pools in the starting populations. Furthermore, as many processes and pathways affect growth, and these have to be coordinated among themselves, it is likely that different strategies to obtain the objective of a big wing will have been favored between replicates. Along these lines, it will also be interesting to see how related these populations still are to each other, and whether those experiencing the same selective pressure cluster more together in terms of enriched loci, or still those deriving from the same starting sample, which would imply high numbers of nonspecifically enriched SNPs.

3. Linking allele frequency differences to differences in metabolite clusters. In addition to pool-sequencing of generation 10 of the selection experiment, we will measure metabolite levels in wing discs of generation 11 third instar wandering larvae, the stage where most of the growth occurs in Drosophila (Mirth et al. 2005). By comparing metabolite levels in populations of flies with big wings and small wings we hope to identify metabolites or metabolite groups that are associated with a difference in wing size. Specifically, we are looking for single or clusters of metabolites that show consistently different levels between the large and small populations and similar levels within and among populations of one extreme. What we hope to understand from this is if and where in metabolism something is distinct in small and large organs. If we identify such a cluster, we can then go further and search for SNPs that show the same patterns as these metabolites. Since we know approximately where in metabolism something is changed, we can specifically look for polymorphisms in the enzymes involved in the corresponding reactions. Metabolic rate generally increases with decreasing size in the animal kingdom (Martin and Palumbi 1993), however the link between body size and metabolic rate within a species is not clear. We thus do not know if there is a difference in metabolic rate between flies of our selected populations, and it might be worthwhile to measure this parameter in the inbred lines created from the populations (see next section). A relatively easy method to achieve this at reasonable throughput has been published for Drosophila recently (Yatsenko et al. 2014).

158 4. Probing genotype, intermediate phenotype and growth in inbred lines derived from selected populations. Since the selected populations showed considerable phenotypic divergence between high and low lines in generation 10 we decided to create isofemale lines from these populations to maintain the genotypic combinations yielding differences in size. Comparing the genomes of different inbred lines derived from the same population might give us an indication of the level of heterogeneity in this population in generation 10. We will not only sequence genomes and measure size of adult flies with the FlyCatwalk but additionally assay transcriptomes, proteomes and metabolomes of third instar wing discs in these lines in order to identify coregulated modules within each phenotypic layer and find correlations between layers and size. To this end, we will combine several analytic approaches. GWAS between each of the traits and the genome identifies loci associated with transcript, protein and metabolite levels and size. To reduce the high dimensionality of intermediate phenotypes we will additionally perform associations between principal components and biologically relevant clusters of these phenotypes to genetic variants and to size. Integration of such a multitude of data will be challenging and require additional expertise from statistician and computational biologists. We are aware that different strategies for integration of such data exist, ranging from very general correlation based methods, which identify co-varying modules but cannot assign causality, over Boolean network models that have directionality but limited regulatory possibilities (e.g. Krumsiek et al. 2011, Dallidis and Karafyllidis 2014), to global hierarchical state space models that allow modeling of hierarchical layers of modules and can incorporate complex regulatory interactions (e.g. Liu Z et al. 2014) or models that accommodate interactions between cells (e.g. Kozma and Puljic 2013). Alternatively integrating several already available and more specialized models that only model certain aspects of cellular behavior analogous to the strategy used by Karr et al. might be most promising (Tøndel et al. 2011, Karr et al. 2012, Chew YH et al. 2014).

5. Studying intermediate phenotypes and size in stress adapted populations. Selection for small and large size is one strategy to modulate the genome and enrich for loci controlling size variation. Another approach lies in the adaptation of populations to variable environments followed by sequencing, molecular phenotyping and size determination. This approach would on the one hand reveal both general and specialized mechanisms of organisms to adapt to changing environments and integrate such cues with development, and on the other hand could be used to

159 identify modules involved in relaying environmental effects on size. Obviously, organisms would have to exhibit variable adult sizes in different environments to be able to link the cellular and organ level processes to size. Possible environments include variable types (protein, carbohydrate, lipid) and severities of starvation, variable carbon sources, neurotoxins such as ethanol, caffeine and cocaine, abiotic stresses like high osmolarity, oxidative stress, extreme temperatures and biotic stresses like crowding.

160 6. MATERIALS AND METHODS Materials and Methods used in this study are essentially described in the Methods sections of the two manuscript and below are only the methods for the Further Results section.

Association analysis: Quantitative genetic analysis, phenotypic modeling and association analysis were performed as described in the methods of RESULTS chapter 4.1.

Base population for artificial selection: We created an outbred base population for artificial selection by two generations of round robin mating of 176 DGRP lines followed by 12 generations of random mating. This process ensures that linkage between alleles is broken and all alleles are introduced into the genetic pool.

Establishment of selection lines: To establish selection lines, we measured wings and interocular distance (IOD) of 150 flies of each sex using the FlyCatwalk and selected the 60 with the biggest wings relative to IOD of each sex to establish the high selection lines and the 60 smallest to establish the low selection lines. To establish the corresponding control line we measured again 150 flies of each sex and chose 60 individuals randomly per sex. We repeated this three times to obtain three populations for each selection regime.

Selection design: In subsequent generations, selection was applied as follows. From each population we measured 150 lines per sex and kept the 60 with biggest wings relative to IOD (high lines), smallest wings relative to IOD (low lines) or a random 60 individuals (control lines). The sixty chosen individuals of each population and sex were then used as parents for the next generation. As we did not separate females from males upon hatching the females had already been inseminated by random males. For this reason, we left flies to mate for one week, then transferred them to a new bottle and only used progeny from this second bottle for quantification. This process was repeated over 10 generation and flies that were selected as parents were frozen in each generation after mating. We ensured that bottles were not crowded by restricting the mating time to one day.

Phenotypic measurements: All phenotypic measurements were performed using the FlyCatwalk described in RESULTS chapter 4.2.

161 7. REFERENCES

1000 Genomes Project Consortium, G. R. Abecasis, D. Altshuler, A. Auton, L. D. Brooks et al., 2010 A map of human genome variation from population-scale sequencing. Nature 467: 1061–1073. Aegerter-Wilmsen, T., C. M. Aegerter, E. Hafen, and K. Basler, 2007 Model for the regulation of size in the wing imaginal disc of Drosophila. Mechanisms of Development 124: 318–326. Aegerter-Wilmsen, T., M. B. Heimlicher, A. C. Smith, P. B. de Reuille, R. S. Smith et al., 2012 Integrating force-sensing and signaling pathways in a model for the regulation of wing imaginal disc size. Development (Cambridge, England) 139: 3221–3231. Aitman, T. J., C. Boone, G. A. Churchill, M. O. Hengartner, T. F. C. Mackay et al., 2011 The future of model organisms in human disease research. Nature reviews. Genetics 12: 575– 582. Aldea, M., E. Garí, and N. Colomina, 2007 Control of cell cycle and cell growth by molecular chaperones. Cell cycle (Georgetown, Tex.) 6: 2599–2603. Alexander, D. D., P. J. Mink, C. A. Cushing, and B. Sceurman, 2010 A review and meta- analysis of prospective studies of red and processed meat intake and prostate cancer. Nutrition journal 9: 50. Alexander, D. H., J. Novembre, and K. Lange, 2009 Fast model-based estimation of ancestry in unrelated individuals. Genome Research 19: 1655–1664. Allada, R., and B. Y. Chung, 2010 Circadian Organization of Behavior and Physiology in Drosophila. Annual Review of Physiology 72: 605–624. Altschul, S. F., W. Gish, W. Miller, E. W. Myers, and D. J. Lipman, 1990 Basic local alignment search tool. Journal of molecular biology 215: 403–410. American Diabetes Association, 2009 Diagnosis and Classification of Diabetes Mellitus. Diabetes Care 32: S62–S67. Angilletta, M. J., T. D. Steury, and M. W. Sears, 2004 Temperature, growth rate, and body size in ectotherms: fitting pieces of a life-history puzzle. Integrative and comparative biology 44: 498–509. Anholt, R. R. H., and T. F. C. Mackay, 2004 Quantitative genetic analyses of complex behaviours in Drosophila. Nature reviews. Genetics 5: 838–849. Anholt, R. R. H., and T. F. C. Mackay, 2010 Principles of behavioral genetics. Elsevier Academic Press.

Arrese, E. L., and J. L. Soulages, 2010 Insect Fat Body: Energy, Metabolism, and Regulation. Annual Review of Entomology 55: 207–225. Arts, J., M. L. Fernandez, and I. E. Lofgren, 2014 Coronary Heart Disease Risk Factors in 162 College Students. Advances in Nutrition: An International Review Journal 5: 177–187. Arya, G. H., A. L. Weber, P. Wang, M. M. Magwire, Y. L. S. Negron et al., 2010 Natural variation, functional pleiotropy and transcriptional contexts of odorant binding protein genes in Drosophila melanogaster. Genetics 186: 1475–1485. Ashton-Beaucage, D., C. M. Udell, P. Gendron, M. Sahmi, M. Lefrançois et al., 2014 A Functional Screen Reveals an Extensive Layer of Transcriptional and Splicing Control Underlying RAS/MAPK Signaling in Drosophila (H. J. Bellen, Ed.). PLoS biology 12: e1001809. Atkinson, D., 1994 Temperature and Organism Size—A Biological Law for Ectotherms?, pp. 1–58 in Advances in Ecological Research, Advances in Ecological Research, Elsevier. Aune, D., T. Norat, P. Romundstad, and L. J. Vatten, 2013 Dairy products and the risk of type 2 diabetes: a systematic review and dose-response meta-analysis of cohort studies. American Journal of Clinical Nutrition 98: 1066–1083. Avruch, J., K. Hara, Y. Lin, M. Liu, X. Long et al., 2006 Insulin and amino-acid regulation of mTOR signaling and kinase activity through the Rheb GTPase. Oncogene 25: 6361–6372. Awasaki, T., and K. Kimura, 2001 Multiple function of poxn gene in larval PNS development and in adult appendage formation of Drosophila. Development genes and evolution 211: 20– 29. Ayroles, J. F., M. A. Carbone, E. A. Stone, K. W. Jordan, R. F. Lyman et al., 2009 Systems genetics of complex traits in Drosophila melanogaster. Nature Genetics 41: 299–307. Azevedo, R., V. French, and L. Partridge, 2002 Temperature modulates epidermal cell size in Drosophila melanogaster. Journal of Insect Physiology 48: 231–237. Bader, R., L. Sarraf-Zadeh, M. Peters, N. Moderau, H. Stocker et al., 2013 The IGFBP7 homolog Imp-L2 promotes insulin signaling in distinct neurons of the Drosophila brain. Journal of Cell Science 126: 2571–2576. Baena-López, L. A., A. Baonza, and A. Garcia-Bellido, 2005 The Orientation of Cell Divisions Determines the Shape of Drosophila Organs. Current Biology 15: 1640–1644. Baker, J., J. P. Liu, E. J. Robertson, and A. Efstratiadis, 1993 Role of insulin-like growth factors in embryonic and postnatal growth. Cell 75: 73–82. Baker, N. E., 2007 Patterning signals and proliferation in Drosophila imaginal discs. Current opinion in genetics & development 17: 287–293. Baker, N. E., and A. Jenny, 2014 Metabolism and the Other Fat: A Protocadherin in Mitochondria. Cell 158: 1240–1241. Baldwin-Brown, J. G., A. D. Long, and K. R. Thornton, 2014 The Power to Detect Quantitative Trait Loci Using Resequenced, Experimentally Evolved Populations of Diploid, Sexual Organisms. Molecular biology and evolution 31: 1040–1055.

163 Bao, W., F. B. Hu, S. Rong, Y. Rong, K. Bowers et al., 2013 Predicting Risk of Type 2 Diabetes Mellitus with Genetic Risk Models on the Basis of Established Genome-wide Association Markers: A Systematic Review. American Journal of Epidemiology 178: 1197– 1207. Barrick, J. E., D. S. Yu, S. H. Yoon, H. Jeong, T. K. Oh et al., 2009 nature08480. Nature 461: 1243–1247. Bateman, J. M., and H. McNeill, 2004 Temporal Control of Differentiation by the Insulin Receptor/Tor Pathway in Drosophila. Cell 119: 87–96. Batty, G. D., M. J. Shipley, D. Gunnell, R. Huxley, M. Kivimaki et al., 2009 Height, wealth, and health: An overview with new data from three longitudinal studies. Economics & Human Biology 7: 137–152. Baumgartner, R., I. Poernbacher, N. Buser, E. Hafen, and H. Stocker, 2010 The WW Domain Protein Kibra Acts Upstream of Hippo in Drosophila. Developmental Cell 18: 309–316. Beckingham, K. M., M. J. Texada, D. A. Baker, R. Munjaal, and J. D. Armstrong, 2005 Genetics of graviperception in animals. Advances in genetics 55: 105–145. Bejsovec, A., 2006 Flying at the head of the pack: Wnt biology in Drosophila. Oncogene 25: 7442–7449. Bellosta, P., and P. Gallant, 2010 Myc Function in Drosophila. Genes & Cancer 1: 542–546. Bennett, A. F., and B. S. Hughes, 2009 Microbial experimental evolution. American journal of physiology. Regulatory, integrative and comparative physiology 297: R17–25. Bennett, B. J., C. R. Farber, L. Orozco, H. Min Kang, A. Ghazalpour et al., 2010 A high- resolution association mapping panel for the dissection of complex traits in mice. Genome Research 20: 281–290. Bennett, F. C., and K. F. Harvey, 2006 Fat cadherin modulates organ size in Drosophila via the Salvador/Warts/Hippo signaling pathway. Current Biology 16: 2101–2110. Beretta, L., A. C. Gingras, Y. V. Svitkin, M. N. Hall, and N. Sonenberg, 1996 Rapamycin blocks the phosphorylation of 4E-BP1 and inhibits cap-dependent initiation of translation. The EMBO journal 15: 658–664. Bergelson, J., and F. Roux, 2010 Towards identifying genes underlying ecologically relevant traits in Arabidopsis thaliana. Nature reviews. Genetics 11: 867–879. Bi, P., T. Shan, W. Liu, F. Yue, X. Yang et al., 2014 Inhibition of Notch signaling promotes browning of white adipose tissue and ameliorates obesity. Nature Medicine 1–10. Bin Zhao, K. Tumaneng, and K.-L. Guan, 2011 The Hippo pathway in organ size control, tissue regeneration and stem cell self-renewal. Nature Cell Biology 13: 877–883. Birdsall, K., E. Zimmerman, K. Teeter, and G. Gibson, 1999 Genetic variation for the positioning of wing veins in Drosophila melanogaster. Evolution & development 2: 16–24.

164 Bodmer, W., and C. Bonilla, 2008 Common and rare variants in multifactorial susceptibility to common diseases. Nature Genetics 40: 695–701. Boedigheimer, M., and A. Laughon, 1993 Expanded: a gene involved in the control of cell proliferation in imaginal discs. Development (Cambridge, England) 118: 1291–1301. Boedigheimer, M. J., K. P. Nguyen, and P. J. Bryant, 1997 Expanded functions in the apical cell domain to regulate the growth rate of imaginal discs. Developmental genetics 20: 103– 110. Bohni, R., J. Riesgo-Escovar, S. Oldham, W. Brogiolo, H. Stocker et al., 1999 Autonomous control of cell and organ size by CHICO, a Drosophila homolog of vertebrate IRS1-4. Cell 97: 865–875. Brand, A. H., and N. Perrimon, 1993 Targeted gene expression as a means of altering cell fates and generating dominant phenotypes. Development (Cambridge, England) 118: 401– 415. Brennecke, J., D. R. Hipfner, A. Stark, R. B. Russell, and S. M. Cohen, 2003 bantam encodes a developmentally regulated microRNA that controls cell proliferation and regulates the proapoptotic gene hid in Drosophila. Cell 113: 25–36. Britton, J. S., and B. A. Edgar, 1998 Environmental control of the cell cycle in Drosophila: nutrition activates mitotic and endoreplicative cells by distinct mechanisms. Development (Cambridge, England) 125: 2149–2158. Britton, J. S., W. K. Lockwood, L. Li, S. M. Cohen, and B. A. Edgar, 2002 Drosophila's insulin/PI3-kinase pathway coordinates cellular metabolism with nutritional conditions. Developmental Cell 2: 239–249. Brogiolo, W., H. Stocker, T. Ikeya, F. Rintelen, R. Fernandez et al., 2001 An evolutionarily conserved function of the Drosophila insulin receptor and insulin-like peptides in growth control. Current biology : CB 11: 213–221. Brunet, A., A. Bonni, M. J. Zigmond, M. Z. Lin, P. Juo et al., 1999 Akt promotes cell survival by phosphorylating and inhibiting a Forkhead transcription factor. Cell 96: 857–868. Bryant, P. J., and P. Levinson, 1985 Intrinsic growth control in the imaginal primordia of Drosophila, and the autonomous action of a lethal mutation causing overgrowth. Developmental Biology 107: 355–363. Bryant, P. J., and P. Levinson, 2003 Intrinsic Growth Control in the lmaginal Primordia of Drosophila, and the Autonomous Action of a Lethal Mutation Causing Overgrowth. Developmental Biology 355–363. Buday, L., and J. Downward, 1993 Epidermal growth factor regulates p21 ras through the formation of a complex of receptor, Grb2 adapter protein, and Sos nucleotide exchange factor. Cell 73: 611–620.

165 Burga, A., and Ben Lehner, 2013 Predicting phenotypic variation from genotypes, phenotypes and a combination of the two. Current Opinion in Biotechnology 24: 803–809. Burgering, B. M., and P. J. Coffer, 1995 Protein kinase B (c-Akt) in phosphatidylinositol-3-OH kinase signal transduction. Nature 376: 599–602. Burke, M. K., and M. R. Rose, 2009 Experimental evolution with Drosophila. American journal of physiology. Regulatory, integrative and comparative physiology 296: R1847–54. Burke, M. K., J. P. Dunham, P. Shahrestani, K. R. Thornton, M. R. Rose et al., 2010 Genome-wide analysis of a long-term evolution experiment with Drosophila. Nature 467: 587–590. Burks, D. J., J. Font de Mora, M. Schubert, D. J. Withers, M. G. Myers et al., 2000 IRS-2 pathways integrate female reproduction and energy homeostasis. Nature 407: 377–382. Busser, B. W., M. L. Bulyk, and A. M. Michelson, 2008 Toward a systems-level understanding of developmental regulatory networks. Current opinion in genetics & development 18: 521–529. Caldwell, P. E., M. Walkiewicz, and M. Stern, 2005 Ras Activity in the Drosophila Prothoracic Gland Regulates Body Size and Developmental Rate via Ecdysone Release. Current Biology 15: 1785–1795. Campbell, G., and A. Tomlinson, 1999 Transducing the Dpp morphogen gradient in the wing of Drosophila: regulation of Dpp targets by brinker. Cell 96: 553–562. Campos-Ortega, J. A., 1995 Genetic mechanisms of early neurogenesis inDrosophila melanogaster. Molecular neurobiology 10: 75–89. Cao, J., K. Schneeberger, S. Ossowski, T. Günther, S. Bender et al., 2011 Whole-genome sequencing of multiple Arabidopsis thaliana populations. Nature Publishing Group 43: 956– 963. Carter, G. W., 2013 Inferring gene function and network organization in Drosophila signaling by combined analysis of pleiotropy and epistasis. G3 (Bethesda, Md.) 3: 807–814. Cavalli-Sforza, L. L., 2005 The Human Genome Diversity Project: past, present and future. Nature reviews. Genetics 6: 333–340. Celniker, S. E., L. A. L. Dillon, M. B. Gerstein, K. C. Gunsalus, S. Henikoff et al., 2009 Unlocking the secrets of the genome. Nature 459: 927–930. Chan, S. J., and D. F. Steiner, 2000 Insulin through the ages: phylogeny of a growth promoting and metabolic regulatory hormone. Amer. Zool. 40: 213–222. Chen, C.-L., K. M. Gajewski, F. Hamaratoglu, W. Bossuyt, L. Sansores-Garcia et al., 2010 The apical-basal cell polarity determinant Crumbs regulates Hippo signaling in Drosophila. Proceedings of the National Academy of Sciences of the United States of America 107: 15810–15815.

166 Chen, J., Y. Zhang, and P. Shen, 2010 Protein kinase C deficiency-induced alcohol insensitivity and underlying cellular targets in Drosophila. NSC 166: 34–39. Cheng, L. Y., A. P. Bailey, S. J. Leevers, T. J. Ragan, P. C. Driscoll et al., 2011 Anaplastic Lymphoma Kinase Spares Organ Growth during Nutrient Restriction in Drosophila. Cell 146: 435–447. Chew, Y. H., B. Wenden, A. Flis, V. Mengin, J. Taylor et al., 2014 Multiscale digital Arabidopsis predicts individual organ and whole-organism growth. Proceedings of the National Academy of Sciences. Chillakuri, C. R., D. Sheppard, S. M. Lea, and P. A. Handford, 2012 Notch receptor–ligand binding and activation: Insights from molecular studies. Seminars in Cell and Developmental Biology 23: 421–428. Cho, E., Y. Feng, C. Rauskolb, S. Maitra, R. Fehon et al., 2006 Delineation of a Fat tumor suppressor pathway. Nature Genetics 38: 1142–1150. Chong, H., H. G. Vikis, and K.-L. Guan, 2003 Mechanisms of regulating the Raf kinase family. Cellular Signalling 15: 463–469. Chubukov, V., L. Gerosa, K. Kochanowski, and U. Sauer, 2014 Coordination of microbial metabolism. Nature Publishing Group 12: 327–340. Civelek, M., and A. J. Lusis, 2014 Systems genetics approaches to understand complex traits. Nature reviews. Genetics 15: 34–48. Clark, A. G., M. B. Eisen, D. R. Smith, C. M. Bergman, B. Oliver et al., 2007 Evolution of genes and genomes on the Drosophila phylogeny. Nature 450: 203–218. Claussen, M., B. Kübler, M. Wendland, K. Neifer, B. Schmidt et al., 1997 Proteolysis of insulin-like growth factors (IGF) and IGF binding proteins by cathepsin D. Endocrinology 138: 3797–3803. Cole, M. D., 1986 The myc oncogene: its role in transformation and differentiation. Annu. Rev. Genet. 20: 361–384. Colombani, J., D. S. Andersen, and P. Léopold, 2012 Secreted peptide Dilp8 coordinates Drosophila tissue growth with developmental timing. Science 336: 582–585. Colombani, J., L. Bianchini, S. Layalle, E. Pondeville, C. Dauphin-Villemant et al., 2005 Antagonistic actions of ecdysone and insulins determine final size in Drosophila. Science 310: 667–670. Colombani, J., S. Raisin, S. Pantalacci, T. Radimerski, J. Montagne et al., 2003 A nutrient sensor mechanism controls Drosophila growth. Cell 114: 739–749. Cook, M., and M. Tyers, 2007 Size control goes global. Current Opinion in Biotechnology 18: 341–350. Cruchaga, C., C. M. Karch, S. C. Jin, B. A. Benitez, Y. Cai et al., 2014 Rare coding variants

167 in the phospholipase D3 gene confer risk for Alzheimer's disease. Nature 505: 550–554. Dallidis, S., and I. Karafyllidis, 2014 Boolean Network Model of the Quorum Sensing Circuits. IEEE transactions on nanobioscience. De Moed, G. H., C. Kruitwagen, G. De Jong, and W. Scharloo, 1999 Critical weight for the induction of pupariation in Drosophila melanogaster: genetic and environmental variation. Journal of evolutionary biology 12: 852–858. De Virgilio, C., and R. Loewith, 2006 The TOR signalling network from yeast to man. The International Journal of Biochemistry & Cell Biology 38: 1476–1481. Delanoue, R., M. Slaidina, and P. Léopold, 2010 The steroid hormone ecdysone controls systemic growth by repressing dMyc function in Drosophila fat cells. Developmental Cell 18: 1012–1021. Demetriades, C., N. Doumpas, and A. A. Teleman, 2014 Regulation of TORC1 in Response to Amino Acid Starvation via Lysosomal Recruitment of TSC2. Cell 156: 786–799. Dennis, P. B., A. Jaeschke, M. Saitoh, B. Fowler, S. C. Kozma et al., 2001 Mammalian TOR: a homeostatic ATP sensor. Science 294: 1102–1105. DeStefano, M. A., and E. Jacinto, 2013 Regulation of insulin receptor substrate-1 by mTORC2 (mammalian target of rapamycin complex 2). Biochemical Society Transactions 41: 896–901. Di Cristofano, A., and P. P. Pandolfi, 2000 The Multiple Roles of PTEN in Tumor Suppression. Cell 100: 387–390. Diaz-Benjumea, F. J., and E. Hafen, 1994 The sevenless signalling cassette mediates Drosophila EGF receptor function during epidermal development. Development (Cambridge, England) 120: 569–578. Didichenko, S. A., B. Tilton, B. A. Hemmings, K. Ballmer-Hofer, and M. Thelen, 1996 Constitutive activation of protein kinase B and phosphorylation of p47phox by a membrane- targeted phosphoinositide 3-kinase. Current biology : CB 6: 1271–1278. Dierick, H. A., and R. J. Greenspan, 2006 Molecular analysis of flies selected for aggressive behavior. Nature Genetics 38: 1023–1031. Doe, C. Q., and C. S. Goodman, 1985 Early events in insect neurogenesis. II. The role of cell interactions and cell lineage in the determination of neuronal precursor cells. Developmental Biology 111: 206–219. Dong, J., G. Feldmann, J. Huang, S. Wu, N. Zhang et al., 2007 Elucidation of a Universal Size-Control Mechanism in Drosophila and Mammals. Cell 130: 1120–1133. Doroquez, D. B., and I. Rebay, 2006 Signal Integration During Development: Mechanisms of EGFR and Notch Pathway Function and Cross-Talk. Critical Reviews in Biochemistry and Molecular Biology 41: 339–385.

168 Du, M., P. L. Auer, S. Jiao, J. Haessler, D. Altshuler et al., 2014 Whole-exome imputation of sequence variants identified two novel alleles associated with adult body height in African Americans. Human Molecular Genetics. Duan, C., 2002 Specifying the cellular responses to IGF signals: roles of IGF-binding proteins. The Journal of endocrinology 175: 41–54. Düvel, K., J. L. Yecies, S. Menon, P. Raman, A. I. Lipovsky et al., 2010 Activation of a Metabolic Gene Regulatory Network Downstream of mTOR Complex 1. MOLCEL 39: 171– 183.

Durham, M. F., M. M. Magwire, E. A. Stone, and J. Leips, 2014 Genome-wide analysis in Drosophila reveals age-specific effects of SNPs on fitness traits. Nature Communications 5, p. 4338.

Echave, P., I. J. Conlon, and A. C. Lloyd, 2007 Cell size regulation in mammalian cells. Cell cycle (Georgetown, Tex.) 6: 218–224. Edgar, B. A., 2006 How flies get their size: genetics meets physiology. Nature reviews. Genetics 7: 907–916. Edinger, A. L., and C. B. Thompson, 2002 Akt maintains cell size and survival by increasing mTOR-dependent nutrient uptake. Molecular biology of the cell 13: 2276–2288. Edwards, A. C., S. M. Rollmann, T. J. Morgan, and T. F. C. Mackay, 2006 Quantitative genomics of aggressive behavior in Drosophila melanogaster. PLoS Genetics 2: e154. Efstratiadis, A., 1998 Genetics of mouse growth. The International Journal of Developmental Biology 42: 955–976. Eilers, M., and R. N. EISENMAN, 2008 Myc's broad reach. Genes & development 22: 2755– 2766. Elena, S. F., and R. E. Lenski, 2003 Microbial genetics: Evolution experiments with microorganisms: the dynamics and genetic bases of adaptation. Nature reviews. Genetics 4: 457–469. Emily, M., T. Mailund, J. Hein, L. Schauser, and M. H. Schierup, 2009 Using biological networks to search for interacting loci in genome-wide association studies. European journal of human genetics : EJHG 17: 1231–1240. ENCODE Project Consortium, 2012 An integrated encyclopedia of DNA elements in the human genome. Nature 489: 57–74. Engelman, J. A., J. Luo, and L. C. Cantley, 2006 The evolution of phosphatidylinositol 3- kinases as regulators of growth and metabolism. Nature reviews. Genetics 7: 606–619. Ernst, J., P. Kheradpour, T. S. Mikkelsen, N. Shoresh, L. D. Ward et al., 2011 Mapping and analysis of chromatin state dynamics in nine human cell types. Nature 473: 43–49.Falconer,

169 D. S., & Mackay, T. F. C.,1996 Introduction to Quantitative Genetics (Edition 4). Longmans Green, Harlow, Essex, UK.

Feng, Y., and K. D. Irvine, 2007 Fat and expanded act in parallel to regulate growth through warts. Proceedings of the National Academy of Sciences 104: 20362–20367. Feng, Y., and K. D. Irvine, 2009 Processing and phosphorylation of the Fat receptor. Proceedings of the National Academy of Sciences 106: 11989–11994. Fiegna, F., Y.-T. N. Yu, S. V. Kadam, and G. J. Velicer, 2006 Evolution of an obligate social cheater to a superior cooperator. Nature 441: 310–314. Flint, J., and E. Eskin, 2012 Genome-wide association studies in mice. Nature reviews. Genetics 13: 807–817. Fontana, L., L. Partridge, and V. D. Longo, 2010 Extending healthy life span--from yeast to humans. Science 328: 321–326. Fortini, M. E., 2009 Notch Signaling: The Core Pathway and Its Posttranslational Regulation. DEVCEL 16: 633–647. Franceschini, A., D. Szklarczyk, S. Frankild, M. Kuhn, M. Simonovic et al., 2012 STRING v9.1: protein-protein interaction networks, with increased coverage and integration. Nucleic acids research 41: D808–D815. Franke, T. F., S. I. Yang, T. O. Chan, K. Datta, A. Kazlauskas et al., 1995 The protein kinase encoded by the Akt proto-oncogene is a target of the PDGF-activated phosphatidylinositol 3- kinase. Cell 81: 727–736. Frayling, T. M., N. J. Timpson, M. N. Weedon, E. Zeggini, R. M. Freathy et al., 2007 A Common Variant in the FTO Gene Is Associated with Body Mass Index and Predisposes to Childhood and Adult Obesity. Science 316: 889–894. Freedman, M. L., D. Reich, K. L. Penney, G. J. McDonald, A. A. Mignault et al., 2004 Assessing the impact of population stratification on genetic association studies. Nature Genetics 36: 388–393. French, V., M. Feast, and L. Partridge, 1998 Body size and cell size in Drosophila: the developmental response to temperature. Journal of Insect Physiology 44: 1081–1089. Fuller, R. C., C. F. Baer, and J. Travis, 2005 How and When Selection Experiments Might Actually be Useful. Integrative and comparative biology 45: 391–404. Gabay, L., H. Scholz, M. Golembo, A. Klaes, B. Z. Shilo et al., 1996 EGF receptor signaling induces pointed P1 transcription and inactivates Yan protein in the Drosophila embryonic ventral ectoderm. Development (Cambridge, England) 122: 3355–3362. Gagneur, J., O. Stegle, C. Zhu, P. Jakob, M. M. Tekkedil et al., 2013 Genotype-Environment Interactions Reveal Causal Pathways That Mediate Genetic Effects on Phenotype (J. C. Fay,

170 Ed.). PLoS Genetics 9: e1003803. Gao, X., T. P. Neufeld, and D. Pan, 2000 Drosophila PTEN Regulates Cell Growth and Proliferation through PI3K-Dependent and -Independent Pathways. Developmental Biology 221: 404–418. Gao, X., Y. Zhang, P. Arrazola, O. Hino, T. Kobayashi et al., 2002 Tsc tumour suppressor proteins antagonize amino-acid–TOR signalling. Nature Cell Biology 4: 699–704. Garami, A., F. J. T. Zwartkruis, T. Nobukuni, M. Joaquin, M. Roccio et al., 2003 Insulin Activation of Rheb, a Mediator of mTOR/S6K/4E-BP Signaling, Is Inhibited by TSC1 and 2. Molecular Cell 11: 1457–1466. García-Gámez, E., B. Gutiérrez-Gil, G. Sahana, J.-P. Sánchez, Y. Bayón et al., 2012 GWA Analysis for Milk Production Traits in Dairy Sheep and Genetic Support for a QTN Influencing Milk Protein Percentage in the LALBA Gene (J. C. Nelson, Ed.). PLoS ONE 7: e47782. Garoia, F., D. Grifoni, V. Trotta, D. Guerra, M. C. Pezzoli et al., 2005 The tumor suppressor gene fat modulates the EGFR-mediated proliferation control in the imaginal tissues of Drosophila melanogaster. Mechanisms of Development 122: 175–187. Gatenby, R. A., and R. J. Gillies, 2004 Why do cancers have high aerobic glycolysis? Nature Reviews Cancer 4: 891–899. Genevet, A., M. C. Wehr, R. Brain, B. J. Thompson, and N. Tapon, 2010 Kibra is a regulator of the Salvador/Warts/Hippo signaling network. Developmental Cell 18: 300–308. Gerber, M., 2012 Omega-3 fatty acids and cancers: a systematic update review of epidemiological studies. British Journal of Nutrition 107: S228–S239. Gershman, B., O. Puig, L. Hang, R. M. Peitzsch, M. Tatar et al., 2007 High-resolution dynamics of the transcriptional response to nutrition in Drosophila: a key role for dFOXO. Physiological genomics 29: 24–34. Géminard, C., E. J. Rulifson, and P. Léopold, 2009 Remote Control of Insulin Secretion by Fat Cells in Drosophila. Cell Metabolism 10: 199–207. Ghezzi, A., B. J. Liebeskind, A. Thompson, N. S. Atkinson, and H. H. Zakon, 2014 Ancient association between cation leak channels and Mid1 proteins is conserved in fungi and animals. Frontiers in molecular neuroscience 7:. Ghosh, S. M., N. D. Testa, and A. W. Shingleton, 2013 Temperature-size rule is mediated by thermal plasticity of critical size in Drosophila melanogaster. Proceedings of the Royal Society B: Biological Sciences 280: 20130174–20130174. GIBSON, G., 2011 Rare and common variants: twenty arguments. Nature reviews. Genetics 13: 135–145. Goberdhan, D. C. I., and C. Wilson, 2003 The functions of insulin signaling: size isn't everything, even in Drosophila. Differentiation 71: 375–397.

171 Gockel, J., S. J. W. Robinson, W. J. Kennington, D. B. Goldstein, and L. Partridge, 2002 Quantitative genetic analysis of natural variation in body size in Drosophila melanogaster. Heredity 89: 145–153. Grewal, S. S., 2009 Insulin/TOR signaling in growth and homeostasis: A view from the fly world. The International Journal of Biochemistry & Cell Biology 41: 1006–1010. Grewal, S. S., J. R. Evans, and B. A. Edgar, 2007 Drosophila TIF-IA is required for ribosome synthesis and cell growth and is regulated by the TOR pathway. The Journal of cell biology 179: 1105–1113. Grewal, S. S., L. Li, A. Orian, R. N. Eisenman, and B. A. Edgar, 2005 Myc-dependent regulation of ribosomal RNA synthesis during Drosophila development. Nature Cell Biology 7: 295–302. Gronke, S., D.-F. Clarke, S. Broughton, T. D. Andrews, and L. Partridge, 2010 Molecular Evolution and Functional Characterization of Drosophila Insulin-Like Peptides (E. Rulifson, Ed.). PLoS Genetics 6: e1000857. Grubbs, N., M. Leach, X. Su, T. Petrisko, J. B. Rosario et al., 2013 New Components of Drosophila Leg Development Identified through Genome Wide Association Studies (E. E. Schmidt, Ed.). PLoS ONE 8: e60261. Grzeschik, N. A., L. M. Parsons, M. L. Allott, K. F. Harvey, and H. E. Richardson, 2010 Lgl, aPKC, and Crumbs Regulate the Salvador/Warts/Hippo Pathway through Two Distinct Mechanisms. Current Biology 20: 573–581. Guertin, D. A., K. V. P. Guntur, G. W. Bell, C. C. Thoreen, and D. M. Sabatini, 2006 Functional Genomics Identifies TOR-Regulated Genes that Control Growth and Division. Current Biology 16: 958–970. Guler, H. P., J. Zapf, E. Scheiwiller, and E. R. Froesch, 1988 Recombinant human insulin-like growth factor I stimulates growth and has distinct effects on organ size in hypophysectomized rats. Proceedings of the National Academy of Sciences of the United States of America 85: 4889–4893. Guo, S., 2014 Insulin signaling, resistance, and the metabolic syndrome: insights from mouse models into disease mechanisms. Journal of Endocrinology 220: T1–T23. Gwinn, D. M., D. B. Shackelford, D. F. Egan, M. M. Mihaylova, A. Mery et al., 2008 AMPK Phosphorylation of Raptor Mediates a Metabolic Checkpoint. Molecular Cell 30: 214–226. HADLER, N. M., 1964 HERITABILITY AND PHOTOTAXIS IN DROSOPHILA MELANOGASTER. Genetics 50: 1269–1277. Hamaratoglu, F., M. Affolter, and G. Pyrowolakis, 2014 Dpp/BMP signaling in flies: From molecules to biology. Seminars in Cell and Developmental Biology 32: 128–136. Hamaratoglu, F., M. Willecke, M. Kango-Singh, R. Nolo, E. Hyun et al., 2006 The tumour-

172 suppressor genes NF2/Merlin and Expanded act through Hippo signalling to regulate cell proliferation and apoptosis. Nature Cell Biology 8: 27–36. Hangauer, M. J., I. W. Vaughn, and M. T. McManus, 2013 Pervasive Transcription of the Human Genome Produces Thousands of Previously Unidentified Long Intergenic Noncoding RNAs (J. L. Rinn, Ed.). PLoS Genetics 9: e1003569. Hao, Y., X. Liu, X. Lu, X. Yang, L. Wang et al., 2013 Genome-wide association study in Han Chinese identifies three novel loci for human height. Human Genetics 132: 681–689. Harbison, S. T., M. A. Carbone, J. F. Ayroles, E. A. Stone, R. F. Lyman et al., 2009 Co- regulated transcriptional networks contribute to natural genetic variation in Drosophila sleep. Nature Genetics 41: 371–375. Harbison, S. T., L. J. McCoy, and T. F. C. Mackay, 2013 Genome-wide association study of sleep in Drosophila melanogaster. BMC Genomics 14: 281. Harrington, L. S., G. M. Findlay, A. Gray, T. Tolkacheva, S. Wigfield et al., 2004 The TSC1-2 tumor suppressor controls insulin-PI3K signaling via regulation of IRS proteins. The Journal of cell biology 166: 213–223. Harvey, K. F., J. Mattila, A. Sofer, F. C. Bennett, M. R. Ramsey et al., 2008 FOXO-regulated transcription restricts overgrowth of Tsc mutant organs. The Journal of cell biology 180: 691– 696. Harvey, K. F., C. M. Pfleger, and I. K. Hariharan, 2003 The Drosophila Mst Ortholog, hippo, Restricts Growth and Cell Proliferation and Promotes Apoptosis. Cell 114: 457–467. Hatakeyama, J., J. H. Wald, I. Printsev, H. Y. H. Ho, and K. L. Carraway, 2014 Vangl1 and Vangl2: planar cell polarity components with a developing role in cancer. Endocrine Related Cancer 21: R345–R356. Hawkins, R. D., G. C. Hon, and B. Ren, 2010 Next-generation genomics: an integrative approach. Nature reviews. Genetics 11: 476–486. Herranz, H., and M. Milán, 2008 Signalling molecules, growth regulators and cell cycle control in Drosophila. Cell cycle (Georgetown, Tex.) 7: 3335–3337. Hers, I., E. E. Vincent, and J. M. Tavaré, 2011 Akt signalling in health and disease. Cellular Signalling 23: 1515–1527. Hietakangas, V., and S. M. Cohen, 2009 Regulation of Tissue Growth through Nutrient Sensing. Annu. Rev. Genet. 43: 389–410. Hindorff, L. A., P. Sethupathy, H. A. Junkins, E. M. Ramos, J. P. Mehta et al., 2009 Potential etiologic and functional implications of genome-wide association loci for human diseases and traits. Proceedings of the National Academy of Sciences of the United States of America 106: 9362–9367. Hirschhorn, J. N., and M. J. Daly, 2005 Genome-wide association studies for common

173 diseases and complex traits. Nature reviews. Genetics 6: 95–108. Hirschhorn, J. N., and G. Lettre, 2009 Progress in Genome-Wide Association Studies of Human Height. Hormone Research 71: 5–13. Hosokawa, N., T. Hara, T. Kaizuka, C. Kishi, A. Takamura et al., 2009 Nutrient-dependent mTORC1 association with the ULK1–Atg13–FIP200 complex required for autophagy. Molecular biology of the cell 20: 1981–1991. HOULE, D., J. Mezey, P. Galpern, and A. Carter, 2003 Automated measurement of Drosophila wings. BMC Evolutionary Biology 3: 25. Hu, Y., I. Flockhart, A. Vinayagam, C. Bergwitz, B. Berger et al., 2011 An integrative approach to ortholog prediction for disease-focused and other functional studies. BMC bioinformatics 12: 357. Huang, D. W., B. T. Sherman, and R. A. Lempicki, 2009a Bioinformatics enrichment tools: paths toward the comprehensive functional analysis of large gene lists. Nucleic acids research 37: 1–13. Huang, D. W., B. T. Sherman, and R. A. Lempicki, 2009b Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nature protocols 4: 44–57. Huang, H., and D. J. Tindall, 2007 Dynamic FoxO transcription factors. Journal of Cell Science 120: 2479–2487. Huang, J., S. Wu, J. Barrera, K. Matthews, and D. Pan, 2005 The Hippo Signaling Pathway Coordinately Regulates Cell Proliferation and Apoptosis by Inactivating Yorkie, the Drosophila Homolog of YAP. Cell 122: 421–434. Huang, S., and D. E. Ingber, 1999 The structural and mechanical complexity of cell-growth control. Nature Cell Biology 1: E131–E138. Huang, W., A. Massouras, Y. Inoue, J. Peiffer, M. Ramia et al., 2014 Natural variation in genome architecture among 205 Drosophila melanogaster Genetic Reference Panel lines. Genome Research 24: 1193–1208. Huang, W., S. Richards, M. A. Carbone, D. Zhu, R. R. H. Anholt et al., 2012 Epistasis dominates the genetic architecture of Drosophila quantitative traits. Proceedings of the National Academy of Sciences 109: 15553–15559. Huang, X., Y. Zhao, X. Wei, C. Li, A. Wang et al., 2011 Genome-wide association study of flowering time and grain yield traits in a worldwide collection of rice germplasm. Nature Publishing Group 44: 32–39. Hunt, R. C., V. L. Simhadri, M. Iandoli, Z. E. Sauna, and C. Kimchi-Sarfaty, 2014 Exposing synonymous mutations. Trends in genetics : TIG 30: 308–321. Hunter, T., 2000 Signaling—2000 and beyond. Cell 100: 113–127. Ibrahim, D. M., B. Biehs, T. B. Kornberg, and A. Klebes, 2013 Microarray Comparison of

174 Anterior and Posterior Drosophila Wing Imaginal Disc Cells Identifies Novel Wing Genes. G3 (Bethesda, Md.) 3: 1353–1362. Igl, W., Å. Johansson, J. F. Wilson, S. H. Wild, O. Polašek et al., 2010 Modeling of Environmental Effects in Genome-Wide Association Studies Identifies SLC2A2 and HP as Novel Loci Influencing Serum Cholesterol Levels (P. Gasparini, Ed.). PLoS Genetics 6: e1000798. Iida, H., H. Nakamura, T. Ono, M. S. Okumura, and Y. Anraku, 1994 MID1, a novel Saccharomyces cerevisiae gene encoding a plasma membrane protein, is required for Ca2+ influx and mating. Molecular and Cellular Biology 14: 8259–8271. Ikeya, T., S. Broughton, N. Alic, R. Grandison, and L. Partridge, 2009 The endosymbiont Wolbachia increases insulin/IGF-like signalling in Drosophila. Proceedings of the Royal Society B: Biological Sciences 276: 3799–3807. Ikeya, T., M. Galic, P. Belawat, K. Nairz, and E. Hafen, 2002 Nutrient-dependent expression of insulin-like peptides from neuroendocrine cells in the CNS contributes to growth regulation in Drosophila. Current biology : CB 12: 1293–1300. Imasheva, A. G., and O. A. Bubliy, 2003 Quantitative variation of four morphological traits in Drosophila melanogaster under larval crowding. Hereditas 138: 193–199. Imasheva, A. G., D. V. Bosenko, and O. A. Bubli, 1999 Variation in morphological traits of Drosophila melanogaster (fruit fly) under nutritional stress. Heredity 82 ( Pt 2): 187–192. Imasheva, A. G., O. A. Bubli, and O. E. Lazebny, 1994 Variation in wing length in Eurasian natural populations of Drosophila melanogaster. Heredity 72: 508–514. Inoki, K., and K.-L. Guan, 2006 Complexity of the TOR signaling network. Trends in cell biology 16: 206–212. Inoki, K., Y. Li, T. Xu, and K.-L. Guan, 2003 Rheb GTPase is a direct target of TSC2 GAP activity and regulates mTOR signaling. Genes & development 17: 1829–1834. Inoki, K., Y. Li, T. Zhu, J. Wu, and K.-L. Guan, 2002 TSC2 is phosphorylated and inhibited by Akt and suppresses mTOR signalling. Nature Cell Biology 4: 648–657. Inoki, K., T. Zhu, and K.-L. Guan, 2003 TSC2 mediates cellular energy response to control cell growth and survival. Cell 115: 577–590. International HapMap Consortium, 2003 The International HapMap Project. Nature 426: 789– 796. International HapMap Consortium, K. A. Frazer, D. G. Ballinger, D. R. Cox, D. A. Hinds et al., 2007 A second generation human haplotype map of over 3.1 million SNPs. Nature 449: 851– 861. Jacinto, E., R. Loewith, A. Schmidt, S. Lin, M. A. Rüegg et al., 2004 Mammalian TOR complex 2 controls the actin cytoskeleton and is rapamycin insensitive. Nature Cell Biology 6:

175 1122–1128. Jacobson, M. D., M. Weil, and M. C. Raff, 1997 Programmed cell death in animal development. Cell 88: 347–354. Jaiswal, M., 2006 Fat and Wingless signaling oppositely regulate epithelial cell-cell adhesion and distal wing development in Drosophila. Development (Cambridge, England) 133: 925– 935. Jang, Y., Y. Lim, and K. Kim, 2014 Saccharomyces cerevisiae Strain Improvement Using Selection, Mutation, and Adaptation for the Resistance to Lignocellulose-Derived Fermentation Inhibitor for Ethanol Production. Journal of microbiology and biotechnology 24: 667–674. Johnston, G. C., J. R. Pringle, and L. H. Hartwell, 1977 Coordination of growth with cell division in the yeast Saccharomyces cerevisiae. Experimental Cell Research 105: 79–98. Johnston, L. A., and P. Gallant, 2002 Control of growth and organ size in Drosophila. BioEssays 24: 54–64. Johnston, L. A., D. A. Prober, B. A. Edgar, R. N. EISENMAN, and P. Gallant, 1999 Drosophila myc regulates cellular growth during development. Cell 98: 779–790. Jordan, K. W., K. L. Craver, M. M. Magwire, C. E. Cubilla, T. F. C. Mackay et al., 2012 Genome-Wide Association for Sensitivity to Chronic Oxidative Stress in Drosophila melanogaster (P. Csermely, Ed.). PLoS ONE 7: e38722. Jorgensen, P., and M. Tyers, 2004 How Cells Coordinate Growth and Division. Current Biology 14: R1014–R1027. Jumbo-Lucioni, P., J. F. Ayroles, M. Chambers, K. W. Jordan, J. Leips et al., 2010 Systems genetics analysis of body weight and energy metabolism traits in Drosophila melanogaster. BMC Genomics 11: 297. Jumbo-Lucioni, P., S. Bu, S. T. Harbison, J. C. Slaughter, T. F. C. Mackay et al., 2012 Nuclear genomic control of naturally occurring variation in mitochondrial function in Drosophila melanogaster. BMC Genomics 13: 1–1. Justice, R. W., O. Zilian, D. F. Woods, M. Noll, and P. J. Bryant, 1995 The Drosophila tumor suppressor gene warts encodes a homolog of human myotonic dystrophy kinase and is required for the control of cell shape and proliferation. Genes & development 9: 534–546. Jünger, M. A., F. Rintelen, H. Stocker, J. D. Wasserman, M. Végh et al., 2003 The Drosophila forkhead transcription factor FOXO mediates the reduction in cell number associated with reduced insulin signaling. Journal of Biology 2: 20. Kamada, Y., K.-I. Yoshino, C. Kondo, T. Kawamata, N. Oshiro et al., 2010 Tor directly controls the Atg1 kinase complex to regulate autophagy. Molecular and Cellular Biology 30: 1049–1058.

176 Kang, H. M., N. A. Zaitlen, C. M. Wade, A. Kirby, D. Heckerman et al., 2008 Efficient Control of Population Structure in Model Organism Association Mapping. Genetics 178: 1709–1723. Karr, J. R., J. C. Sanghvi, D. N. Macklin, M. V. Gutschow, J. M. Jacobs et al., 2012 A whole- cell computational model predicts phenotype from genotype. Cell 150: 389–401. Kawecki, T. J., R. E. Lenski, D. Ebert, B. Hollis, I. Olivieri et al., 2012 Experimental evolution. Trends in Ecology & Evolution 27: 547–560. Kennedy, S. G., A. J. Wagner, S. D. Conzen, J. Jordan, A. Bellacosa et al., 1997 The PI 3- kinase/Akt signaling pathway delivers an anti-apoptotic signal. Genes & development 11: 701–713. Kiger, A. A., B. Baum, S. Jones, M. R. Jones, A. Coulson et al., 2003 A functional genomic analysis of cell morphology using RNA interference. Journal of Biology 2: 27. Kilfoil, M. L., P. Lasko, and E. Abouheif, 2009 Stochastic variation: From single cells to superorganisms. HFSP Journal 3: 379–385. Killip, L. E., and S. S. Grewal, 2012 DREF is required for cell and organismal growth in Drosophila and functions downstream of the nutrition/TOR pathway. Developmental Biology 371: 191–202. Kim, E., P. Goraksha-Hicks, L. Li, T. P. Neufeld, and K.-L. Guan, 2008 Regulation of TORC1 by Rag GTPases in nutrient response. Nature Cell Biology 10: 935–945. KIM, J., and G. GIBSON, 2010 Insights from GWAS into the quantitative genetics of transcription in humans. Genetics Research 92: 361–369. Kimura, K. D., 1997 daf-2, an Insulin Receptor-Like Gene That Regulates Longevity and Diapause in Caenorhabditis elegans. Science 277: 942–946. King, E. G., S. J. Macdonald, and A. D. Long, 2012 Properties and power of the Drosophila Synthetic Population Resource for the routine dissection of complex traits. Genetics 191: 935–949. Klarsfeld, A., J.-C. Leloup, and F. Rouyer, 2003 Circadian rhythms of locomotor activity in Drosophila. Behavioural Processes 64: 161–175. Kofler, R., and C. SCHLÖTTERER, 2014 A guide for the design of evolve and resequencing studies. Molecular biology and evolution 31: 474–483. Koontz, L. M., Y. Liu-Chittenden, F. Yin, Y. Zheng, J. Yu et al., 2013 The Hippo Effector Yorkie Controls Normal Tissue Growth by Antagonizing Scalloped-Mediated Default Repression. Developmental Cell 25: 388–401. Kooperberg, C., M. LeBlanc, and V. Obenchain, 2010 Risk prediction using genome-wide association studies. Genetic Epidemiology 34: 643–652. Korte, A., and A. Farlow, 2013 The advantages and limitations of trait analysis with GWAS- a review. Plant Methods 9: 1–1.

177 Korte, A., B. J. Vilhjálmsson, V. Segura, A. Platt, Q. Long et al., 2012 A mixed-model approach for genome-wide association studies of correlated traits in structured populations. Nature Genetics 44: 1066–1071. Kozma, R., and M. Puljic, 2013 Hierarchical random cellular neural networks for system-level brain-like signal processing. Neural Networks 45: 101–110. Krumsiek, J., C. Marr, T. Schroeder, and F. J. Theis, 2011 Hierarchical Differentiation of Myeloid Progenitors Is Encoded in the Transcription Factor Network (M. Pesce, Ed.). PLoS ONE 6: e22649. la Cova, de, C., M. Abril, P. Bellosta, P. Gallant, and L. A. Johnston, 2004 Drosophila myc regulates organ size by inducing cell competition. Cell 117: 107–116. Lahner, B., J. Gong, M. Mahmoudian, E. L. Smith, K. B. Abid et al., 2003 Genomic scale profiling of nutrient and trace elements in Arabidopsis thaliana. Nature biotechnology 21: 1215–1221. Lai, Z.-C., X. Wei, T. Shimizu, E. Ramos, M. Rohrbaugh et al., 2005 Control of Cell Proliferation and Apoptosis by Mob as Tumor Suppressor, Mats. Cell 120: 675–685. Lango Allen, H., K. Estrada, G. Lettre, S. I. Berndt, M. N. Weedon et al., 2010 Hundreds of variants clustered in genomic loci and biological pathways affect human height. Nature 467: 832–838. Lawrence, P. A., and J. Casal, 2013 The mechanisms of planar cell polarity, growth and the Hippo pathway: some known unknowns. Developmental Biology 377: 1–8. Lecuit, T., and L. Le Goff, 2007 Orchestrating size and shape during morphogenesis. Nature 450: 189–192. Lee, S. H., B. H. Choi, D. Lim, C. Gondro, Y. M. Cho et al., 2013 Genome-Wide Association Study Identifies Major Loci for Carcass Weight on BTA14 in Hanwoo (Korean Cattle) (C. Wade, Ed.). PLoS ONE 8: e74677. Leevers, S. J., D. Weinkove, L. K. MacDougall, E. Hafen, and M. D. Waterfield, 1996 The Drosophila phosphoinositide 3-kinase Dp110 promotes cell growth. The EMBO journal 15: 6584. Lefranc, A., and J. Bundgaard, 2000 Controlled variation of body size by larval crowding in Drosophila melanogaster. Dros. Inf. Serv 83: 171–174. Lettre, G., 2011a Recent progress in the study of the genetics of height. Human Genetics 129: 465–472. Lettre, G., 2011b Recent progress in the study of the genetics of height. Human Genetics 129: 465–472. Lettre, G., J. L. Butler, K. G. Ardlie, and J. N. Hirschhorn, 2007 Common genetic variation in eight genes of the GH/IGF1 axis does not contribute to adult height variation. Human

178 Genetics 122: 129–139. Lettre, G., A. U. Jackson, C. Gieger, F. R. Schumacher, S. I. Berndt et al., 2008 Identification of ten loci associated with height highlights new biological pathways in human growth. Nature Genetics 40: 584–591. Levin, D. E., and B. Errede, 1995 The proliferation of MAP kinase signaling pathways in yeast. Current Opinion in Cell Biology 7: 197–202. Lin, C., and V. L. Katanaev, 2013 Kermit Interacts with Gαo, Vang, and Motor Proteins in Drosophila Planar Cell Polarity (E. Moreno, Ed.). PLoS ONE 8: e76885. Ling, C., Y. Zheng, F. Yin, J. Yu, J. Huang et al., 2010 The apical transmembrane protein Crumbs functions as a tumor suppressor that regulates Hippo signaling by binding to Expanded. Proceedings of the National Academy of Sciences of the United States of America 107: 10532–10537. Lipka, A. E., M. A. Gore, M. Magallanes-Lundback, A. Mesberg, H. Lin et al., 2013 Genome- wide association study and pathway-level analysis of tocochromanol levels in maize grain. G3 (Bethesda, Md.) 3: 1287–1299. Lippert, C., J. Listgarten, Y. Liu, C. M. Kadie, R. I. Davidson et al., 2011 FaST linear mixed models for genome-wide association studies. Nature Methods 8: 833–835. Liu, J.-P., J. Baker, A. S. Perkins, E. J. Robertson, and A. Efstratiadis, 1993 Mice carrying null mutations of the genes encoding insulin-like growth factor I ( Igf-1) and type 1 IGF receptor ( Igf1r). Cell 75: 59–72. Liu, Z., A. R. Cappola, L. J. Crofford and W. Guo, 2014 Modeling Bivariate Longitudinal Hormone Profiles by Hierarchical State Space Models. Journal of the American Statistical Association 109: 108-118. Liu, J. Z., A. F. Mcrae, D. R. Nyholt, S. E. Medland, N. R. Wray et al., 2010 A Versatile Gene- Based Test for Genome-wide Association Studies. The American Journal of Human Genetics 87: 139–145. Livneh, E., L. Glazer, D. Segal, J. Schlessinger, and B. Z. Shilo, 1985 The Drosophila EGF receptor gene homolog: conservation of both hormone binding and kinase domains. Cell 40: 599–607. Loewith, R., E. Jacinto, S. Wullschleger, A. Lorberg, J. L. Crespo et al., 2002 Two TOR complexes, only one of which is rapamycin sensitive, have distinct roles in cell growth control. MOLCEL 10: 457–468. Loman, N. J., R. V. Misra, T. J. Dallman, C. Constantinidou, S. E. Gharbia et al., 2012 Performance comparison of benchtop high-throughput sequencing platforms. Nature biotechnology 30: 434–439. Long, X., Y. Lin, S. Ortiz-Vega, K. Yonezawa, and J. Avruch, 2005 Rheb Binds and

179 Regulates the mTOR Kinase. Current Biology 15: 702–713. Ludwig, T., J. Eggenschwiler, P. Fisher, A. J. D'Ercole, M. L. Davenport et al., 1996 Mouse mutants lacking the type 2 IGF receptor (IGF2R) are rescued from perinatal lethality in Igf2 and Igf1r null backgrounds. Developmental Biology 177: 517–535. Lynch, M. & Walsh, B., 1998 Genetics and Analysis of Quantitative Traits. Sinauer Associates, Inc., Sunderland, MA, USA.

Mackay, T. F. C., 2010 Mutations and quantitative genetic variation: lessons from Drosophila. Philosophical Transactions of the Royal Society B: Biological Sciences 365: 1229–1239. Mackay, T. F. C., 2001 The genetic architecture of quantitative traits. Annu. Rev. Genet. 1– 37. Mackay, T. F. C., and R. R. H. Anholt, 2006 Of flies and man: Drosophila as a model for human complex traits. Annual review of genomics and human genetics 7: 339–367. Mackay, T. F. C., and J. H. Moore, 2014 Why epistasis is important for tackling complex human disease genetics. Genome medicine 6: 42. Mackay, T. F. C., S. L. Heinsohn, R. F. Lyman, A. J. Moehring, T. J. Morgan et al., 2005 Genetics and genomics of Drosophila mating behavior. Proceedings of the National Academy of Sciences of the United States of America 102 Suppl 1: 6622–6629. Mackay, T. F. C., S. Richards, E. A. Stone, A. Barbadilla, J. F. Ayroles et al., 2012 The Drosophila melanogaster Genetic Reference Panel. Nature 482: 173–178. Mackay, T. F. C., E. A. Stone, and J. F. Ayroles, 2009 The genetics of quantitative traits: challenges and prospects. Nature Publishing Group 10: 565–577. Madan, L. L., S. Veeranna, K. Shameer, C. C. S. Reddy, R. Sowdhamini, and B. Gopal, 2011a Modulation of Catalytic Activity in Multi-Domain Protein Tyrosine Phosphatases (V. N. Uversky, Ed.). PLoS ONE 6: e24766. Madan, L. L., S. Veeranna, K. Shameer, C. C. S. Reddy, R. Sowdhamini, and B. Gopal, 2011b Modulation of Catalytic Activity in Multi-Domain Protein Tyrosine Phosphatases (V. N. Uversky, Ed.). PLoS ONE 6: e24766. Maehama, T., and J. E. Dixon, 1998 The Tumor Suppressor, PTEN/MMAC1, Dephosphorylates the Lipid Second Messenger, Phosphatidylinositol 3,4,5-Trisphosphate. Journal of Biological Chemistry 273: 13375–13378. Maki, R. G., 2010 Small Is Beautiful: Insulin-Like Growth Factors and Their Role in Growth, Development, and Cancer. Journal of Clinical Oncology 28: 4985–4995. Makvandi-Nejad, S., G. E. Hoffman, J. J. Allen, E. Chu, E. Gu et al., 2012 Four Loci Explain 83% of Size Variation in the Horse (M. Hofreiter, Ed.). PLoS ONE 7: e39929. Maller, J., S. George, S. Purcell, J. Fagerness, D. Altshuler et al., 2006 Common variation in

180 three genes, including a noncoding variant in CFH, strongly influences risk of age-related macular degeneration. Nature Genetics 38: 1055–1059. Manning, B. D., 2004 Balancing Akt with S6K: implications for both metabolic diseases and tumorigenesis. The Journal of cell biology 167: 399–403. Manolio, T. A., 2013 Bringing genome-wide association findings into clinical use. Nature reviews. Genetics 14: 549–558. Manolio, T. A., 2010 Genomewide association studies and assessment of the risk of disease. The New England journal of medicine 363: 166–176. Manolio, T. A., F. S. Collins, N. J. Cox, D. B. Goldstein, L. A. Hindorff et al., 2009 Finding the missing heritability of complex diseases. Nature 461: 747–753. Mao, Y., B. Kucuk, and K. D. Irvine, 2009 Drosophila lowfat, a novel modulator of Fat signaling. Development (Cambridge, England) 136: 3223–3233. Mao, Y., C. Rauskolb, E. Cho, W.-L. Hu, H. Hayter et al., 2006 Dachs: an unconventional myosin that functions downstream of Fat to regulate growth, affinity and gene expression in Drosophila. Development (Cambridge, England) 133: 2539–2551. Mao, Y., A. L. Tournier, P. A. Bates, J. E. Gale, N. Tapon et al., 2011 Planar polarization of the atypical myosin Dachs orients cell divisions in Drosophila. Genes & development 25: 131–136. Marjoram, P., A. Zubair, and S. V. Nuzhdin, 2014 Post-GWAS: where next? More samples, more SNPs or more biology? Heredity 112: 79–88. Martin, A. P., and S. R. Palumbi, 1993 Body size, metabolic rate, generation time, and the molecular clock. Proceedings of the National Academy of Sciences of the United States of America 90: 4087–4091. Martin, D. E., and M. N. Hall, 2005 The expanding TOR signaling network. Current Opinion in Cell Biology 17: 158–166. Martin, S. G., 2009 Geometric control of the cell cycle. Cell cycle (Georgetown, Tex.) 8: 3643–3647. Marygold, S. J., J. Roote, G. Reuter, A. Lambertsson, M. Ashburner et al., 2007 The ribosomal protein genes and Minute loci of Drosophila melanogaster. Genome Biology 8: R216. Massouras, A., S. M. Waszak, M. Albarca-Aguilera, K. Hens, W. Holcombe et al., 2012 Genomic Variation and Its Impact on Gene Expression in Drosophila melanogaster (B. Oliver, Ed.). PLoS Genetics 8: e1003055. Matakatsu, H., and S. S. Blair, 2006 Separating the adhesive and signaling functions of the Fat and Dachsous protocadherins. Development (Cambridge, England) 133: 2315–2324. Matis, M., and J. D. Axelrod, 2013 Regulation of PCP by the Fat signaling pathway. Genes &

181 development 27: 2207–2220. Matsuzaki, H., H. Daitoku, M. Hatta, K. Tanaka, and A. Fukamizu, 2003 Insulin-induced phosphorylation of FKHR (Foxo1) targets to proteasomal degradation. Proceedings of the National Academy of Sciences of the United States of America 100: 11285–11290. Matsuzaki, H., H. Konishi, M. Tanaka, Y. Ono, T. Takenawa et al., 1996 Isolation of the active form of RAC-protein kinase (PKB/Akt) from transfected COS-7 cells treated with heat shock stress and effects of phosphatidylinositol 3,4,5-trisphosphate and phosphatidylinositol 4,5-bisphosphate on its enzyme activity. FEBS Letters 396: 305–308. Maxa, J., M. Neuditschko, I. Russ, M. Förster, and I. Medugorac, 2012 Genome-wide association mapping of milk production traits in Braunvieh cattle. Journal of Dairy Science 95: 5357–5364. McBrayer, Z., H. Ono, M. Shimell, J.-P. Parvy, R. B. Beckstead et al., 2007 Prothoracicotropic hormone regulates developmental timing and body size in Drosophila. Dev Cell 13: 857–871. McCarroll, S. A., 2008 Extending genome-wide association studies to copy-number variation. Human Molecular Genetics 17: R135–42. McCarthy, M. I., G. R. Abecasis, L. R. Cardon, D. B. Goldstein, J. Little et al., 2008 Genome- wide association studies for complex traits: consensus, uncertainty and challenges. Nature reviews. Genetics 9: 356–369. McCormick, F., 1994 Activators and effectors of ras p21 proteins. Current opinion in genetics & development 4: 71–76. Medema, R. H., G. J. Kops, J. L. Bos, and B. M. Burgering, 2000 AFX-like Forkhead transcription factors mediate cell-cycle regulation by Ras and PKB through p27kip1. Nature 404: 782–787. Meijón, M., S. B. Satbhai, T. Tsuchimatsu, and W. Busch, 2013 Genome-wide association study using cellular traits identifies a new regulator of root development in Arabidopsis. Nature Publishing Group 46: 77–81. Menon, S., C. C. Dibble, G. Talbott, G. Hoxhaj, A. J. Valvezan et al., 2014 Spatial Control of the TSC Complex Integrates Insulin and Nutrient Regulation of mTORC1 at the Lysosome. Cell 156: 771–785. Meyer, K., 1985 Maximum likelihood estimation of variance components for a multivariate mixed model with equal design matrices. Biometrics 41: 153–165. Minke, B., and B. Cook, 2002 TRP channel proteins and signal transduction. Physiological reviews 82: 429–472. Minozzi, G., E. L. Nicolazzi, A. Stella, S. Biffani, R. Negrini et al., 2013 Genome Wide Analysis of Fertility and Production Traits in Italian Holstein Cattle (L. Chen, Ed.). PLoS ONE

182 8: e80219. Mirth, C., J. W. Truman, and L. M. Riddiford, 2005 The Role of the Prothoracic Gland in Determining Critical Weight for Metamorphosis in Drosophila melanogaster. Current Biology 15: 1796–1807. Mirth, C. K., and L. M. Riddiford, 2007 Size assessment and growth control: how adult size is determined in insects. BioEssays 29: 344–355. Mirth, C. K., and A. W. Shingleton, 2012 Integrating body and organ size in Drosophila: recent advances and outstanding problems. Frontiers in endocrinology 3: 49. Montagne, J., M. J. Stewart, H. Stocker, E. Hafen, S. C. Kozma et al., 1999 Drosophila S6 kinase: a regulator of cell size. Science 285: 2126–2129. Montero, L., N. Müller, and P. Gallant, 2008 Induction of apoptosis byDrosophila Myc. genesis 46: 104–111. Morata, G., and P. Ripoll, 1975 Minutes: mutants of Drosophila autonomously affecting cell division rate. Developmental Biology 42: 211–221. Moreno, E., and K. Basler, 2004 dMyc transforms cells into super-competitors. Cell 117: 117–129. Morozova, T. V., R. R. H. Anholt, and T. F. C. Mackay, 2007 Phenotypic and transcriptional response to selection for alcohol sensitivity in Drosophila melanogaster. Genome Biology 8: R231. Muskavitch, M. A., 1994 Delta-notch signaling and Drosophila cell fate choice. Developmental Biology 166: 415–430. Müller, B., and U. Grossniklaus, 2010 Model organisms — A historical perspective. Journal of Proteomics 73: 2054–2063. Nakae, J., Y. Kido, and D. Accili, 2001 Distinct and overlapping functions of insulin and IGF-I receptors. Endocrine reviews 22: 818–835. Neto-Silva, R. M., S. de Beco, and L. A. Johnston, 2010 Evidence for a growth-stabilizing regulatory feedback mechanism between Myc and Yorkie, the Drosophila homolog of Yap. Developmental Cell 19: 507–520. Neto-Silva, R. M., B. S. Wells, and L. A. Johnston, 2009 Mechanisms of Growth and Homeostasis in the DrosophilaWing. Annual Review of Cell and Developmental Biology 25: 197–220. Neufeld, T. P., A. F. A. de la Cruz, L. A. Johnston, and B. A. Edgar, 1998 Coordination of Growth and Cell Division in the Drosophila Wing. Cell 93: 1183–1193. Nicoloff, H., V. Perreten, and S. B. Levy, 2007 Increased Genome Instability in Escherichia coli lon Mutants: Relation to Emergence of Multiple-Antibiotic-Resistant (Mar) Mutants Caused by Insertion Sequence Elements and Large Tandem Genomic Amplifications.

183 Antimicrobial Agents and Chemotherapy 51: 1293–1303. Nijhout, H. F., 2003 The control of body size in insects. Developmental Biology 261: 1–9. Nijhout, H. F., and C. M. Williams, 1974 Control of moulting and metamorphosis in the tobacco hornworm, Manduca sexta (L.): growth of the last-instar larva and the decision to pupate. The Journal of experimental biology 61: 481–491. Nijhout, H. F., L. M. Riddiford, C. Mirth, A. W. Shingleton, Y. Suzuki et al., 2014 The developmental control of size in insects. Wiley interdisciplinary reviews. Developmental biology 3: 113–134. Nobukuni, T., M. Joaquin, M. Roccio, S. G. Dann, S. Y. Kim et al., 2005 Amino acids mediate mTOR/raptor signaling through activation of class 3 phosphatidylinositol 3OH-kinase. Proceedings of the National Academy of Sciences of the United States of America 102: 14238–14243. Normanno, N., A. De Luca, C. Bianco, L. Strizzi, M. Mancino et al., 2006 Epidermal growth factor receptor (EGFR) signaling in cancer. Gene 366: 2–16. Norry, F. M., J. C. Vilardi, and E. Hasson, 1997 Genetic and phenotypic correlations among size-related traits, and heritability variation between body parts in Drosophila buzzatii. Genetica 101: 131–139. Nowak, K., G. Seisenbacher, E. Hafen, and H. Stocker, 2013 Nutrient restriction enhances the proliferative potential of cells lacking the tumor suppressor PTEN in mitotic tissues. eLife 2: e00380. Nüsslein-Volhard, C., and E. Wieschaus, 1980 Mutations affecting segment number and polarity in Drosophila. Nature 287: 795–801. Okamoto, N., R. Nakamori, T. Murai, Y. Yamauchi, A. Masuda et al., 2013 A secreted decoy of InR antagonizes insulin/IGF signaling to restrict body growth in Drosophila. Genes & development 27: 87–97. Okamoto, N., N. Yamanaka, Y. Yagi, Y. Nishida, H. Kataoka et al., 2009 A Fat Body-Derived IGF-like Peptide Regulates Postfeeding Growth in Drosophila. Developmental Cell 17: 885– 891. Oksenberg, J. R., S. E. Baranzini, S. Sawcer, and S. L. Hauser, 2008 The genetics of multiple sclerosis: SNPs to pathways to pathogenesis. Nature reviews. Genetics 9: 516–526. Oldham, S., 2011 Obesity and nutrient sensing TOR pathway in flies and vertebrates: Functional conservation of genetic mechanisms. Trends in Endocrinology & Metabolism 22: 45–52. Oldham, S., and E. Hafen, 2003 Insulin/IGF and target of rapamycin signaling: a TOR de force in growth control. Trends in cell biology 13: 79–85. Oldham, S., R. Bohni, H. Stocker, W. Brogiolo, and E. Hafen, 2000 Genetic control of size in

184 Drosophila. Philosophical Transactions of the Royal Society B: Biological Sciences 355: 945– 952. Oldham, S., H. Stocker, M. Laffargue, F. Wittwer, M. Wymann et al., 2002 The Drosophila insulin/IGF receptor controls growth and size by modulating PtdInsP3 levels. Development (Cambridge, England) 129: 4103–4109. Olivier, J. P., T. Raabe, M. Henkemeyer, B. Dickson, G. Mbamalu et al., 1993 A Drosophila SH2-SH3 adaptor protein implicated in coupling the sevenless tyrosine kinase to an activator of Ras guanine nucleotide exchange, Sos. Cell 73: 179–191. Orozco-terWengel, P., M. Kapun, V. Nolte, R. Kofler, T. Flatt et al., 2012 Adaptation of Drosophilato a novel laboratory environment reveals temporally heterogeneous trajectories of selected alleles. Molecular ecology 21: 4931–4941. Özkan, E., R. A. Carrillo, C. L. Eastman, R. Weiszmann, D. Waghray et al., 2013 An Extracellular Interactome of Immunoglobulin and LRR Proteins Reveals Receptor-Ligand Networks. Cell 154: 228–239. Paaby, A. B., A. O. Bergland, E. L. Behrman, and P. S. Schmidt, 2014 An amino acid polymorphism in the Drosophila insulin receptor demonstrates pleiotropic and adaptive function in life history traits:. Pak, S., 2010 The growth status of North Korean refugee children and adolescents from 6 to 19 years of age. Economics & Human Biology 8: 385–395. Pan, D., 2007 Hippo signaling in organ size control. Genes & development 21: 886–897. Pantalacci, S., N. Tapon, and P. Léopold, 2003 The Salvador partner Hippo promotes apoptosis and cell-cycle exit in Drosophila. Nature Cell Biology 5: 921–927. Parsons, L., N. Grzeschik, and H. Richardson, 2014 lgl Regulates the Hippo Pathway Independently of Fat/Dachs, Kibra/Expanded/Merlin and dRASSF/dSTRIPAK. Cancers 6: 879–896. Parsons, L. M., N. A. Grzeschik, M. Allott, and H. Richardson, 2010 Lgl/aPKC and Crb regulate the Salvador/Warts/Hippo pathway. Fly 4: 288–293. Partridge, L., and D. Gems, 2002 MECHANISMS OF AGEING: PUBLIC OR PRIVATE? Nature reviews. Genetics 3: 165–175. Partridge, L., B. Barrie, K. Fowler, and V. French, 1994a Evolution and development of body size and cell size in Drosophila melanogaster in response to temperature. Evolution 1269– 1276. Partridge, L., B. Barrie, K. Fowler, and V. French, 1994b Thermal evolution of pre‐adult life history traits in Drosophila melanogaster. Journal of evolutionary biology 7: 645–663. Partridge, L., R. Langelan, K. Fowler, B. Zwaan, and V. French, 1999 Correlated responses to selection on body size in Drosophila melanogaster. Genetics Research 74: 43–54.

185 Parts, L., F. A. Cubillos, J. Warringer, K. Jain, F. Salinas et al., 2011 Revealing the genetic structure of a trait by sequencing a population under selection. Genome Research 21: 1131– 1138. Paschou, P., E. Ziv, E. G. Burchard, S. Choudhry, W. Rodriguez-Cintron et al., 2007 PCA- Correlated SNPs for Structure Identification in Worldwide Human Populations. PLoS Genetics 3: e160. Patterson, N., A. L. Price, and D. Reich, 2006 Population Structure and Eigenanalysis. PLoS Genetics 2: e190. Patursky-Polischuk, I., M. Stolovich-Rain, M. Hausner-Hanochi, J. Kasir, N. Cybulski et al., 2009 The TSC-mTOR pathway mediates translational activation of TOP mRNAs by insulin largely in a raptor- or rictor-independent manner. Molecular and Cellular Biology 29: 640– 649. Pawson, T., G. D. Gish, and P. Nash, 2001 SH2 domains, interaction modules and cellular wiring. Trends in cell biology 11: 504–511. Peirce, J. L., L. Lu, J. Gu, L. M. Silver, and R. W. Williams, 2004 A new set of BXD recombinant inbred lines from advanced intercross populations in mice. BMC genetics 5: 7. Peiter, E., M. Fischer, K. Sidaway, S. K. Roberts, and D. Sanders, 2005 The Saccharomyces cerevisiae Ca2+ channel Cch1pMid1p is essential for tolerance to cold stress and iron toxicity. FEBS Letters 579: 5697–5703. Pellock, B. J., E. Buff, K. White, and I. K. Hariharan, 2007 The Drosophila tumor suppressors Expanded and Merlin differentially regulate cell cycle exit, apoptosis, and Wingless signaling. Developmental Biology 304: 102–115. Peng, T., T. R. Golub, and D. M. Sabatini, 2002 The Immunosuppressant Rapamycin Mimics a Starvation-Like Signal Distinct from Amino Acid and Glucose Deprivation. Molecular and Cellular Biology 22: 5575–5584. Perola, M., S. Sammalisto, T. Hiekkalinna, N. G. Martin, P. M. Visscher et al., 2007 Combined Genome Scans for Body Stature in 6,602 European Twins: Evidence for Common Caucasian Loci. PLoS Genetics 3: e97. Pickrell, J. K., 2014 Joint Analysis of Functional Genomic Data and Genome-wide Association Studies of 18 Human Traits. American journal of human genetics 94: 559–573. Poltilove, R. M. K., A. R. Jacobs, C. R. Haft, P. Xu, and S. I. Taylor, 2000 Characterization of Drosophila Insulin Receptor Substrate. Journal of Biological Chemistry 275: 23346–23354. Povelones, M., 2005 Genetic Evidence That Drosophila frizzled Controls Planar Cell Polarity and Armadillo Signaling by a Common Mechanism. Genetics 171: 1643–1654. Powell, A. M., M. Davis, and J. R. Powell, 2010 Phenotypic plasticity across 50MY of evolution: Drosophila wing size and temperature. Journal of Insect Physiology 56: 380–382.

186 Price, A. L., N. J. Patterson, R. M. Plenge, M. E. Weinblatt, N. A. Shadick et al., 2006 Principal components analysis corrects for stratification in genome-wide association studies. Nature Genetics 38: 904–909. Pritchard, J. K., and N. J. Cox, 2002 The allelic architecture of human disease genes: common disease–common variant… or not? Human Molecular Genetics 11: 2417–2423. Pritchard, J. K., M. Stephens, and P. Donnelly, 2000 Inference of population structure using multilocus genotype data. Genetics 155: 945–959. PROBER, D., and B. EDGAR, 2000 Ras1 Promotes Cellular Growth in the Drosophila Wing. Cell 100: 435–446. Prober, D. A., and B. A. Edgar, 2002 Interactions between Ras1, dMyc, and dPI3K signaling in the developing Drosophila wing. Genes & development 16: 2286–2299. Proud, C. G., 2006 Regulation of protein synthesis by insulin. Biochemical Society Transactions 34: 213–216. Puig, O., and R. Tjian, 2005 Transcriptional feedback control of insulin receptor by dFOXO/FOXO1. Genes & development 19: 2435–2446. Puig, O., M. T. Marr, M. L. Ruhf, and R. Tjian, 2003 Control of cell number by Drosophila FOXO: downstream and feedback regulation of the insulin receptor pathway. Genes & development 17: 2006–2020. Raff, M. C., 1996 Size control: the regulation of cell numbers in animal development. Cell 86: 173–175. Rakitsch, B., C. Lippert, O. Stegle, and K. Borgwardt, 2013 A Lasso multi-marker mixed model for association mapping with population structure correction. Bioinformatics 29: 206– 214. Rauskolb, C., S. Sun, G. Sun, Y. Pan, and K. D. Irvine, 2014 Cytoskeletal Tension Inhibits Hippo Signaling through an Ajuba-Warts Complex. Cell 158: 143–156. Reaven, G. M., 1997 Banting Lecture 1988. Role of insulin resistance in human disease. 1988. Reed, L. K., K. Lee, Z. Zhang, L. Rashid, A. Poe et al., 2014 Systems Genomics of Metabolic Phenotypes in Wild-Type Drosophila melanogaster. Genetics 197: 781–793. Reich, D. E., and E. S. Lander, 2001 On the allelic spectrum of human disease. Trends in genetics : TIG 17: 502–510. Reiling, J. H., and E. Hafen, 2004 The hypoxia-induced paralogs Scylla and Charybdis inhibit growth by down-regulating S6K activity upstream of TSC in Drosophila. Genes & development 18: 2879–2892. Restrepo, S., J. J. Zartman, and K. Basler, 2014 Coordination of Patterning and Growth by the Morphogen DPP. Current Biology 24: R245–R255.

187 Reynolds, J. V., C. L. Donohoe, and S. L. Doyle, 2010 Diet, obesity and cancer. Irish Journal of Medical Science 180: 521–527. Ribeiro, P. S., F. Josué, A. Wepf, M. C. Wehr, O. Rinner et al., 2010 Combined functional genomic and proteomic approaches identify a PP2A complex as a negative regulator of Hippo signaling. Molecular Cell 39: 521–534. Riccardi, G., R. Giacco, and A. A. Rivellese, 2004 Dietary fat, insulin sensitivity and the metabolic syndrome. Clinical Nutrition 23: 447–456. Riese, D. J., and D. F. Stern, 1998 Specificity within the EGF family/ErbB receptor family signaling network. BioEssays 20: 41–48. Risch, N., and K. Merikangas, 1996 The future of genetic studies of complex human diseases. Science 273: 1516–1517. Robertson, F. W., 1959 Studies in Quantitative Inheritance. Xii. Cell Size and Number in Relation to Genetic and Environmental Variation of Body Size in Drosophila. Genetics 44: 869–896. Robertson, F. W., 1963 The ecological genetics of growth in Drosophila 6. The genetic correlation between the duration of the larval period and body size in relation to larval diet. Genetics Research 4: 74–92. Robertson, F. W., and E. Reeve, 1952 Studies in quantitative inheritance I. The effects of selection of wing and thorax length in Drosophila melanogaster. Journal of genetics 50: 414– 448. Robinson, B. S., J. Huang, Y. Hong, and K. H. Moberg, 2010 Crumbs Regulates Salvador/Warts/Hippo Signaling in Drosophila via the FERM-Domain Protein Expanded. Current Biology 20: 582–590. Roff, D. A., and T. A. Mousseau, 1987 Quantitative genetics and fitness: lessons from Drosophila. Heredity 58 ( Pt 1): 103–118. Rogulja, D., C. Rauskolb, and K. D. Irvine, 2008 Morphogen Control of Wing Growth through the Fat Signaling Pathway. Developmental Cell 15: 309–321. Rommel, C., and E. Hafen, 1998 Ras--a versatile cellular switch. Current opinion in genetics & development 8: 412–418. Ruan, Y., C. Chen, Y. Cao, and R. S. Garofalo, 1995 The Drosophila insulin receptor contains a novel carboxyl-terminal extension likely to play an important role in signal transduction. The Journal of biological chemistry 270: 4236–4243. Rueedi, R., M. Ledda, A. W. Nicholls, R. M. Salek, P. Marques-Vidal et al., 2014 Genome- Wide Association Study of Metabolic Traits Reveals Novel Gene-Metabolite-Disease Links (G. GIBSON, Ed.). PLoS Genetics 10: e1004132. Rulifson, E. J., 2002 Ablation of Insulin-Producing Neurons in Flies: Growth and Diabetic

188 Phenotypes. Science 296: 1118–1120. Rupes, I., 2002 Checking cell size in yeast. Trends in genetics : TIG 18: 479–485. Saltiel, A. R., and C. R. Kahn, 2001 Insulin signalling and the regulation of glucose and lipid metabolism. Nature 414: 799–806. Sancak, Y., L. Bar-Peled, R. Zoncu, A. L. Markhard, S. Nada et al., 2010 Ragulator-Rag Complex Targets mTORC1 to the Lysosomal Surface and Is Necessary for Its Activation by Amino Acids. Cell 141: 290–303. Santos, M., K. Fowler, and L. Partridge, 1994 Gene-environment interaction for body size and larval density in Drosophila melanogaster: an investigation of effects on development time, thorax length and adult sex ratio. Heredity 72 ( Pt 5): 515–521. Sarbassov, D. D., D. A. Guertin, S. M. Ali, and D. M. Sabatini, 2005 Phosphorylation and regulation of Akt/PKB by the rictor-mTOR complex. Science 307: 1098–1101. Sarraf-Zadeh, L., S. Christen, U. Sauer, P. Cognigni, I. Miguel-Aliaga et al., 2013 Local requirement of the Drosophila insulin binding protein imp-L2 in coordinating developmental progression with nutritional conditions. Developmental Biology 381: 97–106. Saucedo, L. J., and B. A. Edgar, 2007 Filling out the Hippo pathway. Nature Reviews Molecular Cell Biology 8: 613–621. Saucedo, L. J., X. Gao, D. A. Chiarelli, L. Li, D. Pan et al., 2003 Rheb promotes cell growth as a component of the insulin/TOR signalling network. Nature Cell Biology 5: 566–571. Schadt, E. E., 2009 Molecular networks as sensors and drivers of common human diseases. Nature 461: 218–223. Schadt, E. E., J. Lamb, X. Yang, J. Zhu, S. Edwards et al., 2005 An integrative genomics approach to infer causal associations between gene expression and disease. Nature Genetics 37: 710–717. Schertel, C., D. Huang, M. Björklund, J. Bischof, D. Yin et al., 2013 Systematic Screening of a Drosophila ORF Library In Vivo Uncovers Wnt/Wg Pathway Components. DEVCEL 25: 207–219. Schlessinger, J., 1993 How receptor tyrosine kinases activate Ras. Trends in biochemical sciences 18: 273–275. Schluck, T., U. Nienhaus, T. Aegerter-Wilmsen, and C. M. Aegerter, 2013 Mechanical Control of Organ Size in the Development of the Drosophila Wing Disc (R. M. H. Merks, Ed.). PLoS ONE 8: e76171. Schmelzle, T., and M. N. Hall, 2000 TOR, a central controller of cell growth. Cell 103: 253– 262. Schmid, C. H., T. H. Steiner, and E. R. Froesch, 1983 Preferential enhancement of myoblast differentiation by insulin-like growth factors (IGF I and IGF II) in primary cultures of chicken

189 embryonic cells. FEBS Letters 161: 117–121. Schmidt, A., J. Kunz, and M. N. Hall, 1996 TOR2 is required for organization of the actin cytoskeleton in yeast. Proceedings of the National Academy of Sciences of the United States of America 93: 13780–13785. Schmidt, M., S. Fernandez de Mattos, A. van der Horst, R. Klompmaker, G. J. P. L. Kops et al., 2002 Cell Cycle Inhibition by FoxO Forkhead Transcription Factors Involves Downregulation of Cyclin D. Molecular and Cellular Biology 22: 7842–7852. Schroeder, M. C., and G. Halder, 2012 Regulation of the Hippo pathway by cell architecture and mechanical signals. Seminars in Cell and Developmental Biology 23: 803–811. Schubiger, M., and J. Palka, 1987 Changing spatial patterns of DNA replication in the developing wing of Drosophila. Developmental Biology 123: 145–153. Schüpbach, T., I. Xenarios, S. Bergmann, and K. Kapur, 2010 FastEpistasis: a high performance computing solution for quantitative trait epistasis. Bioinformatics 26: 1468– 1469. Schwekendiek, D., and S. Pak, 2009 Recent growth of children in the two Koreas: A meta- analysis. Economics & Human Biology 7: 109–112. Segalen, M., and Y. Bellaïche, 2009 Cell division orientation and planar cell polarity pathways. Seminars in Cell and Developmental Biology 20: 972–977. Segura, V., B. J. Vilhjálmsson, A. Platt, A. Korte, Ü. Seren et al., 2012 ng.2314. Nature Publishing Group 44: 825–830. Serrano, N., and P. H. O'Farrell, 1997 Limb morphogenesis: connections between patterning and growth. Current biology : CB 7: R186–R195. Sham, P. C., and S. M. Purcell, 2014 Statistical power and significance testing in large-scale genetic studies. Nature reviews. Genetics 15: 335–346. Shaw, L. M., 2011 The insulin receptor substrate (IRS) proteins: At the intersection of metabolism and cancer. Cell cycle (Georgetown, Tex.) 10: 1750–1756. Shi, G., E. Boerwinkle, A. C. Morrison, C. C. Gu, A. Chakravarti et al., 2011 Mining gold dust under the genome wide significance level: a two-stage approach to analysis of GWAS. Genetic Epidemiology 35: 111–118. Shilo, B., 2003 Signaling by the Drosophila epidermal growth factor receptor pathway during development. Experimental Cell Research 284: 140–149. Shilo, B. Z., 2005 Regulating the dynamics of EGF receptor signaling in space and time. Development (Cambridge, England) 132: 4017–4027. Shimizu, H., S. A. Woodcock, M. B. Wilkin, B. Trubenová, N. A. M. Monk et al., 2014 Compensatory Flux Changes within an Endocytic Trafficking Network Maintain Thermal Robustness of Notch Signaling. Cell 157: 1160–1174.

190 Shingleton, A. W., 2010 The regulation of organ size in Drosophila: physiology, plasticity, patterning and physical force. Organogenesis 6: 76–87. Shingleton, A. W., C. M. Estep, M. V. Driscoll, and I. Dworkin, 2009 Many ways to be small: different environmental regulators of size generate distinct scaling relationships in Drosophila melanogaster. Proceedings. Biological sciences / The Royal Society 276: 2625–2633. Shioi, T., P. M. Kang, P. S. Douglas, J. Hampe, C. M. Yballe et al., 2000 The conserved phosphoinositide 3-kinase pathway determines heart size in mice. The EMBO journal 19: 2537–2548. Sieberts, S. K., and E. E. Schadt, 2007 Moving toward a system genetics view of disease. Mammalian Genome 18: 389–401. Silva, E., Y. Tsatskis, L. Gardano, N. Tapon, and H. McNeill, 2006 The Tumor-Suppressor Gene fat Controls Tissue Growth Upstream of Expanded in the Hippo Signaling Pathway. Current Biology 16: 2081–2089. Simon, M. A., A. Xu, H. O. Ishikawa, and K. D. Irvine, 2010 Modulation of Fat:Dachsous Binding by the Cadherin Domain Kinase Four-Jointed. Current Biology 20: 811–817. Simpson, P., P. Berreur, and J. Berreur-Bonnenfant, 1980 The initiation of pupariation in Drosophila: dependence on growth of the imaginal discs. Journal of embryology and experimental morphology 57: 155–165. Sing, A., Y. Tsatskis, L. Fabian, I. Hester, R. Rosenfeld et al., 2014 The Atypical Cadherin Fat Directly Regulates Mitochondrial Function and Metabolic State. Cell 158: 1293–1308. Slaidina, M., R. Delanoue, S. Gronke, L. Partridge, and P. Léopold, 2009 A Drosophila Insulin-like Peptide Promotes Growth during Nonfeeding States. Developmental Cell 17: 874–884. Smith-Bolton, R. K., M. I. Worley, H. Kanda, and I. K. Hariharan, 2009 Regenerative Growth in Drosophila Imaginal Discs Is Regulated by Wingless and Myc. Developmental Cell 16: 797–809. Sopko, R., E. Silva, L. Clayton, L. Gardano, M. Barrios-Rodiles et al., 2009 Phosphorylation of the Tumor Suppressor Fat Is Regulated by Its Ligand Dachsous and the Kinase Discs Overgrown. Current Biology 19: 1112–1117. St Pierre, S. E., L. Ponting, R. Stefancsik, P. McQuilton, FlyBase Consortium, 2014 FlyBase 102--advanced approaches to interrogating FlyBase. Nucleic acids research 42: D780–8. Stark, A., M. F. Lin, P. Kheradpour, J. S. Pedersen, L. Parts et al., 2007 Discovery of functional elements in 12 Drosophila genomes using evolutionary signatures. Nature 450: 219–232. Steinthorsdottir, V., G. Thorleifsson, P. Sulem, H. Helgason, N. Grarup et al., 2014 Identification of low-frequency and rare sequence variants associated with elevated or

191 reduced risk of type 2 diabetes. Nature Publishing Group 46: 294–298. Stephens, L., A. Eguinoa, S. Corey, T. Jackson, and P. T. Hawkins, 1993 Receptor stimulated accumulation of phosphatidylinositol (3, 4, 5)-trisphosphate by G-protein mediated pathways in human myeloid derived cells. The EMBO journal 12: 2265. Stern, D. L., and D. J. Emlen, 1999 The developmental basis for allometry in insects. Development (Cambridge, England) 126: 1091–1101. Stern, D. L., and V. Orgogozo, 2008 THE LOCI OF EVOLUTION: HOW PREDICTABLE IS GENETIC EVOLUTION? Evolution 62: 2155–2177. Stieper, B. C., M. Kupershtok, M. V. Driscoll, and A. W. Shingleton, 2008 Imaginal discs regulate developmental timing in Drosophila melanogaster. Developmental Biology 321: 18– 26. Stocker, H., and E. Hafen, 2000 Genetic control of cell size. Current opinion in genetics & development 10: 529–535. Stocker, H., M. Andjelkovic, S. Oldham, M. Laffargue, M. P. Wymann et al., 2002 Living with lethal PIP3 levels: viability of flies lacking PTEN restored by a PH domain mutation in Akt/PKB. Science 295: 2088–2091. Stunnenberg, H. G., and N. C. Hubner, 2013 Genomics meets proteomics: identifying the culprits in disease. Human Genetics 133: 689–700. Su, T. T., and P. H. O'Farrell, 1998 Size control: cell proliferation does not equal growth. Current biology : CB 8: R687–R689. Sun, X. J., P. Rothenberg, C. R. Kahn, J. M. Backer, E. Araki et al., 1991 Structure of the insulin receptor substrate IRS-1 defines a unique signal transduction protein. Nature 352: 73–77. Sutter, N. B., C. D. Bustamante, K. Chase, M. M. Gray, K. Zhao et al., 2007 A Single IGF1 Allele Is a Major Determinant of Small Size in Dogs. Science 316: 112–115. Swarup, S., and E. M. Verheyen, 2012 Wnt/Wingless Signaling in Drosophila. Cold Spring Harbor perspectives in biology 4: a007930–a007930. Swarup, S., W. Huang, T. F. C. Mackay, and R. R. H. Anholt, 2013 Analysis of natural variation reveals neurogenetic networks for Drosophila olfactory behavior. Proceedings of the National Academy of Sciences of the United States of America 110: 1017–1022. Takahashi, Y., H. Kadowaki, K. Momomura, Y. Fukushima, T. Orban et al., 1997 A homozygous kinase-defective mutation in the insulin receptor gene in a patient with leprechaunism. Diabetologia 40: 412–420. Tamemoto, H., T. Kadowaki, K. Tobe, T. Yagi, H. Sakura et al., 1994 Insulin resistance and growth retardation in mice lacking insulin receptor substrate-1. Tang, H. Y., M. S. B. Smith-Caldas, M. V. Driscoll, S. Salhadar, and A. W. Shingleton, 2011

192 FOXO Regulates Organ-Specific Phenotypic Plasticity In Drosophila (E. Hafen, Ed.). PLoS Genetics 7: e1002373. Tapon, N., K. F. Harvey, D. W. Bell, D. C. Wahrer, T. A. Schiripo et al., 2002 salvador Promotes Both Cell Cycle Exit and Apoptosis in Drosophila and Is Mutated in Human Cancer Cell Lines. Cell 110: 467–478. Taylor, S. I., 1992 Lilly Lecture: molecular mechanisms of insulin resistance: lessons from patients with mutations in the insulin-receptor gene. Diabetes 41: 1473–1490. Teleman, A. A., Y.-W. Chen, and S. M. Cohen, 2005 4E-BP functions as a metabolic brake used under stress conditions but not during normal growth. Genes & development 19: 1844– 1848. Teleman, A. A., V. Hietakangas, A. C. Sayadian, and S. M. Cohen, 2008 Nutritional Control of Protein Biosynthetic Capacity by Insulin via Myc in Drosophila. Cell Metabolism 7: 21–32. Tettweiler, G., M. Miron, M. Jenkins, N. Sonenberg, and P. F. Lasko, 2005 Starvation and oxidative stress resistance in Drosophila are mediated through the eIF4E-binding protein, d4E-BP. Genes & development 19: 1840–1843. Thakur, Das, M., Y. Feng, R. Jagannathan, M. J. Seppa, J. B. Skeath et al., 2010 Ajuba LIM proteins are negative regulators of the Hippo signaling pathway. Current biology : CB 20: 657–662. The International HapMap Consortium, 2005 A haplotype map of the human genome. Nature 437: 1299–1320. Thomas, D., 2010 Gene–environment-wide association studies: emerging approaches. Nature reviews. Genetics 11: 259–272. Thomas, D., M. Guthridge, J. Woodcock, and A. Lopez, 2005 14‐3‐3 Protein Signaling in Development and Growth Factor Responses, pp. 285–303 in Current Topics in Developmental Biology, Current Topics in Developmental Biology, Elsevier. Thompson, B. J., and S. M. Cohen, 2006 The Hippo Pathway Regulates the bantam microRNA to Control Cell Proliferation and Apoptosis in Drosophila. Cell 126: 767–774. Thummel, C. S., 1996 Flies on steroids--Drosophila metamorphosis and the mechanisms of steroid hormone action. Trends in genetics : TIG 12: 306–310. Tobler, R., S. U. Franssen, R. Kofler, P. Orozco-terWengel, V. Nolte et al., 2014 Massive Habitat-Specific Genomic Response in D. melanogaster Populations during Experimental Evolution in Hot and Cold Environments. Molecular biology and evolution 31: 364–375. Toma, D. P., K. P. White, J. Hirsch, and R. J. Greenspan, 2002 Identification of genes involved in Drosophila melanogaster geotaxis, a complex behavioral trait. Nature Genetics 31: 349–353. Tremblay, F., and A. Marette, 2001 Amino acid and insulin signaling via the mTOR/p70 S6

193 kinase pathway. A negative feedback mechanism leading to insulin resistance in skeletal muscle cells. The Journal of biological chemistry 276: 38052–38060. Trotta, V., F. C. F. Calboli, M. Ziosi, and S. Cavicchi, 2007 Fitness variation in response to artificial selection for reduced cell area, cell number and wing area in natural populations of Drosophila melanogaster. BMC Evolutionary Biology 7 Suppl 2: S10. Tumaneng, K., R. C. Russell, and K.-L. Guan, 2012 Organ Size Control by Hippo and TOR Pathways. Current biology : CB 22: R368–R379. Turner, T. L., and P. M. Miller, 2012 Investigating Natural Variation in Drosophila Courtship Song by the Evolve and Resequence Approach. Genetics 191: 633–642. Turner, T. L., P. M. Miller, and V. A. Cochrane, 2013 Combining genome-wide methods to investigate the genetic complexity of courtship song variation in Drosophila melanogaster. Molecular biology and evolution 30: 2113–2120. Turner, T. L., A. D. Stewart, A. T. Fields, W. R. Rice, and A. M. Tarone, 2011 Population- Based Resequencing of Experimentally Evolved Populations Reveals the Genetic Basis of Body Size Variation in Drosophila melanogaster (G. GIBSON, Ed.). PLoS Genetics 7: e1001336. Tyler, D. M., and N. E. Baker, 2007 Expanded and fat regulate growth and differentiation in the Drosophila eye through multiple signaling pathways. Developmental Biology 305: 187– 201. Tøndel, K., U. G. Indahl, A. B. Gjuvsland, J. O. Vik, P. Hunter et al., 2011 tondel. BMC Systems Biology 5: 90. Udan, R. S., M. Kango-Singh, R. Nolo, C. Tao, and G. Halder, 2003 Hippo promotes proliferation arrest and apoptosis in the Salvador/Warts pathway. Nature Cell Biology 5: 914– 920. Ullrich, A., and J. Schlessinger, 1990 Signal transduction by receptors with tyrosine kinase activity. Cell 61: 203–212. Valdar, W., L. C. Solberg, D. Gauguier, W. O. Cookson, J. N. P. Rawlins et al., 2006 Genetic and Environmental Effects on Complex Traits in Mice. Genetics 174: 959–984. Valentinis, B., and R. Baserga, 2001 IGF-I receptor signalling in transformation and differentiation. Molecular pathology : MP 54: 133–137. van Bon, B. W. M., M. A. W. Oortveld, L. G. Nijtmans, M. Fenckova, B. Nijhof, J. Besseling, M. Vos, J. M. Kramer, N. de Leeuw, A. Castells-Nobau, L. Asztalos, E. Viragh, M. Ruiter, F. Hofmann, L. Eshuis, L. Collavin, M. A. Huynen, Z. Asztalos, P. Verstreken, R. J. Rodenburg, J. A. Smeitink, B. B. A. de Vries, and A. Schenck, 2013a CEP89 is required for mitochondrial metabolism and neuronal function in man and fly. Human Molecular Genetics 22: 3138– 3151.

194 van Bon, B. W. M., M. A. W. Oortveld, L. G. Nijtmans, M. Fenckova, B. Nijhof et al. 2013b CEP89 is required for mitochondrial metabolism and neuronal function in man and fly. Human Molecular Genetics 22: 3138–3151. Van der Have, T. M., and G. De Jong, 1996 Adult size in ectotherms: temperature effects on growth and differentiation. Journal of Theoretical Biology 183: 329–340. Van Der Heide, L. P., M. F. M. Hoekman, and M. P. Smidt, 2004 The ins and outs of FoxO shuttling: mechanisms of FoxO translocation and transcriptional regulation. Biochemical Journal 380: 297–309. Vanaphan, N., B. Dauwalder, and R. A. Zufall, 2012 Diversification of takeout, a male-biased gene family in Drosophila. Gene 491: 142–148. Vanhaesebroeck, B., S. J. Leevers, K. Ahmadi, J. Timms, R. Katso et al., 2001 Synthesis and function of 3-phosphorylated inositol lipids. Annual Review of Biochemistry 70: 535–602. Vijendravarma, R. K., S. Narasimha, and T. J. Kawecki, 2011 Chronic malnutrition favours smaller critical size for metamorphosis initiation in Drosophila melanogaster. Journal of evolutionary biology 25: 288–292. Vilhjálmsson, B. J., and M. Nordborg, 2013 The nature of confounding in genome-wide association studies. Nature reviews. Genetics 14: 1–2. Visscher, P. M., W. G. HILL, and N. R. Wray, 2008 Heritability in the genomics era — concepts and misconceptions. Nature reviews. Genetics 9: 255–266. Voet, D., J.G. Voet, and C.W. Pratt, 2001. Fundamentals of Biochemistry. New York: Wiley (2nd ed). Vogt, P. K., 2001 PI 3-kinase, mTOR, protein synthesis and cancer. Trends in molecular medicine 7: 482–484. Wang, B. L., A. Ghaderi, H. Zhou, J. Agresti, D. A. Weitz et al., 2014 Microfluidic high- throughput culturing of single cells for selection based on extracellular metabolite production or consumption. Nature biotechnology 32: 473–478. Wang, K., M. Li, and H. Hakonarson, 2010 Analysing biological pathways in genome-wide association studies. Nature reviews. Genetics 11: 843–854. Wang, T., C. C. Y. Hung, and D. J. Randall, 2006 The comparative physiology of food deprivation: From Feast to Famine. Annual Review of Physiology 68: 223–251. Wang, Z. Q., M. R. Fung, D. P. Barlow, and E. F. Wagner, 1994 Regulation of embryonic growth and lysosomal targeting by the imprinted Igf2/Mpr gene. Nature 372: 464–467. Weber, A. L., G. F. Khan, M. M. Magwire, C. L. Tabor, T. F. C. Mackay et al., 2012 Genome- wide association analysis of oxidative stress resistance in Drosophila melanogaster. PLoS ONE 7: e34745.

195 Weber, U., C. Pataki, J. Mihaly, and M. Mlodzik, 2008 Combinatorial signaling by the Frizzled/PCP and Egfr pathways during planar cell polarity establishment in the Drosophila eye. Developmental Biology 316: 110–123. Weigel, D., and R. Mott, 2009 The 1001 Genomes Project for Arabidopsis thaliana. Genome Biology 10: 107. Weigmann, K., S. M. Cohen, and C. F. Lehner, 1997 Cell cycle progression, growth and patterning in imaginal discs despite inhibition of cell division after inactivation of Drosophila Cdc2 kinase. Development (Cambridge, England) 124: 3555–3563. Weinkove, D., and S. J. Leevers, 2000 The genetic control of organ growth: insights from Drosophila. Current opinion in genetics & development 10: 75–80. Weinkove, D., T. P. Neufeld, T. Twardzik, M. D. Waterfield, and S. J. Leevers, 1999 Regulation of imaginal disc cell size, cell number and organ size by Drosophila class IA phosphoinositide 3-kinase and its adaptor. Current biology : CB 9: 1019–1029. Weisman, R., A. Cohen, and S. M. Gasser, 2014 TORC2--a new player in genome stability. EMBO Molecular Medicine 6: 995–1002. Wellcome Trust Case Control Consortium, 2007 Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature 447: 661–678. Wertheimer, E., S. P. Lu, P. F. Backeljauw, M. L. Davenport, and S. I. Taylor, 1993 Homozygous deletion of the human insulin receptor gene results in leprechaunism. Nature Genetics 5: 71–73. White, M. F., 2002 IRS proteins and the common path to diabetes. American journal of physiology. Endocrinology and metabolism 283: E413–22. White, M. F., 1998 The IRS-signalling system: a network of docking proteins that mediate insulin action. Molecular and cellular biochemistry 182: 3–11. Wichman, H. A., L. A. Scott, C. D. Yarber, and J. J. Bull, 2000 Experimental evolution recapitulates natural evolution. Philosophical Transactions of the Royal Society B: Biological Sciences 355: 1677–1684. Willecke, M., F. Hamaratoglu, M. Kango-Singh, R. Udan, C.-L. Chen et al., 2006 The Fat Cadherin Acts through the Hippo Tumor-Suppressor Pathway to Regulate Tissue Size. Current Biology 16: 2090–2100. Willecke, M., F. Hamaratoglu, L. Sansores-Garcia, C. Tao, and G. Halder, 2008 Boundaries of Dachsous Cadherin activity modulate the Hippo signaling pathway to induce cell proliferation. Proceedings of the National Academy of Sciences 105: 14897–14902. Wiser, M. J., N. Ribeck, and R. E. Lenski, 2013 Long-term dynamics of adaptation in asexual populations. Science 342: 1364–1367.

196 Withers, D. J., J. S. Gutierrez, H. Towery, D. J. Burks, J. M. Ren et al., 1998 Disruption of IRS-2 causes type 2 diabetes in mice. Nature 391: 900–904. Womack, J. E., H.-J. Jang, and M. O. Lee, 2012 Genomics of complex traits. Annals of the New York Academy of Sciences 1271: 33–36. Woods, K. A., C. Camacho-Hübner, M. O. Savage, and A. J. Clark, 1996 Intrauterine growth retardation and postnatal growth failure associated with deletion of the insulin-like growth factor I gene. The New England journal of medicine 335: 1363–1367. Wray, N. R., J. Yang, Ben J Hayes, A. L. Price, M. E. Goddard et al., 2013 Pitfalls of predicting complex traits from SNPs. Nature reviews. Genetics 14: 507–515. Wu, J., A.-C. Roman, J. M. Carvajal-Gonzalez, and M. Mlodzik, 2013 wu. Nature Cell Biology 15: 1045–1055. Wu, L., S. I. Candille, Y. Choi, D. Xie, L. Jiang et al., 2013 Variation and genetic control of protein abundance in humans. Nature 498: 79–82. Wu, S., J. Huang, J. Dong, and D. Pan, 2003 hippo Encodes a Ste-20 Family Protein Kinase that Restricts Cell Proliferation and Promotes Apoptosis in Conjunction with salvador and warts. Cell 114: 445–456. Wullschleger, S., R. Loewith, and M. N. Hall, 2006 TOR Signaling in Growth and Metabolism. Cell 124: 471–484. Xu, T., and G. M. Rubin, 1993 Analysis of genetic mosaics in developing and adult Drosophila tissues. Development (Cambridge, England) 117: 1223–1237. Yanagida, M., N. Ikai, M. Shimanuki, and K. Sajiki, 2011 Nutrient limitations alter cell division control and chromosome segregation through growth-related kinases and phosphatases. Philosophical Transactions of the Royal Society B: Biological Sciences 366: 3508–3520. Yang, C.-H., P. Belawat, E. Hafen, L. Y. Jan, and Y.-N. Jan, 2008 Drosophila egg-laying site selection as a system to study simple decision-making processes. Science 319: 1679–1683. Yang, J., B. Benyamin, B. P. McEvoy, S. Gordon, A. K. Henders et al., 2010 Common SNPs explain a large proportion of the heritability for human height. Nature Publishing Group 42: 565–569. Yang, J., T. Ferreira, A. P. Morris, S. E. Medland, P. A. F. Madden et al., 2012 Conditional and joint multiple-SNP analysis of GWAS summary statistics identifies additional variants influencing complex traits. Nature Genetics 44: 369–375. Yang, J., T. A. Manolio, L. R. Pasquale, E. Boerwinkle, N. Caporaso et al., 2011 Genome partitioning of genetic variation for complex traits using common SNPs. Nature Genetics 43: 519–525.

197 Yang, L., F. Meng, D. Ma, W. Xie, and M. Fang, 2012 Bridging Decapentaplegic and Wingless signaling in Drosophila wings through repression of naked cuticle by Brinker. Development (Cambridge, England) 140: 413–422. Yatsenko, A. S., A. K. Marrone, M. M. Kucherenko, and H. R. Shcherbata, 2014 Measurement of Metabolic Rate in Drosophila using Respirometry. Journal of Visualized Experiments. Yenush, L., R. Fernandez, M. G. Myers, T. C. Grammer, X. J. Sun et al., 1996 The Drosophila insulin receptor activates multiple signaling pathways but requires insulin receptor substrate proteins for DNA synthesis. Molecular and Cellular Biology 16: 2509–2517. Young, R. S., A. C. Marques, C. Tibbit, W. Haerty, A. R. Bassett et al., 2012 Identification and Properties of 1,119 Candidate LincRNA Loci in the Drosophila melanogaster Genome. Genome biology and evolution 4: 427–442. Yu, F. X., and K. L. Guan, 2013 The Hippo pathway: regulators and regulations. Genes & development 27: 355–371. Zecca, M., and G. Struhl, 2010 A Feed-Forward Circuit Linking Wingless, Fat-Dachsous Signaling, and the Warts-Hippo Pathway to Drosophila Wing Growth (M. Affolter, Ed.). PLoS biology 8: e1000386. Zeggini, E., and J. P. Ioannidis, 2009 Meta-analysis in genome-wide association studies. Pharmacogenomics 10: 191–201. Zeyl, C., 2006 Experimental evolution with yeast. FEMS yeast research 6: 685–691. Zhang, G., R. Karns, G. Sun, S. R. Indugula, H. Cheng et al., 2012 Finding Missing Heritability in Less Significant Loci and Allelic Heterogeneity: Genetic Variation in Human Height (M. Xiong, Ed.). PLoS ONE 7: e51211. Zhang, H., J. P. Stallock, J. C. Ng, C. Reinhard, and T. P. Neufeld, 2000 Regulation of cellular growth by the Drosophila target of rapamycin dTOR. Genes & development 14: 2712–2724. Zhang, J., J. KIM, A. Alexander, S. Cai, D. N. Tripathi et al., 2013 A tuberous sclerosis complex signalling node at the peroxisome regulates mTORC1 and autophagy in response to ROS. Nature Cell Biology 15: 1186–1196. Zhang, L., F. Ren, Q. Zhang, Y. Chen, B. Wang et al., 2008 The TEAD/TEF Family of Transcription Factor Scalloped Mediates Hippo Signaling in Organ Size Control. Developmental Cell 14: 377–387. Zhang, W., S. Patil, B. Chauhan, S. Guo, D. R. Powell et al., 2006 FoxO1 regulates multiple metabolic pathways in the liver: effects on gluconeogenic, glycolytic, and lipogenic gene expression. The Journal of biological chemistry 281: 10105–10117.

198 Zhang, Y., X. Gao, L. J. Saucedo, B. Ru, B. A. Edgar et al., 2003 Rheb is a direct target of the tuberous sclerosis tumour suppressor proteins. Nature Cell Biology 5: 578–581. Ziosi, M., L. A. Baena-López, D. Grifoni, F. Froldi, A. Pession et al., 2010 dMyc Functions Downstream of Yorkie to Promote the Supercompetitive Behavior of Hippo Pathway Mutant Cells (G. S. Barsh, Ed.). PLoS Genetics 6: e1001140. Zwaan, B. J., R. B. Azevedo, A. C. James, J. Van 't Land, and L. Partridge, 2000 Cellular basis of wing size variation in Drosophila melanogaster: a comparison of latitudinal clines on two continents. Heredity 84 ( Pt 3): 338–347.

199 8. ACKNOWLEDGEMENTS

Here I would like to say thank you to the many people who have supported me in the past almost 5 years. First, I would like to thank my doctoral advisor Ernst Hafen for giving me the opportunity to work in his group and on this fascinating project. Also, for introducing me to people who have been key to the success of this thesis and for providing me with great opportunities to develop myself and my scientific career.

I would also like to thank my thesis committee, Konrad Basler, Sven Bergmann and Trudy Mackay, who have given me valuable input for my project and introduced me to people in their labs that I could learn a lot from.

Many thanks go also to the Hafen lab members past and present, for good times at and after work. Special thanks here go to Anni Strässle who has supported me greatly in the past 1.5 years and without whom I am not sure it would have been possible to run two large projects at the same time. Also, to Maria Skoura and Vasco Medici for supporting me with the experiments and always finding a solution. Furthermore, I would like to thank Anna Troller who took over the huge job of phenotyping all the DGRP lines with me, and Katja Köhler and Benjamin Schlager for the great help with any kind of writing related work and for discussions about results.

200 9. CURRICULUM VITAE

Name: Vonesch

First Name: Sibylle

Date of birth: 11. 10. 1984

Place of birth: Zürich, Switzerland

Nationality: Swiss

Education:

PhD thesis at ETH Zürich 05/2010 – 09/2014

Supervised by Prof. Dr. Ernst Hafen

Title: A systems genetics approach to understanding intra-species variation in wing and body size

Masters thesis in Molecular and Cellular Biology at Universität Zürich

Supervised by Prof. Dr. Michael O. Hengartner 04/2008 – 07/2009

Title: “Marasmius oreades agglutinin: A nematotoxic fungal lectin and its target glycan structure in C. elegans and Xerocomus chrysenteron lectin toxicity in the nematode C. elegans”

Matura

Kantonsschule Glattal (Switzerland) 08/1999 – 09/2003

201