ȱ ȱ ȱ GeneticalȱGenomicsȱinȱArabidopsis:ȱ fromȱnaturalȱvariationȱtoȱregulatoryȱnetworksȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ JoostȱKeurentjesȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ Promotoren:ȱȱ ȱ Prof.ȱdr.ȱir.ȱM.ȱKoornneefȱ PersoonlijkȱhoogleraarȱbijȱhetȱLaboratoriumȱvoorȱ Erfelijkheidsleer,ȱWageningenȱUniversiteitȱ ȱ Prof.ȱdr.ȱL.H.W.ȱvanȱderȱPlasȱ HoogleraarȱinȱdeȱPlantenfysiologie,ȱWageningenȱ Universiteitȱ ȱ Copromotor:ȱȱ ȱ Dr.ȱD.ȱVreugdenhilȱ UniversitairȱhoofddocentȱbijȱhetȱLaboratoriumȱvoorȱ Plantenfysiologie,ȱWageningenȱUniversiteitȱ ȱ Promotiecommissie:ȱȱ Prof.ȱdr.ȱW.J.ȱStiekema,ȱWageningenȱUniversiteitȱ Prof.ȱdr.ȱR.G.F.ȱVisser,ȱWageningenȱUniversiteitȱ Prof.ȱdr.ȱJ.C.M.ȱSmeekens,ȱUniversiteitȱUtrechtȱ Prof.ȱdr.ȱM.ȱStitt,ȱMaxȱPlanckȱInstituutȱvoorȱ MoleculaireȱPlantenfysiologie,ȱGolm,ȱDuitslandȱ ȱ ȱ DitȱonderzoekȱisȱuitgevoerdȱbinnenȱdeȱonderzoekschoolȱvoorȱExperimenteleȱ Plantwetenschappen.ȱ ȱ ȱ ȱ ȱ JoostȱKeurentjesȱ ȱ ȱ ȱ ȱ ȱ ȱ GeneticalȱGenomicsȱinȱArabidopsis:ȱ fromȱnaturalȱvariationȱtoȱregulatoryȱnetworksȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ Proefschriftȱ terȱverkrijgingȱvanȱdeȱgraadȱvanȱdoctorȱ opȱgezagȱvanȱdeȱrectorȱmagnificusȱ vanȱWageningenȱUniversiteit,ȱ Prof.ȱdr.ȱM.J.ȱKropff,ȱ inȱhetȱopenbaarȱteȱverdedigenȱ opȱvrijdagȱ7ȱseptemberȱ2007ȱ desȱnamiddagsȱteȱ13:30ȱuurȱinȱdeȱAulaȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ JoostȱJ.B.ȱKeurentjesȱ ȱ GeneticalȱGenomicsȱinȱArabidopsis:ȱfromȱnaturalȱvariationȱtoȱ regulatoryȱnetworksȱ(2007)ȱ ȱ PhDȱthesis,ȱWageningenȱUniversity,ȱWageningen,ȱTheȱNetherlands.ȱ Withȱreferencesȱ–ȱwithȱsummariesȱinȱEnglishȱandȱDutch.ȱ ȱ ISBNȱ978Ȭ90Ȭ8504Ȭ704Ȭ9 CONTENTSȱ ȱ Chapterȱ1ȱ Generalȱintroductionȱ ȱ ȱ ȱ ȱ ȱ ȱȱȱȱ1ȱ ȱ Chapterȱ2ȱ DevelopmentȱofȱaȱNearȬIsogenicȱLineȱpopulationȱofȱȱ ȱȱ17ȱ Arabidopsisȱthalianaȱandȱcomparisonȱofȱmappingȱpowerȱ withȱaȱRecombinantȱInbredȱLineȱpopulationȱ ȱ Chapterȱ3ȱ RegulatoryȱnetworkȱconstructionȱinȱArabidopsisȱbyȱusingȱȱȱ47ȱ genomeȬwideȱgeneȱexpressionȱquantitativeȱtraitȱlociȱ ȱ Chapterȱ4ȱ Theȱgeneticsȱofȱplantȱmetabolismȱȱȱȱȱȱ73ȱ ȱ Chapterȱ5ȱ Integrativeȱanalysesȱofȱgeneticȱvariationȱinȱenzymeȱȱ ȱȱ99ȱ activitiesȱofȱprimaryȱcarbohydrateȱmetabolismȱrevealȱ distinctȱmodesȱofȱregulationȱinȱArabidopsisȱthalianaȱ ȱ Chapterȱ6ȱ Generalȱdiscussionȱȱȱȱȱȱ133ȱ ȱ Summaryȱȱȱȱȱȱȱ143ȱ ȱ Samenvattingȱȱȱȱȱȱȱ147ȱ ȱ Publicationsȱȱȱȱȱȱȱ151ȱ ȱ Curriculumȱvitaeȱȱȱȱȱȱ155ȱ ȱ Nawoordȱȱȱȱȱȱȱ156ȱ ȱ EducationȱStatementȱȱȱȱȱȱ159ȱ ȱ Chapterȱ1ȱ ȱ ȱ Generalȱintroductionȱ ȱ ȱ Naturalȱvariationȱandȱquantitativeȱtraitsȱ Forȱ mostȱ organismsȱ variationȱ betweenȱ individualsȱ canȱ beȱ observedȱ inȱ nature.ȱ Plantsȱareȱnoȱexceptionȱtoȱthisȱandȱnaturallyȱoccurringȱvariationȱcanȱbeȱobservedȱ betweenȱ andȱ withinȱ species.ȱ Althoughȱ partȱ ofȱ theȱ withinȬspeciesȱ variationȱ observedȱinȱnatureȱcanȱbeȱattributedȱtoȱenvironmentalȱinfluences,ȱgeneticȱvariationȱ canȱbeȱobservedȱwhenȱplantsȱofȱdifferentȱoriginsȱareȱgrownȱtogetherȱinȱtheȱsameȱ environmentȱ (Nordborgȱ etȱ al.,ȱ 2005).ȱ Theȱ contributionȱ ofȱ geneticȱ factorsȱ toȱ theȱ totallyȱ observedȱ variationȱ betweenȱ differentȱ genotypesȱ isȱ oftenȱ expressedȱ asȱ theȱ heritabilityȱofȱaȱtrait.ȱ Naturalȱ variationȱ exhibitedȱ byȱ genotypicallyȱ differentȱ accessionsȱ canȱ beȱ classifiedȱ asȱ qualitativeȱ orȱ quantitative.ȱ Qualitativeȱ traitsȱ areȱ characterizedȱ byȱ distinctȱ phenotypicȱ classes,ȱ e.g.ȱ presenceȱ orȱ absenceȱ ofȱ aȱ property,ȱ oftenȱ aȱ resultȱ fromȱ geneticȱ differencesȱ atȱ singleȱ genes.ȱ Suchȱ traitsȱ canȱ relativelyȱ easilyȱ beȱ dissectedȱ geneticallyȱ dueȱ toȱ theirȱ clearȱ segregationȱ patternȱ inȱ theȱ progenyȱ ofȱ crosses.ȱ Quantitativeȱ traitsȱ onȱ theȱ otherȱ hand,ȱ oftenȱ displayȱ aȱ moreȱ continuousȱ variationȱ inȱ phenotypesȱ dueȱ toȱ aȱ multiplicityȱ ofȱ genesȱ involvedȱ andȱ aȱ relativelyȱ largeȱ effectȱ ofȱ environmentalȱ factorsȱ onȱ theȱ expressionȱ ofȱ theȱ trait.ȱ Becauseȱ differentȱ genesȱ canȱ contributeȱ positivelyȱ orȱ negativelyȱ toȱ aȱ quantifiableȱ trait,ȱ recombinationȱofȱgenesȱresultsȱinȱaȱlargeȱnumberȱofȱphenotypicȱclassesȱwhichȱcanȱ notȱ unambiguouslyȱ beȱ associatedȱ withȱ genotypicȱ classesȱ (Kearseyȱ etȱ al.,ȱ 2003;ȱ WeigelȱandȱNordborg,ȱ2005;ȱHolland,ȱ2007).ȱTheȱcomplexityȱofȱquantitativeȱtraitsȱisȱ furtherȱenhancedȱbyȱtheȱpresenceȱofȱepistaticȱinteractionsȱandȱinteractionsȱbetweenȱ genesȱ andȱ theȱ environmentȱ (Carlborgȱ andȱ Haley,ȱ 2004;ȱ Kroymannȱ andȱ MitchellȬ Olds,ȱ2005).ȱ Althoughȱmuchȱmoreȱdifficultȱtoȱdissect,ȱquantitativeȱvariationȱisȱfoundȱforȱ manyȱagronomicalȱimportantȱtraitsȱlikeȱbiomassȱformation,ȱplantȱheight,ȱfloweringȱ time,ȱ reproductiveȱ yieldȱ andȱ seedȱ dormancyȱ (Koornneefȱ etȱ al.,ȱ 2004;ȱ RossȬIbarra,ȱ 2005;ȱ Ashikariȱ andȱ Matsuoka,ȱ 2006;ȱ Semelȱ etȱ al.,ȱ 2006;ȱ Zhaoȱ etȱ al.,ȱ 2006).ȱ Furthermore,ȱ quantitativeȱ naturalȱ variationȱ controlsȱ adaptiveȱ strategiesȱ toȱ copeȱ withȱ bioticȱ andȱ abioticȱ influencesȱ andȱ itsȱ understandingȱ canȱ provideȱ insightȱ inȱ ecologicalȱmechanismsȱandȱtheȱevolutionaryȱhistoryȱofȱplantsȱ(Tonsorȱetȱal.,ȱ2005;ȱ MitchellȬOldsȱandȱSchmitt,ȱ2006).ȱȱ

1ȱ Chapterȱ1ȱ

Arabidopsisȱthalianaȱasȱaȱmodelȱplantȱ Theȱstudyȱofȱquantitativeȱtraitsȱisȱoftenȱcontrastedȱwithȱtheȱanalysisȱofȱqualitativeȱ traits,ȱwhichȱareȱmostlyȱrepresentedȱbyȱsingleȱgeneȱmutantsȱorȱsingleȱgeneȱnaturalȱ variants.ȱForȱtheȱstudyȱofȱsuchȱsingleȱgenesȱArabidopsisȱthalianaȱhasȱprovenȱtoȱbeȱaȱ veryȱefficientȱmodelȱplantȱbecauseȱofȱaȱnumberȱofȱbiologicalȱpropertiesȱthatȱmakeȱ geneticȱ analysesȱ veryȱ efficientȱ (Somervilleȱ andȱ Koornneef,ȱ 2002).ȱ Althoughȱ itȱ isȱ selfȬfertilizingȱitȱcanȱeasilyȱbeȱoutȬcrossedȱandȱitȱcombinesȱshortȱgenerationȱtimesȱ withȱ highȱ reproductiveȱ yield.ȱ Moreover,ȱ itȱ containsȱ aȱ fullyȱ sequencedȱ smallȱ genomeȱ(120ȱMbp)ȱmadeȱupȱofȱonlyȱfiveȱchromosomesȱandȱapproximatelyȱ30,000ȱ genesȱ(TheȱArabidopsisȱGenomeȱInitiative,ȱ2000).ȱTheȱaccumulationȱofȱknowledge,ȱ biologicalȱresourcesȱandȱavailableȱmolecularȱtoolsȱaddsȱupȱtoȱtheȱattractivenessȱofȱ Arabidopsisȱasȱaȱmodelȱsystemȱ(AlonsoȱandȱEcker,ȱ2006).ȱ Theseȱ advantagesȱ alsoȱ makeȱ Arabidopsisȱ veryȱ suitableȱ forȱ theȱ geneticȱ analysisȱ ofȱ naturalȱ variation.ȱ Theȱ plantȱ showsȱ aȱ broadȱ globalȱ distributionȱ throughoutȱ theȱ northernȱ hemisphereȱ atȱ differentȱ continents,ȱ includingȱ America,ȱ Africa,ȱ Europeȱ andȱ Asiaȱ (Schmidȱ etȱ al.,ȱ 2006).ȱ Moreoverȱ itȱ isȱ foundȱ atȱ differentȱ latitudesȱandȱaltitudesȱrangingȱfromȱScandinavianȱseaȱlevelȱtoȱhighȱupȱinȱtheȱAsianȱ Himalayas.ȱ Atȱ manyȱ locations,ȱ accessionsȱ orȱ ecotypes,ȱ haveȱ beenȱ collectedȱ displayingȱ aȱ broadȱ spectrumȱ ofȱ naturalȱ variationȱ forȱ numerousȱ traitsȱ (AlonsoȬ BlancoȱandȱKoornneef,ȱ2000;ȱKoornneefȱetȱal.,ȱ2004).ȱManyȱofȱthoseȱaccessionsȱareȱ depositedȱtoȱstockȱcentersȱmakingȱthemȱpubliclyȱavailableȱforȱgeneticȱanalyses.ȱ ȱ Geneticȱanalysisȱofȱquantitativeȱtraitsȱ Despiteȱtheȱcomplexityȱinȱgeneticȱregulationȱofȱquantitativeȱtraitsȱmuchȱprogressȱ hasȱ beenȱ madeȱ overȱ theȱ pastȱ decadesȱ inȱ dissectingȱ theseȱ traitsȱ byȱ theȱ useȱ ofȱ molecularȱ markers.ȱ Theȱ increasingȱ easeȱ byȱ whichȱ molecularȱ markersȱ canȱ beȱ generatedȱ (Borevitzȱ andȱ Chory,ȱ 2004)ȱ inȱ combinationȱ withȱ theȱ applicationȱ ofȱ sophisticatedȱ mappingȱ methodsȱ (Jansen,ȱ 1993)ȱ hasȱ ledȱ toȱ aȱ strongȱ interestȱ inȱ theȱ useȱ ofȱ naturalȱ variationȱ forȱ studyingȱ quantitativeȱ traitsȱ (Slate,ȱ 2005).ȱ Mutantȱ screens,ȱoftenȱdirectedȱtoȱaȱspecificȱtrait,ȱandȱtheȱsubsequentȱmappingȱandȱcloningȱ ofȱtheȱaffectedȱgene,ȱhaveȱbeenȱaȱveryȱeffectiveȱstrategyȱtoȱanalyzeȱtheȱfunctionȱofȱ genesȱ inȱ Arabidopsisȱ (Meinkeȱ etȱ al.,ȱ 2003).ȱ However,ȱ specificȱ advantagesȱ areȱ associatedȱwithȱtheȱstudyȱofȱmultipleȱnaturalȱperturbationsȱinȱtheȱsameȱmappingȱ population.ȱ Thisȱ allowsȱ forȱ theȱ analysisȱ ofȱ anȱ almostȱ infiniteȱ numberȱ ofȱ traitsȱ (Doerge,ȱ 2002).ȱ Forȱ thisȱ typeȱ ofȱ studyȱ soȬcalledȱ immortalȱ mappingȱ populations,ȱ consistingȱinȱmostȱcasesȱofȱhomozygousȱgenotypesȱthatȱcanȱbeȱtestedȱinȱreplicatesȱ andȱinȱdifferentȱexperiments,ȱhaveȱprovenȱveryȱuseful.ȱ Althoughȱ variousȱ typesȱ ofȱ suchȱ mappingȱ populationsȱ haveȱ beenȱ developedȱforȱaȱvarietyȱofȱspeciesȱ(EshedȱandȱZamir,ȱ1995;ȱRaeȱetȱal.,ȱ1999;ȱYoonȱetȱ

2ȱ Generalȱintroductionȱ

al.,ȱ2006),ȱtheȱrelativeȱeaseȱofȱgeneratingȱrecombinantȱinbredȱlinesȱ(RILs)ȱhasȱledȱtoȱ theirȱfavorableȱuseȱforȱquantitativeȱtraitȱlocusȱ(QTL)ȱanalysisȱinȱArabidopsisȱandȱ manyȱ otherȱ plantsȱȱ(Jansen,ȱ 2003b).ȱ RILsȱ areȱ producedȱ byȱ crossingȱ twoȱ distinctȱ genotypesȱandȱusingȱsingleȱseedȱdescentȱpropagationȱofȱtheȱinbredȱlinesȱobtainedȱ byȱ selfingȱ aȱ randomȱ setȱ ofȱ F2ȱ individuals.ȱȱWhileȱ theȱ accuracyȱ ofȱ QTLȱ mappingȱ dependsȱ onȱ statisticalȱ factorsȱ suchȱ asȱ theȱ sizeȱ ofȱ theȱ mappingȱ population,ȱ itȱ hasȱ beenȱ shownȱ toȱ beȱ quiteȱ accurateȱ inȱ manyȱ casesȱ (Price,ȱ 2006).ȱ However,ȱ thereȱ isȱ oftenȱ stillȱ aȱ needȱ forȱ confirmationȱ andȱ furtherȱ fineȱ mappingȱ (Paranȱ andȱ Zamir,ȱ 2003;ȱ Weigelȱ andȱ Nordborg,ȱ 2005).ȱ Forȱ theseȱ aspects,ȱ whichȱ areȱ theȱ basisȱ ofȱ theȱ cloningȱ ofȱ genesȱ underlyingȱ QTLs,ȱ nearȱ isogenicȱ linesȱ (NILs)ȱ areȱ oftenȱ usedȱ toȱ isolateȱaȱQTL.ȱAȱsetȱofȱNILsȱconsistsȱofȱlinesȱwithȱidenticalȱgeneticȱbackgroundȱbutȱ differingȱinȱgenotypeȱatȱtheȱpositionȱofȱaȱlimitedȱnumberȱofȱloci.ȱNILsȱareȱgenerallyȱ constructedȱ byȱ introgressingȱ aȱ donorȱ accessionȱ intoȱ theȱ geneticȱ backgroundȱ ofȱ anotherȱ accessionȱ byȱ crossingȱ andȱ repeatedȱ backȬcrossingȱ withȱ theȱ recurrentȱ accession.ȱNILsȱallowȱstudyingȱtheȱeffectȱofȱMendelizedȱQTLsȱandȱcanȱrefineȱtheȱ positionȱofȱaȱQTLȱbyȱvaryingȱpositionȱandȱsizeȱofȱintrogressions.ȱ Despiteȱ theȱ factȱ thatȱ RILȱ populationsȱ haveȱ beenȱ developedȱ forȱ anȱ increasingȱ numberȱ ofȱ differentȱ genotypesȱ theȱ developmentȱ ofȱ NILsȱ hasȱ laggedȱ behind.ȱ Uponȱ theȱ detectionȱ ofȱ aȱ QTL,ȱ NILsȱ areȱ oftenȱ notȱ availableȱ forȱ theȱ confirmationȱ andȱ finemappingȱ ofȱ thoseȱ QTLs.ȱ Valuableȱ timeȱ isȱ oftenȱ lostȱ inȱ developingȱ NILsȱ beforeȱ theȱ necessaryȱ followȬupȱ experimentsȱ canȱ beȱ continued.ȱ Theȱ Landsbergȱ erectaȱ (Ler)ȱ xȱ Capeȱ Verdeȱ Islandsȱ (Cvi)ȱ RILȱ populationȱ (AlonsoȬ Blancoȱetȱal.,ȱ1998b)ȱisȱoneȱofȱtheȱmostȱfrequentlyȱusedȱpopulationsȱinȱquantitativeȱ geneticsȱandȱseveralȱNILsȱhaveȱbeenȱdevelopedȱatȱdistinctȱlociȱforȱtheseȱgenotypesȱ (AlonsoȬBlancoȱetȱal.,ȱ1998a;ȱSwarupȱetȱal.,ȱ1999;ȱAlonsoȬBlancoȱetȱal.,ȱ2003;ȱBentsinkȱ etȱ al.,ȱ2003;ȱ Edwardsȱ etȱ al.,ȱ 2005;ȱ Juengerȱ etȱ al.,ȱ2005;ȱTengȱ etȱ al.,ȱ 2005).ȱHowever,ȱ mostȱ ofȱ theseȱ NILsȱ wereȱ developedȱ afterȱ theȱ detectionȱ ofȱ QTLsȱ inȱ theȱ RILȱ populationȱ andȱ theseȱ studiesȱ couldȱ haveȱ benefitedȱ muchȱ fromȱ theȱ directȱ availabilityȱ ofȱ NILs.ȱ Toȱ increaseȱ theȱ efficiencyȱ fromȱ theȱ mappingȱ ofȱ quantitativeȱ traitsȱtoȱtheȱactualȱcloningȱofȱtheȱcausalȱgenesȱitȱwouldȱthereforeȱbeȱadvantageousȱ toȱ haveȱ aȱ NILȱ atȱ everyȱ possibleȱ genomicȱ locationȱ atȱ one’sȱ disposal.ȱ Moreover,ȱ collectionsȱofȱNILsȱwithȱgenomeȬwideȱcoverageȱcanȱserveȱasȱmappingȱpopulations,ȱ whichȱdifferȱinȱeffectivenessȱfromȱRILs,ȱmainlyȱbecauseȱtheȱcomplexityȱofȱepistasisȱ isȱstronglyȱreducedȱ(EshedȱandȱZamir,ȱ1995).ȱ ȱ Geneticalȱgenomics:ȱvariationȱinȱgenomeȱsequenceȱandȱexpressionȱ Inȱ Arabidopsisȱ asȱ wellȱ asȱ inȱ otherȱ species,ȱ genomeȬwideȱ analysesȱ ofȱ genomicȱ polymorphismsȱ inȱ aȱ largeȱ collectionȱ ofȱ accessionsȱ haveȱ revealedȱ extensiveȱ sequenceȱ variationȱ (Borevitzȱ etȱ al.,ȱ 2003;ȱ Hanȱ andȱ Xue,ȱ 2003;ȱ Schmidȱ etȱ al.,ȱ 2003;ȱ

3ȱ Chapterȱ1ȱ

Nordborgȱetȱal.,ȱ2005;ȱVigourouxȱetȱal.,ȱ2005).ȱPolymorphisms,ȱwhenȱconvertedȱtoȱ molecularȱ markers,ȱ areȱ indispensableȱ forȱ (fine)ȱ mappingȱ ofȱ quantitativeȱ traitsȱ inȱ experimentalȱpopulations.ȱWhenȱsurveyedȱinȱnaturalȱpopulationsȱatȱhighȱdensity,ȱ polymorphismsȱ evenȱ enableȱ highȱ resolutionȱ mappingȱ throughȱ linkageȱ disequilibriumȱ(Remingtonȱetȱal.,ȱ2001;ȱNordborgȱetȱal.,ȱ2002;ȱAranzanaȱetȱal.,ȱ2005;ȱ Kimȱ etȱ al.,ȱ 2006).ȱ Theȱ bestȱ marker,ȱ however,ȱ isȱ theȱ polymorphismȱ causalȱ forȱ theȱ observedȱ variation.ȱ Byȱ definitionȱ naturalȱ geneticȱ variationȱ isȱ aȱ resultȱ ofȱ genomicȱ differencesȱ andȱ thereforeȱ theȱ extentȱ ofȱ variationȱ inȱ quantitativeȱ traitsȱ isȱ largelyȱ dependentȱ onȱ theȱ levelȱ ofȱ DNAȱ sequenceȱ variation.ȱ Althoughȱ manyȱ ofȱ theȱ polymorphismsȱwillȱbeȱneutral,ȱitȱleavesȱlittleȱdoubtȱthatȱtheȱstudyȱofȱquantitativeȱ traitsȱ canȱ benefitȱ enormouslyȱ fromȱ genomicȱ analysesȱ (Borevitzȱ andȱ Nordborg,ȱ 2003;ȱMaloof,ȱ2003;ȱGiladȱandȱBorevitz,ȱ2006).ȱNonȬsynonymousȱpolymorphismsȱinȱ codingȱ sequencesȱ ofȱ genesȱ mightȱ alterȱ proteinȱ functionȱ orȱ stability,ȱ introducingȱ phenotypicȱ variation.ȱPolymorphismsȱ inȱ regulatoryȱsequencesȱ onȱ theȱ otherȱ handȱ mightȱresultȱinȱdifferencesȱinȱtranscriptionalȱefficiencyȱofȱgenes.ȱItȱisȱconceivableȱ thatȱ expressionȱ differences,ȱ orȱ variationȱ inȱ mRNAȱ stabilityȱ causedȱ byȱ codingȱ sequenceȱ polymorphisms,ȱ contributeȱ heavilyȱ toȱ naturalȱ variationȱ inȱ Arabidopsisȱ (Chenȱ etȱ al.,ȱ 2005).ȱ Givenȱ theȱ extensiveȱ variationȱ inȱ phenotypeȱ andȱ genomicȱ sequenceȱ withinȱ Arabidopsis,ȱ itȱ isȱ thereforeȱ notȱ surprisingȱ thatȱ forȱ manyȱ genesȱ expressionȱdifferencesȱcanȱbeȱobservedȱbetweenȱaccessionsȱ(Vuylstekeȱetȱal.,ȱ2005;ȱ Kliebensteinȱetȱal.,ȱ2006a;ȱWestȱetȱal.,ȱ2006).ȱ Theȱ geneticȱ regulationȱ ofȱ naturalȱ variationȱ inȱ geneȱ expressionȱ isȱ presumablyȱ notȱ differentȱ fromȱ anyȱ otherȱ ‘classical’ȱ quantitativeȱ trait.ȱ Therefore,ȱ geneȱ expressionȱ canȱ beȱ treatedȱ likeȱ anyȱ otherȱ quantitativeȱ trait,ȱ onȱ whichȱ allȱ statisticalȱtoolsȱofȱquantitativeȱgeneticsȱcanȱbeȱapplied.ȱHowever,ȱtheȱeffectȱofȱthisȱ variationȱmayȱbeȱreflectedȱatȱtheȱphenotypicȱlevel,ȱtherebyȱexplainingȱtheȱgeneticȱ componentȱofȱnaturalȱphenotypicȱvariation.ȱThisȱcombinationȱofȱlinkageȱanalysisȱ (genetics)ȱ andȱ expressionȱ profilingȱ (genomics)ȱ wasȱ coinedȱ ‘geneticalȱ genomics’ȱ (JansenȱandȱNap,ȱ2001)ȱandȱexperimentsȱwereȱfirstȱreportedȱinȱyeastȱ(Bremȱetȱal.,ȱ 2002),ȱsoonȱfollowedȱbyȱdataȱofȱhigherȱeukaryotesȱ(Schadtȱetȱal.,ȱ2003).ȱBecauseȱofȱ theȱ availableȱ highȱ qualityȱ mappingȱ populationsȱ andȱ theȱ commerciallyȱ availableȱ genomeȬwideȱ microarrays,ȱ Arabidopsisȱ isȱ ideallyȱ suitedȱ forȱ theseȱ kindsȱ ofȱ analyses.ȱ However,ȱ uponȱ publicationȱ ofȱ theȱ firstȱ geneticalȱ genomicsȱ studiesȱ noȱ genomeȬwideȱdataȱforȱArabidopsisȱwereȱavailableȱyetȱandȱonlyȱrecentlyȱaȱnumberȱ ofȱstudiesȱinȱvariousȱRILȱpopulationsȱhaveȱindicatedȱextensiveȱgeneticȱregulationȱ ofȱ geneȱ expressionȱ (DeCookȱ etȱ al.,ȱ 2006;ȱ Vuylstekeȱ etȱ al.,ȱ 2006;ȱ Keurentjesȱ etȱ al.,ȱ 2007;ȱWestȱetȱal.,ȱ2007).ȱ GenomeȬwideȱ expressionȱ analysisȱ ofȱ fullyȱ sequencedȱ genomes,ȱ likeȱ Arabidopsis,ȱoffersȱtheȱuniqueȱpossibilityȱtoȱcompareȱgenomicȱpositionsȱofȱgenesȱ

4ȱ Generalȱintroductionȱ

withȱ theȱ mapȱ position(s)ȱ ofȱ theirȱ detectedȱ expressionȱ QTL(s)ȱ (eQTL).ȱ Suchȱ comparativeȱanalysesȱrevealȱeitherȱlocalȱorȱdistantȱregulationȱofȱgeneȱexpression.ȱ LocalȱregulatoryȱvariationȱisȱobservedȱwhenȱgenesȱandȱtheirȱrespectiveȱeQTLsȱcoȬ locateȱandȱdistantȱregulatoryȱvariationȱisȱobservedȱwhenȱgenesȱandȱtheirȱrespectiveȱ eQTLsȱareȱpositionallyȱseparatedȱonȱtheȱgenomeȱ(RockmanȱandȱKruglyak,ȱ2006).ȱ Localȱregulatoryȱvariationȱisȱoftenȱaȱresultȱofȱpolymorphismsȱwithinȱtheȱgeneȱforȱ whichȱ theȱ eQTLȱ wasȱ observed.ȱ Whenȱ suchȱ polymorphismsȱ resideȱ inȱ cisȬactingȱ regulatoryȱ elementsȱ thisȱ mightȱ affectȱ transcriptionalȱ activity.ȱ Regulationȱ inȱ cisȱ couldȱ alsoȱ actȱ postȬtranscriptionallyȱ byȱ alteringȱ mRNAȱ stabilityȱ whenȱ polymorphismsȱresideȱinȱcodingȱsequencesȱofȱtheȱgene.ȱHowever,ȱpolymorphismsȱ withinȱ theȱ geneȱ itselfȱ mightȱ alsoȱ actȱ inȱ transȱ byȱ alteringȱ autoȬregulationȱ andȱ feedbackȱloops.ȱ Furthermore,ȱ occasionallyȱlocalȱ regulatoryȱ variationȱ mightȱactȱ inȱ transȱ dueȱ toȱ polymorphismsȱ inȱ aȱ tightlyȱ linkedȱ geneȱ thatȱ regulatesȱ theȱ geneȱ forȱ whichȱ theȱ eQTLȱ wasȱ detected.ȱ Toȱ determineȱ whetherȱ localȱ regulatoryȱ variationȱ actsȱinȱcisȱorȱtransȱfurtherȱexperimentation,ȱlikeȱalleleȱspecificȱexpressionȱanalysis,ȱ isȱ necessaryȱ (Ronaldȱ etȱ al.,ȱ 2005;ȱ Zhangȱ etȱ al.,ȱ 2007).ȱ Distantȱ regulatoryȱ variationȱ mostȱlikelyȱactsȱinȱtransȱwhenȱpolymorphismsȱinȱanotherȱgeneȱ(e.g.ȱaȱtranscriptionȱ factor)ȱ affectȱ transcriptionȱ ofȱ theȱ geneȱ forȱ whichȱ theȱ distantȱ eQTLȱ wasȱ detected.ȱ Nonetheless,ȱ otherȱ mechanismsȱ ofȱ distantȱ regulation,ȱ bothȱ inȱ cisȱ andȱ trans,ȱ areȱ imaginableȱ(RockmanȱandȱKruglyak,ȱ2006).ȱ ȱ Geneticȱregulationȱofȱplantȱmetabolicȱcontentȱ Theȱ impactȱ ofȱ geneȱ expressionȱ variationȱ onȱ quantitativeȱ traitsȱ isȱ nowȱ widelyȱ acknowledgedȱ andȱ theȱ useȱ ofȱ highȱ throughputȱ genomicȱ analysesȱ hasȱ becomeȱ anȱ importantȱ toolȱ inȱ geneticȱ analysesȱ ofȱ naturalȱ variationȱ (Gibsonȱ andȱ Weir,ȱ 2005).ȱ Transcriptionȱhowever,ȱisȱonlyȱaȱfirstȱlinkȱinȱtheȱchainȱfromȱgenotypeȱtoȱphenotypeȱ andȱ successiveȱ entitiesȱ likeȱ proteinsȱ andȱ metabolitesȱ (qualityȱ andȱ quantity)ȱ areȱ expectedȱasȱcausalȱsourcesȱforȱnaturalȱphenotypicȱvariationȱbutȱhaveȱbeenȱlargelyȱ underȬexploited.ȱ Yet,ȱ highȬthroughputȱ technologies,ȱ i.e.ȱ proteomicsȱ andȱ metabolomics,ȱ haveȱ shownȱ thatȱ muchȱ variationȱ isȱ observedȱ uponȱ physiologicalȱ perturbationȱandȱbetweenȱgeneticȱvariantsȱ(Fiehnȱetȱal.,ȱ2000;ȱChevalierȱetȱal.,ȱ2004).ȱ Moreover,ȱ smallȬscaleȱ targetedȱ analysesȱ andȱ subsequentȱ QTLȱ analysisȱ revealedȱ strongȱgeneticȱregulationȱinȱaȱnumberȱofȱstudiesȱ(Kliebensteinȱetȱal.,ȱ2001;ȱConsoliȱetȱ al.,ȱ2002).ȱ Analogousȱ toȱ geneticalȱ genomics,ȱ theȱ combinationȱ ofȱ highȬthroughputȱ proteomicsȱandȱmetabolomicsȱandȱmultifactorialȱgeneticȱanalysesȱwouldȱthereforeȱ allowȱstudyingȱtheȱfunctionalȱconsequencesȱofȱnaturalȱgeneticȱvariationȱatȱaȱmuchȱ largerȱ scaleȱ (Jansen,ȱ 2003a).ȱ However,ȱ fullȬscaleȱ analysesȱ forȱ proteinsȱ andȱ metabolites,ȱequivalentȱtoȱgenomeȬwideȱexpressionȱanalysis,ȱareȱnotȱavailableȱyet.ȱ

5ȱ Chapterȱ1ȱ

Thisȱ isȱ mainlyȱ becauseȱ proteinsȱ andȱ metabolitesȱ areȱ muchȱ moreȱ diverseȱ inȱ theirȱ propertiesȱthanȱnucleicȱacids,ȱmakingȱitȱdifficultȱtoȱextractȱandȱanalyzeȱallȱdifferentȱ classesȱ usingȱ aȱ singleȱ protocol.ȱ Evenȱ basedȱ onȱ aȱ fullyȱ sequencedȱ genomeȱ oneȱ cannotȱ predictȱ allȱ proteinȱ variantsȱ andȱ metabolitesȱ thatȱ aȱ plantȱ canȱ contain.ȱ Moreover,ȱtheȱdynamicȱrangeȱofȱproteinȱandȱmetaboliteȱabundanceȱisȱfarȱgreaterȱ thanȱ forȱ nucleicȱ acidsȱ andȱ noȱ amplificationȱ techniquesȱ areȱ availableȱ forȱ theseȱ entities,ȱ makingȱ sampleȱ volumeȱ andȱ detectionȱ rangeȱ (sensitivityȱ vs.ȱ saturation)ȱ criticalȱ limitations.ȱ Nevertheless,ȱ severalȱ complementingȱ highȬthroughputȱ technologiesȱhaveȱbeenȱdevelopedȱcoveringȱtogetherȱaȱlargeȱpartȱofȱtheȱproteomeȱ (Peck,ȱ 2005)ȱ andȱ metabolomeȱ (Wardȱ etȱ al.,ȱ 2003;ȱ Lisecȱ etȱ al.,ȱ 2006;ȱ Deȱ Vosȱ etȱ al.,ȱ 2007).ȱ Theȱprogressȱmadeȱinȱproteomicsȱandȱmetabolomicsȱnowȱalsoȱenablesȱtheȱ largeȬscaleȱ geneticȱ analysisȱ ofȱ theseȱ entities,ȱ whichȱ hasȱ onlyȱ recentlyȱ beȱ demonstratedȱforȱprimaryȱmetabolitesȱ(Schauerȱetȱal.,ȱ2006).ȱHowever,ȱvariationȱinȱ secondaryȱ metabolismȱ isȱ probablyȱ moreȱ extensiveȱ andȱ determinesȱ muchȱ ofȱ theȱ phenotypicȱvariationȱthatȱcanȱbeȱobserved.ȱPlantsȱareȱespeciallyȱrichȱinȱtheȱnumberȱ ofȱsecondaryȱmetabolites,ȱpossiblyȱasȱaȱconsequenceȱofȱtheirȱsessileȱnature.ȱSinceȱ plantsȱareȱunableȱtoȱmoveȱawayȱfromȱbioticȱandȱabioticȱthreatsȱtheyȱhaveȱadaptedȱ toȱ copeȱ withȱ manyȱ environmentalȱ influences.ȱ Inȱ Arabidopsisȱ aloneȱ alreadyȱ hundredsȱofȱsecondaryȱmetabolitesȱrepresentingȱnumerousȱchemicalȱclassesȱhaveȱ beenȱ discoveredȱ (DȇAuriaȱ andȱ Gershenzon,ȱ 2005).ȱ Givenȱ theȱ wideȱ globalȱ distributionȱrangeȱofȱArabidopsisȱandȱtheȱdivergeȱrangeȱofȱsitesȱplantsȱhaveȱbeenȱ collected,ȱ itȱ isȱ conceivableȱ thatȱ metabolitesȱ playȱ anȱ importantȱ roleȱ inȱ localȱ adaptationȱstrategies.ȱItȱisȱthereforeȱlikelyȱthatȱtheȱhighȱlevelȱofȱnaturalȱvariationȱinȱ Arabidopsisȱisȱalsoȱreflectedȱinȱmetaboliteȱcompositionȱandȱcontentȱ(Fiehn,ȱ2002).ȱ Aȱ largeȱ drawbackȱ ofȱ metabolomicȱ analysesȱ isȱ theȱ lackȱ ofȱ compoundȱ identification.ȱUnlikeȱmicroarrays,ȱwhereȱeachȱsignalȱcanȱbeȱreducedȱtoȱaȱspecificȱ gene,ȱ mostȱ largeȬscaleȱ metabolomicȱ techniquesȱ areȱ untargeted.ȱ Theȱ outputȱ ofȱ aȱ metabolicȱsampleȱanalysisȱtypicallyȱconsistsȱofȱaȱcomplexȱchromatogramȱofȱmany,ȱ oftenȱanonymousȱpeaks,ȱwhereȱcompoundsȱcanȱbeȱrepresentedȱbyȱmultipleȱpeaksȱ dependingȱonȱadductȱformation,ȱfragmentationȱandȱisotopes.ȱForȱgeneticȱanalysisȱ itȱ isȱ essentialȱ thatȱ chromatogramsȱ ofȱ differentȱ genotypesȱ areȱ qualitativelyȱ comparable.ȱ Thisȱ alignmentȱ problemȱ canȱ beȱ solvedȱ byȱ addingȱ referenceȱ compounds,ȱstandardizationȱandȱproperȱalignmentȱsoftwareȱ(Lisecȱetȱal.,ȱ2006;ȱDeȱ Vosȱetȱal.,ȱ2007).ȱAlthoughȱeachȱpeakȱrepresentsȱaȱspecificȱchemicalȱcompound,ȱtheȱ order,ȱretentionȱtimeȱandȱintensityȱofȱpeaksȱcanȱdifferȱsubstantiallyȱdependingȱonȱ analyticalȱ differencesȱ andȱ sampleȱ properties.ȱ Suchȱ inconsistenciesȱ inȱ dataȱ outputȱ makeȱitȱdifficultȱtoȱcompareȱanalysesȱperformedȱinȱdifferentȱlabsȱorȱexperiments.ȱ Althoughȱ someȱ effortsȱ haveȱ beenȱ madeȱ inȱ constructingȱ identificationȱ librariesȱ

6ȱ Generalȱintroductionȱ

(Schauerȱetȱal.,ȱ2005;ȱMocoȱetȱal.,ȱ2006;ȱWardȱetȱal.,ȱ2007),ȱsuchȱlibrariesȱdoȱnotȱcoverȱ entirelyȱ theȱ stillȱ expandingȱ numberȱ ofȱ detectedȱ compounds.ȱ Moreover,ȱ theȱ differentȱmethodologiesȱappliedȱinȱvariousȱlabsȱmakeȱitȱdifficultȱtoȱimplementȱsuchȱ libraries.ȱ Theȱ scientificȱ communityȱ wouldȱ thereforeȱ benefitȱ muchȱ fromȱ aȱ commonlyȱadoptedȱstandardȱforȱmetabolomicȱanalysesȱ(Jenkinsȱetȱal.,ȱ2004).ȱ ȱ Regulatoryȱnetworkȱconstructionȱ Toȱfunctionallyȱlinkȱtheȱlargeȱdataȱsetsȱobtainedȱinȱ‘omic’ȱexperimentsȱasȱanȱorderȱ ofȱ eventsȱ thatȱ ultimatelyȱ resultȱ inȱ aȱ specificȱ phenotype,ȱ networkȱ constructionȱ providesȱ aȱ usefulȱ tool.ȱ Biologicalȱ networksȱ describeȱ relationshipsȱ betweenȱ individualȱ componentsȱ ofȱ aȱ biologicalȱ processȱ (Barabasiȱ andȱ Oltvai,ȱ 2004).ȱ Suchȱ componentsȱ canȱ eitherȱ beȱ genes,ȱ proteins,ȱ metabolitesȱ orȱ aȱ combinationȱ thereof.ȱ Dependingȱonȱtheȱdataȱsource,ȱnetworksȱcanȱbeȱconstructedȱinȱvariousȱwaysȱbutȱallȱ ofȱthemȱserveȱtoȱelucidateȱthe,ȱoftenȱcomplex,ȱregulationȱofȱbiologicalȱprocesses.ȱ Aȱspecialȱtypeȱofȱnetworksȱdoesȱnotȱrelyȱonȱexperimentalȱdataȱbutȱratherȱ predictsȱ inȱ silicoȱ connectionsȱ basedȱ onȱ genomeȬwideȱ sequenceȱ information.ȱ Mostȱ notablyȱareȱgenomeȬscaleȱmetabolicȱconnectivityȱnetworks,ȱwhereȱmetabolitesȱareȱ connectedȱwhenȱtheȱgenomeȱcontainsȱaȱgeneȱencodingȱanȱenzymeȱableȱtoȱcatalyzeȱ theȱconversionȱofȱoneȱofȱtheȱmetabolitesȱintoȱtheȱotherȱ(Jeongȱetȱal.,ȱ2000).ȱHowever,ȱ geneticȱ networksȱ haveȱ alsoȱ beenȱ predictedȱ inȱ silicoȱ byȱ analyzingȱ regulatoryȱ elementsȱofȱgenesȱforȱbindingȱsitesȱofȱknownȱtranscriptionȱfactorsȱ(Palaniswamyȱetȱ al.,ȱ 2006).ȱ Althoughȱ powerfulȱ inȱ hypothesisȱ formationȱ suchȱ studiesȱ requireȱ empiricalȱdataȱforȱconfirmationȱofȱpredictedȱpathwaysȱandȱinteractions.ȱTherefore,ȱ manyȱapproachesȱforȱnetworkȱconstructionȱareȱbasedȱonȱexperimentalȱdata,ȱwhichȱ alsoȱallowsȱtheȱidentificationȱofȱrelationshipsȱunableȱtoȱbeȱpredictedȱfromȱgenomicȱ informationȱonly.ȱProteinȬproteinȱinteractionsȱforȱinstance,ȱareȱdifficultȱtoȱdeduceȱ fromȱ sequenceȱ informationȱ butȱ requireȱ immunoȱ precipitationȱ orȱ twoȬhybridȱ screens.ȱ Similarȱ analyses,ȱ likeȱ chromatineȱ immunoȱ precipitationȱ (ChIPȬchip),ȱ canȱ alsoȱbeȱusedȱtoȱidentifyȱandȱconfirmȱtranscriptionalȱregulationȱofȱtargetȱgenesȱbyȱ transcriptionȱ factorsȱ orȱ otherȱ knownȱ regulatorsȱ (Leeȱ etȱ al.,ȱ 2002).ȱ Inȱ yeast,ȱ muchȱ progressȱ inȱ regulatoryȱ networkȱ constructionȱ hasȱ beenȱ madeȱ byȱ expressionȱ andȱ metabolicȱ profilingȱ ofȱ deletionȱ strainsȱ (Forsterȱ etȱ al.,ȱ 2002;ȱ Huȱ etȱ al.,ȱ 2007)ȱ andȱ geneticȱinteractionȱanalysesȱofȱdoubleȱmutantsȱ(syntheticȱlethals)ȱ(Tongȱetȱal.,ȱ2004).ȱ However,ȱforȱmostȱhigherȱeukaryotesȱsuchȱgenomeȬwideȱanalysesȱareȱnotȱrealisticȱ becauseȱofȱtheȱmuchȱhigherȱgeneȱnumber,ȱtheȱpresumablyȱmoreȱcomplexȱgeneticȱ architecture,ȱ andȱ aspectsȱ ofȱ subȬcellularȱ andȱ tissueȱ specificȱ compartmentation.ȱ Manyȱattemptsȱinȱregulatoryȱnetworkȱconstructionȱthereforeȱrelyȱonȱmoreȱindirectȱ approachesȱofȱestablishingȱassociationsȱbetweenȱnetworkȱcomponents.ȱȱ

7ȱ Chapterȱ1ȱ

Aȱstraightforwardȱapproachȱisȱcorrelationȱanalysisȱoverȱaȱlargeȱsetȱofȱdataȱ compiledȱ fromȱ numerousȱ perturbationȱ experimentsȱ (deȱ laȱ Fuenteȱ etȱ al.,ȱ 2004).ȱ ExemplaryȱareȱtheȱwidelyȱappliedȱgeneȱcoȬexpressionȱanalyses,ȱwhereȱcorrelationȱ inȱgeneȱexpressionȱpatternsȱisȱsurveyedȱunderȱaȱlargeȱnumberȱofȱdiverseȱconditionsȱ (Stuartȱetȱal.,ȱ2003;ȱGachonȱetȱal.,ȱ2005).ȱTheȱrationaleȱforȱthisȱkindȱofȱanalysisȱisȱthatȱ genesȱparticipatingȱinȱtheȱsameȱbiologicalȱprocessȱareȱoftenȱcoȬregulatedȱandȱhenceȱ exhibitȱsimilarȱexpressionȱpatterns.ȱFollowingȱtheȱsameȱlineȱofȱreasoning,ȱmetabolicȱ correlationȱ networksȱ haveȱ beenȱ constructedȱ (Steuerȱ etȱ al.,ȱ 2003).ȱ However,ȱ correlationȱ doesȱ notȱ necessarilyȱ implyȱ functionalȱ relatednessȱ norȱ doesȱ itȱ addressȱ causalityȱ issues.ȱ Theȱ reliabilityȱ of,ȱ andȱ informationȱ containedȱ inȱ constructedȱ networksȱ wouldȱ thereforeȱ gainȱ muchȱ strengthȱ fromȱ integratedȱ analysesȱ ofȱ interdisciplinaryȱapproachesȱ(Fiehnȱetȱal.,ȱ2001;ȱWinnacker,ȱ2003).ȱSuchȱintegratedȱ studiesȱ canȱ eitherȱ combineȱ experimentalȱ dataȱ withȱ inȱ silicoȱ analysesȱ (Segalȱ etȱ al.,ȱ 2003)ȱ orȱ benefitȱ fromȱ multiȬparallelȱ analysesȱ ofȱ diverseȱ biologicalȱ samplesȱ (UrbanczykȬWochniakȱetȱal.,ȱ2003;ȱHiraiȱetȱal.,ȱ2005;ȱJoosenȱetȱal.,ȱ2007).ȱ Althoughȱ demonstrablyȱ effective,ȱ correlationȱ analysesȱ dependȱ onȱ largeȱ compendiaȱ ofȱ publiclyȱ availableȱ dataȱ orȱ sufferȱ fromȱ theȱ limitedȱ numberȱ ofȱ physiologicalȱconditionsȱthatȱcanȱbeȱanalyzedȱinȱdedicatedȱexperiments.ȱHowever,ȱ sometimesȱcoȬregulationȱisȱdisplayedȱonlyȱinȱparticularȱconditionsȱ(Gachonȱetȱal.,ȱ 2005)ȱ whichȱ mayȱ evenȱ remainȱ undiscoveredȱ inȱ largeȱ dataȱ setsȱ dueȱ toȱ dilutionȱ effects.ȱ Theȱ largestȱ drawbackȱ ofȱ correlationȱ analyses,ȱ however,ȱ isȱ thatȱ noȱ informationȱcanȱbeȱretrievedȱaboutȱtheȱnatureȱofȱtheȱunderlyingȱgeneticȱregulation.ȱ CorrelationȱmayȱbeȱaȱresultȱfromȱcoȬregulationȱbyȱaȱcommonȱregulatorȱorȱdueȱtoȱ independentȱ pathwaysȱ thatȱ occurȱ inȱ parallel,ȱ possiblyȱ dueȱ toȱ developmentalȱ orȱ spatialȱ control.ȱ Aȱ highlyȱ correlatedȱ clusterȱ ofȱ biologicalȱ elements,ȱ suchȱ asȱ genes,ȱ proteinsȱandȱmetabolites,ȱcanȱalsoȱresultȱfromȱdownstreamȱeffectsȱofȱtheȱregulationȱ ofȱ aȱ singleȱ memberȱ butȱ noȱ informationȱ aboutȱ causeȱ andȱ consequenceȱ canȱ beȱ extractedȱfromȱgeneticȱcorrelations.ȱ Mappingȱpopulationsȱcombineȱaȱhighȱnumberȱofȱgeneticȱperturbationsȱbyȱ whichȱ numerousȱ quantitativeȱ traitsȱ canȱ segregateȱ inȱ aȱ singleȱ experiment.ȱ Moreover,ȱgeneticȱanalysisȱoffersȱtheȱuniqueȱpossibilityȱofȱidentifyingȱgenomicȱlociȱ causalȱ forȱ observedȱ variationȱ in,ȱ andȱ possibleȱ correlationȱ betweenȱ traits.ȱ Whenȱ appliedȱ toȱ genomeȬwideȱ expressionȱ analysisȱ orȱ otherȱ largeȬscaleȱ ‘omic’ȱ analysesȱ thisȱ thereforeȱ allowsȱ theȱ identificationȱ ofȱ trueȱ geneȬtoȬgeneȱ orȱ geneȬtoȬfunctionȱ regulation.ȱUnfortunately,ȱmappingȱresolutionȱisȱoftenȱnotȱhighȱenoughȱtoȱidentifyȱ directlyȱcausalȱgenesȱunderlyingȱdetectedȱQTLsȱandȱwillȱrequireȱfurtherȱanalysisȱ suchȱasȱfineȱmapping,ȱtheȱstudyȱofȱoverexpressorsȱandȱmutantsȱofȱcandidateȱgenes,ȱ etc.ȱ However,ȱ cisȬregulatedȱ genesȱ areȱ obviousȱ candidatesȱ andȱ coȬregulatedȱ traitsȱ canȱ effectivelyȱ beȱ identifiedȱ throughȱ coȬlocationȱ ofȱ detectedȱ QTLs.ȱ Still,ȱ notȱ allȱ

8ȱ Generalȱintroductionȱ

coincidingȱ QTLsȱ necessarilyȱ representȱ theȱ sameȱ causalȱ geneȱ becauseȱ effectsȱ ofȱ closelyȱ linkedȱ genesȱ areȱ difficultȱ toȱ distinguishȱ fromȱ trueȱ pleiotropicȱ effectsȱ ofȱ aȱ singleȱgene.ȱWithoutȱfurtherȱexperimentationȱgeneticȱinteractionsȱcanȱbeȱpredictedȱ computationallyȱ byȱ comparingȱ QTLȱ profilesȱ andȱ correlationȱ analysesȱ (Zhuȱ etȱ al.,ȱ 2004;ȱ Bingȱ andȱ Hoeschele,ȱ 2005;ȱ Liȱ etȱ al.,ȱ 2005;ȱ Lanȱ etȱ al.,ȱ 2006;ȱ Fuȱ etȱ al.,ȱ 2007).ȱ However,ȱtheȱaccuracyȱofȱconstructedȱnetworksȱcanȱbenefitȱtremendouslyȱfromȱtheȱ integrationȱofȱadditionalȱinformationȱlikeȱgeneȱontologyȱ(Kliebensteinȱetȱal.,ȱ2006b;ȱ Keurentjesȱ etȱ al.,ȱ 2007),ȱ sequenceȱ dataȱ (Hitzemannȱ etȱ al.,ȱ 2003)ȱ andȱ relatedȱ quantitativeȱtraitȱdataȱ(Consoliȱetȱal.,ȱ2002;ȱHubnerȱetȱal.,ȱ2005).ȱ Althoughȱmuchȱprogressȱhasȱbeenȱmadeȱinȱtheȱconstructionȱofȱregulatoryȱ networks,ȱanyȱinformationȱinferredȱfromȱsuchȱnetworksȱshouldȱbeȱinterpretedȱwithȱ caution.ȱWhereȱmanyȱstudiesȱhaveȱshownȱtheȱidentificationȱofȱcorrectȱinteractions,ȱ mostȱ approachesȱ canȱ notȱ excludeȱ theȱ assignmentȱ ofȱ falseȱ positives.ȱ Predictedȱ interactionsȱ andȱ regulatoryȱ stepsȱ shouldȱ thereforeȱ beȱ consideredȱ asȱ hypothesisȱ formationȱ onlyȱ andȱ confirmationȱ ofȱ suchȱ relationshipsȱ shouldȱ comeȱ fromȱ additionalȱexperimentation.ȱ ȱ

9ȱ Chapterȱ1ȱ

Scopeȱofȱtheȱthesisȱ Inȱ Arabidopsisȱ naturalȱ variationȱ existsȱ forȱ manyȱ quantitativeȱ traits.ȱ Theȱ geneticȱ regulationȱ ofȱ quantitativeȱ traitsȱ canȱ effectivelyȱ beȱ analyzedȱ inȱ mappingȱ populationsȱbyȱwayȱofȱquantitativeȱtraitȱlocusȱ(QTL)ȱanalyses.ȱThisȱthesisȱdescribesȱ theȱ largeȬscaleȱ geneticȱ analysisȱ ofȱ ‘omics’ȱ dataȱ andȱ theirȱ useȱ inȱ dissectingȱ theȱ geneticȱregulationȱofȱquantitativeȱtraits.ȱ Chapterȱ twoȱ describesȱ theȱ developmentȱ ofȱ aȱ nearȱ isogenicȱ lineȱ (NIL)ȱ populationȱ andȱ itsȱ useȱ inȱ mappingȱ andȱ fineȬmappingȱ ofȱ QTLs.ȱ NILsȱ areȱ widelyȱ usedȱ inȱ theȱ confirmationȱ ofȱ QTLs,ȱ detectedȱ inȱ recombinantȱ inbredȱ lineȱ (RIL)ȱ populations.ȱHowever,ȱwhenȱaȱpopulationȱofȱNILsȱwithȱgenomeȬwideȱcoverageȱisȱ available,ȱsuchȱaȱpopulationȱcanȱalsoȱbeȱusedȱforȱmappingȱpurposes.ȱAȱgenomeȬ wideȱ NILȱ populationȱ wasȱ generatedȱ byȱ introgressingȱ genomicȱ regionsȱ ofȱ anȱ accessionȱ fromȱ theȱ Capeȱ Verdeȱ Islandsȱ (Cvi)ȱ intoȱ theȱ geneticȱ backgroundȱ ofȱ theȱ commonlyȱusedȱlaboratoryȱaccessionȱLandsbergȱerectaȱ(Ler).ȱMappingȱpowerȱandȱ resolutionȱofȱthisȱpopulationȱwasȱcomparedȱwithȱtheȱpreviouslyȱdevelopedȱLerȱxȱ CviȱRILȱpopulation.ȱȱ ChapterȱthreeȱdescribesȱtheȱgenomeȬwideȱexpressionȱanalysisȱofȱtheȱLerȱxȱ CviȱRILȱpopulation.ȱSimilarȱtoȱ‘classical’ȱquantitativeȱtraits,ȱnaturalȱvariationȱalsoȱ existsȱforȱexpressionȱlevelsȱofȱmanyȱgenes.ȱQTLȱmappingȱofȱexpressionȱvariationȱ thereforeȱrevealsȱgenomicȱlociȱcontrollingȱtheȱexpressionȱofȱgenes.ȱThisȱinformationȱ canȱthenȱbeȱusedȱtoȱconstructȱgeneticȱregulatoryȱnetworksȱandȱhelpȱelucidatingȱtheȱ geneticȱcontrolȱofȱmanyȱphysiologicalȱtraits.ȱȱ ChapterȱfourȱdescribesȱtheȱlargeȬscaleȱuntargetedȱmetabolomicȱanalysesȱinȱ theȱ Lerȱ xȱ Cviȱ RILȱ population.ȱ Subsequentȱ mappingȱ revealedȱ substantialȱ geneticȱ controlȱforȱmetaboliteȱcompositionȱandȱcontent.ȱIdentificationȱofȱanonymousȱmassȱ peaksȱ enabledȱ theȱ reconstructionȱ ofȱ metabolicȱ pathwaysȱ andȱ revealedȱ novelȱ biosyntheticȱsteps.ȱȱ Chapterȱfiveȱdescribesȱtheȱintegratedȱanalysisȱofȱgeneȱexpression,ȱenzymeȱ activitiesȱ andȱ metaboliteȱ contentȱ inȱ primaryȱ carbohydrateȱ metabolism.ȱ QTLȱ andȱ correlationȱanalysesȱidentifiedȱdifferentȱmodesȱofȱcontrolȱofȱprimaryȱcarbohydrateȱ metabolism,ȱ includingȱ regulationȱ ofȱ structuralȱ geneȱ expressionȱ andȱ metabolicȱ control.ȱȱ Finally,ȱinȱchapterȱsix,ȱtheȱworkȱdescribedȱinȱthisȱthesisȱisȱsummarizedȱandȱ discussed.

10ȱ Generalȱintroductionȱ

REFERENCESȱ ȱ AlonsoȬBlanco,ȱC.,ȱElȬAssal,ȱS.E.,ȱCoupland,ȱG.ȱandȱKoornneef,ȱM.ȱ(1998a).ȱAnalysisȱofȱnaturalȱallelicȱ variationȱatȱfloweringȱtimeȱlociȱinȱtheȱLandsbergȱerectaȱandȱCapeȱVerdeȱIslandsȱecotypesȱofȱ Arabidopsisȱthaliana.ȱGeneticsȱ149,ȱ749Ȭ764.ȱ AlonsoȬBlanco,ȱ C.,ȱPeeters,ȱ A.J.,ȱKoornneef,ȱM.,ȱLister,ȱC.,ȱDean,ȱC.,ȱvanȱdenȱBosch,ȱN.,ȱ Pot,ȱJ.ȱandȱ Kuiper,ȱ M.T.ȱ (1998b).ȱ Developmentȱ ofȱ anȱ AFLPȱ basedȱ linkageȱ mapȱ ofȱ Ler,ȱ Colȱ andȱ Cviȱ Arabidopsisȱ thalianaȱ ecotypesȱ andȱ constructionȱ ofȱ aȱ Ler/Cviȱ recombinantȱ inbredȱ lineȱ population.ȱPlantȱJȱ14,ȱ259Ȭ271.ȱ AlonsoȬBlanco,ȱ C.ȱ andȱ Koornneef,ȱ M.ȱ (2000).ȱ Naturallyȱ occurringȱ variationȱ inȱ Arabidopsis:ȱ anȱ underexploitedȱresourceȱforȱplantȱgenetics.ȱTrendsȱPlantȱSciȱ5,ȱ22Ȭ29.ȱ AlonsoȬBlanco,ȱ C.,ȱ Bentsink,ȱ L.,ȱ Hanhart,ȱ C.J.,ȱ BlankestijnȬdeȱ Vries,ȱ H.ȱ andȱ Koornneef,ȱ M.ȱ (2003).ȱ AnalysisȱofȱnaturalȱallelicȱvariationȱatȱseedȱdormancyȱlociȱofȱArabidopsisȱthaliana.ȱGeneticsȱ164,ȱ 711Ȭ729.ȱ Alonso,ȱJ.M.ȱandȱEcker,ȱJ.R.ȱ(2006).ȱMovingȱforwardȱinȱreverse:ȱgeneticȱtechnologiesȱtoȱenableȱgenomeȬ wideȱphenomicȱscreensȱinȱArabidopsis.ȱNatȱRevȱGenetȱ7,ȱ524Ȭ536.ȱ Aranzana,ȱM.J.,ȱKim,ȱS.,ȱZhao,ȱK.,ȱBakker,ȱE.,ȱHorton,ȱM.,ȱJakob,ȱK.,ȱLister,ȱC.,ȱMolitor,ȱJ.,ȱShindo,ȱC.,ȱ Tang,ȱC.ȱetȱal.ȱ(2005).ȱGenomeȬwideȱassociationȱmappingȱinȱArabidopsisȱidentifiesȱpreviouslyȱ knownȱfloweringȱtimeȱandȱpathogenȱresistanceȱgenes.ȱPLoSȱGenetȱ1,ȱe60.ȱ Ashikari,ȱM.ȱandȱMatsuoka,ȱM.ȱ(2006).ȱIdentification,ȱisolationȱandȱpyramidingȱofȱquantitativeȱtraitȱlociȱ forȱriceȱbreeding.ȱTrendsȱPlantȱSciȱ11,ȱ344Ȭ350.ȱ Barabasi,ȱ A.L.ȱ andȱ Oltvai,ȱ Z.N.ȱ (2004).ȱ Networkȱ biology:ȱ understandingȱ theȱ cellȇsȱ functionalȱ organization.ȱNatȱRevȱGenetȱ5,ȱ101Ȭ113.ȱ Bentsink,ȱ L.,ȱ Yuan,ȱ K.,ȱ Koornneef,ȱ M.ȱ andȱ Vreugdenhil,ȱ D.ȱ (2003).ȱ Theȱ geneticsȱ ofȱ phytateȱ andȱ phosphateȱaccumulationȱinȱseedsȱandȱleavesȱofȱArabidopsisȱthaliana,ȱusingȱnaturalȱvariation.ȱ TheorȱApplȱGenetȱ106,ȱ1234Ȭ1243.ȱ Bing,ȱ N.ȱ andȱ Hoeschele,ȱ I.ȱ (2005).ȱ Geneticalȱ genomicsȱ analysisȱ ofȱ aȱ yeastȱ segregantȱ populationȱ forȱ transcriptionȱnetworkȱinference.ȱGeneticsȱ170,ȱ533Ȭ542.ȱ Borevitz,ȱJ.O.,ȱLiang,ȱD.,ȱPlouffe,ȱD.,ȱChang,ȱH.S.,ȱZhu,ȱT.,ȱWeigel,ȱD.,ȱBerry,ȱC.C.,ȱWinzeler,ȱE.ȱandȱ Chory,ȱ J.ȱ (2003).ȱ LargeȬscaleȱ identificationȱ ofȱ singleȬfeatureȱ polymorphismsȱ inȱ complexȱ genomes.ȱGenomeȱResȱ13,ȱ513Ȭ523.ȱ Borevitz,ȱJ.O.ȱandȱNordborg,ȱM.ȱ(2003).ȱTheȱimpactȱofȱgenomicsȱonȱtheȱstudyȱofȱnaturalȱvariationȱinȱ Arabidopsis.ȱPlantȱPhysiolȱ132,ȱ718Ȭ725.ȱ Borevitz,ȱ J.O.ȱ andȱ Chory,ȱ J.ȱ (2004).ȱ Genomicsȱ toolsȱ forȱ QTLȱ analysisȱ andȱ geneȱ discovery.ȱ Currȱ Opinȱ PlantȱBiolȱ7,ȱ132Ȭ136.ȱ Brem,ȱ R.B.,ȱ Yvert,ȱ G.,ȱ Clinton,ȱ R.ȱ andȱ Kruglyak,ȱ L.ȱ (2002).ȱ Geneticȱ dissectionȱ ofȱ transcriptionalȱ regulationȱinȱbuddingȱyeast.ȱScienceȱ296,ȱ752Ȭ755.ȱ Carlborg,ȱ O.ȱ andȱ Haley,ȱ C.S.ȱ (2004).ȱ Epistasis:ȱ tooȱ oftenȱ neglectedȱ inȱ complexȱ traitȱ studies?ȱ Natȱ Revȱ Genetȱ5,ȱ618Ȭ625.ȱ Chen,ȱW.J.,ȱChang,ȱS.H.,ȱHudson,ȱM.E.,ȱKwan,ȱW.K.,ȱLi,ȱJ.,ȱEstes,ȱB.,ȱKnoll,ȱD.,ȱShi,ȱL.ȱandȱZhu,ȱT.ȱ (2005).ȱContributionȱofȱtranscriptionalȱregulationȱtoȱnaturalȱvariationsȱinȱArabidopsis.ȱGenomeȱ Biolȱ6,ȱR32.ȱ Chevalier,ȱF.,ȱMartin,ȱO.,ȱRofidal,ȱV.,ȱDevauchelle,ȱA.D.,ȱBarteau,ȱS.,ȱSommerer,ȱN.ȱandȱRossignol,ȱM.ȱ (2004).ȱProteomicȱinvestigationȱofȱnaturalȱvariationȱbetweenȱArabidopsisȱecotypes.ȱProteomicsȱ 4,ȱ1372Ȭ1381.ȱ Consoli,ȱL.,ȱLefevre,ȱA.,ȱZivy,ȱM.,ȱdeȱVienne,ȱD.ȱandȱDamerval,ȱC.ȱ(2002).ȱQTLȱanalysisȱofȱproteomeȱ andȱtranscriptomeȱvariationsȱforȱdissectingȱtheȱgeneticȱarchitectureȱofȱcomplexȱtraitsȱinȱmaize.ȱ PlantȱMolȱBiolȱ48,ȱ575Ȭ581.ȱ

11ȱ Chapterȱ1ȱ

DȇAuria,ȱJ.C.ȱandȱGershenzon,ȱJ.ȱ(2005).ȱTheȱsecondaryȱmetabolismȱofȱArabidopsisȱthaliana:ȱgrowingȱlikeȱ aȱweed.ȱCurrȱOpinȱPlantȱBiolȱ8,ȱ308Ȭ316.ȱ deȱlaȱFuente,ȱA.,ȱBing,ȱN.,ȱHoeschele,ȱI.ȱandȱMendes,ȱP.ȱ(2004).ȱDiscoveryȱofȱmeaningfulȱassociationsȱinȱ genomicȱdataȱusingȱpartialȱcorrelationȱcoefficients.ȱBioinformaticsȱ20,ȱ3565Ȭ3574.ȱ DeȱVos,ȱR.C.,ȱMoco,ȱS.,ȱLommen,ȱA.,ȱKeurentjes,ȱJ.J.B.,ȱBino,ȱR.J.ȱandȱHall,ȱR.D.ȱ(2007).ȱUntargetedȱ largeȬscaleȱplantȱmetabolomicsȱusingȱliquidȱchromatographyȱcoupledȱtoȱmassȱspectrometry.ȱ NatȱProtocȱ2,ȱ778Ȭ791.ȱ DeCook,ȱ R.,ȱ Lall,ȱ S.,ȱ Nettleton,ȱ D.ȱ andȱ Howell,ȱ S.H.ȱ (2006).ȱ Geneticȱ regulationȱ ofȱ geneȱ expressionȱ duringȱshootȱdevelopmentȱinȱArabidopsis.ȱGeneticsȱ172,ȱ1155Ȭ1164.ȱ Doerge,ȱR.W.ȱ(2002).ȱMappingȱandȱanalysisȱofȱquantitativeȱtraitȱlociȱinȱexperimentalȱpopulations.ȱNatȱ RevȱGenetȱ3,ȱ43Ȭ52.ȱ Edwards,ȱK.D.,ȱLynn,ȱJ.R.,ȱGyula,ȱP.,ȱNagy,ȱF.ȱandȱMillar,ȱA.J.ȱ(2005).ȱNaturalȱallelicȱvariationȱinȱtheȱ temperatureȬcompensationȱ mechanismsȱ ofȱ theȱ Arabidopsisȱ thalianaȱ circadianȱ clock.ȱ Geneticsȱ 170,ȱ387Ȭ400.ȱ Eshed,ȱ Y.ȱ andȱ Zamir,ȱ D.ȱ (1995).ȱ Anȱ introgressionȱ lineȱ populationȱ ofȱ Lycopersiconȱ pennelliiȱ inȱ theȱ cultivatedȱ tomatoȱ enablesȱ theȱ identificationȱ andȱ fineȱ mappingȱ ofȱ yieldȬassociatedȱ QTL.ȱ Geneticsȱ141,ȱ1147Ȭ1162.ȱ Fiehn,ȱO.,ȱKopka,ȱJ.,ȱDormann,ȱP.,ȱAltmann,ȱT.,ȱTrethewey,ȱR.N.ȱandȱWillmitzer,ȱL.ȱ(2000).ȱMetaboliteȱ profilingȱforȱplantȱfunctionalȱgenomics.ȱNatȱBiotechnolȱ18,ȱ1157Ȭ1161.ȱ Fiehn,ȱO.,ȱKloska,ȱS.ȱandȱAltmann,ȱT.ȱ(2001).ȱIntegratedȱstudiesȱonȱplantȱbiologyȱusingȱmultiparallelȱ techniques.ȱCurrȱOpinȱBiotechnolȱ12,ȱ82Ȭ86.ȱ Fiehn,ȱO.ȱ(2002).ȱMetabolomicsȬȬtheȱlinkȱbetweenȱgenotypesȱandȱphenotypes.ȱPlantȱMolȱBiolȱ48,ȱ155Ȭ171.ȱ Forster,ȱJ.,ȱGombert,ȱA.K.ȱandȱNielsen,ȱJ.ȱ(2002).ȱAȱfunctionalȱgenomicsȱapproachȱusingȱmetabolomicsȱ andȱinȱsilicoȱpathwayȱanalysis.ȱBiotechnolȱBioengȱ79,ȱ703Ȭ712.ȱ Fu,ȱJ.,ȱSwertz,ȱM.A.,ȱKeurentjes,ȱJ.J.B.ȱandȱJansen,ȱR.C.ȱ(2007).ȱMetaNetwork:ȱaȱcomputationalȱprotocolȱ forȱtheȱgeneticȱstudyȱofȱmetabolicȱnetworks.ȱNatȱProtocȱ2,ȱ685Ȭ694.ȱ Gachon,ȱ C.M.,ȱ LangloisȬMeurinne,ȱ M.,ȱ Henry,ȱ Y.ȱ andȱ Saindrenan,ȱ P.ȱ (2005).ȱ Transcriptionalȱ coȬ regulationȱ ofȱ secondaryȱ metabolismȱ enzymesȱ inȱ Arabidopsis:ȱ functionalȱ andȱ evolutionaryȱ implications.ȱPlantȱMolȱBiolȱ58,ȱ229Ȭ245.ȱ Gibson,ȱG.ȱandȱWeir,ȱB.ȱ(2005).ȱTheȱquantitativeȱgeneticsȱofȱtranscription.ȱTrendsȱGenetȱ21,ȱ616Ȭ623.ȱ Gilad,ȱY.ȱandȱBorevitz,ȱJ.ȱ(2006).ȱUsingȱDNAȱmicroarraysȱtoȱstudyȱnaturalȱvariation.ȱCurrȱOpinȱGenetȱ Devȱ16,ȱ553Ȭ558.ȱ Han,ȱ B.ȱ andȱ Xue,ȱ Y.ȱ (2003).ȱ GenomeȬwideȱ intraspecificȱ DNAȬsequenceȱ variationsȱ inȱ rice.ȱ Currȱ Opinȱ PlantȱBiolȱ6,ȱ134Ȭ138.ȱ Hirai,ȱ M.Y.,ȱ Klein,ȱ M.,ȱ Fujikawa,ȱ Y.,ȱ Yano,ȱ M.,ȱ Goodenowe,ȱ D.B.,ȱ Yamazaki,ȱ Y.,ȱ Kanaya,ȱ S.,ȱ Nakamura,ȱ Y.,ȱ Kitayama,ȱ M.,ȱ Suzuki,ȱ H.ȱ etȱ al.ȱ (2005).ȱ Elucidationȱ ofȱ geneȬtoȬgeneȱ andȱ metaboliteȬtoȬgeneȱ networksȱ inȱ Arabidopsisȱ byȱ integrationȱ ofȱ metabolomicsȱ andȱ transcriptomics.ȱJȱBiolȱChemȱ280,ȱ25590Ȭ25595.ȱ Hitzemann,ȱ R.,ȱ Malmanger,ȱ B.,ȱ Reed,ȱ C.,ȱ Lawler,ȱ M.,ȱ Hitzemann,ȱ B.,ȱ Coulombe,ȱ S.,ȱ Buck,ȱ K.,ȱ Rademacher,ȱB.,ȱWalter,ȱN.,ȱPolyakov,ȱY.ȱetȱal.ȱ(2003).ȱAȱstrategyȱforȱtheȱintegrationȱofȱQTL,ȱ geneȱexpression,ȱandȱsequenceȱanalyses.ȱMammȱGenomeȱ14,ȱ733Ȭ747.ȱ Holland,ȱJ.B.ȱ(2007).ȱGeneticȱarchitectureȱofȱcomplexȱtraitsȱinȱplants.ȱCurrȱOpinȱPlantȱBiolȱ10,ȱ156Ȭ161.ȱ Hu,ȱ Z.,ȱ Killion,ȱ P.J.ȱ andȱ Iyer,ȱ V.R.ȱ (2007).ȱ Geneticȱ reconstructionȱ ofȱ aȱ functionalȱ transcriptionalȱ regulatoryȱnetwork.ȱNatȱGenetȱ39,ȱ683Ȭ687.ȱ Hubner,ȱN.,ȱWallace,ȱC.A.,ȱZimdahl,ȱH.,ȱPetretto,ȱE.,ȱSchulz,ȱH.,ȱMaciver,ȱF.,ȱMueller,ȱM.,ȱHummel,ȱ O.,ȱMonti,ȱJ.,ȱZidek,ȱV.ȱetȱal.ȱ(2005).ȱIntegratedȱtranscriptionalȱprofilingȱandȱlinkageȱanalysisȱ forȱidentificationȱofȱgenesȱunderlyingȱdisease.ȱNatȱGenetȱ37,ȱ243Ȭ253.ȱ Jansen,ȱR.C.ȱ(1993).ȱIntervalȱmappingȱofȱmultipleȱquantitativeȱtraitȱloci.ȱGeneticsȱ135,ȱ205Ȭ211.ȱ

12ȱ Generalȱintroductionȱ

Jansen,ȱR.C.ȱandȱNap,ȱJ.P.ȱ(2001).ȱGeneticalȱgenomics:ȱtheȱaddedȱvalueȱfromȱsegregation.ȱTrendsȱGenetȱ 17,ȱ388Ȭ391.ȱ Jansen,ȱ R.C.ȱ (2003a).ȱ Studyingȱ complexȱ biologicalȱ systemsȱ usingȱ multifactorialȱ perturbation.ȱ Natȱ Revȱ Genetȱ4,ȱ145Ȭ151.ȱ Jansen,ȱ R.C.ȱ (2003b).ȱ Quantitativeȱ traitȱ lociȱ inȱ inbredȱ lines.ȱ Inȱ Handbookȱ ofȱ Statisticalȱ Genetics,ȱ D.J.ȱ Balding,ȱM.ȱBishopȱandȱC.ȱCannings,ȱedsȱ(Chichester,ȱUK:ȱJohnȱWileyȱ&ȱSons),ȱpp.ȱ445Ȭ476.ȱ Jenkins,ȱ H.,ȱ Hardy,ȱ N.,ȱ Beckmann,ȱ M.,ȱ Draper,ȱ J.,ȱ Smith,ȱ A.R.,ȱ Taylor,ȱ J.,ȱ Fiehn,ȱ O.,ȱ Goodacre,ȱ R.,ȱ Bino,ȱ R.J.,ȱ Hall,ȱ R.ȱ etȱ al.ȱ (2004).ȱ Aȱ proposedȱ frameworkȱ forȱ theȱ descriptionȱ ofȱ plantȱ metabolomicsȱexperimentsȱandȱtheirȱresults.ȱNatȱBiotechnolȱ22,ȱ1601Ȭ1606.ȱ Jeong,ȱH.,ȱTombor,ȱB.,ȱAlbert,ȱR.,ȱOltvai,ȱZ.N.ȱandȱBarabasi,ȱA.L.ȱ(2000).ȱTheȱlargeȬscaleȱorganizationȱ ofȱmetabolicȱnetworks.ȱNatureȱ407,ȱ651Ȭ654.ȱ Joosen,ȱ R.,ȱ Cordewener,ȱ J.,ȱ Supena,ȱ E.D.,ȱ Vorst,ȱ O.,ȱ Lammers,ȱ M.,ȱ Maliepaard,ȱ C.,ȱ Zeilmaker,ȱ T.,ȱ Miki,ȱ B.,ȱ America,ȱ T.,ȱ Custers,ȱ J.ȱ etȱ al.ȱ (2007).ȱ Combinedȱ transcriptomeȱ andȱ proteomeȱ analysisȱidentifiesȱpathwaysȱandȱmarkersȱassociatedȱwithȱtheȱestablishmentȱofȱBrassicaȱnapusȱ microsporeȬderivedȱembryoȱdevelopment.ȱPlantȱPhysiolȱ144,ȱ155Ȭ172.ȱ Juenger,ȱ T.E.,ȱ McKay,ȱ J.K.,ȱ Hausmann,ȱ N.,ȱ Keurentjes,ȱ J.J.B.,ȱ Sen,ȱ S.,ȱ Stowe,ȱ K.A.,ȱ Dawson,ȱ T.E.,ȱ Simms,ȱE.L.ȱandȱRichards,ȱJ.H.ȱ(2005).ȱIdentificationȱandȱcharacterizationȱofȱQTLȱunderlyingȱ wholeȬplantȱ physiologyȱ inȱ Arabidopsisȱ thaliana:ȱ delta13C,ȱ stomatalȱ conductanceȱ andȱ transpirationȱefficiency.ȱPlantȱCellȱEnvironȱ28,ȱ697Ȭ708.ȱ Kearsey,ȱM.J.,ȱPooni,ȱH.S.ȱandȱSyed,ȱN.H.ȱ(2003).ȱGeneticsȱofȱquantitativeȱtraitsȱinȱArabidopsisȱthaliana.ȱ Heredityȱ91,ȱ456Ȭ464.ȱ Keurentjes,ȱ J.J.B.,ȱ Fu,ȱ J.,ȱ Terpstra,ȱ I.R.,ȱ Garcia,ȱ J.M.,ȱ vanȱ denȱ Ackerveken,ȱ G.,ȱ Snoek,ȱ L.B.,ȱ Peeters,ȱ A.J.,ȱ Vreugdenhil,ȱ D.,ȱ Koornneef,ȱ M.ȱ andȱ Jansen,ȱ R.C.ȱ (2007).ȱ Regulatoryȱ networkȱ constructionȱ inȱ Arabidopsisȱ byȱ usingȱ genomeȬwideȱ geneȱ expressionȱ quantitativeȱ traitȱ loci.ȱ ProcȱNatlȱAcadȱSciȱUȱSȱAȱ104,ȱ1708Ȭ1713.ȱ Kim,ȱ S.,ȱ Zhao,ȱ K.,ȱ Jiang,ȱ R.,ȱ Molitor,ȱ J.,ȱ Borevitz,ȱ J.O.,ȱ Nordborg,ȱ M.ȱ andȱ Marjoram,ȱ P.ȱ (2006).ȱ AssociationȱmappingȱwithȱsingleȬfeatureȱpolymorphisms.ȱGeneticsȱ173,ȱ1125Ȭ1133.ȱ Kliebenstein,ȱD.J.,ȱKroymann,ȱJ.,ȱBrown,ȱP.,ȱFiguth,ȱ A.,ȱPedersen,ȱD.,ȱGershenzon,ȱJ.ȱandȱ MitchellȬ Olds,ȱ T.ȱ (2001).ȱ Geneticȱ controlȱ ofȱ naturalȱ variationȱ inȱ Arabidopsisȱ glucosinolateȱ accumulation.ȱPlantȱPhysiolȱ126,ȱ811Ȭ825.ȱ Kliebenstein,ȱD.J.,ȱWest,ȱM.A.,ȱvanȱLeeuwen,ȱH.,ȱKim,ȱ K.,ȱDoerge,ȱR.W.,ȱMichelmore,ȱR.W.ȱandȱ Stȱ Clair,ȱ D.A.ȱ (2006a).ȱ Genomicȱ surveyȱ ofȱ geneȱ expressionȱ diversityȱ inȱ Arabidopsisȱ thaliana.ȱ Geneticsȱ172,ȱ1179Ȭ1189.ȱ Kliebenstein,ȱD.J.,ȱWest,ȱM.A.,ȱvanȱLeeuwen,ȱH.,ȱLoudet,ȱO.,ȱDoerge,ȱR.W.ȱandȱStȱClair,ȱD.A.ȱ(2006b).ȱ Identificationȱ ofȱ QTLsȱ controllingȱ geneȱ expressionȱ networksȱ definedȱ aȱ priori.ȱ BMCȱ Bioinformaticsȱ7,ȱ308.ȱ Koornneef,ȱM.,ȱAlonsoȬBlanco,ȱC.ȱandȱVreugdenhil,ȱD.ȱ(2004).ȱNaturallyȱoccurringȱgeneticȱvariationȱinȱ ArabidopsisȱThaliana.ȱAnnuȱRevȱPlantȱPhysiolȱPlantȱMolȱBiolȱ55,ȱ141Ȭ172.ȱ Kroymann,ȱJ.ȱandȱMitchellȬOlds,ȱT.ȱ(2005).ȱEpistasisȱandȱbalancedȱpolymorphismȱinfluencingȱcomplexȱ traitȱvariation.ȱNatureȱ435,ȱ95Ȭ98.ȱ Lan,ȱH.,ȱChen,ȱM.,ȱFlowers,ȱJ.B.,ȱYandell,ȱB.S.,ȱStapleton,ȱD.S.,ȱMata,ȱC.M.,ȱMui,ȱE.T.,ȱFlowers,ȱM.T.,ȱ Schueler,ȱ K.L.,ȱ Manly,ȱ K.F.ȱ etȱ al.ȱ (2006).ȱ Combinedȱ expressionȱ traitȱ correlationsȱ andȱ expressionȱquantitativeȱtraitȱlocusȱmapping.ȱPLoSȱGenetȱ2,ȱe6.ȱ Lee,ȱT.I.,ȱRinaldi,ȱN.J.,ȱRobert,ȱF.,ȱOdom,ȱD.T.,ȱBarȬJoseph,ȱZ.,ȱGerber,ȱG.K.,ȱHannett,ȱN.M.,ȱHarbison,ȱ C.T.,ȱ Thompson,ȱ C.M.,ȱ Simon,ȱ I.ȱ etȱ al.ȱ (2002).ȱ Transcriptionalȱ regulatoryȱ networksȱ inȱ Saccharomycesȱcerevisiae.ȱScienceȱ298,ȱ799Ȭ804.ȱ

13ȱ Chapterȱ1ȱ

Li,ȱH.,ȱLu,ȱL.,ȱManly,ȱK.F.,ȱChesler,ȱE.J.,ȱBao,ȱL.,ȱWang,ȱJ.,ȱZhou,ȱM.,ȱWilliams,ȱR.W.ȱandȱCui,ȱY.ȱ(2005).ȱ Inferringȱgeneȱtranscriptionalȱmodulatoryȱrelations:ȱaȱgeneticalȱgenomicsȱapproach.ȱHumȱMolȱ Genetȱ14,ȱ1119Ȭ1125.ȱ Lisec,ȱ J.,ȱ Schauer,ȱ N.,ȱ Kopka,ȱ J.,ȱ Willmitzer,ȱ L.ȱ andȱ Fernie,ȱ A.R.ȱ (2006).ȱ Gasȱ chromatographyȱ massȱ spectrometryȬbasedȱmetaboliteȱprofilingȱinȱplants.ȱNatȱProtocȱ1,ȱ387Ȭ396.ȱ Maloof,ȱ J.N.ȱ (2003).ȱ Genomicȱ approachesȱ toȱ analyzingȱ naturalȱ variationȱ inȱ Arabidopsisȱ thaliana.ȱ Currȱ OpinȱGenetȱDevȱ13,ȱ576Ȭ582.ȱ Meinke,ȱD.W.,ȱMeinke,ȱL.K.,ȱShowalter,ȱT.C.,ȱSchissel,ȱA.M.,ȱMueller,ȱL.A.ȱandȱTzafrir,ȱI.ȱ(2003).ȱAȱ sequenceȬbasedȱ mapȱ ofȱ Arabidopsisȱ genesȱ withȱ mutantȱ phenotypes.ȱ Plantȱ Physiolȱ 131,ȱ 409Ȭ 418.ȱ MitchellȬOlds,ȱT.ȱandȱSchmitt,ȱJ.ȱ(2006).ȱGeneticȱmechanismsȱandȱevolutionaryȱsignificanceȱofȱnaturalȱ variationȱinȱArabidopsis.ȱNatureȱ441,ȱ947Ȭ952.ȱ Moco,ȱS.,ȱBino,ȱR.J.,ȱVorst,ȱO.,ȱVerhoeven,ȱH.A.,ȱdeȱGroot,ȱJ.,ȱvanȱBeek,ȱT.A.,ȱVervoort,ȱJ.ȱandȱdeȱVos,ȱ C.H.ȱ (2006).ȱ Aȱ liquidȱ chromatographyȬmassȱ spectrometryȬbasedȱ metabolomeȱ databaseȱ forȱ tomato.ȱPlantȱPhysiolȱ141,ȱ1205Ȭ1218.ȱ Nordborg,ȱ M.,ȱ Borevitz,ȱ J.O.,ȱ Bergelson,ȱ J.,ȱ Berry,ȱ C.C.,ȱ Chory,ȱ J.,ȱ Hagenblad,ȱ J.,ȱ Kreitman,ȱ M.,ȱ Maloof,ȱ J.N.,ȱ Noyes,ȱ T.,ȱ Oefner,ȱ P.J.ȱ etȱ al.ȱ (2002).ȱ Theȱ extentȱ ofȱ linkageȱ disequilibriumȱ inȱ Arabidopsisȱthaliana.ȱNatȱGenetȱ30,ȱ190Ȭ193.ȱ Nordborg,ȱ M.,ȱ Hu,ȱ T.T.,ȱ Ishino,ȱ Y.,ȱ Jhaveri,ȱ J.,ȱ Toomajian,ȱ C.,ȱ Zheng,ȱ H.,ȱ Bakker,ȱ E.,ȱ Calabrese,ȱ P.,ȱ Gladstone,ȱ J.,ȱ Goyal,ȱ R.ȱ etȱ al.ȱ (2005).ȱ Theȱ patternȱ ofȱ polymorphismȱ inȱ Arabidopsisȱ thaliana.ȱ PLoSȱBiolȱ3,ȱe196.ȱ Palaniswamy,ȱS.K.,ȱJames,ȱS.,ȱSun,ȱH.,ȱLamb,ȱR.S.,ȱDavuluri,ȱR.V.ȱandȱGrotewold,ȱE.ȱ(2006).ȱAGRISȱ andȱ AtRegNet.ȱ aȱ platformȱ toȱ linkȱ cisȬregulatoryȱ elementsȱ andȱ transcriptionȱ factorsȱ intoȱ regulatoryȱnetworks.ȱPlantȱPhysiolȱ140,ȱ818Ȭ829.ȱ Paran,ȱI.ȱandȱZamir,ȱD.ȱ(2003).ȱQuantitativeȱtraitsȱinȱplants:ȱbeyondȱtheȱQTL.ȱTrendsȱGenetȱ19,ȱ303Ȭ306.ȱ Peck,ȱS.C.ȱ(2005).ȱUpdateȱonȱproteomicsȱinȱArabidopsis.ȱWhereȱdoȱweȱgoȱfromȱhere?ȱPlantȱPhysiolȱ138,ȱ 591Ȭ599.ȱ Price,ȱA.H.ȱ(2006).ȱBelieveȱitȱorȱnot,ȱQTLsȱareȱaccurate!ȱTrendsȱPlantȱSciȱ11,ȱ213Ȭ216.ȱ Rae,ȱ A.M.,ȱ Howell,ȱ E.C.ȱ andȱ Kearsey,ȱ M.J.ȱ (1999).ȱ Moreȱ QTLȱ forȱ floweringȱ timeȱ revealedȱ byȱ substitutionȱlinesȱinȱBrassicaȱoleracea.ȱHeredityȱ83ȱ(Ptȱ5),ȱ586Ȭ596.ȱ Remington,ȱD.L.,ȱThornsberry,ȱJ.M.,ȱMatsuoka,ȱY.,ȱWilson,ȱL.M.,ȱWhitt,ȱS.R.,ȱDoebley,ȱJ.,ȱKresovich,ȱ S.,ȱ Goodman,ȱ M.M.ȱ andȱ Buckler,ȱ E.S.t.ȱ (2001).ȱ Structureȱ ofȱ linkageȱ disequilibriumȱ andȱ phenotypicȱassociationsȱinȱtheȱmaizeȱgenome.ȱProcȱNatlȱAcadȱSciȱUȱSȱAȱ98,ȱ11479Ȭ11484.ȱ Rockman,ȱM.V.ȱandȱKruglyak,ȱL.ȱ(2006).ȱGeneticsȱofȱglobalȱgeneȱexpression.ȱNatȱRevȱGenetȱ7,ȱ862Ȭ872.ȱ Ronald,ȱJ.,ȱBrem,ȱR.B.,ȱWhittle,ȱJ.ȱandȱKruglyak,ȱL.ȱ(2005).ȱLocalȱregulatoryȱvariationȱinȱSaccharomycesȱ cerevisiae.ȱPLoSȱGenetȱ1,ȱe25.ȱ RossȬIbarra,ȱJ.ȱ(2005).ȱQuantitativeȱtraitȱlociȱandȱtheȱstudyȱofȱplantȱdomestication.ȱGeneticaȱ123,ȱ197Ȭ204.ȱ Schadt,ȱE.E.,ȱMonks,ȱS.A.,ȱDrake,ȱT.A.,ȱLusis,ȱA.J.,ȱChe,ȱN.,ȱColinayo,ȱV.,ȱRuff,ȱT.G.,ȱMilligan,ȱS.B.,ȱ Lamb,ȱJ.R.,ȱCavet,ȱG.ȱetȱal.ȱ(2003).ȱGeneticsȱofȱgeneȱexpressionȱsurveyedȱinȱmaize,ȱmouseȱandȱ man.ȱNatureȱ422,ȱ297Ȭ302.ȱ Schauer,ȱ N.,ȱ Steinhauser,ȱ D.,ȱ Strelkov,ȱ S.,ȱ Schomburg,ȱ D.,ȱ Allison,ȱ G.,ȱ Moritz,ȱ T.,ȱ Lundgren,ȱ K.,ȱ RoessnerȬTunali,ȱU.,ȱForbes,ȱM.G.,ȱWillmitzer,ȱL.ȱetȱal.ȱ(2005).ȱGCȬMSȱlibrariesȱforȱtheȱrapidȱ identificationȱofȱmetabolitesȱinȱcomplexȱbiologicalȱsamples.ȱFEBSȱLettȱ579,ȱ1332Ȭ1337.ȱ Schauer,ȱ N.,ȱ Semel,ȱ Y.,ȱ Roessner,ȱ U.,ȱ Gur,ȱ A.,ȱ Balbo,ȱ I.,ȱ Carrari,ȱ F.,ȱ Pleban,ȱ T.,ȱ PerezȬMelis,ȱ A.,ȱ Bruedigam,ȱC.,ȱKopka,ȱJ.ȱetȱal.ȱ(2006).ȱComprehensiveȱmetabolicȱprofilingȱandȱphenotypingȱ ofȱinterspecificȱintrogressionȱlinesȱforȱtomatoȱimprovement.ȱNatȱBiotechnolȱ24,ȱ447Ȭ454.ȱ

14ȱ Generalȱintroductionȱ

Schmid,ȱK.J.,ȱSorensen,ȱT.R.,ȱStracke,ȱR.,ȱTorjek,ȱO.,ȱAltmann,ȱT.,ȱMitchellȬOlds,ȱT.ȱandȱWeisshaar,ȱB.ȱ (2003).ȱ LargeȬScaleȱ identificationȱ andȱ analysisȱ ofȱ genomeȬwideȱ singleȬnucleotideȱ polymorphismsȱforȱmappingȱinȱArabidopsisȱthaliana.ȱGenomeȱResȱ13,ȱ1250Ȭ1257.ȱ Schmid,ȱK.J.,ȱTorjek,ȱO.,ȱMeyer,ȱR.,ȱSchmuths,ȱH.,ȱHoffmann,ȱM.H.ȱandȱAltmann,ȱT.ȱ(2006).ȱEvidenceȱ forȱ aȱ largeȬscaleȱ populationȱ structureȱ ofȱ Arabidopsisȱ thalianaȱ fromȱ genomeȬwideȱ singleȱ nucleotideȱpolymorphismȱmarkers.ȱTheorȱApplȱGenetȱ112,ȱ1104Ȭ1114.ȱ Segal,ȱE.,ȱYelensky,ȱR.ȱandȱKoller,ȱD.ȱ(2003).ȱGenomeȬwideȱdiscoveryȱofȱtranscriptionalȱmodulesȱfromȱ DNAȱsequenceȱandȱgeneȱexpression.ȱBioinformaticsȱ19ȱSupplȱ1,ȱi273Ȭ282.ȱ Semel,ȱY.,ȱNissenbaum,ȱJ.,ȱMenda,ȱN.,ȱZinder,ȱM.,ȱKrieger,ȱU.,ȱIssman,ȱN.,ȱPleban,ȱT.,ȱLippman,ȱZ.,ȱ Gur,ȱ A.ȱ andȱ Zamir,ȱ D.ȱ (2006).ȱ Overdominantȱ quantitativeȱ traitȱ lociȱ forȱ yieldȱ andȱ fitnessȱ inȱ tomato.ȱProcȱNatlȱAcadȱSciȱUȱSȱAȱ103,ȱ12981Ȭ12986.ȱ Slate,ȱJ.ȱ (2005).ȱQuantitativeȱtraitȱlocusȱ mappingȱinȱnaturalȱpopulations:ȱprogress,ȱcaveatsȱandȱfutureȱ directions.ȱMolȱEcolȱ14,ȱ363Ȭ379.ȱ Somerville,ȱC.ȱandȱKoornneef,ȱM.ȱ(2002).ȱTimeline:ȱAȱfortunateȱchoice:ȱtheȱhistoryȱofȱArabidopsisȱasȱaȱ modelȱplant.ȱNatȱRevȱGenetȱ3,ȱ883Ȭ889.ȱ Steuer,ȱR.,ȱKurths,ȱJ.,ȱFiehn,ȱO.ȱandȱWeckwerth,ȱW.ȱ(2003).ȱObservingȱandȱinterpretingȱcorrelationsȱinȱ metabolomicȱnetworks.ȱBioinformaticsȱ19,ȱ1019Ȭ1026.ȱ Stuart,ȱ J.M.,ȱ Segal,ȱ E.,ȱ Koller,ȱ D.ȱ andȱ Kim,ȱ S.K.ȱ (2003).ȱ Aȱ geneȬcoexpressionȱ networkȱ forȱ globalȱ discoveryȱofȱconservedȱgeneticȱmodules.ȱScienceȱ302,ȱ249Ȭ255.ȱ Swarup,ȱK.,ȱAlonsoȬBlanco,ȱC.,ȱLynn,ȱJ.R.,ȱMichaels,ȱS.D.,ȱAmasino,ȱR.M.,ȱKoornneef,ȱM.ȱandȱMillar,ȱ A.J.ȱ(1999).ȱNaturalȱallelicȱvariationȱidentifiesȱnewȱgenesȱinȱtheȱArabidopsisȱcircadianȱsystem.ȱ PlantȱJȱ20,ȱ67Ȭ77.ȱ Teng,ȱ S.,ȱ Keurentjes,ȱ J.J.B.,ȱ Bentsink,ȱ L.,ȱ Koornneef,ȱ M.ȱ andȱ Smeekens,ȱ S.ȱ (2005).ȱ SucroseȬspecificȱ inductionȱofȱanthocyaninȱbiosynthesisȱinȱArabidopsisȱrequiresȱtheȱMYB75/PAP1ȱgene.ȱPlantȱ Physiolȱ139,ȱ1840Ȭ1852.ȱ Theȱ Arabidopsisȱ Genomeȱ Initiativeȱ (2000).ȱ Analysisȱ ofȱ theȱ genomeȱ sequenceȱ ofȱ theȱ floweringȱ plantȱ Arabidopsisȱthaliana.ȱNatureȱ408,ȱ796Ȭ815.ȱ Tong,ȱ A.H.,ȱ Lesage,ȱ G.,ȱ Bader,ȱ G.D.,ȱ Ding,ȱ H.,ȱ Xu,ȱ H.,ȱ Xin,ȱ X.,ȱ Young,ȱ J.,ȱ Berriz,ȱ G.F.,ȱ Brost,ȱ R.L.,ȱ Chang,ȱM.ȱetȱal.ȱ(2004).ȱGlobalȱmappingȱofȱtheȱyeastȱgeneticȱinteractionȱnetwork.ȱScienceȱ303,ȱ 808Ȭ813.ȱ Tonsor,ȱ S.J.,ȱ AlonsoȬBlanco,ȱ C.ȱ andȱ Koornneef,ȱ M.ȱ (2005).ȱ Geneȱ functionȱ beyondȱ theȱ singleȱ trait:ȱ naturalȱ variation,ȱ geneȱ effects,ȱ andȱ evolutionaryȱ ecologyȱ inȱ Arabidopsisȱ thaliana.ȱ Plantȱ Cellȱ Environȱ28,ȱ2Ȭ20.ȱ UrbanczykȬWochniak,ȱ E.,ȱ Luedemann,ȱ A.,ȱ Kopka,ȱ J.,ȱ Selbig,ȱ J.,ȱ RoessnerȬTunali,ȱ U.,ȱ Willmitzer,ȱ L.ȱ andȱFernie,ȱA.R.ȱ(2003).ȱParallelȱanalysisȱofȱtranscriptȱandȱmetabolicȱprofiles:ȱaȱnewȱapproachȱ inȱsystemsȱbiology.ȱEMBOȱRepȱ4,ȱ989Ȭ993.ȱ Vigouroux,ȱY.,ȱMitchell,ȱS.,ȱMatsuoka,ȱY.,ȱHamblin,ȱM.,ȱKresovich,ȱS.,ȱSmith,ȱJ.S.,ȱJaqueth,ȱJ.,ȱSmith,ȱ O.S.ȱandȱDoebley,ȱJ.ȱ(2005).ȱAnȱanalysisȱofȱgeneticȱdiversityȱacrossȱtheȱmaizeȱgenomeȱusingȱ microsatellites.ȱGeneticsȱ169,ȱ1617Ȭ1630.ȱ Vuylsteke,ȱ M.,ȱ vanȱ Eeuwijk,ȱ F.,ȱ Vanȱ Hummelen,ȱ P.,ȱ Kuiper,ȱ M.ȱ andȱ Zabeau,ȱ M.ȱ (2005).ȱ Geneticȱ analysisȱofȱvariationȱinȱgeneȱexpressionȱinȱArabidopsisȱthaliana.ȱGeneticsȱ171,ȱ1267Ȭ1275.ȱ Vuylsteke,ȱ M.,ȱ Daele,ȱ H.,ȱ Vercauteren,ȱ A.,ȱ Zabeau,ȱ M.ȱ andȱ Kuiper,ȱ M.ȱ (2006).ȱ Geneticȱ dissectionȱ ofȱ transcriptionalȱregulationȱbyȱcDNAȬAFLP.ȱPlantȱJȱ45,ȱ439Ȭ446.ȱ Ward,ȱ J.L.,ȱ Harris,ȱ C.,ȱ Lewis,ȱ J.ȱ andȱ Beale,ȱ M.H.ȱ (2003).ȱ Assessmentȱ ofȱ 1Hȱ NMRȱ spectroscopyȱ andȱ multivariateȱ analysisȱ asȱ aȱ techniqueȱ forȱ metaboliteȱ fingerprintingȱ ofȱ Arabidopsisȱ thaliana.ȱ Phytochemistryȱ62,ȱ949Ȭ957.ȱ Ward,ȱ J.L.,ȱ Baker,ȱ J.M.ȱ andȱ Beale,ȱ M.H.ȱ (2007).ȱ Recentȱ applicationsȱ ofȱ NMRȱ spectroscopyȱ inȱ plantȱ metabolomics.ȱFebsȱJȱ274,ȱ1126Ȭ1131.ȱ

15ȱ Chapterȱ1ȱ

Weigel,ȱ D.ȱ andȱ Nordborg,ȱ M.ȱ (2005).ȱ Naturalȱ variationȱ inȱ Arabidopsis.ȱ Howȱ doȱ weȱ findȱ theȱ causalȱ genes?ȱPlantȱPhysiolȱ138,ȱ567Ȭ568.ȱ West,ȱ M.A.,ȱ vanȱ Leeuwen,ȱ H.,ȱ Kozik,ȱ A.,ȱ Kliebenstein,ȱ D.J.,ȱ Doerge,ȱ R.W.,ȱ Stȱ Clair,ȱ D.A.ȱ andȱ Michelmore,ȱ R.W.ȱ (2006).ȱ HighȬdensityȱ haplotypingȱ withȱ microarrayȬbasedȱ expressionȱ andȱ singleȱfeatureȱpolymorphismȱmarkersȱinȱArabidopsis.ȱGenomeȱResȱ16,ȱ787Ȭ795.ȱ West,ȱM.A.,ȱ Kim,ȱ K.,ȱ Kliebenstein,ȱD.J.,ȱvanȱLeeuwen,ȱH.,ȱMichelmore,ȱR.W.,ȱ Doerge,ȱR.W.ȱandȱStȱ Clair,ȱ D.A.ȱ (2007).ȱ Globalȱ eQTLȱ mappingȱ revealsȱ theȱ complexȱ geneticȱ architectureȱ ofȱ transcriptȬlevelȱvariationȱinȱArabidopsis.ȱGeneticsȱ175,ȱ1441Ȭ1450.ȱ Winnacker,ȱE.L.ȱ(2003).ȱInterdisciplinaryȱsciencesȱinȱtheȱ21stȱcentury.ȱCurrȱOpinȱBiotechnolȱ14,ȱ328Ȭ331.ȱ Yoon,ȱD.B.,ȱKang,ȱK.H.,ȱKim,ȱH.J.,ȱJu,ȱH.G.,ȱKwon,ȱS.J.,ȱSuh,ȱJ.P.,ȱJeong,ȱO.Y.ȱandȱAhn,ȱS.N.ȱ(2006).ȱ Mappingȱquantitativeȱtraitȱlociȱforȱyieldȱcomponentsȱandȱmorphologicalȱtraitsȱinȱanȱadvancedȱ backcrossȱ populationȱ betweenȱ Oryzaȱ grandiglumisȱ andȱ theȱ O.ȱ sativaȱ japonicaȱ cultivarȱ Hwaseongbyeo.ȱTheorȱApplȱGenetȱ112,ȱ1052Ȭ1062.ȱ Zhang,ȱX.,ȱRichards,ȱE.J.ȱandȱBorevitz,ȱJ.O.ȱ(2007).ȱGeneticȱandȱepigeneticȱdissectionȱofȱcisȱregulatoryȱ variation.ȱCurrȱOpinȱPlantȱBiolȱ10,ȱ142Ȭ148.ȱ Zhao,ȱ J.,ȱ Becker,ȱ H.C.,ȱ Zhang,ȱ D.,ȱ Zhang,ȱ Y.ȱ andȱ Ecke,ȱ W.ȱ (2006).ȱ Conditionalȱ QTLȱ mappingȱ ofȱ oilȱ contentȱinȱrapeseedȱwithȱrespectȱtoȱproteinȱcontentȱandȱtraitsȱrelatedȱtoȱplantȱdevelopmentȱ andȱgrainȱyield.ȱTheorȱApplȱGenetȱ113,ȱ33Ȭ38.ȱ Zhu,ȱJ.,ȱLum,ȱP.Y.,ȱLamb,ȱJ.,ȱGuhaThakurta,ȱD.,ȱEdwards,ȱS.W.,ȱThieringer,ȱR.,ȱBerger,ȱJ.P.,ȱWu,ȱM.S.,ȱ Thompson,ȱ J.,ȱ Sachs,ȱ A.B.ȱ etȱ al.ȱ (2004).ȱ Anȱ integrativeȱ genomicsȱ approachȱ toȱ theȱ reconstructionȱofȱgeneȱnetworksȱinȱsegregatingȱpopulations.ȱCytogenetȱGenomeȱResȱ105,ȱ363Ȭ 374.ȱ

16ȱ Chapterȱ2ȱ ȱ ȱ DevelopmentȱofȱaȱNearȬIsogenicȱLineȱpopulationȱofȱ Arabidopsisȱthalianaȱandȱcomparisonȱofȱmappingȱpowerȱ withȱaȱRecombinantȱInbredȱLineȱpopulationȱ ȱ Joostȱ J.ȱ B.ȱ Keurentjes,ȱ Leónieȱ Bentsink,ȱ Carlosȱ AlonsoȬBlanco,ȱ Corrieȱ J.ȱ Hanhart,ȱ HettyȱBlankestijnȬDeȱVries,ȱSigiȱEffgen,ȱDickȱVreugdenhilȱandȱMaartenȱKoornneefȱ ȱ PublishedȱinȱGeneticsȱ(2007)ȱ175,ȱ891Ȭ905.ȱ ȱ ȱ ȱ ABSTRACTȱ ȱ Inȱ Arabidopsisȱ Recombinantȱ Inbredȱ Lineȱ (RIL)ȱ populationsȱ areȱ widelyȱ usedȱ forȱ Quantitativeȱ Traitȱ Locusȱ (QTL)ȱ analyses.ȱ However,ȱ mappingȱ analysesȱ withȱ thisȱ typeȱofȱpopulationsȱcanȱbeȱlimitedȱbecauseȱofȱmaskingȱeffectsȱofȱmajorȱQTLsȱandȱ epistaticȱ interactionsȱ ofȱ multipleȱ QTLs.ȱ Anȱ alternativeȱ typeȱ ofȱ immortalȱ experimentalȱpopulationȱcommonlyȱusedȱinȱplantȱspeciesȱareȱsetsȱofȱintrogressionȱ lines.ȱ Hereȱ weȱ introduceȱ theȱ developmentȱ ofȱ aȱ genomeȬwideȱ coverageȱ Nearȱ Isogenicȱ Lineȱ (NIL)ȱ populationȱ ofȱ Arabidopsisȱ thaliana,ȱ byȱ introgressingȱ genomicȱ regionsȱfromȱtheȱCapeȱVerdeȱIslandsȱ(Cvi)ȱaccessionȱintoȱtheȱLandsbergȱerectaȱ(Ler)ȱ geneticȱ background.ȱ Weȱ haveȱ empiricallyȱ comparedȱ theȱ QTLȱ mappingȱ powerȱ ofȱ thisȱ newȱ populationȱ withȱ anȱ alreadyȱ existingȱ RILȱ populationȱ derivedȱ fromȱ theȱ sameȱ parents.ȱ Forȱ that,ȱ weȱ analyzedȱ andȱ mappedȱ QTLsȱ affectingȱ sixȱ developmentalȱ traitsȱ withȱ differentȱ heritability.ȱ Overall,ȱ inȱ theȱ NILȱ populationȱ smallerȬeffectȱ QTLsȱ thanȱ inȱ theȱ RILȱ populationȱ couldȱ beȱ detectedȱ althoughȱ theȱ localizationȱ resolutionȱ wasȱ lower.ȱ Furthermore,ȱ weȱ estimatedȱ theȱ effectȱ ofȱ populationȱsizeȱandȱofȱtheȱnumberȱofȱreplicatesȱonȱtheȱdetectionȱpowerȱofȱQTLsȱ affectingȱ theȱ developmentalȱ traits.ȱ Inȱ general,ȱ populationȱ sizeȱ isȱ moreȱ importantȱ thanȱtheȱnumberȱofȱreplicatesȱtoȱincreaseȱtheȱmappingȱpowerȱofȱRILs,ȱwhereasȱforȱ NILs,ȱ severalȱ replicatesȱ areȱ absolutelyȱ required.ȱ Theseȱ analysesȱ areȱ expectedȱ toȱ facilitateȱexperimentalȱdesignȱforȱQTLȱmappingȱusingȱtheseȱtwoȱcommonȱtypesȱofȱ segregatingȱpopulations.ȱ

17ȱ Chapterȱ2ȱ

INTRODUCTIONȱ ȱ Quantitativeȱtraitsȱareȱcharacterizedȱbyȱcontinuousȱvariation.ȱTheȱestablishmentȱofȱ theȱgeneticȱbasisȱofȱquantitativeȱtraitsȱisȱcommonlyȱreferredȱtoȱasȱQuantitativeȱTraitȱ Locusȱ(QTL)ȱmapping,ȱandȱhasȱbeenȱhamperedȱdueȱtoȱtheirȱmultigenicȱinheritanceȱ andȱ theȱ oftenȱ strongȱ interactionȱ withȱ theȱ environment.ȱ Theȱ principleȱ ofȱ QTLȱ mappingȱ inȱ segregatingȱ populationsȱ isȱ basedȱ onȱ theȱ genotypingȱ ofȱ progenyȱ derivedȱ fromȱ aȱ crossȱ ofȱ distinctȱ genotypesȱ forȱ theȱ traitȱ underȱ study.ȱ Phenotypicȱ valuesȱ forȱ theȱ quantitativeȱ traitȱ areȱ thenȱ comparedȱ withȱ theȱ molecularȱ markerȱ genotypesȱ ofȱ theȱ progenyȱ toȱ searchȱ forȱ particularȱ genomicȱ regionsȱ showingȱ statisticalȱ significantȱ associationsȱ withȱ theȱ traitȱ variation,ȱ whichȱ areȱ thenȱ calledȱ QTLsȱ (Broman,ȱ 2001;ȱ Slate,ȱ 2005).ȱ Overȱ theȱ pastȱ fewȱ decades,ȱ theȱ fieldȱ hasȱ benefitedȱ enormouslyȱ fromȱ theȱ progressȱ madeȱ inȱ molecularȱ markerȱ technology.ȱ Theȱeaseȱbyȱwhichȱsuchȱmarkersȱcanȱbeȱdevelopedȱhasȱenabledȱtheȱgenerationȱofȱ denseȱ geneticȱ mapsȱ andȱ theȱ performanceȱ ofȱ QTLȱ mappingȱ studiesȱ ofȱ theȱ mostȱ complexȱtraitsȱ(BorevitzȱandȱNordborg,ȱ2003).ȱȱ QTLȱ analysesȱ makeȱ useȱ ofȱ theȱ naturalȱ variationȱ presentȱ withinȱ speciesȱ (AlonsoȬBlancoȱ andȱ Koornneef,ȱ 2000;ȱ Maloof,ȱ 2003)ȱ andȱ haveȱ beenȱ successfullyȱ appliedȱ toȱ variousȱ typesȱ ofȱ segregatingȱ populations.ȱ Inȱ plants,ȱ theȱ useȱ ofȱ ‘immortal’ȱ mappingȱ populationsȱ consistingȱ ofȱ homozygousȱ individualsȱ isȱ preferredȱ becauseȱ itȱ allowsȱ performingȱ replicationsȱ andȱ multipleȱ analysesȱ ofȱ theȱ sameȱ population.ȱ Homozygousȱ populationsȱ canȱ beȱ obtainedȱ byȱ repeatedȱ selfing,ȱ likeȱ forȱ Recombinantȱ Inbredȱ Linesȱ (RILs),ȱ butȱ alsoȱ byȱ inducedȱ chromosomalȱ doublingȱofȱhaploids,ȱsuchȱasȱforȱDoubledȱHaploidsȱ(DHs)ȱ(Hanȱetȱal.,ȱ1997;ȱRaeȱetȱ al.,ȱ1999;ȱvonȱKorffȱetȱal.,ȱ2004).ȱDependingȱonȱtheȱspeciesȱoneȱcanȱinȱprincipleȱalsoȱ obtainȱ immortalityȱ byȱ vegetativeȱ propagation,ȱ althoughȱ thisȱ isȱ oftenȱ moreȱ laborious.ȱRILsȱareȱadvantageousȱoverȱDHsȱbecauseȱofȱtheirȱhigherȱrecombinationȱ frequencyȱ inȱ theȱ population,ȱ resultingȱ fromȱ multipleȱ meioticȱ eventsȱ occurredȱ duringȱrepeatedȱselfingȱ(Jansen,ȱ2003).ȱ AnotherȱtypeȱofȱimmortalȱpopulationȱconsistsȱofȱIntrogressionȱLinesȱ(ILs)ȱ (EshedȱandȱZamir,ȱ1995),ȱwhichȱareȱobtainedȱthroughȱrepeatedȱbackcrossingȱandȱ extensiveȱ genotyping.ȱ Theseȱ areȱ alsoȱ referredȱ toȱ asȱ Nearȱ Isogenicȱ Linesȱ (NILs)ȱ (Monforteȱ andȱ Tanksley,ȱ 2000)ȱ orȱ Backcrossȱ Inbredȱ Linesȱ (BILs)ȱ (Jeukenȱ andȱ Lindhout,ȱ2004;ȱBlancoȱetȱal.,ȱ2006).ȱSuchȱpopulationsȱconsistȱofȱlinesȱcontainingȱaȱ singleȱorȱaȱsmallȱnumberȱofȱgenomicȱintrogressionȱfragmentsȱfromȱaȱdonorȱparentȱ intoȱ anȱ otherwiseȱ homogeneousȱ geneticȱ background.ȱ Althoughȱ noȱ essentialȱ differencesȱexistȱbetweenȱtheseȱpopulations,ȱweȱuseȱtheȱtermȱNearȱIsogenicȱLinesȱ forȱ theȱ materialsȱ describedȱ here.ȱ Aȱ specialȱ caseȱ ofȱ ILsȱ areȱ Chromosomalȱ

18ȱ DevelopmentȱofȱaȱNearȬIsogenicȱLineȱpopulationȱ

SubstitutionȱStrainsȱ(CSSs)ȱ(Nadeauȱetȱal.,ȱ2000;ȱKoumproglouȱetȱal.,ȱ2002),ȱwhereȱ theȱintrogressionsȱspanȱcompleteȱchromosomes.ȱAllȱimmortalȱpopulationsȱexceptȱ thoseȱ whichȱ canȱ onlyȱ beȱ propagatedȱ vegetatively,ȱ shareȱ theȱ advantageȱ thatȱ theyȱ canȱ easilyȱ beȱ maintainedȱ throughȱ seeds,ȱ whichȱ allowsȱ theȱ analysisȱ ofȱ differentȱ environmentalȱinfluencesȱandȱtheȱstudyȱofȱmultiple,ȱevenȱinvasiveȱorȱdestructive,ȱ traits.ȱ Statisticalȱ powerȱ ofȱ suchȱ analysesȱ isȱ increasedȱ becauseȱ replicateȱ measurementsȱofȱgeneticallyȱidenticalȱindividualsȱcanȱbeȱdone.ȱ Inȱ plants,ȱ RILsȱ andȱ NILsȱ areȱ theȱ mostȱ commonȱ typesȱ ofȱ experimentalȱ populationsȱusedȱforȱtheȱanalysisȱofȱquantitativeȱtraits.ȱInȱbothȱcasesȱtheȱaccuracyȱ ofȱQTLȱlocalization,ȱreferredȱtoȱasȱmappingȱresolution,ȱdependsȱonȱpopulationȱsize.ȱ ForȱRILs,ȱrecombinationȱfrequencyȱwithinȱexistingȱlinesȱisȱfixedȱandȱcanȱthereforeȱ onlyȱ beȱ increasedȱ withinȱ theȱ populationȱ byȱ addingȱ moreȱ linesȱ (i.e.ȱ moreȱ independentȱrecombinationȱevents).ȱAlternatively,ȱrecombinationȱfrequencyȱcanȱbeȱ increasedȱbyȱintercrossingȱlinesȱbeforeȱfixationȱasȱhomozygousȱlinesȱbyȱinbreedingȱ (Zouȱetȱal.,ȱ2005).ȱInȱNILȱpopulationsȱresolutionȱcanȱbeȱimprovedȱbyȱminimizingȱ theȱ introgressionȱ sizeȱ ofȱ eachȱ NIL.ȱ Consequently,ȱ toȱ maintainȱ genomeȬwideȱ coverageȱ aȱ largerȱ numberȱ ofȱ linesȱ areȱ needed.ȱ Despiteȱ theȱ similaritiesȱ betweenȱ theseȱ twoȱ typesȱ ofȱ mappingȱ populations,ȱ largeȱ differencesȱ existȱ inȱ theȱ geneticȱ makeupȱ ofȱ theȱ respectiveȱ individualsȱ andȱ theȱ resultingȱ mappingȱ approach.ȱ Inȱ general,ȱ recombinationȱ frequencyȱ inȱ RILȱ populationsȱ isȱ higherȱ thanȱ inȱ equallyȱ sizedȱ NILȱ populations,ȱ whichȱ allowsȱ theȱ analysisȱ ofȱ lessȱ individuals.ȱ Eachȱ RILȱ containsȱseveralȱintrogressionȱfragmentsȱand,ȱonȱaverage,ȱeachȱgenomicȱregionȱisȱ representedȱ byȱ anȱ equalȱ numberȱ ofȱ bothȱ parentalȱ genotypesȱ inȱ theȱ population.ȱ Therefore,ȱreplicationȱofȱindividualȱlinesȱisȱoftenȱnotȱnecessaryȱbecauseȱtheȱeffectȱofȱ eachȱgenomicȱregionȱonȱphenotypicȱtraitsȱisȱtestedȱbyȱcomparingȱtheȱtwoȱgenotypicȱ RILȱ classesȱ (eachȱ comprisingȱ approximatelyȱ halfȱ theȱ numberȱ ofȱ linesȱ inȱ theȱ population).ȱ Inȱ addition,ȱ theȱ multipleȱ introgressionsȱ perȱ RILȱ allowȱ detectionȱ ofȱ geneticȱ interactionsȱ betweenȱ lociȱ (epistasis).ȱ However,ȱ epistasisȱ togetherȱ withȱ unequalȱ recombinationȱ frequenciesȱ throughoutȱ theȱ genomeȱ andȱ segregationȱ distortionsȱcausedȱbyȱlethalityȱorȱreducedȱfitnessȱofȱparticularȱgenotypesȱmayȱbiasȱ theȱpowerȱtoȱdetectȱQTLs.ȱFurthermore,ȱtheȱwideȱvariationȱofȱmorphologicalȱandȱ developmentalȱtraitsȱpresentȱinȱmostȱRILȱpopulationsȱmayȱhamperȱtheȱanalysisȱofȱ traitsȱrequiringȱtheȱsameȱgrowthȱandȱdevelopmentalȱstageȱofȱtheȱindividualȱlines.ȱ Whenȱ manyȱ traitsȱ segregateȱ simultaneously,ȱ thisȱ oftenȱ affectsȱ theȱ expressionȱ ofȱ otherȱtraitsȱdueȱtoȱgeneticȱinteractions.ȱMoreover,ȱlargeȬeffectȱQTLsȱmayȱmaskȱtheȱ detectionȱofȱQTLsȱwithȱaȱsmallȱadditiveȱeffect.ȱ InȱcontrastȱtoȱRILs,ȱNILsȱpreferablyȱcontainȱonlyȱaȱsingleȱintrogressionȱperȱ line,ȱwhichȱincreasesȱtheȱpowerȱtoȱdetectȱsmallȬeffectȱQTLs.ȱHowever,ȱtheȱpresenceȱ ofȱaȱsingleȱintrogressionȱsegmentȱdoesȱnotȱallowȱtestingȱforȱgeneticȱinteractionsȱandȱ

19ȱ Chapterȱ2ȱ

therebyȱtheȱdetectionȱofȱQTLsȱexpressedȱinȱspecificȱgeneticȱbackgroundsȱ(epistasis).ȱ Inȱaddition,ȱbecauseȱmostȱofȱtheȱgeneticȱbackgroundȱisȱidenticalȱforȱallȱlines,ȱNILsȱ showȱ moreȱ limitedȱ developmentalȱ andȱ growthȱ variation,ȱ increasingȱ theȱ homogeneityȱ ofȱ growthȱ stageȱ withinȱ experiments.ȱ Nevertheless,ȱ lethalityȱ andȱ sterilityȱmightȱsometimesȱhinderȱtheȱobtainingȱofȱspecificȱsingleȱintrogressionȱlines.ȱ Theȱchoiceȱofȱoneȱmappingȱpopulationȱoverȱanotherȱdependsȱonȱtheȱplantȱ speciesȱ andȱ theȱ specificȱ parentsȱ ofȱ interest.ȱ Inȱ casesȱ whereȱ differentȱ cultivarsȱ orȱ wildȱ accessionsȱ areȱ studiedȱ preferenceȱ isȱ oftenȱ givenȱ toȱ RILs.ȱ However,ȱ whenȱ differentȱspeciesȱorȱwhenȱwildȱandȱcultivatedȱgermplasmȱareȱcombinedȱNILsȱareȱ preferredȱ (Eshedȱ andȱ Zamir,ȱ 1995;ȱ Jeukenȱ andȱ Lindhout,ȱ 2004;ȱ vonȱ Korffȱ etȱ al.,ȱ 2004;ȱBlairȱetȱal.,ȱ2006;ȱYoonȱetȱal.,ȱ2006).ȱForȱinstance,ȱinȱtomatoȱtheȱhighȱsterilityȱinȱ theȱoffspringȱofȱcrossesȱbetweenȱcultivatedȱandȱwildȱspeciesȱmadeȱtheȱuseȱofȱNILȱ populationsȱ preferableȱ becauseȱ genomeȬwideȱ coverageȱ cannotȱ beȱ obtainedȱ withȱ RILȱ populationsȱ dueȱ toȱ sterilityȱ etc.ȱ (Eshedȱ andȱ Zamir,ȱ 1995).ȱ Furthermore,ȱ theȱ analysisȱ ofȱ agronomicalȱ importantȱ traitsȱ (suchȱ asȱ fruitȱ characters)ȱ cannotȱ beȱ performedȱ whenȱ manyȱ genesȱ conferringȱ reducedȱ fertilityȱ segregate.ȱ Inȱ Arabidopsis,ȱ theȱ easinessȱ toȱ generateȱ fertileȱ RILȱ populationsȱ withȱ completeȱ genomeȱcoverage,ȱdueȱtoȱitsȱfastȱgenerationȱtime,ȱhasȱledȱtoȱtheirȱextensiveȱuseȱinȱ mappingȱquantitativeȱtraits.ȱ NILsȱhaveȱbeenȱdevelopedȱinȱvariousȱstudiesȱusingȱArabidopsisȱtoȱconfirmȱ andȱfineȱmapȱQTLsȱpreviouslyȱidentifiedȱinȱRILsȱ(AlonsoȬBlancoȱetȱal.,ȱ1998a,ȱ2003;ȱ Swarupȱetȱal.,ȱ1999;ȱBentsinkȱetȱal.,ȱ2003;ȱEdwardsȱetȱal.,ȱ2005;ȱJuengerȱetȱal.,ȱ2005a;ȱ Tengȱetȱal.,ȱ2005)ȱforȱwhichȱalsoȱHeterogeneousȱInbredȱFamiliesȱ(HIFs)ȱ(Tuinstraȱetȱ al.,ȱ 1997)ȱ haveȱ beenȱ usedȱ (Loudetȱ etȱ al.,ȱ 2005;ȱ Reymondȱ etȱ al.,ȱ 2006).ȱ Aȱ setȱ ofȱ chromosomalȱ substitutionsȱ ofȱ theȱ Landsbergȱ erectaȱ (Ler)ȱaccessionȱ intoȱ Columbiaȱ (Col)ȱ hasȱ beenȱ developedȱ toȱ serveȱ asȱ startingȱ materialȱ forȱ makingȱ smallerȱ introgressionsȱ (Koumproglouȱ etȱ al.,ȱ 2002).ȱ Inȱ miceȱ CSSsȱ areȱ widelyȱ usedȱ forȱ mappingȱ purposesȱ andȱ haveȱ provenȱ toȱ beȱ aȱ valuableȱ complementȱ toȱ otherȱ populationȱtypesȱ(Stylianouȱetȱal.,ȱ2006).ȱHowever,ȱnoȱgenomeȬwideȱsetȱofȱNIlsȱthatȱ allowsȱmappingȱtoȱsubpartsȱofȱtheȱchromosomeȱhasȱbeenȱdescribedȱinȱArabidopsisȱ and,ȱ toȱ ourȱ knowledge,ȱ noȱ empiricalȱ comparativeȱ studyȱ hasȱ beenȱ performedȱ betweenȱtheȱtwoȱpopulationȱtypesȱwithinȱaȱsingleȱspecies.ȱȱ InȱthisȱstudyȱweȱaimȱtoȱcompareȱaȱRILȱpopulationȱwithȱaȱNILȱpopulationȱ inȱ termsȱ ofȱ QTLȱ detectionȱ powerȱ andȱ localizationȱ resolution.ȱ Forȱ that,ȱ weȱ generatedȱ aȱ newȱ genomeȬwideȱ populationȱ ofȱ NILsȱ usingȱ theȱ sameȱ Lerȱ andȱ Cviȱ parentalȱaccessionsȱasȱusedȱearlierȱtoȱgenerateȱaȱRILȱpopulationȱ(AlonsoȬBlancoȱetȱ al.,ȱ 1998b).ȱ Theȱ twoȱ experimentalȱ populationsȱ wereȱgrownȱ simultaneouslyȱ inȱ theȱ sameȱ experimentalȱ setup,ȱ includingȱ multipleȱ replicates.ȱ QTLȱ mappingȱ analysesȱ

20ȱ DevelopmentȱofȱaȱNearȬIsogenicȱLineȱpopulationȱ

wereȱ performedȱ onȱ sixȱ differentȱ traitsȱ andȱ theȱ resultsȱ ofȱ theseȱ analysesȱ wereȱ comparedȱinȱbothȱpopulations.ȱȱ ȱ ȱ RESULTSȱ ȱ ConstructionȱofȱaȱgenomeȬwideȱNearȱIsogenicȱLineȱpopulationȱ Weȱ constructedȱ aȱ populationȱ ofȱ 92ȱ introgressionȱ linesȱ carryingȱ betweenȱ oneȱ andȱ fourȱ Cviȱ introgressionȱ fragmentsȱ inȱ aȱ Lerȱ geneticȱ background.ȱ Linesȱ wereȱ genotypedȱusingȱ349ȱAFLPȱandȱ95ȱPCRȱmarkersȱtoȱdetermineȱtheȱnumber,ȱpositionȱ andȱ sizeȱ ofȱ theȱ introgressionsȱ (seeȱ Materialsȱ andȱ Methods).ȱ Thisȱ setȱ ofȱ linesȱ wasȱ selectedȱtoȱprovideȱtogetherȱanȱalmostȱcompleteȱgenomeȬwideȱcoverageȱ(Figureȱ1).ȱ Fortyȱ linesȱ containedȱ aȱ singleȱ introgressionȱ whileȱ 52ȱ linesȱ carriedȱ severalȱ Cviȱ fragments.ȱFromȱthose,ȱ32,ȱ19,ȱandȱ1ȱlineȱboreȱtwo,ȱthreeȱandȱfourȱintrogressionsȱ respectively.ȱTheȱgeneticȱlengthȱofȱtheȱintrogressionȱfragmentsȱwasȱestimatedȱusingȱ theȱmapȱpositionsȱofȱtheȱintrogressedȱmarkersȱinȱtheȱgeneticȱmapȱconstructedȱfromȱ theȱexistingȱRILȱpopulationȱderivedȱfromȱtheȱsameȱLerȱandȱCviȱparentalȱaccessionsȱ (AlonsoȬBlancoȱetȱal.,ȱ1998b).ȱTheȱaverageȱgeneticȱsizeȱofȱtheȱmain,ȱsecond,ȱthird,ȱ andȱ fourthȱ introgressionȱ fragmentȱ wasȱ 31.7,ȱ 11.1,ȱ 6.7,ȱ andȱ 5.2ȱ cMȱ respectively.ȱ Thus,ȱ linesȱ withȱ multipleȱ Cviȱ fragmentsȱ carriedȱ aȱ mainȱ largeȱ introgressionȱ andȱ severalȱmuchȱsmallerȱCviȱfragments.ȱAdditionally,ȱweȱselectedȱaȱcoreȱsetȱofȱ25ȱlinesȱ thatȱ togetherȱ coveredȱ moreȱ thenȱ 90%ȱ ofȱ theȱ genomeȱ (supplementalȱ Tableȱ 1ȱ atȱ http://www.genetics.org/supplemental/).ȱ ȱ Geneticȱanalysesȱofȱdevelopmentalȱtraitsȱ SixȱtraitsȱwereȱmeasuredȱandȱanalyzedȱinȱtheȱRILȱandȱNILȱpopulationsȱ(Tableȱ1).ȱ Althoughȱplantsȱwereȱgrownȱinȱfourȱreplicatedȱblocks,ȱblockȱeffectsȱwereȱnegligibleȱ andȱwasȱthereforeȱnotȱusedȱasȱaȱfactorȱinȱsubsequentȱanalyses.ȱInȱbothȱpopulations,ȱ amongȬgenotypeȱ varianceȱ wasȱ highlyȱ significantȱ (Pȱ <ȱ 0.0001)ȱ forȱ allȱ traits.ȱ Inȱ theȱ RILȱpopulation,ȱbroadȱsenseȱheritabilityȱestimatesȱrangedȱfromȱ0.34ȱ(basalȱbranchȱ number)ȱtoȱ0.92ȱ(totalȱplantȱlength)ȱ(Tableȱ1).ȱStatisticalȱparametersȱofȱmostȱtraitsȱ wereȱsimilarȱtoȱthoseȱdescribedȱbyȱAlonsoȬBlancoȱetȱal.ȱ(1998a,ȱ1999)ȱandȱJuengerȱetȱ al.ȱ(2005b).ȱHowever,ȱUngererȱetȱal.ȱ(2002)ȱreportedȱmuchȱlowerȱaverageȱvaluesȱforȱ plantȱheightȱandȱbranchȱnumberȱalthoughȱtimeȱtoȱflowerȱwasȱsimilar.ȱMoreover,ȱ amongȬgenotypeȱ varianceȱ estimatesȱ wereȱ lowerȱ andȱ withinȬgenotypeȱ varianceȱ estimatesȱhigherȱresultingȱinȱlowerȱheritabilityȱvaluesȱcomparedȱtoȱourȱanalyses.ȱ ȱ

21ȱ Chapterȱ2ȱ

22ȱ DevelopmentȱofȱaȱNearȬIsogenicȱLineȱpopulationȱ

Figureȱ1:ȱGraphicalȱgenotypeȱofȱtheȱLerȱxȱCviȱNILȱpopulation.ȱ Barsȱ representȱ introgressions.ȱ Solidȱ barsȱ representȱ theȱ geneticȱ positionȱ ofȱ Cviȱ introgressionsȱ inȱ individualȱNILs.ȱShadedȱbarsȱrepresentȱcrossoverȱregionsȱbetweenȱmarkersȱusedȱforȱtheȱgenotypingȱofȱ theȱlines.ȱNumbersȱatȱtheȱtopȱindicateȱtheȱfiveȱlinkageȱgroups.ȱ ȱ Tableȱ1:ȱDescriptiveȱstatisticsȱforȱsixȱdevelopmentalȱtraitsȱanalyzedȱinȱtwoȱmappingȱpopulationsȱandȱ theirȱparents.ȱ Traitȱ ȱ X r SD ȱ[VG]aȱ ȱ[VE]bȱ [H2]cȱ [CVG]dȱ ȱ Parentsȱ FTȱ(days)ȱ 24.30ȱ(1.03)eȱ ȱȱ8.74ȱȱȱ3.57ȱ 0.71ȱ 10.85ȱ ȱ 30.21ȱ(2.47)fȱ ȱȱȱȱ SLȱ(cm)ȱȱȱ9.58ȱ(0.98)eȱ ȱȱ3.27ȱȱȱ3.14ȱ 0.51ȱ 15.87ȱ ȱ 13.21ȱ(2.30)fȱ ȱȱȱȱ TLȱ(cm)ȱ 23.59ȱ(1.92)eȱ 26.81ȱ 10.53ȱ 0.72ȱ 17.99ȱ ȱ 33.95ȱ(4.17)fȱ ȱȱȱȱ IBȱȱȱ2.21ȱ(0.46)eȱ ȱȱ0.02ȱȱȱ0.33ȱ 0.05ȱȱȱ5.53ȱ ȱȱȱ2.49ȱ(0.67)fȱ ȱȱȱȱ BBȱȱȱ1.54ȱ(0.68)eȱ ȱȱ0.00ȱȱȱ0.65ȱ 0.00ȱȱȱ0.00ȱ ȱȱȱ1.48ȱ(0.91)fȱ ȱȱȱȱ TBȱȱȱ3.75ȱ(0.77)eȱ ȱȱ0.01ȱȱȱ0.82ȱ 0.01ȱȱȱ1.88ȱ ȱȱȱ3.97ȱ(1.02)fȱ ȱȱȱȱ ȱȱ ȱȱȱȱ RILȱpopulationȱ FTȱ(days)ȱ 26.06ȱ(6.03)ȱ 32.59ȱ 3.82ȱ 0.90ȱ 21.91ȱ SLȱ(cm)ȱȱȱ9.89ȱ(3.39)ȱȱȱ9.70ȱ 1.80ȱ 0.83ȱ 31.49ȱ TLȱ(cm)ȱ 26.13ȱ(9.22)ȱ 78.53ȱ 6.52ȱ 0.92ȱ 33.91ȱ IBȱȱ ȱȱ2.34ȱ(1.22)ȱȱȱ0.99ȱ 0.50ȱ 0.67ȱ 42.66ȱ BBȱȱȱ1.43ȱ(0.93)ȱȱȱ0.30ȱ 0.57ȱ 0.34ȱ 37.98ȱ TBȱȱȱ3.77ȱ(1.27)ȱȱȱ0.78ȱ 0.84ȱ 0.48ȱ 23.36ȱ ȱȱ ȱȱȱȱ NILȱpopulationȱ FTȱ(days)ȱ 23.68ȱ(3.60)ȱ 10.78ȱ 2.21ȱ 0.83ȱ 13.87ȱ SLȱ(cm)ȱȱȱ9.81ȱ(2.18)ȱȱȱ3.17ȱ 1.58ȱ 0.65ȱ 18.15ȱ TLȱ(cm)ȱ 24.50ȱ(5.95)ȱ 31.24ȱ 4.10ȱ 0.87ȱ 22.82ȱ IBȱȱ ȱȱ2.26ȱ(0.88)ȱȱȱ0.51ȱ 0.27ȱ 0.65ȱ 31.42ȱ BBȱȱȱ1.56ȱ(0.84)ȱȱȱ0.18ȱ 0.53ȱ 0.24ȱ 26.92ȱ TBȱȱȱ3.82ȱ(1.06)ȱȱȱ0.48ȱ 0.64ȱ 0.42ȱ 18.25ȱ FT,ȱ floweringȱ time;ȱ SL,ȱ lengthȱ atȱ firstȱ silique;ȱ TL,ȱ totalȱ plantȱ length;ȱ IB,ȱ mainȱ inflorescenceȱ branchȱ number;ȱ BB,ȱ basalȱ branchȱ number;ȱ TB,ȱ totalȱ branchȱ number.ȱ aȱ AmongȬgenotypeȱ varianceȱ componentȱ fromȱANOVA;ȱtestsȱwhetherȱgeneticȱdifferencesȱexistȱamongȱgenotypesȱforȱspecifiedȱtraitsȱ(Pȱ<ȱ0.0001).ȱ bȱ Residualȱ varianceȱ componentȱ fromȱ ANOVA.ȱ cȱ Measureȱ ofȱ totalȱ phenotypicȱ varianceȱ attributableȱ toȱ geneticȱdifferencesȱamongȱgenotypesȱ(broadȱsenseȱheritability)ȱcalculatedȱasȱVG/(VG+VE).ȱdȱCoefficientȱofȱ

e f geneticȱvariationȱcalculatedȱasȱ 100u VG X .ȱ ȱLandsbergȱerectaȱparent.ȱ ȱCapeȱVerdeȱIslandsȱparent.ȱ ȱ

23ȱ Chapterȱ2ȱ

ForȱtheȱNILȱpopulation,ȱmeanȱtraitȱvaluesȱwereȱcloserȱtoȱthoseȱmeasuredȱ forȱLerȱdueȱtoȱtheȱgeneticȱstructureȱofȱtheȱpopulation,ȱconsistingȱofȱlinesȱcarryingȱ onlyȱ smallȱ Cviȱ introgressionsȱ inȱ aȱ Lerȱ background.ȱ Furthermore,ȱ varianceȱ componentsȱ fromȱ ANOVAȱ wereȱ lowerȱ inȱ theȱ NILȱ populationȱ butȱ heritabilityȱ estimatesȱdifferedȱonlyȱslightlyȱcomparedȱtoȱtheȱRILȱpopulationȱ(Tableȱ1).ȱ Strongȱandȱsimilarȱgeneticȱcorrelationsȱwereȱobservedȱbetweenȱtraitsȱinȱtheȱ twoȱ Lerȱ xȱ Cviȱ populationsȱ indicatingȱ partialȱ geneticȱ coȬregulationȱ (Tableȱ 2).ȱ Floweringȱ timeȱ showsȱ theȱ highestȱ correlationȱ withȱ theȱ numberȱ ofȱ mainȱ inflorescenceȱ branchesȱ butȱ isȱ negativelyȱ correlatedȱ withȱ basalȱ branchȱ number.ȱ Floweringȱ timeȱ isȱ also,ȱ butȱ toȱ aȱ lesserȱ degree,ȱ correlatedȱ withȱ plantȱ height.ȱ Correlationsȱ wereȱ alsoȱ foundȱ betweenȱ plantȱ heightȱ andȱ branching,ȱ withȱ againȱ positiveȱ valuesȱ withȱ theȱ numberȱ ofȱ mainȱ inflorescenceȱ branchesȱ andȱ negativeȱ correlationsȱwithȱ basalȱ branchȱ number.ȱ Theseȱ resultsȱ contrastedȱwithȱ thoseȱ fromȱ Ungererȱ etȱ al.ȱ (2002),ȱ whoȱ foundȱ negativeȱ correlationsȱ betweenȱ floweringȱ time,ȱ plantȱheightȱandȱbranchingȱinȱallȱpairȬwiseȱcomparisons,ȱwhichȱisȱprobablyȱdueȱtoȱ theȱdifferentȱenvironmentalȱsetȱupȱinȱtheȱtwoȱlaboratories.ȱ ȱ Tableȱ2:ȱGeneticȱcorrelationsȱamongȱdevelopmentalȱtraitsȱanalyzedȱinȱtwoȱmappingȱpopulations.ȱ Traitȱ FTȱ SLȱ TLȱ IBȱ BBȱ TBȱ FTȱȱ0.63*ȱ 0.38*ȱ 0.97*ȱȬ0.49*ȱ 0.80*ȱ SLȱ 0.39*ȱȱ0.90*ȱ 0.52*ȱȬ0.39*ȱ 0.35*ȱ TLȱ 0.21*ȱ 0.88*ȱȱ0.18*ȱȬ0.32*ȱ 0.00ȱ IBȱ 0.91*ȱ 0.31*ȱ 0.09*ȱȱȬ0.54*ȱ 0.95*ȱ BBȱȬ0.26*ȱȬ0.28*ȱȬ0.26*ȱȬ0.35*ȱȱ0.12*ȱ TBȱ 0.77*ȱ 0.15*ȱȬ0.07ȱ 0.85*ȱ 0.31*ȱȱ TheȱtopȱrightȱandȱtheȱbottomȱleftȱhalvesȱofȱtheȱtableȱrepresentȱvaluesȱcalculatedȱforȱtheȱRILȱandȱtheȱNILȱ populationsȱrespectively.ȱFT,ȱfloweringȱtime;ȱSL,ȱlengthȱatȱfirstȱsilique;ȱTL,ȱtotalȱplantȱlength;ȱIB,ȱmainȱ inflorescenceȱ branchȱ number;ȱ BB,ȱ basalȱ branchȱ number;ȱ TB,ȱ totalȱ branchȱ number.ȱ *ȱ Significantȱ atȱ Pȱ <ȱ 0.001.ȱ ȱ MappingȱquantitativeȱtraitsȱinȱtheȱLerȱxȱCviȱRILȱpopulationȱ EachȱtraitȱwasȱsubjectedȱtoȱQTLȱanalysisȱandȱthreeȱtoȱeightȱQTLsȱwereȱdetectedȱforȱ eachȱ traitȱ (Figureȱ 2,ȱ Tableȱ 3).ȱ Majorȱ QTLsȱ forȱ floweringȱ time,ȱ plantȱ heightȱ andȱ branchingȱwereȱinȱconcordanceȱwithȱpreviouslyȱreportedȱstudiesȱ(AlonsoȬBlancoȱetȱ al.,ȱ 1998a,ȱ 1999;ȱ Ungererȱ etȱ al.,ȱ 2002,ȱ 2003;ȱ Juengerȱ etȱ al.,ȱ 2005b),ȱ althoughȱ slightȱ differencesȱforȱminorȱQTLsȱwereȱalsoȱfound.ȱTotalȱexplainedȱvarianceȱforȱeachȱtraitȱ rangedȱfromȱ38.5%ȱforȱbasalȱbranchȱnumberȱtoȱ86.3%ȱforȱtotalȱplantȱheight.ȱLODȱ scoresȱ forȱ theȱ largestȬeffectȱ QTLȱ rangedȱ fromȱ 5.7ȱ forȱ basalȱ branchȱ numberȱ upȱ toȱ 60.7ȱ forȱ totalȱ plantȱ heightȱ withȱ correspondingȱ explainedȱ variancesȱ ofȱ 11.0ȱ andȱ 64.0%ȱrespectively.ȱTheȱaverageȱgeneticȱlengthȱofȱ2ȬLODȱsupportȱintervalsȱwasȱ11.6ȱ cM,ȱrangingȱfromȱ2.3ȱ(lengthȱatȱfirstȱsilique)ȱtoȱ33.3ȱcMȱ(totalȱbranchȱnumber).ȱȱ

24ȱ DevelopmentȱofȱaȱNearȬIsogenicȱLineȱpopulationȱ

Tableȱ3:ȱQTLsȱdetectedȱinȱtheȱRILȱpopulation.ȱ ȱ ȱ LODȱ supportȱ Explainedȱ ȱ TotalȱExplainedȱ Interactionfȱ Traitȱ Chraȱ scoreȱ intervalbȱ(cM)ȱ Variancecȱ(%)ȱ Effectdȱ Varianceeȱ(%)ȱ (%)ȱ FTȱ 1ȱ 11.9ȱ ȱȱȱȱ1.5Ȭ9.8*ȱ 13.0ȱȬ3.9ȱ 68.4ȱȱȱ9.6ȱ ȱ 5ȱ 18.9ȱ 388.4Ȭ394.5*ȱ 22.2ȱȱ5.7ȱȱ ȱ ȱ 5ȱȱ11.9ȱ ȱȱȱȱȱ408.2Ȭ413.7*ȱ 13.0ȱȱ4.4ȱȱ ȱ SLȱ 1ȱȱȱ9.3ȱȱ0.0Ȭ9.3ȱȱ6.3ȱȬ1.7ȱ 79.5ȱ 15.0ȱ ȱ 1ȱȱȱ4.8ȱ 103.1Ȭ126.0ȱȱȱ3.1ȱȬ1.3ȱȱ ȱ ȱ 2ȱ 39.7ȱ 173.2Ȭ175.5ȱ 43.2ȱȱ4.5ȱȱ ȱ ȱ 3ȱȱȱ2.9ȱ 234.2Ȭ253.6ȱȱȱ1.9ȱȱ1.0ȱȱ ȱ ȱ 3ȱȱȱ5.0ȱ 281.5Ȭ287.8ȱȱ3.2ȱȬ1.2ȱȱ ȱ ȱ 5ȱ 15.7ȱ 387.9Ȭ392.4*ȱ 11.8ȱȱ2.9ȱȱ ȱ ȱ 5ȱȱ10.2ȱ 403.6Ȭ409.7*ȱȱȱ7.2ȱȱ2.0ȱȱ ȱ TLȱ 1ȱȱȱ6.5ȱ ȱȱȱȱ0.0Ȭ9.8*ȱȱȱ2.8ȱȬ3.1ȱ 86.3ȱ 11.5ȱ ȱ 1ȱȱȱ5.0ȱȱȱ73.9Ȭ84.6ȱȱȱ2.1ȱȬ2.7ȱȱ ȱ ȱ 1ȱȱȱ3.3ȱ 116.3Ȭ126.0ȱȱȱ1.2ȱȬ2.3ȱȱ ȱ ȱ 2ȱ 60.7ȱ 173.2Ȭ176.0*ȱ 64.0ȱ 14.8ȱȱ ȱ ȱ 3ȱȱȱ6.0ȱ 207.3Ȭ225.7*ȱȱȱ2.6ȱȬ3.0ȱȱ ȱ ȱ 4ȱȱȱ5.2ȱ 287.8Ȭ307.5*ȱȱȱ2.2ȱȬ2.7ȱȱ ȱ ȱ 5ȱȱȱ7.8ȱ 383.1Ȭ392.5*ȱȱȱ3.6ȱȱ4.1ȱȱ ȱ ȱ 5ȱȱȱ5.1ȱ 403.6Ȭ411.7ȱȱȱ2.2ȱȱ3.0ȱȱ ȱ IBȱ 1ȱȱȱ5.0ȱ 0.0Ȭ13.5*ȱȱȱ5.3ȱȬ0.4ȱ 65.0ȱ 20.5ȱ ȱ 2ȱȱȱ2.7ȱ 154.9Ȭ171.0*ȱȱȱ2.8ȱȬ0.3ȱȱ ȱ ȱ 5ȱ 15.3ȱ 387.0Ȭ391.9*ȱ 19.7ȱȱ0.9ȱȱ ȱ ȱ 5ȱȱ10.4ȱ 398.8Ȭ411.7*ȱ 12.3ȱȱ0.7ȱȱ ȱ ȱ 5ȱȱȱ3.1ȱ 472.2Ȭ485.3ȱȱȱ3.2ȱȬ0.3ȱȱ ȱ BBȱ 1ȱȱȱ5.7ȱȱȱ72.4Ȭ91.0*ȱ 11.0ȱȱ0.4ȱ 38.5ȱȱȱ3.1ȱ ȱ 2ȱȱȱ3.2ȱ 167.0Ȭ200.2*ȱȱȱ6.2ȱȬ0.3ȱȱ ȱ ȱ 4ȱȱȱ4.6ȱ 360.7Ȭ373.5*ȱȱȱ9.1ȱȱ0.4ȱȱ ȱ ȱ 5ȱȱȱ5.5ȱ 385.6Ȭ406.1*ȱ 11.3ȱȬ0.5ȱȱ ȱ TBȱ 1ȱ 15.5ȱ ȱȱȱȱ5.3Ȭ12.4*ȱ 16.1ȱȬ0.8ȱ 71.1ȱ 16.2ȱ ȱ 1ȱȱȱ4.9ȱȱȱ81.7Ȭ93.8*ȱȱȱ4.6ȱȱ0.4ȱȱ ȱ ȱ 2ȱȱȱ9.5ȱ 169.0Ȭ180.0*ȱȱȱ9.1ȱȬ0.6ȱȱ ȱ ȱ 5ȱȱȱ9.7ȱ 386.5Ȭ392.4*ȱȱ9.4ȱȱ0.6ȱȱ ȱ ȱ 5ȱȱ10.9ȱ 403.3Ȭ412.2*ȱ 10.8ȱȱ0.7ȱȱ ȱ ȱ 5ȱȱȱ5.2ȱ 472.2Ȭ485.3ȱȱȱ4.7ȱȬ0.4ȱȱ ȱ FT,ȱ floweringȱ time;ȱ SL,ȱ lengthȱ atȱ firstȱ silique;ȱ TL,ȱ totalȱ plantȱ length;ȱ IB,ȱ mainȱ inflorescenceȱ branchȱ number;ȱBB,ȱbasalȱbranchȱnumber;ȱTB,ȱtotalȱbranchȱnumber.ȱ aȱChromosomeȱnumber.ȱbȱ2ȬLODȱsupportȱ interval.ȱcȱPercentageȱofȱtotalȱvariationȱexplainedȱbyȱindividualȱQTLs.ȱdȱEffectȱofȱQTLsȱcalculatedȱasȱΐBȬ ΐA,ȱwhereȱAȱandȱBȱareȱRILsȱcarryingȱLerȱandȱCviȱgenotypesȱatȱtheȱQTLȱpositions,ȱrespectively.ȱΐBȱandȱ ΐAȱwereȱestimatedȱbyȱMapQTL®.ȱEffectsȱareȱgivenȱinȱdaysȱ(floweringȱtime),ȱcentimetersȱ(lengthȱatȱfirstȱ siliqueȱandȱtotalȱlength)ȱorȱnumbersȱ(elongatedȱaxils,ȱbasalȱbranchȱnumberȱandȱtotalȱbranchȱnumber).ȱeȱ PercentageȱofȱtotalȱvarianceȱexplainedȱbyȱgeneticȱfactorsȱestimatedȱbyȱMapQTL®.ȱ fȱPercentageȱofȱtotalȱ variationȱ explainedȱ byȱ interactionȱ betweenȱ individualȱ QTLs.ȱ *ȱ QTLsȱ showingȱ significantȱ epistaticȱ interactionsȱ(Pȱ<ȱ0.05)ȱandȱusedȱtoȱestimateȱtheȱpercentageȱofȱexplainedȱvarianceȱbyȱgeneticȱinteractions.

25ȱ Chapterȱ2ȱ

ȱ

26ȱ DevelopmentȱofȱaȱNearȬIsogenicȱLineȱpopulationȱ

Figureȱ2:ȱGenomeȬwideȱQTLȱprofilesȱofȱtraitsȱanalyzedȱinȱtheȱRILȱpopulation.ȱ (A)ȱFloweringȱtime,ȱ(B)ȱLengthȱatȱfirstȱsilique,ȱ(C)ȱTotalȱplantȱlength,ȱ(D)ȱNumberȱofȱmainȱinflorescenceȱ branches,ȱ(E)ȱBasalȱbranchȱnumberȱandȱ(F)ȱTotalȱbranchȱnumber.ȱSolidȱlinesȱrepresentȱtheȱQTLȱeffectȱ calculatedȱasȱdescribedȱinȱMaterialsȱandȱMethods.ȱShadedȱlinesȱrepresentȱLODȱscores.ȱShadedȱdashedȱ linesȱrepresentȱgenomeȬwideȱsignificanceȱthresholdȱlevelsȱforȱLODȱscoresȱdeterminedȱbyȱpermutationȱ testing.ȱ ȱ Opposingȱ effectȱ QTLsȱ wereȱ foundȱ forȱ allȱ traits,ȱ explainingȱ theȱ observedȱ transgressiveȱ segregationȱ withinȱ theȱ populationȱ (dataȱ notȱ shown).ȱ Geneticȱ interactionȱamongȱtheȱdetectedȱQTLsȱwasȱalsoȱtested.ȱTheȱproportionȱofȱvarianceȱ explainedȱbyȱepistaticȱinteractionsȱrangedȱfromȱ3.1ȱ(basalȱbranchȱnumber)ȱtoȱ20.5%ȱ (numberȱofȱmainȱinflorescenceȱbranches)ȱandȱinvolvedȱtwoȱtoȱfiveȱofȱtheȱdetectedȱ QTLsȱ(Tableȱ3).ȱUsingȱaȱcompleteȱpairwiseȱsearchȱofȱallȱmarkersȱ(Chaseȱetȱal.,ȱ1997),ȱ aȱ numberȱ ofȱ additionalȱ interactionsȱ wereȱ detectedȱ betweenȱ lociȱ notȱ coȬlocatingȱ withȱ majorȱ QTLȱ positionsȱ (supplementalȱ Figureȱ 1ȱ atȱ http://www.genetics.org/ȱ supplemental/).ȱ Theȱsmallestȱsignificantȱabsoluteȱeffectȱdetectedȱwasȱ4.4ȱdaysȱforȱfloweringȱ time,ȱ1.0ȱandȱ2.3ȱcmȱforȱlengthȱatȱfirstȱsiliqueȱandȱtotalȱplantȱlength,ȱrespectively,ȱ andȱ0.3,ȱ0.3,ȱandȱ0.4ȱforȱtheȱnumberȱofȱmainȱinflorescenceȱbranches,ȱbasalȱbranchȱ numberȱ andȱ totalȱ branchȱ number,ȱ respectively.ȱ Relativeȱ effects,ȱ expressedȱ asȱ theȱ foldȱ differenceȱ betweenȱ genotypes,ȱ calculatedȱ asȱ (|ΐBȬΐA|+ΐA)/ΐA,ȱ thenȱ equaledȱ 1.15Ȭ,ȱ 1.09Ȭ,ȱ 1.09Ȭ,ȱ 1.13Ȭ,ȱ 1.59Ȭ,ȱ andȱ 1.10Ȭfold,ȱ respectivelyȱ (Tablesȱ 3ȱ andȱ 5).ȱ Asȱ expected,ȱ theȱ totalȱ explainedȱ varianceȱ ofȱ aȱ traitȱ correlatedȱ positivelyȱ withȱ theȱ smallestȱsignificantlyȱdetectableȱeffectȱforȱthatȱparticularȱtrait.ȱInȱgeneral,ȱsmallerȱ effectsȱ couldȱ beȱ detectedȱ withȱ increasingȱ totalȱ explainedȱ variance.ȱ Whenȱ theȱ chromosomeȬwideȱthresholdȱforȱsignificanceȱwasȱusedȱinsteadȱofȱtheȱgenomeȬwideȱ threshold,ȱ oneȱ additionalȱ suggestiveȱ QTLȱ wasȱ detectedȱ forȱ mainȱ inflorescenceȱ branchȱnumberȱandȱtotalȱbranchȱnumberȱandȱtwoȱforȱlengthȱatȱfirstȱsilique.ȱ ȱ MappingȱquantitativeȱtraitsȱinȱtheȱLerȱxȱCviȱNILȱpopulationȱ ToȱsearchȱforȱQTLsȱinȱtheȱNILȱpopulation,ȱweȱdividedȱtheȱArabidopsisȱgeneticȱmapȱ inȱadjacentȱgenomicȱfragmentsȱthatȱwereȱindividuallyȱtested.ȱTheȱcompleteȱgenomeȱ wasȱ subdividedȱ intoȱ 97ȱ regions,ȱ definedȱ byȱ theȱ positionȱ ofȱ theȱ recombinationȱ eventsȱ ofȱ theȱ mainȱ introgressionsȱ ofȱ theȱ 92ȱ NILsȱ (supplementalȱ Tableȱ 2ȱ atȱ http://www.genetics.org/supplemental/).ȱTheseȱregionsȱareȱreferredȱtoȱasȱbinsȱandȱ eachȱNILȱwasȱthenȱassignedȱtoȱthoseȱadjacentȱbinsȱspannedȱbyȱitsȱCviȱintrogressionȱ fragment.ȱ Thus,ȱ eachȱ binȱcontainsȱ aȱ uniqueȱsubsetȱofȱ linesȱ withȱoverlappingȱCviȱ introgressionsȱ inȱ thatȱ particularȱ region,ȱ whichȱ wereȱ usedȱ toȱ testȱ theȱ phenotypicȱ effectsȱofȱthatȱbin.ȱTheȱaverageȱgeneticȱlengthȱofȱtheȱbinsȱwasȱ5.0ȱcM,ȱrangingȱfromȱ 0.1ȱtoȱ26.3ȱcM.ȱTheȱnumberȱofȱNILsȱperȱbinȱrangedȱfromȱ0ȱtoȱ13ȱwithȱanȱaverageȱofȱ

27ȱ Chapterȱ2ȱ

5.1ȱ NILs.ȱ Becauseȱ NILsȱ wereȱ onlyȱ assignedȱ toȱ binsȱ whenȱ theȱ completeȱ binȱ wasȱ coveredȱbyȱtheȱintrogression,ȱthreeȱbinsȱremainedȱemptyȱ[viz.ȱbinsȱ66ȱ(26.3ȱcM),ȱ73ȱ (3.3ȱcM)ȱandȱ77ȱ(5.4ȱcM)].ȱOnȱaverageȱeachȱNILȱwasȱassignedȱtoȱ5.4ȱadjacentȱbins.ȱ OneȱNILȱ(LCN4Ȭ2)ȱwasȱnotȱassignedȱtoȱanyȱbinȱbecauseȱitsȱintrogressionȱincludedȱ onlyȱ aȱ singleȱ marker.ȱ Twoȱ NILsȱ correspondedȱ toȱ completeȱ chromosomalȱ substitutions:ȱlineȱLCN3Ȭ8ȱ(chromosomeȱ3)ȱandȱlineȱLCN1Ȭ8ȱ(chromosomeȱ1),ȱtheȱ latterȱcarryingȱtheȱlargestȱintrogressionȱassignedȱtoȱ27ȱadjacentȱbins.ȱȱ ToȱmapȱQTLsȱinȱtheȱNILȱpopulation,ȱallȱbinsȱwereȱtestedȱindividuallyȱbyȱ comparingȱ theȱ phenotypesȱ ofȱ theȱ NILsȱassignedȱ toȱ eachȱ binȱ withȱ thatȱ ofȱ Ler.ȱAsȱ shownȱinȱFigureȱ3ȱandȱTableȱ4,ȱoneȱtoȱnineȱQTLsȱwereȱdetectedȱforȱeachȱtrait.ȱTheȱ totalȱexplainedȱvarianceȱforȱeachȱtraitȱrangedȱfromȱ26.7%ȱforȱbasalȱbranchȱnumberȱ upȱtoȱ87.7%ȱforȱtotalȱplantȱheight.ȱExplainedȱvariancesȱforȱtheȱlargestȬeffectȱQTLȱforȱ eachȱ traitȱ rangedȱ fromȱ 19.3%ȱ forȱ basalȱ branchȱ numberȱ toȱ 91.9%ȱ forȱ totalȱ plantȱ heightȱ asȱ calculatedȱ fromȱ aȱ restrictedȱ ANOVAȱ usingȱ onlyȱ linesȱ fromȱ theȱ mostȱ significantȱ binȱ andȱ Ler.ȱ Toȱ showȱ theȱ relativeȱ effectȱ ofȱ Mendelizingȱ QTLsȱ withȱ respectȱtoȱtheȱtotalȱpopulationȱvarianceȱweȱcalculatedȱtheȱexplainedȱvariancesȱalsoȱ whenȱallȱlinesȱofȱtheȱpopulationȱwereȱsubjectedȱtoȱANOVAȱanalysisȱusingȱtheȱmostȱ significantȱbinȱasȱfixedȱfactorȱ(Tableȱ4).ȱRelativeȱeffectsȱofȱQTLsȱwereȱmuchȱlowerȱ inȱ thisȱ unrestrictedȱ analysisȱ becauseȱ allȱ otherȱ QTLsȱ inȱ theȱ populationȱ increaseȱ residualȱvariationȱwhichȱisȱnotȱcorrectedȱfor,ȱasȱisȱdoneȱinȱMQMȱmappingȱinȱtheȱ RILȱpopulation.ȱMoreover,ȱlinesȱpartlyȱoverlappingȱtheȱQTLȱbinȱareȱnotȱassignedȱ toȱthatȱbinȱbutȱcanȱstillȱcontainȱtheȱQTLȱCviȱallele,ȱfurtherȱincreasingȱtheȱresidualȱ variationȱinȱtheȱpopulation.ȱȱ Theȱ smallestȱ significantȱ QTLȱ effectȱ detectedȱ wasȱ 0.7ȱ daysȱ forȱ floweringȱ time,ȱ1.1ȱandȱ2.1ȱcmȱforȱlengthȱatȱfirstȱsiliqueȱandȱtotalȱplantȱlength,ȱrespectively,ȱ andȱ3.8,ȱ0.5,ȱandȱ0.4ȱforȱtheȱnumberȱofȱmainȱinflorescenceȱbranches,ȱbasalȱbranchȱ numberȱ andȱ totalȱ branchȱ number,ȱ respectively.ȱ Relativeȱ effects,ȱ expressedȱ asȱ theȱ foldȱ differenceȱ betweenȱ genotypes,ȱ calculatedȱ asȱ (|ΐBȬΐA|+ΐA)/ΐA,ȱ thenȱ equaledȱ 1.03Ȭ,ȱ1.11Ȭ,ȱ1.09Ȭ,ȱ2.71Ȭ,ȱ1.30Ȭ,ȱandȱ1.11Ȭfold,ȱrespectivelyȱ(Tablesȱ4ȱandȱ5).ȱ Forȱ aȱ numberȱ ofȱ traitsȱ severalȱ QTLsȱ wereȱ foundȱ thatȱ couldȱ notȱ beȱ significantlyȱdetectedȱinȱtheȱRILȱpopulation.ȱInȱtotalȱ12ȱofȱsuchȱsmallȬeffectȱQTLsȱ wereȱdetectedȱforȱfloweringȱtimeȱ(3),ȱlengthȱatȱfirstȱsiliqueȱ(5),ȱtotalȱplantȱlengthȱ(2),ȱ andȱ basalȱ branchȱ numberȱ (2).ȱ Noneȱ ofȱ thoseȱ metȱ theȱ lowerȱ chromosomeȬwideȱ significanceȱ thresholdȱ forȱ suggestiveȱ QTLsȱ inȱ theȱ RILȱ population.ȱ Althoughȱ twoȱ wereȱcloseȱtoȱthisȱthreshold,ȱtenȱofȱthemȱdidȱnotȱreachȱLODȱscoresȱ>1.0ȱinȱtheȱRILȱ populationȱ(supplementalȱTableȱ3ȱatȱhttp://www.genetics.org/supplemental/).ȱ

28ȱ DevelopmentȱofȱaȱNearȬIsogenicȱLineȱpopulationȱ

Tableȱ4:ȱQTLsȱdetectedȱinȱtheȱNILȱpopulation.ȱ Supportȱ Supportȱ ExplainedȱVarianceȱ(%)ȱ TotalȱExplainedȱ Traitȱ Chraȱ intervalbȱ binȱ(cM)cȱ Restricteddȱ Unrestrictedeȱ Effectfȱ Variancegȱ(%)ȱ FTȱ 1ȱ 0.0ȱȬȱ21.6ȱ 3.9ȱȬȱ7.8ȱ 70.3ȱȱȱ3.2ȱȬ3.2ȱ 83.2ȱ ȱ 1ȱ 31.4ȱȬȱ40.6ȱ 33.4ȱȬȱ40.7ȱ 18.0ȱȱȱ0.5ȱȬ1.0ȱȱ ȱ 1ȱ 73.3ȱȬȱ122.0ȱ 83.6ȱ–ȱ87.0ȱȱȱ7.1ȱȱȱ0.7ȱȬ0.7ȱȱ ȱ 2ȱ 174.4ȱȬȱ204.7ȱ 200.9ȱȬȱ201.8ȱ 22.3ȱȱȱ0.6ȱ 1.5ȱȱ ȱ 5ȱ 388.4ȱȬȱ434.2ȱ 392.3ȱȬȱ395.0ȱ 52.1ȱ 42.8ȱ 15.7ȱȱ SLȱ 1ȱ 10.8ȱȬȱ27.4ȱ 17.3ȱȬȱ21.7ȱ 64.0ȱȱȱ4.8ȱȬ3.1ȱ 66.1ȱ ȱ 1ȱ 31.4ȱȬȱ40.6ȱ 33.4ȱȬȱ40.7ȱ 17.1ȱȱȱ0.6ȱȬ1.1ȱȱ ȱ 1ȱ 73.3ȱȬȱ125.9ȱ 122.1ȱȬȱ126.0ȱ 34.9ȱȱȱ2.8ȱȬ1.7ȱȱ ȱ 2ȱ 160.8ȱȬȱ207.2ȱ 162.0ȱȬȱ174.5ȱ 73.4ȱȱȱ5.3ȱ 4.9ȱȱ ȱ 3ȱ 270.1ȱȬȱ288.4ȱ 287.1ȱȬȱ288.4ȱ 37.1ȱȱȱ1.6ȱȬ1.7ȱȱ ȱ 4ȱ 359.5ȱȬȱ375.7ȱ 368.2ȱȬȱ375.7ȱ 32.2ȱȱȱ1.7ȱȬ1.6ȱȱ ȱ 5ȱ 388.3ȱȬȱ418.9ȱ 392.3ȱ–ȱ395.0ȱ 32.2ȱȱȱ0.7ȱ 2.7ȱȱ ȱ 5ȱ 434.2ȱȬȱ436.0ȱ 434.3ȱȬȱ436.1ȱ 29.6ȱȱȱ3.8ȱȬ1.4ȱȱ ȱ 5ȱ 441.4ȱȬȱ459.3ȱ 454.3ȱȬȱ459.4ȱ 28.2ȱȱȱ1.1ȱȬ1.1ȱȱ TLȱ 1ȱ 0.0ȱȬȱ33.3ȱ 17.3ȱȬȱ21.7ȱ 66.2ȱȱȱ1.7ȱȬ6.3ȱ 87.7ȱ ȱ 1ȱ 64.7ȱȬȱ125.9ȱ 122.1ȱ–ȱ126.0ȱ 48.8ȱȱȱ3.8ȱȬ3.8ȱȱ ȱ 2ȱ 160.8ȱȬȱ207.2ȱ 174.5ȱȬȱ178.8ȱ 91.9ȱ 10.5ȱ 18.5ȱȱ ȱ 3ȱ 287.0ȱȬȱ288.4ȱ 287.1ȱȬȱ288.4ȱ 19.0ȱȱȱ0.4ȱȬ2.1ȱȱ ȱ 5ȱ 389.9ȱȬȱ416.1ȱ 411.7ȱȬȱ416.2ȱ 34.1ȱȱȱ1.7ȱ 3.7ȱȱ ȱ 5ȱ 434.2ȱȬȱ454.3ȱ 434.3ȱȬȱ436.1ȱ 45.0ȱȱȱ1.4ȱȬ3.9ȱȱ IBȱ 5ȱ 388.3ȱȬȱ434.2ȱ 392.3ȱ–ȱ395.0ȱ 46.3ȱ 37.7ȱ 3.8ȱ 66.1ȱ BBȱ 1ȱ 0.0ȱȬȱ15.1ȱ 3.9ȱȬȱ7.8ȱ 17.7ȱȱȱ1.8ȱȬ0.6ȱ 26.7ȱ ȱ 1ȱ 40.6ȱȬȱ125.9ȱ 94.5ȱȬȱ101.6ȱ 17.9ȱȱȱ9.0ȱ 0.8ȱȱ ȱ 2ȱ 174.4ȱȬȱ189.1ȱ 179.7ȱȬȱ189.2ȱ 11.4ȱȱȱ2.4ȱȬ0.5ȱȱ ȱ 5ȱ 388.3ȱȬȱ434.2ȱ 392.3ȱ–ȱ395.0ȱ 14.4ȱȱȱ1.7ȱȬ0.7ȱȱ ȱ 5ȱ 483.2ȱȬȱ487.8ȱ 483.2ȱȬȱ487.8ȱ 19.3ȱȱȱ1.1ȱȬ0.8ȱȱ TBȱ 1ȱ 0.0ȱȬȱ15.9ȱ 7.8ȱȬȱ9.9ȱ 24.1ȱȱȱ2.2ȱȬ0.8ȱ 44.1ȱ ȱ 1ȱ 40.6ȱȬȱ125.9ȱ 94.5ȱȬȱ101.6ȱ 14.0ȱȱȱ4.1ȱ 0.8ȱȱ ȱ 2ȱ 174.4ȱȬȱ189.1ȱ 179.7ȱȬȱ189.2ȱȱȱ7.6ȱȱȱ1.5ȱȬ0.4ȱȱ ȱ 5ȱ 388.3ȱȬȱ434.2ȱ 392.3ȱ–ȱ395.0ȱ 43.2ȱ 17.4ȱ 3.1ȱȱ FT,ȱ floweringȱ time;ȱ SL,ȱ lengthȱ atȱ firstȱ silique;ȱ TL,ȱ totalȱ plantȱ length;ȱ IB,ȱ mainȱ inflorescenceȱ branchȱ number;ȱ BB,ȱ basalȱ branchȱ number;ȱ TB,ȱ totalȱ branchȱ number.ȱ aȱ Chromosomeȱ number.ȱ bȱ Theȱ regionȱ spannedȱbyȱconsecutiveȱbins,ȱsignificantlyȱ(Pȱ<ȱ0.001)ȱdifferingȱfromȱLerȱandȱsharingȱtheȱsameȱdirectionȱ ofȱeffect,ȱwasȱtakenȱasȱsupportȱinterval.ȱ cȱPositionȱofȱtheȱbinȱwithinȱtheȱQTLȱsupportȱintervalȱshowingȱ theȱlargestȱeffect.ȱdȱWithinȱtheȱQTLȱsupportȱinterval,ȱtheȱbinȱshowingȱtheȱlargestȱeffectȱwasȱcomparedȱtoȱ LerȱinȱanȱANOVAȱanalysis.ȱTheȱamongȬgenotypeȱcomponentȱofȱANOVAȱwasȱtakenȱasȱanȱestimatorȱofȱ explainedȱvariance.ȱeȱAllȱlinesȱinȱtheȱpopulationȱwereȱsubjectedȱtoȱANOVAȱusingȱtheȱbinȱdescribedȱinȱ footnoteȱ dȱ asȱ fixedȱ factor.ȱ Theȱ amongȬgenotypeȱ componentȱ ofȱ ANOVAȱ wasȱ takenȱ asȱ anȱ estimatorȱ ofȱ explainedȱvariance.ȱfȱEffectȱofȱQTLsȱcalculatedȱasȱΐBȬΐA,ȱwhereȱΐAȱisȱtheȱmeanȱvalueȱofȱallȱLerȱlinesȱandȱ ΐBȱisȱtheȱmeanȱvalueȱofȱallȱlinesȱinȱtheȱbinȱdescribedȱinȱfootnoteȱd.ȱEffectsȱareȱgivenȱinȱdaysȱ(floweringȱ time),ȱ centimetersȱ (lengthȱ atȱ firstȱ siliqueȱ andȱ totalȱ length)ȱ orȱ numbersȱ (mainȱ inflorescenceȱ branchȱ number,ȱbasalȱbranchȱnumberȱandȱtotalȱbranchȱnumber).ȱgȱAllȱbinsȱtogetherȱwithȱLerȱwereȱanalyzedȱbyȱ ANOVAȱandȱtheȱamongȬgenotypeȱcomponentȱwasȱtakenȱasȱaȱmeasureȱofȱtotallyȱexplainedȱvariance.ȱ

29ȱ Chapterȱ2ȱ

ȱ

30ȱ DevelopmentȱofȱaȱNearȬIsogenicȱLineȱpopulationȱ

Figureȱ3:ȱQTLȱprofilesȱofȱtraitsȱanalyzedȱinȱtheȱNILȱpopulation.ȱ (A)ȱFloweringȱtime,ȱ(B)ȱLengthȱatȱfirstȱsilique,ȱ(C)ȱTotalȱplantȱlength,ȱ(D)ȱNumberȱofȱmainȱinflorescenceȱ branches,ȱ(E)ȱBasalȱbranchȱnumberȱandȱ(F)ȱTotalȱbranchȱnumber.ȱSolidȱlinesȱrepresentȱtheȱQTLȱeffectȱ calculatedȱasȱdescribedȱinȱMaterialsȱandȱMethods.ȱShadedȱlinesȱrepresentȱsignificanceȱscores.ȱShadedȱ dashedȱlinesȱrepresentȱsignificanceȱthresholdȱlevelsȱappliedȱinȱthisȱstudy.ȱ ȱ Tableȱ5:ȱComparativeȱsummaryȱofȱQTLȱmappingȱparametersȱinȱtheȱLerȱxȱCviȱRILȱandȱNILȱpopulations.ȱ QTLsbȱ Supportcȱ Explainedȱ Totalȱexplainedȱ Relativeȱ Traitȱ Population.aȱ (no.)ȱ (cM)ȱ Variancedȱ(%)ȱ Varianceȱ(%)ȱ Effecteȱ effectfȱ FTȱ RILȱ 3ȱ ȱȱȱ6.6ȱ 16.1ȱ 68.4ȱ 4.7ȱ 1.15ȱ ȱ NILȱ 5ȱ 35.5ȱ(3.6)ȱ 34.0ȱ 83.2ȱ 4.4ȱ 1.03ȱ SLȱ RILȱ 7ȱȱ10.1ȱ 11.0ȱ 79.5ȱ 2.1ȱ 1.09ȱ ȱ NILȱ 9ȱ 23.3ȱ(5.2)ȱ 38.7ȱ 66.1ȱ 2.1ȱ 1.11ȱ TLȱ RILȱ 8ȱȱ11.1ȱ 10.1ȱ 86.3ȱ 4.5ȱ 1.09ȱ ȱ NILȱ 6ȱ 31.4ȱ(3.4)ȱ 50.8ȱ 87.7ȱ 6.4ȱ 1.09ȱ IBȱ RILȱ 5ȱȱ12.1ȱȱȱ8.7ȱ 65.0ȱ 0.5ȱ 1.13ȱ ȱ NILȱ 1ȱ 45.9ȱ(2.7)ȱ 46.3ȱ 66.1ȱ 3.8ȱ 2.71ȱ BBȱ RILȱ 4ȱȱ21.3ȱȱȱ9.4ȱ 38.5ȱ 0.4ȱ 1.59ȱ ȱ NILȱ 5ȱ 33.1ȱ(5.6)ȱ 16.1ȱ 26.7ȱ 0.7ȱ 1.30ȱ TBȱ RILȱ 6ȱ ȱȱȱ9.7ȱȱȱ9.1ȱ 71.1ȱ 0.6ȱ 1.10ȱ ȱ NILȱ 4ȱ 40.5ȱ(5.4)ȱ 22.2ȱ 44.1ȱ 1.3ȱ 1.11ȱ FT,ȱ floweringȱ time;ȱ SL,ȱ lengthȱ atȱ firstȱ silique;ȱ TL,ȱ totalȱ plantȱ length;ȱ IB,ȱ mainȱ inflorescenceȱ branchȱ number;ȱ BB,ȱ basalȱ branchȱ number;ȱ TB,ȱ totalȱ branchȱ number.ȱ aȱ Populationȱ type.ȱ bȱ Numberȱ ofȱ QTLsȱ detected.ȱ cȱ Averageȱ lengthȱ ofȱ supportȱ interval.ȱ Inȱ parentheses:ȱ averageȱ lengthȱ ofȱ largestȬeffectȱ bin.ȱ dȱ AverageȱexplainedȱvarianceȱforȱeachȱQTL.ȱeȱAverageȱabsoluteȱeffectȱforȱeachȱQTL.ȱEffectsȱareȱgivenȱinȱ daysȱ(floweringȱtime),ȱcentimetersȱ(lengthȱatȱfirstȱsiliqueȱandȱtotalȱlength)ȱorȱnumbersȱ(elongatedȱaxils,ȱ basalȱ branchȱ numberȱ andȱ totalȱ branchȱ number).ȱ fȱ Smallestȱ relativeȱ effectȱ significantlyȱ detected,ȱ expressedȱasȱfoldȱdifferenceȱcomparedȱtoȱLer,ȱcalculatedȱasȱ(|ΐBȬΐA|+ΐA)/ΐA.ȱ ȱ Weȱ definedȱ theȱ supportȱ intervalȱ inȱ theȱ NILȱ mappingȱ populationȱ asȱ theȱ regionȱspannedȱbyȱconsecutiveȱbins,ȱsignificantlyȱdifferingȱfromȱLerȱ(Pȱ<ȱ0.001)ȱandȱ sharingȱtheȱsameȱdirectionȱofȱeffect.ȱTheȱlengthȱofȱsupportȱintervalsȱestimatedȱinȱ thisȱwayȱrangedȱfromȱ1.4ȱ(totalȱplantȱlength)ȱtoȱ85.3ȱcMȱ(basalȱbranchȱnumber)ȱwithȱ anȱ averageȱ ofȱ 30.9ȱ cM.ȱ Alternatively,ȱ weȱ alsoȱ searchedȱ forȱ QTLsȱ inȱ theȱ NILȱ populationȱ byȱ comparingȱ theȱ phenotypeȱ ofȱ eachȱ NILȱ individuallyȱ againstȱ Lerȱ (supplementalȱFiguresȱ2Ȭ7ȱatȱhttp://www.genetics.org/supplemental/).ȱInȱthisȱcase,ȱ supportȱ intervalsȱ canȱ beȱ estimatedȱ asȱ theȱ lengthȱ ofȱ theȱ overlappingȱ regionsȱ betweenȱtheȱCviȱintrogressionȱfragmentsȱofȱNILsȱsignificantlyȱdifferingȱfromȱLerȱinȱ aȱ particularȱ genomicȱ region.ȱ Thisȱ secondȱ methodȱ increasesȱ theȱ QTLȱ localizationȱ resolution,ȱbutȱreducesȱstatisticalȱpower.ȱForȱeachȱbinȱonȱaverageȱ116ȱplantsȱcouldȱ beȱ testedȱ againstȱ Lerȱ whereasȱ onlyȱ 24ȱ plantsȱ wereȱ availableȱ forȱ analysisȱ ofȱ individualȱNILs.ȱMoreover,ȱindividualȱlinesȱmayȱcontainȱmultipleȱopposingȬeffectȱ QTLs,ȱ resultingȱ inȱ nonsignificantȱ differencesȱ comparedȱ toȱ Ler.ȱ Therefore,ȱ linesȱ

31ȱ Chapterȱ2ȱ

spanningȱ theȱ binȱ supportȱ intervalȱ wereȱ occasionallyȱ notȱ significantlyȱ differentȱ fromȱLer.ȱLikewise,ȱlinesȱbearingȱintrogressionsȱoutsideȱtheȱbinȱsupportȱintervalsȱ sometimesȱdifferedȱsignificantlyȱfromȱLer,ȱprobablyȱdueȱtoȱmultipleȱadditiveȱsmallȬ effectȱ QTLs.ȱ Together,ȱ theȱ lossȱ ofȱ powerȱ andȱ theȱ complexityȱ ofȱ theȱ traitsȱ underȱ studyȱhinderedȱaȱconfidentȱestimationȱofȱaȱNILȱsupportȱinterval.ȱNevertheless,ȱallȱ QTLsȱdetectedȱinȱtheȱbinȱanalysisȱcouldȱalsoȱbeȱdetectedȱbyȱanalyzingȱindividualȱ NILs.ȱAsȱaȱcompromiseȱbetweenȱtheȱtwoȱmethodsȱofȱsupportȱintervalȱestimationȱ weȱrecordedȱtheȱpositionȱofȱtheȱlargestȬeffectȱbinȱwithinȱtheȱbinȱsupportȱintervalȱ (Tableȱ 4).ȱ However,ȱ itȱ mustȱ beȱ notedȱ thatȱ binȱ supportȱ intervalsȱ mayȱ containȱ multipleȱQTLsȱofȱsimilarȱdirection.ȱTheȱaverageȱsizeȱofȱtheseȱlargestȬeffectȱbinsȱwasȱ 4.6ȱcM.ȱWithinȱthoseȱbins,ȱatȱleastȱoneȱindividualȱNILȱsignificantlyȱdifferingȱfromȱ Ler,ȱwasȱalwaysȱfound.ȱ ȱ PowerȱinȱRILȱvs.ȱNILȱQTLȱmappingȱ TheȱpowerȱtoȱdetectȱaȱQTLȱatȱaȱspecificȱlocusȱbasicallyȱdependsȱonȱtheȱdifferenceȱinȱ meanȱtraitȱvaluesȱbetweenȱAȱandȱBȱgenotypesȱforȱthatȱparticularȱlocus.ȱAlthoughȱ otherȱ parametersȱ likeȱ traitȱ heritability,ȱ geneticȱ interactions,ȱ andȱ geneticȱ mapȱ qualityȱshouldȱnotȱbeȱignored.ȱBecauseȱpowerȱincreasesȱwhenȱvarianceȱforȱmeanȱ valuesȱdecreases,ȱQTLȱanalysesȱcanȱbenefitȱgreatlyȱfromȱmultipleȱmeasurements.ȱInȱ aȱRILȱpopulationȱthisȱcanȱbeȱachievedȱinȱtwoȱways.ȱFirst,ȱbecauseȱsegregationȱofȱ bothȱallelesȱoccursȱrandomlyȱandȱeachȱlocusȱisȱrepresentedȱequallyȱbyȱtheȱAȱandȱBȱ genotype,ȱ providedȱ thereȱ isȱ noȱ segregationȱ distortionȱ (Doerge,ȱ 2002),ȱ increasingȱ theȱnumberȱofȱRILsȱtoȱbeȱanalyzedȱwillȱincreaseȱtheȱnumberȱofȱobservationsȱofȱeachȱ genotypeȱatȱaȱgivenȱgenomicȱposition.ȱAȱfurtherȱadvantageȱofȱincreasingȱtheȱRILȱ populationȱ sizeȱ isȱ thatȱ theȱ numberȱ ofȱ recombinationȱ eventsȱ increases,ȱ whichȱ canȱ improveȱ resolution.ȱ However,ȱ whenȱ theȱ numberȱ ofȱ linesȱ isȱ fixed,ȱ moreȱ accurateȱ traitȱvaluesȱofȱlinesȱcanȱbeȱachievedȱbyȱmeasuringȱreplicateȱindividualsȱofȱtheȱsameȱ line.ȱInȱadditionȱaccurateȱtraitȱvaluesȱbasedȱonȱreplicateȱmeasurementsȱimproveȱtheȱ possibilityȱofȱdetectingȱsmallerȬeffectȱQTLs.ȱ Toȱ testȱ theȱ effectȱ ofȱ replicatedȱ measurementsȱ andȱ populationȱ sizeȱ onȱ theȱ QTLȱdetectionȱpowerȱofȱtheȱtwoȱLerȱxȱCviȱpopulationsȱweȱanalyzedȱtheȱphenotypicȱ dataȱ obtainedȱ inȱ theseȱ populationsȱ byȱ varyingȱ bothȱ parameters.ȱ Forȱ theȱ RILȱ populationȱweȱperformedȱQTLȱanalysesȱonȱdifferentȱnumbersȱofȱRILsȱ(populationȱ size)ȱ andȱ usedȱ meanȱ lineȱ valuesȱ obtainedȱ withȱ differentȱ numberȱ ofȱ replicatesȱ (replicateȱsize).ȱTheȱtotalȱexplainedȱvarianceȱinȱtheȱpopulation,ȱLODȱscoreȱofȱtheȱ largestȬeffectȱQTL,ȱandȱtheȱnumberȱofȱdetectedȱQTLsȱwereȱthenȱrecordedȱforȱeachȱ traitȱ (Figureȱ 4).ȱ Whenȱ theȱ populationȱ sizeȱ wasȱ keptȱ constantȱ (161ȱ lines),ȱ theȱ recordedȱ statisticsȱ increasedȱ whenȱ increasingȱ theȱ replicateȱ numberȱ fromȱ oneȱ toȱ

32ȱ DevelopmentȱofȱaȱNearȬIsogenicȱLineȱpopulationȱ

fourȱbutȱthisȱincreaseȱleveledȱoffȱrapidlyȱwhenȱmeasuringȱmoreȱreplicatesȱ(Figureȱ 4,ȱAȬC).ȱ ȱ

ȱ ȱ Figureȱ4:ȱQTLȱdetectionȱpowerȱanalysisȱofȱtheȱLerȱxȱCviȱRILȱpopulation.ȱ (A)ȱEffectȱofȱreplicateȱnumberȱonȱtotalȱexplainedȱvariance.ȱ(B)ȱEffectȱofȱreplicateȱnumberȱonȱLODȱscoreȱ ofȱtheȱlargestȬeffectȱQTL.ȱ(C)ȱEffectȱofȱreplicateȱnumberȱonȱtheȱnumberȱofȱdetectedȱQTLs.ȱ(D)ȱEffectȱofȱ populationȱsizeȱonȱtotalȱexplainedȱvariance.ȱ(E)ȱEffectȱofȱpopulationȱsizeȱonȱLODȱscoreȱofȱtheȱlargestȬ effectȱQTL.ȱ(F)ȱEffectȱofȱpopulationȱsizeȱonȱtheȱnumberȱofȱdetectedȱQTLs.ȱȠ,ȱFloweringȱtime;ȱǐ,ȱLengthȱatȱ firstȱsilique;ȱNJ,ȱTotalȱplantȱlength;ȱx,ȱMainȱinflorescenceȱbranchȱnumber;ȱż,ȱBasalȱbranchȱnumber;ȱandȱ+,ȱ Totalȱbranchȱnumber.ȱErrorȱbarsȱrepresentȱSEMȱofȱtenȱindependentȱanalyses.ȱ ȱ Inȱcontrast,ȱwhenȱtheȱnumberȱofȱreplicatesȱwasȱkeptȱconstantȱ(16ȱreplicatedȱ measurementsȱ perȱ RIL)ȱ andȱ populationȱ sizeȱ wasȱ increased,ȱ theȱ QTLȱ detectionȱ powerȱimprovedȱmoreȱdrastically.ȱHowever,ȱtheȱtotalȱexplainedȱvarianceȱremainedȱ moreȱ orȱ lessȱconstantȱ overȱ allȱ populationȱ sizesȱ (Figureȱ 4D).ȱ Thisȱ phenomenonȱ isȱ commonlyȱ knownȱ asȱ theȱ Beavisȱ effectȱ andȱ isȱ dueȱ toȱ theȱ factȱ thatȱ estimatedȱ explainedȱ variancesȱ ofȱ detectedȱ QTLsȱ areȱ sampledȱ fromȱ aȱ truncatedȱ distributionȱ becauseȱ QTLsȱ areȱ onlyȱ takenȱ intoȱ accountȱ whenȱ theȱ testȱ statisticsȱ reachȱ aȱ predeterminedȱ criticalȱ valueȱ (Xu,ȱ 2003).ȱ Asȱ aȱ result,ȱ theȱ expectationsȱ ofȱ detectedȱ QTLȱeffectsȱareȱbiasedȱupward.ȱAȱsecondȱeffectȱofȱincreasingȱpopulationȱsizeȱisȱtheȱ nearlyȱlinearȱincreaseȱofȱLODȱscores,ȱobservedȱforȱallȱanalyzedȱQTLsȱ(Figureȱ4E).ȱ Significanceȱthresholdsȱdeterminedȱbyȱpermutationȱtestsȱforȱeachȱpopulationȱsize,ȱ

33ȱ Chapterȱ2ȱ

wereȱsteadyȱaroundȱ2.7ȱLODȱforȱpopulationȱsizesȱ>30ȱRILsȱandȱincreasedȱslightlyȱ withȱ smallerȱ populationȱ sizesȱ (dataȱ notȱ shown).ȱ Theȱ largestȬeffectȱ QTLȱ couldȱ beȱ significantlyȱ detectedȱ atȱ allȱ populationȱ sizesȱ forȱ allȱ traitsȱ exceptȱ forȱ basalȱ branchȱ number,ȱwhoseȱlargestȬeffectȱQTLȱcouldȱnotȱbeȱsignificantlyȱdetectedȱinȱpopulationȱ sizesȱ<80ȱRILs.ȱ Toȱ evaluateȱ theȱ NILȱ population,ȱ weȱ studiedȱ theȱ effectȱ ofȱ increasingȱ theȱ numberȱ ofȱ replicatesȱ perȱ lineȱ byȱ estimatingȱ theȱ relativeȱ differenceȱ betweenȱ lineȱ meanȱ valuesȱ thatȱ couldȱ stillȱ beȱ significantlyȱ detectedȱ withȱ differentȱ replicateȱ numbersȱ(seeȱMaterialsȱandȱMethods).ȱAsȱshownȱinȱFigureȱ5Aȱtheȱpowerȱtoȱdetectȱ significantȱphenotypicȱdifferencesȱgreatlyȱincreasesȱwhenȱincreasingȱtheȱnumberȱofȱ replicateȱindividualsȱofȱNILsȱmeasured.ȱFurthermore,ȱtheȱlowerȱtheȱheritabilityȱofȱ theȱ traitȱ theȱ largerȱ theȱ increaseȱ ofȱ detectionȱ powerȱ achievedȱ byȱ increasingȱ theȱ numberȱofȱreplicatesȱperȱNIL.ȱWhenȱaȱbinȱanalysisȱwasȱcarriedȱoutȱusingȱincreasingȱ replicateȱnumbersȱaȱsimilarȱincreaseȱinȱtheȱnumberȱofȱdetectedȱQTLsȱwasȱobservedȱ (Figureȱ5B).ȱOverall,ȱtheȱresultsȱpresentedȱinȱFiguresȱ4ȱandȱ5ȱshowȱthatȱtheȱnumberȱ ofȱreplicatesȱusedȱinȱourȱanalysesȱ(16ȱindividualsȱforȱeachȱRILȱandȱ24ȱindividualsȱ forȱeachȱNIL)ȱapproximatedȱtheȱmaximumȱQTLȱdetectionȱpowerȱofȱbothȱLerȱxȱCviȱ populations.ȱȱ ȱ

ȱ ȱ Figureȱ5:ȱQTLȱdetectionȱpowerȱanalysisȱofȱtheȱLerȱxȱCviȱNILȱpopulation.ȱ (A)ȱ Effectȱ ofȱ replicateȱ numberȱ onȱ significantlyȱ detectableȱ relativeȱ differences,ȱ expressedȱ asȱ foldȱ differenceȱ betweenȱ twoȱ lines.ȱ (B)ȱ Effectȱ ofȱ replicateȱ numberȱ onȱ theȱ numberȱ ofȱ detectedȱ QTLs.ȱȠ,ȱ Floweringȱtime;ȱǐ,ȱLengthȱatȱfirstȱsilique;ȱNJ,ȱTotalȱplantȱlength;ȱx,ȱMainȱinflorescenceȱbranchȱnumber;ȱż,ȱ Basalȱ branchȱ number;ȱ andȱ +,ȱ Totalȱ branchȱ number.ȱ Errorȱ barsȱ representȱ SEMȱ ofȱ tenȱ independentȱ analyses.ȱ ȱ

34ȱ DevelopmentȱofȱaȱNearȬIsogenicȱLineȱpopulationȱ

DISCUSSIONȱ ȱ Experimentalȱ mappingȱ populationsȱ areȱ aȱ basicȱ resourceȱ toȱ elucidateȱ theȱ geneticȱ basisȱ ofȱ quantitativeȱ multigenicȱ traits.ȱ Inȱ thisȱ work,ȱ weȱ haveȱ developedȱ theȱ firstȱ genomeȬwideȱ populationȱ ofȱ NILsȱ ofȱ Arabidopsisȱ thalianaȱ consistingȱ ofȱ 92ȱ linesȱ carryingȱgenomicȱintrogressionȱfragmentsȱfromȱtheȱparentalȱaccessionȱCviȱintoȱtheȱ geneticȱ backgroundȱ ofȱ theȱ commonȱ laboratoryȱ accessionȱ Landsbergȱ erecta.ȱ Inȱ additionȱ weȱ haveȱ empiricallyȱ comparedȱ theȱ mappingȱ powerȱ ofȱ thisȱ populationȱ withȱ anȱ existingȱ populationȱ ofȱ recombinantȱ inbredȱ lines,ȱ derivedȱ fromȱ theȱ sameȱ parentalȱ accessions.ȱ RILȱ andȱ NILȱ populationsȱ haveȱ beenȱ usedȱ extensivelyȱ inȱ geneticȱ studiesȱ (Eshedȱ andȱ Zamir,ȱ1995;ȱ Raeȱ etȱ al.,ȱ1999;ȱ Monforteȱ andȱ Tanksley,ȱ 2000;ȱKoumproglouȱetȱal.,ȱ2002;ȱHanȱetȱal.,ȱ2004;ȱKoornneefȱetȱal.,ȱ2004;ȱSingerȱetȱal.,ȱ 2004;ȱvonȱKorffȱetȱal.,ȱ2004)ȱdueȱtoȱtheȱadvantagesȱderivedȱfromȱtheirȱhomozygosityȱ andȱ immortality:ȱ theyȱ canȱ beȱ usedȱindefinitely;ȱ variousȱ traitsȱ canȱ beȱ analyzedȱ inȱ differentȱexperimentsȱandȱenvironmentalȱsettings;ȱandȱreplicatesȱofȱtheȱindividualȱ linesȱcanȱbeȱanalyzed,ȱenablingȱaȱmoreȱaccurateȱestimateȱofȱtheȱline’sȱphenotypicȱ meanȱvalue.ȱHowever,ȱtheȱmainȱdifferenceȱbetweenȱtheȱtwoȱpopulationsȱliesȱinȱtheȱ natureȱ ofȱ theirȱ geneticȱ makeup.ȱ Inȱ aȱ RILȱ populationȱ multipleȱ genomicȱ regionsȱ differȱ betweenȱ mostȱ pairsȱ ofȱ RILsȱ andȱ severalȱ segregatingȱ QTLsȱ contributeȱ toȱ phenotypicȱdifferencesȱbetweenȱpairsȱofȱlines,ȱmakingȱitȱimpossibleȱtoȱassignȱtheȱ observedȱvariationȱbetweenȱpairsȱofȱlinesȱtoȱaȱspecificȱgenomicȱregion.ȱTherefore,ȱ toȱdetectȱQTLsȱoneȱmustȱperformȱtheȱsimultaneousȱanalysisȱofȱaȱlargeȱnumberȱofȱ lines.ȱInȱcontrast,ȱinȱaȱNILȱpopulation,ȱtheȱphenotypicȱvariationȱobservedȱbetweenȱ pairsȱofȱlinesȱcanȱbeȱassignedȱdirectlyȱtoȱtheȱdistinctȱgenomicȱregionsȱintrogressedȱ inȱanȱotherwiseȱsimilarȱgeneticȱbackground.ȱDependingȱonȱtheȱdesiredȱresolutionȱ oneȱ canȱ minimizeȱ theȱ numberȱ ofȱ linesȱ byȱ analyzingȱ linesȱ carryingȱ largeȱ introgressionsȱorȱevenȱchromosomeȱsubstitutionȱstrainsȱ(Nadeauȱetȱal.,ȱ2000).ȱȱ Aȱ summaryȱ ofȱ theȱ differencesȱ observedȱ betweenȱ theȱ RILȱ andȱ NILȱ populationsȱ derivedȱ fromȱ Lerȱ andȱ Cviȱ isȱ shownȱ inȱ Tableȱ 5ȱ andȱ inȱ supplementalȱ Figureȱ 8ȱ atȱ http://www.genetics.org/supplemental/.ȱ Theȱ totalȱ numberȱ ofȱ QTLsȱ detectedȱdidȱnotȱdifferȱmuchȱbetweenȱtheȱtwoȱpopulations.ȱHowever,ȱdifferentȱlociȱ wereȱ detectedȱ inȱ bothȱ typesȱ ofȱ populations,ȱ showingȱ theirȱ complementaryȱ properties.ȱForȱbothȱpopulationsȱtheȱdetectionȱofȱQTLsȱwasȱhighlyȱdependentȱonȱ theȱtraitȱunderȱconsiderationȱandȱitsȱgeneticȱarchitectureȱ(e.g.ȱeffectȱandȱpositionȱofȱ QTL,ȱepistasis).ȱTheȱpowerȱofȱtheȱnewȱNILȱpopulationȱtoȱdetectȱtheȱlargeȬeffectȱlociȱ wasȱ closeȱ toȱthatȱ ofȱ theȱ existingȱ RILȱ populationȱ sinceȱ mostȱ largeȬeffectȱ lociȱ wereȱ detectedȱinȱbothȱpopulations.ȱHowever,ȱaȱfewȱrelativelyȱlargeȬeffectȱlociȱshowingȱ significantȱepistaticȱinteractionsȱcouldȱonlyȱbeȱdetectedȱinȱtheȱRILȱpopulation,ȱbutȱ

35ȱ Chapterȱ2ȱ

notȱinȱtheȱNILsȱ(supplementalȱTableȱ3ȱatȱhttp://www.genetics.org/supplemental/).ȱ Moreover,ȱ localizationȱ resolutionȱ wasȱ higherȱ inȱ theȱ RILȱ populationȱ comparedȱ toȱ theȱbinȱanalysisȱofȱtheȱNILȱpopulation,ȱallowingȱseparationȱofȱlinkedȱQTLs.ȱThisȱ wasȱbestȱillustratedȱbyȱtheȱtwoȱmajorȱQTLsȱforȱfloweringȱtimeȱdetectedȱinȱtheȱRILȱ populationȱ onȱ theȱ topȱ ofȱ chromosomeȱ five,ȱ whichȱ notȱ onlyȱ areȱ linkedȱ butȱ alsoȱ showedȱ strongȱ epistaticȱ interaction.ȱ Consequently,ȱ theseȱ twoȱ QTLsȱ couldȱ notȱ beȱ separatedȱ inȱ theȱ NILȱ population.ȱ Nevertheless,ȱ theȱ QTLȱ resolutionȱ inȱ theȱ NILȱ populationȱcanȱbeȱincreasedȱwhenȱanalyzingȱindividualȱlines,ȱalthoughȱthisȱwillȱbeȱ atȱ theȱ costȱ ofȱ mappingȱ power.ȱ Inȱ total,ȱ 14ȱ QTLsȱ detectedȱ inȱ theȱ RILȱ populationȱ couldȱ notȱ beȱ detectedȱ inȱ theȱ NILȱ population,ȱ ofȱ whichȱ 10ȱ showedȱ significantȱ epistaticȱinteractionsȱwithȱotherȱQTLsȱandȱallȱothersȱwereȱcloselyȱlinkedȱtoȱanotherȱ significantȱQTL.ȱ Inȱcontrast,ȱtheȱaverageȱexplainedȱvarianceȱofȱsingleȱQTLsȱwasȱhigherȱinȱ theȱ NILȱ population,ȱ increasingȱ theȱ powerȱ toȱ detectȱ smallȬeffectȱ QTLs.ȱ Thisȱ differenceȱcanȱbeȱattributedȱtoȱtheȱlevelȱofȱtransgression,ȱwhichȱisȱstrongerȱinȱtheȱ RILȱpopulation,ȱtherebyȱincreasingȱtotalȱphenotypicȱvariance.ȱAsȱaȱresult,ȱ13ȱsmallȬ effectȱQTLsȱcouldȱbeȱdetectedȱinȱtheȱNILȱpopulation,ȱwhichȱwereȱnotȱdetectedȱinȱ theȱ RILȱ population.ȱ Nevertheless,ȱ someȱ ofȱ theȱ smallȬeffectȱ QTLsȱ detectedȱ inȱ theȱ NILsȱwereȱcloseȱtoȱtheȱsignificanceȱthresholdȱinȱtheȱRILȱpopulationȱwhenȱusingȱtheȱ lowerȱ chromosomalȱ LODȱ thresholdsȱ (supplementalȱ Tableȱ 3ȱ atȱ http://www.ȱ genetics.org/supplemental/).ȱExpectedly,ȱtheȱpowerȱtoȱdetectȱsmallȬeffectȱQTLsȱinȱ theȱ NILȱ populationȱ wasȱ higherȱ forȱ theȱ moreȱ heritableȱ traitsȱ (floweringȱ timeȱ andȱ plantȱ height)ȱ thanȱ forȱ thoseȱ traitsȱ withȱ lowȱ heritabilityȱ (branchingȱ traits).ȱ Theȱ differentȱ powerȱ toȱ detectȱ smallȬeffectȱ QTLsȱ ofȱ theȱ twoȱ populationsȱ isȱ dueȱ toȱ theȱ effectȱofȱtheȱsegregationȱofȱmultipleȱQTLsȱinȱtheȱRILȱpopulation,ȱwhichȱincreasesȱ theȱresidualȱvarianceȱatȱeachȱQTLȱunderȱstudy.ȱ TheȱanalysesȱofȱtheȱRILȱandȱNILȱpopulationsȱperformedȱinȱthisȱworkȱwereȱ probablyȱ closeȱ toȱ theȱ maximumȱ statisticalȱ powerȱ forȱ theȱ givenȱ populationȱ sizesȱ sinceȱtheȱnumberȱofȱdetectedȱQTLsȱleveledȱoffȱatȱhigherȱreplicationȱsizesȱ(Figuresȱ4ȱ andȱ5).ȱTheȱpowerȱanalysesȱpresentedȱhereȱcouldȱguideȱtheȱdecisionȬmakingȱonȱtheȱ numberȱofȱplantsȱtoȱbeȱanalyzedȱwhenȱexperimentsȱareȱcostly,ȱlaborious,ȱorȱtimeȱ consumingȱ andȱ thereforeȱ mayȱ requireȱ theȱ analysisȱ ofȱ fewerȱ plants.ȱ Overall,ȱ forȱ RILs,ȱtheȱeffectȱofȱpopulationȱsizeȱonȱmappingȱpowerȱwasȱlargerȱthanȱtheȱeffectȱofȱ replicatedȱ measurementsȱ ofȱ individualȱ lines.ȱ Therefore,ȱ toȱ reduceȱ theȱ numberȱ ofȱ plantsȱtoȱbeȱanalyzed,ȱitȱisȱpreferableȱtoȱfirstȱreduceȱtheȱnumberȱofȱreplicatesȱperȱ line,ȱandȱonlyȱthereafter,ȱifȱrequired,ȱtheȱnumberȱofȱlines.ȱInȱourȱanalysesȱmajorȬ effectȱQTLsȱforȱmostȱtraitsȱcouldȱstillȱbeȱsignificantlyȱdetectedȱwhenȱonlyȱ50ȱlinesȱ wereȱ analyzedȱ withoutȱ replicatesȱ (dataȱ notȱ shown).ȱ However,ȱ dueȱ toȱ theȱ Beavisȱ effectȱ(Xu,ȱ2003)ȱtheȱexplainedȱvariancesȱobtainedȱwithȱsmallȱpopulationȱsizesȱwereȱ

36ȱ DevelopmentȱofȱaȱNearȬIsogenicȱLineȱpopulationȱ

stronglyȱ overestimated.ȱ Inȱ theȱ NILȱ population,ȱ theȱ numberȱ ofȱ replicatedȱ measurementsȱhasȱaȱlargerȱimpactȱonȱmappingȱpowerȱandȱatȱleastȱfiveȱreplicatedȱ plantsȱshouldȱbeȱanalyzedȱtoȱobtainȱenoughȱstatisticalȱpowerȱ(Figureȱ5).ȱHowever,ȱ fewerȱlinesȱcanȱbeȱanalyzedȱasȱlongȱasȱgenomeȬwideȱcoverageȱisȱmaintained.ȱInȱthisȱ NILȱ populationȱ thisȱ canȱ beȱ achievedȱ usingȱ aȱ coreȱ setȱ ofȱ 25ȱ lines,ȱ althoughȱ localizationȱ resolutionȱ wasȱ diminished.ȱ Nevertheless,ȱ mostȱ QTLsȱ detectedȱ inȱ theȱ fullȱ setȱ couldȱ stillȱ beȱ detectedȱ inȱ theȱ coreȱ setȱ (supplementalȱ Figureȱ 9ȱ atȱ http://www.genetics.org/supplemental/).ȱ Onceȱ aȱ QTLȱ hasȱ beenȱ identifiedȱ inȱ aȱ particularȱ region,ȱ oneȱ canȱ zoomȱ inȱ withȱ aȱ minimalȱ setȱ ofȱ linesȱ carryingȱ smallerȱ introgressionsȱdefinedȱbyȱcrossoversȱinȱtheȱsupportȱintervalȱofȱtheȱQTLȱofȱinterestȱ (Fridmanȱetȱal.,ȱ2004).ȱ Theȱ Lerȱ xȱ Cviȱ NILȱ populationȱ developedȱ inȱ thisȱ workȱ providesȱ aȱ usefulȱ resourceȱ thatȱ willȱ facilitateȱ theȱ geneticȱ dissectionȱ ofȱ quantitativeȱ traitsȱ inȱ Arabidopsisȱ inȱ variousȱ aspects.ȱ First,ȱ asȱ shownȱ here,ȱ itȱ canȱ beȱ analyzedȱ asȱ anȱ alternativeȱ segregatingȱ populationȱ toȱ performȱ genomeȬwideȱ QTLȱ mapping,ȱ withȱ theȱ particularȱ advantageȱ ofȱ detectingȱ smallȬeffectȱ QTLs.ȱ Second,ȱ thisȱ populationȱ canȱbeȱusedȱtoȱconfirmȱpreviouslyȱdetectedȱQTLsȱinȱtheȱLerȱxȱCviȱRILȱpopulation.ȱ Third,ȱindividualȱlinesȱofȱthisȱpopulationȱcanȱserveȱasȱstartingȱpointȱforȱtheȱrapidȱ MendelizationȱofȱparticularȱQTLsȱandȱforȱtheirȱfineȱmappingȱandȱcloningȱ(Paranȱ andȱZamir,ȱ2003).ȱFinally,ȱtheȱsingleȱintrogressionȱlinesȱofȱthisȱpopulationȱmayȱalsoȱ stronglyȱ facilitateȱ theȱ fineȱ mappingȱ ofȱ artificiallyȱ inducedȱ mutantȱ allelesȱ inȱ theȱ commonȱlaboratoryȱLerȱgeneticȱbackgroundȱ(orȱtransferredȱtoȱthisȱaccession).ȱTheȱ fineȱ mappingȱ ofȱ mutantȱ lociȱ affectingȱ quantitativeȱ adaptiveȱ traitsȱ isȱ oftenȱ hamperedȱ byȱ theȱ confoundingȱ effectsȱ ofȱ QTLsȱ segregatingȱ inȱ theȱ mappingȱ populationsȱ derivedȱ fromȱ crossesȱ betweenȱ theȱ mutantȱ andȱ anotherȱ Arabidopsisȱ wildȱ accession.ȱ Knowingȱ theȱ approximateȱ geneticȱ locationȱ ofȱ theȱ mutantȱ locusȱ withinȱaȱchromosomalȱarm,ȱspecificȱlinesȱofȱthisȱNILȱpopulationȱcanȱbeȱselectedȱasȱ carryingȱaȱsingleȱintrogressionȱspanningȱtheȱmapȱpositionȱofȱtheȱlocusȱofȱinterest.ȱ Theseȱ linesȱ canȱ thenȱ beȱ usedȱ toȱ deriveȱ theȱ requiredȱ monogenicȱ mappingȱ population,ȱasȱhasȱbeenȱillustratedȱwithȱtheȱfloweringȬtimeȱlocusȱFVEȱ(Ausinȱetȱal.,ȱ 2004).ȱ Inȱ conclusion,ȱ theȱ elucidationȱ ofȱ quantitativeȱ traitsȱ canȱ benefitȱ fromȱ theȱ parallelȱanalysisȱofȱbothȱpopulations.ȱ

37ȱ Chapterȱ2ȱ

MATERIALSȱANDȱMETHODSȱ ȱ Mappingȱpopulationsȱ Twoȱtypesȱofȱmappingȱpopulationsȱwereȱusedȱtoȱanalyzeȱsixȱdevelopmentalȱtraits.ȱ Theȱfirstȱpopulationȱconsistsȱofȱaȱsetȱofȱ161ȱrecombinantȱinbredȱlinesȱ(RILs)ȱderivedȱ fromȱaȱcrossȱbetweenȱtheȱaccessionsȱCapeȱVerdeȱIslandsȱ(Cvi)ȱandȱLandsbergȱerectaȱ (Ler).ȱ Theȱ F10ȱ generationȱ hasȱ beenȱ extensivelyȱ genotypedȱ (AlonsoȬBlancoȱ etȱ al.,ȱ 1998b)ȱandȱisȱavailableȱfromȱtheȱArabidopsisȱBiologicalȱResourceȱCenter.ȱAllȱlinesȱ wereȱadvancedȱtoȱtheȱF13ȱgenerationȱandȱresidualȱheterozygousȱregions,ȱestimatedȱ atȱ0.71%ȱinȱtheȱF10ȱgeneration,ȱwereȱgenotypedȱagainȱwithȱmolecularȱPCRȱmarkersȱ toȱconfirmȱthatȱtheyȱwereȱpracticallyȱ100%ȱhomozygous.ȱȱ Theȱ secondȱ populationȱ consistsȱ ofȱ aȱ setȱ ofȱ 92ȱ nearȱ isogenicȱ linesȱ (NILs).ȱ NILsȱ wereȱ generatedȱ byȱ selectingȱ appropriateȱ Lerȱ xȱ Cviȱ RILsȱ andȱ repeatedȱ backcrossingȱ withȱ Lerȱ asȱ recurrentȱ femaleȱ parent.ȱ Aȱ numberȱ ofȱ theseȱ linesȱ haveȱ beenȱ describedȱ previouslyȱ (AlonsoȬBlancoȱ etȱ al.,ȱ 1998a,ȱ 2003;ȱ Swarupȱ etȱ al.,ȱ 1999;ȱ Bentsinkȱetȱal.,ȱ2003;ȱEdwardsȱetȱal.,ȱ2005;ȱJuengerȱetȱal.,ȱ2005a;ȱTengȱetȱal.,ȱ2005).ȱTheȱ progenyȱofȱbackcrossesȱwasȱgenotypedȱwithȱPCRȱmarkersȱandȱlinesȱcontainingȱaȱ homozygousȱ Cviȱ introgressionȱ intoȱ anȱ otherwiseȱ Lerȱ backgroundȱ wereȱ selected.ȱ TheȱsetȱofȱselectedȱlinesȱwasȱthenȱextensivelyȱgenotypedȱbyȱAFLPȱanalysisȱusingȱ theȱ sameȱ restrictionȱ enzymesȱ andȱ primerȱ combinationsȱ asȱ thoseȱ usedȱ forȱ theȱ genotypingȱ ofȱ theȱ RILsȱ (AlonsoȬBlancoȱ etȱ al.,ȱ 1998b).ȱ Theȱ NILsȱ willȱ beȱ madeȱ availableȱthroughȱtheȱArabidopsisȱstockȱcenters.ȱ Inȱ bothȱ populationsȱ eachȱ lineȱ isȱ almostȱ completelyȱ homozygousȱ andȱ thereforeȱ individualsȱ ofȱ theȱ sameȱ lineȱ areȱ geneticallyȱ identical,ȱ whichȱ allowsȱ theȱ poolingȱ ofȱ replicatedȱ individualsȱ andȱ repeatedȱ measurementsȱ toȱ obtainȱ aȱ moreȱ preciseȱestimateȱofȱphenotypicȱvalues.ȱForȱtheȱRILȱandȱNILȱpopulationȱ16ȱandȱ24ȱ geneticallyȱ identicalȱ plantsȱ wereȱ grownȱ perȱ line,ȱ respectively.ȱ Additionally,ȱ 96ȱ replicatesȱ wereȱ grownȱ forȱ eachȱ parentalȱ accessionȱ Lerȱ andȱ Cvi.ȱ Allȱ plantsȱ wereȱ grownȱinȱaȱsingleȱexperimentȱwithȱfourȱcompletelyȱrandomizedȱblocksȱcontainingȱ 4,ȱ6,ȱandȱ24ȱreplicatesȱperȱRIL,ȱNIL,ȱandȱparent,ȱrespectively.ȱȱ ȱ Plantȱgrowingȱconditionsȱ Seedsȱ wereȱ sownȱ inȱ petriȱ dishesȱ onȱ waterȬsoakedȱ filterȱ paperȱ andȱ incubatedȱ forȱ fiveȱ daysȱ inȱ aȱ coldȱ roomȱ atȱ 4°Cȱ inȱ theȱ darkȱ toȱ promoteȱ uniformȱ germination.ȱ Subsequently,ȱpetriȱdishesȱwereȱtransferredȱtoȱaȱclimateȱchamberȱ(24°C,ȱ16ȱhrȱlightȱ perȱday)ȱforȱtwoȱdaysȱbeforeȱplanting.ȱGerminatedȱseedlingsȱwereȱtransferredȱtoȱ clayȱpots,ȱplacedȱinȱpeat,ȱcontainingȱaȱsandyȱsoilȱmixture.ȱAȱsingleȱplantȱperȱpotȱ

38ȱ DevelopmentȱofȱaȱNearȬIsogenicȱLineȱpopulationȱ

wasȱgrownȱunderȱlongȬdayȱlightȱconditionsȱinȱanȱairȬconditionedȱgreenȱhouseȱfromȱ JulyȱuntilȱOctober.ȱPlantsȱwereȱfertilizedȱeveryȱtwoȱweeksȱusingȱaȱliquidȱfertilizer.ȱ ȱ Quantitativeȱtraitsȱ Aȱtotalȱnumberȱofȱsixȱdevelopmentalȱtraits,ȱwhichȱwereȱknownȱtoȱvaryȱwithinȱtheȱ populationsȱ forȱ theȱ numberȱ ofȱ QTLsȱ andȱ heritability,ȱ wereȱ measuredȱ onȱ allȱ individuals.ȱWeȱquantifiedȱfloweringȱtimeȱ(FT);ȱmainȱinflorescenceȱlengthȱatȱfirstȱ siliqueȱ(SL);ȱtotalȱlengthȱofȱtheȱmainȱinflorescenceȱ(TL);ȱbasalȱbranchȱnumberȱ(BB),ȱ whichȱ isȱ theȱ numberȱ ofȱ sideȱ shootsȱ growingȱ outȱ fromȱ theȱ rosette;ȱ mainȱ inflorescenceȱ branchȱ numberȱ (IB),ȱ whichȱ isȱ theȱ numberȱ ofȱ elongatedȱ axillaryȱ (secondary)ȱinflorescencesȱalongȱtheȱmainȱinflorescence;ȱandȱtotalȱnumberȱofȱsideȱ shootsȱ (TB;ȱ basalȱ plusȱ mainȱ inflorescence).ȱ Floweringȱ timeȱ wasȱ recordedȱ asȱ theȱ numberȱofȱdaysȱfromȱtheȱdateȱofȱplantingȱuntilȱtheȱopeningȱofȱtheȱfirstȱflower.ȱAllȱ otherȱtraitsȱwereȱmeasuredȱatȱmaturity.ȱ ȱ Quantitativeȱgeneticȱanalysesȱ Forȱbothȱpopulationsȱandȱforȱeachȱtrait,ȱtotalȱphenotypicȱvarianceȱwasȱpartitionedȱ intoȱsourcesȱattributableȱtoȱgenotypeȱ(VG;ȱi.e.ȱtheȱlineȱeffect)ȱandȱerrorȱ(VE)ȱusingȱaȱ randomȬeffectsȱanalysisȱofȱvarianceȱ(ANOVA,ȱSPSSȱversionȱ11.0)ȱaccordingȱtoȱtheȱ modelȱ Yȱ =ȱΐȱ +ȱ Gȱ +ȱ E.ȱ Varianceȱ componentsȱ wereȱ usedȱ toȱ estimateȱ broadȱ senseȱ heritabilityȱ accordingȱ toȱ theȱ formulaȱ H2ȱ =ȱ VG/(VGȱ +ȱ VE),ȱ whereȱ VGȱ isȱ theȱ amongȬ genotypeȱvarianceȱcomponentȱandȱVEȱisȱtheȱresidualȱ(error)ȱvarianceȱcomponent.ȱ

Geneticȱcorrelationsȱ(rG)ȱwereȱestimatedȱasȱ rG cov1,2 / VG1 uVG2 ,ȱwhereȱcov1,2ȱisȱ theȱ covarianceȱ ofȱ traitȱ meansȱ andȱ VG1ȱ andȱ VG2ȱ areȱ theȱ amongȬgenotypeȱ varianceȱ componentsȱ forȱ thoseȱ traits.ȱ Theȱ coefficientȱ ofȱ geneticȱ variationȱ (CVG)ȱ wasȱ estimatedȱforȱeachȱtraitȱasȱ CVG 100 u VG X ,ȱwhereȱVGȱisȱtheȱamongȬgenotypeȱ varianceȱcomponentȱandȱ X ȱisȱtheȱtraitȱmeanȱofȱtheȱgenotypes.ȱȱ ȱ QTLȱanalysesȱinȱtheȱRILȱpopulationȱ ToȱmapȱQTLsȱusingȱtheȱRILȱpopulation,ȱaȱsetȱofȱ144ȱmarkersȱequallyȱspacedȱoverȱ theȱ Arabidopsisȱ geneticȱ mapȱ wasȱ selectedȱ fromȱ theȱ RILȱ Lerȱ xȱ Cviȱ mapȱ (AlonsoȬ Blancoȱ etȱ al.,ȱ 1998b).ȱ Theseȱ markersȱ spannedȱ 485ȱ cM,ȱ withȱ anȱ averageȱ distanceȱ betweenȱconsecutiveȱmarkersȱofȱ3.5ȱcMȱandȱtheȱlargestȱgeneticȱdistanceȱbeingȱ11ȱ cM.ȱ Theȱ phenotypicȱ valuesȱ recorded,ȱ exceptȱ basalȱ branchȱ number,ȱ wereȱ transformedȱ (log10(x+1))ȱ toȱ improveȱ theȱ normalityȱ ofȱ theȱ distributionsȱ andȱ theȱ valuesȱ ofȱ16ȱplantsȱ perȱRILȱ wereȱusedȱtoȱ calculateȱ theȱ meansȱ ofȱeachȱlineȱ forȱallȱ traits.ȱ Theseȱ meansȱ wereȱ usedȱ toȱ performȱ theȱ QTLȱ analysesȱ unlessȱ otherwiseȱ stated.ȱTheȱcomputerȱprogramȱMapQTLȱversionȱ5.0ȱ(VanȱOoijen,ȱ2004)ȱwasȱusedȱtoȱ

39ȱ Chapterȱ2ȱ

identifyȱ andȱ locateȱ QTLsȱ linkedȱ toȱ theȱ molecularȱ markersȱ usingȱ bothȱ intervalȱ mappingȱandȱmultipleȱQTLȱmappingȱ(MQM).ȱInȱaȱfirstȱstep,ȱputativeȱQTLsȱwereȱ identifiedȱ usingȱ intervalȱ mapping.ȱ Thereafter,ȱ aȱ markerȱ closelyȱ linkedȱ toȱ eachȱ putativeȱ QTLȱ wasȱ selectedȱ asȱ aȱ cofactorȱ andȱ theȱ selectedȱ markersȱ wereȱ usedȱ asȱ geneticȱbackgroundȱcontrolsȱinȱtheȱapproximateȱMQMȱofȱMapQTL.ȱLODȱstatisticsȱ wereȱcalculatedȱatȱ0.5ȱcMȱintervals.ȱTestsȱofȱ1000ȱpermutationsȱwereȱusedȱtoȱobtainȱ anȱestimateȱofȱtheȱnumberȱofȱtypeȱ1ȱerrorsȱ(falseȱpositives).ȱTheȱgenomeȬwideȱLODȱ score,ȱwhichȱ95%ȱofȱtheȱpermutationsȱdidȱnotȱexceed,ȱrangedȱfromȱ2.6ȱtoȱ2.8ȱandȱ chromosomeȬwideȱLODȱthresholdsȱvariedȱbetweenȱ1.8ȱandȱ2.1ȱdependingȱonȱtraitȱ andȱlinkageȱgroup.ȱTheȱgenomeȬwideȱLODȱscoreȱwasȱthenȱusedȱasȱtheȱsignificanceȱ thresholdȱ toȱ declareȱ theȱ presenceȱ ofȱ aȱ QTLȱ inȱ MQMȱ mapping,ȱ whileȱ theȱ chromosomeȬwideȱ thresholdsȱ wereȱ usedȱ toȱ detectȱ putativeȱ smallȬeffectȱ QTLs.ȱ Inȱ theȱ finalȱ MQMȱ modelȱ theȱ geneticȱ effectȱ (ΐBȬΐA)ȱ andȱ percentageȱ ofȱ explainedȱ varianceȱ wasȱ estimatedȱ forȱ eachȱ QTLȱ andȱ 2ȬLODȱ supportȱ intervalsȱ wereȱ establishedȱasȱanȱ~95%ȱconfidenceȱlevelȱ(VanȱOoijen,ȱ1992),ȱusingȱrestrictedȱMQMȱ mapping.ȱ Epistaticȱ interactionsȱ betweenȱ QTLsȱ wereȱ estimatedȱ usingȱ factorialȱ analysisȱ ofȱ variance.ȱ Forȱ eachȱ trait,ȱ theȱ meanȱ phenotypicȱ valuesȱ wereȱ usedȱ asȱ dependentȱvariableȱandȱcofactors,ȱcorrespondingȱtoȱtheȱdetectedȱQTLs,ȱwereȱusedȱ asȱfixedȱfactors.ȱTheȱgeneralȱlinearȱmodelȱmoduleȱofȱtheȱstatisticalȱpackageȱSPSSȱ versionȱ11.0ȱwasȱusedȱtoȱperformȱaȱfullȱfactorialȱanalysisȱofȱvarianceȱorȱanalysisȱofȱ mainȱ effectsȱ only.ȱ Differencesȱ inȱ R2Ȭvalues,ȱ calculatedȱ fromȱ theȱ Typeȱ IIIȱ sumȱ ofȱ squares,ȱ wereȱ assignedȱ toȱ epistaticȱ interactionȱ effectsȱ ofȱ detectedȱ QTLs.ȱ Additionallyȱweȱperformedȱaȱcompleteȱpairwiseȱsearchȱ(Pȱ<ȱ0.001,ȱdeterminedȱbyȱ MonteȱCarloȱsimulations)ȱforȱconditionalȱandȱcoadaptiveȱepistaticȱinteractionsȱforȱ eachȱtraitȱusingȱtheȱcomputerȱprogramȱEPISTATȱ(Chaseȱetȱal.,ȱ1997).ȱȱ Theȱeffectȱofȱreplicationȱonȱstatisticalȱpowerȱwasȱanalyzedȱbyȱperformingȱ MQMȱmappingȱonȱmeansȱofȱtraitȱvaluesȱfromȱ1,ȱ2,ȱ4,ȱ8,ȱ12,ȱandȱ16ȱreplicateȱplants,ȱ respectively.ȱ Analysesȱ wereȱ performedȱ onȱ tenȱ independent,ȱ stochasticallyȱ sampled,ȱ dataȱ setsȱ forȱ eachȱ replicationȱ sizeȱ andȱ traitȱ usingȱ automatedȱ cofactorȱ selectionȱ(Pȱ<ȱ0.02).ȱTotalȱexplainedȱvariance,ȱLODȱscoreȱofȱtheȱlargestȬeffectȱQTL,ȱ andȱnumberȱofȱsignificantȱQTLsȱwereȱrecordedȱforȱeachȱanalysis.ȱ Theȱ effectȱ ofȱ populationȱ sizeȱ onȱ statisticalȱ powerȱ wasȱ analyzedȱ byȱ performingȱ MQMȱ mappingȱ onȱ increasingȱ populationȱ sizes.ȱ Analysesȱ wereȱ performedȱ onȱ tenȱ independent,ȱ stochasticallyȱ sampled,ȱ dataȱ setsȱ forȱ eachȱ populationȱ size.ȱ Subpopulationsȱ ofȱ increasingȱ size,ȱ withȱ aȱ stepȱ sizeȱ ofȱ 20ȱ lines,ȱ wereȱ analyzedȱ forȱ eachȱ traitȱ usingȱ automatedȱ cofactorȱ selectionȱ (Pȱ <ȱ 0.02).ȱ Totalȱ explainedȱvariance,ȱLODȱscoreȱofȱtheȱlargestȬeffectȱQTL,ȱandȱnumberȱofȱsignificantȱ QTLsȱwereȱrecordedȱforȱeachȱanalysis.ȱȱ

40ȱ DevelopmentȱofȱaȱNearȬIsogenicȱLineȱpopulationȱ

StatisticalȱanalysesȱNILsȱ Differencesȱ inȱ meanȱ traitȱ valuesȱ ofȱ Lerȱ andȱ NILsȱ wereȱ analyzedȱ byȱ univariateȱ analysisȱ ofȱ variance,ȱ usingȱ theȱ generalȱ linearȱ modelȱ moduleȱ ofȱ theȱ statisticalȱ packageȱ SPSSȱ versionȱ 11.0.ȱ Dunnett’sȱ pairwiseȱ multipleȱ comparisonȱ tȬtestȱ wasȱ usedȱasȱaȱpostȱhocȱtestȱtoȱdetermineȱsignificantȱdifferences.ȱForȱeachȱanalysis,ȱtraitȱ valuesȱwereȱusedȱasȱdependentȱvariableȱandȱNILsȱwereȱusedȱasȱfixedȱfactor.ȱTestsȱ wereȱperformedȱ2ȬsidedȱwithȱaȱBonferroniȬcorrectedȱsignificanceȱthresholdȱlevelȱofȱ 0.05ȱ andȱ Lerȱ asȱ controlȱ category.ȱ Inȱ orderȱ toȱ increaseȱ statisticalȱ power,ȱ similarȱ analysesȱwereȱconductedȱforȱbinsȱ(seeȱresultsȱsection).ȱForȱthis,ȱtraitȱvaluesȱofȱallȱ introgressionȱlinesȱassignedȱtoȱaȱcertainȱbinȱwereȱpooledȱandȱcomparedȱtoȱvaluesȱ ofȱtheȱLerȱparentalȱline.ȱBecauseȱeachȱNILȱcanȱbeȱaȱmemberȱofȱmoreȱthanȱoneȱbinȱ theȱsignificanceȱthresholdȱwasȱloweredȱtoȱ0.001ȱtoȱcorrectȱforȱmultipleȱtesting.ȱTheȱ geneticȱeffectȱofȱCviȱbinsȱsignificantlyȱdifferingȱfromȱLerȱwasȱcalculatedȱasȱΐBȬΐA,ȱ whereȱΐAȱ andȱΐBȱ areȱ theȱ meanȱ traitȱ valuesȱ ofȱ Lerȱ andȱ theȱ Cviȱ bin,ȱ respectively.ȱ Explainedȱvarianceȱwasȱestimatedȱfromȱtheȱpartialȱ΋2ȱofȱtheȱunivariateȱanalysisȱofȱ variance,ȱwhereȱ΋2ȱisȱtheȱproportionȱofȱtotalȱvarianceȱattributableȱtoȱfactorsȱinȱtheȱ analysis.ȱTheȱtotalȱpercentageȱofȱexplainedȱvarianceȱwasȱthenȱestimatedȱbyȱusingȱ traitȱvaluesȱasȱdependentȱvariableȱandȱNILsȱasȱfixedȱfactor,ȱwhereȱallȱNILsȱwhereȱ includedȱasȱsubjects.ȱTheȱpercentageȱofȱexplainedȱvarianceȱofȱindividualȱQTLsȱwasȱ estimatedȱasȱaȱfractionȱofȱtheȱtotalȱvariationȱinȱtheȱpopulationȱ(includingȱallȱlines),ȱ usingȱ aȱ singleȱ binȱ asȱ fixedȱ factorȱ andȱ asȱ aȱ fractionȱ ofȱ theȱ totalȱ variationȱ inȱ aȱ comparisonȱofȱaȱsingleȱbinȱwithȱLerȱonly.ȱ Toȱ determineȱ theȱ effectȱ ofȱ replicatedȱ measurementsȱ weȱ calculatedȱ theȱ powerȱ ofȱ detectingȱ significantȱ differencesȱ betweenȱ Lerȱ andȱ NILsȱ usingȱ variousȱ replicateȱnumbers.ȱForȱeachȱtraitȱweȱcalculatedȱtheȱminimalȱrelativeȱdifferenceȱinȱ meanȱ traitȱ valuesȱ thatȱ couldȱ stillȱ beȱ significantlyȱ detected.ȱ Calculationsȱ wereȱ performedȱ usingȱ aȱ normalȱ distributionȱ twoȬsampleȱ equalȱ varianceȱ powerȱ calculatorȱ fromȱ theȱ UCLAȱ departmentȱ ofȱ statisticsȱ (http://calculators.stat.ucla.ȱ edu/).ȱ Weȱ firstȱ calculatedȱ forȱ eachȱ traitȱ theȱ meanȱ phenotypicȱ valueȱ ofȱ 96ȱ Lerȱ replicateȱplantsȱ(ΐA)ȱandȱforȱeachȱlineȱtheȱstandardȱdeviationȱofȱ24ȱreplicateȱplants.ȱ Theȱmeanȱlineȱstandardȱdeviationȱofȱeachȱtraitȱwasȱtakenȱasȱaȱmeasureȱofȱvariationȱ (Η)ȱinȱallȱsubsequentȱcalculations.ȱTheȱsignificanceȱlevel,ȱtheȱprobabilityȱofȱfalselyȱ rejectingȱtheȱnullȱhypothesisȱ(H0:ΐA=ΐB)ȱwhenȱitȱisȱtrue,ȱwasȱsetȱtoȱ0.05ȱandȱpower,ȱ theȱ probabilityȱ ofȱ correctlyȱ rejectingȱ theȱ nullȱ hypothesisȱ whenȱ theȱ alternativeȱ (H1:ΐAƾΐB)ȱisȱtrue,ȱwasȱsetȱtoȱ0.95.ȱTheȱsampleȱsizeȱofȱLerȱ(NA)ȱwasȱalwaysȱidenticalȱ toȱtheȱsampleȱsizeȱofȱNILsȱ(NB)ȱandȱrangedȱfromȱ2ȱtoȱ24ȱindividuals.ȱForȱeachȱtraitȱ andȱ sampleȱ sizeȱ theȱ meanȱ traitȱ valueȱ (ΐB)ȱ forȱ NILsȱ wasȱ thenȱ calculatedȱ asȱ theȱ minimumȱvalueȱtoȱmeetȱtheȱalternativeȱhypothesisȱ(H1:ΐAƾΐB)ȱinȱaȱtwoȬsidedȱtest.ȱ TheseȱminimumȱvaluesȱwereȱthenȱconvertedȱinȱaȱfoldȬdifferenceȱvalueȱcomparedȱ

41ȱ Chapterȱ2ȱ

toȱ theȱ Lerȱ value,ȱ calculatedȱ asȱ (|ΐBȬΐA|+ΐA)/ΐA,ȱ toȱ obtainȱ aȱ relativeȱ estimateȱ independentȱofȱtraitȱmeasurementȱunits.ȱ Theȱ effectȱ ofȱ replicationȱ onȱ statisticalȱ powerȱ wasȱ alsoȱ analyzedȱ byȱ performingȱ binȱ mappingȱ usingȱ 2,ȱ 4,ȱ 8,ȱ 12,ȱ andȱ 16ȱ replicateȱ plants,ȱ respectively.ȱ Analysesȱwereȱperformedȱonȱtenȱindependent,ȱstochasticallyȱsampled,ȱdataȱsetsȱforȱ eachȱreplicationȱsizeȱandȱtraitȱandȱtheȱnumberȱofȱsignificantȱQTLsȱwasȱrecordedȱforȱ eachȱanalysis.ȱ ȱ Acknowledgementsȱ Weȱ thankȱ Kieronȱ Edwardsȱ forȱ sharingȱ NILs,ȱ Johanȱ vanȱ Ooijenȱ forȱ helpfulȱ assistanceȱ inȱ theȱ QTLȱ mapping,ȱ andȱ Pietȱ Stamȱ forȱ criticalȱ readingȱ ofȱ theȱ manuscript.ȱ Thisȱ workȱ wasȱ supportedȱ byȱ aȱ grantȱ fromȱ Theȱ Netherlandsȱ OrganizationȱforȱScientificȱResearch,ȱProgramȱGenomicsȱ(050Ȭ10Ȭ029).ȱ

42ȱ DevelopmentȱofȱaȱNearȬIsogenicȱLineȱpopulationȱ

REFERENCESȱ ȱ AlonsoȬBlanco,ȱC.,ȱElȬAssal,ȱS.E.,ȱCoupland,ȱG.ȱandȱKoornneef,ȱM.ȱ(1998a).ȱAnalysisȱofȱnaturalȱallelicȱ variationȱatȱfloweringȱtimeȱlociȱinȱtheȱLandsbergȱerectaȱandȱCapeȱVerdeȱIslandsȱecotypesȱofȱ Arabidopsisȱthaliana.ȱGeneticsȱ149,ȱ749Ȭ764.ȱ AlonsoȬBlanco,ȱ C.,ȱPeeters,ȱ A.J.,ȱKoornneef,ȱM.,ȱLister,ȱC.,ȱDean,ȱC.,ȱvanȱdenȱBosch,ȱN.,ȱ Pot,ȱJ.ȱandȱ Kuiper,ȱ M.T.ȱ (1998b).ȱ Developmentȱ ofȱ anȱ AFLPȱ basedȱ linkageȱ mapȱ ofȱ Ler,ȱ Colȱ andȱ Cviȱ Arabidopsisȱ thalianaȱ ecotypesȱ andȱ constructionȱ ofȱ aȱ Ler/Cviȱ recombinantȱ inbredȱ lineȱ population.ȱPlantȱJȱ14,ȱ259Ȭ271.ȱ AlonsoȬBlanco,ȱC.,ȱBlankestijnȬdeȱVries,ȱH.,ȱHanhart,ȱC.J.ȱandȱKoornneef,ȱM.ȱ(1999).ȱNaturalȱallelicȱ variationȱatȱseedȱsizeȱlociȱinȱrelationȱtoȱotherȱlifeȱhistoryȱtraitsȱofȱArabidopsisȱthaliana.ȱProcȱNatlȱ AcadȱSciȱUȱSȱAȱ96,ȱ4710Ȭ4717.ȱ AlonsoȬBlanco,ȱ C.ȱ andȱ Koornneef,ȱ M.ȱ (2000).ȱ Naturallyȱ occurringȱ variationȱ inȱ Arabidopsis:ȱ anȱ underexploitedȱresourceȱforȱplantȱgenetics.ȱTrendsȱPlantȱSciȱ5,ȱ22Ȭ29.ȱ AlonsoȬBlanco,ȱ C.,ȱ Bentsink,ȱ L.,ȱ Hanhart,ȱ C.J.,ȱ BlankestijnȬdeȱ Vries,ȱ H.ȱ andȱ Koornneef,ȱ M.ȱ (2003).ȱ AnalysisȱofȱnaturalȱallelicȱvariationȱatȱseedȱdormancyȱlociȱofȱArabidopsisȱthaliana.ȱGeneticsȱ164,ȱ 711Ȭ729.ȱ Ausin,ȱ I.,ȱ AlonsoȬBlanco,ȱ C.,ȱ Jarillo,ȱ J.A.,ȱ RuizȬGarcia,ȱ L.ȱ andȱ MartinezȬZapater,ȱ J.M.ȱ (2004).ȱ RegulationȱofȱfloweringȱtimeȱbyȱFVE,ȱaȱretinoblastomaȬassociatedȱprotein.ȱNatȱGenetȱ36,ȱ162Ȭ 166.ȱ Bentsink,ȱ L.,ȱ Yuan,ȱ K.,ȱ Koornneef,ȱ M.ȱ andȱ Vreugdenhil,ȱ D.ȱ (2003).ȱ Theȱ geneticsȱ ofȱ phytateȱ andȱ phosphateȱaccumulationȱinȱseedsȱandȱleavesȱofȱArabidopsisȱthaliana,ȱusingȱnaturalȱvariation.ȱ TheorȱApplȱGenetȱ106,ȱ1234Ȭ1243.ȱ Blair,ȱ M.W.,ȱ Iriarte,ȱ G.ȱ andȱ Beebe,ȱ S.ȱ (2006).ȱ QTLȱ analysisȱ ofȱ yieldȱ traitsȱ inȱ anȱ advancedȱ backcrossȱ populationȱ derivedȱ fromȱ aȱ cultivatedȱ Andeanȱ xȱ wildȱ commonȱ beanȱ (Phaseolusȱ vulgarisȱ L.)ȱ cross.ȱTheorȱApplȱGenetȱ112,ȱ1149Ȭ1163.ȱ Blanco,ȱA.,ȱSimeone,ȱR.ȱandȱGadaleta,ȱA.ȱ(2006).ȱDetectionȱofȱQTLsȱforȱgrainȱproteinȱcontentȱinȱdurumȱ wheat.ȱTheorȱApplȱGenetȱ112,ȱ1195Ȭ1204.ȱ Borevitz,ȱJ.O.ȱandȱNordborg,ȱM.ȱ(2003).ȱTheȱimpactȱofȱgenomicsȱonȱtheȱstudyȱofȱnaturalȱvariationȱinȱ Arabidopsis.ȱPlantȱPhysiolȱ132,ȱ718Ȭ725.ȱ Broman,ȱK.W.ȱ(2001).ȱReviewȱofȱstatisticalȱmethodsȱforȱQTLȱmappingȱinȱexperimentalȱcrosses.ȱLabȱAnimȱ (NY)ȱ30,ȱ44Ȭ52.ȱ ,ȱK.,ȱAdler,ȱF.R.ȱandȱLark,ȱK.G.ȱ(1997).ȱEpistat:ȱaȱcomputerȱprogramȱforȱidentifyingȱandȱtestingȱ interactionsȱbetweenȱpairsȱofȱquantitativeȱtraitȱloci.ȱTheorȱApplȱGenetȱ94,ȱ724Ȭ730.ȱ Doerge,ȱR.W.ȱ(2002).ȱMappingȱandȱanalysisȱofȱquantitativeȱtraitȱlociȱinȱexperimentalȱpopulations.ȱNatȱ RevȱGenetȱ3,ȱ43Ȭ52.ȱ Edwards,ȱK.D.,ȱLynn,ȱJ.R.,ȱGyula,ȱP.,ȱNagy,ȱF.ȱandȱMillar,ȱA.J.ȱ(2005).ȱNaturalȱallelicȱvariationȱinȱtheȱ temperatureȬcompensationȱ mechanismsȱ ofȱ theȱ Arabidopsisȱ thalianaȱ circadianȱ clock.ȱ Geneticsȱ 170,ȱ387Ȭ400.ȱ Eshed,ȱ Y.ȱ andȱ Zamir,ȱ D.ȱ (1995).ȱ Anȱ introgressionȱ lineȱ populationȱ ofȱ Lycopersiconȱ pennelliiȱ inȱ theȱ cultivatedȱ tomatoȱ enablesȱ theȱ identificationȱ andȱ fineȱ mappingȱ ofȱ yieldȬassociatedȱ QTL.ȱ Geneticsȱ141,ȱ1147Ȭ1162.ȱ Fridman,ȱE.,ȱCarrari,ȱF.,ȱLiu,ȱY.S.,ȱFernie,ȱA.R.ȱandȱZamir,ȱD.ȱ(2004).ȱZoomingȱinȱonȱaȱquantitativeȱtraitȱ forȱtomatoȱyieldȱusingȱinterspecificȱintrogressions.ȱScienceȱ305,ȱ1786Ȭ1789.ȱ Han,ȱ F.,ȱ Ullrich,ȱ S.E.,ȱ Kleinhofs,ȱ A.,ȱ Jones,ȱ B.L.,ȱ Hayes,ȱ P.M.ȱ andȱ Wesenberg,ȱ D.M.ȱ (1997).ȱ Fineȱ structureȱmappingȱofȱtheȱbarleyȱchromosomeȬ1ȱcentromereȱregionȱcontainingȱmaltingȬqualityȱ QTLs.ȱTheorȱApplȱGenetȱ95,ȱ903Ȭ910.ȱ

43ȱ Chapterȱ2ȱ

Han,ȱF.,ȱClancy,ȱJ.A.,ȱJones,ȱB.L.,ȱWesenberg,ȱD.M.,ȱKleinhofs,ȱA.ȱandȱUllrich,ȱS.E.ȱ(2004).ȱDissectionȱ ofȱaȱmaltingȱqualityȱQTLȱregionȱonȱchromosomeȱ1ȱ(7H)ȱofȱbarley.ȱMolȱBreedȱ14,ȱ339Ȭ347.ȱ Jansen,ȱ R.C.ȱ (2003).ȱ Quantitativeȱ traitȱ lociȱ inȱ inbredȱ lines.ȱ Inȱ Handbookȱ ofȱ Statisticalȱ Genetics,ȱ D.J.ȱ Balding,ȱM.ȱBishopȱandȱC.ȱCannings,ȱedsȱ(Chichester,ȱUK:ȱJohnȱWileyȱ&ȱSons),ȱpp.ȱ445Ȭ476.ȱ Jeuken,ȱ M.J.ȱ andȱ Lindhout,ȱ P.ȱ (2004).ȱ Theȱ developmentȱ ofȱ lettuceȱ backcrossȱ inbredȱ linesȱ (BILs)ȱ forȱ exploitationȱofȱtheȱLactucaȱsalignaȱ(wildȱlettuce)ȱgermplasm.ȱTheorȱApplȱGenetȱ109,ȱ394Ȭ401.ȱ Juenger,ȱ T.E.,ȱ McKay,ȱ J.K.,ȱ Hausmann,ȱ N.,ȱ Keurentjes,ȱ J.J.B.,ȱ Sen,ȱ S.,ȱ Stowe,ȱ K.A.,ȱ Dawson,ȱ T.E.,ȱ Simms,ȱ E.L.ȱ andȱ Richards,ȱ J.H.ȱ (2005a).ȱ Identificationȱ andȱ characterizationȱ ofȱ QTLȱ underlyingȱ wholeȬplantȱ physiologyȱ inȱ Arabidopsisȱ thaliana:ȱ delta13C,ȱ stomatalȱ conductanceȱ andȱtranspirationȱefficiency.ȱPlantȱCellȱEnvironȱ28,ȱ697Ȭ708.ȱ Juenger,ȱ T.E.,ȱ Sen,ȱ S.,ȱ Stowe,ȱ K.A.ȱ andȱ Simms,ȱ E.L.ȱ (2005b).ȱ Epistasisȱ andȱ genotypeȬenvironmentȱ interactionȱforȱquantitativeȱtraitȱlociȱaffectingȱfloweringȱtimeȱinȱArabidopsisȱthaliana.ȱGeneticaȱ 123,ȱ87Ȭ105.ȱ Koornneef,ȱM.,ȱAlonsoȬBlanco,ȱC.ȱandȱVreugdenhil,ȱD.ȱ(2004).ȱNaturallyȱoccurringȱgeneticȱvariationȱinȱ ArabidopsisȱThaliana.ȱAnnuȱRevȱPlantȱPhysiolȱPlantȱMolȱBiolȱ55,ȱ141Ȭ172.ȱ Koumproglou,ȱR.,ȱWilkes,ȱT.M.,ȱTownson,ȱP.,ȱWang,ȱX.Y.,ȱBeynon,ȱJ.,ȱPooni,ȱH.S.,ȱNewbury,ȱH.J.ȱandȱ Kearsey,ȱ M.J.ȱ (2002).ȱ STAIRS:ȱ aȱ newȱ geneticȱ resourceȱ forȱ functionalȱ genomicȱ studiesȱ ofȱ Arabidopsis.ȱPlantȱJȱ31,ȱ355Ȭ364.ȱ Loudet,ȱ O.,ȱ Gaudon,ȱ V.,ȱ Trubuil,ȱ A.ȱ andȱ DanielȬVedele,ȱ F.ȱ (2005).ȱ Quantitativeȱ traitȱ lociȱ controllingȱ rootȱ growthȱ andȱ architectureȱ inȱ Arabidopsisȱ thalianaȱ confirmedȱ byȱ heterogeneousȱ inbredȱ family.ȱTheorȱApplȱGenetȱ110,ȱ742Ȭ753.ȱ Maloof,ȱ J.N.ȱ (2003).ȱ Genomicȱ approachesȱ toȱ analyzingȱ naturalȱ variationȱ inȱ Arabidopsisȱ thaliana.ȱ Currȱ OpinȱGenetȱDevȱ13,ȱ576Ȭ582.ȱ Monforte,ȱ A.J.ȱ andȱ Tanksley,ȱ S.D.ȱ (2000).ȱ Developmentȱ ofȱ aȱ setȱ ofȱ nearȱ isogenicȱ andȱ backcrossȱ recombinantȱ inbredȱ linesȱ containingȱ mostȱ ofȱ theȱ Lycopersiconȱ hirsutumȱ genomeȱ inȱ aȱ L.ȱ esculentumȱgeneticȱbackground:ȱaȱtoolȱforȱgeneȱmappingȱandȱgeneȱdiscovery.ȱGenomeȱ43,ȱ803Ȭ 813.ȱ Nadeau,ȱ J.H.,ȱ Singer,ȱ J.B.,ȱ Matin,ȱ A.ȱ andȱ Lander,ȱ E.S.ȱ (2000).ȱ Analysingȱ complexȱ geneticȱ traitsȱ withȱ chromosomeȱsubstitutionȱstrains.ȱNatȱGenetȱ24,ȱ221Ȭ225.ȱ Paran,ȱI.ȱandȱZamir,ȱD.ȱ(2003).ȱQuantitativeȱtraitsȱinȱplants:ȱbeyondȱtheȱQTL.ȱTrendsȱGenetȱ19,ȱ303Ȭ306.ȱ Rae,ȱ A.M.,ȱ Howell,ȱ E.C.ȱ andȱ Kearsey,ȱ M.J.ȱ (1999).ȱ Moreȱ QTLȱ forȱ floweringȱ timeȱ revealedȱ byȱ substitutionȱlinesȱinȱbrassicaȱoleracea.ȱHeredityȱ83ȱ(Ptȱ5),ȱ586Ȭ596.ȱ Reymond,ȱM.,ȱSvistoonoff,ȱS.,ȱLoudet,ȱO.,ȱNussaume,ȱL.ȱandȱDesnos,ȱT.ȱ(2006).ȱIdentificationȱofȱQTLȱ controllingȱ rootȱ growthȱ responseȱ toȱ phosphateȱ starvationȱ inȱ Arabidopsisȱ thaliana.ȱ Plantȱ Cellȱ Environȱ29,ȱ115Ȭ125.ȱ Singer,ȱJ.B.,ȱHill,ȱA.E.,ȱBurrage,ȱL.C.,ȱOlszens,ȱK.R.,ȱSong,ȱJ.,ȱJustice,ȱM.,ȱOȇBrien,ȱW.E.,ȱConti,ȱD.V.,ȱ Witte,ȱJ.S.,ȱLander,ȱE.S.ȱetȱal.ȱ(2004).ȱGeneticȱdissectionȱofȱcomplexȱtraitsȱwithȱchromosomeȱ substitutionȱstrainsȱofȱmice.ȱScienceȱ304,ȱ445Ȭ448.ȱ Slate,ȱJ.ȱ (2005).ȱQuantitativeȱtraitȱlocusȱ mappingȱinȱnaturalȱpopulations:ȱprogress,ȱcaveatsȱandȱfutureȱ directions.ȱMolȱEcolȱ14,ȱ363Ȭ379.ȱ Stylianou,ȱI.M.,ȱTsaih,ȱS.W.,ȱDiPetrillo,ȱK.,ȱIshimori,ȱN.,ȱLi,ȱR.,ȱPaigen,ȱB.ȱandȱChurchill,ȱG.ȱ(2006).ȱ ComplexȱgeneticȱarchitectureȱrevealedȱbyȱanalysisȱofȱhighȬdensityȱlipoproteinȱcholesterolȱinȱ chromosomeȱsubstitutionȱstrainsȱandȱF2ȱcrosses.ȱGeneticsȱ174,ȱ999Ȭ1007.ȱ Swarup,ȱK.,ȱAlonsoȬBlanco,ȱC.,ȱLynn,ȱJ.R.,ȱMichaels,ȱS.D.,ȱAmasino,ȱR.M.,ȱKoornneef,ȱM.ȱandȱMillar,ȱ A.J.ȱ(1999).ȱNaturalȱallelicȱvariationȱidentifiesȱnewȱgenesȱinȱtheȱArabidopsisȱcircadianȱsystem.ȱ PlantȱJȱ20,ȱ67Ȭ77.ȱ

44ȱ DevelopmentȱofȱaȱNearȬIsogenicȱLineȱpopulationȱ

Teng,ȱ S.,ȱ Keurentjes,ȱ J.J.B.,ȱ Bentsink,ȱ L.,ȱ Koornneef,ȱ M.ȱ andȱ Smeekens,ȱ S.ȱ (2005).ȱ SucroseȬspecificȱ inductionȱofȱanthocyaninȱbiosynthesisȱinȱArabidopsisȱrequiresȱtheȱMYB75/PAP1ȱgene.ȱPlantȱ Physiolȱ139,ȱ1840Ȭ1852.ȱ Tuinstra,ȱM.R.,ȱEjeta,ȱG.ȱandȱGoldsbrough,ȱP.B.ȱ(1997).ȱHeterogeneousȱinbredȱfamilyȱ(HIF)ȱanalysis:ȱaȱ methodȱ forȱ developingȱ nearȬisogenicȱ linesȱ thatȱ differȱ atȱ quantitativeȱ traitȱ loci.ȱ Theorȱ Applȱ Genetȱ95,ȱ1005Ȭ1011.ȱ Ungerer,ȱM.C.,ȱHalldorsdottir,ȱS.S.,ȱModliszewski,ȱJ.L.,ȱMackay,ȱT.F.ȱandȱPurugganan,ȱM.D.ȱ(2002).ȱ QuantitativeȱtraitȱlociȱforȱinflorescenceȱdevelopmentȱinȱArabidopsisȱthaliana.ȱGeneticsȱ160,ȱ1133Ȭ 1151.ȱ Ungerer,ȱ M.C.,ȱ Halldorsdottir,ȱ S.S.,ȱ Purugganan,ȱ M.D.ȱ andȱ Mackay,ȱ T.F.ȱ (2003).ȱ GenotypeȬ environmentȱ interactionsȱ atȱ quantitativeȱ traitȱ lociȱ affectingȱ inflorescenceȱ developmentȱ inȱ Arabidopsisȱthaliana.ȱGeneticsȱ165,ȱ353Ȭ365.ȱ VanȱOoijen,ȱJ.W.ȱ(1992).ȱAccuracyȱofȱmappingȱquantitativeȱtraitȱlociȱinȱautogamousȱspecies.ȱTheorȱApplȱ Genetȱ84,ȱ803Ȭ811.ȱ VanȱOoijen,ȱJ.W.ȱ(2004).ȱMapQTLȱ5,ȱSoftwareȱforȱtheȱmappingȱofȱquantitativeȱtraitȱlociȱinȱexperimentalȱ populationsȱ(Wageningen,ȱTheȱNetherlands:ȱKyazmaȱB.V.).ȱ vonȱKorff,ȱM.,ȱWang,ȱH.,ȱLeon,ȱJ.ȱandȱPillen,ȱK.ȱ(2004).ȱDevelopmentȱofȱcandidateȱintrogressionȱlinesȱ usingȱanȱexoticȱbarleyȱaccessionȱ(Hordeumȱvulgareȱssp.ȱspontaneum)ȱasȱdonor.ȱTheorȱApplȱGenetȱ 109,ȱ1736Ȭ1745.ȱ Xu,ȱS.ȱ(2003).ȱTheoreticalȱbasisȱofȱtheȱBeavisȱeffect.ȱGeneticsȱ165,ȱ2259Ȭ2268.ȱ Yoon,ȱD.B.,ȱKang,ȱK.H.,ȱKim,ȱH.J.,ȱJu,ȱH.G.,ȱKwon,ȱS.J.,ȱSuh,ȱJ.P.,ȱJeong,ȱO.Y.ȱandȱAhn,ȱS.N.ȱ(2006).ȱ Mappingȱquantitativeȱtraitȱlociȱforȱyieldȱcomponentsȱandȱmorphologicalȱtraitsȱinȱanȱadvancedȱ backcrossȱ populationȱ betweenȱ Oryzaȱ grandiglumisȱ andȱ theȱ O.ȱ sativaȱ japonicaȱ cultivarȱ Hwaseongbyeo.ȱTheorȱApplȱGenetȱ112,ȱ1052Ȭ1062.ȱ Zou,ȱF.,ȱGelfond,ȱJ.A.,ȱAirey,ȱD.C.,ȱLu,ȱL.,ȱManly,ȱK.F.,ȱWilliams,ȱR.W.ȱandȱThreadgill,ȱD.W.ȱ(2005).ȱ Quantitativeȱ traitȱ locusȱ analysisȱ usingȱ recombinantȱ inbredȱ intercrosses:ȱ theoreticalȱ andȱ empiricalȱconsiderations.ȱGeneticsȱ170,ȱ1299Ȭ1311.ȱ

45ȱ 46ȱ Chapterȱ3ȱ ȱ ȱ RegulatoryȱnetworkȱconstructionȱinȱArabidopsisȱbyȱusingȱ genomeȬwideȱgeneȱexpressionȱquantitativeȱtraitȱlociȱ ȱ JoostȱJ.ȱB.ȱKeurentjes*,ȱJingyuanȱFu*,ȱInezȱR.ȱTerpstra*,ȱJuanȱM.ȱGarcia,ȱGuidoȱvanȱ denȱAckerveken,ȱL.ȱBastenȱSnoek,ȱAntonȱJ.ȱM.ȱPeeters,ȱDickȱVreugdenhil,ȱMaartenȱ KoornneefȱandȱRitsertȱC.ȱJansenȱ ȱ Publishedȱ inȱ Proceedingsȱ ofȱ theȱ Nationalȱ Academyȱ ofȱ Sciencesȱ USAȱ (2007)ȱ 104,ȱ 1708Ȭ1713.ȱ ȱ *ȱEqualȱcontribution.ȱ ȱ ȱ ȱ ABSTRACTȱ ȱ Accessionsȱ ofȱ aȱ plantȱ speciesȱ canȱ showȱ considerableȱ geneticȱ differencesȱ thatȱ areȱ effectivelyȱ analyzedȱ usingȱ Recombinantȱ Inbredȱ Lineȱ (RIL)ȱ populations.ȱ Hereȱ weȱ describeȱ theȱ resultsȱ ofȱ genomeȱ wideȱ expressionȱ variationȱ analysisȱ inȱ anȱ RILȱ populationȱofȱArabidopsisȱthaliana.ȱForȱmanyȱgenes,ȱvariationȱinȱexpressionȱcouldȱbeȱ explainedȱ byȱ expressionȱ Quantitativeȱ Traitȱ Lociȱ (eQTLs).ȱ Theȱ natureȱ andȱ consequencesȱ ofȱ thisȱ variationȱ areȱ discussedȱ basedȱ onȱ additionalȱ geneticȱ parameters,ȱsuchȱasȱheritabilityȱandȱtransgressionȱandȱbyȱexaminingȱtheȱgenomicȱ positionȱ ofȱ eQTLsȱ versusȱ geneȱ position,ȱ polymorphismȱ frequency,ȱ andȱ geneȱ ontology.ȱ Furthermore,ȱ weȱ developedȱ aȱ novelȱ approachȱ forȱ geneticȱ regulatoryȱ networkȱconstructionȱbyȱcombiningȱeQTLȱmappingȱandȱregulatorȱcandidateȱgeneȱ selection.ȱTheȱpowerȱofȱourȱmethodȱwasȱshownȱinȱaȱcaseȱstudyȱofȱgenesȱassociatedȱ withȱ floweringȱ time,ȱ aȱ wellȱ studiedȱ regulatoryȱ networkȱ inȱ Arabidopsis.ȱ Resultsȱ thatȱrevealedȱclustersȱofȱcoȬregulatedȱgenesȱandȱtheirȱmostȱlikelyȱregulatorsȱwereȱ inȱagreementȱwithȱpublishedȱdata,ȱandȱunknownȱrelationshipsȱcouldȱbeȱpredicted.ȱ ȱ

47ȱ Chapterȱ3ȱ

INTRODUCTIONȱ ȱ Analogousȱ toȱ classicalȱ traits,ȱ quantitativeȱ geneticȱ variationȱ isȱ oftenȱ observedȱ forȱ transcriptȱlevelsȱofȱgenes.ȱJansenȱandȱNapȱ(2001),ȱtherefore,ȱintroducedȱtheȱconceptȱ ofȱgeneticalȱgenomics,ȱinȱwhichȱQuantitativeȱTraitȱLocusȱ(QTL)ȱanalysisȱisȱappliedȱ toȱ levelsȱ ofȱ transcriptȱ abundanceȱ andȱ identifiesȱ genomicȱ lociȱ controllingȱ theȱ observedȱvariationȱinȱexpressionȱ(eQTLs).ȱOneȱofȱtheȱbestȱstudiedȱorganismsȱwithȱ regardȱ toȱ geneȱ expressionȱ regulationȱ nowadaysȱ isȱ yeastȱ (Bremȱ etȱ al.,ȱ 2002,ȱ 2005;ȱ Yvertȱetȱal.,ȱ2003;ȱBingȱandȱHoeschele,ȱ2005;ȱBremȱandȱKruglyak,ȱ2005;ȱRonaldȱetȱal.,ȱ 2005;ȱ Storeyȱ etȱ al.,ȱ 2005).ȱ However,ȱ inȱ recentȱ yearsȱ severalȱ studiesȱ haveȱ demonstratedȱ theȱ feasibilityȱ ofȱ thisȱ approachȱ inȱ differentȱ organismsȱ andȱ diverseȱ typesȱ ofȱ populationsȱ (Bremȱ etȱ al.,ȱ 2002;ȱ Schadtȱ etȱ al.,ȱ 2003;ȱ Morleyȱ etȱ al.,ȱ 2004;ȱ Bystrykhȱetȱal.,ȱ2005;ȱHubnerȱetȱal.,ȱ2005;ȱDeCookȱetȱal.,ȱ2006).ȱ Aȱ logicalȱ nextȱ stepȱ wouldȱ beȱ theȱ constructionȱ ofȱ geneticȱ regulatoryȱ networksȱ(KendziorskiȱandȱWang,ȱ2006),ȱwhichȱonlyȱaȱfewȱstudiesȱhaveȱaddressedȱ upȱ toȱ nowȱ (Bingȱ andȱ Hoeschele,ȱ 2005;ȱ Kliebensteinȱ etȱ al.,ȱ 2006).ȱ Althoughȱ manyȱ studiesȱonȱhigherȱeukaryotesȱsufferedȱfromȱsmallȱpopulationsȱorȱonlyȱanalyzedȱaȱ subsetȱ ofȱ genesȱ presentȱ onȱ theȱ genomeȱ ofȱ theȱ organismȱ underȱ study,ȱ theȱ mainȱ reasonȱholdingȱbackȱtheȱidentificationȱofȱgeneȬbyȬgeneȱregulationȱhasȱbeenȱtheȱlackȱ ofȱaȱreliableȱidentificationȱofȱcandidateȱregulators.ȱAlthoughȱpowerfulȱinȱdetectingȱ lociȱcontrollingȱtheȱobservedȱvariationȱforȱtraitȱvalues,ȱsupportȱintervalsȱofȱQTLsȱ areȱ stillȱ ofȱ considerableȱ width,ȱ oftenȱ coveringȱ hundredsȱ ofȱ genes.ȱ Consequently,ȱ theȱ molecularȱ dissectionȱ ofȱ quantitativeȱ traitȱ regulationȱ isȱ stillȱ inȱ itsȱ infancyȱ andȱ wouldȱgreatlyȱbenefitȱfromȱapproachesȱreducingȱtheȱnumberȱofȱcandidateȱgenesȱinȱ aȱQTLȱsupportȱinterval.ȱ Promisingȱ resultsȱ haveȱ beenȱ obtainedȱ byȱ combiningȱ QTLȱ analysesȱ ofȱ physiologicalȱ andȱ geneȱ expressionȱ traits,ȱ basedȱ onȱ coȬlocalizationȱ ofȱ (e)QTLsȱ (Wayneȱ andȱ McIntyre,ȱ 2002;ȱ Hubnerȱ etȱ al.,ȱ 2005;ȱ DeCookȱ etȱ al.,ȱ 2006).ȱ However,ȱ whenȱ expressionȱ differencesȱ inȱ genesȱ areȱ causedȱ byȱ differencesȱ inȱ expressionȱ ofȱ theirȱ regulator,ȱ itȱ isȱ likelyȱ thatȱ theyȱ showȱ correlationȱ inȱ expressionȱ (Bingȱ andȱ Hoeschele,ȱ 2005).ȱ Moreover,ȱ multipleȱ functionallyȱ relatedȱ genesȱ withȱ coȬincidingȱ eQTLs,ȱwhichȱmightȱbeȱmembersȱofȱaȱcommonȱpathway,ȱareȱlikelyȱtoȱhaveȱoneȱandȱ theȱsameȱregulator.ȱWeȱthereforeȱdevelopedȱaȱnovelȱapproachȱforȱtheȱassignmentȱ ofȱmaximumȬlikelihoodȱregulatorsȱbyȱcombiningȱQTLȱanalysisȱofȱgeneȱexpressionȱ profilingȱandȱiterativeȱGroupȱAnalysisȱ(iGA)ȱ(Breitlingȱetȱal.,ȱ2004)ȱofȱfunctionallyȱ relatedȱgenesȱwithȱcoȬincidingȱeQTLs.ȱ Toȱapplyȱtheȱconceptȱofȱgeneticalȱgenomicsȱtoȱhigherȱplantsȱweȱanalyzedȱ genomeȬwideȱ geneȱ expressionȱ variationȱ inȱ aȱ large,ȱ wellȬstudiedȱ Recombinantȱ

48ȱ RegulatoryȱnetworkȱconstructionȱusingȱgenomeȬwideȱgeneȱexpressionȱQTLsȱ

InbredȱLineȱ(RIL)ȱpopulationȱofȱArabidopsisȱthaliana.ȱWeȱshowȱthatȱforȱmanyȱgenesȱ theȱvariationȱinȱtranscriptȱlevelȱcanȱbeȱexplainedȱbyȱgeneticȱfactors.ȱByȱintegratingȱ currentȱ knowledgeȱ ofȱ theȱ geneticsȱ ofȱ aȱ specificȱ trait,ȱ weȱ demonstrateȱ theȱ constructionȱofȱgeneticȱregulatoryȱnetworks,ȱwhichȱcanȱserveȱtoȱformȱhypothesesȱ aboutȱasȬyetȬunknownȱregulatoryȱsteps.ȱ ȱ ȱ RESULTSȱȱ ȱ Geneticȱcontrolȱofȱgeneȱexpressionȱinȱplantsȱisȱhighlyȱcomplexȱ Toȱdetermineȱtheȱeffectȱofȱgeneticȱfactorsȱinvolvedȱinȱtheȱregulationȱofȱexpression,ȱ weȱanalyzedȱgenomeȬwideȱgeneȱexpressionȱinȱtheȱparentsȱandȱanȱRILȱpopulationȱ ofȱaȱcrossȱbetweenȱtheȱdistinctȱaccessionsȱLandsbergȱerectaȱ(Ler)ȱandȱCapeȱVerdeȱ Islandsȱ(Cvi),ȱconsistingȱofȱ160ȱlinesȱ(AlonsoȬBlancoȱetȱal.,ȱ1998b).ȱTranscriptȱlevelsȱ ofȱ 24,065ȱ genesȱ wereȱ analyzedȱ byȱ DNAȱ microarrays,ȱ ofȱ whichȱ 922ȱ showedȱ significantȱ differentialȱ expressionȱ betweenȱ theȱ parentsȱ [Pȱ <ȱ 2.5ȱ xȱ 10Ȭ3;ȱ falseȬ discoveryȱrateȱ(FDR)ȱ=ȱ0.05].ȱSubsequentȱmappingȱresultedȱinȱ4,523ȱeQTLsȱdetectedȱ forȱ 4,066ȱ genesȱ (Pȱ <ȱ 5.29ȱ xȱ 10Ȭ5;ȱ FDRȱ =ȱ 0.05,ȱ correspondingȱ toȱ aȱ qȱ valueȱ ofȱ 0.01)ȱ (StoreyȱandȱTibshirani,ȱ2003).ȱȱ ȱ

ȱ ȱ Figureȱ1:ȱFrequencyȱdistributionsȱofȱheritabilityȱvaluesȱofȱgeneȱexpression.ȱ (A)ȱDataȱfromȱaȱmicroarrayȱcomparisonȱofȱtheȱparents.ȱ(B)ȱDataȱfromȱaȱmicroarrayȱanalysisȱofȱtheȱLerȱxȱ CviȱRILȱpopulation.ȱSolidȱandȱshadedȱbarsȱrepresentȱtheȱnumberȱofȱgenesȱthatȱcouldȱandȱcouldȱnotȱbeȱ mapped,ȱrespectively.ȱTheȱsolidȱlineȱdepictsȱtheȱnumberȱofȱmappedȱgenesȱasȱaȱproportionȱofȱtheȱtotalȱ numberȱofȱgenesȱforȱaȱgivenȱheritabilityȱclass.ȱ ȱ Becauseȱtheȱmicroarrayȱprobeȱsetȱwasȱdesignedȱonȱtheȱsequencedȱaccessionȱ Columbiaȱ (Col),ȱ weȱ performedȱ hybridizationsȱ ofȱ genomicȱ DNAȱ ofȱ theȱ parentalȱ linesȱandȱfoundȱrelativelyȱfewȱhybridizationȱdifferencesȱ(supplementalȱTableȱ1ȱatȱ www.pnas.org/cgi/content/full/0610429104/DC1).ȱ However,ȱ theȱ lowȱ powerȱ toȱ

49ȱ Chapterȱ3ȱ

detectȱ differences,ȱ dueȱ toȱ theȱ smallȱ numberȱ ofȱ replicates,ȱ mightȱ haveȱ ledȱ toȱ anȱ underȱestimation,ȱasȱindicatedȱbyȱotherȱstudiesȱ(Borevitz,ȱ2006).ȱ ȱ

ȱ ȱ Figureȱ2:ȱEffectȱofȱexpressionȱlevelȱandȱtransgressionȱonȱeQTLȱdetection.ȱ (A)ȱFrequencyȱdistributionȱofȱtheȱmeanȱexpressionȱlevelȱofȱanalyzedȱgenesȱinȱtheȱRILȱpopulation.ȱSolidȱ andȱshadedȱbarsȱrepresentȱtheȱnumberȱofȱgenesȱthatȱcouldȱandȱcouldȱnotȱbeȱmapped,ȱrespectively.ȱTheȱ solidȱlineȱdepictsȱtheȱnumberȱofȱmappedȱgenesȱasȱaȱproportionȱofȱtheȱtotalȱnumberȱofȱgenesȱforȱaȱgivenȱ class.ȱ(B)ȱDiagramȱofȱtheȱnumberȱofȱgenesȱshowingȱlinkageȱandȱtransgression.ȱCirclesȱareȱproportionalȱ toȱtheȱnumberȱofȱgenes.ȱIncreasingȱshadingȱrepresents,ȱrespectively,ȱtheȱtotalȱnumberȱofȱgenesȱanalyzedȱ (24,065),ȱtheȱnumberȱofȱgenesȱwhoseȱexpressionȱshowedȱsignificantȱlinkageȱ(4,066)ȱandȱtheȱnumberȱofȱ genesȱwhoseȱexpressionȱshowedȱtransgressiveȱsegregationȱ(10,849).ȱ ȱ Heritabilityȱ valuesȱ calculatedȱ fromȱ theȱ parentalȱ dataȱ andȱ theȱ RILȱ populationȱ reachedȱ aȱ medianȱ valueȱ ofȱ 28.6ȱ andȱ 74.7%,ȱ respectivelyȱ (Figureȱ 1),ȱ whichȱisȱinȱagreementȱwithȱtheȱdiscrepancyȱbetweenȱtheȱnumberȱofȱdifferentiallyȱ expressedȱandȱmappedȱgenesȱ(i.e.ȱgenesȱforȱwhichȱanȱeQTLȱwasȱfound).ȱAlthoughȱ theȱfractionȱofȱmappedȱgenesȱincreasedȱwithȱhigherȱheritabilityȱvalues,ȱforȱmanyȱ genesȱ showingȱ highȱ heritability,ȱ noȱ eQTLȱ couldȱ beȱ significantlyȱ detected.ȱ Theseȱ findingsȱsuggestȱthatȱtheȱregulationȱofȱexpressionȱofȱmanyȱgenesȱisȱcontrolledȱbyȱ multipleȱ eQTLs,ȱ ofȱ whichȱ manyȱ mightȱ notȱ haveȱ passedȱ theȱ significanceȱ testȱ becauseȱ ofȱ theirȱ smallȱ effect.ȱ Likewise,ȱ onlyȱ 65.6%ȱ ofȱ theȱ genesȱ differentiallyȱ expressedȱbetweenȱtheȱparentsȱcouldȱbeȱmapped.ȱHowever,ȱforȱ15.0%ȱofȱtheȱgenesȱ forȱ whichȱ theȱ parentsȱ didȱ notȱ showȱ aȱ significantȱ differenceȱ inȱ expressionȱ levels,ȱ eQTLsȱ couldȱ beȱ detected.ȱ Theseȱ observationsȱ andȱ theȱ muchȱ lowerȱ heritabilitiesȱ calculatedȱfromȱtheȱparentalȱdata,ȱcomparedȱwithȱthoseȱfromȱtheȱRILȱpopulation,ȱ indicateȱthatȱeQTLsȱforȱaȱgivenȱgeneȱmightȱexertȱoppositeȱadditiveȱeffects,ȱleadingȱ

50ȱ RegulatoryȱnetworkȱconstructionȱusingȱgenomeȬwideȱgeneȱexpressionȱQTLsȱ

toȱ aȱ balancedȱ expressionȱ inȱ theȱ parentsȱ butȱ aȱ transgressiveȱ expressionȱ patternȱ amongȱ theȱ segregantsȱ ofȱ theȱ population.ȱ Toȱ testȱ thisȱ hypothesis,ȱ weȱ testedȱ eachȱ geneȱforȱsignificantȱtransgressionȱandȱfoundȱsignificantȱtransgressionȱofȱexpressionȱ forȱ 10,849ȱ genesȱ (45.1%).ȱ Noȱ relationshipȱ wasȱ foundȱ betweenȱ theȱ numberȱ ofȱ mappedȱ genesȱ andȱ transgressionȱ (Figureȱ 2B).ȱ Theseȱ dataȱ indicateȱ thatȱ theȱ regulationȱ ofȱ geneȱ expressionȱ inȱ plantsȱ isȱ largelyȱ underȱ geneticȱ controlȱ butȱ isȱ highlyȱcomplexȱbecauseȱofȱtheȱinvolvementȱofȱmultipleȱgenes.ȱ ȱ DistributionȱofȱeQTLsȱidentifiesȱregulatoryȱhotȱspotsȱ Toȱ characterizeȱ inȱ moreȱ detailȱ theȱ genesȱ whoseȱ expressionȱ showedȱ significantȱ linkage,ȱ weȱ determinedȱ severalȱ features.ȱ Weȱ firstȱ analyzedȱ theȱ distributionȱ ofȱ eQTLsȱalongȱtheȱgenomeȱofȱArabidopsisȱandȱfoundȱaȱnumberȱofȱgenomicȱregionsȱ containingȱnumbersȱofȱeQTLsȱsignificantlyȱdeviatingȱfromȱwhatȱcanȱbeȱexpectedȱbyȱ chance,ȱasȱdeterminedȱbyȱpermutationȱtestsȱ(Figureȱ3).ȱTheseȱhotȱspotsȱmayȱreflectȱ localȱ geneȬdenseȱ regions,ȱ inȱ contrastȱ toȱ coldȱ spots,ȱ whichȱ mayȱ reflectȱ lowȬgeneȬ densityȱ regionsȱ suchȱ asȱ centromers.ȱ Alternatively,ȱ hotȱ spotsȱ mayȱ containȱ masterȱ regulators:ȱ genesȱ controllingȱ theȱ expressionȱ ofȱ manyȱ otherȱ genes.ȱ Theȱ largeȱ numberȱ ofȱ genesȱ mappingȱ toȱ theȱ ERECTAȱ gene,ȱ whichȱ wasȱ includedȱ asȱ aȱ phenotypicȱmarker,ȱillustrateȱthisȱfinding.ȱAnȱempiricalȱthresholdȱforȱassessingȱaȱ hotȱ spot,ȱ providingȱ aȱ 0.05ȱ genomeȬwideȱ errorȱ rateȱ wasȱ generatedȱ usingȱ aȱ permutationȱprocedure,ȱwhichȱdefinedȱaȱhotȱspotȱasȱanyȱmarkerȱassociatedȱwithȱ43ȱ orȱmoreȱgenes.ȱBecauseȱ176ȱgenesȱmappedȱtoȱtheȱERECTAȱmarker,ȱthisȱlocusȱwasȱ consideredȱtoȱbeȱanȱeQTLȱhotȱspot.ȱPolymorphismsȱinȱERECTA,ȱaȱreceptorȱproteinȱ kinaseȱ(Toriiȱetȱal.,ȱ1996),ȱareȱwellȱknownȱforȱtheirȱpleiotropicȱeffectȱonȱmanyȱtraits,ȱ includingȱmorphologicalȱdifferencesȱ(Koornneefȱetȱal.,ȱ2004).ȱ ȱ

ȱ ȱ Figureȱ3:ȱGenomicȱdistributionȱofȱeQTLs.ȱ Barsȱrepresentȱtheȱnumberȱofȱdistantȱ(solid)ȱandȱlocalȱ(shaded)ȱeQTLsȱdetectedȱatȱeachȱmarkerȱposition.ȱ Eachȱ eQTLȱ wasȱ positionedȱ atȱ itsȱ bestȱ controllingȱ marker.ȱ Theȱ dashedȱ horizontalȱ lineȱ representsȱ theȱ significanceȱthresholdȱvalueȱforȱdefiningȱaȱhotȱspot.ȱShadedȱverticalȱlinesȱdepictȱchromosomalȱborders.ȱ ȱ

51ȱ Chapterȱ3ȱ

Distantȱgeneȱexpressionȱregulationȱoccursȱmoreȱfrequentlyȱbutȱlocalȱregulationȱ isȱstrongerȱ Genomicȱ differencesȱ responsibleȱ forȱ eQTLsȱ eitherȱ occurȱ inȱ regulatoryȱ genesȱ affectingȱ theȱ transcriptȱ levelȱ ofȱ otherȱ genesȱ (transȬregulation)ȱ orȱ inȱ theȱ genesȱ encodingȱtheȱmRNAȱforȱwhichȱtheȱeQTLȱwasȱfoundȱ(cisȬregulation)ȱ(Rockmanȱandȱ Kruglyak,ȱ2006).ȱToȱcompareȱtheȱpositionȱofȱgenesȱandȱtheirȱeQTLs,ȱweȱanchoredȱ theȱ geneticȱ mapȱ toȱ theȱ physicalȱ mapȱ andȱ foundȱ anȱ almostȱ linearȱ genomeȬwideȱ relationȱofȱ4.1ȱcMȱperȱMbpȱ(supplementalȱFigureȱ8ȱatȱwww.pnas.org/cgi/content/ȱ full/0610429104/DC1).ȱ Whenȱ theȱ positionȱ ofȱ eachȱ eQTLȱ wasȱ plottedȱ againstȱ theȱ positionȱofȱtheȱgeneȱforȱwhichȱthatȱeQTLȱwasȱfound,ȱaȱstrongȱenrichmentȱalongȱtheȱ diagonalȱ ofȱ theȱ graphȱ wasȱ observedȱ (Figureȱ 4).ȱ Thisȱ enrichmentȱ indicatesȱ thatȱ manyȱ genes,ȱ ofȱ whichȱ theȱ majorityȱ isȱ expectedȱ toȱ beȱ cisȬregulatedȱ (Ronaldȱ etȱ al.,ȱ 2005),ȱmapȱtoȱtheirȱownȱphysicalȱposition.ȱ ȱ ȱFigureȱ 4:ȱ Distributionȱ ofȱ mappedȱ genesȱ versusȱ theȱpositionȱofȱtheirȱaccompanyingȱeQTL.ȱ Positionsȱ ofȱ detectedȱ eQTLsȱ areȱ plottedȱ againstȱ theȱpositionȱofȱtheȱgeneȱforȱwhichȱthatȱeQTLȱwasȱ found.ȱ Chromosomalȱ bordersȱ areȱ depictedȱ asȱ horizontalȱ andȱ verticalȱ lines.ȱ Mbp,ȱ megabaseȱ pairs.ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ Toȱquantifyȱthisȱresult,ȱweȱdefinedȱlocal/distantȱregulationȱinȱtermsȱofȱtheȱ positionalȱcoincidenceȱofȱgenesȱandȱtheirȱaccompanyingȱeQTL(s).ȱOfȱ4,066ȱmappedȱ genes,ȱ 1,875ȱ (46.1%)ȱ coȬlocatedȱ withȱ theȱ supportȱ intervalȱ ofȱ oneȱ ofȱ theirȱ eQTLs,ȱ correspondingȱ toȱ aȱ regionȱ consistentȱ withȱ max{ȬLog10P}ȱȬȱ1.5ȱ (whereȱ Pȱ expressesȱ theȱ significanceȱ ofȱ associationȱ (Keurentjesȱ etȱ al.,ȱ 2006)),ȱ andȱ wereȱ thereforeȱ classifiedȱ asȱ locallyȱ regulated.ȱ Genesȱ outsideȱ suchȱ intervalsȱ (1,958;ȱ 48.1%)ȱ wereȱ classifiedȱasȱdistantlyȱregulated.ȱAȱminorȱnumberȱofȱ198ȱgenesȱ(4.9%)ȱwithȱmultipleȱ eQTLsȱshowedȱbothȱlocalȱandȱdistantȱregulation,ȱwhereasȱtheȱphysicalȱpositionȱofȱ 35ȱgenesȱ(0.9%)ȱwasȱunknownȱ(Tableȱ1).ȱ ȱ

52ȱ RegulatoryȱnetworkȱconstructionȱusingȱgenomeȬwideȱgeneȱexpressionȱQTLsȱ

Tableȱ1:ȱTheȱnumberȱofȱgenesȱshowingȱlinkage,ȱclassifiedȱaccordingȱtoȱtheȱpositionȱofȱeQTLsȱrelativeȱtoȱ theȱgene.ȱShownȱareȱtheȱnumberȱofȱgenesȱwithȱaȱsingleȱorȱmultipleȱeQTL(s)ȱforȱdifferentȱsignificanceȱ thresholdsȱ(P)ȱandȱeQTLȱsupportȱintervalsȱ(max{ȬLog10P}ȱȬȱx,ȱwhereȱxȱ=ȱ1.5ȱandȱ2.0ȱrespectively).ȱ Positionȱȱ SingleȱeQTLȱ MultipleȱeQTLsȱȱ ȱ Pȱ<ȱ5.29ȱxȱ10Ȭ5;ȱmax{ȬLog10P}ȱȬȱ1.5ȱ Localȱ 1875ȱȱ ȱ Distantȱ 1752ȱ 206ȱȱ Localȱ+ȱdistantȱȱ198ȱȱ Unknownȱ 31ȱ 4ȱȱ ȱȱȱȱ Pȱ<ȱ6.50ȱxȱ10Ȭ4;ȱmax{ȬLog10P}ȱȬȱ1.5ȱ Localȱ 2167ȱȱ ȱ Distantȱ 3671ȱ 916ȱȱ Localȱ+ȱdistantȱȱ794ȱȱ Unknownȱ 45ȱ 11ȱȱ ȱȱȱȱ Pȱ<ȱ5.29ȱxȱ10Ȭ5;ȱmax{ȬLog10P}ȱȬȱ2.0ȱ Localȱ 2007ȱȱ ȱ Distantȱ 1676ȱ 156ȱȱ Localȱ+ȱdistantȱȱ192ȱȱ Unknownȱ 31ȱ 4ȱȱ ȱ Becauseȱ cisȬregulationȱ isȱ oftenȱ muchȱ strongerȱ thanȱ transȬregulationȱ (Bingȱ andȱHoeschele,ȱ2005),ȱasȱalsoȱindicatedȱbyȱtheȱmedianȱ–Log10Pȱvaluesȱofȱ7.1ȱandȱ5.3ȱ andȱtheȱmedianȱexplainedȱvarianceȱofȱ30.3ȱandȱ22.6%ȱforȱlocalȱandȱdistantȱeQTLs,ȱ respectively,ȱ theȱ ratioȱ ofȱ detectedȱ localȱ versusȱ distantȱ eQTLsȱ dependsȱ onȱ theȱ appliedȱsignificanceȱthresholdȱ(Schadtȱetȱal.,ȱ2003;ȱMorleyȱetȱal.,ȱ2004;ȱHubnerȱetȱal.,ȱ 2005).ȱ Theȱ stringentȱ thresholdȱappliedȱ here,ȱ correctedȱ forȱ multipleȱ testing,ȱ mightȱ thereforeȱ haveȱ underestimatedȱ distantȱ regulation.ȱ Whenȱ theȱ thresholdȱ wasȱ decreasedȱ fromȱ 5.29ȱ xȱ 10Ȭ5ȱ toȱ 6.5ȱ xȱ 10Ȭ4ȱ (FDRȱ =ȱ 0.25,ȱ qȱ =ȱ 0.05),ȱ 7,604ȱ transcriptsȱ showedȱ atȱ leastȱ oneȱ linkage,ȱ withȱ 2,167ȱ (28.5%)ȱ beingȱ locallyȱ regulated,ȱ 4,587ȱ (60.3%)ȱbeingȱdistantlyȱregulated,ȱandȱ794ȱ(10.4%)ȱbeingȱbothȱlocallyȱandȱdistantlyȱ regulated.ȱ Basedȱ onȱ theirȱ PȬvalueȱ distributionsȱ (Storeyȱ andȱ Tibshirani,ȱ 2003),ȱ theȱ overallȱproportionȱofȱlocallyȱandȱdistantlyȱregulatedȱgenesȱwereȱestimatedȱatȱ40.5ȱ andȱ15.3%ȱrespectively.ȱ Aȱ secondȱ parameterȱ affectingȱ theȱ assignmentȱ ofȱ locallyȱ versusȱ distantlyȱ regulatedȱtranscriptsȱisȱtheȱsettingȱofȱtheȱeQTLȱsupportȱinterval.ȱHowever,ȱwhenȱaȱ widerȱintervalȱofȱmax{–Log10P}ȱȬȱ2.0ȱwasȱusedȱatȱPȱ<ȱ5.29ȱxȱ10Ȭ5,ȱresultsȱwereȱsimilarȱ withȱ 2,007ȱ (49.4%),ȱ 1,832ȱ (45.1%),ȱ andȱ 192ȱ (4.7%)ȱ genesȱ classifiedȱ asȱ locally,ȱ distantly,ȱandȱbothȱlocallyȱandȱdistantlyȱregulated,ȱrespectively.ȱȱ ȱ

53ȱ Chapterȱ3ȱ

Localȱ regulationȱ correlatesȱ withȱ SNPȱ frequencyȱ andȱ isȱ lessȱ frequentȱ inȱ regulatoryȱgenesȱ ToȱdetermineȱwhetherȱaȱrelationshipȱexistsȱbetweenȱSNPȱorȱgeneȱdensityȱandȱtheȱ numberȱofȱmappedȱgenes,ȱweȱperformedȱaȱslidingȬwindowȱregressionȱanalysis.ȱAȱ strongȱcorrelationȱwasȱobservedȱbetweenȱgeneȱdensityȱandȱtheȱnumberȱofȱlocallyȱ andȱ distantlyȱ regulatedȱ genesȱ (r2ȱ =ȱ 0.88,ȱ Pȱ <ȱ 0.0001ȱ andȱ r2ȱ =ȱ 0.91,ȱ Pȱ <ȱ 0.0001,ȱ respectively)ȱ(Figureȱ5A).ȱȱ ȱ

ȱ ȱ Figureȱ5:ȱRelationshipȱbetweenȱgeneȱandȱSNPȱfrequencyȱandȱtheȱnumberȱofȱmappedȱgenes.ȱ (A)ȱ Relationshipȱ betweenȱ geneȱ frequencyȱ (solidȱ lines)ȱ andȱ theȱ numberȱ ofȱ mappedȱ genes,ȱ dividedȱ inȱ locallyȱ (shadedȱ lines)ȱ andȱ distantlyȱ (dottedȱ shadedȱ lines)ȱ regulatedȱ genes.ȱ (B)ȱ Relationshipȱ betweenȱ SNPȱ frequencyȱ (solidȱ lines)ȱ andȱ theȱ numberȱ ofȱ mappedȱ genes,ȱ dividedȱ inȱ locallyȱ (shadedȱ lines)ȱ andȱ distantlyȱ (dottedȱ shadedȱ lines)ȱ regulatedȱ genes,ȱ correctedȱ forȱ geneȱ density.ȱ Gapsȱ representȱ chromosomalȱborders.ȱMbp,ȱmegabaseȱpairs.ȱ ȱ AȱweakerȱbutȱsignificantȱcorrelationȱwasȱalsoȱfoundȱbetweenȱgeneȱandȱSNPȱ frequencyȱ (r2ȱ =ȱ 0.34,ȱ Pȱ <ȱ 0.0001).ȱ Evenȱ whenȱ theȱ numberȱ ofȱ mappedȱ genesȱ inȱ aȱ windowȱ wasȱ correctedȱ forȱ geneȱ density,ȱ aȱ significantȱ correlationȱ wasȱ stillȱ foundȱ betweenȱSNPȱfrequencyȱandȱtheȱnumberȱofȱlocallyȱregulatedȱgenesȱ(r2ȱ =ȱ0.32,ȱPȱ<ȱ

54ȱ RegulatoryȱnetworkȱconstructionȱusingȱgenomeȬwideȱgeneȱexpressionȱQTLsȱ

0.0001),ȱ althoughȱ incidentalȱ differencesȱ inȱ hybridizationȱ efficiencyȱ mightȱ haveȱ contributedȱtoȱanȱoverestimation.ȱSuchȱaȱrelationshipȱwasȱnotȱfoundȱforȱdistantlyȱ regulatedȱgenesȱ(r2ȱ=ȱȬ0.003,ȱPȱ=ȱ0.89)ȱ(Figureȱ5B).ȱ Toȱ assessȱ whetherȱ thereȱ wasȱ aȱ functionalȱ enrichmentȱ forȱ genesȱ whoseȱ variationȱ inȱ expressionȱ couldȱ beȱ geneticallyȱ explained,ȱ weȱ computedȱ theȱ proportionȱofȱtheseȱgenesȱforȱeachȱGeneȱOntologyȱbiologicalȱprocessȱandȱmolecularȱ functionȱ categoryȱ (Theȱ Arabidopsisȱ Informationȱ Resource;ȱ www.arabidopsis.org)ȱ (Figureȱ 6).ȱ Genesȱ involvedȱ inȱ regulatoryȱ processesȱ showedȱ significantlyȱ lessȱ geneticallyȱ explainableȱ variationȱ inȱ expressionȱ (AlȬShahrourȱ etȱ al.,ȱ 2004)ȱ (supplementalȱ Tableȱ 3ȱ atȱ www.pnas.org/cgi/content/full/0610429104/DC1).ȱ However,ȱ smallȱ changesȱ inȱ expressionȱ level,ȱ whichȱ mayȱ beȱ moreȱ frequentȱ inȱ regulatoryȱgenes,ȱareȱmoreȱdifficultȱtoȱdetectȱbutȱcanȱneverthelessȱbeȱveryȱrelevantȱ biologically,ȱbecauseȱtheyȱmayȱresultȱinȱlargeȱchangesȱinȱexpressionȱofȱtargetȱgenes.ȱ Furthermore,ȱmanyȱregulatoryȱgenesȱoftenȱdisplayȱpleiotropicȱeffects.ȱAȱchangeȱinȱ expressionȱ ofȱ suchȱ keyȱ regulatorsȱ canȱ affectȱ theȱ expressionȱ ofȱ manyȱ moreȱ targetȱ genes,ȱwhichȱmayȱskewȱtheȱdistributionȱofȱdifferentlyȱexpressedȱgenesȱinȱfavorȱofȱ classesȱcontainingȱpredominantlyȱtargetȱgenes.ȱȱ ȱ

ȱ ȱ Figureȱ6:ȱFrequencyȱdistributionȱofȱtheȱproportionȱofȱmappedȱgenesȱversusȱfunction.ȱ (A)ȱ Proportionȱ ofȱ genesȱ thatȱ couldȱ beȱ mappedȱ inȱ differentȱ Geneȱ Ontologyȱ categoriesȱ ofȱ biologicalȱ processes.ȱ (B)ȱ Proportionȱ ofȱ genesȱ thatȱ couldȱ beȱ mappedȱ inȱ differentȱ Geneȱ Ontologyȱ categoriesȱ ofȱ molecularȱfunctions.ȱSolid,ȱshaded,ȱandȱwhiteȱbarsȱrepresentȱlocal,ȱdistant,ȱandȱbothȱlocalȱandȱdistantȱ regulation.ȱ ȱ

55ȱ Chapterȱ3ȱ

Interestingly,ȱ whenȱ theseȱ analysesȱ wereȱ performedȱ separatelyȱ forȱ locallyȱ andȱ distantlyȱ regulatedȱ genes,ȱ regulatoryȱ categoriesȱ showedȱ aȱ comparableȱ proportionȱ ofȱ distantlyȱ regulatedȱ genesȱ withȱ otherȱ classesȱ butȱ aȱ muchȱ smallerȱ proportionȱ ofȱ locallyȱ regulatedȱ genesȱ (Figureȱ 6).ȱ Comparingȱ locallyȱ toȱ distantlyȱ regulatedȱgenesȱ(AlȬShahrourȱetȱal.,ȱ2004)ȱresultedȱinȱsignificantȱoverrepresentationȱ ofȱdistantlyȱregulatedȱgenesȱinȱtenȱGeneȱOntologyȱbiologicalȱprocessȱcategories,ȱallȱ involvedȱinȱregulation.ȱOnlyȱoneȱcategoryȱwasȱdetectedȱinȱwhichȱlocallyȱregulatedȱ genesȱ wereȱ overrepresentedȱ (Tableȱ 2).ȱ Thisȱ findingȱ agreesȱ withȱ theȱ generalȱ assumptionȱ thatȱ regulatoryȱ genesȱ areȱ muchȱ moreȱ stronglyȱ conservedȱ thanȱ otherȱ genesȱbecauseȱofȱtheirȱoftenȱpleiotropicȱeffects.ȱȱ ȱ Tableȱ 2:ȱ Geneȱ ontologyȱ categoriesȱ withȱ significantlyȱ differentȱ proportionsȱ ofȱ locallyȱ versusȱ distantlyȱ regulatedȱgenes.ȱTheȱ secondȱandȱfourthȱcolumnȱrepresent,ȱforȱeachȱcategoryȱrespectively,ȱ howȱ manyȱ genesȱofȱtheȱtestȱsetȱwereȱlocallyȱandȱdistantlyȱregulated.ȱTheȱthirdȱandȱfifthȱcolumnȱrepresent,ȱforȱeachȱ categoryȱrespectively,ȱtheȱproportionȱofȱtheȱtotalȱnumberȱofȱannotatedȱgenesȱinȱtheȱtestȱsetȱthatȱwereȱ locallyȱandȱdistantlyȱregulated.ȱTheȱsixthȱcolumnȱrepresents,ȱforȱeachȱcategory,ȱtheȱPȬvalueȱofȱobservedȱ differencesȱbetweenȱlocallyȱandȱdistantlyȱregulatedȱgenes.ȱn.s.,ȱnotȱsignificant.ȱ ȱ Localȱ ȱ Distantȱ ȱ GeneȱOntologyȱcategoryȱ Genesȱ %ȱȱGenesȱ %ȱ PȬvalueȱ Biologicalȱprocessȱȱȱȱȱȱȱ regulationȱofȱcellularȱprocessȱ 69ȱȱȱ7.6ȱȱ 135ȱ 14.4ȱ 1.68EȬ03ȱ regulationȱofȱcellularȱmetabolismȱ 58ȱȱȱ7.2ȱȱ 120ȱ 14.2ȱ 1.68EȬ03ȱ regulationȱofȱnucleicȱacidȱmetabolismȱ 57ȱȱȱ8.2ȱȱ 118ȱ 16.1ȱ 1.68EȬ03ȱ regulationȱofȱtranscriptionȱ 56ȱ 10.4ȱȱ 117ȱ 20.7ȱ 1.68EȬ03ȱ regulationȱofȱmetabolismȱ 59ȱȱȱ7.0ȱȱ 121ȱ 13.4ȱ 2.02EȬ03ȱ regulationȱofȱcellularȱphysiologicalȱprocessȱ 69ȱȱȱ8.2ȱȱ 135ȱ 15.0ȱ 2.02EȬ03ȱ transcriptionȱ 63ȱȱȱ9.1ȱȱ 123ȱ 16.8ȱ 2.52EȬ03ȱ regulationȱofȱphysiologicalȱprocessȱ 74ȱȱȱ8.2ȱȱ 136ȱ 14.5ȱ 2.99EȬ03ȱ RNAȱprocessingȱ 31ȱȱȱ5.7ȱ ȱ ȱȱȱȱ9ȱȱȱ1.6ȱ 3.86EȬ02ȱ transcription,ȱDNAȬdependentȱ 29ȱȱȱ5.4ȱȱ ȱȱ65ȱ 11.5ȱ 3.86EȬ02ȱ regulationȱofȱtranscription,ȱDNAȬdependentȱ 29ȱȱȱ8.6ȱȱ ȱȱ64ȱ 17.9ȱ 3.86EȬ02ȱ ȱȱȱȱȱȱȱ Molecularȱfunctionȱȱȱȱȱȱȱ transcriptionȱfactorȱactivityȱ 64ȱȱȱ6.2ȱȱ 125ȱ 11.5ȱ 1.60EȬ02ȱ ȱȱȱȱȱȱȱ Cellularȱcomponentȱȱȱȱȱȱȱ n.s.ȱȱȱȱȱȱȱ ȱ ȱ Aȱ dualȱ approachȱ forȱ theȱ constructionȱ ofȱ regulatoryȱ networksȱ revealsȱ novelȱ regulatoryȱstepsȱforȱfloweringȱtimeȱ Geneticȱ regulatoryȱ networksȱ consistȱ ofȱ aȱ collectionȱ ofȱ genes,ȱ whichȱ areȱ interconnectedȱbecauseȱoneȱgeneȱregulatesȱtheȱtranscriptionȱofȱanotherȱdirectlyȱorȱ

56ȱ RegulatoryȱnetworkȱconstructionȱusingȱgenomeȬwideȱgeneȱexpressionȱQTLsȱ

indirectly.ȱ Theȱ analysisȱ ofȱ geneȱ expressionȱ inȱ aȱ mappingȱ populationȱ canȱ greatlyȱ enhanceȱtheȱconstructionȱofȱsuchȱnetworks.ȱIfȱanȱeQTLȱresultsȱfromȱdifferencesȱinȱ expressionȱofȱaȱregulator,ȱthisȱregulatorȱisȱlikelyȱtoȱshowȱcorrelationȱinȱexpressionȱ levelsȱ withȱ theȱ geneȱ thatȱ mappedȱ toȱ itsȱ positionȱ (Bingȱ andȱ Hoeschele,ȱ 2005).ȱ Multipleȱ genesȱ involvedȱ inȱ theȱ sameȱ biologicalȱ processȱ mappingȱ toȱ theȱ sameȱ positionȱindicatesȱthatȱmanyȱofȱthemȱmightȱbeȱunderȱtheȱcontrolȱofȱtheȱsameȱgene.ȱ Weȱ reasonedȱ thatȱ theȱ bestȱ candidateȱ withinȱ anȱ eQTLȱ intervalȱ isȱ theȱ geneȱ whoseȱ expressionȱ bestȱ correlatesȱ withȱ multipleȱ genesȱ mappingȱ toȱ theȱ positionȱ ofȱ thatȱ gene.ȱWeȱthereforeȱcombinedȱexpressionȱtraitȱprofilingȱwithȱeQTLȱmapping,ȱgeneȱ annotation,ȱandȱextendedȱiterativeȱGroupȱAnalysisȱ(iGA)ȱ(Breitlingȱetȱal.,ȱ2004)ȱtoȱ sortȱcandidateȱregulatorsȱbasedȱonȱtheirȱPCȬvalueȱ(PossibilityȱofȱȱChange),ȱwhichȱ tells,ȱ forȱ aȱ givenȱ regulator,ȱ howȱ likelyȱ itȱ isȱ toȱ observeȱ aȱ strongȱ correlationȱ withȱ multipleȱmembersȱofȱaȱselectedȱgroupȱofȱgenes.ȱThisȱnovelȱapproachȱenabledȱusȱtoȱ drasticallyȱnarrowȱdownȱtheȱnumberȱofȱcandidateȱgenesȱinȱanȱeQTLȱintervalȱandȱ selectȱtheȱbestȱcandidateȱforȱtheȱconstructionȱofȱgeneticȱregulatoryȱnetworks.ȱ Toȱverifyȱourȱapproach,ȱweȱfocusedȱonȱoneȱofȱtheȱbestȱstudiedȱandȱmostȱ completeȱ geneticȱ regulatoryȱ networksȱ availableȱ inȱ plants:ȱ theȱ regulationȱ ofȱ floweringȱinȱArabidopsis.ȱFloweringȱtimeȱisȱhighlyȱvariableȱbetweenȱaccessionsȱofȱ Arabidopsisȱ (Koornneefȱ etȱ al.,ȱ 2004).ȱ Variationȱ inȱ floweringȱ timeȱ alsoȱ existsȱ betweenȱLerȱandȱCvi,ȱandȱseveralȱstudiesȱhaveȱreportedȱQTLsȱforȱthisȱtraitȱ(AlonsoȬ Blancoȱetȱal.,ȱ1998a;ȱUngererȱetȱal.,ȱ2002;ȱJuengerȱetȱal.,ȱ2005).ȱAlthoughȱfloweringȱ startsȱmuchȱlater,ȱtheȱexpressionȱofȱgenesȱthatȱindicateȱcommitmentȱtoȱfloweringȱ areȱ alreadyȱ apparentȱ atȱ veryȱ earlyȱ stageȱ andȱ findȱ theirȱ transcriptionȱ peakȱ inȱ theȱ seedlingȱstageȱ(Kobayashiȱetȱal.,ȱ1999;ȱZimmermannȱetȱal.,ȱ2004).ȱWeȱselectedȱaȱsetȱ ofȱ 192ȱ genesȱ knownȱ toȱ beȱ involvedȱ inȱ theȱ controlȱ ofȱ floweringȱ fromȱ recentȱ literatureȱ(seeȱsupplementalȱTableȱ5ȱatȱwww.pnas.org/cgi/content/full/0610429104/ȱ DC1ȱ forȱ aȱ fullȱ list)ȱ andȱ keywordȱ searchingȱ inȱ Theȱ Arabidopsisȱ Informationȱ Resourceȱdatabase;ȱ175ȱofȱtheseȱgenesȱwereȱanalyzedȱinȱourȱstudy.ȱAnalysisȱofȱtheirȱ expressionȱ levelȱ inȱ theȱ parentalȱ accessionsȱ assignedȱ eightȱ ofȱ themȱ asȱ beingȱ differentiallyȱ expressed.ȱ However,ȱ 83ȱ genesȱ showedȱ atȱ leastȱ oneȱ eQTLȱ atȱ aȱ genomeȬwideȱ thresholdȱ ofȱ 2.23ȱ xȱ10Ȭ3.ȱWeȱ calculatedȱ PCȬvaluesȱforȱ correlationȱ inȱ expressionȱprofiles,ȱusingȱtheȱgroupȱofȱ83ȱmappedȱflowerȱgenesȱandȱallȱcandidateȱ genesȱwithinȱtheirȱeQTLȱsupportȱintervals.ȱWeȱthenȱselectedȱtheȱgenesȱwithinȱtheȱ eQTLȱ supportȱ intervalȱ ofȱaȱ givenȱflowerȱ geneȱ withȱ significantȱ PCȬvaluesȱ(FDRȱ =ȱ 0.05)ȱ asȱ candidatesȱ forȱ thisȱ eQTLȱ (supplementalȱ Tableȱ 5ȱ atȱ www.pnas.org/cgi/ȱ content/full/0610429104/DC1).ȱRegulatorsȱwereȱpredictedȱforȱ51ȱgenes,ȱwhereasȱforȱ 32ȱgenesȱnoȱsignificantȱPCȬvalueȱwasȱobtainedȱ ȱ

57ȱ Chapterȱ3ȱ

ȱ ȱ Figure 7: Regulatory network of genes involved in the transition to flowering. Flower genes are connected to their most likely regulator by directional edges. Arrows and bars represent stimulative and repressive regulation, respectively. ȱ ȱFigureȱ7ȱshowsȱaȱnetworkȱofȱflowerȱgenesȱandȱtheirȱmostȱlikelyȱregulators.ȱ Theȱ mostȱ significantȱ regulatorȱ detectedȱ wasȱ GIGANTEAȱ (GI)ȱ withȱ aȱ PCȬvalueȱ ofȱ 1.01ȱ xȱ 10Ȭ12.ȱȱThirteenȱ genesȱ mappedȱ toȱ GI,ȱ includingȱ GIȱ itself,ȱ andȱ allȱ ofȱ themȱ contributedȱtoȱtheȱlowestȱPCȬvalue.ȱGIȱisȱtheȱfirstȱmemberȱofȱanȱoutputȱpathwayȱofȱ theȱ circadianȱ clockȱ thatȱ controlsȱ floweringȱ timeȱ andȱ hasȱ beenȱ shownȱ toȱ regulateȱ circadianȱrhythmsȱinȱArabidopsisȱ(Mizoguchiȱetȱal.,ȱ2005).ȱAtȱtheȱpositionȱofȱGI,ȱaȱ minorȱ floweringȬtimeȱ QTLȱ (AlonsoȬBlancoȱ etȱ al.,ȱ 1998a)ȱ andȱ aȱ circadianȱ periodȱ lengthȱ QTLȱ (Swarupȱ etȱ al.,ȱ 1999;ȱ Michaelȱ etȱ al.,ȱ 2003)ȱ wereȱ identified,ȱ whichȱ indicatesȱ theȱ physiologicalȱ consequencesȱ ofȱ thisȱ complexȱ patternȱ ofȱ geneȱ expressionȱvariation.ȱIndeedȱmanyȱofȱtheȱgenes,ȱlikeȱCCA1ȱ(seeȱsupplementalȱTableȱ

58ȱ RegulatoryȱnetworkȱconstructionȱusingȱgenomeȬwideȱgeneȱexpressionȱQTLsȱ

5ȱatȱwww.pnas.org/cgi/content/full/0610429104/DC1ȱforȱdetails),ȱLHY1,ȱELF4,ȱandȱ TOC1,ȱforȱwhichȱGIȱwasȱidentifiedȱasȱtheirȱmostȱlikelyȱregulator,ȱbelongȱtoȱtheȱcoreȱ circadianȱoscillatorȱ(Bossȱetȱal.,ȱ2004).ȱOthersȱareȱinvolvedȱinȱtheȱregulationȱofȱtheȱ circadianȱ clock,ȱ suchȱ asȱ PCL1,ȱ APRR9,ȱ andȱ FKF1ȱ (Michaelȱ etȱ al.,ȱ 2003;ȱ Onaiȱ andȱ Ishiura,ȱ2005),ȱorȱplayȱaȱroleȱinȱfloralȱtransition,ȱsuchȱasȱELF7ȱandȱtheȱCONSTANSȬ LIKEȱfamilyȱCOL1,ȱCOL2,ȱandȱCOL9ȱ(Ledgerȱetȱal.,ȱ2001;ȱHeȱetȱal.,ȱ2004;ȱChengȱandȱ Wang,ȱ2005).ȱAȱsecondȱclusterȱofȱcoȬregulatedȱgenesȱisȱinvolvedȱinȱfloralȱrepressionȱ andȱ mappedȱ toȱ FLG,ȱ anotherȱ majorȱ QTLȱ forȱ floweringȱ time.ȱ Whereȱ theȱ floralȱ repressorsȱFLC,ȱMAF1,ȱMAF4,ȱMAF5,ȱandȱTOE1ȱ(Bossȱetȱal.,ȱ2004)ȱareȱupȬregulated,ȱ theȱ floralȱ promoterȱ CRY2ȱ (Bossȱ etȱ al.,ȱ 2004)ȱ isȱ downȬregulatedȱ byȱ thisȱ locus,ȱ inȱ agreementȱwithȱfindingsȱthatȱFLCȱexpressionȱnegativelyȱcorrelatesȱwithȱCRY2ȱ(ElȬ DinȱElȬAssalȱetȱal.,ȱ2003).ȱInȱadditionȱtoȱFLG,ȱCRY2ȱandȱFLCȱareȱmajorȬeffectȱQTLsȱ forȱfloweringȱtimeȱinȱtheȱLerȱxȱCviȱpopulation,ȱandȱsignificantȱepistasisȱhasȱbeenȱ foundȱbetweenȱCRY2ȱandȱFLCȱ(ElȬDinȱElȬAssalȱetȱal.,ȱ2003)ȱandȱbetweenȱtheȱFLCȱ regionȱ andȱ theȱ FLGȱ locusȱ (AlonsoȬBlancoȱ etȱ al.,ȱ 1998a).ȱ Althoughȱ HUA2ȱ wasȱ previouslyȱsuggestedȱasȱaȱcandidateȱforȱtheȱFLGȱlocusȱ(Doyleȱetȱal.,ȱ2005),ȱweȱdidȱ notȱidentifyȱitȱasȱsuchȱandȱfoundȱaȱgeneȱwithȱunknownȱfunctionȱ(At5g23460)ȱtoȱbeȱ theȱmostȱlikelyȱcandidate.ȱOtherȱclustersȱareȱpredominantlyȱinvolvedȱinȱhormonalȱ pathwaysȱ(MYB33,ȱARF6,ȱARF8,ȱRD29B,ȱandȱSHI)ȱ(Mouradovȱetȱal.,ȱ2002;ȱNagpalȱetȱ al.,ȱ2005)ȱandȱtheȱphotoperiodȱpathwayȱ(PIE1,ȱCAM1,ȱPHYE,ȱandȱESD4)ȱ(Levyȱandȱ Dean,ȱ1998;ȱBossȱetȱal.,ȱ2004)ȱofȱflowering.ȱ Toȱ identifyȱ otherȱ possibleȱ targetȱ genesȱ ofȱ theȱ mostȱ significantȱ regulatorȱ (GI),ȱ weȱ calculatedȱ theȱ correlationȱ coefficientȱ betweenȱ theȱ genesȱ ofȱ theȱ GIȱ regulatoryȱ clusterȱ andȱ allȱ otherȱ genes.ȱ Strongȱ correlationȱ wasȱ observedȱ forȱ 280ȱ transcriptsȱatȱanȱempiricalȱcorrelationȱcoefficientȱcutoffȱofȱ0.55,ȱcorrespondingȱtoȱaȱ FDRȱ ofȱ 9.5ȱ xȱ 10Ȭ5ȱ (supplementalȱ Tableȱ 6ȱ atȱ www.pnas.org/cgi/content/full/ȱ 0610429104/DC1).ȱ Manyȱ ofȱ theseȱ genesȱ showedȱ noȱ significantȱ linkageȱ atȱ theȱ positionȱofȱGIȱbutȱseveralȱdisplayedȱaȱsuggestiveȱeQTL.ȱAlthoughȱcorrelationȱcanȱ beȱ aȱ resultȱ ofȱ linkedȱ geneticȱ effect,ȱ onlyȱ 32ȱ locallyȱ regulatedȱ genesȱ wereȱ locatedȱ withinȱ 2.5ȱ Mbpȱ ofȱ GI.ȱ Theȱ highestȱ correlationȱ coefficientȱ (0.75)ȱ wasȱ foundȱ forȱ aȱ CONSTANSȬLIKEȱPROTEINȱencodingȱgeneȱ(At1g07050).ȱTheȱlongȱdayȱintegratorȱ CONSTANSȱ (CO)ȱ hasȱ beenȱ shownȱ toȱ beȱ aȱ directȱ targetȱ ofȱ GIȱ (Mizoguchiȱ etȱ al.,ȱ 2005),ȱ althoughȱ itȱ wasȱ notȱ identifiedȱ asȱ suchȱ inȱ ourȱ study.ȱ Twoȱ otherȱ genesȱ associatedȱ withȱ circadianȱ rhythms,ȱ APRR5ȱ andȱ WNK1,ȱ wereȱ detected,ȱ andȱ bothȱ showedȱaȱsuggestiveȱeQTLȱatȱtheȱpositionȱofȱGI.ȱAPRRȱgenesȱareȱparalogsȱofȱTOC1ȱ andȱhaveȱbeenȱshownȱtoȱbeȱregulatedȱbyȱtheȱproteinȱkinaseȱWNK1ȱ(Nakamichiȱetȱ al.,ȱ2002).ȱTheseȱresultsȱsuggestȱthatȱtheȱfeedbackȱregulationȱofȱtheȱcircadianȱclockȱ byȱGIȱacts,ȱatȱleastȱpartly,ȱthroughȱWNK1ȱandȱAPRR5.ȱ ȱ

59ȱ Chapterȱ3ȱ

DISCUSSIONȱ ȱ Geneticȱvariationȱinȱgeneȱexpressionȱisȱabundantȱandȱcomplexȱ Weȱdeterminedȱdifferencesȱinȱgeneȱexpressionȱbetweenȱtwoȱdistinctȱaccessionsȱofȱ ArabidopsisȱandȱwithinȱanȱRILȱpopulationȱderivedȱfromȱtheseȱaccessions.ȱȱ Ourȱ dataȱ suggestȱ thatȱ variationȱ inȱ geneȱ expressionȱ amongȱ geneticallyȱ differentȱ plantsȱ ofȱ theȱ sameȱ speciesȱ isȱ forȱ aȱ largeȱ partȱ geneticallyȱ controlledȱ andȱ highlyȱ complex.ȱ Althoughȱ eQTLsȱ wereȱ detectedȱ forȱ >4,000ȱ genes,ȱ onlyȱ 922ȱ wereȱ differentiallyȱexpressedȱbetweenȱtheȱparents,ȱwhichȱsuggestsȱthatȱtheȱexpressionȱofȱ manyȱ genesȱ isȱ controlledȱ byȱ multipleȱ lociȱ withȱ opposingȱ effects,ȱ avoidingȱ largeȱ differencesȱ betweenȱ naturalȱ accessionsȱ butȱ generatingȱ strongȱ transgressionȱ inȱ aȱ segregatingȱ population.ȱ Thisȱ suggestionȱ isȱ supportedȱ byȱ theȱ differencesȱ inȱ heritability,ȱ asȱ calculatedȱ fromȱ theȱ parentalȱ andȱ populationȱ expressionȱ analyses.ȱ Thisȱdifferenceȱbetweenȱtheȱtwoȱheritabilityȱestimatesȱmightȱhaveȱseveralȱreasons.ȱ First,ȱ statisticalȱ issuesȱ mightȱ biasȱ theȱ outcomeȱ ofȱ theȱ analyses.ȱ Falseȱ negativesȱ mightȱ biasȱ theȱ numberȱ ofȱ genesȱ differentiallyȱ expressedȱ betweenȱ theȱ parentsȱ downwards,ȱbecauseȱstatisticalȱpowerȱwasȱlimitedȱtoȱtenȱreplicateȱmeasurementsȱ ofȱeachȱparent.ȱOnȱtheȱotherȱhand,ȱfalseȱpositivesȱdueȱtoȱlowȱsignalȬtoȬnoiseȱratiosȱ forȱ lowȬexpressedȱ genesȱ mightȱ biasȱ theȱ numberȱ ofȱ mappedȱ genesȱ upwards.ȱ However,ȱmostȱmappedȱgenesȱhadȱmediumȬtoȬhighȱexpressionȱlevelsȱ(Figureȱ2A).ȱ Aȱ secondȱ andȱ moreȱ likelyȱ reasonȱ whyȱ mappedȱ genesȱ wereȱ notȱ significantlyȱ differentlyȱ expressedȱ betweenȱ theȱ parents,ȱ givenȱ theȱ extentȱ ofȱ theȱ differenceȱinȱnumber,ȱmightȱbeȱtheȱcomplexȱgeneticȱinheritanceȱofȱgeneȱexpression.ȱ Illustratingȱthisȱfindingȱisȱthatȱalthoughȱtheȱmedianȱheritabilityȱofȱmappedȱgenesȱ wasȱ82.4%,ȱonlyȱaȱmedianȱ28.4%ȱofȱtheȱvariationȱobservedȱforȱmappedȱgenesȱcouldȱ beȱ explainedȱ byȱ significantȱ eQTLs.ȱ Furthermore,ȱ althoughȱ theȱ proportionȱ ofȱ mappedȱgenesȱincreasedȱwithȱhigherȱheritabilityȱvalues,ȱmanyȱgenesȱwithȱaȱhighȱ heritabilityȱ couldȱ notȱ beȱ mappedȱ significantly.ȱ Togetherȱ withȱ theȱ strongȱ transgressionȱ observedȱ forȱ manyȱ genes,ȱ theseȱ dataȱ implyȱ thatȱ regulationȱ ofȱ expressionȱ oftenȱ occursȱ throughȱ theȱ addedȱ effectȱ ofȱ numerousȱ smallȬeffectȱ loci,ȱ eachȱofȱwhichȱfailȱtoȱpassȱtheȱsignificanceȱthreshold.ȱ Becauseȱ twoȱ colorȱ arraysȱ wereȱ usedȱ inȱ thisȱ study,ȱ aȱ dyeȱ effectȱ canȱ beȱ expectedȱinȱsubsequentȱanalyses.ȱInȱourȱexperiment,ȱdyeȱeffectȱwasȱcontrolledȱandȱ correctedȱatȱtwoȱlevels.ȱAtȱtheȱlevelȱofȱtheȱexperimentalȱdesign,ȱweȱbalancedȱtheȱ dyeȱeffectȱbetweenȱtwoȱallelesȱbyȱoptimizingȱforȱtheȱnumberȱofȱLer/CviȱandȱCvi/Lerȱ comparisonsȱatȱeachȱmarkerȱpositionȱ(FuȱandȱJansen,ȱ2006).ȱAtȱtheȱanalysisȱlevelȱ weȱincludedȱtheȱgeneȬspecificȱdifferentialȱeffectȱbetweenȱtheȱtwoȱdyesȱinȱtheȱQTLȱ analysisȱmodelȱ(Dobbinȱetȱal.,ȱ2005).ȱ

60ȱ RegulatoryȱnetworkȱconstructionȱusingȱgenomeȬwideȱgeneȱexpressionȱQTLsȱ

Molecularȱbackgroundȱofȱexpressionȱvariationȱ Manyȱ factors,ȱ rangingȱ fromȱ abioticȱ externalȱ influencesȱ toȱ directȱ activeȱ controlȱ ofȱ transcriptionalȱactivity,ȱinfluenceȱtheȱlevelȱofȱtranscriptȱabundanceȱofȱaȱgivenȱgene.ȱ Here,ȱ weȱ focusedȱ onȱ geneticȱ factorsȱ contributingȱ toȱ wholeȬgenomeȱ transcriptȱ levels.ȱ Ourȱ dataȱ showedȱ thatȱ genesȱ whoseȱ transcriptȱ variationȱ couldȱ beȱ mappedȱ areȱ notȱ equallyȱ distributedȱ overȱ theȱ Arabidopsisȱ genome.ȱ Althoughȱ aȱ strongȱ correlationȱbetweenȱtheȱtotalȱnumberȱofȱgenesȱperȱunitȱofȱchromosomeȱandȱthoseȱ thatȱ couldȱ beȱ mappedȱ wasȱ observed,ȱ otherȱ explanations,ȱ suchȱ asȱ differencesȱ inȱ chromatinȱstructureȱorȱSNPȱfrequency,ȱcannotȱbeȱexcluded.ȱIllustrativeȱforȱthisȱwasȱ theȱ correlationȱ observedȱ betweenȱ SNPȱ frequencyȱ andȱ theȱ proportionȱ ofȱ mappedȱ genes.ȱ Anchoringȱ ofȱ theȱ geneticȱ mapȱ enabledȱ usȱ toȱ defineȱ localȱ versusȱ distantȱ regulation.ȱ Although,ȱ inȱ general,ȱ localȱ regulationȱ seemsȱ stronger,ȱ distantȱ regulationȱ occursȱ moreȱ frequently.ȱ Theseȱ findingsȱ wereȱ demonstratedȱ byȱ decreasingȱ theȱ significanceȱ threshold;ȱ onlyȱ aȱ minorȱ numberȱ ofȱ additionalȱ locallyȱ regulatedȱgenesȱwereȱdetected,ȱwhereasȱtheȱnumberȱofȱdistantlyȱregulatedȱgenesȱ increasedȱ moreȱ thanȱ twoȬfold.ȱ Becauseȱ theȱ vastȱ majorityȱ ofȱ genesȱ showingȱ localȱ linkageȱ areȱ expectedȱ toȱ beȱ cisȬregulatedȱ (Ronaldȱ etȱ al.,ȱ 2005),ȱ thisȱ differenceȱ inȱ increaseȱ canȱ beȱ explainedȱ byȱ theȱ directȱ influenceȱ ofȱ cisȬpolymorphismsȱ onȱ expression,ȱ whereasȱ transȬpolymorphismsȱ exertȱ theirȱ effectȱ indirectlyȱ throughȱ aȱ changeȱ inȱ expressionȱ orȱ codingȱ sequenceȱ ofȱ aȱ secondȱ gene.ȱ Takingȱ togetherȱ theȱ strongȱtransgressionȱobservedȱforȱmanyȱgenesȱandȱtheȱnumberȱofȱdistantlyȱversusȱ locallyȱ regulatedȱ genes,ȱ itȱ isȱ conceivableȱ thatȱ manyȱ cisȬregulatedȱ genesȱ exertȱ pleiotropicȱeffectsȱonȱtheȱexpressionȱofȱotherȱgenesȱandȱareȱcausalȱforȱmanyȱofȱtheȱ eQTLsȱactingȱinȱtrans.ȱȱ ȱ Regulatoryȱnetworksȱ Forȱmanyȱbiologicalȱprocesses,ȱtheȱgenesȱcontributingȱtoȱaȱcertainȱphenotypeȱareȱ oftenȱ wellȱ known.ȱ However,ȱ inȱ manyȱ cases,ȱ littleȱ isȱ knownȱ aboutȱ theȱ regulationȱ andȱ interactionȱ ofȱ theseȱ genes.ȱ Weȱ combinedȱ expressionȱ informationȱ withȱ eQTLȱ mapping,ȱ geneȱ annotation,ȱ andȱ iterativeȱ Groupȱ Analysisȱ toȱ identifyȱ likelyȱ regulators.ȱThisȱapproachȱenabled,ȱforȱtheȱfirstȱtime,ȱtheȱconstructionȱofȱmaximumȬ likelihoodȱ geneticȱ regulatoryȱ networksȱ fromȱ aȱ genomeȬwideȱ geneticalȱ genomicsȱ experiment.ȱAȱcaseȱstudyȱthatȱusedȱgenesȱinvolvedȱinȱtheȱwellȬknownȱprocessȱofȱ transitionȱ fromȱ aȱ vegetativeȱ stateȱ toȱ aȱ floweringȱ stateȱ confirmedȱ manyȱ ofȱ theȱ interactionsȱidentifiedȱpreviously.ȱMoreover,ȱnumerousȱnovelȱinteractionsȱthatȱcanȱ serveȱ toȱ formȱ hypothesisȱ forȱ futureȱ studiesȱ wereȱ predicted.ȱ Itȱ mustȱ beȱ noted,ȱ however,ȱthatȱanalysesȱwereȱperformedȱonȱdataȱfromȱaȱsingleȱtimeȱpoint.ȱItȱisȱnotȱ unlikelyȱthatȱregulationȱoccursȱdifferentlyȱatȱotherȱdevelopmentalȱstagesȱorȱdiurnalȱ

61ȱ Chapterȱ3ȱ

phaseȱ orȱ evenȱ organ,ȱ specifically.ȱ Especiallyȱ forȱ pathwaysȱ influencedȱ byȱ theȱ circadianȱ clock,ȱ suchȱ asȱ floweringȱ time,ȱ expressionȱ differencesȱ atȱ oneȱ timeȱ pointȱ canȱbeȱcausedȱbyȱdifferencesȱinȱcircadianȱphaseȱ(Michaelȱetȱal.,ȱ2003;ȱDarrahȱetȱal.,ȱ 2006).ȱ Accuracyȱ andȱ reliabilityȱ wouldȱ thereforeȱ benefitȱ fromȱ geneȱ expressionȱ analysisȱ atȱ multipleȱ developmentalȱ stagesȱ andȱ timeȱ points.ȱ Nevertheless,ȱ confidenceȱ inȱ theȱ followedȱ approachȱ wasȱ gainedȱ fromȱ theȱ factȱ thatȱ manyȱ functionallyȱrelatedȱgenesȱgroupedȱtogether,ȱindicatingȱcommonȱandȱsimultaneousȱ regulation.ȱ Weȱ assignedȱ theȱ geneȱ withȱ theȱ lowestȱ PCȬvalueȱ asȱ theȱ mostȱ likelyȱ candidateȱresponsibleȱforȱthisȱregulationȱalthoughȱotherȱgenesȱwithȱsignificantȱPCȬ valuesȱcanȱnotȱbeȱruledȱoutȱaȱpriori.ȱMoreover,ȱdueȱtoȱcoincidentalȱgeneticȱlinkageȱ ofȱ regulators,ȱ independentlyȱ regulatedȱ genesȱ mayȱ showȱ aȱ highȱ correlationȱ inȱ expression.ȱThisȱpotentialȱsourceȱofȱfalseȱcandidateȱassignmentȱisȱespeciallyȱproneȱ toȱ hotȱ spotsȱ ofȱ locallyȱ regulatedȱ genes.ȱ Subsequentȱ inȬdepthȱ analysisȱ shouldȱ beȱ performedȱtoȱunambiguouslyȱidentifyȱgenesȱunderlyingȱeQTLs,ȱbutȱtheȱnumberȱofȱ candidateȱgenesȱdecreasedȱsubstantiallyȱwithȱtheȱdescribedȱmethod.ȱȱ ȱ ȱ

62ȱ RegulatoryȱnetworkȱconstructionȱusingȱgenomeȬwideȱgeneȱexpressionȱQTLsȱ

MATERIALSȱANDȱMETHODSȱ ȱ Plantȱmaterialȱandȱtissueȱcollectionȱ AerialȱpartsȱofȱseedlingsȱfromȱtheȱaccessionsȱLerȱandȱCviȱandȱaȱpopulationȱofȱ160ȱ recombinantȱ inbredȱ linesȱ derivedȱ fromȱ aȱ crossȱ betweenȱ theseȱ parentsȱ (AlonsoȬ Blancoȱetȱal.,ȱ1998b;ȱKeurentjesȱetȱal.,ȱ2006)ȱwereȱgrownȱandȱcollectedȱasȱdescribedȱ previouslyȱ(Keurentjesȱetȱal.,ȱ2006).ȱInȱbrief,ȱseedsȱofȱlinesȱwereȱsownȱinȱpetriȱdishesȱ onȱ1/2MSȱagarȱandȱplacedȱinȱaȱcoldȱroomȱforȱsevenȱdays.ȱPetriȱdishesȱwereȱthenȱ transferredȱ toȱ aȱ climateȱ chamberȱ andȱ seedlingsȱ wereȱ collectedȱ afterȱ sevenȱ days.ȱ PlantȱmaterialȱwasȱstoredȱatȱȬ80°Cȱuntilȱfurtherȱprocessing.ȱ ȱ Linkageȱmapȱconstructionȱandȱanchoringȱtoȱtheȱphysicalȱmapȱ Theȱ geneticȱ mapȱ wasȱ constructedȱ fromȱ aȱ subsetȱ ofȱ theȱ markersȱ available,ȱ atȱ http:/nasc.nott.ac.uk/,ȱ withȱ aȱ fewȱ newȱ markersȱ added.ȱ Theȱ computerȱ programȱ JoinMapȱ4ȱ(vanȱOoijen,ȱ2006)ȱwasȱusedȱforȱtheȱcalculationȱofȱlinkageȱgroupsȱandȱ geneticȱdistances.ȱInȱtotal,ȱ144ȱmarkersȱwereȱused,ȱwithȱanȱaverageȱspacingȱofȱ3.5ȱ cM.ȱTheȱlargestȱdistanceȱbetweenȱtwoȱmarkersȱwasȱ10.8ȱcM.ȱ ToȱanchorȱtheȱgeneticȱmapȱtoȱtheȱphysicalȱmapȱofȱArabidopsis,ȱtheȱtotalȱsetȱ ofȱ291ȱavailableȱmarkersȱwasȱanalyzed.ȱFirst,ȱaȱgeneticȱmapȱthatȱcomprisedȱallȱ291ȱ markersȱ wasȱ constructed.ȱ Physicalȱ positionsȱ ofȱ molecularȱ PCRȱ markersȱ wereȱ obtainedȱ fromȱ Theȱ Arabidopsisȱ Informationȱ Resource,ȱ releaseȱ 6.0ȱ (www.ȱ arabidopsis.org).ȱSequencesȱofȱamplifiedȱfragmentȱlengthȱpolymorphismȱmarkersȱ wereȱ obtainedȱ byȱ inȱ silicoȱ amplificationȱ ofȱ Colȱ markersȱ thatȱ wereȱ polymorphicȱ betweenȱLerȱandȱCviȱ(Petersȱetȱal.,ȱ2001)ȱorȱbyȱsequencingȱfragmentsȱpolymorphicȱ betweenȱLerȱandȱCviȱbutȱabsentȱinȱCol.ȱTheȱretrievedȱmarkerȱsequencesȱwereȱthenȱ blastedȱ againstȱ theȱ completelyȱ sequencedȱ Colȱ genome,ȱ andȱ centerȱ positionsȱ ofȱ positiveȱ hitsȱ wereȱ takenȱ asȱ theȱ physicalȱ position.ȱ Physicalȱ positionsȱ couldȱ beȱ establishedȱ forȱ 179ȱ markers;ȱ positionsȱ ofȱ remainingȱ markersȱ wereȱ inferredȱ fromȱ interpolationȱ byȱ usingȱ theȱ closestȱ nearbyȱ markersȱ forȱ whichȱ aȱ physicalȱ positionȱ wasȱ known.ȱ Theȱ largestȱ gapȱ betweenȱ twoȱ markersȱ withȱ confirmedȱ physicalȱ positionȱ comprisedȱ 3.5ȱ Mbp,ȱ whichȱ correspondedȱ toȱ aȱ geneticȱ distanceȱ ofȱ approximatelyȱ15ȱcM.ȱ ȱ Sampleȱpreparationȱ Totalȱ RNAȱ ofȱ eachȱ lineȱ wasȱ isolatedȱ fromȱ twoȱ biologicalȱ replicatesȱ byȱ usingȱ phenolȬchloroformȱextractionȱ(Jonesȱetȱal.,ȱ1985).ȱExtractsȱwereȱthenȱcombinedȱandȱ purifiedȱ withȱ RNeasyȱ (Qiagen,ȱ Valencia,ȱ CA),ȱ amplifiedȱ withȱ theȱ MessageAmpȱ aRNAȱ kitȱ (Ambion,ȱ Austin,ȱ TX)ȱ incorporatingȱ 5Ȭ(3Ȭaminoallyl)ȬUTP,ȱ andȱ labeledȱ

63ȱ Chapterȱ3ȱ

withȱ Cy3ȱ orȱ Cy5ȱ monoȬreactiveȱ dyeȱ (Amersham,ȱ Piscataway,ȱ NJ.).ȱ Allȱ RNAȱ productsȱ wereȱ purifiedȱ byȱ usingȱ theȱ Rneasyȱ kitȱ (Qiagen).ȱ Labeledȱ RNAȱ wasȱ fragmentedȱ forȱ 15ȱ minutesȱ beforeȱ hybridizationȱ (fragmentationȱ reagentȱ obtainedȱ fromȱAmbion).ȱ ȱ Microarrayȱanalysesȱ Arabidopsisȱ DNAȱ microarraysȱ wereȱ providedȱ byȱ theȱ Galbraithȱ laboratoryȱ (Universityȱ ofȱ Arizona,ȱ Tucson,ȱ AZ)ȱ andȱ wereȱ producedȱ fromȱ aȱ setȱ ofȱ 70Ȭmerȱ oligonucleotides,ȱ representingȱ 24,065ȱ uniqueȱ genesȱ (ArrayȬReadyȱ Oligoȱ Set,ȱ versionȱ1.0,ȱQiagenȬOperon).ȱ DNAȱprobeȱimmobilizationȱandȱhybridizationȱwasȱperformedȱaccordingȱtoȱ instructionsȱ fromȱ theȱ Galbraithȱ laboratory.ȱ Arraysȱ wereȱ scannedȱ byȱ usingȱ aȱ ScanArrayȱ Expressȱ HTȱ (PerkinElmer,ȱ Wellesley,ȱ MA)ȱ andȱ quantifiedȱ byȱ usingȱ Imageneȱ6.0ȱ(BioDiscovery,ȱElȱSegundo,ȱCA).ȱ ȱ Experimentalȱdesignȱ GenomeȬwideȱgeneȱexpressionȱanalysisȱwasȱcarriedȱoutȱforȱLerȱandȱCviȱandȱanȱRILȱ populationȱ derivedȱ fromȱ aȱ crossȱ betweenȱ theseȱ twoȱ accessions.ȱ Tenȱ replicatesȱ ofȱ theȱ parentalȱ linesȱ wereȱ comparedȱ inȱ directȱ hybridizationsȱ byȱ usingȱ aȱ dyeȱ swapȱ design.ȱ Theȱ 160ȱ RILsȱ wereȱ analyzedȱ byȱ directȱ hybridizationȱ ofȱ twoȱ geneticallyȱ distantȱ linesȱ onȱ eachȱ array,ȱ leadingȱ toȱ aȱ totalȱ ofȱ 80ȱ slides.ȱ Aȱ novelȱ distantȱ pairȱ design,ȱwhichȱwasȱproposedȱspecificallyȱforȱgeneticȱstudiesȱonȱgeneȱexpressionȱ(Fuȱ andȱJansen,ȱ2006)ȱwasȱused.ȱAnȱoptimalȱdesignȱwasȱobtainedȱthroughȱsimulatedȱ annealing,ȱinȱwhichȱpairsȱofȱgeneticallyȱdistantȱlinesȱwereȱhybridizedȱtoȱmaximizeȱ theȱdirectȱcomparisonsȱbetweenȱtwoȱdifferentȱallelesȱatȱeachȱmarker.ȱTheȱnumbersȱ ofȱLer/CviȱandȱCvi/Lerȱcomparisonsȱatȱeachȱmarkerȱwereȱoptimizedȱforȱequalȱratioȱ toȱ balanceȱ dyeȱ effects,ȱ andȱ theirȱ totalȱ numberȱ wasȱ optimizedȱ forȱ minimalȱ extraȱ variationȱacrossȱotherȱmarkers.ȱTheȱobservedȱsignalȱintensitiesȱonȱtheȱarraysȱwereȱ subjectedȱ toȱ generalȱ normalizationȱ proceduresȱ (Yangȱ etȱ al.,ȱ 2002;ȱ Smyth,ȱ 2004).ȱ Resultingȱ logȱ signalȱ intensitiesȱ andȱ logȱ ratiosȱ betweenȱ coȬhybridizedȱ RILsȱ wereȱ usedȱforȱfurtherȱanalyses.ȱ ȱ Statisticalȱanalysesȱ Differentialȱ expressionȱ ofȱ genesȱ betweenȱ theȱ twoȱ parentsȱ wasȱ testedȱ forȱ significance.ȱForȱeachȱgene,ȱtheȱPȬvalueȱofȱaȱtȬtestȱandȱtheȱcorrespondingȱqȬvaluesȱ (Storeyȱ andȱ Tibshirani,ȱ 2003)ȱ wereȱ computedȱ (Smyth,ȱ 2004).ȱ Theȱ PȬvalueȱ significanceȱthresholdȱwasȱ2.5ȱxȱ10Ȭ3ȱatȱaȱqȬvalueȱcutoffȱofȱ0.05.ȱ

64ȱ RegulatoryȱnetworkȱconstructionȱusingȱgenomeȬwideȱgeneȱexpressionȱQTLsȱ

Logȱ signalȱ intensitiesȱ ofȱ geneȱ expressionȱ wereȱ usedȱ toȱ testȱ forȱ geneticȱ varianceȱofȱexpressionȱtraits.ȱSpotȱeffectsȱwereȱremovedȱbyȱtreatingȱitȱasȱaȱrandomȱ effectȱinȱaȱlinearȱmixedȱmodel.ȱȱ Heritabilityȱ ofȱ expressionȱ inȱ theȱ parentalȱ accessionsȱ wasȱ calculatedȱ asȱ followsȱ(HegmannȱandȱPossidente,ȱ1981):ȱ 0.5uVg H 2 ȱ P 0.5uVg  Ve whereȱ Vgȱ andȱ Veȱ representȱ theȱ componentsȱ ofȱ varianceȱ amongȱ andȱ withinȱ accessions,ȱ respectively.ȱ Theȱ factorȱ 0.5ȱ wasȱ appliedȱ toȱ adjustȱ forȱ theȱ 2Ȭfoldȱ overestimationȱofȱadditiveȱgeneticȱvarianceȱamongȱinbredȱstrains.ȱ Heritabilityȱ ofȱ expressionȱ withinȱ theȱ RILȱ populationȱ wasȱ calculatedȱ byȱ usingȱtheȱpooledȱvarianceȱofȱtheȱparentsȱasȱanȱestimateȱofȱtheȱwithinȱlineȱvariance:ȱ

2 V RIL  Ve H RIL ȱ V RIL whereȱ VRILȱ andȱ Veȱ areȱ theȱ varianceȱ amongȱ adjustedȱ expressionȱ intensitiesȱ inȱ theȱ segregantsȱ andȱ theȱ pooledȱ varianceȱ withinȱ parentalȱ measurements,ȱ respectively.ȱ Toȱ preventȱ overestimation,ȱ weȱ removedȱ outliersȱ moreȱ thanȱ threeȱ standardȱ deviationsȱ awayȱ fromȱ theȱ meanȱ values.ȱ Weȱ discardedȱ 1,470ȱ (6.1%)ȱ negativeȱ heritabilityȱvalues.ȱȱ Transgressiveȱsegregationȱwasȱdeterminedȱinȱtermsȱofȱtheȱpooledȱstandardȱ deviationȱofȱtheȱparentsȱ(BremȱandȱKruglyak,ȱ2005).ȱWeȱcalculatedȱtheȱnumberȱofȱ RILs,ȱn,ȱwhoseȱexpressionȱlevelȱlayȱbeyondȱtheȱregionȱΐȱ±ȱ2ȱxȱSD;ȱwhereȱΐȱandȱSDȱ areȱ theȱ meanȱ andȱ theȱ standardȱ deviationȱ ofȱ parentalȱ phenotypicȱ values,ȱ respectively.ȱ Toȱ determineȱ significance,ȱ phenotypeȱ valuesȱ ofȱ parentsȱ andȱ segregantsȱ wereȱ reassignedȱ atȱ randomȱ toȱ nullȱ parentsȱ andȱ segregantsȱ forȱ eachȱ transcript.ȱ Theȱ numberȱ ofȱ transgressiveȱ individuals,ȱ n0,ȱ wasȱ thenȱ recorded.ȱ Theȱ totalȱnumberȱofȱtranscriptsȱwithȱn0ȱgreaterȱthanȱaȱgivenȱthresholdȱmȱrepresentedȱ theȱ genomeȬwideȱ falseȬpositiveȱ countȱ atȱ m.ȱ Theȱ FDRȱ wasȱ computedȱ asȱ theȱ ratioȱ betweenȱ estimatedȱ falseȬpositiveȱ countȱ atȱ mȱ andȱ theȱ numberȱ ofȱ nonȬpermutedȱ transcriptsȱ withȱ nȱ >ȱ m.ȱ Resultsȱ wereȱ averagedȱ overȱ 20ȱ permutations.ȱ Theȱ FDRȱ =ȱ 0.05ȱcutoffȱcorrespondedȱtoȱmȱ=ȱ33.ȱȱ ȱ MultipleȱQTLȱanalysisȱ Geneȱ expressionȱ inȱ theȱ mappingȱ populationȱ wasȱanalyzedȱ forȱ significantȱ eQTLs.ȱ Forȱ eachȱ geneȱ theȱ logȬratiosȱ ofȱ signalȱ intensitiesȱ wereȱ subjectedȱ toȱ multipleȱ QTLȱ mappingȱ (MQM).ȱ Cofactorsȱ wereȱ selectedȱ byȱ usingȱ aȱ backwardȱ eliminationȱ processȱ (Jansen,ȱ 1993)ȱ (seeȱ supplementalȱ informationȱ atȱ www.pnas.org/cgi/ȱ

65ȱ Chapterȱ3ȱ

content/full/0610429104/DC1).ȱ Forȱ everyȱ markerȬbyȬgeneȱ combination,ȱ theȱ MQMȱ modelȱcanȱbeȱgivenȱas:ȱ

mk y P  bk xk  ¦ bi xi ȱ i 1 whereȱ yȱ isȱ theȱ expressionȱ ratioȱ ofȱ aȱ transcript,ȱ ΐȱ isȱ theȱ geneȬspecificȱ differentialȱ effectȱ betweenȱ Cy3ȱ andȱ Cy5ȱ dyesȱ (characterizedȱ asȱ consistentȱ acrossȱ samples)ȱ (Dobbinȱetȱal.,ȱ2005),ȱxȱdenotesȱtheȱgenotypeȱcomparisonȱandȱtakesȱtheȱfollowingȱ values:ȱ 1ȱ forȱ Ler/Cvi,ȱȬ1ȱ forȱ Cvi/Lerȱ andȱ 0ȱ forȱ Ler/Lerȱ andȱ Cvi/Cvi;ȱ bȱ isȱ theȱ substitutionȱeffect;ȱkȱisȱtheȱkthȱmarkerȱunderȱstudy;ȱandȱiȱdenotesȱtheȱcofactorsȱfromȱ 1ȱtoȱmk,ȱoutsideȱaȱ30ȬcMȱintervalȱofȱtheȱkthȱmarker.ȱTheȱPȬvalueȱfromȱaȱtȬtestȱthatȱ testedȱ theȱ hypothesisȱ thatȱ bkȱ =ȱ 0ȱ wasȱ usedȱ asȱ aȱ measureȱ ofȱ significanceȱ ofȱ theȱ association.ȱȱ AȱgenomeȬwideȱPȬvalueȱthresholdȱofȱ2.23ȱxȱ10Ȭ3ȱatȱ΅ȱ=ȱ0.05ȱforȱaȱsingleȱtraitȱ wasȱestimatedȱbyȱaȱ10,000ȱpermutationȱtestȱ(ChurchillȱandȱDoerge,ȱ1994).ȱButȱforȱaȱ studyȱ withȱ 24,065ȱ geneȱ transcripts,ȱ weȱ controlledȱ theȱ falseȱ discoveryȱ rateȱ (FDR)ȱ basedȱ onȱ theȱ poolȱ ofȱ PȬvaluesȱ forȱ allȱ markersȱ andȱ allȱ transcripts.ȱ Becauseȱ theȱ PȬ valuesȱ areȱ correlatedȱ whenȱ markersȱ areȱ linked,ȱ theȱ FDRȱ increasesȱ dependingȱ onȱ theȱnumberȱofȱmarkersȱonȱaȱchromosomeȱ(BenjaminiȱandȱYekutieli,ȱ2001).ȱInȱourȱ experiment,ȱtheȱmaximumȱnumberȱofȱmarkersȱreachedȱ35ȱ(chromosomeȱ5),ȱandȱaȱ simulationȱ analysisȱ (dataȱ notȱ shown)ȱ thatȱ usedȱ Storey’sȱ algorithmȱ toȱ controlȱ theȱ FDRȱ (Storey,ȱ 2002)ȱ atȱ aȱ desiredȱ levelȱ indeedȱ showedȱ aȱ 4.4Ȭfoldȱ increaseȱ ofȱ theȱ actualȱFDR.ȱToȱaccountȱforȱthisȱincrease,ȱweȱcorrectedȱtheȱFDRȱbyȱaȱfactorȱofȱ5ȱandȱ calculatedȱtheȱgenomeȱwideȱPȬvalueȱthresholdȱatȱStorey’sȱFDRȱofȱ0.01ȱforȱallȱgeneȬ markerȱ PȬvalues,ȱ toȱ makeȱ sureȱ thatȱ theȱ realȱ FDRȱ rateȱ isȱ <0.05ȱ (correctedȱ FDRȱ =ȱ 0.05).ȱ Theȱ estimatedȱ PȬvalueȱ thresholdȱthenȱ correspondedȱ toȱ 5.29ȱ xȱ 10Ȭ5,ȱ andȱthisȱ thresholdȱwasȱusedȱasȱaȱsignificanceȱthresholdȱforȱtheȱdetectionȱofȱeQTLs.ȱȱ ExplainedȱvarianceȱofȱdetectedȱeQTLsȱwasȱestimatedȱbyȱfittingȱexpressionȱ ratiosȱ ofȱ allȱ detectedȱ eQTLsȱ andȱ theirȱ interactionsȱ inȱ aȱ linearȱ model.ȱ Weȱ usedȱ ANOVAȱ toȱ estimateȱ theȱ fractionȱ ofȱ varianceȱ explainedȱ byȱ eachȱ eQTLȱ andȱ eQTLȱ interactions.ȱ ȱ Localȱandȱdistantȱregulationȱ WeȱdeterminedȱtheȱphysicalȱpositionȱofȱeachȱeQTLȱbyȱanchoringȱtheȱgeneticȱmapȱ ofȱ theȱ Lerȱ xȱ Cviȱ populationȱ toȱ theȱ physicalȱ mapȱ ofȱ theȱ sequencedȱ accessionȱ Col.ȱ Supportȱ intervalsȱ wereȱ thenȱ calculatedȱ byȱ settingȱ leftȱ andȱ rightȱ borderȱ positionsȱ associatedȱ withȱ max{–Log10P}ȱȬȱ1.5,ȱ whereȱ Pȱ representsȱ theȱ significanceȱ valueȱ forȱ linkageȱ(Keurentjesȱetȱal.,ȱ2006).ȱ Theȱ physicalȱ positionsȱ ofȱ genesȱ (Theȱ Arabidopsisȱ Informationȱ Resource,ȱ versionȱ 2005.12.8)ȱ showingȱ significantȱ linkageȱ ofȱ expressionȱ valuesȱ wereȱ thenȱ

66ȱ RegulatoryȱnetworkȱconstructionȱusingȱgenomeȬwideȱgeneȱexpressionȱQTLsȱ

comparedȱwithȱtheȱpositionsȱofȱtheirȱrespectiveȱeQTL(s);ȱaȱgeneȱwasȱclassifiedȱasȱ locallyȱ regulatedȱ whenȱ itsȱ positionȱ coincidedȱ withȱ theȱ supportȱ intervalȱ andȱ asȱ distantlyȱregulatedȱwhenȱitȱdidȱnot.ȱȱ ȱ Distributionȱofȱhotȱspotsȱ eQTLȱ hotȱ spotsȱ areȱ shownȱ byȱ theȱ frequencyȱ distributionȱ ofȱ theȱ numberȱ ofȱ significantȱ eQTLsȱ detected.ȱ Eachȱ eQTLȱ isȱ presentedȱ byȱ theȱ markerȱ showingȱ theȱ mostȱ significantȱ linkage.ȱ Theȱ frequencyȱ distributionȱ ofȱ eQTLsȱ byȱ chanceȱ wasȱ empiricallyȱestimatedȱbyȱ250ȱpermutationsȱ(deȱKoningȱandȱHaley,ȱ2005).ȱTheȱ95thȱ percentile,ȱcorrespondingȱtoȱ43ȱeQTLs,ȱwasȱusedȱasȱaȱconfidenceȱthresholdȱforȱtheȱ occurrenceȱofȱaȱhotȱspot.ȱȱ ȱ SlidingȬwindowȱanalysesȱ Allȱ24,065ȱgenesȱanalyzedȱwereȱpositionedȱonȱtheȱArabidopsisȱphysicalȱmap,ȱandȱ theȱATGȱstartȱcodonȱwasȱusedȱasȱtheȱstartȱofȱeachȱgene.ȱEachȱgeneȱwasȱclassifiedȱasȱ locallyȱregulated,ȱdistantlyȱregulated,ȱorȱnonȬregulated.ȱTheȱfrequencyȱofȱtheȱtotalȱ numberȱ ofȱ genesȱ andȱ theȱ numberȱ ofȱ locallyȱ andȱ distantlyȱ regulatedȱ genesȱ alongȱ eachȱchromosomeȱwasȱdeterminedȱinȱaȱ5ȬMbpȱslidingȱwindowȱbyȱusingȱaȱ50ȬKbpȱ stepȱsize.ȱ PolymorphismsȱbetweenȱLerȱandȱCviȱinȱ875ȱsequencedȱlociȱ(Nordborgȱetȱ al.,ȱ2005)ȱwereȱdownloadedȱfromȱtheȱMSQTȱwebsiteȱ(http://msqt.weigelworld.org)ȱ andȱfilteredȱforȱuniqueȱpositions.ȱINDELsȱwereȱrecordedȱasȱaȱsingleȱpolymorphismȱ byȱusingȱtheȱphysicalȱpositionȱofȱtheȱfirstȱnucleotideȱdifference.ȱAȱtotalȱnumberȱofȱ 4,032ȱ polymorphismsȱ wereȱ subjectedȱ toȱ furtherȱ analysis.ȱ Aȱ slidingȬwindowȱ analysisȱforȱSNPȱfrequencyȱwasȱthenȱcarriedȱoutȱasȱdescribedȱabove.ȱ Observedȱ geneȱ andȱ SNPȱ frequenciesȱ perȱ windowȱ wereȱ standardizedȱ byȱ usingȱ theȱ genomeȬwideȱ averageȱ andȱ standardȱ deviation,ȱ andȱ resultingȱ zȬscoresȱ wereȱplottedȱatȱtheȱphysicalȱpositionȱofȱtheȱcenterȱofȱeachȱwindow.ȱ ȱ Geneticȱnetworkȱconstructionȱ Aȱgroupȱofȱ83ȱfunctionallyȱrelatedȱgenesȱandȱtheirȱpotentialȱregulatorsȱwereȱusedȱ forȱ theȱ constructionȱ ofȱ aȱ geneticȱ regulatoryȱ network.ȱ Allȱ ofȱ theȱ genesȱ thatȱ wereȱ physicallyȱlocatedȱinȱanȱeQTLȱintervalȱwereȱassignedȱasȱaȱregulatorȱcandidateȱforȱ theȱgeneȱforȱwhichȱthatȱeQTLȱwasȱdetected.ȱTheȱcandidatesȱwereȱsortedȱbyȱusingȱ iterativeȱGroupȱAnalysisȱ(iGA)ȱ(Breitlingȱetȱal.,ȱ2004).ȱWeȱpostulatedȱthat,ȱamongȱ allȱpossibleȱregulators,ȱtheȱbestȱcandidatesȱareȱthoseȱthatȱcorrelateȱparticularlyȱwellȱ toȱaȱlargeȱnumberȱofȱtheirȱpotentialȱtargetȱgenes.ȱ Toȱ testȱ thatȱ postulation,ȱ weȱ calculatedȱ allȱ pairȬwiseȱ Spearmanȱ rankȱ correlationsȱ onȱ expressionȱ profilesȱ (80ȱ logȬratiosȱ ofȱ coȬhybridizedȱ RILs)ȱ betweenȱ

67ȱ Chapterȱ3ȱ

eachȱofȱtheȱ83ȱfunctionallyȱrelatedȱgenesȱandȱallȱpotentialȱregulatorsȱinȱtheirȱeQTLȱ intervals.ȱ Theseȱ valuesȱ wereȱ thenȱ rankȬorderedȱ soȱ thatȱ theȱ stronglyȱ correlatedȱ geneȬcandidateȱ pairsȱ wereȱ atȱ theȱ topȱ ofȱ theȱ list.ȱ Forȱ eachȱ givenȱ candidate,ȱ weȱ determinedȱtheȱiGAȱpossibilityȱofȱchangeȱvalueȱ(PCȬvalue,ȱsupplementalȱTableȱ5ȱatȱ www.pnas.org/cgi/content/full/0610429104/DC1).ȱ Theȱ PCȬvalueȱ thresholdȱ wasȱ BonferroniȬadjustedȱ asȱ 0.05/m,ȱ whereȱ mȱ isȱ theȱ totalȱ numberȱ ofȱ candidateȱ genes.ȱ Anyȱ candidateȱ withȱ aȱ significantȱ PCȬvalueȱ isȱ aȱ putativeȱ regulator,ȱ andȱ allȱ genesȱ contributingȱ toȱ thisȱ valueȱ areȱ putativeȱ targetȱ genes.ȱ Weȱ definedȱ theȱ regulatoryȱ relationȱ inȱ termsȱ ofȱ theȱ signȱ ofȱ theȱ correlationȱ coefficient.ȱ Ifȱ theȱ correlationȱ coefficientȱisȱnegative,ȱregulationȱisȱrepressive;ȱotherwiseȱitȱisȱstimulative.ȱ Potentialȱ targetȱ genesȱ outsideȱ theȱ initialȱ groupȱ ofȱ functionallyȱ relatedȱ genesȱ wereȱ identifiedȱ byȱ usingȱ expressionȱ traitȱ correlationsȱ (Lanȱ etȱ al.,ȱ 2006),ȱ forȱ whichȱ weȱ usedȱ theȱ regulatorsȱ andȱ targetȱ genesȱ obtainedȱ fromȱ theȱ iGAȱ studyȱ asȱ seedȱtranscripts.ȱWeȱthenȱsplitȱtheȱlogȬratioȱgeneȬexpressionȱprofileȱmatrixȱ(aȱxȱb)ȱ intoȱtwoȱparts:ȱa1ȱxȱbȱandȱa2ȱxȱb,ȱwhereȱaȱisȱtheȱtotalȱnumberȱofȱgeneȱtranscriptsȱ(aȱ=ȱ 24,065ȱinȱourȱcase);ȱa1ȱisȱtheȱnumberȱofȱseedȱtranscripts;ȱa2ȱisȱtheȱnumberȱofȱotherȱ genesȱ (a1ȱ +ȱ a2ȱ =ȱ a)ȱ andȱ bȱ isȱ theȱ numberȱ ofȱ arraysȱ (bȱ =ȱ 80ȱ inȱ ourȱ case).ȱ Weȱ thenȱ computedȱ theȱ Spearmanȱ correlationȱ coefficientȱ andȱ itsȱ correspondingȱ PȬvalueȱ betweenȱeachȱa1ȱseedȱgeneȱandȱa2ȱtranscript.ȱAȱ95ȱpercentileȱempiricalȱthresholdȱ(rȱ=ȱ 0.55)ȱ andȱ itsȱ correspondingȱ FDRȱ (Storeyȱ andȱ Tibshirani,ȱ 2003)ȱ (FDRȱ =ȱ 9.5ȱ xȱ 10Ȭ5)ȱ wereȱestimatedȱbyȱperformingȱ1,000ȱpermutations.ȱ ȱ Acknowledgementsȱ WeȱthankȱDavidȱGalbraithȱforȱprovidingȱmicroarraysȱandȱprotocols,ȱJannyȱPetersȱ forȱ providingȱ dataȱ ofȱ Inȱ silicoȱ AFLPȱ analysisȱ andȱ Linusȱ vanȱ derȱ Plasȱ forȱ criticalȱ readingȱ ofȱ theȱ manuscript.ȱ Thisȱ workȱ wasȱ supportedȱ byȱ aȱ grantȱ fromȱ theȱ NetherlandsȱOrganizationȱforȱScientificȱResearch,ȱProgramȱGenomicsȱ(050Ȭ10Ȭ029).ȱ

68ȱ RegulatoryȱnetworkȱconstructionȱusingȱgenomeȬwideȱgeneȱexpressionȱQTLsȱ

REFERENCESȱ ȱ AlȬShahrour,ȱ F.,ȱ DiazȬUriarte,ȱ R.ȱ andȱ Dopazo,ȱ J.ȱ (2004).ȱ FatiGO:ȱ aȱ webȱ toolȱ forȱ findingȱ significantȱ associationsȱofȱGeneȱOntologyȱtermsȱwithȱgroupsȱofȱgenes.ȱBioinformaticsȱ20,ȱ578Ȭ580.ȱ AlonsoȬBlanco,ȱC.,ȱElȬAssal,ȱS.E.,ȱCoupland,ȱG.ȱandȱKoornneef,ȱM.ȱ(1998a).ȱAnalysisȱofȱnaturalȱallelicȱ variationȱatȱfloweringȱtimeȱlociȱinȱtheȱLandsbergȱerectaȱandȱCapeȱVerdeȱIslandsȱecotypesȱofȱ Arabidopsisȱthaliana.ȱGeneticsȱ149,ȱ749Ȭ764.ȱ AlonsoȬBlanco,ȱ C.,ȱPeeters,ȱ A.J.,ȱKoornneef,ȱM.,ȱLister,ȱC.,ȱDean,ȱC.,ȱvanȱdenȱBosch,ȱN.,ȱ Pot,ȱJ.ȱandȱ Kuiper,ȱ M.T.ȱ (1998b).ȱ Developmentȱ ofȱ anȱ AFLPȱ basedȱ linkageȱ mapȱ ofȱ Ler,ȱ Colȱ andȱ Cviȱ Arabidopsisȱ thalianaȱ ecotypesȱ andȱ constructionȱ ofȱ aȱ Ler/Cviȱ recombinantȱ inbredȱ lineȱ population.ȱPlantȱJȱ14,ȱ259Ȭ271.ȱ Benjamini,ȱY.ȱandȱYekutieli,ȱD.ȱ(2001).ȱTheȱcontrolȱofȱtheȱfalseȱdiscoveryȱrateȱinȱmultipleȱtestingȱunderȱ dependency.ȱAnn.ȱStat.ȱ29,ȱ1165Ȭ1188.ȱ Bing,ȱ N.ȱ andȱ Hoeschele,ȱ I.ȱ (2005).ȱ Geneticalȱ genomicsȱ analysisȱ ofȱ aȱ yeastȱ segregantȱ populationȱ forȱ transcriptionȱnetworkȱinference.ȱGeneticsȱ170,ȱ533Ȭ542.ȱ Borevitz,ȱJ.ȱ(2006).ȱGenotypingȱandȱmappingȱwithȱhighȬdensityȱoligonucleotideȱarrays.ȱMethodsȱMolȱBiolȱ 323,ȱ137Ȭ145.ȱ Boss,ȱP.K.,ȱBastow,ȱR.M.,ȱMylne,ȱJ.S.ȱandȱDean,ȱC.ȱ(2004).ȱMultipleȱpathwaysȱinȱtheȱdecisionȱtoȱflower:ȱ enabling,ȱpromoting,ȱandȱresetting.ȱPlantȱCellȱ16ȱSuppl,ȱS18Ȭ31.ȱ Breitling,ȱ R.,ȱ Amtmann,ȱ A.ȱ andȱ Herzyk,ȱ P.ȱ (2004).ȱ Iterativeȱ Groupȱ Analysisȱ (iGA):ȱ aȱ simpleȱ toolȱ toȱ enhanceȱ sensitivityȱ andȱ facilitateȱ interpretationȱ ofȱ microarrayȱ experiments.ȱ BMCȱ Bioinformaticsȱ5,ȱ34.ȱ Brem,ȱ R.B.,ȱ Yvert,ȱ G.,ȱ Clinton,ȱ R.ȱ andȱ Kruglyak,ȱ L.ȱ (2002).ȱ Geneticȱ dissectionȱ ofȱ transcriptionalȱ regulationȱinȱbuddingȱyeast.ȱScienceȱ296,ȱ752Ȭ755.ȱ Brem,ȱR.B.ȱandȱKruglyak,ȱL.ȱ(2005).ȱTheȱlandscapeȱofȱgeneticȱcomplexityȱacrossȱ5,700ȱgeneȱexpressionȱ traitsȱinȱyeast.ȱProcȱNatlȱAcadȱSciȱUȱSȱAȱ102,ȱ1572Ȭ1577.ȱ Brem,ȱ R.B.,ȱ Storey,ȱ J.D.,ȱ Whittle,ȱ J.ȱ andȱ Kruglyak,ȱ L.ȱ (2005).ȱ Geneticȱ interactionsȱ betweenȱ polymorphismsȱthatȱaffectȱgeneȱexpressionȱinȱyeast.ȱNatureȱ436,ȱ701Ȭ703.ȱ Bystrykh,ȱL.,ȱWeersing,ȱE.,ȱDontje,ȱB.,ȱSutton,ȱS.,ȱPletcher,ȱM.T.,ȱWiltshire,ȱT.,ȱSu,ȱA.I.,ȱVellenga,ȱE.,ȱ Wang,ȱJ.,ȱManly,ȱK.F.ȱetȱal.ȱ(2005).ȱUncoveringȱregulatoryȱpathwaysȱthatȱaffectȱhematopoieticȱ stemȱcellȱfunctionȱusingȱȇgeneticalȱgenomicsȇ.ȱNatȱGenetȱ37,ȱ225Ȭ232.ȱ Cheng,ȱ X.F.ȱ andȱ Wang,ȱ Z.Y.ȱ (2005).ȱ Overexpressionȱ ofȱ COL9,ȱ aȱ CONSTANSȬLIKEȱ gene,ȱ delaysȱ floweringȱbyȱreducingȱexpressionȱofȱCOȱandȱFTȱinȱArabidopsisȱthaliana.ȱPlantȱJȱ43,ȱ758Ȭ768.ȱ Churchill,ȱ G.A.ȱ andȱ Doerge,ȱ R.W.ȱ (1994).ȱ Empiricalȱ thresholdȱ valuesȱ forȱ quantitativeȱ traitȱ mapping.ȱ Geneticsȱ138,ȱ963Ȭ971.ȱ Darrah,ȱC.,ȱTaylor,ȱB.L.,ȱEdwards,ȱK.D.,ȱBrown,ȱP.E.,ȱHall,ȱA.ȱandȱMcWatters,ȱH.G.ȱ(2006).ȱAnalysisȱofȱ phaseȱ ofȱ LUCIFERASEȱ expressionȱ revealsȱ novelȱ circadianȱ quantitativeȱ traitȱ lociȱ inȱ Arabidopsis.ȱPlantȱPhysiolȱ140,ȱ1464Ȭ1474.ȱ deȱKoning,ȱD.J.ȱandȱHaley,ȱC.S.ȱ(2005).ȱGeneticalȱgenomicsȱinȱhumansȱandȱmodelȱorganisms.ȱTrendsȱ Genetȱ21,ȱ377Ȭ381.ȱ DeCook,ȱ R.,ȱ Lall,ȱ S.,ȱ Nettleton,ȱ D.ȱ andȱ Howell,ȱ S.H.ȱ (2006).ȱ Geneticȱ regulationȱ ofȱ geneȱ expressionȱ duringȱshootȱdevelopmentȱinȱArabidopsis.ȱGeneticsȱ172,ȱ1155Ȭ1164.ȱ Dobbin,ȱ K.K.,ȱ Kawasaki,ȱ E.S.,ȱ Petersen,ȱ D.W.ȱ andȱ Simon,ȱ R.M.ȱ (2005).ȱ Characterizingȱ dyeȱ biasȱ inȱ microarrayȱexperiments.ȱBioinformaticsȱ21,ȱ2430Ȭ2437.ȱ Doyle,ȱM.R.,ȱBizzell,ȱC.M.,ȱKeller,ȱM.R.,ȱMichaels,ȱS.D.,ȱSong,ȱJ.,ȱNoh,ȱY.S.ȱandȱAmasino,ȱR.M.ȱ(2005).ȱ HUA2ȱisȱrequiredȱforȱtheȱexpressionȱofȱfloralȱrepressorsȱinȱArabidopsisȱthaliana.ȱPlantȱJȱ41,ȱ 376Ȭ385.ȱ

69ȱ Chapterȱ3ȱ

ElȬDinȱElȬAssal,ȱS.,ȱAlonsoȬBlanco,ȱC.,ȱPeeters,ȱA.J.,ȱWagemaker,ȱC.,ȱWeller,ȱJ.L.ȱandȱKoornneef,ȱM.ȱ (2003).ȱTheȱroleȱofȱcryptochromeȱ2ȱinȱfloweringȱinȱArabidopsis.ȱPlantȱPhysiolȱ133,ȱ1504Ȭ1516.ȱ Fu,ȱ J.ȱ andȱ Jansen,ȱ R.C.ȱ (2006).ȱ Optimalȱ designȱ andȱ analysisȱ ofȱ geneticȱ studiesȱ onȱ geneȱ expression.ȱ Geneticsȱ172,ȱ1993Ȭ1999.ȱ He,ȱ Y.,ȱ Doyle,ȱ M.R.ȱ andȱ Amasino,ȱ R.M.ȱ (2004).ȱ PAF1ȬcomplexȬmediatedȱ histoneȱ methylationȱ ofȱ FLOWERINGȱ LOCUSȱ Cȱ chromatinȱ isȱ requiredȱ forȱ theȱ vernalizationȬresponsive,ȱ winterȬ annualȱhabitȱinȱArabidopsis.ȱGenesȱDevȱ18,ȱ2774Ȭ2784.ȱ Hegmann,ȱ J.P.ȱ andȱ Possidente,ȱ B.ȱ (1981).ȱ Estimatingȱ geneticȱ correlationsȱ fromȱ inbredȱ strains.ȱ Behavȱ Genetȱ11,ȱ103Ȭ114.ȱ Hubner,ȱN.,ȱWallace,ȱC.A.,ȱZimdahl,ȱH.,ȱPetretto,ȱE.,ȱSchulz,ȱH.,ȱMaciver,ȱF.,ȱMueller,ȱM.,ȱHummel,ȱ O.,ȱMonti,ȱJ.,ȱZidek,ȱV.ȱetȱal.ȱ(2005).ȱIntegratedȱtranscriptionalȱprofilingȱandȱlinkageȱanalysisȱ forȱidentificationȱofȱgenesȱunderlyingȱdisease.ȱNatȱGenetȱ37,ȱ243Ȭ253.ȱ Jansen,ȱR.C.ȱ(1993).ȱIntervalȱmappingȱofȱmultipleȱquantitativeȱtraitȱloci.ȱGeneticsȱ135,ȱ205Ȭ211.ȱ Jansen,ȱR.C.ȱandȱNap,ȱJ.P.ȱ(2001).ȱGeneticalȱgenomics:ȱtheȱaddedȱvalueȱfromȱsegregation.ȱTrendsȱGenetȱ 17,ȱ388Ȭ391.ȱ Jones,ȱJ.D.,ȱDunsmuir,ȱP.ȱandȱBedbrook,ȱJ.ȱ(1985).ȱHighȱlevelȱexpressionȱofȱintroducedȱchimaericȱgenesȱ inȱregeneratedȱtransformedȱplants.ȱEmboȱJȱ4,ȱ2411Ȭ2418.ȱ Juenger,ȱ T.E.,ȱ Sen,ȱ S.,ȱ Stowe,ȱ K.A.ȱ andȱ Simms,ȱ E.L.ȱ (2005).ȱ Epistasisȱ andȱ genotypeȬenvironmentȱ interactionȱforȱquantitativeȱtraitȱlociȱaffectingȱfloweringȱtimeȱinȱArabidopsisȱthaliana.ȱGeneticaȱ 123,ȱ87Ȭ105.ȱ Kendziorski,ȱC.ȱandȱWang,ȱP.ȱ(2006).ȱAȱreviewȱofȱstatisticalȱmethodsȱforȱexpressionȱquantitativeȱtraitȱ lociȱmapping.ȱMammȱGenomeȱ17,ȱ509Ȭ517.ȱ Keurentjes,ȱJ.J.B.,ȱFu,ȱJ.,ȱdeȱVos,ȱC.H.,ȱLommen,ȱA.,ȱHall,ȱR.D.,ȱBino,ȱR.J.,ȱvanȱderȱPlas,ȱL.H.,ȱJansen,ȱ R.C.,ȱVreugdenhil,ȱD.ȱandȱKoornneef,ȱM.ȱ(2006).ȱTheȱgeneticsȱofȱplantȱmetabolism.ȱNatȱGenetȱ 38,ȱ842Ȭ849.ȱ Kliebenstein,ȱD.J.,ȱWest,ȱM.A.,ȱvanȱLeeuwen,ȱH.,ȱLoudet,ȱO.,ȱDoerge,ȱR.W.ȱandȱStȱClair,ȱD.A.ȱ(2006).ȱ Identificationȱ ofȱ QTLsȱ controllingȱ geneȱ expressionȱ networksȱ definedȱ aȱ priori.ȱ BMCȱ Bioinformaticsȱ7,ȱ308.ȱ Kobayashi,ȱ Y.,ȱ Kaya,ȱ H.,ȱ Goto,ȱ K.,ȱ Iwabuchi,ȱ M.ȱ andȱ Araki,ȱ T.ȱ (1999).ȱ Aȱ pairȱ ofȱ relatedȱ genesȱ withȱ antagonisticȱrolesȱinȱmediatingȱfloweringȱsignals.ȱScienceȱ286,ȱ1960Ȭ1962.ȱ Koornneef,ȱM.,ȱAlonsoȬBlanco,ȱC.ȱandȱVreugdenhil,ȱD.ȱ(2004).ȱNaturallyȱoccurringȱgeneticȱvariationȱinȱ Arabidopsisȱthaliana.ȱAnnuȱRevȱPlantȱBiolȱ55,ȱ141Ȭ172.ȱ Lan,ȱH.,ȱChen,ȱM.,ȱFlowers,ȱJ.B.,ȱYandell,ȱB.S.,ȱStapleton,ȱD.S.,ȱMata,ȱC.M.,ȱMui,ȱE.T.,ȱFlowers,ȱM.T.,ȱ Schueler,ȱ K.L.,ȱ Manly,ȱ K.F.ȱ etȱ al.ȱ (2006).ȱ Combinedȱ expressionȱ traitȱ correlationsȱ andȱ expressionȱquantitativeȱtraitȱlocusȱmapping.ȱPLoSȱGenetȱ2,ȱe6.ȱ Ledger,ȱ S.,ȱ Strayer,ȱ C.,ȱ Ashton,ȱ F.,ȱ Kay,ȱ S.A.ȱ andȱ Putterill,ȱ J.ȱ (2001).ȱ Analysisȱ ofȱ theȱ functionȱ ofȱ twoȱ circadianȬregulatedȱCONSTANSȬLIKEȱgenes.ȱPlantȱJȱ26,ȱ15Ȭ22.ȱ Levy,ȱY.Y.ȱandȱDean,ȱC.ȱ(1998).ȱTheȱtransitionȱtoȱflowering.ȱPlantȱCellȱ10,ȱ1973Ȭ1990.ȱ Michael,ȱT.P.,ȱSalome,ȱP.A.,ȱYu,ȱH.J.,ȱSpencer,ȱT.R.,ȱSharp,ȱE.L.,ȱMcPeek,ȱM.A.,ȱAlonso,ȱJ.M.,ȱEcker,ȱ J.R.ȱandȱMcClung,ȱC.R.ȱ(2003).ȱEnhancedȱfitnessȱconferredȱbyȱnaturallyȱoccurringȱvariationȱinȱ theȱcircadianȱclock.ȱScienceȱ302,ȱ1049Ȭ1053.ȱ Mizoguchi,ȱT.,ȱWright,ȱL.,ȱFujiwara,ȱS.,ȱCremer,ȱF.,ȱLee,ȱK.,ȱOnouchi,ȱH.,ȱMouradov,ȱA.,ȱFowler,ȱS.,ȱ Kamada,ȱH.,ȱPutterill,ȱJ.ȱetȱal.ȱ(2005).ȱDistinctȱrolesȱofȱGIGANTEAȱinȱpromotingȱfloweringȱ andȱregulatingȱcircadianȱrhythmsȱinȱArabidopsis.ȱPlantȱCellȱ17,ȱ2255Ȭ2270.ȱ Morley,ȱM.,ȱMolony,ȱC.M.,ȱWeber,ȱT.M.,ȱDevlin,ȱJ.L.,ȱEwens,ȱK.G.,ȱSpielman,ȱR.S.ȱandȱCheung,ȱV.G.ȱ (2004).ȱGeneticȱanalysisȱofȱgenomeȬwideȱvariationȱinȱhumanȱgeneȱexpression.ȱNatureȱ430,ȱ743Ȭ 747.ȱ

70ȱ RegulatoryȱnetworkȱconstructionȱusingȱgenomeȬwideȱgeneȱexpressionȱQTLsȱ

Mouradov,ȱA.,ȱCremer,ȱF.ȱandȱCoupland,ȱG.ȱ(2002).ȱControlȱofȱfloweringȱtime:ȱinteractingȱpathwaysȱasȱ aȱbasisȱforȱdiversity.ȱPlantȱCellȱ14ȱSuppl,ȱS111Ȭ130.ȱ Nagpal,ȱ P.,ȱ Ellis,ȱ C.M.,ȱ Weber,ȱ H.,ȱ Ploense,ȱ S.E.,ȱ Barkawi,ȱ L.S.,ȱ Guilfoyle,ȱ T.J.,ȱ Hagen,ȱ G.,ȱ Alonso,ȱ J.M.,ȱCohen,ȱJ.D.,ȱFarmer,ȱE.E.ȱetȱal.ȱ(2005).ȱAuxinȱresponseȱfactorsȱARF6ȱandȱARF8ȱpromoteȱ jasmonicȱacidȱproductionȱandȱflowerȱmaturation.ȱDevelopmentȱ132,ȱ4107Ȭ4118.ȱ Nakamichi,ȱ N.,ȱ MurakamiȬKojima,ȱ M.,ȱ Sato,ȱ E.,ȱ Kishi,ȱ Y.,ȱ Yamashino,ȱ T.ȱ andȱ Mizuno,ȱ T.ȱ (2002).ȱ CompilationȱandȱcharacterizationȱofȱaȱnovelȱWNKȱfamilyȱofȱproteinȱkinasesȱinȱArabiodpsisȱ thalianaȱwithȱreferenceȱtoȱcircadianȱrhythms.ȱBiosciȱBiotechnolȱBiochemȱ66,ȱ2429Ȭ2436.ȱ Nordborg,ȱ M.,ȱ Hu,ȱ T.T.,ȱ Ishino,ȱ Y.,ȱ Jhaveri,ȱ J.,ȱ Toomajian,ȱ C.,ȱ Zheng,ȱ H.,ȱ Bakker,ȱ E.,ȱ Calabrese,ȱ P.,ȱ Gladstone,ȱJ.,ȱGoyal,ȱR.ȱetȱal.ȱ(2005).ȱTheȱpatternȱofȱpolymorphismȱinȱArabidopsisȱthaliana.ȱ PLoSȱBiolȱ3,ȱe196.ȱ Onai,ȱ K.ȱ andȱ Ishiura,ȱ M.ȱ (2005).ȱ PHYTOCLOCKȱ 1ȱ encodingȱ aȱ novelȱ GARPȱ proteinȱ essentialȱ forȱ theȱ Arabidopsisȱcircadianȱclock.ȱGenesȱCellsȱ10,ȱ963Ȭ972.ȱ Peters,ȱJ.L.,ȱConstandt,ȱH.,ȱNeyt,ȱP.,ȱCnops,ȱG.,ȱZethof,ȱJ.,ȱZabeau,ȱM.ȱandȱGerats,ȱT.ȱ(2001).ȱAȱphysicalȱ amplifiedȱfragmentȬlengthȱpolymorphismȱmapȱofȱArabidopsis.ȱPlantȱPhysiolȱ127,ȱ1579Ȭ1589.ȱ Rockman,ȱM.V.ȱandȱKruglyak,ȱL.ȱ(2006).ȱGeneticsȱofȱglobalȱgeneȱexpression.ȱNatȱRevȱGenetȱ7,ȱ862Ȭ872.ȱ Ronald,ȱJ.,ȱBrem,ȱR.B.,ȱWhittle,ȱJ.ȱandȱKruglyak,ȱL.ȱ(2005).ȱLocalȱregulatoryȱvariationȱinȱSaccharomycesȱ cerevisiae.ȱPLoSȱGenetȱ1,ȱe25.ȱ Schadt,ȱE.E.,ȱMonks,ȱS.A.,ȱDrake,ȱT.A.,ȱLusis,ȱA.J.,ȱChe,ȱN.,ȱColinayo,ȱV.,ȱRuff,ȱT.G.,ȱMilligan,ȱS.B.,ȱ Lamb,ȱJ.R.,ȱCavet,ȱG.ȱetȱal.ȱ(2003).ȱGeneticsȱofȱgeneȱexpressionȱsurveyedȱinȱmaize,ȱmouseȱandȱ man.ȱNatureȱ422,ȱ297Ȭ302.ȱ Smyth,ȱG.K.ȱ(2004).ȱLinearȱmodelsȱandȱempiricalȱbayesȱmethodsȱforȱassessingȱdifferentialȱexpressionȱinȱ microarrayȱexperiments.ȱStatȱApplȱGenetȱMolȱBiolȱ3,ȱArticle3.ȱ Storey,ȱJ.D.ȱ(2002).ȱAȱdirectȱapproachȱtoȱfalseȱdiscoveryȱratesȱJ.ȱR.ȱStatist.ȱSoc.ȱBȱ64,ȱ479Ȭ498.ȱ Storey,ȱJ.D.ȱandȱTibshirani,ȱR.ȱ(2003).ȱStatisticalȱsignificanceȱforȱgenomewideȱstudies.ȱProcȱNatlȱAcadȱSciȱ UȱSȱAȱ100,ȱ9440Ȭ9445.ȱ Storey,ȱ J.D.,ȱ Akey,ȱ J.M.ȱ andȱ Kruglyak,ȱ L.ȱ (2005).ȱ Multipleȱ locusȱ linkageȱ analysisȱ ofȱ genomewideȱ expressionȱinȱyeast.ȱPLoSȱBiolȱ3,ȱe267.ȱ Swarup,ȱK.,ȱAlonsoȬBlanco,ȱC.,ȱLynn,ȱJ.R.,ȱMichaels,ȱS.D.,ȱAmasino,ȱR.M.,ȱKoornneef,ȱM.ȱandȱMillar,ȱ A.J.ȱ(1999).ȱNaturalȱallelicȱvariationȱidentifiesȱnewȱgenesȱinȱtheȱArabidopsisȱcircadianȱsystem.ȱ PlantȱJȱ20,ȱ67Ȭ77.ȱ Torii,ȱK.U.,ȱMitsukawa,ȱN.,ȱOosumi,ȱT.,ȱMatsuura,ȱY.,ȱYokoyama,ȱR.,ȱWhittier,ȱR.F.ȱandȱKomeda,ȱY.ȱ (1996).ȱ Theȱ Arabidopsisȱ ERECTAȱ geneȱ encodesȱ aȱ putativeȱ receptorȱ proteinȱ kinaseȱ withȱ extracellularȱleucineȬrichȱrepeats.ȱPlantȱCellȱ8,ȱ735Ȭ746.ȱ Ungerer,ȱM.C.,ȱHalldorsdottir,ȱS.S.,ȱModliszewski,ȱJ.L.,ȱMackay,ȱT.F.ȱandȱPurugganan,ȱM.D.ȱ(2002).ȱ Quantitativeȱ traitȱ lociȱ forȱ inflorescenceȱ developmentȱ inȱ Arabidopsisȱ thaliana.ȱ Geneticsȱ 160,ȱ 1133Ȭ1151.ȱ vanȱOoijen,ȱJ.W.ȱ(2006).ȱJoinmapȱ4,ȱSoftwareȱforȱtheȱcalculationȱofȱgeneticȱlinkageȱmapsȱinȱexperimentalȱ populations.ȱInȱJoinMapȱ(Wageningen,ȱTheȱNetherlands:ȱKyazmaȱB.V.).ȱ Wayne,ȱM.L.ȱandȱMcIntyre,ȱL.M.ȱ(2002).ȱCombiningȱmappingȱandȱarraying:ȱAnȱapproachȱtoȱcandidateȱ geneȱidentification.ȱProcȱNatlȱAcadȱSciȱUȱSȱAȱ99,ȱ14903Ȭ14906.ȱ Yang,ȱ Y.H.,ȱ Buckley,ȱ M.J.,ȱ Dudoit,ȱ S.ȱ andȱ Speed,ȱ T.P.ȱ (2002).ȱ Comparisonȱ ofȱ methodsȱ forȱ imageȱ analysisȱonȱcDNAȱmicroarrayȱdata.ȱJȱComputȱGraphȱStatȱ11,ȱ108Ȭ136.ȱ Yvert,ȱG.,ȱBrem,ȱR.B.,ȱWhittle,ȱJ.,ȱAkey,ȱJ.M.,ȱFoss,ȱE.,ȱSmith,ȱE.N.,ȱMackelprang,ȱR.ȱandȱKruglyak,ȱL.ȱ (2003).ȱ TransȬactingȱ regulatoryȱ variationȱ inȱ Saccharomycesȱ cerevisiaeȱ andȱ theȱ roleȱ ofȱ transcriptionȱfactors.ȱNatȱGenetȱ35,ȱ57Ȭ64.ȱ Zimmermann,ȱ P.,ȱ HirschȬHoffmann,ȱ M.,ȱ Hennig,ȱ L.ȱ andȱ Gruissem,ȱ W.ȱ (2004).ȱ GENEVESTIGATOR.ȱ Arabidopsisȱ microarrayȱ databaseȱ andȱ analysisȱ toolbox.ȱ Plantȱ Physiolȱ 136,ȱ 2621Ȭ263.ȱ

71ȱ ȱ

72ȱ Chapterȱ4ȱ ȱ ȱ Theȱgeneticsȱofȱplantȱmetabolismȱ ȱ JoostȱJ.ȱB.ȱKeurentjes*,ȱJingyuanȱFu*,ȱC.ȱH.ȱRicȱdeȱVos*,ȱArjenȱLommen,ȱRobertȱD.ȱ Hall,ȱRaoulȱJ.ȱBino,ȱLinusȱH.ȱW.ȱvanȱderȱPlas,ȱRitsertȱC.ȱJansen,ȱDickȱVreugdenhilȱ andȱMaartenȱKoornneefȱ ȱ PublishedȱinȱNatureȱGeneticsȱ(2006)ȱ38,ȱ842Ȭ849.ȱ ȱ *ȱEqualȱcontribution.ȱ ȱ ȱ ȱ ABSTRACTȱ ȱ Variationȱ forȱ metaboliteȱ compositionȱ andȱ contentȱ isȱ oftenȱ observedȱ inȱ plants.ȱ However,ȱitȱisȱpoorlyȱunderstoodȱtoȱwhatȱextentȱthisȱvariationȱhasȱaȱgeneticȱbasis.ȱ Here,ȱ weȱ describeȱ theȱ geneticȱ analysisȱ ofȱ naturalȱ variationȱ inȱ theȱ metaboliteȱ compositionȱinȱArabidopsisȱthaliana.ȱInsteadȱofȱfocusingȱonȱspecificȱmetabolites,ȱweȱ haveȱappliedȱempiricalȱuntargetedȱmetabolomicsȱusingȱLiquidȱChromatography– Timeȱ ofȱ Flightȱ Massȱ Spectrometryȱ (LCȬQTOFȱ MS).ȱ Thisȱ uncoveredȱ manyȱ qualitativeȱ andȱ quantitativeȱ differencesȱ inȱ metaboliteȱ accumulationȱ betweenȱ A.ȱ thalianaȱaccessions.ȱOnlyȱ13.4%ȱofȱtheȱmassȱpeaksȱwereȱdetectedȱinȱallȱ14ȱaccessionsȱ analyzed.ȱQuantitativeȱTraitȱLocusȱ(QTL)ȱanalysisȱofȱmoreȱthanȱ2,000ȱmassȱpeaks,ȱ detectedȱ inȱ aȱ Recombinantȱ Inbredȱ Lineȱ (RIL)ȱ populationȱ derivedȱ fromȱ theȱ twoȱ mostȱdivergentȱaccessions,ȱenabledȱtheȱidentificationȱofȱQTLsȱforȱaboutȱ75%ȱofȱtheȱ massȱsignals.ȱMoreȱthanȱoneȬthirdȱofȱtheȱsignalsȱwereȱnotȱdetectedȱinȱeitherȱparent,ȱ indicatingȱ theȱ largeȱ potentialȱ forȱ modificationȱ ofȱ metabolicȱ compositionȱ throughȱ classicalȱ breeding.ȱ Combiningȱ partialȱ interpretationȱ ofȱ massȱ signalsȱ andȱ QTLȱ profilesȱ allowedȱ usȱ toȱ confirmȱ biochemicalȱ pathwaysȱ knownȱ fromȱ theȱ literatureȱ andȱalsoȱtoȱelucidateȱnovelȱbiosynthesisȱsteps.ȱThisȱcanȱleadȱtoȱtheȱidentificationȱofȱ theȱunderlyingȱgenesȱandȱtheȱconstructionȱofȱbiochemicalȱnetworksȱinȱrelationȱtoȱ otherȱphenotypicȱtraits.ȱ ȱ ȱ

73ȱ Chapterȱ4ȱ

INTRODUCTIONȱ ȱ Metabolitesȱ areȱ criticalȱ inȱ biology,ȱ andȱ plantsȱ areȱ especiallyȱ richȱ inȱ diverseȱ biochemicalȱcompounds.ȱItȱhasȱbeenȱestimatedȱthatȱoverȱ100,000ȱmetabolitesȱcanȱbeȱ foundȱ inȱ plants,ȱ andȱ eachȱ speciesȱ mayȱ containȱ itsȱ ownȱ chemotypicȱ expressionȱ patternȱ(Wink,ȱ1988).ȱMoreover,ȱsubstantialȱquantitativeȱandȱqualitativeȱvariationȱ inȱ metaboliteȱ compositionȱ isȱ oftenȱ observedȱ withinȱ plantȱ speciesȱ (Windsorȱ etȱ al.,ȱ 2005).ȱȱ Althoughȱ knowledgeȱ onȱ theȱ regulationȱ ofȱ metaboliteȱ formationȱ isȱ increasing,ȱ forȱ thousandsȱ ofȱ metabolites,ȱ theirȱ functionȱ inȱ theȱ plant,ȱ theirȱ biosyntheticȱpathwayȱandȱtheȱregulationȱthereofȱisȱstillȱunknown.ȱQTLȱanalysisȱofȱ naturalȱ variationȱ presentȱ inȱ segregatingȱ populations,ȱ whichȱ canȱ alsoȱ concernȱ metabolitesȱ (Jansenȱ andȱ Nap,ȱ 2001),ȱ canȱ identifyȱ lociȱ explainingȱ theȱ observedȱ variationȱ(Jansen,ȱ1993).ȱInȱrecentȱyears,ȱaȱfewȱstudiesȱhaveȱfocusedȱonȱidentifyingȱ QTLsȱ regulatingȱ aȱ specificȱ groupȱ ofȱ knownȱ metabolitesȱ usingȱ detectionȱ methodsȱ directedȱtowardȱspecificȱmetaboliteȱgroupsȱ(Mitaȱetȱal.,ȱ1997;ȱBentsinkȱetȱal.,ȱ2000;ȱ Kliebensteinȱetȱal.,ȱ2001a;ȱLoudetȱetȱal.,ȱ2003;ȱHobbsȱetȱal.,ȱ2004).ȱHowever,ȱrecentȱ advancesȱ inȱ massȱ spectrometryȬbasedȱ metabolomicsȱ andȱ dataȱ processingȱ techniquesȱ shouldȱ nowȱ allowȱ largeȬscaleȱ QTLȱ analysesȱ ofȱ untargetedȱ metabolicȱ profiles,ȱ whichȱ mayȱ uncoverȱ previouslyȱ unknownȱ regulatoryȱ functionsȱ ofȱ lociȱ inȱ metabolicȱ pathways.ȱ Usingȱ dedicatedȱ alignmentȱ software,ȱ itȱ isȱ nowȱ possibleȱ toȱ performȱanȱunbiasedȱcomparisonȱofȱlargeȱnumbersȱofȱmetaboliteȬderivedȱmassesȱ detectableȱ inȱ largeȱ numbersȱ ofȱ samples,ȱ arisingȱ fromȱ inherentlyȱ largeȱ setsȱ ofȱ genotypesȱ (whichȱ areȱ requiredȱ forȱ accurateȱ mappingȱ ofȱ QTLs)ȱ inȱ anȱ RILȱ populationȱ(Tikunovȱetȱal.,ȱ2005;ȱVorstȱetȱal.,ȱ2005).ȱQTLȱmappingȱwillȱresultȱinȱtheȱ localizationȱofȱloci,ȱandȱultimatelyȱgenes,ȱcausalȱforȱtheȱobservedȱvariationȱandȱwillȱ allowȱtheȱdiscoveryȱofȱcoȬregulatedȱcompounds.ȱInȱthisȱway,ȱgenomeȬwideȱgeneticȱ correlativeȱmetabolicȱanalysisȱnowȱbecomesȱfeasible,ȱasȱweȱdemonstrateȱhere.ȱ Relationshipsȱbetweenȱbiologicalȱtraitsȱareȱoftenȱinferredȱfromȱcorrelationȱ analysisȱwithinȱaȱgivenȱdataȱset.ȱHowever,ȱmanyȱofȱtheseȱcorrelationsȱneedȱnotȱtoȱ beȱ causalȱ orȱ aȱ resultȱ fromȱ pleiotropicȱ effectsȱ ofȱ aȱ commonȱ setȱ ofȱ regulators.ȱ Inȱ studiesȱfocusingȱonȱaȱsmallȱnumberȱofȱtraitsȱthisȱisȱusuallyȱnotȱaȱproblemȱbecauseȱ additionalȱ experimentsȱ canȱ easilyȱ addressȱ this.ȱ Inȱ largeȱ dataȱ sets,ȱ suchȱ asȱ thoseȱ fromȱgeneȱexpressionȱanalysisȱandȱmetabolomics,ȱmoreȱsophisticatedȱapproachesȱ areȱneededȱtoȱreduceȱtheȱnumberȱofȱ‘falseȱpositive’ȱcorrelationsȱ(Koseȱetȱal.,ȱ2001;ȱ Stuartȱetȱal.,ȱ2003).ȱSuchȱmethodsȱareȱpowerfulȱinȱdetectingȱrelevantȱrelationshipsȱ andȱcanȱbeȱappliedȱtoȱanyȱgivenȱdataȱsetȱevenȱwhenȱdataȱwereȱacquiredȱinȱdifferentȱ experiments.ȱ However,ȱ noȱ informationȱ canȱ beȱ obtainedȱ aboutȱ theȱ underlyingȱ

74ȱ Theȱgeneticsȱofȱplantȱmetabolismȱ

geneticȱregulationȱresponsibleȱforȱtheȱobservedȱcorrelation.ȱTheȱuseȱofȱaȱmappingȱ populationȱ toȱ createȱ comprehensiveȱ dataȱ sets,ȱ onȱ theȱ otherȱ hand,ȱ allowsȱ theȱ identificationȱ ofȱ commonȱ regulatorsȱ causalȱ forȱ theȱ observedȱ correlationȱ betweenȱ traits.ȱYvertȱetȱal.ȱ(2003)ȱcombinedȱbothȱapproachesȱbyȱfirstȱclusteringȱtraitsȱbasedȱ onȱ segregationȱ variationȱ andȱ subsequentȱ mappingȱ ofȱ theȱ meanȱ clusterȱ values.ȱ Althoughȱ thisȱ reducedȱ multipleȱ testingȱ ofȱ traitsȱ andȱ markers,ȱ noiseȱ mayȱ beȱ introducedȱ whenȱ multipleȱ QTLsȱ segregateȱ inȱ theȱ clusterȱ therewithȱ reducingȱ mappingȱ power.ȱ Moreover,ȱ toȱ excludeȱ chanceȱ correlationȱ fromȱ trueȱ coordinateȱ regulation,ȱ stringentȱ thresholdsȱ needȱ toȱ beȱ appliedȱ toȱ defineȱ clusters,ȱ thusȱ individualȱoutliersȱandȱlessȱtightlyȱlinkedȱgenesȱareȱnotȱincluded.ȱ Weȱchoseȱtoȱmapȱeachȱmassȱpeakȱseparatelyȱandȱdetermineȱrelationshipsȱ byȱcorrelationȱanalysisȱofȱQTLȱprofiles.ȱThisȱenablesȱaȱpairȱwiseȱgeneticȱcorrelationȱ analysisȱofȱeachȱindividualȱmassȱpeakȱidentifyingȱrelatedȱmassesȱonȱtheȱbasisȱofȱcoȬ regulation.ȱ Althoughȱ thisȱ rulesȱ outȱ experimentalȱ errorȱ andȱ otherȱ nonȬgeneticalȱ variation,ȱweȱcanȱnotȱexcludeȱdevelopmentalȱcontrolȱofȱmetaboliteȱformationȱasȱtheȱ causeȱ forȱ theȱ observedȱ correlationȱ whenȱ developmentalȱ traitsȱ segregateȱ inȱ theȱ population.ȱ ȱ ȱ

75ȱ Chapterȱ4ȱ

RESULTSȱ ȱ Metaboliteȱvariationȱisȱabundantȱandȱgeneticallyȱcontrolledȱ Toȱ assessȱ theȱ naturalȱ variationȱ inȱ metaboliteȱ contentȱ presentȱ inȱ Arabidopsis,ȱ weȱ performedȱHPLCȬQTOFȱMSȬbasedȱuntargetedȱmetabolicȱfingerprintingȱofȱacidifiedȱ aqueousȱ methanolȱ extractsȱ fromȱ seedlingsȱ ofȱ 14ȱ differentȱ accessionsȱ originatingȱ fromȱvariousȱpartsȱofȱtheȱglobalȱdistributionȱrangeȱofȱArabidopsisȱ(supplementalȱ Tableȱ1ȱatȱhttp://www.nature.com/naturegenetics).ȱ Considerableȱ quantitativeȱ andȱ qualitativeȱ variationȱ wasȱ observedȱ inȱ theȱ massȱ profilesȱ ofȱ theȱ differentȱ accessions.ȱ Althoughȱ aȱ metaboliteȱ mayȱ beȱ representedȱ byȱ oneȱ toȱ severalȱ massȱ signalsȱ inȱ theseȱ analyses,ȱ dependingȱ onȱ itsȱ chemicalȱ structureȱ andȱ abundance,ȱ eachȱ massȱ signalȱ wasȱ treatedȱ asȱ aȱ separateȱ elementȱ inȱ subsequentȱ analyses.ȱ Onȱ average,ȱ 964ȱ massȱ peaksȱ wereȱ detectedȱ perȱ accession,ȱ withȱ aȱ minimumȱ ofȱ 826ȱ (Col)ȱ andȱ aȱ maximumȱ ofȱ 1,337ȱ (Cvi).ȱ Weȱ detectedȱaȱtotalȱofȱ2,475ȱdifferentȱmassȱpeaks;ȱ706ȱwereȱuniqueȱtoȱsingleȱaccessions,ȱ andȱonlyȱ331ȱwereȱpresentȱinȱallȱ14ȱaccessionsȱ(Figureȱ1A).ȱOnȱaverage,ȱ50ȱmassȱ peaksȱperȱaccessionȱwereȱfoundȱtoȱbeȱunique,ȱwithȱaȱminimumȱofȱ14ȱ(BayȬ0)ȱandȱaȱ maximumȱ ofȱ 235ȱ (Cvi).ȱ Althoughȱ thereȱ mightȱ beȱ aȱ slightȱ biasȱ towardȱ anȱ overestimationȱ ofȱ theȱ numberȱ ofȱ accessionȱ specificȱ massȱ peaksȱ owingȱ toȱ lowȬ abundanceȱ peaksȱ detectedȱ aroundȱ theȱ thresholdȱ level,ȱ theȱ observedȱ frequencyȱ distributionȱpatternȱwasȱsimilarȱwhenȱtheȱthresholdȱlevelȱwasȱincreasedȱfromȱsixȱtoȱ tenȱ timesȱ localȱ noise.ȱ Itȱ canȱ thereforeȱ beȱ assumedȱ thatȱ manyȱ ofȱ theȱ differencesȱ observedȱbetweenȱaccessionsȱareȱdueȱtoȱqualitativeȱdifferences.ȱForȱmostȱmasses,ȱaȱ largeȱpartȱofȱtheȱobservedȱvariationȱcanȱbeȱassignedȱtoȱgeneticȱfactors,ȱasȱconcludedȱ fromȱtheirȱoftenȱhighȱbroadȬsenseȱheritabilitiesȱ(Figureȱ1B).ȱThis,ȱtogetherȱwithȱtheȱ substantialȱ variationȱ inȱ metaboliteȱ compositionȱ observedȱ withinȱ aȱ singleȱ plantȱ speciesȱ promisesȱ greatȱ opportunitiesȱ forȱ metabolicȱ engineeringȱ byȱ classicalȱ breedingȱ(Dixon,ȱ2005).ȱ ȱ

76ȱ Theȱgeneticsȱofȱplantȱmetabolismȱ

ȱ ȱ Figureȱ1:ȱNaturalȱvariationȱinȱArabidopsisȱmetaboliteȱaccumulation.ȱ (A)ȱFrequencyȱdistributionȱofȱtheȱnumberȱofȱdifferentȱaccessionsȱeachȱmassȱpeakȱwasȱdetectedȱin.ȱ(B)ȱ Frequencyȱdistributionȱofȱbroadȱsenseȱheritabilityȱofȱeachȱmassȱpeakȱdetectedȱinȱtheȱdifferentȱaccessions.ȱ Dataȱareȱbasedȱonȱatȱleastȱtwoȱbiologicalȱreplicatesȱperȱaccession.ȱ ȱ Mostȱofȱtheȱmetabolicȱvariationȱcanȱbeȱmappedȱȱ Toȱ uncoverȱ lociȱ controllingȱ theȱ observedȱ variationȱ inȱ metabolicȱ profiles,ȱ weȱ subsequentlyȱanalyzedȱanȱRILȱpopulationȱderivedȱfromȱaȱcrossȱbetweenȱLandsbergȱ erectaȱ(Ler)ȱandȱCapeȱVerdeȱIslandsȱ(Cvi)ȱ(AlonsoȬBlancoȱetȱal.,ȱ1998).ȱTheseȱwereȱ theȱ twoȱ biochemicallyȱ mostȱ distinctȱ accessionsȱ forȱ whichȱ suchȱ aȱ mappingȱ populationȱwasȱavailableȱ(Figureȱ2).ȱ Strikingly,ȱ 853ȱ ofȱ aȱ totalȱ ofȱ 2,129ȱ massȱ peaksȱ identifiedȱ inȱ theȱ RILȱ populationȱwereȱnotȱdetectedȱinȱeitherȱparentȱ(Figureȱ3).ȱAlthoughȱtheȱnumberȱofȱ linesȱ analyzedȱ inȱ theȱ RILȱ populationȱ (160ȱ linesȱ measuredȱ inȱ duplicate)ȱ exceededȱ thatȱ ofȱ theȱ numberȱ ofȱ parentalȱ linesȱ (5ȱ replicatesȱ ofȱ eachȱ parentȱ measuredȱ inȱ duplicate),ȱ makingȱ theȱ chanceȱ ofȱ detectingȱ massȱ peakȱ intensitiesȱ aroundȱ theȱ thresholdȱlevelȱhigher,ȱtheȱobservedȱratioȱdidȱnotȱdifferȱmuchȱwhenȱtheȱthresholdȱ wasȱincreasedȱmodestlyȱ(dataȱnotȱshown).ȱThisȱsuggestsȱthatȱmanyȱmetabolitesȱnotȱ presentȱ inȱ eitherȱ parentȱ areȱ producedȱ asȱ aȱ resultȱ ofȱ theȱ recombinationȱ ofȱ theȱ genomesȱofȱtheȱtwoȱparents.ȱ

77ȱ Chapterȱ4ȱ

Figureȱ 2:ȱ Hierarchicalȱ clusteringȱ ofȱ accessionsȱ forȱmetaboliteȱcontent.ȱ Theȱ dendrogramȱ depictsȱ euclideanȱ distanceȱ betweenȱ groupsȱ afterȱ transformationȱ ofȱ theȱ data.ȱ Numbersȱ representȱ confidenceȱ percentagesȱafterȱbootstrapȱanalysis.ȱClusteringȱ onȱ metaboliteȱ contentȱ forȱ theȱ differentȱ accessionsȱ showsȱ theȱ clearȱ separationȱ ofȱ Cviȱ fromȱLerȱindicatingȱlargeȱgeneticȱdifferencesȱforȱ metaboliteȱcontent.ȱ ȱ ȱ ȱ ȱ ȱ ȱ Forȱ1,592ȱmassȱsignalsȱ(74.8%),ȱatȱleastȱoneȱsignificantȱ(Pȱ<ȱ0.0001)ȱQTLȱwasȱ detectedȱ usingȱ aȱ twoȬpartȱ parametricȱ modelȱ (Broman,ȱ 2003).ȱ Thisȱ PȬthresholdȱ correspondedȱtoȱaȱqȱvalueȱofȱ0.0002ȱinȱStorey’sȱgenomeȬwideȱfalseȱdiscoveryȱrateȱ (FDR)ȱ methodȱ (Storeyȱ andȱ Tibshirani,ȱ 2003).ȱ Onȱ average,ȱ weȱ foundȱ nearlyȱ 2.0ȱ QTLsȱperȱanalyzedȱmass,ȱleadingȱtoȱaȱtotalȱofȱ4,213ȱQTLsȱ(supplementalȱFigureȱ2ȱatȱ http://www.nature.com/naturegenetics).ȱ Thus,ȱ afterȱ crossingȱ theseȱ twoȱ distinctȱ genotypes,ȱ variationȱ inȱ theȱ presenceȱ andȱ abundanceȱ ofȱ ~75%ȱ ofȱ theȱ detectedȱ massesȱ inȱ theirȱ offspringȱ couldȱ atȱ leastȱ partlyȱ beȱ explainedȱ byȱ mappableȱ geneticȱ factorsȱ(Figureȱ3),ȱconsistentȱwithȱtheȱrelativelyȱhighȱheritabilitiesȱfoundȱforȱmanyȱ massesȱ(supplementalȱFigureȱ3ȱatȱhttp://www.nature.com/naturegenetics).ȱAtȱmoreȱ stringentȱPȬvalueȱthresholdsȱofȱ5.0ȱxȱ10Ȭ5,ȱ1ȱxȱ10Ȭ5,ȱandȱ1ȱxȱ10Ȭ6,ȱcorrespondingȱtoȱqȱ valuesȱofȱ1ȱxȱ10Ȭ4,ȱ2.9ȱxȱ10Ȭ5,ȱandȱ4.1ȱxȱ10Ȭ6,ȱrespectively,ȱ1,500ȱ(70.5%),ȱ1,306ȱ(61.3%),ȱ andȱ1,068ȱ(50.2%)ȱmassȱsignalsȱshowedȱatȱleastȱoneȱsignificantȱlinkage.ȱ

78ȱ Theȱgeneticsȱofȱplantȱmetabolismȱ

Figureȱ3:ȱNumberȱofȱmassesȱdetectedȱinȱtheȱRILȱ populationȱandȱitsȱparents.ȱ Theȱ triangleȱ isȱ subdividedȱ intoȱ massesȱ notȱ detectedȱinȱeitherȱparentȱ(upperȱpart),ȱdetectedȱinȱ oneȱ parentȱ onlyȱ (leftȱ andȱ right)ȱ andȱ detectedȱ inȱ bothȱparentsȱ(lowerȱpart).ȱTheȱnumberȱofȱmassesȱ forȱwhichȱatȱleastȱoneȱsignificantȱ(Pȱ<ȱ0.0001)ȱQTLȱ wasȱ detectedȱ isȱ shownȱ inȱ parentheses.ȱ Dataȱ representȱtwoȱbiologicalȱreplicatesȱperȱRILȱandȱ5ȱ biologicalȱreplicatesȱforȱeachȱparentȱmeasuredȱinȱ 2ȱreplicateȱextractions.ȱ ȱ ȱ ȱ ȱ ȱ ȱ AnalysisȱofȱtheȱgenomicȱdistributionȱofȱtheȱdetectedȱQTLsȱshowsȱthatȱtheseȱ areȱ notȱ evenlyȱ distributedȱ overȱ theȱ Arabidopsisȱ genome.ȱ Instead,ȱ hotȱ andȱ coldȱ spotsȱ forȱ theȱ regulationȱ ofȱ metabolicȱ contentȱ wereȱ observedȱ (Figureȱ 4).ȱ Thisȱ unequalȱ distributionȱ ofȱ QTLsȱ mayȱ occurȱ forȱ aȱ numberȱ ofȱ reasons.ȱ Manyȱ ofȱ theȱ metabolitesȱ detectedȱ byȱ theȱ approachȱ chosenȱ mayȱ beȱ biochemicallyȱ relatedȱ andȱ thereforeȱhaveȱsimilarȱgeneticȱcontrol.ȱInȱaddition,ȱgeneticȱfactorsȱsuchȱasȱdegreeȱofȱ geneticȱ differentiationȱ andȱ effectsȱ ofȱ differentialȱ recombinationȱ ratesȱ mightȱ contributeȱtoȱthisȱheterogeneity.ȱFinally,ȱhotȱspotsȱmayȱreflectȱfalseȬpositiveȱQTLsȱ ofȱtraitsȱhighlyȱcorrelatedȱowingȱtoȱtechnicalȱorȱenvironmentalȱfactorsȱ(deȱKoningȱ andȱ Haley,ȱ 2005).ȱ Weȱ thereforeȱ computedȱ empiricalȱ confidenceȱ levelsȱ byȱ permutationȱ testsȱ (supplementalȱ methodsȱ atȱ http://www.nature.com/ȱ naturegenetics)ȱandȱfoundȱthatȱinȱmostȱcases,ȱtheȱfrequencyȱofȱQTLsȱoccurringȱatȱ hotȱspotsȱwasȱmuchȱhigherȱthanȱwasȱexpectedȱbyȱchanceȱ(Figureȱ4).ȱ ȱ

79ȱ Chapterȱ4ȱ

ȱ ȱ Figureȱ4:ȱFrequencyȱdistributionȱofȱtheȱnumberȱofȱsignificantȱQTLsȱdetectedȱatȱeachȱmarkerȱpositionȱatȱ fourȱsignificanceȱlevels.ȱ When,ȱ forȱ aȱ certainȱ massȱ signal,ȱ consecutiveȱ markersȱ showedȱ significantȱ linkage,ȱ onlyȱ theȱ mostȱ significantȱmarkerȱwasȱcounted.ȱMarkersȱwereȱevenlyȱspacedȱoverȱtheȱgenomeȱwithȱanȱaverageȱdistanceȱ ofȱ 5ȱ cMȱ betweenȱ them.ȱ Chromosomalȱ bordersȱ areȱ indicatedȱ byȱ verticalȱ shadedȱ lines.ȱ Theȱ dashedȱ horizontalȱ linesȱ representȱ theȱ 95%ȱ genomeȬwideȱ frequencyȱ confidenceȱ thresholdsȱ forȱ regulationȱ hotspotsȱ obtainedȱ fromȱ 1,000ȱ permutations.ȱ Theȱ correspondingȱ valuesȱ areȱ 31,ȱ 23,ȱ 8,ȱ andȱ 2ȱ QTLsȱ perȱ markerȱexpectedȱbyȱchanceȱforȱsignificanceȱlevelsȱofȱ10Ȭ4,ȱ5ȱxȱ10Ȭ5,ȱ10Ȭ5,ȱandȱ10Ȭ6ȱinȱincreasingȱintensity,ȱ respectively.ȱDataȱrepresentȱtwoȱbiologicalȱreplicatesȱperȱRIL.ȱ ȱ Mapȱpositionsȱcanȱrevealȱmetabolicȱpathwaysȱ CoȬlocationȱofȱQTLsȱcoincidesȱwithȱclustersȱofȱhighlyȱcorrelatedȱmassȱpeaks,ȱwhichȱ areȱassumedȱtoȱbeȱenrichedȱforȱmassesȱregulatedȱbyȱtheȱsameȱgenes.ȱCoȬregulatedȱ

80ȱ Theȱgeneticsȱofȱplantȱmetabolismȱ

metabolitesȱ mayȱ indicateȱ thatȱ aȱ specificȱ biologicalȱ functionȱ controlsȱ differentȱ componentsȱorȱthatȱaȱspecificȱstepȱinȱaȱbiochemicalȱpathwayȱisȱaffectedȱ(MitchellȬ OldsȱandȱPedersen,ȱ1998).ȱToȱdemonstrateȱtheȱlatterȱpossibility,ȱweȱfirstȱfocusedȱonȱ theȱ massȱ signalsȱ correspondingȱ toȱ glucosinolates,ȱ forȱ whichȱ overȱ 30ȱ differentȱ structuresȱ haveȱ alreadyȱ beenȱ identifiedȱ inȱ Arabidopsisȱ(Reicheltȱ etȱ al.,ȱ 2002).ȱTheȱ largestȱ classȱ comprisesȱ theȱ aliphaticȱ glucosinolates,ȱ whichȱ areȱ allȱ derivedȱ fromȱ methionineȱ(Figureȱ5).ȱȱ ȱ

ȱ ȱ Figureȱ5:ȱGeneticȱregulationȱofȱaliphaticȱglucosinolateȱaccumulationȱinȱArabidopsis.ȱ Correspondingȱlociȱofȱenzymaticȱstepsȱareȱshownȱinȱboldȱnextȱtoȱtheȱarrows.ȱ ȱ Previousȱ studies,ȱ targetedȱ towardsȱ thisȱ classȱ ofȱ metabolites,ȱ haveȱ shownȱ largeȱ quantitativeȱ andȱ qualitativeȱ differencesȱ inȱ accumulationȱ ofȱ aliphaticȱ glucosinolatesȱ betweenȱ Arabidopsisȱ accessionsȱ (Kliebensteinȱ etȱ al.,ȱ 2001b).ȱ Inȱ addition,ȱ QTLȱ analysisȱ ofȱ theseȱ glucosinolatesȱ inȱ theȱ Lerȱ xȱ Cviȱ RILȱ populationȱ uncoveredȱ twoȱ majorȱ lociȱ explainingȱ theȱ observedȱ variationȱ forȱ mostȱ aliphaticȱ glucosinolatesȱ (Kliebensteinȱ etȱ al.,ȱ 2001a).ȱ Theȱ MAMȱ locusȱ atȱ theȱ topȱ ofȱ chromosomeȱ5ȱisȱresponsibleȱforȱtheȱobservedȱvariationȱinȱchainȱlengthȱ(Kroymannȱ etȱal.,ȱ2001),ȱwhereasȱtheȱAOPȱlocusȱatȱtheȱtopȱofȱchromosomeȱ4ȱisȱresponsibleȱforȱ theȱ observedȱ variationȱ inȱ sideȱ chainȱ modificationȱ (Kliebensteinȱ etȱ al.,ȱ 2001c).ȱ Moreover,ȱ bothȱ loci,ȱ whichȱ containȱ multipleȱ copiesȱ ofȱ genesȱ havingȱ differentȱ biochemicalȱfunctions,ȱseemȱtoȱcontrolȱ theȱquantitativeȱvariationȱinȱglucosinolateȱ

81ȱ Chapterȱ4ȱ

accumulation,ȱwithȱsubstantialȱinteractionȱbetweenȱtheȱtwoȱloci.ȱTheȱMAMȱlocusȱ harborsȱaȱfamilyȱofȱmethylthioalkylmalateȱsynthaseȱ(MAM)ȱgenes.ȱInȱadditionȱtoȱaȱ MAMȬLȱ (MAMȬlike)ȱ gene,ȱ theȱ locusȱ mayȱ harborȱ twoȱ furtherȱ genes,ȱ MAM1ȱ andȱ MAM2ȱ(Figureȱ5).ȱSynthesisȱofȱC4ȱglucosinolatesȱisȱcompletelyȱdependentȱonȱtheȱ presenceȱ ofȱ aȱ functionalȱ MAM1ȱ gene.ȱ Withoutȱ thisȱ gene,ȱ C3ȱ glucosinolatesȱ areȱ synthesized.ȱTheȱoccurrenceȱofȱaȱMAMȬLȱgeneȱisȱresponsibleȱforȱtheȱformationȱofȱ glucosinolatesȱ withȱ longerȱ chainȱ lengths.ȱ Bothȱ Lerȱ andȱ Cviȱ containȱ aȱ functionalȱ MAMȬLȱgeneȱwhereasȱCviȱcontainsȱtwoȱMAM1ȱgenesȱarrangedȱinȱtandemȱandȱLerȱ containsȱaȱfunctionalȱMAM2ȱgeneȱinȱadditionȱtoȱaȱtruncated,ȱnonȬfunctionalȱMAM1ȱ geneȱ(Kroymannȱetȱal.,ȱ2001).ȱTheȱAOPȱlocusȱisȱalsoȱaȱcomplexȱregionȱcontainingȱ genesȱ encodingȱ 2ȬoxoglutarateȬdependentȱ dioxygenases.ȱ Atȱ leastȱ threeȱ paralogsȱ haveȱbeenȱidentified.ȱTheȱfunctionȱofȱAOP1ȱisȱstillȱunknownȱbutȱAOP2ȱandȱAOP3ȱ functionsȱhaveȱbeenȱdescribedȱ(Figureȱ5).ȱAllȱthreeȱAOPȱgenesȱareȱpresentȱinȱbothȱ LerȱandȱCviȱbutȱwhereȱAOP1ȱisȱexpressedȱatȱsimilarȱlevels,ȱAOP2ȱisȱonlyȱexpressedȱ inȱCviȱandȱAOP3ȱisȱonlyȱexpressedȱinȱLerȱ(Kliebensteinȱetȱal.,ȱ2001c).ȱBecauseȱtheȱ specificȱ genesȱ ofȱ theȱ twoȱ loci,ȱ whichȱ areȱ phylogeneticȱ paralogs,ȱ areȱ physicallyȱ placedȱatȱtheȱsameȱgenomicȱposition,ȱtheyȱsegregateȱasȱallelesȱofȱeachȱother.ȱ Byȱ makingȱ useȱ ofȱ theȱ massȱ accuracyȱ ofȱ theȱ TOFȬMS,ȱ weȱ wereȱ ableȱ toȱ identifyȱmostȱofȱtheȱaliphaticȱglucosinolatesȱreportedȱforȱArabidopsis.ȱSubsequentȱ QTLȱanalysisȱshowedȱthatȱallȱmassesȱcorrespondingȱtoȱanȱaliphaticȱglucosinolateȱ indeedȱmappedȱtoȱtheȱAOPȱand/orȱMAMȱlociȱ(Figureȱ6),ȱthusȱconfirmingȱpreviousȱ findings.ȱEpistaticȱanalysisȱofȱtheȱtwoȱlociȱrevealedȱstrongȱinteractionsȱforȱmanyȱofȱ theȱ detectedȱ glucosinolatesȱ (supplementalȱ methodsȱ andȱ supplementalȱ Tableȱ 2ȱ atȱ http://www.nature.com/naturegenetics).ȱ ȱ Figureȱ 6:ȱ QTLȱ likelihoodȱ profilesȱ ofȱ aliphaticȱ glucosinolatesȱdetectedȱinȱtheȱRILȱpopulation.ȱ Theȱ firstȱ QTL,ȱ atȱ 303.3ȱ cM,ȱ isȱ atȱ theȱ AOPȱ locus,ȱ theȱsecond,ȱatȱ409.4ȱcM,ȱisȱatȱtheȱMAMȱlocus.ȱTheȱ signȱofȱtheȱvalueȱisȱrelatedȱtoȱtheȱadditiveȱeffectȱ atȱeachȱmarkerȱpositionȱ(+,ȱCvi;ȱȬ,ȱLer).ȱSolidȱlinesȱ representȱ glucosinolatesȱ beforeȱ sideȱ chainȱ modificationȱ andȱ dottedȱ linesȱ glucosinolatesȱ afterȱ sideȱ chainȱ modification.ȱ Chromosomalȱ bordersȱ areȱ indicatedȱ byȱ verticalȱ shadedȱ lines.ȱ Colorsȱ representȱ differentȱ chainȱ lengthsȱ (black,ȱ 3C;ȱshaded,ȱ>4C).ȱ ȱ ȱ ȱ ȱ

82ȱ Theȱgeneticsȱofȱplantȱmetabolismȱ

Theȱ factȱ thatȱ weȱ didȱ notȱ detectȱ allȱ glucosinolateȱ QTLsȱ foundȱ inȱ anotherȱ studyȱ(Kliebensteinȱetȱal.,ȱ2001a)ȱisȱmostȱlikelyȱexplainedȱbyȱtheȱuseȱofȱaȱdifferentȱ stageȱ ofȱ plantȱ developmentȱ andȱ differencesȱ inȱ growingȱ conditions.ȱ Thisȱ isȱ supportedȱbyȱtheȱfactȱthatȱtheyȱfoundȱdifferentȱQTLsȱinȱseedsȱversusȱleaves.ȱTheȱ observationȱthatȱourȱMAMȱQTLȱwasȱmuchȱstrongerȱthanȱinȱtheirȱstudyȱprovidesȱ anotherȱexampleȱofȱsuchȱaȱgenotypeȱxȱenvironmentȱorȱgenotypeȱxȱdevelopmentalȱ stageȱ interaction,ȱ whichȱ canȱ beȱ expectedȱ alsoȱ forȱ metabolites.ȱ Furthermore,ȱ weȱ mappedȱindividualȱglucosinolatesȱwhereasȱKliebensteinȱetȱal.ȱ(2001a)ȱshowedȱtheȱ mappingȱofȱtotalȱaliphaticȱglucosinolateȱcontent.ȱ ȱ

ȱ ȱ Figureȱ 7:ȱ SecondȬorderȱ geneticȱ correlationsȱ betweenȱ aliphaticȱ glucosinolatesȱ detectedȱ inȱ theȱ RILȱ population.ȱ Theȱ upperȱ panelȱ containsȱ glucosinolatesȱ beforeȱ sideȱ chainȱ modification;ȱ theȱ lowerȱ panelȱ containsȱ glucosinolatesȱafterȱsideȱchainȱmodification.ȱAllȱedgesȱdepictedȱareȱsignificantȱatȱDȱ=ȱ0.05,ȱasȱdeterminedȱ byȱpermutation.ȱCorrespondingȱcorrelationȱvaluesȱareȱplacedȱnextȱtoȱedges.ȱ ȱ Toȱ assessȱ theȱ extentȱ ofȱ geneticȱ overlapȱ betweenȱ anyȱ twoȱ masses,ȱ weȱ computedȱ theȱ correlationȱ coefficientsȱ betweenȱ QTLȱ profilesȱ (vectorsȱ ofȱ PȬvaluesȱ associatedȱ withȱ markersȱ alongȱ theȱ genomeȱ forȱ eachȱ mass).ȱ Strongȱ geneticȱ correlationsȱamongȱaliphaticȱglucosinolatesȱwereȱobservedȱdueȱtoȱtheȱcoȬlocationȱofȱ QTLsȱ (dataȱ notȱ shown).ȱ Toȱ extractȱ theȱ mostȱ relevantȱ relationshipsȱ betweenȱ differentȱ glucosinolates,ȱ weȱ alsoȱ calculatedȱ secondȬorderȱ correlationsȱ definedȱ byȱ correlationȱbetweenȱtwoȱglucosinolatesȱindependentȱofȱcoȬvarianceȱwithȱanyȱotherȱ pairȱ (deȱ laȱ Fuenteȱ etȱ al.,ȱ 2004).ȱ Theȱ significanceȱ thresholdȱ forȱ theȱ secondȬorderȱ correlationsȱwasȱempiricallyȱestimatedȱbyȱpermutationȱ(supplementalȱmethodsȱatȱ http://www.nature.com/naturegenetics).ȱ Significantȱ coefficientsȱ areȱ shownȱ inȱ

83ȱ Chapterȱ4ȱ

Figureȱ 7ȱ asȱ edgesȱ betweenȱ metabolites;ȱ 0.1ȱ falseȱ positiveȱ edgesȱ areȱ expectedȱ byȱ chance.ȱTheȱresultingȱnetworkȱisȱessentiallyȱaȱreconstructionȱofȱaȱknownȱpathwayȱ forȱglucosinolateȱformationȱ(Figureȱ5)ȱandȱgroupsȱglucosinolatesȱaccordingȱtoȱtheirȱ specificȱbiosynthesisȱsteps.ȱTheȱfactȱthatȱtheȱreconstructedȱnetworkȱhasȱsimilaritiesȱ toȱ theȱ knownȱ pathwayȱ validatesȱ ourȱ methods,ȱ andȱ theȱ dissimilaritiesȱ suggestȱ possibleȱpreviouslyȱunknownȱstepsȱinȱtheȱformationȱofȱglucosinolates.ȱ Evenȱifȱnoȱpriorȱinformationȱhadȱbeenȱavailable,ȱourȱmappingȱdataȱaloneȱ suggestȱ thatȱ atȱ leastȱ twoȱ lociȱ contributeȱ toȱ theȱ observedȱ variationȱ inȱ aliphaticȱ glucosinolateȱ formation.ȱ Theȱ factȱ thatȱ mostȱ MAMȬregulatedȱ compoundsȱ doȱ notȱ showȱaȱQTLȱatȱtheȱAOPȱlocusȱandȱallȱAOPȬregulatedȱcompoundsȱalsoȱshowȱaȱQTLȱ atȱ theȱ MAMȱ locusȱ (Figureȱ 6)ȱ suggestsȱ thatȱ AOPȱ actsȱ downstreamȱ ofȱ MAM.ȱ Furthermore,ȱ weȱ observedȱ highȱ levelsȱ ofȱ sideȱ chainȬmodifiedȱ compoundsȱ inȱ unexpectedȱ genotypicȱ classesȱ (Tableȱ 1).ȱ Inȱ contrastȱ toȱ previousȱ findingsȱ (Kliebensteinȱetȱal.,ȱ2001c),ȱthisȱsuggests,ȱthatȱbothȱAOP2ȱandȱAOP3ȱareȱexpressedȱ inȱseedlings,ȱindicatingȱthatȱregulationȱofȱglucosinolateȱformationȱisȱdependentȱonȱ developmentalȱ stage.ȱ Theȱ reverseȱ additiveȱ effectȱ ofȱ theȱ AOPȱ locusȱ forȱ 4Ȭ hydroxybutyl,ȱ2Ȭpropenylȱandȱ4Ȭbenzoyloxybutylȱformationȱshowsȱthatȱregulationȱ canȱbeȱcompletelyȱdifferentȱforȱdifferentȱgrowthȱstages,ȱalthoughȱKliebensteinȱetȱal.ȱ (2001c)ȱalsoȱsuggestedȱalternativeȱlociȱforȱ4Ȭhydroxybutylȱformation.ȱTheseȱresultsȱ validateȱourȱcombinedȱgeneticȱandȱmetabolomicȱapproachȱtoȱidentifyȱcoȬregulatedȱ massesȱandȱprovideȱanȱindependentȱlineȱofȱevidenceȱtoȱvalidateȱorȱmodifyȱcurrentȱ knowledge.ȱAnȱuntargetedȱapproachȱshouldȱthereforeȱfacilitateȱtheȱannotationȱofȱ metabolitesȱtoȱexistingȱorȱevenȱtoȱasȬyetȬunknownȱpathways.ȱȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ

84ȱ Theȱgeneticsȱofȱplantȱmetabolismȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ of (4) SD ȱ ȱ ȱ (51) (45) (28) (88) ȱ ȱ ȱ 9 ȱ ± (249) (184) (757) (506) (129) (317) ȱ (MC) ȱ ȱ ȱ ȱ ȱ ȱ (3621) (2043) (7671) (8989) (5156) (2055) (n=38) 73 31 35 ȱ ȱ ȱ ȱ ȱ ȱ ȱ (11843) signal 110 52 ȱ ȱ ȱȱȱȱȱȱȱȱ 547 128 413 179 437 BB ȱȱȱȱȱȱ ȱȱȱȱȱȱ ȱȱȱȱȱȱ ȱȱȱȱ effect Mean ȱȱȱȱ 8751 ȱȱȱȱ 2484 ȱȱȱȱȱȱ ȱȱȱȱ ȱȱȱȱ 2332 ȱȱȱȱ ȱ ȱȱ 17973 11212 26046 ȱȱ ȱȱȱ 15984 mass ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ as ȱ (4) (1) SD ȱ ȱ ȱ (65) (18) (65) (97) additive ȱ ȱ ȱ 4 ȱ ± (536) (131) (223) (106) (313) (584) ȱ (MC) ȱ 11 ȱ ȱ ȱ ȱ ȱ ȱ (2520) (5699) (2620) (5125) (7728) (9469) (n=27) 84 11 98 83 ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ a given ȱȱȱȱȱȱ ȱȱȱȱȱȱȱȱ 701 258 109 157 792 137 BA ȱ ȱȱȱȱȱȱ ȱȱȱȱȱȱ ȱȱȱȱȱȱ ȱȱȱȱȱȱ and Mean 1246 ȱȱȱȱ ȱȱȱȱ 9183 ȱȱȱȱ ȱȱȱȱ ȱȱȱȱ 8396 ȱȱȱȱ ȱ is ȱȱ 29793 ȱȱ 16202 ȱȱ 29646 ȱ RILs P) ȱ 10 ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ RILs log ȱ (4) (9) (2) Ȭ SD ȱ ȱ ȱ ȱ ( (28) (52) (16) ȱ ȱ ȱ 9 4 ± ȱ (154) (114) (692) (392) ȱ (MC) 17 ȱ ȱ ȱ ȱ Genotype (1000) (1160) (1528) (2326) (1670) (3748) (5309) (5761) (n=49) 21 26 21 ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ and 56 ȱ ȱȱȱȱȱȱ ȱȱȱȱȱȱȱȱ ȱȱȱȱȱȱȱȱ 188 200 447 AB ȱȱȱȱȱȱ ȱȱȱȱȱȱ ȱȱȱȱȱȱ 245 313 Mean ȱȱȱȱȱȱ 3682 2969 ȱȱȱȱ ȱȱȱȱ 1255 1076 7712 ȱȱȱȱ 32690 ȱȱȱȱ ȱȱȱȱ ȱȱ ȱȱ ȱȱ ȱȱ ȱȱ lines ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ significance (3) (6) (4) (7) ȱ SD ȱ ȱ ȱ ȱ ȱ (54) (37) (12) (80) ȱ ȱ ȱ 4 ȱ 6 6 ± (141) (305) (187) ȱ (MC) 8 20 ȱ ȱ ȱ (4549) (2647) (5414) (1124) (6517) (4981) (1162) (n=43) 24 12 63 ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ parental 49 ȱ ȱȱȱȱȱȱȱȱ ȱȱȱȱȱȱȱȱ ȱȱȱȱȱȱ ȱȱȱȱȱȱȱȱ 267 264 ȱȱȱȱȱ ȱȱȱȱȱȱȱȱ ȱȱȱȱȱȱ ȱȱȱȱȱȱ 348 856 296 AA Mean 3851 ȱȱȱȱ 8327 ȱȱȱȱ ȱȱȱȱȱȱ 32659 ȱȱ 11458 ȱȱȱȱ ȱȱ ȱȱȱȱ ȱȱȱȱ the ȱ in ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ glucosinolate, (3) (5) (6) (0) SD ȱ ȱ ȱ ȱ ȱ ȱ (68) (40) ȱ ȱ 4 ± (304) (460) (159) (499) (309) (995) (n=5) ȱ (MC) 10 15 ȱ ȱ ȱ ȱ ȱ ȱ ȱ (7053) (1989) (1269) (4787) (4470) (9432) 104 38 ȱ ȱ ȱ ȱ ȱ ȱ 223 ȱȱȱȱȱȱ ȱȱȱȱ ȱȱȱȱȱȱ ȱȱȱȱȱȱȱȱ 470 261 801 776 each Cvi ȱȱȱȱ ȱȱȱȱȱȱ 1445 1362 ȱ Mean 2654 ȱȱ ȱȱȱȱ ȱȱ 7843 ȱȱȱȱ ȱȱȱȱ ȱȱȱȱ ȱ 17707 31101 ȱȱ ȱȱ 29816 22435 abundance ȱ For ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ Parents (0) (1) (4) (0) SD ȱ ȱ ȱ ȱ ȱ (10) (11) (12) (62) (23) (32) (10) ȱ 4 ȱ ȱ ȱ ȱ 4 ȱ ȱ 7 ± (132) (194) (171) (418) (125) (471) (n=5) ȱ (MC) 24 ȱ ȱ ȱ ȱ ȱ ȱ ȱ relative (3514) 16 42 13 42 12 ȱ ȱ er 125 ȱȱȱȱȱȱȱ ȱȱȱȱȱ L ȱȱȱȱȱȱȱȱ ȱȱȱȱȱȱȱȱ 422 663 269 555 ȱȱȱȱȱ B=Cvi. ȱȱȱȱȱȱ 34807 ȱȱȱȱȱȱ ȱȱȱȱ ȱȱȱȱȱȱ ȱȱȱȱȱȱ 1116 4371 ȱ Mean , ȱȱ ȱȱ ȱȱȱȱ ȱȱȱȱ ȱȱȱȱ ȱȱȱȱ and 18937 ȱ er glucosinolates. ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ A=L 3 ȱ Ȭ locus, 69 48 77 96 54 57 Ȭ Ȭ Ȭ ȱ 610 136 250 864 156 Ȭ ȱ 1086 2419 2607 5568 5242 1696 Add. (MC) Ȭ Ȭ Ȭ Ȭ ȱ AOP ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ aliphatic ȱ AOP 8 ȱ 1.2 5.3 7.7 2.2 0.2 0.8 4.4 2.3 0.1 4.5 2.3 1.7 3.7 11.1 12.4 10.8 24.5 height). of ȱ Sign. and ȱ logP) ȱ Ȭ ȱ ( respectively; ȱ data peak ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ Locus 8 Ȭ 53 53 92 MAM 418 291 110 424 122 255 ȱ locus, Ȭ Ȭ 9238 4998 4842 6178 8150 5073 1512 ȱ Add. (MC) Ȭ Ȭ ȱ 20416 the ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ AOP ȱ MAM 14 mapping for 5.3 9.1 8.9 maximum ȱ ȱ ȱ 11.5 27.1 17.4 13.5 16.5 10.4 23.1 13.5 16.3 21.9 11.5 12.6 10.8 12.8 Sign. logP) Ȭ and at ( ȱ ȱ and ȱ given ȱȱ ȱȱ ȱ ȱ ȱ ȱ ȱ ȱ MAM ȱ ȱ counts ȱ ȱ are ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ the ȱ ȱ ȱ at (MC, ȱ ȱ Phenotypic QTLs ȱ ȱ ȱ ȱ 1: ȱ Methylthiobutyl Methylthiopropyl Methylthiopentyl Methylthiohexyl Methylthioheptyl Methylsulfinylpropyl Methylsulfinylheptyl Hydroxypropyl Methylsulfinylbutyl Methylsulfinylpentyl Methylsulfinylhexyl Benzoyloxypropyl Hydroxybutyl Butenyl Benzoyloxybutyl Propenyl Benzoyloxypentyl Benzoyloxyhexyl Ȭ Ȭ Ȭ Ȭ Ȭ Ȭ Ȭ Ȭ Ȭ Ȭ Ȭ Ȭ Ȭ Ȭ Ȭ Ȭ Ȭ Ȭ Genotype intensities Table a Glucosinolate 4 3 5 6 7 3 detected 7 3 4 5 6 3 4 3 4 2 5 6

ȱ

85ȱ Chapterȱ4ȱ

Untargetedȱmetabolomicsȱuncoversȱnewȱbiosyntheticȱstepsȱ Toȱ demonstrateȱ theȱ powerȱ ofȱ ourȱ untargetedȱ metabolomicsȱ approachȱ inȱ uncoveringȱ previouslyȱ unknownȱ potentialȱ regulatoryȱ relationshipsȱ betweenȱ metabolites,ȱweȱfocusedȱonȱaȱlocusȱonȱchromosomeȱ1ȱatȱ88.6ȱcM,ȱwhereȱaȱnumberȱ ofȱmassȱsignalsȱcouldȱbeȱmappedȱwithȱhighȱsignificance.ȱWeȱfirstȱdeterminedȱtheȱ extentȱofȱQTLȱoverlap,ȱexpressedȱasȱtheȱcorrelationȱcoefficient,ȱofȱtheȱmassȱwithȱtheȱ mostȱ significantȱ QTLȱ withȱ allȱ otherȱ masses.ȱ Next,ȱ massesȱ showingȱ significantȱ correlationȱ wereȱ identifiedȱ byȱ calculatingȱ theirȱ accurateȱ mass,ȱ interpretingȱ theirȱ absorbanceȱ spectraȱ (Photoȱ Diodeȱ Arrayȱ (PDA)ȱ signals)ȱ andȱ usingȱ MS/MSȱ fragmentationȱ techniquesȱ (supplementalȱ Tableȱ 4ȱ atȱ http://www.nature.com/ȱ naturegenetics).ȱMostȱofȱtheȱmassȱsignalsȱsharingȱthisȱsingleȱQTLȱonȱchromosomeȱ1ȱ correspondedȱtoȱdifferentȱglycosylatedȱflavonolsȱ(Figureȱ8A).ȱTheȱdirectionȱofȱtheȱ additiveȱ effect,ȱ however,ȱ suggestsȱ thatȱ genotypicȱ variationȱ atȱ thisȱ locusȱ exertsȱ oppositeȱeffectsȱonȱtheȱglycosylationȱpattern.ȱLinesȱcarryingȱtheȱLerȱallele(s)ȱatȱthisȱ locusȱaccumulateȱflavonolsȱcontainingȱdihexosylȱglycosides,ȱwhereasȱlinesȱcarryingȱ theȱ Cviȱ allele(s)ȱ atȱ thisȱ positionȱ doȱ not.ȱ Lerȱ genotypes,ȱ however,ȱ areȱ ableȱ toȱ synthesizeȱallȱflavonolsȱdetectedȱinȱCviȱgenotypesȱ(Tableȱ2ȱandȱFigureȱ8,ȱBȱandȱC).ȱ Theȱ presentȱ findingsȱ suggestȱ thatȱ aȱ specificȱ notȬpreviouslyȬidentifiedȱ glycosylȱ transferase,ȱcatalyzingȱtheȱproductionȱofȱflavonolȬdihexosides,ȱisȱactiveȱinȱLerȱbutȱ notȱinȱCvi,ȱthusȱaffectingȱtotalȱflavonolȱcomposition.ȱ ȱ Tableȱ 2:ȱ Characteristicsȱ ofȱ putativelyȱ identifiedȱ flavonols.ȱ Eachȱ flavonolȱ isȱ presentedȱ asȱ itsȱ aglyconeȱ withȱitsȱdistinguishingȱglycosylationȱpattern.ȱSignificanceȱofȱtheȱdetectedȱQTLȱonȱchromosomeȱ1ȱatȱ88.6ȱ cMȱ forȱ eachȱ flavonolȱ isȱ shownȱ asȱ –Log10Pȱ valuesȱ andȱ additiveȱ effectȱ andȱ relativeȱ abundanceȱ ofȱ eachȱ flavonolȱinȱtheȱparentalȱlinesȱisȱgivenȱasȱmassȱsignalȱintensitiesȱ(MC,ȱcountsȱatȱmaximumȱpeakȱheight).ȱ Sign.ȱ Effectȱ ȱȱȱȱȱLerȱ ȱȱȱȱCviȱ Aglyconeȱ Glycosylationȱ (ȬLog10P)ȱ (MC)ȱ (MCȱ±ȱSE)ȱ (MCȱ±ȱSE)ȱ Isorhamnetinȱ DeoxyhexosylȬhexosideȱ 30.7ȱ ȱȱȱ199ȱȱȱ247ȱ±ȱȱȱ54ȱȱȱ212ȱ±ȱȱȱ10ȱ Isorhamnetinȱ DeoxyhexosylȬdihexosideȱ 24.0ȱ ȱȱȬ123ȱȱȱ258ȱ±ȱȱȱ18ȱ ȱȱȱȱȱȱ4ȱ±ȱȱȱȱȱ0ȱ Kaempferolȱ DideoxyhexosylȬhexosideȱ 39.1ȱ ȱȱȱ197ȱ ȱȱȱȱ13ȱ±ȱȱȱȱȱ2ȱȱȱ329ȱ±ȱȱȱ40ȱ Kaempferolȱ DeoxyhexosylȬdihexosideȱ 29.5ȱȬ1326ȱ 1334ȱ±ȱ164ȱ ȱȱȱȱȱȱ7ȱ±ȱȱȱȱȱ0ȱ Quercetinȱ DeoxyhexosylȬhexosideȱ 50.7ȱȱ2659ȱ 1293ȱ±ȱ291ȱ 4928ȱ±ȱ517ȱ Quercetinȱ DeoxyhexosylȬdihexosideȱ 24.3ȱȬ1721ȱ 3031ȱ±ȱ167ȱ ȱȱȱȱȱȱ4ȱ±ȱȱȱȱȱ0ȱ ȱ Twoȱ genesȱ putativelyȱ annotatedȱ asȱ UDPȬglucose:glycosyltransferasesȱ (UGTs)ȱbasedȱonȱconsensusȱsequenceȱhomologyȱwithȱFamilyȱ1ȱUGTsȱcoincideȱwithȱ theȱ supportȱ intervalȱ ofȱ theȱ QTLȱ (viz.ȱ UGT79B10ȱ andȱ UGT79B11)ȱ (Liȱ etȱ al.,ȱ 2001).ȱ UGT79B10ȱ hasȱ beenȱ expressedȱ asȱ recombinantȱ proteinȱ inȱ Escherichiaȱ coli,ȱ butȱ itȱ showedȱnoȱactivityȱagainstȱquercetinȱglucosidesȱinȱanȱinȱvitroȱanalysisȱ(Limȱetȱal.,ȱ 2004).ȱHowever,ȱtheȱcodingȱsequenceȱwasȱobtainedȱfromȱtheȱColumbiaȱaccessionȱ whichȱmightȱharborȱallelicȱdifferencesȱcomparedȱwithȱLerȱorȱCvi.ȱNoȱinformationȱ

86ȱ Theȱgeneticsȱofȱplantȱmetabolismȱ

aboutȱ activityȱ ofȱ UGT79B11ȱ isȱ currentlyȱ available,ȱ butȱ itsȱ sequenceȱ isȱ highlyȱ homologousȱtoȱUGT79B10,ȱandȱtheȱtwoȱgenesȱprobablyȱaroseȱfromȱaȱduplicationȱ event.ȱ Therefore,ȱ bothȱ genesȱ cannotȱ beȱ ruledȱ outȱ aȱ prioriȱ asȱ candidatesȱ forȱ theȱ observedȱQTL.ȱAnotherȱpossibilityȱmightȱbeȱtheȱpresenceȱofȱaȱgeneȱinȱLerȱthatȱisȱ absentȱ inȱ Cviȱ andȱ Colȱ andȱ thereforeȱ isȱ notȱ annotatedȱ inȱ theȱ Colȱ sequence.ȱ FineȬ mappingȱ ofȱ thisȱ locusȱ shouldȱ demonstrateȱ whetherȱ theȱ QTLȱ representsȱ anȱ encodingȱstructuralȱgeneȱorȱaȱregulatorȱthereof.ȱ ȱ

ȱ ȱ Figureȱ8:ȱGeneticȱvariationȱinȱflavonolȬglycosideȱaccumulationȱinȱArabidopsis.ȱ (A)ȱQTLȱlikelihoodȱprofilesȱofȱputativelyȱidentifiedȱflavonolȱglycosidesȱinȱtheȱRILȱpopulation.ȱTheȱsignȱ ofȱtheȱvalueȱisȱrelatedȱtoȱtheȱadditiveȱeffectȱatȱeachȱmarkerȱpositionȱ(+,ȱCvi;ȱȬ,ȱLer).ȱDottedȱandȱsolidȱlinesȱ representȱ flavonolsȱ withȱ andȱ withoutȱ dihexosylȱ residues,ȱ respectively.ȱ Chromosomalȱ bordersȱ areȱ indicatedȱbyȱverticalȱshadedȱlines.ȱ(B)ȱTypicalȱexampleȱofȱrelativeȱlevelsȱofȱflavonolȬdihexosideȱversusȱ flavonolȬmonohexosideȱ inȱ theȱ RILȱ population.ȱ Eachȱ symbolȱ representsȱ theȱ averageȱ ofȱ twoȱ measurementsȱperȱRIL.ȱSquaresȱandȱtrianglesȱrepresentȱlinesȱcarryingȱaȱCviȱorȱLerȱgenotypeȱatȱtheȱQTLȱ position,ȱ respectively.ȱ (C)ȱ Typicalȱ exampleȱ ofȱ flavonolȱ dihexosideȱ andȱ flavonolȱ monohexosideȱ accumulationȱinȱtheȱparentalȱlinesȱLerȱ(black)ȱandȱCviȱ(shaded).ȱDataȱrepresentȱfiveȱbiologicalȱreplicatesȱ forȱ eachȱ parentȱ measuredȱ inȱ twoȱ replicateȱ extractions.ȱ Inȱ (B)ȱ andȱ (C),ȱ valuesȱ representȱ massȱ signalȱ intensitiesȱ(MC,ȱcountsȱatȱmaximumȱpeakȱheight).ȱErrorȱbarsȱrepresentȱs.e.m.ȱ ȱ Thus,ȱ theȱ untargetedȱ detectionȱ andȱ subsequentȱ mappingȱ ofȱ metabolitesȱ enabledȱ usȱ toȱ identifyȱ aȱ numberȱ ofȱ putativeȱ flavonolȬglycosidesȱ notȱ previouslyȱ reportedȱ inȱ Arabidopsisȱ (DȇAuriaȱ andȱ Gershenzon,ȱ 2005).ȱ CoȬlocationȱ ofȱ QTLsȱ suggestsȱthatȱvariationȱinȱtheȱaccumulationȱofȱtheseȱflavonolȱspeciesȱisȱattributableȱ toȱaȱsingleȱlocusȱaffectingȱglycosylationȱofȱtheȱbasicȱflavonoidȱbackbone.ȱ ȱ

87ȱ Chapterȱ4ȱ

DISCUSSIONȱ ȱ Theȱ frameworkȱ proposedȱ hereȱ involvesȱ theȱ untargetedȱ detectionȱ ofȱ hundredsȱ toȱ potentiallyȱthousandsȱofȱmetabolitesȱinȱaȱmappingȱpopulation,ȱthusȱenablingȱtheȱ mappingȱ ofȱ QTLsȱ forȱ individualȱ metabolites.ȱ Thisȱ createsȱ newȱ opportunitiesȱ forȱ pathwayȱ elucidationȱ andȱ identificationȱ evenȱ whenȱ backgroundȱ knowledgeȱ isȱ highlyȱlimited.ȱWeȱshowȱthatȱtheȱbiochemicalȱvariationȱinȱArabidopsisȱisȱextensiveȱ butȱ isȱ neverthelessȱ largelyȱ underȱ geneticȱ control,ȱ asȱ concludedȱ fromȱ theȱ observationȱ thatȱ genomicȱ lociȱ couldȱ beȱ assignedȱ forȱ 75%ȱ ofȱ theȱ LCȬMSȬdetectedȱ massȱ peaks.ȱ Theȱ useȱ ofȱ untargetedȱ metabolomicsȱ isȱ particularlyȱ usefulȱ inȱ thisȱ context,ȱ becauseȱ itȱ allowsȱ theȱ detectionȱ ofȱ previouslyȱ unidentifiedȱ metabolites.ȱ Whenȱ suchȱ metabolitesȱ areȱ coȬregulatedȱ withȱ knownȱ metabolites,ȱ thisȱ mayȱ facilitateȱ theȱ functionalȱ assignmentȱ ofȱ thoseȱ unknownȱ metabolites.ȱ Similarly,ȱ unexpectedȱcoȬoccurrenceȱofȱwellȬknownȱmetabolitesȱcanȱalsoȱbeȱdiscoveredȱthatȱ wouldȱotherwiseȱhaveȱbeenȱmissedȱifȱdetectionȱwasȱtargetedȱtoȱaȱspecificȱsubsetȱofȱ compounds.ȱ Geneticȱ variationȱ forȱ metaboliteȱ compositionȱ mightȱ beȱ importantȱ inȱ adaptationȱ toȱ theȱ specificȱ environmentalȱ conditionsȱ inȱ whichȱ theȱ differentȱ accessionsȱ grow.ȱ Inȱ addition,ȱ theyȱ determineȱ manyȱ aspectsȱ ofȱ theȱ nutritional,ȱ sensory,ȱandȱotherȱaspectsȱofȱcropȱplantȱquality.ȱ Biologicalȱ systemsȱ areȱ oftenȱ regulatedȱ atȱ variousȱ molecularȱ levels,ȱ includingȱtheȱinfluenceȱofȱmetabolitesȱonȱplantȱdevelopment.ȱAȱnumberȱofȱstudiesȱ haveȱ indicatedȱ theȱ influenceȱ ofȱ metabolitesȱ onȱ wholeȱ plantȱ morphologyȱ duringȱ earlyȱstagesȱofȱdevelopmentȱ(Albaȱetȱal.,ȱ2005;ȱLumbaȱandȱMcCourt,ȱ2005).ȱThus,ȱ ourȱunderstandingȱofȱbiologicalȱfunctionȱwouldȱbenefitȱgreatlyȱfromȱquantitativeȱ measurementsȱofȱdifferentȱclassesȱofȱcompoundsȱ(suchȱasȱproteinsȱandȱmetabolites)ȱ andȱvariousȱprocessesȱ(suchȱasȱgeneȱexpression)ȱcarriedȱoutȱinȱparallel,ȱpreferablyȱ combinedȱwithȱotherȱclassicalȱphenotypicȱanalysesȱ(OksmanȬCaldenteyȱandȱSaito,ȱ 2005).ȱ Theȱ implementationȱ ofȱ differentȱ technologiesȱ thenȱ enablesȱ associationȱ analysesȱbasedȱonȱsimilarȱgeneticȱcontrol,ȱasȱshownȱbyȱsimilarȱQTLȱpositions.ȱInȱ particular,ȱtheȱuseȱofȱaȱperpetualȱmappingȱpopulationȱsuchȱasȱanȱRILȱpopulationȱ willȱhaveȱaddedȱvalueȱbecauseȱcoȬlocatingȱQTLsȱcanȱidentifyȱtheȱgeneticȱbasisȱforȱ theseȱassociationsȱevenȱwhenȱdifferentȱexperimentsȱhaveȱbeenȱperformedȱ(Lallȱetȱ al.,ȱ2004;ȱDeCookȱetȱal.,ȱ2006).ȱOurȱstudyȱcanȱthereforeȱeasilyȱbeȱextendedȱbyȱusingȱ differentȱ extractionȱ andȱ analysisȱ methodsȱ orȱ byȱ examiningȱ contrastingȱ plantȱ developmentalȱstages.ȱMoreover,ȱtheȱrecentȱprogressȱmadeȱinȱgeneticȱanalysesȱofȱ geneȱexpressionȱ(Bremȱetȱal.,ȱ2002;ȱSchadtȱetȱal.,ȱ2003)ȱcanȱalsoȱreadilyȱbeȱexploited,ȱ andȱthisȱwillȱaidȱfurtherȱtheȱconstructionȱofȱgeneticȱregulatoryȱnetworksȱ(Jansen,ȱ 2003).ȱ

88ȱ Theȱgeneticsȱofȱplantȱmetabolismȱ

Inȱ theȱ past,ȱ numerousȱ studiesȱ haveȱ shownȱ theȱ usefulnessȱ ofȱ naturalȱ biodiversityȱforȱtheȱelucidationȱofȱagronomicallyȱimportantȱtraits,ȱandȱpleiotropicȱ lociȱhaveȱbeenȱidentifiedȱcontrollingȱdifferentȱtraitsȱsimultaneouslyȱ(Koornneefȱetȱ al.,ȱ 2004).ȱ Theȱ parallelȱ geneticȱ analysisȱ ofȱ physiological,ȱ transcriptional,ȱ andȱ biochemicalȱ profilingȱ canȱ greatlyȱ enhanceȱ ourȱ understandingȱ ofȱ metabolicȱ regulatoryȱcircuitryȱandȱitsȱrelationshipȱwithȱphenotypicȱtraitsȱthatȱsegregateȱinȱtheȱ sameȱ population.ȱ Theȱ definitiveȱ identificationȱ ofȱ theȱ mostȱ interestingȱ chemicalȱ compoundsȱ representedȱ byȱ theȱ variousȱ massȱ peaksȱ wouldȱ requireȱ additionalȱ chemicalȱ analysis.ȱ However,ȱ settingȱ prioritiesȱ forȱ theseȱ analysesȱ canȱ nowȱ beȱ performedȱ effectivelyȱ onȱ theȱ identifiedȱ mapȱ positionsȱ ofȱ QTLsȱ controllingȱ suchȱ phenotypicȱtraits.ȱ Understandingȱ theȱ mechanismsȱ thatȱ explainȱ naturalȱ variationȱ inȱ metaboliteȱprofilesȱandȱhowȱthisȱcorrelatesȱwithȱphenotypeȱisȱaȱprimaryȱchallengeȱ forȱevolutionaryȱresearchȱandȱresearchȱgearedȱtoȱdefiningȱnaturalȱbiodiversityȱandȱ maximizingȱ itsȱ useȱ throughȱ directedȱ plantȱ breedingȱ approaches.ȱ Theȱ strategyȱ describedȱhereȱhasȱuniversalȱapplicationȱandȱcanȱbeȱusedȱforȱanyȱsetȱofȱmetabolitesȱ analyzedȱinȱmappingȱpopulationsȱofȱanyȱorganism.ȱ ȱ ȱ

89ȱ Chapterȱ4ȱ

MATERIALSȱANDȱMETHODSȱ ȱ Arabidopsisȱaccessionsȱandȱmappingȱpopulationȱ Fourteenȱ accessionsȱ ofȱ A.ȱ thalianaȱ representingȱ differentȱ regionsȱ ofȱ theȱ globalȱ distributionȱ ofȱ theȱ speciesȱ wereȱ analyzedȱ forȱ quantitativeȱ geneticȱ variationȱ inȱ metaboliteȱcontent.ȱAȱpopulationȱofȱ160ȱrecombinantȱinbredȱlinesȱderivedȱfromȱaȱ crossȱbetweenȱtheȱaccessionsȱCapeȱVerdeȱIslandsȱ(Cvi)ȱandȱLandsbergȱerectaȱ(Ler)ȱ wasȱ usedȱ forȱ QTLȱ mappingȱ ofȱ metaboliteȱ content.ȱ Theȱ F10ȱ generationȱ hasȱ beenȱ extensivelyȱ genotypedȱ (AlonsoȬBlancoȱ etȱ al.,ȱ 1998)ȱ andȱ isȱ availableȱ fromȱ theȱ Arabidopsisȱ Biologicalȱ Resourceȱ Center.ȱ Allȱ linesȱ wereȱ advancedȱ toȱ theȱ F13ȱ generation,ȱ andȱ residualȱ heterozygousȱ regions,ȱ estimatedȱ toȱ beȱ 0.71%ȱ inȱ theȱ F10ȱ generation,ȱwereȱgenotypedȱagainȱusingȱmolecularȱPCRȱmarkers.ȱInȱaddition,ȱallȱ linesȱ wereȱ genotypedȱ withȱ aȱ fewȱ extraȱ markersȱ toȱ improveȱ theȱ qualityȱ ofȱ theȱ geneticȱmap.ȱBecauseȱeachȱlineȱisȱalmostȱcompletelyȱhomozygous,ȱindividualsȱofȱ theȱ sameȱ lineȱ areȱ geneticallyȱ identical,ȱ whichȱ allowsȱ theȱ poolingȱ ofȱ replicateȱ individualsȱ andȱ repeatedȱ measurementsȱ toȱ obtainȱ aȱ moreȱ preciseȱ estimateȱ ofȱ phenotypeȱvaluesȱandȱbroadȱsenseȱheritabilities.ȱ ȱ Germination,ȱgrowthȱconditionsȱandȱharvestingȱ Seedsȱ ofȱ accessionsȱ andȱ RILsȱ wereȱ sownȱ onȱ 10ȱ mlȱ twiceȬdilutedȱ Murashigiȱ andȱ Skoogȱmediumȱcontainingȱ2%ȱagarȱinȱ6ȬcmȱPetriȱdishes.ȱForȱeachȱline,ȱfiveȱreplicateȱ dishesȱwereȱsownȱonȱfiveȱconsecutiveȱdaysȱwithȱaȱdensityȱofȱaȱfewȱhundredȱseedsȱ perȱPetriȱdish.ȱPetriȱdishesȱwereȱplacedȱinȱaȱcoldȱroomȱatȱ4°Cȱforȱ7ȱdaysȱinȱtheȱdarkȱ toȱ promoteȱ uniformȱ germination.ȱ Subsequently,ȱ dishesȱ wereȱ randomlyȱ placedȱ inȱ fiveȱblocksȱinȱaȱclimateȱchamberȱwhereȱeachȱblockȱcontainedȱoneȱreplicateȱdishȱofȱ eachȱline.ȱGrowingȱconditionsȱwereȱ16ȱhrȱlightȱ(30ȱW.mȬ2)ȱatȱ20°C,ȱ8ȱhrȱdarkȱatȱ15°Cȱ andȱ75%ȱrelativeȱhumidity.ȱAfterȱ6ȱdaysȱtheȱlidsȱofȱtheȱPetriȱdishesȱwereȱremovedȱ toȱensureȱseedlingsȱwereȱfreeȱofȱcondensedȱwaterȱonȱtheȱdayȱofȱharvesting.ȱOnȱdayȱ 7,ȱseedlingsȱwereȱharvestedȱbyȱsubmergingȱtheȱcompleteȱPetriȱdishȱbrieflyȱinȱliquidȱ nitrogenȱandȱscrapingȱoffȱtheȱaerialȱpartsȱwithȱaȱrazorȱblade.ȱHarvestingȱstartedȱ7ȱ hoursȱintoȱtheȱlightȱperiodȱandȱallȱlinesȱwereȱharvestedȱinȱrandomȱorderȱwithinȱ2ȱ hours.ȱPlantȱmaterialȱwasȱstoredȱatȱȬ80°Cȱuntilȱfurtherȱprocessing.ȱ ȱ ExtractȱpreparationȱandȱLCȬMSȱanalysisȱ Forȱeachȱline,ȱplantȱmaterialȱfromȱtwoȱdishesȱwasȱharvestedȱtoȱmakeȱoneȱreplicateȱ sampleȱ andȱ materialȱ fromȱ theȱ otherȱ threeȱ dishesȱ wasȱ harvestedȱ forȱ theȱ secondȱ sample.ȱSamplesȱwereȱgroundȱinȱliquidȱnitrogen,ȱandȱ100ȱmgȱofȱeachȱsampleȱwasȱ weighedȱinȱ2.2ȱmlȱEppendorfȱtubes.ȱAqueousȬmethanolȱextractsȱwereȱpreparedȱbyȱ

90ȱ Theȱgeneticsȱofȱplantȱmetabolismȱ

addingȱ400ȱPlȱofȱiceȬcoldȱ92%ȱmethanolȱacidifiedȱwithȱ0.1%ȱ(vol/vol)ȱformicȱacidȱtoȱ theȱ plantȱ sampleȱ (finalȱ methanolȱ concentrationȱ 75%,ȱ assumingȱ 90%ȱ waterȱ inȱ tissues).ȱ Afterȱ sonicationȱ forȱ 15ȱ minȱ andȱ centrifugationȱ (20,000g)ȱ forȱ 10ȱ min,ȱ theȱ extractsȱ wereȱ transferredȱ toȱ 96Ȭwellȱ proteinȱ filtrationȱ platesȱ (Captivaȱ 0.45ȱΐm,ȱ AnsysȱTechnologies),ȱvacuumȱfiltratedȱandȱcollectedȱinȱ700Ȭΐlȱglassȱinsertsȱinȱ96Ȭ wellȱautosamplerȱplatesȱ(WatersȱCorporation),ȱusingȱaȱGenesisȱWorkstationȱ(Tecanȱ SystemsȱInc.).ȱSamplesȱwereȱautomaticallyȱinjectedȱ(5ȱΐl)ȱandȱseparatedȱusingȱanȱ Allianceȱ2795ȱHTȱsystemȱ(WatersȱCorporation)ȱequippedȱwithȱaȱLunaȱC18Ȭreversedȱ phaseȱcolumnȱ(150ȱxȱ2.1ȱmm,ȱ3ȱΐm;ȱPhenomenex,ȱCA).ȱSeparationȱwasȱperformedȱ atȱ 40°Cȱ byȱ applyingȱ aȱ20ȱminȱ gradientȱ fromȱ 5Ȭ75%ȱacetonitrilȱ inȱ water,ȱ acidifiedȱ withȱ0.1%ȱformicȱacid,ȱatȱaȱflowȱrateȱofȱ0.2ȱml/min.ȱCompoundsȱelutingȱfromȱtheȱ columnȱ wereȱ detectedȱ online,ȱ firstȱ byȱ aȱ Watersȱ996ȱphotodiodeȱ arrayȱ detectorȱ atȱ 200Ȭ600ȱ nmȱ andȱ thenȱ byȱ aȱ QȬTOFȱ Ultimaȱ MSȱ (Waters)ȱ withȱ anȱ Electronȱ Sprayȱ Ionizationȱ(ESI)ȱsource.ȱIonsȱwereȱdetectedȱinȱnegativeȱmodeȱinȱtheȱrangeȱofȱm/zȱ 100ȱ toȱ 1,500,ȱ usingȱ aȱ scanȱ timeȱ ofȱ 900ȱ msecȱ andȱ anȱ interscanȱ delayȱ ofȱ 100ȱ msec.ȱ Desolvationȱtemperatureȱwasȱ250°Cȱwithȱaȱnitrogenȱgasȱflowȱofȱ500ȱl/h,ȱcapillaryȱ sprayȱwasȱ2.75ȱkV,ȱsourceȱtemperatureȱ120°C,ȱconeȱvoltageȱwasȱ35ȱVȱwithȱ50ȱl/hȱ nitrogenȱ gasȱ flowȱ andȱ collisionȱ energyȱ wasȱ 10ȱ eV.ȱ Theȱ massȱ spectrometerȱ wasȱ calibratedȱusingȱ0.05%ȱphosphoricȱacidȱinȱ50%ȱacetonitrileȱandȱleucineȱenkaphalinȱ (Sigma),ȱdetectedȱonlineȱthroughȱaȱseparateȱESIȱinterfaceȱeveryȱ10ȱsec,ȱwasȱusedȱasȱ aȱlockȱmassȱforȱexactȱmassȱmeasurements.ȱMassLynxȱsoftwareȱversionȱ4.0ȱ(Waters)ȱ wasȱusedȱtoȱcontrolȱallȱinstrumentsȱandȱforȱcalculationȱofȱaccurateȱmasses.ȱ ȱ DataȱpreȬprocessingȱ TheȱdedicatedȱsoftwareȱprogramȱMETALIGNȱ(http://www.metAlign.nl)ȱwasȱusedȱ forȱunbiasedȱandȱunsupervisedȱcomparisonȱofȱallȱLCȬMSȱdatasetsȱ(Tikunovȱetȱal.,ȱ 2005;ȱVorstȱetȱal.,ȱ2005).ȱInȱshort,ȱtheȱprogramȱperformsȱautomatedȱpeakȱcentering,ȱ localȱ noiseȱ calculation,ȱ baselineȱ correctionȱ andȱ extractionȱ ofȱ allȱ relevantȱ massȱ signalsȱ (i.e.ȱ signalȬtoȬnoiseȱ ratioȱ ofȱ 3ȱ orȱ higher)ȱ fromȱ allȱ LCȬMSȱ datasets,ȱ andȱ itȱ subsequentlyȱ usesȱ landmarkȬdependentȱ alignmentȱ algorithmsȱ toȱ correctȱ forȱ localȱ chromatographicȱ driftsȱ andȱ obtainȱ anȱ orderedȱ dataȱ matrixȱ (ȇalignedȱ massȱ peaksȇȱ versusȱ samples).ȱ Massȱ peakȱ signalsȱ generatedȱ areȱ calculatedȱ asȱ massȱ intensitiesȱ (ionȱcounts)ȱatȱmaximumȱpeakȱheight.ȱ ȱ Qualityȱimprovementȱbyȱreductionȱofȱtheȱdatasetȱ Forȱ eachȱ sample,ȱ theȱ numberȱ ofȱ detectedȱ massesȱ wasȱ reducedȱ toȱ improveȱ theȱ qualityȱofȱtheȱdataȱset.ȱOnlyȱmassesȱthatȱwereȱdetectedȱinȱtheȱoptimizedȱgradientȱ phaseȱ (Vorstȱ etȱ al.,ȱ 2005)ȱ (betweenȱ 3ȱ andȱ 20ȱ minȱ retentionȱ time)ȱ andȱ thatȱ hadȱ aȱ signalȱ intensityȱ higherȱ thanȱ sixȱ timesȱ localȱ noiseȱ wereȱ selectedȱ forȱ furtherȱ dataȱ

91ȱ Chapterȱ4ȱ

analysis.ȱForȱtheȱRILȱpopulation,ȱmassesȱthatȱhadȱaȱsignalȱintensityȱhigherȱthanȱsixȱ timesȱlocalȱnoiseȱbutȱthatȱwereȱdetectedȱinȱfewerȱthanȱtenȱlinesȱwereȱdiscardedȱasȱ well.ȱ ȱ Statisticalȱanalysesȱ Totalȱ phenotypicȱ varianceȱ wasȱ partitionedȱ intoȱ sourcesȱ attributableȱ toȱ genotypeȱ andȱerror.ȱComponentsȱofȱvarianceȱwereȱusedȱtoȱestimateȱbroadȬsenseȱheritabilityȱ accordingȱ toȱ theȱ formulaȱ H2ȱ =ȱ VG/(VGȱ +ȱ Ve),ȱ whereȱ VGȱ isȱ theȱ amongȬgenotypeȱ varianceȱ component,ȱ andȱ Veȱ isȱ theȱ residualȱ (error)ȱ varianceȱ componentȱ ofȱ theȱ analysisȱofȱvarianceȱ(ANOVA).ȱȱ Theȱ distanceȱ betweenȱ accessions,ȱ basedȱ onȱ metabolicȱ content,ȱ wasȱ calculatedȱ byȱ hierarchicalȱ clustering.ȱ Dataȱ wereȱ firstȱ transformedȱ asȱ (xijȱ Ȭȱ ui)/sdi,ȱ whereȱ xijȱ isȱ theȱ peakȱ intensityȱ ofȱ theȱ ithȱ massȱ inȱ theȱ jthȱ accession;ȱ uiȱ isȱ theȱ meanȱ intensityȱofȱtheȱithȱmassȱoverȱallȱaccessions,ȱandȱsdiȱisȱtheȱstandardȱdeviationȱofȱtheȱ meanȱ intensityȱ ofȱ theȱ ithȱ massȱ overȱ allȱ accessions.ȱ Distanceȱ wasȱ thenȱ calculatedȱ usingȱ euclideanȱ methodsȱ andȱ clustersȱ wereȱ constructedȱ usingȱ averageȱ linkageȱ clustering.ȱ Toȱ verifyȱ theȱ clustering,ȱ weȱ performedȱ 1,000ȱ bootstrapȱ runsȱ byȱ usingȱ approximatelyȬunbiasedȱ multistepȬmultiscaleȱ bootstrapȱ resamplingȱ (Shimodaira,ȱ 2004).ȱ Theȱ PȬvaluesȱ computedȱ indicateȱ howȱ stronglyȱ eachȱ clusterȱ wasȱ supportedȱ byȱtheȱdata.ȱ ȱ Linkageȱmapȱconstructionȱ Genotypeȱ dataȱ forȱ theȱ Lerȱ xȱ Cviȱ populationȱ individualsȱ areȱ availableȱ atȱ http:/nasc.nott.ac.uk/.ȱ Theȱ geneticȱ mapȱ wasȱ constructedȱ fromȱ aȱ subsetȱ ofȱ theȱ markersȱ availableȱ withȱ aȱ fewȱ newȱ markersȱ added.ȱ Theȱ computerȱ programȱ JOINMAPȱ3.0ȱ(Stam,ȱ1993)ȱ(http://www.kyazma.com)ȱwasȱusedȱforȱtheȱcalculationȱ ofȱ linkageȱ groupsȱ andȱ geneticȱ distances.ȱ Recombinationȱ frequenciesȱ wereȱ convertedȱtoȱcentiMorganȱdistancesȱusingȱtheȱKosambiȱmappingȱfunction.ȱ ȱ QTLȱanalysisȱ Forȱmanyȱmasses,ȱaȱspikeȱinȱtheȱphenotypeȱdistributionȱwasȱobserved,ȱcausingȱaȱ departureȱ fromȱ theȱ assumptionȱ ofȱ normalȱ distribution.ȱ Theȱ spikeȱ wasȱ causedȱ byȱ theȱabsenceȱofȱaȱmassȱpeakȱinȱaȱconsiderableȱnumberȱofȱRILs,ȱconsequentlyȱleadingȱ toȱsignalȱintensitiesȱequalȱtoȱtheȱdetectionȱthresholdȱvalueȱ(fourȱtimesȱlocalȱnoise).ȱ Becauseȱ distributionsȱ wereȱ normalȱ ifȱ onlyȱ RILsȱ wereȱ takenȱ intoȱ accountȱ whenȱ signalȱ intensitiesȱ wereȱ aboveȱ theȱ detectionȱ threshold,ȱ weȱ carriedȱ outȱ aȱ singleȬ markerȱanalysisȱusingȱaȱtwoȬpartȱparametricȱmodelȱ(Broman,ȱ2003).ȱ Theȱ firstȱ partȱ describesȱ aȱ binominalȱ modelȱ thatȱ testsȱ forȱ associationȱ ofȱ markersȱwithȱpresenceȱorȱabsenceȱofȱmassȱpeaks.ȱForȱeachȱmassȱpeak,ȱletȱyiȱdenoteȱ

92ȱ Theȱgeneticsȱofȱplantȱmetabolismȱ

theȱmassȱintensityȱforȱtheȱithȱRIL.ȱLetȱziȱ=ȱ0ȱifȱyiȱ=ȱ4,ȱandȱziȱ=1ȱifȱyiȱ>ȱ4.ȱWeȱthenȱtestedȱ eachȱ markerȱ forȱ significantȱ differencesȱ betweenȱ theȱ twoȱ genotypesȱ forȱ theȱ probabilityȱofȱpresenceȱofȱtheȱmassȱpeak:ȱH0:ȱP{zȱ=ȱ1|gȱ=ȱLer}ȱ=ȱP{zȱ=ȱ1|ȱgȱ=ȱCvi}ȱ versusȱtheȱalternativeȱhypothesisȱH1:ȱP{zȱ=ȱ1|gȱ=ȱLer}ȱƾȱP{zȱ=ȱ1|ȱgȱ=ȱCvi},ȱwhereȱgȱisȱ theȱgenotypeȱ(LerȱorȱCvi)ȱofȱaȱmarkerȱunderȱanalysis.ȱ Theȱsecondȱpartȱdescribesȱaȱparametricȱmodelȱthatȱtestsȱforȱassociationȱofȱ markersȱwithȱintensityȱofȱtheȱmassȱsignalȱforȱthoseȱlinesȱwhereȱyiȱ>ȱ4.ȱUnderȱtheȱ assumptionȱ ofȱ normalȱ distribution,ȱ weȱ testedȱ eachȱ markerȱ forȱ significantȱ differencesȱinȱtheȱmeanȱvaluesȱbetweenȱtwoȱgenotypes:ȱH0:ȱΐ{gȱ=ȱLer}ȱ=ȱΐ{gȱ=ȱCvi}ȱ versusȱ theȱ alternativeȱ hypothesisȱ H1:ȱ ΐ{gȱ =ȱ Ler}ȱ zȱ ΐ{gȱ =ȱ Cvi}.ȱ Theȱ PȬvalueȱ ofȱ theȱ twoȬpartȱmodelȱwasȱthenȱdeterminedȱbyȱtheȱmultipleȱofȱtheȱPȬvaluesȱfromȱtheȱtwoȱ separateȱanalysesȱ(P1ȱandȱP2,ȱrespectively).ȱ Toȱ calculateȱ significanceȱ thresholds,ȱ weȱ performedȱ aȱ simulationȱ studyȱ followingȱ Bromanȱ (2003).ȱ Eachȱ individualȱ hadȱ probabilityȱ 40%ȱ (theȱ medianȱ proportionȱofȱnullȱphenotypesȱobservedȱinȱmassȱdata)ȱofȱhavingȱaȱnullȱphenotypeȱ andȱprobabilityȱ60%ȱofȱhavingȱaȱphenotypeȱdrawnȱfromȱaȱnormalȱdistributionȱwithȱ meanȱ13ȱ(theȱmedianȱvalueȱofȱmassȱphenotypeȱdata)ȱandȱstandardȱdeviationȱ1.ȱȱForȱ eachȱofȱ10,000ȱreplicates,ȱweȱsimulatedȱsuchȱdataȱunderȱtheȱnullȱhypothesisȱofȱnoȱ QTL,ȱappliedȱtheȱtwoȬpartȱmodelȱandȱstoredȱtheȱgenomeȬwideȱminimumȱPȬvalue.ȱ Theȱ98thȱpercentileȱofȱtheȱPȬvaluesȱcorrespondedȱtoȱ0.0001.ȱWithȱtheȱrealȱdata,ȱtheȱqȬ valuesȱcorrespondingȱtoȱPȬvaluesȱwereȱestimatedȱusingȱStorey’sȱgenomeȬwideȱfalseȱ discoveryȱrateȱ(FDR)ȱmethodȱ(StoreyȱandȱTibshirani,ȱ2003).ȱȱ ȱ Weȱ nextȱ calculatedȱ theȱ proportionȱ ofȱ QTLȱ significanceȱ explainedȱ byȱ theȱ binominalȱpartȱbyȱlogP1/(logP1ȱ+ȱlogP2),ȱwhereȱP1ȱandȱP2ȱareȱtheȱPȬvaluesȱfromȱ theȱ twoȱ separateȱ partsȱ ofȱ theȱ modelȱ respectively,ȱ (supplementalȱ Figureȱ 4ȱ atȱ http://www.nature.com/naturegenetics).ȱ Theȱ varianceȱ explainedȱ byȱ QTLsȱ wasȱ calculatedȱforȱbothȱpartsȱseparatelyȱ(supplementalȱFigureȱ5ȱatȱhttp://www.nature.ȱ com/naturegenetics).ȱ Inȱ theȱ quantitativeȱ modelȱ (partȱ II),ȱ weȱ usedȱ ANOVAȱ toȱ estimateȱ theȱ totalȱ sumȱ ofȱ squaresȱ (SStotal)ȱ andȱ theȱ sumȱ ofȱ squaresȱ betweenȱ QTLȱ genotypesȱ (SSQTL).ȱ Theȱ proportionȱ ofȱ varianceȱ explainedȱ byȱ theȱ QTLȱ wasȱ thenȱ calculatedȱasȱSSQTL/SStotal.ȱForȱtheȱbinominalȱmodelȱ(partȱI),ȱweȱusedȱtheȱdevianceȱ insteadȱ ofȱ theȱ sumȱ ofȱ squares.ȱ Weȱ fittedȱ theȱ binominalȱ dataȱ intoȱ aȱ generalizedȱ linearȱ (probit)ȱ modelȱ toȱ estimateȱ theȱ deviancesȱ (dev)ȱ (McCullaghȱ andȱ Nelder,ȱ 1989).ȱ Theȱ proportionȱ ofȱ varianceȱ explainedȱ byȱ theȱ QTLȱ inȱ theȱ binominalȱ modelȱ wasȱthenȱcalculatedȱasȱdevQTL/devtotal.ȱ ȱ ȱ ȱ ȱ

93ȱ Chapterȱ4ȱ

Calculationȱofȱgeneticȱcorrelationsȱ Variousȱ methodsȱ haveȱ beenȱ developedȱ andȱ appliedȱ toȱ uncoverȱ geneȱ regulatoryȱ networksȱfromȱexpressionȱprofilesȱ(deȱlaȱFuenteȱetȱal.,ȱ2004;ȱBingȱandȱHoeschele,ȱ 2005;ȱSchadtȱetȱal.,ȱ2005)ȱorȱfromȱQTLȱprofilesȱ(Zhuȱetȱal.,ȱ2004).ȱWeȱcombinedȱandȱ modifiedȱ theȱ methodsȱ ofȱ Bingȱ andȱ Hoescheleȱ (2005)ȱ andȱ Zhuȱ etȱ al.ȱ (2004)ȱ andȱ calculatedȱtheȱsecondȬorderȱpartialȱcorrelationȱonȱQTLȱprofilesȱbetweenȱanyȱpairȱofȱ massesȱtoȱassessȱtheȱstrengthȱofȱtheirȱgeneticȱrelationship.ȱ Theȱcalculationȱtookȱthreeȱsteps:ȱ(i)ȱforȱeachȱQTLȱsignificantȱatȱPȱ<ȱ0.0001,ȱ theȱQTLȱsupportȱintervalȱwasȱdeterminedȱbyȱsettingȱleftȱandȱrightȱborderȱpositionsȱ associatedȱ withȱ max{–log10P}ȱ ±ȱ 1.5;ȱ thatȱ is,ȱ theȱ 1.5ȬLODȱ dropȬoffȱ interval.ȱ Subsequentlyȱ–log10Pȱvaluesȱforȱpositionsȱoutsideȱtheȱsupportȱintervalsȱwereȱsetȱtoȱ zero.ȱ (ii)ȱ Pairȱ wiseȱ correlationȱ coefficientsȱ betweenȱ anyȱ twoȱ massesȱ wereȱ thenȱ calculatedȱas:ȱ n 2¦ xi u yi i 1 rxy n n 2 2 ¦ xi  ¦ yi i 1 i 1 whereȱrxyȱisȱtheȱcorrelationȱcoefficientȱbetweenȱmassȱxȱandȱy,ȱandȱiȱ(iȱ=ȱ1…n)ȱisȱaȱ marker.ȱxiȱandȱyiȱrepresentȱ–log10Pȱvaluesȱforȱmarkerȱi.ȱ(iii)ȱFinally,ȱsecondȬorderȱ partialȱcorrelationsȱwereȱcalculated.ȱTheȱfirstȬorderȱcorrelationȱbetweenȱvariableȱxȱ andȱyȱconditionalȱonȱaȱsingleȱvariableȱzȱisȱgivenȱby:ȱ r  r r r xy xz yz ȱ xy z 2 2 1 rxz 1 ryz whereȱ rxy,ȱ rxzȱ andȱ ryzȱ areȱ correlationȱ coefficientsȱ onȱ massȱ expressionȱ profilesȱ betweenȱ xȱ andȱ y,ȱ xȱ andȱ z,ȱ andȱ yȱ andȱ z,ȱ respectively.ȱ Theȱ secondȬorderȱ partialȱ correlationȱbetweenȱxȱandȱy,ȱconditionalȱonȱaȱpairȱofȱvariablesȱzȱandȱk,ȱisȱaȱfunctionȱ ofȱfirstȬorderȱcoefficients:ȱ r  r r r xy z xk z yk z ȱ xy zk 2 2 1 rxk z 1 ryk z ȱ Forȱeachȱpairȱxȱandȱy,ȱtheȱsecondȬorderȱpartialȱcorrelationsȱwereȱcalculatedȱ conditionalȱ onȱ eachȱ pairȱ zȱ andȱ k,ȱ andȱ theȱ minimalȱ valueȱ wasȱ stored.ȱ Havingȱ calculatedȱ theseȱ minimalȱvaluesȱforȱallȱ pairsȱxȱandȱyȱforȱaliphaticȱ glucosinolates,ȱ theȱ empiricalȱ thresholdȱ wasȱ obtainedȱ byȱ permutationȱ (supplementalȱ methodsȱ atȱ http://www.nature.com/naturegenetics).ȱ Theȱ secondȬorderȱ partialȱ correlationȱ coefficientsȱbetweenȱQTLȱprofilesȱwereȱcomputedȱinȱeachȱofȱ20,000ȱpermutationsȱ andȱsortedȱtoȱderiveȱtheȱthresholdȱofȱ0.14ȱatȱDȱ=ȱ0.05,ȱBonferroniȬadjustedȱforȱ17,ȱ theȱ numberȱ ofȱ correlationȱ testsȱ forȱ eachȱ glucosinolate.ȱ Weȱ didȱ notȱ correctȱ theȱ Dȱ

94ȱ Theȱgeneticsȱofȱplantȱmetabolismȱ

levelȱforȱtheȱnumberȱofȱallȱpairȬwiseȱanalysesȱ(17ȱxȱ18/2)ȱtoȱavoidȱoverȬcorrection.ȱ Atȱthisȱthreshold,ȱonȱaverageȱ0.1ȱcorrelationȱcoefficientsȱareȱsignificantȱbyȱchance.ȱ ȱ Acknowledgementsȱ Thisȱ workȱ wasȱ supportedȱ byȱ grantsȱ fromȱ theȱ Netherlandsȱ Organizationȱ forȱ ScientificȱResearch,ȱProgramȱGenomicsȱ(050Ȭ10Ȭ029)ȱandȱtheȱCentreȱforȱBiosystemsȱ Genomicsȱ(CBSG,ȱNetherlandsȱGenomicsȱInitiative).ȱ ȱ

95ȱ Chapterȱ4ȱ

REFERENCESȱ ȱ Alba,ȱR.,ȱPayton,ȱP.,ȱFei,ȱZ.,ȱMcQuinn,ȱR.,ȱDebbie,ȱP.,ȱMartin,ȱG.B.,ȱTanksley,ȱS.D.ȱandȱGiovannoni,ȱ J.J.ȱ(2005).ȱTranscriptomeȱandȱselectedȱmetaboliteȱanalysesȱrevealȱmultipleȱpointsȱofȱethyleneȱ controlȱduringȱtomatoȱfruitȱdevelopment.ȱPlantȱCellȱ17,ȱ2954Ȭ2965.ȱ AlonsoȬBlanco,ȱ C.,ȱPeeters,ȱ A.J.,ȱKoornneef,ȱM.,ȱLister,ȱC.,ȱDean,ȱC.,ȱvanȱdenȱBosch,ȱN.,ȱ Pot,ȱJ.ȱandȱ Kuiper,ȱ M.T.ȱ (1998).ȱ Developmentȱ ofȱ anȱ AFLPȱ basedȱ linkageȱ mapȱ ofȱ Ler,ȱ Colȱ andȱ Cviȱ Arabidopsisȱ thalianaȱ ecotypesȱ andȱ constructionȱ ofȱ aȱ Ler/Cviȱ recombinantȱ inbredȱ lineȱ population.ȱPlantȱJȱ14,ȱ259Ȭ271.ȱ Bentsink,ȱL.,ȱAlonsoȬBlanco,ȱC.,ȱVreugdenhil,ȱD.,ȱTesnier,ȱK.,ȱGroot,ȱS.P.ȱandȱKoornneef,ȱM.ȱ(2000).ȱ Geneticȱ analysisȱ ofȱ seedȬsolubleȱ oligosaccharidesȱ inȱ relationȱ toȱ seedȱ storabilityȱ ofȱ Arabidopsis.ȱPlantȱPhysiolȱ124,ȱ1595Ȭ1604.ȱ Bing,ȱ N.ȱ andȱ Hoeschele,ȱ I.ȱ (2005).ȱ Geneticalȱ genomicsȱ analysisȱ ofȱ aȱ yeastȱ segregantȱ populationȱ forȱ transcriptionȱnetworkȱinference.ȱGeneticsȱ170,ȱ533Ȭ542.ȱ Brem,ȱ R.B.,ȱ Yvert,ȱ G.,ȱ Clinton,ȱ R.ȱ andȱ Kruglyak,ȱ L.ȱ (2002).ȱ Geneticȱ dissectionȱ ofȱ transcriptionalȱ regulationȱinȱbuddingȱyeast.ȱScienceȱ296,ȱ752Ȭ755.ȱ Broman,ȱ K.W.ȱ (2003).ȱ Mappingȱ quantitativeȱ traitȱ lociȱ inȱ theȱ caseȱ ofȱ aȱ spikeȱ inȱ theȱ phenotypeȱ distribution.ȱGeneticsȱ163,ȱ1169Ȭ1175.ȱ DȇAuria,ȱJ.C.ȱandȱGershenzon,ȱJ.ȱ(2005).ȱTheȱsecondaryȱmetabolismȱofȱArabidopsisȱthaliana:ȱgrowingȱlikeȱ aȱweed.ȱCurrȱOpinȱPlantȱBiolȱ8,ȱ308Ȭ316.ȱ deȱKoning,ȱD.J.ȱandȱHaley,ȱC.S.ȱ(2005).ȱGeneticalȱgenomicsȱinȱhumansȱandȱmodelȱorganisms.ȱTrendsȱ Genetȱ21,ȱ377Ȭ381.ȱ deȱlaȱFuente,ȱA.,ȱBing,ȱN.,ȱHoeschele,ȱI.ȱandȱMendes,ȱP.ȱ(2004).ȱDiscoveryȱofȱmeaningfulȱassociationsȱinȱ genomicȱdataȱusingȱpartialȱcorrelationȱcoefficients.ȱBioinformaticsȱ20,ȱ3565Ȭ3574.ȱ DeCook,ȱ R.,ȱ Lall,ȱ S.,ȱ Nettleton,ȱ D.ȱ andȱ Howell,ȱ S.H.ȱ (2006).ȱ Geneticȱ regulationȱ ofȱ geneȱ expressionȱ duringȱshootȱdevelopmentȱinȱArabidopsis.ȱGeneticsȱ172,ȱ1155Ȭ1164.ȱ Dixon,ȱR.A.ȱ(2005).ȱEngineeringȱofȱplantȱnaturalȱproductȱpathways.ȱCurrȱOpinȱPlantȱBiolȱ8,ȱ329Ȭ336.ȱ Hobbs,ȱD.H.,ȱFlintham,ȱJ.E.ȱandȱHills,ȱM.J.ȱ(2004).ȱGeneticȱcontrolȱofȱstorageȱoilȱsynthesisȱinȱseedsȱofȱ Arabidopsis.ȱPlantȱPhysiolȱ136,ȱ3341Ȭ3349.ȱ Jansen,ȱR.C.ȱ(1993).ȱIntervalȱmappingȱofȱmultipleȱquantitativeȱtraitȱloci.ȱGeneticsȱ135,ȱ205Ȭ211.ȱ Jansen,ȱR.C.ȱandȱNap,ȱJ.P.ȱ(2001).ȱGeneticalȱgenomics:ȱtheȱaddedȱvalueȱfromȱsegregation.ȱTrendsȱGenetȱ 17,ȱ388Ȭ391.ȱ Jansen,ȱ R.C.ȱ (2003).ȱ Studyingȱ complexȱ biologicalȱ systemsȱ usingȱ multifactorialȱ perturbation.ȱ Natȱ Revȱ Genetȱ4,ȱ145Ȭ151.ȱ Kliebenstein,ȱ D.J.,ȱ Gershenzon,ȱ J.ȱ andȱ MitchellȬOlds,ȱ T.ȱ (2001a).ȱ Comparativeȱ quantitativeȱ traitȱ lociȱ mappingȱ ofȱ aliphatic,ȱ indolicȱ andȱ benzylicȱ glucosinolateȱ productionȱ inȱ Arabidopsisȱ thalianaȱ leavesȱandȱseeds.ȱGeneticsȱ159,ȱ359Ȭ370.ȱ Kliebenstein,ȱD.J.,ȱKroymann,ȱJ.,ȱBrown,ȱP.,ȱFiguth,ȱ A.,ȱPedersen,ȱD.,ȱGershenzon,ȱJ.ȱandȱ MitchellȬ Olds,ȱ T.ȱ (2001b).ȱ Geneticȱ controlȱ ofȱ naturalȱ variationȱ inȱ Arabidopsisȱ glucosinolateȱ accumulation.ȱPlantȱPhysiolȱ126,ȱ811Ȭ825.ȱ Kliebenstein,ȱD.J.,ȱLambrix,ȱV.M.,ȱReichelt,ȱM.,ȱGershenzon,ȱJ.ȱandȱMitchellȬOlds,ȱT.ȱ(2001c).ȱGeneȱ duplicationȱinȱtheȱdiversificationȱofȱsecondaryȱmetabolism:ȱtandemȱ2ȬoxoglutarateȬdependentȱ dioxygenasesȱcontrolȱglucosinolateȱbiosynthesisȱinȱArabidopsis.ȱPlantȱCellȱ13,ȱ681Ȭ693.ȱ Koornneef,ȱM.,ȱAlonsoȬBlanco,ȱC.ȱandȱVreugdenhil,ȱD.ȱ(2004).ȱNaturallyȱoccurringȱgeneticȱvariationȱinȱ ArabidopsisȱThaliana.ȱAnnuȱRevȱPlantȱPhysiolȱPlantȱMolȱBiolȱ55,ȱ141Ȭ172.ȱ Kose,ȱ F.,ȱ Weckwerth,ȱ W.,ȱ Linke,ȱ T.ȱ andȱ Fiehn,ȱ O.ȱ (2001).ȱ Visualizingȱ plantȱ metabolomicȱ correlationȱ networksȱusingȱcliqueȬȱmetaboliteȱmatrices.ȱBioinformaticsȱ17,ȱ1198Ȭ1208.ȱ

96ȱ Theȱgeneticsȱofȱplantȱmetabolismȱ

Kroymann,ȱJ.,ȱTextor,ȱS.,ȱTokuhisa,ȱJ.G.,ȱFalk,ȱK.L.,ȱBartram,ȱS.,ȱGershenzon,ȱJ.ȱandȱMitchellȬOlds,ȱT.ȱ (2001).ȱAȱgeneȱcontrollingȱvariationȱinȱArabidopsisȱglucosinolateȱ compositionȱisȱpartȱofȱtheȱ methionineȱchainȱelongationȱpathway.ȱPlantȱPhysiolȱ127,ȱ1077Ȭ1088.ȱ Lall,ȱS.,ȱNettleton,ȱD.,ȱDeCook,ȱR.,ȱChe,ȱP.ȱandȱHowell,ȱS.H.ȱ(2004).ȱQuantitativeȱtraitȱlociȱassociatedȱ withȱadventitiousȱshootȱformationȱinȱtissueȱcultureȱandȱtheȱprogramȱofȱshootȱdevelopmentȱinȱ Arabidopsis.ȱGeneticsȱ167,ȱ1883Ȭ1892.ȱ Li,ȱ Y.,ȱ Baldauf,ȱ S.,ȱ Lim,ȱ E.K.ȱ andȱ Bowles,ȱ D.J.ȱ (2001).ȱ Phylogeneticȱ analysisȱ ofȱ theȱ UDPȬ glycosyltransferaseȱmultigeneȱfamilyȱofȱArabidopsisȱthaliana.ȱJȱBiolȱChemȱ276,ȱ4338Ȭ4343.ȱ Lim,ȱ E.K.,ȱ Ashford,ȱ D.A.,ȱ Hou,ȱ B.,ȱ Jackson,ȱ R.G.ȱ andȱ Bowles,ȱ D.J.ȱ (2004).ȱ Arabidopsisȱ glycosyltransferasesȱ asȱ biocatalystsȱ inȱ fermentationȱ forȱ regioselectiveȱ synthesisȱ ofȱ diverseȱ quercetinȱglucosides.ȱBiotechnolȱBioengȱ87,ȱ623Ȭ631.ȱ Loudet,ȱO.,ȱChaillou,ȱS.,ȱMerigout,ȱP.,ȱTalbotec,ȱJ.ȱandȱDanielȬVedele,ȱF.ȱ(2003).ȱQuantitativeȱtraitȱlociȱ analysisȱofȱnitrogenȱuseȱefficiencyȱinȱArabidopsis.ȱPlantȱPhysiolȱ131,ȱ345Ȭ358.ȱ Lumba,ȱS.ȱandȱMcCourt,ȱP.ȱ(2005).ȱPreventingȱleafȱidentityȱtheftȱwithȱhormones.ȱCurrȱOpinȱPlantȱBiolȱ8,ȱ 501Ȭ505.ȱ McCullagh,ȱP.ȱandȱNelder,ȱJ.A.ȱ(1989).ȱGeneralizedȱLinearȱModels.ȱ(NewȱYork:ȱChapmanȱ&ȱHall).ȱ Mita,ȱ S.,ȱ Murano,ȱ N.,ȱ Akaike,ȱ M.ȱ andȱ Nakamura,ȱ K.ȱ (1997).ȱ Mutantsȱ ofȱ Arabidopsisȱ thalianaȱ withȱ pleiotropicȱeffectsȱonȱtheȱexpressionȱofȱtheȱgeneȱforȱbetaȬamylaseȱandȱonȱtheȱaccumulationȱofȱ anthocyaninȱthatȱareȱinducibleȱbyȱsugars.ȱPlantȱJȱ11,ȱ841Ȭ851.ȱ MitchellȬOlds,ȱ T.ȱ andȱ Pedersen,ȱ D.ȱ (1998).ȱ Theȱ molecularȱ basisȱ ofȱ quantitativeȱ geneticȱ variationȱ inȱ centralȱandȱsecondaryȱmetabolismȱinȱArabidopsis.ȱGeneticsȱ149,ȱ739Ȭ747.ȱ OksmanȬCaldentey,ȱK.M.ȱandȱSaito,ȱK.ȱ(2005).ȱIntegratingȱgenomicsȱandȱmetabolomicsȱforȱengineeringȱ plantȱmetabolicȱpathways.ȱCurrȱOpinȱBiotechnolȱ16,ȱ174Ȭ179.ȱ Reichelt,ȱM.,ȱBrown,ȱP.D.,ȱSchneider,ȱB.,ȱOldham,ȱN.J.,ȱStauber,ȱE.,ȱTokuhisa,ȱJ.,ȱKliebenstein,ȱD.J.,ȱ MitchellȬOlds,ȱ T.ȱ andȱ Gershenzon,ȱ J.ȱ (2002).ȱ Benzoicȱ acidȱ glucosinolateȱ estersȱ andȱ otherȱ glucosinolatesȱfromȱArabidopsisȱthaliana.ȱPhytochemistryȱ59,ȱ663Ȭ671.ȱ Schadt,ȱE.E.,ȱMonks,ȱS.A.,ȱDrake,ȱT.A.,ȱLusis,ȱA.J.,ȱChe,ȱN.,ȱColinayo,ȱV.,ȱRuff,ȱT.G.,ȱMilligan,ȱS.B.,ȱ Lamb,ȱJ.R.,ȱCavet,ȱG.ȱetȱal.ȱ(2003).ȱGeneticsȱofȱgeneȱexpressionȱsurveyedȱinȱmaize,ȱmouseȱandȱ man.ȱNatureȱ422,ȱ297Ȭ302.ȱ Schadt,ȱ E.E.,ȱ Lamb,ȱ J.,ȱ Yang,ȱ X.,ȱ Zhu,ȱ J.,ȱ Edwards,ȱ S.,ȱ Guhathakurta,ȱ D.,ȱ Sieberts,ȱ S.K.,ȱ Monks,ȱ S.,ȱ Reitman,ȱ M.,ȱ Zhang,ȱ C.ȱ etȱ al.ȱ (2005).ȱ Anȱ integrativeȱ genomicsȱ approachȱ toȱ inferȱ causalȱ associationsȱbetweenȱgeneȱexpressionȱandȱdisease.ȱNatȱGenetȱ37,ȱ710Ȭ717.ȱ Shimodaira,ȱH.ȱ (2004).ȱApproximatelyȱ unbiasedȱtestsȱofȱregionsȱ usingȱ multistepȬmultiscaleȱbootstrapȱ resampling.ȱAnnȱStatistȱ32,ȱ2616Ȭ2641.ȱ Stam,ȱP.ȱ(1993).ȱConstructionȱofȱintegratedȱgeneticȱlinkageȱmapsȱbyȱmeansȱofȱaȱnewȱcomputerȱpackage:ȱ JoinMap.ȱPlantȱJȱ3,ȱ739Ȭ744.ȱ Storey,ȱJ.D.ȱandȱTibshirani,ȱR.ȱ(2003).ȱStatisticalȱsignificanceȱforȱgenomewideȱstudies.ȱProcȱNatlȱAcadȱSciȱ UȱSȱAȱ100,ȱ9440Ȭ9445.ȱ Stuart,ȱ J.M.,ȱ Segal,ȱ E.,ȱ Koller,ȱ D.ȱ andȱ Kim,ȱ S.K.ȱ (2003).ȱ Aȱ geneȬcoexpressionȱ networkȱ forȱ globalȱ discoveryȱofȱconservedȱgeneticȱmodules.ȱScienceȱ302,ȱ249Ȭ255.ȱ Tikunov,ȱY.,ȱLommen,ȱA.,ȱdeȱVos,ȱC.H.,ȱVerhoeven,ȱH.A.,ȱBino,ȱR.J.,ȱHall,ȱR.D.ȱandȱBovy,ȱA.G.ȱ(2005).ȱ Aȱ novelȱ approachȱ forȱ nontargetedȱ dataȱ analysisȱ forȱ metabolomics.ȱ LargeȬscaleȱ profilingȱ ofȱ tomatoȱfruitȱvolatiles.ȱPlantȱPhysiolȱ139,ȱ1125Ȭ1137.ȱ Vorst,ȱO.,ȱdeȱVos,ȱC.H.R.,ȱLommen,ȱA.,ȱStaps,ȱR.V.,ȱVisser,ȱR.G.F.,ȱBino,ȱR.J.ȱandȱHall,ȱR.D.ȱ(2005).ȱAȱ nonȬdirectedȱ approachȱ toȱ theȱ differentialȱ analysisȱ ofȱ multipleȱ LCȬMSȬderivedȱ metabolicȱ profiles.ȱMetabolomicsȱ1,ȱ169Ȭ180.ȱ

97ȱ Chapterȱ4ȱ

Windsor,ȱA.J.,ȱReichelt,ȱM.,ȱFiguth,ȱA.,ȱSvatos,ȱA.,ȱKroymann,ȱJ.,ȱKliebenstein,ȱD.J.,ȱGershenzon,ȱJ.ȱ andȱMitchellȬOlds,ȱT.ȱ(2005).ȱGeographicȱandȱevolutionaryȱdiversificationȱofȱglucosinolatesȱ amongȱnearȱrelativesȱofȱArabidopsisȱthalianaȱ(Brassicaceae).ȱPhytochemistryȱ66,ȱ1321Ȭ1333.ȱ Wink,ȱ M.ȱ (1988).ȱ Plantȱ breeding:ȱ importanceȱ ofȱ plantȱ secondaryȱ metabolitesȱ forȱ protectionȱ againstȱ pathogensȱandȱherbivores.ȱTheorȱApplȱGenetȱ75,ȱ225Ȭ233.ȱ Yvert,ȱG.,ȱBrem,ȱR.B.,ȱWhittle,ȱJ.,ȱAkey,ȱJ.M.,ȱFoss,ȱE.,ȱSmith,ȱE.N.,ȱMackelprang,ȱR.ȱandȱKruglyak,ȱL.ȱ (2003).ȱ TransȬactingȱ regulatoryȱ variationȱ inȱ Saccharomycesȱ cerevisiaeȱ andȱ theȱ roleȱ ofȱ transcriptionȱfactors.ȱNatȱGenetȱ35,ȱ57Ȭ64.ȱ Zhu,ȱJ.,ȱLum,ȱP.Y.,ȱLamb,ȱJ.,ȱGuhaThakurta,ȱD.,ȱEdwards,ȱS.W.,ȱThieringer,ȱR.,ȱBerger,ȱJ.P.,ȱWu,ȱM.S.,ȱ Thompson,ȱ J.,ȱ Sachs,ȱ A.B.ȱ etȱ al.ȱ (2004).ȱ Anȱ integrativeȱ genomicsȱ approachȱ toȱ theȱ reconstructionȱofȱgeneȱnetworksȱinȱsegregatingȱpopulations.ȱCytogenetȱGenomeȱResȱ105,ȱ363Ȭ 374.ȱ

98ȱ Chapterȱ5ȱ ȱ ȱ Integrativeȱanalysesȱofȱgeneticȱvariationȱinȱenzymeȱ activitiesȱofȱprimaryȱcarbohydrateȱmetabolismȱrevealȱ distinctȱmodesȱofȱregulationȱinȱArabidopsisȱthalianaȱ ȱ Joostȱ J.ȱ B.ȱ Keurentjes,ȱ Ronanȱ Sulpice,ȱ Yvesȱ Gibon,ȱ Jingyuanȱ Fu,ȱ Maartenȱ Koornneef,ȱMarkȱStittȱandȱDickȱVreugdenhilȱ ȱ ȱ ȱ ABSTRACTȱ ȱ Plantȱprimaryȱcarbohydrateȱmetabolismȱisȱcomplexȱandȱflexible,ȱandȱisȱregulatedȱ atȱ manyȱ levels.ȱ Changesȱ ofȱ transcriptȱ levelsȱ doȱ notȱ alwaysȱ leadȱ toȱ changesȱ inȱ enzymeȱactivities,ȱandȱtheseȱmayȱnotȱalwaysȱaffectȱmetaboliteȱlevelsȱandȱfluxes.ȱToȱ analyzeȱ interactionsȱ betweenȱ theseȱ threeȱ levelsȱ ofȱ function,ȱ weȱ haveȱ performedȱ parallelȱ geneticȱ analysesȱ ofȱ 15ȱ enzymaticȱ activitiesȱ involvedȱ inȱ primaryȱ carbohydrateȱmetabolism,ȱtheȱtranscriptȱlevelsȱforȱtheirȱencodingȱstructuralȱgenes,ȱ andȱtheirȱsubstrateȱandȱproductȱmetabolites,ȱasȱwellȱasȱaȱnumberȱofȱotherȱrelatedȱ metabolites.ȱQuantitativeȱanalysesȱofȱeachȱtraitȱwereȱperformedȱinȱtheȱArabidopsisȱ Lerȱ xȱ Cviȱ recombinantȱ inbredȱ lineȱ (RIL)ȱ populationȱ andȱ subjectedȱ toȱ correlationȱ andȱ quantitativeȱ traitȱ locusȱ (QTL)ȱ analysis.ȱ Specificȱ regulationȱ wasȱ oftenȱ accompaniedȱ withȱ correlationsȱ betweenȱ traits,ȱ possiblyȱ dueȱ toȱ developmentalȱ controlȱaffectingȱseveralȱgenes,ȱenzymes,ȱorȱmetabolites.ȱForȱaȱnumberȱofȱenzymes,ȱ activityȱQTLsȱcoȬlocalizedȱwithȱexpressionȱQTLsȱ(eQTLs)ȱofȱtheirȱstructuralȱgenes,ȱ orȱ metaboliteȱ accumulationȱ QTLsȱ ofȱ theirȱ substrateȱ andȱ product.ȱ However,ȱ regulationȱ oftenȱ occurredȱ throughȱ multipleȱ loci,ȱ bothȱ dueȱ toȱ posttranscriptionalȱ andȱ cisȬȱ andȱ transȬactingȱ transcriptionalȱ controlȱ ofȱ structuralȱ genes,ȱ asȱ wellȱ asȱ independentȱofȱtheȱstructuralȱgenes.ȱAlthoughȱmanyȱofȱtheȱregulatoryȱprocessesȱinȱ primaryȱ carbohydrateȱ metabolismȱ remainȱ toȱ beȱ resolved,ȱ itȱ isȱ clearȱ thatȱ suchȱ studiesȱ willȱ benefitȱ fromȱ theȱ integrativeȱ geneticȱ analysisȱ ofȱ geneȱ transcription,ȱ enzymeȱ activity,ȱ andȱ metaboliteȱ content.ȱ Theȱ multiparallelȱ QTLȱ analysesȱ ofȱ theȱ variousȱinterconnectedȱtransducersȱofȱbiologicalȱinformationȱflow,ȱdescribedȱhereȱ forȱtheȱfirstȱtime,ȱcanȱassistȱinȱdeterminingȱtheȱcauseȱandȱconsequencesȱofȱgeneticȱ regulationȱatȱdifferentȱlevelsȱofȱcomplexȱbiologicalȱsystems.ȱ

99ȱ Chapterȱ5ȱ

INTRODUCTIONȱ ȱ Carbonȱ isȱ probablyȱ theȱ mostȱ prevalentȱ andȱ importantȱ elementȱ inȱ anyȱ lifeȱ form.ȱ Unlikeȱmostȱotherȱorganisms,ȱwhichȱareȱdependentȱonȱuptakeȱofȱorganicȱformsȱofȱ carbon,ȱplantsȱfixȱinorganicȱcarbonȱthroughȱphotosynthesis.ȱUponȱfixation,ȱmostȱofȱ theȱinorganicȱcarbonȱisȱconvertedȱintoȱsucrose,ȱwhichȱthenȱactsȱasȱtheȱmajorȱsourceȱ ofȱorganicȱcarbonȱforȱfurtherȱmetabolism.ȱSomeȱofȱtheȱfixedȱcarbonȱisȱtemporarilyȱ storedȱ asȱ starch,ȱ andȱ remobilizedȱ atȱ nightȱ toȱ supportȱ respirationȱ andȱ continuedȱ synthesisȱandȱexportȱtoȱotherȱtissues.ȱToȱmeetȱtheȱvariousȱdemandsȱofȱaȱgrowingȱ plantȱ forȱ specificȱ purposes,ȱ carbohydratesȱ needȱ toȱ beȱ allocatedȱ withinȱ theȱ plant,ȱ andȱ convertedȱ intoȱ aȱ plethoraȱ ofȱ compoundsȱ (Koch,ȱ 2004).ȱ Carbohydrateȱ metabolismȱinȱplantsȱisȱmoreȱcomplexȱthanȱinȱmostȱotherȱorganisms;ȱforȱexample,ȱ thereȱ areȱ alternativeȱ routesȱ forȱ theȱ mobilizationȱ andȱ metabolizationȱ ofȱ diverseȱ componentsȱ(Carrariȱetȱal.,ȱ2003).ȱFurthermore,ȱdependingȱonȱtheȱtissue,ȱpartȱorȱallȱ ofȱ theȱ glycolyticȱ pathwayȱ isȱ presentȱ inȱ theȱ plastidȱ asȱ wellȱ asȱ theȱ cytosolȱ (Lunn,ȱ 2007).ȱMoreover,ȱmostȱenzymesȱinȱplantȱcentralȱmetabolismȱareȱencodedȱbyȱsmallȱ geneȱfamiliesȱ(TheȱArabidopsisȱGenomeȱInitiative,ȱ2000;ȱMartienssen,ȱ2000).ȱAsȱaȱ result,ȱ aȱ givenȱ substrateȱ mayȱ beȱ convertedȱ intoȱ differentȱ products,ȱ andȱ productsȱ canȱbeȱformedȱfromȱdifferentȱsubstrates.ȱThisȱversatilityȱofȱenzymaticȱreactionsȱinȱ combinationȱ withȱ substrateȱ competitionȱ enablesȱ differentȱ metabolicȱ routesȱ andȱ createsȱ aȱ denseȱ metabolicȱ networkȱ withȱ shortȱ pathwayȱ lengths.ȱ Perturbationsȱ inȱ subȱ partsȱ ofȱ theȱ networkȱ canȱ thereforeȱ haveȱ strongȱ consequencesȱ forȱ otherȱ partsȱ andȱevenȱaffectȱplantȱgrowthȱandȱdevelopmentȱ(SturmȱandȱTang,ȱ1999;ȱRoessnerȱetȱ al.,ȱ2001;ȱFernieȱetȱal.,ȱ2002).ȱTheȱcomplexityȱofȱtheȱmetabolicȱnetworkȱmayȱallowȱ theȱ plantȱ toȱ compensateȱ forȱ disturbanceȱ inȱ oneȱ route,ȱ byȱ enhancingȱ theȱ fluxȱ throughȱ anȱ alternativeȱ routeȱ (Ronteinȱ etȱ al.,ȱ 2002).ȱ Toȱ ensureȱ aȱ balancedȱ carbonȱ allocationȱ throughȱ aȱ plant’sȱ lifecycle,ȱ aȱ strongȱ andȱ tightȱ regulationȱ isȱ thereforeȱ essential.ȱAtȱtheȱsameȱtime,ȱthisȱcomplexityȱmeansȱthatȱthereȱmayȱbeȱconsiderableȱ redundancy,ȱ atȱ leastȱ underȱ standardizedȱ growthȱ conditions.ȱ Thereȱ areȱ severalȱ reportsȱwhereȱmajorȱchangesȱinȱtheȱexpressionȱofȱindividualȱenzymesȱleadȱtoȱlittleȱ changeȱinȱmetabolismȱ(e.g.ȱ(Neuhausȱetȱal.,ȱ1989).ȱȱ Givenȱ theȱ hugeȱ diversityȱ inȱ plantȱ species,ȱ withȱ largeȱ differencesȱ inȱ theirȱ energyȱmetabolism,ȱgrowthȱandȱstorageȱofȱreserves,ȱitȱcanȱbeȱexpectedȱthatȱthereȱ willȱ beȱ considerableȱ variationȱ inȱ primaryȱ carbohydrateȱ metabolismȱ betweenȱ species,ȱandȱmostȱlikelyȱalsoȱwithinȱspecies.ȱForȱaȱthoroughȱunderstandingȱofȱtheȱ roleȱ ofȱ naturalȱ variationȱ inȱ plantȱ metabolismȱ andȱ developmentȱ itȱ isȱ ofȱ pivotalȱ importanceȱ toȱ identifyȱ theȱ geneticȱ basisȱ ofȱ variationȱ inȱ metabolicȱ pathwaysȱ andȱ processesȱwithinȱspecies.ȱTheȱidentificationȱofȱgenesȱaffectingȱmetabolicȱprocessesȱ

100ȱ Integrativeȱanalysesȱofȱgeneticȱvariationȱinȱenzymeȱactivityȱ

mightȱ alsoȱ increaseȱ ourȱ knowledgeȱ aboutȱ theȱ regulatoryȱ controlȱ ofȱ pathwaysȱ inȱ general.ȱ Theȱ geneticȱ controlȱ ofȱ primaryȱ carbohydrateȱ metabolismȱ isȱ highlyȱ complexȱ becauseȱ manyȱ biochemicalȱ stepsȱ areȱ involved,ȱ togetherȱ withȱ environmentalȱandȱdevelopmentalȱfactors.ȱTheȱfinding,ȱinȱArabidopsis,ȱthatȱlargeȱ differencesȱ inȱ manyȱ enzymeȱ activitiesȱ andȱ metaboliteȱ contentsȱ existȱ betweenȱ accessionsȱ (MitchellȬOldsȱ andȱ Pedersen,ȱ 1998;ȱ Crossȱ etȱ al.,ȱ 2006),ȱ growingȱ conditionsȱ (Gibonȱ etȱ al.,ȱ 2006;ȱ Morcuendeȱ etȱ al.,ȱ 2007;ȱ Osunaȱ etȱ al.,ȱ 2007),ȱ developmentalȱ stagesȱ (Meyerȱ etȱ al.,ȱ 2007),ȱ timeȱ ofȱ dayȱ (Gibonȱ etȱ al.,ȱ 2004b),ȱ andȱ tissuesȱ (Sergeevaȱ etȱ al.,ȱ 2004,ȱ 2006)ȱ illustratesȱ thisȱ complexity.ȱ Crossȱ etȱ al.ȱ (2006)ȱ analyzedȱ 24ȱ Arabidopsisȱ accessionsȱ forȱ biomassȱ production,ȱ metaboliteȱ content,ȱ andȱ enzymeȱ activity.ȱ Positiveȱ correlationsȱ wereȱ observedȱ betweenȱ biomass,ȱ enzymeȱactivities,ȱandȱcarbohydrates.ȱFurtherȱevidenceȱforȱdevelopmentalȱcontrolȱ ofȱ plantȱ metabolismȱ isȱ derivedȱ fromȱ aȱ studyȱ byȱ Meyerȱ etȱ al.ȱ (2007).ȱ Theȱ authorsȱ usedȱGCȬMSȱmetabolicȱprofilingȱofȱtheȱColȱxȱC24ȱRILȱpopulationȱinȱparallelȱwithȱ biomassȱ determinations.ȱ Noȱ strongȱ correlationsȱ betweenȱ individualȱ metabolitesȱ andȱbiomassȱproductionȱcouldȱbeȱobservedȱbutȱaȱstrongȱcanonicalȱcorrelationȱwasȱ observedȱ whenȱ allȱ metabolitesȱ wereȱ takenȱ intoȱ account.ȱ Amongȱ theȱ metabolitesȱ contributingȱ mostȱ toȱ theȱ observedȱ correlationȱ wereȱ intermediatesȱ ofȱ theȱ hexoseȱ phosphateȱ pool:ȱ fructoseȬ6Ȭphosphate,ȱ glucoseȬ6Ȭphosphate,ȱ andȱ glucoseȬ1Ȭ phosphate.ȱ Bothȱ positiveȱ andȱ negativeȱ correlationsȱ betweenȱ biomassȱ andȱ metabolitesȱ wereȱ observedȱ althoughȱ theȱ largeȱ majorityȱ ofȱ metabolites,ȱ includingȱ sucrose,ȱ hexoseȱ phosphatesȱ andȱ membersȱ ofȱ theȱ TCAȱ cycle,ȱ showedȱ negativeȱ correlations.ȱThis,ȱandȱtheȱresultsȱofȱCrossȱetȱal.ȱ(2006),ȱindicatesȱthatȱhighȱratesȱofȱ biomassȱ productionȱ depleteȱ poolsȱ ofȱ metabolitesȱ resultingȱ inȱ higherȱ enzymeȱ activities,ȱ asȱ wasȱ alsoȱ concludedȱ fromȱ theȱ relationshipȱ betweenȱ tomatoȱ fruitȱ sizeȱ andȱmetaboliteȱcontentȱ(Schauerȱetȱal.,ȱ2006).ȱNaturalȱvariationȱin,ȱandȱspatialȱandȱ temporalȱ controlȱ ofȱ primaryȱ carbohydrateȱ metabolism,ȱ therefore,ȱ suggestȱ aȱ tightȱ relationshipȱ withȱ plantȱ development,ȱ althoughȱ itȱ isȱ difficultȱ toȱ assessȱ causeȱ andȱ consequenceȱandȱregulationȱisȱhighlyȱcomplex.ȱ Naturalȱ variationȱ canȱ beȱ effectivelyȱ analyzedȱ inȱ mappingȱ populations,ȱ offeringȱtheȱpossibilityȱofȱlocatingȱgeneticȱfactorsȱcausalȱforȱtheȱobservedȱvariationȱ (Koornneefȱetȱal.,ȱ2004).ȱAlthoughȱgeneticsȱhasȱbeenȱsuccessfullyȱusedȱtoȱanalyzeȱ quantitativeȱvariationȱinȱplantȱmetabolismȱ(Causseȱetȱal.,ȱ1995;ȱMitchellȬOldsȱandȱ Pedersen,ȱ1998;ȱPrioulȱetȱal.,ȱ1999;ȱHirelȱetȱal.,ȱ2001;ȱRauhȱetȱal.,ȱ2002;ȱLoudetȱetȱal.,ȱ 2003;ȱFridmanȱetȱal.,ȱ2004;ȱHarrisonȱetȱal.,ȱ2004;ȱSergeevaȱetȱal.,ȱ2004,ȱ2006;ȱCalengeȱ etȱal.,ȱ2006;ȱKeurentjesȱetȱal.,ȱ2006;ȱSchauerȱetȱal.,ȱ2006),ȱmostȱstudiesȱaddressedȱonlyȱ aȱ limitedȱ numberȱ ofȱ enzymesȱ orȱ metabolites,ȱ andȱ didȱ notȱ integrateȱ thisȱ withȱ informationȱaboutȱchangesȱinȱtranscriptȱlevels.ȱGivenȱtheȱstrongȱinterdependencyȱ ofȱenzymeȱactivitiesȱandȱmetabolites,ȱgeneticȱstudiesȱcanȱbenefitȱenormouslyȱfromȱ

101ȱ Chapterȱ5ȱ

multidisciplinaryȱapproachesȱ(Fiehnȱetȱal.,ȱ2001;ȱWinnacker,ȱ2003).ȱToȱgainȱinsightȱ intoȱconnectivityȱinȱmetabolicȱnetworksȱitȱisȱthereforeȱrecommendableȱtoȱanalyzeȱ asȱ manyȱ enzymesȱ andȱ metabolitesȱ involvedȱ inȱ suchȱ aȱ networkȱ asȱ possible.ȱ Theȱ parallelȱanalysisȱofȱgeneȱexpressionȱwouldȱfurtherȱenhanceȱourȱunderstandingȱofȱ geneticȱregulationȱ(UrbanczykȬWochniakȱetȱal.,ȱ2003;ȱGachonȱetȱal.,ȱ2005;ȱHiraiȱetȱal.,ȱ 2005;ȱGibonȱetȱal.,ȱ2006).ȱȱ Inȱ theȱ presentȱ study,ȱ weȱ analyzedȱ theȱ activityȱ ofȱ 15ȱ differentȱ enzymesȱ involvedȱinȱprimaryȱcarbohydrateȱmetabolismȱasȱwellȱasȱtheȱtranscriptȱlevelsȱforȱ theirȱ structuralȱ genes,ȱ inȱ parallelȱ withȱ quantificationȱ ofȱ theȱ mostȱ importantȱ carbohydratesȱandȱrelatedȱmetabolitesȱinȱtheȱLandsbergȱerectaȱ(Ler)ȱxȱCapeȱverdeȱ islandsȱ (Cvi)ȱ recombinantȱ inbredȱ lineȱ (RIL)ȱ populationȱ ofȱ Arabidopsisȱ thalianaȱ (AlonsoȬBlancoȱ etȱ al.,ȱ 1998).ȱ RILȱ populationsȱ offerȱ uniqueȱ possibilitiesȱ forȱ suchȱ integrativeȱ studiesȱ becauseȱ differentȱ typesȱ ofȱ experimentsȱ canȱ beȱ performedȱ inȱ replicatesȱ onȱ theȱ sameȱ genotypes.ȱ Furthermoreȱ aȱ largeȱ numberȱ ofȱ geneticȱ perturbationsȱsegregateȱinȱpopulationsȱderivedȱfromȱcrossesȱofȱdistinctȱaccessions.ȱ Aȱrelativelyȱlargeȱsetȱofȱlinesȱcanȱthenȱbeȱanalyzedȱforȱcorrelationsȱbetweenȱtraitsȱasȱ wellȱ asȱ forȱ quantitativeȱ traitȱ lociȱ (QTLs)ȱ controllingȱ variationȱ observedȱ forȱ theseȱ traits.ȱTheȱadvantageȱofȱArabidopsisȱisȱthatȱtheȱgenomeȱhasȱbeenȱsequencedȱ(Theȱ ArabidopsisȱGenomeȱInitiative,ȱ2000)ȱandȱgenesȱhaveȱbeenȱ(putatively)ȱannotatedȱ forȱ nearlyȱ allȱ enzymesȱ inȱ primaryȱ metabolismȱ (Theȱ Arabidopsisȱ Informationȱ Resourceȱ atȱ http://www.arabidopsis.org/),ȱ allowingȱ analysisȱ ofȱ transcriptionalȱ regulationȱofȱtheseȱgenes.ȱ Weȱ showȱ thatȱ geneticallyȱ controlledȱ variationȱ existsȱ forȱ theȱ activityȱ ofȱ manyȱenzymesȱasȱwellȱasȱforȱtranscriptȱlevelsȱofȱtheirȱstructuralȱgenesȱandȱforȱtheȱ metabolitesȱ theyȱ interconvert.ȱ Byȱ comparingȱ theȱ localizationȱ andȱ responsesȱ ofȱ structuralȱgenesȱencodingȱtheȱenzymes,ȱeQTLsȱforȱtheirȱtranscriptȱlevels,ȱandȱQTLsȱ forȱ enzymeȱ activitiesȱ andȱ metaboliteȱ contents,ȱ weȱ demonstrateȱ thatȱ geneticallyȱ controlledȱ regulationȱ occursȱ throughȱ differentȱ modesȱ ofȱ actionȱ andȱ atȱ multipleȱ levels.ȱ ȱ ȱ

102ȱ Integrativeȱanalysesȱofȱgeneticȱvariationȱinȱenzymeȱactivityȱ

RESULTSȱ ȱ Naturalȱvariationȱinȱprimaryȱcarbohydrateȱmetabolismȱ Toȱdetermineȱtheȱextentȱofȱnaturalȱvariationȱinȱprimaryȱcarbohydrateȱmetabolismȱ inȱArabidopsisȱweȱanalyzedȱaȱRecombinantȱInbredȱLineȱ(RIL)ȱpopulationȱofȱaȱcrossȱ betweenȱtheȱtwoȱdistinctȱaccessionsȱLandsbergȱerectaȱ(Ler)ȱandȱCapeȱVerdeȱIslandsȱ (Cvi)ȱ (AlonsoȬBlancoȱ etȱ al.,ȱ 1998).ȱ Metabolicȱ conversionȱ ratesȱ attributableȱ toȱ enzymeȱ activityȱ wereȱ establishedȱ forȱ 15ȱ specificȱ enzymaticȱ reactionsȱ inȱ parallelȱ withȱdeterminationsȱofȱpoolsȱofȱmetabolicȱcarbonȱsourcesȱ(Tableȱ1,ȱFigureȱ1).ȱ ȱ

ȱ ȱ Figureȱ1:ȱEnzymaticȱconversionsȱinȱprimaryȱcarbohydrateȱmetabolism.ȱ Reactionsȱareȱgivenȱinȱtheȱbiologicallyȱmostȱrelevantȱdirection,ȱalthoughȱseveralȱenzymesȱcanȱcatalyzeȱ reversibleȱreactions.ȱMetabolitesȱareȱdepictedȱinȱsolidȱtypefaceȱandȱconvertingȱenzymesȱareȱdepictedȱinȱ shadedȱtypefaceȱonȱtheȱrightȱsideȱofȱarrows.ȱ ȱ ȱ ȱ ȱ

103ȱ Chapterȱ5ȱ

Tableȱ1:ȱSummationȱofȱenzymesȱ andȱ metabolitesȱ analyzedȱandȱtheȱabbreviationsȱused.ȱReactionsȱ areȱ givenȱinȱtheȱdirectionȱasȱtheyȱwereȱassayedȱalthoughȱseveralȱenzymesȱcanȱalsoȱcatalyzeȱtheȱreversibleȱ reactions.ȱ Traitȱ Fullȱnameȱ Reactionȱ Invȱ Acidȱsolubleȱinvertase,ȱvacuolarȱ Sucroseȱ+ȱH2OȱÆȱȱ DȬDȬglucoseȱ+ȱfructoseȱ AGPȱ ADPȬglucoseȱpyrophosphorylaseȱ ADPȬDȬglucoseȱ+ȱPPiȱÆȱȱ DȬDȬglucoseȬ1Ȭphosphateȱ+ȱATPȱ

FBPȱ FructoseȬ1,6Ȭbisphosphateȱphosphatase,ȱ FructoseȬ1,6Ȭbisphosphateȱ+ȱH2OȱÆȱȱ cytosolicȱisoformȱ DȬfructoseȬ6Ȭphosphateȱ+ȱPiȱ G6PDHȱ GlucoseȬ6Ȭphosphateȱ1Ȭdehydrogenaseȱ EȬDȬglucoseȬ6Ȭphosphateȱ+ȱNADP+ȱÆȱȱ DȬgluconoȬGȬlactoneȬ6Ȭphosphateȱ+ȱNADPHȱ PFKȱ ATPȱdependentȱphosphofructokinaseȱ DȬfructoseȬ6Ȭphosphateȱ+ȱATPȱÆȱȱ fructoseȬ1,6Ȭbisphosphateȱ+ȱADPȱ PFPȱ Pyrophosphate:ȱfructoseȬ6Ȭphosphateȱ1Ȭ DȬfructoseȬ6Ȭphosphateȱ+ȱPPiȱÆȱ phosphotransferaseȱ ȱfructoseȬ1,6Ȭbisphosphateȱ+ȱPiȱ PGMȱ Phosphoglucomutaseȱ DȬDȬglucoseȬ1ȬphosphateȱÆȱȱ DȬDȬglucoseȬ6Ȭphosphateȱ PGIȱ Phosphoglucoseȱisomerase,ȱȱ DȬfructoseȬ6ȬphosphateȱÆȱȱ cytosolicȱandȱplastidialȱisoformsȱ EȬDȬglucoseȬ6Ȭphosphateȱ SPSȱ Sucroseȱphosphateȱsynthaseȱ DȬfructoseȬ6Ȭphosphateȱ+ȱUDPȬDȬglucoseȱÆȱ ȱsucroseȬ6Ȭphosphateȱ+ȱUDPȱ SuSyȱ Sucroseȱsynthaseȱ Sucroseȱ+ȱUDPȱÆȱȱ UDPȬDȬglucoseȱ+ȱfructoseȱ GKȱ Glucokinaseȱ DȬDȬglucoseȱ+ȱATPȱÆȱȱ DȬDȬglucoseȬ6Ȭphosphateȱ+ȱADPȱ FKȱ Fructokinaseȱ Fructoseȱ+ȱATPȱÆȱȱ DȬfructoseȬ6Ȭphosphateȱ+ȱADPȱ UGPȱ UDPȬglucoseȱpyrophosphorylaseȱ UDPȬDȬglucoseȱ+ȱPPiȱÆȱȱ DȬDȬglucoseȬ1Ȭphosphateȱ+ȱUTPȱ

Rubiscoȱ Ribuloseȱbisphosphateȱcarboxylase/ȱ H2Oȱ+ȱCO2ȱ+ȱDȬribuloseȬ1,5ȬbisphosphateȱÆȱȱ oxygenase,ȱinitialȱandȱuponȱmaxȱactivationȱ 2ȱ3Ȭphosphoglycerateȱ+ȱ2ȱH+ȱ ȱȱ ȱ ChlAȱ ChlorophylȱAȱȱ ChlBȱ ChlorophylȱBȱȱ AAȱ TotalȱAminoȱAcidsȱȱ Proteinȱ TotalȱProteinȱcontentȱȱ Starchȱ Starchȱȱ ȱ Sucȱ Sucroseȱȱ ȱ Gluȱ Glucoseȱȱ ȱ Fruȱ Fructoseȱȱ ȱ G1Pȱ DȬDȬglucoseȬ1Ȭphosphateȱȱ ȱ G6Pȱ DȬDȬglucoseȬ6Ȭphosphateȱȱ ȱ UDPGȱ UDPȬDȬglucoseȱȱ ȱ ȱ

104ȱ Integrativeȱanalysesȱofȱgeneticȱvariationȱinȱenzymeȱactivityȱ

Considerableȱvariationȱwasȱobservedȱwithinȱtheȱpopulationȱforȱmostȱofȱtheȱ analyzedȱ traits,ȱ withȱ coefficientsȱ ofȱ variationȱ (CV)ȱ rangingȱ fromȱ 13.7ȱ (ChlA)ȱ toȱ 54.2%ȱ (GK)ȱ (Tableȱ 2).ȱ Inȱ generalȱ CVȱ valuesȱ wereȱ higherȱ forȱ enzymeȱ activityȱ measurementsȱthanȱforȱcontentsȱofȱmetabolites.ȱAȱsubstantialȱpartȱofȱtheȱobservedȱ variationȱcouldȱbeȱattributedȱtoȱgeneticȱfactors,ȱasȱconcludedȱfromȱQTLȱanalyses.ȱ SignificantȱQTLsȱwereȱdetectedȱforȱtenȱofȱtheȱenzymeȱactivityȱtraitsȱandȱforȱnineȱ metaboliteȱ traitsȱ (Tableȱ 2,ȱ Figureȱ 2).ȱ Inȱ aȱ numberȱ ofȱ cases,ȱ multipleȱ QTLsȱ wereȱ detected,ȱ sometimesȱ withȱ oppositeȱ effects,ȱ explainingȱ theȱ largeȱ variationȱ andȱ transgressionȱthatȱwasȱobserved,ȱalthoughȱinȱgeneralȱtheȱoverallȱeffectȱofȱQTLsȱwasȱ inȱ concordanceȱ withȱ theȱ phenotypicȱ differencesȱ observedȱ betweenȱ theȱ parents.ȱ VeryȱfewȱcoȬlocatingȱQTLsȱwereȱdetectedȱforȱtheȱdifferentȱenzymeȱactivities,ȱwhereȱ coȬlocationȱ isȱ definedȱ asȱ anȱ overlapȱ inȱ 2ȱ Mbpȱ supportȱ intervals,ȱ evenȱ thoughȱ severalȱ ofȱ themȱ areȱ fromȱ theȱ sameȱ orȱ relatedȱ pathwaysȱ (Tableȱ 2,ȱ Figureȱ 3).ȱ CoȬ locationȱofȱQTLsȱwasȱmoreȱoftenȱtheȱcaseȱforȱmetabolicȱcontentȱdueȱtoȱtheȱhigherȱ numberȱofȱQTLsȱdetectedȱforȱtheȱmetabolicȱtraits.ȱȱ Despiteȱthisȱevidenceȱforȱstrongȱindependentȱregulation,ȱsuggestedȱbyȱtheȱ detectionȱofȱtraitȱspecificȱQTLs,ȱwhenȱtheȱvaluesȱareȱcomparedȱacrossȱallȱtheȱRILs,ȱaȱ positiveȱ correlationȱ wasȱ observedȱ betweenȱ activityȱ levelsȱ ofȱ allȱ theȱ enzymesȱ analyzedȱ(Figureȱ4).ȱThereȱwasȱalsoȱaȱpositiveȱcorrelationȱbetweenȱmanyȱenzymeȱ activitiesȱandȱtheȱstructuralȱmetabolitesȱproteinȱandȱchlorophyll.ȱAȱweakerȱpositiveȱ correlationȱ wasȱ observedȱ betweenȱ manyȱ enzymeȱ activitiesȱ andȱ sucrose,ȱ aminoȱ acids,ȱ andȱ starch,ȱ andȱ aȱ weakȱ negativeȱ correlationȱ withȱ reducingȱ sugars.ȱ Thisȱ groupȱ ofȱ metabolitesȱ representsȱ theȱ endȱ productsȱ ofȱ photosynthesis,ȱ andȱ theȱ primaryȱcompoundsȱresultingȱfromȱnitrogenȱincorporation.ȱTheyȱareȱexportedȱtoȱ theȱremainderȱofȱtheȱplantȱor,ȱinȱtheȱcaseȱofȱstarch,ȱtemporarilyȱstoredȱinȱtheȱleafȱ andȱ remobilizedȱ forȱ exportȱ inȱ theȱ night.ȱ Strongerȱ negativeȱ correlationsȱ wereȱ observedȱwithȱintermediatesȱofȱmetabolicȱpathways,ȱsuchȱasȱglucoseȬ1Ȭphosphate,ȱ glucoseȬ6Ȭphosphate,ȱ andȱ UDPȬglucose.ȱ Theseȱ findingsȱ suggestȱ thatȱ higherȱ enzymeȱ activitiesȱ mayȱ allowȱ higherȱ fluxes,ȱ whileȱ loweringȱ theȱ levelsȱ ofȱ theȱ intermediaryȱsubstratesȱinȱtheȱpathways.ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ

105ȱ Chapterȱ5ȱ

Tableȱ2:ȱGeneticȱanalysesȱofȱanalyzedȱtraits.ȱTheȱsecondȱtoȱeighthȱcolumnȱrepresent,ȱrespectively,ȱtheȱ coefficientȱofȱvariationȱforȱtraitȱvaluesȱwithinȱtheȱRILȱpopulation,ȱtheȱchromosomeȱnumberȱonȱwhichȱaȱ QTLȱ wasȱ detected,ȱ theȱ positionȱ ofȱ theȱ QTLȱ onȱ theȱ chromosomeȱ inȱ Mbp,ȱ theȱ LODȱ score,ȱ percentageȱ explainedȱvarianceȱandȱdirectionȱofȱeffectȱ(+,ȱLerȱ>ȱCvi;ȱȬ,ȱLerȱ<ȱCvi)ȱofȱtheȱQTLȱandȱtheȱ2Logȱratioȱofȱtraitȱ valuesȱforȱtheȱparentalȱaccessions.ȱPC1Ȭ8,ȱprincipalȱcomponents.ȱ Traitȱ CVȱ Chr.ȱ Mbȱ LODȱ %Expl.ȱVarȱ Effectȱ 2LogȱLer/Cviȱ Invȱ 29.1ȱ 1ȱ 4.1ȱ 5.3ȱ 13.7ȱȬȱȱȬ0.13ȱ AGPȱ 21.3ȱ 4ȱ 12.4ȱ 3.1ȱȱȱ8.0ȱ +ȱȱȬ0.02ȱ FBPȱ 34.7ȱ 5ȱ 14.0ȱ 3.5ȱȱȱ9.6ȱȬȱȱȬ1.00ȱ G6PDHȱ 39.0ȱȱ ȱ ȱ ȱ ȱ ȱȬ0.94ȱ PFKȱ 32.2ȱȱ ȱ ȱ ȱ ȱ ȱȬ0.38ȱ PFPȱ 26.0ȱȱ ȱ ȱ ȱ ȱ ȱȱ0.36ȱ PGMȱ 37.8ȱ 1ȱ 26.9ȱ 16.0ȱ 17.5ȱ +ȱȱȬ0.37ȱ ȱȱ5ȱ 20.9ȱ 36.4ȱ 56.3ȱȬȱȱ PGI(cyt)ȱ 22.8ȱ 1ȱ 16.8ȱ 3.1ȱȱȱ6.8ȱ +ȱȱȱ0.35ȱ ȱȱ2ȱ 11.2ȱ 5.4ȱ 12.7ȱ +ȱȱ ȱȱ5ȱ 17.2ȱ 4.0ȱȱȱ8.9ȱ +ȱȱ PGI(pla)ȱ 22.9ȱ 5ȱ 16.7ȱ 3.1ȱȱȱ8.4ȱȬȱȱȱ0.33ȱ PGI(tot)ȱ 15.5ȱ 1ȱ 14.9ȱ 3.2ȱȱȱ8.8ȱ +ȱȱȱ0.34ȱ SPSȱ 20.6ȱ 5ȱ 7.0ȱ 6.4ȱ 18.0ȱ +ȱȱȱ0.36ȱ SuSyȱ 29.8ȱȱ ȱ ȱ ȱ ȱ ȱȱ0.07ȱ GKȱ 54.2ȱȱ ȱ ȱ ȱ ȱ ȱȱNDȱ FKȱ 47.8ȱ 5ȱ 16.6ȱ 3.6ȱȱȱ9.4ȱȬȱȱȱNDȱ UGPȱ 21.8ȱ 3ȱ 0.8ȱ 17.1ȱ 37.8ȱȬȱȱȱ0.12ȱ ȱȱ5ȱ 5.2ȱ 5.1ȱȱȱ9.3ȱ +ȱȱ Rubisco(ini)ȱ 24.9ȱȱ ȱ ȱ ȱ ȱ ȱȱ0.16ȱ Rubisco(max)ȱ 20.9ȱ 3ȱ 20.5ȱ 3.1ȱȱȱ9.0ȱ +ȱȱȱ0.21ȱ Rubisco(ratio)ȱ 33.2ȱȱ ȱ ȱ ȱ ȱ ȱȬ0.50ȱ ChlAȱ 13.7ȱ 2ȱ 11.2ȱ 3.7ȱȱȱ7.4ȱ +ȱȱȱ0.43ȱ ȱȱ3ȱ 0.3ȱ 3.4ȱȱȱ6.8ȱ +ȱȱ ȱȱ4ȱ 10.6ȱ 3.4ȱȱȱ6.7ȱ +ȱȱ ȱȱ5ȱ 1.7ȱ 3.8ȱȱȱ7.6ȱ +ȱȱ ChlBȱ 14.0ȱȱ ȱ ȱ ȱ ȱ ȱȱ0.32ȱ AAȱ 15.0ȱ 2ȱ 8.5ȱ 5.3ȱȱȱ8.9ȱȬȱȱȬ0.53ȱ ȱȱ2ȱ 16.2ȱ 3.9ȱȱȱ6.2ȱȬȱȱ ȱȱ3ȱ 0.3ȱ 4.7ȱȱȱ7.5ȱ +ȱȱ ȱȱ4ȱ 13.9ȱ 5.1ȱȱȱ8.6ȱȬȱȱ ȱȱ5ȱ 14.0ȱ 4.1ȱȱȱ6.6ȱȬȱȱ Proteinȱ 14.2ȱ 2ȱ 12.9ȱ 3.2ȱȱȱ7.6ȱ +ȱȱȱ0.35ȱ ȱȱ3ȱ 7.4ȱ 3.2ȱȱȱ7.6ȱ +ȱȱ Starchȱ 17.8ȱȱ ȱ ȱ ȱ ȱ ȱȬ0.04ȱ Sucȱ 15.2ȱ 3ȱ 15.6ȱ 3.4ȱȱȱ8.5ȱȬȱȱȱ0.39ȱ ȱȱ3ȱ 23.3ȱ 5.8ȱ 15.1ȱ +ȱȱ Gluȱ 20.4ȱ 1ȱ 4.9ȱ 8.5ȱ 19.2ȱȬȱȱȱ0.10ȱ ȱȱ2ȱ 11.2ȱ 4.4ȱȱȱ9.1ȱȬȱȱ ȱȱ3ȱ 13.0ȱ 5.8ȱ 13.8ȱȬȱȱ ȱȱȱȱȱȱȱȱ ȱȱȱȱȱȱȱȱ

106ȱ Integrativeȱanalysesȱofȱgeneticȱvariationȱinȱenzymeȱactivityȱ

Tableȱ2:ȱContinued.ȱ Traitȱ CVȱ Chr.ȱ Mbȱ LODȱ %Expl.ȱVarȱ Effectȱ 2LogȱLer/Cviȱ Fruȱ 19.4ȱ 1ȱ 5.4ȱ 5.0ȱ 10.9ȱȬȱȱȱ0.03ȱ ȱȱ3ȱ 7.9ȱ 11.7ȱ 27.5ȱ +ȱȱ ȱȱ3ȱ 13.0ȱ 6.2ȱ 15.3ȱȬȱȱ G1Pȱ 32.7ȱ 3ȱ 0.3ȱ 4.5ȱ 12.1ȱȬȱȱȬ0.56ȱ ȱȱ5ȱ 7.2ȱ 3.3ȱȱȱ8.8ȱ +ȱȱ G6Pȱ 35.8ȱ 3ȱ 1.3ȱ 4.0ȱ 13.0ȱȬȱȱȬ0.38ȱ UDPGȱ 24.7ȱ 3ȱ 0.8ȱ 35.9ȱ 64.9ȱȬȱȱȬ0.71ȱ ȱȱȱȱȱȱȱȱ PC1ȱȱ2ȱ 11.2ȱ 4.7ȱ 11.6ȱ +ȱȱ PC2ȱȱ3ȱ 0.3ȱ 28.2ȱ 54.6ȱȬȱȱ PC3ȱȱ1ȱ 4.4ȱ 4.7ȱ 13.0ȱȬȱȱ PC4ȱȱȱȱȱȱȱȱ PC5ȱȱ5ȱ 8.6ȱ 4.1ȱ 11.9ȱȬȱȱ PC6ȱȱ3ȱ 7.0ȱ 7.1ȱ 19.0ȱ +ȱȱ PC7ȱȱ5ȱ 18.2ȱ 10.8ȱ 28.5ȱȬȱȱ PC8ȱȱ5ȱ 1.3ȱ 4.2ȱ 11.9ȱ +ȱȱ ȱ ȱ ȱ

107ȱ Chapterȱ5ȱ

ȱ ȱ Figureȱ2:ȱHeatmapȱofȱQTLȱprofilesȱofȱeachȱanalyzedȱtrait.ȱ ShadingȱintensitiesȱrepresentȱLODȱscores.ȱPositiveȱeffectȱlociȱareȱprojectedȱinȱdecreasingȱintensityȱandȱ negativeȱeffectȱlociȱinȱincreasingȱintensity.ȱChromosomalȱbordersȱareȱindicatedȱbyȱverticalȱshadedȱlinesȱ andȱtheȱpositionȱofȱstructuralȱgenesȱforȱtheȱenzymeȱbyȱtriangles.ȱTranscriptionalȱregulationȱofȱstructuralȱ genesȱisȱindicatedȱbyȱshadingȱintensityȱofȱtheȱtriangles;ȱSolid,ȱlocalȱeQTL;ȱshaded,ȱdistantȱeQTL;ȱopen,ȱ noȱeQTLsȱdetectedȱorȱgeneȱnotȱanalyzed.ȱ ȱ ȱ ȱ

108ȱ Integrativeȱanalysesȱofȱgeneticȱvariationȱinȱenzymeȱactivityȱ

ȱ ȱ Figureȱ3:ȱQTLȱcoȬlocationȱnetworkȱofȱanalyzedȱgenes,ȱenzymesȱandȱmetabolites.ȱ Edgesȱbetweenȱplanesȱrepresent,ȱrespectively:ȱbetweenȱgenesȱandȱenzymes:ȱsolid,ȱpositionȱofȱstructuralȱ geneȱ coȬlocatingȱ withȱ enzymeȱ activityȱ QTL;ȱ dashed,ȱ cisȬeQTLȱ coȬlocatingȱ withȱ enzymeȱ activityȱ QTL;ȱ dotted,ȱ transȬeQTLȱ coȬlocatingȱ withȱ enzymeȱ activityȱ QTL;ȱ betweenȱ enzymesȱ andȱ metabolites:ȱ solid,ȱ enzymeȱ activityȱ QTLȱ coȬlocatingȱ withȱ metaboliteȱ contentȱ QTL;ȱ dashed,ȱ enzymesȱ connectedȱ toȱ theirȱ substrateȱand/orȱproductȱmetabolites.ȱSolidȱedgesȱwithinȱplanesȱconnectȱtraitsȱwithȱcoȬlocatingȱQTLs.ȱ CoȬlocationȱwasȱdefinedȱasȱanȱoverlapȱinȱQTLȱsupportȱintervals.ȱȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ

109ȱ Chapterȱ5ȱ

Toȱdetermineȱwhetherȱweȱcouldȱidentifyȱaȱcommonȱfactorȱexplainingȱtheȱ observedȱcorrelationsȱweȱperformedȱaȱprincipalȱcomponentȱanalysisȱ(PCA)ȱonȱallȱ traitsȱanalyzed.ȱForȱmostȱtraitsȱaȱlargeȱpartȱofȱtheȱvariationȱcouldȱbeȱextractedȱinȱ eightȱ principalȱ componentsȱ (PC),ȱ explainingȱ togetherȱ 68%ȱ ofȱ theȱ observedȱ variationȱ (Tableȱ 3).ȱ Byȱ farȱ theȱ bestȱ representativeȱ ofȱ allȱ traitsȱ wasȱ PC1,ȱ whichȱ explainedȱ overȱ 28%ȱ ofȱ theȱ variance.ȱ Interestingly,ȱ inȱ PC1ȱ positiveȱ valuesȱ wereȱ obtainedȱ forȱ theȱ enzymeȱ activityȱ traitsȱ andȱ someȱ endȱ products,ȱ whileȱ negativeȱ valuesȱ wereȱ obtainedȱ forȱ theȱ hexoseȱ pools,ȱ whichȱ isȱ inȱ lineȱ withȱ theȱ observedȱ correlationsȱ betweenȱ theseȱ traits.ȱ Whenȱ theȱ correspondingȱ PCȱ valuesȱ forȱ theȱ individualȱ RILsȱ wereȱ subjectedȱ toȱ QTLȱ analysisȱ aȱ strongȱ QTLȱ forȱ PC1ȱ wasȱ observedȱ atȱ 11.2ȱ Mbpȱ onȱ chromosomeȱ 2,ȱ whichȱ correspondsȱ toȱ theȱ positionȱ ofȱ ERECTAȱ (Tableȱ 2).ȱ Thisȱ locusȱ wasȱ alsoȱ identifiedȱ asȱ aȱ QTLȱ forȱ cytosolicȱ phosphoglucoseȱ isomeraseȱ activity,ȱ chlorophyllȱ Aȱ andȱ glucoseȱ content.ȱ Theȱ ERECTAȱ geneȱ isȱ polymorphicȱ betweenȱ theȱ population’sȱ parentalȱ accessionsȱ Lerȱ andȱCviȱ(AlonsoȬBlancoȱetȱal.,ȱ1998)ȱandȱcausalȱforȱmanyȱofȱtheȱmorphologicalȱandȱ developmentalȱ differencesȱ observedȱ betweenȱ theseȱ accessionsȱ (Toriiȱ etȱ al.,ȱ 1996;ȱ Juengerȱetȱal.,ȱ2005;ȱMasleȱetȱal.,ȱ2005).ȱMoreover,ȱERECTAȱhasȱbeenȱshownȱtoȱexertȱ pleiotropicȱ effectsȱ onȱ manyȱ growthȱ relatedȱ andȱ metabolicȱ traitsȱ (ElȬLithyȱ etȱ al.,ȱ 2004;ȱ Keurentjesȱ etȱ al.,ȱ 2006,ȱ 2007a).ȱ Itȱ isȱ thereforeȱ conceivableȱ thatȱ ERECTAȱ isȱ responsibleȱforȱaȱsubtleȱsimultaneousȱregulationȱofȱprimaryȱcarbonȱmetabolism,ȱinȱ parallelȱ withȱ itsȱ effectsȱ onȱ development.ȱ Itȱ hasȱ beenȱ suggestedȱ earlierȱ thatȱ thereȱ mayȱbeȱsuchȱlinks,ȱbutȱwithoutȱanyȱspecificȱsuggestionsȱasȱtoȱwhichȱgenesȱmightȱbeȱ involvedȱ(Crossȱetȱal.,ȱ2006;ȱMeyerȱetȱal.,ȱ2007).ȱOtherȱPCsȱmerelyȱexplainȱvariationȱ inȱaȱspecificȱsubsetȱofȱtraits,ȱe.g.ȱPC2ȱbestȱexplainsȱmostȱofȱtheȱvariationȱobservedȱ forȱ UDPȬglucoseȱ pyrophosphorylase,ȱ glucoseȬ1Ȭphosphate,ȱ glucoseȬ6Ȭphosphateȱ andȱUDPȬglucose.ȱAllȱofȱtheseȱtraitsȱshowȱaȱQTLȱatȱtheȱsameȱpositionȱatȱtheȱtopȱofȱ chromosomeȱthree,ȱwhereȱaȱQTLȱforȱPC2ȱwasȱalsoȱdetectedȱ(Tableȱ2)ȱ(seeȱbelowȱforȱ furtherȱdiscussion).ȱȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ

110ȱ Integrativeȱanalysesȱofȱgeneticȱvariationȱinȱenzymeȱactivityȱ

Tableȱ3:ȱPrincipalȱcomponentȱanalysis.ȱColumnsȱrepresentȱrespectivelyȱtheȱproportionȱofȱvarianceȱthatȱ couldȱ beȱ explainedȱ byȱ allȱ componentsȱ andȱ byȱ eachȱ componentȱ separatelyȱ forȱ theȱ differentȱ traitsȱ analyzed.ȱTheȱlastȱrowȱrepresentsȱtheȱpercentageȱofȱexplainedȱvarianceȱofȱallȱtraitsȱbyȱallȱcomponentsȱ andȱbyȱeachȱcomponentȱseparately.ȱ ȱ Extractionȱ PC1ȱ PC2ȱ PC3ȱ PC4ȱ PC5ȱ PC6ȱ PC7ȱ PC8ȱ Invȱ 0.44ȱ 0.22ȱ 0.27ȱ 0.41ȱȬ0.24ȱ 0.06ȱ 0.26ȱȬ0.17ȱȬ0.01ȱ AGPȱ 0.64ȱ 0.78ȱ 0.06ȱ 0.10ȱȬ0.05ȱ 0.09ȱȬ0.01ȱ 0.03ȱ 0.05ȱ FBPȱ 0.53ȱ 0.48ȱ 0.21ȱ 0.15ȱȬ0.04ȱ 0.17ȱ 0.22ȱ 0.31ȱȬ0.24ȱ G6PDHȱ 0.59ȱ 0.70ȱȬ0.11ȱ 0.09ȱȬ0.06ȱ 0.23ȱ 0.09ȱȬ0.14ȱȬ0.10ȱ PFKȱ 0.42ȱ 0.56ȱ 0.02ȱȬ0.04ȱ 0.04ȱ 0.00ȱ 0.28ȱȬ0.15ȱȬ0.08ȱ PFPȱ 0.82ȱ 0.83ȱ 0.21ȱȬ0.06ȱȬ0.11ȱ 0.03ȱȬ0.05ȱ 0.13ȱȬ0.21ȱ PGMȱ 0.65ȱ 0.54ȱ 0.04ȱ 0.15ȱȬ0.19ȱ 0.09ȱȬ0.02ȱ 0.49ȱ 0.20ȱ PGI(cyt)ȱ 0.70ȱ 0.76ȱ 0.12ȱȬ0.09ȱ 0.01ȱȬ0.08ȱ 0.13ȱȬ0.25ȱȬ0.10ȱ PGI(pla)ȱ 0.84ȱ 0.33ȱȬ0.11ȱ 0.30ȱȬ0.54ȱȬ0.19ȱȬ0.54ȱȬ0.06ȱ 0.01ȱ PGI(tot)ȱ 0.89ȱ 0.71ȱȬ0.04ȱ 0.16ȱȬ0.38ȱȬ0.22ȱȬ0.34ȱȬ0.23ȱȬ0.01ȱ SPSȱ 0.58ȱ 0.65ȱ 0.30ȱ 0.11ȱ 0.07ȱȬ0.15ȱȬ0.13ȱȬ0.04ȱȬ0.06ȱ SuSyȱ 0.35ȱ 0.45ȱ 0.14ȱȬ0.01ȱȬ0.12ȱȬ0.01ȱ 0.15ȱȬ0.05ȱȬ0.30ȱ GKȱ 0.51ȱ 0.60ȱ 0.02ȱ 0.02ȱȬ0.31ȱ 0.04ȱ 0.17ȱȬ0.11ȱ 0.06ȱ FKȱ 0.54ȱ 0.49ȱȬ0.23ȱ 0.06ȱȬ0.28ȱ 0.15ȱ 0.17ȱ 0.31ȱ 0.13ȱ UGPȱ 0.72ȱ 0.51ȱ 0.57ȱ 0.16ȱ 0.25ȱ 0.05ȱȬ0.18ȱ 0.08ȱȬ0.04ȱ Rubisco(ini)ȱ 0.91ȱ 0.51ȱȬ0.20ȱ 0.10ȱ 0.33ȱ 0.53ȱȬ0.41ȱȬ0.16ȱ 0.13ȱ Rubisco(max)ȱ 0.73ȱ 0.54ȱ 0.01ȱ 0.07ȱ 0.40ȱȬ0.29ȱȬ0.37ȱȬ0.10ȱ 0.21ȱ Rubisco(ratio)ȱ 0.93ȱ 0.09ȱȬ0.24ȱ 0.05ȱ 0.02ȱ 0.91ȱȬ0.10ȱȬ0.10ȱȬ0.03ȱ chlAȱ 0.83ȱ 0.73ȱȬ0.24ȱȬ0.14ȱ 0.20ȱȬ0.15ȱ 0.25ȱȬ0.02ȱ 0.32ȱ chlBȱ 0.78ȱ 0.68ȱȬ0.19ȱȬ0.17ȱ 0.11ȱȬ0.05ȱ 0.36ȱȬ0.06ȱ 0.33ȱ AAȱ 0.70ȱ 0.13ȱȬ0.52ȱȬ0.01ȱ 0.13ȱȬ0.08ȱȬ0.25ȱ 0.51ȱȬ0.26ȱ Proteinȱ 0.74ȱ 0.80ȱȬ0.13ȱȬ0.10ȱ 0.14ȱȬ0.10ȱ 0.06ȱ 0.02ȱ 0.18ȱ Starchȱ 0.59ȱ 0.55ȱȬ0.25ȱ 0.02ȱ 0.31ȱȬ0.18ȱȬ0.10ȱ 0.15ȱȬ0.25ȱ Sucȱ 0.70ȱ 0.24ȱȬ0.23ȱ 0.50ȱ 0.48ȱȬ0.11ȱ 0.19ȱ 0.02ȱȬ0.24ȱ Gluȱ 0.86ȱȬ0.39ȱȬ0.30ȱ 0.78ȱ 0.01ȱȬ0.11ȱ 0.06ȱȬ0.03ȱ 0.00ȱ Fruȱ 0.79ȱȬ0.27ȱȬ0.39ȱ 0.68ȱȬ0.01ȱȬ0.03ȱ 0.22ȱȬ0.07ȱ 0.23ȱ G1Pȱ 0.69ȱȬ0.14ȱ 0.57ȱ 0.13ȱȬ0.01ȱ 0.04ȱ 0.00ȱ 0.39ȱ 0.43ȱ G6Pȱ 0.48ȱȬ0.16ȱ 0.54ȱ 0.22ȱ 0.09ȱ 0.02ȱ 0.23ȱ 0.01ȱȬ0.23ȱ UDPGȱ 0.70ȱȬ0.13ȱ 0.69ȱ 0.26ȱ 0.27ȱ 0.09ȱȬ0.19ȱȬ0.07ȱ 0.15ȱ ȱ ȱ ȱȱȱȱȱȱȱȱ %ȱofȱVarianceȱ 67.82ȱ 28.25ȱ 9.08ȱ 6.64ȱ 5.47ȱ 5.36ȱ 5.25ȱ 4.00ȱ 3.77ȱ ȱ

111ȱ Chapterȱ5ȱ

ȱ ȱ Figureȱ4:ȱCorrelationȱmatrixȱofȱanalyzedȱenzymesȱandȱmetabolites.ȱ Valuesȱandȱshadingȱintensitiesȱrepresentȱspearmanȱrankȱcorrelationȱcoefficientsȱbetweenȱtwoȱtraits.ȱ ȱ Relationshipȱbetweenȱstructuralȱgeneȱexpressionȱandȱenzymeȱactivityȱ Theȱstructuralȱgenesȱencodingȱenzymesȱcapableȱofȱspecificȱconversionsȱareȱknownȱ forȱmostȱstepsȱinȱtheȱmetabolicȱpathwaysȱofȱprimaryȱcarbohydrateȱmetabolismȱinȱ Arabidopsis.ȱAsȱnotedȱinȱtheȱintroduction,ȱinȱmostȱcasesȱmultipleȱgenesȱhaveȱbeenȱ annotated.ȱThisȱredundancyȱinȱstructuralȱgenesȱpossiblyȱresultsȱfromȱaȱnumberȱofȱ genomeȱ duplicationsȱ duringȱ theȱ evolutionaryȱ historyȱ ofȱ Arabidopsisȱ (Theȱ Arabidopsisȱ Genomeȱ Initiative,ȱ 2000).ȱ Empiricalȱ evidenceȱ forȱ biologicalȱ activityȱ existsȱonlyȱforȱaȱlimitedȱnumberȱofȱgenes,ȱalthoughȱforȱmany,ȱtwoȱorȱmoreȱgenesȱ

112ȱ Integrativeȱanalysesȱofȱgeneticȱvariationȱinȱenzymeȱactivityȱ

mayȱ beȱ neededȱ asȱ aȱ minimumȱ toȱ encodeȱ theȱ enzymesȱ inȱ differentȱ tissuesȱ andȱ subcellularȱcompartments.ȱManyȱofȱtheȱannotationsȱareȱbasedȱonȱhomologyȱwithȱ genesȱ withȱ knownȱ biologicalȱ activity,ȱ butȱ functionalȱ analysesȱ haveȱ notȱ beenȱ performed.ȱ Furthermoreȱ homologousȱ andȱ paralogousȱ genesȱ mightȱ haveȱ lostȱ orȱ modifiedȱfunctionsȱorȱtheirȱexpressionȱpatternȱmightȱhaveȱchanged.ȱ SeveralȱcasesȱwereȱfoundȱwhereȱtheȱpositionȱofȱstructuralȱgenesȱcoȬlocatesȱ withȱ QTLsȱ forȱ activityȱ ofȱ theirȱ encodedȱ enzymesȱ (Figureȱ 2,ȱ Tableȱ 4),ȱ includingȱ invertase,ȱ phosphoglucomutase,ȱ phosphoglucoseȱ isomerase,ȱ sucroseȱ phosphateȱ synthase,ȱ andȱ UDPȬglucoseȱ pyrophosphorylase.ȱ Inȱ theseȱ cases,ȱ theȱ variationȱ observedȱ inȱ enzymeȱ activityȱ isȱ mostȱ likelyȱ toȱ beȱ dueȱ toȱ polymorphismsȱ inȱ theȱ encodingȱstructuralȱgenes.ȱSuchȱpolymorphismsȱmayȱoccurȱ(i)ȱinȱtheȱcodingȱregionȱ ofȱ genesȱ leadingȱ toȱ anȱ alterationȱ ofȱ theȱ specificȱ activityȱ orȱ stability,ȱ orȱ (ii)ȱ inȱ promoterȱ regionsȱ thatȱ affectȱ transcriptionȱ efficiencyȱ andȱ subsequentlyȱ proteinȱ levels.ȱInȱtheȱformerȱcaseȱtheȱchangesȱofȱactivityȱshouldȱbeȱindependentȱofȱchangesȱ ofȱ theȱ transcriptȱ levels,ȱ whereasȱ inȱ theȱ latterȱ caseȱ theyȱ willȱ beȱ accompaniedȱ byȱ qualitativelyȱ similarȱ changesȱ ofȱ transcriptȱ levels.ȱ Toȱ distinguishȱ betweenȱ theseȱ possibilities,ȱweȱanalyzedȱtranscriptȱlevelsȱforȱallȱofȱtheȱputativeȱstructuralȱgenes,ȱ inȱ parallelȱ withȱ theȱ aforementionedȱ enzymeȱ activityȱ assays.ȱ Samplesȱ wereȱ analyzedȱonȱfullȱgenomeȱarraysȱ(Keurentjesȱetȱal.,ȱ2007b);ȱsignalȱintensitiesȱforȱeachȱ RILȱwereȱusedȱtoȱcalculateȱtheȱcorrelationȱcoefficientȱbetweenȱindividualȱtranscriptȱ levelsȱandȱenzymeȱactivities,ȱandȱsignalȱratiosȱofȱpairsȱofȱRILsȱonȱtheȱsameȱslideȱ wereȱusedȱforȱQTLȱanalyses.ȱ Onlyȱ aȱ weakȱ toȱ mediumȱ correlationȱ betweenȱ enzymeȱ activitiesȱ andȱ theȱ transcriptȱlevelsȱofȱtheȱputativeȱstructuralȱgenesȱwasȱobservedȱ(Tableȱ4);ȱ(seeȱlaterȱ forȱ aȱ discussionȱ ofȱ possibleȱ reasons).ȱ However,ȱ inȱ someȱ casesȱ significantȱ correlationsȱ wereȱ found.ȱ Theȱ strongestȱ correlationsȱ wereȱ observedȱ forȱ structuralȱ genesȱcoȬlocatingȱwithȱenzymeȱactivityȱQTLs,ȱindicatingȱthatȱpartȱofȱtheȱvariationȱ observedȱ inȱ enzymeȱ activityȱ canȱ beȱ explainedȱ byȱ differentialȱ expressionȱ ofȱ structuralȱgenes.ȱThisȱisȱfurtherȱsupportedȱbyȱtheȱfactȱthatȱnearlyȱallȱcorrelationsȱofȱ transcriptȱ levelsȱ ofȱ theseȱ genesȱ withȱ enzymeȱ activitiesȱ wereȱ positive.ȱ Theȱ onlyȱ exceptionȱ wasȱ aȱ smallȱ nonȬsignificantȱ negativeȱ correlationȱ ofȱ aȱ phosphoglucoseȱ mutaseȱgeneȱ(At1g70820).ȱNegativeȱcorrelationsȱpossiblyȱresultȱfromȱphaseȱshiftsȱinȱ transcriptionȱ andȱ translation;ȱ althoughȱ otherȱ explanationsȱ areȱ alsoȱ possibleȱ (seeȱ discussion).ȱ ȱ ȱ ȱ ȱ

113ȱ Chapterȱ5ȱ

Tableȱ4:ȱStatisticsȱofȱstructuralȱgenes.ȱColumnsȱrepresentȱrespectivelyȱtheȱencodedȱenzymes,ȱtheȱAGIȱ geneȱcodesȱofȱstructuralȱgenes,ȱtheȱpositionȱofȱtheȱstructuralȱgeneȱonȱtheȱchromosomeȱindicatedȱinȱtheȱ AGIȱcode,ȱtheȱspearmanȱrankȱcorrelationȱcoefficientȱbetweenȱenzymeȱactivityȱandȱgeneȱtranscriptȱlevels,ȱ theȱPȬvalueȱofȱtheȱcorrelationȱcoefficient,ȱtheȱchromosomeȱnumberȱand,ȱinȱparentheses,ȱtheȱpositionȱinȱ Mbp,ȱtheȱLODȱscore,ȱandȱtheȱdirectionȱofȱeffectȱ(+,ȱLerȱ>ȱCvi;ȱȬ,ȱLerȱ<ȱCvi)ȱofȱdetectedȱeQTLs.ȱGenesȱandȱ eQTLȱ positionsȱ inȱ boldfaceȱ coȬlocateȱ withȱ QTLsȱ detectedȱ forȱ enzymeȱ activity.ȱ Whenȱ moreȱ thenȱ oneȱ eQTLȱ wasȱ detected,ȱ positions,ȱ LODȱ scores,ȱ andȱ effectsȱ ofȱ theȱ differentȱ eQTLsȱ areȱ separatedȱ byȱ aȱ semicolon.ȱNA,ȱNotȱAnalyzed;ȱNS,ȱNotȱSignificant.ȱ Enzymeȱ ȱȱȱGeneȱȱȱMbȱȱRȱ Pȱ eQTLȱ LODȱ Effectȱ Invȱ at1g12240ȱ 4.15ȱ 0.19ȱ 1.8EȬ02ȱ 1(4.1)ȱ 6.4ȱȬȱ ȱ at1g62660ȱ 23.20ȱȬ0.06ȱ 4.8EȬ01ȱ 1(7.9);ȱ3(20.0)ȱ 3.7;ȱ3.3ȱȬ;ȱȬȱ ȱȱȱȱȱ ȱȱȱ AGPȱ at1g27680ȱ 9.63ȱȬ0.08ȱ 3.2EȬ01ȱ 1(10.2)ȱ 3.0ȱȬȱ ȱ at1g05610ȱ 1.67ȱȬ0.04ȱ 6.1EȬ01ȱ 1(28.8);ȱ3(20.5)ȱ 4.0;ȱ3.5ȱȬ;ȱ+ȱ ȱ at1g74910ȱ 28.14ȱ 0.24ȱ 3.2EȬ03ȱ 1(22.3);ȱ1(26.4)ȱ 3.7;ȱ4.0ȱ +;ȱ+ȱ ȱ at2g04650ȱ 1.62ȱ 0.03ȱ 7.4EȬ01ȱ NSȱȱȱ ȱ at2g21590ȱ 9.25ȱȬ0.02ȱ 8.5EȬ01ȱ NSȱȱȱ ȱ at3g03250ȱ 0.75ȱȬ0.26ȱ 1.1EȬ03ȱ 1(12.5);ȱ3(1.4);ȱ3(20.5)ȱ 3.2;ȱ15.1;ȱ3.2ȱȬ;ȱȬ;ȱȬȱ ȱ at4g39210ȱ 18.26ȱȬ0.20ȱ 1.4EȬ02ȱ 3(18.6)ȱ 3.5ȱȬȱ ȱ at5g17310ȱ 5.70ȱȬ0.23ȱ 3.9EȬ03ȱ 3(4.1)ȱ 9.6ȱȬȱ ȱ at5g19220ȱ 6.46ȱ 0.23ȱ 3.4EȬ03ȱ 5(8.1)ȱ 6.2ȱ +ȱ ȱ at5g48300ȱ 19.59ȱ 0.15ȱ 6.2EȬ02ȱ NSȱȱȱ ȱȱȱȱȱ ȱȱȱ FBPȱ at1g43670ȱ 16.47ȱ 0.01ȱ 9.1EȬ01ȱ 1(12.2)ȱ 3.1ȱȬȱ ȱ at3g54050ȱ 20.03ȱȬ0.15ȱ 7.2EȬ02ȱ NSȱȱȱ ȱ at5g64380ȱ 25.76ȱ 0.03ȱ 7.5EȬ01ȱ 5(22.4)ȱ 4.3ȱȬȱ ȱȱȱȱȱ ȱȱȱ G6PDHȱ at1g09420ȱ 3.04ȱȬ0.06ȱ 4.5EȬ01ȱ 1(3.1)ȱ 4.8ȱȬȱ ȱ at1g24280ȱ 8.61ȱ 0.31ȱ 8.4EȬ05ȱ 2(6.9)ȱ 3.1ȱ +ȱ ȱ at3g27300ȱ 10.08ȱ 0.17ȱ 3.7EȬ02ȱ 4(0.3)ȱ 3.5ȱȬȱ ȱ at5g13110ȱ 4.16ȱȬ0.02ȱ 7.6EȬ01ȱ NSȱȱȱ ȱ at5g35790ȱ 13.97ȱ 0.12ȱ 1.3EȬ01ȱ 4(0.3);ȱ4(13.9);ȱ5(16.7)ȱ 3.1;ȱ3.2;ȱ4.7ȱȬ;ȱȬ;ȱȬȱ ȱ at5g40760ȱ 16.33ȱ 0.06ȱ 4.9EȬ01ȱ 5(16.7)ȱ 8.9ȱ +ȱ ȱȱȱȱȱ ȱȱȱ PFKȱ at1g43766ȱ 16.55ȱ NAȱȱ ȱ ȱ ȱ ȱ at1g59810ȱ 22.01ȱ NAȱȱ ȱ ȱ ȱ ȱ at2g22480ȱ 9.55ȱ 0.13ȱ 1.1EȬ01ȱ 1(18.0);ȱ2(18.3);ȱ5(2.5)ȱ 3.7;ȱ4.9;ȱ3.6ȱ +;ȱ+;ȱȬȱ ȱ at4g26270ȱ 13.30ȱ 0.25ȱ 1.4EȬ03ȱ 2(10.0);ȱ2(11.2)ȱ 3.1;ȱ3.5ȱ +;ȱ+ȱ ȱ at4g29220ȱ 14.40ȱȬ0.08ȱ 3.4EȬ01ȱ NSȱȱȱ ȱ at5g03300ȱ 0.80ȱ 0.10ȱ 2.0EȬ01ȱ 5(0.8)ȱ 21.7ȱ +ȱ ȱ at5g47810ȱ 19.37ȱ 0.04ȱ 6.4EȬ01ȱ NSȱȱȱ ȱ at5g56630ȱ 22.94ȱȬ0.01ȱ 9.0EȬ01ȱ NSȱȱȱ ȱ at5g61580ȱ 24.78ȱ 0.09ȱ 2.8EȬ01ȱ NSȱȱȱ ȱȱȱȱȱ ȱȱȱ ȱȱȱȱȱ ȱȱȱ ȱȱȱȱȱ ȱȱȱ

114ȱ Integrativeȱanalysesȱofȱgeneticȱvariationȱinȱenzymeȱactivityȱ

Tableȱ4:ȱContinued.ȱ Enzymeȱ ȱȱȱGeneȱȱȱMbȱȱRȱ Pȱ eQTLȱ LODȱ Effectȱ PFPȱ at1g12000ȱ 4.05ȱ 0.01ȱ 9.3EȬ01ȱ NSȱȱȱ ȱ at1g20950ȱ 7.30ȱ 0.01ȱ 9.3EȬ01ȱ NSȱȱȱ ȱ at1g76550ȱ 28.73ȱ 0.35ȱ 9.5EȬ06ȱ NSȱȱȱ ȱ at2g05150ȱ 1.86ȱ NAȱȱ ȱ ȱ ȱ ȱ at4g04040ȱ 1.94ȱȬ0.15ȱ 6.3EȬ02ȱ 1(3.8)ȱ 3.5ȱ +ȱ ȱ at4g08876ȱ 5.68ȱ NAȱȱ ȱ ȱ ȱ ȱ at4g32840ȱ 15.84ȱ 0.27ȱ 7.7EȬ04ȱ NSȱȱȱ ȱȱȱȱȱ ȱȱȱ PGMȱ at1g23190ȱ 8.22ȱ 0.11ȱ 1.7EȬ01ȱ NSȱȱȱ ȱ at1g70730ȱ 26.67ȱ 0.04ȱ 6.3EȬ01ȱ NSȱȱȱ ȱ at1g70820ȱ 26.71ȱȬ0.13ȱ 1.1EȬ01ȱ 1(28.0)ȱ 5.6ȱȬȱ ȱ at5g17530ȱ 0.58ȱ NAȱȱ ȱ ȱ ȱ ȱ at5g51820ȱ 21.08ȱ 0.69ȱ 4.4EȬ23ȱ 5(1.7);ȱ5(21.0)ȱ 7.4;ȱ36.6ȱ +;ȱȬȱ ȱ ȱ ȱȱ ȱȱ ȱȱ PGI(Cyt)ȱ at1g30560ȱ 10.82ȱȬ0.02ȱ 8.4EȬ01ȱ 4(6.6);ȱ4(10.6)ȱ 3.4;ȱ3.4ȱ +;ȱ+ȱ ȱ at4g25220ȱ 12.92ȱ 0.32ȱ 4.6EȬ05ȱ 2(11.2)ȱ 3.1ȱ +ȱ ȱ at5g42740ȱ 17.15ȱ 0.19ȱ 1.6EȬ02ȱ NSȱȱȱ ȱȱȱȱȱ ȱȱȱ PGI(Pla)ȱ at4g24620ȱ 12.71ȱȬ0.17ȱ 3.0EȬ02ȱ NSȱȱȱ ȱȱȱȱȱ ȱȱȱ PGI(Tot)ȱ at1g30560ȱ 10.82ȱ 0.01ȱ 9.3EȬ01ȱ 4(6.6);ȱ4(10.6)ȱ 3.4;ȱ3.4ȱ +;ȱ+ȱ ȱ at4g24620ȱ 12.71ȱȬ0.23ȱ 3.5EȬ03ȱ NSȱȱȱ ȱ at4g25220ȱ 12.92ȱ 0.15ȱ 5.8EȬ02ȱ 2(11.2)ȱ 3.1ȱ +ȱ ȱ at5g42740ȱ 17.15ȱ 0.17ȱ 3.1EȬ02ȱ NSȱȱȱ ȱȱȱȱȱ ȱȱȱ SPSȱ at1g04920ȱ 1.39ȱ 0.10ȱ 2.2EȬ01ȱ NSȱȱȱ ȱ at1g16570ȱ 5.67ȱ 0.12ȱ 1.3EȬ01ȱ NSȱȱȱ ȱ at4g10120ȱ 6.31ȱȬ0.08ȱ 3.1EȬ01ȱ 4(6.2)ȱ 7.0ȱ +ȱ ȱ at5g11110ȱ 3.54ȱ 0.13ȱ 1.2EȬ01ȱ 5(3.7)ȱ 4.5ȱ +ȱ ȱ at5g20280ȱ 6.84ȱ 0.23ȱ 4.2EȬ03ȱ 5(7.2)ȱ 9.2ȱ +ȱ ȱȱȱȱȱ ȱȱȱ SuSyȱ at1g73370ȱ 27.59ȱ 0.16ȱ 4.0EȬ02ȱ 5(14.0)ȱ 8.2ȱȬȱ ȱ at3g43190ȱ 15.19ȱȬ0.07ȱ 4.0EȬ01ȱ NSȱȱȱ ȱ at4g02280ȱ 0.99ȱ 0.07ȱ 3.8EȬ01ȱ NSȱȱȱ ȱ at5g20830ȱ 7.05ȱ 0.10ȱ 2.1EȬ01ȱ NSȱȱȱ ȱ at5g37180ȱ 14.74ȱ 0.14ȱ 7.7EȬ02ȱ NSȱȱȱ ȱ at5g49190ȱ 19.96ȱ 0.27ȱ 5.5EȬ04ȱ NSȱȱȱ ȱȱȱȱȱ ȱȱȱ ȱȱȱȱȱ ȱȱȱ ȱȱȱȱȱ ȱȱȱ ȱȱȱȱȱ ȱȱȱ ȱȱȱȱȱ ȱȱȱ ȱȱȱȱȱ ȱȱȱ ȱȱȱȱȱ ȱȱȱ

115ȱ Chapterȱ5ȱ

Tableȱ4:ȱContinued.ȱ Enzymeȱ ȱȱȱGeneȱȱȱMbȱȱRȱ Pȱ eQTLȱ LODȱ Effectȱ GKȱ at1g30660ȱ 10.88ȱ 0.18ȱ 2.5EȬ02ȱ NSȱȱȱ ȱ at1g47840ȱ 17.62ȱ 0.04ȱ 6.5EȬ01ȱ 1(16.8)ȱ 3.8ȱȬȱ ȱ at1g50460ȱ 18.70ȱ 0.02ȱ 7.9EȬ01ȱ 1(18.0)ȱ 17.0ȱȬȱ ȱ at2g19860ȱ 8.58ȱ 0.10ȱ 2.3EȬ01ȱ NSȱȱȱ ȱ at3g20040ȱ 6.99ȱ 0.22ȱ 6.3EȬ03ȱ NSȱȱȱ ȱ at4g29130ȱ 14.35ȱ 0.07ȱ 4.0EȬ01ȱ NSȱȱȱ ȱ at4g37840ȱ 17.79ȱ 0.19ȱ 2.0EȬ02ȱ NSȱȱȱ ȱȱȱȱȱ ȱȱȱ FKȱ at1g06020ȱ 1.82ȱȬ0.09ȱ 2.7EȬ01ȱ NSȱȱȱ ȱ at1g06030ȱ 1.83ȱ 0.01ȱ 8.8EȬ01ȱ NSȱȱȱ ȱ at1g30660ȱ 10.88ȱ 0.17ȱ 3.1EȬ02ȱ NSȱȱȱ ȱ at1g47840ȱ 17.62ȱȬ0.11ȱ 1.6EȬ01ȱ 1(16.8)ȱ 3.8ȱȬȱ ȱ at1g50390ȱ 18.67ȱ NAȱȱ ȱ ȱ ȱ ȱ at1g50460ȱ 18.70ȱȬ0.09ȱ 2.9EȬ01ȱ 1(18.0)ȱ 17.0ȱȬȱ ȱ at1g66430ȱ 24.78ȱȬ0.11ȱ 1.6EȬ01ȱ 1(28.8);ȱ2(16.8)ȱ 3.4;ȱ3.5ȱȬ;ȱȬȱ ȱ at1g69200ȱ 26.02ȱȬ0.07ȱ 4.0EȬ01ȱ NSȱȱȱ ȱ at2g19860ȱ 8.58ȱȬ0.03ȱ 7.2EȬ01ȱ NSȱȱȱ ȱ at2g31390ȱ 13.39ȱȬ0.15ȱ 7.2EȬ02ȱ 2(12.5)ȱ 5.0ȱȬȱ ȱ at3g20040ȱ 6.99ȱ 0.22ȱ 5.3EȬ03ȱ NSȱȱȱ ȱ at3g54090ȱ 20.04ȱ 0.26ȱ 1.2EȬ03ȱ 3(11.0)ȱ 3.3ȱ +ȱ ȱ at3g59480ȱ 21.99ȱ 0.05ȱ 5.6EȬ01ȱ NSȱȱȱ ȱ at4g10260ȱ 6.37ȱ NAȱȱ ȱ ȱ ȱ ȱ at4g29130ȱ 14.35ȱ 0.18ȱ 2.5EȬ02ȱ NSȱȱȱ ȱ at4g37840ȱ 17.79ȱ 0.19ȱ 1.7EȬ02ȱ NSȱȱȱ ȱ at5g51830ȱ 21.09ȱ 0.27ȱ 5.8EȬ04ȱ 5(21.0)ȱ 26.4ȱȬȱ ȱȱȱȱȱ ȱȱȱ UGPȱ at3g03250ȱ 0.75ȱ 0.41ȱ 1.3EȬ07ȱ 1(12.5);ȱ3(1.4)ȱ 4.5;ȱ43.9ȱȬ;ȱȬȱ ȱ at5g17310ȱ 5.70ȱ 0.42ȱ 7.1EȬ08ȱ 1(12.5);ȱ3(1.9)ȱ 4.6;ȱ30.3ȱȬ;ȱȬȱ ȱȱȱȱȱ ȱȱȱ Rubiscoȱ at1g34630ȱ 12.69ȱ 0.19ȱ 1.9EȬ02ȱ 1(13.4)ȱ 3.0ȱȬȱ ȱ at1g67090ȱ 25.05ȱ Ȭ0.03ȱ 7.3EȬ01ȱ NSȱȱȱ ȱ at5g38410ȱ 15.39ȱ 0.02ȱ 8.4EȬ01ȱ NSȱȱȱ ȱ at5g38420ȱ 15.40ȱ NAȱȱȱȱȱ ȱ at5g38430ȱ 15.40ȱ 0.05ȱ 5.2EȬ01ȱ NSȱȱȱ ȱ at5g58240ȱ 23.58ȱ 0.20ȱ 1.3EȬ02ȱ NSȱȱȱ ȱ Weȱnextȱsubjectedȱtheȱobservedȱtranscriptȱlevelsȱofȱtheȱstructuralȱgenesȱtoȱ QTLȱ analyses.ȱ Forȱ eachȱ encodedȱ enzyme;ȱ weȱ foundȱ significantȱ QTLsȱ forȱ atȱ leastȱ oneȱ ofȱ theȱ encodingȱ structuralȱ genesȱ (eQTLs)ȱ (Tableȱ 4).ȱ Bothȱ localȱ andȱ distantȱ regulationȱwasȱobserved,ȱasȱjudgedȱfromȱtheȱpositionȱofȱgenesȱandȱtheirȱrespectiveȱ eQTLs;ȱ locallyȱ observedȱ eQTLsȱ indicateȱ thatȱ regulationȱ occursȱ inȱ cisȱ whereasȱ distantȱeQTLsȱsuggestsȱregulationȱtoȱoccurȱinȱtransȱ(RockmanȱandȱKruglyak,ȱ2006).ȱ

116ȱ Integrativeȱanalysesȱofȱgeneticȱvariationȱinȱenzymeȱactivityȱ

Examplesȱ ofȱ strongȱ localȱ regulationȱ includeȱ UDPȬglucoseȱ pyrophosphorylaseȱ (At3g03250),ȱphosphoglucomutaseȱ(At5g51820),ȱphosphofructokinaseȱ(At5g03300),ȱ andȱhexokinaseȱ(At1g50460).ȱAsȱnotedȱabove,ȱenzymeȱactivityȱcorrelatedȱwithȱtheȱ transcriptȱ levelȱ forȱ severalȱ ofȱ theseȱ genes.ȱ Moreover,ȱ strongȱ localȱ regulationȱ ofȱ structuralȱgenesȱcoȬlocatingȱwithȱaȱQTLȱforȱactivityȱofȱtheirȱencodedȱenzymeȱwasȱ observedȱ [e.g.ȱ invertaseȱ (At1g12240),ȱ phosphoglucomutaseȱ (At1g70820ȱ andȱ At5g51820),ȱ sucroseȱ phosphateȱ synthaseȱ (At5g20280),ȱ andȱ UDPȬglucoseȱ pyrophosphorylaseȱ (At3g03250)].ȱ Theȱ onlyȱ exceptionȱ wasȱ aȱ structuralȱ geneȱ forȱ UDPȬglucoseȱ pyrophosphorylaseȱ (At5g17310),ȱ whichȱ showedȱ strongȱ distantȱ regulation.ȱTheseȱfindingsȱagainȱsuggestȱthatȱcisȬregulatoryȱvariationȱinȱexpressionȱ ofȱstructuralȱgenesȱisȱatȱleastȱpartlyȱresponsibleȱforȱobservedȱvariationȱinȱenzymeȱ activity.ȱ Inȱ otherȱ cases,ȱ bothȱ locallyȱ andȱ distantlyȱ actingȱ significantȱ eQTLsȱ forȱ structuralȱgenes,ȱthatȱdidȱnotȱcoȬlocateȱwithȱQTLsȱforȱenzymeȱactivity,ȱwereȱfound,ȱ evenȱ thoughȱ significantȱ correlationȱ wasȱ sometimesȱ observedȱ betweenȱ transcriptȱ levelsȱofȱtheseȱgenesȱandȱenzymeȱactivity.ȱInȱtheȱcaseȱofȱcytosolicȱphosphoglucoseȱ isomeraseȱaȱtransȬactingȱeQTLȱforȱaȱstructuralȱgeneȱ(At4g25220)ȱcoȬlocatesȱwithȱaȱ QTLȱ forȱ enzymeȱ activityȱ (Tableȱ 4).ȱ Moreover,ȱ ofȱ allȱ genesȱ annotatedȱ asȱ aȱ phosphoglucoseȱ isomerase,ȱ theȱ transcriptȱ levelsȱ ofȱ thisȱ geneȱ showedȱ theȱ highestȱ correlationȱ withȱ enzymeȱ activity.ȱ Thisȱ indicatesȱ thatȱ alsoȱ transȬactingȱ regulatoryȱ variationȱinȱstructuralȱgeneȱtranscriptionȱcanȱexplainȱvariationȱobservedȱinȱenzymeȱ activity.ȱForȱtwoȱstructuralȱgenes,ȱcoȬlocatingȱwithȱtheirȱencodingȱenzymeȱactivityȱ QTLsȱ (viz.ȱ At1g70730,ȱ phosphoglucomutaseȱ andȱ At5g42740,ȱ cytosolicȱ phosphoglucoseȱisomerase),ȱnoȱsignificantȱeQTLȱwasȱobserved.ȱ Finally,ȱ bothȱ locallyȱ andȱ distantlyȱ actingȱ significantȱ eQTLsȱ forȱ structuralȱ genesȱwereȱdetectedȱwithoutȱcoincidingȱpositionsȱofȱgenesȱandȱactivityȱQTLsȱorȱcoȬ locatingȱ(e)QTLsȱandȱforȱwhichȱnoȱsignificantȱcorrelationȱbetweenȱtranscriptȱlevelȱ andȱ enzymeȱ activityȱ wasȱ found.ȱ Theseȱ findingsȱ suggestȱ thatȱ notȱ allȱ annotatedȱ genesȱ actuallyȱ contributeȱ toȱ theȱ observedȱ activityȱ ofȱ theȱ putativelyȱ encodedȱ enzymeȱ andȱ mightȱ serveȱ otherȱ functionsȱ independentlyȱ regulatedȱ fromȱ theirȱ currentȱannotation.ȱHowever,ȱourȱresultsȱdoȱnotȱexcludeȱotherȱexplanations,ȱsuchȱ asȱ spatialȱ andȱ temporalȱ control,ȱ postȬtranscriptionalȱ andȱ (post)Ȭtranslationalȱ regulation,ȱ additiveȱ effectsȱ ofȱ multipleȱ genes,ȱ andȱ temporalȱ shiftsȱ betweenȱ transcriptionȱandȱtranslationȱ(seeȱalsoȱdiscussion).ȱ ȱ ȱ ȱ ȱ ȱ

117ȱ Chapterȱ5ȱ

Differentȱmodesȱofȱactionȱinȱtheȱgeneticȱcontrolȱofȱenzymaticȱactivityȱ Althoughȱ variationȱ inȱ activityȱ wasȱ observedȱ forȱ manyȱ ofȱ theȱ analyzedȱ enzymes,ȱ theȱstrongestȱgeneticallyȱcontrolledȱvariationȱwasȱfoundȱforȱphosphoglucomutaseȱ (PGM)ȱ andȱ UDPȬglucoseȱ pyrophosphorylaseȱ (UGP).ȱ Forȱ theseȱ twoȱ enzymes,ȱ weȱ alsoȱ investigatedȱ substrateȱ andȱ productȱ levels.ȱ Whenȱ combinedȱwithȱ theȱ parallelȱ analysisȱofȱtranscriptȱlevelsȱofȱtheȱstructuralȱgenes,ȱthisȱoffersȱtheȱopportunityȱofȱ gainingȱdeeperȱinsightȱintoȱtheȱmechanismsȱofȱgeneticȱregulationȱofȱtheseȱtraits.ȱ ForȱPGMȬactivityȱtwoȱhighlyȱsignificantȱQTLsȱwereȱdetectedȱwithȱoppositeȱ effectsȱ (Figureȱ 5A).ȱ Oneȱ strongȱ activityȱ QTLȱ forȱ PGMȱ wasȱ detectedȱ atȱ theȱ lowerȱ armȱofȱchromosomeȱfive,ȱwithȱactivityȱbeingȱstronglyȱdecreasedȱinȱLerȱgenotypesȱ forȱ thisȱ locus,ȱ comparedȱ toȱ Cviȱ genotypes.ȱ Thisȱ activityȱ QTLȱ coȬlocatedȱ withȱ aȱ structuralȱ geneȱ forȱ theȱ plastidicȱ PGMȱ (At5g51820ȱ (PGM1)ȱ (Koflerȱ etȱ al.,ȱ 2000;ȱ Periappuramȱetȱal.,ȱ2000).ȱQTLȱanalysisȱofȱtranscriptȱlevelsȱofȱthisȱgeneȱrevealedȱanȱ equallyȱ significantȱ eQTLȱ atȱ theȱ identicalȱ positionȱ ofȱ thisȱ structuralȱ geneȱ andȱ theȱ enzymeȱactivityȱQTL.ȱSinceȱtheȱdirectionȱofȱtheȱadditiveȱeffectȱofȱbothȱQTLsȱisȱalsoȱ identical,ȱthisȱsuggestsȱthatȱcisȬregulatoryȱvariationȱinȱtheȱexpressionȱofȱaȱstructuralȱ geneȱ isȱ causalȱ forȱ theȱ observedȱ variationȱ inȱ enzymeȱ activity.ȱ Theȱsecondȱ activityȱ QTLȱforȱPGMȱisȱlocatedȱonȱtheȱlowerȱarmȱofȱchromosomeȱ1ȱandȱcoincidesȱwithȱtwoȱ putativelyȱ annotatedȱ structuralȱ genesȱ forȱ cytosolicȱ isoformsȱ ofȱ PGMȱ (At1g70730ȱ andȱ At1g70820ȱ (Theȱ Arabidopsisȱ Informationȱ Resource).ȱ Noȱ eQTLȱ couldȱ beȱ detectedȱexplainingȱvariationȱinȱtranscriptȱlevelsȱofȱAt1g70730,ȱbutȱaȱminorȱeQTLȱ wasȱdetectedȱexplainingȱtranscriptȱlevelȱvariationȱofȱAt1g70820.ȱThisȱminorȱeQTLȱ wasȱ locatedȱatȱ aȱ similarȱpositionȱ asȱ theȱ QTLȱ forȱ PGMȬactivity,ȱalthoughȱ withȱ anȱ oppositeȱadditiveȱeffect.ȱThereȱareȱseveralȱalternativeȱexplanationsȱwhyȱanȱeQTLȱ andȱactivityȱQTLȱhaveȱdifferentȱsigns.ȱOneȱisȱthatȱaȱpolymorphismȱinȱtheȱstructuralȱ geneȱisȱleadingȱtoȱincreasedȱactivityȱorȱproteinȱstability,ȱwhichȱresultsȱinȱchangesȱofȱ metabolitesȱthatȱweaklyȱrepressȱtheȱtranscriptionȱofȱtheȱstructuralȱgeneȱ(negativeȱ feedback).ȱAnotherȱisȱthatȱthereȱareȱactuallyȱtwoȱcisȱpolymorphisms,ȱoneȱaffectingȱ transcriptionȱ andȱ oneȱ affectingȱ proteinȱ function,ȱ whichȱ interactȱ toȱ regulateȱ theȱ eventualȱlevelȱofȱenzymeȱactivity.ȱTheȱLerȱallele,ȱcomparedȱtoȱtheȱCviȱallele,ȱthenȱ leadsȱtoȱlowerȱtranscriptȱlevelsȱbutȱtheȱencodedȱenzymeȱshowsȱhigherȱactivityȱforȱ theȱconversionȱofȱG1PȱintoȱG6P.ȱForȱAt1g70730,ȱfunctionalȱpolymorphismsȱinȱtheȱ codingȱ sequenceȱ aloneȱ couldȱ explainȱ theȱ observedȱ variationȱ inȱ enzymeȱ activityȱ sinceȱnoȱgeneticallyȱregulatedȱvariationȱinȱtranscriptȱlevelsȱwasȱobservedȱforȱthisȱ gene.ȱ ȱ

118ȱ Integrativeȱanalysesȱofȱgeneticȱvariationȱinȱenzymeȱactivityȱ

ȱ ȱ Figureȱ5:ȱQTLȱprofilesȱandȱboxplotsȱofȱPGMȱrelatedȱtraits.ȱ (A)ȱ LODȱ scoresȱ plottedȱ againstȱ genomicȱ position,ȱ theȱ signȱ ofȱ theȱ LODȱ scoreȱ isȱ determinedȱ byȱ theȱ directionȱofȱeffectȱ(+,ȱLerȱ>ȱCvi;ȱȬ,ȱLerȱ<ȱCvi).ȱSolidȱline,ȱPGMȱactivity;ȱdottedȱline,ȱG1Pȱcontent;ȱdashedȱ line,ȱ G6Pȱ content;ȱ shadedȱ solidȱ lineȱ At1g70820ȱ expressionȱ level;ȱ shadedȱ dottedȱ line,ȱ At5g51820ȱ expressionȱlevel.ȱShadedȱtrianglesȱindicateȱpositionsȱofȱstructuralȱgenes:ȱ1,ȱAt1g70820;ȱ5,ȱAt5g51820.ȱ(B)ȱ Boxplotsȱ forȱ fourȱ genotypicȱ classes.ȱ Eachȱ classȱ representsȱ genotypicȱ identicalȱ individualsȱ forȱ theȱ twoȱ QTLsȱatȱchromosomeȱoneȱandȱfiveȱ(fromȱleftȱtoȱright:ȱA1A5,ȱA1B5,ȱB1A5,ȱB1B5;ȱAȱ=ȱLer,ȱBȱ=ȱCvi).ȱBoxplotsȱ showȱtheȱmedian,ȱinterquartileȱrange,ȱoutliersȱ(o)ȱandȱextremeȱcasesȱ(*)ȱofȱindividualȱvariables.ȱAllȱtraitsȱ areȱplottedȱinȱarbitraryȱunits.ȱ ȱ Theȱ levelsȱ ofȱ substrateȱ andȱ productȱ ofȱ PGMȱ wereȱ notȱ affectedȱ byȱ PGMȬ activityȱ QTLsȱ (Figureȱ 5B).ȱ Althoughȱ minorȱ QTLsȱ wereȱ detectedȱ forȱ glucoseȬ1Ȭ phosphateȱ (G1P)ȱ andȱ glucoseȬ6Ȭphosphateȱ (G6P)ȱ content,ȱ theseȱ didȱ notȱ coȬlocateȱ withȱQTLsȱforȱPGMȬactivity,ȱsuggestingȱthatȱtheȱsizeȱofȱtheȱhexoseȱphosphateȱpoolȱ isȱnotȱdeterminedȱbyȱfluxȱrates,ȱasȱcatalyzedȱbyȱPGM,ȱbutȱregulatedȱindependently.ȱ NoteȱthatȱG1PȱandȱG6Pȱareȱpresentȱinȱtheȱplastidȱandȱtheȱcytosol,ȱwithȱlargerȱpoolsȱ inȱ theȱ cytosol.ȱ Asȱ theȱ strongȱ PGMȬactivityȱ QTLȱ isȱ likelyȱ toȱ beȱ causedȱ byȱ theȱ

119ȱ Chapterȱ5ȱ

plastidicȱPGM,ȱthenȱquiteȱlargeȱchangesȱinȱtheȱpoolsȱofȱtheȱplastidȱmightȱnotȱhaveȱ beenȱseenȱinȱtheȱoverallȱmeasurements.ȱ Inȱcontrast,ȱstrongȱcoȬregulationȱwasȱobservedȱforȱtheȱactivityȱofȱUGPȱandȱ itsȱ metaboliteȱ substrateȱ UDPȬglucoseȱ (UDPG)ȱ andȱ toȱ aȱ lesserȱ degreeȱ itsȱ productȱ G1P.ȱTwoȱQTLsȱwithȱoppositeȱeffectȱwereȱdetectedȱforȱUGPȬactivity,ȱeachȱofȱthemȱ coȬlocatingȱ withȱ aȱ putativelyȱ annotatedȱ structuralȱ geneȱ (Figureȱ 6A).ȱ Theȱ UGPȬ activityȱ QTLȱ atȱ theȱ topȱ ofȱ chromosomeȱ threeȱ coȬlocatesȱ withȱ theȱ positionȱ ofȱ theȱ structuralȱgeneȱAt3g03250,ȱforȱwhichȱanȱeQTLȱwithȱtheȱsameȱdirectionȱofȱeffectȱwasȱ detectedȱatȱtheȱidenticalȱposition.ȱThisȱsuggestsȱthatȱvariationȱinȱUGPȬactivityȱcanȱ beȱ explainedȱ byȱ cisȬregulatedȱ differencesȱ inȱ transcriptȱ levelsȱ ofȱ At3g03250.ȱ Theȱ secondȱQTLȱforȱUGPȬactivityȱmapsȱtoȱtheȱupperȱarmȱofȱchromosomeȱfive,ȱandȱcoȬ locatesȱwithȱtheȱstructuralȱgeneȱAt5g17310.ȱWhenȱtheȱAt5g17310ȱtranscriptȱlevelsȱ wereȱ subjectedȱ toȱ QTLȱ analysis,ȱ aȱ highlyȱ significantȱ transȬactingȱ eQTLȱ wasȱ detectedȱatȱtheȱsameȱpositionȱasȱtheȱchromosomeȱthreeȱUGPȬactivityȱQTLȱandȱtheȱ At3g03250ȱeQTL,ȱandȱwithȱtheȱsameȱdirectionȱofȱeffect.ȱThisȱimpliesȱthatȱtheȱUGPȬ activityȱQTLȱatȱchromosomeȱfiveȱcannotȱbeȱexplainedȱbyȱtranscriptionȱdifferencesȱ ofȱ At5g17310,ȱ butȱ mightȱ resultȱ fromȱ cisȱ polymorphismsȱ inȱ theȱ codingȱ sequence.ȱ Instead,ȱ transcriptȱ levelȱ differencesȱ ofȱ At5g17310ȱ mightȱ contributeȱ toȱ theȱ chromosomeȱ threeȱ UGPȬactivityȱ QTL.ȱ Althoughȱ theȱ encodedȱ enzymeȱ ofȱ theȱ Cviȱ alleleȱofȱAt5g17310,ȱcomparedȱtoȱtheȱLerȱallele,ȱmightȱhaveȱaȱlowerȱspecificȱactivityȱ itȱ isȱ muchȱ strongerȱ transcribedȱ inȱ linesȱ carryingȱ theȱ Cviȱ genotypeȱ atȱ theȱ chromosomeȱthreeȱlocusȱ(figureȱ6B).ȱGivenȱtheȱstrongȱhomologyȱinȱsequenceȱandȱ functionȱ betweenȱ At3g03250ȱ andȱ At5g17310,ȱ andȱ theȱ factȱ thatȱ forȱ bothȱ genesȱ aȱ highlyȱsignificantȱeQTLȱwasȱdetectedȱatȱanȱidenticalȱposition,ȱitȱisȱlikelyȱthatȱtheyȱ areȱcoȬregulatedȱbyȱtheȱsameȱgeneticȱfactor.ȱThisȱcouldȱimplyȱthatȱAt3g03250ȱisȱnotȱ cisȬregulated,ȱ asȱ suggestedȱ earlier,ȱ but,ȱ likeȱ At5g17310,ȱ isȱ regulatedȱ inȱ transȱ byȱ aȱ tightlyȱlinkedȱlocus.ȱ Interestingly,ȱaȱQTLȱforȱbothȱUDPGȬȱandȱG1PȬcontentȱwasȱdetectedȱatȱtheȱ chromosomeȱthreeȱlocusȱ(Figureȱ6A),ȱeachȱwithȱtheȱsameȱdirectionȱofȱeffectȱasȱtheȱ (e)QTLsȱforȱUGPȬactivityȱandȱgeneȱtranscriptȱlevels.ȱTheȱdirectionȱofȱeffectȱandȱtheȱ positionȱ ofȱ theȱ G1Pȱ QTLȱ canȱ beȱ explainedȱ byȱ productȱ accumulationȱ (G1P)ȱ uponȱ higherȱ conversionȱ ratesȱ ofȱ UGP.ȱ However,ȱ theȱ directionȱ ofȱ theȱ highlyȱ significantȱ QTLȱ forȱ theȱ substrateȱ UDPGȱ isȱ againstȱ expectationsȱ sinceȱ increasingȱ conversionȱ ratesȱ areȱ incompatibleȱ withȱ accumulationȱ ofȱ substrateȱ (UDPG).ȱ Itȱ isȱ thereforeȱ unlikelyȱthatȱUDPGȱcontentȱisȱcontrolledȱbyȱtheȱactivityȱlevelȱofȱUGP.ȱInstead,ȱweȱ hypothesizeȱthatȱaccumulationȱofȱUDPGȱtriggersȱupregulationȱofȱtheȱexpressionȱofȱ UGPȱencodingȱgenesȱleadingȱtoȱhigherȱenzymeȱactivityȱandȱaccumulationȱofȱG1P.ȱ ȱ

120ȱ Integrativeȱanalysesȱofȱgeneticȱvariationȱinȱenzymeȱactivityȱ

ȱ ȱ Figureȱ6:ȱQTLȱprofilesȱandȱboxplotsȱofȱUGPȱrelatedȱtraits.ȱ (A)ȱ LODȱ scoresȱ plottedȱ againstȱ genomicȱ position,ȱ theȱ signȱ ofȱ theȱ LODȱ scoreȱ isȱ determinedȱ byȱ theȱ directionȱofȱeffectȱ(+,ȱLerȱ>ȱCvi;ȱȬ,ȱLerȱ<ȱCvi).ȱSolidȱline,ȱUGPȱactivity;ȱdottedȱline,ȱUDPGȱcontent;ȱdashedȱ line,ȱ G1Pȱ content;ȱ shadedȱ solidȱ lineȱ At3g03250ȱ expressionȱ level;ȱ shadedȱ dottedȱ line,ȱ At5g17310ȱ expressionȱlevel.ȱShadedȱtrianglesȱindicateȱpositionsȱofȱstructuralȱgenes:ȱ3,ȱAt3g03250;ȱ5,ȱAt5g17310.ȱ(B)ȱ Boxplotsȱ forȱ fourȱ genotypicȱ classes.ȱ Eachȱ classȱ representsȱ genotypicȱ identicalȱ individualsȱ forȱ theȱ twoȱ QTLsȱatȱchromosomeȱthreeȱandȱfiveȱ(fromȱleftȱtoȱright:ȱA3A5,ȱA3B5,ȱB3A5,ȱB3B5;ȱAȱ=ȱLer,ȱBȱ=ȱCvi).ȱBoxplotsȱ showȱtheȱmedian,ȱinterquartileȱrange,ȱoutliersȱ(o)ȱandȱextremeȱcasesȱ(*)ȱofȱindividualȱvariables.ȱAllȱtraitsȱ areȱplottedȱinȱarbitraryȱunits.ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ

121ȱ Chapterȱ5ȱ

DISCUSSIONȱ ȱ Naturalȱvariationȱinȱprimaryȱcarbohydrateȱmetabolismȱ Naturalȱdiversityȱprovidesȱaȱrichȱsourceȱofȱgeneticȱperturbationsȱwhichȱhasȱbeenȱ effectivelyȱ analyzedȱ forȱ carbohydrateȱ metabolismȱ byȱ quantitativeȱ geneticsȱ inȱ aȱ numberȱ ofȱ studiesȱ andȱ aȱ varietyȱ ofȱ plantȱ speciesȱ (Causseȱ etȱ al.,ȱ 1995;ȱ Eshedȱ andȱ Zamir,ȱ1995;ȱMitchellȬOldsȱandȱPedersen,ȱ1998;ȱPrioulȱetȱal.,ȱ1999;ȱChenȱetȱal.,ȱ2001;ȱ Fridmanȱ etȱ al.,ȱ 2004;ȱ Sergeevaȱ etȱ al.,ȱ 2004,ȱ 2006;ȱ Liȱ etȱ al.,ȱ 2005;ȱ Crossȱ etȱ al.,ȱ 2006;ȱ Schauerȱ etȱ al.,ȱ 2006).ȱ However,ȱ mostȱ ofȱ theseȱ studiesȱ didȱ notȱ incorporateȱ transcriptionȱ analysisȱ ofȱ relevantȱ genesȱ orȱ evenȱ combinedȱ enzymeȱ activityȱ andȱ metaboliteȱ contentȱ measurements.ȱ Hereȱ weȱ present,ȱ forȱ theȱ firstȱ time,ȱ aȱ comprehensiveȱ geneticȱ analysisȱ ofȱ allȱ intermediateȱ entitiesȱ ofȱ theȱ pathȱ fromȱ genotypeȬtoȬphenotype,ȱ includingȱ geneȱ transcription,ȱ enzymeȱ activity,ȱ andȱ metaboliteȱcontent.ȱWeȱhaveȱshownȱthatȱnaturalȱvariationȱinȱprimaryȱcarbohydrateȱ metabolismȱ isȱ extensiveȱ inȱ Arabidopsis.ȱ Aȱ substantialȱ partȱ ofȱ thisȱ variationȱ wasȱ attributableȱ toȱ geneticȱ regulation,ȱ resultingȱ inȱ manyȱ QTLsȱ detectedȱ forȱ theȱ analyzedȱtraits;ȱincludingȱ15ȱQTLsȱforȱtheȱ15ȱenzymeȱactivitiesȱandȱ23ȱQTLsȱforȱtheȱ 11ȱmetabolitesȱanalysedȱinȱthisȱstudy.ȱManyȱofȱthoseȱQTLsȱcouldȱbeȱexplainedȱbyȱ geneticȱvariationȱinȱstructuralȱgenes.ȱ Severalȱ otherȱ studiesȱ inȱ Arabidopsisȱ haveȱ reportedȱ QTLȱ analysesȱ ofȱ carbohydrateȱ metabolismȱ traitsȱ inȱ RILȱ populations.ȱ MitchellȬOldsȱ andȱ Pedersenȱ (1998)ȱanalyzedȱactivitiesȱofȱtenȱenzymesȱamongȱwhichȱphosphoglucoseȱisomeraseȱ (PGI),ȱ phosphoglucomutaseȱ (PGM)ȱ andȱ fructoseȬ1,6Ȭbisphosphateȱ phosphataseȱ (FBP)ȱinȱtheȱColȱxȱLerȱRILȱpopulation.ȱNoȱQTLȱwasȱfoundȱforȱFBP,ȱinȱcontrastȱtoȱ ourȱfindings.ȱForȱPGIȱtwoȱQTLsȱwereȱfoundȱatȱotherȱpositionsȱthanȱtheȱthreeȱlociȱ identifiedȱinȱourȱstudy.ȱTheȱsingleȱQTLȱforȱPGMȱonȱchromosomeȱfiveȱhoweverȱcoȬ locatedȱ withȱ oneȱ ofȱ theȱ QTLsȱ identifiedȱ inȱ ourȱ study.ȱ PGMȱ activityȱ wasȱ alsoȱ analyzedȱinȱtheȱLerȱxȱCviȱpopulationȱbyȱSergeevaȱetȱal.ȱ(2004)ȱwhoȱreportedȱatȱleastȱ threeȱ QTLs,ȱ ofȱ whichȱ twoȱ coȬlocatedȱ withȱ theȱ twoȱ QTLsȱ foundȱ inȱ ourȱ study.ȱ Inȱ anotherȱ studyȱ byȱ Sergeevaȱ etȱ al.ȱ (2006),ȱ solubleȱ acidȱ invertaseȱ (Inv)ȱ activityȱ wasȱ analyzedȱinȱtheȱLerȱxȱCviȱpopulationȱrevealingȱseveralȱQTLs,ȱamongȱwhichȱtheȱoneȱ thatȱwasȱconfirmedȱinȱourȱanalyses.ȱ WithȱrespectȱtoȱmetaboliteȱQTLs,ȱaminoȱacidȱcontentȱwasȱanalyzedȱinȱtheȱ BayȬ0ȱxȱShaȱpopulationȱbyȱLoudetȱetȱal.ȱ(2003).ȱSimilarȱtoȱourȱresultsȱaȱhighȱnumberȱ ofȱ QTLsȱ wereȱ detectedȱ ofȱ whichȱ aȱ fewȱ coȬlocated.ȱ However,ȱ noȱ coȬlocationȱ wasȱ observedȱbetweenȱtheȱmostȱsignificantȱQTLsȱinȱbothȱstudies.ȱTheȱextractsȱusedȱinȱ theȱ studyȱ ofȱLoudetȱ etȱ al.ȱ(2003)ȱ wereȱ alsoȱanalyzedȱforȱ starch,ȱglucose,ȱ fructose,ȱ andȱ sucroseȱcontentȱ (Calengeȱ etȱ al.,ȱ2006).ȱ Multipleȱ QTLsȱ wereȱ detectedȱ forȱ eachȱ

122ȱ Integrativeȱanalysesȱofȱgeneticȱvariationȱinȱenzymeȱactivityȱ

analyzedȱtraitȱunderȱtheȱtwoȱdifferentȱenvironmentalȱconditionsȱthatȱwereȱtested.ȱ QTLsȱforȱstarchȱcontentȱwereȱnotȱdetectedȱinȱourȱstudy,ȱpossiblyȱdueȱtoȱdifferencesȱ inȱ samplingȱ timeȱ pointȱ andȱ growthȱ stage.ȱ Forȱ glucose,ȱ fructose,ȱ andȱ sucroseȱ multipleȱQTLsȱwereȱalsoȱdetectedȱinȱourȱstudy.ȱHowever,ȱcoȬlocationȱwithȱQTLsȱ detectedȱ byȱ Calengeȱ etȱ al.ȱ (2006)ȱ wasȱ onlyȱ observedȱ forȱ theȱ strongestȱ QTLȱ forȱ glucoseȱ contentȱ onȱ chromosomeȱ 1ȱ andȱ forȱ aȱ minorȱ QTLȱ forȱ fructoseȱ contentȱ onȱ chromosomeȱ 3.ȱ Theȱ evidentȱ dissimilaritiesȱ betweenȱ theȱ differentȱ studiesȱ mightȱ reflectȱgenotypicȱdifferencesȱbetweenȱpopulationsȱorȱdifferencesȱinȱdevelopmentalȱ stage,ȱtimingȱofȱsampling,ȱorȱenvironmentalȱgrowthȱconditions.ȱLoudetȱetȱal.ȱ(2003)ȱ andȱ Calengeȱ etȱ al.ȱ (2006)ȱ showedȱ largeȱ differencesȱ inȱ regulationȱ ofȱ carbohydrateȱ contentȱ whenȱ plantsȱ wereȱ grownȱ underȱ differentȱ nitrogenȱ supplyȱ regimes.ȱ Moreover,ȱSergeevaȱetȱal.ȱ(2004,ȱ2006)ȱshowedȱorganȱspecificȱregulationȱofȱenzymeȱ activity.ȱ Theseȱ resultsȱ illustrateȱ thatȱ geneticȱ regulationȱ ofȱ primaryȱ carbohydrateȱ metabolismȱ isȱ underȱ spatialȱ andȱ temporalȱ controlȱ involvingȱ aȱ multitudeȱ ofȱ loci,ȱ whichȱcanȱbeȱrevealedȱdependingȱonȱgenotype,ȱenvironment,ȱdevelopmentȱstages,ȱ andȱtheirȱmutualȱinteractions.ȱ Nevertheless,ȱ residualȱ fractionsȱ ofȱ varianceȱ couldȱ oftenȱ notȱ beȱ explainedȱ byȱ detectedȱ QTLsȱ dueȱ toȱ minorȱ environmentalȱ andȱ developmentalȱ differencesȱ betweenȱ samples,ȱandȱ samplingȱ andȱ analyticalȱ variation.ȱ Whenȱ highȱ fractionsȱ ofȱ unexplainedȱ residualȱ variationȱ areȱ observedȱ thisȱ mightȱ alsoȱ reflectȱ theȱ complexȱ regulationȱ ofȱ primaryȱ carbohydrateȱ metabolismȱ dueȱ toȱ theȱ geneticȱ regulationȱ byȱ manyȱQTLs,ȱeachȱwithȱaȱrelativelyȱsmallȱeffect.ȱSuchȱminorȱQTLsȱmayȱfailȱtoȱpassȱ theȱQTLȱsignificanceȱthreshold.ȱSegregationȱofȱtheseȱsmallȬeffectȱQTLs,ȱhowever,ȱinȱ additionȱtoȱpossibleȱepistaticȱinteractions,ȱmayȱcontributeȱtoȱtransgressionȱandȱtoȱ theȱ largeȱ geneticȱ variationȱ thatȱ isȱ observed.ȱ Anotherȱ indicationȱ ofȱ theȱ complexȱ regulationȱofȱprimaryȱcarbohydrateȱmetabolismȱwasȱtheȱfindingȱthatȱspecificȱQTLsȱ wereȱ detectedȱ forȱ mostȱ analyzedȱ traits.ȱ Whenȱ coȬlocationȱ ofȱ QTLsȱ forȱ differentȱ traitsȱwasȱobservedȱthisȱmightȱoftenȱbeȱdueȱtoȱtheȱdirectȱinterȬdependenceȱofȱtheȱ traits.ȱ Forȱ instance,ȱ UDPȬglucoseȱ pyrophosphorylaseȱ convertsȱ UDPȬglucoseȱ intoȱ glucoseȬ1Ȭphosphateȱandȱallȱthreeȱtraitsȱmapȱtoȱaȱsimilarȱpositionȱonȱtheȱgenome.ȱ Despiteȱ theȱ seeminglyȱ specificȱ independentȱ regulationȱ ofȱ manyȱ traits,ȱ indicatedȱ byȱ theȱ positionȱ ofȱ theȱ identifiedȱ QTLs,ȱ thereȱ wasȱ aȱ strikingȱ correlationȱ patternȱbetweenȱmanyȱtraits.ȱPositiveȱcorrelationsȱwereȱobservedȱbetweenȱdifferentȱ enzymeȱ activityȱ levelsȱ andȱ betweenȱ enzymeȱ activitiesȱ andȱ theȱ structuralȱ components,ȱsuchȱasȱchlorophyllȱandȱproteins,ȱandȱweakerȱcorrelationsȱwithȱsomeȱ endȱ products,ȱ suchȱ asȱ sucrose,ȱ starch,ȱ andȱ aminoȱ acids.ȱ Negativeȱ correlations,ȱ however,ȱwereȱobservedȱbetweenȱenzymeȱactivitiesȱandȱdynamicȱ(phosphorylated)ȱ intermediatesȱ ofȱ carbohydrateȱ metabolicȱ pathways.ȱ Theseȱ resultsȱ suggestȱ thatȱ inȱ additionȱtoȱtheȱoftenȱspecificȱindependentȱregulationȱofȱmetabolicȱpathwaysȱaȱmoreȱ

123ȱ Chapterȱ5ȱ

generalȱlevelȱofȱregulationȱisȱactingȱonȱcarbohydrateȱmetabolismȱinȱplants,ȱwhichȱ couldȱbeȱrelatedȱtoȱtheȱgrowthȱandȱdevelopmentalȱstatusȱofȱtheȱplant.ȱSubsequentȱ dataȱ analysisȱ suggestedȱ developmentalȱ differencesȱ toȱ beȱ causalȱ forȱ theȱ observedȱ correlations.ȱ Theȱ principalȱ componentȱ bestȱ explainingȱ theȱ variationȱ observedȱ forȱ allȱtraitsȱmappedȱtoȱtheȱpositionȱofȱERECTAȱ(AT2G26330),ȱaȱgeneȱwellȱknownȱforȱ itsȱ involvementȱ inȱ developmentalȱ controlȱ ofȱ Arabidopsis.ȱ Theȱ entwinementȱ ofȱ plantȱgrowthȱwithȱcarbohydrateȱmetabolismȱwasȱalsoȱreportedȱinȱotherȱstudiesȱforȱ enzymeȱ activitiesȱ (Crossȱ etȱ al.,ȱ 2006)ȱ andȱ metaboliteȱ contentȱ (Crossȱ etȱ al.,ȱ 2006;ȱ Schauerȱetȱal.,ȱ2006;ȱMeyerȱetȱal.,ȱ2007).ȱ ȱ Relationshipȱbetweenȱstructuralȱgeneȱexpressionȱandȱenzymeȱactivityȱ Manyȱmetabolicȱconversionsȱinȱplantsȱareȱcatalyzedȱbyȱenzymesȱandȱvariationȱinȱ enzymaticȱ activityȱ canȱ haveȱ aȱ highȱ impactȱ onȱ metabolicȱ fluxesȱ andȱ metaboliteȱ content.ȱ Itȱ isȱ conceivableȱ thatȱ naturalȱ variationȱ inȱ enzymeȱ activityȱ isȱ inflictedȱ byȱ genomicȱvariationȱinȱtheȱstructuralȱgenesȱencodingȱtheseȱenzymes.ȱ Weȱfoundȱstrongȱevidenceȱthatȱnaturalȱvariationȱforȱenzymeȱactivityȱlevelsȱ isȱ atȱ leastȱ partiallyȱ regulatedȱ byȱ variationȱ inȱ structuralȱ genesȱ orȱ regulatoryȱ lociȱ controllingȱ theȱ transcriptionȱ ofȱ theseȱ genes.ȱ First,ȱ coȬlocationȱ ofȱ structuralȱ genesȱ andȱenzymeȱactivityȱQTLsȱsuggestsȱnaturalȱvariationȱforȱtheseȱgenesȱtoȱbeȱcausalȱ forȱ theȱ observedȱ variationȱ inȱ enzymeȱ activity.ȱ Whenȱ cisȬactingȱ eQTLsȱ wereȱ detectedȱforȱtheseȱgenes,ȱregulationȱisȱlikelyȱtoȱoccurȱonȱtheȱtranscriptionalȱlevel,ȱ otherwiseȱ regulationȱ mightȱ actȱ postȬtranscriptionally,ȱ possiblyȱ dueȱ toȱ alteredȱ specificȱactivityȱorȱproteinȱstability.ȱSecondly,ȱcoȬlocationȱofȱtransȬactingȱeQTLsȱforȱ structuralȱgenesȱandȱenzymeȱactivityȱQTLsȱsuggestsȱtransȬregulatoryȱvariationȱofȱ theseȱ genesȱ toȱ beȱ causalȱ forȱ theȱ observedȱ variationȱ inȱ enzymeȱ activity.ȱ Suchȱ regulationȱ isȱ likelyȱ toȱ occurȱ throughȱ transcriptionalȱ regulationȱ ofȱ theȱ structuralȱ geneȱ dueȱ toȱ variationȱ forȱ aȱ distantȱ regulator.ȱ Bothȱ cisȬȱ andȱ transȬactingȱ transcriptionalȱ asȱ wellȱ asȱ cisȬactingȱ postȬtranscriptionalȱ regulationȱ ofȱ structuralȱ genesȱwereȱidentifiedȱasȱpotentialȱcausesȱforȱobservedȱvariationȱinȱenzymeȱactivity.ȱ However,ȱ forȱ manyȱ enzymes,ȱ QTLsȱ wereȱ alsoȱ detectedȱ whichȱ didȱ notȱ coȬlocateȱ withȱstructuralȱgenesȱorȱtheirȱeQTLs,ȱsuggestingȱthatȱregulationȱoccursȱatȱmultipleȱ levels,ȱ partlyȱ independentȱ ofȱ variationȱ inȱ (transcriptȱ levelsȱ of)ȱ structuralȱ genes.ȱ Likewise,ȱforȱmanyȱstructuralȱgenesȱeQTLsȱwereȱdetectedȱwhichȱdidȱnotȱcoȬlocateȱ withȱQTLsȱforȱenzymeȱactivity,ȱwhichȱoftenȱexplainedȱtheȱlowȱcorrelationȱobservedȱ betweenȱ transcriptȱ levelsȱ andȱ activity.ȱ Apparentlyȱ variationȱ observedȱ inȱ theȱ transcriptȱ levelsȱ ofȱ theseȱ genesȱ doesȱ notȱ contributeȱ toȱ theȱ variationȱ observedȱ inȱ enzymeȱ activity,ȱ suggestingȱ thatȱ theirȱ encodedȱ proteinsȱ mightȱ serveȱ otherȱ functionsȱthanȱtheirȱcurrentȱannotation,ȱorȱthatȱvariationȱatȱtheȱtranscriptionalȱlevelȱ isȱ‘overruled’ȱbyȱotherȱregulatingȱmechanismsȱorȱbyȱtemporalȱdifferencesȱbetweenȱ

124ȱ Integrativeȱanalysesȱofȱgeneticȱvariationȱinȱenzymeȱactivityȱ

geneȱ expressionȱ andȱ subsequentȱ processes.ȱ Finally,ȱ forȱ aȱ numberȱ ofȱ structuralȱ genesȱ noȱ significantȱ eQTLȱ couldȱ beȱ detectedȱ whichȱ canȱ beȱ theȱ resultȱ ofȱ lowȱ (variationȱ in)ȱ transcriptȱ levelsȱ thatȱ couldȱ notȱ beȱ detectedȱ inȱ theȱ microarrayȱ experimentȱ Oftenȱonlyȱaȱweakȱtoȱmediumȱcorrelationȱexistsȱbetweenȱlevelsȱofȱenzymeȱ activityȱandȱtranscriptȱlevelsȱofȱstructuralȱgenes.ȱThisȱcanȱbeȱpartlyȱexplainedȱbyȱ theȱ redundancyȱ inȱ structuralȱ genesȱ whenȱ differentȱ genesȱ eachȱ contributeȱ onlyȱ partiallyȱ toȱ theȱ eventualȱ levelȱ ofȱ enzyme.ȱ However,ȱ differentȱ genesȱ ofȱ aȱ geneȱ familyȱmightȱhaveȱdifferentȱspecificȱactivitiesȱforȱtheȱmetabolicȱconversionsȱunderȱ study,ȱforȱwhichȱalsoȱnaturalȱvariationȱmightȱbeȱpresentȱbetweenȱaccessions.ȱInȱaȱ segregatingȱ populationȱ thisȱ diversityȱ ofȱ geneticȱ variantsȱ andȱ possibleȱ epistaticȱ interactionsȱ betweenȱ themȱ canȱ severelyȱ complicateȱ correlationȱ analyses.ȱ Onȱ theȱ otherȱhand,ȱcorrelationsȱmightȱbeȱdifficultȱtoȱestablishȱwhenȱrelationshipsȱbetweenȱ transcriptȱ levelsȱ andȱ proteinȱ levelsȱ areȱ notȱ linear.ȱ Deviationsȱ fromȱ perfectȱ correlationsȱandȱlinearityȱcanȱbeȱcausedȱdueȱtoȱdelaysȱinȱproteinȱformationȱand/orȱ activationȱuponȱtranscription.ȱMoreover,ȱregulationȱofȱenzymeȱactivityȱcanȱoccurȱ postȬtranscriptionallyȱ throughȱ mRNAȬȱ andȱ proteinȬstability,ȱ proteinȬfolding,ȱ activationȱbyȱorȱdependencyȱonȱcoȬfactors,ȱ(de)Ȭphosphorylation,ȱetc.ȱFinally,ȱlackȱ ofȱ correlationȱ canȱ beȱ simplyȱ aȱ resultȱ ofȱ nonȬfunctionalityȱ atȱ theȱ sampledȱ developmentalȱstageȱorȱdueȱtoȱaȱdilutionȱeffectȱwhenȱgenesȱareȱonlyȱtranscribedȱinȱ specificȱ cellsȱ orȱ tissues.ȱ Negativeȱ correlationsȱ mightȱ beȱ theȱ resultȱ fromȱ negativeȱ feedbackȱ dueȱ toȱ highȱ transcriptionȱ levelsȱ ofȱ redundantȱ genes,ȱ orȱ phaseȱ shiftsȱ inȱ diurnalȱrhythmsȱofȱtranscriptionȱandȱtranslationȱ(Gibonȱetȱal.,ȱ2004b,ȱ2006;ȱBlasingȱ etȱal.,ȱ2005).ȱ ȱ Differentȱmodesȱofȱactionȱinȱtheȱgeneticȱcontrolȱofȱenzymaticȱactivityȱ Forȱ manyȱ enzymesȱ naturalȱ variationȱ wasȱ observedȱ inȱ theirȱ levelȱ ofȱ activity.ȱ Inȱ manyȱ casesȱ enzymeȱ activityȱ wasȱ relatedȱ toȱ metaboliteȱ content,ȱ amongȱ whichȱ substratesȱ andȱ productsȱ ofȱ theȱ analyzedȱ enzymes.ȱ Inȱ severalȱ casesȱ QTLsȱ forȱ enzymeȱactivityȱcoȬlocatedȱwithȱstructuralȱgenesȱencodingȱtheseȱenzymesȱorȱeQTLsȱ forȱthoseȱgenes.ȱDifferencesȱinȱcorrelationȱpatternȱandȱQTLȱprofilesȱbetweenȱgeneȱ expression,ȱenzymeȱactivityȱandȱmetaboliteȱcontentȱindicate,ȱhowever,ȱthatȱgeneticȱ regulationȱ causalȱ forȱ observedȱ variationȱ isȱ notȱ similarȱ forȱ allȱ analyzedȱ traits.ȱ Instead,ȱvariousȱmodesȱofȱgeneticȱcontrol,ȱusingȱdifferentȱmechanisms,ȱseemȱtoȱactȱ inȱtheȱregulationȱofȱcarbohydrateȱmetabolism.ȱ Forȱ phosphoglucomutase,ȱ oneȱ ofȱ theȱ enzymesȱ forȱ whichȱ theȱ highestȱ variationȱinȱactivityȱwasȱobserved,ȱitȱwasȱshownȱthatȱmostȱofȱthisȱvariationȱcouldȱ beȱexplainedȱbyȱgeneticȱfactors.ȱParallelȱanalysisȱofȱenzymeȱactivityȱandȱstructuralȱ geneȱ expressionȱ suggestedȱ cisȬregulatoryȱ variationȱ inȱ transcriptionȱ ofȱ oneȱ ofȱ theȱ

125ȱ Chapterȱ5ȱ

structuralȱ genesȱ (At5g51820)ȱ toȱ beȱ causalȱ forȱ theȱ majorȱ PGMȱ activityȱ QTL.ȱ However,ȱ differencesȱ inȱ theȱ variationȱ inȱ transcriptionȱ ofȱ structuralȱ genesȱ andȱ enzymeȱ activityȱ alsoȱ indicatedȱ polymorphismsȱ inȱ codingȱ regionsȱ ofȱ structuralȱ genesȱatȱaȱsecondȱlocusȱtoȱaccountȱforȱtheȱobservedȱvariationȱinȱenzymeȱactivity.ȱ Furthermore,ȱ althoughȱ significantȱ negativeȱ correlationsȱ wereȱ observedȱ betweenȱ PGMȱ activityȱ andȱ itsȱ substrateȱ andȱ productȱ G1Pȱ andȱ G6P,ȱ theseȱ correlationsȱ areȱ notȱ causedȱ byȱ anyȱ ofȱ theȱ detectedȱ QTLs.ȱ Thisȱ suggestsȱ thatȱ otherȱ levelsȱ ofȱ regulationȱareȱalsoȱactiveȱforȱwhichȱnoȱgenomicȱvariationȱcouldȱbeȱdetectedȱwithinȱ theȱanalyzedȱpopulation.ȱ Inȱcontrast,ȱtheȱcombinedȱanalysisȱofȱvariationȱinȱtheȱactivityȱofȱUGP,ȱitsȱ substrateȱandȱproductȱandȱtranscriptionȱofȱitsȱencodingȱstructuralȱgenesȱsuggestedȱ transȬregulatedȱ transcriptionȱ differencesȱ toȱ beȱ theȱ majorȱ causeȱ forȱ variationȱ inȱ enzymeȱ activity.ȱ Inȱ thisȱ caseȱ theȱ strongȱ positiveȱ correlationȱ betweenȱ UDPGȱ andȱ UGPȱsuggestsȱUDPGȱlevelsȱtoȱbeȱtheȱdrivingȱforceȱforȱthisȱtransȬactingȱregulation.ȱ ThisȱwouldȱmeanȱthatȱplantsȱareȱableȱtoȱsenseȱandȱrespondȱtoȱchangesȱinȱUDPGȱ accumulation,ȱwhichȱhasȱbeenȱsuggestedȱandȱshownȱalsoȱforȱotherȱsugarsȱ(Rollandȱ etȱal.,ȱ2002;ȱHalfordȱetȱal.,ȱ2003;ȱAvonceȱetȱal.,ȱ2005;ȱGonzaliȱetȱal.,ȱ2006).ȱAlthoughȱitȱ remainsȱ speculativeȱ toȱ assignȱ whichȱ geneticȱ factor(s)ȱ determine(s)ȱ theȱ variationȱ observedȱ inȱ UDPGȱ accumulationȱ itȱ isȱ interestingȱ toȱ noteȱ thatȱ theȱ inorganicȱ phosphateȱstatusȱinȱArabidopsisȱaffectsȱtheȱtranscriptionȱofȱUGPȬencodingȱgenesȱ (Ciereszkoȱ etȱ al.,ȱ 2001,ȱ 2005).ȱ Moreover,ȱ naturalȱ variationȱ forȱ phosphateȱ andȱ phytate,ȱtheȱmajorȱsourceȱofȱinorganicȱphosphateȱinȱplants,ȱhasȱbeenȱobservedȱinȱ theȱLerȱxȱCviȱpopulationȱandȱaȱcommonȱQTLȱexplainingȱmostȱofȱtheȱvariationȱcoȬ locatesȱ withȱ theȱ QTLȱ forȱ UDPGȬcontentȱ andȱ UGPȬactivityȱ (Bentsinkȱ etȱ al.,ȱ 2003).ȱ Furthermore,ȱaȱQTLȱforȱtheȱaccumulationȱofȱtheȱphosphorylatedȱhexosesȱG1Pȱandȱ G6Pȱ wasȱ detectedȱ atȱ thisȱ position,ȱ whichȱ mightȱ indicateȱ thatȱ highȱ levelsȱ ofȱ inorganicȱ phosphateȱ resultsȱ inȱ elevatedȱ levelsȱ ofȱ phosphorylatedȱ sugars.ȱ Inȱ conclusion,ȱvariationȱinȱphosphorusȱlevelsȱwouldȱthenȱregulateȱtheȱaccumulationȱ ofȱUDPG,ȱwhichȱinȱturnȱtriggersȱtheȱexpressionȱofȱUGPȬencodingȱstructuralȱgenes,ȱ leadingȱtoȱhigherȱactivityȱofȱUGP.ȱ ȱ ȱ

126ȱ Integrativeȱanalysesȱofȱgeneticȱvariationȱinȱenzymeȱactivityȱ

MATERIALSȱANDȱMETHODSȱ ȱ Plantȱmaterialȱandȱtissueȱcollectionȱ AerialȱpartsȱofȱseedlingsȱfromȱtheȱaccessionsȱLerȱandȱCviȱandȱaȱpopulationȱofȱ160ȱ recombinantȱ inbredȱ linesȱ derivedȱ fromȱ aȱ crossȱ betweenȱ theseȱ parentsȱ (AlonsoȬ Blancoȱetȱal.,ȱ1998;ȱKeurentjesȱetȱal.,ȱ2006)ȱwereȱgrownȱandȱcollectedȱasȱdescribedȱ previouslyȱ(Keurentjesȱetȱal.,ȱ2006).ȱInȱbrief,ȱseedsȱofȱlinesȱwereȱsownȱinȱpetriȱdishesȱ onȱ1/2MSȱagarȱandȱplacedȱinȱaȱcoldȱroomȱforȱsevenȱdays.ȱPetriȱdishesȱwereȱthenȱ transferredȱ toȱ aȱ climateȱ chamberȱ andȱ seedlingsȱ wereȱ collectedȱ afterȱ sevenȱ days.ȱ PlantȱmaterialȱwasȱstoredȱatȱȬ80°Cȱuntilȱfurtherȱprocessing.ȱ ȱ Linkageȱmapȱconstructionȱandȱanchoringȱtoȱtheȱphysicalȱmapȱ Theȱ geneticȱ mapȱ wasȱ constructedȱ fromȱ aȱ subsetȱ ofȱ theȱ markersȱ available,ȱ atȱ http:/nasc.nott.ac.uk/,ȱasȱdescribedȱinȱKeurentjesȱetȱal.ȱ(2007b).ȱInȱtotal,ȱ144ȱmarkersȱ wereȱused,ȱwithȱanȱaverageȱspacingȱofȱ3.5ȱcM.ȱTheȱlargestȱdistanceȱbetweenȱtwoȱ markersȱ wasȱ 10.8ȱ cM.ȱ Theȱ geneticȱ mapȱ wasȱ anchoredȱ toȱ theȱ physicalȱ mapȱ asȱ describedȱinȱKeurentjesȱetȱal.ȱ(2007b),ȱwithȱanȱalmostȱlinearȱgenomeȬwideȱrelationȱ ofȱ4.1ȱcMȱperȱMbp.ȱ ȱ Metaboliteȱandȱenzymeȱmeasurementsȱ Metabolitesȱwereȱextractedȱandȱanalyzedȱasȱdescribedȱpreviously;ȱChlA,ȱChlB,ȱAA,ȱ protein,ȱ sucrose,ȱ glucose,ȱ andȱ fructoseȱ (Crossȱ etȱ al.,ȱ 2006);ȱ starch,ȱ G1P,ȱ andȱ G6Pȱ (Gibonȱ etȱ al.,ȱ 2002);ȱ UDPGȱ (Keurentjesȱ etȱ al.,ȱ 2006).ȱ Enzymesȱ wereȱ extractedȱ asȱ describedȱinȱGibonȱetȱal.ȱ(2004a)ȱandȱanalyzedȱasȱdescribedȱpreviously;ȱInv,ȱAGP,ȱ FBP,ȱ G6PDH,ȱ PFK,ȱ PFP,ȱ SPS,ȱ SuSy,ȱ GK,ȱ FK,ȱ andȱ UGPȱ (Gibonȱ etȱ al.,ȱ 2004a);ȱ PGIȱ (Crossȱ etȱ al.,ȱ 2006);ȱ PGMȱ (Manjunathȱ etȱ al.,ȱ 1998);ȱ Rubiscoȱ (Sulpiceȱ etȱ al.,ȱ 2007).ȱ Samplesȱ wereȱ randomizedȱ duringȱ extractionȱ andȱ analysis,ȱ andȱ twoȱ biologicalȱ replicatesȱwereȱanalyzedȱforȱeachȱtrait.ȱ ȱ Microarrayȱanalysesȱ Transcriptȱ levelsȱ ofȱ genesȱ wereȱ analyzedȱ onȱ twoȬcolorȱ DNAȬmicroarraysȱ asȱ describedȱ previouslyȱ (Keurentjesȱ etȱ al.,ȱ 2007b).ȱ Resultingȱ 2logȱ signalȱ intensitiesȱ wereȱ usedȱ forȱ correlationȱ analysesȱ andȱ 2logȱ ratiosȱ betweenȱ coȬhybridizedȱ RILsȱ wereȱusedȱforȱQTLȱanalyses.ȱ ȱ Statisticalȱanalysesȱ Spearmanȱ rankȱ correlationsȱ wereȱ determinedȱ inȱ Excelȱ (Microsoft)ȱ forȱ meanȱ traitȱ valuesȱasȱfollows:ȱ

127ȱ Chapterȱ5ȱ

2 n 2 1 n n 1  6¦i 1 y ij  y ik  Tj Tk 2 R jk ,ȱ j,ȱkȱ=ȱ1,2,…,m;ȱ 2 2 >@n n 1 Tj >@n n 1 Tk whereȱnȱisȱtheȱnumberȱofȱobservations,ȱyȱisȱtheȱrankȱofȱobservationsȱforȱvariablesȱjȱ 2 toȱ m,ȱ andȱ Tj ¦ t j t j 1 ,ȱ tjȱ beingȱ theȱ numberȱ ofȱ tiesȱ ofȱ aȱ particularȱ valueȱ ofȱ variableȱj,ȱandȱtheȱsummationȱbeingȱoverȱallȱtiedȱvaluesȱofȱvariableȱjȱ(Siegel,ȱ1956).ȱ QTLȱ analysesȱ forȱ geneȱ transcriptȱ levelsȱ wereȱ performedȱ asȱ describedȱ inȱ (Keurentjesȱ etȱ al.,ȱ 2007b).ȱ Forȱ QTLȱ analysesȱ ofȱ metaboliteȱ andȱ enzymeȱ traitsȱ theȱ computerȱ programȱ MapQTLȱ versionȱ 5.0ȱ (Vanȱ Ooijen,ȱ 2004)ȱ wasȱ usedȱ toȱ identifyȱ andȱ locateȱ QTLsȱ linkedȱ toȱ theȱ molecularȱ markersȱ usingȱ multipleȱ QTLȱ mappingȱ (MQM).ȱ LODȱ statisticsȱ wereȱ calculatedȱ atȱ 0.5ȱ cMȱ intervals.ȱ Testsȱ ofȱ 1000ȱ permutationsȱwereȱusedȱtoȱobtainȱanȱestimateȱofȱtheȱnumberȱofȱtypeȱ1ȱerrorsȱ(falseȱ positives).ȱTheȱgenomeȬwideȱLODȱscore,ȱwhichȱ95%ȱofȱtheȱpermutationsȱdidȱnotȱ exceed,ȱrangedȱfromȱ2.4ȱtoȱ2.7.ȱAȱLODȱscoreȱofȱ3.0,ȱtoȱcorrectȱforȱmultipleȱtesting,ȱ wasȱthenȱusedȱasȱtheȱsignificanceȱthresholdȱtoȱdeclareȱtheȱpresenceȱofȱaȱQTL.ȱInȱtheȱ MQMȱmodelȱtheȱgeneticȱeffectȱ(ΐBȬΐA)ȱandȱpercentageȱofȱexplainedȱvarianceȱwasȱ estimatedȱforȱeachȱQTL,ȱandȱ2ȱMbpȬsupportȱintervalsȱwereȱestablishedȱasȱanȱ~95%ȱ confidenceȱ levelȱ (Vanȱ Ooijen,ȱ 1992).ȱ CoȬlocationȱ ofȱ (e)QTLsȱ wasȱ definedȱ asȱ anȱ overlapȱinȱtheȱ2MbpȬsupportȱintervals.ȱ Genomicȱ positionsȱ ofȱ genesȱ wereȱ inferredȱ fromȱ theȱ Arabidopsisȱ informationȱ resourceȱ (Theȱ Arabidopsisȱ Genomeȱ Initiative,ȱ 2000).ȱ Whenȱ physicalȱ positionsȱ ofȱ genesȱ fellȱ inȱ theȱ 2ȱ MbpȬsupportȱ intervalȱ ofȱ (e)QTLsȱ thisȱ wasȱ consideredȱasȱcoȬlocation.ȱ Principalȱ componentȱ andȱ boxȱ plotȱ analysesȱ wereȱ performedȱ inȱ SPSSȱ (versionȱ12.0).ȱ ȱ Acknowledgementsȱ WeȱthankȱLinusȱvanȱderȱPlasȱforȱcriticalȱreadingȱofȱtheȱmanuscript.ȱThisȱworkȱwasȱ supportedȱ byȱ grantsȱ fromȱ theȱ Netherlandsȱ Organizationȱ forȱ Scientificȱ Research,ȱ Programȱ Genomicsȱ (050Ȭ10Ȭ029),ȱ theȱ Centreȱ forȱ Biosystemsȱ Genomicsȱ (CBSG,ȱ Netherlandsȱ Genomicsȱ Initiative)ȱ andȱ aȱ Genomicsȱ Fellowshipȱ fromȱ Theȱ NetherlandsȱGenomicsȱInitiativeȱ(050Ȭ72Ȭ412).ȱ

128ȱ Integrativeȱanalysesȱofȱgeneticȱvariationȱinȱenzymeȱactivityȱ

REFERENCESȱ ȱ AlonsoȬBlanco,ȱ C.,ȱPeeters,ȱ A.J.,ȱKoornneef,ȱM.,ȱLister,ȱC.,ȱDean,ȱC.,ȱvanȱdenȱBosch,ȱN.,ȱ Pot,ȱJ.ȱandȱ Kuiper,ȱ M.T.ȱ (1998).ȱ Developmentȱ ofȱ anȱ AFLPȱ basedȱ linkageȱ mapȱ ofȱ Ler,ȱ Colȱ andȱ Cviȱ Arabidopsisȱ thalianaȱ ecotypesȱ andȱ constructionȱ ofȱ aȱ Ler/Cviȱ recombinantȱ inbredȱ lineȱ population.ȱPlantȱJȱ14,ȱ259Ȭ271.ȱ Avonce,ȱ N.,ȱ Leyman,ȱ B.,ȱ Thevelein,ȱ J.ȱ andȱ Iturriaga,ȱ G.ȱ (2005).ȱ Trehaloseȱ metabolismȱ andȱ glucoseȱ sensingȱinȱplants.ȱBiochemȱSocȱTransȱ33,ȱ276Ȭ279.ȱ Bentsink,ȱ L.,ȱ Yuan,ȱ K.,ȱ Koornneef,ȱ M.ȱ andȱ Vreugdenhil,ȱ D.ȱ (2003).ȱ Theȱ geneticsȱ ofȱ phytateȱ andȱ phosphateȱaccumulationȱinȱseedsȱandȱleavesȱofȱArabidopsisȱthaliana,ȱusingȱnaturalȱvariation.ȱ TheorȱApplȱGenetȱ106,ȱ1234Ȭ1243.ȱ Blasing,ȱO.E.,ȱGibon,ȱY.,ȱGunther,ȱM.,ȱHohne,ȱM.,ȱMorcuende,ȱR.,ȱOsuna,ȱD.,ȱThimm,ȱO.,ȱUsadel,ȱB.,ȱ Scheible,ȱ W.R.ȱ andȱ Stitt,ȱ M.ȱ (2005).ȱ Sugarsȱ andȱ circadianȱ regulationȱ makeȱ majorȱ contributionsȱtoȱtheȱglobalȱregulationȱofȱdiurnalȱgeneȱexpressionȱinȱArabidopsis.ȱPlantȱCellȱ17,ȱ 3257Ȭ3281.ȱ Calenge,ȱ F.,ȱ SalibaȬColombani,ȱ V.,ȱ Mahieu,ȱ S.,ȱ Loudet,ȱ O.,ȱ DanielȬVedele,ȱ F.ȱ andȱ Krapp,ȱ A.ȱ (2006).ȱ Naturalȱ variationȱ forȱ carbohydrateȱ contentȱ inȱ Arabidopsis.ȱ Interactionȱ withȱ complexȱ traitsȱ dissectedȱbyȱquantitativeȱgenetics.ȱPlantȱPhysiolȱ141,ȱ1630Ȭ1643.ȱ Carrari,ȱ F.,ȱ UrbanczykȬWochniak,ȱ E.,ȱ Willmitzer,ȱ L.ȱ andȱ Fernie,ȱ A.R.ȱ (2003).ȱ Engineeringȱ centralȱ metabolismȱinȱcropȱspecies:ȱlearningȱtheȱsystem.ȱMetabȱEngȱ5,ȱ191Ȭ200.ȱ Causse,ȱM.,ȱRocher,ȱJ.P.,ȱHenry,ȱA.M.,ȱCharcosset,ȱA.,ȱPrioul,ȱJ.L.ȱandȱDeȱVienne,ȱD.ȱ(1995).ȱGeneticȱ dissectionȱ ofȱ theȱ relationshipȱ betweenȱ carbonȱ metabolismȱ andȱ earlyȱ growthȱ inȱ maize,ȱ withȱ emphasisȱonȱkeyȱenzymeȱloci.ȱMolȱBreedȱ1,ȱ259Ȭ272.ȱ Chen,ȱ X.,ȱ Salamini,ȱ F.ȱ andȱ Gebhardt,ȱ C.ȱ (2001).ȱ Aȱ potatoȱ molecularȬfunctionȱ mapȱ forȱ carbohydrateȱ metabolismȱandȱtransport.ȱTheorȱApplȱGenetȱ102,ȱ284Ȭ295.ȱ Ciereszko,ȱI.,ȱJohansson,ȱH.,ȱHurry,ȱV.ȱandȱKleczkowski,ȱL.A.ȱ(2001).ȱPhosphateȱstatusȱaffectsȱtheȱgeneȱ expression,ȱ proteinȱ contentȱ andȱ enzymaticȱ activityȱ ofȱ UDPȬglucoseȱ pyrophosphorylaseȱ inȱ wildȬtypeȱandȱphoȱmutantsȱofȱArabidopsis.ȱPlantaȱ212,ȱ598Ȭ605.ȱ Ciereszko,ȱI.,ȱJohansson,ȱH.ȱandȱKleczkowski,ȱL.A.ȱ(2005).ȱInteractiveȱeffectsȱofȱphosphateȱdeficiency,ȱ sucroseȱandȱlight/darkȱconditionsȱonȱgeneȱexpressionȱofȱUDPȬglucoseȱpyrophosphorylaseȱinȱ Arabidopsis.ȱJȱPlantȱPhysiolȱ162,ȱ343Ȭ353.ȱ Cross,ȱJ.M.,ȱvonȱKorff,ȱM.,ȱAltmann,ȱT.,ȱBartzetko,ȱL.,ȱSulpice,ȱR.,ȱGibon,ȱY.,ȱPalacios,ȱN.ȱandȱStitt,ȱM.ȱ (2006).ȱ Variationȱ ofȱ enzymeȱ activitiesȱ andȱ metaboliteȱ levelsȱ inȱ 24ȱ Arabidopsisȱ accessionsȱ growingȱinȱcarbonȬlimitedȱconditions.ȱPlantȱPhysiolȱ142,ȱ1574Ȭ1588.ȱ ElȬLithy,ȱM.E.,ȱClerkx,ȱE.J.,ȱRuys,ȱG.J.,ȱKoornneef,ȱM.ȱandȱVreugdenhil,ȱD.ȱ(2004).ȱQuantitativeȱtraitȱ locusȱanalysisȱofȱgrowthȬrelatedȱtraitsȱinȱaȱnewȱArabidopsisȱrecombinantȱinbredȱpopulation.ȱ PlantȱPhysiolȱ135,ȱ444Ȭ458.ȱ Eshed,ȱ Y.ȱ andȱ Zamir,ȱ D.ȱ (1995).ȱ Anȱ introgressionȱ lineȱ populationȱ ofȱ Lycopersiconȱ pennelliiȱ inȱ theȱ cultivatedȱ tomatoȱ enablesȱ theȱ identificationȱ andȱ fineȱ mappingȱ ofȱ yieldȬassociatedȱ QTL.ȱ Geneticsȱ141,ȱ1147Ȭ1162.ȱ Fernie,ȱA.R.,ȱTauberger,ȱE.,ȱLytovchenko,ȱA.,ȱRoessner,ȱU.,ȱWillmitzer,ȱL.ȱandȱTrethewey,ȱR.N.ȱ(2002).ȱ Antisenseȱrepressionȱofȱcytosolicȱphosphoglucomutaseȱinȱpotatoȱ(Solanumȱtuberosum)ȱresultsȱ inȱ severeȱ growthȱ retardation,ȱ reductionȱ inȱ tuberȱ numberȱ andȱ alteredȱ carbonȱ metabolism.ȱ Plantaȱ214,ȱ510Ȭ520.ȱ Fiehn,ȱO.,ȱKloska,ȱS.ȱandȱAltmann,ȱT.ȱ(2001).ȱIntegratedȱstudiesȱonȱplantȱbiologyȱusingȱmultiparallelȱ techniques.ȱCurrȱOpinȱBiotechnolȱ12,ȱ82Ȭ86.ȱ Fridman,ȱE.,ȱCarrari,ȱF.,ȱLiu,ȱY.S.,ȱFernie,ȱA.R.ȱandȱZamir,ȱD.ȱ(2004).ȱZoomingȱinȱonȱaȱquantitativeȱtraitȱ forȱtomatoȱyieldȱusingȱinterspecificȱintrogressions.ȱScienceȱ305,ȱ1786Ȭ1789.ȱ

129ȱ Chapterȱ5ȱ

Gachon,ȱ C.M.,ȱ LangloisȬMeurinne,ȱ M.,ȱ Henry,ȱ Y.ȱ andȱ Saindrenan,ȱ P.ȱ (2005).ȱ Transcriptionalȱ coȬ regulationȱ ofȱ secondaryȱ metabolismȱ enzymesȱ inȱ Arabidopsis:ȱ functionalȱ andȱ evolutionaryȱ implications.ȱPlantȱMolȱBiolȱ58,ȱ229Ȭ245.ȱ Gibon,ȱ Y.,ȱ Vigeolas,ȱ H.,ȱ Tiessen,ȱ A.,ȱ Geigenberger,ȱ P.ȱ andȱ Stitt,ȱ M.ȱ (2002).ȱ Sensitiveȱ andȱ highȱ throughputȱmetaboliteȱassaysȱforȱinorganicȱpyrophosphate,ȱADPGlc,ȱnucleotideȱphosphates,ȱ andȱglycolyticȱintermediatesȱbasedȱonȱaȱnovelȱenzymicȱcyclingȱsystem.ȱPlantȱJȱ30,ȱ221Ȭ235.ȱ Gibon,ȱY.,ȱBlaesing,ȱO.E.,ȱHannemann,ȱJ.,ȱCarillo,ȱP.,ȱHohne,ȱM.,ȱHendriks,ȱJ.H.,ȱPalacios,ȱN.,ȱCross,ȱ J.,ȱ Selbig,ȱ J.ȱ andȱ Stitt,ȱ M.ȱ (2004a).ȱ Aȱ RobotȬbasedȱ platformȱ toȱ measureȱ multipleȱ enzymeȱ activitiesȱ inȱ Arabidopsisȱ usingȱ aȱ setȱ ofȱ cyclingȱ assays:ȱ comparisonȱ ofȱ changesȱ ofȱ enzymeȱ activitiesȱandȱtranscriptȱlevelsȱduringȱdiurnalȱcyclesȱandȱinȱprolongedȱdarkness.ȱPlantȱCellȱ16,ȱ 3304Ȭ3325.ȱ Gibon,ȱ Y.,ȱ Blasing,ȱ O.E.,ȱ PalaciosȬRojas,ȱ N.,ȱ Pankovic,ȱ D.,ȱ Hendriks,ȱ J.H.,ȱ Fisahn,ȱ J.,ȱ Hohne,ȱ M.,ȱ Gunther,ȱ M.ȱ andȱ Stitt,ȱ M.ȱ (2004b).ȱ Adjustmentȱ ofȱ diurnalȱ starchȱ turnoverȱ toȱ shortȱ days:ȱ depletionȱ ofȱ sugarȱ duringȱ theȱ nightȱ leadsȱ toȱ aȱ temporaryȱ inhibitionȱ ofȱ carbohydrateȱ utilization,ȱ accumulationȱ ofȱ sugarsȱ andȱ postȬtranslationalȱ activationȱ ofȱ ADPȬglucoseȱ pyrophosphorylaseȱinȱtheȱfollowingȱlightȱperiod.ȱPlantȱJȱ39,ȱ847Ȭ862.ȱ Gibon,ȱ Y.,ȱ Usadel,ȱ B.,ȱ Blaesing,ȱ O.E.,ȱ Kamlage,ȱ B.,ȱ Hoehne,ȱ M.,ȱ Trethewey,ȱ R.ȱ andȱ Stitt,ȱ M.ȱ (2006).ȱ Integrationȱofȱmetaboliteȱwithȱtranscriptȱandȱenzymeȱactivityȱprofilingȱduringȱdiurnalȱcyclesȱ inȱArabidopsisȱrosettes.ȱGenomeȱBiolȱ7,ȱR76.ȱ Gonzali,ȱS.,ȱLoreti,ȱE.,ȱSolfanelli,ȱC.,ȱNovi,ȱG.,ȱAlpi,ȱA.ȱandȱPerata,ȱP.ȱ(2006).ȱIdentificationȱofȱsugarȬ modulatedȱgenesȱandȱevidenceȱforȱinȱvivoȱsugarȱsensingȱinȱArabidopsis.ȱJȱPlantȱResȱ119,ȱ115Ȭ 123.ȱ Halford,ȱ N.G.,ȱ Hey,ȱ S.,ȱ Jhurreea,ȱ D.,ȱ Laurie,ȱ S.,ȱ McKibbin,ȱ R.S.,ȱ Paul,ȱ M.ȱ andȱ Zhang,ȱ Y.ȱ (2003).ȱ Metabolicȱ signallingȱ andȱ carbonȱ partitioning:ȱ roleȱ ofȱ Snf1Ȭrelatedȱ (SnRK1)ȱ proteinȱ kinase.ȱ Jȱ ExpȱBotȱ54,ȱ467Ȭ475.ȱ Harrison,ȱJ.,ȱHirel,ȱB.ȱandȱLimami,ȱA.M.ȱ(2004).ȱVariationȱinȱnitrateȱuptakeȱandȱassimilationȱbetweenȱ twoȱecotypesȱofȱLotusȱjaponicusȱandȱtheirȱrecombinantȱinbredȱlines.ȱPhysiolȱPlantȱ120,ȱ124Ȭ131.ȱ Hirai,ȱ M.Y.,ȱ Klein,ȱ M.,ȱ Fujikawa,ȱ Y.,ȱ Yano,ȱ M.,ȱ Goodenowe,ȱ D.B.,ȱ Yamazaki,ȱ Y.,ȱ Kanaya,ȱ S.,ȱ Nakamura,ȱ Y.,ȱ Kitayama,ȱ M.,ȱ Suzuki,ȱ H.ȱ etȱ al.ȱ (2005).ȱ Elucidationȱ ofȱ geneȬtoȬgeneȱ andȱ metaboliteȬtoȬgeneȱ networksȱ inȱ arabidopsisȱ byȱ integrationȱ ofȱ metabolomicsȱ andȱ transcriptomics.ȱJȱBiolȱChemȱ280,ȱ25590Ȭ25595.ȱ Hirel,ȱ B.,ȱ Bertin,ȱ P.,ȱ Quillere,ȱ I.,ȱ Bourdoncle,ȱ W.,ȱ Attagnant,ȱ C.,ȱ Dellay,ȱ C.,ȱ Gouy,ȱ A.,ȱ Cadiou,ȱ S.,ȱ Retailliau,ȱ C.,ȱ Falque,ȱ M.ȱ etȱ al.ȱ (2001).ȱ Towardsȱ aȱ betterȱ understandingȱ ofȱ theȱ geneticȱ andȱ physiologicalȱbasisȱforȱnitrogenȱuseȱefficiencyȱinȱmaize.ȱPlantȱPhysiolȱ125,ȱ1258Ȭ1270.ȱ Juenger,ȱ T.E.,ȱ McKay,ȱ J.K.,ȱ Hausmann,ȱ N.,ȱ Keurentjes,ȱ J.J.B.,ȱ Sen,ȱ S.,ȱ Stowe,ȱ K.A.,ȱ Dawson,ȱ T.E.,ȱ Simms,ȱE.L.ȱandȱRichards,ȱJ.H.ȱ(2005).ȱIdentificationȱandȱcharacterizationȱofȱQTLȱunderlyingȱ wholeȬplantȱ physiologyȱ inȱ Arabidopsisȱ thaliana:ȱ delta13C,ȱ stomatalȱ conductanceȱ andȱ transpirationȱefficiency.ȱPlantȱCellȱEnviron.ȱ28,ȱ697Ȭ708.ȱ Keurentjes,ȱJ.J.B.,ȱFu,ȱJ.,ȱdeȱVos,ȱC.H.,ȱLommen,ȱA.,ȱHall,ȱR.D.,ȱBino,ȱR.J.,ȱvanȱderȱPlas,ȱL.H.,ȱJansen,ȱ R.C.,ȱVreugdenhil,ȱD.ȱandȱKoornneef,ȱM.ȱ(2006).ȱTheȱgeneticsȱofȱplantȱmetabolism.ȱNatȱGenetȱ 38,ȱ842Ȭ849.ȱ Keurentjes,ȱJ.J.B.,ȱBentsink,ȱL.,ȱAlonsoȬBlanco,ȱC.,ȱHanhart,ȱC.J.,ȱBlankestijnȬDeȱVries,ȱH.,ȱEffgen,ȱS.,ȱ Vreugdenhil,ȱD.ȱandȱKoornneef,ȱM.ȱ(2007a).ȱDevelopmentȱofȱaȱnearȬisogenicȱlineȱpopulationȱ ofȱ Arabidopsisȱ thalianaȱ andȱ comparisonȱ ofȱ mappingȱ powerȱ withȱ aȱ recombinantȱ inbredȱ lineȱ population.ȱGeneticsȱ175,ȱ891Ȭ905.ȱ ȱ ȱ

130ȱ Integrativeȱanalysesȱofȱgeneticȱvariationȱinȱenzymeȱactivityȱ

Keurentjes,ȱ J.J.B.,ȱ Fu,ȱ J.,ȱ Terpstra,ȱ I.R.,ȱ Garcia,ȱ J.M.,ȱ vanȱ denȱ Ackerveken,ȱ G.,ȱ Snoek,ȱ L.B.,ȱ Peeters,ȱ A.J.,ȱ Vreugdenhil,ȱ D.,ȱ Koornneef,ȱ M.ȱ andȱ Jansen,ȱ R.C.ȱ (2007b).ȱ Regulatoryȱ networkȱ constructionȱ inȱ Arabidopsisȱ byȱ usingȱ genomeȬwideȱ geneȱ expressionȱ quantitativeȱ traitȱ loci.ȱ ProcȱNatlȱAcadȱSciȱUȱSȱAȱ104,ȱ1708Ȭ1713.ȱ Koch,ȱ K.ȱ (2004).ȱ Sucroseȱ metabolism:ȱ regulatoryȱ mechanismsȱ andȱ pivotalȱ rolesȱ inȱ sugarȱ sensingȱ andȱ plantȱdevelopment.ȱCurrȱOpinȱPlantȱBiolȱ7,ȱ235Ȭ246.ȱ Kofler,ȱ H.,ȱ Hausler,ȱ R.E.,ȱ Schulz,ȱ B.,ȱ Groner,ȱ F.,ȱ Flugge,ȱ U.I.ȱ andȱ Weber,ȱ A.ȱ (2000).ȱ Molecularȱ characterisationȱ ofȱ aȱ newȱ mutantȱ alleleȱ ofȱ theȱ plastidȱ phosphoglucomutaseȱ inȱ Arabidopsis,ȱ andȱcomplementationȱofȱtheȱmutantȱwithȱtheȱwildȬtypeȱcDNA.ȱMolȱGenȱGenetȱ263,ȱ978Ȭ986.ȱ Koornneef,ȱM.,ȱAlonsoȬBlanco,ȱC.ȱandȱVreugdenhil,ȱD.ȱ(2004).ȱNaturallyȱoccurringȱgeneticȱvariationȱinȱ ArabidopsisȱThaliana.ȱAnnuȱRevȱPlantȱPhysiolȱPlantȱMolȱBiolȱ55,ȱ141Ȭ172.ȱ Li,ȱL.,ȱStrahwald,ȱJ.,ȱHofferbert,ȱH.R.,ȱLubeck,ȱJ.,ȱTacke,ȱE.,ȱJunghans,ȱH.,ȱWunder,ȱJ.ȱandȱGebhardt,ȱC.ȱ (2005).ȱDNAȱvariationȱatȱtheȱinvertaseȱlocusȱinvGE/GFȱisȱassociatedȱwithȱtuberȱqualityȱtraitsȱ inȱpopulationsȱofȱpotatoȱbreedingȱclones.ȱGeneticsȱ170,ȱ813Ȭ821.ȱ Loudet,ȱO.,ȱChaillou,ȱS.,ȱMerigout,ȱP.,ȱTalbotec,ȱJ.ȱandȱDanielȬVedele,ȱF.ȱ(2003).ȱQuantitativeȱtraitȱlociȱ analysisȱofȱnitrogenȱuseȱefficiencyȱinȱArabidopsis.ȱPlantȱPhysiolȱ131,ȱ345Ȭ358.ȱ Lunn,ȱJ.E.ȱ(2007).ȱCompartmentationȱinȱplantȱmetabolism.ȱJȱExpȱBotȱ58,ȱ35Ȭ47.ȱ Manjunath,ȱ S.,ȱ Lee,ȱ C.H.,ȱ VanWinkle,ȱ P.ȱ andȱ BaileyȬSerres,ȱ J.ȱ (1998).ȱ Molecularȱ andȱ biochemicalȱ characterizationȱofȱcytosolicȱphosphoglucomutaseȱinȱmaize.ȱExpressionȱduringȱdevelopmentȱ andȱinȱresponseȱtoȱoxygenȱdeprivation.ȱPlantȱPhysiolȱ117,ȱ997Ȭ1006.ȱ Martienssen,ȱ R.A.ȱ (2000).ȱ Weedingȱ outȱ theȱ genes:ȱ theȱ Arabidopsisȱ genomeȱ project.ȱ Functȱ Integrȱ Genomicsȱ1,ȱ2Ȭ11.ȱ Masle,ȱ J.,ȱ Gilmore,ȱ S.R.ȱ andȱ Farquhar,ȱ G.D.ȱ (2005).ȱ Theȱ ERECTAȱ geneȱ regulatesȱ plantȱ transpirationȱ efficiencyȱinȱArabidopsis.ȱNatureȱ436,ȱ866Ȭ870.ȱ Meyer,ȱR.C.,ȱSteinfath,ȱM.,ȱLisec,ȱJ.,ȱBecher,ȱM.,ȱWituckaȬWall,ȱH.,ȱTorjek,ȱO.,ȱFiehn,ȱO.,ȱEckardt,ȱA.,ȱ Willmitzer,ȱL.,ȱSelbig,ȱJ.ȱetȱal.ȱ(2007).ȱTheȱmetabolicȱsignatureȱrelatedȱtoȱhighȱplantȱgrowthȱ rateȱinȱArabidopsisȱthaliana.ȱProcȱNatlȱAcadȱSciȱUȱSȱAȱ104,ȱ4759Ȭ4764.ȱ MitchellȬOlds,ȱ T.ȱ andȱ Pedersen,ȱ D.ȱ (1998).ȱ Theȱ molecularȱ basisȱ ofȱ quantitativeȱ geneticȱ variationȱ inȱ centralȱandȱsecondaryȱmetabolismȱinȱArabidopsis.ȱGeneticsȱ149,ȱ739Ȭ747.ȱ Morcuende,ȱ R.,ȱ Bari,ȱ R.,ȱ Gibon,ȱ Y.,ȱ Zheng,ȱ W.,ȱ Pant,ȱ B.D.,ȱ Blasing,ȱ O.,ȱ Usadel,ȱ B.,ȱ Czechowski,ȱ T.,ȱ Udvardi,ȱ M.K.,ȱ Stitt,ȱ M.ȱ etȱ al.ȱ (2007).ȱ GenomeȬwideȱ reprogrammingȱ ofȱ metabolismȱ andȱ regulatoryȱnetworksȱofȱArabidopsisȱinȱresponseȱtoȱphosphorus.ȱPlantȱCellȱEnvironȱ30,ȱ85Ȭ112.ȱ Neuhaus,ȱ H.E.,ȱ Kruckeberg,ȱ A.L.,ȱ Feil,ȱ R.,ȱ Gottlieb,ȱ L.ȱ andȱ Stitt,ȱ M.ȱ (1989).ȱ Dosageȱ mutantsȱ ofȱ phosphoglucoseȱisomeraseȱinȱtheȱcytosolȱandȱchloroplastsȱofȱClarkiaȱxantiana.ȱII.ȱStudyȱofȱtheȱ mechanismsȱwhichȱregulateȱphotosynthateȱpartitioning.ȱPlantaȱ178,ȱ110Ȭ122.ȱ Osuna,ȱD.,ȱUsadel,ȱB.,ȱMorcuende,ȱR.,ȱGibon,ȱY.,ȱBlasing,ȱO.E.,ȱHohne,ȱM.,ȱGunter,ȱM.,ȱKamlage,ȱB.,ȱ Trethewey,ȱ R.,ȱ Scheible,ȱ W.R.ȱ etȱ al.ȱ (2007).ȱ Temporalȱ responsesȱ ofȱ transcripts,ȱ enzymeȱ activitiesȱ andȱ metabolitesȱ afterȱ addingȱ sucroseȱ toȱ carbonȬdeprivedȱ Arabidopsisȱ seedlings.ȱ PlantȱJȱ49,ȱ463Ȭ491.ȱ Periappuram,ȱ C.,ȱ Steinhauer,ȱ L.,ȱ Barton,ȱ D.L.,ȱ Taylor,ȱ D.C.,ȱ Chatson,ȱ B.ȱ andȱ Zou,ȱ J.ȱ (2000).ȱ Theȱ plastidicȱ phosphoglucomutaseȱ fromȱ Arabidopsis.ȱ Aȱ reversibleȱ enzymeȱ reactionȱ withȱ anȱ importantȱroleȱinȱmetabolicȱcontrol.ȱPlantȱPhysiolȱ122,ȱ1193Ȭ1199.ȱ Prioul,ȱJ.L.,ȱPelleschi,ȱS.,ȱSene,ȱM.,ȱThevenot,ȱC.,ȱCausse,ȱM.,ȱdeȱVienne,ȱD.ȱandȱLeonardi,ȱA.ȱ(1999).ȱ FromȱQTLsȱforȱenzymeȱactivityȱtoȱcandidateȱgenesȱinȱmaize.ȱJȱExpȱBotȱ50,ȱ1281Ȭ1288.ȱ Rauh,ȱ L.,ȱ Basten,ȱ C.ȱ andȱ Buckler,ȱ S.t.ȱ (2002).ȱ Quantitativeȱ traitȱ lociȱ analysisȱ ofȱ growthȱ responseȱ toȱ varyingȱnitrogenȱsourcesȱinȱArabidopsisȱthaliana.ȱTheorȱApplȱGenetȱ104,ȱ743Ȭ750.ȱ Rockman,ȱM.V.ȱandȱKruglyak,ȱL.ȱ(2006).ȱGeneticsȱofȱglobalȱgeneȱexpression.ȱNatȱRevȱGenetȱ7,ȱ862Ȭ872.ȱ

131ȱ Chapterȱ5ȱ

Roessner,ȱ U.,ȱ Luedemann,ȱ A.,ȱ Brust,ȱ D.,ȱ Fiehn,ȱ O.,ȱ Linke,ȱ T.,ȱ Willmitzer,ȱ L.ȱ andȱ Fernie,ȱ A.ȱ (2001).ȱ Metabolicȱ profilingȱ allowsȱ comprehensiveȱ phenotypingȱ ofȱ geneticallyȱ orȱ environmentallyȱ modifiedȱplantȱsystems.ȱPlantȱCellȱ13,ȱ11Ȭ29.ȱ Rolland,ȱF.,ȱMoore,ȱB.ȱandȱSheen,ȱJ.ȱ(2002).ȱSugarȱsensingȱandȱsignalingȱinȱplants.ȱPlantȱCellȱ14ȱSuppl,ȱ S185Ȭ205.ȱ Rontein,ȱD.,ȱDieuaideȬNoubhani,ȱM.,ȱDufourc,ȱE.J.,ȱRaymond,ȱP.ȱandȱRolin,ȱD.ȱ(2002).ȱTheȱmetabolicȱ architectureȱofȱplantȱcells.ȱStabilityȱofȱcentralȱmetabolismȱandȱflexibilityȱofȱanabolicȱpathwaysȱ duringȱtheȱgrowthȱcycleȱofȱtomatoȱcells.ȱJȱBiolȱChemȱ277,ȱ43948Ȭ43960.ȱ Schauer,ȱ N.,ȱ Semel,ȱ Y.,ȱ Roessner,ȱ U.,ȱ Gur,ȱ A.,ȱ Balbo,ȱ I.,ȱ Carrari,ȱ F.,ȱ Pleban,ȱ T.,ȱ PerezȬMelis,ȱ A.,ȱ Bruedigam,ȱC.,ȱKopka,ȱJ.ȱetȱal.ȱ(2006).ȱComprehensiveȱmetabolicȱprofilingȱandȱphenotypingȱ ofȱinterspecificȱintrogressionȱlinesȱforȱtomatoȱimprovement.ȱNatȱBiotechnolȱ24,ȱ447Ȭ454.ȱ Sergeeva,ȱ L.I.,ȱ Vonk,ȱ J.,ȱ Keurentjes,ȱ J.J.B.,ȱ vanȱ derȱ Plas,ȱ L.H.,ȱ Koornneef,ȱ M.ȱ andȱ Vreugdenhil,ȱ D.ȱ (2004).ȱ Histochemicalȱ analysisȱ revealsȱ organȬspecificȱ quantitativeȱ traitȱ lociȱ forȱ enzymeȱ activitiesȱinȱArabidopsis.ȱPlantȱPhysiolȱ134,ȱ237Ȭ245.ȱ Sergeeva,ȱ L.I.,ȱ Keurentjes,ȱ J.J.B.,ȱ Bentsink,ȱ L.,ȱ Vonk,ȱ J.,ȱ vanȱ derȱ Plas,ȱ L.H.,ȱ Koornneef,ȱ M.ȱ andȱ Vreugdenhil,ȱD.ȱ(2006).ȱVacuolarȱinvertaseȱregulatesȱelongationȱofȱArabidopsisȱthalianaȱrootsȱ asȱrevealedȱbyȱQTLȱandȱmutantȱanalysis.ȱProcȱNatlȱAcadȱSciȱUȱSȱAȱ103,ȱ2994Ȭ2999.ȱ Siegel,ȱS.ȱ(1956).ȱNonȬparametricȱstatisticsȱforȱtheȱbehavioralȱsciences.ȱ(NewȱYork:ȱMcGrawȬHill).ȱ Sturm,ȱA.ȱandȱTang,ȱG.Q.ȱ(1999).ȱTheȱsucroseȬcleavingȱenzymesȱofȱplantsȱareȱcrucialȱforȱdevelopment,ȱ growthȱandȱcarbonȱpartitioning.ȱTrendsȱPlantȱSciȱ4,ȱ401Ȭ407.ȱ Sulpice,ȱ R.,ȱ Tschoep,ȱ H.,ȱ vonȱ Korff,ȱ M.,ȱ Bussis,ȱ D.,ȱ Usadel,ȱ B.,ȱ Hoehne,ȱ M.,ȱ WituckaȬWall,ȱ H.,ȱ Altmann,ȱ T.,ȱ Stitt,ȱ M.ȱ andȱ Gibon,ȱ Y.ȱ (2007).ȱ Descriptionȱ andȱ applicationsȱ ofȱ aȱ rapidȱ andȱ sensitiveȱ nonȬradioactiveȱ microplateȬbasedȱ assayȱ forȱ maximumȱ andȱ initialȱ activityȱ ofȱ ribuloseȬ1,5Ȭbisphosphateȱ carboxylase.ȱ Plantȱ Cellȱ Environȱ Inȱ press,ȱ doi:ȱ 10.1111/j.1365Ȭ 3040.2007.01679.x.ȱ TheȱArabidopsisȱGenomeȱInitiative.ȱ(2000).ȱAnalysisȱofȱtheȱgenomeȱsequenceȱofȱ theȱfloweringȱplantȱ Arabidopsisȱthaliana.ȱNatureȱ408,ȱ796Ȭ815.ȱ Torii,ȱK.U.,ȱMitsukawa,ȱN.,ȱOosumi,ȱT.,ȱMatsuura,ȱY.,ȱYokoyama,ȱR.,ȱWhittier,ȱR.F.ȱandȱKomeda,ȱY.ȱ (1996).ȱ Theȱ Arabidopsisȱ ERECTAȱ geneȱ encodesȱ aȱ putativeȱ receptorȱ proteinȱ kinaseȱ withȱ extracellularȱleucineȬrichȱrepeats.ȱPlantȱCellȱ8,ȱ735Ȭ746.ȱ UrbanczykȬWochniak,ȱ E.,ȱ Luedemann,ȱ A.,ȱ Kopka,ȱ J.,ȱ Selbig,ȱ J.,ȱ RoessnerȬTunali,ȱ U.,ȱ Willmitzer,ȱ L.ȱ andȱFernie,ȱA.R.ȱ(2003).ȱParallelȱanalysisȱofȱtranscriptȱandȱmetabolicȱprofiles:ȱaȱnewȱapproachȱ inȱsystemsȱbiology.ȱEMBOȱRepȱ4,ȱ989Ȭ993.ȱ VanȱOoijen,ȱJ.W.ȱ(1992).ȱAccuracyȱofȱmappingȱquantitativeȱtraitȱlociȱinȱautogamousȱspecies.ȱTheorȱApplȱ Genetȱ84,ȱ803Ȭ811.ȱ VanȱOoijen,ȱJ.W.ȱ(2004).ȱMapQTLȱ5,ȱSoftwareȱforȱtheȱmappingȱofȱquantitativeȱtraitȱlociȱinȱexperimentalȱ populationsȱ(Wageningen,ȱTheȱNetherlands:ȱKyazmaȱB.V.).ȱ Winnacker,ȱE.L.ȱ(2003).ȱInterdisciplinaryȱsciencesȱinȱtheȱ21stȱcentury.ȱCurrȱOpinȱBiotechnolȱ14,ȱ328Ȭ331.ȱ ȱ

132ȱ Chapterȱ6ȱ ȱ ȱ Generalȱdiscussionȱ ȱ Oneȱ ofȱ theȱ intriguingȱ observationsȱ inȱ natureȱ isȱ theȱ enormousȱ diversityȱ inȱ characteristicȱpropertiesȱofȱvariousȱspecies.ȱHowever,ȱnaturalȱvariationȱcanȱalsoȱbeȱ observedȱwithinȱspeciesȱ(AlonsoȬBlancoȱandȱKoornneef,ȱ2000),ȱwhichȱisȱsupposedȱ toȱ beȱ oneȱ ofȱ theȱ drivingȱ forcesȱ ofȱ speciesȱ formation.ȱ Farmersȱ andȱ breedersȱ haveȱ usedȱ naturalȱ occurringȱ geneticȱ variationȱ forȱ centuriesȱ toȱ improveȱ cropȱ speciesȱ (KoornneefȱandȱStam,ȱ2001;ȱZamir,ȱ2001).ȱTheȱidentificationȱofȱtheȱgeneticȱfactorsȱ controllingȱ naturalȱ variationȱ wouldȱ thereforeȱ improveȱ ourȱ understandingȱ ofȱ geneticȱregulatoryȱprocessesȱandȱgiveȱinsightȱintoȱtheȱevolutionaryȱsignificanceȱofȱ variationȱ(MitchellȬOldsȱandȱSchmitt,ȱ2006).ȱForȱmanyȱtraitsȱquantitativeȱvariationȱ isȱobserved,ȱsuggestingȱthatȱitȱisȱcontrolledȱbyȱmultipleȱgenes.ȱPossibleȱinteractionsȱ betweenȱ genesȱ andȱ betweenȱ genesȱ andȱ theȱ environmentȱ furtherȱ addȱ toȱ theȱ complexityȱ ofȱ quantitativeȱ traits,ȱ makingȱ theȱ geneticȱ dissectionȱ ofȱ suchȱ traitsȱ difficult.ȱ Aȱclassicalȱfirstȱstepȱinȱtheȱgeneticȱanalysisȱofȱtraitsȱisȱtheȱdeterminationȱofȱ inheritanceȱ patternsȱ inȱ theȱ progenyȱ ofȱ aȱ crossȱ betweenȱ distinctȱ varieties.ȱ Quantitativeȱtraitsȱdoȱnotȱsegregateȱinȱdistinctȱclassesȱbutȱinsteadȱdisplayȱaȱmoreȱ continuousȱ variationȱ inȱ traitȱ valuesȱ asȱ aȱ resultȱ fromȱ theȱ segregationȱ ofȱ multipleȱ independentȱloci.ȱToȱrelateȱtheseȱquantitativeȱtraitȱlociȱ(QTLs)ȱtoȱgenomicȱpositionsȱ itȱisȱpivotalȱtoȱbeȱableȱtoȱdetermineȱtheȱgenotypeȱofȱsegregants.ȱTheȱdevelopmentȱofȱ molecularȱmarkersȱgreatlyȱenhancedȱtheȱeaseȱinȱwhichȱmappingȱpopulationsȱcanȱ beȱ genotyped.ȱ Molecularȱ markersȱ representȱ genomicȱ polymorphismsȱ betweenȱ genotypicallyȱ differentȱ lines.ȱ Byȱ crossingȱ distinctȱ accessions,ȱ numerousȱ polymorphismsȱ willȱsegregateȱinȱ aȱ progeny,ȱenablingȱ theȱ constructionȱ ofȱ geneticȱ maps.ȱ Becauseȱ quantitativeȱ traitsȱ mayȱ alsoȱ segregateȱ inȱ theȱ sameȱ offspringȱ population,ȱQTLsȱforȱtheseȱtraitsȱcanȱthenȱbeȱmappedȱbyȱanalyzingȱcoȬsegregationȱ ofȱtraitȱvaluesȱwithȱmolecularȱmarkersȱusedȱforȱtheȱconstructionȱofȱtheȱgeneticȱmapȱ (Broman,ȱ2001;ȱDoerge,ȱ2002;ȱJansen,ȱ2003).ȱ Dependingȱonȱtheȱspeciesȱandȱtheȱeaseȱbyȱwhichȱmappingȱpopulationsȱcanȱ beȱ generatedȱ severalȱ approachesȱ haveȱ beenȱ applied.ȱ Aȱ relativelyȱ fastȱ approach,ȱ requiringȱaȱminimalȱnumberȱofȱgenerations,ȱisȱtheȱgenerationȱofȱanȱF2ȱorȱbackȱcrossȱ (BC)ȱ population.ȱ However,ȱ suchȱ populationsȱ stillȱ containȱ aȱ highȱ levelȱ ofȱ heterozygosityȱ whichȱ mayȱ compromiseȱ theȱ constructionȱ ofȱ geneticȱ maps,ȱ especiallyȱ whenȱ dominantȱ markersȱ areȱ beingȱ used.ȱ Moreover,ȱ suchȱ populationsȱ

133ȱ Chapterȱ6ȱ

canȱ notȱ beȱ propagatedȱ sexuallyȱ withoutȱ furtherȱ segregationȱ ofȱ theȱ heterozygousȱ regions,ȱmakingȱadditionalȱgenotypingȱinȱlaterȱgenerationsȱnecessary.ȱPopulationsȱ consistingȱofȱhomozygousȱlinesȱonȱtheȱotherȱhand,ȱneedȱseveralȱroundsȱofȱselfingȱ orȱ backcrossingȱ toȱ reachȱ fullȱ homozygosity.ȱ Alternatively,ȱ homozygosityȱ canȱ beȱ obtainedȱ byȱ generatingȱ doubleȱ haploids.ȱ Whenȱ fullyȱ homozygous,ȱ linesȱ canȱ beȱ propagatedȱ withoutȱ introducingȱ furtherȱ genotypicȱ changesȱ inȱ theirȱ progeny.ȱ Atȱ thisȱstageȱtheȱpopulationȱhasȱbecomeȱimmortalȱandȱaȱsingleȱroundȱofȱgenotypingȱisȱ sufficientȱtoȱgenerateȱaȱgeneticȱmapȱforȱanyȱfurtherȱexperimentation.ȱHomozygousȱ linesȱ offerȱ theȱ advantageȱ ofȱ replicatedȱ measurementsȱ atȱ genotypicallyȱ identicalȱ individualsȱ andȱ alsoȱ allowȱ comparingȱ differentȱ experimentsȱ inȱ timeȱ andȱ environment.ȱ InȱArabidopsis,ȱrecombinantȱinbredȱlinesȱ(RILs)ȱhaveȱbecomeȱtheȱmappingȱ populationȱ ofȱ choiceȱ becauseȱ ofȱ itsȱ selfingȱ natureȱ andȱ shortȱ generationȱ timesȱ (Somervilleȱ andȱ Koornneef,ȱ 2002).ȱ Inȱ otherȱ speciesȱ however,ȱ nearȱ isogenicȱ linesȱ (NILs)ȱ areȱ sometimesȱ favorableȱ becauseȱ ofȱ sterilityȱ problemsȱ andȱ intoleranceȱ towardsȱinbreedingȱ(EshedȱandȱZamir,ȱ1995).ȱBothȱtypesȱofȱpopulationȱcanȱbeȱusedȱ forȱmappingȱpurposesȱalthoughȱtheyȱdifferȱmarkedlyȱinȱtheirȱgeneticȱmakeȬupȱdueȱ toȱ differencesȱ inȱ theȱ crossingȱ scheme.ȱ RILsȱ areȱ generatedȱ fromȱ anȱ F1ȱ withoutȱ backcrossingȱandȱthereforeȱcontainȱonȱaverageȱequalȱcontributionsȱofȱbothȱparentalȱ genomes.ȱ NILsȱ onȱ theȱ otherȱ hand,ȱ areȱ generatedȱ fromȱ anȱ F1ȱ throughȱ repeatedȱ backcrossingȱ withȱ aȱ recurrentȱ parentȱ andȱ containȱ onlyȱ aȱ limitedȱ amountȱ ofȱ theȱ donorȱgenome.ȱAsȱaȱresultȱRILsȱcanȱcontainȱmultipleȱintrogressionsȱwhereasȱNILsȱ preferablyȱ containȱ onlyȱ aȱ singleȱ introgressionȱ intoȱ anȱ otherwiseȱ isogenicȱ background.ȱ Becauseȱ ofȱ theseȱ differences,ȱ differentȱ mappingȱ strategiesȱ areȱ requiredȱforȱtheȱtwoȱtypesȱofȱpopulation.ȱ ChapterȱtwoȱofȱthisȱthesisȱdescribedȱtheȱdevelopmentȱofȱtheȱfirstȱgenomeȬ wideȱ coverageȱ NILȱ populationȱ inȱ Arabidopsisȱ allowingȱ theȱ comparisonȱ ofȱ mappingȱpurposesȱwithȱanȱalreadyȱexistingȱRILȱpopulationȱderivedȱfromȱtheȱsameȱ parentalȱ accessionsȱ (AlonsoȬBlancoȱ etȱ al.,ȱ 1998).ȱ Theseȱ comparisonsȱ revealedȱ aȱ higherȱ mappingȱ powerȱ forȱ smallȱ effectȱ lociȱ butȱ lowerȱ mappingȱ resolutionȱ inȱ theȱ NILȱ populationȱ comparedȱ toȱ theȱ RILȱ population.ȱ However,ȱ resultsȱ wereȱ greatlyȱ dependingȱonȱtheȱgeneticȱarchitectureȱofȱtraitsȱandȱpopulationȱsizeȱandȱstructure.ȱ Forȱ RILȱ populationsȱ bothȱ mappingȱ powerȱ andȱ resolutionȱ canȱ beȱ increasedȱ byȱ increasingȱ populationȱ size,ȱ whichȱ wouldȱ alsoȱ diminishȱ theȱ needȱ forȱ replicatedȱ measurements.ȱ Forȱ NILȱ populationsȱ resolutionȱ canȱ alsoȱ beȱ improvedȱ byȱ addingȱ moreȱlinesȱbut,ȱdependingȱonȱtheȱsizeȱofȱintrogressionsȱandȱtheȱamountȱofȱoverlap,ȱ powerȱ hasȱ toȱ beȱ increasedȱ byȱ replicatedȱ measurements.ȱ Becauseȱ ofȱ theȱ muchȱ higherȱ recombinationȱ frequency,ȱ RILȱ populationsȱ areȱ oftenȱ favorableȱ overȱ NILȱ populationsȱwhenȱmappingȱexperimentsȱareȱlimitedȱbyȱtheȱnumberȱofȱplantsȱthatȱ

134ȱ Generalȱdiscussionȱ

canȱbeȱanalyzed.ȱTheȱsegregationȱofȱmultipleȱlociȱinȱRILȱpopulationsȱmightȱmaskȱ smallȬeffectȱ QTLs,ȱ butȱ allowsȱ theȱ detectionȱ ofȱ geneticȱ interactions,ȱ whichȱ isȱ notȱ possibleȱ inȱ NILȱ populations.ȱ Nilsȱ haveȱ beenȱ shownȱ toȱ beȱ veryȱ usefulȱ forȱ theȱ confirmationȱ ofȱ QTLsȱ andȱ asȱ startingȱ materialȱ forȱ theȱ fineȬmappingȱ ofȱ soȬcalledȱ MendelizedȱQTL.ȱ Althoughȱ naturalȱ phenotypicȱ variationȱ canȱ beȱ observedȱ forȱ manyȱ quantitativeȱ traitsȱ inȱ Arabidopsis,ȱ whichȱ canȱ beȱ effectivelyȱ analyzedȱ inȱ RILȱ populationsȱ (AlonsoȬBlancoȱ andȱ Koornneef,ȱ 2000;ȱ Koornneefȱ etȱ al.,ȱ 2004),ȱ QTLȱ analysisȱ oftenȱ revealsȱ onlyȱ aȱ limitedȱ numberȱ ofȱ stepsȱ inȱ theȱ complexȱ regulatoryȱ pathwaysȱ ofȱ quantitativeȱ traits.ȱ Theȱ pathȱ fromȱ genotypeȱ toȱ phenotypeȱ oftenȱ involvesȱ multipleȱ intermediateȱ stepsȱ andȱ itȱ isȱ thereforeȱ difficultȱ toȱ determineȱ whetherȱQTLsȱregulateȱtraitsȱdirectlyȱorȱindirectly.ȱMoreover,ȱregulationȱcanȱoccurȱ atȱ differentȱ levels,ȱ rangingȱ fromȱ variationȱ inȱ presenceȱ orȱ expressionȱ ofȱ genesȱ toȱ variationȱinȱproteinȱfunction.ȱUntilȱtheȱcloningȱofȱaȱQTLȱandȱtheȱidentificationȱofȱ theȱ causalȱ polymorphism(s)ȱ itȱ thereforeȱ remainsȱ uncertainȱ atȱ whichȱ pointȱ inȱ aȱ pathwayȱ traitsȱ areȱ regulated.ȱ Toȱ fullyȱ understandȱ theȱ complexȱ regulationȱ ofȱ quantitativeȱ traitsȱ itȱ isȱ thereforeȱ recommendableȱ toȱ geneticallyȱ analyzeȱ differentȱ levelsȱ andȱ intermediatesȱ atȱ whichȱ geneticȱ controlȱ mightȱ actȱ (Fiehnȱ etȱ al.,ȱ 2001;ȱ Winnacker,ȱ2003).ȱ Theȱrecentȱadvanceȱinȱanalyticalȱtechnologiesȱ(transcriptomics,ȱproteomicsȱ andȱmetabolomics)ȱnowȱenablesȱtheȱlargeȬscaleȱgeneticȱanalysisȱofȱdifferentȱentitiesȱ inȱtheȱcircuitryȱfromȱgeneȱtoȱphenotype.ȱTheȱexpressionȱofȱgenesȱoftenȱdeterminesȱ theȱonsetȱofȱpathwaysȱresultingȱinȱaȱparticularȱphenotype.ȱTherefore,ȱphenotypicȱ variationȱmightȱbeȱinflictedȱbyȱvariationȱinȱgeneȱexpression.ȱAtȱtheȱotherȱendȱofȱtheȱ informationȱ flowȱ fromȱ DNAȱ sequenceȱ toȱ geneȱ function,ȱ metabolites,ȱasȱ productsȱ fromȱ theȱ encodedȱ proteins,ȱ standȱ closestȱ toȱ theȱ eventualȱ phenotype.ȱ Itȱ isȱ conceivableȱ thatȱ geneticallyȱ controlledȱ variationȱ inȱ metaboliteȱ compositionȱ andȱ accumulationȱ determines,ȱ atȱ leastȱ partly,ȱ theȱ observedȱ phenotypicȱ variation.ȱ Inȱ chapterȱ threeȱ andȱ fourȱ highȱ throughputȱ ‘omic’ȱ technologiesȱ wereȱ usedȱ forȱ theȱ analysisȱ ofȱ naturalȱ variationȱ inȱ geneȱ expressionȱ (transcriptomics)ȱ andȱ metaboliteȱ compositionȱandȱcontentȱ(metabolomics).ȱ Analogousȱ toȱ ‘classical’ȱ quantitativeȱ phenotypicȱ traits,ȱ naturalȱ variationȱ canȱ alsoȱ beȱ observedȱ forȱ geneȱ expression,ȱ whenȱ thatȱ expressionȱ isȱ underȱ geneticȱ controlȱ(Borevitzȱetȱal.,ȱ2003;ȱKliebensteinȱetȱal.,ȱ2006).ȱChapterȱthreeȱdescribesȱtheȱ geneticȱ analysisȱ ofȱ genomeȬwideȱ geneȱ expressionȱ variationȱ inȱ Arabidopsisȱ (geneticalȱ genomics)ȱ (Jansenȱ andȱ Nap,ȱ 2001).ȱ Theseȱ analysesȱ revealedȱ extensiveȱ geneticȱcontrolȱofȱgeneȱexpression,ȱjudgedȱfromȱtheȱfactȱthatȱforȱmoreȱthanȱ4,000ȱ genesȱ expressionȱ QTLsȱ (eQTLs)ȱ couldȱ beȱ detected.ȱ However,ȱ manyȱ moreȱ genesȱ showedȱ highȱ heritabilityȱ values,ȱ evenȱ thoughȱ noȱ eQTLȱ couldȱ beȱ detected.ȱ Thisȱ

135ȱ Chapterȱ6ȱ

suggestsȱ thatȱ theirȱ expressionȱ isȱ regulatedȱ byȱ multipleȱ eQTLs,ȱ ofȱ whichȱ manyȱ mightȱnotȱhaveȱpassedȱtheȱstringentȱsignificanceȱthresholdȱdueȱtoȱtheirȱsmallȱeffect.ȱ Bothȱ localȱ andȱ distantȱ regulationȱ (Rockmanȱ andȱ Kruglyak,ȱ 2006)ȱ wasȱ observedȱ althoughȱ localȱ regulationȱ wasȱ oftenȱ muchȱ stronger.ȱ Localȱ regulationȱ mightȱ beȱ aȱ resultȱ fromȱ polymorphismsȱ inȱ cisȬregulatoryȱ elementsȱ whichȱ directlyȱ affectsȱ theȱ expressionȱofȱtheȱgeneȱunderȱstudy.ȱInterestingly,ȱlocalȱregulationȱcorrelatedȱwithȱ polymorphismȱ frequency,ȱ furtherȱ supportingȱ theȱ suggestionȱ thatȱ expressionȱ variationȱ canȱ beȱ aȱ resultȱ fromȱ localȱ sequenceȱ differences.ȱ Moreover,ȱ regulatoryȱ genesȱ showedȱ muchȱ lessȱ localȱ regulation,ȱ whichȱ canȱ beȱ explainedȱ byȱ muchȱ strongerȱconservationȱdueȱtoȱtheirȱpleiotropicȱeffects.ȱ Distantȱ regulationȱ onȱ theȱ otherȱ hand,ȱ isȱ mostȱ likelyȱ aȱ resultȱ fromȱ polymorphismsȱ inȱ aȱ regulator,ȱ affectingȱ expressionȱ inȱ trans,ȱ possiblyȱ throughȱ multipleȱintermediates.ȱSinceȱregulatorsȱmayȱexertȱpleiotropicȱeffectsȱonȱnumerousȱ genes,ȱ directlyȱ orȱ indirectly,ȱ multipleȱ eQTLsȱ wouldȱ mapȱ toȱ theȱ positionȱ ofȱ thisȱ regulator.ȱIndeedȱseveralȱeQTLȱhotȱspotsȱwereȱidentified,ȱpossiblyȱcontainingȱsuchȱ masterȱregulators.ȱTheȱdetectionȱofȱdistantȱeQTLsȱindicatesȱthatȱtheȱexpressionȱofȱ theȱgeneȱunderȱstudyȱisȱcontrolledȱbyȱgeneticȱfactorsȱinȱtrans.ȱWhenȱtheȱcausalȱgeneȱ underpinningȱ theȱ transȱ eQTLȱ canȱ beȱ identifiedȱ thisȱ allowsȱ theȱ possibilityȱ ofȱ establishingȱ geneȱ regulationȱ networksȱ (Jansen,ȱ 2003).ȱ However,ȱ QTLȱ supportȱ intervalsȱ oftenȱ containȱ hundredsȱ ofȱ genes,ȱ eachȱ ofȱ whichȱ canȱ beȱ aȱ candidateȱ regulator.ȱPositiveȱconfirmationȱofȱaȱcandidateȱcanȱonlyȱbeȱobtainedȱuponȱcloningȱ ofȱtheȱeQTL,ȱwhichȱisȱpracticallyȱdifficultȱtoȱachieveȱforȱgenomeȬwideȱexpressionȱ studies.ȱ Theȱ assignmentȱ ofȱ candidateȱ genesȱ thereforeȱ reliesȱ onȱ additionalȱ information,ȱ suchȱ asȱ coȬexpressionȱ andȱ geneȱ ontology.ȱ Theȱ powerȱ ofȱ suchȱ aȱ computationalȱapproachȱwasȱdemonstratedȱbyȱtheȱreconstructionȱofȱaȱregulatoryȱ networkȱ forȱ genesȱ involvedȱ inȱ theȱ regulationȱ ofȱ floweringȱ time.ȱ Nonetheless,ȱ variationȱ inȱ expressionȱ ofȱ genesȱ canȱ notȱ alwaysȱ beȱ explainedȱ byȱ expressionȱ differencesȱofȱtheirȱregulator,ȱespeciallyȱwhenȱexpressionȱofȱtheȱregulatorȱisȱnotȱcisȬ regulated.ȱ Whenȱ theȱ causalȱ polymorphism(s)ȱ reside(s)ȱ inȱ theȱ codingȱ sequenceȱ ofȱ theȱ regulatorȱ thisȱ mightȱ alterȱ proteinȱ functionȱ orȱ stabilityȱ andȱ theȱ expressionȱ ofȱ targetȱ genesȱ thenȱ dependsȱ onȱ theȱ allelicȱ formȱ ofȱ theȱ regulatorȱ ratherȱ thanȱ onȱ itsȱ expressionȱ level.ȱ Theȱ identificationȱ ofȱ suchȱ relationshipsȱ fromȱ additionalȱ informationȱ suchȱ asȱ genomeȱ sequences,ȱ e.g.ȱ bindingȱ siteȱ data,ȱ andȱ experimentation,ȱ e.g.ȱ proteinȬDNAȱ interactionȱ data,ȱ canȱ furtherȱ improveȱ theȱ assignmentȱofȱcandidateȱregulatorsȱandȱultimatelyȱtheȱconstructionȱofȱregulatoryȱ networks.ȱ Theȱ recognitionȱ ofȱ geneticallyȱ controlledȱ expressionȱ ofȱ genesȱ andȱ theȱ reductionȱ inȱ theȱ numberȱ ofȱ candidateȱ regulatorsȱ throughȱ QTLȱ analysisȱ shouldȱ thereforeȱfurtherȱguideȱtheȱdetailedȱanalysisȱofȱgeneȬbyȬgeneȱregulation.ȱ

136ȱ Generalȱdiscussionȱ

UnlikeȱmicroarraysȱforȱtheȱgenomeȬwideȱanalysisȱofȱtheȱtranscriptome,ȱnoȱ singleȱplatformȱexistsȱforȱtheȱsimultaneousȱanalysisȱofȱtheȱcompleteȱmetabolome.ȱ Inȱ contrastȱ toȱ mRNA,ȱ whichȱ hasȱ anȱ identicalȱ chemicalȱ structureȱ forȱ allȱ genes,ȱ metabolitesȱ representȱ aȱ plethoraȱ ofȱ differentȱ chemicalȱ classesȱ andȱ noȱ universalȱ analyticalȱ toolsȱ areȱ availableȱ yet.ȱ However,ȱ advancesȱ inȱ massȱ spectrometryȱ haveȱ madeȱ theȱ detectionȱ andȱ quantificationȱ ofȱ hundredsȱ ofȱ compoundsȱ ofȱ specificȱ chemicalȱ classesȱ possibleȱ (Fiehnȱ etȱ al.,ȱ 2000).ȱ Plantsȱ areȱ especiallyȱ richȱ inȱ theȱ numberȱofȱsecondaryȱmetabolites,ȱwhichȱisȱpossiblyȱaȱconsequenceȱofȱtheirȱsessileȱ nature.ȱSinceȱplantsȱareȱunableȱtoȱmigrateȱtheyȱneedȱtoȱadaptȱtoȱlocalȱenvironmentsȱ forȱ theirȱ survival.ȱ Theȱ wideȱ rangeȱ ofȱ habitatsȱ makesȱ itȱ amenableȱ thatȱ naturalȱ variationȱ inȱ secondaryȱ metaboliteȱ compositionȱ andȱ accumulationȱ playsȱ anȱ importantȱroleȱinȱtheȱdiversificationȱofȱplantsȱ(Fiehn,ȱ2002).ȱ Chapterȱ fourȱ describesȱ theȱ geneticȱ analysisȱ ofȱ metaboliteȱ compositionȱ inȱ Arabidopsisȱ usingȱ largeȬscaleȱ untargetedȱ liquidȱ chromatographyȬtimeȱ ofȱ flightȱ massȱ spectrometryȱ (LCȬQTOFȱ MS).ȱ Althoughȱ untargeted,ȱ LCȬMSȱ predominantlyȱ detectsȱ semiȬpolarȱ secondaryȱ metabolites,ȱ whichȱ areȱ amongstȱ theȱ mostȱ variableȱ compoundsȱinȱnature.ȱWhenȱappliedȱtoȱaccessionsȱoriginatingȱfromȱvariousȱpartsȱ ofȱ theȱ globalȱ distributionȱ rangeȱ ofȱ A.ȱ thaliana,ȱ considerableȱ quantitativeȱ andȱ qualitativeȱdifferencesȱwereȱobserved.ȱTheȱmajorityȱofȱcompoundsȱcouldȱonlyȱbeȱ detectedȱinȱaȱlimitedȱnumberȱofȱaccessionsȱandȱeachȱaccessionȱanalyzedȱcontainedȱ uniqueȱ compoundsȱ notȱ foundȱ inȱ anyȱ otherȱ accession.ȱ However,ȱ aȱ substantialȱ numberȱ ofȱ compounds,ȱ presumablyȱ representingȱ essentialȱ metabolites,ȱ couldȱ beȱ detectedȱinȱallȱaccessionsȱanalyzed.ȱTheȱextensiveȱnaturalȱvariationȱinȱmetaboliteȱ contentȱ togetherȱ withȱ theȱ oftenȱ observedȱ highȱ heritabilitiesȱ indicatesȱ thatȱ metaboliteȱ compositionȱ isȱ largelyȱ underȱ geneticȱ control.ȱ Indeed,ȱ whenȱ quantificationȱinȱaȱRILȱpopulationȱwasȱsubjectedȱtoȱQTLȱanalysis,ȱforȱ75%ȱofȱtheȱ detectedȱcompoundsȱsignificantȱQTL(s)ȱcouldȱbeȱassigned.ȱTheȱimpactȱofȱgeneticȱ factorsȱonȱtheȱdynamicȱrangeȱofȱmetaboliteȱcontentȱwasȱfurtherȱdemonstratedȱbyȱ theȱfactȱthatȱaȱhighȱnumberȱofȱcompoundsȱwhichȱwereȱnotȱfoundȱinȱeitherȱoneȱofȱ theȱparentsȱcouldȱbeȱdetectedȱinȱRILs.ȱThisȱsuggestsȱthatȱmetabolicȱpathwaysȱinȱtheȱ parentsȱareȱblockedȱatȱdifferentȱstepsȱwhichȱcanȱbeȱovercomeȱbyȱcomplementationȱ dueȱtoȱrecombinationȱofȱtheirȱgenomes.ȱNaturalȱvariationȱthereforeȱoffersȱaȱlargeȱ potentialȱforȱmetabolicȱengineeringȱofȱcropȱspeciesȱthroughȱclassicalȱbreeding.ȱ SimilarȱtoȱtheȱdistributionȱofȱeQTLsȱalongȱtheȱgenome,ȱhotȬȱandȱcoldȬspotsȱ couldȱ beȱ observedȱ forȱ metaboliteȱ accumulationȱ QTLs.ȱ Interestingly,ȱ forȱ bothȱ analysesȱaȱhotspotȱwasȱobservedȱatȱtheȱpositionȱofȱtheȱERECTAȱgene,ȱaȱreceptorȱ proteinȱkinaseȱwellȱknownȱforȱitsȱpleiotropicȱeffectsȱ(Toriiȱetȱal.,ȱ1996).ȱTheȱERECTAȱ geneȱisȱpolymorphicȱforȱtheȱparentalȱaccessionsȱofȱtheȱRILȱpopulationȱandȱcausalȱ forȱ manyȱ ofȱ theȱ morphologicalȱ differencesȱ observedȱ betweenȱ theȱ parents.ȱ CoȬ

137ȱ Chapterȱ6ȱ

locationȱofȱQTLsȱimpliesȱthatȱaccumulationȱofȱtheȱmetabolitesȱmappingȱtoȱtheȱsameȱ positionȱmightȱbeȱcontrolledȱbyȱaȱcommonȱregulator.ȱAlthoughȱnoȱinformationȱcanȱ beȱ inferredȱ whetherȱ thisȱ controlȱ actsȱ directlyȱ orȱ indirectlyȱ throughȱ downstreamȱ effectsȱ ofȱ aȱ regulatoryȱ step,ȱ coȬregulatedȱ metabolitesȱ areȱ likelyȱ toȱ beȱ partȱ ofȱ aȱ commonȱ pathwayȱ orȱ involvedȱ inȱ theȱ sameȱ biologicalȱ process.ȱ Whenȱ thoseȱ metabolitesȱcanȱbeȱidentified,ȱinformationȱcanȱbeȱobtainedȱaboutȱtheȱmechanismȱofȱ regulation,ȱ andȱ theȱ numberȱ andȱ orderȱ ofȱ metabolitesȱ inȱ aȱ metabolicȱ pathway.ȱ Theseȱ featuresȱ wereȱ demonstratedȱ byȱ theȱ reconstructionȱ ofȱ theȱ aliphaticȱ glucosinolateȱ formationȱ pathwayȱ andȱ theȱ discoveryȱ ofȱ variationȱ inȱ glycosylȱ transferaseȱ activityȱ affectingȱ flavonolȱ composition.ȱ However,ȱ untargetedȱ metabolomicȱapproachesȱ detectȱ anonymousȱ compoundsȱ andȱ theȱ identificationȱofȱ theseȱcompoundsȱisȱstillȱinȱitsȱinfancyȱ(Schauerȱetȱal.,ȱ2005;ȱMocoȱetȱal.,ȱ2006).ȱTheȱ unravelingȱ ofȱ metabolicȱ pathwaysȱ wouldȱ thereforeȱ benefitȱ muchȱ fromȱ theȱ developmentȱofȱmassȱidentificationȱlibraries.ȱ TheȱlargeȬscaleȱgeneticȱanalysesȱofȱgeneȱexpressionȱandȱmetaboliteȱcontentȱ clearlyȱ haveȱ shownȱ theirȱ usefulnessȱ inȱ constructingȱ geneticȱ regulatoryȱ networks.ȱ Yet,ȱ noneȱ ofȱ theseȱ approachesȱ canȱ fullyȱ explainȱ theȱ complexȱ regulationȱ ofȱ phenotypicȱ quantitativeȱ traits.ȱ Moreover,ȱ interactionsȱ andȱ crossȬtalkȱ betweenȱ componentsȱ ofȱ theȱ variousȱ regulatoryȱ levelsȱ areȱ probablyȱ eminentȱ andȱ itȱ isȱ notȱ alwaysȱpossibleȱtoȱdistinguishȱcauseȱandȱconsequenceȱofȱnaturalȱvariationȱwithoutȱ furtherȱ experimentation.ȱ Howeverȱ mostȱ ofȱ theȱ tools,ȱ includingȱ theȱ genomeȱ sequence,ȱareȱnowȱavailableȱtoȱstudyȱbiologicalȱsystemsȱasȱaȱgeneticȱsystemȱinȱitsȱ entirety.ȱ Theȱ integrationȱ ofȱ dataȱ collectedȱ fromȱ multiparallelȱ analysesȱ ofȱ theȱ variousȱ interconnectedȱ transducersȱ ofȱ biologicalȱ informationȱ flowȱ willȱ thereforeȱ thriveȱourȱunderstandingȱofȱcomplexȱbiologicalȱsystems.ȱ Chapterȱ fiveȱ describesȱ theȱ integrativeȱ analysisȱ ofȱ geneticȱ variationȱ inȱ enzymeȱ activitiesȱ ofȱ primaryȱ carbohydrateȱ metabolism.ȱ Carbohydratesȱ areȱ essentialȱforȱmanyȱbiologicalȱprocessesȱrangingȱfromȱgrowthȱtoȱenergyȱmetabolismȱ andȱplantsȱcontainȱaȱmultitudeȱofȱenzymesȱforȱtheȱallocationȱandȱconversionȱofȱtheȱ necessaryȱ compounds.ȱ Perturbationsȱaffectingȱ theȱ functionalityȱofȱ theseȱ enzymesȱ canȱ thereforeȱ haveȱ largeȱ effectsȱ onȱ plantȱ growth.ȱ Theȱ activityȱ ofȱ 15ȱ enzymesȱ involvedȱ inȱ carbohydrateȱ metabolismȱ wereȱ analyzedȱ inȱ theȱ Landsbergȱ erectaȱ xȱ CapeȱVerdeȱIslandsȱRILȱpopulationȱandȱsubjectedȱtoȱQTLȱanalyses.ȱInȱaddition,ȱtheȱ expressionȱ ofȱ theȱ structuralȱ genesȱ encodingȱ thoseȱ enzymesȱ andȱ aȱ numberȱ ofȱ carbohydrateȱ metabolites,ȱ asȱ substrateȱ andȱ productsȱ ofȱ theȱ enzymes,ȱ wereȱ analyzedȱinȱparallel.ȱ Theȱ naturalȱ variationȱ observedȱ forȱ aȱ largeȱ numberȱ ofȱ enzymesȱ andȱ metabolitesȱcouldȱpartlyȱbeȱexplainedȱbyȱdetectedȱQTLs.ȱMoreover,ȱbothȱpositiveȱ andȱnegativeȱcorrelationsȱwereȱobservedȱbetweenȱenzymeȱactivitiesȱandȱmetaboliteȱ

138ȱ Generalȱdiscussionȱ

contents,ȱ althoughȱ onlyȱ fewȱ coȬlocatingȱ QTLsȱ wereȱ detected.ȱ Theseȱ findingsȱ suggestedȱ thatȱ geneticȱ controlȱ ofȱ primaryȱ carbohydrateȱ metabolismȱ actsȱ atȱ differentȱ levels:ȱ aȱ directȱ independentȱ regulationȱ ofȱ individualȱ componentsȱ andȱ aȱ moreȱ generalȱ simultaneousȱ regulationȱ ofȱ allȱ components.ȱ Principalȱ componentȱ analysesȱ furtherȱ suggestedȱ thatȱ suchȱ simultaneousȱ regulationȱ ofȱ carbohydrateȱ metabolismȱmightȱbeȱunderȱdevelopmentalȱcontrol.ȱ Theȱ parallelȱ analysisȱ ofȱ structuralȱ geneȱ expressionȱ andȱ enzymeȱ activityȱ alsoȱrevealedȱdistinctȱmodesȱofȱregulation.ȱFromȱtheȱpositionȱofȱstructuralȱgenes,ȱ theirȱeQTLsȱandȱenzymeȱactivityȱQTLs,ȱtogetherȱwithȱcorrelationȱanalysesȱofȱgeneȱ expressionȱandȱenzymeȱactivityȱlevelsȱtheȱinvolvementȱofȱstructuralȱgeneȱvariationȱ couldȱ beȱ evaluated.ȱ Inȱ aȱ numberȱ ofȱ casesȱ cisȬregulatedȱ expressionȱ variationȱ ofȱ structuralȱ genesȱ wasȱ suggestedȱ toȱ beȱ causalȱ forȱ observedȱ variationȱ inȱ enzymeȱ activity.ȱ However,ȱ transȬregulatedȱ expressionȱ variationȱ wasȱ alsoȱ observedȱ andȱ mightȱ haveȱ contributedȱ toȱ theȱ observedȱ variationȱ inȱ activityȱ forȱ someȱ enzymes.ȱ Furthermoreȱ theȱ lackȱ ofȱ expressionȱ variationȱ inȱ someȱ instancesȱ indicatedȱ alteredȱ proteinȱ functionȱ toȱ affectȱ specificȱ activityȱ ofȱ enzymes.ȱ Finally,ȱ theȱ detectionȱ ofȱ enzymeȱactivityȱQTLsȱnotȱcoȬlocatingȱwithȱstructuralȱgenesȱencodingȱtheȱenzymeȱ underȱ studyȱ suggestsȱ otherȱ regulatoryȱ mechanisms,ȱ independentȱ ofȱ structuralȱ genes,ȱ toȱ beȱ activeȱ (e.g.ȱ postȬtranslationalȱ control).ȱ Theȱ differentȱ regulatoryȱ mechanisms,ȱincludingȱtheȱroleȱofȱmetabolites,ȱwereȱdemonstratedȱbyȱtheȱdetailedȱ analysisȱ ofȱ phosphoglucomutaseȱ andȱ UDPȬglucoseȱ pyrophosphorylaseȱ activity,ȱ theirȱstructuralȱgenesȱandȱtheirȱrespectiveȱsubstratesȱandȱproducts.ȱ Theȱ workȱ describedȱ inȱ thisȱ thesisȱ hasȱ shownȱ theȱ extensiveȱ variationȱ inȱ quantitativeȱ traitsȱ inȱ Arabidopsisȱ includingȱ variationȱ inȱ geneȱ expressionȱ andȱ metaboliteȱ content.ȱ Theȱ useȱ ofȱ naturalȱ variationȱ inȱ combinationȱ withȱ geneticȱ approachesȱsuchȱasȱQTLȱanalysesȱhasȱfurtherȱshownȱtheȱpowerȱinȱelucidatingȱtheȱ oftenȱ complexȱ geneticȱ regulationȱ ofȱ traits.ȱ Moreover,ȱ theȱ applicationȱ ofȱ highȱ throughputȱ ‘omic’ȱ technologiesȱ enabledȱ theȱ constructionȱ ofȱ regulatoryȱ networksȱ whichȱ wereȱ unlikelyȱ toȱ beȱ uncoveredȱ fromȱ targetedȱ smallȱ scaleȱ approaches.ȱ HoweverȱQTLȱanalysesȱareȱlimitedȱbyȱtheȱamountȱofȱnaturalȱvariationȱsegregatingȱ inȱ theȱ mappingȱ populationȱ andȱ theȱ geneticȱ makeȬupȱ ofȱ theȱ employedȱ RILȱ populationȱ onlyȱ consistsȱ ofȱ twoȱ genotypes.ȱ Theȱ analysisȱ ofȱ traitsȱ inȱ multipleȱ populations,ȱ generatedȱ fromȱ differentȱ accessions,ȱ isȱ likelyȱ toȱ revealȱ additionalȱ regulatoryȱ steps.ȱ Alternatively,ȱ multipleȱ distinctȱ accessionsȱ couldȱ beȱ intercrossedȱ inȱ aȱ singleȱ mappingȱ population,ȱ therebyȱ increasingȱ theȱ amountȱ ofȱ segregatingȱ naturalȱ variation.ȱ Theȱ ultimateȱ mappingȱ populationȱ however,ȱ consistsȱ ofȱ theȱ worldwideȱ collectionȱ ofȱ accessionsȱ andȱ theȱ recentȱ advancesȱ inȱ linkageȱ disequilibriumȱ mappingȱ haveȱ onlyȱ justȱ begunȱ toȱ makeȱ theȱ explorationȱ ofȱ thisȱ comprehensiveȱreservoirȱofȱnaturalȱvariationȱpossibleȱ(Nordborgȱetȱal.,ȱ2002).ȱ

139ȱ Chapterȱ6ȱ

Anotherȱimpedimentȱobstructingȱtheȱcomprehensiveȱelucidationȱofȱgeneticȱ regulationȱisȱtheȱoftenȱobservedȱspatialȱandȱtemporalȱcontrolȱofȱquantitativeȱtraits.ȱ Dueȱ toȱ costȱ andȱ timeȱ considerationsȱ analysesȱ areȱ oftenȱ limitedȱ inȱ theȱ numberȱ ofȱ developmentalȱ stagesȱ andȱ tissuesȱ thatȱ canȱ beȱ sampled.ȱ However,ȱ toȱ getȱ aȱ fullȱ understandingȱofȱtheȱcomplexȱregulatoryȱmechanismsȱofȱtraitsȱitȱisȱrecommendableȱ toȱanalyzeȱtraitsȱinȱmultipleȱdevelopmentalȱstagesȱandȱtissues.ȱLikewise,ȱforȱmanyȱ traitsȱgeneticȱinteractionsȱwithȱtheȱenvironmentȱareȱobservedȱandȱanalyzingȱtraitsȱ inȱdifferentȱcircumstancesȱmightȱthereforeȱrevealȱspecificȱregulatoryȱsteps.ȱ ȱ Althoughȱ powerfulȱ inȱ mappingȱ genomicȱ regions,ȱ theȱ resolutionȱ ofȱ QTLȱ analysisȱisȱoftenȱnotȱhighȱenoughȱtoȱidentifyȱtheȱcausalȱgeneȱ(QTG),ȱandȱultimatelyȱ theȱcausalȱchangesȱatȱtheȱnucleotideȱlevelȱ(QTN),ȱaffectingȱtheȱtraitȱofȱinterest.ȱDueȱ toȱ theȱ oftenȱ smallȱ effectsȱ ofȱ QTLsȱ andȱ theȱ complexȱ regulationȱ ofȱ traits,ȱ itȱ isȱ notȱ alwaysȱ easyȱ toȱ obtainȱ definitiveȱ proofȱ forȱ theȱ identificationȱ ofȱ aȱ QTGȱ orȱ QTN.ȱ InitialȱQTLȱmappingȱisȱusuallyȱfollowedȱbyȱconfirmationȱinȱNILs,ȱwhichȱcanȱalsoȱ beȱusedȱforȱfineȬmapping.ȱOnceȱaȱselectȱsetȱofȱcandidateȱgenesȱhasȱbeenȱdefined,ȱ severalȱ linesȱ ofȱ experimentationȱ canȱ beȱ followedȱ toȱ provideȱ evidenceȱ forȱ theȱ identificationȱ ofȱ aȱ QTGȱ orȱ QTN.ȱ Suchȱ linesȱ includeȱ naturalȱ variationȱ surveysȱ withinȱ andȱ betweenȱ species,ȱ comparativeȱ sequenceȱ analyses,ȱ geneȱ expressionȱ analyses,ȱfunctionalȱ(inȱvitro)ȱgeneȱanalyses,ȱknockoutȱorȱmutationalȱanalyses,ȱandȱ (transgenic)ȱ complementationȱ testsȱ (Borevitzȱ andȱ Nordborg,ȱ 2003;ȱ Weigelȱ andȱ Nordborg,ȱ2005).ȱAlthoughȱsomeȱofȱtheseȱanalysesȱwillȱprovideȱstrongerȱevidenceȱ thanȱothers,ȱusuallyȱseveralȱtestsȱareȱneededȱtoȱdemonstrateȱaȱcausalȱlinkȱbetweenȱ allelicȱvariationȱandȱaȱparticularȱphenotype.ȱ Finally,ȱ hereȱ Arabidopsisȱ wasȱ chosenȱ asȱ aȱ modelȱ plantȱ forȱ theȱ geneticȱ analysesȱ ofȱ quantitativeȱ variation.ȱ Theȱ availabilityȱ ofȱ theȱ completeȱ genomeȱ sequence,ȱ commerciallyȱ availableȱ genomeȬwideȱ microarraysȱ andȱ theȱ publiclyȱ availableȱhighȱqualityȱmappingȱpopulationsȱtogetherȱwithȱtheȱnumerousȱtoolsȱandȱ techniquesȱdevelopedȱforȱthisȱspeciesȱmakeȱArabidopsisȱtheȱperfectȱchoiceȱforȱtheseȱ kindsȱofȱanalysesȱ(AlonsoȱandȱEcker,ȱ2006).ȱHowever,ȱdueȱtoȱtheȱdevelopmentsȱinȱ sequenceȱ technologiesȱ andȱ comparativeȱ genomicsȱ manyȱ ofȱ theȱ findingsȱ inȱ Arabidopsisȱcanȱbeȱreadilyȱ‘translated’ȱtoȱotherȱspeciesȱ(GaleȱandȱDevos,ȱ1998;ȱHallȱ etȱ al.,ȱ 2002).ȱ Moreover,ȱ theȱ rapidȱ progressȱ inȱ genomicȱ technologiesȱ andȱ theȱ increasingȱ numberȱ ofȱ mappingȱ populationsȱ forȱ otherȱ (crop)ȱ speciesȱ shouldȱ noȱ longerȱrestrictȱtheȱanalysesȱdescribedȱinȱthisȱthesisȱtoȱmodelȱspecies.ȱManyȱofȱtheȱ toolsȱ andȱ techniquesȱ developedȱ inȱ Arabidopsisȱ canȱ readilyȱ beȱ appliedȱ toȱ otherȱ speciesȱ andȱ nowȱ theȱ timeȱ hasȱ comeȱ thatȱ appliedȱ sciencesȱ willȱ benefitȱ fromȱ theȱ groundbreakingȱworkȱinȱmodelȱspeciesȱsuchȱasȱArabidopsisȱthaliana.ȱ

140ȱ Generalȱdiscussionȱ

REFERENCESȱ ȱ Alonso,ȱJ.M.ȱandȱEcker,ȱJ.R.ȱ(2006).ȱMovingȱforwardȱinȱreverse:ȱgeneticȱtechnologiesȱtoȱenableȱgenomeȬ wideȱphenomicȱscreensȱinȱArabidopsis.ȱNatȱRevȱGenetȱ7,ȱ524Ȭ536.ȱ AlonsoȬBlanco,ȱ C.,ȱPeeters,ȱ A.J.,ȱKoornneef,ȱM.,ȱLister,ȱC.,ȱDean,ȱC.,ȱvanȱdenȱBosch,ȱN.,ȱ Pot,ȱJ.ȱandȱ Kuiper,ȱ M.T.ȱ (1998).ȱ Developmentȱ ofȱ anȱ AFLPȱ basedȱ linkageȱ mapȱ ofȱ Ler,ȱ Colȱ andȱ Cviȱ Arabidopsisȱ thalianaȱ ecotypesȱ andȱ constructionȱ ofȱ aȱ Ler/Cviȱ recombinantȱ inbredȱ lineȱ population.ȱPlantȱJȱ14,ȱ259Ȭ271.ȱ AlonsoȬBlanco,ȱ C.ȱ andȱ Koornneef,ȱ M.ȱ (2000).ȱ Naturallyȱ occurringȱ variationȱ inȱ Arabidopsis:ȱ anȱ underexploitedȱresourceȱforȱplantȱgenetics.ȱTrendsȱPlantȱSciȱ5,ȱ22Ȭ29.ȱ Borevitz,ȱJ.O.,ȱLiang,ȱD.,ȱPlouffe,ȱD.,ȱChang,ȱH.S.,ȱZhu,ȱT.,ȱWeigel,ȱD.,ȱBerry,ȱC.C.,ȱWinzeler,ȱE.ȱandȱ Chory,ȱ J.ȱ (2003).ȱ LargeȬscaleȱ identificationȱ ofȱ singleȬfeatureȱ polymorphismsȱ inȱ complexȱ genomes.ȱGenomeȱResȱ13,ȱ513Ȭ523.ȱ Borevitz,ȱJ.O.ȱandȱNordborg,ȱM.ȱ(2003).ȱTheȱimpactȱofȱgenomicsȱonȱtheȱstudyȱofȱnaturalȱvariationȱinȱ Arabidopsis.ȱPlantȱPhysiolȱ132,ȱ718Ȭ725.ȱ Broman,ȱK.W.ȱ(2001).ȱReviewȱofȱstatisticalȱmethodsȱforȱQTLȱmappingȱinȱexperimentalȱcrosses.ȱLabȱAnimȱ (NY)ȱ30,ȱ44Ȭ52.ȱ Doerge,ȱR.W.ȱ(2002).ȱMappingȱandȱanalysisȱofȱquantitativeȱtraitȱlociȱinȱexperimentalȱpopulations.ȱNatȱ RevȱGenetȱ3,ȱ43Ȭ52.ȱ Eshed,ȱ Y.ȱ andȱ Zamir,ȱ D.ȱ (1995).ȱ Anȱ introgressionȱ lineȱ populationȱ ofȱ Lycopersiconȱ pennelliiȱ inȱ theȱ cultivatedȱ tomatoȱ enablesȱ theȱ identificationȱ andȱ fineȱ mappingȱ ofȱ yieldȬassociatedȱ QTL.ȱ Geneticsȱ141,ȱ1147Ȭ1162.ȱ Fiehn,ȱO.,ȱKopka,ȱJ.,ȱDormann,ȱP.,ȱAltmann,ȱT.,ȱTrethewey,ȱR.N.ȱandȱWillmitzer,ȱL.ȱ(2000).ȱMetaboliteȱ profilingȱforȱplantȱfunctionalȱgenomics.ȱNatȱBiotechnolȱ18,ȱ1157Ȭ1161.ȱ Fiehn,ȱO.,ȱKloska,ȱS.ȱandȱAltmann,ȱT.ȱ(2001).ȱIntegratedȱstudiesȱonȱplantȱbiologyȱusingȱmultiparallelȱ techniques.ȱCurrȱOpinȱBiotechnolȱ12,ȱ82Ȭ86.ȱ Fiehn,ȱO.ȱ(2002).ȱMetabolomicsȬȬtheȱlinkȱbetweenȱgenotypesȱandȱphenotypes.ȱPlantȱMolȱBiolȱ48,ȱ155Ȭ171.ȱ Gale,ȱM.D.ȱandȱDevos,ȱK.M.ȱ(1998).ȱPlantȱcomparativeȱgeneticsȱafterȱ10ȱyears.ȱScienceȱ282,ȱ656Ȭ659.ȱ Hall,ȱ A.E.,ȱ Fiebig,ȱ A.ȱ andȱ Preuss,ȱ D.ȱ (2002).ȱ Beyondȱ theȱ Arabidopsisȱ genome:ȱ opportunitiesȱ forȱ comparativeȱgenomics.ȱPlantȱPhysiolȱ129,ȱ1439Ȭ1447.ȱ Jansen,ȱR.C.ȱandȱNap,ȱJ.P.ȱ(2001).ȱGeneticalȱgenomics:ȱtheȱaddedȱvalueȱfromȱsegregation.ȱTrendsȱGenetȱ 17,ȱ388Ȭ391.ȱ Jansen,ȱ R.C.ȱ (2003).ȱ Studyingȱ complexȱ biologicalȱ systemsȱ usingȱ multifactorialȱ perturbation.ȱ Natȱ Revȱ Genetȱ4,ȱ145Ȭ151.ȱ Kliebenstein,ȱD.J.,ȱWest,ȱM.A.,ȱvanȱLeeuwen,ȱH.,ȱKim,ȱ K.,ȱDoerge,ȱR.W.,ȱMichelmore,ȱR.W.ȱandȱ Stȱ Clair,ȱ D.A.ȱ (2006).ȱ Genomicȱ surveyȱ ofȱ geneȱ expressionȱ diversityȱ inȱ Arabidopsisȱ thaliana.ȱ Geneticsȱ172,ȱ1179Ȭ1189.ȱ Koornneef,ȱM.ȱandȱStam,ȱP.ȱ(2001).ȱChangingȱparadigmsȱinȱplantȱbreeding.ȱPlantȱPhysiolȱ125,ȱ156Ȭ159.ȱ Koornneef,ȱM.,ȱAlonsoȬBlanco,ȱC.ȱandȱVreugdenhil,ȱD.ȱ(2004).ȱNaturallyȱoccurringȱgeneticȱvariationȱinȱ ArabidopsisȱThaliana.ȱAnnuȱRevȱPlantȱPhysiolȱPlantȱMolȱBiolȱ55,ȱ141Ȭ172.ȱ MitchellȬOlds,ȱT.ȱandȱSchmitt,ȱJ.ȱ(2006).ȱGeneticȱmechanismsȱandȱevolutionaryȱsignificanceȱofȱnaturalȱ variationȱinȱArabidopsis.ȱNatureȱ441,ȱ947Ȭ952.ȱ Moco,ȱS.,ȱBino,ȱR.J.,ȱVorst,ȱO.,ȱVerhoeven,ȱH.A.,ȱdeȱGroot,ȱJ.,ȱvanȱBeek,ȱT.A.,ȱVervoort,ȱJ.ȱandȱdeȱVos,ȱ C.H.ȱ (2006).ȱ Aȱ liquidȱ chromatographyȬmassȱ spectrometryȬbasedȱ metabolomeȱ databaseȱ forȱ tomato.ȱPlantȱPhysiolȱ141,ȱ1205Ȭ1218.ȱ Nordborg,ȱ M.,ȱ Borevitz,ȱ J.O.,ȱ Bergelson,ȱ J.,ȱ Berry,ȱ C.C.,ȱ Chory,ȱ J.,ȱ Hagenblad,ȱ J.,ȱ Kreitman,ȱ M.,ȱ Maloof,ȱ J.N.,ȱ Noyes,ȱ T.,ȱ Oefner,ȱ P.J.ȱ etȱ al.ȱ (2002).ȱ Theȱ extentȱ ofȱ linkageȱ disequilibriumȱ inȱ Arabidopsisȱthaliana.ȱNatȱGenetȱ30,ȱ190Ȭ193.ȱ

141ȱ Chapterȱ6ȱ

Rockman,ȱM.V.ȱandȱKruglyak,ȱL.ȱ(2006).ȱGeneticsȱofȱglobalȱgeneȱexpression.ȱNatȱRevȱGenetȱ7,ȱ862Ȭ872.ȱ Schauer,ȱ N.,ȱ Steinhauser,ȱ D.,ȱ Strelkov,ȱ S.,ȱ Schomburg,ȱ D.,ȱ Allison,ȱ G.,ȱ Moritz,ȱ T.,ȱ Lundgren,ȱ K.,ȱ RoessnerȬTunali,ȱU.,ȱForbes,ȱM.G.,ȱWillmitzer,ȱL.ȱetȱal.ȱ(2005).ȱGCȬMSȱlibrariesȱforȱtheȱrapidȱ identificationȱofȱmetabolitesȱinȱcomplexȱbiologicalȱsamples.ȱFEBSȱLettȱ579,ȱ1332Ȭ1337.ȱ Somerville,ȱC.ȱandȱKoornneef,ȱM.ȱ(2002).ȱTimeline:ȱAȱfortunateȱchoice:ȱtheȱhistoryȱofȱArabidopsisȱasȱaȱ modelȱplant.ȱNatȱRevȱGenetȱ3,ȱ883Ȭ889.ȱ Torii,ȱK.U.,ȱMitsukawa,ȱN.,ȱOosumi,ȱT.,ȱMatsuura,ȱY.,ȱYokoyama,ȱR.,ȱWhittier,ȱR.F.ȱandȱKomeda,ȱY.ȱ (1996).ȱ Theȱ Arabidopsisȱ ERECTAȱ geneȱ encodesȱ aȱ putativeȱ receptorȱ proteinȱ kinaseȱ withȱ extracellularȱleucineȬrichȱrepeats.ȱPlantȱCellȱ8,ȱ735Ȭ746.ȱ Weigel,ȱ D.ȱ andȱ Nordborg,ȱ M.ȱ (2005).ȱ Naturalȱ variationȱ inȱ Arabidopsis.ȱ Howȱ doȱ weȱ findȱ theȱ causalȱ genes?ȱPlantȱPhysiolȱ138,ȱ567Ȭ568.ȱ Winnacker,ȱE.L.ȱ(2003).ȱInterdisciplinaryȱsciencesȱinȱtheȱ21stȱcentury.ȱCurrȱOpinȱBiotechnolȱ14,ȱ328Ȭ331.ȱ Zamir,ȱD.ȱ(2001).ȱImprovingȱplantȱbreedingȱwithȱexoticȱgeneticȱlibraries.ȱNatȱRevȱGenetȱ2,ȱ983Ȭ989.ȱ ȱȱ

142ȱ Summaryȱ ȱ Plantsȱ showȱ considerableȱ geneticȱ differencesȱ betweenȱ accessionsȱ ofȱ theȱ sameȱ species,ȱwhichȱareȱreflectedȱinȱphenotypicȱvariation.ȱThisȱnaturalȱvariationȱisȱoftenȱ displayedȱ asȱ aȱ continuousȱ distributionȱ ofȱ traitȱ valuesȱ andȱ isȱ thereforeȱ calledȱ quantitativeȱ variation.ȱ Quantitativeȱ variationȱ isȱ theȱ resultȱ ofȱ theȱ interplayȱ ofȱ multipleȱgenesȱandȱenvironmentalȱfactors.ȱBecauseȱtheȱcontributionȱofȱeachȱgeneȱtoȱ theȱ eventualȱ phenotypeȱ canȱ beȱ quiteȱ small,ȱ sophisticatedȱ statisticalȱ methodsȱ areȱ neededȱtoȱassociateȱgenomicȱregionsȱwithȱtheȱtraitȱofȱinterest.ȱSuchȱanȱapproachȱisȱ knownȱ asȱ quantitativeȱ traitȱ locusȱ (QTL)ȱ analysis.ȱ Inȱ QTLȱ analysis,ȱ theȱ traitȱ ofȱ interestȱ isȱ quantifiedȱ inȱ aȱ genotypedȱ mappingȱ population,ȱ derivedȱ fromȱ aȱ crossȱ betweenȱdistinctȱgenotypes.ȱȱ Arabidopsisȱthalianaȱȱisȱtheȱleadingȱmodelȱspeciesȱinȱmodernȱplantȱsciences.ȱ Itsȱ shortȱ generationȱ time,ȱ smallȱ andȱ fullyȱ sequencedȱ genome,ȱ andȱ wideȱ globalȱ distributionȱ rangeȱ makeȱ Arabidopsisȱ especiallyȱ suitedȱ forȱ theȱ analysisȱ ofȱ quantitativeȱtraits.ȱBecauseȱofȱitsȱnaturalȱselfȬpollination,ȱrecombinantȱinbredȱlinesȱ (RILs)ȱhaveȱbecomeȱtheȱmappingȱpopulationȱofȱchoiceȱinȱArabidopsis.ȱHowever,ȱinȱ otherȱ speciesȱ nearȬisogenicȱ linesȱ (NILs)ȱ areȱ favorableȱ dueȱ toȱ intoleranceȱ towardȱ inbreedingȱandȱfertilityȱissues.ȱNILsȱareȱalsoȱusefulȱforȱtheȱconfirmationȱandȱfineȬ mappingȱ ofȱ QTLsȱ identifiedȱ inȱ RILȱ populations.ȱ Chapterȱ twoȱ describesȱ theȱ developmentȱofȱaȱgenomeȬwideȱcoverageȱNILȱpopulationȱfromȱaȱcrossȱbetweenȱtheȱ distinctȱaccessionsȱLandsbergȱerectaȱ(Ler)ȱandȱCapeȱVerdeȱIslandsȱ(Cvi),ȱforȱwhichȱaȱ RILȱpopulationȱwasȱdevelopedȱpreviously.ȱTheȱgeneticȱmakeȬupȱofȱtheseȱtwoȱtypesȱ ofȱ populationsȱ differsȱ inȱ theȱ numberȱ ofȱ introgressionsȱ thatȱ segregateȱ inȱ theȱ population.ȱRILsȱcontainȱmultipleȱintrogressions,ȱwhereasȱNILsȱpreferablyȱcontainȱ onlyȱ aȱ singleȱ introgressionȱ perȱ line.ȱ Asȱ aȱ consequence,ȱ inȱ contrastȱ toȱ NILȱ populations,ȱepistaticȱinteractionsȱbetweenȱlociȱcanȱbeȱdetectedȱinȱRILȱpopulations.ȱ Furthermore,ȱowingȱtoȱtheȱhigherȱrecombinationȱfrequency,ȱfewerȱreplicationsȱperȱ lineȱ needȱ toȱ beȱ analyzedȱ inȱ RILȱ populations.ȱ However,ȱ theȱ simultaneousȱ segregationȱofȱmultipleȱQTLsȱdiminishesȱtheȱpowerȱtoȱdetectȱsmallȬeffectȱQTLsȱinȱ RILȱpopulationsȱcomparedȱtoȱNILȱpopulations.ȱ ȱ Theȱ segregationȱ ofȱ phenotypicȱ variationȱ canȱ beȱ observedȱ forȱ numerousȱ traits,ȱ includingȱ quantitativeȱ ones,ȱ inȱ populationsȱ derivedȱ fromȱ crossesȱ betweenȱ Arabidopisȱaccessions.ȱHowever,ȱquantitativeȱtraitsȱareȱoftenȱtheȱresultantȱofȱmanyȱ intermediaryȱstepsȱfromȱgenotypeȱtoȱphenotype.ȱToȱfullyȱunderstandȱtheȱcomplexȱ regulatoryȱcircuitryȱofȱquantitativeȱtraitsȱitȱisȱthereforeȱrecommendableȱtoȱanalyzeȱ geneticȱregulationȱatȱdifferentȱlevelsȱofȱtheȱbiologicalȱinformationȱflow.ȱTheȱrecentȱ advancesȱ inȱ ‘omic’ȱ technologiesȱ nowȱ makeȱ theȱ largeȱ scaleȱ analysisȱ ofȱ geneȱ

143ȱ Summaryȱ

expressionȱ(transcriptomics),ȱproteinȱcontentȱ(proteomics),ȱandȱmetaboliteȱcontentȱ (metabolomics)ȱfeasible.ȱ ȱ Theȱexpressionȱofȱgenesȱoftenȱdeterminesȱtheȱonsetȱofȱbiologicalȱpathwaysȱ andȱitȱisȱconceivableȱthatȱvariationȱinȱexpressionȱlevelsȱisȱreflectedȱinȱphenotypicȱ variation.ȱTheȱgeneticȱanalysisȱofȱgenomeȬwideȱgeneȱexpressionȱvariationȱinȱtheȱLerȱ xȱCviȱRILȱpopulationȱinȱchapterȱthreeȱrevealedȱhighȱheritabilitiesȱforȱmanyȱgenes,ȱ indicatingȱ thatȱ theirȱ expressionȱisȱunderȱ geneticȱ control.ȱ Indeedȱforȱ aȱ substantialȱ numberȱ ofȱ genesȱ expressionȱ QTLsȱ (eQTLs)ȱ couldȱ beȱ detected.ȱ Inȱ depthȱ analysisȱ uncoveredȱ bothȱ cisȬregulatedȱ expression,ȱ resultingȱ fromȱ polymorphismsȱ inȱ theȱ geneȱ itself,ȱ andȱ transȬregulatedȱ expression,ȱ resultingȱ fromȱ geneticȱ differencesȱ inȱ distantȱ regulators.ȱ Identifyingȱ transȬregulatorsȱ offersȱ theȱ possibilityȱ toȱ determineȱ geneȬtoȬgeneȱ regulationȱ andȱ ultimatelyȱ theȱ constructionȱ ofȱ regulatoryȱ networks.ȱ Forȱaȱnumberȱofȱgenomicȱregions,ȱunexpectedlyȱhighȱnumbersȱofȱtransȬeQTLsȱwereȱ detected.ȱ Suchȱ hotȱ spotsȱ areȱ possiblyȱ causedȱ byȱ pleiotropicȱ effectsȱ ofȱ regulatorsȱ (e.g.ȱ transcriptionȱ factors).ȱ Whenȱ multipleȱ genes,ȱ involvedȱ inȱ theȱ sameȱ biologicalȱ process,ȱ mapȱ toȱ theȱ sameȱ positionȱ thisȱ indicatesȱ thatȱ manyȱ ofȱ themȱ mightȱ beȱ regulatedȱ byȱ theȱ sameȱ gene.ȱ Thisȱ informationȱ wasȱ successfullyȱ usedȱ toȱ demonstrateȱtheȱconstructionȱofȱaȱregulatoryȱnetworkȱforȱfloweringȱtime.ȱ ȱ Onȱ theȱ otherȱ endȱ ofȱ theȱ informationȱ chain,ȱ metabolitesȱ standȱ closestȱ toȱ physiologicalȱ phenotypes.ȱ Itȱ isȱ thereforeȱ likelyȱ thatȱ geneticȱ variation,ȱ leadingȱ toȱ physiologicalȱdifferences,ȱisȱalsoȱcausalȱforȱdifferencesȱinȱmetaboliteȱcontent.ȱTheȱ untargetedȱ metabolicȱ analysesȱ describedȱ inȱ chapterȱ fourȱ uncoveredȱ extensiveȱ naturalȱ variationȱ inȱ metaboliteȱ compositionȱ inȱ 14ȱ differentȱ accessionsȱ ofȱ Arabidopsis.ȱQTLȱanalysisȱofȱmoreȱthanȱ2,000ȱhighȱqualityȱmassȱpeaks,ȱdetectedȱinȱ theȱLerȱxȱCviȱRILȱpopulation,ȱenabledȱtheȱidentificationȱofȱQTLsȱforȱaboutȱ75%ȱofȱ theȱmassȱsignals.ȱTheȱfindingȱthatȱmoreȱthanȱoneȬthirdȱofȱtheȱmassȱsignals,ȱdetectedȱ inȱtheȱRILs,ȱwereȱnotȱdetectedȱinȱeitherȱparentȱsuggestsȱthatȱmanyȱmetabolitesȱareȱ formedȱ dueȱ toȱ theȱ recombinationȱ ofȱ theȱ parentalȱ genomes.ȱ Theȱ identificationȱ ofȱ anonymousȱmassȱpeaks,ȱthatȱappearȱtoȱbeȱcoȬregulatedȱasȱbasedȱonȱtheȱpositionsȱofȱ QTLs,ȱ enabledȱ theȱ (re)constructionȱ ofȱ metabolicȱ pathwaysȱ andȱ uncoveredȱ novelȱ biosyntheticȱstepsȱandȱcompoundsȱinȱArabidopsis.ȱTheseȱresultsȱindicateȱtheȱlargeȱ potentialȱforȱmodificationȱofȱmetabolicȱcompositionȱthroughȱclassicalȱbreeding.ȱ ȱ Althoughȱ eachȱ ofȱ theȱ differentȱ entitiesȱ inȱ theȱ pathȱ fromȱ genotypeȱ toȱ phenotypeȱ canȱ beȱ effectivelyȱ analyzed,ȱ aȱ thoroughȱ understandingȱ ofȱ theȱ interactionȱbetweenȱtheseȱdifferentȱlevelsȱcanȱonlyȱbeȱobtainedȱfromȱtheȱintegratedȱ studyȱofȱmultiȬparallelȱanalyses.ȱInȱchapterȱfive,ȱtheȱcomplexȱregulationȱofȱprimaryȱ carbohydrateȱ metabolismȱ wasȱ analyzedȱ inȱ aȱ caseȱ study.ȱ Theȱ activitiesȱ ofȱ 15ȱ enzymesȱinvolvedȱinȱcarbohydrateȱmetabolism,ȱinȱparallelȱwithȱtheȱexpressionȱofȱ theirȱ structuralȱ genes,ȱ andȱ contentsȱ ofȱ theirȱ metabolicȱ substratesȱ andȱ products,ȱ

144ȱ Summaryȱ

wereȱ geneticallyȱ analyzedȱ inȱ theȱ Lerȱ xȱ Cviȱ RILȱ population.ȱ Forȱ manyȱ enzymesȱ QTLsȱexplainingȱvariationȱobservedȱinȱtheirȱactivityȱwereȱdetected.ȱAȱnumberȱofȱ theseȱQTLsȱcoȬlocatedȱwithȱtheȱpositionȱofȱstructuralȱgenes,ȱindicatingȱthatȱnaturalȱ variationȱinȱstructuralȱgenesȱcanȱbeȱcausalȱforȱvariationȱinȱenzymeȱactivity.ȱFromȱ theȱ expressionȱ analysesȱ ofȱ theseȱ structuralȱ genesȱ itȱ wasȱ concludedȱ thatȱ bothȱ expressionȱvariationȱandȱvariationȱinȱproteinȱfunctionȱdetermineȱtheȱdifferencesȱinȱ observedȱenzymeȱactivity.ȱHowever,ȱnotȱallȱenzymeȱactivityȱQTLsȱcoȬlocatedȱwithȱ structuralȱ genes,ȱ suggestingȱ thatȱ regulationȱ occursȱ atȱ multipleȱ levels.ȱ Toȱ furtherȱ complicateȱ theȱ regulationȱ ofȱ carbohydrateȱ metabolism,ȱ significantȱ correlationsȱ betweenȱ enzymeȱ activitiesȱ andȱ metaboliteȱ contentsȱ wereȱ observed,ȱ althoughȱ thisȱ wasȱ notȱ alwaysȱ accompaniedȱ byȱ coȬlocatingȱ QTLs.ȱ Furtherȱ analysisȱ suggestedȱ aȱ relationshipȱ betweenȱ theȱ regulationȱ ofȱ carbohydrateȱ metabolismȱ andȱ plantȱ development.ȱ ȱ Theȱ resultsȱ ofȱ thisȱ thesisȱ demonstrateȱ theȱ powerȱ ofȱ combiningȱ geneticȱ approachesȱwithȱlargeȬscaleȱhighȬthroughputȱtechnologiesȱforȱtheȱconstructionȱofȱ geneticȱ regulatoryȱ networksȱ andȱ metabolicȱ pathways.ȱ Theȱ integrationȱ ofȱ multiȬ parallelȱanalysesȱwillȱfurtherȱenhanceȱourȱunderstandingȱofȱtheȱcomplexȱcircuitryȱ ofȱgeneticȱregulationȱofȱquantitativeȱtraits.ȱ

145ȱ ȱ

146ȱ Samenvattingȱ ȱ Genetischeȱ verschillenȱ tussenȱ accessiesȱ vanȱ plantenȱ openbarenȱ zichȱ vaakȱ alsȱ fenotypischeȱ variatie.ȱ Dezeȱ natuurlijkeȱ variatieȱ vertoontȱ inȱ veelȱ gevallenȱ eenȱ continueȱ verdelingȱ enȱ wordtȱ daaromȱ ookȱ welȱ kwantitatieveȱ variatieȱ genoemd.ȱ Kwantitatieveȱvariatieȱisȱhetȱgevolgȱvanȱhetȱsamenspelȱvanȱmeerdereȱgenenȱenȱdeȱ invloedȱ vanȱ omgevingsfactoren.ȱ Omdatȱ deȱ bijdrageȱ vanȱ iederȱ genȱ aanȱ hetȱ uiteindelijkeȱfenotypeȱ ergȱkleinȱ kanȱzijn,ȱ zijnȱ geavanceerdeȱ statistischeȱ methodesȱ nodigȱomȱeenȱassociatieȱvanȱgenomischeȱregio’sȱmetȱeenȱbepaaldeȱeigenschapȱaanȱ teȱ tonen.ȱ Eenȱ dergelijkeȱ aanpakȱ staatȱ bekendȱ alsȱ quantitativeȱ traitȱ locusȱ (QTL)ȱ analyse.ȱ Inȱ QTLȱ analysesȱ wordtȱ deȱ gewensteȱ eigenschapȱ gekwantificeerdȱ inȱ eenȱ genetischeȱ karteringȱ populatie,ȱ welkeȱ verkregenȱ isȱ doorȱ eenȱ kruisingȱ vanȱ verschillendeȱgenotypes.ȱ ȱ Arabidopsisȱ thalianaȱ (Zandraket)ȱ isȱ deȱ meestȱ gebruikteȱ modelplantȱ inȱ moderneȱplantwetenschappen.ȱDoorȱdeȱcombinatieȱvanȱeenȱkorteȱlevenscyclus,ȱeenȱ kleinȱenȱvolledigȱopgehelderdȱgenoomȱenȱeenȱwijdeȱverspreidingȱoverȱdeȱwereldȱisȱ Arabidopsisȱ bijȱ uitstekȱ geschiktȱ voorȱ deȱ genetischeȱ analyseȱ vanȱ kwantitatieveȱ eigenschappen.ȱ Omdatȱ hetȱ eenȱ zelfbevruchterȱ is,ȱ zijnȱ recombinanteȱ inteeltȱ lijnenȱ (Recombinantȱ Inbredȱ Lines;ȱ RILs)ȱ hetȱ meestȱ gangbaarȱ alsȱ genetischeȱ karteringȱ populatieȱinȱArabidopsis.ȱInȱandereȱsoortenȱzijnȱbijnaȬisogeneȱlijnenȱ(NearȬIsogenicȱ Lines;ȱ NILs)ȱ echterȱ beterȱ bruikbaarȱ doorȱ inteeltȱ enȱ vruchtbaarheidsproblemenȱ inȱ RILs.ȱNILsȱzijnȱookȱergȱnuttigȱvoorȱhetȱbevestigenȱenȱdeȱpreciezeȱpositioneringȱvanȱ QTLsȱ dieȱ inȱ RILȱ populatiesȱ gevondenȱ zijn.ȱ Hoofdstukȱ tweeȱ beschrijftȱ deȱ ontwikkelingȱvanȱeenȱvolledigȱgenoomdekkendeȱNILȱpopulatieȱverkregenȱuitȱeenȱ kruisingȱ tussenȱ deȱ verschillendeȱ accessiesȱ Landsbergȱ erectaȱ (Ler)ȱ enȱ Capeȱ Verdeȱ Islandsȱ (Cvi).ȱ Uitȱ dezeȱ kruisingȱ wasȱ eerderȱ alȱ eenȱ RILȱ populatieȱ ontwikkeld.ȱ Deȱ genetischeȱ opmaakȱ vanȱ dezeȱ tweeȱ typesȱ populatieȱ verschiltȱ inȱ hetȱ aantalȱ introgressies.ȱ RILsȱ bevattenȱ meerdereȱ introgressiesȱ terwijlȱ NILsȱ bijȱ voorkeurȱ slechtsȱ eenȱ enkeleȱ introgressieȱ perȱ lijnȱ bevatten.ȱ Hierdoorȱ kunnenȱ inȱ RILȱ populaties,ȱinȱtegenstellingȱtotȱNILȱpopulaties,ȱepistatischeȱinteractiesȱaangetoondȱ worden.ȱBovendienȱkunnenȱerȱminderȱherhalingenȱperȱlijnȱgeanalyseerdȱwordenȱinȱ RILȱpopulatiesȱomdatȱdeȱrecombinatiefrequentieȱhogerȱisȱdanȱinȱNILȱpopulaties.ȱ Echter,ȱdeȱkansȱopȱhetȱdetecterenȱvanȱQTLsȱmetȱeenȱkleinȱeffectȱisȱkleinerȱinȱRILȱ populaties,ȱ vergelekenȱ metȱ NILȱ populaties,ȱ omdatȱ meerdereȱ QTLsȱ tegelijkertijdȱ uitsplitsen.ȱ ȱ Voorȱ veleȱ eigenschappen,ȱ inclusiefȱ kwantitatieve,ȱ kanȱ uitsplitsingȱ vanȱ fenotypischeȱvariatieȱwordenȱwaargenomenȱinȱpopulatiesȱverkregenȱuitȱkruisingenȱ metȱverschillendeȱArabidopsisȱaccessies.ȱKwantitatieveȱeigenschappenȱzijnȱechterȱ

147ȱ Samenvattingȱ

vaakȱhetȱgevolgȱvanȱveleȱtussenliggendeȱstappenȱopȱhetȱtrajectȱvanȱgenotypeȱnaarȱ fenotype.ȱ Omȱ eenȱ volledigȱ beeldȱ teȱ krijgenȱ vanȱ deȱ complexeȱ reguleringscircuitsȱ vanȱkwantitatieveȱeigenschappenȱisȱhetȱdaaromȱaanȱteȱbevelenȱomȱdeȱgenetischeȱ regulatieȱ opȱ verschillendeȱ niveausȱ vanȱ deȱ biologischeȱ informatiestroomȱ teȱ analyseren.ȱDeȱrecenteȱvoortgangȱinȱzogenaamdeȱ‘omic’ȱtechnologieënȱmaaktȱhetȱ momenteelȱmogelijkȱomȱdeȱexpressieȱvanȱgenenȱ(transcription;ȱtranscriptomics)ȱenȱ deȱ aanwezigheidȱ vanȱ eiwittenȱ (proteins;ȱ proteomics)ȱ enȱ inhoudstoffenȱ (metabolites;ȱmetabolomics)ȱopȱgroteȱschaalȱteȱanalyseren.ȱ Deȱexpressieȱvanȱgenenȱbepaaltȱvaakȱhetȱbeginȱvanȱbiologischeȱroutesȱenȱ hetȱ isȱ aannemelijkȱ datȱ variatieȱ inȱ expressieȱ niveausȱ zijnȱ weerslagȱ heeftȱ opȱ fenotypischeȱ variatie.ȱ Deȱ genetischeȱ analyseȱ vanȱ genexpressieȱ variatieȱ vanȱ hetȱ completeȱgenoomȱinȱdeȱLerȱxȱCviȱRILȱpopulatieȱinȱhoofdstukȱdrieȱtoondeȱaanȱdatȱ voorȱveleȱgenenȱdeȱgevondenȱvariatieȱerfelijkȱis.ȱDitȱwijstȱerȱopȱdatȱdeȱexpressieȱ vanȱ dezeȱ genenȱ genetischȱ gereguleerdȱ is.ȱ Voorȱ eenȱ grootȱ aantalȱ genenȱ werdenȱ inderdaadȱ expressieȱ QTLsȱ (eQTLs)ȱ gevonden.ȱ Gedetailleerdeȱ analysesȱ toondenȱ zowelȱcisȬgereguleerdeȱexpressie,ȱalsȱgevolgȱvanȱpolymorfismenȱinȱhetȱgenȱzelf,ȱalsȱ ookȱ transȬgereguleerdeȱ expressie,ȱ alsȱ gevolgȱ vanȱ genetischeȱ verschillenȱ inȱ regulatorenȱeldersȱopȱhetȱgenoom,ȱaan.ȱDeȱidentificatieȱvanȱtransȬregulatorenȱbiedtȱ deȱ mogelijkheidȱ omȱ genȬtotȬgenȱ regulatieȱ aanȱ teȱ tonenȱ enȱ uiteindelijkȱ omȱ regulatienetwerkenȱ teȱ construeren.ȱ Voorȱ eenȱ aantalȱ genomischeȱ regio’sȱ werdenȱ onverwachtȱ hogeȱ aantallenȱ transȬeQTLsȱ gevonden.ȱ Dezeȱ hotȱ spotsȱ wordenȱ mogelijkȱ veroorzaaktȱ doorȱ pleiotropeȱ effectenȱ vanȱ regulatorenȱ (b.v.ȱ transcriptieȱ factoren).ȱ Alsȱ voorȱ meerdereȱ genen,ȱ dieȱ iederȱ bijȱ hetzelfdeȱ biologischeȱ procesȱ betrokkenȱzijn,ȱopȱdezelfdeȱpositieȱeenȱeQTLȱgevondenȱwordtȱdanȱwijstȱditȱerȱopȱ datȱ veleȱ wellichtȱ doorȱ hetzelfdeȱ genȱ gereguleerdȱ worden.ȱ Dezeȱ informatieȱ werdȱ succesvolȱaangewendȱomȱdeȱconstructieȱvanȱeenȱregulatieȱnetwerkȱvoorȱbloeitijdȱteȱ demonstreren.ȱ ȱ Aanȱhetȱandereȱeindȱvanȱdeȱinformatieketenȱstaanȱmetabolietenȱhetȱdichtstȱ bijȱ hetȱ uiteindelijkeȱ fysiologischeȱ fenotype.ȱ Hetȱ isȱ daaromȱ waarschijnlijkȱ datȱ genetischeȱ variatie,ȱ leidendȱ totȱ fysiologischeȱ verschillen,ȱ ookȱ deȱ oorzaakȱ isȱ vanȱ verschillenȱ inȱ metabolietȱ niveaus.ȱ Deȱ ongerichteȱ metabolietȱ analysesȱ zoalsȱ beschrevenȱ inȱ hoofdstukȱ vierȱ toondenȱ deȱ uitgebreideȱ natuurlijkeȱ variatieȱ inȱ metabolietȱ samenstellingȱ inȱ 14ȱ verschillendeȱ accessiesȱ vanȱ Arabidopsisȱ aan.ȱ QTLȱ analysesȱvanȱmeerȱdanȱ2.000ȱkwalitatiefȱbetrouwbareȱmassapieken,ȱgedetecteerdȱinȱ deȱLerȱxȱCviȱRILȱpopulatie,ȱresulteerdeȱinȱdeȱidentificatieȱvanȱQTLsȱvoorȱongeveerȱ 75%ȱvanȱdeȱmassapieken.ȱMeerȱdanȱeenȬderdeȱvanȱdeȱmassapiekenȱdieȱinȱdeȱRILsȱ kondenȱ wordenȱ gedetecteerdȱ werdenȱ inȱ geenȱ vanȱ deȱ oudersȱ aangetroffen.ȱ Ditȱ suggereertȱ datȱ veleȱ metabolietenȱ werdenȱ gevormdȱ doorȱ deȱ recombinatieȱ vanȱ deȱ genomenȱvanȱdeȱouders.ȱDeȱidentificatieȱvanȱmassapieken,ȱdieȱopȱbasisȱvanȱQTLȱ

148ȱ Samenvattingȱ

positiesȱ identiekȱ gereguleerdȱ lijkenȱ teȱ zijn,ȱ maakteȱ hetȱ mogelijkȱ omȱ metaboleȱ routesȱteȱ(re)construerenȱenȱomȱnieuweȱbiosynthetischeȱstappenȱenȱmetabolietenȱinȱ Arabidopsisȱ aanȱ teȱ tonen.ȱ Dezeȱ resultatenȱ gevenȱ deȱ hogeȱ potentieȱ voorȱ hetȱ modificerenȱvanȱmetabolietȱsamenstellingȱdoorȱklassiekeȱveredelingȱweer.ȱ ȱ Hoewelȱiederȱvanȱdeȱverschillendeȱstappenȱinȱhetȱtrajectȱvanȱgenotypeȱnaarȱ fenotypeȱ effectiefȱ geanalyseerdȱ kanȱ worden,ȱ kanȱ eenȱ volledigȱ begripȱ vanȱ deȱ interactieȱ tussenȱ verschillendeȱ niveausȱ alleenȱ verkregenȱ wordenȱ doorȱ geïntegreerdeȱ studiesȱ vanȱ multiȬparallelleȱ analyses.ȱ Inȱ hoofdstukȱ vijfȱ werdȱ deȱ complexeȱregulatieȱvanȱhetȱprimaireȱkoolhydraatmetabolismeȱgeanalyseerdȱinȱeenȱ modelstudie.ȱ Deȱ activiteitenȱ vanȱ 15ȱ enzymenȱ dieȱ allenȱ eenȱ rolȱ spelenȱ inȱ ditȱ metabolismeȱ werdenȱ parallelȱ metȱ deȱ expressieȱ vanȱ hunȱ structureleȱ genenȱ enȱ accumulatieȱvanȱhunȱmetaboleȱsubstratenȱenȱproductenȱgenetischȱgeanalyseerdȱinȱ deȱLerȱxȱCviȱRILȱpopulatie.ȱVoorȱveelȱenzymenȱkondenȱQTLsȱwordenȱgevondenȱ dieȱdeȱvariatieȱinȱactiviteitȱverklaarden.ȱEenȱdeelȱvanȱdezeȱQTLsȱwerdȱgevondenȱ opȱ positiesȱ vanȱ structureleȱ genen.ȱ Ditȱ wijstȱ erȱ opȱ datȱ natuurlijkeȱ variatieȱ inȱ structureleȱ genenȱ deȱ oorzaakȱ kanȱ zijnȱ vanȱ deȱ variatieȱ inȱ enzymactiviteit.ȱ Naȱ expressieanalysesȱ vanȱ dezeȱ structureleȱ genenȱ konȱ geconcludeerdȱ wordenȱ datȱ zowelȱ expressievariatieȱ alsȱ variatieȱ inȱ enzymfunctieȱ deȱ verschillenȱ inȱ enzymactiviteitȱbepalen.ȱNietȱalleȱenzymactiviteitȱQTLsȱwerdenȱechterȱopȱpositiesȱ vanȱstructureleȱgenenȱgevonden,ȱwatȱsuggereertȱdatȱregulatieȱopȱmeerdereȱniveausȱ plaatsȱ vindt.ȱ Deȱ complexeȱ regulatieȱ vanȱ hetȱ hydraatmetabolismeȱ werdȱ verderȱ geïllustreerdȱ doorȱ deȱ significanteȱ correlatiesȱ tussenȱ enzymactiviteitenȱ enȱ metabolietȱaccumulatiesȱzonderȱdatȱerȱsprakeȱwasȱvanȱQTLsȱopȱidentiekeȱposities.ȱ Statistischeȱ analysesȱ suggereerdenȱ eenȱ relatieȱ tussenȱ koolhydraatmetabolismeȱ enȱ plantontwikkelingȱalsȱeenȱmogelijkeȱoorzaakȱvanȱcorrelaties.ȱ ȱ Deȱresultatenȱvanȱditȱproefschriftȱtonenȱdeȱkrachtȱaanȱvanȱhetȱcombinerenȱ vanȱ genetischeȱ analysesȱ metȱ grootschaligeȱ ‘omics`ȱ technologieënȱ omȱ genetischeȱ regulatienetwerkenȱ enȱ metaboleȱ routesȱ teȱ construeren.ȱ Deȱ integratieȱ vanȱ multiȬ parallelleȱanalysesȱzalȱonsȱbegripȱvanȱdeȱcomplexeȱcircuitsȱvanȱgenetischeȱregulatieȱ vanȱkwantitatieveȱeigenschappenȱverderȱvergroten.ȱ ȱ

149ȱ 150ȱ Publicationsȱ ȱ Publicationsȱfromȱthisȱthesisȱ ȱ Keurentjes,ȱJ.J.B.,ȱJ.ȱFu,ȱI.R.ȱTerpstra,ȱJ.M.ȱGarcia,ȱG.ȱvanȱdenȱAckerveken,ȱL.B.ȱSnoek,ȱA.J.M.ȱPeeters,ȱD.ȱ Vreugdenhil,ȱ M.ȱ Koornneefȱ andȱ R.C.ȱ Jansen.ȱ Regulatoryȱ networkȱ constructionȱ inȱ Arabidopsisȱ usingȱ genomeȬwideȱgeneȱexpressionȱQTLs.ȱProcȱNatlȱAcadȱSciȱUȱSȱA.ȱ104:ȱ1708Ȭ13.ȱ2007.ȱ ȱ Keurentjes,ȱ J.J.B.,ȱ L.ȱ Bentsink,ȱ C.ȱ AlonsoȬBlanco,ȱ C.J.ȱ Hanhart,ȱ H.ȱ BlankestijnȬDeȱ Vries,ȱ S.ȱ Effgen,ȱ D.ȱ Vreugdenhilȱ andȱ M.ȱ Koornneef.ȱ Developmentȱ ofȱ aȱ Nearȱ Isogenicȱ Lineȱ populationȱ ofȱ Arabidopsisȱ thalianaȱandȱcomparisonȱofȱmappingȱpowerȱwithȱaȱRecombinantȱInbredȱLineȱpopulation.ȱGenetics.ȱ175:ȱ 891Ȭ905.ȱ2007.ȱ ȱ Keurentjes,ȱJ.J.B.,ȱJ.ȱFu,ȱC.H.ȱdeȱVos,ȱA.ȱLommen,ȱR.D.ȱHall,ȱR.J.ȱBino,ȱL.H.W.ȱvanȱderȱPlas,ȱR.C.ȱJansen,ȱ D.ȱVreugdenhilȱandȱM.ȱKoornneef.ȱTheȱgeneticsȱofȱplantȱmetabolism.ȱNatȱGenet.ȱ38:ȱ842Ȭ9.ȱ2006.ȱ ȱ Fu,ȱJ.,ȱM.A.ȱSwertz,ȱJ.J.B.ȱKeurentjesȱandȱR.C.ȱJansen.ȱMetaNetwork:ȱaȱcomputationalȱprotocolȱforȱtheȱ geneticȱstudyȱofȱmetabolicȱnetworks.ȱNatȱprotoc.ȱ2:ȱ685Ȭ94.ȱ2007.ȱ ȱ DeȱVos,ȱC.H.,ȱS.ȱMoco,ȱA.ȱLommen,ȱJ.J.B.ȱKeurentjes,ȱR.J.ȱBinoȱandȱR.D.ȱHall.ȱUntargetedȱlargeȬscaleȱ plantȱmetabolomicsȱusingȱliquidȱchromatographyȱcoupledȱtoȱmassȱspectrometry.ȱNatȱprotoc.ȱ2:ȱ778Ȭ91.ȱ 2007.ȱ ȱ Keurentjes,ȱ J.J.B.,ȱ M.ȱ Koornneefȱ andȱ D.ȱ Vreugdenhil.ȱ Genomeȱ studiesȱ andȱ molecularȱ genetics.ȱ Inȱ preparationȱforȱCurrȱOpinȱPlantȱBiol.ȱ ȱ Keurentjes,ȱJ.J.B.,ȱR.ȱSulpice,ȱY.ȱ Gibon,ȱJ.ȱFu,ȱM.ȱ Koornneef,ȱM.ȱStittȱandȱD.ȱVreugdenhil.ȱIntegrativeȱ analysesȱofȱgeneticȱvariationȱinȱenzymeȱactivitiesȱofȱprimaryȱcarbohydrateȱmetabolismȱrevealȱdistinctȱ modesȱofȱregulationȱinȱArabidopsisȱthaliana.ȱInȱpreparationȱforȱGenomeȱBiol.ȱ ȱ Fuȱ J.,ȱ M.ȱ Dijkstra,ȱ J.J.B.ȱ Keurentjes,ȱ M.ȱ Koornneef,ȱ R.ȱ Breitlingȱ andȱ R.C.ȱ Jansen.ȱ Systemsȱ biologyȱ throughȱsystemȱgenetics.ȱInȱpreparation.ȱ ȱ ȱ Publicationsȱrelatedȱtoȱthisȱthesisȱ ȱ Sergeeva,ȱ L.I.,ȱ J.J.B.ȱ Keurentjes,ȱ L.ȱ Bentsink,ȱ J.ȱ Vonk,ȱ L.H.W.ȱ vanȱ derȱ Plas,ȱ M.ȱ Koornneefȱ andȱ D.ȱ Vreugdenhil.ȱVacuolarȱinvertaseȱregulatesȱelongationȱofȱArabidopsisȱthalianaȱrootsȱasȱrevealedȱbyȱQTLȱ andȱmutantȱanalysis.ȱProcȱNatlȱAcadȱSciȱUȱSȱA.ȱ103:ȱ2994Ȭ9.ȱ2006.ȱ ȱ Teng,ȱ S.,ȱ J.J.B.ȱ Keurentjes,ȱ L.ȱ Bentsink,ȱ M.ȱ Koornneefȱ andȱ S.ȱ Smeekens.ȱ SucroseȬspecificȱ inductionȱ ofȱ anthocyaninȱ biosynthesisȱ inȱ Arabidopsisȱ requiresȱ theȱ MYB75/PAP1ȱ gene.ȱ Plantȱ Physiol.ȱ 139:ȱ 1840Ȭ52.ȱ 2005.ȱ ȱ ȱ ȱ

151ȱ Juenger,ȱT.E.,ȱJ.K.ȱmcKay,ȱN.ȱHausmann,ȱJ.J.B.ȱKeurentjes,ȱS.ȱSen,ȱK.A.ȱStowe,ȱT.E.ȱDawson,ȱE.L.ȱSimmsȱ andȱ J.H.ȱ Richards.ȱ Identificationȱ andȱ characterizationȱ ofȱ QTLȱ underlyingȱ wholeȬplantȱ physiologyȱ inȱ Arabidopsisȱthaliana:ȱΈ13C,ȱstomatalȱconductanceȱandȱtranspirationȱefficiency.ȱPlantȱCellȱEnv.ȱ28:ȱ697Ȭ 702.ȱ2005.ȱ ȱȱ Sergeeva,ȱ L.I,ȱ J.ȱ Vonk,ȱ J.J.B.ȱ Keurentjes,ȱ L.H.W.ȱ vanȱ derȱ Plas,ȱ M.ȱ Koornneefȱ andȱ D.ȱ Vreugdenhil.ȱ HistochemicalȱanalysisȱrevealsȱorganȬspecificȱquantitativeȱtraitȱlociȱforȱenzymeȱactivitiesȱinȱArabidopsis.ȱ PlantȱPhysiol.ȱ134:ȱ237Ȭ45.ȱ2004.ȱ ȱ Sicardȱ O.,ȱ O.ȱ Loudet,ȱ J.J.B.ȱ Keurentjes,ȱ T.ȱ Candresse,ȱ O.ȱ Leȱ Gall,ȱ F.ȱ Reversȱ andȱ V.ȱ Decroocq.ȱ IdentificationȱofȱQTLsȱcontrollingȱsymptomȱdevelopmentȱduringȱviralȱinfectionȱinȱArabidopsisȱthaliana.ȱ SubmittedȱtoȱPlantȱPhysiol.ȱ ȱ TessadoriȱF.,ȱM.ȱvanȱZanten,ȱP.ȱPavlova,ȱB.L.ȱSnoek,ȱF.F.ȱMillenaar,ȱR.K.ȱSchulkes,ȱJ.J.B.ȱKeurentjes,ȱR.ȱ vanȱDriel,ȱL.A.C.J.ȱVoesenek,ȱP.ȱFranszȱandȱA.J.M.ȱPeeters.ȱNaturalȱvariationȱinȱheterochromatinȱcontentȱ amongȱArabidopsisȱaccessionsȱisȱcontrolledȱbyȱlightȬintensity.ȱInȱpreparationȱforȱPlantȱcell.ȱ ȱ ȱ Otherȱpublicationsȱ ȱ LéonȬKloosterziel,ȱK.M.,ȱB.W.M.ȱVerhagen,ȱJ.J.B.ȱKeurentjes,ȱJ.A.ȱVanȱPelt,ȱM.ȱRep,ȱL.C.ȱVanȱLoonȱandȱ C.M.J.ȱPieterse.ȱColonizationȱofȱtheȱArabidopsisȱrhizosphereȱbyȱfluorescentȱPseudomonasȱspp.ȱactivatesȱ aȱrootȬspecific,ȱethyleneȬresponsiveȱPRȬ5ȱgeneȱinȱtheȱvascularȱbundle.ȱPlantȱMol.ȱBiol.ȱ57:ȱ731Ȭ48.ȱ2005.ȱ ȱ DeȱBoer,ȱM.,ȱP.ȱBom,ȱF.ȱKindt,ȱJ.J.B.ȱKeurentjes,ȱI.ȱVanȱderȱSluis,ȱL.C.ȱVanȱLoon,ȱandȱP.A.H.M.ȱBakker.ȱ Controlȱ ofȱ Fusariumȱ wiltȱ ofȱ radishȱ byȱ combiningȱ Pseudomonasȱ putidaȱ strainsȱ thatȱ haveȱ differentȱ diseaseȬsuppressiveȱmechanisms.ȱPhytopathology.ȱ93:ȱ626Ȭ32.ȱ2003.ȱ ȱ LéonȬKloosterziel,ȱ K.M.,ȱ B.W.M.ȱ Verhagen,ȱ J.J.B.ȱ Keurentjes,ȱ L.C.ȱ Vanȱ Loonȱ andȱ C.M.J.ȱ Pieterse.ȱȱ IdentificationȱofȱgenesȱinvolvedȱinȱrhizobacteriaȬmediatedȱinducedȱsystemicȱresistanceȱinȱArabidopsis.ȱ In:ȱ Inducedȱ Resistanceȱ inȱ Plantsȱ Againstȱ Insectsȱ andȱ Diseasesȱ (A.ȱ Schmittȱ andȱ B.ȱ MauchȬMani,ȱ eds),ȱ IOBC/wprsȱBulletin.ȱ25:ȱ71Ȭ4.ȱ2002.ȱ ȱȱ Pieterse,ȱ C.M.J.,ȱ J.A.ȱ vanȱ Pelt,ȱ S.C.M.ȱ vanȱ Wees,ȱ J.ȱ Ton,ȱ K.M.ȱ LéonȬKloosterziel,ȱ J.J.B.ȱ Keurentjes,ȱ B.W.M.ȱVerhagen,ȱM.ȱKnoester,ȱI.ȱVanȱderȱSluis,ȱP.A.H.M.ȱBakkerȱandȱL.C.ȱvanȱLoon.ȱRhizobacteriaȬ mediatedȱinducedȱsystemicȱresistance:ȱtriggering,ȱsignalling,ȱandȱexpression.ȱEur.ȱJ.ȱPlantȱPathol.ȱ107:ȱ 51Ȭ61.ȱ2001.ȱ ȱ DeȱBoer,ȱM.,ȱI.ȱvanȱderȱSluis,ȱJ.J.B.ȱKeurentjes,ȱL.C.ȱvanȱLoonȱandȱP.A.H.M.ȱBakker.ȱModesȱofȱactionȱofȱ suppressionȱ ofȱ fusariumȱ wiltȱ ofȱ radishȱ byȱ theȱ combinationȱ ofȱ Pseudomonasȱ putidaȱ RE8ȱ andȱ P.ȱ fluorescensȱ RS111.ȱ In:ȱ Proceedingsȱ ofȱ theȱ Fifthȱ Internationalȱ PlantȬGrowthȱ Promotingȱ Rhizobacteriaȱ Workshop.ȱCordoba,ȱArgentina.ȱ2000.ȱ ȱ DeȱBoer,ȱM.,ȱI.ȱVanȱderȱSluis,ȱJ.J.B.ȱKeurentjes,ȱL.C.ȱvanȱLoonȱandȱP.A.H.M.ȱBakker.ȱVerbeteringȱvanȱ biologischeȱbeheersingȱvanȱFusariumȬverwelkingsziekteȱinȱradijsȱdoorȱhetȱgebruikȱvanȱcombinatiesȱvanȱ PseudomonasȬstammen.ȱGewasbescherming.ȱ31:ȱ56Ȭ7.ȱ2000.ȱ ȱ

152ȱ Pieterse,ȱ C.M.J.,ȱ S.C.M.ȱ vanȱ Wees,ȱ J.ȱ Ton,ȱ K.M.ȱ LéonȬKloosterziel,ȱ J.A.ȱ vanȱ Pelt,ȱ J.J.B.ȱ Keurentjes,ȱ M.ȱ KnoesterȱandȱL.C.ȱvanȱLoon.ȱRhizobacteriaȬmediatedȱinducedȱsystemicȱresistanceȱ(ISR)ȱinȱArabidopsis:ȱ involvementȱofȱjasmonateȱandȱethylene.ȱIn:ȱBiologyȱofȱPlantȬMicrobeȱInteractions,ȱVolumeȱ2ȱ(P.J.G.M.ȱ Deȱ Wit,ȱ T.ȱ Bisselingȱ andȱ W.J.ȱ Stiekema,ȱ eds),ȱ Theȱ Internationalȱ Societyȱ forȱ Molecularȱ PlantȬMicrobeȱ Interactions,ȱSt.ȱPaul,ȱMN.,ȱ291Ȭ6,ȱ1999.ȱ ȱ DeȱBoer,ȱM.,ȱP.ȱBom,ȱF.ȱKindt,ȱI.ȱVanȱderȱSluis,ȱJ.J.B.ȱKeurentjes,ȱL.C.ȱvanȱLoonȱandȱP.A.H.MȱBakker.ȱ Hetȱ gebruikȱ vanȱ deȱ combinatieȱ vanȱ Pseudomonasȱ putidaȱ stammenȱ RE8ȱ enȱ WCS358ȱ kanȱ biologischeȱ bestrijdingȱvanȱFusariumȱverwelkingsziekteȱinȱradijsȱverbeteren.ȱGewasbescherming.ȱ30:ȱ85Ȭ6.ȱ1999.ȱ ȱ Falque,ȱ M.,ȱ J.J.B.ȱ Keurentjes,ȱ J.M.T.ȱ BakxȬSchotmanȱ andȱ P.J.ȱ vanȱ Dijk.ȱ Developmentȱ andȱ characterisationȱ ofȱ microsatelliteȱ markersȱ inȱ theȱ sexualȬapomicticȱ complexȱ Taraxacumȱ officinaleȱ (dandelion).ȱTheor.ȱAppl.ȱGenet.ȱ97:ȱ283Ȭ92.ȱ1998.ȱ ȱ GorissenȱA.,ȱJ.H.ȱvanȱGinkel,ȱJ.J.B.ȱKeurentjesȱandȱJ.A.ȱvanȱVeen.ȱGrassȱrootȱdecompositionȱisȱretardedȱ whenȱgrassȱhasȱbeenȱgrownȱunderȱelevatedȱCO2.ȱSoilȱBiol.ȱBiochem.ȱ27:ȱ117Ȭ20.ȱ1995.ȱ ȱ Bonnier,ȱF.J.M.,ȱJ.J.B.ȱKeurentjesȱandȱJ.M.ȱvanȱTuyl.ȱIonȱleakageȱasȱaȱcriterionȱforȱviabilityȱofȱlilyȱbulbȱ scalesȱafterȱstorageȱatȱȬ2°Cȱforȱ0.5,ȱ1.5ȱandȱ2.5ȱyears.ȱHortȱScience.ȱ29:ȱ1332Ȭ4.ȱ1994.ȱ ȱ ȱ Honours,ȱAwardsȱandȱFellowshipsȱ ȱ CBSGȱSpecialȱAchievementȱAward.ȱ2005.ȱ ȱ ZonMw,ȱNationalȱGenomicsȱInitiative.ȱInternationalȱresearchȱfellowship.ȱ2006.ȱ ȱ

153ȱ 154ȱ Curriculumȱvitaeȱ ȱ Joostȱ Keurentjesȱ werdȱ geborenȱ opȱ 25ȱ decemberȱ 1968ȱ teȱ Doetinchem.ȱ Naȱ hetȱ behalenȱ vanȱ hetȱ HAVOȱ diplomaȱ aanȱ hetȱ ISALAȱ collegeȱ teȱ Silvoldeȱ inȱ 1988ȱ voltooideȱhijȱhetȱMLOȱteȱArnhemȱinȱ1992.ȱAansluitendȱwasȱhijȱtweeȱjaarȱwerkzaamȱ alsȱ onderzoeksmedewerkerȱ aanȱ hetȱ ABȬDLOȱ alvorensȱ inȱ 1994ȱ deȱ studieȱ plantenbiotechnologieȱaanȱdeȱIAHLȱteȱVelpȱteȱbeginnen.ȱInȱ1997ȱwerdȱdezeȱstudieȱ afgerondȱenȱvolgdenȱerȱenkeleȱdienstverbandenȱalsȱonderzoeksmedewerkerȱbijȱhetȱ IPOȬDLO,ȱ deȱ universiteitȱ Utrechtȱ (Fytopathologie)ȱ enȱ alsȱ seniorȱ onderzoekerȱ bijȱ Herculesȱ b.v..ȱ Inȱ 2002ȱ begonȱ hijȱ aanȱ hetȱ promotieonderzoekȱ beschrevenȱ inȱ ditȱ proefschriftȱ bijȱ deȱ leerstoelgroepenȱ Erfelijkheidsleerȱ enȱ Plantenfysiologieȱ aanȱ deȱ WageningenȱUniversiteit.ȱMetȱingangȱvanȱ1ȱmeiȱ2007ȱisȱhijȱinȱdienstȱalsȱpostȬdocȱ onderzoekerȱbijȱvoornoemdeȱleerstoelgroepen.ȱ

155ȱ Nawoordȱ ȱ Opȱhetȱmomentȱdatȱuȱditȱleestȱhoopȱikȱdatȱuȱookȱdeȱmoeiteȱheeftȱgenomen,ȱofȱnogȱ zultȱnemen,ȱomȱdeȱvoorgaandeȱhoofdstukkenȱdoorȱteȱnemen.ȱDeȱinhoudȱvanȱditȱ proefschriftȱ isȱ namelijkȱ metȱ zorgȱ enȱ toewijdingȱ samengesteldȱ enȱ gelukkigȱ nietȱ alleenȱ doorȱ mij.ȱ Hetȱ vermeldenȱ vanȱ eenȱ iederȱ dieȱ heeftȱ bijgedragenȱ aanȱ deȱ totstandkomingȱ ervanȱ zouȱ slechtsȱ leidenȱ totȱ eenȱ drogeȱ opsomming.ȱ Hetȱ staatȱ echterȱbuitenȱkijfȱdatȱditȱboekjeȱerȱheelȱandersȱuitȱhadȱgezienȱzonderȱdeȱhulpȱvanȱ velen.ȱ Hoewelȱ deȱ inbrengȱ vanȱ deȱ éénȱ misschienȱ omvangrijkerȱ ofȱ zinvollerȱ isȱ geweestȱdanȱvanȱdeȱanderȱwilȱikȱtochȱgeenȱonderscheidȱmakenȱinȱwaardering.ȱIkȱ prijsȱ mijȱ gelukkigȱ omȱ deelȱ uitȱ teȱ hebbenȱ mogenȱ makenȱ vanȱ eenȱ omvangrijkȱ netwerkȱvanȱspecialistenȱvanȱwierȱexpertiseȱikȱdankbaarȱgebruikȱhebȱgemaakt.ȱIkȱ hebȱgeprobeerdȱdeȱcomplexiteitȱvanȱditȱnetwerkȱweerȱteȱgevenȱinȱdeȱfiguurȱopȱdeȱ volgendeȱpagina.ȱKennersȱzullenȱonmiddellijkȱopmerkenȱdatȱhetȱeenȱtopologischȱ robuust,ȱ modulairȱ enȱ schaalvrijȱ hierarchischȱ netwerkȱ isȱ metȱ eenȱ hogeȱ graadȱ vanȱ connectiviteit.ȱInȱdeȱpraktijkȱstaatȱditȱsynonymȱvoorȱeenȱkwalitatiefȱhoogwaardigȱ samenwerkingsverbandȱ metȱ korteȱ lijnenȱ tussenȱ deȱ deelnemers,ȱ hetȱ zogenaamdeȱ ‘kleineȬwereldȱeffect’ȱ(iedereenȱkentȱwelȱiemandȱdieȱiemandȱandersȱkent).ȱ Tochȱwilȱikȱerȱgraagȱeenȱaantalȱpersonenȱuitlichtenȱdieȱinȱdeȱachterliggendeȱ jarenȱbijzonderȱveelȱvoorȱmijȱbetekendȱhebben.ȱInȱdeȱeersteȱplaatsȱmijnȱpromotorenȱ MaartenȱenȱLinusȱenȱcoȬpromotorȱDick.ȱDezeȱsynergistischeȱdrieȬeenheidȱheeftȱmijȱ vrijwelȱprobleemloosȱdoorȱmijnȱpromotietrajectȱgeloodst.ȱEenȱspeciaalȱwoordȱvanȱ dankȱookȱaanȱmijnȱtweeȱparanimfen.ȱJingyuan,ȱzonderȱwierȱhulpȱenȱtomelozeȱinzetȱ eenȱgrootȱdeelȱvanȱditȱproefschriftȱnietȱtotȱstandȱwasȱgekomen.ȱJudith,ȱmetȱwie,ȱalsȱ medeȬAIOȱ enȱ kamergenoot,ȱ ikȱ meerȱ danȱ vierȱ jaarȱ opgetrokkenȱ ben.ȱ Weȱ hebbenȱ veelȱliefȱenȱleedȱgedeeldȱenȱgelukkigȱmeerȱliefȱdanȱleed.ȱȱ ȱ Restȱ mijȱ nogȱ teȱ vermeldenȱ datȱ ikȱ hetȱ allemaalȱ metȱ veelȱ plezierȱ volbrachtȱ heb.ȱHetȱinȱmijȱgesteldeȱvertrouwenȱomȱerȱnogȱeensȱvierȱjaarȱaanȱvastȱteȱplakkenȱ verheugtȱmijȱdanȱookȱzeer.ȱ ȱ Joostȱ

156ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ

157ȱ

ȱ EducationȱStatementȱofȱtheȱGraduateȱSchoolȱȱȱ ȱ ExperimentalȱPlantȱSciencesȱȱȱ ȱ ȱȱ ȱ ȱ ȱ Issuedȱto:ȱ JoostȱJ.ȱB.ȱKeurentjesȱ ȱ Date:ȱ 7ȱSeptemberȱ2007ȱ ȱ Group:ȱ LaboratoriesȱofȱPlantȱPhysiologyȱandȱGenetics,ȱ WageningenȱUniversityȱ ȱ ȱ ȱ ȱȱȱȱ

1)ȱStartȬupȱphaseȱȱ dateȱ Ź Firstȱpresentationȱofȱyourȱprojectȱ ȱȱ Usingȱnaturalȱvariationȱforȱdissectingȱpathwaysȱofȱplantȱperformanceȱ ȱȱ traitsȱ Aprȱ07,ȱ2003ȱ Ź Writingȱorȱrewritingȱaȱprojectȱproposalȱ ȱȱ Ź Writingȱaȱreviewȱorȱbookȱchapterȱ ȱȱ Ź MScȱcoursesȱ ȱȱ Ź Laboratoryȱuseȱofȱisotopesȱ ȱȱ SubtotalȱStartȬupȱPhaseȱȱ1.5ȱcredits*ȱ 2)ȱScientificȱExposureȱȱ dateȱ Ź EPSȱPhDȱstudentȱdaysȱ ȱȱ ȱȱ EPSȱPhDȱstudentȱday,ȱUtrechtȱUniversityȱ Marȱ27,ȱ2003ȱ ȱȱ EPSȱPhDȱstudentȱday,ȱVrijeȱUniversiteitȱAmsterdamȱ Junȱ03,ȱ2004ȱ ȱȱ EPSȱPhDȱstudentȱday,ȱRadboudȱUniversityȱNijmegenȱ Junȱ02,ȱ2005ȱ ȱȱ EPSȱPhDȱstudentȱday,ȱWageningenȱUniversityȱ Sepȱ19,ȱ2006ȱ Ź EPSȱthemeȱsymposiaȱ ȱȱ EPSȱThemeȱ3ȱsymposiumȱȇMetabolismȱandȱAdaptationȇ,ȱWageningenȱ ȱȱ universityȱ Marȱ23,ȱ2003ȱ EPSȱThemeȱ3ȱsymposiumȱȇMetabolismȱandȱAdaptationȇ,ȱWageningenȱ ȱȱ universityȱ Octȱ25,ȱ2004ȱ EPSȱThemeȱ3ȱsymposiumȱȇMetabolismȱandȱAdaptationȇ,ȱUtrechtȱ ȱȱ universityȱ Novȱ24,ȱ2005ȱ Ź NWOȱLunterenȱdaysȱandȱotherȱNationalȱPlatformsȱ ȱȱ ȱȱ ALWȱmeetingȱExperimentalȱPlantȱSciences,ȱLunterenȱ Aprȱ07Ȭ08,ȱ2003ȱ ȱȱ ALWȱmeetingȱExperimentalȱPlantȱSciences,ȱLunterenȱ Aprȱ05Ȭ06,ȱ2004ȱ ȱȱ ALWȱmeetingȱExperimentalȱPlantȱSciences,ȱLunterenȱ Aprȱ04Ȭ05,ȱ2005ȱ ȱȱ ALWȱmeetingȱExperimentalȱPlantȱSciences,ȱLunterenȱ Aprȱ02Ȭ03,ȱ2007ȱ Ź Seminarsȱ(series),ȱworkshopsȱandȱsymposiaȱ ȱȱ ȱȱ FrontiersȱinȱPlantȱScience,ȱWageningenȱuniversityȱ 2003ȱ ȱȱ Flyingȱseminars,ȱWageningenȱuniversityȱ 2003Ȭ2007ȱ ȱȱ CBSGȱClusterȱmeetingȱArabidopsis,ȱWageningenȱ(3x)ȱ 2004Ȭ2005ȱ ȱȱ CBSGȱSummit,ȱWageningenȱ(2x)ȱ 2005ȱ&ȱ2007ȱ 15thȱsymposiumȱALWȬDiscussionȱGroupȱ“SecondaryȱMetabolismȱinȱ ȱȱ PlantȱandȱPlantȱCell”,ȱZeistȱ Mayȱ20,ȱ2005ȱ 16thȱsymposiumȱALWȬDiscussionȱGroupȱ“SecondaryȱMetabolismȱinȱ ȱȱ PlantȱandȱPlantȱCell”,ȱLeidenȱ Octȱ6,ȱ2006ȱ NetherlandsȱBioInformaticsȱCentreȱWorkshopȱBioinformaticsȱforȱ ȱȱ Metabolomics,ȱWageningenȱ Novȱ29,ȱ2006ȱ Ź Seminarȱplusȱ ȱȱ Ź Internationalȱsymposiaȱandȱcongressesȱ ȱȱ 7thȱInternationalȱCongressȱofȱPlantȱMolecularȱBiologyȱ(ISPMB),ȱ ȱȱ Barcelona,ȱSpainȱ Junȱ23Ȭ28,ȱ2003ȱ Keystoneȱsymposia,ȱBiologicalȱdiscoveryȱusingȱdiverseȱhighȬthroughputȱ ȱȱ data,ȱSteamboatȱSprings,ȱUSAȱ Marȱ30ȬAprȱ4,ȱ2004ȱ ȱȱ 16thȱInternationalȱConferenceȱonȱArabidopsisȱResearch,ȱMadison,ȱUSAȱ Junȱ15Ȭ19,ȱ2005ȱ ȱȱ 4thȱPlantȱGenomicsȱEuropeanȱMeetings,ȱAmsterdam,ȱTheȱNetherlandsȱ Sepȱ20Ȭ23,ȱ2005ȱȱ 15thȱCruciferȱGeneticsȱWorkshop:ȱBrassicaȱ2006,ȱWageningen,ȱTheȱ ȱȱ Netherlandsȱ Sepȱ30–Octȱ4,ȱ2006ȱ ȱȱ 5thȱPlantȱGenomicsȱEuropeanȱMeetings,ȱVenice,ȱItalyȱ Octȱ11Ȭ14,ȱ2006ȱ ȱȱ 18thȱInternationalȱConferenceȱonȱArabidopsisȱResearch,ȱBeijing,ȱChinaȱ Junȱ20Ȭ23,ȱ2007ȱ ȱȱ ȱ ȱȱ Ź Oralȱpresentationsȱ ȱȱ 7thȱInternationalȱCongressȱofȱPlantȱMolecularȱBiologyȱ(ISPMB),ȱ ȱȱ Barcelona,ȱSpainȱ Junȱ24,ȱ2003ȱ HeidelbergȱInstituteȱofȱPlantȱScience,ȱHeidelbergȱUniversity,ȱHeidelberg,ȱ ȱȱ Germanyȱ Sep,ȱ2003ȱ ȱȱ CBSGȱClusterȱmeetingȱArabidopsis,ȱWageningenȱ(3x)ȱ 2004Ȭ2005ȱ EPSȱThemeȱ3ȱsymposiumȱMetabolismȱandȱAdaptation,ȱWageningenȱ ȱȱ universityȱ Octȱ25,ȱ2004ȱ PlantȱResearchȱInternational,ȱBioscience,ȱPlantȱDevelopmentȱSystems,ȱ ȱȱ Wageningenȱ Nov,ȱ2004ȱ PlantȱResearchȱInternational,ȱBioscience,ȱMetabolicȱRegulation,ȱ ȱȱ Wageningenȱ Dec,ȱ2004ȱ ȱȱ ALWȱmeetingȱExperimentalȱPlantȱSciences,ȱLunterenȱ Aprȱ5,ȱ2005ȱ ȱȱ EPS/VLAG/CBSGȱWorkshopȱMetabolomics,ȱWageningenȱUniversityȱ Mayȱ3,ȱ2005ȱ ȱȱ 16thȱInternationalȱConferenceȱonȱArabidopsisȱResearch,ȱMadison,ȱUSAȱ Junȱ17,ȱ2005ȱ ȱȱ 4thȱPlantȱGenomicsȱEuropeanȱMeetings,ȱAmsterdam,ȱTheȱNetherlandsȱ Sepȱ23,ȱ2005ȱȱ EPSȱThemeȱ3ȱsymposiumȱMetabolismȱandȱAdaptation,ȱUtrechtȱ ȱȱ universityȱ Novȱ24,ȱ2005ȱ ȱȱ MaxȬPlanckȬInstituteȱofȱMolecularȱPlantȱPhysiology,ȱGolm,ȱGermanyȱ Mayȱ5,ȱ2006ȱ ȱȱ DeȱRuiterȱSeeds,ȱBergschenhoekȱ Julȱ25,ȱ2006ȱ 15thȱCruciferȱGeneticsȱWorkshop:ȱBrassicaȱ2006,ȱWageningen,ȱTheȱ ȱȱ Netherlandsȱ Octȱ4,ȱ2006ȱ 16thȱsymposiumȱALWȬDiscussionȱGroupȱ“SecondaryȱMetabolismȱinȱ ȱȱ PlantȱandȱPlantȱCell”,ȱLeidenȱ Octȱ6,ȱ2006ȱ ȱȱ 5thȱPlantȱGenomicsȱEuropeanȱMeetings,ȱVenice,ȱItalyȱ Octȱ14,ȱ2006ȱ NetherlandsȱBioInformaticsȱCentre,ȱWorkshopȱBioinformaticsȱforȱ ȱȱ Metabolomics,ȱWageningenȱ Novȱ29,ȱ2006ȱ ȱȱ CBSGȱSummit,ȱWageningenȱ Febȱ6,ȱ2007ȱ ȱȱ ALWȱmeetingȱExperimentalȱPlantȱSciences,ȱLunterenȱ Aprȱ2Ȭ3,ȱ2007ȱ ȱȱ UtrechtȱGeneticȱSeminarȱSeries,ȱHubrechtȱlaboratory,ȱUtrechtȱ Aprȱ12,ȱ2007ȱ InstituteȱofȱVegetablesȱandȱFlowers,ȱChineseȱAcademyȱofȱAgriculturalȱ ȱȱ Sciences,ȱBeijing,ȱChinaȱ Junȱ20,ȱ2007ȱ 18thȱInternationalȱConferenceȱonȱArabidopsisȱResearch,ȱBeijing,ȱChinaȱ ȱȱ (2x)ȱ Junȱ21,ȱ2007ȱ PhDȱsummerschool;ȱEnvironmentalȱsignaling:ȱArabidopsisȱasȱaȱmodel,ȱ ȱȱ UtrechtȱUniversityȱ Augȱ27,ȱ2007ȱ ȱȱ ȱ ȱȱ Ź IABȱinterviewȱ May,ȱ2005ȱ Ź Excursionsȱ ȱȱ SubtotalȱScientificȱExposureȱ 37.2ȱcredits*ȱ 3)ȱInȬDepthȱStudiesȱ dateȱ Ź EPSȱcoursesȱorȱotherȱPhDȱcoursesȱ ȱȱ InternationalȱSummerschool,ȱTheȱanalysisȱofȱnaturalȱvariationȱwithinȱ ȱȱ cropȱandȱmodelȱplants,ȱWageningenȱ Aprȱ22Ȭ25,ȱ2003ȱ EPSȱSummerschool,ȱFunctionalȱGenomics:ȱtheoryȱandȱhandsȬonȱdataȱ ȱȱ analysis,ȱUtrechtȱuniversityȱ Augȱ25Ȭ28,ȱ2003ȱ ȱȱ EPS/VLAG/CBSGȱWorkshopȱMetabolomics,ȱWageningenȱUniversityȱ Mayȱ2Ȭ4,ȱ2005ȱ ABIES/PE&RC/EPS/SdVȱWorkshopȱMathematicsȱinȱPlantȱBiology,ȱParis,ȱ ȱȱ Franceȱ Junȱ30ȬJulȱ1,ȱ2005ȱ PhDȱsummerschool;ȱEnvironmentalȱsignaling:ȱArabidopsisȱasȱaȱmodel,ȱ ȱȱ UtrechtȱUniversityȱ Augȱ27Ȭ29,ȱ2007ȱ Ź Journalȱclubȱ ȱȱ ȱȱ memberȱofȱliteratureȱdiscussionȱgroupȱ 2002Ȭ2003ȱ Ź Individualȱresearchȱtrainingȱ ȱȱ NetherlandsȱGenomicsȱInitiativeȱfellowship,ȱMPIȱforȱMolecularȱPlantȱ ȱȱ Physiology,ȱGolmȱGermanyȱ febȱ1Ȭmayȱ5,ȱ2006ȱ SubtotalȱInȬDepthȱStudiesȱȱ9.6ȱcredits*ȱ 4)ȱPersonalȱdevelopmentȱ dateȱ Ź Skillȱtrainingȱcoursesȱ ȱȱ Ź OrganisationȱofȱPhDȱstudentsȱday,ȱcourseȱorȱconferenceȱ ȱȱ WageningenȱInternational,ȱTrainingȱprogrammeȱonȱtheȱconservation,ȱ managementȱandȱuseȱofȱplantȱgeneticȱresourcesȱinȱagriculture;ȱ ȱȱ Biotechnologyȱforȱgeneticȱresourcesȱconservationȱandȱcropȱimprovementȱ Mayȱ9ȬJulȱ1,ȱ2005ȱ Ź MembershipȱofȱBoard,ȱCommitteeȱorȱPhDȱcouncilȱ ȱȱ ȱȱ MemberȱofȱPhDȱcouncilȱ 2003Ȭ2006ȱ SubtotalȱPersonalȱDevelopmentȱ 2.9ȱcredits*ȱ ȱȱ ȱȱ TOTALȱNUMBERȱOFȱCREDITȱPOINTS*ȱ 51.2ȱȱ

Herewithȱ theȱ Graduateȱ Schoolȱ declaresȱ thatȱ theȱ PhDȱ candidateȱ hasȱ compliedȱ withȱ theȱ educationalȱ requirementsȱsetȱbyȱtheȱEducationalȱCommitteeȱofȱEPSȱwhichȱcomprisesȱȱaȱminimumȱtotalȱofȱ30ȱcredits.ȱ ȱ ȱ *ȱAȱcreditȱrepresentsȱaȱnormativeȱstudyȱloadȱofȱ28ȱhoursȱofȱstudyȱ ȱȱ ȱ Theȱresearchȱdescribedȱinȱthisȱthesisȱwasȱperformedȱinȱtheȱprojectȱ‘QTLȱexpress:ȱ identificationȱ ofȱ plantȱ performanceȱ traitsȱ inȱ Arabidopsisȱ byȱ combiningȱ highȬ throughputȱ mappingȱ andȱ expressionȱ profiling’,ȱ financiallyȱ supportedȱ byȱ aȱ grantȱ fromȱ theȱ Netherlandsȱ Organizationȱ forȱ Scientificȱ Research,ȱ Programȱ Genomicsȱ (050Ȭ10Ȭ029).ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ ȱ Printedȱby:ȱPonsenȱ&ȱLooijenȱBV,ȱWageningen,ȱTheȱNetherlands.ȱ