Validation of QTL related to Soluble Solid Content and Quantitative behaviour of in a Tomato F2

population

MSc. Thesis Report – Plant Breeding

Lorena Guardia Velarde

Daily supervisor: Dr. Yury Tikunov Supervisors: Dr. Arnaud Bovy Dr. ir. Sjaak van Heusden

April, 2015

Confidential P a g e | 1

Table of Contents

List of Tables ______3

List of Figures ______3

List of acronyms ______4

Abstract ______5

Introduction ______6 Tomato flavour ______6 Wild tomato accessions related to flavour traits ______6 Soluble solids content trait (Brix content) ______7 QTLs related to brix content ______7 Previous results on Brix QTL analysis in high Brix interspecific introgression lines ______8 QTLs associated to brix content found by Petit (2014) in Chromosome 6 ______9

Aim ______10

Materials & Methods ______11 DNA extraction ______11 DNA quality analysis ______11 Genotyping by Sequencing ______11 Primers design ______11 PCR ______12 Sequencing ______12 QTL analysis ______13 CAPS Assay ______13 Enzyme digestion ______13 KASP assay ______13 Metabolic profiling ______14

Results ______15 Selection of plants from F2 lines ______15 Verification of the introgression length and marker position in region VIIa and VIIb ______18 QTL and linkage-map of Chromosome 6 and 9______22 Genotyping the complete population by CAPS and KASP ______24 CAPS asay ______24 KASP assay ______25 Metabolomic analysis of the predominant and their comparison with brix content ____ 30

Wageningen UR

P a g e | 2

Comparison between sugars ______33 Discussion ______39 Comparison between genotyping methods ______39 Confirmation of the location, length of the introgression and gene involved in the QTL related to brix content in Chromosome 6 ______39 Higher content was found in samples with higher brix content ______40 Ratification of number of QTLs present in Chromosome 9 related to brix content ______40

Acknowledgments ______42

References ______43

Appendix ______45 Sequencing data ______45 Previous markers from Petits Thesis______59 Sugars values ______61

Wageningen UR

P a g e | 3

List of Tables - Table 1. Description of agarose gel and master mix - Table 2. Reaction Setup - Table 3. Thermo-cycling conditions for a routine PCR - Table 4. Marker seq-rs9017_b - Table 5. Lines for sequencing - Table 6. Detail of 14 primers - Table 7. Positions of marker seq-rs9017 (region VIIa) - Table 8. Markers in region VIIb - Table 9. Final position of markers at region VIIb - Table 10. Alleles of the 25 selected plants and 7 controls - Table 11. Markers related to brix content in Chromosome 6 - Table 12. Kruskal-Wallis results, markers significantly linked to brix content trait - Table 13. Kruskal-Wallis results, markers significantly linked to brix content trait - Table 14. Samples for metabolomic study - Table 15. Selected samples for metabolomics with alleles and brix values - Table 16. List of the identity of the sugars and other metabolites present in tomato fruit samples - Table 17. Groups for the metabolomic assay - Table 18. T-test results between groups and sugar - Table 19. T-test results between alleles and sugars, brix

List of Figures - Figure 1. KASP primers. - Figure 2. List of 14 positions found between seq-rs6813 and seq-rs6622 - Figure 3. Amplification of primers in region VIIa to evaluate their specificity - Figure 4. Regions selected for sequencing - Figure 5. Alignments for 4 markers in samples: MM8, S40, F1 and S ch - Figure 6. Analysis of marker seq-rs9017_a - Figure 7. Markers in Chromosome 9 - Figure 8. Markers in Chromosome 6 - Figure 9. Gel picture of the digestion - Figure 10. KASP assay results for the whole population - Figure 11. Complete set of samples with the values for each marker - Figure 12. Markers at Chromosome 9 - Figure 13. Representation of Chromosome 9 - Figure 14. Markers in Chromosome 6 - Figure 15. Main sugars by groups - Figure 16. Brix versus - Figure 17. Brix versus - Figure 18. Brix versus - Figure 19. Average of sugars and brix content between the two groups and the standard errors (bars) - Figure 20. Correlation between KASP markers with Brix content

Wageningen UR

P a g e | 4

List of acronyms - bp: Base pairs - °C: Centigrade - CASP: Cleaved amplified polymorphic sequence - Chr: Chromosome - cM: centiMorgan - F1 population: First filial population - F2 population: Second filial population - Forward: Fw - g: Gram - IL: Introgression Line - KASP: Kompetitive Allele Specific PCR - Lin: Acid invertase gene - Mbp: Mega base pair - ml: Mili litre - mM: Mili Molar - MiliQ water: mq - MoneyMaker: MM - “N”: None call - Nitrogen: N - Non template control: NTC - PCR: Polymerase Chain Reaction - pH: Potential hydrogen - QTL: Quantitative Trait Locus - Reverse: Rv - RIL: Recombinant Inbreed Lines - rpm: Revolutions per minute - SNP: Single Nucleotide Polymorphism - Solanum chmielewskii: S ch - SSC: Soluble Solid Content - Sweet 40: S40 - µl: Micro litre - V: Voltage

Wageningen UR

P a g e | 5

Abstract

Tomato fruit conformation involves several chemical compounds, which collaborate in the composition of the flavour. The depends on the type and amount of sugars present, also in soluble solid content, pH, titratable acidity and fruit size. To estimate soluble solid content, the refractive index (°Brix) of the fruit juice is measured. Improving brix levels is an interesting trait to breed and for QTL research. Studies based on RILs or ILs between wild type species and MoneyMaker var. have been done in order to elucidate the location of QTL related to brix content in the genome. A previous study done by Petit presents that in a F2 population, coming from the Sweet Line 40 and MoneyMaker, 2 QTLs linked to brix content were found in Chromosome 6 and 9. During the current thesis it was proven that the QTL found in Chromosome 6 is located in Chromosome 9. In addition, a correlation between sugars and alleles was found.

Keywords: °Brix, sugars, QTL, Sweet40, MoneyMaker, Chromosome 6 and 9.

Wageningen UR

P a g e | 6

Introduction

Tomato flavour

A general consumer opinion is that tomato flavour is poor. This might be due to the fact that flavour has not been the main focus in tomato breeding and the emphasis has been mainly on processing performance, yield, fruit size, firmness, lack of defects and disease resistance (Bucheli et al., 1999). This has led to an erosion of flavour quality in tomato (Fulton et al., 2002).

Tomato fruit composition consist of more than 400 different volatile and non-volatile chemical compounds, which contribute to its odour and taste, respectively (Tiemen et al., 2006). The most prominent taste-related compounds are sugars (sucrose, glucose and fructose), organic acids (citric and malic acid) and amino acids. Humans can perceive them by taste receptors located in the tongue and olfactory receptors in the nose. All these components together represent about 60% of the tomato dry matter weight (Causse et al., 2004).

Soluble sugar content is an important factor of fruit quality, for processing and fresh market tomatoes. Firstly, the tomato paste yield depends on solid content. Fruits with high soluble solids content have less water and need less processing. Secondly, the flavour depends on the sugars and acids concentration (Chetelat et al., 1993; Baxter et al., 2005). In order to achieve an excellent acceptation in the market the tomato fruit should fulfil a combination between total fruit yield and fruit soluble solids content (Baxter et al., 2005).

Sweetness is the key element for determination of quality and marketability in fruits and vegetables. This trait depends on the type and the source of sugars present, which rest on the genotype. Furthermore, sugar content also depends on total solids, pH, titratable acidity and fruit size (Nookaraju et al., 2010).

Wild tomato accessions related to flavour traits

Wild tomato germplasm has been considered as a useful source of many important agronomical traits for cultivated tomato. Different interspecific crossing populations exist nowadays to study the genetics of these traits including tomato fruit flavour. The wild tomato relative Solanum chmielewskii, produces small yellow-green fruit with approximately two times the soluble sugar concentration of S. lycopersicum fruit. This species can be hybridized with cultivated tomato and the genes that determine the high soluble solids content have been incorporated into tomato via backcrossing and selection (Chetelat et al., 1993). The fruit of S. chmielewskii for instance, stores soluble sugars as sucrose while S. lycopersicum accumulates glucose and fructose (Chetelat et al., 1993).

Despite backcrossing technique has shown some drawbacks, it is still a good alternative to incorporate different traits into the commercial varieties. The trait of sucrose accumulation might favour high soluble sugar accumulation since the osmolarity of sucrose is half that of the . Moreover, sucrose may be less available for respiratory loss, and fruits that store high levels of sugars accumulate sucrose (Chetelat et al., 1993).

Wageningen UR

P a g e | 7

Soluble solids content trait (Brix content)

The cost of fruit processing to produce tomato paste could be reduced by increasing soluble solids content. An estimation of soluble solids content (sugar content) can be done by measuring the refractive index (Brix) of fruit juice: 1 degree of Brix corresponds to 1 g of sucrose per 100 g of solution (Fulton et al., , 2002). Enhancing the brix levels in raw tomato fruit is a significant trait of breeding programs and QTLs studies (Fulton et al., 2002). However, this is a challenging task since this trait appears to be controlled by many genetic factors, environmental factors and pleiotropic relationships which can interfere with other important traits like fruit yield, fruit size and plant canopy weight etc. For instance seems that a higher soluble solid content has a negative relationship with fruit yield (Stevens et al., 1979; Bernacchi et al., 1998, Fridman et al., 2000).

QTLs related to brix content

In 1991, Yelle et al. reported the presence of sucrose synthase in chromosome 3 that leads to sucrose accumulation and loss of acid invertase activity in the development of the sucrose accumulating in S. chmielewskii. Meanwhile, Miron and Schaffer at the same year informed that S. hirsutum (a wild relative) accumulated about 118 micromoles per gram fresh weight in the final stages of development, while S. lycopersicum stored less than 15 micromoles per gram fresh weight of sucrose at the ripe stage. Also, they observed in segregating populations of sucrose accumulating and non-accumulating genotypes that the genotypes which had a high-acid invertase do not store sucrose. Nevertheless, genotypes which poses a low-acid invertase do not necessarily accumulate sucrose, suggesting that low-acid invertase activity is not the only responsible factor of sucrose accumulation.

Chetelat et al., 1993 stated that wild relatives producing green fruits like S. chmielewskii, S. hirsutum, S. hirsutum f. glabratum and S. peruvianum store sucrose and are characterized by low activities of invertase, while red (or orange) fruit species (S. cheesmanii, S. esculentum, S. pimpinellifolium) accumulate hexoses and have high invertase activities. In the report from Eshed and Zamir in 1995, 23 QTL for total soluble solids content were detected. A population of 50 IL (introgression lines) was used for the experiment, with S. lycopersicum x S. pennelli as parentals. Finally, a negative relationship between total fruit yield and soluble solids concentration was reported in these studies.

Bernacchi et al., 1998 identified 5 QTL related to soluble solids content: ssc3.1, ssc3.2, ssc5.1, ssc6.1 and ssc9.1. In this study an introgression ssc6.1 from S. lycopersicum x S. hirsutum LA1777 was reported to decreased solids content in tomato fruit. The other four QTLs: ssc3.1, ssc3.2, ssc5.1 and ssc9.1, increased soluble solids content but were also related to the reduction of the yield. Thus, the rise in soluble solids content may be associated to a yield reduction and a concentration of photosynthates.

Fridman et al., 2000 after the cross of S. lycopersicum x S. pennelli identified 23 QTLs. The most remarkable QTL was Brix9-2-5 in chromosome 9. This introgression from S. pennelli increased the soluble solids content. The total soluble solids in fruits of the wild relative can reach up to 15% of the fruits fresh weight. Brix9-2-5 carried an invertase gene Lin5. Their objective was to elucidate if Brix9- 2-5 was the only QTL on the introgression. They found another one, PW9-2-5 related in the alteration of the growth habit resulting in an expansion in plant weight, yield and Brix. Lin5 has been reported to be expressed only in flowers and in young fruits, with a transcript detectable in petal, stamen,

Wageningen UR

P a g e | 8

ovary and small fruit. In order to study the function of Lin5 gene, Zandor in 2009 performed an experiment silencing the expression of the gen to evaluate the consequences on floral and fruit development. The role of Lin5 in the brix and soluble solids content could be confirmed. In addition, the morphological effects in the transgenic lines were confirmed. In 2005, Baxter and colleagues, demonstrated the importance of Lin5 in sink organ development. Mutants and transgenic plants without the gene showed a slower or aborted sin organ development.

Fulton et al., at 2002, worked with four wild relatives: S. hirsutum, S. peruvianum, S. parviflorum S. pimpinellifolium and S. lycopersicum as recurrent parent. Around 65 QTLs involved in fructose, glucose, sucrose and total sugars traits were found. A higher correlation was presented regarding fructose and glucose due to the same origin from the breakdown of sucrose. Also, sucrose presented a high correlation between fructose and glucose. These three sugars traits and brix content were reported in a cluster of QTL located at the top of chromosome 6. Twenty-two QTLs were found associated to brix content. QTLs brx6.1/PM, brx8.1/PM and brx8.1/PR were reported to enhance brix levels without altering sugar levels.

An introgression from S. chmielewskii in chromosome 1 was reported by Frary et al., at 2003. This introgression (56 cM) enhanced the soluble solids content.

Causse et al., 2004 reported sixty-three genes involved in carbon metabolism. Based on their functions, they constituted candidate genes for sugar and acid content. Some of these genes participated in Calvin cycle, glycolysis, TCA cycle, sugar metabolism, transport and other functions related to primary metabolism. QTL Brix9-2-5 (located in chromosome 9) reported by Fridman et al., (2000) was not identified in this study, suggesting that it can be specific to S. pennelli.

Wild tomato relatives have been great suppliers of gene variety. These wild species have been used to make crosses with a recurrent parent, e.g S. lycopersicum follow by the analysis of the offspring in order to use them in breeding due at their beneficial introgressions.

Previous results on Brix QTL analysis in high Brix interspecific introgression lines

Recombinant Inbreed Lines (RILs) with high Brix values in ripe fruits (“Sweet” RILs) were previously derived from an interspecific S. lycopersicum x S. chmielewskii crossing population. The population for the present study comes from a backcross between S. lycopersicum cv Moneymaker as recurrent parent and a “Sweet” RIL, as donor parent, reported by Hunk, 2014. As a result of this backcross, four lines were selected and tested on sweetness and brix content of the fruits. The Sweet40 line was reported as the one with the best performances and was previously genotyped using a custom Infinium array (Petit, 2014 and Víquez-Zamora et al., 2013).

Subsequently, an additional backcross was made between Sweet40 and Moneymaker. The F1 plants were genotyped by KASP markers. Then, this F1 (A2) was selfed and the offspring (F2) was analysed to detect the introgressions involve in brix content. F2 population was genotyped by KASP with 58 markers from previous array (Petit, 2014).

Petit (2014) results showed two QTLs for Brix in the population. One QTL is located at the bottom of chromosome 6 (marker seq-rs9017, position 33.933 Mb).

Wageningen UR

P a g e | 9

QTLs associated to brix content found by Petit (2014) in Chromosome 6

In the QTL found by Petit, 3 genes were involved. A MADS-Box transcription factor, a transposon and the hexokinase 2. The most significant marker found by Petit was located in the transposon (seq- rs9017 at 33932733 bp). For this reason, the most attractive region for this study was located in the transposon area.

Transposons, also known as transposable elements, are DNA portions with the ability to replicate them self and move to another location in the genome. They are highly distributed in the whole genome and are an important component of eukaryotes (Oliver et al., 2013). They are one of the responsible factors that lead genomic variation by the chromosome rearrangements that occurs in the genes (Oliver et al., 2013; Bennetzen and Wang, 2014). Transposons are divided in two classes. The class I that needs an RNA as intermediator. The RNA transcript integrated is used as a template to convert it in DNA. The Class II is the one associated to DNA sequences (Chénais, et al., 2012) and is also related with genes (Lisch and Bennetzen, 2011). It mechanism is to excising the DNA sequence from the chromosome and incorporate it somewhere else in the genome (Bennetzen and Wang, 2014).

Finally, to analysed the differences between the main sugars involved and their relation with brix content, metabolomics analyses have to be performed.

Wageningen UR

P a g e | 10

Aim

The aims of the present study were:

1. To validate the QTLs for soluble solids (Brix) content found previously on chromosomes 6 of the tomato. 2. To identify candidate genes located in the QTL in chromosome 6 that are related to brix content. 3. To provide insights in quantitative behaviour of individual sugars contributing to Brix in the population segregating for these QTLs.

Wageningen UR

P a g e | 11

Materials & Methods

DNA extraction The DNA extraction was performed for the entire population and the controls: MoneyMaker 8 (MM8), Sweet40 (S40), Solanum chmielewskii (S. ch) and F1. Plant material was provided by Petit. The youngest leaves from all samples were used (approximately 3 leaves).

The leaves were collected in 2 ml eppendorf tubes and were maintained in liquid nitrogen. After that, 2 bullets were added per eppendorf and grinded in the Tissue lyser machine for 1 minute. Then, 950 µl of extraction solutions were added to each sample (Table 1) and mixed by vortexing. Then, the samples were placed in a water bath for incubation (65˚C for 60 minutes) and were mixed by inversion every 15 minutes. After this, samples were put on ice. To extract the DNA of the samples, 450µl of chloroform:amylacohol (24:1) was added and mixed by inversion for 5 minutes. Samples were centrifuged at 13000 rpm for 5 minutes. After this step, the supernatant was collected in a new 2 ml tube and cold isopropanol (800 µl) was added to each sample and were mixed by inversion. Samples were centrifuged at 13000 rpm for 5 minutes and the supernatant was discarded. By this step, the DNA turned into a visible pellet at the bottom of the tube. This step was repeated once. The pellet was rinsed in 250 µl of 76% cold ethanol for 20 minutes followed by centrifugation at 13000 rpm for 5 minutes and the ethanol was removed carefully. The pellets were dried in the oven at 37° C for 30 minutes. Afterwards, DNA was re-suspended in 100 µl of TE and 1 µl of RNase. Finally, samples were incubated at 37˚C for 20 minutes to dissolve the pellet.

DNA quality analysis Once dissolved, the concentration of DNA was measured using Nanodrop. First, it had to be calibrated (with H2O mq) and 1 µl of DNA is added to the lector. To verify the quality, an agarose gel was used.

Table 1. Description of agarose gel and master mix.

Agarose gel 1.2 g agarose 200 ml TBE 2µl Ethidium bromide

Sample master mix 1µl of DNA sample 1µl of loading buffer

8µl H2O MiliQ

Genotyping by Sequencing

Primers design Fourteen suitable primers were designed for PCR analysis by Primer3plus program and verified in BLAST to confirm their specificity.

Wageningen UR

P a g e | 12

PCR DNA samples were diluted to concentration of 100 ng/µl in 50 µl. A PCR (polymerase chain reaction) was performed in order to amplify the DNA samples. Kit Q5 High-Fidelity DNA Polymerase New England BioLabs was used followed by next protocol (Table 2 and 3):

Table 2. Reaction Setup.

Component 1 reaction (25 µl) Final Concentration 5X q5 Reaction Buffer 5 µl 1X 10 mM dNTPs 0.2 µl 200 µM 10 µM Forward Primer 1.25 µl 0.5 µM 10 µM Reverse Primer 1.25 µl 0.5 µM Template DNA 1 µl 100 ng/µl Q5 High-Fidelity DNA Polymerase 0.5 µl 0.02 U/ µl H2O mq 15.75 -

Table 3. Thermo-cycling conditions for a routine PCR.

Step Temperature (C°) Time Initial Denaturation 98 30 seconds 98 10 seconds 7 cycles 61 20 seconds 72 30 seconds 98 10 seconds 32 cycles 58 20 seconds 72 30 seconds Final Extension 72 7 minutes Hold 10 ∞

Finally, PCR products were visualized on a 0.6% agarose gel. The gel was made as described before. Three µl of the PCR product was mixed with 2 µl of the loading buffer and 5 µl MilliQ. This 10 µl were loaded on the gel and run for 45 minutes at 80V.

Sequencing Three pair of primers (II, VIIb and XII) were sent to GTAC Sequence Service to perform the sequencing to verify the length of the introgression in Chromosome 6. Before sending all samples, a couple of tubes were sent to visualize the resolution.

To sequence each sample the following elements were needed:

0.5 µl PCR product + 5 µl Forward primer + H2O mq

0.5 µl PCR product + 5 µl Reverse primer + H2O mq

Wageningen UR

P a g e | 13

QTL analysis MapQTL 6 software was used to carry out the QTL analysis by using the Kruskal-Wallis test as a non- parametric quantitative test. This test was used to correlate the brix trait with the introgression selected in chromosome 6. Furthermore, software JoinMap 4 was used for the calculation of the genetic linkage map in the F2 population.

CAPS Assay

CAPS assay was performed as a confirmation method of the sequencing in the whole population. Primers from region VIIb were used for this assay. Enzyme BbvCI cuts into two fragments the region VIIb when the analysis is performed in MM genome, but not in S40 genome. The cut is done at 33932614 bp position, in the marker: seq-rs9017_b (Table 4). The exact length of the fragments and the position were obtained using Emboss.

Table 4. Marker seq-rs9017_b. Description of marker seq-rs9017_b located in chromosome 6, position 33932614 bp, in MM and S40. Sequence, in red the SNP.

Enzyme digestion PCR products were digested by the BbvCI enzyme which cuts the following pattern: CCTCAGC. To perform the digestion 1 µl of buffer, 1 µl of enzyme and 3 µl of PCR product were added in an eppendorf tube and were incubated overnight. Next morning, the digestion and fragments were visualized with an agarose gel at 1.5%. 2 µl of loading buffer were added to the digestion products. This mixture was loaded at the gel and run for 1 hour at 50V. Finally, a picture was taken with the Octopus program.

KASP assay

KASP genotyping chemistry assay (www.lgcgroup.com) was used as a validation method of the sequencing in the complete population (Like CAPS analysis). The primers were designed based on marker seq-rs9017_b (33932614 bp) located in region VIIb. FAM dye was assigned to MM genotype and HEX dye for S40 (Figure 1)

Wageningen UR

P a g e | 14

Figure 1. KASP primers. Three primers for KASP; MM Fw, S40 Fw and Rv. In red FAM dye, in blue HEX dye. SNP located at the end of Fw primers.

Metabolic profiling

Ripe fruits were harvested from each F2 plant of the population (5 - 6 fruits per plant). Fruits were cut for quarters and one quarter of each fruit was immediately frozen in liquid N. The frozen fruit material was then ground using an A11 IKA analytical mill and the powder was stored at -80°C before metabolite extraction.

The fruits were crushed with a hammer and triturated by a grinder. Then, they were collected in plastic flasks and stored at -80°C.

A quantity of 300 mg of fruit powder sample was weight in a 2 ml Eppendorf tube under liquid N condition. 0.7 mL of 100% methanol was added to the tubes with tomato material and tubes were sonicated in an ultra-sonic bath for 10 min followed by centrifugation at 14000 rpm for 10 min. Then, 500 ml of supernatant was collected in a new tube. After this, 1 ml of MiliQ water and 400 ml of chloroform was added to each sample and were shaken for 2 minutes. Afterwards, samples were centrifuged for 10 minutes. 30 uL of supernatant were transferred to 2 ml glass vials with inserts. These extracts were fully dried overnight in a Speedvac equipped with a -100°C criotrap. The vials were then capped with magnetic caps under nitrogen gas flashing and subjected to online derivatization using methoxyamine hydrochloride : MSTFA method as described in Lisec et al., (2006)

The derivates were then analysed using Gas Chromatography coupled to Time-Of-Flight – Mass spectrometry method (GC-TOF-MS) (Lisec et al., 2006).

Wageningen UR

P a g e | 15

Results

Selection of plants from F2 lines

Twenty five plants were selected from Petits data (2014). These plants came from a backcroos between the Sweet 40 line and S. lycopersicum L. cv MoneyMaker resulting in F1 plants. Finally, the F1 was selfed to obtain an F2 plants. These F2 plants are the ones used in this research.

The most important introgressions for brix content and plant vigour were found in chromosome 6 (6C introgression) and in chromosome 9 (9B introgression) (Petit, 2014).

The 25 plants presented the two introgressions, except for two lines: #6 and #89 (Table 5). These 2 lines only presented the introgression 6C. The analysis of these two lines make possible to elucidate the influence of chromosome 9 in brix and soluble solids content. Lines #6 and #89 showed an intermediate value for brix content.

Table 5. Lines for sequencing. 25 plants and 7 controls. Samples in green present the lowest brix content and vigour. Samples in red present an intermediate brix content and do not contain introgression 9B. Samples in blue present the highest brix content and vigour. Samples in white are consider as controls.

Introgressions Lines Brix content 6C 9B 27   4.42 53   4.87 80   4.89 47   5.06 24   5.09 49   5.19 6   5.44 89   5.55 73   5.79 5   5.82 83   5.83 40   5.86 39   5.88 42   5.91 81   5.91 26   5.92 43   5.95 69   5.95 16   5.95 12   6.02 86   6.08 70   6.11 65   6.13 9   6.23 64   6.72 48   4.66 77   5.81 79   5.72 MM8 F1 Controls S40 S. ch.

Wageningen UR

P a g e | 16

Petit 2014 reported 6C as the most significant introgression related to brix content. The QTL found on Chr 6 was allocated to the marker seq-rs9017 (position 33932085 bp). Three regions between the markers seq-rs6813 (position 29257740 bp) and seq-rs6622 (position 37129439 bp) (both of them in red, Figure 2) were analysed to find other markers, which could be linked with the QTL in addition to seq-rs9017. The markers were selected according to candidate genes (Table 6) which could be possibly related to sugar content and plant vigour.

Figure 2. List of 14 marker positions found between seq-rs6813 and seq-rs6622. 3 groups of markers were defined. In red, markers seq-rs6813 and seq-rs6622. . Marker seq-rs9017, in green, is where the QTL related to brix content was found (Petit, 2014).

Fourteen markers were chosen to cover the probable region of introgression in region 6C. These markers were designed by contrasting the genomes of S. lycopersicum var. MM, S. chmielewskii LA2663 and LA2695. Sequences of the Tomato genome project were used as templates to verify that the 14 primers did not include any SNP (Table 6).

Wageningen UR

P a g e | 17

Table 6. Detail of 14 primers. Marker position, exact location in bp of each marker. Number of the 14 regions (in roman numerals) in which introgression 6C has been divided. Primer, name of the forward and reverse primer. Sequence, nucleotides. Length of the amplicon and length of the primer in bp. Tm, melting temperature in °C. GC %, guanine-cytosine contain. Physical map (bp) of chr 6 including 14 regions.

Wageningen UR

P a g e | 18

Verification of the introgression length and marker position in region VIIa and VIIb

Performance of markers designed in the hypothetical Brix QTL region of chromosome 6 (Table 6) was evaluated. After the design of the primers, a confirmation of their specificity in an agarose gel was required. In order to verify the amplification of the 14 designed primers, several PCRs were made using gDNA of the selected 25 F2 plants as templates. All the sets of primers produced amplification products of predicted sizes, except for one set in region VIIa (Figure 3) - the region where a Brix QTL was found (Petit, 2014). The agarose gel showed different bands per sample and did not show the band of 1051 bp that was estimated.

Figure 3. Amplification of primers in region VIIa to evaluate their specificity. The first and last column are the ladder. Each band represents the amplification of the delimited amplicon in region VIIa for the 25 plants and one control. Numbers at the top are the numbers of each sample and MM8 is MoneyMaker 8, the control. The green line (1051 bp) shows the section where the PCR products were expected. Different location of the bands indicate that the primer was unspecific. #40 and #64 bands are in a red circle. These two samples were chosen for further analysis as representative for the upper and lower bands.

To verify if the two PCR products of different sizes in region VIIa correspond to the sequences of the tomato genome used as template to design the primers, two samples, #40 and #64 were chosen and analysed by sequencing. Both samples were representative for the lower and the upper band, each sample only showed one of the bands in the gel (Figure 3, red circles).

For other regions, two samples of each pair of primers were sequenced with the intention to elucidate if the introgression was present or not and delimit the length of the introgression. The selected samples were #5, #89 (both samples showed heterozygosity in Petits results, data not shown) for region II and XII (Figure 4). These regions were chosen due to the proximity to the edge of the hypothetical introgression 6C.

Wageningen UR

P a g e | 19

Figure 4. Regions selected for sequencing. Pair of primers corresponding to regions II, VIIa and XII (in green) were sent to sequence to elucidate the size of the introgression (primers of regions II and XII); and to find out wheter the two PCR products correspond to the tomato genome of the templates were used to create the primers (region VIIa).

The sequencing results for sample #5 and #89 at region II and XII showed MM genotype of the markers. Subseqent sequensing of the next markers towards the QTL center also revealed the MM alleles. Thus marker seq-rs9017, which has been previously shown to be linked with the Brix QTL on chromosome 6 remained the only marker for this particular QTL.

The sequencing result of region VIIa which included SNP seq-rs9017 did not show the marker in its proposed position. This marker was expected to be located in the position 33932085 on chromosome 6 (Petit, 2014). However, after performing the alignment of the marker sequence obtained by sequencing in this study and the tomato genome sequence it did not map in the position it was supposed to be located. Moreover, the sequence of the marker did not map anywhere in this region VIIa as well. The marker sequence for that position in the tomato genome project was different to the one from the Infinium array (Table 7).

Table 7. Positions of marker seq-rs9017 (region VIIa). Sequence of the marker in the tomato genome project and in the Infinium array. In red, the SNP.

Tomato genome project sequence Infinium array sequence TTATGTCTG CCAAGTCTCT

To clarify the position of the marker, Infinium array sequence was analysed to verify the exact position of the marker. This analysis revealed that the annotation of the marker was wrong. The correct position of the marker is 33932614 bp and not 33932085 bp, specified by Petit (2014).

Wageningen UR

P a g e | 20

The region VIIa covers from 33931263 to 33932314 bp, thus the new position of the marker (33932614 bp) was not included in this region. Fortunately, region VIIb covers from 33932288 to 33933041, including position 339392614.

The complete region VIIb is a part of the transposon gene: protein CACTA En/Spm sub-class (Solyc06g054690.2.1) which presents 5 exons. This region includes 2 markers; a transposon protein CACTA at 33932441 bp and marker seq-rs9018 at 33932756 bp. Furthermore, marker seq-rs9017 (correct position, 33932614 bp) was considered into region VIIb too (Table 8).

Table 8. Markers in region VIIb. Region VIIb contains 3 markers. Transposon protein CACTA (33932441 bp), seq-rs9018 (33932756 bp) and sep-rs9017 (33932614 bp) which was used to be considered part of region VIIa. In red, the SNP.

Marker Position Sequence Transposon protein CACTA 33932441 AAGGGCTAA seq-rs9017 33932614 TGAGGCTTG seq-rs9018 33932756 CATTCGTGG

Considering that region VIIb had 3 markers including the Brix QTL significant seq-rs9017, it was selected as the next region for sequencing. Figure 5 shows the 3 different markers in that region. To verify the positions of the markers, sequence of the SNP were compared with the controls; MM, S40, F1 and S ch. For all markers, MM and S40 presented different alleles confirming the presence of the SNP.

For the first marker, transposon protein CACTA, the SNP was present in the nucleotide A and not in the G as it was expected (Table 8, Figure 5). Thus, the new position of the marker is; 33932438 bp (Table 9). For this marker (33932438 bp) F1 and S ch. presented the same allele of MM (Figure 5).

In case of the second marker - seq-rs9017 - another marker was found located 4 base pairs before the original SNP (Figure 5, the 2 red arrows in the yellow sequence). Therefore, this marker was split in two: seq-rs9017_a located at 33932610 bp and seq-rs9017_b located at 33932614 bp (Table 9). For seq-rs9017_a F1 presented one allele of MM and the other one showed a “N” (none call). In case of S ch. both alleles were identical to MM. The next marker seq-rs9017_b showed two “N” for F1 and S ch. had S40 alleles (Figure 5).

Finally, marker seq-rs9018 was heterozygous in F1 and had homozygous S ch. allele in S40 (Figure 5).

Table 9. Final position of markers at region VIIb. Four markers were located in region VIIb, instead of 3. Marker seq-rs9017 was split in 2 (seq-rs9017_a and sep-rs9017_b) due to the proximity of two markers (another marker was found 4 bp before than the original marker). Additionally, marker transposon protein CACTA was located 3 bp before that the expected position. New positions in bp and SNPs are in red.

Marker Position Sequence Transposon protein CACTA 33932438 AAGGGCTAA seq-rs9017_a 33932610 TGAGGCTTG seq-rs9017_b 33932614 TGAGGCTTG seq-rs9018 33932756 CATTCGTGG

Wageningen UR

P a g e | 21

Figure 5. Alignments for 4 markers in samples: MM8, S40, F1 and S ch. Markers transposon protein CACTA, seq-rs9017_a, seq-rs9017_b and seq-rs9018 are represented in the three upper rows. The followed 8 rows correspond to (in pairs) MM8, S40, F1 and S ch sequences. The red arrows indicate the exact position of the marker.

In order to curate the “N” alleles presented in the F1 markers, chromatograms of the corresponding seq-rs9017 were evaluated manually. For marker seq-rs9017_a, the allele of the reverse strand was analysed (Figure 6).

Figure 6. Analysis of marker seq-rs9017_a. On the left side, in yellow, the marker sequence compared with sample F1. Rows indicate the exact position of the SNP (“N”). On the right side, are the sequence chromatograms. At the bottom two peacks are visualized in the SNP position specifying the allele heterozygosity.

Figure 6 showed two different peaks at the “N” position. Thus, heterozygosity was confirmed for marker seq-rs9017_a in F1. The same procedure was done for marker seq-rs9017_b to elucidate the two “N” nucleotide. Two peaks were present at the marker position indicating that F1 was also heterozygous for seq-rs9017_b marker. Each sample which presented a “N” in the marker position was analysed in the same way (Appendix, Table 1, Figures 1-14). A compile result of the 25 plants and 7 controls are presented in Table 10.

Wageningen UR

P a g e | 22

Table 10. Alleles of the 25 selected plants and 7 controls. Position of each marker. In blue, alleles from MM8 and S40. Both controls were used to elucidate the homozygosis or heterozygosity of each sample. DNA of control #79 did not codified for none of the markers.

QTL and linkage-map of Chromosome 6 and 9

A marker-trait association analysis was performed using the new marker information and the non- parametric Kruskal-Wallis procedure (Table 12). The data used to run the analysis was provided by Petit, 2014 (Appendix, Figure 15) and the values of the new markers were added (Table 11).

Table 11. Markers related to brix content in Chromosome 6. Data proportionated by Petit included markers in all the tomato chromosome. The most significant marker in his study related to brix content was: seq-rs9017 and it is represented 33.932733+6 in Chromosome 6. In this study, 4 new markers were included to validate the association between them with the brix QTL on Chromosome 6 found by Petit. The positions of the markers are on the right side.

Marker at Chr 6 Positions Transposon protein CACTA 33.932438-6 New markers (this study) seq-rs9017_a 33.932610-6 seq-rs9017_b 33.932614-6 seq-rs9018 33.932756-6 Petit’s marker seq-rs9017 33.932733+6

Table 12. Kruskal-Wallis results, markers significantly linked to brix content trait. First column presents the brix content trait; in the second and third column are the two chromosomes related to brix content (Chr 6 and 9) and the position of each marker. Genotype “a”, “b” and “h” indicate a homozygous MoneyMaker genotype, homozygous Sweet40 genotype and heterozygous genotype, respectively. The last two columns contain the significance. Only the markers linked to traits with a level of significance of at least 0.005 were considered.

Position Level Trait Group Mean a Mean h Mean b Significance (Mbp) of sign. Chr09 14.930818-9 25.5 45.8 56.9 ****** 0.0005 Chr09 45.660292-9 25.5 45.8 56.9 ****** 0.0005

Chr09 50.644852-9 25.5 45.8 56.9 ****** 0.0005

Brix Chr09 5.127715-9 26.9 47.2 54.2 ***** 0.001 Chr09 11.230376-9 25.3 44.4 55.1 ***** 0.001 Chr09 23.384812-9 24.1 42.4 52.9 ***** 0.001 Chr06 33.932733+6 24.7 47.4 44.8 **** 0.005 Chr09 29.483028-9 24.8 41.4 51.8 **** 0.005

Wageningen UR

P a g e | 23

The results of the test showed a linkage with the markers presented in chromosome 9. Only marker seq-rs9017 reported by Petit was found to be significantly associated with Brix on chromosome 6.

For the determination of the linkage groups Join Map4 was used. Two groups were obtained. Markers presented on chromosome 6 and 9 were reorganized depending on how close the linkage between them is.

Figure 7. Markers in Chromosome 9. Markers located in chromosome 9 linked to brix content trait. 5 markers which were previously found in chr 6 were re-located to chr 9. Markers at position 33.932438-6 and 14.930818- 9 (green line) present 2 identicals values (in green). On the right side, a list of the 14 markers.

Figure 7 shows the chromosome 6 Brix marker seq-rs9017 (Petit, 2014)(underlined by light green) and the 4 new markers of chromosome 6 (in dark green rectangle) were found in the linkage group on chromosome 9. All these markers were significantly linked to brix content trait.

In case of group 2, the rest of markers in chromosome 6 were located (Figure 8). None of these markers was found to be associated with Brix.

Wageningen UR

P a g e | 24

Values:

Group Locus 2 4.306678-6 2 13.419647-6 2 29.25774-6 2 37.129439-6 2 38.340271-6

Figure 8. Markers in Chromosome 6. Markers from Petits previous data (2014). On the left side a genetic map of the putative positions of the 5 markers in chr 6. On the right side, a list of the markers.

Genotyping the complete population by CAPS and KASP

CAPS asay In order to genotype the whole population a CAPS assay was performed. PCR products of the samples, MM8, #48, S40, F1 and #64 were selected as controls to confirm and validate the results of the assay. Samples #48 and #64 were chosen as MM8 and F1 genotypic duplication, respectively. The DNA was digested by BbvCI restriction enzyme, which supposed to cut a fragment of seq-rs9017_b marker sequence of MM and #48 into two fragments of 432 and 322 bp. The results for the digestion are presented in Figure 9.

MoneyMaker presented 3 bands instead of the 2 that were expected. Repetitions were performed with different concentrations of the enzyme, and overnight but the result stayed the same. This might happened because of the transposon presence. Different copies could be present in the genome resembling a heterozygous marker pattern.

In the case of S40 the picture shows one band at 754 bp as was predicted, showing the homozygous non-Moneymarker allele. Finally, F1 and #64 presented 3 bands as heterozygous (Figure 9).

Wageningen UR

P a g e | 25

Figure 9. Gel picture of the digestion. The first column, ladder. Numbers at the top represents the numbers of each sample. The upper row represents the PCR product not digested by the enzyme and the lower bands represent the fragments cut by the enzyme. The upper bands (754 pb) were not expected in MM8 and #48 samples (red box).

KASP assay As a second alternative method a KASP assay was performed. The first assay included 6 samples and 2 non template controls (NTC). The samples were: MM8, #43 (as MM8 replicate), S40, #9 (as S40 replicate), F1 and #64 (as F1 replicate). Five of the 6 samples performed as expected. Sweet40’s result appeared to be in the NTC group. After an electrophoresis gel it was discovered that the DNA of control S40 was degraded. The DNA might have been disrupted due to continuously exposing it to low temperatures in exchange with room temperature. This assay was repeated in the same conditions and the result was the same. Sample #9 was considered as the control sample for S40 for the next assay. Afterwards, the sequencing for the whole population was performed. The results are presented in Figure 10.

Wageningen UR

P a g e | 26

Figure 10. KASP assay results for the whole population. In the x axis, allele 1 - FAM dye represents MM8 allele (orange circles); in the y axis allele 2 - HEX dye represents S40 allele (blue square). In the middle are located the heterozygous plants (green triangles) and the ones at the bottom on the left corner are the NTC (black diamonds). In red circles, sample #47 and #88.

Eight samples presented outlier values. Samples S40 and #60 were scored as NTC, samples #47 and #88 were scored as heterozygous but were located out of the heterozygous zone and close to the homozygous alleles (Figure 10, red circles) and finally samples #5, #39, #42 and #89 showed a different result from the previous sequencing analysis.

Samples S40 and #60 were located in the NTC cloud. In the case of sample #60, a problem with the PCR might have occurred because of the low amount of DNA present. However, sample S40 is one of the control samples and has been used for all the assays before. This DNA was the same as Hunk, 2014 used for her research. After a year, the little amount of DNA and the constant use could have damaged the DNA of the sample. To verify the quality of the DNA, an electrophoresis gel was made. The gel confirmed the degradation of the sample since it did not show any band.

For sample #5, #39, #42 and #89 the actual data contradicted the previous results. In the first sequencing, all these samples scored as S40 homozygous allele type, but the results from KASP scored them as heterozygous. In order to confirm which results were the correct ones, these samples were re-analysed. For each sample, the alignment of SNP sequences were examined and the SNP position was evaluated by checking the sequence chromatograms (as in Figure 6). Finally, for sample #47 the same procedure was followed. We could not check sample #88, because it has not been sequenced.

Finally, a compiled result was made based on the rectification using the sequencing results and the KASP assay (Figure 11).

Wageningen UR

P a g e | 27

Figure 11. Complete set of samples with the values for each marker. On the left side, the first table presents the 25 plants (selected at the beginning) and the 7 controls. MM8 and S40 are in blue. These plants were sequenced and showed the alleles for the 4 markers designed in Chr 6. In red, marker seq-rs9017_b (A). The others two columns at the right side present the rest of the samples. Marker seq-rs9017_b in red. These plants were sequencing only by KASP (B).

After this result, a new QTL analysis was performed including the whole population. Petits, 2014 data was updated with the recent values (Appendix, Table 16). Kruskal-Wallis test was used. The aim of this test was to check if the Brix segregating QTL was linked to the same markers (Table 11).

The results showed the same linkage between markers in chromosome 9 that the previous analysis and included a new marker for chr 6 (33.932614-6) (Table 13). Two markers from chromosome 6 were included: the one that Petit presented as significant (33.932733+6) and marker 33.932614-6.

Wageningen UR

P a g e | 28

Table 13. Kruskal-Wallis results, markers significantly associated to brix content trait. First column presents the brix content trait; in the second and third column are the two chromosomes related to brix content (Chr 6 and 9) and the position of each marker. Genotype “a”, “b” and “h” indicate a homozygous MoneyMaker genotype, homozygous Sweet40 genotype and heterozygous genotype respectively. The last two columns contain the significance. Only the markers linked to traits with a level of significance of at least 0.005 were considered.

Position Level of Trait Group Significance Mean a Mean h Mean b (Mbp) sign. Chr09 14.930818-9 ****** 25.5 45.8 56.9 0.0005 Chr09 45.660292-9 ****** 25.5 45.8 56.9 0.0005

Chr09 50.644852-9 ****** 25.5 45.8 56.9 0.0005

Chr09 5.127715-9 ***** 26.9 47.2 54.2 0.001 Brix Chr09 11.230376-9 ***** 25.3 44.4 55.1 0.001 Chr09 23.384812-9 ***** 24.1 42.4 52.9 0.001 Chr06 33.932614-6 **** 28.5 42.8 53.5 0.005 Chr06 33.932733+6 **** 24.7 47.4 44.8 0.005 Chr09 29.483028-9 **** 24.8 41.4 51.8 0.005

As a result, two groups were identified. The same markers that were contemplated in the previous analysis were presented. The position of the markers changed in Chromosome 9 and in Chromosome 6 (Figures 12, 13 and 14).

Wageningen UR

P a g e | 29

Figure 12. Markers at Chromosome 9. Markers located in chromosome 9 linked to brix content trait. On the left side a representation of the putative positions of the 14 markers. 5 markers which were previously found in chr 6 were re located to chr 9.

Figure 13. Representation of Chromosome 9. Markers in green are the ones found in Chr 6 previously. At the middle in green, marker seq-rs9017 from Petits thesis. At the bottom, in green new markers use in this study. In blue, markers from Chr 9.

Wageningen UR

P a g e | 30

Values:

Group Locus

2 4.306678-6

2 13.419647-6

2 29.25774-6

2 37.129439-6

2 38.340271-6

Figure 14. Markers in Chromosome 6. Markers from Petits previous data (2014). On the left side a representation of the putative positions of the 5 markers in Chr 6. On the right, a list of the markers.

Furthermore, Figure 13 shows marker seq-rs9017 far from the other markers that were located in chromosome 6. The distance between them allow recombination in this region. In order to identified which of the markers is associated to the trait, a correlation test was performed between marker seq-rs9017 and seq-rs9017_b regarding brix. The correlation value for marker seq-rs9017 was: 0.26841 and for the marker seq-rs9017_b was: 0.38456. A higher correlation was found in marker seq-rs9017_b related to brix. These results suggest that the QTL related to brix content is located in chromosome 9, from 81.3 to 91.2 cM.

Metabolomic analysis of the predominant sugars and their comparison with brix content

The aim of the metabolomic assay was to identify which sugars were predominant between the lower and higher brix content groups. 14 samples and 2 controls were chosen from the previous 25 plants selected at the beginning (Tables 5, 14). Selected samples presented introgressions 6C and 9B identified in Petit (2014). Thus, even samples presented the same introgressions, they performed different for brix content. Therefore, the main sugars present in each sample could give a clue about the different performance between samples. Furthermore, samples #6 and #89 were part of this assay. These two samples did not contain introgression 9B. Hence, by evaluating them could be possible elucidate the difference in sugars between plants with 2 introgressions and one introgression.

Table 14. Samples for metabolomic study. Samples in green present the lowest brix content values; the ones in red present an intermediate value and do not present introgression 9C; and the lines in blue present the highest values. MM7 and MM8 (in white) are the controls.

Wageningen UR

P a g e | 31

Introgressions Line Brix content 6C 9B 27   4.42 53   4.87 80   4.89 47   5.06 24   5.09 49   5.19 6   5.44 89   5.55 40   5.86 39   5.88 16   5.95 12   6.02 9   6.23 64   6.72 MM7 Controls MM8

Sample #27, which had lowest brix content and the same allele combination (for these 4 markers) as sample #64 (Table 15). Sample #64 was considered the most vigorous one and with the highest brix content. Samples #6 and #89, selected due to the absence of introgression 9B presented heterozygous alleles for each marker and an intermediate score for brix content. F1 presented heterozygosity in 3 of 4 of the new markers. In case of S ch. 2 markers showed MM genotype, and the other 2 were identified as S40 alleles. The other samples showed alleles from Sweet40 or were heterozygous (Table 15).

Table 15. Selected sample for metabolomics with alleles and brix values . 14 samples were selected to perform the metabolomic assay. First column, the divisions between high, intermediate and low brix content. Second column, number of each plant. The 4 followed columns are alleles of the marker of each plant. Last column showed the brix content. Samples #6 and #89 in blue did not present introgression 9B. Alleles in yellow are heterozygous and alleles in green are homozygous to S40.

Wageningen UR

P a g e | 32

The metabolomic assay was performed by gas chromatography mass spectrometry. The different sugars and related compounds present in the samples are listed in Table 16. In this research, the most important sugars taken into account were fructose, glucose and sucrose. Fructose was the sugar which scored highest in all the samples (Figure 15). Controls MM7.1, MM8.1 and MM8.2 were not measured for brix content but did for sugars. Two groups were identified and tested. One group were the samples with low brix content (samples: #27, #53, #80, #47, #24, #49) and group 2 with high brix content (samples: #40, #39, #16, #12, #9 and # 64) Table 17. Samples #6 and #89 were not considered in these two groups due to a lack of 9B introgression.

Table 16. List of the identity of the sugars and other metabolites present in tomato fruit samples. In bold, the three main sugars for this study.

Sugars and related compounds

Fructose Glucose Sucrose Citrate Glutamate Malate GABA Phosphate Glucopyranose Aspartate Myo- Pyroglutamate

Wageningen UR

P a g e | 33

Figure 15. Main sugars by groups. Fructose, glucose and sucrose were considered the major sugars in the samples. Two groups were made including 6 samples each. On the left side Group 1 – Low brix content and Group 2 – High brix content. Samples #6 and #89 were not considered due to the lack of 9B introgression. On the right side the controls (MM7, MM8.1 and MM8.2). Each bar represents the score for fructose/glucose/sucrose in the plants. Fructose is overexpressed in comparison to the other two sugars. Glucose is present in major quantity than sucrose in all samples except for 2 controls (MM7 and MM8.1).

Sample #6 presented the highest content of sugars (Figure 15). On the other hand, sample #89 that shared the same characteristic did not perform as good as sample #6. Control MM8.2 was the only one who performed as was expected. This sample scored low for all the main sugars. Contrariwise, controls MM7.1 and MM8.1 had a better performance without the presence any of the introgressions (Figure 15).

Comparison between sugars The brix content was compared with the relevant sugars found in the samples. The data is provided in the Appendix (Table 2).

Wageningen UR

P a g e | 34

Figure 16. Brix versus Fructose. The R2 of the trend line in this graph is 0.1479. The data of this chart can be found in Appendix, Table 2.

Figure 17. Brix versus Glucose. The R2 of the trend line in this graph is 0.2058. The data of this chart can be found in Appendix, Table 2.

Wageningen UR

P a g e | 35

Figure 18. Brix versus Glucose. The trend line does not appear base on the values of this graph. The data of this chart can be found in the Appendix, Table 2.

Figures 16, 17 and 18 showed the comparison between brix content vs fructose, glucose and sucrose. The R2 of the trend line in these graphs were 0.2291, 0.2058 and 0.0074 respectively. Therefore, a low correlation was found between fructose, glucose on one hand and brix. Sucrose does not seem to correlate with brix.

The low correlation observed was probably due to a presence of an outlier sample #06, which showed a very high levels of all three sugars and an average Brix value. Removing this sample from the correlation analysis increases the correlation coefficient to Brix up to 0.7 for fructose and glucose, while for sucrose it remains very low, 0.08.

Table 17. Groups for the metabolomic assay. Each group has 6 samples. Group 1 is for samples with low content of brix and group 2 is for samples with high brix content.

Group 1 Brix content Group 2 Brix content #27 4.42 #40 5.86 #53 4.87 #39 5.88 #80 4.89 #16 5.95 #47 5.06 #12 6.02 #24 5.09 #9 6.23 #49 5.19 #64 6.72

In order to determine if the sugars present were significantly different between the two groups a T- test was performed for each sugar type (Table 18) (Appendix, Table 2). A significant difference was only present in four compounds, including fructose and glucose, between the groups; p-value Glucose: 0.049872, p-value Fructose: 0.048557, p-value Mannose: 0.046907 and p-value Myo-Inositol: 0.006123.

Nevertheless, another T-test was done to see whether there were differences between the alleles (AB and BB) of markers linked with the Brix QTL and the 3 main sugars (Fructose, glucose and sucrose). The results in Table 19 revealed that no significant differences between alleles were found.

Wageningen UR

P a g e | 36

Table 18. T-test results between groups and sugar. The results show four compounds with a significant difference between the two groups. In green, the significant values.

T-test Sugar and related compounds p-value Fructose 0.048557 Glucose 0.049872 Sucrose 0.346927 Mannose 0.046907 Citrate 0.759206 Glutamate 0.069641 Malate 0.065766 GABA 0.130136 Phosphate 0.144154 Glucopyranose 0.831822 Aspartate 0.075702 Myo-Inositol 0.006123 Pyroglutamate 0.775038

Table 19. T-test results between alleles and sugars, brix. The results show no significant difference was present between alleles (AB and BB).

T-test Brix and Main sugars p-value Brix 0.777178 Fructose 0.414974 Glucose 0.442538 Sucrose 0.577235

Wageningen UR

P a g e | 37

Figure 19. Average of sugars and brix content between the two groups and the standard errors (bars). Samples were divided in two groups, group 1 – low brix content and group 2- high brix content. For all sugars and brix, the contents in group 2 was higher.

In all the charts, the average of Group 2 was greater than of Group 1. Nevertheless, the standard error per group was significant except for the brix content (Figure 19).

Finally, a correlation test was performed to determine how strong the relation between brix content and the different alleles for marker sq-rs9017_b (KASP marker) in the plants is. The 14 plants selected for this test were not enough to obtain a confident result. Thus, the whole population was considered in order to achieve a better overview. The equivalent values for the alleles were A=0, B=1. Hence, the values of the combinations were: AA=0, AB=1 and BB=2. The correlation value was: 0.38456. Therefore, a low correlation was found between brix and alleles (Figure 20).

Wageningen UR

P a g e | 38

Figure 20. Correlation between KASP markers with Brix content. On the y axis the KASP markers values: 0 (AA), 1 (AB) and 2 (BB). On the x axis the Brix content of each sample. R=0.38456 represent a low correlation between the markers and brix content.

Wageningen UR

P a g e | 39

Discussion

Comparison between genotyping methods

In this research, 3 different methods for sequencing were used. The first and the most effective one was the sequencing by the GTAC Sequence Service. The results were uploaded in 3 different files, by this way, the opportunity to re-confirm the results in case of doubts was possible.

The other technique used for sequencing was CASP. This technique is based on the restriction enzyme action. In order to use this technique the cut has to be done in the location of the SNP. This was the first limitation. For the 4 SNPs presented in region VIIb, only one was cut by the enzyme BbvCI. The cut was possible only in MoneyMaker genotype. After running the digestion gel, MM presented 3 bands instead of 2. Even repetitions and different protocols were used, the result was the same in each occasion.

Finally, KASP assay was carried out. The test execution and interpretation was simple. The results were clear, except for two samples (#47 and #88) (Figure 10) which were between two genotypes. Sample #47 was sequenced before, so a re-analysis was possible. For sample #88 was the only result. Hence it was not possible to compare or analyse it again. In this technique is not possible to perform a confirmation. Furthermore, 4 additional samples (#5, #39, #42 and #89) presented a different result that the previous one analysed by the first method. These samples were re-analysed and the real genotype was elucidated.

Confirmation of the location, length of the introgression and gene involved in the QTL related to brix content in Chromosome 6

The QTL analysis results showed an unexpected output. The marker used by Petit, (2014) and the other 4 markers (surrounding Petits one) developed in this research were located on chromosome 9 after the genetic linkage analysis. Petit also presented that the marker seq-rs9017 was part of a transposon gene: protein CACTA En / Spm sub-class (Solyc06g054690.2.1) which presents 5 exons.

The movement of transposable DNA had have an effect on the evolution in the eukaryotic genome. Transposons activity can lead to a mutation. Furthermore, their introduction into a gene can alter or even disrupt the gene function (Chénais, et al., 2012). The movement of the QTL found by Petit from chromosome 6 to 9, left chromosome 6 without brix content QTL at region 6C. Some transposons have the property to copy a fragment of the flanking sequence to a different position in the genome. This change can be the responsible of the increase of the expression of the transposed genes (Bennetzen and Wang, 2014).

Chromosome 6 presents a great quantity of Ty3/Gypsy and Ty1/Copia retrotransposons in an unequal way. The relation between them is between 2:1 or 3:1 respectively (Peters et al., 2009). Ty3/Gypsy retrotransposons belong to a subclass from Class I and is more like a retrovirus. (Marín and Llore’ns, 2000). This element is a common retrotransposon in modern angiosperms. Also, is correlated with large genome sizes (Marín and Llore’ns, 2000). Ty1/Copia transposons are different and are located in the centromeres and telomeres (Heslop-Harrison et al., 1997; Peters, et al., 2009). Peters et al., (2009) in order to construct a physical mapping detected seven expressed sequence tag (EST)-derived markers that scored not only in chromosome 6, but also in chromosome 3 and 9. This

Wageningen UR

P a g e | 40

result suggests gene duplication between the chromosomes mentioned. This finding suggest a higher abundance of retrotransposon and repetitions in the complete chromosome 6. This annotation can be considered a value sample of the whole genome (Peters et al., 2009). It was showed since 2009 that chromosome 6 contains transposons and that also has a particular relation with chromosome 9; some duplications were already found. The QTL presented at 33.93 Mbp could be another representative sample of this finding.

The focus of this research was the most significant marker that Petit found: marker seq-rs9017. To identify the length of the QTL, 14 pair of primers were developed. Two key samples (one identified as heterozygous and the other one identified as S40) of each pair of primer were sent to sequence. Samples for 13 pair of primer did not present the alleles for S40 o heterozygosity, so they were not taken into account. Only markers included in VIIb region were significant. This result suggests that the length of the QTL is comprised between positions 33.932438 and 33.932756 Mbp.

Higher hexose content was found in samples with higher brix content

The metabolomic analysis revealed the content of the 3 main sugars in each sample that were studied in this research: glucose, fructose and sucrose. Greater amount of fructose and glucose than sucrose was observed in each sample. Both hexoses together formed the sucrose. Invertases, like Lin5, catalyse the cleavage of sucrose and release glucose and fructose. Chetelat et al., (1993) stated that plant with red fruits accumulate hexoses and present a higher invertase activity. Furthermore, it is known that Solanum lycopersicum varieties store hexose sugars (Baxter, 2005). The plants in this research accumulate hexoses instead of sucrose and might have a greater activity of Lin5 due to the modification that transposon can lead, like increasing the expression of the transposed genes (Chénais, et al., 2012).

The comparison with sugars and brix was done to evaluate the correlation between them. R2 showed a high correlation between brix-fructose and brix-glucose. In the case of sucrose, the correlation was too low to be considered. This result suggests a relation between the hexoses and the brix content, corroborating the previous result; a higher hexose content than sucrose content.

The correlation analysis was confirmed by a statistical test. This result showed that the amount of glucose and fructose in the high Brix group was different to the one presented in the low Brix group: the content of hexoses in group 2 is higher than in group 1. However, no significance difference was found between groups made using the genotypic information of the markers linked to Brix by a QTL analysis performed. This might indicate the complexity of the mechanism leading to accumulation of sugars in tomato fruit and that there are other factors, different from those controled by the QTL found in this study.

Ratification of number of QTLs present in Chromosome 9 related to brix content

Baxter et al., 2005 focused their investigation on a tomato line: IL9-2-5, a cross between S. lycopericum and S. pennelli. They recognized 2 QTLs related to Brix content: Brix-9-2-5 and PW-9-2-5. The first QTL contains an apoplastic invertase (Lin5). This invertase leads the cleave of sucrose to glucose and fructose. Lin5 is located in the apoplast and is involved in 2 processes. The first one, is to maintain a favourable sucrose gradient to obtain the sucrose form the phloem. This enhances the

Wageningen UR

P a g e | 41

capacity to take sugar in the sink organ. On the other hand, the action of Lin5 could be an instrument in charge of regulating, amplifying and integrating the sugar signals which can incorporate source and sink metabolism (Baxter, et al., 2005 and Fridman, et al., 2000).

Markers coming from chromosome 6 and 9 related to brix content were display in a linkage map (Figure 12, 13). Due to markers seq-rs9017 (49.4 cM) and the other 4 markers designed in this study (87.5 and 91.2 cM) were too far in the genetic map, a correlation test was done between markers seq-rs9017 and seq-rs9017_b regarding brix content in order to elucidate the position of the QTL. The result of this test revealed a higher correlation between Brix and marker seq-rs9017_b. This finding suggest that the QTL related to brix content could be located in Chromosome 9 from 81.3 to 91.2 cM, besides QTL Brix9-2-5 (37.5 cM) and QTL PW9-2-5 also present in Chromosome 9.

Lin5 allele is fruit specific and its expression only influences the fruit metabolism. It is not related with the photosynthetic metabolism because Lin5 is expressed only in non-photosynthetic tissues. (Baxter, 2005). After the cleavage of sucrose to glucose and fructose, the hexoses are transferred to the fruit cells by transporters. It has been shown that the action of monosaccharide transporter and the apoplastic invertase are co-regulated. Additionally, the manifestation of Lin5 might lead to enhancing the expression of monosaccharide transporters, especially LeHT3, the hexose transporter and LeSUT4 the sucrose transporter (Baxter, 2005). Brix9-2-5, with Lin5 seems to be responsible of the greater brix content in the plants. Lin5 expression or the over expression of it, because of the transposon action, can lead to a higher action of monosaccharide transporter and increasing content of hexoses in the fruits.

In the majority of cases, fruits used to import sugars directly from the leaves (Fridman 2000). Plants expressing QTL PW9-2-5 showed a semi determinate plant growth. In this regard, the plant weight increases and the number of leaves which basically feed the developing fruits also increases, leading a raise on the yield and brix content (Fridman, 2002).

The purpose of the current thesis was to corroborate the QTL related to soluble solids content located in chromosome 6. Finally, the QTL located in chromosome 6 is discarded. The region with the markers, and the assuming QTL was located as part of a transposon. Since the markers where located in chromosome 9, a misassembling of the genome due the repetitive elements could be the cause of this modification. Additionally, Fridman et al., (2000, 2002) identified 2 QTLs related to brix content in chromosome 9: Brix9.2.5 and PW9.2.5 Both QTLs enhance the brix content in the fruits. The first one because of the Lin5 expression and the second one increases the number of leaves which are the nutrition source of the fruit. Furthermore, a low correlation between the main sugars (Fructose, glucose and sucrose) and the alleles of the whole population was found.

Wageningen UR

P a g e | 42

Acknowledgments

I would first like to thank Dr. Arnaud Bovy and Dr. ir. Sjaak van Heusden for the opportunity they gave me to work in their group during the last 7 months. My sincere thanks also go to my daily supervisor, Dr. Yury Tikunov for his valuable help, guidance and support he provided me from the beginning till the completion of my thesis. Last but not least, I would like to express my gratitude to Jos Molthoff, Marcela Víquez-Zamora, Jordi Petit for their continues advice, help during all this time and to everyone in the Breeding for Quality group.

Wageningen UR

P a g e | 43

References

Baxter, C., Carrai, F., Bauke, A., Overy, S., Hill, S., Quick, P., Fernie, A. and Sweetlove, L. (2005). Fruit metabolism in an introgression line of tomato with increased fruit soluble solids. Plant Cell Physiol. 46(3): 425-437.

Bennetzen, J. L. and Wang, H. (2014). The contributions of transposable elements to the structure, function, and evolutions of plant genomes. Annu. Rev. Plant Biol, 65:505-30.

Bernacchi, D., Beck-Bunn, T., Emmatty, D., Eshed, Y., Inai, S., Lopez, J., Petiard, V., Sayama, H., Uhlig, J., Zamir, D., and Tanksley, S. (1998). Advance bakcross QTL analysis of tomato. II. Evaluation of near- isogenic lines carrying single-donor introgressions for desirable wild QTL-alleles derived from Lycopersicum hirsutum and L. pimpinellifolium. Theor Appl Genet, 97: 170-180.

Bucheli, P., Voirol, E., de la Torre, R., López, J., Rytz, A., Tanksley, S. D., & Pétiard, V. (1999). Definition of nonvolatile markers for flavor of tomato (Lycopersicon esculentum Mill.) as tools in selection and breeding. Journal of agricultural and food chemistry, 47(2), 659-664.

Causse, M., Duffe, P., Gomez, M. C., Buret, M., Damidaux, R., Zamir, D., Gur, A., Chevalier, C., Lemaire-Chamley, M. and Rothan, C. (2004). A genetic map of candidate genes and QTLs involves in tomato fruit size and composition. Journal of experimental botany, 403(55) 1671-1685.

Chénais, B., Caruso, A., Hiard, S. and Casse, N. (2012). The impact of transposable elements on eukaryotic genomes: From genome size increase to genetic adaptation to stressful environments. Gene, 509 7-15.

Chetelat, R. T., Klann, E., DeVerna, J. W., Yelle, S. and Bennett, A. B. (1193) Inheritance and genetic mapping of fruit sucrose accumulation in Lycopersicum chmielewskii. The plant journal, 4(4), 643-650.

Eshed, Y. and Zamir D. (1995). An introgression line population of Lycopersicum pennelli in the cultivated tomato enables the identification and fine mapping of yield-associated QTL. Genetics, 141: 1147-1162.

Fridman, E., Pleban, T. and Zamir, D. (2000). A recombination hotspot delimits a wild-species quantitative trait locus for tomato sugar content to 484 bp within an invertase gene. PNAS, 97(9): 4718-4723.

Fridman, E., Liu, Y.S., Carmel-Gorel, L., Gur, A., Shoresh, M., Pleban, T., Eshed, Y. and Zamir, D. (2002), Two tightly linked QTLs modify tomato sugar content via different physiological pathways. Mol Genet Genomics, 266: 821-826.

Fulton, T. M., Bucheli, P., Voirol, E., López, J., V. Pétiard, V. and Tanksley, S. D. (2002). Quantitative trait loci (QTL) affecting sugars, organic acids and other biochemical properties possible contributing to flavor, identified in four advanced backcross population of tomato. Euphytica, 127: 163-177.

Heslop-Harrison, J. S., Brandes, A., Taketa, S., Schmidt, T., Vershini, A., Alkhimova, E., Kamm, A., Doudrick, R., Schwarzacher, T., Katsiotis, A., Kubis, S., Kumar, A., Pearce, S., Flavell, A. and Harrison, G. (1997) Genetica, 100: 197-204.

Hunk, M. (2014). Master thesis: Tracking down some sugar and motivation. Wageningen university.

Wageningen UR

P a g e | 44

Lisch, D. and Bennetzen, J. (2011). Transposable element origins of epigenetic gene regulation. Plant Biology, 14: 156-161.

Lisec J, Schauer N, Kopka J, Willmitzer L, Fernie AR. (2006). Gas chromatography mass spectrometry- based metabolite profiling in plants. Nature Protocols, 1(1):387-396.

Marín, I. and Llore’ns, C. (2000) Ty3/Gypsy Retrotransposons: Description of new Arabidopsis thaliana elements and evolutionary perspectives derived from comparative genomic data. Mol. Biol, Evol 17(7):1040-1049.

Miron, D. and Schaffer, A. A. (1991). Sucrose phosphate synthase, sucrose synthase, and invertase activities in developing fruit of Lycopersicon esculentum Mill. and the sucrose accumulating Lycopersicon hirsutum Humb. and Bonpl. Plant Physiol. 95, 623-627.

Nookaraju, A., Upadhyaya, C. P., Pandey, S. K., Young, K. E., Hong, S. J., Park, S. K. and Park, S. W. (2010). Molecular approaches for enhancing sweetness in fruits and vegetables. Scientia horticulturae, 127: 1-15.

Oliver, K., McComb, J. and Greene, W. (2013). Transposable elements: Powerful contributors to angiosperm evolution and diversity. Genome Biol. Evol. 5(10): 1886-1901.

Peters, S. A., Datema, E., Szinay, D., J. van Staveren, M., Schijlen, E., C. van Haarst, J., Hesselink, T., Abma-Henkens, M., Bai, Y., de Jong, H., Stiekema, W., Lankhorst, R. and J. van Ham, R. (2009). Solanum lycopersicum cv. Heinz 1706 chromosome 6: distribution and abundance of genes and retrotransposable elements. The plant journal, 58, 857-869.

Petit, J. (2014). Master thesis: Identification of introgression region(s) related to brix content with increased fruit soluble solids. Wageningen university.

Stevens, M. A., Kader, A. A. and Albright, M. (1979). Potential for increasing tomato flavour via increased sugar and acid content. J. Amer. Soc. Hort. Sci. 104(1):40-42.

Tiemen, D. M., Zeigler, M., Schmelz, E. A., Taylor, M. G., Bliss, P., Kirst, M. and Klee, H. J. (2006). Identification of loci affecting flavour volatile emissions in tomato fruits. Journal of experimental botany, 57(4): 887-896.

Víquez-Zamora, M., Vosman, B., van de Geest, H., Bovy, A., Visser, R., Finkers, R. and van Heusden, A. (2013). Tomamto breeding in the genomics era: Insights from a SNP array. BMC Genomics, 14:354

Yelle, S., Chetelat, R. T., Dorais, M., DeVerna, J. W. and Bennett, A. B. (1991). Sink metabolism in tomato fruit. Plant physiol, 95 : 1026-1035.

Zanor, M. I., Osorio, S., Nunes-Nesi, A., Carrari, F., Lohse, M., Usadel, B., Kühn, C., Bleiss, W., Giavalisco, P., Willmitzer, L., Sulpice, R., Zhou, Y. H. and Fernie, A. R. (2009). RNA interference of Lin5 in tomato confirms its role in controlling brix content, uncovers the influence of sugars on the levels of fruit hormones, and demonstrates the importance of sucrose cleavage for normal fruit development and fertility. Plant physiology, 150: 1204-1278.

Wageningen UR

P a g e | 45

Appendix

Sequencing data

Table1. Results of the sequencing of 25 samples and 7 controls for 4 markers. Non call values in orange.

Figure 1. Chromatogram of sample 5 to elucidate “N”.

Wageningen UR

P a g e | 46

Wageningen UR

P a g e | 47

Figure 3. Chromatogram of sample 12 to elucidate “N”.

Wageningen UR

P a g e | 48

Figure 4. Chromatogram of sample 16 to elucidate “N”.

Wageningen UR

P a g e | 49

Figure 5. Chromatogram of sample 24 to elucidate “N”.

Wageningen UR

P a g e | 50

Figure 6. Chromatogram of sample 27 to elucidate “N”.

Wageningen UR

P a g e | 51

Figure 7. Chromatogram of sample 49 to elucidate “N”.

Wageningen UR

P a g e | 52

Figure 8. Chromatogram of sample 64 to elucidate “N”.

Wageningen UR

P a g e | 53

Figure 9. Chromatogram of sample 73 to elucidate “N”.

Wageningen UR

P a g e | 54

Figure 10. Chromatogram of sample 80 to elucidate “N”.

Wageningen UR

P a g e | 55

Figure 11. Chromatogram of sample 83 to elucidate “N”.

Wageningen UR

P a g e | 56

Figure 12. Chromatogram of sample 86 to elucidate “N”.

Wageningen UR

P a g e | 57

Figure 13. Chromatogram of sample 89 to elucidate “N”.

Wageningen UR

P a g e | 58

Figure 14. Chromatogram of sample F1 to elucidate “N”.

Wageningen UR

P a g e | 59

Previous markers from Petits Thesis

Figure 15. Complete set of markers from Petits thesis. Including seq-rs9017 in blue and the new 4 markers developed in this research in purple (on the right side). Column in right side contains all the markers. Alleles a in red (MM genotype), in yellow h (heterozygous genotype) and in green b (S40 genotype). Numbers above are the number of each plant.

Wageningen UR

P a g e | 60

Figure 16. Complete set of markers from Petits thesis including KASP results. Including seq-rs9017 in blue and the new 4 markers developed in this research in purple (on the right side). Column in right side contains all the markers. Alleles a in red (MM genotype), in yellow h (heterozygous genotype) and in green b (S40 genotype). Numbers above are the number of each plant.

Wageningen UR

P a g e | 61

Sugars values

Table 2. List of sugars and other metabolites after the metabolomic assay. Values of each component. In green, samples with low content of brix, in red samples #6 and #89 with intermediate values for brix content and samples without 9B introgression. In blue, samples with high content of brix.

Wageningen UR