<<

1 The Draft Genome of Tropical () 2 1,2,3,4,5,6# 2,7 2,7 3 3 Bin Tean Teh , Kevin Lim *, Chern Han Yong *, Cedric Chuan Young Ng *, Sushma Ramesh 8,14,15,16 3 2,4, 7 9 10 4 Rao , Vikneswari Rajasegaran , Weng Khong Lim , Choon Kiat Ong , Ki Chan , Vincent Kin 11 12 8,14,15,16,17 2,4,7 13 5 Yuen Cheng , Poh Sheng Soh , Sanjay Swarup , Steven G Rozen , Niranjan Nagarajan , 1,2,4,5,13# 6 Patrick Tan 7 8 1 9 Thorn Biosystems Pte Ltd, Singapore 2 10 Program in Cancer and Stem Cell Biology, Duke-NUS Medical School, Singapore 3 11 Laboratory of Cancer Epigenome, Division of Medical Science, National Cancer Centre, Singapore 4 12 SingHealth/Duke-NUS Institute of Precision Medicine, National Heart Centre, Singapore 5 13 Cancer Science Institute of Singapore, National University of Singapore, Singapore 6 14 Institute of Molecular and Cellular Biology, Singapore 7 15 Centre for Computational Biology, Duke-NUS Medical School, Singapore 8 16 Department of Biological Sciences, National University of Singapore, Singapore 9 17 Lymphoma Genomic Translational Research Laboratory, National Cancer Centre, Singapore 10 18 Global Databank, Singapore 11 19 Verdant Foundation, Hong Kong 12 20 Samsoney Group, 13 21 Genome Institute of Singapore, Singapore 14 22 Singapore Centre for Environmental Life Sciences Engineering, Nanyang Technological University,

23 Singapore 15 24 Metabolites Biology Lab, National University of Singapore, Singapore 16 25 NUS Synthetic Biology for Clinical and Technological Innovation, Life Sciences Institute, National

26 University of Singapore, Singapore 17 27 NUS Environmental Research Institute, National University of Singapore, Singapore

28 29 30 * Denotes equal contribution 31 32 # Address correspondence: [email protected] (B.T.T.) or [email protected]

33 (P.T.)

34 2

35 Abstract

36 Durian (Durio zibethinus) is a South East Asian tropical , well-known for its hefty spine-

37 covered fruit and notorious sulfury and onion-like odor. Here, we present a draft genome assembly of D.

38 zibethinus—the third plant in the order and first in the subfamily. Single-

39 molecule sequencing and chromosome contact maps enabled assembly of the highly heterozygous

40 durian genome at chromosome-scale resolution. Transcriptomic analysis revealed upregulation of sulfur,

41 ethylene, and lipid-related pathways in durian , consistent with its striking phenotypic traits. We

42 observed paleopolyploidy events shared between durian and cotton, and durian-specific gene expansions

43 in methionine gamma lyase (MGL), associated with production of volatile sulfur compounds (VSCs).

44 MGL and the ethylene-related gene aminocyclopropane-1-carboxylic acid synthase (ACS) were

45 upregulated in fruits, along concomitantly with their downstream metabolites (VSCs and ethylene)

46 suggesting a potential association between ethylene biosynthesis and methionine regeneration via the

47 Yang cycle. The durian genome provides a resource for tropical fruit biology and agronomy. 3

48 Introduction

49 Durian (Durio zibethinus) is a tropical edible fruit endemic to South East . In the region, durian is

50 extolled as the “king of fruits” for its formidable spiny husk, overpowering flavor, and unique odor

51 described as a mix of an onion-like sulfury aroma with notes of sweet fruitiness and savory soup- 1 52 seasoning . Well-known for igniting opposing passions of devotion and revulsion, the famed naturalist

53 Alfred Russel Wallace once remarked on durian - “the more you eat of it the less you feel inclined to 2 54 stop” . In contrast, durian is banned in public transportation and many hotels due to its characteristic and 3 55 pungent smell, described by some detractors as “ and onions, garnished with a gym sock” .

56

57 Among the thirty known species in the Durio genus, D. zibethinus is the most prized as a major South

58 East Asian food crop. The three leading durian producing countries are Thailand, Malaysia, and 4 59 Indonesia, with more than 250,000 hectares cultivated in 2008 . Durian is also of significant economic

60 value as it has recently gained market penetrance in China - in 2016 alone, durian imports into China

61 accounted for about US$600M compared to US$200M for oranges, one of China’s other main fruit 5 62 imports . More than 200 different of durian exist, encompassing a range of fruit textures,

63 flavors, and aromas. Distinct regional demands for different cultivars reflect local idiosyncrasies in

64 consumer tastes: pungent and bitter varieties are prized in Malaysia and Singapore (e.g., Musang King),

65 while sweeter cultivars with a mild odor are popular in Thailand (e.g., Monthong). The distinctive odors

66 of different durian cultivars have also been biochemically studied, and characterized as a complex suite 4,6,7 67 of odor-active compounds including sulfur volatiles, esters, alcohols, and acids .

68

69 Despite being an important tropical fruit crop, durian-related genetic research is almost non-existent.

70 Important resources such as genetic maps are not publicly available for durian, and no plant species in

71 the Helicteroideae subfamily have been genome-sequenced and assembled. Close relatives whose

72 genomes are available include only the more distantly-related cash crops within the larger

73 family, i.e., Theobroma cacao (chocolate) and Gossypium (cotton) spp. Besides D. zibethinus, several

74 other Durio species can produce edible fruits (e.g. D. gravolens and D. testudinarum), occupying

75 specific ecological niches where they share important relationships with specific pollinators, primarily 1,8 76 fruit and birds . However, many of these other Durio species are currently listed as vulnerable or

77 endangered. 4

78

79 Here, we report a draft whole-genome assembly of the D. zibethinus ‘Musang King’ , employing

80 both second- and third- generation sequencing. We used the durian genome to assess phylogenetic

81 relationships with other Malvaceae members, and to compare transcriptome data between different plant

82 organs (fruit aril, , stem, and ), and between fruit arils from different cultivars. Genomic and

83 transcriptomic analyses provided insights into the evolution and regulation of fruit-related processes in

84 durian, including , production, and sulfur metabolism. The durian genome provides a

85 valuable resource for biological research and agronomy of this tropical fruit.

86 5

87 Results

88

89 Genome assembly

90 We extracted DNA from a D. zibethinus ‘Musang King’ durian fruit stalk, generating 18.3 million

91 PacBio single-molecule long reads (6.2 kb average read length) corresponding to ~153X coverage of the

92 ~738 Mb durian genome, estimated by k-mer distribution analysis (Supplementary Figure 1,

93 Supplementary Table 1). We performed a PacBio-only assembly using an Overlap-Layout-Conensus 9 94 method implemented in FALCON , and as the durian genome is highly heterozygous possibly due to

95 frequent out-crossing during cultivation, we additionally phased the contigs using diploid-aware 10 96 FALCON-Unzip . The subsequent haplotig-merged assembly was polished using Arrow over three

97 iterations. For details on validation of genome assembly, see Supplementary Note.

98

99 The durian assembly was further refined in two stages using chromosome-contact maps. Chicago 11,12 100 libraries and Hi-C libraries (~280M read pairs each) were used to improve scaffold N50 to 22.7Mb,

101 with the longest scaffold being 36.3Mb (Figure 1a; GC content 32.5%). The final reference assembly

102 comprised chromosome-scale pseudomolecules, with 30 pseudomolecules greater than 10Mb and

103 covering 95% of the 712Mb assembly (Table 1; thereafter referred to as chromosomes, numbered

104 according to size). These figures correspond closely to previous estimates of the haploid chromosome 11 105 number of durian (1n=28, 2n=56) .

106

107 We experimentally estimated the durian genome size about 800 Mb by flow cytometry, close to our k-

108 mer analysis (738 Mb; Supplementary Table 2). K-mer distributions revealed two distinct peaks,

109 indicative of either a diploid or an autotetraploid genome with low heterozygosity (Supplementary

110 Figure 1). Mapping of syntenic regions within the assembly revealed extensive shuffling of syntenic

111 regions, rather than a one-to-one correspondence expected of an autotetraploid genome. SNP calling on

112 the final assembly also yielded a heterozygosity rate of 1.14%, supporting k-mer estimates (1%) and

113 consistent with a highly heterozygous organism.

114

115 We identified and masked 54.8% of the assembly as repeat regions (Figure 1a). LTR/Gypsy repeats

116 were the most abundant, comprising 26.2% of the genome, followed by LTR/Copia (3.2%; Figure 1b, 6

117 Supplementary Table 3). Compared to other genomes in the Malvaceae family, LTR/Gypsy families of

118 repeats appear to have expanded in D. zibethinus (26.2% of genome), G. raimondii (33.1%) and G.

119 arboreum (55.8%) compared to T. cacao (9%). Conversely, LTR/Copia repeats are contracted in D.

120 zibethinus (3.2% of genome) and G. arboreum (5.5%) compared to T. cacao (7%) and G. raimondii

121 (11.1%).

122

123 We annotated the remaining unmasked D. zibethinus genome using a comprehensive strategy combining 13 124 evidence-based and ab initio gene prediction (Figure 1a). Using the Maker pipeline , we incorporated

125 151,593 sequences from four plant species and 43,129 transcripts assembled from D. zibethinus

126 ‘Musang King’ RNA-seq data (described later). 45,335 gene models were identified, with an average

127 coding sequence length of 1.7 kb and 5.8 exons per gene. The vast majority of gene predictions (42,747)

128 were supported by either homology to known , existence of known functional domains, or the 14 129 presence of expressed transcripts (Supplementary Figure 2). Using Blast2GO , we annotated 35,975

130 predicted gene models with Gene Ontology terms. The annotated gene models included processes such

131 as defense response, fruit development, and and lipid metabolism (Figure 1c), which may

132 be of interest in the study of genetic variation between domesticated cultivars.

133

134 To verify the sensitivity of our gene predictions and the completeness and proper haplotig-merging of 15 135 our assembly, we checked core gene statistics using BUSCO . Our gene predictions recovered 1300 of

136 1440 (90.3%) highly-conserved core proteins in the embryophyta lineage, of which 68.1% were single

137 copy and 22.2% were duplicated. We checked if this high duplication number (22.2%) indicated

138 unmerged haplotigs in our assembly, using coverage statistics from the Illumina short reads. Among

139 these duplicated genes, 93.1% had mean read coverage within 1 standard deviation of the mean read

140 coverage in single-copy core genes, showing that these duplicated genes likely exist as independent and

141 distinct copies in the genome (Supplementary Figure 3). The high number of independent duplicate

142 genes is suggestive of a recent whole-genome duplication (WGD) event in the durian lineage, similar to

143 G. raimondii (11.5%) and G. arboreum (12.2%) - both with recent WGDs, versus 1.2% in T.

144 cacao, which has not undergone a recent WGD. Visualization of syntenic blocks between durian and

145 cacao confirmed that chromosome-level blocks in cacao are duplicated across multiple durian

146 chromosomes (Figure 1a). 7

147

148 Comparative phylogenomics reveals durian paleopolyploidy

149 To investigate the evolution of distinct durian traits, we compared the durian genome to ten other

150 sequenced plant species. These included three plants in the same Malvales order (G. raimondii, G.

151 arboreum, T. cacao), six plants in the same clade (Arabidopsis thaliana, Populus trichocarpa

152 [poplar], Glycine max [], Vitis vinifera [grape], Coffea canephora [], Carica papaya), and

153 Oryza sativa japonica (rice) as an outgroup.

154 16 155 Gene family clustering with OrthoMCL identified 32,159 gene families consisting of 327,136 genes

156 (Figure 2a; for clarity, only durian, T. cacao, G. arboreum, G. raimodii and A. thaliana gene families are

157 shown). 607 gene families, consisting of 1764 genes, were unique to durian (Supplementary Table 4).

158 Durian shared the most gene families with the other Malvales (cotton and cacao), consistent with the

159 placement of these three species in the same taxonomic order. Durian, cotton, and cacao also shared

160 similar numbers of gene families with A. thaliana (within 99% of each other), further validating the

161 accuracy and completeness of our gene predictions at the gene family level.

162

163 We derived 47 single-copy orthologous genes among the eleven plant species for phylogeny analysis,

164 taking the intersection of 344 single-copy gene families found by OrthoMCL and 395 single-copy gene 17 17 165 families present across diverse plant species (Supplementary Table 5). We used BEAST2 to generate

166 and date phylogenetic trees based on Bayesian analysis. Consistent with the phylogenetic ordering of 18 167 durian, cotton, and cacao first proposed by Alverson et al , our results suggest that cacao first diverged

168 from the shared durian and cotton lineage 62 – 85 million years ago (MYA; 95% highest posterior

169 density interval), followed by the divergence of durian and cotton at 60 – 77 MYA (Figure 2b). For the

170 other plant species, phylogenetic placement and estimated speciation dates were mostly in broad 19,20 171 consensus with other studies .

172

173 Ancient whole-genome duplications (WGDs, also known as paleopolyploidy events) are widespread in

174 plant lineages, and constitute a powerful evolutionary force for the development of novel gene functions 21 22 175 and the emergence of new species . All core eudicots share an ancient WGD termed the γ-event , and

176 among the eleven plant species analyzed, at least eight additional ancient WGDs have been previously 8

23 177 identified . To investigate WGDs in the durian lineage, we identified syntenic regions across the durian,

178 cotton, and cacao genomes, with each region consisting of at least five collinear homologous genes

179 (Supplementary Table 6). We found 60% of the cacao genome represented by syntenic regions in the

180 durian genome. Of these syntenic regions in durian, 25% were present in one copy, 37% in two copies,

181 36% in three copies, and the remainder (2%) in four or more copies (Supplementary Figure 4). The

182 observation that 75% of genomic regions syntenic between durian and cacao are present at multiple

183 copies in durian strongly suggests that the durian lineage underwent a WGD after speciation from cacao.

184 The Ks (synonymous substitution rate) distribution between syntenic durian genes also exhibited a peak

185 characteristic of a recent WGD (Ks = 0.24; Figure 2c).

186 24,25 187 A WGD has also been previously shown for the cotton lineage, after its speciation with cacao . We

188 investigated if the WGDs associated with cotton and durian reflect the same event or represent two

189 distinct evolutionary events, by resolving the ordering and dates of the WGDs alongside the durian-

190 cotton divergence event. We extracted paralogous pairs of durian and cotton genes arising from their

191 respective WGDs, established orthologous relationships between them, and estimated their phylogenetic

192 relationships and divergence times (Supplementary Table 7). Our analysis showed that paralogs within

193 cotton and durian (originating from the WGD) each pre-dated their orthologs (originating from the

194 durian-cotton divergence; posterior probability of the phylogenetic tree branches ≥ 0.9981; Figure 2d).

195 This provides evidence that durian and cotton share a common WGD that likely occurred before their

196 lineages diverged, and places the known cotton-specific WGD as also part of the durian lineage.

197

198 Upregulated gene expression in sulfur and ripening

199 To investigate biological processes associated with durian fruit traits, we sequenced expressed RNAs

200 (RNA-seq) from different plant organs of the Musang King cultivar, including ripe fruit arils and stem,

201 leaf, and . In addition, we sequenced RNAs from ripe fruit arils of two other durian cultivars

202 (Monthong and Puang Manee). We performed three independent comparative transcriptomic analyses –

203 i) we compared durian transcriptomic data from fruit arils (Musang King, Monthong, and Puang Manee

204 combined, 3 arils each) against root, stem, and leaf transcriptomic profiles (combined); ii) durian fruit

205 arils (Musang King, Monthong, and Puang Manee combined, 3 arils each) against fruits of other plant

206 species combined, specifically Musa acuminata (, SRA259656), Mangifera indica (mango, 9

207 SRA289054), Persea americana (avocado, SRA172282), Solanum lycopersicum (tomato, SRA049915),

208 and corymbosum (, SRA145423); and iii) Musang King fruit arils (3 arils) against

209 Monthong and Puang Manee fruit arils combined (3 arils each).

210 26 211 Gene Set Enrichment Analysis (GSEA) revealed that genes upregulated in durian fruits relative to non-

212 fruit organs were associated with distinct cellular pathways, such as sulfur-related, ripening-related, and

213 flavor-related processes (Figure 3a, Supplementary Table 8). Enrichment in gene sets related to sulfur

214 comprised five clusters, of which acid-thiol ligase enzymes (containing a number of key enzymes in

215 carbon-sulfur reactions and flavonoid production) and methionine metabolic pathways (of which MGL is 27 216 a key member) contained the most significant upregulated leading edge genes. Ripening-related gene 28,29 217 sets included genes regulated by the MADS-BOX transcription factor family , SEPALLATA

218 transcription factor family and ethylene-related genes such as aminocyclopropane-1-carboxylic acid 30 219 synthase (ACS), a key ethylene production enzyme involved in ripening . Other upregulated processes

220 related to taste and odor included triterpenoid metabolism which have been associated with the intense 31 221 bitter taste of Ganoderma lucidum () and lipid-related genes involved in the production of C6

222 volatile compounds (hexanal and hexanol) which have been associated with green odor profiles of 32,33 223 apples, tomatoes and , as well as undesirable rancid odors when present at high levels .

224 Conversely, processes enriched in non-fruit organs included photosynthetic pathways and the regulation

225 of nitrogen metabolism, likely required by roots and to manufacture amino acids.

226

227 To determine if these fruit-enriched processes are specific to durian, we next compared the

228 transcriptomes of durian fruits versus the fruits of five other species. GSEA showed that sulfur-related

229 pathways, lipid oxidation pathways and ripening pathways related to ethylene, and protein degradation

230 were upregulated in durian compared to other fruits; whereas some flavor-related pathways were

231 downregulated (Figure 3b). These results suggest that both sulfur-related pathways and specific ripening

232 processes related to ethylene are likely to be highly-regulated in durian fruits, even compared to other

233 climacteric fruits. A prominent role for these pathways in durian is supported in the literature, as volatile 6,7,34 234 sulfur compounds (VSCs) have been identified as important components in durian odor (see next

235 section). Ethylene production and cellular respiration are also increased in durian fruits during the 10

236 climacteric stage of ripening, and associated with important physiological changes including odor 1,35 237 production, pulp softening, release, and loss followed by dehiscence .

238

239 We also investigated if differences in sulfur, ripening and flavor related pathways might be associated

240 with the distinct and multifarious odor descriptors ascribed to different durian cultivars. Our results

241 revealed that all three pathways were significantly upregulated (Sulfur metabolic process, p = 0.0019;

242 Response to ethylene, p = 0.0009; Flavonoid metabolic process, p = 0.0009) in Musang King compared

243 to Monthong and Puang Manee (Figure 3c). These findings correlate well with the stronger perceived

244 taste and smell of the Musang King cultivar.

245

246 Genomic expansions of volatile sulfur compound genes

247 Durian aroma comprises at least two major groups of volatile compounds: sulfur-containing volatiles,

248 such as thiols, di- and tri-sulfides, which may contribute to the distinct roasty, onion-like odor of durian; 6,7,36 249 and esters, which may contribute to a sweeter, fruity odor . The former group (VSCs) is of particular

250 interest, as its odor descriptors reflect most closely the distinctive nature of the durian aroma.

251 Interestingly, analysis of the durian genome revealed evolutionary gene family expansions of the MGL

252 gene (4 copies in durian, 3 in G. arboreum, 2 in G. raimondii, and 1 in T. cacao, A. thaliana, and O.

253 sativa; Figure 4a). Studies in microbes and plants have shown that MGL is a major functional

254 contributor to VSC biosynthesis, degrading the sulfur-containing amino acids cysteine and methionine 1,37,38 255 into methanethiol, which is further broken down into di- and tri-sulfides (Figure 4b). Phylogenetic

256 analysis of MGL proteins from durian and other species revealed two evolutionary MGL subgroups

257 (posterior probability > 0.9999; Figure 4c). The first group, referred to as MGLa, included singular MGL

258 genes from T. cacao and both cotton species, as well as two durian MGL genes, MGLa1

259 (Duzib1.0C006G000732) and MGLa2 (Duzib1.0C011G000528). This group likely represents the

260 ancestral, conserved MGL gene family. The second MGL group included only the two remaining

261 duplicated copies of MGL from durian, suggesting that it comprises MGL genes that have diverged in

262 sequence after genome duplication. We denoted the MGL genes in this second group as MGLb1

263 (Duzib1.0C010G000484) and MGLb2 (Duzib1.0C010G000488).

264 11

265 To investigate functions of MGLa and MGLb genes in durian, we examined their expression differences

266 between durian fruit arils versus non-fruit organs (stem, root, and leaf). MGLb1 and MGLb2 were

267 consistently upregulated in fruits, with MGLb2 in particular showing a dramatic increase in expression -4 -100 268 (12.73 and 2241 fold increase, p = 2.07x10 and 9.26x10 respectively; Figure 4d; Supplementary

269 Table 9). In contrast, among MGLa genes, MGLa1 exhibited decreased expression in fruits (7.14-fold -3 270 decrease, p = 4.44x10 ). These results suggest a novel function of durian MGLb in durian fruits.

271

272 To investigate potential mechanisms as to how MGL might have expanded from a single copy in the

273 durian-cacao ancestor to four copies in durian, we compared MGL-containing genomic regions in cacao

274 and durian. The two durian MGLa genes were present in two separate scaffolds, with both showing

275 regional synteny to the cacao MGL region. In contrast, the two durian MGLb genes were present in close

276 proximity within the same scaffold, which also displayed synteny with the cacao MGL region (Figure

277 4e). This suggests that MGL initially expanded in the durian lineage via a large-scale genomic

278 duplication event (such as a WGD), followed by a subsequent tandem duplication event that doubled the

279 durian MGLb genes.

280

281 Besides MGL, we also observed significant genomic expansions at the pathway level in gene families

282 related to sulfur metabolism (FDR = 0.032), fatty acid metabolism (FDR = 0.018), and ethylene -7 283 processes (FDR = 1.76 x 10 ) (Figure 4a, Supplementary Table 10). These included the ATP-

284 sulfurylase (APS) family, involved in hydrogen sulfide biosynthesis; cysteine synthase (OASA1),

285 involved in maintaining sulfur levels; the 3-ketoacyl-COA thiolase (THIK) family, involved in the

286 production of lipid volatiles; E3 ubiquitin-protein ligase (XBAT32), a regulator of ethylene biosynthesis;

287 and ethylene receptor (ETR2) and histidine kinase (AHK5), involved in ethylene signaling. Interestingly,

288 some genes in these pathways were expanded in both durian and cotton, supporting the importance of 39 289 ethylene processing in cotton . Together, these results suggest that a WGD event in durian’s lineage led

290 to the expansion and diversification of pathways related to durian volatiles, such as those involved in

291 sulfur processing (including MGL), lipid volatiles, and ethylene (see below). Upregulation of these

292 genes in durian may be linked to increased VSC production, thereby contributing to durian odor.

293 12

294 Potential association between VSC production and ripening

295 In climacteric plant species, ethylene sensing has been shown to act as a major trigger of fruit ripening.

296 Ethylene production is a three step reaction, beginning with transfer of the adenosyl group in ATP to L-

297 methionine by SAM synthase, followed by conversion of L-methionine to 1-aminocyclopropane-1- 21 298 carboxylic acid and ethylene by the enzymes ACS and ACO respectively . Our genomic and

299 transcriptomic findings suggest a potential association between durian odor and fruit ripening, based on

300 the central role of methionine in both processes. In this model (Figure 5a), methionine acts as a

301 precursor amino acid for both VSCs and ethylene via MGL and ACS activity respectively, and both

302 MGL and ACS are upregulated in durian fruits (Figures 4d, 5b). We also witnessed upregulation of other

303 genes plausibly related to durian ripening, including PME40, a pectinesterase that plays an important 40 41 304 role in fruit softening of ; C71BV, a cytochrome P450 upregulated in banana fruits ; and 42 305 BXL1, a beta-D-xylosidase involved in cell wall modification of ripening tomatoes . To study

306 associations between MGL, ACS, VCSs and ethylene, we profiled six durian fruits at different ripening

307 stages (6 arils for each stage): 1 month post-anthesis, 2 months post-anthesis, 2.5 months post-anthesis

308 (ripe), and over 3 days post-abscission (fully-ripe). Quantitative RT-PCR analysis of MGLa and MGLb

309 in durian arils over the post-anthesis period demonstrated a coordinated spike over the ripening peroid,

310 in concert with rising VSCs detected by global mass spectrometry profiling (GC-MS) and rising

311 ethylene levels (Figure 5c, Supplementary Tables 11 , 12 and 13). The availability of global mass

312 spectrometry profiles also allowed us to identify different types of VSCs such as sulfide complexes and

313 disulfide analogs (details of different VSCs are provided in Supplementary Table 11), the most abundant

314 being diethyl disulfide (DEDS), a compound produced as a result of redox reactions from organic thiols.

315 The levels of MGL expression, DEDS, and ethylene remained high for post-abscission fruits over 3 days

316 at room temperature (Figure 5d).

317

318 The dual usage of methionine as a precursor for both durian odor (VCSs) and ripening (ethylene)

319 suggests a rapid sulfur-depleted environment in durian fruits, since i) sulfur is a scarce nutrient, ii) VCS

320 production via MGL is a sulfur sink, and iii) 5-methylthioadenosine (MTA), the byproduct of ACS,

321 contains a reduced sulfur group. Consistent with a low sulfur state, we observed upregulation in fruits of

322 several genes related to sulfur sensing and scavenging, including SDI1, a sulfur sensing gene and the 43 323 upregulation of BGL, a beta glucosidase that cleaves sulfur from glucosinolate . An additional pathway 13

324 for salvaging sulfur from MTA is the Yang cycle, which allows sulfur moieties from MTA to be 44,45 325 recycled back to methionine . In summary, our results suggest that the complex aroma of durian is

326 possibly linked to durian fruit ripening as both ethylene production and VSC production are connected

327 to the same regulatory processes and metabolites. However, we emphasize that in the current study, the

328 potential associations between MGL expression, VSC generation, and ethylene production are

329 correlative, and that further functional experiments will be required to establish a direct causal link

330 between these molecular entities.

331 14

332 Discussion

333 Although well-known as a tropical food delicacy, durian research has been hampered by an absence of

334 genetic resources. The D. zibethinus Musang King draft genome thus provides a useful resource for

335 durian agronomy, and is to our knowledge the first and most complete genome assembly of durian. The

336 D. zibethinus assembly also represents the first from the Helicteroideae subfamily, and is thus also

337 valuable for evolutionary phylogenomic studies, similar to cotton and cacao whose genomes are

338 available and genetic research is active.

339

340 Ancient polyploidy events are important evolutionary drivers for the emergence of new phenotypes and 46 341 new species . Once duplicated, novel or specialized functions can arise in redundant alleles, and new

342 regulatory mechanisms can develop through genomic rearrangements. Our observation that durian likely

343 shares the previously-described cotton WGD has two implications. First, other subfamilies within

344 Malvaceae, after divergence from the Byttnerioideae subfamily (including cacao), are likely to also

345 share the cotton-specific WGD. Second, this WGD event, which has been proposed to have driven the 22 346 evolution of unique Gossypium traits , may also be involved in driving the evolution of unique durian 26,47 347 traits, as well as the radiation of different Malvaceae subfamilies around the same time .

348

349 Our current study focused on analyzing biological processes related to durian odor. VSCs have been

350 identified as major contributors to durian smell, and we observed upregulation of sulfur-related

351 pathways in the durian fruit arils compared against other durian plant organs, and against fruits from

352 other species. At the genomic level, we also identified expansions in gene families in sulfur-related

353 pathways, most noteworthy of which was the MGL gene, which has been shown to catalyze the 37,38,48 354 breakdown of methionine into VSCs in microbes and plants . More specifically, our analysis

355 suggests a distinct role for MGLb genes in the durian fruit, where it was found to be significantly

356 upregulated. MGL functions as a homotetramer where each subunit requires a pyridoxal 5’-phosphate 39,49 357 (PLP) cofactor, with the active sites located in the vicinity of PLP and the substrate binding site .

358 While these active sites are conserved in both durian MGLa and MGLb sequences compared to other

359 plant and bacterial MGL sequences, regions proximal to the active sites, which are conserved in other

360 plant species, exhibited more mutations in MGLb than in MGLa (four versus two respectively; 15

361 Supplementary Figure 5). The impact of these mutations on the function of durian MGL genes requires

362 further investigation.

363

364 Our analysis also suggests a potential association between MGL and ACS in the durian fruit, whose

365 coordination may involve the Yang Cycle. Previous durian studies have also noted correlations between 50 366 VSC production and increased ethylene production , however a genetic link behind these observations

367 has not been demonstrated. It is possible that linking odor and ripening may provide an evolutionary

368 advantage for durian in facilitating fruit dispersal. Certain plants, whose primary dispersal vectors are 36,51,52 369 primates with more advanced olfactory systems, show a shift in odor at ripening . Similarly, durian

370 – by emanating an extremely pungent at ripening, appears to have the characteristic of a plant whose

371 main dispersal vectors are odor-enticed primates, rather than visually-enticed animals.

372

373 The durian genome assembly enables a large scope for further studies. As an example, rapid

374 commercialization of durian has led to a proliferation of cultivars with a wide discrepancy in prices,

375 with little way to verify the authenticities of the fruit product at scale. A high-quality genome assembly

376 may aid in identifying cultivar-specific sequences, including SNPs related to important cultivar-specific

377 traits (such as taste, texture, and odor), and allow molecular barcoding of different durian cultivars for

378 rapid quality control. At a more basic level, such genetic information is vital to better understand durian

379 biodiversity. The Durio genus comprises more than thirty known species, some of which do not

380 produce edible fruit, e.g. D. singaporesis, and others that are outcompeted in nature and face extinction.

381 Further studies will help to elucidate the ecological roles of these important and fascinating tropical

382 plants.

383

384 Acknowledgements and Funding

385 This work was supported by Thorn Biosystems Pte Ltd through funding by an anonymous donor, and by

386 Dovetail Genomics through their End-of-Year Matching Funds Award. We thank all co-authors and

387 colleagues for donating their after-work hours to support this study. We thank Professor Lee Chuen

388 Neng for his support. We thank Agnes Tan from 101 Fruits for their education and guidance in durian

389 cultivars.

390 16

391 Author contributions

392 B.T.T. and P.T. conceived the project. B.T.T., P.T., N.N., and S.S. directed the study and supervised the

393 research. K.L., C.H.Y., C.C.Y.N., N.N., B.T.T., and P.T. designed and interpreted the data, and wrote

394 the manuscript. C.C.Y.N., S.R.R., and V.R. performed the experiments. K.C. and V.K.Y.C. provided

395 support for the experiments. K.L., C.H.Y., N.N., and W.K.L. performed the data analysis. S.G.R.

396 contributed to the data analysis. C.K.O. and P.S.S. contributed to sample collection. All authors have

397 read and approved the final manuscript.

398

399 Competing financial interest statement

400 No competing financial interest declared.

401 17

402 References

403 1. Li, J.X., Schieberle, P. & Steinhaus, M. Characterization of the major odor-active compounds in 404 Thai durian ( Durio zibethinus L. 'Monthong') by aroma extract dilution analysis and headspace 405 gas chromatography-olfactometry. J Agric Food Chem 60, 11253-62 (2012). 406 2. Wallace, A.R. On the and durian of Borneo. Hooker's Journal of Botany 8, 225-230 407 (1856). 408 3. Winokur, J. The Traveling Curmudgeon: Irreverent Notes, Quotes, and Anecdotes on Dismal 409 Destinations, Excess Baggage, the Full Upright Position, and Other Reasons Not to Go There, 410 (Sasquatch Books, 2003). 411 4. Siriphanich, J. Postharvest Biology and Technology of Tropical and Subtropical Fruits, 412 (Woodhead Publishing Limited, Cambridge, UK, 2011). 413 5. UNTradeStatistics. http://unstats.un.org/. 414 6. Jaswir, I., Che Man, Y.B., Selamat, J., Ahmad, F. & Sugisawa, H. RETENTION OF VOLATILE 415 COMPONENTS OF DURIAN FRUIT LEATHER DURING PROCESSING AND STORAGE. Journal of 416 Food Processing and Preservation 32, 740-750 (2008). 417 7. Chin, S.T. et al. Analysis of volatile compounds from Malaysian (Durio zibethinus) using 418 headspace SPME coupled to fast GC-MS. Journal of Food Composition and Analysis 20, 31-44 419 (2007). 420 8. Bumrungsri, S., Sripaoraya, E., Chongsiri, T., Sridith, K. & Racey, P.A. The Ecology of 421 Durian (Durio zibethinus, ) in Southern Thailand. Journal of Tropical Ecology 25, 422 85-92 (2009). 423 9. Yumoto, T. Bird-pollination of three Durio species (Bombacaceae) in a tropical rainforest in 424 Sarawak, Malaysia. American Journal of Botany 87, 1181-1188 (2000). 425 10. Chin, C.-S. et al. Phased diploid genome assembly with single-molecule real-time sequencing. 426 Nat Meth 13, 1050-1054 (2016). 427 11. Lieberman-Aiden, E. et al. Comprehensive Mapping of Long-Range Interactions Reveals Folding 428 Principles of the Human Genome. Science 326, 289-293 (2009). 429 12. Putnam, N.H. et al. Chromosome-scale shotgun assembly using an in vitro method for long- 430 range linkage. Genome Research 26, 342-350 (2016). 431 13. Mangenot, S. & Mangenot, G. Enquête sur les nombres chromosomiques dans une collection 432 d'espèces tropicales. Bulletin de la Société Botanique de France 109, 411-447 (1962). 433 14. Holt, C. & Yandell, M. MAKER2: an annotation pipeline and genome-database management tool 434 for second-generation genome projects. BMC Bioinformatics 12, 1-14 (2011). 435 15. Conesa, A. et al. Blast2GO: a universal tool for annotation, visualization and analysis in 436 functional genomics research. Bioinformatics 21, 3674-3676 (2005). 437 16. Li, L., Stoeckert, C.J., Jr. & Roos, D.S. OrthoMCL: identification of ortholog groups for 438 eukaryotic genomes. Genome Res 13, 2178-89 (2003). 439 17. Duarte, J.M. et al. Identification of shared single copy nuclear genes in Arabidopsis, Populus, 440 Vitis and Oryza and their phylogenetic utility across various taxonomic levels. BMC Evol Biol 10, 441 61 (2010). 442 18. Bouckaert, R.R. et al. BEAST 2: a software platform for bayesian evolutionary analysis. PLoS 443 Comput Biol 10(2014). 444 19. Alverson, W.S., Whitlock, B.A., Nyffeler, R., Bayer, C. & Baum, D.A. Phylogeny of the core 445 Malvales: evidence from ndhF sequence data. Am J Bot 86, 1474-86 (1999). 18

446 20. Magallón, S., Gómez-Acevedo, S., Sánchez-Reyes, L.L. & Hernández-Hernández, T. A 447 metacalibrated time-tree documents the early rise of phylogenetic diversity. New 448 Phytologist 207, 437-453 (2015). 449 21. Li, F. et al. Genome sequence of the cultivated cotton Gossypium arboreum. Nat Genet 46, 450 567-72 (2014). 451 22. Otto, S.P. The evolutionary consequences of polyploidy. Cell 131, 452-62 (2007). 452 23. Jiao, Y. et al. A genome triplication associated with early diversification of the core eudicots. 453 Genome Biology 13, R3 (2012). 454 24. Vanneste, K., Baele, G., Maere, S. & Van de Peer, Y. Analysis of 41 plant genomes supports a 455 wave of successful genome duplications in association with the Cretaceous–Paleogene boundary. 456 Genome Research 24, 1334-1347 (2014). 457 25. Wang, K. et al. The draft genome of a diploid cotton Gossypium raimondii. Nat Genet 44, 1098- 458 103 (2012). 459 26. Paterson, A.H. et al. Repeated polyploidization of Gossypium genomes and the evolution of 460 spinnable cotton fibres. Nature 492, 423-427 (2012). 461 27. Subramanian, A. et al. Gene set enrichment analysis: a knowledge-based approach for 462 interpreting genome-wide expression profiles. Proc Natl Acad Sci U S A 102, 15545-50 (2005). 463 28. Joshi, V. & Jander, G. Arabidopsis Methionine γ-Lyase Is Regulated According to 464 Isoleucine Biosynthesis Needs But Plays a Subordinate Role to Threonine Deaminase. Plant 465 Physiology 151, 367-378 (2009). 466 29. Giovannoni, J.J. Fruit ripening mutants yield insights into ripening control. Curr Opin Plant Biol 467 10, 283-9 (2007). 468 30. Seymour, G., Poole, M., Manning, K. & King, G.J. Genetics and epigenetics of fruit development 469 and ripening. Curr Opin Plant Biol 11, 58-63 (2008). 470 31. Burg, S.P. & Burg, E.A. Role of Ethylene in Fruit Ripening. Plant Physiol 37, 179-89 (1962). 471 32. Tsuyoshi Nishitoba, H.S., Takanori Kasai, Hirokazu Kawagishi, Sadao Sakamura. New Bitter C27 472 and C30 Terpenoids from the Fungus Ganoderma lucidum (Reishi). Agricultural and Biological 473 Chemistry 49, 5 (1984). 474 33. Eskin, N.A., Grossman, S. & Pinsky, A. Biochemistry of lipoxygenase in relation to food quality. 475 CRC Crit Rev Food Sci Nutr 9, 1-40 (1977). 476 34. Rowan, D.D., Allen, J.M., Fielder, S. & Hunt, M.B. Biosynthesis of straight-chain ester volatiles in 477 delicious and granny smith apples using deuterium-labeled precursors. J Agric Food Chem 478 47, 2553-62 (1999). 479 35. Ketsa, S. & Daengkanit, T. Firmness and activities of polygalacturonase, pectinesterase, b- 480 galactosidase and cellulase in ripening durian harvested at different stages of maturity. Scientia 481 Horticulturae 80, 181-188 (1999). 482 36. Maninang, J.S., Wongs-Aree, C., Kanlayanarat, S., Sugaya, S. & Gemma, H. Influence of 483 maturity and postharvest treatment on the volatile profile and physiological properties of the 484 durian (Durio zibethinus Murray) fruit. International Food Research Journal 18, 1067-1075 485 (2011). 486 37. Landaud, S., Helinck, S. & Bonnarme, P. Formation of volatile sulfur compounds and 487 metabolism of methionine and other sulfur compounds in fermented food. Appl Microbiol 488 Biotechnol 77, 1191-205 (2008). 489 38. Rébeillé, F. et al. Methionine catabolism in Arabidopsis cells is initiated by a γ-cleavage process 490 and leads to S-methylcysteine and isoleucine syntheses. Proceedings of the National Academy 491 of Sciences 103, 15687-15692 (2006). 19

492 39. Gonda, I. et al. Catabolism of l–methionine in the formation of sulfur and other volatiles in 493 melon (Cucumis melo L.) fruit. The Plant Journal 74, 458-472 (2013). 494 40. Rodrigues, M.A., Bianchetti, R.E. & Freschi, L. Shedding light on ethylene metabolism in higher 495 plants. Frontiers in Plant Science 5, 665 (2014). 496 41. Castillejo, C., de la Fuente, J.I., Iannetta, P., Botella, M.Á. & Valpuesta, V. esterase gene 497 family in fruit: study of FaPE1, a ripening‐specific isoform*. Journal of Experimental 498 Botany 55, 909-918 (2004). 499 42. Asif, M.H. et al. Transcriptome analysis of ripe and unripe fruit tissue of banana identifies major 500 metabolic networks involved in fruit ripening process. BMC Plant Biology 14, 1-15 (2014). 501 43. Itai, A., Ishihara, K. & Bewley, J.D. Characterization of expression, and , of beta-D- 502 xylosidase and alpha-L-arabinofuranosidase in developing and ripening tomato (Lycopersicon 503 esculentum Mill.) fruit. J Exp Bot 54, 2615-22 (2003). 504 44. Zheng, Z.-L., Zhang, B. & Leustek, T. Transceptors at the boundary of nutrient transporters and 505 receptors: a new role for Arabidopsis SULTR1;2 in sulfur sensing. Frontiers in Plant Science 5, 506 710 (2014). 507 45. Baur, A.H. & Yang, S.F. METHIONINE METABOLISM IN APPLE TISSUE IN RELATION TO 508 ETHYLENE BIOSYNTHESIS. Phytochemistry 11, 3207-3214 (1972). 509 46. Miyazaki, J.H. & Yang, S.F. Metabolism of 5-Methylthioribose to Methionine. Plant Physiology 84, 510 277-281 (1987). 511 47. Salman-Minkov, A., Sabath, N. & Mayrose, I. Whole-genome duplication as a key factor in crop 512 domestication. 2, 16115 (2016). 513 48. Soltis, P.S. & Soltis, D.E. Ancient WGD events as drivers of key innovations in angiosperms. 514 Current Opinion in Plant Biology 30, 159-165 (2016). 515 49. Kudou, D. et al. Structure of the antitumour enzyme L-methionine gamma-lyase from 516 Pseudomonas putida at 1.8 A resolution. J Biochem 141, 535-44 (2007). 517 50. Hacham, Y., Avraham, T. & Amir, R. The N-terminal region of Arabidopsis cystathionine 518 gamma-synthase plays an important regulatory role in methionine metabolism. Plant Physiol 519 128, 454-62 (2002). 520 51. Hodgkison, R. et al. Fruit bats and fruits: the evolution of fruit scent in relation to the 521 foraging behaviour of bats in the New and Old World tropics. Functional Ecology 27, 1075-1084 522 (2013). 523 52. Borges, R.M., Bessière, J.M. & Hossaert-McKey, M. The chemical ecology of seed dispersal in 524 monoecious and dioecious figs. Functional Ecology 22, 484-493 (2008).

525

526

527

528

529

530 20

531

532

533 21

534 Figure legends

535 Figure 1. Characterization of the Durio zibethinus genome. (a) Circos plot of multi-dimensional

536 topography of the Durio zibethinus genome (right), comprising 30 pseudomolecules that cover ~95% of

537 the assembly. (I) Density of genes, (II) density of repeat elements, (III) density of GC content, (IV)

538 syntenic regions with Theobroma cacao (left), the closest sequenced relative in the Malvaceae family

539 that did not undergo a recent WGD event. (b) Distribution of repeat classes in the durian genome. (c)

540 Distribution of predicted genes among different high-level Gene Ontology Biological Process terms.

541

542 Figure 2. Comparative genomic analysis of durian with other plants. (a) Shared gene families

543 between durian and three other Malvales, with Arabidopsis as an outgroup. Number in parenthesis

544 indicates durian-specific gene families among all eleven plants considered. (b) Inferred phylogeny tree 53 545 across eleven plant species. Established WGD events placed according to Vanneste et al. (2014) .

546 Posterior probabilities for all branches > 0.9999. (c) Frequency distributions of synonymous substitution

547 rates (Ks) between collinear genes in syntenic blocks, in durian and G. raimondii. Peak boundaries

548 corresponding to recent WGDs derived from Gaussian mixture modeling. (d) Inferred phylogeny tree of

549 paralogous durian and G. raimondii genes originating from their respective WGDs. Italicized numbers

550 represent branching posterior probabilities. The tree topology shows that the divergence of durian and

551 cotton paralogs (derived from WGD) pre-dates the divergence of durian-cotton orthologs (derived from

552 speciation) with high branching posterior probabilities, suggesting a shared WGD occurring before the

553 durian-cotton speciation (60 – 65 MYA, green circle).

554

555 Figure 3. Transcriptome analysis reveals differentially-expressed pathways in durian fruit. 556 Differentially-expressed pathways in (a) durian fruit arils versus stem, root and leaf, (b) durian fruit arils

557 versus five other plant species’ fruits (banana, mango, avocado, tomato, blueberry), (c) Musang King

558 cultivar versus other cultivars’ fruit arils (Monthong and Puang Manee). Circle sizes represent relative

559 sizes of leading edge genes in enriched pathways. The 45-degree diagonal in each plot corresponds to

560 the region where the difference in gene expression is not different between the two groups. Regions

561 above this line depict gene sets that are prominently upregulated in durian fruit.

562

563 Figure 4. Gene family analysis shows expansion in sulfur-related pathways associated with volatile 564 sulfur compounds (VSCs). (a) Expanded durian gene families from sulfur, ethylene, and lipid 22

565 metabolism pathways, including an expansion of the MGL family. (b) The VSC pathway, which breaks

566 down cysteine and methionine into methanethiol and other VSCs, via the MGL enzyme. (c) Phylogeny

567 of MGL genes in durian, cotton, cacao, Arabidopsis, and rice, shows two groups of durian MGL genes,

568 denoted as MGLa and MGLb. Italicized numbers show branching posterior probabilities. (d) MGLb

569 durian genes show upregulation in durian fruit. MSK Musang King, MT Monthong, PM Puang Manee.

570 (e) Three genomic regions containing durian MGLa and MGLb genes show clear synteny with cacao

571 genome, while the two MGLb genes occur in tandem. This suggests the involvement of both WGD and

572 tandem duplication events in the MGL family expansion.

573

574 Figure 5. Yang cycle may link ethylene production and VSC production. (a) The Yang cycle (black

575 arrows) salvages methionine consumed for ethylene production (pink), while sulfur-related processes

576 (green) replenish and consume the sulfur moiety from methionine. (b) Upregulation of ripening-related

577 and sulfur-related genes in durian fruit. MSK Musang King, MT Monthong, PM Puang Manee. (c) Mass

578 spectrometry of VSC diethyl-disulfide (DEDS), qRT-PCR of MGL, and measurement of ethylene across

579 durian fruits at three ripening stages (1 month, 2 months, and 2.5 months post-anthesis) shows

580 coordinated spike towards later ripening stages. Shown are individual samples and mean ± standard

581 error. (d) The increased levels of DEDS, MGL and ethylene remain for post-abscission fruits (fully

582 ripened) over 3 days at room temperature. Shown are individual samples and mean ± standard error.

583 23

584 Tables

585 Table 1. Statistics of D. zibethinus draft genome.

Assembly statistics

Estimated genome size (by k-mer analysis) 738 Mb

Number of scaffolds 677

Scaffold N50 22.7 Mb

Longest scaffold 36.3 Mb

Assembly length 715 Mb

Assembly % of genome 96.88%

Repeat region % of assembly 54.8%

Predicted gene models 45,335

Average coding sequence length 1700.4 bp

Average exons per gene 5.8

586

587

588 Online methods 589 590 Sample collection

591 Durian samples used for genomic (Musang King fruit stalk) and transcriptomic analysis (3 Musang King

592 fruit arils, 3 Monthong fruit arils, and 3 Puang Manee fruit arils, as well as 3 vegetative parts (stem, root

593 and leaf)) were from a plantation in the Bentong region of Pahang, West Malaysia, while durian fruits

594 used for validation studies (mass spectrometry profiling of VSCs, qRT-PCR of MGL, and measurement

595 of ethylene) were from a separate plantation in Sendenak, Johor, West Malaysia at GPS coordinates – 1°

596 41’ 55.8”N; 103° 28’ 57.7”E. For the different ripening stages, we sourced for fruiting trees in this

597 plantation in July 2016 and isolated 3 trees of ages 28, 45 and 45 years respectively. Two durian fruits

598 were collected per tree in each ripening stage – 1 month post-anthesis, 2 months post-anthesis, 2.5

599 months post-anthesis, and post-abscission. In each ripening stage (1 month post-anthesis, 2 months post-

600 anthesis, 2.5 months post-anthesis, and post-abscission), we profiled 6 arils from 6 durian fruits for qRT- 24

601 PCR and mass spectrometry and profiled ethylene for 6 durian whole fruits. For ethylene measurements,

602 samples that did not obtain stable profiles were discarded.

603

604 DNA extraction and library preparation

605 Genomic DNA was extracted from a fruit stalk using Plant DNAzol reagent (Life Technologies)

606 following the manufacturer’s recommendations.

607 SMRTbell DNA library preparation and sequencing with P6-C4 chemistry was performed in accordance

608 with manufacturer’s protocols (Pacific Biosciences). Fifty ug of high quality genomic DNA was used to

609 generate a SMRTbell 20kb library using a 10kb lower end size selection on the BluePippin (Sage

610 Science). Initial titration runs were performed to optimize SMRT cell loading and yield. The genome

611 was sequenced employing 52 SMRT cells on the Pacbio RSII platform (Pacific Biosciences) by

612 sequencing provider Icahn MSSM.

613 The short read genomic library was prepared using the TruSeq Nano DNA Library Preparation Kit

614 (Illumina) in accordance with the manufacturer’s recommendations. Briefly, 1 ug of extracted intact

615 genomic DNA was sheared (Covaris), with 200 – 300 bp insert size, and adapter ligated. The DNA

616 fragments were subjected to PCR cycles, and the DNA library was used in paired-end sequencing

617 (2x100 bp, ~750 M paired-end reads, ~202X coverage) using the Illumina Hiseq 2500 sequencer

618 (Illumina).

619

620 RNA extraction and library preparation

621 RNA-seq experiments were conducted for Musang King fruit arils (three arils, 51.6 M paired end reads

622 each on average), and a combined non-fruit group comprising stem (55.6 M), leaf (56.9 M), and root

623 (53.2 M) profiles. We also sequenced fruit arils of Monthong and Puang Manee cultivars (three arils

624 each, 90 M each on average). Our sample sizes and read depths enabled significance testing of 54 625 differentially-expressed genes and gene sets (by DESeq2 and GSEA, described below).

626 Plant RNAs were extracted using the Purelink RNA mini kit (Life Technologies) following the

627 manufacturer’s recommended protocol.

TM 628 For the construction of RNA-seq libraries, 2 ug of total RNA was processed using the TruSeq RNA

629 Sample Preparation Kit (Illumina) followed by using the Illumina HiSeq2500 platform (Illumina).

630

631 25

632 Genome assembly 633 The main assembly was performed on full Pacbio long reads using two diploid assembly approaches. 55 634 CANU was run with both default and high sensitivity settings, and FALCON with length_cutoff=5000

635 for initial mapping of seed reads for the error correction phase. The CANU assembly produced a less

636 contiguous N50 and did not significantly improve even after applying Redundans, a tool that detects

637 redundant heterozygous contigs and attempts to reduce assembly fragmentation. We used FALCON

638 subsequently because it was better able to phase the diploid genome as well as better contiguity from

639 N50 evaluations. The PacBio reads were error corrected within the FALCON pipeline. Contamination

640 screening was handled with PacBio’s whitelisting pipeline. We subsequently compared a number of

641 length_cutoff values in FALCON for the mapping of seed reads for pre-assembly ranging from 2000 to

642 11000. Based on the contig N50 results, we chose length_cutoff=9500 for the pre-assembly step. We

643 additionally configured DBsplit (-x500, –s400), daligner (-B128, -l4800, –s100, -k18, -h480) and

644 overlap_filtering (--max_diff 100, --max_cov 100) with parameters optimized for our plant assembly.

645 The results from this step were used to perform phased diploid genome assembly using FALCON-Unzip

646 (default parameters). Subsequently the draft assembly was polished using Arrow

647 (https://github.com/PacificBiosciences/GenomicConsensus) over three iterations and finally corrected

648 using Illumina short reads with Pilon.

649 For additional details about genome assembly validation, see Supplementary Note.

650

651 Estimation of genome size and 56 652 We estimated genome size estimation by flow cytometry and obtained an estimate of 800 Mb

653 (Supplementary Table 2). We additionally estimated genome size by k-mer distribution analysis with the

654 program Jellyfish (k=31), using Illumina short reads, and obtained an estimate of 738 Mb. For additional

655 details about ploidy estimation see Supplementary Note.

656

657 Chicago library and HiC library preparation and sequencing

57 658 A Chicago library and a dovetail HiC library was prepared as described previously . Briefly, ~500 ng of

659 HMW gDNA (mean fragment length = 150 kb) was reconstituted into chromatin in vitro and fixed with

660 formaldehyde for the Chicago library and chromatin was fixed in place with formaldehyde in the

661 nucleus for the HiC library. Fixed chromatin was digested with DpnII, the 5’ overhangs filled in with

662 biotinylated nucleotides, and then free blunt ends were ligated. After ligation, crosslinks were reversed

663 and the DNA purified from protein. Purified DNA was treated to remove biotin that was not internal to

664 ligated fragments. The DNA was then sheared to ~350 bp mean fragment size and sequencing libraries 26

665 were generated using NEBNext Ultra enzymes and Illumina-compatible adapters. Biotin-containing

666 fragments were isolated using streptavidin beads before PCR enrichment of each library. The libraries

667 were sequenced on an Illumina HiSeqX to produce 233 million (Chicago) and 280 million (HiC) 2x150

668 bp paired end reads, which provided 379.52X (1-100 kb pairs) and 4,370.55X (100-1000 kb pairs)

669 physical coverage of the genome respectively.

670

671 Scaffolding with HiRise

672 The input de novo assembly, shotgun reads, Chicago library reads, and Dovetail HiC library reads were

673 used as input data for HiRise, a software pipeline designed specifically for using proximity ligation data

674 to scaffold genome assemblies. An iterative analysis was conducted. First, Shotgun and Chicago library

675 sequences were aligned to the draft input assembly using a modified SNAP read mapper. The

676 separations of Chicago read pairs mapped within draft scaffolds were analyzed by HiRise to produce a

677 likelihood model for genomic distance between read pairs, and the model was used to identify and break

678 putative misjoins, to score prospective joins, and make joins above a threshold. After aligning and

679 scaffolding Chicago data, Dovetail HiC library sequences were aligned and scaffolded following the

680 same method. After scaffolding, shotgun sequences were used to close gaps between contigs.

681 Scaffolding with Chicago libraries made 167 breaks and 2,359 joins to the contig assembly. We also

682 additionally used Hi-C libraries to detect 7 breaks and 706 joins.

683

684 Transcriptome assembly

685 RNA-seq reads were preprocessed by trim_galore to remove contaminating sequences from adapters as 58 686 well as sequences with low base quality. They were then aligned to the reference genome using HiSat .

687 Reference guided assembly was performed on the resulting alignment to produce plant organ specific set 59 688 of transcripts using StringTie . A non-redundant set of transcripts observed in all samples assembled

689 was merged using the merge functionality in StringTie.

690

691 Genome annotation

60 692 We generated a de novo repeat library from our assembly using RepeatModeler . We removed de novo -5 693 repeats that matched known Arabidopsis genes (Blastn E < 10 ). We annotated our de novo repeats

694 using the RepeatClassifier module of RepeatModeler. We combined our de novo library with the 61 62 695 RepBase plant repeat database , and ran RepeatMasker on the assembly (default parameters). 27

63 696 We used the Maker genome annotation pipeline for gene prediction. Gene evidence was provided in

697 the form of 151,594 protein sequences from four plant species (Arabidopsis, grape, rice, and soybean),

698 and 178,840 transcripts assembled from our RNA-seq data. For details on how Maker was run, see

699 Supplementary Note.

700 We transferred Gene Ontology annotations to our predicted genes using Blast2GO, incorporating -5 701 sequence homology with Arabidopsis genes using Blastp (best hit with E<10 ) and functional domain

702 search with InterProScan.

703

704 Short read alignment and SNP calling

705 Previously preprocessed Illumina short reads were used to for paired-end remapping to the reference 64 65 706 assembly using BWA-MEM . The resulting alignments were sorted using SAMtools and PCR and

707 optical duplicates were marked and removed using Picardtools (http://broadinstitute.github.io/picard/). 66 708 The processed alignments were then used to infer SNPs using FreeBayes with parameters, -C 5, -m 20

709 –F 0.01. Resulting SNP calls were further filtered of quality less than 20.

710

711 Molecular phylogeny analysis

712 We performed gene family clustering between 11 plant species using OrthoMCL (default parameters) on

713 protein sequences. Molecular phylogeny analysis was performed using a strict set of 47 single-copy

714 orthologous genes (Supplementary Table 5), derived from intersecting 344 single-copy gene families

715 found by OrthoMCL with 395 single-copy gene families published. We codon-aligned each gene family 67 716 with Muscle, and curated the alignments with Gblocks . Phylogeny analysis was performed with 68 717 BEAST2 . For details about model selection for phylogeny, see Supplementary Note.

718

719 WGD analysis

69 70 720 We used the MCScan and the CoGe Comparative Genomics Platform to detect syntenic blocks

721 (regions with at least 5 collinear genes) and calculate Ks rates between syntenic genes. To analyze the

722 durian and G. raimondii WGDs, we first identified paralogous durian and G. raimondii gene pairs

723 originating from their respective WGDs. In each of durian and G. raimondii, we modeled the

724 distribution of Ks rates as a mixture model, and identified syntenic gene pairs falling within the first

725 peak ± 1SD as paralogs likely derived from the most recent WGD event. Next, we identified OrthoMCL

726 gene families consisting of one such paralogous durian pair, one such paralogous G. raimondii pair, and

727 one gene from each of cacao and Arabidopsis. In each of these 62 families, we matched each durian 28

728 paralog with its cotton ortholog (among the two cotton paralogs), by setting the durian-cotton pair with

729 the lowest synonymous distance as orthologs (Supplementary Table 7). We then used BEAST2 to

730 estimate the phylogenetic tree and divergence times of the paralogous and orthologous genes in durian

731 and G. raimondii, along with cacao and Arabidopsis. BEAST2 was run as described above.

732

733 Transcriptome analysis

71 734 Previously aligned RNA-seq data was counted against the predicted gene models using HTSeq-count .

735 Raw counts were fitted on a negative binomial distribution and tested for differential expression using

736 DESeq2. Hidden variables were removed using the sva package and added to the study design prior to

737 running DESeq2. Genes were sorted according to their log2 fold change values after shrinkage in 738 DESeq2 and used for gene set analysis using fgsea’s implementation of GSEA. Significant gene sets

739 were used to perform leading edge analysis. We detected clusters of pathways that shared many leading

740 edge genes using a community detection algorithm, and manually curated these clusters to elucidate the

741 important phenotype associated pathway groups visualized on the bubble plots. Leading edges that also

742 had a fold change of 2 were used to generate bubble plots. DESeq normalized gene expression was used

743 in all bubble plots and heatmaps. For additional details about transcriptomic comparison against other

744 plant species, see Supplementary Note.

745

746 Gene family expansion analysis

747 For gene family expansion analysis, we considered only 6 plants from the OrthoMCL gene family

748 results: durian, two cotton species (G. raimondii and G. arboretum), cacao, Arabidopsis, and rice. We

749 considered a gene family expanded in durian if (i) the number of durian gene members ≥ the number of

750 gene members for every other considered plant, (ii) the number of durian gene members > the number of

751 gene members for cacao, Arabidopsis, and rice, and (iii) the number of gene members for all considered

752 plants > 0. These criteria were chosen to detect gene family expansions in durian, as well as expansions

753 in both durian and the closely-related cotton species. At the pathway level, we considered if each

754 pathway was expanded in durian by checking if its durian gene members were enriched in belonging to

755 expanded gene families (Fisher’s exact test, with Benjamini-Hochberg multiple hypothesis correction).

756

757 Quantitative real-time PCR

758 cDNA synthesis was performed on 200ng of total RNA with SensiFAST™ cDNA Synthesis Kit

759 (Bioline) followed by preAmplification with SsoAdvanced™ PreAmp Supermix (Biorad) according to 29

760 the manufacturer’s recommendations. The cDNA was diluted 10 times and 1 ul aliquot of cDNA was

761 used for each quantitative RT–PCR reaction using SsoAdvanced™ SYBR® Green Supermix (Bio-Rad).

762 All samples were run in triplicate for each primer combination. Thermal cycling was performed on a

763 CFX C1000 System (Bio-Rad) in 96 well plates with an initiation step with 95 °C for 30s, followed by

764 40 cycles of denaturation at 95 °C for 5s and 60°C for 15 s and completed with a melting curve analysis

765 on each reaction to confirm the specificity of amplification. Relative quantification of gene expression

766 was performed using 3 durian housekeeping gene i.e with CAC, DNAJ and APT as reference. The Ct

767 values were determined using CFXTM manager software (BioRad) and exported into MS Excel

768 (Microsoft Inc) for statistical analysis. Real-time efficiencies (E) were calculated from the slopes of

769 standard curves for each gene [E = 10( 1/slope)].

770

771 Mass spectrometry

772 Headspace-solid-phase micro extraction (HS-SPME) was used to identify sulfur-containing volatiles in

773 the durian fruits. Harvested fruits were kept in a room maintained at 25 °C with a relative humidity of

774 80-90%. Five hundred milligrams of each sample aril were ground under liquid nitrogen and was

775 transferred into a 20 ml brown glass head-space vial (Restek). Fifty nmol of the internal standard,

776 Thiophene (Sigma Aldrich) was added to each sample vials and were locked with air-tight headspace-

777 screw caps immediately. An 85-um carboxen on polymethylsiloxane (CAR/PDMS) (Supelco,

778 Bellefonte) was used as the SPME. The SPME fibre was conditioned as per the manufacturer’s

779 instructions. Online headspace sampling, extraction and GC-MS analysis was performed on the Agilent

780 7200 GC-QTOF mass spectrometer equipped with a PAL CTC Headspace autosampler. Headspace

781 extraction and desporption was carried out accordingly: Each sample was incubated for 30 minutes at 40

782 °C, followed by extraction of the volatiles for 30 minutes at 40 °C and 250 rpm. Conditioning was

783 performed for 5 minutes at 250 °C followed by injection into the GC-MS for 1 minute at 230 °C. The

784 injector was operated in a “Split” mode at a ratio of 50:1 and the flow rate was maintained at 1 ml/min.

785 An Agilent 19091S-433HP-5MS 5% Phenyl Methyl Silox (30 m x 250 μm x 0.25 μm) column was used

786 to separate the volatiles. The oven temperature was increased from 40 °C to 200 °C for at 5 °C/min

787 followed by further increase to 280 °C at 40 °C/min for 5 minutes. The total run time was 39 minutes.

788 Other parameters are as follows: Ion source- EI, Source temperature- 230 °C, Electron energy- 70 eV,

789 Acquisition rate- 2 spectr/s, Acquisition time- 500 ms/spectrum, Mass range- 40 to 250 amu. Mass

790 spectra was analysed using the Agilent MassHunter Qualitative Analysis software version B.05.00.

791 Identification of metabolites was done using the NIST version 11 library based on spectral matching.

792 Quantification of sulfur-containing metabolites are expressed as follows: Peak area of metabolite/Peak

793 area of Internal standard/ 0.5 g = Abundance/g. 30

794

795 Ethylene gas measurement

796 Harvested fruits were all kept in a room maintained at 25 °C with a relative humidity of 80-90%. All

797 daily samplings for the post-abscission fruits were conducted at the same starting time to ensure

798 consistency. Each durian fruit was placed in an airtight PVC 8L container fitted with gas sampling ports

799 connected to the ethylene gas analyzer (model F-900, based on electrochemical sensors; Felix

800 Instruments). After 60 minutes, the headspace air was sampled and measured under polarcept settings

801 following the manufacturer’s recommendations. Ethylene measurements were obtained as a mean of 3

802 readings sampled at 10 minute intervals.

803

804 Statistics

805 To determine differentially expressed genes, we used DESeq2 to model gene expression raw gene

806 counts on a negative binomial model, the p-values for a two-tailed test was used with Benjamini-

807 Hochberg multiple hypothesis correction. To determine disregulated pathways based on gene

808 expression, we used GSEA which uses a Kolmogorov-smirnov like statistic, the p-values were derived

809 from a permutation test to generate empirical null distributions. To determine significant pathways with

810 gene family expansion, we used a two-tailed Fisher’s exact test, with Benjamini-Hochberg multiple

811 hypothesis correction. All p-values were considered significant at 0.05 threshold.

812

813

814 31

815 Accession codes

816 All custom scripts have been made available at https://github.com/chernycherny/FSFix. Raw reads and

817 genome assembly have been deposited at the NCBI BioProject under the accession number

818 PRJNA400310.

819

820 Methods-only references

821 53. Lomáscolo, S.B., Levey, D.J., Kimball, R.T., Bolker, B.M. & Alborn, H.T. Dispersers shape fruit 822 diversity in Ficus (Moraceae). Proceedings of the National Academy of Sciences 107, 14668- 823 14672 (2010). 824 54. Ching, T., Huang, S. & Garmire, L.X. Power analysis and sample size estimation for RNA-Seq 825 differential expression. RNA 20, 1684-1696 (2014). 826 55. Koren, S. et al. Canu: scalable and accurate long-read assembly via adaptive k-mer weighting 827 and repeat separation. Genome Research 27, 722-736 (2017). 828 56. Marie, D. & Brown, S.C. A cytometric exercise in plant DNA histograms, with 2C values for 70 829 species. Biol Cell 78, 41-51 (1993). 830 57. Marçais, G. & Kingsford, C. A fast, lock-free approach for efficient parallel counting of 831 occurrences of k-mers. Bioinformatics 27, 764-770 (2011). 832 58. Zaharia, M. et al. Faster and More Accurate Sequence Alignment with SNAP. arXiv, 1111.5572 833 (2011). 834 59. Kim, D., Langmead, B. & Salzberg, S.L. HISAT: a fast spliced aligner with low memory 835 requirements. Nat Meth 12, 357-360 (2015). 836 60. Pertea, M. et al. StringTie enables improved reconstruction of a transcriptome from RNA-seq 837 reads. Nat Biotech 33, 290-295 (2015). 838 61. Smit, A.F.A. & Hubley, R. RepeatModeler Open-1.0. 2008-2010. 839 62. Bao, W., Kojima, K.K. & Kohany, O. Repbase Update, a database of repetitive elements in 840 eukaryotic genomes. Mobile DNA 6, 11 (2015). 841 63. Smit, A.F.A., Hubley, R. & Green, P. RepeatMasker at http://repeatmasker.org/. 842 64. Jones, P. et al. InterProScan 5: genome-scale protein function classification. Bioinformatics 30, 843 1236-40 (2014). 844 65. Li, H. Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. ArXiv, 845 1303.3997 (2013). 846 66. Li, H. et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics 25, 2078-2079 847 (2009). 848 67. Edgar, R.C. MUSCLE: multiple sequence alignment with high accuracy and high throughput. 849 Nucleic Acids Res 32, 1792-7 (2004). 850 68. Castresana, J. Selection of Conserved Blocks from Multiple Alignments for Their Use in 851 Phylogenetic Analysis. Molecular Biology and Evolution 17, 540-552 (2000). 852 69. Tang, H. et al. Synteny and Collinearity in Plant Genomes. Science 320, 486-488 (2008). 853 70. Lyons, E. & Freeling, M. How to usefully compare homologous plant genes and chromosomes as 854 DNA sequences. Plant J 53, 661-73 (2008). 855 71. Anders, S., Pyl, P.T. & Huber, W. HTSeq—a Python framework to work with high-throughput 856 sequencing data. Bioinformatics 31, 166-169 (2015). 32

857

858

859

860

861

a

I

II o a III c

a c IV D

. .

T z i b

e

t

h

i

n

u s

b c Nucleic acid CMC-EnSpm 4.0% metabolic hAT-Ac 1.0% process 12% Other DNA 0.9%

6% hAT-Tag1 0.4% Copia PIF-Harbinger 0.3% 7% Small Zisupton 0.3% molecule

10% Defense 0.7% Caulimovirus response metabolic MULE-MuDR 0.2% process Maverick 0.1% Protein hAT-Tip100 0.1% metabolic 5% Fruit Gypsy process 8% development LTR L1 1.0% 48% Repeats Gene 6% Organelle 3% Carbohydrate 30% Unknown RTE-BovB 0.3% Ontology metabolic process L2 0.02% organization 5% development 3% Lipid DNA rRNA 5.0%

LINE

RC metabolic process

Helitron 5.0% 4% Secondary metabolic process Regulation of biological Satellite 1% process 35% Simple repeat 0.2% 2% Cell cycle process tRNA 0.005% SINE 0.0002% s e s t

D. zibethinus e id s o id a a b s 60 - 77 va l G. v arboreu G. arboreum o m 1 - 15 R Ma l Eudi c a 62 - 85 Ma l n 640 G. raimondii alia h T. cacao . t A 94 19 G. β α 307 A. thaliana 970 3406 rai 25 C. papaya 27 246 m 67 82 ondii 779 s P. trichocarpa 190 13 125 10710 282 G. max 46 16 197 γ 22 V. vinifera 2151 358 45 67 64 C. canephora D 90 . 838 165 σ ρ O. sativa z 401 ibethinu(607s ) 721 200 150 100 50 0 Million years ago cacao T. Estimated divergence (95% HPD) Estimated durian WGD WGD (from literature) c D. zibethinus d Recent WGD Cotton-durian G. raimondii (paralog 1) divergence: 55 - 64 1.000

Frequency D. zibethinus (paralog 1) WGD: 60 - 65 1.000 0.0 0.5 1.0 1.5 2.0 Cotton-durian G. raimondii (paralog 2) divergence: 57 - 65 G. raimondii 0.998 Recent WGD D. zibethinus (paralog 2) 1.000 T. cacao Frequency 75 50 25 0 0.0 0.5 1.0 1.5 2.0 Million years ago Synonymous substitutions per site (Ks) Estimated ortholog divergence (95% HPD) Estimated paralog divergence (95% HPD) Estimated durian WGD SULFUR RIPENING FLAVOR HORMONES OTHERS ) 2 12.5 GLUCOSINOLATE RESPIRATION TRANSPORT LIGHT SUGAR.1 PROTEIN SUGAR.3 METHIONINE CELL WALL.2 GIBBERELLIN TRANSPORT.2 10.0 DEGRADATION SUGAR.2 FLAVONOID AUXIN LIPID.2 APETALLA TF LIGHT.1 TRANSPORT.3 FLAVONOID.2 OXIDOREDUCTASE CYSTEINE CELL WALL IMMUNE LIGHT.2 CAROTENOID 7.5 ETHYLENE LIPID LIPID.1 SEPALLATA TF TERPENOID TRANSPORT.1 SULFUR AMINO ACID a PHENYLPROPANOID IAA PHOSPHATE 5.0 FLAVONOID.1 PHLOEM MYB TF XYLEM 2.5 PHOTOSYNTHESIS

0.0

Average gene expression in durian arils (log 0.0 2.5 5.0 7.5 10.0 12.50.0 2.5 5.0 7.5 10.0 12.50.0 2.5 5.0 7.5 10.0 12.50.0 2.5 5.0 7.5 10.0 12.50.0 2.5 5.0 7.5 10.0 12.5

Average gene expression in durian vegetative parts (log2) SULFUR RIPENING FLAVOR HORMONES OTHERS ) 2

METHIONINE PROTEIN DEGRADATION.1 LIPID ETHYLENE SULFUR PROTEIN DEGRADATION 10 AMINO ACID PHENYLPROPANOIDS

ANTHOCYANIN b TERPENOID TRANSPORT.1 GIBBERELLIN LIGHT PHOSPHATE 5 AUXIN ISOPRENOID TRANSPORT PHOTOSYNTHESIS.1

0

Average gene expression of durian arils (log 0 5 10 0 5 10 0 5 10 0 5 10 0 5 10

Average gene expression of other fruits (log2) SULFUR RIPENING FLAVOR HORMONES OTHERS ) 2 10.0

7.5

SUGAR c 5.0 SULFUR AMINO ACID MADS-BOX TF OXIDOREDUCTASE ACTIVITY GLUCOSINOLATE TERPENOID ABSCISIC ACID PROTEIN DEGRADATION CELL WALL PHOTOSYNTHESIS 2.5 FLAVONOID MYB TF IMMUNE RESPONSE TRANSPORT

0.0

$YHUDJHJHQHH[SUHVVLRQLQPVZDULOV ORJ 0.0 2.5 5.0 7.5 10.0 0.0 2.5 5.0 7.5 10.0 0.0 2.5 5.0 7.5 10.0 0.0 2.5 5.0 7.5 10.0 0.0 2.5 5.0 7.5 10.0

$YHUDJHJHQHH[SUHVVLRQLQRWKHUFXOWLYDUV·DULOV ORJ2) a Gene family expansion b VSC pathway

MGL Row Z-score CYSTEINE APS1-4

OASA1 −2 0 2 CGS GLY3 APK3 # CYSTATHIONINE

Sulfur DPNPH SIR CYSK4 CBL SAHH1-2 SAMDC4 HOMOCYSTEINE

XBAT32 HMT1-3 ETR2 METE1-3 AHK5 VOLATILE MPK6 METHIONINE SULFUR EIL3 COMPOUNDS Ethylene ACO4# MGL (VSCs) BXL1

THIK1,2,5 METHANETHIOL #

Lipid ACX1,1.2

T. cacao O. sativa A. thaliana D. zibethinusG. raimondiiG. arboreum #Expanded in Gossypium and durian c d Durian MGL gene expression O. sativa MGL Row Z−Score A. thaliana MGL MGLa1 1 G. arboreum MGL −1 0 1 0.926 G. arboreum MGL MGLa2 1 1 G. raimondii MGL D. zibethinus MGLa1 MGLb1 0.992 1 D. zibethinus MGLa2 MGLb2 0.348 G. arboreum MGL 1 1 0.380 G. raimondii MGL StemRootLeaf MT PM T. cacao MGL MSK D. zibethinus MGLb1 Non-fruit Fruit 0.03 0.921 D. zibethinus MGLb2 Amino acid substitution rate e

D. zibethinus

MGLb1 MGLb2 MGLa2 MGLa1

MGL

T. cacao a

SDI1 DPNPH Sulfur

metabolism S-Adenosylmethionine ACS

Methylthioadenosine MGL

Methionine 1-Aminocyclopropane-1-carboxylate

Methanethiol

Ethylene

Volatile

4-Methylthio-2-oxobutanoate sulfur compounds Ripening

PME40 AGT23 BXL1 ZIP1 CYP71B34

b c MGLa - qRT-PCR (fold) Row Z−Score MGLb - qRT-PCR (fold) DEDS - Mass spec (ratio) Ethylene 40 35 −1 0 1 30 2.0 25 20 ACS 15 10 1.5 5

CYP71B34 0 Ethylene (ppm/kg)

1.0 PME40 Ripening

0.5 BXL1

SDI1 0.0 MGLa MGLb DEDS MGLa MGLb DEDS MGLa MGLb DEDS Month 1 Month 2 Month 2.5 Sulfur DPNPH

d 40 3.5 StemRootLeaf MSK MT PM 35 30 3.0 25 Non-fruit Fruit 20 15 2.5 10 5

0 Ethylene (ppm/kg) 2.0

1.5

1.0

0.5

0.0 MGLa MGLb DEDS MGLa MGLb DEDS MGLa MGLb DEDS Day 1 Day 2 Day 3